Carnegie Mellon study of Twitter sentiments yields results similar to public opinion polls
May 11, 2010Computer analysis of sentiments expressed in a billion Twitter messages during 2008-2009 yielded measures of consumer confidence and of presidential job approval similar to those of well-established public opinion polls, Carnegie Mellon University researchers report.
The findings suggest that analyzing the text found in streams of tweets could become a cheap, rapid means of gauging public opinion on at least some subjects, said Noah Smith, assistant professor of language technologies and machine learning in the School of Computer Science. But tools for extracting public opinion from social media text are still crude and social media remain in their infancy, he cautioned, so the extent to which these methods could replace or supplement traditional polling is still unknown.
"With seven million or more messages being tweeted each day, this data stream potentially allows us to take the temperature of the population very quickly," Smith said. "The results are noisy, as are the results of polls. Opinion pollsters have learned to compensate for these distortions, while we're still trying to identify and understand the noise in our data. Given that, I'm excited that we get any signal at all from social media that correlates with the polls."
The study findings will be presented May 25 at the Association for the Advancement of Artificial Intelligence's International Conference on Weblogs and Social Media in Washington, D.C.
In the study, Smith and his colleagues collected a billion microblog messages — averaging about 11 words each — posted to Twitter during 2008 and 2009. They used simple text analysis techniques to identify messages that pertained to the economy or to politics and then found words within the text that indicated if the writer expressed positive or negative sentiments.
Results regarding consumer confidence were compared with the Index of Consumer Sentiment (ICS) from Reuters/University of Michigan Surveys of Consumers and the Gallup Organization's Economic Confidence Index. Political sentiments regarding President Obama were compared with Gallup's daily tracking poll on presidential job approval and views regarding the 2008 U.S. presidential election were compared with a compilation of 46 different polls prepared by Pollster.com. The ICS, Gallup and Pollster.com measurements were all obtained from telephone surveys using traditional polling techniques.
The Twitter-derived sentiment measurements were much more volatile day-to-day than the polling data, but when the researchers "smoothed" the results by averaging them over a period of days, the results often correlated closely with the polling data, said Brendan O'Connor, a graduate student in Carnegie Mellon's Language Technologies Institute and first author of the study. Consumer confidence, for instance, followed the same general slide through 2008 and the same rebound in February/March of 2009 as was seen in the poll data. The researchers noted that the ICS and Gallup data had a correlation of 86 percent over the period; the Twitter-derived sentiments had between 72 percent and 79 percent correlation with the Gallup data, depending on the number of days averaged to smooth the data.
Likewise, both the Twitter-derived sentiments and the traditional polls reflected declining approval of President Obama's job performance during 2009, with a 72 percent correlation between them.
But the researchers found that their sentiment analysis did not correlate as well with election polling during 2008. For instance, increased mentions of "Obama" tended to correlate with rises in Barack Obama's polling numbers, but increased mentions of "McCain" also correlated with rises in Obama's popularity. Improved computational methods for understanding natural language, particularly the unusual lexicon of microblogs, will be necessary before Twitter feeds can be reliably mined to predict elections, the researchers concluded.
"The Web is so mainstream now that there's no question that the Web is representative somehow of the population," O'Connor said. But pinning down Web demographics is still difficult, he acknowledged, noting that Twitter traffic alone increased by a factor of 50 during the two-year span of the study.
Using computer programs to judge the sentiments of microblogs is fraught with potential error, but even with the crude tools used in this exploratory research, the accuracy is better than can be achieved by chance, O'Connor said. "The massive amount of data was crucial in making this work," he explained. "We don't need to get the sentiment of every individual right to understand sentiments in aggregate."
Improved natural language processing tools, as well as query-driven analysis and use of demographic and time stamp data available on some social media sites, could increase the sophistication and reliability of microblog analysis.
More information: Download a copy of the paper here, http://www.cs.cmu. … du/~nasmith/
-
Presidential primary 2008 polls: What went wrong
Mar 30, 2009 |
not rated yet |
0
-
Super Tuesday results indicate race card may be a joker in primaries
Feb 07, 2008 |
not rated yet |
0
-
German officials to probe Twitter election leaks
Sep 27, 2009 |
not rated yet |
0
-
Americans’ support for a female president is significantly exaggerated, researchers say
Jan 22, 2007 |
not rated yet |
0
-
Visualizing election polls
Oct 06, 2008 |
not rated yet |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (31) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (3) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (1) |
0
-
Can I forget a language?
15 hours ago
-
The Biggest Lie Ever
Feb 09, 2012
-
What are the limits of learning?
Feb 06, 2012
-
Isn't that grammatically wrong?
Feb 06, 2012
-
What does it mean when traders are indifferent?
Feb 04, 2012
-
Peak of Our Civilization
Feb 04, 2012
- More from Physics Forums - Social Sciences
More news stories
A frank discussion of the power law and linking correlation to causation
(PhysOrg.com) -- Michael Stumpf a mathematics professor at Imperial College in London, and Mason Porter a lecturer at Oxford have teamed together to write and publish a perspective piece in Science regarding the in ...
Employers feel no love for unscrupulous practice of 'service sweethearting'
A new study led by two Florida State University marketing professors finds that some frontline service employees who are rewarded for hikes in customer loyalty and satisfaction also may engage in "service ...
Other Sciences / Economics & Business
12 hours ago |
4 / 5 (1) |
5
The question of life in the ancient world
Theres a general feeling that we dont get the Greeks ancient or modern. Many, including heads of state like Angela Merkel, visibly shake their head in exasperation, rightly or wrongly, at ...
Other Sciences / Archaeology & Fossils
17 hours ago |
1.3 / 5 (3) |
4
Sonic Cradle lands spot in TED exhibition
A Simon Fraser University graduate student project that melds music, meditation and modern technology has landed a rare spot as an exhibit at TEDActive 2012 in Palm Springs, California this month.
14 hours ago |
not rated yet |
0
Do we no longer care about the collective good?
The Transformation of Solidarity, a book co-edited by University of Queensland sociologist Dr Mara Yerkes, tackles the subject of globalisation of national economies and societies where we put a high value ...
Other Sciences / Social Sciences
Feb 06, 2012 |
3.9 / 5 (8) |
39
Anonymous knocks CIA website offline (Update)
The website of the Central Intelligence Agency was inaccessible on Friday after the hacker group Anonymous claimed to have knocked it offline.
Google users warned of threat to smartphone wallets
Users of Google smartphone wallets were being warned on Friday that there is a way to crack pass codes intended to thwart thieves from going on illicit shopping sprees.
New error-correcting codes guarantee the fastest possible rate of data transmission
Error-correcting codes are one of the triumphs of the digital age. Theyre a way of encoding information so that it can be transmitted across a communication channel such as an optical fiber o ...
Humans may have helped the decline of African rainforests 3000 years ago
(PhysOrg.com) -- Large areas of rainforests in Central Africa mysteriously disappeared over three thousand years ago, to be replaced by savannas. The prevailing theory has been that the cause was a change ...
New power source discovered
(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.
Small modular reactor design could be a 'SUPERSTAR'
(PhysOrg.com) -- Though most of today's nuclear reactors are cooled by water, we've long known that there are alternatives; in fact, the world's first nuclear-powered electricity in 1951 came from a reactor ...
May 14, 2010
Rank: not rated yet