Monday, November 5, 2012

Obama wins the election! (on Twitter)

Can Twitter predict the outcome of the US election tomorrow? If our results are anything to go by then Barack Obama will be reelected. The data presented below are the result of some research that Adham Tamer, Ning Wang, Scott Hale and I (Mark Graham) carried out in order to see how visible both major presidential candidates are on Twitter.

We collected about 30 million geocoded tweets between Oct 1 and Nov 1 and pulled out all references to Obama and Romney. You can see the initial results in the map below.


We see that if the election were decided purely based on Twitter mentions, then Obama would be re-elected. In fact, the only states that Romney would win are Maine, Massachusetts, New Mexico, Oregon, Pennsylvania, Utah, and Vermont. Romney also wins in the District of Colombia (we unfortunately didn't collect data on Alaska or Hawaii).

However, this drubbing that Romney receives in the Twitter electoral college belies the close nature of the final popular (Twitter) vote. There are a total of 132,771 tweets mentioning Obama and 120,637 mentioning Romney, giving Obama only 52.4% of the total (and Romney 47.6%). A breakdown that is remarkably similar to current opinion polls.

If you want to explore the data in more detail, please play around with the interactive map below:
We can also map the data using a sliding scale in order to better see how close the margin of victory is in each state.


Romney's largest margins of victory are in Pennsylvania and Massachusetts. Obama's largest victories are in California and, strangely, Texas.

It is also worth noting that we compared Twitter mentions of both Vice-Presidential candidates: Biden and Ryan. Ryan, interestingly, wins the head-to-head competition in every single state. This makes for a rather boring map, so I decided to instead compare references to Ryan and Romney in the map below (Romney shaded in grey for his ebullient personality, and Ryan in pink as a result of his staunch support for gay rights).


As might be expected, there are more references to Romney in most states (Kansas, Michigan, North Dakota, Rhode Island, South Dakota, and Vermont being the exceptions here). However, when looking at total references, we again don't see a large gap between the two men. Ryan has 94,707 tweets compared to Romney's 120,637.

What do these data really tell us? I doubt that they will accurately predict that Obama will win in Texas or that Romney will win in Massachusetts. But they do certainly reveal that many internet users in California, Texas, and much of the country prefer talking about Obama than Romney. We would need to employ sentiment analysis or manually read a large number of the election-related tweets in order to figure out whether we are seeing messages of support or more critical posts.

Some of the results seem to be interesting reflections of social and political characteristics of particular places. It makes sense that Romney has captured more of the public imagination in Utah (perhaps due to the state's large Mormon population) and Massachusetts (the state that he once governed).

Other results are harder to extract meaning from. Romney's (Twitter) win in Pennsylvania perhaps will also presage interesting results in that state on election day. But, who knows what Obama's Texas win demonstrate.

Maybe the most revealing aspect of these data is the 'popular vote' split between the two candidates. While the social and political data shadows that we are picking up may not accurately tell us much about the electoral college results, when aggregated across the country they may be a rough indicator of outcomes tomorrow.

While this work may seem like a contemporary attempt at soothsaying, the data will also serve as a useful benchmark in order to allow us to see what social media data shadows actually might reflect.

2 comments:

Yvonne-Anne Pignolet said...

Another study considers celebrities on Twitter and the presidential election:

After following celebrities on Twitter for 8 months we finally finished building a system that can predict the political opinion of many of these celebrities (and our tests indicate that we do so with high accuracy). Our analysis on www.celebs-vote.com is not only based on the content of tweets but also on the social network of the most popular Twitter users.

Anonymous said...

How is a tweet "geo-coded" i.e. what data is used to determine location of tweet?