Monday, November 5, 2012

Virtuous Visible Circles: mapping views to place-based Wikipedia articles

We know that Wikipedia matters to the construction of geographical imaginations of place, and content in the encyclopaedia has immense power to augment our spatial understandings and interactions. However, I have yet to see many empirical analysis of the visibility of the 'peer-produced' information in Wikipedia.

Our team therefore decided to map the views that every country in the world receives on Wikipedia. Specifically, what this means is that we constructed a list of every single article about a place (towns, monuments, historical events, rivers, buildings etc.) in the top 42 Wikipedia language versions, and then queried the number of views that each of those articles received over a two-year period (2009-2011). We then aggregated those data by country in order to get a sense of how visible the information layers over each country are. In other words, we can ask: what parts of the world are people looking at and virtually experiencing?

We already know that the geographies of both the production of content (i.e. where edits come from) and the subject of that content (i.e. the locations of articles) are highly geographically uneven and centred in the world's most privileged countries. But it is interesting to see how the geography of views compares. 



What we broadly see is that attention also tends to focus on some of the world's biggest economies. As might be expected, the US garners the most attention. The US is then followed by the UK, Germany, France, Japan, Italy, Poland, Spain, Russia, and then Canada. In other words, these are the countries that more Internet users view articles about than anywhere else.

If we compare the list of 'views to countries' with the list of 'views from countries' (i.e. where people are using Wikipedia from), the list of articles about countries, and the list of edits from countries, we see a lot of similarity.



It is interesting that there are so few surprises and a relative lack of countries that aren't the 'usual suspects' when it comes to centrality in the world's information economies. 

But maybe it is less surprising that all of these metrics seem to correlate with one another (at least at the top of the scale). It makes sense that virtual circles exist between the availability, use, and production of geographic information.

In the next few posts, I'll dig deeper into these data in Africa, Europe, and the Middle East. In the meantime I'd welcome any thoughts of comments on these results.

1 comment:

Stefan Hahmann said...

In a similar study we found that the article "USA" has the most distinct incoming links within the network of (German) Wikipedia articles. Incoming links may also be used to measure visibility (and hence are used by search engines for this purpose - you know that...). In fact, in the German Wikipedia among the top 10 articles referenced from other articles, there are 6 countries (1 USA, 2 Germany, 4 France, 6 Austria, 7 Italy, 8 Switzerland). The spatial autocorrelation between Wikipedia language version and content is obvious. It would be interesting to see the results of this metric in other Wikipedia language versions. The metric of incoming links might be a bit more stable over time, since page views are partially biased by short-time effects. For example we observed high access rates to articles of Japan during the time after the Tsunami in 2011 that would even bias a two years aggregate of access rates. However, we also found the correlation between incoming links and article access rates only to be 0.44, which shows that both metrics, in fact, are different. Some of this was published in this years' GIScience conference (http://www.giscience.org/proceedings/abstracts/giscience2012_paper_84.pdf).