Friday, February 17, 2012

Where do Wikipedia edits come from?

Our team recently decided to look at the origins of edits to Wikipedia articles. The results are striking. But given what we already know about the uneven geographies of Wikipedia are perhaps not that shocking.

To make these maps we took quarterly data about the total number of edits (to all Wikipedia versions) to emerge from any territory (i.e. the amount of content that people are producing in each country) and averaged it over a two year period (2010-2011). The inequalities in the amount of content produced are stark: the US, Germany, the UK and France all have an average of over a million edits each quarter.

But then you get most of Africa and the Middle East where the average number of edits per quarter is only a few thousand. Interestingly, there are more edits than originate in Hong Kong each quarter than the entire continent of Africa.

Much of this variation can actually be explained by Internet population (i.e. the total number of Internet users in a country). However, even accounting for their generally low Internet populations, most countries in the MENA region and Sub-Saharan Africa still fall below their expected number of edits (we are currently working on some statistical models and writing a paper about this topic).

Zooming into the MENA region, the scale of some of these disparities become even more clear. In the map above, you can see that there are almost as many Wikipedia edits that come from Israel (215,333) as from the rest of the entire region combined (254,089)!

Finally, it might also make sense to look at the number of views per edit in the region. This gives us a sense of how much consumption vs. production of information is happening. Israel again stands out with fundamentally different characteristics from the rest of the region. In Israel their is much higher rate of information production on Wikipedia (as compared to number of views/consumption) than many of its neighbours. 

Libya and Iran also score well in this regard. In the case of Iran, we also see a lot of edits originating in the country (Iran has the second highest number of edits in the region). With Libya, it may also be the case that there is genuinely a high number of edits per views: but we may also be dealing with rounding errors given the very small number of both views and edits that we see in the country.

As always, there will be more maps and analysis up here soon. But I wanted to share these and ask for thoughts and comments as we start writing our paper.


Sagi S said...

Very interesting.
Do you think it would be worthwhile to show edits "density", as in # of edits per # of inhabitants of a particular country, instead of looking at absolute numbers?
For example, the US is sure to stand out in terms of sheer # of edits, but where does it stand relative to, say, Israel, in terms of relative # of edits per thousand inhabitants?

Mark Graham said...

Good point. We've actually been playing with those data and using them in some models. I'll can upload some maps of these data normalised by Internet users or total population in the next few weeks if you'd like (a couple of conferences next week are going to slow me down a bit I'm afraid).

Anonymous said...

Wikipedia is widely accessed. It is said to be as reliable as Britannica Enclyclopaedia.
Maybe the number of edits can be the result of hacktivism, in order to set the vision about places, religions, dates and heroes.

Sagi S said...

@Mark Graham: Would indeed be great if you could upload normalized data
@ Anonymous: I agree it would be good to see if there is foul play involved here, perhaps looking at # of unique users or weeding out users who have a suspiciously high number of edits would do the trick? not sure.

Dror K said...

I'm not surprised that Iran scores high in terms of views and edits. Internet accessibility in Iran is very high and the country has a developed Internet culture. However, the government there block access to a wide range of sites. Wikipedia used to be among them, I think it is not the case anymore but I'm not sure. I suppose many Internet users find ways to circumvent the governmental web filtering.

I never checked this scientifically, but I have a strong feeling that Wikipedia's popularity is quite low in Arab countries, and it sometimes even treated with suspiciousness. This is in strong contrast to its popularity in neighboring Israel. In Israel, Wikipedia has become part of the popular culture, with many references and citations in TV shows, newspapers, books etc. This is not the case at all in the Arab World, and even in the Arab community of Israel.

Anonymous said...

Yes Mark,
The stat of Edits per capita will will be greatly appreciated.
Thank you,

Ahmad Osman said...

Yeah of course you're gonna flaunt Israel's Wikipedia activity, but don't you point out at the fact that Lebanon stands out in terms of number of views / number of edits. It's sticking out like the sore thumb that is your racism.

Ahmad Osman said...

"Finally, it might also make sense to look at the number of views per edit in the region. This gives us a sense of how much consumption vs. production of information is happening. Israel again stands out" <--- oh and by the way this is a lie. Anyone with eyeballs can see that Israel does *not* stand out in this regard. What a retard.