Tuesday, May 14, 2013

We're hiring: Quantitative Internet Geographer/Sociologist

We're hiring a full-time researcher to work with Grant Blank, Bernie Hogan and myself at the Oxford Internet Institute (on a one-year contract in the first instance).

The successful candidate will be working on two projects: (1) Helping us to continue our work on the geographies of Wikipedia. i.e. modelling and mapping patterns in Wikipedia data (read more here or here); (2) Carrying out a new project on digital inequality in Britain. Here, the researcher will gather and analyse a range of Internet-related data in order to help us better understand local-scale geographies of digital engagement in the UK.

We think this will be quite an exciting position for someone with a background in statistics (and ideally GIS). More info and an application package is available at this link, but feel free to get in touch if you have any question about the job.

Monday, April 29, 2013

Wikipedians without borders


Our team recently held a workshop for Wikipedia editors in Amman in order to discussion barriers to participation and representation in Wikipedia (with a focus on the Middle East and North Africa). The event had participants from all over the region (from Morocco, Algeria, Iraq, Iran, Pakistan, Israel, The Palestinian Territories, Jordan, Egypt, Syria, Tunisia, Saudi Arabia, and Lebanon).

However, this very diversity of participants initial proved slightly controversial: with some editors worried that it would be unhelpful to include both Arabs and Israelis for instance. Despite these concerns, we managed to have very productive discussions and realise that there are some general goals and hopes related to knowledge sharing that united all of us.

As a result, some of the participants have decided to put together a group call 'Wikipedians without borders.' The stated goal of the group is "Wikipedians from around the world reaching out, individually and collectively, to share their thoughts, ideas and projects for increasing human knowledge and cultural understanding."

The group might not solve any of the structural inequalities and uneven power relations that we see in the encyclopaedia, but will at least provide a platform for a diverse group of editors to come together and better understand each other's perspectives. This might then ultimately work to make Wikipedia a more balanced and representative source of knowledge.  

I wish them the best, and invite anyone interested in getting involved to check out their new Facebook page

Thursday, April 25, 2013

New article published - Thai Silk Dot Com: Authenticity, Altruism, Modernity and Markets in the Thai Silk Industry


An article that I had accepted into Globalizations has made its way into print:

Graham, M. 2013. Thai Silk Dot Com: Authenticity, Altruism, Modernity and Markets in the Thai Silk IndustryGlobalizations. 10(2) 211-230.

The abstract is below, and you can access a pre-publication version at this link.

The production of silk occupies a unique place in Thai cultural and economic practices. However, the practice is rarely passed on to the younger generation and is widely considered to be a dying craft. In response, influential organizations have proposed use of the internet as a way to reinvigorate the industry and attract new customers. This paper looks at the discourses used to sell silk and the ways in which sellers are either framing Thai silk as a traditional craft in need of saving or as an enterprise that efficiently engages with the commercial needs of the global economy. The paper reviews the range of, often problematic, emotions, images, and associations used to sell a dying craft. Ultimately, it argues that, in contrast to many of the theorized effects of the internet, it seems to be neither encouraging mass homogenization nor pushing sellers to effectively integrate themselves into global markets.

Friday, April 19, 2013

New Article published - Beyond the geotag: situating 'big data' and leveraging the potential of the geoweb


An article that I worked on with Jeremy Crampton, Ate Poorthuis, Taylor Shelton, Monica Stephens, Matt Wilson, and Matt Zook -- Beyond the geotag: situating 'big data' and leveraging the potential of the geoweb -- has just been published in Cartography and Geographic Information Science as part of a special issue on "Mapping Cyberspace and Social Media."


The abstract and full citation for the paper are below:
This article presents an overview and initial results of a geoweb analysis designed to provide the foundation for a continued discussion of the potential impacts of ‘big data’ for the practice of critical human geography. While Haklay's (2012) observation that social media content is generated by a small number of ‘outliers’ is correct, we explore alternative methods and conceptual frameworks that might allow for one to overcome the limitations of previous analyses of user-generated geographic information. Though more illustrative than explanatory, the results of our analysis suggest a cautious approach toward the use of the geoweb and big data that are as mindful of their shortcomings as their potential.

More specifically, we propose five extensions to the typical practice of mapping georeferenced data that we call going ‘beyond the geotag’: (1) going beyond social media that is explicitly geographic; (2) going beyond spatialities of the ‘here and now’; (3) going beyond the proximate; (4) going beyond the human to data produced by bots and automated systems, and (5) going beyond the geoweb itself, by leveraging these sources against ancillary data, such as news reports and census data. We see these extensions of existing methodologies as providing the potential for overcoming existing limitations on the analysis of the geoweb.

The principal case study focuses on the widely reported riots following the University of Kentucky men's basketball team's victory in the 2012 NCAA championship and its manifestation within the geoweb. Drawing upon a database of archived Twitter activity – including all geotagged tweets since December 2011–we analyze the geography of tweets that used a specific hashtag (#LexingtonPoliceScanner) in order to demonstrate the potential application of our methodological and conceptual program. By tracking the social, spatial, and temporal diffusion of this hashtag, we show how large databases of such spatially referenced internet content can be used in a more systematic way for critical social and spatial analysis.
Crampton, J.W., M. Graham, A. Poorthuis, T. Shelton, M. Stephens, M.W. Wilson and M. Zook. 2013. Beyond the Geotag: Situating ‘Big Data’ and Leveraging the Potential of the Geoweb. Cartography and Geographic Information Science 40(2): 130-139.

Or you can freely access a pre-publication version from SSRN: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2253918

Saturday, March 30, 2013

Die Welt in der Wikipedia als Politik der Exklusion


I've just had my first ever German-language publication be released. For those of you that can speak German, wander over to the following link:
Graham, M. 2012. Die Welt in Der Wikipedia Als Politik der Exklusion: Palimpseste des Ortes und selective Darstellung. In Wikipedia. eds. S. Lampe, and P. Bäumer. Bundeszentrale für politische Bildung/bpb, Bonn. 
For those that can't, the piece is mostly based on my 2011 chapter in the CPOV reader, and concerns the geographies of geographies information in Wikipedia
Graham, M. 2011. Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, 269-282.

Monday, March 25, 2013

What percentage of edits to English-language Wikipedia articles are from local people?


As part of our on-going efforts to explore the geographies of participation in Wikipedia, we have calculated the percentage of local edits to articles about places. In other words, this map illustrates the percentage of edits about any country that come from people with strong associations to that country.

For more on the method that we employed, have a read through the post on "who edits Wikipedia" - in which I explained our data collection efforts in much detail. The data are undoubtedly somewhat imprecise, but we are confident that they offer us the best overview of the geography of authorship that can be obtained with publicly-available data.

What do these results tell us?

Unsurprisingly, they show that in predominantly English-speaking countries most edits tend to be local. That is, we see that most Wikipedia articles (85%) about the US tend to be written from America, and most articles about the UK are likewise written from the UK (78%). The Philippines (68%) and India (65%) score well in this regard - likely because of role that English plays as an official language in both countries. But why then do we see relatively low numbers is other countries that also have English as an official language, such as Nigeria (16%) or Kenya (9%)?

We also, interestingly, see relatively high local edit percentages from a handful of countries that don't count English as an official language: Finland (50%), Norway (56%), Romania (54%), and Bulgaria (53%).

Then we also observe large parts of the world in which very few English-language descriptions about local places are created about local people. Almost all of Sub-Saharan Africa falls into this category.

The key question is whether these data actually tell us anything meaningful. For instance, just because most edits about the United States likely come from the United States does not necessarily mean that those articles are representative, include a diversity of viewpoints, or fail to exclude people, places, and processes.

But the data nonetheless, in a very broad way, do tell a story about voice and representation. Some parts of the world are represented on one of the world's most-used websites predominantly by local people, while others are almost exclusively created by foreigners - something to bear in mind next time you read a Wikipedia article.

Friday, March 22, 2013

Article in R-Link: Geographies of Information in Africa

The Rwandan ICT magazine, R-Link, just published a short piece that I wrote for them titled "Geographies of Information in Africa." The magazine is only available in print form (and presumably only available in Rwanda), so I'm including a scan at the link below:

Graham, M. 2013. Geographies ofInformation in Africa: Wikipedia and User-Generated Content. In R-Link: Rwanda’s Official ICT Magazine. Kigali: Rwanda ICT Chamber 40-41.

Talk and Panel Schedule at the Association of American Geographers

The annual AAG meeting is fast approaching, and I'd like to share the list of sessions and panels that I'm participating in:

1) Global City Challenges: Debating a Concept, Improving the Practice

Here I will be talking about my recent chapter for a forthcoming book that shares a name with the panel (Graham, M. 2013. Virtual Geographies and Urban Environments: Big data and the ephemeral, augmented city. In Global City Challenges: debating a concept, improving the practice. eds. M. Acuto and W. Steele. London: Palgrave. (in press).)

2 Digital Divides, Digital Domination, and Digital Divisions of Labour

This session, co-organised with Monica Stephens and Alan McConchie is concerned with the geographies, networks, and power relations of the digital inequalities of the geoweb.We have five very interesting papers scheduled by Alan McConchie, Matthew Kelley, Gregory Donovan, Sarah Williams, and Quiyang Xu.

3) DOLLY and The Questing Beast: Adventures in Twitterspace

This session is devoted to discussing some of the methodological details of Twitter collection efforts at ESRI, The University of Kentucky, and the Oxford Internet Institute.  

4) More Data, More Problems? Geography and the Future of 'Big Data'

Here I am co-chairing a panel with Taylor Shelton on the role of 'big data' in geographic scholarship. We aim to discuss whether 'big data' will actually provide us with more insights than more conventional methodologies? Will it limit us, liberate us, or lock us into particular ways of understanding the world? The panel we consist of Trevor Barnes, Rob Kitchin, Michael Goodchild, Sean Gorman, and Mike Batty.

5) From Dangling Cranes to Flooded Tunnels: Hurricane Sandy and the Geographies of Twitter

Taylor will be presenting this paper (co-authored with Ate Poorthuis, Matt Zook, and myself. The title gives away most of the content.

Tuesday, March 19, 2013

Who edits Wikipedia? A map of edits to articles about Egypt



This map displays where edits to English Wikipedia articles about Egypt come from. Detail about the method is included at the end of this post; but first a quick discussion of results.

One of the first things we notice in the map is that there is broad geographic interest in Egyptian-related topics. Editors from all over the world have played some part in writing about Egypt. In fact there is only a handful of countries that have never hosted an editor who wanted to write about Egypt.

However, this isn't to say that there aren't large differences in the origins of those edits. Whilst we might expect most edits about Egyptian-related topics to come from Egypt, we see that only 13% of all edits actually originate in the country. The US, in contrast, is home to 38% of all edits about Egypt (and the UK is home to 15%).

Because we're only looking at the English version of Wikipedia, the heavy presence of America-based editors is not that surprising: but does give us important empirical insights into just where geographic knowledge is created from.

Methods

The simplicity of the map belies the complexity of data collection that went into it.

First, we collected a list of all geotagged articles in Wikipedia. In other words, we collected a list of every single article about a place, event, or anything else that has a location, and then calculated which country each one of those articles is in. This process has already been described here and some of our results are available here.

Second, for every article, we constructed a list of every editor that made an edit to the article. Some of these editors are logged-in and identifiable by user-names and others edited anonymously and only left behind an IP address. We wanted to estimate, to the best of our abilities, the rough location (at the national-scale), for each of these editors. While we already know where edits come from at the national level, until now we have known little about what those editors are writing about.

The IP addresses were fairly straightforward to geolocate, and we placed 99.68% of them at the national-scale (accounting for about 52.5% of the total of 24,087,257 geotagged edits for geotagged articles)

The logged-in edits are more challenging to place because user-profiles are mostly unstructured and contain few consistent types of geographic data.

First, we gathered the GEOnet Names Server geographic gazetteer that included approximately 2.7 million place names, and used the place names signified by wikipedia geo-coded articles to refine that gazeteer further in terms of number of distinct locations mapped to any place name.

Second, we adopted the following two approaches to extract locations from wikipedia user pages.

I. We parsed the wikipedia meta-current dump to extract userboxes that signify user locations and mapped each to a particular ISO code.For example the userbox, {{User Georgia}} signifies that the user comes from Georgia, United States. This allowed us to directly map a user with any of such userboxes to a particular location with near-perfect accuracy.

II. To parse the user's unstructured text, we started off by generating a list of common preceding and succeeding patterns to locations in sentences. To generate that list, the following steps were carried out:
  • (A) Scan through matches of any of the place names in the gazeteer through all user pages after removing location userboxes that were already detected.
  • (B) For each place name match, increment count for the preceding and succeeding unigrams, bigrams and trigrams. in two separate dictionaries, one for preceding and the other for succeeding statements.
  • (C) Sort all counts for each combination descendingly and include frequently occurring predecessors and successors as a filtered down subset to use when tagging locations, resulting in patterns such as {I|Username} * live in {placename} or Wikipedians in {placename}.
  • (D) Since the common preceding and succeeding statements were manageable in size, manually tag each statement with a location relationship signifier, for example 'resides in' was tagged as a 'lives in' relation, whereas 'wikipedians from' was tagged as a born/from relation, the two relationship types that statements were mapped to.
  • (E) Then all user pages with place names within a pattern satisfying the listed predecessors and successors were mapped to the location mapping to that place name
  • (F) As for the cases where the place name were ambiguous (could possibly be mapped to two distinct locations), for example, Alexandria, Cairo and Alexandria, Virginia. If the places parent location was mentioned right after it then the disambiguation was straight-forward. As for other cases, we combined  a page rank score computed for each ISO location by generating a network of place names in the article and connecting them to their ISO location through their subnational locations, with the probability of parent location given place name as inferred from their co-occurences in all user pages , to make a decision on which of the parent locations to assign to a user.
From that procedure we were able to assign 122,888 users to countries, which allowed us to geolocate 11,437,436 edits of registered users to geotagged articles which adds up to 33.65% of the total of 33,991,052 registered user edits to all geotagged articles.

So by combining geolocations through ips and geotagged registered users, we were able to geolocate 51.6% (24,087,257 edits)  of all edits to geotagged articles in the english wikipedia, which are a total of 46,681,386 edits.

We should stress a few things about the results. Most importantly, we are not publishing any individual user locations, and are instead focusing entirely on aggregate data. We are also aware of the significant limitations of this method. In some ways, we are simply reproducing existing geographic inequalities. Absences in the gazetteer and in Wikipedia's coverage can be reproduced in our tagging method. The method might therefore somewhat underestimate the number of edits coming from places in the world's informational peripheries.

It will be interesting to see how well geographic patterns in IP edits data correspond with the parsed user data: as the IP data are less likely to suffer from the embedded biases mentioned above.   

Over the next few weeks and months, I'll be sifting through our data and publishing some of the more interesting findings on the blog.

Read more from the project:

Graham, M. 2011. Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, 269-282.

Graham, M., M. Zook., and A. Boulton. 2012. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. DOI: 10.1111/j.1475-5661.2012.00539.x

Graham, M. 2013. The Knowledge Based Economy and Digital Divisions of Labour. In Companion to Development Studies, 3rd edition, eds V. Desai, and R. Potter. Hodder (in press).

Friday, February 15, 2013

Mapping Twitter in Francophone Africa

There was a lot of interest in the series of eleven maps of tweets in African cities that I posted yesterday. So, here are ten more from Francophone Africa: Algiers, Bamako, Abidjan, Nouakchott, Kinshasa, Ougadougou, Libreville, Dakar, Conkary, and Douala. 

Compared to places like Cairo, Johannesburg, or Nairobi, we see very little activity in most of these cities - with the notable exception of Abidjan.

Remember that we are only mapping geocoded tweets here. But these patterns might nonetheless give us a very crude indication of some of the distinct geographies of contemporary digital divides.










Relevant articles:

Graham, M. 2013. Virtual Geographies and Urban Environments: Big data and the ephemeral, augmented city. In Global City Challenges: debating a concept, improving the practice. eds. M. Acuto and W. Steele. London: Palgrave. (in press).

Graham, M and M. Zook. 2013. Augmented Realities and Uneven Geographies: Exploring the Geo-linguistic Contours of the Web. Environment and Planning A 45(1) 77-99.

Graham, M., M. Zook., and A. Boulton. 2012. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. DOI: 10.1111/j.1475-5661.2012.00539.x