Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Targeted Social Mobilization in a Global Manhunt

  • Alex Rutherford ,

    Contributed equally to this work with: Alex Rutherford, Manuel Cebrian, Iyad Rahwan

    Affiliation Computing and Information Science, Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates

  • Manuel Cebrian ,

    Contributed equally to this work with: Alex Rutherford, Manuel Cebrian, Iyad Rahwan

    Affiliations National Information and Communications Technology Australia, Melbourne, Victoria, Australia, Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, United States of America

  • Iyad Rahwan ,

    Contributed equally to this work with: Alex Rutherford, Manuel Cebrian, Iyad Rahwan

    Affiliations Computing and Information Science, Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • Sohan Dsouza,

    Affiliation Computing and Information Science, Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates

  • James McInerney,

    Affiliation School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom

  • Victor Naroditskiy,

    Affiliation School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom

  • Matteo Venanzi,

    Affiliation School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom

  • Nicholas R. Jennings,

    Affiliation School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom

  • J. R. deLara,

    Affiliation George Washington University, Washington D.C., United States of America

  • Eero Wahlstedt,

    Affiliation University of Oxford, Oxford, United Kingdom

  • Steven U. Miller

    Affiliation Champlain College, Burlington, Vermont, United States of America


Social mobilization, the ability to mobilize large numbers of people via social networks to achieve highly distributed tasks, has received significant attention in recent times. This growing capability, facilitated by modern communication technology, is highly relevant to endeavors which require the search for individuals that possess rare information or skills, such as finding medical doctors during disasters, or searching for missing people. An open question remains, as to whether in time-critical situations, people are able to recruit in a targeted manner, or whether they resort to so-called blind search, recruiting as many acquaintances as possible via broadcast communication. To explore this question, we examine data from our recent success in the U.S. State Department's Tag Challenge, which required locating and photographing 5 target persons in 5 different cities in the United States and Europe – in under 12 hours – based only on a single mug-shot. We find that people are able to consistently route information in a targeted fashion even under increasing time pressure. We derive an analytical model for social-media fueled global mobilization and use it to quantify the extent to which people were targeting their peers during recruitment. Our model estimates that approximately 1 in 3 messages were of targeted fashion during the most time-sensitive period of the challenge. This is a novel observation at such short temporal scales, and calls for opportunities for devising viral incentive schemes that provide distance or time-sensitive rewards to approach the target geography more rapidly. This observation of ′12 hours of separation' between individuals has applications in multiple areas from emergency preparedness, to political mobilization.


The Internet and online social media are now credited with the unprecedented ability to coordinate the mobilization of large masses of people to achieve remarkable feats that require coverage of large geographical and informational landscapes in a very limited time. Social media has been used to mobilize volunteers to map natural disasters in real-time [1], to conduct large-scale search-and-rescue missions [2], and to locate physical objects within extremely short time frames [3].

Despite the numerous successes attributed to the Internet, mobile communication and social media, we still lack a comprehensive understanding of the dynamics of technology-mediated social mobilization. Open questions remain about essential aspects that determine the success of social mobilization. One such aspect is the relationship between social interaction and geography. Social interaction is an essential driver of recruitment and coordination. However, social interaction is constrained by geography [4], and such constraints exhibit fundamentally different characteristics for large communities [5]. Further, geography is influenced by the nature of the task at hand, as we discuss below.

Consider the task of mobilizing protesters as part of the Occupy Wall Street movement [6]. It has recently been shown that social interaction exhibits a disproportionately high degree of geographical locality, reflecting the movement's efforts to mobilize resources in their local neighborhoods and cities [7], [8].

On the other hand, mobilization for large search-and-rescue operations demands the opposite approach, namely spreading the message and recruiting participants in geographically distant locations. In the DARPA Network Challenge (a.k.a. Red Balloon Challenge), organized by the Defense Advanced Research Projects Agency, teams competed to locate and submit the coordinates of 10 tethered weather balloons dispersed at random locations all over the continental United States. The winning team, based at MIT, won the challenge by locating all balloons in less than 9 hours [9]. The team used an incentive scheme to kick start an information and recruitment cascade that resulted in 4,400 sign-ups to the team's Web site within 48 hours. Our earlier analysis revealed that the recursive incentive scheme may have played an important role in maximizing the speed and branching of the diffusion to limits above what is normally observed in viral propagation schemes [10]. Further, data reveals that people managed to recruit acquaintances who are more distant than expected, thus contributing to the rapid coverage of a large geographical area [3].

Another class of mobilization tasks requires geographical propagation that simultaneously spans large distances, while exhibiting targeted spatial dynamics. An example of this is search for a missing person or an object with a known approximate location. Milgram's landmark “small world” experiment showed that people are, in principle, able to find a target individual using 6 hops on the global social network [11]. This result has been reaffirmed in the Internet age in an email-based version of Milgram's experiment [12]. This phenomenon relies on people's ability to form reliable estimates of distance to the target, in order to exploit the large jumps afforded by small world networks as they forward the message to their acquaintances [13][15]. In particular, people rely on heuristic information (simple rules of thumb for guiding choice) in the routing of information by the recruitment of acquaintances. Geographical distance, along with non-geographical distance measures – such as similarity of occupation to the target individual – form particularly effective heuristics [16]. For example, if the target is known to be a Professor residing in Kyoto, Japan, one might send it to a friend who lives in Tokyo, Japan, as they are more likely to know someone who lives in Kyoto, who in turn may know someone in academia, and so on.

An open question remains as to whether in time-critical situations, such as public response to natural disasters, an abduction, or search for a missing child, people are still able to spread information in such a heuristic manner. Humans have a limited amount of time per day to dedicate to social interaction [17], which poses a limit on the effort one can invest in persuading an acquaintance to act. Further, time pressure can affect the way in which people process environmental information [18]. Consequently, people may be expected to resort to so-called blind search, focusing simply on the recruitment of as many acquaintances as possible via broadcast messaging [19]. However, while this strategy may be effective at delivering the message to a broad audience, it results in lower effort in finding and mobilizing those recruits that have high affinity with the task (due to their location or other characteristics), and are therefore more likely to propagate the message or participate in the required action [20].

We examined the spatial dynamics of global recruitment in the State Department's Tag Challenge, which required competing teams to locate and photograph 5 target “thieves” (actors) in 5 different cities in the US and Europe, based only on a mug shot released at 8:00am local time in each respective city [21]. The targets were only visible for 12 hours, and followed pre-arranged itineraries around the cities of Stockholm, London, Bratislava, New York City and Washington D.C. Our team successfully located 3 of the 5 suspects [22], winning the competition by remotely mobilizing volunteers through social media using a recursive incentive mechanism that encourages recruitment [23], [24]. This was achieved despite the fact that none of our team members were based in any of the target cities [25].

The challenge provided a rare opportunity to quantify the dynamics of large-scale, global social mobilization in a time-critical scenario from a spatial and temporal perspective. The 12 hour deadline provided clear urgency. Furthermore, the announcement of the challenge, 2 months in advance, provided a chance to quantify the growth of awareness over time, as we approach the actual day of the challenge, March 31st, 2012. Finally, due to its geographical dispersal over multiple countries and languages, no single small team of acquaintances can conceivably achieve the task without the help of others not directly connected to them. Consequently, people were required to forward messages to acquaintances who are either in the target cities, or whom they believed would be more likely to forward messages towards those cities. Despite the DARPA Network Challenge being very close in aim, it did not provide this opportunity, as there was no information available to the searchers whatsoever about the location of the balloons.

We collected data about the general awareness of the challenge which is not specific to the efforts of a particular team, measured by number of hits to the main challenge organizers' Web site, as well as on major social media sites (Twitter and Facebook). We also captured data about the winning team's presence on major social media sites (Twitter and Facebook). This provides a quantitative view of the growth dynamics of mobilization over time as the deadline approaches. More importantly, by mapping the approximate geographical locations of different social media messages, we were able to quantify the geographical convergence towards the target cities.

Twitter, the popular micro-blogging service, is an ideal barometer for investigating blind versus heuristic (targeted) mobilization strategies as both modes of communication are available. Users may tweet messages to all their friends (the content is also publicly available if the user chooses this option). Alternatively, a user may mention one or more other users specifically, regardless of whether they are friends or not, by adding the symbol @ followed by the target user's Twitter name. For example, to target a person with user name alex, one simply includes the string @alex in the message. If a tweet is of this second variety, the mentioned user receives a specific alert and is generally obliged to respond, or at least pay more attention to the message. Often, such targeted messaging also leads to subsequent public or private conversations. In the case of the Tag Challenge, such conversations can be seen as an effort exerted on behalf of the recruiter to persuade the recruit to join the cause. Although some tweets can be considered to be specific to a particular team e.g. ‘Snap a picture of a traveler in this digital scavenger hunt and you could win $ for you/charity. @TagTeam_’ as opposed to general tweets simply raising awareness e.g. ‘ 5 thieves, 5 cities, 12 hours: Can Twitter catch them? –’ Here we aggregate all tag-related tweets together, so we may consider our findings to be general between teams.

By classifying each challenge-related message to either the broadcast or targeted variety, we were able to investigate the extent of conscious effort towards targeted mobilization over time as the deadline approaches. In addition, by combining this information with the approximate geographical location of the target audience, it was also possible to investigate whether this targeting was effective in converging towards the target cities geographically.

It is important to disentangle two potential explanations of the phenomenon of targeted recruitment in this time-critical social mobilization. One explanation is the explicit effort on behalf of participants to identify and recruit acquaintances who are closer to the target geography. But another explanation is also possible, namely the intrinsic structure of global communication and its role in routing information automatically towards hubs. This is particularly relevant, since two of the target cities, London and New York City, are recognized global hubs, with disproportionately higher social, financial, and social media ties to the rest of the world. To disentangle the roles played by global communication structure and by individual participant choices, we developed a biased routing model that parameterizes the degree of explicit heuristic targeting, and use it to quantify the behavior observed.


Media Exposure

Figure 1 shows the daily volume of Tweets related to the Tag Challenge and traffic to the official website (see Materials & Methods). The dates of major media articles concerning the challenge are also indicated. There is clearly some degree of correlation between media coverage and social media traffic. However significant traffic persists on days with no media coverage suggesting that there is also a slower process of peer-to-peer sharing of information about the challenge.

Figure 1. Daily volumes of Tag Challenge related Tweets and Web hits on up to the challenge day. Major media coverage events are highlighted.

We also see from Figure 2 that our team's social media presence, measured by the daily number of impressions of our presence on Facebook, provided access to daily volumes of several thousand potential searchers. Although this measure counts repeated exposure by the same users, the total sums to over 29,000. The official Tag Challenge Facebook page also created over 86,000 impressions. We can therefore infer the presence of a hidden network of ‘passive recruits’ – people who are aware of the challenge, yet are not sufficiently motivated to sign up and recruit others, but who will report sightings of the target. Such a mechanism was found to be a necessary condition for successful social mobilisation in geographical search [26].

Figure 2. Daily number of impressions on Facebook for the winning team CrowdscannerHQ, and the official Tag Challenge organizers.

The vertical dotted line denotes the release of the first mug shots.

Evidence of Targeted Mobilization

Figure 3 shows the distance scaling behaviour of traffic to the Tag Challenge Web site in the days leading up to the challenge. The distance from the originating Internet Protocol (IP) address to the nearest Tag Challenge city was calculated for each unique visitor. After filtering distance independent traffic and smoothing (see Materials & Methods), we observe a strong trend of geographical convergence towards the target cities over time, quantified by the Pearson coefficient .

Figure 3. Distance convergence toward Tag Challenge cities of web hits on

We consider a moving average of distance filtered daily tweet traffic (MA(prop(t))) (grey circles), which is fit with a linear regression (red line) giving a correlation of .

Figure 4 considers the rate at which individual users are specifically targeted (i.e. @-mentioned) in the Tweets related to the Tag Challenge. This distinguishes messages which broadcast to all followers from those which target specific users perceived to be useful for locating the targets (we exclude Tweets from the participating teams from this analysis). The proportion of Twitter traffic targeting individuals increases in the 6 days leading up to the Tag Challenge .

Figure 4. The total daily number of Tweets (black line), the number targeting individuals via @-mentions (blue line) and their proportion (red line).

Correlation of targeted proportion with time was found as .

This trend is additionally supported by Figure 5, which considers the location of users specifically targeted (@-mentioned) in Tweets. The effect of spurious noise was mitigated with the use of a 4 day moving average. The daily proportion of these targeted users located in the tag cities (defined as 25 km from the city centre), with respect to the total number of daily targeted users, was seen to increase approaching the challenge day. A strong correlation with time was found ( using the raw, unsmoothed data). This result suggests that Twitter users successfully routed information geographically towards users more likely to locate a target.

Figure 5. Daily proportion of @-mentioned users which are located within a tag city.

Noise is eliminated by smoothing with a 4.

The increase in both the rate of targeted messaging and its geographical convergence suggests that, as time becomes more critical, people become surprisingly more rather than less targeted in their social mobilization heuristic. This is a novel observation at such short temporal scales (days to hours), and calls for devising viral incentive schemes that provide distance- or time-sensitive rewards to approach the target geography more rapidly, with applications in multiple areas from emergency preparedness [1], [19] to political mobilization [27], [28].

At this point, we emphasise that these results are not team-specific. All tweets related to the challenge were collected and analysed together. Each of these tweets might refer to a particular team or may simply wish to draw attention to the challenge. While we present the exposure of our team€s Facebook page in Figure 2, we also compare it to the official page of the challenge itself which may be considered team-agnostic. Figure 3 presents the geographical convergence of traffic to the official Tag Challenge website, this traffic cannot be attributed to any particular team.

Disentangling Targeting Behavior

The results above suggest the existence of a significant effort by people to mobilize others in a targeted manner, moving towards the target cities. However, it is reasonable to suspect that this observed behavior may be, at least in part, an artefact of the importance of major cities like New York and London – which may receive a disproportionately large amount of traffic regardless of the propagation process. Thus it is important to quantify the extent to which we can expect to reach those cities without any deliberate targeting (top hubs are listed in Table S1 and Table S2 in File S1), then use this baseline to quantify the amount of targeting needed to produce the observed behavior in the Tag Challenge.

To investigate this issue, we construct a network of communications between global Metropolitan Statistical Areas (MSA). We use flight frequency data between MSAs as a proxy for social media communication intensity, which have been shown to correlate well (and more strongly than distance) with traffic from Twitter data [29]. Air traffic connections reflect the cultural/linguistic and even post-colonial and post-Commonwealth expatriate ties that have been found to be present in social networks [30], [31] as well as inter-city economic relations [32] and internet connectivity [33]. The raw flight numbers and proportions of flights between cities are represented in Figures S1 and S2 in File S1. An additional advantage of using the air flight network is that we are able to capture the structure of what is a combination of different social media platforms which make up a fragmented global social media ecosystem. This includes not only email but also Facebook, Orkut and Weibo which dominate in North America and Europe, the Lusosphere and China respectively along with many others.

We simulate a random walk over the MSA network, which represents the diffusion of social mobilization using social media and other means of communication (see Materials and Methods for more details; the list of cities can be found in Tables S3-7 in File S1). To capture the effect of different mixing of targeted and broadcasting behaviour, we assign some degree of geographical greediness (targeting) in making the mobilization decisions. Such a random walk does not attempt to replicate the dynamics of information flows over time. Rather it seeks to determine the static centrality of specific nodes indirectly from the stationary occupation probability distribution of a walker through resampling. The level of greediness is fixed in each simulation, and given that the degree of targeting increases as the challenge approaches, each simulation measures this property of the network at a fixed moment in time.

With probability a random walker on a node chooses to move (i.e. send a message) to a connected node randomly according to the outgoing edge weights (including self-edges capturing local communication within the MSA). With probability the walker instead moves greedily to one of its neighbours which enjoys the network-constrained, closest geographic position to any Tag Challenge city (it does this independently of the edge weight). Note that this will generally lead to an overestimation of the centralities of the Tag Challenge cities since it assumes that people can successfully leverage any link to a Tag Challenge city no matter how weak it might be. Therefore the degree of greediness (targeting) we report to reproduce our observations should be considered a lower bound. The agent takes very large steps in space only when there exist long haul flight connections. The greedy behavior represents an agent who actively chooses to leverage social ties which are perceived to be more likely to find a target due to privileged location in space [11]. The distance to the nearest tag city may be considered a heuristic which agents use to target a particular city. Note that moving to the geographically closest city may be sub-optimal, since the new, closer city may not in fact be well connected to the target city. However agents are unlikely to have perfect knowledge of the network and so the shortest path to the target city. When a walker chooses to move greedily and has more than one Tag Challenge city among its neighbours, it chooses one at random.

We perform simulations to determine the stationary probability distributions of the above random walk (10 steps per simulation), given various degrees of greedy targeting towards Tag Challenge cities. From this stationary probability we infer the effective centralities of the different cities.

Figure 6 (red) shows the unbiased centralities without any greedy targeted mobilization. The figure highlights the existence of clear peaks at hub cities, including some tag cities themselves. This random walk, corresponding to untargeted broadcast mobilization by participants, leads to of traffic ending up in one of the Tag Challenge cities. While this is a significant proportion in a global network of metropolitan areas, largely driven by the centralities of London and New York, it is significantly lower than the observed proportion. In particular, as shown in Figure 5 the proportion of targeted tweets with @-mentions increases to as the deadline approaches. The proportion of those tweets that are in one of the target cities is (Figure 5). This means that the proportion of messages reaching the target cities is approximately , almost an order of magnitude higher than what would be expected by an unbiased, non-targeting random flow of messages.

Figure 6. Plot of stationary distribution during a random walk on global MSA network, with increasing degree of greediness (targeting) moving clockwise from top left.

The red line represents an pure, untargeted random walk, corresponding to pure random mobilization via broadcast messaging. (Top left) The horizontal dashed line represents the uniform distribution of centralities expected in a fully connected graph. The black line in other plots represents a greedy random walk. (Bottom right) When the greediness is increased to we match the observed proportion of targeted messages reaching the Tag Challenge cities. The shading represents MSAs from different continents. The 5 tag cities are marked with vertical, dashed blue lines.

Figure 6 (bottom left, black) highlights that a significant degree of targeting behavior, corresponding to , is required to approach the approximate proportion of time spent in the Tag Challenge cities as observed in the data (Figure 7 shows the unbiased and targeted centralities on a map). In other words, people not only need to target others with personalized recruitment messages, but they also need to do so using a geographically informed heuristic at least of the time. Even when restricting the communication network to North America and Europe (see Figure S3 in File S1), to mitigate the affects of linguistic barriers, significant targeting remains necessary to reproduce the observed proportions of traffic. However the diverse originating locations of global traffic to our team's site suggests that awareness of the challenge did transcend linguistic barriers, justifying consideration of the full global network (see Figure S4 in File S1).

Figure 7. Map showing communication network within Europe and North America following an unbiased random walk (upper) and under 30% targeting (lower).

The area of red circles are proportional to centrality.


Sixty years ago, social psychologist Stanley Milgram redefined our notion of social distance with his landmark Six Degrees of Separation experiment [11], showing that we are, on average, only 6 hops of friendship away from anyone else on earth. Facebook found the degree of separation to be only 4 in their digital network [34]. Endeavors like the Tag Challenge are set to redefine our conception of the temporal and spatial limits of technology-mediated social mobilization in the Internet age, showing that using a network of mobilised people, in principle, we can find any person (who is not particularly hiding) in less than 12 hours.

The data analysed in this paper includes public tweets, data collected via our own team's efforts and data aceesible to us due to our privileged status as eventual winners. However we can confidently conclude that this relatively small sample is representative of the much larger flow of private communications. The conclusions drawn are not specific to our team and since we do not have access to data collected by other teams for comparison we neccesarily limit our analysis to this.

We have shown that this 12 hours of separation phenomenon relies crucially on the ability of social networks to mobilize in a targeted manner, using geographical information in recruiting participants. The data provides significant support for the presence of geographical targeting, even under time pressure. In fact, we observe that targeting increases as a function of time pressure, as the challenge approaches its deadline.

We were also able to quantify the intensity of targeted mobilization behavior, in comparison with the baseline of untargeted flow of global social media communication. This supports the general notion that social networks are able to tune their geographical communication to suit the task at hand. For example, using Twitter data, it was shown that the Occupy Wall Street social movement in the United States exhibits significant localization (at the state level) when it comes to messages that facilitate resource mobilization and coordination, with reference protest action and specific places and times. In contrast, information flows across state boundaries are more likely to contain framing language to develop narrative frames that reinforce collective purpose at the national level [7], [8]. Our findings complement these results, by contributing towards a general theory that link the purpose of social mobilization to the temporal and spatial dynamics of different forms of communication.

Within high volume social media communications, considerable effort is required to persuade people about the importance of a particular message or cause or even to notice it at all. Both considerations are crucial for a successful mobilisation process. Previous work has shown that shared news stories of interest become obsolete on a timescale [35] and that the amount of cognitive resources an individual dedicates to online communications is limited and inelastic [36], meaning that the intrinsic importance of the message cannot be relied upon to overcome informational overload and to motivate its sharing. In addition, active interaction with a task requires much more attentional cost to an individual than simple observation [37] and connected individuals vital for propagation also have an associated high inertia [38]. The importance of targeted personal interactions (typified by Twitter @ mentions) can be seen in this context; personalised messages obligate greater cognitive effort from the receiver overcoming the inevitable slide into obsolescence of a single subject over time. Geographical targeting now has an additional advantage beyond the increased chance of recruiting a first hand searcher as the targeting converges; increased personal affiliation of the receiver with the message. The empirical evidence presented above suggests that large distributed communities intuitively understand these considerations and can leverage them in a timely and powerful manner.

Materials and Methods

1. Twitter

The Web site Twitter is an extremely popular micro-blogging service which also incorporates a social network. Users create short messages (‘tweets’) of 140 characters or less which contain text and/or shortened hyperlinks to other webpages or images of interest. Users tweets appear in the feed of all other users who have chosen to follow her. A user may also opt to make the content of their tweets visible to the public. Tweets contain hashtags to signify that the tweet is relevant to a particular topic i.e. # playTag was a popular hashtag for the Tag Challenge. Users may also choose to target a Tweet to a particular user, regardless of whether the users are connected by a follower/following link, rather than simply broadcasting to her followers. This is done by including a user's Twitter handle e.g. @crowdscannerhq.

We collected the full set of relevant tweets from the period 13 February to 10 April using a paid service [39] according to appropriate hash tags and keywords or targeted mentions (@ mentions) of competing teams. Tweets originating from @TagTeam_, @CrowdscannerHQ, @TagChallenge, @Tagteam and @Tag_Challenge were discarded. Tweets from the participating teams were excluded from these daily totals since the teams had an interest in increasing the daily tweet volumes. The tweets were then manually filtered for relevance by relevant hashtags such as # playTag, # tagchallenge, # tag and any links contained within the tweet. 1263 tweets out of 2181 remained after the filtering process.

Tweets from users with no reliable location information which could be geo-coded were discarded, further care was taken to recognise and eliminate artefacts of the geocoding process which led to spurious latitude/longitude coordinates. e.g. ‘The world’ becoming ‘(0.0,0.0)’. Tweets originating from within 25 km of the defined city centres [40] were considered to originate from the city.

2. Facebook

As the largest Web-based social network in the world, Facebook has over 1 billion active users. The daily number of impressions were sourced using the Facebook Insights Application Programming Interface (API) [41]. This covers any user engagement with Tag Challenge page, such as posts on one's “wall” or expressions of approval by friends using the “like” button, etc.

3. Google Analytics

The traffic to the official website was recorded between 14 February and 4 April. A total of 1000 unique users and their IP addresses were recorded in this period. We used an online service [42] to derive approximate location coordinates from this IP. To mitigate the effect of noise due to the variable volumes of traffic, a moving average was taken for each day, using a sliding window defined as , where is the proportion of distance ordered tweets within the -th percentile on day which were within a tag city and is the order of the moving average. Figure 2 corresponds to and .

Even the full set of unsmoothed data (, ) reveals a geographically convergent trend ). We excluded tweets from the Tag teams since the teams may have actively pursued a strategy of geographical convergence skewing the results.

4. Simulation

A coarsened network of air travel connections was constructed as follows. Firstly the largest 280 Metropolitan Statistical Areas (MSA) were considered across all continents. Polycentric MSAs such as the New York Tri-state area in New York were collapsed into one node in the network. A full list of global airports and connections between them was taken from Open Flights [43]. In order to coarsen the data, airports were agglomerated to the geographically closest MSA using open data [44] [29]. Now the many airports of Greater London; Heathrow, Stanstead, Luton, Gatwick etc are all considered together. This coarsening helps mitigate the effect of anomalous behavior within sparsely populated regional clusters with unusual locality, such as Alaska [45].

The network edge weights are based on a normalised number of flights between every 2 cities, with self loop weights set to 0.39 representing the probability of communication within the same MSA [29]. We construct an adjacency matrix representation of the network, namely an square matrix , where is the number of MSAs, and is the weight of the directed edge between cities and . The adjacency matrix was row normalised, such that row represents a probability distribution over the target node reached by a random walker leaving node . This results in an adjacency matrix which is nearly symmetric.

We then simulated a random walk over this network. With probability , so called greediness bias, we move towards the closest Tag Challenge cities. And with probability we take a pure random walk with probabilities proportional to the outgoing edge weights. A random walk, with corresponds to the eigenvector centrality vector of the different MSAs (see File S1 for further details).

Supporting Information

File S1.

Supporting information for the manuscript, including Figures S1–S4 and Tables S1–S7. Figure S1, Heat map of raw ight numbers. Figure S2, Heat map of normalised and locality-adjusted adjacency matrix with greediness set to 0.3. Figure S3, Plot of stationary distribution during a random walk on reduced MSA network (comprising only North America and Europe), with increasing degree of greediness moving clockwise from top left. Figure S4, Heatmap showing tra_c to on 48 hours approaching the challenge. Table S1, Table of regional hub MSA's and centralities. Table S2, Table of MSA's with highest centrality values after locality adjustment (and tag cities for comparison). Table S3, List of cities in North America. Table S4, List of cities in South America. Table S5, List of cities in Europe. Table S6, List of cities in Asia. Table S7, List of cities in Africa



We would like to thanks David Alan Grier for discussion. We are grateful to all our volunteers that joined the CrowdScanner team.

Author Contributions

Conceived and designed the experiments: IR MC AR SD VN JRd EW SM. Performed the experiments: IR MC AR SD. Analyzed the data: AR IR MC JM MV. Wrote the paper: IR AR MC NRJ. Built the software platform: SD JM.


  1. 1. Gao H, Barbier G, Goolsby R (2011) Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems 26: 10–14.
  2. 2. Hellerstein J, Tennenhouse D (2011) Searching for Jim Gray: a technical overview. Communications of the ACM 54: 77–87.
  3. 3. Pickard G, Pan W, Rahwan I, Cebrian M, Crane R, et al. (2011) Time-critical social mobilization. Science 334: 509–512.
  4. 4. Liben-Nowell D, Novak J, Kumar R, Raghavan P, Tomkins A (2005) Geographic routing in social networks. Proceedings of the National Academy of Sciences 102: 11623–11628.
  5. 5. Onnela JP, Arbesman S, González MC, Barabási AL, Christakis NA (2011) Geographic constraints on social network groups. PLoS one 6: e16939.
  6. 6. Chomsky N (2012) Occupy. Penguin Books.
  7. 7. Conover MD, Davis C, Ferrara E, McKelvey K, Menczer F, et al. (2013) The geospatial characteristics of a social movement communication network. PLOS ONE 8: e55957.
  8. 8. Conover MD, Ferrara E, Menczer F, Flammini A (2013) The digital evolution of occupy wall street. PloS one 8: e64679.
  9. 9. Tang J, Cebrian M, Giacobe N, Kim H, Kim T, et al. (2011) Reecting on the darpa red balloon challenge. Communications of the ACM 54: 78–85.
  10. 10. Iribarren JL, Moro E (2009) Impact of Human Activity Patterns on the Dynamics of Information Diffusion. Physical Review Letters 103: 038702+.
  11. 11. Milgram S (1967) The small world problem. Psychology Today 61: 60–67.
  12. 12. Dodds PS, Muhamad R, Watts DJ (2003) An experimental study of search in global social networks. Science 301: 827–829.
  13. 13. Kleinberg J (2000) Navigation in a small world. Nature 406: 845.
  14. 14. Kleinberg J (2000) The small-world phenomenon: an algorithm perspective. In: Proceedings of the thirty-second annual ACM symposium on Theory of computing. ACM, 163–170.
  15. 15. Adamic LA, Adar E (2005) How to search a social network. Social Networks 27: 187–203.
  16. 16. Watts D, Dodds P, Newman M (2002) Identity and search in social networks. Science 296: 1302–1305.
  17. 17. Miritello G, Moro E, Lara R, Martínez-López R, Belchamber J, et al.. (2013) Time as a limited resource: Communication strategy in mobile phone networks. Social Networks
  18. 18. Ozel F (2001) Time pressure and stress as a factor during emergency egress. Safety Science 38: 95–107.
  19. 19. Watts D, Cebrian M, Elliot M (2013) Dynamics of social media. In: Council NR, editor, Public Response to Alerts andWarnings Using Social Media: Report of aWorkshop on Current Knowledge and Research Gaps, Washington, D.C.: The National Academies Press.
  20. 20. Iribarren JL, Moro E (2011) Affinity paths and information diffusion in social networks. CoRR abs/1105.3316.
  21. 21. (2012). Tag Challenge Press Release. Available: Accessed 2013 Mar 13.
  22. 22. Firth N (2012) Social media web snares ‘criminals’. New Scientist.
  23. 23. Kleinberg J, Raghavan P (2005) Query incentive networks. In: Proceedings of the IEEE Symposium on Foundations of Computer Science. 132–141.
  24. 24. Cebrian M, Coviello L, Vattani A, Voulgaris P (2012) Finding red balloons with split contracts: Robustness to individual's selfishness. In: Proceedings of the ACM Symposium on Theory of Computing. 775–788.
  25. 25. Rahwan I, D'souza S, Rutherford A, Naroditskiy V, McInerney J, et al. (2013) Global manhunt pushes the limits of social mobilization. IEEE Computer 46 (4): 68–75.
  26. 26. Rutherford A, Cebrian M, D'souza S, Moro E, Pentland A, et al. (2013) The Limits of Social Mobilisation. Proceedings of National Academy of Science 110 (16): 6281–6286.
  27. 27. González-Bailón S, Borge-Holthoefer J, Rivero A, Moreno Y (2011) The dynamics of protest recruitment through an online network. Scientific Reports 1.
  28. 28. Bond R, Fariss C, Jones J, Kramer A, Marlow C, et al. (2012) A 61-million-person experiment in social influence and political mobilization. Nature 489: 295–298.
  29. 29. Takhteyev Y, Gruzd A, Wellman B (2012) Geography of twitter networks. Social Networks 34: 73–81.
  30. 30. State B, Park P, Weber I, Mejova Y, Macy M (2013) The mesh of civilisations and international email flows. arXiv 1303.0045.
  31. 31. Ugander J, Karrer B, Backstrom L, Marlow C (2011) The anatomy of the facebook social graph. CoRR abs/1111.4503.
  32. 32. Beaverstock JV, Smith RG, Taylor PJ (1999) A roster of world cities. Cities 16: 445–458.
  33. 33. Choi JH, Barnett GA, Chon BS (2006) Comparing world city networks: a network analysis of internet backbone and air transport intercity linkages. Global Networks 6: 81–99.
  34. 34. Backstrom L, Boldi P, Ugander J, Vigna S (2011) Four degrees of separation. arXiv 111.4570.
  35. 35. Wu F, Huberman BA (2007) Novelty and collective attention. Proceedings of the National Academy of Sciences 104: 17599–17601.
  36. 36. Miritello G, Lara R, Cebrian M, Moro E (2013) Limited communication capacity unveils strategies for human interaction. Scientific reports 3.
  37. 37. Backstrom L, Bakshy E, Kleinberg J, Lento TM, Rosenn I (2011) Center of attention: How facebook users allocate attention across friends. In: In International Conference onWeblogs and Social Media (ICWSM.
  38. 38. Hodas NO, Lerman K (2012) How visibility and divided attention constrain social contagion. In: Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom). IEEE, 249–257.
  39. 39. Hootsuite. Hootsuite. Available: Accessed 2013 Apr 9.
  40. 40. Geohack. Abailable: Accessed 2013 Apr 9.
  41. 41. Facebook insights.Available: 2013 Apr 9.
  42. 42. Ipinfo. Available: Accessed 2013 Apr 9.
  43. 43. Openights. Available: Accessed 2013 Apr 9.
  44. 44. Geonames. Available: Accessed 2013 Apr 9.
  45. 45. Guimera R, Mossa S, Turtschi A, Amaral LAN (2005) The worldwide air transportation network: Anomalous centrality, community structure, and cities' global roles. Proceedings of the National Academy of Sciences 102: 7794–7799.