Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The Geospatial Characteristics of a Social Movement Communication Network

  • Michael D. Conover ,

    midconov@indiana.edu

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

  • Clayton Davis,

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

  • Emilio Ferrara,

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

  • Karissa McKelvey,

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

  • Filippo Menczer,

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

  • Alessandro Flammini

    Affiliation Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America

The Geospatial Characteristics of a Social Movement Communication Network

  • Michael D. Conover, 
  • Clayton Davis, 
  • Emilio Ferrara, 
  • Karissa McKelvey, 
  • Filippo Menczer, 
  • Alessandro Flammini
PLOS
x

Abstract

Social movements rely in large measure on networked communication technologies to organize and disseminate information relating to the movements’ objectives. In this work we seek to understand how the goals and needs of a protest movement are reflected in the geographic patterns of its communication network, and how these patterns differ from those of stable political communication. To this end, we examine an online communication network reconstructed from over 600,000 tweets from a thirty-six week period covering the birth and maturation of the American anticapitalist movement, Occupy Wall Street. We find that, compared to a network of stable domestic political communication, the Occupy Wall Street network exhibits higher levels of locality and a hub and spoke structure, in which the majority of non-local attention is allocated to high-profile locations such as New York, California, and Washington D.C. Moreover, we observe that information flows across state boundaries are more likely to contain framing language and references to the media, while communication among individuals in the same state is more likely to reference protest action and specific places and times. Tying these results to social movement theory, we propose that these features reflect the movement’s efforts to mobilize resources at the local level and to develop narrative frames that reinforce collective purpose at the national level.

Introduction

One of the most prominent American political movements of the past thirty years, Occupy Wall Street (‘Occupy’) is remarkable in the extent to which social media played a central role in its development and organization [1], [2]. In this study, we examine how the needs and constraints of social movements are reflected in the geospatial characteristics and information sharing practices of Twitter users engaged in communication about the Occupy movement. Specifically, we focus on the geographic distribution of these users and the ways in which the relationships among them diverge from those of users contributing to the two most popular streams for stable political discourse in the United States, ‘Top Conservatives on Twitter’ and ‘Progressives 2.0.’

The organizing forces underlying successful social movements have been studied extensively by sociologists and political scientists [3][7]. From this body of work common themes have emerged, include the problems of resource mobilization and collective framing, which together constitute two of the core issues any social movement must address in order to effect social or political change. Resource mobilization refers to the process by which a social movement must marshal the financial, material, and human resources required to sustain its activities [8]. Collective framing is a process whereby the constituents of a social movement, through formal or informal processes, come to establish the narratives, language, and imagery that capture the essential features of the movement’s purpose and struggle [9]. Effective framing helps to foster a sense of community and engagement, and can be a powerful response to countervailing social pressures from establishment organizations [10].

Here we study Occupy Wall Street, a social movement focused on issues relating to the uneven distribution of wealth, social inequality, corporate greed, and the regulation of major financial institutions. Since the first protest on September 17, 2011, a major feature of the movement has been the long-term physical occupation of high-visibility encampments, often found in parks, banks, libraries and foreclosed homes. As a result, the Occupy movement requires substantial supporting infrastructure, including housing and sanitation facilities, as well as access to communication technologies. In spite of this, Occupy has sustained a lasting presence in American cities including New York City, Oakland, Washington, D.C., and Boston, which also represent key loci of decision making and protest activity [1], [2]. Under the Occupy model, proposals are brought to a vote before a general assembly, a form of direct democracy in which any participant is free to comment or vote on any proposal under consideration. The most prominent among these organizational structures is the New York City General Assembly, which has been responsible for producing policy and key narrative frames such as the popular protest slogan, “We are the 99%,” which references the disproportionate concentration of wealth among the top 1% of the world’s population [11].

Social media have played a prominent role in facilitating communication and coordination throughout the development of the Occupy Wall Street movement. For example, the first call to action in the Canadian anticapitalist magazine ‘AdBusters’ used the Twitter ‘hashtag’ #occupywallstreet as one of just ten words featured in a full-page ad. Ever since, the Twitter platform has been used extensively by movement participants [2], with #ows being one of the hundred most popular hashtags on Twitter for the year 2011. In addition to Occupy, Twitter has also played a prominent role in several foreign social movements, most notably in the Egyptian revolutionary protests of 2011 [12][14].

In this work, we seek to understand the relationship between the geospatial dimensions of social movement communication networks and the organizational pressures facing such movements. To accomplish this, we use a state-of-the-art location inference technique to model relationships among users as a weighted directed network of communication flows between states, in which the weight of each edge corresponds to the volume of traffic between pairs of locations. Using this framework we investigate three distinct relationships: attention allocation and proximity to on-the-ground events, resource mobilization and localized information sharing, and the role of collective framing in long-distance communication.

With respect to the issue of attention allocation, we find that compared to stable domestic political communication the Occupy Wall Street movement exhibits very high levels of geographic concentration, with users in New York, California, and Washington D.C. producing more than half of all retweeted content. Aside from these hubs, however, we find that the appeal of content relating to Occupy Wall Street has a disproportionately local audience. With extended, high profile encampments and large-scale protest action playing central roles in the Occupy movement, we propose that this structural feature reflects the importance of mobilizing human resources at the local level.

Finally, we report on evidence indicating that the content of communication at the national level is distinct from the content of communication among users in the same state. Comparing intrastate versus interstate communication, we find that the terms most overrepresented in interstate communication relate to the movement’s core framing language and the news media, while the terms most overrepresented in local communication reference physical places, protest action, and specific times. These results support the hypothesis that local-level communication activity is driven by the challenge of resource mobilization, while long-distance communication is more strongly associated with collective framing processes.

Materials and Methods

Twitter Platform

Twitter is a popular social networking and microblogging site extensively explored in recent literature [15][21]. Among others, it has been used to study influence and credibility [22][26], social structure [27][29] and to monitor users’ sentiment [30][33]. Twitter users can post -character messages containing text and hyperlinks, called tweets, and interact with one another in a variety of ways. Communication on Twitter is characterized by directed, non-reciprocal social links that allow users to subscribe to the stream of content produced by another user. The content produced by every user an individual follows is aggregated into a single streaming feed, from which an individual can selectively rebroadcast content to his or her followers by choosing to retweet it. In this way, a retweet serves to broaden the potential audience of a piece of content, and signifies that information has been transmitted between two individuals. Hashtags, short tokens prepended with a pound sign (e.g. #taxes or #obama), constitute another important feature of the platform, and allow the content produced by many individuals to be aggregated into a custom, topic-specific stream including all tweets containing a given token.

Data

The analysis described in this article relies on data collected from the Twitter ‘gardenhose’ streaming API between July , 2011 and March , 2012 – a nine month period including the birth and maturation of the Occupy Wall Street movement. The gardenhose provides an approximately sample of the entire Twitter stream in a machine-readable format. Gardenhose tweets include useful metadata, among them a unique tweet identifier, the content of the tweet (including hashtags and hyperlinks), a timestamp, the username of the account that produced the tweet, a free text ‘location’ string associated with the originating user’s profile, and for retweets, the account names of the other users associated with the tweet. Tweets from geolocation-enabled mobile devices also report latitude/longitude coordinates, however the incidence rate of tweets with this data is not enough to be useful as a feature in general.

To isolate a representative sample of Occupy Wall Street content we flagged for collection any tweet containing hashtags associated with the Occupy movement, including #ows and #occupy{*} (e.g. #occupywallst, #occupyboston, etc.). To provide a baseline against which to compare our observations, we also extracted content originating from the two most popular communication channels associated with stable domestic political communication, #tcot (Top Conservatives on Twitter) and #p2 (Progressives 2.0). In total, this sampling procedure produced 1,522,415 tweets associated with Occupy Wall Street and 825,262 tweets associated with domestic political communication. As this analysis is concerned primarily with information spreading processes we consider only retweet events from this corpus, resulting in 676,369 retweets among 257,657 users associated with Occupy Wall Street, and 259,703 retweets among 68,049 users associated with stable domestic political communication. Henceforth, we consider these corpora to constitute representative samples of retweet interactions among users participating in the streams of content associated with the Occupy Wall Street movement and stable domestic political communication in the United States.

Geocoding

To facilitate a geospatial analysis of communication activity associated with these content streams we require a high quality method to infer individual users’ locations. To accomplish this, we rely on self-reported location strings and the services of a commercial geocoding API. This technique, popularized in work by Onnela et al. [34], has been shown to produce high-resolution, high-quality geolocation data in the presence of geographically meaningful input.

A caveat to this technique, however, is that it relies on raw text generated by a broad swath of the Twitter population, and so we find geographically meaningless location descriptors included in the dataset. To address this issue we rely on an extensive hand-curated blacklist of popular non-geographical responses such as ‘everywhere’ and ‘the dance floor’. To produce this list we sorted all location strings by popularity and reviewed the thousand most popular strings manually, blacklisting those that did not correspond to geographically meaningful entities. Drawn from a long tailed distribution, 53% of all tweets in the data set are associated with a location among the 1,000 most popular responses, with 27% of all tweets containing one of the top hundred location strings. From this set of one thousand we blacklisted 161 non-location strings, corresponding to 6% of the tweets associated with the 1,000 most popular responses.

To improve recall in the presence of novel input, we used a modified version of the Ratcliff-Obershelp algorithm [35] to detect fuzzy matches between free text location strings and the blacklist of popular non-location responses. As a result, because ‘the dance floor’ is in the set of blacklist responses, strings taking a slightly modified form, such as ‘on the dance floor,’ will also be classified as invalid input. The hand-coded blacklist combined with the Ratcliff-Obershelp fuzzy matching technique resulted in 9% of the free-text location strings being classified as non-location input.

From among the remaining responses we submitted location strings to the Bing.com geocoding API, which returns a best-guess estimate for the corresponding physical coordinates. This output is hierarchically formatted to describe the finest level of geographic resolution available. For example, if a user reports ‘Logan Square, Chicago’ as his or her location, the Bing API will return information about the likely zip code, city, state and country associated with that location. However, if the user reports only ‘USA,’ the information provided by the API describes only a country-level guess as to the user’s location. Owing to decreased coverage at the city-level and the proportionately few users associated with each individual city, we utilize the state-level location estimates for the geospatial components of this analysis.

In total, 68.4% of Occupy Wall Street users reported location strings, and from these we were able to obtain geolocation estimates for 55.7% of these accounts. Among this set of users, 60% of the resulting geolocation estimates included state-level metadata. Response rates were somewhat diminished for users associated with the stream of domestic political communication, with 36% of individuals reporting free-text location strings. Using the procedure described above, we were able to obtain geolocation estimates for 29.3% of all users in the