Collective response of human populations to large-scale emergencies

Despite recent advances in uncovering the quantitative features of stationary human activity patterns, many applications, from pandemic prediction to emergency response, require an understanding of how these patterns change when the population encounters unfamiliar conditions. To explore societal response to external perturbations we identified real-time changes in communication and mobility patterns in the vicinity of eight emergencies, such as bomb attacks and earthquakes, comparing these with eight non-emergencies, like concerts and sporting events. We find that communication spikes accompanying emergencies are both spatially and temporally localized, but information about emergencies spreads globally, resulting in communication avalanches that engage in a significant manner the social network of eyewitnesses. These results offer a quantitative view of behavioral changes in human activity under extreme conditions, with potential long-term impact on emergency detection and response.


Introduction
Current research on human dynamics is limited to data collected under normal and stationary circumstances [1], capturing the regular daily activity of individuals [2,2,4,5,3,1,8,9,10,11,12,13,14,15].Yet, there is exceptional need to understand how people change their behavior when exposed to rapidly changing or unfamiliar conditions [1], such as life-threatening epidemic outbreaks [4,12], emergencies and traffic anomalies, as models based on stationary events are expected to break down under these circumstances.Such rapid changes in conditions are often caused by natural, technological or societal disasters, from hurricanes to violent conflicts [16].
The possibility to study such real time changes has emerged recently thanks to the widespread use of mobile phones, which track both user mobility [2,2,3,17] and real-time communications along the links of the underlying social network [1,18].Here we take advantage of the fact that mobile phones act as in situ sensors at the site of an emergency, to study the real-time behavioral patterns of the local population under external perturbations caused by emergencies.Advances in this direction not only help redefine our understanding of information propagation [19] and cooperative human actions under externally induced perturbations, which is the main motivation of our work, but also offer a new perspective on panic [20,21,22,23] and emergency protocols in a data-rich environment [24].
Our starting point is a country-wide mobile communications dataset, culled from the anonymized billing records of approximately ten million mobile phone subscribers of a mobile company which covers about one-fourth of subscribers in a country with close to full mobile penetration.It provides the time and duration of each mobile phone call [1], together with information on the tower that handled the call, thus capturing the real-time locations of the users [2,3,25] (Methods, Supporting Information S1, Fig. A).To identify potential societal perturbations, we scanned media reports pertaining to the coverage area between January 2007 and January 2009 and developed a corpus of times and locations for eight societal, technological, and natural emergencies, ranging from bombings to a plane crash, earthquakes, floods and storms (Table 1).Approximately 30% of the events mentioned in the media occurred in locations with sparse cellular coverage or during times when few users are active (like very early in the morning).The remaining events do offer, however, a sufficiently diverse corpus to explore the generic vs. unique changes in the activity patterns in response to an emergency.Here we discuss four events, chosen for their diversity: (1) a bombing, resulting in several injuries (no fatalities); (2) a plane crash resulting in a significant number of fatalities; (3) an earthquake whose epicenter was outside our observation area but affected the observed population, causing mild damage but no casualties; and (4) a power outage (blackout) affecting a major metropolitan area (Supporting Information S1, Fig. B).To distinguish emergencies from other events that cause collective changes in human activity, we also explored eight planned events, such as sports games and a popular local sports race and several rock concerts.We discuss here in detail a cultural festival and a large pop music concert as non-emergency references (Table 1, see also Supporting Information S1, Sec.B).The characteristics of the events not discussed here due to length limitations are provided in Supporting Information S1, Sec.I for completeness and comparison.

Results and Discussion
As shown in Fig. 1A, emergencies trigger a sharp spike in call activity (number of outgoing calls and text messages) in the physical proximity of the event, confirming that mobile phones act as sensitive local "sociometers" to external societal perturbations.The call volume starts decaying immediately after the emergency, suggesting that the urge to communicate is strongest right at the onset of the event.We see virtually no delay between the onset of the event and the jump in call volume for events that were directly witnessed by the local population, such as the bombing, the earthquake and the blackout.Brief delay is observed only for the plane crash, which took place in an unpopulated area and thus lacked eyewitnesses.In contrast, non-emergency events, like the festival and the concert in Fig. 1A, display a gradual increase in call activity, a noticeably different pattern from the "jump-decay" pattern observed for emergencies.See also Supporting Information S1, Figs.I and J.
To compare the magnitude and duration of the observed call anomalies, in Fig. 1B we show the temporal evolution of the relative call volume ∆V/ V normal as a function of time, where ∆V = V event − V normal , V event is the call activity during the event and V normal is the average call activity during the same time period of the week.As Fig. 1B indicates, the magnitude of ∆V/ V normal correlates with our relative (and somewhat subjective) sense of the event's potential severity and unexpectedness: the bombing induces the largest change in call activity, followed by the plane crash; whereas the collective reaction to the earthquake and the blackout are somewhat weaker and comparable to each other.While the relative change was also significant for non-emergencies, the emergence of the call anomaly is rather gradual and spans seven or more hours, in contrast with the jump-decay pattern lasting only three to five hours for emergencies (Figs.1B, Supporting Information S1, Figs.I and J).As we show in Fig. 1C (see also Supporting Information S1, Sec.C) the primary source of the observed call anomaly is a sudden increase of calls by individuals who would normally not use their phone during the emergency period, rather than increased call volume by those that are normally active in the area.
The temporally localized spike in call activity (Fig. 1A,B) raises an important question: is information about the events limited to the immediate vicinity of the emergency or do emergencies, often immediately covered by national media, lead to spatially extended changes in call activity [23]?We therefore inspected the change in call activity in the vicinity of the epicenter, finding that for the bombing, for example, the magnitude of the call anomaly is strongest near the event, and drops rapidly with the distance r from the epicenter (Fig. 1A).To quantify this effect across all emergencies, we integrated the call volume over time in concentric shells of radius r centered on the epicenter (Fig. 1B).The decay is approximately exponential, ∆V(r) ∼ exp (−r/r c ), allowing us to characterize the spatial extent of the reaction with a decay rate r c (Fig. 1C).The observed decay rates range from 2 km (bombing) to 10 km (plane crash), indicating that the anomalous call activity is limited to the event's vicinity.An extended spatial range (r c ≈ 110 km) is seen only for the earthquake, lacking a narrowly defined epicenter.Meanwhile, a distinguishing pattern of non-emergencies is their highly localized nature: they are characterized by a decay rate of less than 2 km, implying that the call anomaly was narrowly confined to the venue of the event.This systematic split in r c between the spatially extended emergencies and well-localized non-emergencies persists for all explored events (see Table 1, Supporting Information S1, Fig. K).
Despite the clear temporal and spatial localization of anomalous call activity during emergencies, one expects some degree of information propagation beyond the eyewitness population [26].
We therefore identified the individuals located within the event region (G 0 ), as well as a G 1 group consisting of individuals outside the event region but who receive calls from the G 0 group during the event, a G 2 group that receive calls from G 1 , and so on.We see that the G 0 individuals engage their social network within minutes, and that the G 1 , G 2 , and occasionally even the G 3 group show an anomalous call pattern immediately after the anomaly (Fig. 1A).This effect is quantified in Fig. 1B, where we show the increase in call volume for each group as a function of their social network based distance from the epicenter (for example, the social distance of the G 2 group is 2, being two links away from the G 0 group), indicating that the bombing and plane crash show strong, immediate social propagation up to the third and second neighbors of the eyewitness G 0 population, respectively.The earthquake and blackout, less threatening emergencies, show little propagation beyond the immediate social links of G 0 and social propagation is virtually absent in non-emergencies.
The nature of the information cascade behind the results shown in Fig. 1A,B is illustrated in Fig. 1C, where we show the individual calls between users active during the bombing.In contrast with the information cascade triggered by the emergencies witnessed by the G 0 users, there are practically no calls between the same individuals during the previous week.To quantify the magnitude of the information cascade we measured the length of the paths emanating from the G 0 users, finding them to be considerably longer during the emergency (Fig. 1D), compared to five non-emergency periods, demonstrating that the information cascade penetrates deep into the social network, a pattern that is absent during normal activity [27].See also Supporting Information S1, Figs.E, F, G, H, L, M, N, and O, and Table A.
The existence of such prominent information cascades raises tantalizing questions about who contributes to information propagation about the emergency.Using self-reported gender information available for most users (see Supporting Information S1), we find that during emergencies female users are more likely to make a call than expected based on their normal call patterns.This gender discrepancy holds for the G 0 (eyewitness) and G 1 groups, but is absent for non-emergency events (see Supporting Information S1, Sec.E, Fig. C).We also separated the total call activity of G 0 and G 1 individuals into voice and text messages (including SMS and MMS).For most events (the earthquake and blackout being the only exceptions), the voice/text ratios follow the normal patterns (Supporting Information S1, Fig. D), indicating that users continue to rely on their preferred means of communication during an emergency.
The patterns identified discussed above allow us to dissect complex events, such as an explosion in an urban area preceded by an evacuation starting approximately one hour before the blast.While a call volume anomaly emerges right at the start of the evacuation, it levels off and the jump-decay pattern characteristic of an emergency does not appear until the real explosion (Fig. 1A).The spatial extent of the evacuation response is significantly smaller than the one observed during the event (r c = 1.6 for the evacuation compared with r c = 9.0 for the explosion, see Fig. 1B).During the evacuation, social propagation is limited to the G 0 and G 1 groups only (Fig. 1C,D) while after the explosion we observe a communication cascade that activates the G 2 users as well.The lack of strong propagation during evacuation indicates that individuals tend to be reactive rather than proactive and that a real emergency is necessary to initiate a communication cascade that effectively spreads emergency information.
The results of Figs.??-?? not only indicate that the collective response of the population to an emergency follows reproducible patterns common across diverse events, but they also document subtle differences between emergencies and non-emergencies.We therefore identified four variables that take different characteristic values for emergencies and non-emergencies: (i) the midpoint fraction f mid = (t mid − t start ) / t stop − t start , where t start and t stop are the times when the anomalous activity begins and ends, respectively, and t mid is the time when half of the total anomalous call volume has occurred; (ii) the spatial decay rate r c capturing the extent of the event; (iii) the relative size R of each information cascade, representing the ratio between the number of users in the event cascade and the cascade tracked during normal periods; (iv) the probability for users to contact existing friends (instead of placing calls to strangers).
In Fig. ?? we show these variables for all 16 events, finding systematic differences between emergencies and non-emergencies.As the figure indicates, a multidimensional variable, relying on the documented changes in human activity, can be used to automatically distinguish emergency situations from non-emergency induced anomalies.Such a variable could also help real-time monitoring of emergencies [24], from information about the size of the affected population, to the timeline of the events, and could help identify mobile phone users capable of offering immediate, actionable information, potentially aiding search and rescue.
Rapidly-evolving events such as those studied throughout this work require dynamical data with ultra-high temporal and spatial resolution and high coverage.Although the populations affected by emergencies are quite large, occasionally reaching thousands of users, due to the demonstrated localized nature of the anomaly, this size is still small in comparison to other proxy studies of human dynamics, which can exploit the activity patterns of millions of internet users or webpages [13,14,15,27].Meanwhile, emergencies occur over very short timespans, a few hours at most, whereas much current work on human dynamics relies on longitudinal datasets covering months or even years of activity for the same users (e.g.[2,3,9]), integrating out transient events and noise.But in the case of emergencies, such transient events are precisely what we wish to quantify.Given the short duration and spatially localized nature of these events, it is vital to have extremely high coverage of the entire system, to maximize the availability of critical information during an event.To push human dynamics research into such fast-moving events requires new tools and datasets capable of extracting signals from limited data.We believe that our research offers a first step in this direction.
In summary, similar to how biologists use drugs to perturb the state of a cell to better understand the collective behavior of living systems, we used emergencies as external societal perturbations, helping us uncover generic changes in the spatial, temporal and social activity patterns of the human population.Starting from a large-scale, country-wide mobile phone dataset, we used news reports to gather a corpus of sixteen major events, eight unplanned emergencies and eight scheduled activities.Studying the call activity patterns of users in the vicinity of these events, we found that unusual activity rapidly spikes for emergencies in contrast with non-emergencies induced anomalies that build up gradually before the event; that the call patterns during emergencies are exponentially localized regardless of event details; and that affected users will only invoke the social network to propagate information under the most extreme circumstances.When this social propagation does occur, however, it takes place in a very rapid and efficient manner, so that users three or even four degrees from eyewitnesses can learn of the emergency within minutes.
These results not only deepen our fundamental understanding of human dynamics, but could also improve emergency response.Indeed, while aid organizations increasingly use the distributed, real-time communication tools of the 21st century, much disaster research continues to rely on low-throughput, post-event data, such as questionnaires, eyewitness reports [28,29], and communication records between first responders or relief organizations [30].The emergency situations explored here indicate that, thanks to the pervasive use of mobile phones, collective changes in human activity patterns can be captured in an objective manner, even at surprisingly short time-scales, opening a new window on this neglected chapter of human dynamics.

Dataset
We use a set of anonymized billing records from a western european mobile phone service provider [1,2,3].The records cover approximately 10M subscribers within a single country over 3 years of activity.Each billing record, for voice and text services, contains the unique identifiers of the caller placing the call and the callee receiving the call; an identifier for the cellular antenna (tower) that handled the call; and the date and time when the call was placed.Coupled with a dataset describing the locations (latitude and longitude) of cellular towers, we have the approximate location of the caller when placing the call.For full details, see Supporting Information S1, Sec. A.

Identifying events
To find an event in the mobile phone data, we need to determine its time and location.We have used online news aggregators, particularly the local news.google.comservice to search for news stories covering the country and time frame of the dataset.Keywords such as 'storm', 'emergency', 'concert', etc. were used to find potential news stories.Important events such as bombings and earthquakes are prominently covered in the media and are easy to find.Study of these reports, which often included photographs of the affected area, typically yields precise times and locations for the events.Reports would occasionally conflict about specific details, but this was rare.We take the reported start time of the event as t = 0.
To identify the beginning and ending of an event, t start and t stop , we adopt the following procedure.First, identify the event region (a rough estimate is sufficient) and scan all its calls during a large time period covering the event (e.g., a full day), giving V event (t).Then, scan calls for a number of "normal" periods, those modulo one week from the event period, exploiting the weekly periodicity of V(t).These normal periods' time series are averaged to give V normal .(To smooth time series, we typically bin them into 5-10 minute intervals.)The standard deviation σ (V normal ) as a function of time is then used to compute z(t) = ∆V(t)/σ (V normal ).Finally, we define the interval t start , t stop as the longest contiguous run of time intervals where z(t) > z thr , for some fixed cutoff z thr .We chose z thr = 1.5 for all events.
For full details, see Supporting Information S1, Sec.B.
Table 1: Summary of the studied emergencies and non-emergencies.The columns provide the duration of the anomalous call activity (Fig. 1), the spatial decay rate r c (Fig. 2), the number of users in the event population |G 0 |, and the total size of the information cascade i |G i | (Fig. 3).Events discussed in the main text are italicized, the rest are discussed in the supplementary material.'Jet scare' refers to a sonic boom interpreted by the local population and initial media reports as an explosion.Call anomalies during emergencies.A, The time dependence of call volume V(t) in the vicinity of four emergencies and two non-emergencies (See Table 1).B, The temporal behavior of the relative call volume ∆V/ V normal of the events shown in A, where ∆V = V event − V normal , V event is the call volume on the day of the event (shown in red in A), and V normal is the average call volume during the same period of the week (the call volume during the previous week is shown in black in B).C, The relative change in the average number of calls placed per user (ρ) and the total number of users (N) making calls from the region indicates that the call anomaly is primarily due to a significant increase in the number of users that place calls during the events.

Event
Distance from epicenter, r (km) The spatial impact of an emergency.A, Maps of total anomalous call activity (activity during the event minus expected normal activity) for two-hour periods before (−2 < t < 0), during (0 < t < 2), and after (2 < t < 4) the bombing.The color code corresponds to the total change t ∆V(t), where the sum runs over the particular time period.B, Changes in call volume in regions at various distances r from the event epicenter.Note that the peak of the call volume anomaly for the bombing within the observed 1 < r < 5 km region is delayed by approximately 10 minutes compared to the r < 1 km epicenter region.No call anomaly is observed for r > 10 km.The earthquake covers a large spatial range so we instead choose three event regions A-C, at distances of 310 km, 340 km, and 425 km from the seismic epicenter (which was outside the studied region).C, To measure the distance dependence of the anomaly, we computed the total anomalous call volume in B before (∆t < t < 0) and after (0 < t < ∆t) each event as a function of the distance r, revealing approximately exponential decay, ∆V(r) ∼ exp (−r/r c ). Non-emergencies are spatially localized, with r c < 2 km.× 100 Social characteristics of information cascades.A, Changes in call volume for users directly affected by the event (G 0 ), users that receive calls from G 0 but are not near the event (G 1 ), users contacted by G 1 but not in G 1 or G 0 (G 2 ), etc.For the bombing and plane crash, populations respond very rapidly, within minutes.B, The total amount of anomalous call activity in A before (during −∆t < t < 0) and after (during 0 < t < ∆t) the event for each user group G i quantifies the impact on the social network.We see that information propagates deeply into the social network for the bombing and plane crash.C, Top panel: the contact network formed between affected users during the bombing.Bottom panel: the call pattern between users that are active during the emergency during the previous week, indicating that the information cascade observed during the bombing is out of the ordinary.D, The distribution of shortest paths within the contact network quantifies the anomalous information cascade induced by the bombing.

V(t)
Time since event, t (hours) Time since event, t (hours) Analyzing a composite event (evacuation preceding an explosion).A, Call activity increases during the evacuation (−1 < t < 0) but levels off after the initial warning, until the explosion at t = 0 causes a much larger increase in call activity.B, Spatially, the evacuation causes a sharply localized activity spike (r c = 1.6 km), but the explosion increases the spatial extent dramatically (r c = 9.0 km).C-D, The evacuation only activates the G 0 (eyewitness) and G 1 groups, meaning that information fails to propagate significantly beyond the initial group and their immediate ties.However, the blast not only leads to a further increase in call activity in the G 0 and G 1 groups, but also triggers the second neighbors G 2 .A, The midpoint fraction f mid quantifying the onset speed of anomalous call activity (a lower f mid indicates a faster onset).Emergencies display a more abrupt call anomaly than non-emergencies, which feature gradual buildups of anomalous call activity.B, The spatial extent of the events, quantified by r c , indicates that non-emergency events are far more centrally localized than unexpected emergencies.C, The relative cascade size R = N event / N normal , where N = i |G i | is the number of users in the social cascade.D, z F = (P event − P normal ) /σ (P normal ), where P is the probability of calling an acquaintance and σ(P) is the standard deviation of P.

Supporting Information
Collective response of human populations to large-scale emergencies by James P. Bagrow, Dashun Wang, and Albert-László Barabási List of Figures List of Tables 1 Summary of the studied emergencies and non-emergencies.The columns provide the duration of the anomalous call activity (Fig. 1), the spatial decay rate r c (Fig. 2), the number of users in the event population |G 0 |, and the total size of the information cascade i |G i | (Fig. 3).Events discussed in the main text are italicized, the rest are discussed in the supplementary material.

A Dataset
We use a set of anonymized billing records from a western european mobile phone service provider [1,2,3].The records cover approximately 10M subscribers within a single country over 3 years of activity.Each billing record, for voice and text services, contains the unique identifiers of the caller placing the call and the callee receiving the call; an identifier for the cellular antenna (tower) that handled the call; and the date and time when the call was placed.Coupled with a dataset describing the locations (latitude and longitude) of cellular towers, we have the approximate location of the caller when placing the call.Unless otherwise noted, a "call" can be either voice or text (SMS, MMS, etc.), and "call volume" or "call activity" is both voice calls and text messages.
After identifying the start time and location of an event, we then scan these billing records to determine the activity of nearby users.The mobile phone activity patterns of the affected users can then be followed in the weeks preceding or following the event, to provide control or baseline behavior.
Self-reported gender information is available for approximately 90% of subscribers.

A.1 Market share
These records cover approximately 20% of the country's mobile phone market.However, we also possess identification numbers for phones that are outside the provider but that make or receive calls to users within the company.While we do not possess any other information about these lines, nor anything about their users or calls that are made to other numbers outside the service provider, we do have records pertaining to all calls placed to or from these ID numbers involving subscribers covered by our dataset.This information was used to study social propagation (see Sec. H.1).

B Identifying events
To find an event in the mobile phone data, we need to determine its time and location.We have used online news aggregators, particularly the local news.google.comservice to search for news stories covering the country and time frame of the dataset.Keywords such as 'storm', 'emergency', 'concert', etc. were used to find potential news stories.Important events such as bombings and earthquakes are prominently covered in the media and are easy to find.Study of these reports, which often included photographs of the affected area, typically yields precise times and locations for the events.Reports would occasionally conflict about specific details, but this was rare.We take the reported start time of the event as t = 0. Most events are spatially localized, so it is important to consider only the immediate event region.Otherwise, the event signal is masked by normal activity (Fig. A).The local event region can be estimated as a circle centered on the identified epicenter with a radius chosen based on r c .For the blackout, however, we chose all towers within the affected city (contained within the city's postal codes).
To identify the beginning and the end of an event, t start and t stop , we adopt the following procedure.First, identify the event region (a rough estimate is sufficient) and scan all its calls during a large time period covering the event (e.g., a full day), giving V event (t).Then, scan calls for a number of "normal" periods, those modulo one week from the event period, exploiting the weekly periodicity of V(t).These normal periods' time series are averaged to give V normal .(To smooth time series, we typically bin them into 5-10 minute intervals.)The standard deviation σ (V normal ) as a function of time is then used to compute z(t) = ∆V(t)/σ (V normal ).Finally, we define the interval t start , t stop as the longest contiguous run of time intervals where z(t) > z thr , for some fixed cutoff z thr .We chose z thr = 1.5 for all events.Finally, there is also the concern that an emergency may be so severe that it interferes with the operations of the mobile phone system itself.Only the blackout caused any damage to the mobile phone system, where some towers were temporarily disabled.(No calls appear to have been lost as other towers picked up the slack.)See Fig. B. Likewise, no towers reached maximum capacity, preventing important calls from being routed.Such effects may occur during larger, more serious emergencies.

B.1 Missing events
It is possibile that newsworthy events may not be discoverable using mobile phones.Indeed, while there are sixteen events documented in main text Table 1, there were a number of events discovered in news reports that we could not identify in the data.The majority of these were forest fires.While they affected large regions and numbers of people, we could not find them with mobile phones.A large wind storm and a gas main explosion were also not confirmed; both occurred late at night.Two other events, a chemical leak causing an evacuation and a fire at a remote factory causing noxious fumes were discovered in the dataset, but the affected populations were very small, so we decided to discount them.
The absence of these events in the dataset provides important information about the strengths and weaknesses of using mobile phones to study emergencies.Since they rely on user activity, Figure B: Spatial changes in call activity due to the blackout.We integrate call activity V(t) over two time windows before and after the blackout occurs (left, shaded).Studying total call load spatially we see a nonlinear response, with a region near the city center (star) suffering a drop in calls due to the blackout (right).This region is surrounded by areas that display a significant increase in calls, implying that most load was shifted onto nearby cell towers and was not lost.
events that occur late at night, when most people are asleep, may be difficult to study.Likewise, events that are severe but diffuse, providing a slight effect over a very broad area, may not be distinguishable from the background of normal activity (although the earthquake is an exception to this).Events in remote locations with little cellular coverage will also be more difficult to study than events in well-covered and well-populated regions.
Finally, since we are especially interested in studying how information propagates socially, we avoided national events, such as televised sports matches or popular public holidays, as these make distinguishing the different event populations G i unreliable.

C Source of call anomaly
The anomalous call activity raises an important question: is the observed spike due to individuals who normally do not use their phone in the event region and now suddenly choose to place calls, or are those who normally use their phone in the respective timeframe prompted to call more frequently than under normal circumstances?We determined in the event region (i) the relative change in the average number of calls placed per user, ∆ρ/ ρ normal , and (ii) the relative increase in the number of individuals that use their phone in this period ∆N/ N normal .Figure ?? shows that during the bombing we see a 36% increase in phone usage whereas the number of individuals that make a call increases by 232%.Other emergencies show a similar pattern: the plane crash, earthquake and blackout show increases in ρ (N) of 21% (67.5%), 1.36% (17.4%) and 4.97% (20.8%), respectively.Taken together, these results indicate that the primary source of the observed call anomaly is a sudden increase of calls by individuals who would normally not use their phone during the emergency period, a behavioral change triggered by the witnessed event.

D Whom do people call
To see if affected users tend to call existing friends or contact strangers, we measured the probability P for a user in G 0 to make his first call between t start and t stop to a friend, where 'friends' are the set of individuals that have had phone contact with the G 0 user during the previous full three months (not including the month of the event).Computing the mean and standard deviation of P over normal time periods (all weekdays of the month of the event, except the day of the event) allows us to quantify the relative change during the event with In all emergencies we observe an increase in the number of calls placed to friends (and a corresponding decrease in calls to non-friends).Many non-emergencies show the opposite trend: users are less likely to call a friend, although this change is seldom large.See Sec.G and Fig. 5d.

E Gender response during events
To investigate how the population response to events depends on demographic factors, we used the self-reported gender information, available for the majority (∼88%) of the users.For each event we compute the significance z female and z male in the fraction of female and male users active during the event, compared with normal time periods, as described in D, for both directly affected users (G 0 ) and those one step away (G 1 ).As shown in Fig. C, we see that nearly all emergencies cause an increase in the fraction of affected female users; this increase is significant for half of the emergencies.Non-emergencies do not result in deviations in the gender breakdown of affected populations.These results hold for both the directly affected users and their neighbors one step away.
Figure C: Gender response during emergency and non-emergency events.For each event we compute the significance z female and z male in the fraction of female and male users active during the event, compared with normal time periods, for both directly affected users (G 0 ) and those one step away (G 1 ).We see that nearly all emergencies cause an increase in the fraction of affected female users; this increase is significant for half of the emergencies.Non-emergencies do not result in deviations in the gender breakdown of affected populations.

Figure D:
Voice and text usages during emergencies.Most events do not show a significant change in the fraction of voice calls compared to text messages.The earthquake and blackout are exceptions, as is the plane crash and festival 3 (G 1 only).

F Voice versus text usage during events
Similar to the quantities z F and z female , we can assess whether users have changed their means of communication due to an event.To do so we compute the significance z voice of the fraction of voice calls compared to text messages during the event, for both populations G 0 and G 1 (Fig. D).The earthquake and blackout give significantly increased text usage while the plane crash shows an increase in voice, while most other events do not show a significant change.Since the earthquake and blackout were both relatively minor (low danger) events, this result implies that a spike in primarily text messaging activity may indicate that the event is a low-threat/non-critical emergency.

G Systematic response mechanisms during emergencies
As mentioned in the main text, to summarize our current understanding of these events, we computed temporal, spatial, and social properties for each anomaly, plotted in main text Fig. 5. Temporally, we study the midpoint fraction: the fraction of time required for half of the anomalous call activity to occur, where t mid satisfies: The midpoint fraction is more robust to noisy and non-sharply peaked time series, where estimating t peak is difficult, than the peak fraction.Spatially, we use the anomaly's r c .Socially, we compute two quantities: the relative size of the cascade R (the number of people in the event cascade divided by the number generated under normal periods; see Sec.H.2 and Fig. H) and z F , the significance in the probability of calling a friend compared with a non-friend (see Sec. D).
Figure 5 shows a distinct separation in these measures between emergencies and non-emergencies, indicating that there are universal response patterns underlying societal dynamics independent of the particular event details.

H Calculating social propagation
In this section we detail the procedure for extracting the contact network between users after an event (Sec.H.1) and how to control for various factors to demonstrate whether or not the contact network or its information cascade is anomalous due to the event (Sec.H.2).

H.1 Constructing the time-dependent contact network
We use the following process to generate the contact network between users due to an event.We follow all messages in order of occurrence during the event's time interval t start , t stop .While we do not know the content of the messages, we assume that any related messages do transmit information pertaining to the event.A user u who does not know about the event becomes "infected" with knowledge due to communication at some time t ∈ t start , t stop if (1) u initiated communication from a tower within the event region or (2) u communicated with a user that was already infected.We place u in the set G 0 if u communicated from the event region, otherwise we place u into G i+1 where the infected user transmitting knowledge to u was in G i .Following these calls and generating the G i with this procedure then forms the contact network.Note that there are two types of communications in the dataset, voice and text.We assume that voice is bidirectional whereas text messages are not (a user who sends a text message to someone with knowledge of the event will learn nothing from that particular communication).An illustration of this process is depicted in Fig. E.
The contact network itself can be studied using a number of network science tools.One way to analyze the size and scale of this network is through the distribution of shortest (or geodesic) path lengths [4].In Fig. ?? we presented the distribution of paths emanating from G 0 users within the giant connected component (GCC) of the bombing's network.One can also analyze the distribution for all users, not just G 0 , and for all components of the network.These possibilities are shown in Fig. F.
Finally, the cascade of information through a contact network is a non-local process which may be highly effected by sampling/percolation [4].Indeed, the mobile phone dataset contains only users of a single phone company, and a number of propagation paths may be missing.However, the dataset actually contains all users who make or receive calls to users within the company, even those outside the company.This means we have all cascade paths of one or two steps that begin and end with in-company users (regardless of whether they travel through a company user or not) and that those paths are the actual shortest paths, providing an effective lower bound on the cascade.In other words, if we can demonstrate the existence of a cascade over users {G 0 , G 1 , G 2 }, then the actual cascade containing those users can only be larger.

H.2 Controlling for social propagation
A contact network between users can always be constructed, even when no event takes place.There may appear to be propagating cascades as well, since there are temporal correlations between users receiving and then placing calls.This must be properly controlled for.
Suppose we are studying an event and have identified its affected users G 0 , those users that made calls during some time window (t, t + ∆t).We expect that G 0 users will call other users (G 1 ) outside the event, G 1 users will call G 2 users, etc., generating a cascade {G 0 , G 1 , . ..}. Tracking calls starting from some G will always generate such groups, even during normal periods.Then the question is, do the users in G i , i > 0 show increased activity during an emergency or other event?If so, that is evidence of the social propagation of situational awareness.
To answer this question we need to consider several points: • Suppose that the total call activity for a region is V(t) (Fig. GA).Now select a group of users G ∆t that each place at least one call during a small time window ∆t.Tracking only their call activity generates the conditional time series V(t|G ∆t ).This time series has a "selection bias" that creates the appearance of a large increase in activity during the time window since all the users must place calls then (Fig. GB).This must be accounted for when studying selected users during an event.
• When tracking V(t|G) for a group of N = |G| users, the overall level of activity will depend on N (Fig. GC).A rescaling is necessary when comparing the activity levels of different size groups.
• The selection bias will also depend on the length of the time window ∆t.If ∆t = 24 hours had been used in Fig. G, no bias would be evident.All events are compared to time periods with the same ∆t, so this effect is automatically controlled for.
Having considered these aspects, we turn our attention to the problem of calculating the cascade itself.For the event period, tracking the outgoing calls of {G 0 , G 1 , . ..} is straightforward.To determine how unusual this activity is, we need controls for comparison.There are several possibilities: Control 1 One option is scanning the event region during the same time of the week, collecting a control population of users, and then following their cascade.However, this does not account for changes in the composition or number of users in the event region (some studied events were quite remote and typically contained very few users).
Control 2 Another possibility is to simply follow the activity of the same users G i from the event's cascade during normal time periods.This choice is keeps the population unchanged but it does not account for changes in who is being called; G 0 users may have chosen to call very different people during an emergency.Further, it does not account for the selection bias that is present during the event but not during the normal periods, which may exaggerate the change in call activity.
Control 3 Finally, one can study new cascades generated by the same event users G 0 during normal periods, creating a different cascade {G 0 , g 1 , g 2 , . ..} for each normal period.The activities of each G i can then be compared to those of the corresponding g i 's.This directly tests the effect that the initiating population G 0 has on the cascade, by studying those cascades the population would normally induce, and accounts for selection bias since this bias is present during the event and the normal periods.
We have chosen to use Control 3. Note that G i will typically be larger than the normal g i 's and that G i users may be more active than those in g i , so V(t|g i ) must be rescaled when being compared to the event's V(t|G i ).To do this, we multiply V(t|g i ) by a constant scaling factor a i , where both integrals run over the same "calibration interval" δt and τ = 0 is the start of the selection window.For most events we integrate over a 24-hour period two days before the window, δt = (−48, −24).If the event is on a weekday, we ensure the calibration interval is not a weekend and vice versa.This factor a i was chosen such that the total number of calls during normal time periods for V(t|G i ) is approximately equal to a i V(t|g i ), equalizing the smaller time series and removing bias due to |G i | |g i |.

I Results on the event corpus
Sixteen events were identified for this work (see main text Table 1), but six events were focused upon in the main text.Here we report the results for all events.In Figs.I and J we provide the call activities V(t) for all sixteen events used in this study (compare to Fig. 1).In    indicating that the call anomaly for those events was caused only by a greater-than-expected number of users all making an expected number of calls.

2 [
Systematic response mechanisms during emergencies]Systematic response mechanisms during emergencies.

Figure A :
Figure A: Regional and national visibility of events.On a national level (a-c), the spike in call activity due to the bombing is lost, but it clearly emerges when we focus only on the immediately local vicinity of the event (d-f).The strong weekly periodicity in V(t) is also visible.

1 G 2 G 3 Figure E :
Figure E:Extracting the time-dependent contact network from call data.A, A cartoon example of the dataset's call records, representing eight calls during 15 minutes following an event.The region of the event contains two towers, T1 and T2.Call type is (T) for text message and (V) for voice call.B, Three instances of the evolving contact network, extracted from the example call log shown in a.Two users made calls from the region during the first five timesteps, initiating a cascade.

Figure F :
Figure F: Distributions of shortest paths found within the bombing's contact network.One can compute shortest paths emanating from all users (top) or only users in G 0 (bottom) and for paths within the entire network (left) or only the network's giant component (right).

Figure G :
Figure G: Understanding selection bias.A, The call volume V(t) of a major city during an ordinary 24-hour period.B, The call volume V(t|G) of N = 10 4 randomly selected users from A who all placed one or more calls between 12:00 and 14:00 (highlighted).The 'bias' of this conditional time series is clear.C, The same as B for different values of N. Rescaling by the population size (inset) indicates that the relative scale of the bias of V(t|G) during the time window is independent of the population size.

Figure I :
Figure I: Regional call activity for the eight emergencies analyzed.The first four are also shown in main text Fig. ??.Shaded regions indicate ±2 standard deviations.

Figure J :
Figure J: The same as Fig.I for the eight non-emergencies.Concert 3 takes place at an otherwise unpopulated location and the normal activity is not visible on a scale showing the event activity.

Figure K :
Figure K: Spatial call activity for the ten events not shown in main text Fig. ??.

Figure L :Figure M :Figure N :
Figure L: Social propagation for the four main text emergencies.Shown are the activity patterns (conditional time series) for G 0 through G 3 during the event (black curve) and normally (shaded regions indicate ±2 s.d.).Normal activity levels were rescaled to account for population and selection bias (see Sec. H) The bombing and plane crash show increased activities for multiple G i while the earthquake and blackout do not.