Figures
Abstract
The COVID-19 pandemic offers an unprecedented natural experiment providing insights into the emergence of collective behavioral changes of both exogenous (government mandated) and endogenous (spontaneous reaction to infection risks) origin. Here, we characterize collective physical distancing—mobility reductions, minimization of contacts, shortening of contact duration—in response to the COVID-19 pandemic in the pre-vaccine era by analyzing de-identified, privacy-preserving location data for a panel of over 5.5 million anonymized, opted-in U.S. devices. We define five indicators of users’ mobility and proximity to investigate how the emerging collective behavior deviates from typical pre-pandemic patterns during the first nine months of the COVID-19 pandemic. We analyze both the dramatic changes due to the government mandated mitigation policies and the more spontaneous societal adaptation into a new (physically distanced) normal in the fall 2020. Using the indicators here defined we show that: a) during the COVID-19 pandemic, collective physical distancing displayed different phases and was heterogeneous across geographies, b) metropolitan areas displayed stronger reductions in mobility and contacts than rural areas; c) stronger reductions in commuting patterns are observed in geographical areas with a higher share of teleworkable jobs; d) commuting volumes during and after the lockdown period negatively correlate with unemployment rates; and e) increases in contact indicators correlate with future values of new deaths at a lag consistent with epidemiological parameters and surveillance reporting delays. In conclusion, this study demonstrates that the framework and indicators here presented can be used to analyze large-scale social distancing phenomena, paving the way for their use in future pandemics to analyze and monitor the effects of pandemic mitigation plans at the national and international levels.
Author summary
The COVID-19 pandemic resulted in some of the most significant disruptions to collective human behavior. In this study, we quantified the nature and scale of these disruptions during the first nine months of the pandemic by estimating changes in daily routines related to mobility, commuting, and social contacts. We used high-resolution mobility data that describe the physical movements of over 5.5 million individuals in the United States. Our findings indicate that the strength of the behavioral responses varied during different phases of the pandemic, across locations (states and cities), and across social settings (urban versus rural). We also found that reductions in commute flows were correlated with employment characteristics and that our proposed indicators for social interactions could be used as early warnings of potential future negative health outcomes (e.g., new daily deaths), thus opening up the possibility of using these metrics as additional situational awareness tools in future outbreaks.
Citation: Klein B, LaRock T, McCabe S, Torres L, Friedland L, Kos M, et al. (2024) Characterizing collective physical distancing in the U.S. during the first nine months of the COVID-19 pandemic. PLOS Digit Health 3(2): e0000430. https://doi.org/10.1371/journal.pdig.0000430
Editor: Yuan Lai, Tsinghua University, CHINA
Received: February 22, 2023; Accepted: December 11, 2023; Published: February 6, 2024
Copyright: © 2024 Klein et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The mobility, commuting, and contact indexes introduced in this publication are accessible at the following live dashboard: https://covid19.gleamproject.org/mobility, while accompanying codes and aggregated index data are available at https://github.com/mobs-lab/covid19-mobility. Cuebiq raw data are available upon request to Cuebiq’s Spectus Social Impact program (https://spectus.ai/social-impact/).
Funding: MC and AV acknowledge support from COVID Supplement CDC-HHS-6U01IP001137-01 and Google Cloud and Google Cloud Research Credits program to fund this project. A.V. acknowledges support from the McGovern Foundation and the Chleck Family Foundation. The findings and conclusions in this study are those of the authors and do not necessarily represent the official position of the funding agencies, the National Institutes of Health or U.S. Department of Health and Human Services. TER, LT, and TL were supported in part by NSF IIS-1741197, Combat Capabilities Development Command Army Research Laboratory under Cooperative Agreement Number W911NF-13-2-0045, and Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15- D-0001. None of the funders played any role in the study design, data collection, analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The near-ubiquity of mobile phone usage—coupled with state-of-the-art techniques for data anonymization and user privacy [1, 2]—has led to unprecedented opportunities to gain insight into the social response to the COVID-19 pandemic [3–11], to improve our understanding of human behavior by quantifying reductions in mobility and changes in consumer behavior [9, 12], and it has contributed to the debate around the effectiveness of the different policies and guidelines introduced to mitigate the spread of the COVID-19 pandemic [13–20].
Here, we present a framework aimed at characterizing collective patterns of physical distancing and we show some of its applications by looking at behavioral changes over time and locations, and by examining the relationship between the observed changes and employment characteristics and health outcomes. The proposed approach consists of several measures of mobility and physical proximity: 1) the daily range of mobility for each user; 2) the fraction of users that commute to work; 3) the fraction of users that travel between metropolitan areas; 4) the number of unique contacts outside of home and work; and 5) the average duration of those contacts. We compute these measures over a sample of anonymized, privacy-preserving aggregated location data for a panel of approximately 5.5M users selected from more than 40 million mobile devices geolocated in the United States between January and September, 2020. Together, these complementary measures provide a macroscopic signature of what happens to a population when millions of individuals reduce their mobility and physical proximity. These measures allow us to provide one possible working definition of collective physical distancing in the United States during the first nine months of the COVID-19 pandemic, pre-vaccine era, and to quantify how it emerged—and, to some extent, persisted—following work-from-home policies, mobility restrictions, shelter-in-place orders, and other policy interventions implemented and promoted during the COVID-19 pandemic [21–23]. We show that the defined measures capture relevant differences of behavior changes in urban versus rural settings and they are statistically associated with unemployment and teleworking rates. Notably, we also find that the measures characterizing reduction in individual contacts are early indicators of COVID-19 deaths. These findings suggest that the proxy measures identified here can, in turn, be used to calibrate epidemic transmission models aimed at defining the burden of the COVID-19 epidemic [3, 24–26]. An interactive version of the results and measures included in this manuscript (as well as access to the anonymized, aggregated dataset) is made publicly available through the following online dashboard: https://covid19.gleamproject.org/mobility.
Materials and methods
Mobility data
Mobility data are provided by Cuebiq Inc., a location intelligence and measurement company. Through its Data for Good program (https://www.cuebiq.com/about/data-for-good/), Cuebiq Inc. provides access to aggregated and privacy-enhanced mobility data for academic research and humanitarian initiatives. Cuebiq Inc. collects its data primarily through its proprietary location-based Software Development Kit (SDK) that partners embed in their mobile apps. In other words, Cuebiq Inc. mobility data is collected from smartphone applications where location is at the core of the app’s functionality. This includes app categories such as maps, navigation, weather, and geo-specific retail. All data is collected with the informed consent of fully anonymized users under GDPR and CCPA compliant frameworks. As part of the opt-in process, users consent to share their anonymized data directly with Cuebiq Inc. for research purposes. Users may opt out at any time, request copies of their data, and request that their data be permanently deleted under portability and erasure clauses. In addition to fully anonymizing the device IDs of each user by utilizing an encrypted hash, the mobility data undergoes additional privacy protections by utilizing patented privacy enhancing technologies. First, the inferred coordinates of users’ home and work locations are up-leveled so that the new coordinates will correspond to the centroid of their corresponding Census block group [27], thereby precluding identification of individual users based on home or work addresses data, while preserving sociodemographic inference capabilities based on publicly available census data. Then, the mobility data is also subject to cleansing to remove visits to sensitive points of interest, including military bases, sexual reproductive health centers, places of worship, elementary schools, and other places with heightened levels of privacy sensitivity.
While this mobility dataset has been released to researchers in an effort to assist COVID-19 response and epidemic modeling efforts, the data collection process on itself has not been specifically tailored for public health studies. In this study, we characterize collective physical distancing for a panel of 5,506,590 users that were active between January 7th and June 30th, 2020. Specifically, users are included in the panel if all the following conditions apply. First, each user must be active for at least 21 days in each month from January until June 2020. Second, on average, the spatial coordinates of each user must be recorded at least once per hour (averaged over the number of days in which a user is active). Lastly, the average geolocation accuracy for each device needs to be less than 50 meters for the period of coverage. The considered panel of users is representative of the U.S. population for several socio-demographic characteristics such as age, sex, race, educational attainment, and earnings (see S1 Text).
Demographic and employment data
County-level demographic data, including the rural-urban designation as well as demographic data used for statistical corrections are from the United States Census and the American Community Survey (https://www.census.gov). County-level unemployment data are from the United States Bureau of Labor Statistics. Data about the percent of teleworkable jobs are from Dingel and Neiman [28] and Dey et al. [29] that provide estimates for the percent of jobs that can transition to telework for Metropolitan Statistical Areas (MSAs) based on their occupation distribution within each region.
State-level COVID-19 testing data and reopening data
Data about the COVID-19 testing and cases are from the COVID Tracking Project [30], which compiles data directly from state health authorities. Data about the dates that states initially began to reopen was collected from the New York Times [23].
Collective physical distancing indicators
In this section, we introduce five different mobility and proximity indicators that we use to quantify daily changes in collective physical distancing: 1) commute volume, 2) mobility range (radius of gyration), 3) inter-CSA transit, 4) distinct contacts per user, and 5) average contact duration.
Estimating short-range traveling using daily commute volume.
Daily commute volumes count the total number of home to work trips originating from a given county within 24 hours. Cuebiq Inc. provides a list of obfuscated “personal areas” for each user. Observations geolocated from within these locations are deemed to be coming for either the home or the work location of the individual and are therefore up-leveled to preserve user privacy. That is, these coordinates are aggregated to the centroid of the Census block group level that each observation falls into. In order to quantify the changes in commuting behavior to and from work, we classify personal areas into the home or work location to be able to count commute flows. In particular, we consider the most commonly-visited personal area during nighttime hours (9:00pm—5:00am) as the home location of the user, while the most common non-home personal area visited during daytime hours (9:00am—5:00pm) is classified as the work location of the user. This method is imperfect (i.e., it may obfuscate users who exclusively work night shifts), but it is based on assumptions about the typical worker in the United States. Then, one commute is defined as a user visiting their “home” and “work” in a given day. Lastly, in this study we take as reference the definition and location of personal areas as identified in the period immediately prior to the lockdown measures. Therefore, our commute metric reflects changes with respect to the status quo existing prior to the COVID-19 pandemic.
Estimating mobility range using the radius of gyration.
The radius of gyration [31] characterizes the extent of a given user’s trajectory in a single day and its formal definition is: (1) where n is the user’s number of observations on that day, is the ith observed position of the user, i = 1, 2, …, n, and is the center of mass of the trajectory. This measure gives us a standardized way to tell how far an individual is traveling from their average daily position (center of mass), most likely their home and work locations, in a given day. I.e. a larger radius of gyration corresponds to a trajectory with positions that are further away from the person’s center of mass. The radius of gyration provides us with a way to measure the “characteristic distance” [31] traveled daily by each individual and how their spatial range changed during the different phases of the pandemic. From an epidemiological standpoint, this measure is particularly relevant as reductions in people spatial ranges will reduce the rate at which an epidemic will diffuse between (possibly distant) locations. Lastly, in order to compute typical mobility within a given region, we sum the total daily radius of gyration that by users in that region.
Estimating long-range traveling using inter-CSA transit.
To study the changes in long-range traveling, we calculated the number of users who visited at least two separate U.S. Census Combined Statistical Areas (CSAs) in a single day. In particular, CSAs are defined so that each CSA groups together neighboring areas that share significant employment interchange and that show a substantial degree of economic or social connection between them, often measured by commuting patterns. In other words, we measure the volume of long-range trips between major metropolitan areas which allows us to capture—among other things—also variations in air traffic and long-range train and road trips.
Estimating social mixing using daily contacts.
For the purpose of contact tracing, the CDC defined a close contact as someone who was “within 6 feet of an infected person for at least 15 minutes” [32]. Using this guidance, we operationalize the definition of contact as two devices being within the same rectangular area of approximately 38m × 20m for at least 15 minutes and we define two measures quantifying contact mixing between individuals. Even though defining contacts in this way can be noisy or imprecise due to the spatial resolution considered, we show in the following that the measures introduced in this section positively correlate with key epidemiological indicators, e.g., new deaths.
The method works as follows. Each time the time-stamped spatial coordinates of a user are recorded, we associate the longitude-latitude coordinates to an 8-character geohash. A geohash is a short string of letters and digits that allows to encode coordinates into a hierarchical spatial data structure that tessellates the world surface into a grid. In our case, we consider geohashes at an 8-digits resolution which encode rectangular cells of dimensions that are approximately 38m × 19m at the equator [33]. We define two users to be co-located if they are observed in the same geohash for at least 15 consecutive minutes. For each user, we compute the number of unique users that a device is in contact with during a single day and the total dwell time of their daily contacts that fit the above criteria. We average these values across users in a given region to arrive at county, state, and nationwide daily average contact duration and average number of distinct contacts. Note, this method can record only contacts occurring outside of personal areas due to Cuebiq’s privacy protection procedures that obfuscate the exact coordinates for location events occurring within personal areas (e.g. users’ home and work locations).
Typical behavior.
In this manuscript, we report the metrics as percent of typical activity. We select the period between January 16 and February 28, excluding holidays, as our baseline, therefore defining what constitutes, in the context of this work, typical behavior. For each measure, we divide the indicator daily value by its average value during the baseline period for the same day of the week (i.e., Mondays are compared to the average Monday during the baseline period). Therefore, values of 100% denote typical behavior. For example, the timeline of the percent of typical behavior for the commuting indicator at time t will be computed as where Ct denotes the number of commutes observed at time t, Nt denotes the number of active users at time t, denotes the number of commutes in the baseline reference period, and denotes the number of active users in the baseline reference period.
Results
Quantifying spatial mobility and social mixing
On March 16, 2020, the United States government issued guidelines promoting nonpharmaceutical interventions (NPIs) to reduce the spread of the COVID-19 [21]. Such interventions included school closures, state of emergency declarations requiring non-essential businesses to close, and shelter-in-place orders to minimize person-to-person contacts. By April 7, 95% of people in the United States were being urged by their states’ governors to stay home due to the pandemic [22] and by early May 2020, daily commuting volume shows that across the United States there has been a reduction of approximately 65% of the typical daily values (Fig 1a, commute volume). Notably, in our data, the aggregate trend in commute volume has remained relatively stable since early May, at about a 60–70% reduction, though it is beginning to trend upwards again as of early September. We suspect this trend is both a reflection of the reality of the “new normal” of work-from-home policies, along with increases in unemployment due to COVID-19 in the fall of 2020. Commuting behavior informs us about the fraction of individuals that still go to their workplaces and this information is also relevant for the modeling of disease transmission in workplace settings [34–36]. At the same time, we also observe a sharp decline in long-range traveling, as measured by the number of users traveling between CSAs (Fig 1b) as compared to the baseline in every CSA included in this analysis. At its peak, the amount of inter-CSA transits among the users in our panel had decreased by almost 50%, on average.
Graphs show deviations from typical behavior for the same weekday in the United States. (a) Mobility: Individual mobility (radius of gyration), commute volume, and inter-CSA transit. (b) Contacts: Number of distinct contacts and average contact duration events outside of work and home. By the national declaration of emergency (March 13), reductions in spatial mobility measures had begun, reaching approximately 50% of typical values by April 1; while contact measures show a reduction greater than 75% by the same date. A 7-day rolling average is shown alongside each measure. Two grey vertical dashed lines denote the introduction and expiration, respectively, of the CDC non-pharmaceutical interventions guidelines.
Lastly, we capture the change in the range of individual daily traveled distance during the COVID-19 pandemic and we show that by early May, the average radius of gyration of users in our panel decreased by between 45–55% relative to a typical weekday, as shown in Fig 1a (mobility range). Similar results have been reported previously for New York City [5]. The range of distance traveled increases steadily throughout May and June, and by early July returns to about 95% of the typical behavior. This increase follows the rescinding of stay-at-home orders and the steady reopening of businesses across the country, meaning increases in mobility for both employees and consumers. However, it is also likely related to increased confidence among the general public that activities requiring traveling, such as trips to the beach and hikes, could be done while practicing social distancing, making them safe to engage in. Indeed, this return to near-typical mobility range is not accompanied by a return to near-typical person-to-person contact events, giving support to the evidence that public’s confidence in the safety of low-contact activities increased in time.
As a first measure of social mixing we considered the number of distinct contacts that a user has in a given day, outside of work or home. These contacts quantify the opportunity for disease transmission to/from distinct individuals, being at the same coffee place, interacting at a grocery store, and so on. On average, there was a dramatic decline in the number of distinct contacts that users had in a day with the onset of this decline around March 11 (see Fig 1). Users in our panel had approximately 75% fewer distinct contacts per day by mid-April. Unique contacts increased steadily starting in May and through June, leveling off for the remainder of the summer at approximately 40–50% reduction. This trend reflects a general loosening of physical distancing, consistent with reopening of businesses as well as increased comfort with outdoor gathering. Importantly, however, we do not see a full return to typical behavior, suggesting that even faced with newly reopened amenities (shops, restaurants, etc.) people in the United States remained reluctant to return to pre-pandemic levels of social activity.
Characterizing effective contacts for disease transmission must take into account that the probability of transmission increases also with the duration of the contact [37]. For this reason we measured each user’s average total duration of contacts with other users, based on how long their devices were located near each other. The total duration of contacts per day followed a similar pattern to the number of unique contacts. By mid-April, the duration of contacts was reduced by about 75% compared to typical behavior before social distancing measures took effect. Through May and June there was a steady increase up to about a 45% reduction from typical. The fact that total duration of contacts was reduced further than distinct contacts per day indicates that the increase in distinct users met is not always accompanied by an increase in time spent together. Again, some of this could be due to increased comfort with outdoor, socially distanced behavior, such as passing others on a walk through the park. Similar to the mobility range trends discussed earlier, throughout May and June there was a steady increase in contact events. However, the trend does not approach typical behavior by the end of July, instead hovering between 50–65% of typical.
The phases of collective physical distancing
The COVID-19 pandemic has brought some of the most substantial disruptions to collective human behavior in living memory. The timeline of these behavioral disruptions is clearly visible in Fig 2, where we report an aggregate index for each state computed as the average percentage reduction with respect to the typical activity of the five previously introduced mobility and proximity indicators. Using this aggregate index, we can characterize four distinct phases of collective physical distancing behavior in the United States:
- Typical activity. From late January to late February, we observe a baseline period that define the typical activity.
- Peak reductions. During this period we observe the dramatic reduction of all indicators following the federal and state mandates and mitigations.
- Heterogeneous reopenings. The time window from early May to late June was marked by states’ reopening of businesses and schools according to different schedules and strictness in residual NPI’s.
- New normal. From July onward when mobility range and inter-CSA transit increased to values comparable to pre-pandemic levels, while commuting flows (i.e., people going to work) and contact measures generally remained at values lower than typical pre-pandemic levels, characterizing a new stage for people living through a pandemic.
(Top) County-level maps of collective physical distancing, with each county colored by an average of its typical daily commute volume, individual mobility range, inter-CSA transit, unique contacts outside of home and work, and total duration of contacts for the time frame listed. (Bottom) Heatmap of reductions in contacts and mobility, emphasizing key dates in every state and the key phases of the pandemic in the U.S. (reopening data from [23]).
During these four time frames, we see a combination of nationwide reductions in commuting volume to/from work. People’s daily social routines changed dramatically as well, with daily mobility being reduced by up to 60% in April, along with approximately 80% fewer contacts with others per day at the peak of physical distancing. It remains a challenge to identify any single cause of these changes in behavior. However, when looked at together, they offer a way to characterize the evolution of our collective behavior, giving us a baseline for understanding how societies react to such a massive disruption. In the Supporting Information, we provide measure-specific curves for each state (S12 Fig) and for several major metropolitan areas (S13 to S29 Figs).
Commuting, working from home, and unemployment.
Commuting volume decreased dramatically in early March, and, in most states, did not increase in the same way that the other mobility/contact measures have. This is likely due to several reasons, from the historic waves of unemployment in the spring and early summer, to a dramatic increase in teleworking. Indeed, throughout the COVID-19 pandemic in the U.S., we see a strong negative correlation between the percentage of teleworkable jobs and commute volume in metropolitan statistical areas (MSAs). For public health officials planning for future pandemics, this relationship between telework and commute volume is especially insightful; many of the NPIs that have been introduced throughout the pandemic were designed to limit the amount of workplace infections, and as such, reductions in commute volume that are due to increased telework (as opposed to increased unemployment) could illustrate a preferable balance of economic and public health priorities.
During the COVID-19 pandemic, many employers eliminated in-person interactions, although many jobs in the United States cannot easily transition to remote work [28, 29, 38]. Furthermore, it is difficult to quantify the ubiquity of remote work across the United States and various surveys have been conducted trying to estimate this number [39]. In addition, typical commuting patterns were also impacted by an unprecedented surge in unemployment [40]. Therefore, we explore the relationships between commuting, unemployment, and teleworking by combining our commute metric with two additional data sources: the Bureau of Labor Statistics’s Local Area Unemployment Statistics (LAUS) dataset, which provides monthly estimates of county-level unemployment rates; and Dingel and Neiman’s and Dey et al.’s estimates of the proportion of jobs in an area that can be feasibly worked from home [28, 29]. The teleworkability estimates are at the Metropolitan Statistical Area (MSA) level, and the LAUS data at the county level. For this reason, in our analysis, we aggregate our measurements to the MSA level, excluding rural counties. Note that while local unemployment and commute volume vary month-to-month, the estimated proportion of teleworkable jobs is largely a static quantity.
In Fig 3, we present the relationship between commute volume, unemployment, and teleworkability in February, April, and July, with each month corresponding to a distinct phase of the pandemic. In February—the baseline period for which we use to define “typical” activity—there is no correlation between the percent of typical commutes in MSAs and the percent of teleworkable jobs, nor is there a correlation between commuting and unemployment at that point (Fig 3a and 3d). This is expected, as the United States had not yet experienced large disruptions resulting from the COVID-19 crisis. Then, in April, following the updated guidelines about physical distancing and the massive surges in unemployment, commuting volume dropped substantially across the United States (on average, about 40% of baseline levels, see Figs 1a and 3b). During this lockdown period, the percentage of jobs that can transition to telework show a -0.533 correlation with commute volumes (Fig 3b); we also observe a significant negative correlation between commute volume and unemployment rate during this period (Fig 3e).
Grouping county-level employment data to the Metropolitan Statistical Area (MSA), we correlate commute volume with the percent of jobs that can readily transition to teleworking according to Dingel and Neiman [28] (top row) and unemployment rate (bottom row) over time. (a & d): February, during the baseline period; (b & e): April, during the peak lockdown; (c & f): July, after unemployment declined but commuting remained low—during the “new normal” phase).
Collective physical distancing in rural and urban areas.
We also observe different levels of collective physical distancing in different parts of the country, which reflects the heterogeneity in policy response, disease incidence, geography, and population structure across the US (see e.g. [41]). By grouping our collective physical distancing measures with the National Center for Health Statistics (NCHS) urban-rural county classification scheme [42], we can compare the responses of people living in urban versus rural settings. We observe that large central metro (code 1) areas showed the largest reductions in our collective physical distancing measures for contacts and mobility range (Fig 4a–4c); the more rural the county, the less reduction from typical behavior. However, users living in more urban counties also had lower baselines for these measures (Fig 4d–4f), which we show using a standardized index as opposed to the “percent of typical” values shown in Fig 4a–4c. As a point of comparison, by the beginning of April, the median reduction of the number of distinct contacts for users in large/medium metro counties approached a level similar to the baseline of a typical user in rural, micropolitan counties (Fig 4f).
Each county in the United States is assigned a rural-urban code, ranging from 1 (large central metro) to 6 (highly rural, “non-core” counties). We average the percent of typical behavior per user (top row) and a standardized index (bottom row) across counties grouped by these six rural-urban code designations. The standardized index obscures raw values but preserves relative differences between groups; we do so by normalizing by the median value across all counties. (a & d): mobility range; (b & e): contact duration; (c & f): distinct contacts. Seven-day rolling averages are plotted in bold above raw values plotted as thin curves.
More-urban areas also began physical distancing behaviors earlier than more-rural areas; for example, large central and large fringe metro areas dipped to 80% of typical for their contact measures about five days earlier than small metropolitan areas, micropolitan areas, and non-core areas. During the first week of May, micropolitan and non-core counties showed mobility range that was around 75% of typical, while large central and fringe metro areas remained at around 50% of typical (Fig 4a). A similar rural-urban gap is seen in the percent of typical behavior for both measures of contacts (Fig 4b and 4c).
Collective physical distancing and the toll of COVID-19.
Lastly, we validate the use of the proximity/contact measures introduced in this manuscript as coarse-grained approximations for true person-to-person contacts. As such, we would expect to find a positive correlation between these measures and key epidemiological indicators, such as new deaths. More specifically, we would expect that a lagged correlation would best capture this relationship since it accounts for: a) the time from exposure to symptom onset (about 6 days); b) the median number of days from symptom onset to death (between 13 to 17 days depending on the age group considered); and c) the median number of days from death to reporting date (varying from 19 to 21 days depending on the age group) [43]. Because of these factors, the median delay we would expect in the correlation between our proxies for contacts and new deaths is in the range 38–44 days. Nationwide, we see that this is indeed the case.
In Fig 5, we plot the nationwide percents of typical average contact duration (a) and distinct contacts (b) against the daily number of new deaths per 100,000 people nationwide. The contact measures are correlated at a delay d ∈ [37, 44] days for each measure, which was selected by maximizing the correlation between contact patterns and new deaths. We do the same comparison for the number of new infections in the S4 Text (S11 Fig), highlighting the robustness of this correlation nationwide.
Here we correlate daily contact measures nationwide with new reported deaths [30] between April 30 and November 5, 2020. The horizontal axes correspond to the percent of typical contact patterns, while the vertical axis corresponds to the (lagged) number of new deaths per 100,000. (a) Average contact duration (b) Distinct contacts. A lag of d days was selected for each state so as to maximize the correlation between new deaths and contact measures. Maximum correlation is observed at d ∈ [37, 44] (d = 44 is visualized) days that is consistent with CDC estimates [43] that account for disease dynamics and reporting delays. In each subplot, darker colors indicate later dates and marker size corresponds to an estimate of the median effective reproductive number (Rt) across all 50 states and District of Columbia (source: rt.live). These contact measures are also positively correlated with new reported cases (but at a shorter lag, see S11 Fig).
The color of the markers in Fig 5 corresponds to time; darker colors indicate later dates. Here, we see an important relationship between our contact measures and the course of the pandemic. Namely, as contact patterns increased in the early summer (lighter colored markers), new infections and new deaths followed; this, in turn, was followed by decreases in contact events, followed again by decreases in new infections by late August (darker colored markers). This is approximately the same time as when the curves in Fig 1b (the contact measures) started to level off, while mobility and inter-city transit continued to rise (Fig 1a). What this disconnect between mobility and contact patterns suggests is that our collective social behavior can reduce the rate of new infections and, as a result, new deaths. This finding is possibly trivial to epidemiologists and public health officials, but it is nonetheless important for our understanding of how our collective behavior impacts the trajectory of a pandemic, to validate our contacts measures as proxies for true person-to-person contacts, and it is also consistent with other findings throughout the literature on COVID-19 [11, 44–46]. The ability to measure these patterns in almost real time shows the potential benefits of using mobile device data in forecasting (or “nowcasting”) the trajectory of a virus, and moving forward, they present a baseline for our collective behavioral response to future pandemics.
Discussion
The massive efforts to comply with the CDC’s physical distancing guidelines came at a substantial cost to the economic and social well-being of people in the United States. By quantifying these nationwide behavioral changes, we get a glimpse into the relationship between large scale collective behavior and the course of the pandemic. Learning from these patterns is necessary to prepare for future pandemics; most notably because despite large-scale collective physical distancing, during the time window from February to December 2020, the United States reported over 13 million cases of COVID-19. Furthermore, considering the period up to December 31st, 2020, more than 385,000 deaths were attributed to COVID-19 on death certificates [47]. This suggests that in the pre-vaccine era, the timing, magnitude, and synchrony [48, 49] of collective physical distancing in the United States was ultimately insufficient to completely mitigate the nationwide outbreak. However, studies have shown that combining social distancing with other interventions such as extensive testing, quarantines, and contact-tracing can help to keep the epidemic incidence to lower levels thus helping the local healthcare systems, while allowing for a certain degree of relaxation of social distancing measures [34]. Indeed, that was the approach that some countries, such as South Korea, Taiwan, and China had followed [50–52]. Ultimately, for such a combined approach to work, it is essential to be able to quantify collective social distancing utilizing near-time indicators, like the ones discussed in this work.
During the “new normal” period from July to December 2020, there were millions of new cases and hundreds of thousands of new deaths in the United States; during this same time period, we see mobility patterns return to 100% of baseline levels while contacts remained at around 65% of typical activity. This suggests two key things. First, a national average of approximately 65% of typical contacts was not sufficient for avoiding the large number of cases seen during that period. Further modeling efforts are needed to estimate the potential effects that larger decreases in contacts would have had (e.g. 60%, or 50%, etc. instead of 65%). Second, this suggests that over the course of the pandemic, people may have learned to adapt their behavior in a way that allows them to travel while still limiting opportunities for contact with others. For example, visiting a park or hiking are activities that are likely associated with higher mobility but not necessarily more contacts. Indeed, in many cities across the United States, we see a relative rise in visits to parks [9] during this time period. Learning from this might inform goals or benchmarks for policy responses to this or future pandemics.
In this work, we quantified the unprecedented behavioral response to COVID-19 in the first 9 months of the COVID-19 pandemic in the United States—collective physical distancing at a nationwide scale—using five different measures of mobility and contact patterns. By studying the daily mobility patterns of millions of anonymous mobile phone users, we show how people altered their typical behavior, limiting daily interactions with others to comply with policy interventions and in an effort to reduce their chances of becoming infected with the virus. Understanding precisely and quantifying how individuals’ behavior changed over the course of the pandemic is critical, and in this work we present several measures that transform large-scale mobile device data into near real-time epidemiological insights. Of particular importance, the contact proximity measures introduced here correlate with the onset of new deaths nationwide; this correlation is maximized at a delay of 37–44 days, in line with the range reported by the CDC [43].
Recent work has shown that a more nuanced understanding of typical human mixing patterns can have dramatic effects on the spread of a disease and our models of the spread of a disease; it is particularly useful to understand age-based, setting-specific contact patterns within a population [36, 53]. The current study is limited by the absence of this data, and in many ways traditional surveying methods may offer more robust estimates (see [53]). However, the measures of collective physical distancing behavior that we introduce can be potentially generalized by using differences in Census tracts age distributions to estimate (on aggregate) age-specific mobility and contact reductions. Lastly, we quantify contacts based on geographic proximity and we do not attempt to link locations to information about the setting where these contacts take place in (i.e., at a restaurant, workplace, park, etc); this information is particularly relevant because the odds of disease transmission are much higher with contacts in closed spaces compared to open-air environments [35, 54]. This can be addressed by measuring contact events within a pre-identified list of points-of-interest.
Cuebiq Inc.’s mobility data used in this work have both strengths and weaknesses. As shown in this study, it is clear that this data can play a significant role in enriching our understanding of the effects of policy interventions and therefore enhance our ability to realistically model and predict disease spreading during an ongoing outbreak by providing us with a real-time situational awareness tool that can allow us to monitor changes in physical behaviors and mobility. However, these data also have some limitations: reliability may be impacted by the user opt-out mechanism if a significant number of users decide to opt-out of the data sharing agreement; we do not have information regarding the specific list of apps the Cuebiq users’ are utilizing and therefore we cannot directly control for potential self-selection biases in the user population; GPS data can sometimes be imprecise, particularly in densely populated urban areas or locations with poor GPS signal, which could affect the quality of location-based data; and the generalizability of the data might be limited due to its collection from apps where location is central to functionality. In this study, we addressed the data limitations by building a selected panel of users and by assessing its representativeness with respect to key sociodemographic characteristics. Nevertheless, we have shown that it is possible to quantify collective physical distancing, at-scale, during an ongoing pandemic using high-resolution location data. Moving forward and despite the limitations listed above, the use of mobility and proximity indicators like the ones proposed in this work will enable us to devise more precise and effective mitigation strategies, allowing for the possibility of integrating social distancing approaches with other interventions, and ultimately informing our actions and policies in the face of future pandemics.
Supporting information
S2 Text. Correlating physical distancing measures across datasets.
https://doi.org/10.1371/journal.pdig.0000430.s002
(PDF)
S3 Text. Sensitivity analysis: Teleworkable jobs and commuting.
https://doi.org/10.1371/journal.pdig.0000430.s003
(PDF)
S4 Text. Correlating contact patterns with new positive tests.
https://doi.org/10.1371/journal.pdig.0000430.s004
(PDF)
S4 Fig. Schematic of statistical procedure for assigning county-level weights.
https://doi.org/10.1371/journal.pdig.0000430.s009
(PDF)
S5 Fig. Changes in mobility and person-to-person contacts over time (unweighted panel).
https://doi.org/10.1371/journal.pdig.0000430.s010
(PDF)
S6 Fig. Collective physical distancing and new deaths (unweighted panel).
https://doi.org/10.1371/journal.pdig.0000430.s011
(PDF)
S7 Fig. Collective Physical Distancing: Weighted vs unweighted.
https://doi.org/10.1371/journal.pdig.0000430.s012
(PDF)
S8 Fig. Correlations across mobility datasets.
https://doi.org/10.1371/journal.pdig.0000430.s013
(PDF)
S11 Fig. Collective physical distancing and new infections.
https://doi.org/10.1371/journal.pdig.0000430.s016
(PDF)
S12 Fig. Collective physical distancing across every state.
https://doi.org/10.1371/journal.pdig.0000430.s017
(PDF)
S13 Fig. Changes in mobility and person-to-person contacts over time in Atlanta–Athens-Clarke County–Sandy Springs, GA-AL.
https://doi.org/10.1371/journal.pdig.0000430.s018
(PDF)
S14 Fig. Changes in mobility and person-to-person contacts over time in Boston-Worcester-Providence, MA-RI-NH-CT.
https://doi.org/10.1371/journal.pdig.0000430.s019
(PDF)
S15 Fig. Changes in mobility and person-to-person contacts over time in Chicago-Naperville, IL-IN-WI.
https://doi.org/10.1371/journal.pdig.0000430.s020
(PDF)
S16 Fig. Changes in mobility and person-to-person contacts over time in Dallas-Fort Worth, TX-OK.
https://doi.org/10.1371/journal.pdig.0000430.s021
(PDF)
S17 Fig. Changes in mobility and person-to-person contacts over time in Denver-Aurora, CO.
https://doi.org/10.1371/journal.pdig.0000430.s022
(PDF)
S18 Fig. Changes in mobility and person-to-person contacts over time in Detroit-Warren-Ann Arbor, MI.
https://doi.org/10.1371/journal.pdig.0000430.s023
(PDF)
S19 Fig. Changes in mobility and person-to-person contacts over time in Los Angeles-Long Beach, CA.
https://doi.org/10.1371/journal.pdig.0000430.s024
(PDF)
S20 Fig. Changes in mobility and person-to-person contacts over time in Miami-Port St. Lucie-Fort Lauderdale, FL.
https://doi.org/10.1371/journal.pdig.0000430.s025
(PDF)
S21 Fig. Changes in mobility and person-to-person contacts over time in New Orleans-Metairie-Hammond, LA-MS.
https://doi.org/10.1371/journal.pdig.0000430.s026
(PDF)
S22 Fig. Changes in mobility and person-to-person contacts over time in New York-Newark, NY-NJ-CT-PA.
https://doi.org/10.1371/journal.pdig.0000430.s027
(PDF)
S23 Fig. Changes in mobility and person-to-person contacts over time in Orlando-Lakeland-Deltona, FL.
https://doi.org/10.1371/journal.pdig.0000430.s028
(PDF)
S24 Fig. Changes in mobility and person-to-person contacts over time in Philadelphia-Reading-Camden, PA-NJ-DE-MD.
https://doi.org/10.1371/journal.pdig.0000430.s029
(PDF)
S25 Fig. Changes in mobility and person-to-person contacts over time in Phoenix-Mesa, AZ.
https://doi.org/10.1371/journal.pdig.0000430.s030
(PDF)
S26 Fig. Changes in mobility and person-to-person contacts over time in San Jose-San Francisco-Oakland, CA.
https://doi.org/10.1371/journal.pdig.0000430.s031
(PDF)
S27 Fig. Changes in mobility and person-to-person contacts over time in Seattle-Tacoma, WA.
https://doi.org/10.1371/journal.pdig.0000430.s032
(PDF)
S28 Fig. Changes in mobility and person-to-person contacts over time in St. Louis-St. Charles-Farmington, MO-IL.
https://doi.org/10.1371/journal.pdig.0000430.s033
(PDF)
S29 Fig. Changes in mobility and person-to-person contacts over time in Washington-Baltimore-Arlington, DC-MD-VA-WV-PA.
https://doi.org/10.1371/journal.pdig.0000430.s034
(PDF)
Acknowledgments
We thank Ciro Cattuto, Michele Tizzoni, and Zachary Cohen for their help understanding the details of Cuebiq data and Esteban Moro for his comments. We also thank Chia-Hung Yang for coding assistance. We thank Agastya Mondal and Robel Kassa for the development of the online dashboard. Geographical boundaries utilized in Fig 2 have been obtained from the U.S. Census Bureau.
References
- 1. Oliver N, Lepri B, Sterly H, Lambiotte R, Delataille S, De Nadai M, et al. Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle. Science Advances. 2020;0764. pmid:32548274
- 2. De Montjoye YA, Gambs S, Blondel V, Canright G, de Cordes N, Deletaille S, et al. Comment: On the privacy-conscientious use of mobile phone data. Scientific Data. 2018;5:1–6.
- 3. Buckee CO, Balsari S, Chan J, Crosas M, Dominici F, Gasser U, et al. Aggregated mobility data could help fight COVID-19. Science. 2020;368(6487):145–146. pmid:32205458
- 4. Pepe E, Bajardi P, Gauvin L, Privitera F, Lake B, Cattuto C, et al. COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Scientific Data. 2020;7(1):3–9. pmid:32641758
- 5.
Bakker M, Berke A, Groh M, Pentland AS, Moro E. Social Distancing in New York City; 2020. http://curveflattening.media.mit.edu/posts/social-distancing-new-york-city/.
- 6.
Glanz J, Carey B, Holder J, Watkins D, Valentino-DeVries J, Rojas R, et al. Where America Didn’t Stay Home Even as the Virus Spread; 2020. https://nyti.ms/3aAql0E.
- 7.
Valentino-DeVries J, Lu D, Dance GJX. Location Data Says It All: Staying at Home During Coronavirus Is a Luxury; 2020. https://www.nytimes.com/interactive/2020/04/03/us/coronavirus-stay-home-rich-poor.html.
- 8.
Canipe C. The social distancing of America; 2020. https://graphics.reuters.com/HEALTH-CORONAVIRUS/USA/qmypmkmwpra/index.html.
- 9.
Google. See how your community is moving around differently due to COVID-19; 2020. https://www.google.com/covid19/mobility/.
- 10. Gao S, Rao J, Kang Y, Liang Y, Kruse J, Dopfer D, et al. Association of Mobile Phone Location Data Indications of Travel and Stay-at-Home Mandates With COVID-19 Infection Rates in the US. JAMA Network Open. 2020;3(9):e2020485. pmid:32897373
- 11. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021;589(7840):82–87. pmid:33171481
- 12.
United States Department of Commerce. Advance Monthly Sales for Retail Trade and Food Services; 2020. https://www.census.gov/retail/marts/www/marts_current.pdf.
- 13. Flaxman S, Mishra S, Gandy A, Unwin HJT, Mellan TA, Coupland H, et al. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature. 2020;584(7820):257–261. pmid:32512579
- 14. Haug N, Geyrhofer L, Londei A, Dervic E, Desvars-Larrive A, Loreto V, et al. Ranking the effectiveness of worldwide COVID-19 government interventions. Nature Human Behaviour. 2020;4(12):1303–1312. pmid:33199859
- 15. Di Domenico L, Pullano G, Sabbatini CE, Boëlle PY, Colizza V. Impact of lockdown on COVID-19 epidemic in Île-de-France and possible exit strategies. BMC Medicine. 2020;18(1):240. pmid:32727547
- 16. Fang H, Wang L, Yang Y. Human mobility restrictions and the spread of the Novel Coronavirus (2019-nCoV) in China. Journal of Public Economics. 2020;191:104272. pmid:33518827
- 17. Perra N. Non-pharmaceutical interventions during the COVID-19 pandemic: A review. Physics Reports. 2021;913:1–52. pmid:33612922
- 18. Liu Y, Morgenstern C, Kelly J, Lowe R, Jit M. The impact of non-pharmaceutical interventions on SARS-CoV-2 transmission across 130 countries and territories. BMC medicine. 2021;19(1):1–12. pmid:33541353
- 19. Hunter PR, Colón-González FJ, Brainard J, Rushton S. Impact of non-pharmaceutical interventions against COVID-19 in Europe in 2020: a quasi-experimental non-equivalent group and time series design study. Eurosurveillance. 2021;26(28):2001401. pmid:34269173
- 20. Brauner JM, Mindermann S, Sharma M, Johnston D, Salvatier J, Gavenčiak T, et al. Inferring the effectiveness of government interventions against COVID-19. Science. 2021;371(6531):eabd9338. pmid:33323424
- 21.
The White House. Coronavirus Guidelines for America; 2020. https://www.whitehouse.gov/briefings-statements/coronavirus-guidelines-america/.
- 22.
Mervosh S, Lu D, Swales V. See Which States and Cities Have Told Residents to Stay at Home; 2020. https://nyti.ms/2y5j9LN.
- 23.
Mervosh S, Lee JC, Gamio L, Popovich N. See Which States Are Reopening and Which Are Still Shut Down; 2020. https://nyti.ms/2Y37Ezj.
- 24. Davis JT, Chinazzi M, Perra N, Mu K, Pastore y Piontti A, Ajelli M, et al. Cryptic transmission of SARS-CoV-2 and the first COVID-19 wave. Nature. 2021;600(7887):127–132. pmid:34695837
- 25. Ray EL, Wattanachit N, Niemi J, Kanji AH, House K, Cramer EY, et al. Ensemble forecasts of coronavirus disease 2019 (COVID-19) in the US. medRxiv. 2020.
- 26. Woody S, Tec MG, Dahan M, Gaither K, Lachmann M, Fox S, et al. Projections for first-wave COVID-19 deaths across the US using social-distancing measures derived from mobile phones. medRxiv. 2020.
- 27.
United States Census Department. Glossary; 2019. https://www.census.gov/programs-surveys/geography/about/glossary.html.
- 28. Dingel JI, Neiman B. How many jobs can be done at home? National Bureau of Economic Research; 2020. 26948. pmid:32834177
- 29.
Dey M., Frazis H., Piccone D.S. Jr, Loewenstein M.A. Teleworking and lost work during the pandemic: new evidence from the CPS Monthly Labor Review, U.S. Bureau of Labor Statistics; 2021.
- 30.
Miller K, Curry K. The COVID Tracking Project; 2020. https://github.com/COVID19Tracking.
- 31. González MC, Hidalgo CA, Barabási AL. Understanding Individual Human Mobility Patterns. Nature. 2008;453(7196):779–782. pmid:18528393
- 32.
Centers for Disease Control and Prevention. Operational considerations for adapting a contact tracing program to respond to the COVID-19 pandemic; 2020. https://www.cdc.gov/coronavirus/2019-ncov/downloads/global-covid-19/operational-considerations-contact-tracing.pdf.
- 33.
Niemeyer G. Geohash; 2008. https://en.wikipedia.org/wiki/Geohash.
- 34. Aleta A, Martín-Corral D, Pastore y Piontti A, Ajelli M, Litvinova M, Chinazzi M, et al. Modelling the impact of testing, contact tracing and household quarantine on second waves of COVID-19. Nature Human Behaviour. 2020;4(9):964–971. pmid:32759985
- 35. Aleta A, Martín-Corral D, Bakker MA, y Piontti AP, Ajelli M, Litvinova M, et al. Quantifying the importance and location of SARS-CoV-2 transmission events in large metropolitan areas. Proceedings of the National Academy of Sciences. 2022;119(26):e2112182119. pmid:35696558
- 36. Mistry D, Litvinova M, Pastore y Piontti A, Chinazzi M, Fumanelli L, Gomes MFC, et al. Inferring high-resolution human mixing patterns for disease modeling. Nature Communications. 2021;12(1):323. pmid:33436609
- 37.
Centers for Disease Control and Prevention. COVID-19 Understanding Exposure Risks; 2022. https://www.cdc.gov/coronavirus/2019-ncov/your-health/risks-exposure.html.
- 38.
U S Bureau of Labor Statistics. Workers who could work at home, did work at home, and were paid for work at home, by selected characteristics, averages for the period 2017-2018; 2019. https://www.bls.gov/news.release/flex2.t01.htm.
- 39. Brynjolfsson E, Horton JJ, Ozimek A, Rock D, Sharma G, TuYe HY. COVID-19 and remote work: An early look at US data. National Bureau of Economic Research; 2020. 27344.
- 40.
U S Department of Labor. April 30, 2020: Unemployment Insurance Weekly Claims Report; 2020. https://www.dol.gov/sites/dolgov/files/OPA/newsreleases/ui-claims/20200774.pdf.
- 41. Rader B, Scarpino SV, Nande A, Hill AL, Adlam B, Reiner RC, et al. Crowding and the shape of COVID-19 epidemics. Nature Medicine. 2020;26(12):1829–1834. pmid:33020651
- 42.
National Center for Health Statistics. NCHS Urban-Rural Classification Scheme for Counties; 2013. https://www.cdc.gov/nchs/data_access/urban_rural.htm.
- 43.
Centers for Disease Control and Prevention. COVID-19 Pandemic Planning Scenarios; 2020. https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html.
- 44. Xiong C, Hu S, Yang M, Luo W, Zhang L. Mobile device data reveal the dynamics in a positive relationship between human mobility and COVID-19 infections. Proceedings of the National Academy of Sciences. 2020;117(44):27087–27089.
- 45. Badr HS, Du H, Marshall M, Dong E, Squire MM, Gardner LM. Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study. The Lancet Infectious Diseases. 2020;20(11):1247–1254. pmid:32621869
- 46. Monod M, Blenkinsop A, Xi X, Hebert D, Bershan S, Tietze S, et al. Age groups that sustain resurging COVID-19 epidemics in the United States. Science. 2021;371 (6536). pmid:33531384
- 47.
Centers for Disease Control and Prevention. Provisional Death Counts for Coronavirus Disease 2019 (COVID-19); 2023. https://www.cdc.gov/nchs/covid19/mortality-overview.htm.
- 48. Althouse BM, Wallace B, Case B, Scarpino SV, Berdhal A, White ER, et al. The unintended consequences of inconsistent pandemic control policies. medRxiv. 2020. pmid:32869043
- 49. Holtz D, Zhao M, Benzell SG, Cao CY, Rahimian MA, Yang J, et al. Interdependence and the cost of uncoordinated responses to COVID-19. Proceedings of the National Academy of Sciences. 2020;117(33):19837–19843. pmid:32732433
- 50. Kang S-J, Kim S, Park K-H, Jung SI, Shin M-H, Kweon S-S, et al. Successful control of COVID-19 outbreak through tracing, testing, and isolation: Lessons learned from the outbreak control efforts made in a metropolitan city of South Korea Journal of Infection and Public Health. 2021;14(9):1151–1154. pmid:34364306
- 51. Wang CJ, Ng CY, Brook RH. Response to COVID-19 in Taiwan: Big Data Analytics, New Technology, and Proactive Testing. JAMA. 2020;323(14):1341–1342. pmid:32125371
- 52. Whitelaw S, Mamas MA, Topol E, Van Spall HGC. Applications of digital technology in COVID-19 pandemic planning and response. The Lancet Digital Health. 2020;2(8):e435–e440. pmid:32835201
- 53. Zhang J, Litvinova M, Liang Y, Wang Y, Wang W, Zhao S, et al. Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China. Science. 2020;368(6498):1481–1486. pmid:32350060
- 54. Nishiura H, Oshitani H, Kobayashi T, Saito T, Sunagawa T, Matsui T, et al. Closed environments facilitate secondary transmission of coronavirus disease 2019 (COVID-19). medRxiv. 2020.