Characterizing collective physical distancing in the U.S. during the first nine months of the COVID-19 pandemic

The COVID-19 pandemic offers an unprecedented natural experiment providing insights into the emergence of collective behavioral changes of both exogenous (government mandated) and endogenous (spontaneous reaction to infection risks) origin. Here, we characterize collective physical distancing—mobility reductions, minimization of contacts, shortening of contact duration—in response to the COVID-19 pandemic in the pre-vaccine era by analyzing de-identified, privacy-preserving location data for a panel of over 5.5 million anonymized, opted-in U.S. devices. We define five indicators of users’ mobility and proximity to investigate how the emerging collective behavior deviates from typical pre-pandemic patterns during the first nine months of the COVID-19 pandemic. We analyze both the dramatic changes due to the government mandated mitigation policies and the more spontaneous societal adaptation into a new (physically distanced) normal in the fall 2020. Using the indicators here defined we show that: a) during the COVID-19 pandemic, collective physical distancing displayed different phases and was heterogeneous across geographies, b) metropolitan areas displayed stronger reductions in mobility and contacts than rural areas; c) stronger reductions in commuting patterns are observed in geographical areas with a higher share of teleworkable jobs; d) commuting volumes during and after the lockdown period negatively correlate with unemployment rates; and e) increases in contact indicators correlate with future values of new deaths at a lag consistent with epidemiological parameters and surveillance reporting delays. In conclusion, this study demonstrates that the framework and indicators here presented can be used to analyze large-scale social distancing phenomena, paving the way for their use in future pandemics to analyze and monitor the effects of pandemic mitigation plans at the national and international levels.


Abstract
The COVID-19 pandemic offers an unprecedented natural experiment providing insights into the emergence of collective behavioral changes of both exogenous (government mandated) and endogenous (spontaneous reaction to infection risks) origin.Here, we characterize collective physical distancing-mobility reductions, minimization of contacts, shortening of contact duration-in response to the COVID-19 pandemic in the pre-vaccine era by analyzing de-identified, privacy-preserving location data for a panel of over 5.5 million anonymized, opted-in U.S. devices.We define five indicators of users' mobility and proximity to investigate how the emerging collective behavior deviates from the typical pre-pandemic patterns during the first nine months of the COVID-19 pandemic.We analyze both the dramatic changes due to the government

Introduction
The near-ubiquity of mobile phone usage-coupled with state-of-the-art techniques for data anonymization and user privacy [1,2]-has led to unprecedented opportunities to gain insight into the social response to the COVID-19 pandemic [3][4][5][6][7][8][9][10][11].These types of data are useful for informing public policies and improving our understanding of human behavior by quantifying reductions in mobility and changes in consumer behavior (e.g.spending less on retail [12] or transitioning to a more sedentary lifestyle [9]).In short, quantifying individuals' physical distancing behavior using mobile device data can give rise to an understanding of physical distancing as a large-scale collective phenomenon-what it looks like at the macroscopic level when individuals' behaviors change at such a dramatic scale.In the case of major crisis like the COVID-19 pandemic, the insights generated by mobility and proximity data provide researchers and policy makers with critical near real-time situational awareness that can help in managing our societal response.Furthermore, these data contribute to the debate around the effectiveness of the different policies and guidelines introduced to mitigate the spread of the disease [13][14][15][16][17][18][19][20]: what drives the reduction of daily interactions with others?By how much?What are the implications of the change in behavior both on the trajectory of the epidemic and on our projections of its spread?
Here, we present a framework aimed at characterizing the collective patterns of physical distancing emerging in a society through several measures of mobility and physical proximity: 1) the daily range of mobility for each user; 2) the fraction of users that commute to work; 3) the fraction of users that travel between metropolitan areas; 4) the number of unique contacts outside of home and work; and 5) the average duration of those contacts.We compute these measures over a sample of anonymized, privacy-preserving aggregated location data selected from more than 40 million mobile devices of users geolocated in the United States between January and September, 2020.Together, these complementary measures provide a macroscopic signature of what happens to a population when millions of individuals reduce their mobility and physical proximity.These measures allow us to provide a working definition of collective physical distancing in the United States during the first nine months of the COVID-19 pandemic, pre-vaccine era, and to quantify how it emerged-and, to some extent, persisted-following work-from-home policies, mobility restrictions, shelter-in-place orders, and other policy interventions implemented and promoted during the COVID-19 pandemic [21][22][23].We show that the defined measures capture relevant differences of behavior changes in urban versus rural settings, and are statistically associated with unemployment and teleworking rates.Notably, we also find that the measures characterizing reduction in individual contacts are early indicators of COVID-19 deaths.These findings suggest that the proxy measures identified here can, in turn, be used to calibrate epidemic transmission models aimed at defining the burden of the COVID-19 epidemic [3,[24][25][26].An interactive version of the results and measures included in this manuscript (as well as access to the anonymized, aggregated dataset) is made publicly available through the following online dashboard: https://covid19.gleamproject.org/mobility.

Results
In the following we considered longitudinal mobility data provided by Cuebiq Inc., through its Data for Good program (https://www.cuebiq.com/about/data-for-good/).Cuebiq Inc. provides access to aggregated and privacy-enhanced mobility data for academic research and humanitarian initiatives, collected from users who have opted in to provide access to their GPS location data anonymously, through a GDPR-compliant framework (see Materials and Methods section for a more complete description of these data).
The use of a convenience sample of mobile users to study behavioral variations in time requires caution, and we took measures to reduce two major sources of potential bias: user attrition and user selection bias.In particular, first, we measure our five collective physical distancing metrics using a stable longitudinal panel of Cuebiq Inc. users who were consistently active between January and June, 2020 (see Supplemental Information (SI) for details).This subset includes approximately 5.5 million anonymous users.Second, we have checked the socio-demographic representativeness of the panel and conducted two separate analyses with and without county-specific sampling weights that control for aggregated socio-demographic characteristics such as age, sex, race, educational attainment, and earnings (see SI).Our approach allowed us to create a statistically representative and stable sample of users to adequately measure collective physical distancing in the United States at national, state, and metropolitan levels of aggregation.In addition, as a robustness check, we provide the correlations between our measures of collective physical distancing with several other datasets that have been made publicly available (see SI).In this comparison, we observe the expected correlations between similarly defined measures.For example, the reduced mobility trends captured in our measures are strongly correlated with Google's stay-at-home measure as well as other comparable transit measures.Critically, the measures proposed in this work provide additional insights that do not seem to be fully captured by other publicly available datasets, especially when looking at publicly available proxies for human interactions.to close, and shelter-in-place orders to minimize person-to-person contacts.By April 7, 95% of people in the United States were being urged by their states' governors to stay home due to the pandemic [22].To quantify the effects of these measures and the aggregated changes of human mobility across the United States we defined the following mobility indicators.

Measures of spatial mobility
Short-distance traveling: daily commutes.We measured commuting flows by computing the total number of home to work trips originating from a given county within 24 hours.Commuting behavior is particularly interesting not only as a measure of short-range traveling but also because it can be used as a proxy for potential opportunities of transmission in the workplace.In other words, it indirectly informs us about the fraction of individuals that still go to their workplace.This information is also relevant for the modeling of disease transmission in workplace settings [27][28][29]).By early May 2020, our measure shows that across the United States there has been a reduction of approximately 65% of the typical daily values (Figure 1a, commute volume).Notably, in our data, the aggregate trend in commute volume has remained relatively stable since early May, at about a 60-70% reduction, though it is beginning to trend upwards again as of early September.We suspect this trend is both a reflection of the reality of the "new normal" of work-from-home policies, along with increases in unemployment due to COVID-19 in the fall of 2020.
Long-range traveling: inter-city trips.To study the changes in long-range traveling, we calculated the number of anonymous users who visited at least two separate Census Statistical Areas (CSAs) in a single day.In other words, we measure the volume of longrange trips between major metropolitan areas which allows us to capture-among other things-also variations in air traffic and long-range train and road trips.Indeed, by looking at inter-CSA mobility, we observe a sharp decline in the number of users traveling between CSAs (Figure 1b) as compared to the baseline in every CSA included in these analysis.At its peak, the amount of inter-CSA transits among the users in our panel had decreased by almost 50%, on average.Individual traveled distance: radius of gyration.Lastly, we capture the change in the range of individual daily traveled distance during the COVID-19 pandemic by calculating the radius of gyration [30] for each (anonymized) mobile device in the panel of users selected for this study (see Materials & Methods for a formal definition of the radius of gyration).This measure gives us a standardized way to tell how far an individual is traveling from their average daily position.In other words, it measures how far a user moves from their typical center of mass, most likely their home and work locations, in a given day.By early May, the average radius of gyration of users in our panel decreased by between 45-55% relative to a typical weekday, as shown in Figure 1a (mobility range).Similar results have been reported previously for New York City [5].The range of distance traveled increases steadily throughout May and June, and by early July returns to about 95% of the typical behavior.This increase follows the rescinding of stay-at-home orders and the steady reopening of businesses across the country, meaning increases in mobility for both employees and consumers.However, it is also likely related to increased confidence among the general public that activities requiring traveling, such as trips to the beach and hikes, could be done while practicing social distancing, making them safe to engage in.Indeed, this return to near-typical mobility range is not accompanied by a return to near-typical person-to-person contact events, giving support to the evidence that public's confidence in the safety of low-contact activities increased in time.

Measures of contacts among individuals
For the purpose of contact tracing, the CDC defined a close contact as someone who was "within 6 feet of an infected person for at least 15 minutes" [31].Using this guidance, we operationalize the definition of contact as two devices being within the same 8 digits geohash (a tile of approximately 38m × 20m) for at least 15 minutes (see Materials & Methods) and we define two measures quantifying contact mixing between individuals.Even though defining contacts in this way can be noisy or imprecise due to the spatial resolution considered, we show that the measures introduced in this section positively correlate with key epidemiological indicators, e.g., new deaths (see Section 2.3).
Number of distinct contacts outside home and work.As a first measure of social mixing we considered the number of distinct contacts that a user has in a given day, outside of work or home.These contacts quantify the opportunity for disease transmission to/from distinct individuals, being at the same coffee place, interacting at a grocery store, and so on.On average, there was a dramatic decline in the number of distinct contacts that users had in a day with the onset of this decline around March 11 (see Figure 1).Users in our panel had approximately 75% fewer distinct contacts per day by mid-April.Unique contacts increased steadily starting in May and through June, leveling off for the remainder of the summer at approximately 40-50% reduction.This trend reflects a general loosening of physical distancing, consistent with reopening of businesses as well as increased comfort with outdoor gathering.Importantly, however, we do not see a full return to typical behavior, suggesting that even faced with newly reopened amenities (shops, restaurants, etc.) people in the United States remained reluctant to return to pre-pandemic levels of social activity.
Average contact duration outside home and work.Characterizing effective contacts for disease transmission must take into account that the probability of transmission increases also with the duration of the contact.For this reason we measured each user's average total duration of contacts with other users, based on how long their devices were located near each other.The total duration of contacts per day followed a similar pattern to the number of unique contacts.By mid-April, the duration of contacts was reduced by about 75% compared to typical behavior before social distancing measures took effect.Through May and June there was a steady increase up to about a 45% reduction from typical.The fact that total duration of contacts was reduced further than distinct contacts per day indicates that the increase in distinct users met is not always accompanied by an increase in time spent together.Again, some of this could be due to increased comfort with outdoor, socially distanced behavior, such as passing others on a walk through the park.Similar to the mobility range trends discussed earlier, throughout May and June there was a steady increase in contact events.However, the trend does not approach typical behavior by the end of July, instead hovering between 50-65% of typical.

Collective physical distancing in the contiguous United States
The COVID-19 pandemic has brought some of the most substantial disruptions to collective human behavior in living memory.The timeline of these behavioral disruptions is clearly visible in Figure 2, where we report an aggregate index for each state computed as the average percentage reduction with respect to the typical activity of the five mobility and proximity indicators previously introduced (i.e., the expectation Ev for each of the five measures v corresponding to the percent of typical activity).We can characterize four distinct phases of collective physical distancing behavior in the United States: • Typical activity.From late January to late February, we observe a baseline period that define the typical activity Percent of typical activity, 7-day rolling average (combined average: mobility, transit, and contacts measures) (Bottom) Heatmap of reductions in contacts and mobility, emphasizing key dates in every state and the key phases of the pandemic in the U.S. (reopening data from [23]).
• Peak reductions.During this period we observe the dramatic reduction of all indicators following the federal and state mandates and mitigations.
• Heterogeneous reopenings.The time window from early May to late June was marked by states' reopening of businesses and schools according to different schedules and strictness in residual NPI's.
• New normal.From July onward when mobility range and inter-CSA transit increased to values comparable to pre-pandemic levels, while commuting flows (i.e., people going to work) and contact measures generally remained at values lower than typical prepandemic levels, characterizing a new stage for people living through a pandemic.
During these four time frames, we see a combination of nationwide reductions in commuting volume to/from work.People's daily social routines changed dramatically as well, with daily mobility being reduced by up to 60% in April, along with approximately 80% fewer contacts with others per day at the peak of physical distancing.It remains a challenge to identify any single cause of these changes in behavior.However, when looked at together, they offer a way to characterize the evolution of our collective behavior, giving us a baseline for understanding how societies react to such a massive disruption.In the SI, we provide measure-specific curves for each state ( Commuting, working from home, and unemployment Commuting volume decreased dramatically in early March, and, in most states, did not increase in the same way that the other mobility/contact measures have.This is likely due to several reasons, from the historic waves of unemployment in the spring and early summer, to a dramatic increase in teleworking.Indeed, throughout the COVID-19 pandemic in the U.S., we see a strong negative correlation between the percentage of teleworkable jobs and commute volume in metropolitan statistical areas (MSAs).For public health officials planning for future pandemics, this relationship between telework and commute volume is especially insightful; many of the NPIs that have been introduced throughout the pandemic were designed to limit the amount of workplace infections, and as such, reductions in commute volume that are due to increased telework (as opposed to increased unemployment) could illustrate an ideal balance of economic and public health priorities.
During the COVID-19 pandemic, many employers eliminated in-person interactions, although many jobs in the United States cannot easily transition to remote work.In 2018, the U.S. Bureau of Labor Statistics estimated that almost 25% of workers could work from home [32], a number that varies widely by race, education level, and industry [33].Despite widespread video-conferencing and teleworking software, it is difficult to quantify the ubiquity of these practices across the United States, though major Internet Service Providers reported traffic increases between 20-30% [34] early on in the pandemic; various surveys have been conducted trying estimate this number as well [35].Also, crucially, the typical commuting patterns of millions of people in the United States were impacted by an unprecedented surge in joblessness; over 42 million unemployment claims were filed in the United States between March and June [36].
Our measurements allow us to explore the relationships between commuting, unemployment, and teleworking by combining our commute metric with two additional data sources: the Bureau of Labor Statistics's Local Area Unemployment Statistics (LAUS) dataset, which provides monthly estimates of county-level unemployment rates; and Dingel and Neiman's estimates of the proportion of jobs in an area that can be feasibly worked from home [33].The Dingel and Neiman teleworkability estimates are at the Metropolitan Statistical Area (MSA) level, and the LAUS data at the county level.For this reason, in our analysis, we aggregate our measurements to the MSA level, excluding rural counties.Note that while local unemployment and commute volume vary month-to-month, the estimated proportion July: "new normal" of teleworkable jobs is largely a static quantity.
In Figure 3, we present the relationship between commute volume, unemployment, and teleworkability in February, April, and July, with each month corresponding to a distinct phase of the pandemic.In February-the baseline period for which we use to define "typical" activity-there is no correlation between the percent of typical commutes in MSAs and the percent of teleworkable jobs, nor is there a correlation between commuting and unemployment at that point (Figure 3a and 3d).This is expected, as the United States had not yet experienced large disruptions resulting from the COVID-19 crisis.Then, in April, following the updated guidelines about physical distancing and the massive surges in unemployment, commuting volume dropped substantially across the United States (on average, about 40% of baseline levels, Figures 3b, 1a).During this lockdown period, the percentage of jobs that can transition to telework show a -0.533 correlation with commute volumes (Figure 3b); we also observe a significant negative correlation between commute volume and unemployment rate during this period (Figure 3e).

Collective physical distancing in rural and urban areas
We also observe different levels of collective physical distancing in different parts of the country, which reflects the heterogeneity in policy response, disease incidence, geography, and population structure across the US (see e.g.[37]).By grouping our collective physical distancing measures with the National Center for Health Statistics (NCHS) urban-rural county classification scheme [38], we can compare the responses of people living in urban versus rural settings.We observe that large central metro (code 1) areas showed the largest reductions in our collective physical distancing measures for contacts and mobility range (Figure 4a-c); the more rural the county, the less reduction from typical behavior.However, users living in more urban counties also had lower baselines for these measures (Figure 4d-f), which we show using a standardized index as opposed to the "percent of typical" values shown in Figure 4a-c.As a point of comparison, by the beginning of April, the median reduction of the number of distinct contacts for users in large/medium metro counties approached a level similar to the baseline of a typical user in rural, micropolitan counties (Figure 4f).
More-urban areas also began physical distancing behaviors earlier than more-rural areas; for example, large central and large fringe metro areas dipped to 80% of typical for their contact measures about five days earlier than small metropolitan areas, micropolitan areas, and non-core areas.During the first week of May, micropolitan and non-core counties showed mobility range that was around 75% of typical, while large central and fringe metro areas remained at around 50% of typical (Figure 4a).A similar rural-urban gap is seen in the percent of typical behavior for both measures of contacts (Figure 4b-c).
Collective physical distancing and the toll of COVID-19 Lastly, we validate the use of the proximity/contact measures introduced in this manuscript as coarse-grained approximations for true person-to-person contacts.As such, we would expect to find a positive correlation between these measures and key epidemiological indicators, such as new deaths.More specifically, we would expect that a lagged correlation would best capture this relationship since it accounts for: a) the time from exposure to symptom onset (about 6 days); b) the median number of days from symptom onset to death (between 13 to 17 days depending on the age group considered); and c) the median number of days from death to reporting date (varying from 19 to 21 days depending on the age group) [39].Because of these factors, the median delay we would expect in the correlation between our proxies for contacts and new deaths is in the range 38-44 days.Nationwide, we see that this is indeed the case.
In Figure 5, we plot the nationwide percents of typical average contact duration (a) and distinct contacts (b) against the daily number of new deaths per 100,000 people nationwide.The contact measures are correlated at a delay d ∈ [37,44] days for each measure, which was selected by maximizing the correlation between contact patterns and new deaths.We do the same comparison for the number of new infections in the SI (Figure A.10), highlighting the robustness of this correlation nationwide.
The color of the markers in Figure 5

(f)
Counties with population smaller than 2000 are excluded Distinct contacts (15+ min.)dates.Here, we see an important relationship between our contact measures and the course of the pandemic.Namely, as contact patterns increased in the early summer (lighter colored markers), new infections and new deaths followed; this, in turn, was followed by decreases in contact events, followed again by decreases in new infections by late August (darker colored markers).This is approximately the same time as when the curves in Figure 1b (the contact measures) started to level off, while mobility and inter-city transit continued to rise (Figure 1a).What this disconnect between mobility and contact patterns suggests is that our collective social behavior can reduce the rate of new infections and, as a result, new deaths.This finding is possibly trivial to epidemiologists and public health officials, but it is nonetheless important for our understanding of how our collective behavior impacts the trajectory of a pandemic, to validate our contacts measures as proxies for true person-toperson contacts, and it is also consistent with other findings throughout the literature on COVID-19 [11,[41][42][43].The ability to measure these patterns in almost real time shows the potential benefits of using mobile device data in forecasting (or "nowcasting") the trajectory of a virus, and moving forward, they present a baseline for our collective behavioral response Distinct contacts (15+ min.) Sep. Oct.
Nov. to future pandemics.

Discussion
The massive efforts to comply with the CDC's physical distancing guidelines have come at a substantial cost to the economic and social well-being of people in the United States.By quantifying these nationwide behavioral changes, we get a glimpse into the relationship between large scale collective behavior and the course of the pandemic.Learning from these patterns is necessary to prepare for future pandemics; most notably because despite largescale collective physical distancing, during the time window from February to December 2020, the United States has reported over 13 million cases of COVID-19 and, as of December 2020, over 340,000 reported deaths (with estimates of the true number being far higher [44]).This suggests that in the pre-vaccine era, the timing, magnitude, and synchrony [45,46] of collective physical distancing in the United States was ultimately insufficient to com-pletely mitigate the nationwide outbreak.This is especially true when physical distancing is not combined with a vigorous testing and contact tracing regimen [27], as was the case in countries like South Korea, Taiwan, and China.During the "new normal" period from July to December 2020, there were millions of new cases and hundreds of thousands of new deaths in the United States; during this same time period, we see mobility patterns return to 100% of baseline levels while contacts remained at around 65% of typical activity.This suggests two key things: First, a national average of approximately 65% of typical contacts was not sufficient for avoiding the large number of cases seen during that period.Further modeling efforts are needed to estimate the potential effects that larger decreases in contacts would have had (e.g.60%, or 50%, etc. instead of 65%).Second, this suggests that over the course of the pandemic, people may have learned to adapt their behavior in a way that allows them to travel while still limiting opportunities for contact with others.For example, visiting a park or hiking are activities that are likely associated with higher mobility but not necessarily more contacts.Indeed, in many cities across the United States, we see a relative rise in visits to parks [9] during this time period.Learning from this might inform goals or benchmarks for policy responses to this or future pandemics.
In this manuscript, we quantified the unprecedented behavioral response to COVID-19 in the first 9 months of the COVID-19 pandemic in the United States-collective physical distancing at a nationwide scale-using five different measures of mobility and contact patterns.By studying the daily mobility patterns of millions of anonymous mobile phone users, we show how people altered their typical behavior, limiting daily interactions with others to comply with policy interventions and in an effort to reduce their chances of becoming infected with the virus.Understanding precisely and quantifying how individuals' behavior changed over the course of the pandemic is critical, and in this work we present several measures that transform large-scale mobile device data into near real-time epidemiological insights.Of particular importance, the contact proximity measures introduced here correlate with the onset of new deaths nationwide; this correlation is maximized at a delay of 37-44 days, in line with the range reported by the CDC [39].
Recent work has shown that a more nuanced understanding of typical human mixing patterns can have dramatic effects on the spread of a disease and our models of the spread of a disease; it is particularly useful to understand age-based, setting-specific contact patterns within a population [29,47].The current study is limited by the absence of this data, and in many ways traditional surveying methods may offer more robust estimates (see [47]).However, the measures of collective physical distancing behavior that we introduce can be potentially generalized by using differences in Census tracts age distributions to estimate (on aggregate) age-specific mobility and contact reductions.Lastly, we quantify contacts based on geographic proximity and we do not attempt to link locations to information about the setting where these contacts take place in (i.e., at a restaurant, workplace, park, etc); this information is particularly relevant because the odds of disease transmission are much higher with contacts in closed spaces compared to open-air environments [28,48].This can be addressed by measuring contact events within a pre-identified list of points-of-interest.
Taken together, our lives and everyday activities have been fundamentally reshaped during the COVID-19 pandemic.Defining and quantifying the collective physical distancing that took place over the course of the COVID-19 pandemic becomes especially vital when planning for mitigation strategies in the future.

Description of data sources
Mobile device data Mobility data are provided by Cuebiq Inc., a location intelligence and measurement company.Through its Data for Good program (https://www.cuebiq.com/about/data-for-good/), Cuebiq Inc. provides access to aggregated and privacy-enhanced mobility data for academic research and humanitarian initiatives.These first-party data are collected from users who have opted in to provide access to their GPS location data anonymously, through a GDPR-compliant framework.In order to preserve users' privacy, Cuebiq Inc. adds noise to users' "personal areas" (i.e. home and work locations) by upleveling the coordinates of these areas to the centroid of their corresponding Census block group [49].This allows for demographic analysis while obfuscating the true home and work location of anonymous users and preventing misuse of data.
Demographic and employment data County-level demographic data, including the rural-urban designation as well as demographic data used for statistical corrections in the SI are from the United States Census and the American Community Survey (https://www.census.gov).County-level unemployment data are from the United States Bureau of Labor Statistics.Data about the percent of teleworkable jobs are from Dingel and Neiman [33], where they estimate the percent of jobs that can transition to telework for a given Core-Based Statistical Area based on the occupation distribution within the each region.
State-level COVID-19 testing data and reopening data Data about the COVID-19 testing and cases are from the COVID Tracking Project [40], which compiles data directly from state health authorities.Data about the dates that states initially began to reopen was collected from the New York Times [23].

Collective physical distancing measures
Below we define the different mobility and contact measures used in this work.Note also that we convert each of these measures to a per-user measure.We do this by dividing the cumulative value for each measure at each spatial resolution (e.g. a state) by either the number of users with "home" locations in that region (for commute volume, inter-city transit, and mobility) or by the number of users with contacts in that region (for average contact duration and number of distinct contacts).In addition, in this manuscript, we report the metrics as percent of typical activity.We select the period between January 16 and February 28, excluding holidays, as our baseline, therefore defining what constitutes, in the context of this work, typical behavior.Then, for each measure, we divide its daily value by the average value of its corresponding day of the week (i.e., Mondays are compared to the average Monday).In other words, values of 100% denote typical behavior.

Estimating daily commute volume
Cuebiq Inc. provides a list of obfuscated "personal areas" for each user.Observations geolocated from within these locations are deemed to be coming for either the home or the work location of the individual and are therefore up-leveled to preserve user privacy.That is, these coordinates are aggregated to the centroid of the Census block group level that each observation falls into.In order to quantify the changes in commuting behavior to and from work, we classify personal areas into the home or work location to be able to count commute flows.In particular, we consider the most commonly-visited personal area during nighttime hours (9:00pm -5:00am) as the home location of the user, while the most common non-home personal area visited during daytime hours (9:00am -5:00pm) is classified as the work location of the user.This method is imperfect (i.e., it may obfuscate users who exclusively work night shifts), but it is based on assumptions about the typical worker in the United States.Then, one commute is defined as a user visiting their "home" and "work" in a given day.Lastly, in this study we take as reference the definition and location of personal areas as identified in the period immediately prior to the lockdown measures.Therefore, our commute metric reflects changes with respect to the status quo existing prior to the COVID-19 pandemic.

Estimating inter-CSA transit
As described previously, we estimate the change in inter-CSA transit by calculating the number of anonymous users who visited at least two separate CSAs in a single day.Such inter-CSA transit could stem from long-range commutes, from travel (i.e., on federal holidays, such as Presidents' Day in February, we observe a spike in inter-CSA transit, suggesting tourism or vacation), or other miscellaneous transit including, for example, airline travel, train or bus trips, or long-range road trips.

Estimating individual mobility using the radius of gyration
As defined in [30], the radius of gyration characterizes the extent of a given user's trajectory in a single day.It measures the mean square distance from the trajectory's center of mass to the locations reached that day.Formally, where n is the user's number of observations on that day, r i is the i th observed position of the user, i = 1, 2, ..., n, and r cm = i r i /n is the center of mass of the trajectory.A larger radius of gyration corresponds to a trajectory with positions that are far away from the trajectory's average position.In the current context, a smaller radius of gyration indicates that a user travels less distance away from their daily average position in a city.In order to compute typical mobility within a given region, we sum the total daily radius of gyration that by users in that region.The radius of gyration is a measure that can be interpreted as the variance around the day's center of mass and there; as with most measures defined here, there exist other measures of an individual's typical path deviation over multiple days that may provide additional insight [50].

Estimating daily contacts outside personal areas
The method for estimating contacts outside personal areas (i.e.outside users' home and work locations) is as follows.For each location event (aka a "ping") recorded, we associate its longitude-latitude coordinates to an 8-character geohash.A geohash is a short string of letters and digits that allows to encode coordinates into a hierarchical spatial data structure that tessellates the world surface into a grid.In our case, we consider geohashes at an 8-digits resolution which encode rectangular cells of dimensions that are approximately 38m × 19m at the equator [51].We define two users to be co-located if they are observed in the same geohash for at least 15 consecutive minutes.Note that this duration can be set arbitrarily, but we use 15 minutes following CDC guidance [31].For each user, we compute the number of unique users that a device is in contact with during a single day and the total dwell time of their daily contacts that fit the above criteria.We average these values across users in a given region to arrive at county, state, and nationwide daily average contact duration and average number of distinct contacts.sampling weights, which allow us to create more statistically representative and, thus, more generalizable inferences from our data at the state and national levels of aggregation.Specifically, when aggregating the indexes at the state or national resolutions, we weigh the mobility and proximity metrics computed for each county, e.g., commutes, not only by the number of users present in our panel for a given location but also by a weight that corrects for the potential selection bias resulting from under-or over-sampling of users with certain sociodemographic characteristics.The weights are estimated using the method outlined in [3], which allows us to estimate the parameters of a target population using data from a potentially biased sample, provided that the determinants of the selection bias are available for both the target population and the sample.In the context of this study, we are estimating mobility and contact patterns of the US population (i.e., the target population) using our panel data.We are assuming that users' socio-demographics characteristics influence their probability of inclusion in the Cuebiq sample.Therefore, we use information about the distribution of these demographics in both the U.S. population and our panel of users to compute bias-reducing weights.The schematic of the adopted statistical procedure is provided in Figure A.4 and described in the following.

Additional information
First, we associate each user to a Census tract by using the location of their home personal area (see Section 4.2.1 in Materials & Methods).This allows us to assign to each user a probability distribution of their socio-demographic characteristics by looking at their empirical distributions as reported for each Census tract in the 2014-2018 5-year American Community Survey (ACS) data [4].Second, we create a synthetic population of users in each census tract, which is a sub-unit of a county, using tract-level data from the ACS.The synthetic population's size in each tract is determined by the number of panel users that are assigned to each tract.Then, to each user we randomly assign age, sex, race, educational attainment, and earnings by sampling their values from the census data.After generating one synthetic population, we compute the mean age, the proportion of males, mean earnings, the proportion of having a college degree or higher, and a proportion of white users for each synthetic county.This process is then repeated 10,000 times, therefore generating 10,000 synthetic datasets.Third, we use a generalized linear model (GLM) with a binomial link function (logit) to estimate the probability that a given county is a synthetic county (as generated from the sample of panel users) or a "census" county where for the given county we directly use the census mean values for the different indicators.In other words, we treat this problem as a classification problem where our regression uses the computed county-specific socio-demographic summary statistics to predict whether a county has been simulated using the Cuebiq sample or not.The intuition behind this procedure is that if our sample is unbiased in terms of the demographics, then the demographic information should not allow us to predict whether a county is a synthetic county or a "census" county (in which

Synthetic County A
Repeat steps 1-3 10,000 times for each county.Select average weight, w i , for each county.
Estimate weights for each county using a generalized linear model (GLM logit) to reduce differences in the distributions of socio-demographics between synthetic and true counties when aggregated at a national level.

Weights estimation 1. Sampling
Sample sociodemographic characteristics from the ACS Survey data.
For each tract, draw the same number of samples (synthetic users) as there are Cuebiq users.

Aggregation
Aggregate samples on a county-level to create a synthetic county population.Through repeatedly simulating synthetic populations at the census tract level (based on the number of Cuebiq users with "home" personal areas in each census tract), we assign weights to the county level in such a way that minimizes the bias with respect to demographic variables of interest.After 10,000 simulations, we select the average weight for each county.
case, all estimated β coefficients should not be statistically significantly different from zero).Conversely, if the sample is biased, demographics will produce meaningful predictions.
Using this family of GLMs in the process of reducing sampling bias is standard in wellestablished techniques such as inverse probability weighting and propensity score matching [2,5].To estimate county-specific weights that reduce bias at the national level, accounting for state effects, we use the following model specification: where age denotes the county mean age, college denotes the proportion of the total population having a college degree or higher, white denotes the proportion of the total population being white, earnings denotes the average earnings, male denotes the proportion of males in a county, and state s is a state-specific fixed effect.All variables are z-score standardized.Lastly, we fit our statistical models to each one of the 10,000 synthetic datasets, compute the probability that a given county is a "synthetic" county (p synthetic ), and convert it into a county-specific weight w i c using: where i denotes the dataset used and c denotes the county [3].We then obtain our final county-specific weight wc as the average of all the estimated weights: wc = I i=1 w i c /I where I =10,000 (see Figure A.3).

A.1.3 Sensitivity Analysis
To evaluate the effects of the weighted resampling procedure, here we replicate some of the results present in the main text using the unweighted panel.In particular, in Figure A.5 we show the changes in mobility and contacts over time, and in Figure A.6 we report the correlation between our measures of collective physical distancing and new deaths.In both cases, results are in line with the ones obtained using the weighted sampling procedure.
In Figure A.7, instead, we show the effect of using the unweighted vs the weighted panel in computing the time series of the average collective physical distancing (as defined in Figure 2) for each state.While there are slight differences in some states, the overall picture and insights from our analysis do not change.

A.2 Correlating physical distancing measures across datasets
In this work, we use data from Cuebiq Inc., but one feature of the COVID-19 pandemic is that mobile providers and other large technology companies have been providing access to aggregated measures of mobility and contacts.For this reason, we include here a series of correlations between the measures studied here and those from a number of other platforms.As a proof-of-principle validation, the measures we include strike a key balance between correlating with existing publicly-available mobility measures (e.g.Google's "residential" measure negatively correlates with each of our measures-which makes sense, as we do not use location pings from within users' home locations) and still providing unique information.The datasets included in Figures A.9 and A.8 are from: Google (https://www.google.com/covid19/mobility/),Apple (https://covid19.apple.com/mobility/),Pla-ceIQ (https://www.nber.org/papers/w27560),Waze (https://www.waze.com/covid19),and the U.S. Bureau of Transportation Statistics (https://www.bts.gov/covid-19).
Broadly, there is correspondence between the measures introduced here and those used by Apple, Google, PlaceIQ, and the U.S. Dept. of Transportation (Figure A.8).The measures that we expect to be highly correlated are indeed highly correlated: for example, Google's "workplace" measure and our commute volume are Pearson correlated at 0.87.Similarly, our mobility measure is highly correlated with Google's and Apple's "transit" measures, and it is negatively correlated with Google's "residential" measure.
Another point of validation can be seen when comparing the various time series of activity (Figure A.9).For major holidays, where we would expect movement to be disrupted, we see broad alignment between the various measures.For example, in early September (Labor Day), we see an increase in the Dept. of Transportation's "Number of trips 100 miles+" measures; similarly we see the same spike in our inter-city transit measure.
There are endless ways to compare the myriad measures of mobility that have been studied during the COVID-19 pandemic, and despite this, the measures included in this work are balanced between offering a novel, informative lens to understand collective physical  Replication of the analysis in Figure 5 using the unweighted panel to correlate daily contact measures nationwide with new reported deaths [6] between April 30 and November 5, 2020.
distancing while also corresponding neatly to measures that have already been proposed in the literature.

A.3 Correlating contact patterns with new positive tests
The contact measures introduced in this manuscript are meant to be used as a lower resolution approximation for true person-to-person contacts in the U.S. population.As we showed in Section 2.3, the average contact duration and distinct contact measures both are positively correlated with (lagged) new reported deaths.In Figure A.10, we show that this pattern also holds for new positive tests at the national level and for most states (data from the COVID Tracking Project [6]).Again, we plot the lagged correlation that maximizes the R-squared; in this case, the delay that maximizes the average contact duration is 9 days, while distinct contacts is maximized at d = 14 days.While testing data is typically noisier and depends on local testing policies, the presence of these positive correlations between contact patterns and lagged new cases is again suggestive that the measures used in this work can serve as coarse indicators of large-scale human behavior.

On March 16 ,Figure 1 :
Figure 1: Changes in mobility and person-to-person contacts over time.Graphs show deviations from typical behavior for the same weekday in the United States.(a) Mobility: Individual mobility (radius of gyration), commute volume, and inter-CSA transit.(b) Contacts: Number of distinct contacts and average contact duration events outside of work and home.By the national declaration of emergency (March 13), reductions in spatial mobility measures had begun, reaching approximately 50% of typical values by April 1; while contact measures show a reduction greater than 75% by the same date.A 7-day rolling average is shown alongside each measure.Grey vertical lines denote weekends.

Figure 2 :
Figure 2: The phases of collective physical distancing in the United States.(Top) County-level maps of collective physical distancing, with each county colored by an average of its typical daily commute volume, individual mobility range, inter-CSA transit, unique contacts outside of home and work, and total duration of contacts for the time frame listed.(Bottom)Heatmap of reductions in contacts and mobility, emphasizing key dates in every state and the key phases of the pandemic in the U.S. (reopening data from[23]).

Figure A. 11 )
and for several major metropolitan areas (Figure A.12 to Figure A.28).

Figure 3 :
Figure 3: Unemployment, teleworking and commuting patterns.Grouping countylevel employment data to the Metropolitan Statistical Area (MSA), we correlate commute volume with the percent of jobs that can readily transition to teleworking (top row) and unemployment rate (bottom row) over time.(a & d): February, during the baseline period; (b & e): April, during the peak lockdown; (c & f ): July, after unemployment declined but commuting remained low-during the "new normal" phase).

Figure 4 :
Figure4: Differences in county-level behavior based on rural-urban codes.Each county in the United States is assigned a rural-urban code, ranging from 1 (large central metro) to 6 (highly rural, "non-core" counties).We average the percent of typical behavior per user (top row) and a standardized index (bottom row) across counties grouped by these six rural-urban code designations.The standardized index obscures raw values but preserves relative differences between groups; we do so by normalizing by the median value across all counties.(a & d): mobility range; (b & e): contact duration; (c & f ): distinct contacts.Seven-day rolling averages are plotted in bold above raw values plotted as thin curves.

Figure 5 :
Figure 5: Collective physical distancing and new deaths.Here we correlate daily contact measures nationwide with new reported deaths [40] between April 30 and November 5, 2020.The horizontal axes correspond to the percent of typical contact patterns, while the vertical axis corresponds to the (lagged) number of new deaths per 100,000.(a) Average contact duration (b) Distinct contacts.A lag of d days was selected for each state so as to maximize the correlation between new deaths and contact measures.Maximum correlation is observed at d ∈ [37, 44] (d = 44 is visualized) days that is consistent with CDC estimates [39] that account for disease dynamics and reporting delays.In each subplot, darker colors indicate later dates and marker size corresponds to an estimate of the median effective reproductive number (R t ) across all 50 states and District of Columbia (source: rt.live).These contact measures are also positively correlated with new reported cases (but at a shorter lag, see SI Figure A.10).

Figure A. 1 :Figure A. 2 :
Figure A.1: Panel membership coverage.By inferring counties of users' "home" perareas from the data, we can see the extent to which we are over/under-representing users on a per county basis.(a) Histogram of the fraction of population included in our panel of users for each county.(b) Scatterplot correlating the number of home users in a county against the total population of the county.

Figure A. 3 :
Figure A.3: County-level weights.Distribution of county-specific sampling weights wc 's.

Figure A. 4 :
Figure A.4: Schematic of statistical procedure for assigning county-level weights.Through repeatedly simulating synthetic populations at the census tract level (based on the number of Cuebiq users with "home" personal areas in each census tract), we assign weights to the county level in such a way that minimizes the bias with respect to demographic variables of interest.After 10,000 simulations, we select the average weight for each county.

FebFigure A. 5 :
Figure A.5: Changes in mobility and person-to-person contacts over (unweighted panel).Graphs show deviations from typical behavior for the same weekday, in the United States, using unweighted (a) Mobility: Individual mobility (radius of gyration), commute volume, and inter-CSA transit.(b) Contacts: Number of distinct contacts and average contact duration events outside of work and home.By the national declaration of emergency (March 13), reductions in spatial mobility measures had begun, reaching approximately 50% of typical values by April 1; while contact measures show a reduction greater than 75% by the same date.A 7-day rolling average is shown alongside each measure.Grey vertical lines denote weekends.

Figure A. 6 :
Figure A.6: Collective physical distancing and new deaths (unweighted panel).Replication of the analysis in Figure5using the unweighted panel to correlate daily contact measures nationwide with new reported deaths[6]  between April 30 and November 5, 2020.

Feb
Figure Miami-Port St. Lucie-Fort Lauderdale, FL.
Collective Distancing: Weighted vs unweighted.For each state we report the aggregate measure of collective physical distancing, defined as the average of the typical daily commute volume, individual mobility range, inter-CSA transit, unique contacts outside of home and work, and total duration of contacts for the time frame listed, using the weighted panel (solid line) and the unweighted panel (dashed line).A.4Collective physical distancing at state and metropolitan levels Collective physical distancing across every state.Grid cartogram including the five measures shown in Figure1, for all 50 states and District of Columbia.Feb. Mar.Apr.May Jun.Jul.Aug. Sep.Oct. Atlanta--Athens-Clarke County--Sandy Springs, GA-AL Figure A.12: Atlanta-Athens-Clarke County-Sandy Springs, GA-AL.