Figures
Abstract
Social distancing, defined as maintaining a minimum interpersonal distance (often 6 ft or 1.83 m), is a non-pharmaceutical intervention to reduce infectious disease transmission. While numerous quantitative studies have examined people’s social distancing behaviors using mobile phone data, large-scale quantitative analyses of adherence to suggested minimum interpersonal distances are lacking. We analyzed pedestrians’ social distancing behaviors of using 3 years of street view imagery collected in a metropolitan city (Seattle, WA, USA) during the COVID-19 pandemic. We employed computer vision techniques to locate pedestrians in images, and a geometry-based algorithm to estimate physical distance between them. Our results indicate that social distancing behaviors correlated with key factors such as vaccine availability, seasonality, and local socioeconomic data. We also identified behavioral differences at various points of interest within the city (e.g., parks, schools, faith-based organizations, museums). This work represents a first of its kind longitudinal study of outdoor social distancing behaviors using computer vision. Our findings provide key insights for policymakers to understand and mitigate infectious disease transmission risks in outdoor environments.
Citation: Martell M, Salazar C, Errett NA, Miles SB, Wartman J, Choe JY (2024) Outdoor social distancing behaviors changed during a pandemic: A longitudinal analysis using street view imagery. PLoS ONE 19(12): e0315132. https://doi.org/10.1371/journal.pone.0315132
Editor: Kenju Akai, Shimane Daigaku, JAPAN
Received: April 7, 2024; Accepted: November 20, 2024; Published: December 5, 2024
Copyright: © 2024 Martell et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All images collected throughout this longitudinal study are available on mapillary.com under the username ‘uwrapid’. Full instructions and code on our pedestrian detection process are available at https://github.com/marte292/rapid-data-pipeline. Full code for the social distance estimation is available at https://github.com/salezaraus/SocDistAlgo. The processed output necessary to reproduce the regression analyses in this paper, are within the supporting files.
Funding: The U.S. National Science Foundation (Grant Number 2031119) provided financial support for this research. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. Data was collected using instrumentation provided by NSF as part of the RAPID Facility, a component of the Natural Hazards Engineering Research Infrastructure, under Award No. CMMI: 2130997. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The COVID-19 pandemic prompted unprecedented responses from national and local governments worldwide, including emergency declarations and the implementation of social distancing and mask-wearing measures to mitigate virus transmission [1]. In the United States, following the first confirmed case in Washington state in January 2020 [2], the federal government swiftly enacted restrictions. By March 2020, gatherings of more than 10 people were discouraged, and remote work was strongly advised. Concurrently, Washington state imposed more stringent measures, including the closure of schools, dine-in restaurants, and entertainment venues, coupled with a statewide stay-at-home order. The state later mandated face masks in public spaces in June 2020. The United States Centers for Disease Control and Prevention (CDC) recommended a physical distance of at least six ft both indoors and outdoors to prevent the spread of the virus. During the early days of the pandemic, some of the largest superspreading events in the US occurred when these recommendations were violated by large groups of people [3]. The CDC recommendations were not relaxed until August 2022, although most states officially ‘reopened’ well before then, including Washington State on June 30, 2021.
While it is known that it is easier for the disease to spread indoors when compared to outdoors, there are still many examples of outdoor transmission occurring, sometimes at a large scale [4–6]. Studies that have attempted to understand how outdoor transmission occurs and what factors reduce transmission risk showed that there is a strong risk of outdoor transmission under the right circumstances. Hang et al. [7] studied outdoor transmission in a street canyon environment and showed that outdoor transmission risk is highest in downwind areas. Fan et al. [8] likewise studied outdoor transmission in a street canyon environment using simulation methods. They recommended an outdoor distance of 4 meters when relative humidity and winds are low. Lastly, Kia et al. [9] studied outdoor transmission risk on the University of Houston campus, focusing on areas where air is not quickly ventilated. They showed that there were ‘hot spots’ on campus where air could take as long as 1,000 seconds to ventilate. Given the risk of outdoor transmission, it is useful to understand the public’s outdoor social distancing behaviors during the pandemic. This can help governments craft targeted policies based on known information to prevent the spread of future infectious diseases.
Though many agencies worldwide issued similar recommendations regarding preventative measures against COVID-19 transmission, compliance has varied, as evidenced by observational studies. For example, Rahimi et al. [10, 11] conducted a comprehensive assessment of the impact of socioeconomic status on face mask usage among pedestrians during the pandemic in Ahvaz, Iran. Additionally, Davis and Esposito [12] revealed that social disparities, including income, education, and race, significantly reduce social distancing behaviors in diverse and divided communities. These observations have spurred further research, such as the study by Hoeben et al. [13], which used CCTV (closed-circuit television) footage from the Netherlands to demonstrate that adherence to the 1.5-meter distancing guideline initially met compliance but diminished over time. This variability in adherence has prompted researchers to explore various aspects of public compliance with social distancing measures.
Additionally, there are many studies that analyze community mobility and congregating behaviors using mobile phone data or self-reported survey data [1, 14–18]. One finding from these studies is that pedestrians were less likely to strictly follow COVID-19 risk-reducing behaviors, including avoiding congregating after getting vaccinated [18, 19]. Another is that there are gaps in social distancing behaviors between different socioeconomic and racial groups [14]. While the findings of the above studies are extremely useful for understanding people’s congregating behaviors, there is a gap in understanding how people are adhering to interpersonal distance recommendations. Instead of measuring physical distances, these studies typically use foot traffic at specific locations of interest, and survey responses on social distancing behaviors as measures of adherence. While efforts are made to avoid it, there are known representation issues associated with mobility data captured by cell phones [20, 21], and sampling bias associated with surveys.
While the aforementioned research utilized indirect measurements to study social distancing behavior, other studies have employed more direct methods. For example, Seres et al. [22] analyzed how face mask compliance influenced adherence to social distancing in queues. Although insightful, studies like these and the CCTV analysis by Hoeben et al. [13] capture only momentary behaviors, not long-term patterns. To overcome this limitation, new deep learning models have been developed. These large-scale models not only classify pedestrians but also estimate social distancing [23, 24], providing valuable tools for social scientists as outlined by Bernasco et al. [25]. However, these methods have primarily been applied in CCTV settings over a study period of months. CCTV data, while valuable, is location-limited, and requires permission to either access data, or install a camera. Additional insights could be gained from a data set providing greater coverage of an area of interest over a longer period of time.
In this study, the large-scale, longitudinal data set of street-view imagery captured in the city of Seattle, as part of the study described in Martell et al. [26], is used to track trends in social distancing over time. Physical distance between pedestrians is estimated using an algorithmic approach [27]. The end result of this process is a set of images, with the pedestrians identified, and the distances between them estimated, as seen in Fig 1, allowing for an empirical analysis of social distancing behaviors. The data set used in this study extends from May 2020 through July 2023. It can be used to track overall trends in community mobility, in addition to social distancing, using the estimated distances generated from the street view imagery. Our main contribution is a first of its kind method for empirically studying longitudinal outdoor social distancing behaviors. Given the geospatial nature of the data, this method allows for subsets of the data at specific locations of interest to be studied. In turn, we can compare different geographic areas such as census tracts to understand the different drivers of social distancing behaviors. Secondary contributions include empirical confirmation of results from qualitative studies on social distancing behaviors, and additional perspective on COVID-19 related inequities in the US.
There are 2 pedestrians, with an estimated 14.38 ft between them. Additional sample images are available in the S1 File.
Methods
This section describes the methods used, including data collection, and statistical methods. The original data set in this work is featured in Martell et al. [26]. We do not describe the process for the generation of that data set here, but rather focus on the new features we added, most notably the physical distance estimates.
Data description and processing
The primary data set utilized in this study is the longitudinal street view image data set from Martell et al. [26]. The data set consists of 37 street-view surveys of the city of Seattle, beginning in May 2020 and ending in July 2023 (typically 2–4 weeks apart, as shown in Fig 2). There are over four million time-stamped, location-tagged images across the 37 surveys. 36 of the 37 data collection surveys were used in this study, as a heavy rain event caused a survey to be stopped early on 10–29-2020. The route design is featured in Errett et al. [28]. As described in Martell et al. [26], we created a data pipeline that took the 360-degree street view images and identified pedestrians in them using the Pedestron algorithm [29]. The output of this pipeline was a set of bounding boxes (each of which is a set of 4 coordinates surrounding a pedestrian, as seen in Fig 1). Additional data features included latitude, longitude, GEOID [30] of the census tract where the image was captured, and various demographic data related to that census tract from the 2019 American Community Survey. Other data features include day of the week, and season of the year, both directly derived from the image timestamp. We consulted with the University of Washington Human Subjects Division, to determine that this study was not considered human subjects research. As such, it did not require Institutional Review Board approval. The data captured in Martell et al. [26] was people in public places, where they cannot expect personal privacy. The image data was published through Mapillary, which automatically obscures faces as an added precaution. For additional ethical reflections, please see the Discussion section.
The blue line represents the number of distances under 9 ft (2.74 m) per image. For example, 1.0 means that there is one pair of pedestrians on average per image, whose distance is under 9 ft.
Summary statistics for each of the predictor variables are seen in Tables 1 and 2. To operationalize this data, we converted the tract-specific data to binary indicators based on Jenk’s natural breaks optimization used during the route design [28]. The day of the week variable was converted to be a binary variable for whether it was the weekend or not. Based on previous analysis [26], we converted the season variable to be a binary for whether it was summer or not.
In addition to the data pipeline outputs, this study makes use of publicly available geolocation data on community capital transects from the city of Seattle. These capitals are theorized to be closely tied to community resilience [31] and were used in the design of the data collection survey route. The capitals examined in this study are parks, schools, faith-based organizations, museums, hospitals, medical clinics, and transit stops. Data for these capitals are publicly available through King County GIS Data Hub [32], Washington State Department of Health [33], the City of Seattle [34], and the Association of Religion Data Archives [35]. These locations were chosen to provide coverage over natural, cultural, and built capitals [36], and to provide a view into patterns at public health infrastructure.
We further processed the data by calculating distances between detected bounding boxes. We used the technique of Salazar [27] to obtain estimates of distances between bounding boxes, even in a 2D image. This technique is based on geometric properties and utilizes the Pythagorean Theorem to estimate distances. To validate the algorithm, we applied Salazar’s method to an experimental ground truth dataset. The Root Mean Square Error (RMSE) on this data set was 1.13 ft. The ground truth data set was collected using the same equipment and methods as the data set utilized in this study. Applying this method to our data resulted in almost 5 million distance calculations. Lastly, we subset the distances to look at only those estimated by the algorithm to be under 9 ft (2.74 m) in length. We decided on this subset as a conservative estimate, using the six ft apart guideline provided by the CDC and the 1.13 ft (0.34 m) RMSE of the distance calculation algorithm [27]. With this RMSE, we can be confident that anything measured as 9 ft or greater by the algorithm is at least 6 ft apart in reality.
After we calculated the distances, we paired the distances with various capitals based on geolocation. A distance is assumed to be at a given capital if it is within 200 ft (60.96 m) of the shapefile footprint of that capital. The one exception to this is transit stops, which we only allowed for a 15 ft (4.57 m) distance as pedestrians usually wait very near to, or inside of transit stops. Additionally, these structures have much smaller footprints when compared to the other analyzed capitals. The last step in data preparation was to normalize the distances counts by the number of images captured in a given census tract. While the survey route was the same from run to run, the exact number of images captured at a given location may not be. Normalizing the number of distances per image for each survey run can help alleviate this inconsistency.
Exploratory data analysis
Before conducting regression analyses on social distancing behaviors, we conducted an exploratory data analysis to better understand trends in the data set. Fig 2 was a key input into our initial modeling decisions. First, we noticed that the number of distances under 9 ft increases sharply in April 2021 and stays high over the rest of the surveys. This corresponds with when the initial COVID-19 vaccine became publicly available for all over the age of 16 in the state of Washington. While it is not strongly visible, we also expected there likely will be some seasonality present in the data, particularly an increase in traffic during the summer months. Lastly, while not easily visible in the figure, we expected there will be a relationship between the day of the week and community mobility. Beyond what we could glean from the graph, there are known inequities in how the COVID-19 pandemic affected different racial groups and people of different income levels [37]. Thus, we wanted to make sure to include all of the above factors as predictors in our models.
Regression analyses
Based on the initial analysis, we developed the following regression model to identify which factors are statistically significant (α = .05):
(1)
where Y is the number of distances under 9 ft per image for each date/census tract combination; Ivaccine is an indicator for if the vaccine was available on that date; Cseason is an indicator variable for if it is summer or not; Iweekend is an indicator for if it is the weekend or not; Cincomelevel is an indicator variable for if a given census tract has a median income above $80,820; Idemographicindicator is an indicator variable for if the population is 55.5% white or more. The $80,820 and 55.5% breakpoints were chosen based on Jenk’s natural breaks optimization. β0 is the baseline number of distances per image on a weekday, not in the summer, with the vaccine unavailable, in a census tract with income below $80,820 and a population less than 55.5% white. β1 represents the change in the number of distances per image from the vaccine becoming available, and β2 represents the change for it being the summer. β3 represents the change from a weekday to the weekend, and β4 represents the change to the higher income bracket. Lastly, β5 represents the change from an area that is less than 55.5% white to an area that is more.
We utilized this model on the entire data set, as well as data subsets located at the various community capitals of interest. We thought it possible that different community capitals would experience different trends over time, and different seasonality effects. As an example, traffic around schools is likely to decrease in the summers rather than increase like we expected to see in the full model. We also developed a similar model for the proportion of the distances under 9 ft. While this model does violate the linearity assumption of a linear regression model, the predicted outputs are all well within the 0–1 bounds of a proportion. We employ a simple form of regression model, as our focus is on drawing inferences rather than making predictions. The only other change from the above model is that the regression coefficients are interpreted as changes in predicted proportion, rather than number of distances per image.
In addition to the above regression models, we examined the baseline trends in social distancing patterns at different capital locations. We did this by first calculating the overall proportion of distances under 9 ft for the entire data set and at each community capital. Then we compared the proportions at each capital to the overall proportion using a difference in proportions test. This allowed us to have a baseline understanding of the differences between the community capitals.
Results
Distances under 9 ft per image
The regression analysis results for the number of distances under 9 ft per image across the entire data set are displayed in Table 3. Vaccine availability, whether it was the weekend or not, it being summer or not, and proportion of the population that identified as white were all significant, positive predictors. Income level was a significant, negative predictor. Results for individual capitals are summarized in Table 4, with full regression outputs in the S1 File.
Please note that as there is some overlap that occurs between images, the coefficients here can only be interpreted relative to each other. Stating that the vaccine effect is 3 times as great as the summer effect in the same direction is useful, but the absolute interpretation of these coefficients’ sizes is not. Full documentation for the Python package used to make this tabular output is available from the developers [38].
The numbers reported are the regression coefficents. Significance level: 0.05*, 0.01**, or <0.001***.
There were a few notable results from the regression analyses at community capitals. One immediately noticeable result from Table 4 was that vaccine availability was significant in all models, except at hospitals. In fact, hospitals did not have any significant predictors at all. Museums were the only capital to have a significant, positive income effect. Schools had a significant, negative effect from it being summer, and faith-based organizations had comparably sized vaccine and weekend effects. Lastly, when compared to the other capitals, museums had substantially larger coefficients for vaccine availability, the weekend effect, and the income effect.
Proportion of distances under 9 ft
To obtain the proportion of distances less than 9 ft, we simply calculated the ratio between the number of distances under 9 ft, and the total number of distances. For the regression analysis, we subset the data at the survey and census tract levels first, then calculated the proportions. The proportion of distances less than 9 ft across the entire data set is 0.264. We then performed the same procedure for only distances at each specific capital and compared them to the total using a difference in proportions test. The results are displayed in Table 5.
The regression analysis results for the proportion of distances under 9 ft across the entire data set are displayed in Table 6. Vaccine availability, whether it was the weekend or not, and income level were all significant, positive predictors. It being summer was a significant, negative predictor. Results for individual capitals are summarized in Table 7, with full regression outputs in the S1 File.
The coefficients here can be directly interpreted as the relative change in the proportion of distances under 9 ft when a given predictor changes from 0 to 1.
The numbers reported are the regression coefficients. Significance level: 0.05*, 0.01**, or <0.001***.
A notable trend from the community capitals regressions is how the weekend effect being significant in 4 different models, with a larger regression coefficient than the vaccine effect. Demographic variables played less of an effect when compared to the regression for number of distances under 9 ft per image. The income effect was only significant in 3 models, and the proportion of the population identifying as white was not significant in any of the models. The vaccine effect was only significant in 4 models, compared to 6 in the number of distances per image models. In all cases, the effect was still positive when significant.
Discussion
Implications
Our results represent important knowledge for researchers and policymakers. First, our approach represents a novel way of understanding outdoor social distancing behaviors. Past research includes looking at worldwide trends over a shorter period of time [23], smaller areas of a city over shorter periods of time (months) [13, 24], and frequently relies on CCTV data being available from public sources [13, 23, 25], or manual data collection and labeling [39]. In comparison, our study analyzes outdoor social distancing over a period of three years across a large portion of a metropolitan area. Our regression analysis shows that across the entire city of Seattle, vaccine availability correlated with increased pedestrian activity. Across community capitals, there was an increase in pedestrians being within 9 ft of each other following the public availability of the vaccine in April 2021. Additionally, the overall proportion of distances that were 9 ft or less also increased. This suggests that, even outdoors, people were more likely to accept the risks of being near one another after the vaccine became available. These results also confirm findings from other studies that show that people were more likely to be willing to take risky behaviors such as visiting crowded places and being near others after being vaccinated [18, 19]. This is not the only additional risk people in the Seattle area are taking as the pandemic has continued, people have been more willing to embrace infection risk by not getting updated vaccines [40]. Fig 3 shows a stark contrast in the number of people who got the original COVID-19 vaccine series compared to the more recent bivalent boosters. In general, as time has gone on it appears that people are less concerned about COVID-19, possibly due to pandemic fatigue [41, 42].
Data collected from the King County Department of Public Health [40].
There are known inequities in how COVID-19 affected different communities, including higher mortality rates and infection rates for historically disadvantaged groups [14]. Across income groups, driving factors include the inability to work from home, challenges securing housing at all, and challenges securing quality healthcare for lower income groups [43]. Our analysis shows that lower income areas had more pedestrians traveling within 9 ft of each other, and more pedestrian traffic overall, leading to greater risk of exposure. In contrast, higher income areas did have a higher proportion of distances less than 9 ft.
Racial demographics were not shown to significantly predict the proportion of distances under 9 ft. In contrast, census tracts that had a larger population identifying as white had more distances under 9 ft per image across the entire data set, and at parks, medical clinics, and faith-based organizations. This is a somewhat surprising result, as is the higher proportion of distances under 9 ft in higher income areas. The known racial disparity in COVID-19 health outcomes in King County (see Fig 4) and racial and socioeconomic disparities in the US as a whole [37], would lead us to expect the opposite. However, it is also known that communities of color are more likely to wear a mask, and that the drivers behind higher infection rates and death are systemic inequities in wealth, income, underlying health conditions, social capital, and lower quality health care [44–46]. In this light, it is less surprising that rates of social distancing adherence across areas with different racial demographics and income levels did not directly correlate with the divergent health outcomes in those groups.
(NHPI, Native Hawaiian or Pacific Islander; AIAN, American Indian or Alaska Native). The right side graph shows the expected proportion of distances under 9 ft per image across our entire data set during the summer, in a lower-income census tract, on a weekday, before the vaccine became publicly available, based on our regression results. It is shown here that the more white areas did not have a statistically significant difference from the less white areas.
Lastly, the findings from this study show some expected results with regard to weekend effects and seasonality. Across the data set, weekend surveys had higher numbers of distances under 9 ft, and the proportion of distances under 9 ft was also higher. These results held true at many, but not all, the capitals of interest. In the summer, the effect was mixed, with the total number of distances under 9 ft increasing across the entire data set, even though the proportion of distances under 9 ft decreased. This suggests that increased foot traffic and decreased social distancing adherence is not a given relationship when outdoors. In general, it is easier to maintain physical distance when outdoors when compared to being indoors, even if foot traffic is somewhat higher.
When comparing the social distancing results at different community capitals, two results stand out. First is the substantially lower proportion of distances less than 9 ft at schools. Schools have authority to enforce strict social distancing rules over their students. While indoors this could prove problematic for a variety of reasons [48], but outdoors, in our study area, these rules are much easier to enforce. Schools also experienced expected results in the regression analyses, with summer being a driver for a decrease in distances under 9 ft. In contrast, faith-based organizations were the location with the highest proportion of defections within 9 ft of each other. The relationship between religious groups and COVID-19 related restrictions in the U.S. can be described as tenuous at best [49], with frequent legal battles over restrictions. Additionally, there is a known link between highly religious Americans (particularly evangelicals) and less concern and support for policy recommendations from public health officials [50]. In this light, it is somewhat expected that social distancing recommendations would be less strictly followed at faith-based organizations compared to other community capitals. However, the regression results at faith-based organizations showed that both the number of distances under 9 ft and the proportion under 9 ft increased after the vaccine became available. Thus, there still was some level of risk avoiding behavior at faith-based organizations prior to vaccine availability. Faith-based organizations experienced an expected result of an increase in distances on weekends.
There were other notable results related to different community capitals from the regression analyses. First, hospitals did not have any significant predictors for the number of distances under 9 ft. Hospital traffic is largely driven by demand rather than other factors, which would lead to none of the seasonality effects being significant. In terms of socioeconomic predictors, hospitals are not limited to serving people who live in the immediate area, so the demographics of the census tract the institution is located would not be expected to impact foot traffic. The COVID-19 infection rate may be a better predictor for foot traffic at hospitals, although more research is needed to confirm this. In contrast, medical clinics had a vaccine effect, income effect, and racial demographic effect that were similar to the overall data set. This is likely due to the outdoor nature of our data. Medical clinics are much more common and cover more of the city than hospitals do, leading them to trend more with the overall data set. This is another capital where more research could be done to better understand social distancing behaviors. Lastly, the large income effect at museums is likely due to the fact that the majority of museums in Seattle are located in wealthy areas, leading to a natural bias.
Another important item to address are the ethical concerns of this type of research. The data collected for this study was naturalistic observation in public places, and thus not considered human subjects research. This is in line with the definition put forward by the American Sociological Association’s code of ethics [51]. However, this type of large-scale image data collection still raises many privacy concerns [25]. While expected levels of privacy, and privacy regulations, differ across cultures, privacy is valued globally [52]. Thus, while the methods outlined in this study are considered acceptable in the United States, they may not be elsewhere.
Similarly, there are significant differences in what is considered acceptable across disciplines [25]. There is some tension between reproducibility in computer vision research and privacy concerns, with reproducibility usually winning out. The practice of collecting and annotating large amounts of video data in public places, without consent of those recorded, is commonplace in this field. These data sets are shared amongst researchers, frequently without anonymization, to verify and improve upon each other’s work. This is in contrast with social science research, where this practice would be met with harsh scrutiny. Without suggesting that one practice is better than the other, this example does illustrate the gap that currently exists between fields for this type of data collection. As interdisciplinary research becomes more common, these contrasting views are more likely to come into conflict.
Limitations
One challenge with utilizing computer vision is that the data product created, in this case the number of estimated distances, cannot be interpreted as the actual number of distances at a given location. There is overlap in the image data in Martell et al. [26]. Pedestrians that appear in the foreground of one image may end up in the background of another. However, our results in this study are enough to demonstrate that while the raw number of distances may not be perfect, the relative change over time is meaningful. Additionally, this type of data collection only captures people near drive-able roads. As some locations of interest such as hospitals and parks may have large footprints away from roads, some data on pedestrian traffic near these areas will be lost that would potentially be captured by other methods such as using cell phone data. Our methods serve as a complement to data of this type, not a replacement.
The distance estimation algorithm is also not without problems [27]. The authors utilized an experiment to determine the optimal values for the algorithm’s parameters through a grid search. In their tests, the ground truth pedestrians stood no more than 12 ft apart. The RMSE of 1.13 ft mentioned earlier in the paper is from those conditions. However, in our data set, pedestrians are frequently further apart than 12 ft, and further away from the vehicle than 20 ft. Thus, under those conditions, the algorithm’s validity decreases, sometimes to the point where it overestimates distances by multiple orders of magnitude. However, given the nature of our study, this overestimation is acceptable. As we only care about pedestrians 9 ft apart or closer, we are within the bounds of what the algorithm was trained on. When the overestimation occurs, the estimated distance obtains the same classification it would have otherwise, 9 ft or greater.
Another challenge with this method is the inability to differentiate between people from the same social bubbles, and strangers. During the height of the pandemic, households would form ‘bubbles’ of small non-overlapping groups that would be able to come into contact with each other while still maintaining social distancing benefits [53]. While our methods cannot account for these social bubbles, Danon et al. [53] shows that there are still substantial risks of transmission when this tactic is employed. Given this insight, we argue that this limitation does not substantially harm our findings.
Lastly, while the survey route was designed to provide a good representation of the city of Seattle, it is not without its flaws [28]. There are some key locations in the city that were missed, such as “The Ave” in the University District, that would have been valuable areas to study. Additionally, the route completely omitted West Seattle due to driving time constraints. These omissions do not directly harm the validity of the results in this paper, but they do somewhat limit their scope.
Extensions
This methodology has the potential to be applied to future pandemic events to understand the impacts on social distancing behaviors and community mobility as a whole. If possible, conducting an occasional baseline survey would allow for pandemic-era data to be compared to pre-pandemic levels, something that was not done for this study. Additionally, this type of analysis can be easily extended to other image data sets, whether it be indoors or outdoors, to understand changes in social distancing behaviors over time at key locations.
There are also still more insights that can be gleaned from this data set. For example, some capitals of interest that were not included in this study, such as cycling infrastructure [54, 55], could be topics of further study. Additionally, large-scale longitudinal street view imagery has a host of applications outside public health such as studying the built environment, and urban analytics [56, 57]. Lastly, with a larger data set, more predictor variables could be analyzed, such as more specific racial demographics than just white vs nonwhite, English proficiency, age and including interaction terms between predictors.
Finally, improvements in generalizable pedestrian detection algorithms or distance estimation in 2-dimensional images would allow for a higher accuracy in model outputs. Improvements in this area could allow for real-world interpretable outputs if pedestrian counts were more accurate. It would also allow for analysis of pedestrians that are further apart from each other, and more confidence in the model outputs for these cases. Similarly, it would become possible to study social distancing patterns for extremely short distances (e.g. less than 3ft) which is currently not possible given the RMSE of the distance estimation algorithm we used. Lastly, algorithms that detect characteristics such as mask-wearing or gender [22] could be used for an individual-level analysis, as opposed to the group-level analysis done here. In a similar vein, street crowding is known to impact pedestrian behavior, as people who are put in situations where distancing is difficult are more likely to violate recommendations [58, 59]. An individual-level variable for crowding would allow our model to provide additional insights.
Conclusions
This study represents a first of its kind effort to track outdoor social distancing behaviors. We used a longitudinal street-view image survey, computer vision, and distance estimation techniques to generate over four million estimated distances across a 3-year survey period. We show that vaccine availability was a key driver in outdoor social distancing in the city of Seattle, with an increase in the number of distances under 9 ft after the vaccine became publicly available. Our results also highlight some of the systemic inequities that exist within the city that match broader trends in the US. This included that lower income areas experienced higher levels of pedestrians in proximity to each other, and that whiter areas had higher numbers of pedestrians in proximity to each other, in spite of white people still having better COVID-19 related health outcomes. Lastly, we were able to quantify the differences between community capitals with regard to social distancing behaviors. We observed that faith-based organizations had the lowest levels of social distancing adherence, while schools had the highest adherence levels.
Supporting information
S1 File. Supplementary information for outdoor social distancing behavior changed during a pandemic.
Additional sample images and full regression output for capitals analysis.
https://doi.org/10.1371/journal.pone.0315132.s001
(PDF)
S1 Dataset. Dataset used to obtain regression results across the entirety of Seattle.
https://doi.org/10.1371/journal.pone.0315132.s002
(CSV)
S2 Dataset. Dataset used to obtain regression results at faith-based organizations.
https://doi.org/10.1371/journal.pone.0315132.s003
(CSV)
S3 Dataset. Dataset used to obtain regression results at hospitals.
https://doi.org/10.1371/journal.pone.0315132.s004
(CSV)
S4 Dataset. Dataset used to obtain regression results at medical clinics.
https://doi.org/10.1371/journal.pone.0315132.s005
(CSV)
S5 Dataset. Dataset used to obtain regression results at museums.
https://doi.org/10.1371/journal.pone.0315132.s006
(CSV)
S6 Dataset. Dataset used to obtain regression results at parks.
https://doi.org/10.1371/journal.pone.0315132.s007
(CSV)
S7 Dataset. Dataset used to obtain regression results at schools.
https://doi.org/10.1371/journal.pone.0315132.s008
(CSV)
S8 Dataset. Dataset used to obtain regression results at transit stops.
https://doi.org/10.1371/journal.pone.0315132.s009
(CSV)
Acknowledgments
The authors gratefully acknowledge DesignSafe and the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing the cyberinfrastructure that enabled the research results reported within this paper.
References
- 1. Mendolia S, Stavrunova O, Yerokhin O. Determinants of the community mobility during the COVID-19 epidemic: The role of government regulations and information. Journal of economic behavior & organization. 2021;184:199–231. pmid:33551525
- 2. Holshue ML, DeBolt C, Lindquist S, Lofy KH, Wiesman J, Bruce H, et al. First Case of 2019 Novel Coronavirus in the United States. New England Journal of Medicine. 2020;382(10):929–936. pmid:32004427
- 3. Dave D, McNichols D, Sabia JJ. The contagion externality of a superspreading event: The Sturgis Motorcycle Rally and COVID‐19. Southern economic journal. 2021;87(3):769–807. pmid:33362303
- 4. Bulfone TC, Malekinejad M, Rutherford GW, Razani N. Outdoor Transmission of SARS-CoV-2 and Other Respiratory Viruses: A Systematic Review. The Journal of infectious diseases. 2021;223(4):550–561. pmid:33249484
- 5. Qi L, Tang W, Wang J, Xiong Y, Yuan Y, Li B, et al. An Outbreak of SARS-CoV-2 Omicron Subvariant BA.2.76 in an Outdoor Park—Chongqing Municipality, China, August 2022. China CDC Weekly. 2022;4(46):1039–1042. pmid:36483192
- 6. Szabo S, Walker S, Edgar T, Kretnzel J, Johnston K, Stone B. EPH7 A Survey of the Occurrence and Transmission of COVID-19 in Outdoor Education Settings in Canada. ISPOR Europe 2022 Abstracts. 2022;25(12, Supplement):S192.
- 7. Hang J, Yang X, Ou CY, Luo ZW, Fan XD, Zhang XL, et al. Assessment of exhaled pathogenic droplet dispersion and indoor-outdoor exposure risk in urban street with naturally-ventilated buildings. Building and Environment. 2023;234:110122.
- 8. Fan X, Zhang X, Weerasuriya AU, Hang J, Zeng L, Luo Q, et al. Numerical investigation of the effects of environmental conditions, droplet size, and social distancing on droplet transmission in a street canyon. Building and Environment. 2022;221:109261.
- 9. Zanganeh Kia H, Choi Y, Nelson D, Park J, Pouyaei A. Large eddy simulation of sneeze plumes and particles in a poorly ventilated outdoor air condition: A case study of the University of Houston main campus. The Science of the total environment. 2023;891:164694–164694. pmid:37290661
- 10. Rahimi Z, Mohammadi MJ, Araban M, Shirali GA, Cheraghian B. Socioeconomic correlates of face mask use among pedestrians during the COVID-19 pandemic: An ecological study. Frontiers in Public Health. 2022;10:921494. pmid:36466470
- 11. Rahimi Z, Shirali GA, Araban M, Mohammadi MJ, Cheraghian B. Mask use among pedestrians during the Covid-19 pandemic in Southwest Iran: an observational study on 10,440 people. BMC Public Health. 2021;21:1–9. pmid:33446172
- 12. Davis L, Esposito J. Social disparities and social distancing during the COVID pandemic. Eastern Economic Journal. 2023;49(2):129. pmid:37051464
- 13. Hoeben EM, Bernasco W, Suonperä Liebst L, Van Baak C, Rosenkrantz Lindegaard M. Social distancing compliance: A video observational analysis. PloS one. 2021;16(3):e0248221. pmid:33720951
- 14. Weill JA, Stigler M, Deschenes O, Springborn MR. Social distancing responses to COVID-19 emergency declarations strongly differentiated by income. Proceedings of the National Academy of Sciences. 2020;117(33):19658–19660. pmid:32727905
- 15. Morita H, Nakamura S, Hayashi Y. Changes of Urban Activities and Behaviors Due to COVID-19 in Japan. SSRN Electronic Journal. 2020.
- 16. Gibson LP, Magnan RE, Kramer EB, Bryan AD. Theory of Planned Behavior Analysis of Social Distancing During the COVID-19 Pandemic: Focusing on the Intention–Behavior Gap. Annals of behavioral medicine. 2021;55(8):805–812. pmid:34228112
- 17. Haddawy P, Lawpoolsri S, Sa-ngamuang C, Su Yin M, Barkowsky T, Wiratsudakul A, et al. Effects of COVID-19 government travel restrictions on mobility in a rural border area of Northern Thailand: A mobile phone tracking study. PLOS ONE. 2021;16(2):e0245842. pmid:33534857
- 18. Si R, Yao Y, Zhang X, Lu Q, Aziz N. Investigating the Links Between Vaccination Against COVID-19 and Public Attitudes Toward Protective Countermeasures: Implications for Public Health. Frontiers in public health. 2021;9:702699–702699. pmid:34368065
- 19. Hossain ME, Islam MS, Rana MJ, Amin MR, Rokonuzzaman M, Chakrobortty S, et al. Scaling the changes in lifestyle, attitude, and behavioral patterns among COVID-19 vaccinated people: insights from Bangladesh. Human vaccines & immunotherapeutics. 2022;18(1):2022920–2022920. pmid:35061569
- 20. Liu Z, Maneekul P, Pendergrast C, Doubleday A, Miles SB, Errett NA, et al. Physical activity monitoring data following disasters. Sustainable Cities and Society. 2022;81:103814.
- 21. Roy A, Nelson TA, Fotheringham AS, Winters M. Correcting Bias in Crowdsourced Data to Map Bicycle Ridership of All Bicyclists. Urban Science. 2019;3(2).
- 22. Seres G, Balleyer AH, Cerutti N, Friedrichsen J, Süer M. Face Mask Use and Physical Distancing before and after Mandatory Masking: No Evidence on Risk Compensation in Public Waiting Lines. SSRN Electronic Journal. 2021. pmid:34840368
- 23.
Ghodgaonkar I, Chakraborty S, Banna V, Allcroft S, Metwaly M, Bordwell F, et al. Analyzing Worldwide Social Distancing through Large-Scale Computer Vision (2020). arXiv preprint arXiv:200812363. 2020;.
- 24. Rezaei M, Azarmi M. Deepsocial: Social distancing monitoring and infection risk assessment in covid-19 pandemic. Applied Sciences. 2020;10(21):7514.
- 25. Bernasco W, M Hoeben E, Koelma D, Liebst LS, Thomas J, Appelman J, et al. Promise into practice: Application of computer vision in empirical research on social distancing. Sociological Methods & Research. 2023;52(3):1239–1287.
- 26. Martell M, Terry N, Sengupta R, Salazar C, Errett NA, Miles SB, et al. Open-source data pipeline for street-view images: A case study on community mobility during COVID-19 pandemic. PloS one. 2024;19(5):e0303180–e0303180. pmid:38728283
- 27.
Salazar C. Estimating Distance between Pedestrians from Street View Images Using Geometric Properties; 2021.
- 28.
Errett NA, Wartman J, Miles SB, Silver B, Martell M, Choe Y. Street View Data Collection Design for Disaster Reconnaissance. arXiv preprint arXiv:230806284. 2023;.
- 29.
Hasan I, Liao S, Li J, Akram SU, Shao L. Pedestrian Detection: Domain Generalization, CNNs, Transformers and Beyond. arXiv preprint arXiv:220103176. 2022;.
- 30.
Bureau USC. Understanding Geographic Identifiers (GEOIDs); 2021. Available from: https://www.census.gov/programs-surveys/geography/guidance/geo-identifiers.html.
- 31. Miles SB. Foundations of community disaster resilience: Well-being, identity, services, and capitals. Environmental Hazards. 2015;14(2):103–121.
- 32.
King County GIS Center. King County GIS Data Hub; 2017. Available from: https://kingcounty.gov/services/gis/GISData.aspx.
- 33.
Washington State Department of Health. Data & Statistical Reports; 2016. Available from: https://doh.wa.gov/data-statistical-reports/data-systems/geographic-information-system.
- 34.
City of Seattle. Seattle Open Data; 1995. Available from: https://data.seattle.gov/.
- 35.
Roger Finke, Christopher Bader, Whitehead A. The Association of Religion Data Archives; 1997. Available from: https://www.thearda.com/.
- 36. Emery M, Flora C. Spiraling-up: mapping community transformation with community capitals framework. Community development (Columbus, Ohio). 2006;37(1):19–35.
- 37. Seto E, Min E, Ingram C, Cummings B, Farquhar SA. Community-Level Factors Associated with COVID-19 Cases and Testing Equity in King County, Washington. International Journal of Environmental Research and Public Health. 2020;17(24):9516. pmid:33353095
- 38.
Seabold S, Perktold J. statsmodels: Econometric and statistical modeling with python. In: 9th Python in Science Conference; 2010.
- 39. Shirali GA, Rahimi Z, Araban M, Mohammadi MJ, Cheraghian B. Social-distancing compliance among pedestrians in Ahvaz, South-West Iran during the Covid-19 pandemic. Asian Journal of Social Health and Behavior. 2021;4(4):131–136.
- 40.
King County Department of Public Health. Summary of COVID-19 vaccination among King County Residents;. Available from: https://kingcounty.gov/en/dept/dph/health-safety/disease-illness/covid-19/data/vaccination.
- 41. Wu X, Lu Y, Jiang B. Built environment factors moderate pandemic fatigue in social distance during the COVID-19 pandemic: A nationwide longitudinal study in the United States. Landscape and Urban Planning. 2023;233:104690. pmid:36687504
- 42. Sandlin EW, Simmons DJ. Polarized perceptions: how time and vaccination status modify Republican and Democratic COVID-19 risk perceptions. Journal of Elections, Public Opinion and Parties. 2023; p. 1–19. pmid:39391365
- 43. Green H, Fernandez R, MacPhail C. The social determinants of health and health outcomes among adults during the COVID‐19 pandemic: A systematic review. Public Health Nursing. 2021;38(6):942–952. pmid:34403525
- 44. Hearne BN, Niño MD. Understanding How Race, Ethnicity, and Gender Shape Mask-Wearing Adherence During the COVID-19 Pandemic: Evidence from the COVID Impact Survey. Journal of Racial and Ethnic Health Disparities. 2022;9(1):176–183. pmid:33469866
- 45.
Smedley BD, Stith AY, Care CoU, Racial E, Health EDi. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington: National Academies Press; 2002.
- 46. Sullivan L, Meschede T. How Measurement of Inequalities in Wealth by Race/Ethnicity Impacts Narrative and Policy: Investigating the Full Distribution. Race and social problems. 2018;10(1):19–29.
- 47.
King County Department of Public Health. COVID-19 race and ethnicity data;. Available from: https://kingcounty.gov/en/legacy/depts/health/covid-19/data/race-ethnicity.aspx.
- 48. Uscher-Pines L, Schwartz HL, Ahmed F, Zheteyeva Y, Tamargo Leschitz J, Pillemer F, et al. Feasibility of Social Distancing Practices in US Schools to Reduce Influenza Transmission During a Pandemic. Journal of public health management and practice. 2020;26(4):357–370. pmid:32437117
- 49. Morley G. Rights and religion: Canadian and American courts face challenges over COVID restrictions on religious gatherings. Inroads (Ottawa). 2021;(49):36.
- 50. Schnabel L, Schieman S. Religion Protected Mental Health but Constrained Crisis Response During Crucial Early Days of the COVID‐19 Pandemic. Journal for the scientific study of religion. 2022;61(2):530–543. pmid:34230686
- 51.
Association AS, et al. Code of Ethics. Washington, DC: American Sociological Association; 2018.
- 52. Altman I. Privacy Regulation: Culturally Universal or Culturally Specific? Journal of social issues. 1977;33(3):66–84.
- 53. Brooks-Pollock E, Danon L, Jombart T, Pellis L. Modelling that shaped the early COVID-19 pandemic response in the UK. Philosophical Transactions of the Royal Society B: Biological Sciences. 2021;376(1829):20210001. pmid:34053252
- 54. Kraus S, Koch N. Provisional COVID-19 infrastructure induces large, rapid increases in cycling. Proceedings of the National Academy of Sciences. 2021;118(15):e2024399118. pmid:33782111
- 55. Park S, Kim B, Lee J. Social Distancing and Outdoor Physical Activity During the COVID-19 Outbreak in South Korea: Implications for Physical Distancing Strategies. Asia-Pacific journal of public health. 2020;32(6-7):360–362. pmid:32667221
- 56. Rzotkiewicz A, Pearson AL, Dougherty BV, Shortridge A, Wilson N. Systematic review of the use of Google Street View in health research: Major themes, strengths, weaknesses and possibilities for future research. Health & Place. 2018;52:240–246. pmid:30015181
- 57. Li Y, Peng L, Wu C, Zhang J. Street View Imagery (SVI) in the Built Environment: A Theoretical and Systematic Review. Buildings. 2022;12(8):1167.
- 58. Kooistra EB, van Rooij B. Pandemic Compliance: A Systematic Review of Influences on Social Distancing Behaviour during the First Wave of the COVID-19 Outbreak. SSRN Electronic Journal. 2020.
- 59. Liebst LS, Ejbye-Ernst P, de Bruin M, Thomas J, Lindegaard MR. No evidence that mask-wearing in public places elicits risk compensation behavior during the COVID-19 pandemic. Scientific reports. 2022;12(1):1511–1511. pmid:35087100