Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

How did COVID-19 case distribution associate with the urban built environment? A community-level exploration in Shanghai focusing on non-linear relationship

  • Jingyi Gao ,

    Contributed equally to this work with: Jingyi Gao, Yifu Ge

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Architecture and Building Science, Graduate School of Engineering, Tohoku University, Sendai, Japan

  • Yifu Ge ,

    Contributed equally to this work with: Jingyi Gao, Yifu Ge

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Writing – review & editing

    Affiliation School of Architecture and Urban Planning, Nanjing University, Nanjing, China

  • Osamu Murao,

    Roles Project administration, Supervision, Validation, Writing – review & editing

    Affiliation International Research Institute of Disaster Science, Tohoku University, Sendai, Japan

  • Yitong Dong,

    Roles Data curation, Resources, Software

    Affiliations Department of Architecture and Building Science, Graduate School of Engineering, Tohoku University, Sendai, Japan, Shanghai Urban Planning and Design Co., Ltd. of Shanghai Planning Institute, Shanghai, China

  • Guofang Zhai

    Roles Conceptualization, Project administration, Supervision, Validation, Writing – review & editing

    guofang_zhai@nju.edu.cn

    Affiliation School of Architecture and Urban Planning, Nanjing University, Nanjing, China

Abstract

Several associations between the built environment and COVID-19 case distribution have been identified in previous studies. However, few studies have explored the non-linear associations between the built environment and COVID-19 at the community level. This study employed the March 2022 Shanghai COVID-19 pandemic as a case study to examine the association between built-environment characteristics and the incidence of COVID-19. A non-linear modeling approach, namely the boosted regression tree model, was used to investigate this relationship. A multi-scale study was conducted at the community level based on buffers of 5-minute, 10-minute, and 15-minute walking distances. The main findings are as follows: (1) Relationships between built environment variables and COVID-19 case distribution vary across scales of analysis at the neighborhood level. (2) Significant non-linear associations exist between built-environment characteristics and COVID-19 case distribution at different scales. Population, housing price, normalized difference vegetation index, Shannon’s diversity index, number of bus stops, floor–area ratio, and distance from the city center played important roles at different scales. These non-linear results provide a more refined reference for pandemic responses at different scales from an urban planning perspective and offer useful recommendations for a sustainable COVID-19 post-pandemic response.

1. Introduction

On May 5, 2023, the World Health Organization (WHO) declared that COVID-19 was no longer a public health emergency of international concern [1]. This indicates that the global community has entered the post-pandemic era. According to the WHO’s updated Strategic Preparedness and Response Plan for 2023–2025, national responses to COVID-19 should shift from emergency measures to long-term management, making the urban environment more resilient, equal, diverse, and sustainable [2]. The associations between COVID-19 case distribution and the urban environment have become a subject of interest for urban planners.

The neighborhood built environment is recognized as a key environmental factor affecting the public health [3, 4]. It is the major environmental context influencing the COVID-19 case distribution. Researchers often delineate buffer zones at specific scales and use them as spatial representations of residents’ activity areas. Previously, buffer zones with radii ranging from 50 m to 5 km have been used in different studies [5, 6]. This variability can result in the strength and direction of correlations between variables changing depending on the spatial scale of analysis. This issue is known as the modifiable areal unit problem (MAUP) [7]. According to Openshaw (1996) [8], the solution to the MAUP is to identify appropriate geospatial scales and geographic unit divisions for specific studies. Additionally, there may be scaling effects between COVID-19 case distribution and the neighborhood built environment that have not been identified. Therefore, an analysis based on different spatial scales is necessary.

Many studies of the built environment and COVID-19 spread tend to predefine the relationship as linear, which potentially misestimating or oversimplifying complex associations [9]. Most studies have used linear models, such as multivariable linear regression [10], generalized linear models [11, 12], weighted least squares (LS) [13], hierarchical multiple regression [14], and ordinary least squares (OLS) [12, 15]. These models can only produce linear results, which has limitations. Previous linear models have indicated a notable correlation between the built environment and the spread of COVID-19. However, Luo et al. identified a significant nonlinear relationship between the built environment and the spread of COVID-19 in the United States. This relationship exhibited a significant threshold effect, and the linear model proved inadequate for accurately estimating the spatial variability of this nonlinear relationship [16]. Similarly, Ma et al.’s investigation into the relationship between the built environment and the spread of COVID-19 in Chinese townships found that the density of gyms and sports centers had the greatest impact at 2/km², with the effect remaining constant with further increases in density [17]. In this context, identifying non-linear results might provide more specific urban planning guidance compared to linear results. Prior studies of the built environment and COVID-19 are further discussed in Section 2.

Considering the lack of research on non-linear relationships at different scales, this study focused on the community level and explored the following two research questions.

  1. Do relationships between built environment variables and COVID-19 case distribution vary across scales of analysis?
  2. What are the non-linear relationships between built-environment characteristics and COVID-19 case distribution at different scales?

2. Previous studies on the COVID-19 case distribution associated with the built environment

Factors influencing COVID-19 case distribution in urban areas have been explored in previous studies. For example, population density has been incorporated as an important indicator in research related to COVID-19 spread in various countries [12, 14, 15, 18]. High population density [13] and increased air pollution caused by excessive population [9] are often correlated with the rapid spread of the pandemic [19]. Additionally, researchers often associate socio-economic factors with spatial location. Differences in economic development levels have been regarded as exacerbating the spread risk of COVID-19 [20]. For instance, researchers identified the key role of economic activities in COVID-19 transmission in Italy. More productive areas with intensive economic activities (e.g., a higher percentage of employment in manufacturing) have higher possibilities of triggering COVID-19 spread [21]. A study that considered variables including the number of visitors, tourists staying overnight, and the generated income from their activities found a significant correlation between the number of cases and these economic factors [22]. Several scholars believe that COVID-19 spread is positively correlated with gross domestic product (GDP) [10].

Urban-planning attributes, including architecture, urban density, land use, and landscape, have also been found to correlate with COVID-19 case distribution [2325]. The significance of greenspace is frequently discussed, though its impact varies in different studies. For example, Kwok et al. found no significant relationship between greenspace and COVID-19 cases in Hong Kong [26]. Conversely, greenspace was associated with a reduced risk of COVID-19 mortality in the United States [27]. Although scholars hold different views on the relationship between COVID-19 cases and urban-density variables, the density indicators selected by researchers have common features, such as road density [28]; floor–area ratio (FAR), which characterizes building density [29]; and the number of points of interest (POIs), representing the facility density [30]. Regarding land use, the Shannon diversity index (SHDI) is commonly used to characterize land-use mixtures. A study conducted in Melbourne, Australia, analyzed the geographic distribution of COVID-19 cases and its correlation with built-environment variables. This study validated and highlighted the importance of considering the diversity of the built environment and mixed-use development when investigating the geographical distribution of COVID-19 cases within a city [31].

The aforementioned research provides a theoretical basis for selecting variables in this study and inspires the methodological approach. In previous studies, conventional spatial analysis models were predominantly used. Many studies first employed spatial autocorrelation analysis to obtain global spatial autocorrelation (Moran’s I) and local indicators of spatial association to measure the spatial distribution characteristics of cases [10, 13, 3133]. Subsequently, statistical methods such as Spearman or Pearson’s were used to test the relationship between the selected indicators and the dependent variable [15, 18, 19, 22]. Finally, a spatial regression model was constructed [34]. Compared to conventional regression models, some scholars have used use machine learning methods to obtain more accurate results in studying COVID-19. This includes using the elastic net machine-learning algorithm to optimize conventional linear models for predicting and verifying the significance of variables or using the multilayer-perceptron model to model and predict the number of cases [35]. However, the objective non-linear relationship between variables and dependent variables has not been well identified, which may lead to the ignoration of spatial attribute thresholds. Ignoring the spatial thresholds reflected by non-linear associations may further cause biased perceptions and the formulation of incorrect policies [36, 37].

Considering the community transmission characteristics of COVID-19, communities must be prepared for COVID-19 challenges [38]. In 2020, experts urged countries worldwide to strengthen monitoring of community COVID-19 infection and changing trends [39]. Compared to the early days of the COVID-19 pandemic, controlling community transmission to reduce the risk of infection was the top priority. In recent years, as the COVID-19 pandemic has subsided, the focus of community preparation in the post-pandemic era has shifted to making communities more resilient and sustainable [40]. Many studies have examined COVID-19 case distribution at the community level [4146]. Although these studies provide a rationale for community-based COVID-19 research, they are mostly large-scale, involving nations, regions, cities, and districts/counties. Small-scale studies focusing on communities, scale differences and the non-linear relationship between the built environment and COVID-19 case distribution are still insufficient.

3. Methods

The COVID-19 pandemic outbreak in Shanghai in March 2022 was selected as a case study, and officially announced COVID-19 addresses were collected. Built-environment variables associated with COVID-19 spread were selected based on a literature review. We used multi-source data to characterize the built environment and constructed a database combining the collected COVID-19 addresses at the community level. BRT models were developed to explore the non-linear associations between built-environment characteristics and COVID-19 case distribution. In this section, we will sequentially outline the variable selection, the collection and processing of COVID-19 and community data, and the steps involved in BRT model construction.

3.1 Data collection and processing

3.1.1 Data collection and variable selection.

Shanghai, a mega-city in China, was selected as the site for this case study. It has a high population and population density, which potentially increase human contact and viral transmission. A large-scale COVID-19 outbreak occurred in Shanghai in March 2022. Initially, Shanghai did not implement strict measures such as lockdowns to control the spread of the pandemic. The period selected for this study was from May 5 to May 31, a time when official data on case reports were available, and strict control measures had not yet been implemented.

Data reflecting the COVID-19 case distribution in Shanghai were collected from the daily pandemic information released by the Shanghai Municipal Health Commission (https://wsjkw.sh.gov.cn/yqtb/), which included the number of infected individuals and their residential addresses. Community data, including names and addresses for buffer zone construction in the main urban area were retrieved from Shanghai Bendibao (http://sh.bendibao.com/), a platform that publishes local information.

Based on review of previous studies in built environments [47, 48] and COVID-19-related research [25, 49], a total of four categories covering 18 variables were determined for this study. The first category focused on facilities closely related to residents’ lives, including various POIs and bus stops [50, 51]. The second category addressed density factors such as road density population density, and bus line density [50, 5255]. The third category encompassed land use characteristics, including land-use mixture (represented by Shannon’s Diversity Index), Normalized Difference Vegetation Index (NDVI), Floor–Area Ratio (FAR), and distance from the city center [5659]. The fourth category included socio-economic factors, specifically housing prices [50]. Data necessary for calculating these variables were collected. Detailed descriptions of each variable and their original data sources are provided in Table 1. Given the correlation of these variables with COVID-19 case distribution, we hypothesized differences in their contributions to the number of COVID-19 addresses across different scales of analysis. Non-linear findings can elucidate their associations with COVID-19 case distribution at various scales.

thumbnail
Table 1. Variables and data sources of the built environment.

https://doi.org/10.1371/journal.pone.0309019.t001

3.1.2 Data processing.

3.1.2.1 Processing of COVID-19 data and generation of COVID-19 addresses. In the ensuing elaboration, statistics on reported COVID-19 addresses with COVID-19 cases (abbreviated as “COVID-19 addresses” for brevity) will be used the dependent variable to explore the associations between COVID-19 case distribution and the built environment, which is primary focus of this study. Since no matching reported information was available from March 1–5, cases with addresses reported from March 6–31 were considered for COVID-19 address extraction. Due to the a significant rise in cases, Shanghai Municipal Health Commission no longer provided the specific address of each infected individual separately from March 18, instead publishing collective addresses. Therefore, duplicate address data before March 18 were removed to mitigate potential errors in this study caused by the inconsistency in the caliber of official data statistics between the two periods. Ultimately, 15,593 data points representing the locations of the addresses of those who were infected with COVID-19, which can also be referred to as COVID-19 addresses, were identified throughout the entire area of Shanghai city. (Fig 1). The COVID-19 addresses in the study area were geocoded into points with geographic coordinates using the geocoding application programming interface (API) of the Gaode open platform(https://lbs.amap.com/). It is important to note that all data used in this study are sourced from open-access platforms, and the data collection and analysis methods adhere to the terms and conditions stipulated by the data sources. Specifically, the COVID-19 addresses used in this study do not disclose personal privacy or directly re-identify infected individuals.

thumbnail
Fig 1. Number of COVID-19 addresses within the time range of the study.

(Sources: Original data from Shanghai Municipal Health Commission, processed and visualized by authors according to the methods explained in the text).

https://doi.org/10.1371/journal.pone.0309019.g001

3.1.2.2 Processing of community data and generation of community buffer zones. The study area in this paper encompasses communities within the main urban area of Shanghai, delimited according to boundaries specified in the Shanghai Urban Master Plan (2017–2035). Once the study area boundaries were defined, we processed the information collected about the community. Initially, we utilized the Gaode open platform’s API to geocode community addresses and obtain the spatial coordinates of each community’s geographic center. Due to the limitations in the precision of some raw data, some addresses were resolved to the same coordinates during geocoding process. Therefore, while filtering community coordinates within the main urban area boundaries, we ensured that duplicate coordinates were removed to prevent redundancy in later buffer zone creation. A total of 3,105 community geographic centers were identified and presented as points with coordinates extracted from their addresses. Considering the MAUP, the appropriate geographic scale was determined based on previous study by Liu et al. (2022) on COVID-19, community life circles, and the built environment in Wuhan, China [60]. Subsequently, these 3,105 screened communities were used to establish buffer zones with radii of 300, 500, and 800 m, corresponding to 5-, 10-, and 15-minute walking distances, respectively. The mean values of the variables selected in Section 3.2.1 within the buffer zones served as independent variables, while the count of COVID-19 addresses within buffer zones was taken as the dependent variable. In dense urban areas, where some communities have smaller areas, overlapping buffer zones may occur, potentially resulting in COVID-19 addresses being counted in multiple buffer zones simultaneously for analysis purposes.

It should be noted that the data collected from open platforms such as Gaode mostly use the GCJ-02 coordinate system, which is the mandatory encrypted coordinate system for domestic use in China. During data processing, all spatial data including COVID-19 addresses and communities, were uniformly converted from Gaode (GCJ-02) to the World Geodetic System (WGS-84) coordinate system before being used formally in the spatial dimension for more accurate results in subsequent analysis.

3.2 Model building

Boosted Regression Tree (BRT) is a technique that enhances the performance of a single model by fitting multiple models and combining them for predictions [61]. It integrates concepts and techniques from statistics and machine learning, focusing on algorithmic modeling and treating the data-generating mechanism as complex and unknown rather than starting from specific statistical models [62]. Unlike traditional statistical models, BRT does not require assumptions about the underlying statistical distribution of the response variable. This versatility allows BRT to accommodate various data types, including categorical, count, and continuous variables [63]. BRT models are widely used in the fields of ecology [64], geography [65], urban studies [66], disaster management [67], and epidemiology [6870] to explore environmental impacts. Their advantage lies in their ability to identify non-linear relationships and assess the contributions of measurement indicators. Given the complex relationship between the built environment and COVID-19, a BRT model is well-suited for this study.

In this study, three BRT models were developed across three different scales using the gbm package (version 2.1.9) in the R programming environment. These models aimed to explore the non-linear relationship between the built environment and the number of COVID-19 addresses. The parameters tree complexity, learning rate, maximum trees, and bag fraction were set to 5, 0.005, 20,000, and 0.5, respectively. To evaluate the models, 10-fold cross-validation (CV) was employed for its computational efficiency and robustness in model evaluation [71]. The relative contribution of each variable to the total number of COVID-19 addresses at different scales was analyzed by using the split frequency importance method in BRT models. This method records how frequently each variable is used to split nodes across all trees in the model and the associated reduction in the loss function [63, 72]. These importance values were normalized to represent each variable’s percentage contribution to the model’s predictions. Differences in variable importance across scales were compared to understand their varying impacts. Based on the ranked importance scores provided by the BRT model, significant variables contributing to predicting the target variable were identified. This process facilitated the selection of important variables for further analysis. Subsequently, the BRT model was used to generate partial dependence plots for the top five influential variables identified in the previous steps. Finally, several metrics were used to assess each model’s performance and validity, including the results of 10-fold cross-validation. This comprehensive workflow is illustrated in Fig 2. The code used in this study can be found in S1 File.

thumbnail
Fig 2. Workflow of the analysis and scale of buffer zones at the community level.

https://doi.org/10.1371/journal.pone.0309019.g002

4. Results

Table 2 and Fig 3 display the relative importance of each independent variable to the dependent variable in the models, along with their respective rankings. A higher ranking indicates greater importance of the variable in predicting the model’s outcomes. This ranking also illustrates how the contribution of each variable to the number of COVID-19 addresses varies across different scales. Consistent patterns were observed across the three scales. While the order of relative contribution rankings varied between scales, variables such as distance from the city center (X17), housing price (X18), and FAR (X16) consistently ranked among the top five in all scales. Housing price showed increasing contribution as the scale expanded, whereas SHDI showed decreasing contribution. Notably, distance from the city center ranked highest at the 500 m scale. Despite maintaining the same value across all three scales for each community (as it represents the distance from the city center), its performance in the model remains crucial for analysis. For instance, its contribution to the cumulative number of COVID-19 addresses notably decreased at the 800 m scale. In contrast, the contribution of the number of bus stops significantly increased at the 800 m scale.

thumbnail
Fig 3. The ranking and the changes of the relative contributions (%) of each variable under different scales.

https://doi.org/10.1371/journal.pone.0309019.g003

thumbnail
Table 2. Variable’s relative contributions (%) ranking of each scale calculated by the BRT model.

https://doi.org/10.1371/journal.pone.0309019.t002

Fig 4 depicts the partial dependence of the top five variables across different scales. The partial dependence plots reveal varying trends for each variable at different scales. The variable distance from the city center exhibits a consistent pattern across scales. Initially, the number of COVID-19 addresses remains high within a 5 km radius of the city center, decreases gradually, and then shows a rebound between 20–25 km. The housing price variable shows a similar trend at the 300 m and 500 m scales, with a notable increase in COVID-19 addresses in areas close to 100,000 yuan/m2, which becomes more pronounced at the 800 m scale. For the SHDI, at the 300 m scale, there are minimal fluctuations in COVID-19 addresses when SHDI is below 1.5, but a significant increase is observed when SHDI exceeds 1.5. At the 500 m scale, the number of COVID-19 addresses fluctuated, but the overall trend shows that COVID-19 cases decrease as SHDI increases below 1.5, with significant increases observed above 1.5. At the 800 m scale, smaller fluctuations and a downward trend are seen for SHDI values below 1.5, with larger fluctuations observed above 1.5. The FAR variable shows higher COVID-19 cases in its lower range, with most fluctuations occurring when FAR is below10. Regarding NDVI, values in the range of 0.2–0.3 tend to be associated with more COVID-19 cases. At the 800 m scale, there is a sharp increase in COVID-19 addresses with an increase of approximately 0.4.

thumbnail
Fig 4. Partial dependence plots for the Top 5 influential variables in the BRT model.

(X-axis: the value of the variable; Y-axis: the number of COVID-19 addresses).

https://doi.org/10.1371/journal.pone.0309019.g004

Table 3 presents key metrics that evaluate the performance and validity of the BRT models. In terms of cross-validation, the CV correlation assesses how well the model predicts unseen data, with values closer to 1 indicating the better performance. The CV correlations for the three BRT models are 0.665, 0.842, and 0.855, indicating strong predictive capabilities across all models. RMSE and MSE values indicate the magnitude of deviation between model predictions and actual observations. Lower RMSE and MSE values suggest smaller prediction errors. Across the three models, deviations in model predictions are relatively small at the 300 m and 500 m scales. R2 measures how well the model fits the observed data, with values closer to 1 indicating a better fit. All three models demonstrate a high degree of fit.

5. Discussion

There were notable findings in the study results and underlying mechanisms warrant discussion. Firstly, the BRT model revealed significant non-linear relationships among variables across different scales. The variable "distance from the city center" consistently ranked high across several scales. This may be attributed to factors such as higher density, and more complex human mobility patterns nearer to the city center, reflecting the intricate mechanisms of COVID-19 transmission. Furthermore, the non-linear results provided a clearer depiction of how COVID-19 addresses vary with different variables. While previous studies have generally acknowledged a positive correlation between population density and of COVID-19 transmission [7375], this study elucidated non-linear relationships by examiningCOVID-19 addresses associated with low-, medium-, and high-density areas. These findings are pivotal for future research, particularly in densely populated mega-cities like Shanghai, where understanding such non-linear associations is crucial for effective pandemic management strategies.

Secondly, the research results were compared with conclusions drawn from previous studies. Previous research often concluded that a higher NDVI level indirectly mitigates the severity of the COVID-19 pandemic. For instance, Peng et al. explored the relationship between green space and COVID-19 incidence across 266 cities in China, finding a negative correlation between NDVI and COVID-19 incidence [76]. Similarly, Venter et al. suggested that green space contributes to increased social distancing, thereby indirectly reducing COVID-19 transmission [77]. However, this study not only found that higher NDVI values were associated with fewer COVID-19 cases but also pinpointed specific NDVI ranges linked to higher case numbers.

Regarding the SHDI, which reflects mixed land use, previous findings suggested that higher levels of mixed land use were associated with fewer COVID-19 cases, particularly at larger spatial scales (e.g., 15-min walking distance). In this study, at the 300 m scale, SHDI values below 1.5 showed minimal fluctuation in the number of COVID-19 addresses, with a downward trend observed as SHDI increased within the 500 m scale. However, when SHDI values reached 1.5 in both scales, COVID-19 addresses exhibited increased fluctuations. On a broader scale, Such as a 15-min walking distance, a higher degree of land-use mixture generally correlated with reduced the number of COVID-19 addresses in the area.

Regarding the FAR, it is important to note that the value range discussed, such as the node at FAR 10 in the results section, may exceed a typical neighborhood FAR values. This variation can be attributed to the inclusion of public buildings with higher FAR in certain buffer zones within the main urban area of mega-city Shanghai. Therefore, when interpreting the model, partial dependence plots within the range of typical neighborhood FAR values (generally below 10) more accurately reflect associations at the community level. Thus, the FAR partial dependence at the 800 m scale remains valuable for further interpretation, despite the BRT model at this scale not necessarily outperforming the other two models. For instance, the partial dependence indicates the corresponding number of COVID-19 addresses when FAR is below 6, a value not significantly higher than the general community FAR range. In essence, interpreting variables in the model should account for theses objective circumstances. Furthermore, in future research could weigh variables based on their contribution and evaluate different regions’ risk levels based on known non-linear relations. This approach could enhance preparedness for future public health emergencies akin to COVID-19.

This study had several limitations, notably uncontrolled confounding. The built environment encompasses various aspects of daily life beyond visible features like buildings, utilities, and transportation systems, and land use, unobserved socio-economic attributes of population groups including age, health status, income, add to the difficulty of understanding COVID-19 in terms of the equity [78]. These uncontrolled confounders across communities might also have associations with the number of COVID-19 infections. Future research will need to address uncontrolled confounding by considering equity and disparities in the built environment, differences in lockdown enforcement, COVID-19 susceptibility, case reporting rates, and initial/imported COVID-19 case numbers to further enhance the accuracy of the dependent variable. The interactions among the variables should also be considered. The mechanism of heterogeneity in different community types can also be further explored in future research.

In addition to committing to a more comprehensive selection of variables, improving the COVID-19 case data is crucial. Firstly, the research period for obtaining valid data was limited; in this study, it spanned only 25 days. Although the relevant model metrics have provided the evidence for good performance and validity, future studies should aim to collect data over longer periods to enhance our research. Second, the COVID-19 case distribution was characterized by COVID-19 addresses. Going forward, if the government can standardize its data release protocols and provide more accurate and open data while ensuring resident privacy protection, it would greatly benefit the advancement of relevant academic research.

6. Conclusions

The study utilized BRT models to explore the relationship between built-environment variables and COVID-19 across different scales. The main findings are as follows: (1) Relationships between built environment variables and COVID-19 case distribution vary across scales of analysis. The relative contribution of built-environment characteristics and their partial dependence on COVID-19 case distribution differ at various scales, highlighting varying strengths and directions of association. (2) Non-linear relationships between built-environment characteristics and the COVID-19 case distribution were identified. Key variables such as distance from the city center, population density, housing price, NDVI, SHDI, number of bus stops, and FAR demonstrated specific patterns in their partial dependence. Despite variability in model validity across scales, these non-linear results provide refined insights for pandemic response from the urban planning perspective at different scales.

Based on our results, several policy recommendations can be proposed for community-based healthy city construction in the post-pandemic era: First, consider important factors identified in urban planning responses to public health emergencies based on the relative importance ranking of variables. Second, incorporate critical nodes and thresholds identified through the marginal effect curves of non-linear relationships into urban planning considerations. Besides, different planning strategies should be adopted for variables associated with varying numbers of COVID-19 cases. For instance, in this study, an SHDI threshold of 1.5 serves as a critical demarcation point: neighborhoods below this threshold require distinct planning measures compared to those above it. Third, promote moderately compact neighborhood environments with ample land use, particularly focusing on enhancing green spaces in areas lacking them. Furthermore, enhance community-level healthcare facilities, especially in neighborhoods closer to the city center. These recommendations aim to optimize urban environments to better respond to future public health challenges, leveraging insights derived from the study’s findings.

Supporting information

Acknowledgments

The authors would like to thank the editor and anonymous reviewers who read the paper and provided helpful comments for improvement.

References

  1. 1. Lenharo M. WHO declares end to COVID-19’s emergency phase. Nature. 2023;882. pmid:37147368
  2. 2. Leach M, MacGregor H, Scoones I, Wilkinson A. Post-pandemic transformations: How and why COVID-19 requires us to rethink development. World Dev. 2021;138: 105233. pmid:33100478
  3. 3. WHO. From emergency response to long-term COVID-19 disease management: Sustaining gains made during the COVID-19 pandemic. [Cited 01 May 2023].
  4. 4. Rundle A, Diez Roux AV, Free LM, Miller D, Neckerman KM, Weiss CC. The urban built environment and obesity in New York City: A multilevel analysis. Am J Health Promot. 2007;21 Supplement: 326–334. pmid:17465178
  5. 5. Houston D. Implications of the modifiable areal unit problem for assessing built environment correlates of moderate and vigorous physical activity. Appl Geogr. 2014;50: 40–47.
  6. 6. Zhang L, Tan PY. Associations between urban green spaces and health are dependent on the analytical scale and how urban green spaces are measured. Int J Environ Res Public Health. 2019;16: 578. pmid:30781534
  7. 7. Tao Y, Kou L, Chai Y, Kwan M-P. Associations of co-exposures to air pollution and noise with psychological stress in space and time: A case study in Beijing, China. Environ Res. 2021;196: 110399. pmid:33157109
  8. 8. Openshaw S. Developing GIS-relevant zone-based spatial analysis methods. Spat Anal Modell GIS Environ. 1996: 55–73.
  9. 9. Liu J, Wang B, Xiao L. Non-linear associations between built environment and active travel for working and shopping: An extreme gradient boosting approach. J Transp Geogr. 2021;92: 103034.
  10. 10. Ramírez-Aldana R, Gomez-Verjan JC, Bello-Chavolla OY, García-Peña C. Spatial epidemiological study of the distribution, clustering, and risk factors associated with early COVID-19 mortality in Mexico. PLOS ONE. 2021;16: e0254884. pmid:34288952
  11. 11. Liu J, Zhou J, Yao J, Zhang X, Li L, Xu X, et al. Impact of meteorological factors on the COVID-19 transmission: A multi-city study in China. Sci Total Environ. 2020;726: 138513. pmid:32304942
  12. 12. Andersen LM, Harden SR, Sugg MM, Runkle JD, Lundquist TE. Analyzing the spatial determinants of local Covid-19 transmission in the United States. Sci Total Environ. 2021;754: 142396. pmid:33254938
  13. 13. Ives AR, Bozzuto C. Estimating and explaining the spread of COVID-19 at the county level in the USA. Commun Biol. 2021;4: 60. pmid:33402722
  14. 14. Coşkun H, Yıldırım N, Gündüz S. The spread of COVID-19 virus through population density and wind in Turkey cities. Sci Total Environ. 2021;751: 141663. pmid:32866831
  15. 15. Buja A, Paganini M, Cocchio S, Scioni M, Rebba V, Baldo V. Demographic and socio-economic factors, and healthcare resource indicators associated with the rapid spread of COVID-19 in Northern Italy: An ecological study. PLOS ONE. 2020;15: e0244535. pmid:33370383
  16. 16. Luo Y, Yan J, McClure S. Distribution of the environmental and socioeconomic risk factors on COVID-19 death rate across continental USA: a spatial nonlinear analysis. Environmental Science and Pollution Research. 2020;28: 6587–6599. pmid:33001396
  17. 17. Ma S, Li S, Zhang J. Diverse and nonlinear influences of built environment factors on COVID-19 spread across townships in China at its initial stage. Scientific Reports. 2021;11. pmid:34127713
  18. 18. Han Y, Yang L, Jia K, Li J, Feng S, Chen W, et al. Spatial distribution characteristics of the COVID-19 pandemic in Beijing and its relationship with environmental factors. Sci Total Environ. 2021;761: 144257. pmid:33352341
  19. 19. Aabed K, Lashin MMA. An analytical study of the factors that influence COVID-19 spread. Saudi J Biol Sci. 2021;28: 1177–1195. pmid:33262677
  20. 20. Li M, Zhang Z, Cao W, Liu Y, Du B, Chen C, et al. Identifying novel factors associated with COVID-19 transmission and fatality using the machine learning approach. Sci Total Environ. 2021;764: 142810. pmid:33097268
  21. 21. Bloise F, Tancioni M. Predicting the spread of COVID-19 in Italy using machine learning: Do socio-economic factors matter? Struct Chang Econ Dyn. 2021;56: 310–329. pmid:35317020
  22. 22. Tantrakarnapa K, Bhopdhornangkul B, Nakhaapakorn K. Influencing factors of COVID-19 spreading: A case study of Thailand. Z Gesundh Wiss. 2022;30: 621–627. pmid:32837844
  23. 23. Harris P, Harris-Roxas B, Prior J, Morrison N, McIntyre E, Frawley J, et al. Respiratory pandemics, urban planning and design: A multidisciplinary rapid review of the literature. Cities. 2022;127: 103767. pmid:35663146
  24. 24. Hussein HAA. Investigating the role of the urban environment in controlling pandemics transmission: Lessons from history. Ain Shams Eng J. 2022;13: 101785.
  25. 25. Alidadi M, Sharifi A. Effects of the built environment and human factors on the spread of COVID-19: A systematic literature review. Sci Total Environ. 2022;850: 158056. pmid:35985590
  26. 26. Kwok CYT, Wong MS, Chan KL, Kwan MP, Nichol JE, Liu CH, et al. Spatial analysis of the impact of urban geometry and socio-demographic characteristics on COVID-19, a study in Hong Kong. Sci Total Environ. 2021;764: 144455. pmid:33418356
  27. 27. Russette H, Graham J, Holden Z, Semmens EO, Williams E, Landguth EL. Greenspace exposure and COVID-19 mortality in the United States: January–July 2020. Environ Res. 2021;198: 111195. pmid:33932476
  28. 28. Khavarian-Garmsir AR, Sharifi A, Moradpour N. Are high-density districts more vulnerable to the COVID-19 pandemic? Sustain Cities Soc. 2021;70: 102911. pmid:36567891
  29. 29. Tong H, Li M, Kang J. Relationships between building attributes and COVID-19 infection in London. Build Environ. 2022;225: 109581. pmid:36124292
  30. 30. Jin X, Long Y, Sun W, Lu Y, Yang X, Tang J; etc. Evaluating cities’ vitality and identifying ghost cities in China with emerging geographical data. Cities. 2017;63: 98–109.
  31. 31. Gaisie E, Oppong-Yeboah NY, Cobbinah PB. Geographies of infections: Built environment and COVID-19 pandemic in metropolitan Melbourne. Sustain Cities Soc. 2022;81: 103838. pmid:35291308
  32. 32. Wang Q, Dong W, Yang K, Ren Z, Huang D, Zhang P, et al. Temporal and spatial analysis of COVID-19 transmission in China and its influencing factors. Int J Infect Dis. 2021;105: 675–685. pmid:33711521
  33. 33. Xie Z, Qin Y, Li Y, Shen W, Zheng Z, Liu S. Spatial and temporal differentiation of COVID-19 epidemic spread in mainland China and its influencing factors. Sci Total Environ. 2020;744: 140929. pmid:32687995
  34. 34. Benita F, Gasca-Sanchez F. The main factors influencing COVID-19 spread and deaths in Mexico: A comparison between phases I and II. Appl Geogr. 2021;134: 102523. pmid:34334843
  35. 35. Zawbaa HM, El-Gendy A, Saeed H, Osama H, Ali AMA, Gomaa D, et al. A study of the possible factors affecting COVID‐19 spread, severity and mortality and the effect of social distancing on these factors: Machine learning forecasting model. Int J Clin Pract. 2021;75: e14116. pmid:33639032
  36. 36. Yang H, Zhang Q, Helbich M, Lu Y, He D, Ettema D, et al. Examining non-linear associations between built environments around workplace and adults’ walking behaviour in Shanghai, China. Transportation Research Part A: Policy and Practice. 2022;155: 234–246.
  37. 37. Tao T, Wang J, Cao X. Exploring the non-linear associations between spatial attributes and walking distance to transit. Journal of Transport Geography. 2020;82: 102560.
  38. 38. Khanna RC, Cicinelli MV, Gilbert SS, Honavar SG, Murthy GSV. COVID-19 pandemic: Lessons learned and future directions. Indian J Ophthalmol. 2020;68: 703–710. pmid:32317432
  39. 39. Daughton C. The international imperative to rapidly and inexpensively monitor community-wide Covid-19 infection status and trends. Sci Total Environ. 2020;726: 138149. pmid:32315842
  40. 40. Doyle A, Hynes W, Purcell SM. Building Resilient, Smart Communities in a Post-COVID Era. International Journal of E-Planning Research. 2021;10: 18–26.
  41. 41. Zhai W, Liu M, Fu X, Peng Z. American inequality meets COVID-19: Uneven spread of the disease across communities. Ann Am Assoc Geogr. 2021;111: 1–21.
  42. 42. Wali B, Frank LD. Neighborhood-level COVID-19 hospitalizations and mortality relationships with built environment, active and sedentary travel. Health Place. 2021;71: 102659. pmid:34481153
  43. 43. Gao Z, Wang S, Gu J, Gu C, Liu R. A community-level study on COVID-19 transmission and policy interventions in Wuhan, China. Cities. 2022;127: 103745. pmid:35582597
  44. 44. Hong B, Bonczak BJ, Gupta A, Thorpe LE, Kontokosta CE. Exposure density and neighborhood disparities in COVID-19 infection risk. Proc Natl Acad Sci U S A. 2021;118: e2021258118. pmid:33727410
  45. 45. Kwok KO, Li KK, Chan HHH, Yi YY, Tang A, Wei WI, et al. Community responses during early phase of COVID-19 epidemic, Hong Kong. Emerg Infect Dis. 2020;26: 1575–1579. pmid:32298227
  46. 46. Ryan BJ, Coppola D, Canyon DV, Brickhouse M, Swienton R. COVID-19 community stabilization and sustainability framework: An integration of the Maslow hierarchy of needs and social determinants of health. Disaster Med Public Health Prep. 2020;14: 623–629. pmid:32314954
  47. 47. Li X, Li Y, Jia T, Zhou L, Hijazi IH. The six dimensions of built environment on urban vitality: Fusion evidence from multi-source data. Cities. 2022;121: 103482.
  48. 48. Paköz MZ, Işık M. Rethinking urban density, vitality and healthy environment in the post-pandemic city: The case of Istanbul. Cities. 2022;124: 103598. pmid:35125597
  49. 49. Wang J, Yang Y, Peng J, Yang L, Gou Z, Lu Y. Moderation effect of urban density on changes in physical activity during the coronavirus disease 2019 pandemic. Sustain Cities Soc. 2021;72: 103058. pmid:34840936
  50. 50. Li B, Peng Y, He H, Wang M, Feng T. Built environment and early infection of COVID-19 in urban districts: A case study of Huangzhou. Sustain Cities Soc. 2021;66: 102685. pmid:33520609
  51. 51. Chen Z, Dong B, Pei Q, Zhang Z. The impacts of urban vitality and urban density on innovation: Evidence from China’s Greater Bay Area. Habitat Int. 2022;119: 102490.
  52. 52. Lamíquiz PJ, López-Domínguez J. Effects of built environment on walking at the neighbourhood scale. A new role for street networks by modelling their configurational accessibility? Transp Res A. 2015;74: 148–163.
  53. 53. Zeng C, Song Y, He Q, Shen F. Spatially explicit assessment on urban vitality: Case studies in Chicago and Wuhan. Sustain Cities Soc. 2018;40: 296–306.
  54. 54. Sung H, Go D, Choi CG. Evidence of Jacobs’s street life in the great Seoul city: Identifying the association of physical environment with walking activity on streets. Cities. 2013;35: 164–173.
  55. 55. Zumelzu A, Barrientos-Trinanes M. Analysis of the effects of urban form on neighborhood vitality: Five cases in Valdivia, Southern Chile. J Hous Built Environ. 2019;34: 897–925.
  56. 56. Deng C, Ma J. Viewing urban decay from the sky: A multi-scale analysis of residential vacancy in a shrinking U.S. city. Landsc Urban Plan. 2015;141: 88–99.
  57. 57. Sung H, Lee S. Residential built environment and walking activity: Empirical evidence of Jane Jacobs’ urban vitality. Transp Res D. 2015;41: 318–329.
  58. 58. Wu J, Ta N, Song Y, Lin J, Chai Y. Urban form breeds neighborhood vibrancy: A case study using a GPS-based activity survey in suburban Beijing. Cities. 2018;74: 100–108.
  59. 59. Braun LM, Malizia E. Downtown vibrancy influences public health and safety outcomes in urban counties. J Transp Health. 2015;2: 540–548.
  60. 60. Liu W, Zheng S, Hu X, Wu Z, Chen S, Huang Z, et al. Effects of spatial scale on the built environments of community life circles providing health functions and services. Build Environ. 2022;223: 109492.
  61. 61. Lin H, Lam TY, Peng P, Chiu C. Embedding Boosted Regression Trees approach to variable selection and cross-validation in parametric regression to predict diameter distribution after thinning. Forest Ecol Manag. 2021;499: 119631.
  62. 62. Breiman L. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Stat Sci. 2001;16: 199–231.
  63. 63. Elith J, Leathwick JR, Hastie T. A working guide to boosted regression trees. J Anim Ecol. 2008;77: 802–813. pmid:18397250
  64. 64. Elith J, Leathwick J. Boosted Regression Trees for ecological modeling. R documentation. Https://cran.r-project.org/web/packages/dismo/vignettes/brt.pdf [Accessed on 12 June 2011].
  65. 65. Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM. Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides. 2016;13: 839–856.
  66. 66. Hu Y, Dai Z, Guldmann JM. Modeling the impact of 2D/3D urban indicators on the urban heat island over different seasons: A boosted regression tree approach. J Environ Manage. 2020;266: 110424. pmid:32392133
  67. 67. Abedi R, Costache R, Shafizadeh-Moghadam H, Pham QB. Flash-flood susceptibility mapping based on XGBoost, random forest and boosted regression trees. Geocarto Int. 2022;37: 5479–5496.
  68. 68. Zhang W, Du Z, Zhang D, Yu S, Hao Y. Boosted regression tree model-based assessment of the impacts of meteorological drivers of hand, foot and mouth disease in Guangdong, China. Sci Total Environ. 2016;553: 366–371. pmid:26930310
  69. 69. Zhang D, Guo Y, Rutherford S, Qi C, Wang X, Wang P, et al. The relationship between meteorological factors and mumps based on Boosted regression tree model. Sci Total Environ. 2019;695: 133758. pmid:31422317
  70. 70. Cheong YL, Leitão PJ, Lakes T. Assessment of land use factors associated with dengue cases in Malaysia using Boosted Regression Trees. Spat Spatiotemporal Epidemiol. 2014;10: 75–84. pmid:25113593
  71. 71. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer Series in Statistics. New York, NY: Springer New York; 2009.
  72. 72. Friedman JH. Greedy function approximation: A gradient boosting machine. The Annals of Statistics. 2001;29: 1189–1232.
  73. 73. Feng Y, Li Q, Tong X, Wang R, Zhai S, Gao C, et al. Spatiotemporal spread pattern of the COVID-19 cases in China. PLOS ONE. 2020;15: e0244351. pmid:33382758
  74. 74. Kadi N, Khelfaoui M. Population density, a factor in the spread of COVID-19 in Algeria: Statistic study. Bull Natl Res Cent. 2020;44: 138. pmid:32843835
  75. 75. Zhang Y, Deng Z, Supriyadi A, Song R, Wang T. Spatiotemporal spread characteristics and influencing factors of COVID‐19 cases: Based on big data of population migration in China. Growth Change. 2022;53: 1694–1715.
  76. 76. Peng W, Dong Y, Tian M, Yuan J, Kan H, Jia X, et al. City-level greenness exposure is associated with COVID-19 incidence in China. Environ Res. 2022;209: 112871. pmid:35123969
  77. 77. Venter ZS, Barton DN, Gundersen V, Figari H, Nowell M. Urban nature in a time of crisis: Recreational use of green space increases during the COVID-19 outbreak in Oslo, Norway. Environ Res Lett. 2020;15: 104075.
  78. 78. Seyedrezaei M, Becerik-Gerber B, Awada M, Contreras S, Boeing G. Equity in the built environment: A systematic review. Building and Environment. 2023;245: 110827.