City size and the spreading of COVID-19 in Brazil

The current outbreak of the coronavirus disease 2019 (COVID-19) is an unprecedented example of how fast an infectious disease can spread around the globe (especially in urban areas) and the enormous impact it causes on public health and socio-economic activities. Despite the recent surge of investigations about different aspects of the COVID-19 pandemic, we still know little about the effects of city size on the propagation of this disease in urban areas. Here we investigate how the number of cases and deaths by COVID-19 scale with the population of Brazilian cities. Our results indicate small towns are proportionally more affected by COVID-19 during the initial spread of the disease, such that the cumulative numbers of cases and deaths per capita initially decrease with population size. However, during the long-term course of the pandemic, this urban advantage vanishes and large cities start to exhibit higher incidence of cases and deaths, such that every 1% rise in population is associated with a 0.14% increase in the number of fatalities per capita after about four months since the first two daily deaths. We argue that these patterns may be related to the existence of proportionally more health infrastructure in the largest cities and a lower proportion of older adults in large urban areas. We also find the initial growth rate of cases and deaths to be higher in large cities; however, these growth rates tend to decrease in large cities and to increase in small ones over time.


Introduction
Human activities have become increasingly concentrated in urban areas. A direct consequence of this worldwide urbanization process is that more people are living in cities than in rural regions since 2007 [1], and projections indicate that the world urban population could reach more than 90% by the end of this century [2]. Besides being increasingly urbanized, we live in an unprecedentedly connected, and highly mobile world with air passengers exceeding 4 billion in 2018 [3]. On the one hand, a highly connected and highly urbanized society brought us innovation, economic growth, more access to education and healthcare; on the other, it has also lead to pollution, environmental degradation, privacy concerns, more people living in substandard conditions, and suitable conditions for dissemination of infectious diseases over the globe. In particular, the emergence of infectious disease outbreaks has significantly increased over time, and the majority of these events are caused by pathogens originating in wildlife [4], which in turn has been associated with changes in environmental conditions and land use, agricultural practices, and the rise of large human population settlements [5]. The ongoing outbreak of the novel coronavirus (SARS-CoV-2) seems to fit well the previous context as it was first identified in Wuhan in December 2019, an influential Chinese city exceeding 11 million inhabitants, and apparently originated from the recombination of bat and Malayan pangolin coronaviruses [6]. The coronavirus disease 2019 (COVID-19) initially spread in Mainland China but rapidly caused outbreaks in other countries, making the World Health Organization first declare a "Public Health Emergency of International Concern" in January 2020, and in mid-March, the outbreak was reclassified as a pandemic. As of 16 August 2020, over 21.2 million cases of COVID-19 have been confirmed in almost all countries, and the worldwide death toll exceeds 761 thousand people [7]. The COVID-19 pandemic poses unprecedented health and economic threats to our society, and understanding its spreading patterns may find important factors for mitigating or controlling the outbreak.
Recent works have focused on modeling the initial spreading of COVID- 19 [8] or the fatality curves [9], projecting the outbreak peak and hospital utilization [10], understanding the effects of mobility [11], demography [12], travel restrictions [13], behavior change on the virus transmission [14], mitigation strategies [15], non-pharmaceutical interventions [16], networkbased strategies for social distancing [17], among many others. Despite the increasing surge of scientific investigations on the subject, little attention has been paid to understanding the effects of city size on spreading patterns of cases and deaths by COVID-19 in urban areas. The idea that size (as measured by population) affects different city indicators has been extensively studied and can be summarized by the urban scaling hypothesis [18][19][20][21]. This theory states that urban indicators are non-linearly associated with city population such that socio-economic indicators tend to present increasing returns to scale [18,22,23], infrastructure indicators often display economy of scale [18,19], and quantities related to individual needs usually scale linearly with city population [18,19].
Urban scaling studies of health-related quantities have shown that the incidence and mortality of diseases are non-linearly related to the city population [24][25][26][27]. Despite the existence of several exceptions [27], noninfectious diseases (such as diabetes) are usually less prevalent in large cities, while infectious diseases (such as AIDS) are relatively more common in large urban areas. This different behavior is likely to reflect the fact that people living in large cities tend to have proportionally more contacts and a higher degree of social interactions than those living in small towns [19,28]. Within this context, the recent work of Stier, Berman, and Bettencourt [29] has indicated that large cities in the United States experienced more pronounced growth rates of COVID-19 cases during the first weeks after the introduction of the disease. Similarly, Cardoso and Gonçalves [30] found that the per capita contact rate of COVID-19 increases with the size and density of cities in United States, Brazil and Germany. These findings have serious consequences for the evolution of COVID-19 and suggest that large metropolises may become infection hubs with potentially higher and earlier peaks of infected people. Investigating whether this behavior generalizes to other places and how different quantities such as the number of cases and deaths scale with city size are thus important elements for a better understanding of the spreading of COVID-19 in urban areas.
Here we investigate how population size is associated with cases and deaths by COVID-19 in Brazilian cities. Brazil is the sixth most populous country in the world, with over 211 million people, of which more than 85% live in urban areas. While it is likely that the novel coronavirus was already circulating in Brazil in early February 2020 [31], the first confirmed case in the country dates back to 26 February 2020, in the city of São Paulo. Between the first case and 12 August 2020, Brazil has confirmed 3,088,670 cases of COVID-19 (second-largest number) spread out over 98.9% of the 5,570 Brazilian cities. This disease caused 102,817 deaths (second-largest number) with 3,892 cities reporting at least one casualty as of 12 August 2020.

Results
We start by briefly presenting our data set (see Methods for details). Our investigations rely on the daily reports published by the Health Offices of each of the 27 Brazilian federative units. These daily reports update the number of confirmed cases (Y c ) and the number of deaths (Y d ) caused by COVID-19 in every Brazilian city from 25 February 2020 (date of the first case in Brazil) to 12 August 2020 (date of our last update). From these data, we create time series of the number of cases Y c (t c ) for each city, where t c refers to the number of days since the first two daily cases reported in each city. Similarly, we create time series of the number of deaths Y d (t d ), where t d refers to the number of days since the first two daily deaths reported in each city. By doing so, we group all cities according to their stage of disease propagation (as measured by t c or t d ) to investigate the evolution of allometric relationships between total cases or deaths and city population. We have also considered different number of daily cases or deaths as the reference point, and our results are robust against different choices (from one to seven daily cases or daily deaths, see  the number of cases is well described by a power-law function of the city population where β c is the so-called urban scaling exponent [18]. Similarly, Fig 1B shows the association between the number of casualties and the city population on logarithmic scale (logY d versus logP) for different numbers of days since two daily deaths first reported (t d = 15, 50, 85 and 120 days). Again, the results indicate that the number of deaths is approximated by a powerfunction of the city population where β d represents the urban scaling exponent for the number of deaths.
The results of Fig 1 also show the adjusted allometric relationships (dashed lines) and the best fitting scaling exponents β c and β d (see Methods for datails). These exponents exhibit an increasing trend with time so that β c and β d exceed one after some number of days after the first two daily cases or deaths. This dynamic behavior is better visualized in Fig 2, where we depict β c and β d as a function of the number of days since the first two daily cases (t c ) or deaths (t d ). The scaling exponent for the number of cases β c is sub-linear (β c < 1) during the first four months and appears to approach a super-linear plateau (β c > 1) as the number of days t c further increases. The dynamic behavior of the scaling exponent for deaths β d is similar to β c ; however, β d appears to be approaching a plateau larger than the one observed for β c .
The evolution of the scaling exponents for cases and deaths indicates that small cities are proportionally more affected by COVID-19 during the first four months. However, this initial apparent advantage of living in large cities vanishes with time, and become a disadvantage after about four months. This is more evident by estimating the number of cases per capita from Eq (1), that is, Y c /P � P β c −1. Similarly, we can estimate the number of deaths per capita from Eq (2), yielding Y d /P � P β d −1. Thus, we expect the number of COVID-19 cases or deaths per capita to decrease with the city population if β c < 1 and β d < 1; conversely, these per capita numbers are expected to increase with the city population if β c > 1 and β d > 1. For instance, because β c � 0.77 and β d � 0.85 after 75 days since the first two daily cases or deaths, the number of cases and deaths per capita decreases with population as Y c /P � P −0.23 and Y c /P � P −0.15 . At those particular values of t c and t d , an 1% rise in the population is associated with a �0.23% decrease in the incidence of COVID-19 cases and �0.15% reduction in the incidence of deaths. In a concrete example for t c = t d = 75 days, we expect a metropolis such as São Paulo (with �12 million people) to have �54% less cases and �39% less deaths per capita than a medium-sized city such as Maringá/PR (with �420 thousand people, �1/30 of São Paulo), which in turn is expected to have �41% less cases and �29% less deaths per capita than a small-sized city such as Paranaíba/MS (with �42 thousand people, �1/10 of Maringá).
However, both scaling exponents increase with time, such that this urban advantage vanishes and become a disadvantage during the long course of the pandemic. By considering our latest estimates for the scaling exponents, we find β c � 1.04 (t c = 144 days) and β d � 1.12 (t d = 120 days). Thus, at these particular values of t c and t d , we expect the number of cases per capita to slightly increase with population (Y c /P � P 0.04 ) and the number of fatalities per capita to increase with population as Y d /P � P 0.12 . Thus, for β d � 1.12 at t d = 120 days, we expect a metropolis such as São Paulo (�12 million people) to have �50% more deaths per capita than Maringá/PR (�420 thousand people), which in turn is expected to have �32% more deaths per capita than Paranaíba/MS (�42 thousand people). Figs 8-14 in S1 Appendix show that the scaling relations for number of cases and deaths per capita support the previous discussions.
The latest estimates of β c found for cases of COVID-19 are smaller than those reported for the 2009 H1N1 Pandemic in Brazil (β c � 1.2) and HIV in Brazil and United States (β c � 1.4) [27]. Similarly to what we observe for the cases of COVID-19, the allometric exponent for HIV cases in Brazil was initially sub-linear during the 1980s, became super-linear after the 1990s, and started to approach a super-linear plateau after the 2000s [27]. However, the evolution of the allometry for HIV has been much slower than what we have observed for the COVID-19. Another interesting point reported by Rocha, Thorson, and Lambiotte [27] is that the number of H1N1 cases in Brazil started to scale linearly with city population in 2010 (one year after the first outbreak). These authors also argue that this reduction in the scaling exponent possibly reflects a better response for the spread of H1N1 after the pandemic outbreak. If the behavior observed in the 2009 H1N1 Pandemic generalizes (at least in part) for the current COVID-19 pandemic, we would expect a decrease in values of β c in the future. The lastest estimates of β d for COVID-19 deaths are larger than those reported for diabetes (β d � 0.8), heart attack (β d � 1) and cerebrovascular accident (β d � 1) in Brazil after the 2000s [27]. Conversely, scaling exponents related to disease mortality in Brazil displayed a decreasing trend with time, and values as high as 1.25 were observed for diabetes in 1996 (β d � 1.22) and heart attack in 1981 (β d � 1.25) [27]. The convergence of these exponents to linear or sub-linear regimes may reflect the increasing access to medical facilities in urban areas [27].
Based on currently available data (Fig 2), it is hard to confidently assert whether the values of β c and β d will remain larger than one during the long-term course of the pandemic. However, the persistence of this behavior indicates large cities are likely to be more affected at the end of the COVID-19 outbreak. Part of this behavior may be due to large cities testing for COVID-19 proportionally more than small ones. Results for the United States indicate that more rural states have lower testing rates and detect disproportionately fewer cases of COVID-19 [32]. As Brazilian cities are likely to suffer from this bias, we would expect a decrease in the scaling exponent β c after the observed increasing trend depending on the magnitude of this effect (that is, as small cities increase their testing capabilities, their number of cases tend to increase and bend the scaling law downwards).
On the other hand, it is clearer that large cities were proportionately less affected during the initial months (since the first two daily cases or deaths) of the pandemic. We believe there are at least two possible explanations for this behavior. First, it may reflect an "increasing urban advantage" where the larger the city, the more access to medical facilities and so the chance of receiving more appropriate treatment against the coronavirus disease. A second cause can be associated with age demographic changes with the city population; specifically, a smaller proportion of older adults at high risk for severe illness and death from COVID-19 leads to a reduced number of deaths per capita. Another possibility is that the strategies and policy responses of large and small cities to COVID-19 are different, which in turn may lead to different efficiency in containing the pandemic. These responses are highly heterogeneous at the national level [33,34] as well as among counties in the United States [35]. Among these three possibilities, we did not explore the possible effects of different city strategies against the COVID-19, but in light of the findings for the United States [35], this effect is likely to play an important role in the Brazilian case and may deserve further investigation.
To test for an increasing urban advantage for the treatment of COVID-19 during the initial spread of the disease, we investigate the scaling relation between the number of hospital intensive care unit (ICU) beds and city population. Because critically ill patients frequently require mechanical ventilation [36,37], the number of ICU beds has proved to be crucial for the treatment of COVID-19. Fig 3A shows the allometric relationship between the number of ICU beds from private and public health systems (Y icu , as of April 2020) and the population, where a super-linear relationship emerges with scaling exponent β icu � 1.16. The super-linear scaling of ICU beds indicates that large Brazilian cities are better structured to deal with critically ill patients, which in turn may partially explain the reduction of deaths per capita with the city size during the initial three-four months since the first two daily deaths. It is worth noting that the Brazilian Public Unified Health System (Sistema Ú nico de Saúde-SUS) is decentralized and composed of "health regions", contiguous groups of cities usually formed by a large city and its neighboring cities [38]. Cities within the same health region may share medical services, which may in turn partially explain the reduction of the structural advantages of large urban areas during the long-term course of the pandemic.
We have also investigated how age demographic distribution changes with city population. Estimates have shown that the case fatality rate of COVID-19 is substantially higher in people aged more than 60 years (0.32% for those younger than 60 years versus 6.5% for those older than 60 years [39]). Thus, the age demographic of cities represents an important factor for the number of deaths caused by COVID-19. Fig 3B and 3C show how the number of people older (P hr , the high-risk population) and younger (P lr , the low-risk population) than 60 years change with the total population (P). We note that the high-risk population increases sub-linearly with city size with an exponent β hr � 0.91, while the low-risk population scales linearly (β lr � 1) with city size. This result shows that large cities have a lower prevalence of adults older than 60 years, such that a 1% increase in city population is associated with a 0.91% rise in the high-risk population. In a more concrete example, we expect a city with one million people to have proportionally �19% fewer adults older than 60 years when compared with a city of 100 thousand inhabitants. Thus, a low prevalence of elderly in large urban areas may also partially explain the initial reduction of the number of deaths per capita with the increase of city population.
In addition to addressing the urban scaling of cases and deaths of COVID-19, we have investigated associations between the growth rates of cases and deaths and the city population (Figs 16-22 in S1 Appendix). As mentioned, the work of Stier, Berman, and Bettencourt [29] shows that the initial growth rates of COVID-19 cases in metropolitan areas of the United States scale as a power-law function of the population with an exponent between 0.11 and 0.20. By using our data and as detailed in Methods, we have estimated the growth rates of cases (r c ) and deaths (r d ) for Brazilian cities. In agreement with the United States case, our results also indicate that COVID-19 cases initially grow faster in large cities (Fig 23 in S1 Appendix), such that r c � P b r c with b r c between �0.1 and �0.3 during the first three months (t c ≲ 90, Fig 23 in S1 Appendix). We also found similar behavior for the growth rate in the number of deaths r d , where a power-law relation r d � P b r d is a reasonable description for the empirical data with a scaling exponent b r d between �0.1 and �0.5 during the first three months (t d ≲ 90, Fig 23 in S1 Appendix).
The growth rate depicts a more instantaneous picture of the COVID-19 spreading process, and its association with size may change during the long-term evolution of the pandemic. These changes may reflect the different actions taken by each city to face the COVID-19 pandemic and other particularities affecting the COVID-19 spreading. For the spreading of COVID-19 in the United States, Heroy [40] has reported that large cities appear to enter in an exponential spreading regime earlier than small ones. To better investigate these possibilities in our data, we have estimated the average relationship between the growth rate of cases (r c ) and deaths (r d ) and the city rank s (s = 1 represents the largest city in data, s = 2 the second-largest, and so on) at different periods. Fig 4A shows the results for the growth rates in the number of cases (r c ). In agreement with the power-law association between r c and the city population (Figs 16-22 in S1 Appendix), we note that lower values of the city rank s are associated with higher growth rates r c in the initial days since the first two daily cases. However, as time goes by, the growth rate of cases starts to decrease in large cities (low-rank values) and to increase in small ones (high-rank values). This result appears to agree with the findings of Heroy [40] in the sense that there is a delay in the emergence of high growth rates of cases between large and small cities. Fig 4B shows the same analysis for growth rate in the number of deaths r d . While we also observe a decrease in r d for large cities and increase for small ones, the differences in r d are less pronounced than in r c . These findings also emerge when investigating the scaling exponents associated with the growth rates of cases (b r c ) and deaths (b r d ). The results of Fig 23 in S1 Appendix show that these exponents start to decrease around t c � t d � 100 days and become negative in our latest estimates. It is worth remembering that the time t c (or t d ) is measured in days since the first two daily cases (or first two daily deaths) for each city; thus, the results of Fig 4 do not reflect delays in the emergence of the first case in each city.

Discussion
We have studied scaling relations for the number of COVID-19 cases and deaths in Brazilian cities. Similarly to what happens for other diseases, we found the number of cases and deaths to be power-law related to the city population. During the initial three-four months since the first two daily cases or deaths, we found a sub-linear association between cases and deaths by COVID-19, meaning that the per capita numbers of cases and deaths tend to decrease with population in this initial stage of the pandemic. We believe this behavior can be partially explained by an "increasing urban advantage" where large cities have proportionally more ICU beds than small ones. In addition, changes in age demography with city size show that large cities have proportionally less elderly people who are at high risk of developing severe illness and dying from COVID-19. This may also partially explain the initial reduction of fatalities per capita with the city population. In addition, we have argued that the strategies and policy responses of large and small cities to COVID-19 may also be different and lead to different efficiency in containing the pandemic.
However, we found that this "urban advantage" vanishes in the long-term course of the pandemic, such that the association between cases and deaths by COVID-19 with population becomes super-linear in our latest estimates since the first two daily cases or deaths. Thus, the persistence of this pattern indicates that large cities are expected to be proportionally more affected at the end of the COVID-19 pandemic. This result is in line with the findings for other infectious diseases [25,27] and probably reflects the existence of a higher degree of interaction between people in large cities [19,28]. Because social distancing is currently the only available measure to mitigate the impact of COVID-19, our results suggest that large cities may require more severe degrees of social distancing policies.
In agreement with the results for metropolitan areas in the United States [29], we have found that large cities usually display higher growth rates in the number of cases during the initial spread of the COVID-19. However, our results also show that these growth rates tend to decrease in large cities and to increase in small ones in the long-term course of the pandemic. This behavior suggests the existence of a delay in the emergence of high growth rates between large and small cities. Similar behavior was also found in the United States [40], where large cities appear to enter an exponential growth regime earlier than small towns. The existence of this delay suggests that the initial slow-spreading pace of the COVID-19 in small cities is likely to be a transient behavior.
Together with the recent findings of Stier-Berman-Bettencourt [29] and Heroy [40] for the United States, as well as those of Cardoso and Gonçalves [30] for United States, Brazil and Germany, our results suggest that social distancing policies and other actions against the pandemic should take into account the non-linear effects of city size on the spreading of the COVID-19.

Data
The primary data set used in this work was collected from the brasil.io API [41]. This API retrieves information from COVID-19 daily reports published by the Health Offices of each of the 27 Brazilian federations (26 states and one federation district) and makes it freely available. This data set comprises information about the cumulative number of cases and deaths of COVID-19 from 25 February 2020 (date of the first case in Brazil) until 12 August 2020 (date of our last update) for all Brazilian cities reporting at least one case of COVID-19. The brasil.io API also provides population data of Brazilian cities, which in turn relies on population estimates for the year 2019 released by the Brazilian Institute of Geography and Statistics (IBGE). There is a total of 5,507 Brazilian cities with at least one reported case of COVID-19 on 12 August 2020, corresponding to 98.9% of the country's total number of cities. In addition, 3,892 cities suffered casualties from this disease, representing 69.9% of the total. To ensure that our estimates rely on at least 50 cities, we consider a suitable upper threshold for the time series length (Fig 24 in S1 Appendix). The data about age demographics refer to the latest Brazilian census that took place in 2010, while the data about the number of ICU beds are from April 2020. These two data sets are maintained and made freely available by the Department of Informatics of the Brazilian Public Health System (DATASUS) [42].

Fitting urban scaling laws
Urban scaling [18] usually refers to a power-law association between a city property Y and the city population P, and it is expressed by where Y 0 is a constant and β is the urban scaling exponent. Eq (3) can be linearized by taking the logarithmic on both sides, that is, where log Y and log P are the dependent and independent variables of the corresponding linear relationship between log Y and log P. We have estimated the power-law exponents in Eq (3) by using the probabilistic approach of Leitão et al. [43]. Specifically, we have found the probabilistic model with lognormal fluctuations and where the fluctuations in log Y are independent of P to be the best description of our data in the majority of scaling laws. Thus, we assume these lognomal fluctuations in all adjusting procedures in order to estimate the values of β. It is worth mentioning that this maximum-likelihood estimate for scaling exponents is analogous to the one obtained via usual least-squares with the log-transformed variables (log Y versus log P).

Logarithmic growth rates of cases and deaths
Let us consider that x t (t = 1, . . ., n) represents the cumulative number of cases (Y c ) or the cumulative number of deaths (Y d ) for COVID-19 in a given city at time t (number of days since first case t c or death t d ). The logarithmic growth rate r t at time t is defined as where τ is a time delay. If we assume the numbers of cases or deaths to initially increase exponentially (x t � e rt , where r is the exponential growth rate), r t represents an estimate for the growth rate of this initial exponential behavior (r). We have estimated r t for the number of cases (r c ) and deaths (r d ) up to values of t c and t d ensuring a sample size of at least 50 cities for the allometric relations between these growth rates and the city population (Fig 24 in S1 Appendix). All results in the main text were obtained for τ = 14 but our discussion is robust for τ between 9 and 21 days (Figs 25-38 in S1 Appendix).