Global convergence of COVID-19 basic reproduction number and estimation from early-time SIR dynamics

The SIR (‘susceptible-infectious-recovered’) formulation is used to uncover the generic spread mechanisms observed by COVID-19 dynamics globally, especially in the early phases of infectious spread. During this early period, potential controls were not effectively put in place or enforced in many countries. Hence, the early phases of COVID-19 spread in countries where controls were weak offer a unique perspective on the ensemble-behavior of COVID-19 basic reproduction number Ro inferred from SIR formulation. The work here shows that there is global convergence (i.e., across many nations) to an uncontrolled Ro = 4.5 that describes the early time spread of COVID-19. This value is in agreement with independent estimates from other sources reviewed here and adds to the growing consensus that the early estimate of Ro = 2.2 adopted by the World Health Organization is low. A reconciliation between power-law and exponential growth predictions is also featured within the confines of the SIR formulation. The effects of testing ramp-up and the role of ‘super-spreaders’ on the inference of Ro are analyzed using idealized scenarios. Implications for evaluating potential control strategies from this uncontrolled Ro are briefly discussed in the context of the maximum possible infected fraction of the population (needed to assess health care capacity) and mortality (especially in the USA given diverging projections). Model results indicate that if intervention measures still result in Ro > 2.7 within 44 days after first infection, intervention is unlikely to be effective in general for COVID-19.


Introduction
A heated dispute about the effectiveness versus risk of smallpox inoculation was playing out in eighteenth-century France, which was to launch the use of mathematical models in epidemiology. This dispute moved inoculation from the domain of philosophy, religion, and disjointed trials plagued by high uncertainty into a debate about mathematical models-put forth by Daniel Bernoulli (in 1766) and Jean-Baptiste le Rond D'Alembert (in 1761), both dealing with competing risks of death and interpretation of trials [1]. Since then, the mathematical description of infectious diseases continues to draw significant attention from researchers and practitioners in governments and health agencies alike. Even news agencies are now seeking out explanations to models so as to offer advice and clarity to their audiences during the (nearcontinuous) coverage of the spread of COVID-19 [2]. The prospect of using mathematical models in conjunction with data is succinctly summarized by the Nobel laureate Ronald Ross, whose 1916 abstract [3] continues to enlighten the role of mathematics in epidemiology today. A quotation from this abstract below, which foreshadows the requirements and challenges for mathematical models to describe emerging epidemics such as COVID-19 [4,5], needs no further elaboration: It is somewhat surprising that so little mathematical work should have been done on the subject of epidemics, and, indeed, on the distribution of diseases in general. Not only is the theme of immediate importance to humanity, but it is one which is fundamentally connected with numbers, while vast masses of statistics have long been awaiting proper examination. But, more than this, many and indeed the principal problems of epidemiology on which preventive measures largely depend, such as the rate of infection, the frequency of outbreaks, and the loss of immunity, can scarcely ever be resolved by any other methods than those of mathematical analysis.
The classic susceptible-infectious-recovered (SIR) paradigm, initiated in the late 1920s [6], now provides a mathematical framework that describes the core transmission dynamics of a wide range of human diseases [7][8][9][10][11][12], including COVID-19 [13]. A key parameter in the SIR paradigm is the basic reproduction number (R o ). The R o is defined by the average number of secondary cases arising from a typical primary case in an entirely susceptible population of size S o [14][15][16]. The usefulness of R o and uncertainty in its estimation are not a subject of debate, as reviewed elsewhere [17], and therefore are not further discussed here.
The R o for COVID-19 and other diseases is commonly estimated directly from case data or by fitting the SIR model or one of its many variants to the data [14,[18][19][20][21][22]. Unsurprisingly, the R o estimates often exhibit large uncertainty. For COVID-19, mean estimated R o ranges between 1.95 and 6.47, with corresponding error estimates giving upper and lower bounds of 1.4 and 8.9 (Fig 1). For a given virus, such variability in R o is attributed to local spatio-temporal variability in public health resources, interventions, and how individuals in a population interact, among others, as well as the estimation method used [15,17]. Due to this local variability, it is commonly held that R o is a site-specific parameter that cannot be directly transferred between sites. The rapid availability of global data for COVID-19 has allowed an unprecedented comparison across a diverse array of populations, which is absent from the prior literature on R o estimation.
In the analysis herein, the SIR model is used to uncover generic spread mechanisms observed by COVID-19 dynamics globally, especially in the early phases of infectious spread. During this early period, potential controls were not effectively put in place or enforced in many countries around the world despite early warning signals from China, Iran, and later on, Italy. Hence, the early phases of COVID-19 spread in many countries where controls were weak offer a unique perspective on the ensemble-behavior of COVID-19 R o . The analysis shows that there is global convergence (i.e. across many nations) to an uncontrolled R o = 4.5 for COVID-19 describing early times spread from the SIR model. This value is compared to a number of published R o estimates for COVID-19 with a timeline summary featured in Fig 1. These published estimates along with the methods used to infer R o , the published uncertainty, and the original data source are provided in the Appendix.
Clearly, such wide ranging values of R o in Fig 1 motivate further analysis of R o variability across populations, the objective of this analysis. Other aspects that are considered in the estimation of R o from SIR are the effects of ramp-up in testing at the early phases of disease spread and the role of super-spreaders. These two effects are briefly discussed using idealized scenarios and model calculations. The implications for evaluating potential control strategies from this uncontrolled R o are considered in the context of mortality and maximum infections.

Definitions and nomenclature
Mathematical models of disease spread assume that a population within a compartment (e.g., city, region, country) can be subdivided into a set of distinct classes [11]. The SIR model classifies individuals in the compartment as one of three classes: susceptible (S), infectious (I), and recovered or removed (R). Infectious individuals spread the disease to susceptible individuals and remain in the infectious class for a given period of time known as the infectious period before moving into the recovered (or removed) class. Individuals in the recovered class are assumed to be immune for an extended period (or removed from the population). For the total population N = S + I + R, the dynamical system describing the SIR equations are given as where β(I/N) is known as the force of infection and coefficients β and γ must be externally supplied. Moreover, this system requires the specification of three initial conditions at time t = 0, S(0), I(0), and R(0). For COVID-19, it is assumed that R(0) = 0 and I(0)�S(0) at the initial outbreak. For the initial conditions selected here, N = S(0) + I(0) + R(0) � S(0), which is labeled S o for notational simplicity and consistency with the SIR literature. The basis of the latter assumption is that the number of deceased individuals is �S o . The dynamical system in Eqs (1), (2), (3) has only one equilibrium point: I = 0 for any S > 0, which is a disease-free stable equilibrium (i.e. as t ! 1, I(t) ! 0).

SIR model assumptions
The SIR model makes a number of assumptions, including a closed system with no changes in natural births or natural deaths occurring during the short-lived outbreak. The infection is assumed to have negligible latent period so that an individual becomes infectious when infected. For this reason, the SIR model might underestimate R o but the uncertainty surrounding the incubation period possibly precludes any advantage of adding a 'incubation' compartment to the SIR model [23] at this stage. Recovering from infection is also assumed to confer long-term immunity, yet to be verified for COVID-19. The most objectionable assumption in SIR dynamics is the use of the so-called 'mass-action' principle. As with all compartment models, mass action assumes that the rate of encounter between I and S is proportional to their product. For this assumption to hold, it requires that members of I and S be uniformly distributed in the space of the compartment [24]. Individuals -unlike molecules in an ideal solution within a closed container-do not mix homogeneously. Nonetheless, the use of the mass action principle serves as one reference to estimate R o in a consistent manner across differing countries using the SIR framework. The presence of super-spreaders on R o estimates using spatially-extended analysis of SIR is to be discussed later on.
The parameters γ and β encode the main properties of the epidemics and the population response to it. The γ = 1/D is generally interpreted as the inverse of the mean recovery time D. The D varies with the nature of the disease and the recovery from it, which depends on the medical facilities and resources available. For COVID-19, the best information on the speed of recovery comes from a World Health Organization study examining more than 55,000 cases in China [25]. They found that for mild illness, the time from the onset of symptoms to natural recovery is, on average, 14 days. This estimate was also supported in other published studies (e.g., [26]), though as much as 6-8 weeks were recorded for severe infections. Because I is dominated by mild cases thus far, D = 14 d is selected here.
With this assumption, the remaining model parameter β must be determined empirically or from separate studies. The β reflects the multiplicative effect of two factors: (1) the transmissibility of the infectious disease (= T r ) or the probability of disease transmission after an encounter between a susceptible and an infected and (2) the number of contacts per unit time k each infected individual has with susceptibles. Hence, β = kT r . Factors such as hand-washing and sanitizing or wearing masks reduce T r whereas social distancing, self-isolation, and closure of public or crowded spaces reduce k. From Eq (2), it is evident that dI/dt will be positive (outbreak) or negative (epidemic contained) depending on the sign of (β(S/S o ) − γ), which is one of the main reasons the basic reproduction number is sought.

The basic reproduction number R o
As earlier stated, the average rate of recovery is set to γ = 1/D. Given the value of D (in days), the probability that an individual remains infected in an infinitesimal time period δτ is 1 − γ (δτ). Therefore, the probability that this individual remains infected for an amount of time τ is lim δτ ! 0 (1 − γδτ) τ/(δτ) = exp(−γτ). In other words, τ, the time that an infected individual remains infected, is exponentially distributed with an average of D = 1/γ. In a compartmental model such as the SIR, every individual is initially susceptible and the average number of susceptibles that encounter an infected individual over a period τ is simply βτ. It follows that the average number of new infections caused by an infected individual, which is the basic reproduction number R o , is given by [27] where the γ after the second equality is to normalize p(τ)dτ. Two assumptions underlying the compartmental SIR model are unrealistic, but a proven correspondence between the compartmental SIR and a Poisson graph SIR model justifies its applicability. In the compartmental (or fully-mixed) SIR model, the recovery times are exponentially distributed and every individual has an equal chance per unit time of encountering all other S o − 1 individuals [27]. However, COVID-19 recovery times have been determined to concentrate around D = 1/γ = 14 days. Moreover, infected individuals come in contact with only a handful of other people. But, it was shown that the dynamics of a discrete-time SIR compartmental model (Reed-Frost model) and those of the SIR on a Poisson random graph are equivalent [28]. The Poisson random graph model assumes a constant recovery time and a Poisson distributed degree distribution (i.e., number of contacts for every individual), both more realistic assumptions. It is this correspondence between the SIR model and the random graph that makes the SIR model an appropriate tool to explore the early-time dynamics of COVID-19 spread.

Early-times dynamics of the SIR system
As common with dynamical systems, non-dimensional variables are preferred in the analysis of the phase-space to be conducted next. Here, γ and S o are obvious choices for normalizing time and population pools. Hence, a dimensionless time t� = γt and a dimensionless fraction of individuals s = S/S o , i = I/S o , and r = R/S o are introduced so that the original SIR system is now In this early phase, s(� 1) can be 'de-coupled' from i resulting in an autonomous budget for i given as When R o > 1, di/dt� > 0 leading to an epidemic or, conversely, a containment of the disease. The solution of Eq (8) is an exponential The R o may be determined by regressing log(i) against t, and the slope of this regression determines R o when γ can be separately estimated. More sophisticated fitting procedures can also be conducted on sampled I(t) versus t. A major limitation to this exercise is that I(t) at early times, often determined from reported confirmed cases, is uncertain and depends on testing frequency that may vary in time as I increases. An alternative is to regress di/dt� upon i at early times to detect the highest slope, which can then be used to infer R o . This approach is featured in Fig 2c, which illustrates that the SIR dynamics exhibit rapid deviations from a linear di/dt� with i set by early times thereby underestimating R o (for a given γ). Evidently, inference of R o requires estimates of early time slope, which cannot be easily detected in practice. A non-conventional approach is to present confirmed infection data using a double-log representation of di/dt� versus i, which is featured in Fig 2d. This presentation has a number of advantages and limitations in the analysis of COVID-19 discussed elsewhere [29]. The main advantage is that the early time slope (= R o − 1) visually persists over much of the graph. A significant decline in di/dt� is also required before 'registering' a drop in such a representation. This insensitivity to moderate declines in di/dt� from its initial value may be advantageous in R o estimates. The other main limitation, which is inherent to all such analyses, is shifts in testing frequency at high i, and thus the increase in confirmed cases due to expanded testing. It is to be noted that a log-log representation will be more robust to these shifts, because the overall graph will be biased by the initial slope prior to the initialization of expanded testing. Such bias should lead to increases in di/dt� versus i, not declines from the initial slope (R o − 1) that can be detected. As later shown, such an increase has been noted in several data sets.
With this representation, it is now shown that initial inaction to COVID-19 across many countries around the globe allowed an ensemble estimate of the uncontrolled R o . Because R o is likely to be at maximum when no action to COVID-19 are implemented early on, a maximum theoretical 'boundary-line' can then be derived to describe the spread of COVID-19 for large S o (on log-log representation). This boundary-line analysis can then be used as a logical reference to assess whether measures to reduce β are effective.

Estimating an early-time R o
The same log-log scheme featured in Fig 2d is now applied to the global data set supplied by the European Center for Disease Prevention and Control (ECDPC). The data source provides daily confirmed infections I(t) and deaths reported for each country. The population of each country, used to estimate S o (i.e. all members are susceptible), was obtained from the 2018 United Nations census and provided as part of the ECDPC data base. While daily data are supplied, not all countries report consistently on a daily I(t). For this reason, daily data on infections were smoothed with a 7 day block-average and dI/dt was estimated from the smoothed data. The 7-day block smooths out some of the spurious reporting during certain time periods (e.g. over weekends or during days when the health-care system was overwhelmed and processing along with reporting delayed by few days). It is to be noted here that the abscissa and ordinate are normalized by the same country-level population, meaning that the actual magnitude of S o is not essential. However, such normalization allows for country-to-country comparisons in the same phase-space. The results show a global convergence to R o = 4.5 from early time-analysis in Fig 3. Examples for specific countries are also featured in Fig 4 illustrating the same early slope patterns. Mindful of all the pitfalls in determining R o [17], the global estimate here of R o = 4.5 is roughly commensurate with other entirely independent estimates for COVID-19 discussed in the appendix and featured in Fig 1. The most recent update from a China study suggests an R o = 4.1 [30] whereas for France, the most recent estimate for early times is R o = 4.9 [31]. The initially reported and the much cited R o = 2.2 value [4] from Wuhan, China appears to be low [32] as already noted in Fig 1. A more elaborate estimate of R o based on case reports, incubation periods, high-resolution real-time human travel data, infection data combined with agent-based mathematical models result in R o = 4.7 − 6.6 [32]. Other studies report values between 3.3 and 6.6 [33]. It must be emphasized that the R o determined here reflects 'country-scale' early times assuming the entire country population to be S o , γ = (1/14)d −1 and does not accommodate any early measures enacted to reduce β or increase γ, which were undertaken in China [13] and other countries (e.g. Germany).

Sub-national dynamics and interventions
The same analysis performed for World countries is now applied at a sub-national level, considering Upper Tier Local Authorities (UTLAs) in the UK and provinces in Italy (Fig 5). Results show a higher variability than country-level data (as expected) but the theoretical 'boundary-line' of R o = 4.5 is shown to hold also at finer spatial scales. Cases reported at the beginning of April demonstrate that UK regions are at an early phase of the epidemics (with more ramp-up in testing as later discussed), while Italian provinces are approaching the peak of infections due to strict interventions put in place by national authorities.
Non-pharmaceutical interventions (e.g., social distancing, hand washing, universal masking) are the only measures currently available to limit the spread of COVID-19 [34], while contact tracing and isolation have been implemented to contain infected individuals [35]. Simulation results [35] showed that a COVID-19 outbreak can be controlled within 3 months if such strategies are put in place rapidly and effectively. This has been confirmed in another study [36] that showed that the containment measures employed in China and aimed at reducing human-to-human aerial transmission, succeeded in reducing the reproduction number to below unity within 30 days from implementation.
To consider the impact of interventions that reduce the infection rate over time and have direct effects on local scale dynamics [13], a time-varying R o , labelled as R o,d (i.e. dynamic) can be implemented in the SIR model. A logistic function captures temporal patterns in R o,d , the effective reproduction number, consistent with those estimated for other outbreaks [36][37][38][39], where R c is the controlled value of the dynamic R o,d , k c is the steepness of the intervention curve, and t 50 is the time when R o,d = (R c + R o )/2. It is assumed that, through interventions, the initial R o = 4.5 is reduced to a controlled value of R c = 1.1 after 2/γ days. Model results accounting for different intervention scenarios (Fig 6) resemble the trends observed in the Italian provinces with the timing and magnitude of R o reductions shifting the linear relation down and decreasing the maximum fraction of infected individuals. Such jumps are smoothed over at the national level where a clear deviation from exponential is observed (Fig 4).

An alternative hypothesis: Power-law vs exponential
Whether these results are suggestive of a global convergence to an uncontrolled R o = 4.5 or to some other dimensionless property must not be overlooked. A linear relation on a log-log representation may also be indicative of power-law solutions at early times, already documented in a number of studies for COVID-19 [40,41]. In fact, published analysis of infection data from the top 25 affected countries reveals approximate power-law behavior of the form I(t) * t a (or log(i) = a log(t) + b) with two different growth patterns [40]: steady power law growth with moderate scaling exponents (i.e., a = 3-5) or explosive power law growth with dramatic scaling exponents (i.e., a = 8-11).
Within the confines of the SIR dynamical system framework here, we ask: what are the necessary modifications to obtain power-law solutions at early times? Such a solution, while not unique, may be possible by revising the force of infection as βi m . The original SIR model is recovered when m = 1. For this non-linear force of infection, the SIR system becomes This revision ensures that the total population maintains its constant value (� S o here). The early times dynamics (i.e. S(t) � S o ) for the non-dimensional infection compartment i are now

PLOS ONE
Global convergence of COVID-19 basic reproduction number and estimation from early-time SIR dynamics governed by When m < 1, maintaining a definition of R o > 1 (epidemic), and noting that i � 1, the first term on the right-hand side of Eq (13) is much larger than the second term. In fact, to obtain a maximum exponent enveloping the early-time relation between di/dt� and i, the linear term can be dropped so that di/dt� � R o i m (only a growth phase). On a log-log representation, log (di/dt�) = m log(i) + log(R o ). A constant slope such as those featured in Figs 3 and 4 may simply be estimates of m (instead of R o ). The initial conjecture is that a power-law solution emerges from the modified SIR dynamics when m < 1. However, the slope here (= 3.5) actually exceeds unity contradicting this revised analysis. This finding supports the view that a global convergence to an uncontrolled R o = 4.5 is a more likely explanation than a power-law alternative arising from a non-linear force of infection. To be clear, there are other causes for power-law solutions (e.g. a stochastic β as discussed elsewhere [42]), but those fall outside the domain of deterministic SIR approaches adopted here. Nonetheless, and as a bridge between the studies reporting power-law growth in time for i and the modified SIR here, a relation between m and a is sought. The solution to Eq (13) can be expressed as which is a power-law in t.

PLOS ONE
Global convergence of COVID-19 basic reproduction number and estimation from early-time SIR dynamics (m < 1). It directly follows that m = (a − 1)/a < 1 (as expected), where a > 1 is determined by regressing early-times log(i) versus log(t). Reported a for what has been termed as 'explosive' cases such as the US, UK Canada, Russia, among others [40] all yield an a > 8 (with the US a > 16). Such high a simply confirms that m � 1 (and without much variations), and the early time SIR dynamics does describe reasonably those cases. For low a values, termed as 'steady', the mean a � 4.8, and thus yields an m � 0.8, still not too far from unity. The shortcoming of analyzing I(t) upon t is that absolute figures of I(t) are sensitive to increased COVID-19 testing in time, which is considered next.

Impact of testing ramp-up
A further explanation of early-time deviation from the SIR model (noted in several data sets here) may be time-dependent ramp-up of testing, which reveals existing infections at a rate faster than the infection spread. This hypothesis can be implemented in the SIR model considering the temporal dynamics of the testing capacity, f. Data show that testing capacity rates of increase depend on the country and follow linear or saturating trends [43]. To model a testing capacity that starts small and saturates over time, we assumed the maximum fraction of individuals that can be tested is f = 1, tests are 100% true and evenly distributed across compartments, and testing capacity grows exponentially at a rate k, independent of I, giving, f(t) = 1 − exp(−kt). Therefore, the apparent number of infections, i a , initially grows according to the superposition of the infectious spread rate and testing capacity increase rate, i.e., exp[ . From the data [43], we estimate a typical value of k is 0.02 d −1 , which is negligible when compared to R o − 1. The small value of k relative to R o − 1 indicates that the imprint of testing ramp-up likely does not strongly impact the observed early-time dynamics and the observed convergent slope remains a robust indicator of the early phases of virus dynamics.

The role of super-spreaders on R o estimates
For COVID-19, there is currently no general agreement on a precise definition of a superspreader. In its broadest interpretation, a super-spreader has the propensity to infect a larger than average number of susceptible individuals. This definition does not have a unique link to a precise mechanism and appears to encompass biological, behavioral and environmental variables relevant to disease transmission. We consider a narrow view of super-spreaders as those infectious individuals with high mobility. Within this narrower scope, the concern is to assess how long-distance mobility of few infected individuals (i.e. super-spreaders) impacts the estimates of R o when the phase-space analysis of early times dynamics is used. Because these infectious individuals are highly mobile and depending on the mobility network in each country or region, detailed country-by-country investigation is beyond the scope here. However, the mobility of these super-spreaders can impact the much discussed mass-action assumption in SIR and thus estimates of R o from early times dynamics. We address this effect using an idealized yet generic analysis similar to the ramp-up testing effect earlier discussed. To do so, a country is first divided into identical and equal sized regions each of area given by dx and dy. These regions are 'isolated' and experience their own identical SIR dynamics assuming the same R o and γ. Few infectious individuals arrive into this country (treated as a lattice in an x − y plane) at random locations thereby initially infecting few regions. Area wise, under 0.01% of the country area experiences an infectious individual at t = 0. To amplify the role of super-spreaders, mobility is only allowed between regions by super-spreaders, whereas mass action is still assumed between I and S within each region. To allow for large mobility of these super-spreaders in the SIR analysis, the budget equation for I in each region is revised to become an integro-differential equation (IDE) given as [44] @I @t where all state variables now evolve in space (x, y) and time (t) so that I, S, and R represent I(x, y, t), S(x, y, t), and R(x, y, t) (unless otherwise stated), N o is the initial population in dx by dy region (S o is the entire country population), ϕ(<< 1) is the fraction of infectious individuals that are mobile in a region within a given time step dt (super-spreaders of the I budget) and p (x 0 , y 0 ) describes their spread kernel defined by the probability that infectious individuals at position x, y move to position x 0 , y 0 in a time increment dt. The spread kernel must satisfy the normalizing condition Z þ1 When ϕ = 0 (no super-spreaders), the IDE approach reduces to spatially independent or autonomous SIR models operating in compartments dx × dy with no connectivity or spatial interaction between compartments (i.e. the entire country comprised of regions will experience the same R o ). A number of choices can be made about p(x 0 , y 0 ), which all depend on the mobility network in each country (airports, roads, trains, public-transport, etc.) and analyzing all of them is beyond the scope here. For simplicity, we selected a distance-dependent spatial spread kernel [45] pðrÞ where r 2 = (x − x o ) 2 + (y − y o ) 2 , σ is a measure of the spread of the spatial kernel, and α N is a normalizing constant. Since the interest here is in spatial spread kernels with finite support Ra (i.e. super-spreaders cannot travel to all the corners of the domain in a single dt), α N is determined so that Z r¼Ra This condition yields [45] a N ðRa; sÞ ¼ Other spatial kernels can be specified and subjected to the same normalizing conditions thereby making the IDE approach flexible in terms of choices about spatial spread of infectious individuals. Also, it is possible to include time dependency in the spreading properties, meaning p(x 0 , y 0 , t) changes in time through temporal variations in Ra or σ or both (e.g. to allow for diurnal variations in mobility habits). Model calculations using the Gaussian kernel in Fig 7 show that super-spreaders (ϕ = 0.001) will infect almost the entire domain by t� = 5. The main difference between Fig 2 and the spatially aggregated SIR model in Fig 8 is an initial delay and the occurrence of a rapid rise resembling the data from the UK and Italy regions at early times. However, beyond this initial delay in spreading, the early times dynamics with super-spreaders and without them are commensurate. That is, the effects of super-spreaders resembles the impact of testing ramp-up earlier discussed on the phase-space of di/dt� versus i but with delays.

Size of the epidemic
The maximum infections I max (where dI/dt = 0) can be derived as a function of S o and R o by first dividing the budgets of dS/dt and dI/dt, solving the resulting equation, and noting that which when integrated between t = 0 and t = 1 yields, The solution for S(1)/S o can be analytically derived and linked to the total infected individuals I T using Different definitions of the global mortality rate α m have been considered but, for the purposes of the present study, the ratio of deaths to both symptomatic and asymptomatic cases is most appropriate. This ratio is dubbed the Infection Fatality Rate (IFR) and is different from the Case Fatality Rate (CFR), which is the ratio of deaths to the number of confirmed cases. The IFR is more appropriate because the 'Infected' compartment of the SIR model comprises of both symptomatic and asymptomatic individuals. According to the latest IFR estimates, α m ranges between 0.53 and 0.82% with a mean value of 0.68% [46].
For the USA, the epicenter of COVID-19 at the time of submission of this article on April 10, 2020, we asked how much R o should be reduced by deliberate intervention to maintain mortality below a certain threshold size M o . With S(0) = 327M, we determine how much R o should be reduced as a function of M o assuming different values of α m . These results are featured in Fig 11 and suggest that to maintain mortality below 1 million, R o < 1.5 when α m = 0.53%, a reduction factor of 3 over its uncontrolled value. Under the same scenario, to limit the death toll below 300 thousand people, R o should be reduced to 1.1, a reduction factor of 4. These reduction factors are not entirely unreasonable using non-pharmaceutical measures (social distancing, masks, hand sanitizing, etc. . .).
A natural extension of this exercise is to consider temporal changes in R o following the logistic form in Eq (9). The maximum number of infected I max at time t and cumulative number of infections R(1) � R(t� � 14) can be made to vary as the slope k c and t 50 are changed (Eq (9)). A larger k c signifies more rapid enforcement of intervention policies and a larger t 50 For reference, the South Korea data suggests that early intervention, even when rapidly enforced shortly after the outbreak, resulted in R o = 1.5 [47]. The effectiveness of interventions and any delays can now be linked to mortality and severity by varying t 50 and t 80 . Fig  12 presents how R(t� � 14) and I max are contained for only a restricted envelope of speed and timeliness of policy enforcement. The R(t� � 14) represents the cumulative number of fatalities and I max is proportional to the degree to which resources, like hospital intensive care units, will be overwhelmed at peak infection rate. The results in Fig 12 indicate that if R o > 2.7 within t = 3.5/γ (about 49 d here), a more than 10% reduction relative to S o in I max or R(1) is unlikely.
An implication of Fig 12 is that if R o does not decrease to at least 2.7 by 44 days after first infection, more than a million people are expected to die with an assumed constant mortality rate of 0.53%. For mortality to be confined to a range in the 100,000, then a reduction of R o from 4.5 to 2.7 must be achieved within 14 days of first infection, which did not occur in the USA. As of July 30, 2020, confirmed fatalities in the USA have exceeded 150,000.
Last, it is to be noted that the fraction of individuals that must be immune (either through vaccination or recovery from prior COVID-19 infections) must exceed the herd immune threshold (HIT), which is given by This estimate of HIT sets the limit on the immune population needed to overcome another

Conclusion
The work here has shown a global convergence of R o = 4.5 when no deliberate intervention was taken for COVID-19. This R o was shown to describe reasonably the maximum initial exponential growth rate of COVID-19 (=(R o − 1)γ, where γ = (1/14)d −1 ) in many countries that did not initiate preventive measures within γt = 2. The findings here further support the growing consensus that the initial R o = 2.2 estimate from Wuhan, China is low. The value of R o = 4.5 is much more in line with other estimates (R o = 4 − 6) derived from far more complex models. Model calculations and theoretical considerations offered here delineate the conditions when this R o estimate is robust to the inclusion of other mechanisms such as super- spreaders and ramp-up in initial testing. The critical herd immunity level that must be reached is 78% to ensure COVID-19 does not become an epidemic again. This estimate sets a maximum limit on the vaccination required.

Appendix: Compilation of R o estimates from previous studies
A recent review [21] showed that existing estimates of R o range from 1.4 to 6.49, with a mean of 3.28 and an interquartile range of 1.16. Early studies reported lower R o values [21] and results often vary with the estimation method. Stochastic, statistical, and mathematical methods provided mean estimates of 2.44, 2.67, and 4.2, respectively [21], and SIR-based approaches are likely to provide higher values compared to other methods [36]. Similar results were presented in another meta-analysis [48] reporting a mean R 0 of 3.38 (values ranging between 1.90 to 6.49) but did not find a significant effect of different estimation methods. A summary of these results is provided in Table 1.