Evaluating the effect of Chinese control measures on COVID-19 via temporal reproduction number estimation

Control measures are necessary to contain the spread of serious infectious diseases such as COVID-19, especially in its early stage. We propose to use temporal reproduction number an extension of effective reproduction number, to evaluate the efficacy of control measures, and establish a Monte-Carlo method to estimate the temporal reproduction number without complete information about symptom onsets. The province-level analysis indicates that the effective reproduction numbers of the majority of provinces in mainland China got down to < 1 just by one week from the setting of control measures, and the temporal reproduction number of the week [15 Feb, 21 Feb] is only about 0.18. It is therefore likely that Chinese control measures on COVID-19 are effective and efficient, though more research needs to be performed.


Introduction
Emerged from Wuhan City, the novel coronavirus diseases rapidly expanded since December 2019. Early analyses indicated that COVID-19 has middle-to-high transmissibility, with preliminary estimation of basic reproduction number R 0 lying in the range [2.0, 4.0], e.g., 1.4-3.9 [1], 2.47-2.86 [2] and 2.8-3.9 [3]. After a period of stealthy spread, on 20 January 2020, COVID-19 was identified as a B-type infectious disease in China, and the control measures were set according to the standard of A-type infectious disease. Roughly speaking, 21 January 2020 can be considered as the starting date of control, on which every province in China took COVID-19 spread as an emergency event and launched strong control measures according to directives of the central government. These control measures have achieved remarkable success, with daily number of confirmed cases quickly decreasing after a short expansion lasting about two weeks from 21 January 2020.
In general, basic reproduction number R 0 can be used to characterize the transmissibility of infectious diseases. It refers to the average number of individuals who will be infected by one infected case in a fully susceptible population without external interventions. Without control, infectious diseases will gradually die out if R 0 < 1, will spread exponentially and become epidemics if R 0 > 1, and will become endemic in the population if R 0 � 1. The basic reproduction number is far different for different infectious diseases, for example, Zika: 1.4-6.6 [4], H1N1: 1.4-3.1 [5], dengue: 1.52-3.90 [6], Ebola: 1.3-2.7 [7], SARS: 2.2-3.7 [8], MERS: 2.0-6.7 [9], smallpox: 3.5-6.0 [10], measles: 12-18 [11], pertussis: 12-17 [12], etc. Usually, it is difficult to directly measure the value of R 0 since R 0 is affected by numerous biological, sociobehavioral, and environmental factors [13], and thus statistical models are widely applied to estimate R 0 [14][15][16][17]. We always assume the population is fully susceptible without control measures in estimating the value of R 0 . However, during the epidemic spreading, various control measures will be introduced to contain the spread, so we should adopt time-related reproduction number to quantify the temporal situation of the spread and the control efficacy. The most intuitive metric is the effective reproduction number R t , which is defined as the average number of secondary cases infected by an infected case with symptom onset at day t. Various methods to estimate R t under different scenarios were proposed in the literature [18][19][20][21][22][23].
If complete information about who infects whom is known, R t can be determined by simply counting secondary cases. However, tracing information is usually incomplete or not timely available, and thus statistical approaches are required. Willinga and Teunis [24] proposed a likelihood-based method to estimate R t from the epidemic curve and the distribution of generation intervals, which works only for the period in which all secondary cases would have been detected, thus resulting in a time lag about 19 days for COVID-19 (95th percentile of the distribution of generation intervals [1]). By accounting for yet unobserved secondary cases via Bayesian inference, Cauchemez et al. [25] extended the Wallinga-Teunis method to provide real-time estimates of R t .
In real world, the situation may be even worse, where not only the complete tracing records, but also the full epidemic curves are unknown. In order to deal with such situation, we proposed a Monte-Carlo method to estimate the full epidemic curve by using a small number of cases with known symptom onsets, and then to estimate the reproduction number.

Estimation of R t
Distribution of generation intervals and epidemic curve are two main inputs to estimate R t , where generation intervals refer to time intervals between symptom onsets of index cases and their infected cases, and the epidemic curve records the number of cases with symptom onsets at each day. According to the empirical observations [1], the distribution of generation intervals, q(t g ), can be approximated by a Gamma distribution [26]: where α � 4.866 is the shape parameter and β � 0.649 is the inverse scale parameter. Given two cases i and j with symptom onset times being t i and t j , the likelihood that case i is infected by case j (t i > t j ) is thus Wallinga and Teunis [24] suggested that the expected number of secondary cases infected by case j can be estimated by the sum of likelihoods, as

PLOS ONE
The effective reproduction number can thus be estimated as where C t is the set of cases with symptom onsets at day t. Obviously, R t = R j if j 2 C t since in the Wallinga-Teunis method, cases with the same symptom onset time have the same expected number of secondary cases. We further consider the task to calculate the effective reproduction number R t given the last known onset time T. Obviously, only if T > t, this task is possible. If T � t þ t max g with t max g denoting the maximum generation interval, we can directly apply the Wallinga-Teunis method. However, if t < T < t þ t max g , we need to introduce an additional step with Bayesian inference [25]. Assuming the mean number of secondary cases infected by a case with symptom onset at day t can be decomposed by two parts as where R À t ðTÞ and R þ t ðTÞ are the mean numbers of secondary cases with symptom onsets before or at T and after T, respectively. The value of R À t ðTÞ can be directly estimated by using the Wallinga-Teunis method, and thus we can infer the effective reproduction number as Temporal reproduction number In this paper, we also consider a slightly different reproduction number, called the temporal reproduction number, to include the period-dependent metric R ½t 1 ;t 2 � ðt 1 � t 2 Þ that is defined as the average number of secondary cases infected by an infected case with symptoms onset during the time period [t 1 , t 2 ] [27]. Accordingly, R t is a special case of R ½t 1 ;t 2 � when t 1 = t 2 = t. Similar to the effective reproduction number, the temporal reproduction number can be estimated as where C ½t 1 ;t 2 � is the set of cases with symptom onsets in the range [t 1 , t 2 ].

Inferring the epidemic curve
For both methods proposed by Willinga and Teunis [24] and Cauchemez et al. [25], the epidemic curve must be given so as to estimate the effective reproduction number or temporal reproduction number. However, we usually face an even-worse condition about data accessibility, where not only the complete tracing records, but also the full epidemic curve is unknown. For example, the number of confirmed cases of COVID-19 for each province in mainland China is made public every day, while the symptom onset of each case is not reported by Chinese CDC. Using the collected records with both known symptom onsets and confirmed dates from scattered reports, we can obtain the empirical distribution of time intervals between symptom onsets and laboratory confirmations, say p(t Δ ). Then, we develop a Monte-Carlo method to infer the epidemic curve. Given a case i confirmed at day t (i) , sample a time interval t ðiÞ D according to the distribution p(t Δ ) and set i 0 s symptom onset as t i ¼ t ðiÞ À t ðiÞ D . Specifically, the uniform stochastic model U(0, 1) is used to sample time intervals between symptom onsets and laboratory confirmations. that is, we use uniform stochastic model U(0, 1) to return a random number z between 0 and 1, and then the time interval t ðiÞ D is defined by the constrain Pðt ðiÞ is the cumulative distribution corresponding to p(t Δ ). Combining it with the methods mentioned above, we can estimate effective reproduction number and temporal reproduction number, and thus evaluate the efficacy of control measures.
In this paper, we implement S = 10000 independent runs to obtain the mean values and confidence intervals. Furthermore, we take the interval time between the symptom onsets and laboratory confirmations as the statistic variable X, and use K-S test [28] to estimate the marginal error ε, as s ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 S where S is the sample size (i.e., the number of independent runs), σ is the standard deviation, α is the significance level, and D α is the critical value. In our work, the marginal error is ε = 0.0379 subject to α = 0.05 and S = 10000. In summary, the proposed method can be decomposited into three parts, namely, inputs, output and processes. The inputs include the distribution of generation intervals, the symptom onsets of some cases, and the laboratory confirmations of all cases. The output of the model is the estimated effective reproduction number R t . In the processes, we estimate the distribution of intervals between symptom onsets and laboratory confirmations based on the cases with known symptom onsets and laboratory confirmations and apply the Monte Carlo sampling method to estimate the symptom onsets of other cases based on their laboratory confirmations. So that, the epidemic curve of all cases can be approximately obtained. Finally, the effective reproduction number is estimated according to the epidemic curve and the distribution of generational intervals. The inputs, output and processes of the proposed method are illustrated in Fig 1.

Results
We have collected all 76936 confirmed cases reported in official websites, which are the known ensemble for the mainland China from 11 January 2020 to 22 February 2020. The detailed quantitative information of daily number of confirmed cases is from National Health Commission of China whose URL address is http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml. A very small fraction (4.74%) of these confirmed cases (i.e. 3650 cases) with known symptom onsets are collected from the six provinces that have reported such information. Since all provinces except Hubei applied almost the same control measures, the samples are representative. The confirmed cases for Tibet and Qinghai are only 1 and 15, so we do not analyze these two provinces.
Based on the six provinces with records of symptom onsets, we have checked that individual distributions are close to each other and can be well resembled by the synthesized distribution (see Fig 2).
Moreover, as shown in Fig 3, the synthesized distribution p(t Δ ) can be well fitted by a translational Weibull distribution [29]: where the shape parameter α � 1.48, the scale parameter β � 7.03, and the translational parameter γ = 0.10. We introduce the translational parameter because some cases are confirmed immediately so p(0) > 0, while the original Weibull distribution gives p(0) = 0 for any shape parameter and scale parameter. The province-level results are shown in Table 1. These results demonstrate the impressive achievement by control measures, namely R t for the majority of provinces decreased to < 1 within one week from the starting date of control. Even for Hubei, the epidemic was under control (R t < 1) in just two weeks. In addition, within a month, the average temporal reproduction number over all provinces already decayed to 0.18, a very small value corresponding to a dying phase of the epidemic. Fig 4 reports the estimated R t for each province from 10 January 2020 to 21 February 2020 by using the present method.
Furthermore, we propose a so-called 5Γ-model with N = 1, 000, 000 individuals to illustrate the reliability of the present method. The spreading starts with 10 initially infected individuals, and all infected and susceptible individuals are fully mixed. In the simulation, in each time step (i.e., a day), the number of contacted individuals of each infected case is independently https://doi.org/10.1371/journal.pone.0246715.g001

PLOS ONE
Evaluating the effect of Chinese control measures on COVID-19 via temporal reproduction number estimation drawn from the Gamma distribution Γ 1 . For each contact between an infected individual and a susceptible individual, the infected probability is independently drawn from the Gamma distribution Γ 2 . The time intervals between symptom onsets and laboratory confirmations obey the Gamma distribution Γ 3 . The generation intervals obey the Gamma distribution Γ 4 . The time intervals between laboratory confirmations and removals from the dynamics (i.e., died, recovered, effectively isolated, etc.) obey the Gamma distribution Γ 5 . The means and variances of all the five Gamma distributions are listed in Table 2.
We assume that the symptom onsets of 20% randomly selected confirmed cases are known, and the laboratory confirmations of all cases are known. The effective reproduction number R t can be directly counted by the simulation model as all transmission chains are known. We compare the accuracy of our method and that of the Wallinga-Teunis method, with simulation results being the benchmark. As shown in Fig 5, the effective reproduction numbers estimated by our method are very close to the benchmark values and remarkably more accurate than those obtained by the Wallinga-Teunis method. We have also checked that our estimations work well subject to other reasonable settings of distributions and parameters.  The results are averaged over 10000 independent runs, and the cyan areas denote the 95% confidence intervals. In each run, the Monte-Carlo sampling method is applied to infer the symptom onsets. The gray shadows emphasize the situations where the epidemic is under control (R t < 1). https://doi.org/10.1371/journal.pone.0246715.g004

Discussion
A Monte-Carlo method is proposed to infer the epidemic curve, and then estimate the temporal reproduction number. Our results suggest that Chinese control measures are likely to be effective and efficient, with daily number of confirmed cases quickly decreasing after a short expansion lasting about two weeks from 21 January 2020. By introducing a Monte-Carlo method to estimate the symptom onsets of confirmed cases based on a small number of cases with known symptom onsets, our method can utilize the information of all cases to calculate the effective reproduction number. In comparison, the Wallinga-Teunis method can only make use of the cases with both known symptom onsets and laboratory confirmations. As shown in Fig 5, our method produces obviously more accurate results than the Wallinga-Teunis method. One underlying assumption in our method is that the small number of samples are representative of all cases. This is a reasonable assumption for mainland China since control measures in different provinces are very much the same, all executing directives from the central government. However, in general, if the samples and the inferred cases are in different spreading stages or different areas, the reliability of the present method has to be carefully  checked before any applications. For example, in US, cases in a few states cannot represent the whole country since different states may adopt different controlling strategies and launch different control measures. The distribution p(t Δ ) is not stable, usually with smaller and smaller mean and standard deviation in the progress of an epidemic [18]. Fig 6 compares the estimates of effective reproduction numbers by the true and inferred records of symptom onsets for the six provinces with known symptom onsets. At the very beginning, the estimates from inferred data are smaller than the ones from true records, but they are getting closer and closer and show almost the same t � in the later stage. Indeed, we still overestimate the reproduction number in the early stage, because a large fraction of cases (except Hubei) are importations [18,30]. Fortunately, the present method shows accordance with the one accounting for importations. For example, R t of the three example provinces (Guangdong, Hunan and Shandong) approach 1 at 23 January 2020, 26 January 2020 and 30 January 2020 by the method in [30] and at 25 January 2020, 25 January 2020 and 27 January 2020 by the present method. In a word, this method can be further improved by considering importations [18,30] and using Markov-Chain Monte-Carlo algorithm based on independent transmission assumption [31][32][33].
Government-led actions likely played a role in the reduction of new COVID-19 cases. In order to block transmission and reduce public health hazards, the "five early" measures, namely "early detection, early report, early investigation, early isolation and early treatment", are implemented. Early detection.-Rapid detection and diagnosis to promote the timely and effective management of confirmed and suspected cases. Early report.-Immediate report to the disease control department about confirmed and suspected cases to start investigation and treatment as soon as possible. Early investigation.-Quick epidemiological investigation on the exposure and detailed contacts of confirmed and suspected cases. Through such investigation, we can find out the transmission chain of each case, so as to comprehensively manage all possible infected individuals related to each case. Early isolation.-All confirmed and suspected cases, as well as their close contacts will be isolated as soon as possible. Early treatment.-Quick providing of proper treatment (symptomatic treatment, supportive treatment, antiviral treatment via traditional Chinese medicine, etc.) to prevent the development of symptom. To efficiently and effectively implement the "five early" measures, some advanced information techniques are employed to trace the epidemic spreading. For example, in many cities, the QR codes [34,35] (similar to these used for online payments) are posted in public transport means (buses, subway stations, taxies, etc.), places with possible crowds (supermarkets, bazaars, restaurants, office buildings, etc.) and places worth particular attention (drugstores). People are asked to scan the codes before entering, so the administrators can get the corresponding check-in records with identifications (mobile phone ID). Therefore, if a person is laboratory confirmed or identified as a suspected case, the administrators will know immediately and exactly the persons who have possible contacts with this case by simply searching the check-in records. This operation is completely automatic with private information being protected if an individual is not laboratory confirmed, suspected or having close contacts with the above two kinds of people (even one is confirmed, her/his personal information is only used in fighting the disease). Fig 7 illustrates an example of the QR codes, which was posted in a bus in Chengdu City of Sichuan Province, and people are required to scan the code before getting on the bus. Therefore, if a confirmed or suspected case has taken this bus, we can immediately find out people who have also taken this bus in the same time period. This is in our opinion a simple but perfect tool in the epidemiological perspective to efficiently and effectively block the spread through communities.