Estimating the basic reproduction rate of HFMD using the time series SIR model in Guangdong, China

Hand, foot, and mouth disease (HFMD) has caused a substantial burden of disease in China, especially in Guangdong Province. Based on notifiable cases, we use the time series Susceptible-Infected-Recovered model to estimate the basic reproduction rate (R0) and the herd immunity threshold, understanding the transmission and persistence of HFMD more completely for efficient intervention in this province. The standardized difference between the reported and fitted time series of HFMD was 0.009 (<0.2). The median basic reproduction rate of total, enterovirus 71, and coxsackievirus 16 cases in Guangdong were 4.621 (IQR: 3.907–5.823), 3.023 (IQR: 2.289–4.292) and 7.767 (IQR: 6.903–10.353), respectively. The heatmap of R0 showed semiannual peaks of activity, including a major peak in spring and early summer (about the 12th week) followed by a smaller peak in autumn (about the 36th week). The county-level model showed that Longchuan (R0 = 33), Gaozhou (R0 = 24), Huazhou (R0 = 23) and Qingxin (R0 = 19) counties have higher basic reproduction rate than other counties in the province. The epidemic of HFMD in Guangdong Province is still grim, and strategies like the World Health Organization’s expanded program on immunization need to be implemented. An elimination of HFMD in Guangdong might need a Herd Immunity Threshold of 78%.


Introduction
Hand, foot and mouth disease (HFMD) is a major public health issue in China, affecting over two million children annually [1,2]. Particularly, the incidence of HFMD in Guangdong Province exceeded 30/10,000 per year, which was more than three times the national average [3,4]. An efficient intervention, a necessary and important action to prevent and control the spread of diseases, hinges on a complete understanding of the transmission and persistence of HFMD.
PLOS ONE | https://doi.org/10.1371/journal.pone.0179623 July 10, 2017 1 / 11 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Transmission of HFMD cases occurs by direct contact with the mucus, saliva, or feces of an infected individual, or through indirect contact via contaminated surfaces. The basic reproduction rate (R0) is used to measure the transmission potential of a disease. R0 is the number of expected secondary infections resulting from a single infectious case [5]. In general, if R0 is greater than one, the disease will continue to spread, and if R0 is less than one, the disease will eventually disappear. Also, R0 can be used to estimate the herd immunity threshold (HIT) needed to stop transmission of communicable diseases [6]. Both R0 and HIT are the key measures used in infectious disease control, immunization and eradication.
Dynamic modelling of infectious diseases has contributed greatly to estimating R0 [7]. Several types of mathematical models have become valuable aid to understanding and fighting transmission of HFMD: A Susceptible-Infected-Recovered (SIR) epidemic model to predict the number of infected in Sarawak, Malaysia, was proposed by Tiing et al. [8]. Roy et al. established a Susceptible-Exposed-Infectious-Recovered (SEIR) model to understand the dynamics of HFMD among young children in Khulna, Bangladesh [9]. In addition, Yang et al. established a Susceptible-Exposed-Infectious-Quarantined-Recovered (SEIQR) model and estimated the R0 for the HFMD transmission in mainland China [10].
However, the transmission dynamics varied across geolocation due to different socio-economic situations, demography, health resources, and people's lifestyles. Targeted strategies for prevention and control would benefit from understanding the transmission dynamics for different geographical levels (i.e. province, city, and county). It is necessary to study the transmission dynamics of HFMD in Guangdong due to its serious disease burden. We used a time series Susceptible-Infected-Recovered model (TSIR) to estimate the R0 and HIT of HFMD in Guangdong based on varying province, city, and county geographical levels. The TSIR was developed by including the time series as a covariate in a nonparametric autoregressive modelling approach to improve forecasting performance, and can be used to estimate transmission dynamics [11]. Moreover, the TSIR model is easier to understand because it is based on simple linear regression [12].
To our knowledge, these principal epidemiological parameters (R0 and HIT) of HFMD in Guangdong Province are yet to be addressed and have not been estimated. In this paper, we adopted the TSIR model to study the transmission dynamics of HFMD in Guangdong. We estimated the spatial pattern of transmission dynamics based on different geographical scales including province, city, and county levels.

Ethics statement
This study was based on official HFMD surveillance data in Guangdong, China. Analyses were conducted at aggregate level and no confidential information was involved. The research study protocol was approved by the Institutional Review Board of the School of Public Health, Sun Yat-sen University. All methods were performed in accordance with the relevant ethical guidelines and regulations.

Study site
Guangdong Province, situated at latitude 20˚15' 0" to 25˚51' 0" N and longitude 109˚75' 0" to 117˚33' 0" E, has a population of 104 million (from 2010 census data). According to the Annual Statistical Report of Guangdong (http://www.gdstats.gov.cn/tjsj/gdtjnj/), the province can be generally divided into four parts: the Pearl River Delta Region, Eastern Region, Western Region, and Mountainous Region. These four regions have different socio-economic profiles, and the Pearl River Delta Region, located around the center of the province, bears a much higher HFMD burden than that borne by the other three areas [13], the accumulative incidence from 2009 to 2012 being 39/10,000 in the Pearl River Delta Region; 7/10,000 in the Eastern Region; 12/10,000 in the Western Region; and 14/10,000 in the Mountainous Region. The population in Guangdong was non-vaccinated from 2009 to 2012 [14].

Data collection
Case-based HFMD surveillance data from 2009 to 2012 were obtained from the National Center for Public Health Surveillance and Information Services and the China Center for Disease Control and Prevention (China CDC) (S1 Dataset). This enhanced national surveillance system has been described in detail elsewhere [15]. Population (pop) and birth statistics were obtained from the Annual Statistical Report of Guangdong and the National Population and Health Science Data Sharing Platform (S1 Dataset) [16]. The cases t was adjusted to the infected part I adj,t for the TSIR model by multiplying by a derivative function: We calculated the cumulate number of cases (X cumCases ) and births (Y cumBiths ) with the numbers of HFMD cases (cases t ) and the births. TheŶ cumBirths was obtained by fitting the locally weighted scatterplot smoothing (LOWESS) non-parametric regression: The susceptible part (S) for the TSIR model was calculated by multiplying pop and a proportion (p) which was simulated from 1% to 40% in our study (S = p × pop). To get a visible spatial pattern of the transmission dynamics of HFMD in Guangdong, we acquired the county-level shapefile map of Guangdong Province from OpenStreetMap (OpenStreetMap Foundation, London, United Kingdom).

Time series susceptible-infected-recovered model
The TSIR model is a discrete-time version of the continuous-time SIR model in which individuals are born and enter the susceptible class of individuals, become infected and infectious with a disease, and recover and are removed thereafter [12]. For HFMD, the characteristic time scale of the disease (i.e. the duration of the transition from infection to recovery and temporary immunity) is about two weeks [17]. Therefore, any new infection must arise from an interaction between a susceptible and an infected individual sometime within the previous biweek. So, we aggregated the data into bi-weekly time steps, and there are twenty-six period (period 1,. . .,26 ) per year, then the future number of infected can be explained as a function of the previous number of infected. The function contained three components including the seasonal transmission (β 1 ×period 1 +. . .+β 26 ×period 26 ), the non-seasonal transmission (α×I adj,t-1 ) and the priori known component (S + Z). The appropriate discrete dynamic two-dimensional compartment model for a childhood disease is thus given by: Here the Z was calculated byŶ cumBirths À Y cumBirths , indicating the newborn population. We were interested in estimating the seasonal transmission parameters β 1,. . .,26 and the mixing parameters rate α which is a correction parameter accounting for non-seasonal heterogeneities [11,18]. The estimated values of α and β 1,. . .,26 obtained from each model were available in S1 Table. Basic reproduction rate and herd immunity threshold R0 is used to measure the transmission potential of a disease, and thought of as the number of secondary infections produced by a typical case of an infection in a population that is totally susceptible [19]. It can therefore be measured by counting the number of secondary cases following the introduction of an infection into a totally susceptible population. For those mathematical models with differential equations (i.e. Li et al. [20,21] and Wu et al. [22]), the R0 was derived according to the concepts of the next generation matrix. While the next generation matrix was replaced by the seasonal transmission parameters β 1,. . .,26 in our study. The β 1,. . .,26 were the regression coefficients of the Eq 4, and could be estimated by using the algorithm of iteratively reweighted least squares (IRLS) [23]. Thus, the R0 in our study is given by: For each model (i.e. serotype-, city-, and county-specific models), we calculated the median R0 of R0 1,. . .,26 as the final result.
HIT is the proportion of a population that must be vaccinated to become immune so that an infectious disease can become stable in that community [6]. When the proportion is reached by vaccination, each case leads to a single new case and the infection becomes stable. We calculated the HIT by:

Software and packages
All of the statistical analyses were conducted in R version 3.3.2 (R Core Team, Vienna, Austria), using packages including base, tsiR, rgeos, maptools, RColorBrewer, and stddiff.

The goodness of fit of the TSIR model for HFMD in Guangdong
The bi-weekly HFMD cases fitted by the TSIR model were matched to the reported cases in Guangdong Province (Fig 1). We calculated the standardized difference (<0.2 represents the balance between reported and fitted groups) for the overall duration (0.009). Other fitting indicators, including mean absolute percentage error (MAPE, 40.94%) and Pearson's productmoment correlation (r = 0.858, P<0.001), were also calculated.

The serotype-specific R0 for HFMD in Guangdong
The median R0 of total cases in Guangdong was 4 The city-specific R0 and HIT for HFMD in Guangdong From the heatmap of R0 (Fig 2), HFMD showed semiannual peaks of activity, including a major peak in spring and early summer (about the 12 th week) followed by a smaller peak in  autumn (about the 36 th week). Based on the patterns of bi-weekly R0, 21 cities were clustered into three groups which might share similar transmission dynamics. Four of these, Zhaoqing, Shaoguan, Shenzhen and Foshan, were clustered into one group with the higher median R0 than other cities. However, these groups did not match geographically. The median R0 of each of the 21 cities ranged from 1.488 to 3.651. In terms of HIT, the proportion of each city's population that need to be vaccinated ranged from 33% to 73%. (Table 2) The county-specific R0 for HFMD in Guangdong The county-level TSIR model showed that R0 among counties varied greatly, ranging from 1 to 33 (Fig 3). Longchuan county had the highest median R0 (33). Huazhou and Gaozhou counties had the second median R0 (23 and 24). A high median R0 (19) also was found in Qingxin county. All these counties are located in the Eastern Region, Western Region, and Mountainous Region.

Discussion
In this paper, we have estimated two principal epidemiological parameters including R0 and HIT of HFMD transmission in Guangdong Province based on surveillance data from 2009 to 2012. According to our study, the HFMD in Guangdong had a median R0 of 4.621 (Table 1), the transmission potential of which was similar to that other infectious diseases including diphtheria (R0: 6-7), mumps (R0: 4-7), polio (R0: 5-7) and rubella (R0: 6-7) [5]. All of these infectious diseases are notifiable diseases in China and were included in the WHO's expanded program on immunization (EPI), with the exception of HFMD. For HFMD cases reported from March 2009 to February 2012 in mainland China, Yang et al. estimated an R0 of 1.392, which indicated an outbreak of HFMD will occur [10]. Our results were thus apt, because the incidence of HFMD in Guangdong was almost three times the national average [1,24] and we updated the data to December 2012 including the increasing disease burden. Thus, to control the spread of HFMD in Guangdong, strategies such as the implementation of EPI vaccination programs should be applied. Our efforts are targeted at increasing the herd immunity to the HIT of 78% (Table 1). For city-specific results, two peaks of R0 were found which corresponded to the observed time series of HFMD, following the Xing et al. findings published in The Lancet's Infectious Diseases journal [15]. Based on the patterns for bi-weekly R0, cities were divided into three groups by clustering analysis (Fig 2). In this way, the HFMD in 21 cities can be prevented and controlled by group and more attention can be paid to the two upcoming peaks. For countyspecific results, we clarified the spatial distribution of the R0 for HFMD in Guangdong (Fig 3). The counties with higher R0 were in the Eastern Region, Western Region, and Mountainous Region, all of which have lower socio-economic profiles than that of the Pearl River Delta Region. Appropriate and enhanced prevention strategies should be implemented in these counties due to the higher R0 indicating the higher potential of outbreak. This study stood out from previous studies by its strengths. First, the TSIR model used in this paper had an acceptable goodness of fit due to bringing the historical time series into the model. The indicators for goodness of fit, including the standardized difference (0.009), the MAPE (40.94%) and Pearson's product-moment coefficient (0.858), were all acceptable; other models, especially the simple SIR, might not render such a high correlation [8]. Second, by exploring the spatial pattern of R0 across the province and indicating the spatial variance from province to county level. Resources can be targeted effectively by the spatial patterns, an improvement over previous studies which usually focused on the temporal perspective [25]. Moreover, our study's HIT calculations for province cities can provide the evidence to support the implementation of the country's urgently needed vaccination program (i.e. EV71 vaccination) [14]; to the best knowledge, previous studies do not put forward this important parameter for disease control.
However, there are also some limitations to this study. The absence of age structure in the TSIR model did not allow us to assess the degree of age-focusing of vaccination. Subsequent models could be further refined to allow for transmission by age. Fortunately, the majority (95%) of HFMD cases comprised children under the age of five and our results reflect on this susceptible population. In addition, the current time series were not updated to the latest year (2015, before the EV71 vaccination was available). Since the cases of HFMD increased annually, we might have slightly underestimated its transmission. However, the quality of the fit suggested that our appeal of strong control and prevention is reasonable as a first step.

Conclusions
Using a relatively simple mathematical model (TSIR), we detected a robust R0 and a signature of herd immunity, driving the outbreak dynamics of HFMD. Our result indicated that the epidemic of HFMD in Guangdong continues to pose a serious threat as all the R0 from models of different geographical levels were greater than one. The R0 of CVA16 cases was higher than that of EV71 cases. Counties with higher R0 including Longchuan, Gaozhou, Huazhou, and Qingxin should be paid more attention to. Targeted strategies for prevention and control such as those of WHO's expanded programs should be implemented. An HIT of 78% might achieve an elimination of HFMD in Guangdong.  Table. The estimated values of α and β obtained from each model. (XLSX)