Reproductive Number and Serial Interval of the First Wave of Influenza A(H1N1)pdm09 Virus in South Africa

Background/Objective Describing transmissibility parameters of past pandemics from diverse geographic sites remains critical to planning responses to future outbreaks. We characterize the transmissibility of influenza A(H1N1)pdm09 (hereafter pH1N1) in South Africa during 2009 by estimating the serial interval (SI), the initial effective reproductive number (initial Rt) and the temporal variation of Rt. Methods We make use of data from a central registry of all pH1N1 laboratory-confirmed cases detected throughout South Africa. Whenever date of symptom onset is missing, we estimate it from the date of specimen collection using a multiple imputation approach repeated 100 times for each missing value. We apply a likelihood-based method (method 1) for simultaneous estimation of initial Rt and the SI; estimate initial Rt from SI distributions established from prior field studies (method 2); and the Wallinga and Teunis method (method 3) to model the temporal variation of Rt. Results 12,360 confirmed pH1N1 cases were reported in the central registry. During the period of exponential growth of the epidemic (June 21 to August 3, 2009), we simultaneously estimate a mean Rt of 1.47 (95% CI: 1.30–1.72) and mean SI of 2.78 days (95% CI: 1.80–3.75) (method 1). Field studies found a mean SI of 2.3 days between primary cases and laboratory-confirmed secondary cases, and 2.7 days when considering both suspected and confirmed secondary cases. Incorporating the SI estimate from field studies using laboratory-confirmed cases, we found an initial Rt of 1.43 (95% CI: 1.38–1.49) (method 2). The mean Rt peaked at 2.91 (95% CI: 0.85–2.91) on June 21, as the epidemic commenced, and Rt>1 was sustained until August 22 (method 3). Conclusions Transmissibility characteristics of pH1N1 in South Africa are similar to estimates reported by countries outside of Africa. Estimations using the likelihood-based method are in agreement with field findings.


Introduction
During 2009, the emergence and worldwide spread of influenza A(H1N1)pdm09 (pH1N1) was observed [1]. While a rapid and timely estimation of the transmission parameters of this novel virus played an important role in informing transmission potential and mitigation interventions during the 2009 pandemic period, the post-pandemic documentation of these parameters is equally important as many previous estimates were established from analyses conducted during the early stages of epidemics and often from preliminary data [2,3]. Additionally enhancing our knowledge of past pandemics assists in providing greater insight to prepare and respond in future outbreaks.
Four key measures are typically used to describe the transmissibility of an infectious disease. First, the serial interval (SI) describes the mean time between illness onset of two successive cases in the chain of transmission. Second, the secondary attack rate (SAR) describes the proportion of susceptible contacts that acquire infection from an infectious person. Third, the basic reproductive number (R 0 ) is defined as the average number of secondary cases per primary case in an idealised entirely susceptible population in the absence of control measures. Finally, the effective reproductive number (R t ) at any given time point represents the actual average number of secondary cases per primary case observed in a population. R t reflects the impact of control measures and the depletion of susceptible persons over time. The initial R t may approximate R 0 in pandemic situations. [2][3][4][5].
In a previous work, we estimated the SAR and SI of pH1N1 among the first 100 cases detected in South Africa by prospectively examining virus transmission between household contacts [22]. We found a SAR of 10% and a mean SI of 2.3 days (SD 61.3, range 1-5) between successive laboratory-confirmed cases in the transmission chain. When additionally including suspected secondary cases into the analysis, the SAR increased to 17% and the SI to 2.7 days (SD 61.5, range 1-6). In this work we incorporate data collected on all laboratory-confirmed cases detected during the 2009 pH1N1 epidemic in South Africa with the aim of describing the transmissibility characteristics (initial R t and temporal variation of R t ) of the epidemic in the country and compare its dynamics with those observed in other countries in the same year.

Data
During 2009, the National Institute for Communicable Diseases (NICD), of the National Health Laboratory Service (NHLS), South Africa, maintained a central registry of all pH1N1 laboratory-confirmed cases detected throughout the country. The methodology of collating this data has previously been described in detail [23]. Briefly, we collated individual case-based data from all laboratories offering pH1N1 testing throughout South Africa, which included patient age, sex, dates of illness onset and specimen collection, and the administrative location (province) of the healthcare facility where the patient presented. Testing was performed by accredited laboratories, including: the National Influenza Centre (NICD-NHLS), NHLS public-sector laboratories or private-sector laboratories. All testing laboratories performed detection and characterisation of pH1N1 virus by real-time PCR by either the protocol developed by the WHO Collaborating Centre for Influenza, U.S. Centers for Disease Control and Prevention [24], or using commercially available kits.

Imputation of Missing Data
Wherever the date of symptom onset was missing, we estimated it from the date of specimen collection using a multiple imputation approach. Firstly, we modelled the lag time from date of symptoms onset to date of specimens collection from cases with complete data via a Poisson regression model using predictors significant at p,0.05. The covariates assessed in the model were patient age, gender, province, date of specimen collection, and collection of a specimen on a weekend day (i.e. Saturday or Sunday). Secondly we obtained an estimated lag-time for each observation with missing date of symptoms onset using a random sampling process from a Poisson distribution centred on the predicted value from the Poisson regression model. A Poisson distribution was selected to model count data. Thirdly we imputed missing dates of symptoms onset by subtracting the estimated lag-time from the date of specimen collection. The imputation process was repeated 100 times for each missing value, creating 100 datasets with information on the onset date (imputed or observed) for 12,630 laboratory-confirmed cases.

Estimation of Intial R t and Temporal Variation in R t
We based the estimation of initial R t and temporal variation of R t on date of symptoms onset (observed and imputed). In all analyses we modelled the SI via a multinomial distribution. When estimating initial R t , we focus our analysis on the exponential growth phase of the epidemic in South Africa (i.e. the period from the first occurrence of five consecutive days with confirmed cases reported to the epidemic peak). The parameters were estimated using three methods: Method 1. We make use of the likelihood-based method for the simultaneous estimation of initial R t and the SI described by   [25]. This method is well suited for estimation of initial R t and SI in real-time with observed aggregated daily counts of new cases, denoted by N = (N 0 , N 1 …,N T ) where T is the last day of observation and N 0 are the initial number of seed cases that begin the outbreak. The N i are assumed to be composed of a mixture of cases that were generated by the previous k days, where k is the maximal value of the serial interval. We denote these as X j , the number of cases that appear on day i that were infected by individuals with onset of symptoms on day j. We assume that the number of infectees generated by infectors with symptoms on day j follows a Poisson distribution with parameter R t N j . Additionally, X j = (X j,j+1 , X j,j+2 …,X j,j+k+1 ), the vector of cases infected by the N j individuals, follows a multinomial distribution with parameters p, k and X j . Here p is a vector of probabilities that denotes the serial interval distribution. Using these assumptions, the following likelihood is obtained: where m i~Rt ( P k j~1 p j N i{j ). Parameter estimates are obtained using maximum likelihood methods. For this method we used 6 days as the maximal value of the SI (k), which is consistent with the length of the SI observed in field investigations in South Africa [22]. In addition we implemented a sensitivity analysis to assess the variation of the initial R t estimates vis-à-vis k values of 4 days and 8 days, respectively.
Method 2. We assume a known distribution of the SI in South Africa and we estimate the initial R t using the maximum likelihood estimator for known SI described by   [9,25]. The estimator of initial R t in this case is a modification of Method 1 and is given by: For this analysis we use the two SI distributions observed from investigations of the first 100 pH1N1 cases in South Africa [22]: (1) the SI distribution between primary cases and laboratoryconfirmed secondary cases only (39%, 24%, 14%, 17%, 3% and 3% for day 1 to 6 respectively), and (2) the SI distribution between primary cases and suspected plus laboratory-confirmed secondary cases (30%, 17%, 20%, 23%, 7% and 3% for day 1 to 6 respectively). We consider suspected secondary cases, individuals that developed ILI symptoms within 14 days from the symptom onset of a confirmed index case within the same household.
Method 3. We make use of the Wallinga and Teunis' method for estimation of R t from the imputed data [26]. This method uses the daily case counts of cases and assumes the serial interval is known. We make the same assumptions for the serial interval as in method 2. The method calculates the relative probability a case on day i infects a case on day j as:  Table 1. Observed lag-time between date of symptom onset and date of specimen collection, incidence rate ratio (IRR) and significance value of the covariates significant in the Poisson regression model. where p k is the probability of a serial interval of length k. Then the estimate for the reproductive number for case i, is: This method requires that we make use of the entire epidemic curve. We calculate R t as the average of the R i when i is in the epidemic period, as previously defined.
Estimates are reported as the means across the 100 imputations. For all estimates, we calculate bootstrap confidence intervals as has been described previously [9,26]. We combine the results from all 100 imputations to obtain a confidence interval that incorporates both imputation error, as well as random error [27].
All analyses were performed using R version 2.14.

Data and Imputation
12,630 laboratory-confirmed pH1N1 cases were captured by the South African central registry during 2009. The overall demographic, spatial and temporal distribution of these cases has been previously described [23]. Data on date of symptom onset was available for 758 (6%) cases and date of specimen collection for 12,500 (99%) cases. The first case reported illness onset of June 12, 2009  The lag-time between symptom onset and specimen collection was significantly associated with the provincial location of specimen collection, as well as the collection of a specimen on a weekend day (Table 1). We used these two covariates in the multiple-imputation to predict the date of symptom onset where missing for all cases (Figure 1). Other available variables, including date of specimen collection (period during the epidemic), patient age and sex were not significantly associated with the lag-time between symptom onset and specimen collection and, therefore, not included in the final model. Analyses to simultaneously estimate initial R t and serial interval, and estimate initial R t given a known serial interval, were performed over the exponential growth phase of the epidemic from June 21 to August 3, 2009.

Simultaneous Estimation of R t and Serial Interval
Using the likelihood-based method to simultaneously estimate initial R t and the SI across 100 imputations of the dataset (Method 1), we estimated aR R t of 1.47 (95% CI: 1.30-1.72) and a mean SI of 2.78 days (95% CI: 1.80-3.75) (Figure 2).R R t estimates ranged from 1.31 (95% CI: 1.21-1.48) to 1.54 (95% CI: 1.37-2.03) when the maximal value of the SI ranged from 4 to 8 days.

Estimation of R t Assuming Known Serial Intervals
We first utilised the SI established from the aforementioned field investigations of the initial 100 cases in estimating R t , as described in method 2. When performing the analysis using the SI distribution observed for laboratory-confirmed pH1N1 secondary cases only (mean 2.3 days, SD 61.3, range 1-5) [22], we found an initialR R t of 1.43 (95% CI: 1.38-1.49) ( Figure 3A). When performing the analysis using the SI distribution observed for both confirmed and suspected secondary cases (mean 2.7 days, SD 61.5, range 1-6) [22], we found an initialR R t of 1.49 (95% CI: 1.44-1.55) ( Figure 3B). Figure 4 shows the variation inR R t with the progression of the outbreak over time. We observed relatively highR R t values following the introduction of pH1N1 virus into South Africa, corresponding to high rates of transmission and exponential growth of the local epidemic during this period.R R t peaked on the first day of the epidemic growth period (June 21) at 2.91 (95% CI: 0.85-3.99).R R t began to drop from July 27 onward and remained consistently below one after August 22. This corresponds with the decline in the daily incidence of new cases detected. Averaging the R t values obtained during the epidemic growth period (June 21 to August 3, 2009), we estimate initial R t to be 1.42 (95% CI: 1.20-1.71).

Discussion
Utilising temporal data on illness onset and specimen collection, and the epidemic curve derived from these data, we provide estimates of the transmissibility parameters of pH1N1 during the first wave experienced in South Africa. Our results focus primarily on the use of analytical techniques to estimate initial R t and SI without incorporating contact tracing or household transmission studies. However, when parameters from field studies are available, we show that these can be incorporated to provide robust estimates of transmission parameters. We found that initial R t estimates established using the likelihood-based method for the simultaneous estimation of R t and SI (method 1: initialR R t : 1.47, SI: 2.78 days) are in agreement with those obtained using SI observed in field investigations [22] (method 2: initialR R t : 1.43 and 1.49 using observed SI for laboratory confirmed or laboratory confirmed and suspected cases respectively). In addition, the mean SI estimate obtained with method 1 (2.78 days) is in agreement with field findings (SI: 2.3-2.7 days using observed SI for laboratory confirmed or laboratory confirmed and suspected cases respectively). Previous estimates of initial R t and the mean SI for pH1N1 have ranged between 1.3-2.9 and 2.5-3.3 days, respectively [2,[6][7][8][9][10][11][14][15][16][17][18][19]. Our estimates are consistent with these findings, regardless of the method used for the analysis and despite difference in climate, demography and health systems across these countries. It appears that once established, the transmission characteristics of pH1N1 are very consistent. Differences in transmission rates may occur within smaller subgroups of the overall population; however, this has not been well-studied.
Previous estimates of the epidemiological parameters of seasonal influenza epidemics found a SI = 2-4 days [28][29][30], and a R t a little over 1 with slight variation between climates; R t = 1.03 in Brazil [31] versus R t = 1.1-1.3 in more temperate climates [32]. A number of studies have retrospectively estimated the transmissibility of influenza pandemics. During the 1918 Spanish influenza A(H1N1) pandemic, when assuming a SI = 4 days, R 0 estimates range from 2.0-4.3 in community settings [33,34], and even higher values (R 0 = 2.6-10.6) in confined settings such as ships and prisons [34]. A separate analysis predicted a slightly lower SI of 3.3 in community settings and a SI of 3.81 in confined settings during the 1918 pandemic, and subsequently estimated R 0 values of 1.34-3.21 and 4.97 in these respective settings [35]. R 0 estimates from the 1957 Asian influenza A(H2N2) pandemic range from 1.65-1.68 [36,37]. During the first wave of the 1968-1969 Hong Kong influenza A(H3N2) pandemic, estimates of R 0 range from 1.06-2.06 and increased to 1.21-3.58 during the second wave [38].
Given our findings, the overall transmissibility of pH1N1 in South African during 2009 was more similar to that of seasonal influenza strains than the 1918 pandemic, and comparable to lower end estimates of the latter pandemics. However, by showing variation in transmissibility with time, we demonstrate that shortly after introduction of pH1N1 into the country, transmission of the virus reached anR R t of 2.9, resulting in exponential growth of the local epidemic and widespread illness. Nonetheless, we show that after a period of less than 2 months of heightened transmission,R R t dropped below 1, corresponding to a decline in the incidence of new cases; likely a result of a combination of herd immunity,  There are several limitations in this analysis which merit discussion. First, we assume that all cases are known and reported. It has been shown previously that, if cases are not reported, this may bias estimates generated using this method [39]. If the proportion of cases reported remains consistent over the study, then the estimates of transmissibility will not be biased; however, if the reporting fraction varies through time, then biased estimates of the reproductive number and serial interval may result. Likewise, variation in case ascertainment with time may bias our estimates of the temporal variation of R t . Generally higher reporting rates may be anticipated in the early phase, with reporting fatigue later becoming a factor. Secondly, data for this study are derived from laboratory-based surveillance data from several regions across South Africa; a large and diverse country. Our findings do not incorporate heterogeneities (such as spatial and demographic differences) that likely exist in transmission patterns, or assess the degree to which these impact aggregate measures of initial R t . Methodologies that incorporate heterogeneities inherent in public health data warrant further study.
Despite these limitations, the post-pandemic estimates presented here add to the body of knowledge of pH1N1 transmissibility parameters, which were previously dominated by estimates from developed nations and often based on preliminary data. It remains important that revised parameters, from complete datasets and diverse geographies, are incorporated into planning mitigation strategies for future pandemics. Nonetheless, the methods used in this study would be adaptable to generating real-time estimates during future epidemics. As we continue to build epidemiological capacity in developing nations, including South Africa, we must keep in mind the need for rapid assessments of transmissibility of novel pathogens, in addition to disease severity, to better inform public health interventions.