The Transmission Dynamics of Tuberculosis in a Recently Developed Chinese City

Background Hong Kong is an affluent subtropical city with a well-developed healthcare infrastructure but an intermediate TB burden. Declines in notification rates through the 1960s and 1970s have slowed since the 1980s to the current level of around 82 cases per 100 000 population. We studied the transmission dynamics of TB in Hong Kong to explore the factors underlying recent trends in incidence. Methodology/Principal Findings We fitted an age-structured compartmental model to TB notifications in Hong Kong between 1968 and 2008. We used the model to quantify the proportion of annual cases due to recent transmission versus endogenous reactivation of latent infection, and to project trends in incidence rates to 2018. The proportion of annual TB notifications attributed to endogenous reactivation increased from 46% to 70% between 1968 and 2008. Age-standardized notification rates were projected to decline to approximately 56 per 100 000 in 2018. Conclusions/Significance Continued intermediate incidence of TB in Hong Kong is driven primarily by endogenous reactivation of latent infections. Public health interventions which focus on reducing transmission may not lead to substantial reductions in disease burden associated with endogenous reactivation of latent infections in the short- to medium-term. While reductions in transmission with socio-economic development and public health interventions will lead to declines in TB incidence in these regions, a high prevalence of latent infections may hinder substantial declines in burden in the longer term. These findings may therefore have important implications for the burden of TB in developing regions with higher levels of transmission currently.

The recovery rate for active TB cases at time t is given by φ(t). A recovered individual does not have immunity and can be reinfected and directly develop active TB disease, becoming recent reinfected infectious TB, RIITB or non-infectious TB, RINTB, or stay at the latent state for a while, developing active TB disease later. In both cases, the infection rate for recovered individuals is assumed to be the same as that of susceptible individuals. A recovered individual may also relapse to become a reactivated infectious TB case (RAITB) or reactivated non-infectious TB case (RANTB), both at a rate of ω. (1 − α)λ(a, t)(S(a, t) + m S (a, t)) − S→RIIT B αθλ(a, t)(S(a, t) + m S (a, t)) −

S→RIN T B
α(1 − θ)λ(a, t)(S(a, t) + m S (a, t)) The numbers of cases due to recent transmission and reactivation

Initial state
For the initial states of the model in year 1961, we assumed that the prevalence of latent TB infection in different age groups followed a logistic distribution with the prevalence of TB being 0.7 in people aged over 35 years old. The numbers of people with active TB disease in different age groups were derived from the TB notification data in Hong Kong by assuming that the number of prevalent active TB cases is about 2.5 times the number of notified cases with the same age. The distribution of recovered individuals from active TB disease in different age groups was also assumed to follow a logistic distribution.
We assumed the prevalence of individuals who have been infected with TB for over 1, 2, 3, 4, 5 years but not developed active TB disease in Hong Kong in 1961 followed The prevalence of individuals who have been infected for over 1 year but less than 2 years is L 1 (t) − L 2 (t), over 2 years but less than 3 years is L 2 (t) − L 3 (t), over 3 years but less than 4 years is , and over 4 years but less than 5 years is The parameters used in the logistic function are: L maxi , maximum of the logistic function (i = 1, 2, 3, 4, 5). a = 7, b = 0.05, c = 17, parameters for the curvature of the logistic function.
, where δ is a small number (we used 0.01 in our model) n j = the age of subjects The age-specific mid-year populations were obtained from official statistics published by the Census and Statistics Department of the Hong Kong government.

Population movement
Considering potential impact of migrated population on the incidence of TB in Hong Kong, we incorporated immigrants and emigrants in our model. We derived the number of migrants aged a at time t in Hong Kong in 1961-2008 from the officially published data by the Census and Statistics Department. The net movement of population was calculated with the equations below: As the number of population is reported by every 5-year age group in Hong Kong, we averaged the 5-year age group data to achieve the approximate number of population in each age group. However, the way to calculate the number of age-specific population would potentially make the difference in the numbers in the 5a th and (5a + 1) th age groups extraordinarily higher than other adjacent age groups. To avoid these sudden leaps, we averaged the m(a, t) in every 5 years of age to get the number of migrants aged a at time t. At the end, the net movement of population in each 5-year age group in our model is exactly the same as officially published data.
To simplify the model, we assumed that the migrants were either susceptible or longterm latently infected. If m(a, t) in equation (2) is positive, it means there are net immigrants in the according age group; if negative, suggesting net emigrants. We also assumed the prevalence of TB in immigrants and emigrants are different as they originated from places with different disease prevalence. The prevalence of TB in the mainland China (especially Southern China) was assumed higher than that in Hong Kong due to different public health infrastructures and also suggested by published studies.

Disease progression rate
We assumed that the disease progression rate within or more than 5 years after TB  Figure S2.

Parameter estimation
We estimated 6 key parameters relating to the transmission dynamics. We also estimated 1 overdispersion parameter for the distribution specifying the likelihood. Estimation was carried out using the optim function in R by maximizing a negative binomial based likelihood as the simulated age-specific TB notifications were fitted against the observed age-specific TB notifications. Multiple sets of initial values were used to ensure the obtained solution was optimal in the plausible solution space. The log likelihood is obtained by grouping simulated data into age groups and years, and given by: log P negbin (m(a, t); m * (a, t|p), k) where a max is the maximal age considered in the model, T is the study period, m * (a, t|p) is the simulated TB notifications in age group a at time t based on paramaters p, m(a, t) is the observed TB notifications in age group a at time t and k is the dispersion parameter of the negative binomial distribution. P negbin is the probability mass function of a negative binomial distribution. In the model, we set T = 58 and a max = 80.
To evaluate the uncertainty in the parameter estimates, we constructed marginal 95% confidence intervals. We used the function hessian in R to evaluate numerically the information matrix based on the likelihood function. The variance covariance matrix of the estimated parameters was then derived and we identified the confidence hyperellipsoid p which satisfied the following relation: where p * is the estimated parameter vector, V p is the variance-covariance matrix, χ 2 7 is the chi-squared distribution with 7 degrees of freedom and α is the significance level.
For ease of presentation we took the boundary of the hyperellipsoid with respect to each estimated parameter as the marginal confidence intervals.
The dispersion parameter was estimated to be 15.3 (95% CI: 13.8-17.2). Other parameter estimates are summarized in Table 2.

Sensitivity analysis
We performed one-way sensitivity analysis to examine the influence of each of the fixed parameters on the trends in TB notifications predicted by the model. We varied the proportion of active TB disease which is infectious (θ), the proportion of active TB disease from latent TB infection (RLTBI or LLTBI) which is infectious (γ), the probability of relapse for recovered patients (ω), the recovery rate for TB patients in 1961 (φ 0 ), the prevalence of latent TB in 1961 (P L0 ), and the ratio of TB prevalence to incidence in 1961 (π T 0 ). Each parameter was varied between minimum and maximum plausible values as determined from local data or the literature (Table 1).
Further, to assess the combined effect of the above parameters on our estimates, a multivariate sensitivity analysis based on Latin hypercube sampling was carried out.
We subdivided the range for each variable as specified in Table 1 into equiprobable intervals, based on a uniform distribution. We then simulated 100 sets of samples in which each variable was drawn randomly and without replacement from these intervals. Based on the samples we simulated the number of annually produced active TB cases, cases due to recent transmission and endogenous reactivation and the proportion of cases from recent transmission in each year as shown in Figure 4. Table S2 shows the correlation matrix of the estimated parameters, from which we observed high dependency between transmission related variables b, M 1 , and M 2 . The negative correlation between α and p r , p l indicates that the model was able to maintain the tradeoff between the risk of directly developing active TB and disease progression rate from latent infection to generate trends in TB cases consistent with the data. All of the correlations are consistent with the working TB transmission dynamics conditioned on the given data.