Estimating dengue transmission intensity from serological data: A comparative analysis using mixture and catalytic models

Background Dengue virus (DENV) infection is a global health concern of increasing magnitude. To target intervention strategies, accurate estimates of the force of infection (FOI) are necessary. Catalytic models have been widely used to estimate DENV FOI and rely on a binary classification of serostatus as seropositive or seronegative, according to pre-defined antibody thresholds. Previous work has demonstrated the use of thresholds can cause serostatus misclassification and biased estimates. In contrast, mixture models do not rely on thresholds and use the full distribution of antibody titres. To date, there has been limited application of mixture models to estimate DENV FOI. Methods We compare the application of mixture models and time-constant and time-varying catalytic models to simulated data and to serological data collected in Vietnam from 2004 to 2009 (N ≥ 2178) and Indonesia in 2014 (N = 3194). Results The simulation study showed larger mean FOI estimate bias from the time-constant and time-varying catalytic models (-0.007 (95% Confidence Interval (CI): -0.069, 0.029) and -0.006 (95% CI -0.095, 0.043)) than from the mixture model (0.001 (95% CI -0.036, 0.065)). Coverage of the true FOI was > 95% for estimates from both the time-varying catalytic and mixture model, however the latter had reduced uncertainty. When applied to real data from Vietnam, the mixture model frequently produced higher FOI and seroprevalence estimates than the catalytic models. Conclusions Our results suggest mixture models represent valid, potentially less biased, alternatives to catalytic models, which could be particularly useful when estimating FOI from data with largely overlapping antibody titre distributions.

Introduction Dengue fever is caused by infection with one or more of four antigenically distinct serotypes of dengue virus (DENV1-4), a Flavivirus carried by Aedes mosquitoes [1,2]. DENV infects approximately 105 million people each year [3], primarily in tropical and sub-tropical regions. The geographical range of DENV is increasing [1,4,5] and it is expected that the spread of dengue will be influenced by rising global temperatures and increasing urbanisation [1,6]. Intervention measures to date rely essentially on vector control due to the absence of antiviral treatment, challenges in the use of the first licensed dengue vaccine for widespread dengue prevention and control [7], as well as in the use of rapid diagnostic tests for screening [8]. The current and expected future burden of dengue on health-systems is therefore high, demonstrating a continuing need for increased understanding of DENV transmission.
Estimating epidemiological parameters such as the force of infection (FOI, the per capita rate at which a susceptible person is infected) and population seroprevalence (the proportion of people in a population exposed to a virus, as determined by the detection of antibodies in the blood) allow us to gain insights into the subsets of populations most at risk of infection and disease [9], to assess the predicted impact of an intervention strategy [10], and to inform public health policy [11,12].
Both the FOI and seroprevalence can be estimated using mathematical models calibrated to age-stratified serological data measuring IgG antibody levels (also called titres) from blood samples. IgG titres are obtained using Enzyme-Linked Immunosorbent Assays (ELISAs) and are often classified into qualitative, binary test results (seropositive or seronegative) based on the manufacturer's threshold.
Catalytic models, first proposed in 1934, estimate disease FOI from age-stratified serological or case notification data [13]. In these models, large rates of increase in seroprevalence between individuals who are age a versus age a+1 are explained by high age-specific FOI (assuming the FOI is constant in time) or high time-specific FOI experienced by individuals of all ages during the period a to a+1 years ago [14]. Catalytic models have been used extensively for measles [15], rubella [16], Hepatitis A [17], Chagas disease [18], and DENV [12,14,[19][20][21]. Whilst commonly used, previous work suggests that catalytic models risk generating biased estimates due to data-loss and/or misclassification [22][23][24]. For example, samples with titres greater than the seronegative threshold but lower than the seropositive threshold are classified as 'equivocal' and discarded from the analysis. Furthermore, titre levels of seropositive individuals in a given population may be affected by factors including host response, the degree of exposure to the pathogen and infection timing, which could lead to misclassification.
Mixture models are flexible statistical models that can be applied to continuous data from different clusters or populations, called components. Mixture models can therefore be applied to the absolute antibody titre values in serology datasets, rather than to the counts of titres in each of two classes (seropositive/seronegative) as is necessary for catalytic models [22]. The components' distributions and their defining parameters (e.g., the mean titre of each component distribution) are inferred from a fitted mixture model which is used to estimate the FOI and population seroprevalence [22,25]. To date, mixture models have been applied to serological data to estimate the seroprevalence of infectious diseases such as parvovirus B19 and rubella in England [26,27], human papillomavirus in the Netherlands [23], measles in Italy [28], and a selection of arboviruses inlcuding DENV in Zambia [29]. In addition, mixture models have been used to develop frameworks capable of distinguishing between primary and post-primary DENV infections [30,31], and recent and historical influenza A infections [32]. Recently, DENV FOI was estimated using catalytic and mixture models applied to serological data collected in three locations in Vietnam (N > = 266) and in Chennai, India (N = 799) [31]. In this study, the estimates from mixture models were deemed more robust than those from binary catalytic models [31].
Here, we implement a simulation study to assess the ability of mixture and catalytic models to reconstruct the FOI value used for simulating the data. Furthermore, we add to the growing body of evidence exploring the use of mixture models by presenting a comparitative analysis of the DENV trasmission intensity estimates obtained from mixture and catalytic models applied to age-stratified serological datasets from Vietnam (N � 2178, for years 2004-2009) and Indonesia (N = 3194, for 2014).

Ethics statement
Ethical approval for the secondary analysis of the age-stratified seroprevalence datasets was granted by the Imperial College Research Ethics Committee (Approval Reference 21IC7066).

Data
Age-stratified seroprevalence data. DENV IgG data were collected in Long Xuyen, Vietnam, during a prospective epidemiological study that was conducted to assess the suitability of the area for future CYD-TDV vaccine efficacy trials, as described previously [33]. Samples were collected from children under 11 years old in 2004 and then from children under 15 years old during September to February in 2004-2005, 2005-2006, 2006-2007, 2007-2008 and 2008-2009 (Datasets A-1 to A-6, Table 1). The titres were measured using in-house ELISA assays (Arbovirus Laboratory of Pasteur Institute, Ho Chi Minh City). DENV IgG data from 30 urban subdistricts in Jakarta, Indonesia were collected from 3,194 children under 18 years old as part of a cross-sectional seroprevalence survey in 2014 [34] (Dataset B). Given the small spatial scale of the range of data collection, we did not account for spatial differences when modelling. IgG titres were measured using the commercial Panbio Dengue IgG Indirect ELISA kit.
Simulated datasets. We simulated 540 antibody titre datasets (Dataset C), with the same age-distribution and sample size as the Indonesian seroprevalence survey data (Dataset B). For each simulation the distributions used for sampling seronegative and seropositive log(titres + 1) were selected from a normal, gamma or Weibull distribution. This gave 9 possible distribution pairs for seronegative and seropositive log(titre + 1) values, and we generated an equal number of simulations (N = 60) for each combination. Normal, gamma and Weibull distributions were chosen based on preliminary work on our antibody titre datasets showing that these mistures were most frequently selected among a wider set of distributions. Parameter values were randomly drawn from uniform distributions with limits as shown in S1 Table. The serostatus of each individual was drawn from a Bernoulli distribution with probability 1−e −λa , where a is the age of the individual and λ is the FOI (which is assumed to be constant with age and time), and therefore λa represents the cumulative FOI experienced by individuals over their lifetime. Log(titre + 1) values for each individual were subsequently randomly drawn from the respective component distributions. The analysis was conducted in the statistical programming language R [35].

Catalytic model
Data preparation. Catalytic models rely on data that are binarily classified as seropositive or seronegative. For Datasets A-1 to A-6, a background/control titre (t) was measured for each assay. An individual titre was classified as seronegative if �t and seropositive otherwise. For Dataset B, samples with titres � 9 PanBio units were classified as seronegative and �11 as seropositive. Titres >9 and <11 were discarded (28 out of 3,194 samples). For simulated Dataset C, titres were classified as seronegative if they were �X and seropositive if they were �Y. X and Y are thresholds that were optimised using the 'true' simulated serostatuses: the optim function in R, using the Nelder Mead algorithm, was used to calulate the X and Y values per simulated dataset resulting in the fewest titre misclassifications. The optimisation process occassionally failed to estimate realistic classification thresholds (X < 25% quantile of seronegative titres or Y > 75% quantile of seropositive titres) and these simulations were excluded from analysis (N = 31). For the remaining 509 simulations, titres >X and <Y were classified as equivocal and discarded (mean = 2 out of 3,194 samples, median = 1, interquartile range Table 1. Description of the datasets used in the analyses. Summary statistics including notation, region, the assay used, the year of testing, the age range of the children participating to the study and the sample sizes. (IQR) = 2, range = 0 to 30). The mean percentage misclassification error rate of titres across the simulations was 6.6% (median = 2.8%, IQR = 9.8%, range = 0% to 41.9%).

Parameter estimation.
We used a catalytic model as previously described [14,36]. The yearly FOI, i.e., the per capita rate of infection experienced by individuals in a given year X−i, where X is the year the serosurvey was conducted, is assumed to be either constant in time (constant across the years) or time-varying (piecewise constant across the years). When we assumed a time-varying FOI, the number of FOI estimates is equal to the number of single year age groups available in the datasets (maximum age group A -minimum age group M). The proportion of seropositive individuals in age group a during year X (π a,X ), was estimated as in Eq 1. Here, the yearly FOI (λ X−i ) is summed over the lifetime of the individuals in age group a to give the cumulative FOI experienced by the individuals in this age group. If the minimum age group M does not equal 1, then we estimated an average FOI for the M years without age-specific data (X−M to X) denoted λ X−M .
When we assume a time-constant FOI over the whole study period, Eq 1 can be expressed as shown in Eq 2: A binomial log-likelihood was assumed for the FOI (Eq 3), where N a is the total number of individuals and P a is the number of seropositive individuals in age group a during year X [19]. The optim function in R was used to find the maximum likelihood estimate of the FOI using Eq 3.
When we assumed a time-varying FOI, the λ X−i values were averaged to produce a mean FOI experienced over the years in the study period (A−M) (Eq 4) which was compared to the FOI estimated by the time-constant catalytic model and the mixture model.
We estimated 95% Confidence Intervals (CI) using a bootstrap method, where the titre data were sampled with replacement and the age-stratified proportion of seropositive indivuals was calculated, 500 times. The 95% CI was given by the 2.5% to 97.5% quantiles of the estimates from the catalytic models applied to the bootstrap samples.

Mixture model
Applying the mixture models to the titre distributions. Mixture models were applied to the bimodal distribution of individual antibody titres as described in Bollaerts et al., 2012 andHens et al., 2012 [22,25]. All individual antibody titre measurements were used in each dataset, which differs from the data used for the catalytic model where equivocal titres were discarded and titre measurements were classified as either seropositive or seronegative. The mixture model defines the distribution (z) of the log(titres + 1) as a mixture of two distinct distributions: one for susceptible individuals (seronegative, z s ) and one for individuals who have been previously infected (seropositive, z I ). The two-component mixture model is represented by: where f s and f I represent the probability density function of the seronegative and seropositive components, respectively, and where μ and σ represent the mean and standard deviation of each component, and π a,X represents the age-specific seroprevalence during year X, when the serosurvey was conducted.
The mixdist R package was used to fit the mixture models to the titre data by maximum likelihood using an Expectation Maximisation (EM) algorithm [37]. The package was adapted to allow fitting of different distributions for the seronegative and seropositive titre components: normal, gamma and Weibull distributions, giving 9 possible combinations. The best fitting mixture was chosen using the Akaike Information Criterion (AIC). For Dataset C, the estimated means (μ s and μ I ) and standard deviations (σ S and σ I ) for the seronegative and seropositive components of the best mixture were compared against the true parameter values used for simulating the data. We explored multiple parameterisations, including fixing the standard deviation of the two mixture components. For the Vietnamese datasets, we optimised the model having constrained the standard deviation of the seropositive component to multiple different values (for Dataset A-4 we set σ I equal to all values from 0.02 to 0.08 in steps of 0.01, for the other five Datasets we set σ I equal to all values from 0.05, to 0.15 in steps of 0.01). For the Indonesian dataset (Dataset B) the standard deviations of both components were constrained (σ S was set equal to 0.10 to 0.15 in steps of 0.001, and σ I was set equal to 0.15 to 0.3 in steps of 0.05).
Parameter estimation. The relationship between the age-dependent mean log(titre + 1) (μ a,X ), the age-specific seroprevalence (π a,X ) and the means of the mixture components (μ s and μ I ) is described in Eq 6. We estimated μ a,X by least-squares regression using a monotonically increasing P spline [22,25,38] using the mpspline.fit function from the serostat R package [39]. Equally spaced cubic polynomial segments (degree = 3) made up the spline. The optimal smoothing parameter (α) and number of segments (knots) were determined using the Bayesian Information Criterion, having explored combinations of α values (set equal to 0.001, 0.01, 0.1, 0.5, 1, 5, 10, 50, 100, 500) and knots (set equal to values in the sequence: 5 to the maximum number of x-axis age categories, step size = 1). The seroprevalence was calculated using Eq 7. m a;X ¼ ð1 À p a;X Þm s þ p a;X m I : ð6Þ The time-varying FOI was derived from the age-specific seroprevalence as described in Eq 8 [22], where the rate of change in the seroprevalence between two sequential age groups (a−1 and a) is divided by the proportion of seronegative individuals in age group a, to give the FOI experienced in the year X−a (λ X−a ). Eq 8 can in turn be expressed as a function of the underlying antibody titre distribution as shown in Eq 9, where μ 0 a,X represents the derivative of the age-specific mean log(titre + 1). The μ 0 a,X terms were calculated by taking the gradient of the fitted μ a,X spline at each age group a. The time-varying FOI can be averaged across the years in the study period to give the total FOI λ (Eq 4).
The 95% CI around the FOI and seroprevalence estimates were calculated using a boostrap method, where the titre data were sampled with replacement 5000 times. The 95% CI were given by the 2.5% to 97.5% quantiles of the estimates from the bootstrap samples. Bias in the mixture and catalytic model estimates of FOI and seroprevalence for Dataset C was calculated as the estimated value minus the true simulated value of the parameter. Uncertainty was calculated as the width of the 95% CIs around the parameter estimates. Coverage was calculated as the percentage of simulations where the estimated 95% CIs contained the true parameter value. Code for the simulation study analysis is available at: https://github.com/ Tori-Cox/Mixture-catalytic-models.

Long Xuyen, Vietnam data
When we applied the mixture model (Fig 3) to the data from Long Xuyen, Vietnam, the total population-level seroprevalence estimates ranged from 0.163 (95% CI 0.138-0. The seroprevalence estimates from all three models were consistent (as determined by the 95% CIs) for 4 out of 6 datasets (Fig 4, S3 Table). The general trend in the age-specific seroprevalence estimates, specifically for Datasets A-2:A-5, differed significantly between the mixture model and the catalytic models, with the mixture model estimating higher seroprevalence at the older ages (Fig 5).   Table). There is a higher degree of uncertainty around the estimates from the catalytic model when assuming a time-varying FOI compared to the time-constant FOI assumption (Fig 4). We observe greater differences in the estimates from each model when comparing the year-specific FOI as opposed to the averaged total FOI (S4 Fig).

Discussion
In this analysis, we explored the accuracy and bias of FOI and seroprevalence estimates obtained from mixture and catalytic models applied to serological data. The catalytic models were applied assuming a time-constant or time-varying FOI. We performed a simulation study to compare the performance of each model with known parameter values used to generate the simulated data, and we observed significantly greater accuracy in FOI and seroprevalence estimates from the mixture and time-varying catalytic models than time-constant catalytic models. We observed reduced bias and uncertainty in estimates from the mixture compared to the time-varying catalytic model.
In our simulation study, larger bias in the catalytic model estimates of FOI and seroprevalence (Figs 1 and 2), was associated with increased serostatus misclassification (S3 Fig). Serostatus misclassification occurred more often in simulations where the difference between the mean log(titre + 1) for the susceptible/seronegative component and the mean log(titre + 1) for the infected/seropositive component was lower (S2 Fig), indicating greater overlap between the distributions of the two components. Our results are consistent with previous work which showed greater bias in seroprevalence estimates using methods which employ cut-off thresholds to classify simulated antibody data as opposed to mixture models, when there was high overlap in the underlying components [24].
Differences in the degree of overlap between components in real serological datasets are likely impacted by many factors, including differences in the ELISA tests used to measure antibody titres, the age groups sampled and the underlying age structure of the population as well as the transmission setting and spatiotemporal heterogeneities in the risk of infection at the local scale. In datasets where there is clear separation in the bimodal distribution of antibody titres, catalytic and mixture models are expected to produce more similar estimates of FOI and seroprevalence as fewer samples are misclassified during the binary classification of the data needed to calibrate catalytic models [22,24]. This is consistent with the results from our simulation study and with the reduced variability we observe in our FOI and seroprevalence estimates from each model when they were applied to serological data from Indonesia compared to Vietnam, where the former had higher separation of titre distributions (Figs 3 and 4, S3 Table).
The estimates for Indonesia from each model were consistent with each other and with previously published FOI estimates from catalytic models fitted to case-notification data from 2008-2017 in Jakarta, Indonesia (0.130, 95% CI: 0.129-0.131) [12], and seroprevalence estimates from time-constant catalytic models applied to the same serology dataset (Dataset B) [34,40]. Our results show that the mixture and catalytic models do not significantly differ in their FOI and seroprevalence estimates in this setting. In contrast, the mixture model applied to the six datasets from Vietnam produced more variable estimates (FOI range = 0.026-0.099, seroprevalence range = 0.16.3-0.376) than the catalytic models (FOI range = 0.023-0.037 and 0.024-0.050, seroprevalence range = 0.190-0.300 and 0.189-0.299 under the assumption of a time-constant or time-varying FOI respectively). The variance was even greater in the age-specific seroprevalence and yearly FOI estimates (Figs 5 and S4). As expected, the time-varying catalytic model and the mixture model (which implicitly models FOI as time-varying), were better able to capture the age-specific seroprevalence than the time-constant catalytic model. The estimates from the mixture model tended to exceed those obtained from the catalytic models (Fig 4, S3 Table). Given the greater negative bias observed for the catalytic models in our simulation study, we expect the higher mixture model estimates to be more accurate for the Vietnamese setting. Lam et al., similarly observed higher FOI estimates when applying mixture models compared to catalytic models to serological data from Vietnam, for example 0.12 (95% CI: 0.11-0.14) compared to 0.07 (95% CI: 0.06-0.09), in Ho Chi Minh City [31].
A major advantage of the mixture model is the comparative ease with which it can be applied to serological data to estimate transmission intensity without the need to use thresholds to process the data. Furthermore, to generate robust estimates, there are fewer data requirements for mixture models than for catalytic models: in the former, the data are pooled, and age is used only to calculate the age-specific mean log(titre + 1) using a spline, meaning that there are no constraints on the number of participants per age category. However, it is important to consider the bias that will be introduced if the mixture distributions fit the titre data poorly [24]. In this study we accounted for this by using an information criterion to select the best fitting models from a range of options. In the future we will explore implementing the models in a Bayesian framework [22] which would allow us to perform posterior predictive checks to more robustly assess model fit. It would also be interesting to explore the FOI estimates obtained when applying a mixture model with more than two mixture distributions, which may better account for the complex immunity profiles observed in areas where multiple DENV serotypes circulate. For example, Biggs et al. and Lam et al. fit three-component mixture distributions to DENV antibody titre data in the Philippines and Vietnam respectively, to develop frameworks capable of distinguishing between post and primary DENV infection [30,31] by specifying mixture components for seronegative, seropositive with a primary infection and seropositive with post-primary infections.
In summary, our results suggest that mixture models represent a good alternative to catalytic models to quantify DENV time-varying FOI and seroprevalence from age-stratified serological data, with potentially less bias and less uncertainty. They may be particularly useful when estimating FOI from data where there is high overlap between the component distributions, where the risk of serostatus misclassification and bias introduction when using cut-off threshold methods is greater (S2 and S3 Figs).
We have provided code to run the simulation study to encourage further exploration and comparison of the different methodologies. Critically, further investigation of the use of mixture models depends on the availability of raw antibody titre data. For these reasons, we would encourage current and future seroprevalence studies on DENV, as well as other infectious diseases, to publish anonymised individual-level antibody titre data where it is possible to do so. Shading represents the 95% Confidence Intervals (CI). The grey points show the observed seroprevalence per age group calculated from the binarily classified IgG data (seropositive individuals / tested individuals), with error bars indicating the 95% exact binomial CIs. The seroprevalence data and model estimates are overlayed for the purpose of comparison. However, it is important to note that the mixture model was not fitted to the data (grey points), as the former does not depend on the titre classification. The size of the grey data points represents the number of individuals tested in each age group.  Table. Force of infection (FOI) and total population level seroprevalence (SP) estimates from the mixture model and the catalytic models fitted to the observed data. The observed data is serology data collected in Vietnam (Datasets A-1:A-6) and Indonesia (Dataset B). 95% Confidence Intervals (CI) were calculated by the bootstrap method. (DOCX) S1 Fig. (A) True versus estimated parameter values from the mixture model fitted to the simulated datasets (Dataset C). The estimated parameters are the mean log(titre + 1) value of the seronegative/ susceptible (S) and seropositive/infected (I) components (μs and μI respectively) and the corresponding standard deviations (σs and σI). Red indicates the estimates where the true parameter value was not captured by the estimates (i.e., the 95% Confidence Interval of the estimate did not contain the true value). Note that the axes limits differ for each panel.