Skip to main content
Advertisement
  • Loading metrics

Dynamics of trachoma infection in West Africa revealed by a hidden state model

  • Jake Carson ,

    Roles Formal analysis, Methodology, Writing – original draft

    Jake.Carson@warwick.ac.uk

    Affiliation Mathematics Institute, University of Warwick, Coventry, United Kingdom

  • Thomas Crellen,

    Roles Formal analysis, Writing – review & editing

    Affiliations Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore, Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom

  • Anna Borlase,

    Roles Formal analysis, Writing – review & editing

    Affiliation Department of Biology, University of Oxford, Oxford, United Kingdom

  • Joaquin M. Prada,

    Roles Formal analysis, Writing – review & editing

    Affiliation Faculty of Health and Medical Sciences, University of Surrey, Surrey, United Kingdom

  • Robin Bailey,

    Roles Conceptualization, Data curation

    Affiliation Faculty of Infectious & Tropical Disease, London School of Hygiene & Tropical Medicine, London, United Kingdom

  • T. Déirdre Hollingsworth,

    Roles Conceptualization, Funding acquisition, Writing – review & editing

    Affiliations Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom

  • Simon E. F. Spencer

    Roles Conceptualization, Funding acquisition, Methodology, Writing – original draft

    Affiliation Department of Statistics, University of Warwick, Coventry, United Kingdom

?

This is an uncorrected proof.

Abstract

Trachoma is estimated to be the leading infectious cause of blindness globally, predominantly affecting low-income populations with poor sanitation and hygiene. Over a decade of mass drug administration with antibiotics has led to substantial progress in control and elimination, but hotspots remain where infection persists or rebounds following mass drug administration for reasons that remain unclear. Transmission modelling is a key component of understanding these dynamics, but the complex dynamics of infection and reinfection with Chlamydia trachomatis are challenging to infer from cross–sectional surveys. Here, we analyze longitudinal data collected over six months in 1991 using multiple diagnostics from two villages in The Gambia by developing and fitting a Bayesian epidemiological model that classifies individuals into disease states at each time point using a forward-filtering backward-sampling algorithm. We find that infection risk is clustered within households and the weekly probability of transmission within a shared room is 40–fold higher than in a shared village. Infected children are estimated to contribute disproportionately to transmission, accounting for 70–90% of the force of infection within the observed period. We estimate the basic reproduction number, R0, to be 2.2 by simulation and find that the distribution of secondary cases per individual is less aggregated than for other directly-transmitted pathogens. We further quantify heterogeneity in predisposition to becoming infected and estimate the sensitivity and specificity for PCR, antigen detection tests, and clinical examinations. Our study uncovers the natural history of trachoma infection, with implications for simulating pathogen dynamics and designing interventions to halt transmission and prevent avoidable cases of blindness.

Author summary

Trachoma is an infectious disease that can lead to blindness through repeated infections over time. Understanding trachoma transmission is important for designing surveys, evaluating the impact of different intervention strategies, and allocating resources for control programmes. Here, we infer transmission properties by analysing data from two villages in The Gambia, in which the same cohort of individuals were followed for six months. By developing and fitting an individual-level model to the 1410 individuals, we derive the impacts of household structure and age on trachoma transmission. Our analysis finds that infection risk is strongly impacted by household structure, with transmission between individuals sharing a room being 40 times higher than between individuals sharing only a village. We also find that transmission is dominated by children, who contribute over 70% of the force of infection over the study period. We further quantify differences in predisposition to infection between individuals. Finally, we determine the error rates of PCR, antigen detection tests, and clinical examinations, which were used during the study.

Introduction

Trachoma is an infectious cause of blindness caused by ocular strains of the gram-negative bacterium Chlamydia trachomatis and is classified as a neglected tropical disease [1]. Repeated infection leads to chronic inflammation of the conjunctiva and to scarring of the inner eyelid (trachomatous scarring; TS) [2]. Over time this scar tissue contracts, which can cause distortion of the eyelids, entropion (inward rotation of the eyelid), leading to contact between the eyelashes and the surface of the eye (trachomatous trichiasis; TT). This is an acutely painful condition, and can ultimately lead to corneal opacity, visual impairment and irreversible blindness. Transmission is mediated person-to-person by ocular or nasal discharge via hands, fomites (e.g., bedding) and eye–seeking flies [3]. As of March 2026, trachoma is endemic in 30 countries and is responsible for blindness or visual impairment in approximately 1.9 million people, with the majority of cases in African countries [4]. Ambitious targets have been set for the control of trachoma worldwide, with all remaining endemic countries aiming for “elimination as a public health problem” by 2030, which requires a prevalence of trachomatous inflammation–follicular (TF; a marker of current or recent infection) <5% in children 1–9 years old, and a prevalence of TT < 0.2% in people 15 years old and over [5]. Mass treatment with the broad-spectrum antibiotic azithromycin is the main tool used to clear infection and prevent onward transmission [6,7]. Other disease control strategies include facial cleanliness, environmental improvement to reduce fly density, and the provision of clean water [8]. Surgical interventions can treat trichiasis [3], with evidence that health systems can manage TT cases being a third criterion for elimination as a public health problem.

Transmission modelling plays an important role within trachoma elimination programmes for survey design, evaluating the impact of interventions, and allocating future resources [7,9,10]. Understanding the mechanisms of pathogen transmission underpin these large-scale predictive models [11]. Key aspects of trachoma transmission that are important to quantify from epidemiological data include i) how the number of times a person has been infected, and their age, affects the course of subsequent infections, e.g., the observation of shorter infectious periods and lower bacterial load in older patients [12], ii) heterogeneity in susceptibility [13] and iii) declining transmission over time in untreated regions [14]. However, most data take the form of cross-sectional surveys, rather than repeated observations, making it challenging to infer hidden or partially observed epidemiological processes.

Here we fit an individual-level stochastic transmission model to detailed longitudinal data collected from endemic communities in West Africa in the early 1990s. This dataset, observed prior to the demonstration of the efficacy of single-dose azithromycin, provides a unique resource for modelling the dynamics of trachoma infection because no intervention was applied for the six month observation period [13,15]. Previous analyses of this cohort have focused on the duration of infection in adults and children, diagnostic efficacy and household transmission [12,13,16,17]. We are the first to infer the epidemiological state of individuals, characterize heterogeneity in susceptibility, and quantify the contribution of adults and children to transmission. The model structure and parameters estimated here will inform a predictive model of trachoma transmission throughout the African continent to inform disease control efforts and policy planning [18].

Materials and methods

Ethics statement

The data used in this study derive from a previously conducted cohort study, which received ethical approval from the Joint Gambia Government and Medical Research Council Ethics Committee (SCC 508) [16]. In the original study, formal verbal informed consent was obtained from all participants, with consent witnessed and documented in accordance with standard procedures at the time. No new data collection involving human participants was undertaken for the present study.

Longitudinal data

We use data from a cohort study in two villages in The Gambia between April and November 1991, Jali (J; population 893) in Kiang West district and Berending (B; population 517) in Kombo South district, which lie approximately 90 km apart [13,15]. The dataset includes trachoma diagnostics and a clinical eye examination at baseline; the age of the surveyed participants; the residential structure (compounds within villages, rooms within compounds); and, for people included in the cohort, subsequent diagnostics taken every two weeks and a further clinical examination at the end of the study period. The village structure and sample size are summarised in Table 1.

thumbnail
Table 1. Summary of the household data used in this study, showing the number of residential compounds, rooms and occupancy within the two study villages (J and B) in The Gambia.

https://doi.org/10.1371/journal.pcbi.1014313.t001

At the start of the observed period, the eyes of all available individuals were clinically examined, and conjunctival swabs taken for antigen detection tests and PCR targeting a plasmid sequence. During the clinical examinations signs of trachoma were graded according to degree of inflammation, presence of follicles, and sequelae (including conjunctival scarring, entropion and corneal opacity). The PCR and antigen detection tests indicate the presence or absence of Chlamydia trachomatis for each individual.

Following the baseline survey, 68 individuals from Berending and 188 individuals from Jali were surveyed longitudinally over a six month period. Participants were included from twenty randomly selected households, based on the criteria of having at least one confirmed case of active trachoma and at least four household members without active trachoma [13]. Follow-ups occurred every two weeks, in which participants were clinically examined for signs of trachoma and swabs taken for antigen detection tests. There are some instances of missing data due to absences during the survey and diagnostics failing to return readable results. On the final week all available individuals underwent a second clinical eye examination. Note that we exclude data from individuals under 1 year of age as clinical signs of trachoma are considered unreliable in under 1s. These individuals are still included in the model described in the next section, but are treated as having missing data.

Trachoma transmission model

We adapted previously published transmission frameworks as an individual-based stochastic model, informed by our understanding of the natural history of trachoma infection [9,12,16,19]. In endemic regions, people undergo repeated trachoma infections throughout their lifetime and those with past infections are known to have shorter infection clearance and disease recovery periods, possibly due to an acquired immune response [12]. We consider four possible epidemiological states in our model; susceptible (S), infectious (I), infectious and diseased (ID), and diseased (D). Here, ‘diseased’ refers to TF, from which individuals may recover in the absence of reinfection. Individuals transition after exposure to C. trachomatis; following an incubation period; when the infection is cleared; and finally after a recovery period. While in state D individuals may also be reinfected, albeit with a modified level of susceptibility given by the parameter . To account for a different susceptibility and duration of infectiousness we consider an individual’s first infection separately from all subsequent infections and therefore denote naïve states as S0, I0, ID0, D0, and non-naïve states S1, I1, ID1, D1. The model structure is shown in Fig 1.

thumbnail
Fig 1. Structure of the trachoma transmission model.

Individuals progress through susceptible (S), infectious (I), infectious and diseased (ID), and diseased states (D). We distinguish between first infection (0) and any later infections (1). Transmission is represented by the p and q parameters, and are durations.

https://doi.org/10.1371/journal.pcbi.1014313.g001

Our epidemiological framework represents a simplification of previous differential equation models, which used distinct states for each subsequent infection with C. trachomatis, e.g., In where , referred to as a “ladder of infection” [9]. One reason for simplifying this to naïve (I0) and non-naïve (I1) is that it is impossible to identify an individual’s exact number of previous infections within the six month observation period, particularly as many rungs of the ladder exhibit similar dynamics. Our stochastic model also introduces heterogeneity to the durations in each state, removing the need for a ladder to encompass different values.

For individuals in state S (taken to mean either S0 or S1) we define the probability of transitioning to state I in a single time step (one week). Let vi, ci, ri denote the village, compound, and room of person i respectively. Furthermore, let , , be the number of infectious people (in states I0, I1, ID0, ID1, assumed equally infectious) at time t in village vi, compound ci, and room ri respectively. We define

(1)

as the probability of transitioning between time t and time t + 1. This probability is derived from a Poisson process and represents the probability of at least one event occurring in unit time with rate . The parameters , , and are coefficients that determine the infectiousness between individuals that are known to share a village, compound, or room. Individuals that share a room must share a compound and individuals in the same compound share a village. Values of indicate that infectious individuals sharing a compound have a greater contribution to the infection pressure than individuals who only share a village. Values of indicate that infectious individuals sharing a room have a greater contribution to the infection pressure than individuals who only share a compound and/or village. Separate parameters are used for the two villages (denoted by the superscript vi), and we assume no mixing between them.

For individuals in state I we define the duration spent in this state before progressing to state ID. We denote the duration as and the mean duration as . Then is distributed according to a translated negative binomial distribution with mean and size 2:

(2)

An equivalent formulation would be to duplicate state I and model transitions through two sequential infectious compartments with geometric durations, yielding a Markov structure [20]. However, because the inference algorithm we employ scales quadratically with the number of model states [21], this formulation would substantially increase computational cost. We therefore adopt the negative binomial specification, where is defined by the following probability mass function:

(3)

where m is the mean and n is the size parameter of the negative binomial distribution. By translating the negative binomial distribution we ensure that the duration is at least one week. Negative binomial distributions are commonly employed in infectious disease modelling to give unimodal probability distributions with a mode bigger than zero. Here, the mode of is . We make no distinction for individuals being naïve or non-naïve, and the parameters are assumed to apply to both villages.

For individuals in state ID we define the duration spent in this state before progressing to state D. As with state I, we model these durations using translated negative binomial distributions. However, here a distinction is made between naïve and non-naïve individuals, with the mean duration being for naïve individuals, and for non-naïve individuals:

(4)

Individuals in state D can either progress to state S via recovery, or to state ID via reinfection. The duration spent in state D before progressing to state S follows a translated negative binomial distribution, with the mean duration being for naïve individuals, and for non-naïve individuals:

(5)

Over these durations, individuals may become reinfected with a modified infection probability. Specifically, the infection rate is multiplied by a parameter , which can make infection more () or less () likely than in the S state:

(6)

Our model estimates the initial states for each individual. This consists of two parts, first we define the probability that an individual is naïve or non-naïve at the start of the survey given their age and village, then given an individual’s naïve status we define the probability that they are in state S, I, ID, or D. We assume that each person’s history of prior trachoma infection (before the start of the survey) follows a village-dependent Poisson process with rate . The probability that individual i is naïve at the start of the study is then:

(7)

where ai is the age of individual i, and the additional 0.5 is a continuity correction. We define , , , and as the probability of a naïve individual starting in state S0, I0, ID0, D0 respectively. Likewise, we define , , , and as the probability of a non-naive individual starting in state S1, I1, ID1, D1 respectively. Additionally, ages are missing for fourteen individuals, and so these are included as extra model parameters.

Likelihood and observation process

We construct a likelihood function to assign individuals to an epidemiological state at each time point based on their diagnostic and clinical observations. We use three types of observation: i) PCR targeting a plasmid sequence, ii) antigen detection tests, and iii) the results of clinical examinations. We use PCR and antigen detection tests to determine C. trachomatis infection; whether individuals are in I or ID, and clinical examinations indicate TF; whether individuals are in ID or D. We estimate the sensitivity () and specificity () of each observation type, assuming these are the same between the two study sites. This gives six diagnostic parameters, see Table 2.

thumbnail
Table 2. Prior probability distributions used for diagnostic parameters.

https://doi.org/10.1371/journal.pcbi.1014313.t002

Inference methodology

We denote as the model parameters, as hidden infection states of all individuals for weeks 1,...,T, and as diagnostic and clinical observations. We sample from the posterior distribution

(8)

where is the combined density of the prior distributions, is given by the stochastic transmission model, and is given by the sensitivity and specificity of the observations. We implement a Markov chain Monte Carlo (MCMC) algorithm to obtain samples of the posterior distribution, alternating between updating model parameters conditional on the hidden infection states and updating the hidden infection states conditional on the model parameters. The full inference procedure is summarised in S1 Text, with key details outlined below.

Although some model parameters are village dependent, we fit the model to the data from both villages in a single analysis. Combining data from both villages increases the information available for the shared parameters, directly improving the precision of their estimates. Since the village-dependent parameters may depend on these shared quantities, estimates of these parameters may also benefit indirectly from the joint analysis.

When updating the model parameters conditional on the hidden infection states we use a variety of MCMC updates that utilise available conjugacy and provide efficient mixing. For the initial state probabilities, sensitivity, and specificity parameters, our choice of conjugate prior distributions means that the full conditional distributions are available, allowing efficient Gibbs updates. For the unknown age parameters we use independence Metropolis proposals using their prior distributions. We find that there is little information in the data to estimate these ages, meaning that the posterior distributions closely resemble the prior distributions, and so this is a more efficient approach than using random-walk proposals. For the remaining parameters, the full conditional distributions are not analytically available, and so we use random-walk Metropolis updates. Transmission parameters for each village are updated jointly in blocks to reduce computational cost. As the three transmission parameters contribute to the infection probabilities, which are needed to compute the likelihood for these updates, blocking these parameters avoids multiple calculations of each infection probability. The mean durations are re-parametrised to the probability parameters of the negative binomial distributions to improve mixing. The random-walk proposal distributions are constructed using empirical covariance estimates obtained from pilot runs. The proposal covariance matrix for each block of transmission parameters is taken to be the empirical covariance matrix scaled by a factor of 2.382 / 3, following [27]. Likewise, for the remaining parameters the proposal variance is scaled by 2.382, as these are univariate proposals.

In order to update the hidden infection states conditional on the model parameters, we implement the individual forward-filtering backward-sampling algorithm (IFFBS), as previously described [21]. We loop through individuals i = 1,2,...,1410 and update the hidden infection states of individual i, conditional on the current sample of infection states for all other individuals () and model parameters ():

(9)

As negative binomial durations yield a semi-Markov model, this cannot be done directly. Instead, we first construct a Markovian approximation to our stochastic transmission model by setting the size parameters of the negative binomial distributions to 1 (corresponding to geometric distributions) while maintaining the same mean durations. For each individual, we propose a set of hidden infection states using IFFBS with the approximate model, and then apply a Metropolis-Hastings accept-reject step to correct for the discrepancy between the approximate and correct models. This accept-reject step ensures that the resulting Markov chain has the correct stationary distribution, and posterior samples are drawn from the true semi‑Markov model rather than from the Markov approximation. As demonstrated in [21], this is a highly efficient way to sample the hidden states of semi-Markov models, particularly when the size parameters are close to 1, as is the case here.

Particular attention is paid to the durations from the start of the survey. Since our model is non-Markovian, we need to estimate how long each individual has been in their current state, and calculate the remaining duration accordingly. These durations will be upwardly biased due to length-biased sampling [28]. This occurs when the probability that an observation is included in a sample increases with the length (or duration) of the phenomenon being measured. In our setting, the longer the duration of infection, the more likely it is to overlap with the start of the survey. As a result, longer infection durations are overrepresented, and the sample is not representative of the underlying population. For the durations from , and from , the corresponding length-biased distributions can be derived analytically [28]. The length-biased distributions from state D are more challenging, as individuals can transition to S via recovery, or to ID via reinfection. Formal derivation of the length-biased durations would therefore require each person’s reinfection risk prior to the survey period, which is computationally burdensome to estimate. Instead, for the purpose of approximating the length-biased distribution, we assume a constant value for all individuals. This means that the durations from D are the smaller of the durations from: a negative binomial distribution governing the transition from (recovery), and a geometric distribution governing the transition from (reinfection). Under this simplification we can determine the length-biased duration distributions from state D numerically.

Prior distributions

We use informative prior distributions for the observation sensitivities and specificities based on previous studies (Table 2). Although informative prior distributions can be constructed for the PCR tests from previous studies, comparable data are not available for antigen detection tests in this setting, and we therefore use the same prior distributions for these two diagnostics. This choice reflects prior uncertainty about the diagnostic performance in this setting, while avoiding favouring one test over another a priori. For the remaining model parameters, we use weakly informative prior distributions to constrain the parameters within sensible ranges.

For the six transmission parameters, we use exponential priors with rate 5, allowing transmission probabilities to be small or large. E.g., if we consider a single susceptible sharing a village with a single infected, the 95% prior HDR credible interval for the weekly infection probability would be (0.00, 0.45). For the means of the durations, we re-parameterise to the probability parameter of the negative binomial distribution, and use a uniform prior distribution on the range (0, 1). We constrain and to be larger than and respectively, as naïve infections take longer to clear on average. For the relative reinfection risk parameter we use a distribution, allowing for the infection risk to be lower or greater for a diseased individual compared to a susceptible individual. The parameters used to approximate the length-biased distribution from state D are given an exponential prior distributions with rate 20, which translates to a 95% prior HDR credible interval of (0.00, 0.15).

For historic infection rates we use a distribution, which broadly covers 1 infection every 6 years to 6 infections per year. The prior distributions for the probability vectors and are non-informative Dirichlet distributions with all concentration parameters equal to 0.5.

A total of 14 individuals have unknown ages, and so these unknown ages are included as additional parameters. The prior distribution for each age parameter follows the empirical age distribution of the relevant village.

Model fit

We ran four parallel MCMC chains for 150,000 iterations, which took approximately 150 hours on a 3 GHz processor. We examined the posterior chains and obtained a multivariate Gelman-Rubin statistic of 1.01, indicating good convergence.

Results

Prevalence of C. trachomatis and ocular disease

We fit the epidemiological model (described in Methods) to cohort data on trachoma infection from two villages in West Africa (Table 1). We start by inferring the prevalence of infection with C. trachomatis: ; and TF: , where NI, NID, ND, and Ntot are the number of individuals in state I, state ID, state D, and the total number of individuals respectively. These numbers are derived through time from the inference of hidden epidemiological states for each individual (an example is shown in Fig 2). Both villages were endemic for trachoma during the study period, see Fig 3. In the larger village (J), the prevalence of infection with C. trachomatis started at 18%, declining to 12% by the end of the observed period. In the second village (B) the prevalence of infection was more stable at around 11% over the observed period. The prevalence of TF was higher than active infection in both villages, exceeding 20% in the larger village (J) for most of the study period, while the prevalence of TF was around 16% in the second village (B).

thumbnail
Fig 2. Example of state posterior estimates for nine people sharing a room.

Each person is represented by four rows of circles. The first row shows antigen detection test results (hollow for negative, black solid for positive), plus the results of the baseline PCR tests offset to the left of the baseline antigen detection test result. Missing circles indicate missing data. The second row shows 25 circles indicating the marginal probability of the person being infected (I or ID states) for each week. Darker shades indicate higher probabilities. The third row shows the marginal posterior probabilities of being diseased (ID or D states). Finally, the fourth row shows the results of clinical examinations for ocular disease.

https://doi.org/10.1371/journal.pcbi.1014313.g002

thumbnail
Fig 3. Prevalence of conjunctival Chlamydia trachomatis infection and trachomatous inflammation–follicular (TF) in two villages in The Gambia over 25 weeks, as inferred by a hidden state epidemiological model.

https://doi.org/10.1371/journal.pcbi.1014313.g003

Transmission of trachoma

Our results indicate that the probability of infection with C. trachomatis is strongly affected by the proximity and nature of contact in domestic settings. The posterior mean probability of a susceptible person becoming infected () given one infectious case residing within the same room is 2.1 (95% credible interval [CrI] 3.2 , 4.8 ) per week, which is 39-fold greater than the mean probability of transmission given one infected person residing within the same village (5.4 , 95% CrI 2.4 ). These values are from the larger village (J), though the effect is even greater in the second village (B) where the probability of transmission increases 48-fold in a shared room compared to a shared village, see Table 3.

thumbnail
Table 3. Risk of a susceptible person acquiring conjunctival Chlamydial trachomatis infection with trachoma per week, both as probabilities (, see Equation 1) and relative risk, given one infectious case in a shared village, compound, or room.

https://doi.org/10.1371/journal.pcbi.1014313.t003

We assess whether individuals who have previously been infected and have ocular disease (D) are more likely to become reinfected than susceptible individuals (S). The posterior distribution of the reinfection parameter is broad, with a mean of 1.24 (95% CrI 0.27, 2.79), though the corresponding effect on infection risk is non‑linear due to the exponential function (Eq 6). While the posterior mean exceeds 1, suggesting higher susceptibility among individuals with TF in contrast to earlier analyses [29], this estimate is highly uncertain. Notably, 41% of the posterior mass lies below 1, indicating substantial support for the alternative possibility that individuals in the D state are less susceptible than those in the S state. We observe no strong posterior correlation between and other model parameters, indicating that this uncertainty is not driven by parameter confounding but rather by limited information in the data. Although there are plausible biological mechanisms that could increase infection risk among diseased individuals (e.g., increased eye touching or eyelid damage), the data do not strongly constrain , and conclusions regarding differences in susceptibility should therefore be interpreted with caution.

Basic reproduction number

We estimate the basic reproduction number (R0) for conjunctival C. trachomatis and the distribution of secondary cases by simulation. We repeatedly sample a set of parameters from the posterior distribution, as well as an individual to act as the initial infected among a susceptible population. For this individual, we simulate their total infectious durations (progressing from I to D, assuming they are naïve), and the number of individuals they infect over this period. Averaging over a large number of samples provides an estimate of R0. We estimate R0 for the largest village (J) as 2.2 and for the second village (B) as 2.4.

The simulated distribution of secondary cases is shown in Fig 4. Fitting the negative binomial distribution to the secondary cases we obtain a value for the size parameter k as 1.81 for village J and 1.61 for village B. A small value of k (e.g., < 1) indicates highly heterogenous transmission, with most secondary cases arising from a small number of individuals (“superspreaders”). As k increases, transmission becomes more evenly distributed across individuals, and the secondary case distribution converges towards a Poisson distribution. A previous study estimated the values of k for several past outbreaks (including SARS, measles, monkeypox, and pneumonic plague) to be in the range of [30]. Our results therefore indicate a lower dispersion of secondary cases compared with other directly transmitted pathogens.

thumbnail
Fig 4. Estimated distribution of secondary cases from a single infected in a fully susceptible population, inferred by simulation, and the basic reproduction number (mean of distribution; dashed line).

https://doi.org/10.1371/journal.pcbi.1014313.g004

Heterogeneity in susceptibility within the cohort

The mean posterior probability of becoming infected (, given by ) per week is 0.0098 across both villages, which reflects the continuous infection pressure in both sites throughout the observed period. The distribution of these probabilities for the 1,410 individuals in the cohort shows considerable heterogeneity per week, ranging 9-fold from 0.0031 to 0.028, which is consistent with the parametric distribution beta(6, 600) fitted by maximum likelihood to the posterior output, see Fig 5. The cumulative probability of becoming infected in week t for person i is defined as . This rises to a mean of 0.22 over 25 weeks of exposure and the cumulative individual risk at the end of the observed period varies 5-fold from 0.085 to 0.42 (Fig 5). We quantify this variance in individual infection risk by fitting the parametric model ; gives the intercept, t the study week, and gi represents individual-level predisposition to infection, where . Through Bayesian inference, we obtain mean posteriors as and . This fitted function is shown as the dashed line in (Fig 5).

thumbnail
Fig 5. Heterogeneity in trachoma susceptibility.

Left: The distribution of the weekly posterior probability of becoming infected as a histogram, with an overlaid parametric distribution fitted by maximum likelihood; beta(6, 600). Right: The cumulative posterior probability of infection () for all 1,410 individuals (i) over the 25 weeks (t) of the cohort study, where the dashed line gives the posterior mean of the fitted function , where and .

https://doi.org/10.1371/journal.pcbi.1014313.g005

Contribution of age groups to transmission

We examine the relative contribution of different age groups to the weekly probability of trachoma transmission, given by the force of infection (Eq 1). We find that young children (0–9 years) contribute disproportionately to transmission, accounting for more than 75% of the infection pressure in the largest village (J) and more than 70% in the second village (B) over the observed period, despite accounting for less than half of the population (J: 36%, B: 40%), see Fig 6. Adults 16 + years of age contribute under 20% to the force of infection in both villages, which is a consequence of having a lower duration of infectiousness due to past infections.

thumbnail
Fig 6. Relative contribution of age groups to the force of infection (Eq 1) with each panel representing a village (Jali [J] and Berending [B]) in The Gambia (Table 1).

https://doi.org/10.1371/journal.pcbi.1014313.g006

Observation sensitivities and specificities

We obtain posterior means (95% credible intervals) for the specificities , , , and sensitivities , , . The marginal posterior distributions are shown in Fig 7 along with each prior distribution. In general, we learn a lot about the parameters, but obtain more confident estimates for the specificity parameters compared to the sensitivity parameters.

thumbnail
Fig 7. Marginal posterior distributions for the specificities and sensitivities.

Histograms show the posterior distributions, and the lines show the prior distributions.

https://doi.org/10.1371/journal.pcbi.1014313.g007

For the PCR test, the specificity posterior mean is similar to the prior, but the sensitivity estimates are at the higher end of previously published values [22]. We determine that the specificity of the antigen detection test is very similar to that of PCR, but the sensitivity is significantly lower. For the clinical examinations, both the specificity and sensitivity were high.

Initial infection probabilities

The probability that an individual starts in any of the eight infection states has two components. First, the probability of being classed as naïve at the start of the survey (depending on age and village), determined by the historic yearly infection rates and for villages B and J respectively. Second, conditional on an individual’s village and naïve status, the probability of being in state S, I, ID, or D, giving 16 parameters for and .

The posterior mean (95% credible interval) for is 0.14 (0.11, 0.18) and for is 0.11 (0.07, 0.16). These values suggest that the historic infection rate is slightly larger in village J compared to village B. In Fig 8 we compare the probabilities that individuals of a given age are classed as naïve at the start of the survey. Most people appear to have experienced at least one infection in early childhood in both villages. However, we should be careful not to over-interpret these parameters, as it is possible that some non-naïve individuals exhibit longer than anticipated recovery durations and are thus categorised as naïve in the model (or vice versa).

thumbnail
Fig 8. Age-dependent probability that an individual has no previous infections at the start of the observed period.

https://doi.org/10.1371/journal.pcbi.1014313.g008

For naïve individuals in village J, the probability of starting in state S, I, ID, and D is estimated from the posterior mean to be 0.25, 0.01, 0.61, and 0.13 respectively. For non-naïve individuals these probabilities are 0.96, 0.02, 0.01, and 0.01. For village B, the naïve probabilities are 0.55, 0.01, 0.32, and 0.12, and the non-naïve probabilities are 0.96, 0.01, 0.03, and 0.00. These probabilities relate to the durations for each state as naïve individuals take longer to progress from the ID state than non-naïve individuals and so are more likely to be sampled in this state. The probability that a naïve individual in village J starts in the S state is much lower than village B (Fig 8), further supporting the notion that the historic infection rate is higher in village J.

Durations in epidemiological states

The posterior mean (95% credible intervals) for the duration parameters are presented in Table 4. As expected, there is a significant reduction in the clearance and recovery durations for non-naïve individuals.

thumbnail
Table 4. Posterior means and 95% credible intervals for the duration parameters in weeks. corresponds to the infectious states (I0 and I1), and correspond to the naive and non-naive infectious and diseased states (ID0 and ID1), and and correspond to the naive and non-naive diseased states (D0 and D1).

https://doi.org/10.1371/journal.pcbi.1014313.t004

Individual durations are stochastic, following a negative binomial distribution with size parameter 2, and so there is large variation. Naïve individuals clear their first infection (progress from I0 to D0; mean (95% credible interval)) in 37 (5, 104) weeks, and non-naïve individuals clear infections (progress from I1 to D1) in 6.2 (2, 16) weeks. The estimates of and may not be indicative of individual durations from D to S, due to the possibility of reinfection. We simulate realisations of individual durations from D to S by randomly selecting individuals and using their average weekly infection probability as a constant weekly reinfection probability, leading to geometrically distributed reinfection durations. For naïve individuals we obtain recovery durations (progress from D0 to S1) of 28 (4, 83) weeks, and for non-naïve individuals we obtain recovery durations (progress from D1 to S1) of 1.8 (1, 5) weeks.

Discussion

We have developed a Bayesian inference framework to infer hidden epidemiological states from longitudinal observations and applied this to understanding the natural history of trachoma infection from a unique cohort in West Africa. Our statistical framework combines detailed data on the structure of the population (organised into rooms, compounds and villages), participants’ age, and (imperfect) diagnostic test results to calculate a posterior probability of infection for each individual in each week of the study period. Such a detailed, individual-level analysis provides unique insights into the precise epidemiology of infection in this population.

Our key findings are that the basic reproduction number R0 for C. trachomatis is estimated as 2.2 and 2.4 respectively in the two study villages, and the force of infection is disproportionately driven by children ≤9 years of age (Figs 4 and 6). These results are supported by the finding that children 1–9 years old are disproportionately likely to have active trachoma and ocular C. trachomatis, as determined by quantitative PCR, in a study in Ethiopia [31]. There are relatively few estimates of R0 for trachoma, owing to a lack of longitudinal studies, though a previous analysis estimated R0 in children 1–9 years old using serological data from Tanzania [32], finding that R0 ranged between 2.8–28 across three types of settings. Our R0 results are consistent with the results from hypoendemic (low) transmission settings (95% confidence interval 1.6–4.0 [32]), though our simulations use the full age distribution within communities, including adults who have lower transmission rates.

The household nature of trachoma transmission is known from epidemiological studies in the 1980s and 1990s [33], however we are the first to quantify this effect by estimating the probability of infection as a function of the number of infectious individuals within the same room, compound or village.

We are the first to simulate the distribution of secondary cases, and we find that relatively few primary cases (around 24%) result in zero subsequent infections. This relatively low heterogeneity reflects the long infectious period for trachoma and the household structure (Table 1), whereby multiple people sharing rooms causes transmission events (Table 3).

Our results suggest that transmission‑blocking interventions targeted at young children could substantially reduce trachoma transmission within communities. Current elimination programmes are largely based on the SAFE strategy (surgery for trichiasis, antibiotics, facial cleanliness, and environmental improvement), which encompasses WASH (water, sanitation, and hygiene) [8,34]. Annual mass treatment with azithromycin remains the cornerstone of control efforts, and has proven effective at reducing trachoma infection and the resulting blindness [3,6]. While observational studies have reported associations between household water access or facial cleanliness and lower odds of active trachoma [35,36], randomised trials in Ethiopia have not demonstrated a measurable impact of WASH provision on trachoma prevalence [10,37]. Despite major progress towards elimination, persistent endemic foci remain, most notably in Ethiopia, prompting renewed interest in novel interventions such as vaccines against Chlamydia trachomatis [38]. A candidate antigen, CTH522, is currently in clinical trials [39], and our findings add to the evidence base supporting the prioritisation of transmission‑blocking vaccines targeted at children.

This study also has several important limitations. Transmission events are not observed directly, but are instead inferred from longitudinal diagnostic and clinical observations by fitting a stochastic transmission model. While the availability of repeated measurements alongside the detailed household structure of the population allows us to infer transmission dynamics with greater clarity than cross‑sectional data, we still can not unambiguously identify who-infected-whom. Furthermore, we lack complementary data such as pathogen genome sequences that could provide circumstantial evidence of direct transmission between individuals. The transmission parameters therefore represent average effects at the level of shared environments, rather than confirmed person‑to‑person transmission events. Future studies combining dense longitudinal data with pathogen genomic data could help resolve transmission links more precisely, and further validate model‑based inference.

Overall, this study demonstrates how a principled statistical approach applied to detailed longitudinal data can be used to jointly infer hidden infection states and key natural history parameters, yielding new insights into the underlying dynamics of trachoma infection.

Supporting information

S1 Text. Algorithmic description of the Bayesian inference procedure.

Summary of the Markov chain Monte Carlo scheme and hidden infection state sampling (Algorithm S1).

https://doi.org/10.1371/journal.pcbi.1014313.s001

(PDF)

References

  1. 1. Feasey N, Wansbrough-Jones M, Mabey DCW, Solomon AW. Neglected tropical diseases. Br Med Bull. 2010;93:179–200. pmid:20007668
  2. 2. Burton MJ. Trachoma: an overview. Br Med Bull. 2007;84:99–116. pmid:18175788
  3. 3. Solomon AW, Burton MJ, Gower EW, Harding-Esch EM, Oldenburg CE, Taylor HR, et al. Trachoma. Nat Rev Dis Primers. 2022;8(1):32.
  4. 4. World Health Organization. Trachoma. Accessed: 27-03-2026. https://www.who.int/news-room/fact-sheets/detail/trachoma
  5. 5. World Health Organization. Ending the neglect to attain the Sustainable Development Goals: a road map for neglected tropical diseases 2021–2030. World Health Organization; 2021.
  6. 6. Bailey RL, Arullendran P, Whittle HC, Mabey DC. Randomised controlled trial of single-dose azithromycin in treatment of trachoma. Lancet. 1993;342(8869):453–6. pmid:8102427
  7. 7. Godwin W, Prada JM, Emerson P, Hooper PJ, Bakhtiari A, Deiner M, et al. Trachoma Prevalence After Discontinuation of Mass Azithromycin Distribution. J Infect Dis. 2020;221(Suppl 5):S519–24. pmid:32052842
  8. 8. Emerson PM, Lindsay SW, Alexander N, Bah M, Dibba S-M, Faal HB, et al. Role of flies and provision of latrines in trachoma control: cluster-randomised controlled trial. Lancet. 2004;363(9415):1093–8. pmid:15064026
  9. 9. Pinsent A, Hollingsworth TD. Optimising sampling regimes and data collection to inform surveillance for trachoma control. PLoS Negl Trop Dis. 2018;12(10):e0006531. pmid:30307939
  10. 10. Srivathsan A, Abdou A, Al-Khatib T, Apadinuwe S-C, Badiane MD, Bucumi V, et al. District-Level Forecast of Achieving Trachoma Elimination as a Public Health Problem By 2030: An Ensemble Modelling Approach. Clin Infect Dis. 2024;78(Suppl 2):S101–7. pmid:38662700
  11. 11. Lietman TM, Pinsent A, Liu F, Deiner M, Hollingsworth TD, Porco TC. Models of Trachoma Transmission and Their Policy Implications: From Control to Elimination. Clin Infect Dis. 2018;66(suppl_4):S275–80. pmid:29860288
  12. 12. Gambhir M, Basáñez M-G, Burton MJ, Solomon AW, Bailey RL, Holland MJ, et al. The development of an age-structured model for trachoma transmission dynamics, pathogenesis and control. PLoS Negl Trop Dis. 2009;3(6):e462. pmid:19529762
  13. 13. Bailey R, Duong T, Carpenter R, Whittle H, Mabey D. The duration of human ocular Chlamydia trachomatis infection is age dependent. Epidemiol Infect. 1999;123(3):479–86. pmid:10694161
  14. 14. House J, Gaynor B, Taylor H, Lietman TM. The real challenge: can we discover why trachoma is disappearing before it’s gone? Int Ophthalmol Clin. 2007;47(3):63–76.
  15. 15. Bailey RL, Hampton TJ, Hayes LJ, Ward ME, Whittle HC, Mabey DC. Polymerase chain reaction for the detection of ocular chlamydial infection in trachoma-endemic communities. J Infect Dis. 1994;170(3):709–12. pmid:8077735
  16. 16. Grassly NC, Ward ME, Ferris S, Mabey DC, Bailey RL. The natural history of trachoma infection and disease in a Gambian cohort with frequent follow-up. PLoS Negl Trop Dis. 2008;2(12):e341. pmid:19048024
  17. 17. Blake IM, Burton MJ, Bailey RL, Solomon AW, West S, Muñoz B, et al. Estimating household and community transmission of ocular Chlamydia trachomatis. PLoS Negl Trop Dis. 2009;3(3):e401. pmid:19333364
  18. 18. Vasconcelos A, King JD, Nunes-Alves C, Anderson R, Argaw D, Basáñez M-G, et al. Accelerating Progress Towards the 2030 Neglected Tropical Diseases Targets: How Can Quantitative Modeling Support Programmatic Decisions? Clin Infect Dis. 2024;78(Suppl 2):S83–92. pmid:38662692
  19. 19. Pinsent A, Gambhir M. Improving our forecasts for trachoma elimination: What else do we need to know? PLoS Negl Trop Dis. 2017;11(2):e0005378.
  20. 20. Barbour AD. Networks of queues and the method of stages. Adv Appl Probab. 1976;8(3):584–91.
  21. 21. Touloupou P, Finkenstädt B, Spencer SEF. Scalable Bayesian Inference for Coupled Hidden Markov and Semi-Markov Models. J Comput Graph Stat. 2019;29(2):238–49. pmid:32939192
  22. 22. Solomon AW, Peeling RW, Foster A, Mabey DCW. Diagnosis and assessment of trachoma. Clin Microbiol Rev. 2004;17(4):982–1011, table of contents. pmid:15489358
  23. 23. Liu F, Porco TC, Amza A, Kadri B, Nassirou B, West SK, et al. Short-term forecasting of the prevalence of clinical trachoma: utility of including delayed recovery and tests for infection. Parasit Vectors. 2015;8:535. pmid:26489933
  24. 24. Koukounari A, Moustaki I, Grassly NC, Blake IM, Basáñez M-G, Gambhir M, et al. Using a nonparametric multilevel latent Markov model to evaluate diagnostics for trachoma. Am J Epidemiol. 2013;177(9):913–22. pmid:23548755
  25. 25. Keenan JD, See CW, Moncada J, Ayele B, Gebre T, Stoller NE, et al. Diagnostic characteristics of tests for ocular Chlamydia after mass azithromycin distributions. Invest Ophthalmol Vis Sci. 2012;53(1):235–40. pmid:22159017
  26. 26. See CW, Alemayehu W, Melese M, Zhou Z, Porco TC, Shiboski S, et al. How reliable are tests for trachoma?--a latent class approach. Invest Ophthalmol Vis Sci. 2011;52(9):6133–7. pmid:21685340
  27. 27. Gelman A, Roberts GO, Gilks WR. Efficient Metropolis Jumping Rules. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, editors. Bayesian Statistics 5: Proceedings of the Fifth Valencia International Meeting. Oxford University Press; 1996. p. 599–607.
  28. 28. Sheaffer RL. Size-biased sampling. Technometrics. 1972;14(3):635–44.
  29. 29. Shattock AJ, Gambhir M, Taylor HR, Cowling CS, Kaldor JM, Wilson DP. Control of trachoma in Australia: a model based evaluation of current interventions. PLoS Negl Trop Dis. 2015;9(4):e0003474. pmid:25860143
  30. 30. Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438(7066):355–9. pmid:16292310
  31. 31. Last A, Versteeg B, Shafi Abdurahman O, Robinson A, Dumessa G, Abraham Aga M, et al. Detecting extra-ocular Chlamydia trachomatis in a trachoma-endemic community in Ethiopia: Identifying potential routes of transmission. PLoS Neglected Tropical Diseases. 2020;14(3):e0008120.
  32. 32. Martin DL, Wiegand R, Goodhew B, Lammie P, Black CM, West S, et al. Serological Measures of Trachoma Transmission Intensity. Sci Rep. 2015;5:18532. pmid:26687891
  33. 33. Muñoz B, West S. Trachoma: the forgotten cause of blindness. Epidemiol Rev. 1997;19(2):205–17. pmid:9494783
  34. 34. Last AR, Shafi Abdurahman O, Greenland K, Robinson A, Collin C, Czerniewska A, et al. Cluster randomised controlled trial of double-dose azithromycin mass drug administration, facial cleanliness and fly control measures for trachoma control in Oromia, Ethiopia: the stronger SAFE trial protocol. BMJ Open. 2024;14(12):e084478. pmid:39719287
  35. 35. Stocks ME, Ogden S, Haddad D, Addiss DG, McGuire C, Freeman MC. Effect of water, sanitation, and hygiene on the prevention of trachoma: a systematic review and meta-analysis. PLoS Med. 2014;11(2):e1001605. pmid:24586120
  36. 36. Abebe TA, Tucho GT. The impact of access to water supply and sanitation on the prevalence of active trachoma in Ethiopia: A systematic review and meta-analysis. PLoS Negl Trop Dis. 2021;15(9):e0009644. pmid:34499655
  37. 37. Aragie S, Wittberg DM, Tadesse W, Dagnew A, Hailu D, Chernet A, et al. Water, sanitation, and hygiene for control of trachoma in Ethiopia (WUHA): a two-arm, parallel-group, cluster-randomised trial. Lancet Glob Health. 2022;10(1):e87–95. pmid:34919861
  38. 38. Hu VH, Holland MJ, Burton MJ. Trachoma: protective and pathogenic ocular immune responses to Chlamydia trachomatis. PLoS Negl Trop Dis. 2013;7(2):e2020. pmid:23457650
  39. 39. Pollock KM, Borges ÁH, Cheeseman HM, Rosenkrands I, Schmidt KL, Søndergaard RE, et al. An investigation of trachoma vaccine regimens by the chlamydia vaccine CTH522 administered with cationic liposomes in healthy adults (CHLM-02): a phase 1, double-blind trial. Lancet Infect Dis. 2024;24(8):829–44. pmid:38615673