Estimating the proportion of clinically suspected cholera cases that are true Vibrio cholerae infections: A systematic review and meta-analysis

Background Cholera surveillance relies on clinical diagnosis of acute watery diarrhea. Suspected cholera case definitions have high sensitivity but low specificity, challenging our ability to characterize cholera burden and epidemiology. Our objective was to estimate the proportion of clinically suspected cholera that are true Vibrio cholerae infections and identify factors that explain variation in positivity. Methods and findings We conducted a systematic review of studies that tested ≥10 suspected cholera cases for V. cholerae O1/O139 using culture, PCR, and/or a rapid diagnostic test. We searched PubMed, Embase, Scopus, and Google Scholar for studies that sampled at least one suspected case between January 1, 2000 and April 19, 2023, to reflect contemporary patterns in V. cholerae positivity. We estimated diagnostic test sensitivity and specificity using a latent class meta-analysis. We estimated V. cholerae positivity using a random-effects meta-analysis, adjusting for test performance. We included 119 studies from 30 countries. V. cholerae positivity was lower in studies with representative sampling and in studies that set minimum ages in suspected case definitions. After adjusting for test performance, on average, 52% (95% credible interval (CrI): 24%, 80%) of suspected cases represented true V. cholerae infections. After adjusting for test performance and study methodology, the odds of a suspected case having a true infection were 5.71 (odds ratio 95% CrI: 1.53, 15.43) times higher when surveillance was initiated in response to an outbreak than in non-outbreak settings. Variation across studies was high, and a limitation of our approach was that we were unable to explain all the heterogeneity with study-level attributes, including diagnostic test used, setting, and case definitions. Conclusions In this study, we found that burden estimates based on suspected cases alone may overestimate the incidence of medically attended cholera by 2-fold. However, accounting for cases missed by traditional clinical surveillance is key to unbiased cholera burden estimates. Given the substantial variability in positivity between settings, extrapolations from suspected to confirmed cases, which is necessary to estimate cholera incidence rates without exhaustive testing, should be based on local data.


Updated search
We updated the search on April 19, 2023 to include additional studies that had been published or indexed since our original search.We used the same search terms as above with the added restriction that data were entered into the database (PubMed and Embase) or published after October 16, 2021 (Scopus and medRxiv) [1].These additions are highlighted in red below.We did not update the search for Google Scholar because we were not able to restrict the search by date studies were entered or indexed in the database.Searched "cholera" and restricted the data posting date after Oct-16-2021 and reviewed all the 146 results, 4 were exported to endnote for possible inclusion.

PubMed
In total, 944 new records identified in this updated search.After uploaded to Covidence, 157 duplicates were removed, with 787 left for screening.
Note on medRxiv and pre-prints Although we originally included pre-prints in our screening, we excluded pre-prints that had not been peer-reviewed by the time of the updated search in our final analyses or that no longer had cholera positivity data in the published version of the manuscript (as was the case in the one pre-print study that was initially included).

Statistical model
To estimate proportion of suspected cases that are true V. cholerae infections, we performed a hierarchical meta-analysis using CmdStanR version 0.5.2 as an interface to Stan for R [2,3].
In this model,   (, ), probability of observing a positive test result by test  for observation  is a function of the true percent positive   () of observation  and test sensitivity  + and specificity  − of test : where   (, ) is the number of suspected cases tested and   (, ) is the number that tested positive.The true percent positive   for each observation was modeled as a function of covariates  (i.e.sampling strategy for test, age constraint in case definition, whether or not surveillance was initiated in response to an outbreak) and a random effect by observation, () : We used a (0,2) prior on the coefficients  and the global intercept  and a standard normal distribution prior on the random effects.In sensitivity analysis, we used a (0.9,2)prior the global intercept .
In each iteration of this model, we used a randomly selected draw from the posterior distribution of sensitivity and specificity in the JAGS model output (described in main text) to incorporate uncertainty in test performance.We ran 3,000 total iterations for all models implemented with Stan, including 4 chains each with 2,000 sampling iterations and 1,000 warm-up iterations.Convergence of all models was assessed by R-hat values and visual inspection of traceplots.

Estimation of V. cholerae positivity overall and by study methodology
To estimate V. cholerae positivity across all studies , we estimated the proportion of suspected cases that represented true V. cholerae infections we marginalized over the study-level random effects for specific strata, s, following similar methods to [4] as follows: Where, Φ −1 () is the normal distribution quantile function and  is the variance of the random effect distribution.

Figures Fig A. Data coverage by geography
Number of observations in the primary dataset at each administrative level as defined by GADM (https://gadm.org/) by country.There is more than one observation per study when the study reported data for more than one sampling method, surveillance type, and/or country.Countries with >10 observations displayed as 10.Haiti includes 1 observation at the national level, 2 at the first administrative division, and 4 at the second.

Fig C. Vibrio cholerae positivity by incidence and suspected case characteristics
Relationship between reported V. cholerae positivity and A) proportion of suspected cholera cases tests that were under 5 years of age, B) suspected cholera incidence rate per 10,000 at each study site in Africa, C) proportion of suspected cases severely dehydrated, and D) proportion of suspected cases on antibiotics prior to testing.Size of the points is proportional to the number of cases tested, and shapes indicate which diagnostic test was used to confirm V. cholerae infection.Confidence intervals for Spearman rank correlation coefficients estimated using bootstrapping (nrep=1000).Smoothing method is loess (without weights).In B) estimated suspected cholera incidence rates in Africa for 2010-2016 [5] were aggregated to the administrative division that best represented each study's catchment area by dividing the total estimated cholera cases in each area by its estimated population.RDT = Rapid Diagnostic Test; PCR = Polymerase Chain Reaction.

Fig D. Posterior distributions of Vibrio cholerae positivity
Posterior distributions of V. cholerae positivity estimated using the random-effects model corresponding to results in Table D. A) Unadjusted.B) Adjusted for test performance.C) Adjusted for test performance, sensitivity analysis shifting prior on alpha from Normal(0,2) as in A-B to Normal(0.9,2).On the left, prior (green) and posterior distribution (purple) of the global intercept are shown in logit space.On the right, histograms of estimated V. cholerae positivity.

Fig A.
Fig A. Data coverage by geography Fig A. Data coverage by geography Fig B. Data coverage over time Fig C. Vibrio cholerae positivity by incidence and suspected case characteristics Fig D. Posterior distributions of Vibrio cholerae positivity TablesTable A.Priors used in the latent class meta-analysis to estimate sensitivity and specificity of each diagnostic test

Fig B.
Fig B. Data coverage over time Number of observations in the primary dataset at each administrative level as defined by GADM (https://gadm.org/) by country within different time periods.There is more than one observation per study when the study reported data for more than one sampling method, surveillance type, and/or country.Countries with >10 observations displayed as 10.Year represents the year sampling was completed.Excludes 9 studies missing a study end date.Haiti includes 3 studies during 2010-2014 and 4 during 2015-2022.

Table B .
Characteristics of suspected cholera cases reported in each observation

Table C .
Estimated sensitivity and specificity of each diagnostic test

Table D .
Estimated underlying Vibrio cholerae positivity

Table E .
Odds of Vibrio cholerae positivity by age and outbreak context

Table A .
Priors used in the latent class meta-analysis to estimate sensitivity and specificity of each diagnostic test

Table C .
Estimated sensitivity and specificity of each diagnostic testEstimate is median sensitivity and specificity pooled across four studies that reported results for all diagnostics tests, as described in Methods.Parentheses show 95% Credible Interval.

Table D .
Estimated underlying Vibrio cholerae positivity "Unadjusted" is mean V. cholerae positivity (95% credible interval) from random effects meta-analysis without adjustments for test performance."Adjusted" refers to V. cholerae estimates additionally adjusted for test performance, where the primary analysis includes a Normal(0,2) prior on the global intercept and a sensitivity analysis includes the prior shifted to Normal(0.9,2)."Stratified estimate for high quality…" corresponds to post-stratified estimates of V. cholerae positivity for studies that use high quality sampling methods and whether an age minimum was set in the suspected case definition, as well as whether surveillance was initiated in response to an outbreak.

Table E .
Odds of Vibrio cholerae positivity by age and outbreak contextOdds that a suspected cholera case seeking testing or care has a true V. cholerae O1/O139 infection with low sampling quality compared to high sampling quality, with any minimum age was set in case definition, and with surveillance initiated in response to an outbreak compared to non-outbreak surveillance (i.e., routine or post-vaccination surveillance).