A Systematic Review and Meta-Analysis of the Performance of Two Point of Care Typhoid Fever Tests, Tubex TF and Typhidot, in Endemic Countries

Background In the absence of well-equipped laboratory infrastructure in many developing countries the accurate diagnosis of typhoid fever is challenging. Rapid diagnostic tests (RDT) with good performance indicators would be helpful to improve clinical management of suspected cases. We performed a systematic literature review and meta- analysis to determine the performance of TUBEX TF and Typhidot for the diagnosis of typhoid fever using PRISMA guidelines. Methods Titles and abstracts were reviewed for relevance. Articles were screened for language, reference method and completeness. Studies were categorized according to control groups used. Meta-analysis was performed only for categories where enough data was available to combine sensitivity and specificity estimates. Sub-analysis was performed for the Typhidot test to determine the influence of indeterminate results on test performance. Results A total of seven studies per test were included. The sensitivity of TUBEX TF ranged between 56% and 95%, Specificity between 72% and 95%. Meta-analysis showed an average sensitivity of 69% (95%CI: 45–85) and an average specificity of 88% (CI95%:83–91). A formal meta-analysis for Typhidot was not possible due to limited data available. Across the extracted studies, sensitivity and specificity estimates ranged from 56% to 84% and 31% to 97% respectively. Conclusion The observed performance does not support the use of either rapid diagnostic test exclusively as the basis for diagnosis and treatment. There is a need to develop an RDT for typhoid fever that has a performance level comparable to malaria RDTs.


Introduction
Salmonella enterica serovar Typhi (Salmonella Typhi), the causative agent of typhoid fever, has been estimated to have caused more than 21.000.000 episodes of typhoid fever at a 1% mortality rate in the year 2000 [1]. The major disease burden lies in developing countries.
Due to the lack of reliable diagnostic tools the estimated incidence rate may be an underestimate for the African continent, as more recent data indicate [2,3]. Since typhoid fever has a nonspecific clinical picture [4,5], accurate diagnosis remains a challenge in resource poor settings [6]. Blood culture is the current reference method for diagnosis, however results are only available after .48 hours, the procedure is expensive and requires extensive laboratory equipment and technical expertise. Sensitivity is estimated to be between 40% and 70% [7,8,9,10,11,12]. Culture from bone marrow is known to be more sensitive [8,9,10], however the invasive character renders the procedure inappropriate for large scale application. Rapid diagnostic tests (RDTs) with good performance indicators at a low price are therefore desirable to provide a reliable diagnosis.
Typhidot (Malaysian Biodiagnostic Research, Malaysia) and Tubex TF (IDL, Sweden) are among the most widely used RDTs within the more recently developed diagnostic devices for typhoid fever. There are a number of other test available such as the Typhidot-M (Malaysian Biodiagnostic Research, Malaysia), the Multi-Test Dip-S-Ticks (Panbio INDX, US), SD Bioline (Standard Diagnostics, Korea) and Mega Salmonella (Mega Diagnostics, US) however little data on their performance is available [13,14,15,16].
Tubex TF is based on an inhibition reaction between patient antibodies (IgM) and monoclonal antibodies included in the test that bind to a Salmonella Typhi specific O9 lipopolysaccharide. A macroscopically visible de-colorization of patient serum in test reagent solution through magnetic particle separation indicates a positive result. In contrast the Typhidot is based on a qualitative dot-blot enzyme-linked immunosorbent assay that separately detects the presence of IgM and IgG in patient sera against a Salmonella Typhi specific 50 kD outer membrane protein.
Several studies have assessed the performance of either test for the diagnosis of symptomatic patients, but no formal meta-analysis of the available data has been performed to date.
We therefore aimed to analyze the diagnostic performance of Tubex TF (IDL, Sweden) and Typhidot (Malaysian Biodiagnostic Research, Malaysia) for the diagnosis of typhoid fever in patients in typhoid endemic regions.

General
We performed a review and meta-analyses using the PRISMA guidelines [17] for systematic reviews and meta-analyses (Checklist S1).

Search method and inclusion criteria
We performed a literature search in the MEDLINE database through PubMed using ''Tubex'' and ''Typhidot'' as search terms. Searches were restricted to publications from 1998 to date to cover the time since introduction of either test to the market. In addition we conducted supplementary searches in the references of the retrieved articles. Titles and abstracts were reviewed for relevance.
Only articles evaluating the performance of one of the two or both test were included. Articles were excluded based on title, abstract, language other than English, lack of automated blood culture as reference method assuming that automated blood culture has a better yield in patients with previous antimicrobial treatment and to assure standardization across the different studies [18]. Articles were further excluded because presented data was insufficient and authors did not reply to our queries. Whenever automated and manual blood culture had been used as reference method, only results of the automated blood culture were included. Corresponding authors were contacted via email for additional information whenever necessary. Information provided by the authors was anonymized. If no answer was provided within eight weeks of the first email and two additional follow up emails (sent without an error report) the respective studies were excluded.

Data retrieval and definitions
The number of true positives (TF), true negatives (TN), false positives (FP) and false negatives (FN) were retrieved from each article by two investigators independently and entered into an excel datasheet. Discordant findings were assessed in a joint approach and authors asked for verification when in doubt. We obtained sensitivity, specificity and accuracy estimates of each included study considering blood culture as the reference method.  To address poor sensitivity of blood culture [7,8,9,10,11,12] we repeated the analysis applying different control groups whenever possible. Control groups to determine true negatives were defined as follows: category 1 -samples with known etiology other than Salmonella Typhi; category 2 -samples with unknown etiology (blood culture negative); category 3 -categories 1 and 2 combined.
Results for IgM and IgG for the Typhidot where assessed separately. Whenever articles evaluating the Typhidot did not present results for IgG and IgM separately, authors were contacted and asked to provide respective data. Based on these data the following outcomes were defined: presence of IgM alone =positive (diseased); presence of both IgG and IgM = positive; absence of both IgG and IgM = negative; presence of IgG alone = indeterminate. If information regarding the number of indeterminate results among cases and controls was not provided in the article the respective numbers were retrieved from the authors.

Risk of bias
The QUADAS checklist [19] has been completed for all included studies (Table S1). Given the limited number of studies included, we did not perform a sensitivity analyses excluding lower quality studies. However sensitivity analyses for the most likely source of bias, the handling of indeterminate results has been performed as described below.

Tubex TF
For Tubex TF we plotted estimates of the sensitivity and specificity in forest plots as well as receiver-operating characteristic (ROC) space using RevMan 5 [20] for each category. Metaanalysis was performed only for categories where enough data was available to produce average sensitivity and specificity estimates. Estimates were calculated using logistic regression separately for sensitivity and specificity correcting for heterogenity among studies using robust standard errors (generalized estimating equations), an approach similar to random effects meta-analysis [21].

Typhidot
For Typhidot no formal meta-analysis was performed, firstly due to the low number of studies included in each control group and secondly because information on the inclusion or exclusion of indeterminate results could not be retrieved for all studies. For studies where information on the number of indeterminate results was available, sensitivity, specificity and accuracy estimates were calculated using three different approaches: Firstly we excluded the indeterminate results completely from the analyses, Secondly we defined the indeterminate results as negative results (TN and FN respectively): Sensitivity~T P TPzFNzIndeterminate cases

Specificity~T
NzIndeterminate controls TNzFPzIndeterminate controls Thirdly we added indeterminate results only to the denominator, resulting in a new formula for specificity only but not for sensitivity when compared to the second approach:

Specificity~T N TNzFPzIndeterminate controls
For studies where information on the number of indeterminate results was not available the results are presented as given by the respective authors. 95% Confidence intervals were calculated according to Wilsons score method and the difference between accuracy estimates was calculated using chi2 test considering p,0.05 as significant.

Tubex TF
A total of seven (30.4%) studies evaluating Tubex TF with different control groups were included in the analyses. One of the studies used two different control groups as comparison and was therefore included in two different categories [35] with the respective results. A total of five studies using febrile controls with known etiology [13,29,31,35,36] were therefore included in category 1, two studies using controls with unknown febrile diseases [15,35] were included in category 2 and one study that used controls with known and unknown etiology [37] was included in category 3. Characteristics of the studies are summarized in Table 1.
Meta-analysis of the data in category 1 showed an average sensitivity of 69% (95%CI: 45-85) and a specificity of 88% (CI95%:83-91) (Figure 3). No meta-analysis was performed for the other categories due to the low number of studies included.

Typhidot
For the evaluation of the Typhidot a total of seven (29.2%) studies were included in the analyses. Two studies used two different control groups and were therefore included in two categories with the respective results [14,35]. Therefore a total of three studies could be included in category 1 [13,14,35], four studies in category 2 [14,15,35,38] and two studies in category 3 [37,39]. Additional characteristics of the included studies are shown in Table 2.
The number of indeterminate results (presence of IgG alone) obtained when using Typhidot showed a great variation among studies. Kawano et al. [15] reported 55 indeterminate results (out of 366 total results) both among cases and controls respectively. Fadeel et al. [35] reported five indeterminate samples among cases and one among controls (out of a total of 140 and 210 results depending on the control group), Olsen et al. [13] reported six indeterminate results, three among cases and three among controls (out of a total of 77 results), Keddy et al. [37] reported no indeterminate results (out of 80 results).
Depending on how indeterminate results are classified sensitivity and specificity can vary. Highest numbers of indeterminate results for the Typhidot were reported by Kawano et al [15] with a total of 30% of all results being indeterminate. Accordingly sensitivity of the test was 82% when indeterminate results were excluded, 56% when the respective results were considered negative and 56% when indeterminate results were included in the denominator. Accordingly specificity was 44%, 60% and 31% respectively (p,0.05 for accuracy).
For studies with smaller numbers of indeterminate results no significant differences in accuracy were found and sensitivity varied between 63% and 84%, specificity between 74% and 97% depending on control group and definition of indeterminate results (Table 3). Results from studies were no information on indeterminate results were available are listed in Table 4.

Discussion
Our meta-analysis for Tubex TF showed an average sensitivity of 69% and a specificity of 88%. Even though no meta-analysis was performed for the Typhidot, sensitivity and specificity varied between 46% and 79% and 31% and 96% respectively when including indeterminate results in the denominator only and across all three control groups. The number of indeterminate results varied between 0% [37] and 30% [15] of the entire study population. However we found that apart from the study conducted by Kawano et al. [15] the number of indeterminate results was low and did not significantly affect test accuracy (p.0.05) ( Table 3). This study only considered sensitivity, specificity, and accuracy for analysis but not predictive values. Predictive values are much heavier affected by prevalence of disease within the study population than sensitivity and specificity, making it difficult to compare predictive values of different studies.
Malaria and typhoid fever may be considered among the most mportant non-viral infectious diseases in developing countries. For malaria a plethora of RDTs is available and current WHO recommendations for the use of those RDTs as an exclusive method of diagnosis postulate a specificity .90% in order to be used on a wider scale [40]. While the average performance of the Tubex TF does not qualify according to these criteria, few individual studies for Tubex TF [29,31] and Typhidot [35,39] report performance above the given threshold.
Since typhoid fever is a potentially fatal disease, easily treatable with affordable antibiotics, its treatment threshold is very low. Moreover no clinical signs with sufficient predictive value are available, and consequently in most situations the disease is treated presumptively. In order for a typhoid RDT to be superior to presumptive treatment, a respective test would require a high sensitivity, in order not to miss possibly fatal cases. On the other hand, even a moderate specificity will allow avoiding the many false positives inherent to the presumptive strategy, leading to unnecessary antibiotics overuse, resulting in resistance on a population scale. The question remains, if RDT's based on antibodies are sufficiently sensitive for an early presentation. Malaria tests are based on antigen detection, an approach that yields positive results earlier after infection, as the result is not delayed by a host immunological response.
Parry et al. [41] suggest testing paired samples to improve performance of the RDTs. Assuming that false positive results occur on an independent basis, this will increase specificity. Likewise, if samples are taken at a timely interval this is likely to improve sensitivity due to higher antibody titers within the course of the disease. The latter approach might be useful for epidemiological purposes but its value in a clinical setting is limited.
The major limitation of the presented data is the small number of study results available. While sufficient publications were retrieved to calculate performance indicators for the Tubex TF test, this was not possible for Typhidot. Different methods in defining and including controls have made it difficult to standardize earlier collected data and have further reduced the number of data that we could compare directly.
The unknown sensitivity of blood culture is likely to have affected the analyzed results. We excluded all studies where manual blood culture was used as a reference method, assuming that automated blood culture has a higher yield in patients with previous antibiotic treatment and to assure some standardization of the workflow across the different studies included [18]. However also automated blood culture results are dependent on skills and knowledge of the performing laboratory staff as well as the condition of local laboratory equipment and consumables. Moreover choosing the most appropriate control group for an RDT evaluation remains a challenge when blood culture is the reference method. Including blood culture negative patients in the control group bears the risk of including undetected Salmonella Typhi cases due to poor sensitivity of blood culture among the controls affecting the specificity of the evaluated test. On the other hand including only febrile cases with a confirmed laboratory diagnosis other than typhoid fever results in an unrealistic control group.
Additional limitations in the longitudinal test evaluation are inter-batch variation as well as minor test modification by the manufactures that are not leading to changes in the brand name and not made public [42,43]. Indeed the Tubex TF test had been modified within the evaluated time period without changes of the product name (IDL personal communication). The study  from Olsen et al [13] had evaluated the former version of the test (IDL personal communication), however when repeating the analysis and excluding the respective publication, we found similar results for average sensitivity and specificity (data not shown).
In the light of poor sensitivity of current blood culture procedures at high costs, requiring considerable expertise and long time to diagnosis, the demand for a reliable RDT in clinical settings remains high. Apart from good performance indicators, a respective test would require good operational characteristics as   well as moderate pricing comparable to currently used malaria RDTs. In addition a diagnostic device to detect Salmonella carriers would be a powerful tool to estimate true disease burden and potential of transmission [41].

Conclusion
The performance of Typhidot and TUBEX TF does not support the use of either rapid diagnostic test exclusively as a basis for diagnosis and treatment. Although more time consuming and related to higher expenses and logistics, blood culture and molecular biologic techniques remain the reference method of choice, despite its limitations. There is a need to develop an RDT for typhoid fever that has a performance level comparable to malaria RDTs.