Scrub typhus point-of-care testing: A systematic review and meta-analysis

Background Diagnosing scrub typhus clinically is difficult, hence laboratory tests play a very important role in diagnosis. As performing sophisticated laboratory tests in resource-limited settings is not feasible, accurate point-of-care testing (POCT) for scrub typhus diagnosis would be invaluable for patient diagnosis and management. Here we summarise the existing evidence on the accuracy of scrub typhus POCTs to inform clinical practitioners in resource-limited settings of their diagnostic value. Methodology/principal findings Studies on POCTs which can be feasibly deployed in primary health care or outpatient settings were included. Thirty-one studies were identified through PubMed and manual searches of reference lists. The quality of the studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2). About half (n = 14/31) of the included studies were of moderate quality. Meta-analysis showed the pooled sensitivity and specificity of commercially available immunochromatographic tests (ICTs) were 66.0% (95% CI 0.37–0.86) and 92.0% (95% CI 0.83–0.97), respectively. There was a significant and high degree of heterogeneity between the studies (I2 value = 97.48%, 95% CI 96.71–98.24 for sensitivity and I2 value = 98.17%, 95% CI 97.67–98.67 for specificity). Significant heterogeneity was observed for total number of samples between studies (p = 0.01), study design (whether using case-control design or not, p = 0.01), blinding during index test interpretation (p = 0.02), and QUADAS-2 score (p = 0.01). Conclusions/significance There was significant heterogeneity between the scrub typhus POCT diagnostic accuracy studies examined. Overall, the commercially available scrub typhus ICTs demonstrated better performance when ‘ruling in’ the diagnosis. There is a need for standardised methods and reporting of diagnostic accuracy to decrease between-study heterogeneity and increase comparability among study results, as well as development of an affordable and accurate antigen-based POCT to tackle the inherent weaknesses associated with serological testing.


Methodology/principal findings
Studies on POCTs which can be feasibly deployed in primary health care or outpatient settings were included. Thirty-one studies were identified through PubMed and manual searches of reference lists. The quality of the studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2). About half (n = 14/31) of the included studies were of moderate quality. Meta-analysis showed the pooled sensitivity and specificity of commercially available immunochromatographic tests (ICTs) were 66.0% (95% CI 0.37-0.86) and 92.0% (95% CI 0.83-0.97), respectively. There was a significant and high degree of heterogeneity between the studies (I 2 value = 97.48%, 95% CI 96.71-98.24 for sensitivity and I 2 value = 98.17%, 95% CI 97.67-98.67 for specificity). Significant heterogeneity was observed for total number of samples between studies (p = 0.01), study design (whether using case-control design or not, p = 0.01), blinding during index test interpretation (p = 0.02), and QUADAS-2 score (p = 0.01).

Conclusions/significance
There was significant heterogeneity between the scrub typhus POCT diagnostic accuracy studies examined. Overall, the commercially available scrub typhus ICTs demonstrated better performance when 'ruling in' the diagnosis. There is a need for standardised methods and reporting of diagnostic accuracy to decrease between-study heterogeneity and increase PLOS  Introduction Scrub typhus is a febrile illness caused by the obligate intracellular bacterium, Orientia tsutsugamushi. It is transmitted by the bite of infected larvae of a number of trombiculid mite species known to be prevalent in Asia, the Pacific Rim islands, pockets in the north of Australia, and some areas of Chile [1][2][3][4]. In 2010, a novel species from the same genus, Orientia chuto sp. nov., was identified in an acutely febrile patient infected in Dubai [5]. Scrub typhus responds to certain antibiotics (i.e. doxycycline, tetracycline, azithromycin, chloramphenicol), but if left untreated, the mortality rate may reach 70% [6]. One estimate, based on scant data, is that there are one billion people at risk of this disease; with one million clinical cases annually in Southeast Asia alone [1]. Although the exact prevalence of scrub typhus is not available, several studies showed that the disease burden in rural Asia is high-causing in some areas over 20.0% of febrile illness admitted to hospital [7,8].
Infected patients usually present with acute fever; lymphadenopathy (regional or generalised) and sensorineural hearing loss may occur, neither of which is sensitive or specific enough for establishing diagnosis [1]. With few distinguishing clinical characteristics, scrub typhus is difficult to differentiate from other tropical febrile illnesses, such as dengue, typhoid fever, leptospirosis, and murine typhus [1,9]. The presence of the pathognomonic eschar, the painless black crust at the site of mite inoculation, can help in establishing clinical diagnosis due to its high specificity (98.9%), however, its presence in patients varies widely (7.0%-97.0%) [1,[10][11][12].
Therefore, the role of laboratory tests in establishing diagnosis in scrub typhus cases is very important. Laboratory tests for scrub typhus often have limited diagnostic accuracy and are generally in limited supply in resource-limited or outpatient settings [9]. Failure in diagnosing scrub typhus may result in prolonged illness, complications including pneumonitis, acute respiratory distress syndrome, renal failure, meningoencephalitis, and unnecessary treatment with inappropriate antibiotics [1,9,13].
Serology remains the mainstay of diagnosis. The immunofluorescent assay (IFA) and immunoperoxidase test (IIP) are considered imperfect gold standards, in view of their limitations which include high expense, requirement for substantial training to perform, inter-operator variability in result interpretation, and the often-retrospective nature of diagnosis that does not help in directing treatment [1,9,12,14,15]. Another antibody detection method, the enzymelinked immunosorbent assay (ELISA) has been developed and shown to have both sensitivity and specificity of greater than 90.0%; however this is highly dependent on endemicity and the application of a previously investigated and geographically-based cut-off [1,16]. Besides antibody-based diagnostics, molecular detection methods, including using the polymerase chain reaction (PCR) to detect various genes targets (e.g. 47 kDa, 56 kDa, groEL, 16S rRNA genes) have also been developed, however they have limitations in terms of diagnostic sensitivity due to the limited period of rickettsaemia [1,9,12,14]. PCR is still deemed impractical in resourcelimited endemic areas because it requires considerable training and expense [1,9,12,14]. The bacteria can be isolated through in vitro and in vivo cultivation methods, such as cell culture and mouse inoculation, respectively [1,9]. These methods need considerable training, biosafety level 3 (BSL 3) laboratory containment facilities for large-scale propagation, and usually take several weeks which contributed to the retrospective nature of the diagnosis [1,9]. Therefore, there is clearly a need for affordable point-of-care testing (POCT) for scrub typhus diagnosis in endemic settings with resource constraints. There are varied definitions of POCT, but fundamentally POCT should provide quick results to inform patient management and be convenient enough to be performed close to the patient (i.e. not in a central laboratory) [17,18].
Immunochromatographic tests (ICTs), dot-blot, and loop-mediated isothermal amplification (LAMP) assays all have the principal qualities of POCT. ICTs and dot-blot tests have the same inherent problems of IFA as serology-based tests (e.g. the retrospective nature of diagnosis in cases where diagnosis relies on a convalescent sample, delicate cut-off setting), while offering more simplicity and speed [1,9]. LAMP is an alternative technique which involves amplification and detection of bacterial DNA. Similar in principle to conventional PCR assays, LAMP assay does not require intricate DNA extraction, a thermocycler, or special equipment to read the result [9,19,20].
This study aims to summarise the existing evidence on the accuracy of scrub typhus POCTs to inform clinical practitioners of their diagnostic value when providing care in resource-limited settings where scrub typhus is endemic.

Eligibility criteria
This review included articles on POCTs that would be feasible in primary health care provider or outpatient settings. Only articles published in English were included. To ensure feasibility in resource-limited settings, studies evaluating methods which were inherently more complicated, requiring relatively high levels of expertise and/or specialised equipment were excluded. Articles on POCTs not performed on human samples were excluded. Studies on the Weil-Felix test were excluded due to its established poor diagnostic accuracy and the lengthy time required to perform [9,21]. Meta-analysis and meta-regression were performed on studies of commercially available POCTs with an extractable diagnostic accuracy 2 by 2 table.

Search strategy
After a preliminary search, ICT, dot-blot, and LAMP were searched for specifically. The search was conducted on articles cited in PubMed up to 2 February 2017 combining the search terms 'scrub typhus', 'immunochromatography', 'dot blot immunoassay', and 'loop mediated isothermal amplification' without any other restrictions (i.e., "scrub typhus" AND (rapid diagnosis OR immunochromatograph Ã OR dot blot immunoassay OR loop mediated isothermal amplification). The titles and abstracts were screened and the full text of relevant articles were reviewed. Manual screening of the reference list of relevant articles was also performed.

Quality assessment
The quality of the studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) [22]. QUADAS-2 was used as a quality scoring system to determine the risk of bias and the applicability of the paper [22]. It evaluates four main areas: 'patient selection', 'index test', 'reference standard', and 'flow and timing' [22]. These are assessed by using seven 'signalling questions' (e.g., "was a case-control design avoided?") with 'yes', 'no', and 'unclear' answer [22]. The answers to these 'signalling questions' were then used to judge whether the risk of bias is low and if there is low concern for the applicability of the research [22]. If the response to the risk of bias and applicability questions were 'low risk' or 'low concern', the articles were given one point each. The articles were grouped based on their score into high (6-7 points), moderate (4-5 points), and low (0-3 points) quality categories.

Data extraction
Data was extracted primarily by one author (KS) and where the results were unclear a second author (SB) was consulted. The data was recorded on a form developed through an iterative process to ensure that all the required data could be collected for future reference. The parameters extracted include: citation information, methodology (i.e., study design, participant characteristics, index and reference test details), and the diagnostic accuracy results (including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and if available, numbers required to construct a 2 by 2 contingency table.

Statistical data analysis and reporting
The extracted data were compiled into summary tables and analysed through narrative synthesis. Meta-analysis and meta-regression were performed on commercially available POCT diagnostic accuracy data, excluding studies with low quality (i.e. QUADAS-2 score of 3 or less). The ones in development stage/prototype were not included in the meta-analysis and metaregression, but included in the narrative synthesis. If one study derived more than one 2 by 2 table, each table was extracted as separate data. However, if one study used more than one reference test cut-off titre, only data using one cut-off value above 1:3,200 were used to ensure accuracy [12]. In performing the meta-regression, relevant signalling questions with 'unclear' as the answers were entered as 'no' to turn these into dichotomous variables. Statistical analysis was done with STATA/IC 14.0 (College Station, TX) using MIDAS and METANDI commands. In the meta-analyses, heterogeneity was assessed using the Chi-square statistic, higher values of the Chi-square (and hence low p-value) being consistent with heterogeneity. In the summary statistics of the resulting forest plot, overall sensitivity and specificity were estimated and reported alongside the 95% confidence interval. In addition, study specific estimates were provided in the same plot to visualize how the estimates from each of the studies deviate from the overall estimate. Most of the results was presented graphically. The data was analysed, summarised, and presented following the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement as much as possible [23]. This review was registered in the International Prospective Register for Systematic Review (PROSPERO) with registration number CRD42017056727.

Search results
There were 133 articles in total identified through database searching and reference list screening (Fig 1). After title and abstract screening and full text review, we included 31 relevant articles. There were six articles excluded after full text review. Since this study only focused on human diagnostics, one article using rabbit sera in its negative sera panel was excluded. One study involved DNA extraction, which is not applicable as a POCT in resource-limited settings. Four other papers were excluded due to language (not in English, n = 2) and study design (not experimental diagnostic accuracy studies, n = 2). There were 20 articles on ICTs, eight articles on dot-blot assays, four articles on LAMP assays, and one each on a passive hemagglutination assay, an IgM dot immunobinding assay, and a latex agglutination test. Four articles evaluated more than one type of diagnostic test. There were 11 studies that evaluated diagnostic tests still in development and 21 studies on prototype/commercial tests.

Characteristics of the included studies
In total, there were 6,772 samples analysed. The samples were taken from 12 countries, with most studies recruiting in Thailand (n = 15, 48.4% of included studies), India (n = 5, 16.1%), Laos (n = 3, 9.7%), and Korea (n = 3, 9.7%). There was one study each (3.2%) conducted on Scrub typhus point-of-care testing samples from Sri Lanka, Nepal, Malaysia, Peru, Indonesia, United States of America, and Australia. There were five studies (16.1%) with unclear sample collection location. Ten (32.2%) studies collected paired samples (acute and convalescent phase), while 15 studies (48.4%) did not provide sufficient details on sample collection timing ( Table 1).

Quality of articles
There are 7, 14, and 10 articles with good, moderate, and low quality, respectively (Table 1). Two articles fulfilled all of the QUADAS-2 main criteria (S1 Dataset). Most of the articles  (Fig 2). The main reason for this is that most of the articles did not mention explicitly whether they performed blinding during the conduct and interpretation of both the index test (n = 23, 74.2%) and reference test (n = 25, 80.6%) (S2 Dataset). There were 20 studies (64.5%) with a case-control study design.

Performance of POCT
IgM ICT. There were five manufacturers of IgM ICT identified (Table 2), namely: Access-Bio, InBios, PanBio, Standard Diagnostic (SD), and ImmuneMed. There were two studies (6.6%) that assessed in-house tests [24,25]. The sensitivity ranged from 23.3% to 100.0%, and the specificity ranged from 73.0% to 100.0% (Fig 3, [26,27]). The accuracy across data points of the same manufacturers varied across the studies. The InBios IgM ICT tests reported >80.0% sensitivity and >90.0% specificity on average. The ImmuneMed IgM ICT demonstrated sensitivity of 99.0% and specificity of 98.0%, however, there was only one data point for this manufacturer. One of the studies reported sensitivity and specificity for the AccessBio ICT IgM of 97.0% and 93.0%, respectively. However, the other AccessBio studies did not demonstrate such a high degree of accuracy (Fig 3).
Total antibody ICT. The ICTs with IgG, combination of IgG and IgM, and combination of IgG, IgM, and IgA as the detection target were grouped together under 'total antibody ICT' (Table 3). There were five manufacturers identified, namely: AccessBio, ImmuneMed, InBios, PanBio, and SD. The remaining studies (n = 3, 6.6%) assessed the diagnostic performance of in-house tests in development. The sensitivity and specificity ranged from 20.9% to 99.1% and 67.9% to 100.0%, respectively (Fig 4, results from  were not plotted since only sensitivity values were presented [26,27]). As in the case of IgM ICT, the accuracy across data points of total antibody ICT of the same manufacturer varies.  The ImmuneMed total antibody ICT demonstrated >95.0% specificity and >80.0% sensitivity. Dot-blot. Aside from the in-house tests assessed by four studies, there were two dot-blot assay manufacturers, namely Integrated Diagnostics and PanBio. The range was 59.6% to 100.0% and 83.0% to 98.7% for sensitivity and specificity, respectively (Table 4).
LAMP. The Loopamp LAMP kit (Eiken Chemical, Japan) was assessed in two studies ( Table 4). The other two studies evaluated in-house tests. The sensitivity ranged from 66.7% to 100.0% and specificity ranged from 63.2% to 100.0%.
Other methods. There was one study each on a passive hemagglutination assay, an IgM dot-immunobinding assay, and a latex agglutination assay (Table 4). These tests were all in the development phase.

Meta-analysis results
There were 11 data points extracted from four studies included in the meta-analysis (Fig 5). In the resulting forest plot (Fig 5), the top three data were extracted from studies assessing total antibody ICT Watthanaworawit et al, 2015) [28,29]. The rest of the 2 by 2 table data were extracted from studies assessing IgM ICT diagnostic performance. The pooled sensitivity and specificity were 66.0% (95% CI 0.37-0.86) and 92.0% (95% CI 0.83-0.97), respectively. The overall Chi-square heterogeneity statistics showed significant heterogeneity (p < 0.001). There is a high degree of heterogeneity present (I 2 value = 97.48%, 95% CI 96.71-98.24 for sensitivity and I 2 value = 98.17%, 95% CI 97.67-98.67 for specificity). Meta-regression on several covariates was performed in an attempt to explain this heterogeneity. Significant heterogeneity was observed for total number of samples (p = 0.01), study design (whether using case-control design or not, p = 0.01), blinding during index test interpretation (p = 0.02), and QUADAS-2 score (p = 0.01). No significant heterogeneity was observed for the blinding during reference test interpretation (p = 0.21) and antibody target detection of the tests (p = 0.22). All of these studies used IFA as their reference standard, except Blacksell et al, 2010 which used IFA with the addition of PCR and culture [28]. None of the meta-analysed studies used an IFA cut off lower than 1:400 as the reference comparator. Scrub typhus point-of-care testing

Discussion
There were 31 relevant articles included in this review. Almost half of the included articles were of moderate quality. The meta-analysis showed moderately low pooled sensitivity and  [29] 2015 Thailand Median fever = 2 days (IQR: 2-3 days), and the median interval between obtaining initial acutephase specimens and convalescent specimens was 14 days (range: 11-30 days). good specificity of the current commercially available scrub typhus POCT. However, the studies were heterogeneous with the I 2 value indicating a high degree of heterogeneity. Hence, the pooled sensitivity and specificity value needs to be interpreted with caution. The systematic review and meta-analysis highlighted the methodological and clinical heterogeneity across scrub typhus POCT diagnostic accuracy studies. These differences made it difficult to pool results and compare studies. Meta-regression for other covariates of interest (e.g. sample collection timing) could not be performed because of limited information presented in the original articles.
Almost a quarter of the responses gathered in the main seven questions on QUADAS-2 quality assessment were 'unclear'. Although we did not assess the quality of reporting in this review, this finding indicates that the quality of reporting in the included studies is still arguably poor. Poorly conducted and controlled diagnostic accuracy studies are a waste of time, resources, and effort; moreover, if research is not accurately reported, it can hinder critical appraisal, replication, and meta-analysis of studies [30]. The launch of reporting guidelines such as Standards for Reporting Diagnostic Accuracy (STARD) and PRISMA is a starting point in improving the quality of reporting, although they are not applied as much as they should be [30,31]. Besides adhering to reporting guidelines, regulations should be adapted to incentivise better and more complete reporting [31]. Creating a reporting infrastructure and building the capability of both authors and reviewers are also necessary to encourage better reporting [31].
Approximately two thirds of the included studies used case-control design. Compared with studies that recruit patients consecutively, case-control study design (evaluating index test in separate diseased population and control group) overestimates diagnostic performance [32]. Therefore, the results presented need to be interpreted with caution.
Aside from LAMP assay, all of the other POCTs reviewed here relied on antibody detection. Serological diagnosis is problematic in several ways. First, the primary serum collection may not contain sufficient antibodies since it takes time for antibodies to increase to a detectable level creating a "false negative" result. Second, in endemic populations with significant background immunity, an appropriate cut-off needs to be established to ensure accurate diagnosis Scrub typhus point-of-care testing otherwise there is the possibility of "false positives". Furthermore the issues regarding the selection of IgM as opposed to whole antibody is very much dependent on the situation. Conventional thought is that whole antibodies may give higher number of "false positive" results in endemic situations due to the presence of residual IgG from previous scrub typhus infections as was the case with the AccessBio ICTs tested in Thailand and Laos [19,28]. These shortcomings of serology highlight the need to develop alternative diagnostic strategies. Although being pooled from heterogeneous studies, the cumulative specificity confidence interval was above 80.0%. This indicates that commercially available ICTs have value in "ruling in" for the diagnosis of scrub typhus. However, it is difficult to draw conclusions with confidence based on the currently available evidence given the high degree of heterogeneity amongst the studies.
Another important consideration when performing a diagnostic accuracy study is the choice of reference comparator. Often the IFA, the serological "gold standard", is selected. However, this test is far from perfect. IFA result interpretation is subjective and there remains a lack of consensus on the optimum positivity cut-off. A cut-off is often adopted without sufficient local evidence, potentially resulting in incorrect diagnostic accuracy measures for the diagnostic under test [15]. To evaluate the true accuracy of IFA, Bayesian latent class models (LCM) have been used. The models showed that the IFA IgM has sensitivity and specificity of 70.0% and 83.8%, respectively; therefore suggesting it to be unfit as a reference standard [12]. An alternative reference comparator is the composite Scrub Typhus Infection Criteria (STIC), Scrub typhus point-of-care testing which have been proposed as a more robust method to diagnose scrub typhus with more confidence, by including a panel of parameters with high specificity [20]. The panel includes: (i) isolation of O. tsutsugamushi, (ii) at least two positive out of the three PCR assays targeting the 56kDa, 47kDa, and groEL genes, (iii) admission IFA IgM titre of !1:12,800, (iv) 4-fold rise in IgM titre from paired samples [12,19,20]. At least one of these criteria needs to be fulfilled for a positive scrub typhus diagnosis. However the Bayesian LCM also showed that STIC's sensitivity and specificity are less than optimal for a reference test (90.5% and 82.5%, respectively).
This study has several limitations. First, the search was performed in one database, and in English only which might have resulted in non-inclusion of relevant articles. Second, the article inclusion and quality assessment were completed primarily by one person, though we attempted to decrease the risk of bias by routine discussion of contentious studies amongst the authors. Third, several studies did not present all of the parameters that we wished to extract, hence limiting the meta-analysis, as we could not include all relevant papers due to the limited information presented in the papers.
In reality, there is only a small number of scrub typhus POCTs that are commercially available. The Panbio POCTs are no longer available and therefore the market is dominated by a few companies including InBios, Standard Diagnostics and AccessBio. Selecting the most appropriate POCT is very much dependant on a few key factors such as availability of product locally, the price and the shelflife. In this study we have not examined these local practical aspects that should be considered when selecting POCTs.
In the absence of robust POCTs, the presence of an eschar can be a valuable clinical sign strongly suggesting a diagnosis of scrub typhus. Although its reported presence is very variable, it is still regarded as pathognomonic. An eschar may go unnoticed by the patient since it is painless, often does not itch, can be small, and looks similar to post trauma scab. In addition, it may be located in a concealed area, such as the perineum or under the breasts. This emphasises the importance of performing a thorough physical examination.
Treating patients empirically based on the pathogen pattern in an area is common practice, but bears the risk of unnecessary or inappropriate treatment (with the attendant risks of side effects) and promotion of antimicrobial resistance. POCTs can play an important role in reducing the number of patients treated empirically and increasing the proportion of patients treated appropriately.
There is an urgent need to develop an affordable and accurate POCT. However, if POCTs still rely on serological measures only, they might not be able to provide diagnosis in time to inform treatment. Future research should also be directed towards developing new antigenbased tests to improve diagnostic accuracy in the early period of disease.