Tourniquet Test for Dengue Diagnosis: Systematic Review and Meta-analysis of Diagnostic Test Accuracy

Background Dengue fever is a ubiquitous arboviral infection in tropical and sub-tropical regions, whose incidence has increased over recent decades. In the absence of a rapid point of care test, the clinical diagnosis of dengue is complex. The World Health Organisation has outlined diagnostic criteria for making the diagnosis of dengue infection, which includes the use of the tourniquet test (TT). Purpose To assess the quality of the evidence supporting the use of the TT and perform a diagnostic accuracy meta-analysis comparing the TT to antibody response measured by ELISA. Data Sources A comprehensive literature search was conducted in the following databases to April, 2016: MEDLINE (PubMed), EMBASE, Cochrane Central Register of Controlled Trials, BIOSIS, Web of Science, SCOPUS. Study Selection Studies comparing the diagnostic accuracy of the tourniquet test with ELISA for the diagnosis of dengue were included. Data Extraction Two independent authors extracted data using a standardized form. Data Synthesis A total of 16 studies with 28,739 participants were included in the meta-analysis. Pooled sensitivity for dengue diagnosis by TT was 58% (95% Confidence Interval (CI), 43%-71%) and the specificity was 71% (95% CI, 60%-80%). In the subgroup analysis sensitivity for non-severe dengue diagnosis was 55% (95% CI, 52%-59%) and the specificity was 63% (95% CI, 60%-66%), whilst sensitivity for dengue hemorrhagic fever diagnosis was 62% (95% CI, 53%-71%) and the specificity was 60% (95% CI, 48%-70%). Receiver-operator characteristics demonstrated a test accuracy (AUC) of 0.70 (95% CI, 0.66–0.74). Conclusion The tourniquet test is widely used in resource poor settings despite currently available evidence demonstrating only a marginal benefit in making a diagnosis of dengue infection alone. Registration The protocol for this systematic review was registered at PROSPERO: CRD42015020323.

Virus transmission can cause a spectrum of illness from subclinical to severe dengue infection characterized by plasma leakage, haemorrhage and end-organ impairment. Characterisation of specific phenotypes of infection is complex and has recently changed [11][12][13][14]. Clinically, dengue fever presents as an acute febrile disease with symptoms of headache, bone or joint and muscular pains, rash and leukopenia. Traditionally a further two stages were described, consisting of dengue haemorrhagic fever (DHF), characterized by high fever, haemorrhagic phenomena, often with hepatomegaly. In severe cases, further signs of circulatory failure may develop culminating in dengue shock syndrome (DSS), which is associated with poor outcomes. More recent consensus guidance [12] recommends distinction of dengue illness into dengue (with or without warning signs which may precede the development of more severe infection) and severe dengue (encompassing the manifestations of severe plasma leakage, severe bleeding or severe end-organ involvement) [11,12].
The clinical diagnosis of dengue is challenging as the symptoms are non-specific and common to many other infections [10][11][12], notably malaria and other arboviral infections. To aid diagnosis, specifically during the initial, acute, febrile phase which may last 2-7 days after the development of fever, the WHO recommend the use of the Tourniquet Test (TT, also known as the Rumpel-Leede or Hess test) to support diagnostic decision-making [13,[15][16][17][18][19][20][21]. As an inexpensive, quick and easy to perform procedure, use of the TT has become widespread in clinical practice globally. The TT is a marker of capillary fragility and can be undertaken by inflating a blood pressure cuff around the upper arm to the point midway between the individual's systolic and diastolic blood pressures and leaving it inflated for 5 minutes. The cuff is subsequently released and after two minutes the number of petechiae below antecubital fossa are counted. The test is positive if more than 10 petechiae are present within a square inch of skin on the arm [11,12]. The clinical diagnosis of dengue may be confirmed by laboratory testing, which in many settings involves the measurement of an antibody response (IgM or IgG) by ELISA [3], for years considered to be the diagnostic standard [22]. This test is less sensitive in the first 5 days after exposure and frequently relies on testing of paired sera samples. Newer tests available in some centres include reverse-transcriptase PCR (polymerase chain reaction) or direct antigen detection (non-structural protein 1). While these tests are likely to offer an improvement in diagnostic accuracy, the cost and current limitation of not detecting all serotypes limits their application.
The evidence to support the recommendation and widespread use of the TT to aid the diagnosis of dengue fever is mixed with variable sensitivity and specificity being reported previously (15)(16)(17)(18)(19)(20)(21). The aim of this study is to map the evidence, assess the quality of the studies and perform a diagnostic accuracy meta-analysis of the diagnosis of dengue using the TT compared to ELISA.

Data Sources and Searches
The protocol for this systematic review was registered at PROSPERO (International prospective register of systemic reviews, http://www.crd.york.ac.uk/PROSPERO/display_record.asp? ID=CRD42015020323). We searched Medical Literature databases Analysis and Retrieval System Online (Medline), Excerpta Medical Database (EMBASE), Allied and Complementary Medicine Database (AMED), Global health, Biological Abstracts/Reports, Reviews, Meetings (BIOSIS) altogether through OVID. Latin American and Caribbean Health Sciences (LILACS) and the Cochrane Library through their website for relevant publications until April 2016.
Additionally, we searched the WHO ICTRP (International Clinical Trials Registry Platform) and ClinicalTrials.gov for completed and ongoing studies.
The search, performed according to the Cochrane Highly Sensitive Search Strategy, used the following terms: "Rumpel-Leede" OR capillary OR "blood pressure cuff" OR petechiae OR tourniquet OR "Hess" AND dengue. The search was sensitive, we used no study filters and no language or publication restrictions. We checked the reference lists of all primary studies included for additional references. There were no language or publication restrictions on our search.
We included cross-sectional and cohort studies that evaluated the diagnostic accuracy of tourniquet test for dengue infection. Both retrospective and prospective studies that consecutively or randomly selected patients were included, together with studies that used delayed verification for gold standard. We included studies looking at patients presenting with fever who were subsequently tested for dengue using both the TT (index test) and ELISA detection of antibody response (reference standard).
For this review, definitions of dengue were used according to those proposed by the WHO [11,12], as these were the definitions used during the time period from which studies were drawn. For the purposes of this meta-analysis, 'dengue' was considered to consist of non-severe 'dengue fever' and 'haemorrhagic dengue fever', defined as follows. Dengue fever included fever plus 2 or more symptoms of nausea/vomiting, rash, aches and pains. Dengue hemorrhagic fever (DHF) was considered as infection accompanied by haemorrhagic manifestations such as petechiae and mucosal or gastro-intestinal bleeding [11,12].
Three comparisons were performed; TT vs. ELISA to diagnose dengue (i.e. both non-severe dengue fever plus DHF; TT vs. ELISA to diagnose dengue fever and TT vs. ELISA to diagnose DHF.

Study Selection
Two review authors (AJG, HR) independently assessed all studies identified from the database searches by screening titles and abstracts using the Review Management website Covidence (http://www.covidence.org). We separated potential studies for full-text reading. A third review author (ET) resolved any disagreements, and reasons for including and excluding trials were recorded.

Data Extraction and Quality Assessment
Two review authors (AJG, HR) independently extracted data from the included studies using a standard data extraction form. With this form we extracted information of study design, participant description, index test description, reference test description, dengue classification and total number of participants. A 2x2 table was created for each study comparing both tests.
All included studies were assessed for their methodological quality using the quality assessment tool for diagnostic accuracy studies (QUADAS-2) [23]. The tool is composed of 17 items regarding study patient selection, index test, reference standard and flow and timing. For each domain mentioned there are items for risk of bias and applicability. Items were scored as positive (low risk of bias), negative (high risk of bias), or insufficient information (unclear). A description of each assessment was described in the results section.

Data Synthesis and Statistical Analysis
For each study, a 2x2 contingency table was constructed. We calculated sensitivity, specificity and likelihood ratios (LRs). When the primary study had 0 in a cell of the 2x2 table, the value of 1 was added, so calculations could be done [24], this only happened in one study(17). We planned to exclude primary studies reporting two cells with 0, but this did not occur.
The sensitivity, specificity and LRs were pooled from each study and a forest plot was generated with 95% confidence intervals. Due to the variability in diagnostic data, we logit-transformed sensitivity and specificity for each primary study and for the aggregate result, considering variability within-study and between-study. The output results are random effects estimates of the mean sensitivity and specificity with corresponding 95% CI. The weighing considered the inverse of the standard error, so indirectly to the sample size reported in the studies. Inconsistency (I 2 ) was explored as an indicator of statistical heterogeneity [24]. Summary receiver operating characteristic (ROC) curves were generated with calculation of area under the curve (AUC) as an indicator of test accuracy. To assess for the possibility of publication bias, we constructed funnel plots to visually assess for signs of asymmetry [25].

Results
We identified 1610 studies of which 637 were excluded as duplicates (Fig 1). A total of 973 studies were assessed on the basis of the title and abstract, of which 883 were excluded because they did not fulfill inclusion criteria. Full text studies were retrieved for 90 titles, of which 74 were excluded (Table 1): unable to extract absolute numbers of true positives, false positives, false negatives, and true negatives (n = 46), wrong test (n = 15) and wrong study designs (n = 13) [4][5][6][7][8].
All the analysis showed high levels of heterogeneity, represented by an I 2 ranging from 75% to 100%. Given this considerable heterogeneity between studies, we performed a random effects meta-analysis presented below.

Methodological Quality of Included Studies
We used the instrument QUADAS-2, which is composed of four quality categories (patient selection, reference standard, index test, and flow and timing), to critically appraise each included study (Fig 2). Six studies (33%) were considered to have high risk of bias in patient selection due to inclusion of patient data from a database, raising the possibility of bias from multiple assessors, or selection of patients with pre-existing disease. Two studies (17%) had not adequately described their sampling methods, so were classified as unclear risk. Eight studies (50%) were low risk of bias for patient selection.
Considering the Reference standard category (ELISA), all studies were considered low risk of bias. For the Index test category, four studies (25%) had not clearly described the process used to conduct the TT, blind assessors or train assessors.
For the flow and timing category, only two studies (12.5%) were considered at high risk of bias as the TT was repeated multiple times over a period of several days. Four studies (25%) were considered unclear risk due to lack of information of withdrawals and appropriate sequencing of tests.

Study
Reason for exclusion Study Reason for exclusion Study Reason for exclusion Ahmed 2001[27] No data to be extracted Gregory 2010 [96] No data to be extracted Muhammad 2006 [7] No data to be extracted Arif 2009 [28] No data to be extracted Guzman 1996 [46] No data to be extracted Munir 1982 [8] No data to be extracted Awasthi 2012 [29] Wrong test Hanif 2011 [47] No data to be extracted Munir 2014 [60] No data to be extracted Ayyub 2006 [30] No data to be extracted Horstick 2012 [48] Wrong study design Namvongsa 2013 [61] No data to be extracted Diagnostic Odds Ratio was 2.08 (95% CI, 1.15-6.82). The area under the curve was 0.66 (95% CI, 0.62-0.70) (Fig 4C).

Dengue Shock Syndrome
None of the included studies reported data comparing TT and ELISA for patients with dengue shock syndrome.

Subgroup Analysis for Dengue vs ELISA
We conducted a subgroup analysis for the included studies considering only children and adolescents aged 6 months to 15 years. No analysis with adults were conducted, since all 16   included studies did not explore only adults' participants, when they analyzed adults they mixed the data with children and adolescents In this subgroup analysis, we included eight studies including both non-severe dengue fever and DHF cases. The pooled sensitivity for dengue diagnosis was 0.71 (95% CI, 0.59-0.82) and the specificity was 0.59 (95% CI, 0.47-0.70) (Fig 7). The positive predictive value was 1.66 (95% CI, 1.45-1.91). The negative predictive value was 0.52 (95% CI, 0.43-0.64). The Diagnostic Odds Ratio was 3.44 (95% CI, 2.25-5.25). The area under the curve was 0.69 (95% CI, 0.65-0.73).

Sensitivity Analysis
We conducted the following sensitivity analyses of the Dengue vs ELISA analysis: 1. In order to analysis the impact of the mix of cut-off points reported by studies (10 petechiae per one square inch and 20 petechiae per one square inch) we repeated the analysis in just studies Tourniquet Test and Dengue Diagnosis using the criteria of 20 petechiae per one square inch. 2. We conducted another sensitivity analysis removing all studies with high risk of selection bias.

Investigations of Publication Bias and Heterogeneity
Funnel plot asymmetry test revealed evidence of publication bias (Fig 10). The I 2 statistics were, as expected in diagnostic meta-analyses, over 95% in all three comparisons made (Figs 3, 5 and 6) [26].

Discussion
Dengue fever is an infection with significant global public health importance. Increasing urbanization and crowding in endemic areas, coupled with failing vector control programs have resulted in a significant increase in cases and major outbreaks since the 1950's [11]. This is the first systematic review and meta-analysis to specifically investigate and compare the utility of the tourniquet test to diagnose dengue infection compared to the widely-used standard laboratory ELISA testing.
Overall, our results demonstrate that the TT is a relatively poor diagnostic test for dengue of any severity. When assessed by ROC analysis, low AUCs (0.70) suggest that the TT, as an isolated diagnostic test and in comparison to ELISA, should be classified as a "relatively poor" test. [103] Funnel plot analysis suggests that there may be a major element of publication bias in the previous reporting of tourniquet test usefulness. Many studies were observed to have overly extreme results, i.e. have a large effect for positive TT and large effect against TT. Reasons for this may include small sample size with wide standard errors, or other problems in study design, with resulting overemphasis in reporting positive or confirmatory results. Resulting estimates of TT efficacy may there have demonstrated a significant skew towards overly positive effects.
In our subgroups analysis, we removed the studies that mixed adults and kept only children and adolescents from 6 months to 15 years of age. We did not find any significant change in utility of TT in diagnosing dengue infection. We did further sensitivity analyses considering both a diagnostic cut-off of 20 petechiae per one square inch and a repeat analysis after removing studies at high risk of bias for patient selection. Neither of these analyses led to any doi:10.1371/journal.pntd.0004888.g008 significant difference in our findings. This is of interest, as using a higher, stricter cutoff would generally reduce sensitivity but increase specificity for diagnosis. Lack of an effect seen when increasing the threshold to 20 petechiae here, suggests limited biological correlation of this clinical observation. Additional analyses demonstrated that data from individual studies were scattered and included a wide range of participant age-groups and geographic regions. Ideally, we would also have liked to investigate the performance of the TT in different age-groups using a range of cutoff thresholds, for example, 10 petechiae in children, 20 petechiae in adults. It was not possible to extract these age-group specific data from many of the primary studies however.
Considering the subgroup analysis, we could hypothesize that increasing the threshold for diagnosing dengue the sensitivity would decrease and specificity increase, however this hypothesis was not confirmed, the lack effect on the outcomes shows the need to use a more rigorous test to diagnose dengue.
Using the GRADE approach to assess the quality of the evidence generated in this study, we can classify it as low, which means that "further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate or any estimate of effect is very uncertain", the evidence was downgraded due to imprecision (wide confidence intervals) and inconsistency (widely different estimates) across the included studies.
Further limitations to our analysis of data currently available in the literature may arise from heterogeneity in the time periods at which the tests were performed, or the number of occasions on which the test was repeated prior to getting a positive result. Additional practical reasons for previous overestimation of the efficacy of the TT may include difficulties in interpreting a positive result in individuals with different skin pigmentation or variation in the virulence or pathogenicity of strains resulting in higher rates of capillary permeability, for example in South East Asian genotypes of the DEN-2/3 serotypes.
Here, we have assessed and presented the best available evidence for use of the TT in making a diagnosis of dengue infection. Clearly the TT should not be used in isolation for making a diagnosis of dengue, however given the evidence available it is doubtful as to whether the test offers any additional benefit over and above careful clinical evaluation. Inconsistencies in data reporting in the primary study datasets also render assessment of whether the TT maybe useful for particular population/disease subgroups currently infeasible.

Conclusions
The clinical diagnosis of dengue is challenging as disease presentation is almost indistinguishable to many other infections commonly found in the tropics [104]. Current WHO recommendations suggest a combination of clinical history, leukopenia and the tourniquet test result to make a diagnosis if ELISA testing is not available or prior to the availability of results. Given the requirement for paired sera samples in many areas where dengue is endemic to demonstrate an increase in antibody titre, reliance on clinical diagnosis will be still greater. While still widely used, our analyses suggest that data supporting routine use of the tourniquet test is, at best, relatively poor, however it is important to consider that the quality of the evidence is low due to imprecision and inconsistency across the included studies. Furthermore, the data used to underpin current international recommendations likely overestimate its utility. Over reliance on the use of the TT to support a clinical diagnosis of dengue infection may result in misdiagnosis of patients and inaccurate estimates of disease incidence; relatively low sensitivity but higher specificity suggest that disease incidence may be underestimated if the TT is overly relied on. While current recommendations should be re-examined in light of these findings, replacement of the tourniquet test in routine clinical practice will only come once improved point-of-care diagnostics are made more widely available, especially in resource-poor areas.

Author Contributions
Conceived and designed the experiments: AJG HR ET CF TCD. Performed the experiments: AJG HR ET. Analyzed the data: AJG HR ET CF TCD. Contributed reagents/materials/analysis tools: AJG HR ET CF TCD. Wrote the paper: AJG HR ET CF TCD.