Rapid diagnostic tests and ELISA for diagnosing chronic Chagas disease: Systematic revision and meta-analysis

Objective To determine the diagnostic validity of the enzyme-linked immunosorbent assay (ELISA) and Rapid Diagnostic Tests (RDT) among individuals with suspected chronic Chagas Disease (CD). Methodology A search was made for studies with ELISA and RDT assays validity estimates as eligibility criteria, published between 2010 and 2020 on PubMed, Web of Science, Scopus, and LILACS. This way, we extracted the data and assessed the risk of bias and applicability of the studies using the QUADAS-2 tool. The bivariate random effects model was also used to estimate the overall sensitivity and specificity through forest-plots, ROC space, and we visually assessed the heterogeneity between studies. Meta-regressions were made using subgroup analysis. We used Deeks’ test to assess the risk of publication bias. Results 43 studies were included; 27 assessed ELISA tests; 14 assessed RDTs; and 2 assessed ELISA and RDTs, against different reference standards. 51.2 % of them used a non-comparative observational design, and 46.5 % a comparative clinical design (“case-control” type). High risk of bias was detected for patient screening and reference standard. The ELISA tests had a sensitivity of 99% (95% CI: 98–99) and a specificity of 98% (95% CI: 97–99); whereas the Rapid Diagnostic Tests (RDT) had values of 95% (95% CI: 94–97) and 97% (95% CI: 96–98), respectively. Deeks’ test showed asymmetry on the ELISA assays. Conclusions ELISA and RDT tests have high validity for diagnosing chronic Chagas disease. The analysis of these two types of evidence in this systematic review and meta-analysis constitutes an input for their use. The limitations included the difficulty in extracting data due to the lack of information in the articles, and the comparative clinical-type design of some studies.

Introduction American Trypanosomiasis or Chagas disease (CD), caused by the protozoan Trypanosoma cruzi, continues to be an important cause of illness, disability and death [1]. In recent years, CD has positioned itself as the main parasitic disease in Latin America and as one of the 13 most neglected tropical diseases [2]. It is estimated that about 100 million people are at risk of being infected with T. cruzi, in the region, and that there are about 8 to 10 million already infected; with 30,000 new cases per year due to all forms of transmission, which leads to 12,000 annual deaths [3]. In addition, the international migration has caused infected individuals from Latin America to migrate all over the world, which now makes the disease a problem for the global health systems [4].
CD has two forms: acute and chronic. The acute phase is usually asymptomatic or can present as a nonspecific, self-limited febrile syndrome that resolves in approximately 90 % of untreated infected individuals [5]. On the other hand, in its chronic phase, around 60% to 70% of patients do not present any apparent symptoms; 30% of the subjects develop cardiomyopathies with a clinical variety, including arrhythmias, aneurysms, dilated cardiomyopathy, and sudden death [6].
It is essential and important to diagnose T. cruzi infection using laboratory tests in order to prescribe the best treatment and, this way, stop the progression of the disease and prevent its transmission [7,8]. However, one limitation is the complexity of the diagnostic process, which is sometimes hampered by the lack of a reference standards, by the availability of multiple types of assays with different sensitivity and specificity values, and by the great difficulty of detecting the parasite in the chronic phase of the disease [9]. The World Health Organization (WHO) recommends using two conventional tests for diagnosing chronic CD, based on different principles and the detection of different antigens. Furthermore, in the case of ambiguous or inconclusive results, a third technique should be used [10]. Thus, serological tests, such as indirect immunofluorescence, indirect hemagglutination, enzyme-linked immunosorbent assay (ELISA), and immunochromatographic tests or rapid diagnostic tests (RDT) are used [11]. They can be qualitative or semi-quantitative, based on different antigens; some use a multi-epitope antigen and others use a combination of recombinant proteins [12]. The Pan American Health Organization states that the evidence on the validity of tests for diagnosing CD has been considered high in the case of ELISA tests and chemiluminescence analysis, and moderate for RDTs [13]. Each technique has different features in relation to the antigenic targets used, the population evaluated, the cut-off points and the equipment used; therefore, a direct comparison of test performance is more difficult [14]. Taking into account the aforementioned, the purpose of this study was to summarize the evidence available on the diagnostic validity of ELISA and immunochromatographic tests (RDT) in individuals with suspected diagnose of chronic CD.

Protocol and registration
This systematic review and meta-analysis was carried out according to the PRISMA-DTA guidelines (Preferred Reporting Items for Systematic Reviews and Meta-analysis of Diagnostic Test Accuracy Studies -The PRISMA-DTA Statement) [15] for the abstract and the body of the manuscript (S1 and S2 Checklists). The protocol was registered in the PROSPERO database (International Prospective Register of Systematic Reviews) with number CRD42020186588.

Eligibility criteria
The search included studies that estimated sensitivity and specificity of ELISA or RDT index tests for chronic CD, with participants over five years old, patients with chronic CD, and patients without this disease; studies conducted in endemic and non-endemic areas for CD, that described the reference standards used, studies with a cross-sectional design and a casecontrol type; written in English, Spanish and Portuguese, published between 2010 and 2020; with research done with volunteers and with samples that included humans. Studies indicating that patients were receiving treatment for CD, those that were related exclusively to acute infection or in newborns, and those with mixed data on patients with acute and chronic infection were excluded.

Data sources
The databases used for the search, which was carried out from May to August 2020, were: Pubmed/Medline, Scopus; ISIWeb/Web of Science, and LILACS. The corresponding authors of articles included were contacted by email to inquire about missing data or request clarification on studies.

Study search and selection
The standard search strategy described in The Joanna Briggs Institute Reviewers' Manual 2015 [16] was used. Thus, there was an initial limited search to identify relevant keywords and indexing terms, followed by a comprehensive search in the databases included with strategies for each of the search engines (S1 Database). Two reviewers (SHSC-LXRL) assessed article titles and abstracts in an independent and blinded manner. Disagreements in the inclusion of studies were resolved by consensus, taking into account that the abstracts should meet the proposed eligibility criteria. Subsequently, the articles were reviewed in full text.

Data collection process
Two authors (SHSC-LXRL) extracted the following data independently: author(s), year of publication, type of participants, study area, index test, reference test, study period, country of implementation, number of patients and healthy subjects, total number of participants, sensitivity and specificity, risk of bias and applicability.

Definitions for data extraction
The subjects included in the different studies were classified into: patients who had lived or resided in an endemic area for CD and patients who reside in a non-endemic area.
The study area was considered endemic if CD occurred in this geographic area; and as a non-endemic area, otherwise. The index tests were considered commercial when they were part of a brand of laboratory diagnostic reagents, validated by medical device regulatory agencies and those available on the market; and considered in-house tests when studies indicated that immunoadsorption assays had been designed with different peptides or proteins with the application of non-standard "internal" methods. RDTs are those immunochromatographic assays that throw qualitative results and can be read at first sight.
Reference tests met the standard if they included a combination of serological tests with different antigens detecting antibodies against T. cruzi, and an additional test to reach a definitive diagnosis if the results were inconclusive.
The study design was considered clinical-comparative or case-control type if a group of participants diagnosed with chronic CD and a group without this diagnosis had been included; and it was considered non-comparative if a consecutive and representative series of patients with suspected CD had taken the test to be evaluated, as well as the reference test.

Risk of bias and applicability
Three authors (SHSC-LXLR-CSC) assessed the methodological quality and risk of bias of the studies included, in a blinded and independent manner, using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool, which comprises four domains: patient screening, index test, reference test, and flow and time [17]. Each domain was assessed for risk of bias, and the first three domains were also assessed for applicability.
The QUADAS-2 tool was adjusted to the needs of this review, as follows: the risk of bias in patient screening was considered high if a consecutive or random sample of patients had not been used; and unclear if patient recruitment was not specified. The risk of bias related to the index test was considered unclear if there was no specification that the results of the index tests were interpreted without knowing the results of the reference test. The risk of bias related to reference tests was considered high if these tests were interpreted knowing the results of the index test, or if a single reference test had been used (taking into account that the WHO establishes that serological diagnosis in the chronic phase of CD should be based on positive results in two tests that are based on different immunological principles and, in case of inconsistency, on a third test).

Diagnostic accuracy measures
The reported measures were sensitivity and specificity for each of the index tests assessed for diagnosing chronic CD. When the studies did not have these two measures, they were calculated based on the number of true positives and negatives, as well as on the number of false positives and negatives and the total number of patients.

Summary of results
Sensitivity and specificity were modeled bivariately with binomial-normal random effects, with a gold standard (GS) assumption, but also with an imperfect gold standard (IGS) model. The GS models were fitted with a Bayesian and classical approach; and the IGS model with a Bayesian approach only. Models were selected with the deviance information criterion (DIC) for the Bayesian models, and with the likelihood ratio test for the classical models. Six possible models for the GS were evaluated according to the type of distribution that followed the random effects (normal or mixed normal) and the type of connection (logit, cloglog and probit), and the best model was selected according to the smallest DIC with at least two points difference. The specification of the model with the best fit (in bamdit metadiag) was reproduced in the rest of the packages (meta4diag: Binomial-normal with probit, and metandi and IGS: Binomial-normal with logit) to facilitate comparisons.
The bivariate random effects model was used to estimate the overall sensitivity and specificity and their respective 95 % confidence intervals (CI). The results were plotted in forest-plots and ROC space (R DTAplots program), and heterogeneity between studies was assessed visually. R 1.3 software (DTAplots, bamdit::plotcompare and meta4diag::meta-regression) [18], Stata 15 (metandi) [19], midas and JAGS were used to conduct the meta-analysis.

Additional analyses
Meta-regressions were carried out with potential modifiers of diagnostic validity (bamdit plotcompare and meta4diag meta-regression). The variables of interest were study design (clinical comparative or non-comparative), study area (endemic or non-endemic), study risk (low or high risk of bias), sample type (serum, whole blood or not applicable) for the RDTs, and type of test (commercial or in-house) for the ELISA tests but not for the RDTs because of the low number of studies, which made it impossible to estimate them.
All variables were categorized at two levels in both the ELISA and RDT assays to facilitate the comparison of predictive regions and validity estimates. A QUADAS-2 assessment was applied in each study in order to analyze by subgroups. The three levels of the QUADAS-2 became two: low risk and high risk (which included the high risk and unclear categories). Of the 7 items of the tool, item 1 (patient screening) and item 3 (reference standard) were considered since they were the only ones with a sufficient number of studies with a high risk of bias. In the rest of items, most studies were low risk.
A sensitivity analysis was carried out excluding influential outliers. Influential studies were reviewed based on the assumption that the subsequent interval distribution of study weight should include one. The publication bias was assessed using Deeks' asymmetry test, which was considered statistically significant with a value of p < 0.1 [20].

Study selection
As shown in Fig 1, 897 publications were initially identified, of which 739 were eliminated due to duplication in the databases. Of the remaining 158 publications, 75 did not meet the selection criteria in the review by title and abstract. Of the remaining 83 articles, 40 were excluded for the following reasons: 17 due to inadequate study design, 9 did not meet the diagnostic reference test criteria suggested by the WHO, 8 due to non-concordance between the index test and the tests that were to analyze the present investigation (ELISA or RDT), 4 used a population that was in the acute phase of the disease or were studies that analyzed subjects in the acute and chronic phase and 2 included patients with previous treatment for CD. Finally, 43 full-text articles were used for qualitative and quantitative analysis [9,10,.
In 33 articles (76 for ELISA tests and 39 for RDT), several substudies included aspects such as tests with different peptides, populations from different countries, participation of several reference laboratories or evaluation of different index tests. This data is presented and analyzed by separated in the present investigation. Six articles did not present sensitivity and specificity data [22,24,25,37,38,51] but were included considering that they indicated true and false positive and negative data. A total of 30,356 participants were reported in the 43 articles (from 56 to 10,284 subjects per study).

Risk of bias and applicability
The quality assessment of the studies included in the analysis of ELISA tests is shown in S1 and S2 Figs; and those of RDTs are shown in S3 and S4 Figs. The risk of bias was assessed in the four domains: 1. Patient selection was assessed at high risk of bias in 19 articles for ELISA tests and in 6 for RDT because a consecutive or random sample of patients was not used, since it was unclear in 5 articles for ELISA tests, and because it did not specify patient recruitment; and it was assessed as low risk in 9 studies for RDTs.
2. The risk of bias related to the index test was assessed as unclear in 19 studies for ELISA and in 7 for RDTs because they did not state clearly whether the index test results were interpreted without knowledge of the reference standard results.
3. The bias related to the reference test had 19 studies for ELISA tests and 7 for RDTs, which were classified as high risk because the result of the reference test was interpreted knowing the results of the index test, or a single diagnostic test was used as a reference standard ( [62] [63],).
4. Flow and time were assessed as high risk of bias in four studies for ELISA tests, as not all patients received the same reference standard; while all included studies for RDTs were found to be at low risk of bias on this dimension.
Regarding the applicability in the first three domains, 100% of the articles that assessed both ELISA and RDTs tests were classified as low risk because they coincided with the review question.

Synthesis of results
Selecting the model for Elisa and Rapid Diagnostic Tests. The Bayesian model (GS) that best fit the analysis of ELISA tests was the binomial-normal with the logit or the probit as the link function (DIC = 630 and DIC = 631, which can be considered equivalent [S1 Table]). Similarly, for the RDT analysis, with the binomial-normal probit model (DIC = 416 [S2 Table]); thus, a probit link was used for the two tests analyzed.
Sensitivity and specificity of ELISA tests. The ELISA tests had an overall sensitivity of 99% (95% CI: 98-99) and an overall specificity of 98% (95% CI: 97-99) (Fig 2). Some studies presented outliers in sensitivity [33] and specificity [25]. In the predictive region, greater variability is observed for specificity than for sensitivity (Fig 3), therefore, it is observed that there is greater heterogeneity in specificity than in sensitivity.
Sensitivity and specificity of Rapid Diagnostic Tests. The RDTs had an overall sensitivity of 95% (95% CI: 94-97), and an overall specificity of 97% (95% CI: 96-98). There were studies with atypical values or outliers [10,54] (Those that did not follow the patron of most of

PLOS NEGLECTED TROPICAL DISEASES
the studies, which means that they were strayed of the tendency) (Fig 4). The predictive region (Fig 5) is more symmetric, and a slightly higher variability is observed for sensitivity. Therefore, it is identified that there is a slightly greater heterogeneity for sensitivity.
Subgroup analysis. For the ELISA tests, sensitivity estimates were similar by subgroups. The punctual estimate ranged from 98% to 99%, and CIs ranged from 97% to 99%. As for specificity, the sensitivity value ranged between 95% and 100%; the subgroup with low risk of bias was the one that showed the lowest specificity, with 95% (95% CI: 91-97), and the non-endemic area was the one that reported the highest specificity, 100% (95% CI: 99-100) (S5 Fig). Regarding the ELISA tests, the moderators that showed the greatest difference in the predictive region were study design (clinical comparative or non-comparative), where the comparative studies presented low heterogeneity, while the non-comparative studies had greater heterogeneity; and the study risk subgroup (low or high risk of bias), because the studies are similar when they present risk, showing a lower precision (S6 Fig).
Regarding the RDTs, the moderators that show the greatest difference in the predictive region are the subgroups study area (endemic or non-endemic), sample type (serum, whole blood or not applicable), and study risk (low or high risk of bias) (S8 Fig). Analysis of influential observations. Four [9,25,33,34] and five [10,23,40,52,54] were the most influential for the ELISA test studies and the RDT respectively. For each test, two models were fitted (one with all studies and one for all studies except the influential ones) in order to observe the effect of excluding influential studies on the accuracy and predictive region. Both models showed a similar predictive region. The influential studies of ELISA tests were always studies with a clinical-comparative design; whereas the ones for RDTs were non-comparative studies (S9 Fig). Publication bias. Asymmetry was observed in the funnel plots for ELISA tests; whereas no asymmetry was observed in the RDTs. The result of the Deeks' asymmetry test was statistically significant for ELISA tests (p < 0.01), but not for RDTs (p = 0.64).

Summary of evidence
The combined sensitivity and specificity of ELISA tests were 99% (95% CI: 98-99) and 98% (95% CI: 97-99); whereas the ones for RDTs were 95% (95% CI: 94-97) and 97% (95% CI: 96-98). The overall sensitivity of RDTs was lower than that of ELISA tests. According to the results PLOS NEGLECTED TROPICAL DISEASES obtained in this meta-analysis, the sensitivity and specificity of ELISA tests were higher than those reported in another meta-analysis, in which ELISA tests were compared with RDTs [64] and in which the sensitivity was 97.7. % (96.7%-98.5%) and specificity 96.3% (94.6%-97.6%). The sensitivity of ELISA tests was also higher than in another meta-analysis, where different types of tests were analyzed for diagnosing CD and in which they obtained a sensitivity result of 90% (89%-91%) and a specificity of 98%. % (98 %-98 %) [65], as in this study. Regarding RDTs, the sensitivity and specificity data obtained in this study were lower than those documented in another meta-analysis, where they included clinical trials with recruitment of cohorts of individuals at risk of exposure to T. cruzi, which were 96.6 % (CI95%: 91.3-98.7) and 99.3 % (CI95 %: 98.4-99.7), respectively [66].
Regarding heterogeneity, in the ELISA tests it was identified that this measure was higher in specificity than in sensitivity, which is similar to that reported by Brazil [64], while in the RTDs, the heterogeneity was limited, being slightly higher for sensitivity, coinciding with what was described in the systematic review carried out by Angheben [39], results that differ from those reported by Afonso [65].
This meta-analysis showed greater specificity for ELISA tests in non-endemic areas, and greater sensitivity for endemic areas. The diagnostic performance of RDTs was the same for endemic and non-endemic areas, unlike what Angheben et al. [66] reported, where performance was higher in endemic than non-endemic areas.
No differences in sensitivity were observed between the in-house and commercial ELISA tests. The same showed to be true for specificity in the two types of tests. This differs from what was stated in the meta-analysis by Afonso et al. [65], where the commercial tests were more sensitive than the in-house ones; however, the specificity was similar between the two, as it was observed in this meta-analysis.

PLOS NEGLECTED TROPICAL DISEASES
All ELISA test assays were made on serum, with the exception of the study by Cervantes-Landín et al. [46], who used venous blood impregnated on filter paper before its coagulation. However, RDTs were made on all types of samples, mostly serum [23,26,32,36,50,54,56,60], but also venous blood and serum [23,43,52]; venous blood [40,61]; capillary blood [10,37]; serum, venous blood, and capillary blood [47], and one study did not specify the sample type [39]. These RDTs are frequently used in Point-of-Care tests (immediate diagnostic analyses), which are performed outside the laboratory, closer to the patient, with easily transportable material and equipment, and which results are available in minutes or in less than one hour. Its application is greater in developing countries [67] since the use of samples such as capillary blood -easy to collect without requiring collection tubes or centrifugation-, would be the best choice for RDTs. The results obtained in this study, depending on the type of sample, allow us to infer that the diagnostic performance of the RDTs was good, regardless of the type of sample.
The trypanosomatids that affect men in America belong to the Leishmania and Trypanosoma genera. ELISA tests have been valuable for diagnosing these two agents, but their specificity may be low due to the cross-reactivity between the two species of parasites; thus, it is important to take this into account, especially when you want to know the prevalence of these two diseases in endemic areas [68]. Furthermore, most of the areas where T. cruzi is found are co-endemic for Leishmania sp. and T. rangeli, which complicates the diagnosis of CD [69]. This coincides with what was documented in 13 of the studies analyzed, according to which the percentage of cross-reaction is between 0% and 62.5% with various diseases, among which leishmaniasis stands out.

Limitations
The items in which a risk of bias was detected in more articles were patient screening and reference standard for the two types of tests assessed; furthermore, the risk of bias associated with the index test was unclear in more than 50% of the studies for ELISA tests. On the other hand, in the applicability items, the studies selected did not present a risk, similar to that described in the meta-analysis carried out by Angheben et al. [66], in relation to the quality of the RDT tests. For the ELISA tests, the sensitivity in the studies with low or high risk of bias was very similar, different from that described by Afonso et al. [65], who reported higher sensitivity in low-risk studies than in high-risk ones. As for specificity, it was higher in the articles that presented a high risk of bias, different from what was reported in this same meta-analysis, where the studies with low risk showed greater specificity. 46.5% of the studies were found to be comparative, and 2.3% mixed studies (comparative and non-comparative). This influences the quality of the studies included because the best design to assess the validity of diagnostic tests is the non-comparative one (cohort or cross-sectional), where a consecutive and representative series of patients with suspected disease are given the test to be evaluated and the reference test in a blind and independent manner and interpreted in the absence of any additional clinical information, which will not be available when the test is used in the practice, either.
In the literature found on this topic, most studies evaluating diagnostic tests take into account the blinding of the reference test results, but few of them have used this non-comparative design [70]. In this meta-analysis, lower sensitivity and specificity values have been found in RDT studies with a comparative clinical design, where a group of patients diagnosed with the disease and a group without this diagnosis were included. This is an unexpected result since studies with a clinical comparative or case-control design tend to overestimate sensitivity and specificity; however, it is possible that this effect only occurs when severe cases are included in the case group [71]. Additionally, a high risk of publication bias was identified for ELISA tests, which coincides with the study by Afonso et al. [65].
The meta-analysis could have been affected by the search made in only four databases because not all studies related to the topic might have been included. Regarding the studies included, ELISA and RDT tests were not compared because only one study [36] had done so; the same was true for the in-house and commercial RDT subgroups which could not be assessed since only one study [26] included in-house RDTs.
The quality of the systematic review was also influenced by the results of the selected articles and their design (some of which were clinical-comparative); that is, some variables could not be explored in the subgroup analysis because they were not reported in many studies, for instance, study period, cross reactions, distribution by rural and urban area, generation or version of diagnostic tests, estimation of cut-off points, geographic areas of the strains used as sources of antigens, and the type of antigen (multiepitope or combination of recombinant proteins).
Subsequent studies should follow the instructions given by the WHO and carry out two tests in parallel using different antigens due to the immunogenic diversity of the different strains of the parasite, the immune response of the patients, and the existence of cross-reactions with other trypanosomatids that coexist in endemic areas [72]. Likewise, a sufficient number of samples should be included to evaluate the cross-reactions between chronic CD and Leishmania infection, considering that patients with either of the two infections, or with mixed infection, may be misdiagnosed given the crossed serological reactions when combinations of uncharacterized antigens are used [22].
Regarding the current status of the implementation of RDTs for diagnosing chronic CD in endemic areas of Latin America, these are used as tests of choice in screening programs for its detection, early treatment and control, and they represent a first approach at point of care for the rapid diagnosis of CD in endemic countries [73]. In addition, since RDTs are easier to use than ELISA, it would be feasible to use them more often in screening programs, which would facilitate the detection of CD cases without ignoring the current recommendations to confirm positive results through conventional methods.

Conclusions
According to this systematic review, ELISA and Rapid Detection Tests (RDTs) have a high validity for diagnosing chronic CD; however, the overall sensitivity of the second test was lower than that of the first one, so it is important to better study the variables that influence the validity of the RDTs, some of which had not been taken into account in some of the studies included.
The usefulness of RDTs for screening CD in epidemiological contexts, such as endemic regions that are difficult to access or non-endemic regions with a high prevalence of chronic CD, should also be assessed, as well as the inclusion of RDTs in the diagnostic algorithms used for its detection, in order to improve access to treatment since the first level of primary health care.