Accuracy of molecular biology techniques for the diagnosis of Strongyloides stercoralis infection—A systematic review and meta-analysis

Background Strongyloides stercoralis infection is a neglected tropical disease which can lead to severe symptoms and even death in immunosuppressed people. Unfortunately, its diagnosis is hampered by the lack of a gold standard, as the sensitivity of traditional parasitological tests (including microscopic examination of stool samples and coproculture) is low. Hence, alternative diagnostic methods, such as molecular biology techniques (mostly polymerase chain reaction, PCR) have been implemented. However, there are discrepancies in the reported accuracy of PCR. Methodology A systematic review with meta-analysis was conducted in order to evaluate the accuracy of PCR for the diagnosis of S. stercoralis infection. The protocol was registered with PROSPERO International Prospective Register of Systematic Reviews (record: CRD42016054298). Fourteen studies, 12 of which evaluating real-time PCR, were included in the analysis. The specificity of the techniques resulted high (ranging from 93 to 95%, according to the reference test(s) used). When all molecular techniques were compared to parasitological methods, the sensitivity of PCR was assessed at 71.8% (95% CI 52.2–85.5), that decreased to 61.8% (95% CI 42.0–78.4) when serology was added among the reference tests. Similarly, sensitivity of real-time PCR resulted 64.4% (95% CI 46.2–77.7) when compared to parasitological methods only, 56.5% (95% CI 39.2–72.4) including serology. Conclusions PCR might not be suitable for screening purpose, whereas it might have a role as a confirmatory test.

Introduction Strongyloides stercoralis is a soil-transmitted helminth (STH) affecting around 370 million people worldwide, particularly in remote rural areas [1].
Chronic strongyloidiasis is characterized by non-specific, mostly mild symptoms involving the gastrointestinal tract (abdominal pain, diarrhea), the respiratory system (symptoms resembling asthma, chronic obstructive pulmonary disease), the skin (pruritus, rash) [2]. However, in immunosuppressed individuals the infection can become severe, with complications due to a heavier load of parasites, including intestinal obstruction, paralytic ileus, respiratory failure, death [2]. Hence it is recommended to diagnose and treat strongyloidiasis when still in the chronic, indolent phase. First-line treatment is with ivermectin, which demonstrated a good safety profile and is highly effective for chronic infection [3]. On the other hand, treatment of the severe syndrome is more complicated as failures tend to occur with the standard regimens [4].
A gold standard for the diagnosis of strongyloidiasis is still lacking [5]. Microscopic examination of stools has insufficient sensitivity, and enrichment techniques (Ritchie's method, for instance) and examination of multiple samples can only partially improve the performance of the method [6]. The Baermann method has a sensitivity about four times higher than formolether concentration technique (FECT); however, it is a cumbersome method and the sensitivity remains not adequate, either [5,6]. Sensitivity of agar plate culture (APC-and in particular, the technique described by Koga) is comparable to the one demonstrated by Baermann. [5] There are different serological tests, some of which are commercially available. Globally, serology demonstrated high sensitivity (ranging from 70 to 95%, depending on the test, according to a diagnostic study on multiple serological tests) [7], but there are concerns about its specificity, because of possible cross-reactions with other parasites and long-term persistence of antibodies after an effective treatment. A recombinant antigen (NIE) has been used in order to increase specificity of serological methods such as ELISA and luciferase immunoprecipitation system (LIPS) [8,9].
The latter test (NIE-LIPS) in particular demonstrated a high specificity and an equivalent sensitivity compared with the other serological tests, but the technique at the moment is not commercially available, and has been used so far only for study purpose [7].
Molecular methods have been implemented in this context, with the aim to achieve the highest sensitivity, preserving a high specificity. However, different studies report either better [10] or worse [11] accuracy of molecular methods compared to other fecal-based methods. Some variables (such as setting in which the research was conducted, population, type of molecular technique, comparator) might influence the global evaluation of their accuracy. In conclusion, the accuracy of molecular biology techniques for the diagnosis of S. stercoralis infection should be better defined, and so should their role in different settings.
Aim of this work was to review the accuracy of molecular biology techniques for the diagnosis of S. stercoralis infection.

Methods
The protocol was registered with PROSPERO International Prospective Register of Systematic Reviews (record: CRD42016054298) on December 29 th , 2016. All relevant studies were reviewed, regardless of language or publication status (published, unpublished, in press, and ongoing). The reference lists of all included studies for other potentially relevant research and authors' personal collections (grey literature) were also reviewed.

Selection of studies Inclusion criteria:
• Cohort studies • All studies evaluating a molecular biology technique:either conventional polymerase chain reaction (PCR), nested PCR, real-time PCR (qPCR), or loop-mediated isothermal amplification (LAMP) in comparison to serology and/or fecal-based methods "specific" for the diagnosis of S. stercoralis infection (Baermann method, agar plate culture, Harada-Mori culture, combination of fecal methods).
• Studies that pooled multiple intestinal parasites into one outcome measure (for example, multiplex PCR including other soil-transmitted helminthes) were included when it was possible to disaggregate the data.
• Studies conducted in endemic as well as in non-endemic areas.
• Studies conducted on either immunocompetent or immunosuppressed patients.

Exclusion criteria:
• Case-control studies • Non-human studies • Duplicate publications Two authors, DB and ARM, reviewed the titles and abstracts yielded by the search, and identified all studies that potentially met the inclusion criteria. DB contacted some authors requesting additional information on published data and/or other potentially relevant unpublished data. After obtainment of the full text articles of the records selected as potentially relevant, DB and ARM independently assessed whether or not each study met the inclusion criteria using an eligibility form in Excel. When DB and ARM did not reach a consensus, a third reviewer (AA) made the final inclusion decision.

Data collection process
DB and ARM independently performed the data extraction, that included sensitivity and specificity values, and other covariates, namely: reference test (divided in four categories: serology, culture, Baermann, combination of parasitological exams), setting (endemic/non endemic area), population (children, adults, all ages, not specified), immunological status (immunocompetent, immunosuppressed, not specified). For study purpose, infected and not infected were all subjects resulting positive and negative, respectively, to the reference standard test(s). The sensitivity of the index test(s) was calculated as the proportion of true positives (positive at the index test over all infected), and the specificity as the proportion of true negatives (negative at the index test over all not-infected). Studies evaluating more than one molecular method or using more than one reference standard test were split into sub-studies. Any disagreements regarding the data extraction was solved by discussion between the two authors. When necessary, a third review author (AA) facilitated the discussion until consensus was reached.

Risk of bias (quality) assessment
DB and ARM independently assessed the methodological quality of each included study using the QUADAS-2 tool [12]. Hence, four key domains were evaluated in terms of risk of bias: patient selection, index test, reference standard, flow and timing. When necessary, a third review author, AA, facilitated discussion until consensus was reached. All assessments were summarized in 'Risk of bias' tables.

Statistical analysis
The values of sensitivity and specificity were automatically computed in RevMan 2014 (Version 5.3[13]). Individual study results were graphically expressed by plotting the estimates of sensitivity and specificity and their 95% confidence intervals (CIs) through both forest plots and receiver operating characteristics (ROC) space. Heterogeneity was firstly evaluated by inspecting forest plots to detect overlapping 95% CIs, then by using a bivariate random-effects model [14] to obtain estimates of the between-study variation in sensitivity and specificity and the correlation between the two. The same bivariate model was used to assess the operating point sensitivity and specificity of the diagnostic tests under scrutiny, together with likelihood ratios and summary diagnostic odds ratio (DOR), taking both heterogeneity and threshold effect into account. Also, for each study, we estimated the true prevalence using the apparent prevalence, test sensitivity and specificity, as described by Rogan and Gladen [15]. Finally, we used the hierarchical summary ROC (HSROC) model [16] to obtain an adjusted ROC curve that summarized the results of all studies. All analyses were performed using all articles first, then they were repeated considering only those with parasitological methods (defined as the use of either stool culture, Baermann, or a combination of the two) as the reference test. This was considered the primary analysis. In order to have a more precise estimate of the influence of the real-time PCR, we also conducted a secondary analysis repeating the primary only on studies that used real-time PCR as the index test. All analyses were performed using Stata IC 13.0.

Results
The electronic search identified 1334 records from the following databases: MEDLINE (448 records retrieved), Embase (516 records), CENTRAL (Cochrane library, 4 records), Lilacs (362 records); search on trial registries permitted to identify 4 further studies. The study flow is summarized in Fig 1. Eventually, 14 studies were included both in quantitative and qualitative analyses. However, some studies evaluated either more than a single molecular method on the same pool of patients (in comparison to the same reference test) or a single molecular method on different subsets of patients (according to the results of different reference tests). In particular: two studies evaluated more than one molecular method (de Paula et al [17] tested both conventional and real-time PCR, Sharifdini et al [18] tested both nested and real-time PCR), one study [19] evaluated the same real-time PCR method performed in two different laboratories, and one study [20] evaluated the same index test (real-time PCR) both on patients positive to serology and on patients positive to APC. To handle and examine all these cases, we considered any experiment reported in a published paper as a separate study (Table 1).
Therefore, 4 out of the 14 included studies generated more than one set of sensitivity and specificity estimates. Globally, the included studies comprised a total of 3060 participants (from 54 [21] to 466 [18]_individuals tested). Of note, 12 of 14 studies evaluated a real-time PCR technique, and all of them used the method described by Verweij et a l [10], which employs primers targeting the S. stercoralis 18S ribosomal RNA gene. A different target DNA was used in a couple of studies [21,23] only. Four studies evaluated either conventional [17,23] or nested [18,21] PCR. In addition, information on immunological status of the individuals tested was collected: only one study was conducted in immunocompromised patients [28]. Three studies compared PCR with serology. In a couple of cases the serology was a commercial ELISA test based on somatic antigens from Strongyloides L3 larvae [21,28], while the other study used an in-house IFAT based on intact S. stercoralis filariform larvae [20].
The information about the methods for the preservation of the biological samples is reported in supporting information table (S1 Table).The samples were mostly kept frozen or preserved in ethanol until DNA extraction. In a few studies, the samples were kept at room temperature or refrigerated, and processed within a short time. Only one study did not report the method for preserving the stool sample before the DNA extraction [28]. Another study protocol entailed the use of filter papers [23]. DNA extraction was performed with a commercial kit in almost all cases (S1 Table). Only Sharifdini et al [18] used an in-house method described previously [29]. The DNA extraction method was not reported in one case [28]. Most studies reported the use of controls for PCR inhibition (9 studies out of 14, S1 Table), and seven studies entailed the controls for DNA extraction. Neither PCR inhibition nor DNA extraction controls were reported by four studies. The validation of the PCR methods included the determination of a limit-of-detection (LOD) in four studies [17,18,23,26] only. Shar et al [25] reported the determination of LOD in the methods, but the value was not specified in the results. Figs 2 and 3 show the results of the qualitative evaluation, in terms of rating for each included study and overall methodological quality, respectively.
As reported in the introduction, the evaluation of diagnostic tests for S. stercoralis is hampered by the lack of a gold standard. Therefore, the risk of bias associated to the reference test (possible incorrect classification) was assessed as unclear for all studies. Only two studies [11,20] applied any of the methods suggested for reporting diagnostic accuracy in absence of a gold standard [14]. In particular, Buonfrate et al [20] used a composite reference standard (CRS), while Knopp et al [11] applied a Bayesan latent class analysis (BLCA). Data from these studies were extracted, similarly to the other studies, in relation to the comparison of PCR to the other tests (without considering CRS or BLCM), in order to obtain a more homogenous evaluation of the index test. However, the results of CRS and BLCM were then compared to the global results of included studies. In the domain of the patient selection, the risk of bias was assessed as unclear for 7 studies. For 6 out of 7 studies, the reason was that the papers did not clearly report some relevant details about the patient sampling: whether the sampling was random or consecutive, or inappropriate exclusions were avoided. For one paper, the unclear risk was mainly due to the retrospective design of the study [20]. Finally, one paper clearly reported that patient sampling was not random, hence the risk of bias was assessed as high. [11] However, applicability concerns were assessed as low for all studies except one that evaluated the PCR accuracy in a cohort of cancer patients [28]. Fig 4 shows the accuracy reported in each study.
The forest plot showed discrepancies in the results of the studies, particularly regarding sensitivity. As we included studies comparing PCR with different reference tests, this heterogeneity was partially expected. Nonetheless, we assessed the between-study variation in sensitivity and the degree of correlation between sensitivity and specificity by using the bivariate random effects approach introduced by Reitsma et al [14]. The variance of the logit of the sensitivity resulted 2.50 (95% CI: 1.12 to 5.49) and the correlation between logit of sensitivity and logit of specificity resulted -0.51 (95% CI: -0.82 to 0.02). Thus, we fitted a bivariate model to take into account heterogeneity as much as possible and to obtain pooled accuracy estimates of PCR versus all other techniques ( summary ROC curve obtained through hierarchical random effects approach (HSROC) [30] is displayed in Fig 5. When studies comparing PCR with serology-positive patients were excluded from analysis, the sensitivity resulted 71.8% (95% CI: 52.2 to 85.5), and specificity 93.4% (95% CI: 90.3 to 95.6). Real-time PCR techniques were then analyzed separately (Table 2), showing sensitivity and specificity values of 56.5% (95% CI 39-72) and 95.4% (95% CI 92-97), respectively.

Discussion
Conventionally, PCR for S. stercoralis is considered 100% specific on the basis of the intrinsic characteristics of the technique. Although not confirming this value, the meta-analysis demonstrated a high specificity of PCR for the diagnosis of S. stercoralis infection, ranging from 93 to 95% according to the reference test. Moreover, it must be considered that the different reference standards used in the studies (implying that a sample PCR positive, but negative to all other fecal tests, is classified as a false positive) have most probably caused some underestimation of Accuracy of PCR for Strongyloides stercoralis infection the specificity. On the other hand, the sensitivity resulted unsatisfactory, regardless of the reference test used: from 56% sensitivity when real-time PCR was compared to any other methods (including serology), to 71% when the results of any PCR techniques (either conventional, nested or real-time) were compared to fecal methods only. One possible explanation for this low sensitivity, particularly when compared with serological tests, is the irregular larval output observed in chronic strongyloidiasis. Therefore, PCR techniques might face the same problem as the conventional parasitological techniques. As a matter of fact, PCR has not proven to be diagnostically superior to other parasitological techniques such as the Baermann method or APC, particularly in low-density infections where the larval output is low and irregular [5]. Moreover, one cause of the low sensitivity of PCR might be the small quantity of fecal sample analyzed [31], particularly relevant when the larvae are scarcely shed in feces.
Unfortunately, only a few included studies assessed the LOD of their techniques, that could permit a more accurate evaluation of the sensitivity of the PCR in relation to different levels of larval shedding. This information would be useful also to compare different techniques used in different studies, and should be better reported. Of note, the only study using a CRS (including serology) to assess the accuracy of real-time PCR demonstrated a sensitivity of 56.8% [20], which is almost the same value found with the meta-analysis. On the other hand, the only study using a Bayesian approach [11] demonstrated an extremely low sensitivity (11.6%) of real-time PCR. https://doi.org/10.1371/journal.pntd.0006229.g006 On the other hand, the sample preservation methods were reported by all but one authors of the included studies: they were all adequate, and presumably did not affect the results of the PCR. Also, DNA extraction was almost always conducted with commercial kits based on silicamembrane-DNA purification. All the automated methods used were highly reliable and the studies resulted homogeneous in relation to this aspect. Only one study reported an in-house method for DNA isolation that implies an organic solvent extraction and alcohol precipitation.
One reason for the low sensitivity might be represented by the presence of PCR inhibitors, commonly found in fecal samples. In fact, some authors did not report the use of controls for PCR inhibition. Knopp et al [11], who found the lowest sensitivity value of real-time PCR (when not considering the studies comparing PCR with serology) declared that the absence of controls for PCR inhibition was one of the limitations of their study. Therefore, we cannot exclude that PCR inhibition occurred and affected the results of some studies. However, most included papers reported the use of controls for PCR inhibition, and sensitivity resulted variable and seldom achieved 90%. In any case, these controls are of primary importance to confirm the correct execution of the PCR, and are therefore recommended both in research studies and in routine practice.
Analogously, the use of controls for DNA extraction was not reported by all authors, and it cannot be ruled out that a low efficiency in DNA extraction affected the results of PCR. Also in this case, the use of such controls is recommended both in routine and in research activities.
The interpretation of the results from a clinical point of view is resumed in the summary of findings table (Table 3). Indeed, PCR is not adequate for universal screening of strongyloidiasis, as it would entail an excessive risk of missing diagnoses of a potentially fatal infection. It could rather be a valid option as a confirmatory test in case of positive serology. Moreover, it could be used as an alternative to other fecal-based tests for the screening of immunosuppressed patients, for whom the sensitivity of serology decreases [32]. However, also in this latter group it should be used in addition to serology, in order to increase case-detection in these patients particularly at risk of developing severe infection.
Unfortunately, as it results from the qualitative evaluation of the included studies, we suggest that the lack of a gold standard for the diagnosis may hamper the results of diagnostic studies. This problem is frequently encountered in parasitology. The comparison of PCR with the fecal methods which proved to be sufficiently sensitive for the diagnosis of strongyloidiasis (namely, Baermann and APC) could be seen as the best option to validate the accuracy of PCR, as they all rely on larval shedding, indicating the presence of active infection. However, the sensitivity of Baermann and APC is still inadequately low to safely rule out the infection, when resulting negative. For this reason, using them as reference tests tends to result in an overestimation of the sensitivity of PCR. Serology detects the antibodies against larval antigens, hence it does not rely on the presence of larvae in stool, that is often inconstant. Despite the possibility of false positive results (as reported in the introduction), we decided to add the comparison with serology to highlight that the sensitivity of PCR is presumably lower than that found when compared with the other fecal methods.
Although methods to assess the test accuracy in the absence of a gold standard have been proposed [14], they are seldom applied, as it resulted from our review, too (only a couple of studies proposed an alternative model for the classification of the results). Indeed, our investigation highlighted that, in absence of a validated reference standard, different studies considered different reference tests for the evaluation of the accuracy of PCR, leading to difficulties in the direct comparison of the results.
Another limitation of our study is that it was not possible to analyze the influence of setting and age on the accuracy, because of the relatively low number of studies included in the metaanalysis. Due to the distinct pools of patients (defined through the different reference tests) of the PCR experiments included in the analysis, a certain degree of heterogeneity was inevitably expected. Indeed, the measure of correlation between sensitivity and specificity provided evidence of a heterogeneity that should not be ignored. This heterogeneity may be also largely caused by variations between tests in terms of country setting, population age or by a threshold effect. Nonetheless, the utilization of statistical techniques that take this heterogeneity into account for the estimation of summary measures, such as the bivariate model by Reitsma et al [14], allowed for exhaustive and robust estimates as shown in Table 2. As the number of studies included did not allow for a proper analysis of all possible sub-cases of index-reference tests, these estimates shall be considered as pooled accuracy measures of the PCR techniques versus all other techniques.

Conclusions
In summary, the results of this review suggest that, although the PCR technique is highly specific, it should not yet be recommended for universal screening, nor as a stand-alone method for the individual diagnosis of S. stercoralis infection. However, PCR has a role as a confirmatory test. Additional studies investigating the accuracy of this and other diagnostic tests for this infection, using appropriate methods to cope with the absence of a gold standard, are needed to improve the screening and management of this neglected infection.