Evaluation of PCR on Bronchoalveolar Lavage Fluid for Diagnosis of Invasive Aspergillosis: A Bivariate Metaanalysis and Systematic Review

Background Nucleic acid detection by polymerase chain reaction (PCR) is emerging as a sensitive and rapid diagnostic tool. PCR assays on serum have the potential to be a practical diagnostic tool. However, PCR on bronchoalveolar lavage fluid (BALF) has not been well established. We performed a systematic review of published studies to evaluate the diagnostic accuracy of PCR assays on BALF for invasive aspergillosis (IA). Methods Relevant published studies were shortlisted to evaluate the quality of their methodologies. A bivariate regression approach was used to calculate pooled values of the method sensitivity, specificity, and positive and negative likelihood ratios. Hierarchical summary receiver operating characteristic curves were used to summarize overall performance. We calculated the post-test probability to evaluate clinical usefulness. Potential heterogeneity among studies was explored by subgroup analyses. Results Seventeen studies comprising 1191 at-risk patients were selected. The summary estimates of the BALF-PCR assay for proven and probable IA were as follows: sensitivity, 0.91 (95% confidence interval (CI), 0.79–0.96); specificity, 0.92 (95% CI, 0.87–0.96); positive likelihood ratio, 11.90 (95% CI, 6.80–20.80); and negative likelihood ratio, 0.10 (95% CI, 0.04–0.24). Subgroup analyses showed that the performance of the PCR assay was influenced by PCR assay methodology, primer design and the methods of cell wall disruption and DNA extraction. Conclusions PCR assay on BALF is highly accurate for diagnosing IA in immunocompromised patients and is likely to be a useful diagnostic tool. However, further efforts towards devising a standard protocol are needed to enable formal validation of BALF-PCR.


Introduction
Invasive aspergillosis (IA) is the most common opportunistic invasive fungal infection in immunocompromised patients, especially those with prolonged neutropenia [1]. In patients with hematological malignancies (HM), the prevalence of IA ranges from 1-15% and mortality can reach as high as 90%, despite the availability of several active antifungal agents [1].
Early diagnosis of IA remains a challenge, and few diagnostic methods are available. The galactomannan (GM) assay may be useful in establishing an earlier diagnosis and may result in improved outcomes for immunocompromised patients [2,3]. The specificity of GM detection in serum generally reaches over 90%. In bronchoalveolar lavage fluid (BALF), GM detection can lead to an earlier diagnosis of IA in patients with HM, yielding an increased sensitivity when compared with serum, e.g., 85-100% versus 47% [4].
Molecular diagnostic techniques such as nucleic acid detection by polymerase chain reaction (PCR) are emerging as potentially more sensitive and rapid than conventional techniques for the diagnosis of IA [5,6]. In addition to being helpful in IA diagnosis, DNA amplification with an integrated system for species-level identification based on melt-curve profiles or via an additional probe, would be useful to save time and refine the diagnosis of specific infections, allowing for administration of targeted antifungal therapy based on species-level identification [7]. A systematic review, which assessed the use of PCR on blood and serum samples, showed its clinical value and recommended standardization of PCR platforms [8].
BALF is routinely used to assess for the presence of fungi at the site of pulmonary infection. Conventional microbiological techniques like culture and histology of BALF are most commonly used for the diagnosis of IA, but these techniques have suboptimal sensitivity [9]. Detection of Aspergillus DNA in BALF yields high sensitivity and specificity. One review evaluated 15 clinical studies investigating the diagnostic efficiency of performing PCR on the BALF of IA patients and showed a promising clinical significance although with some methodological limitations [10]. Using a bivariate regression approach, we have now undertaken a systematic review of all eligible, and more recent, clinical studies to assess the accuracy of BALF-PCR as a diagnostic test for IA in immunocompromised patients.

Materials and Methods
Two investigators searched MEDLINE and EMBASE for relevant articles published up to December 2010 for Medical Headings and text words that included the search terms ''aspergillosis,'' ''aspergillus,'' ''polymerase chain reaction'' and ''bronchoalveolar lavage''. The syntax for the MEDLINE searches was as follows: ''aspergillosis'' OR ''aspergillus'' AND ''polymerase chain reaction'' AND ''bronchoalveolar lavage''. The reference lists of the included studies and review articles were also checked for further relevant studies. Searches were restricted to Englishlanguage literature on human subjects only; abstracts or meeting proceedings were excluded. The results were then manually searched for eligible studies. Full-text publications concerning PCR on BALF were included if (1) they used the European Organization of the Research and Treatment of Cancer/Mycoses Study Group (EORTC/MSG) or the revised EORTC/MSG criteria, as a reference standard [11,12], (2) for studies published before the designation of these criteria in 2002, equivalent but nonidentical criteria were used as a reference standard, (3) the studies reported the data separately on true-positive, false-positive, falsenegative, and true-negative results of the diagnostic tests, and (4) the studies included immunocompromised or at-risk patients. To avoid selection bias, studies with populations fewer than 10 were excluded.
Data extraction was performed independently by two reviewers and any uncertainties or disagreements were resolved by discussion. The quality of the selected studies was assessed as recommended in the Standards for the Reporting of Diagnostic Accuracy (STARD) by using 14 items of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) lists [13,14].

Statistical analysis
To calculate test accuracy, we defined proven and probable patients as having IA, and possible and no IA patients as not having IA, to construct two-by-two tables, according to the revised EORTC/MSG criteria [12]. We also constructed other two-bytwo tables (proven IA vs probable, possible, and no IA).
By undertaking a bivariate regression approach, we calculated pooled estimates of sensitivity (SEN) and specificity (SPE) as the main outcome measures, and constructed hierarchical summary receiver operating characteristic (SROC) curves [15]. Based on random-effects models, this bivariate approach investigates potential between-study heterogeneity and incorporates the possible correlation between the SEN and the SPE. Using the pooled SEN and SPE, positive and negative likelihood ratios (PLR and NLR, respectively) were also calculated.
Heterogeneity was assessed through the test of inconsistency (I 2 ) of the pooled diagnostic odds ratios (DORs) [16]. DOR indicated the accuracy of a diagnostic test, corresponding to particular pairings of SEN and SPE. It illustrated the odds of positive test results in participants with the disease compared with the odds of positive results in those without the disease. The mean DOR was used as an accuracy index and was performed by classic metaanalytic pooling [17]. Potential heterogeneity was explored by subgroup analyses [18,19]. Covariates clearly reported by more than 80% of studies were analyzed: population (only HM vs. mixed/other), design (cohort vs. case-control), data collection (prospective vs. retrospective), EORTC/MSG criteria (yes vs. no), PCR method (quantitative vs. other), primer design (A.fumigatus-specific vs. other), cell wall disruption method (commercial vs. ''in house''), and DNA extraction method (commercial vs. ''in house''). Deeks's funnel plot was used to inspect publication bias [20]. Posttest probability(PTP) was calculated by using the overall prevalence of 19% with Fagan nomograms [21]. All analyses were performed using STATA, version 10 (Stata Corp., College Station, TX) with the module ''Midas'' [21]. All statistical tests were two sided, with P values less than 0.05 denoting statistical significance.
The characteristics of these studies are detailed in Table 1. Overall, there were 1296 clinical BALF samples from 1191 patients. Most patients with HM received either chemotherapy or a hematopoietic stem-cell transplant. The average prevalence of proven and probable IA across cohort studies was 19% (range: 5.77-47.37%), which was higher than that reported in some studies [3,40]. The relatively higher prevalence might result from the different populations among these reviews. The participants of the included studies mainly constituted patients with HM accompanied by pulmonary infiltration.
Ten studies used the EORTC/MSG criteria, whereas seven studies used similar criteria that were not identical to the EORTC/MSG criteria. Three studies included eligible patients exhibiting persistent fever who were unresponsive to first-line broad-spectrum antibiotic treatment [25,27,35].
The details of the PCR techniques used are summarized in Table 2. BALF was used in all studies (volume range: 0.1-5 ml). Six studies used quantitative PCR (qPCR) in Aspergillus DNA determination and the remainder used end-point PCR or semiquantitative PCR. The quality of all studies was generally high, meeting on average 10 of the 14 QUADAS criteria (Fig. 1).

Data synthesis and metaanalysis
For all the studies, the pooled DOR was 122 (95% confidence interval (CI) 41-363). Table 3 shows the SEN, SPE, PLR, NLR, and DOR. The wide range of SEN was mainly because of the variance of IA cases in different studies. We found substantial heterogeneity among studies for all modalities because all I 2 values were above 50%.
The SROC curve represents the relationship between SEN and SPE across studies, determining the presence of a threshold effect. Based on the bivariate approach which estimates not only the strength but also the shape of the correlation between SEN and SPE, a 95% confidence ellipse and a 95% prediction ellipse were drawn (Fig. 2). The area under the SROC curve (AUC) was 0.97 (95% CI 0.95-0.98), signifying the high discriminatory ability of BALF-PCR.
Subgroup analyses showed that the SEN of qPCR was significantly lower than that of other types of PCR (Fig. 3). Studies using commercial kits for cell wall disruption and DNA extraction achieved significantly higher SPE than those using an ''in house'' method. These covariates did not affect the SEN yet. We also found that using A.fumigatus species-specific primers led to a significantly lower SPE than did using other primers (mostly genus-specific primers).
Non-publication bias was detected by the Deek's funnel plot asymmetry test (P = 0.55) [19]. The nomogram of Fagan demonstrated that the PCR assay increased the probability of IA nearly five-fold when the results were positive, and decreased the probability to 1.5% when the results were negative (Fig. 4) [21].

Discussion
The frequency of IA has increased with the increasing number of high-risk patients. The most common site of infection is pulmonary and IA can easily disseminate to other organs [41]. Clinically, the gold standard for the diagnosis of IA still requires an invasive procedure to provide histopathologic or cytopathologic evidence. Unfortunately, the patient's status often prohibits the use of invasive techniques. Culturing of the causative agent can result in false negative or false positive results. Bronchoscopy with BALF seems to be a feasible diagnostic tool, with a high yield and low complication rates [42][43][44].
Serum GM detection and PCR assays seem to have diagnostic potential. Clinical trials have shown that GM assays on serum or BALF and PCR assays on serum have variable performance in different at-risk patient populations. The metaanalyses of Pfeiffer st al. and Leeflang et al. both drew similar conclusion that serum GM detection was moderately useful for diagnosing IA although there were some methodological limitations in these studies [3,40]. In Leeflang et al.'s study, in which only proven and probable cases were considered, the overall SEN increased by 17% (from 62% to 79%) with a decrease in the threshold from 1.5 to 0.5, and the overall SPE decreased by 13% (from 95% to 82%). Mengoli et al. determined the SEN and SPE of PCR for two consecutive positive samples to be 0.75 and 0.87, respectively, and concluded that two positive tests are required to confirm the diagnosis, whereas a single PCR-negative result is sufficient to exclude a diagnosis of IA [8]. Wang et al.'s found that, as a diagnostic tool, the BALF-GM assay was better than serum GM detection [45]. This study showed that the BALF-GM assay had a SEN of 0.90 and a SPE of 0.94 for proven and probable IA, which were both higher than the corresponding values for the serum GM assay.
We undertook the present metaanalysis aiming to establish the overall accuracy of the BALF-PCR assay for the diagnosis of IA and to help standardize PCR assay and normalize clinical performance. Our primary finding was that the BALF-PCR assay is a powerful tool for the diagnosis of IA in patients with hematological malignancy, and that the use of particular procedures can improve diagnostic accuracy. However, the increased SEN and SPE could be related to the high sensitivity of PCR, and possible existing fungal colonization in the bronchial tree of high-risk cases might contribute to the elevated SEN. Comparing the diagnostic tools examined in all former metaanalyses, the BALF-PCR assay had the highest area under the SROC   curve (0.97) which meant that this assay had the best discriminative ability. We also investigated likelihood ratios, which take into account the interaction between the SEN and the SPE in their calculation and describe the discriminatory properties of positive and negative test results. PLR .10 and NLR ,0.1 are usually considered convincing evidence to rule in or rule out diagnoses, respectively [46]. Although three metaanalyses did not report these results [3,8,40], the conclusions of these studies suggested that serum GM and serum PCR could not effectively discriminate IA. In our metaanalysis, for proven and probable cases, the PLR and NLR both exceeded the threshold index and generated large and often conclusive shifts from pre-test to PTP.
It is an important objective of a metaanalysis to explore the likely causes of heterogeneity [18,19]. In our subgroup analyses, we identified several characteristics that might account for the observed heterogeneity. The significantly lower SEN of qPCR might be due to the use of a modified PCR assay for improvement of SEN, such as nested PCR. Nested PCR formats have been widely used for Aspergillus in an attempt to optimize analytical sensitivity, but the requirement to open the reaction tubes means that there is considerable risk of contamination and the subsequent generation of false-positive results [47]. Recently, the real-time quantitative format has become dominant in PCR-based diagnostic studies of fungal infections [7]. The limit of its sensitivity was found to be five copies of Aspergillus DNA per milliliter; this is comparable to that of the commonly used nested PCR assay [48]. In addition, quantification of the fungal burden by real-time PCR may be helpful to distinguish between colonization and infection, and could possibly allow therapeutic monitoring [32]. Nowadays,  real-time qPCR is the best PCR format for clinical diagnostic application.
There is a multitude of fungal DNA extraction techniques. The selected extraction method represents a compromise between efficiency, freedom from exogenous contamination, and applicability to routine high-throughput testing. The efficiency of extraction of fungal DNA may vary considerably among commercial kits [49], And this can hinder comparisons among studies. We found that studies that used commercial kits for cell wall disruption and DNA extraction achieved significantly higher SPE than those that used ''in-house'' methods. The uniform commercial kit for the GM assay contributes to its stable operation and comprehensive use, and we believe that a uniform PCR assay system may also promote the application of qPCR.
When designing primers for clinical diagnostic purposes, the detection of a broad range of fungi is important, as is the ability to increase SEN and ultimately identify the specific pathogen [50]. In our study, we found that using A.fumigatus species-specific primers led to a significantly lower SPE.of the use of narrow-spectrum primers that does not lead to a lower SPE could be caused by poor primer design. Some primers, which were designed based on sequencing data that was not comprehensively validated, have shown complementarity to the genomes of several other microorganisms, such as Candida glabrata and Aspergillus oryzae [24,32]. During primer design, it is necessary to conduct a thorough in silico analysis based on multiple alignment of validated sequences from various databases of as many Aspergillus sequences as possible, plus human sequences and those of closely related fungal pathogens such as Penicillium and Candida. Since several species of Aspergillus are human pathogens, A.fumigatus species-specific primers may lower the accuracy of PCR assays. We recommend choosing Aspergillus genus-specific primers, to improve PCR assay accuracy. The optimal approach, in this regard, involves the application of genus-specific primers with post-amplification analysis for species determination. Genusspecific primers are directed toward conserved regions, usually within multicopy genes, which flank sequences containing species specific polymorphisms that can be exploited in post-amplification analysis [47]. However, Aspergillus genus-specific primers might lead to a new problem. Penicillium, a genus phylogenetically close to Aspergillus, has a high likelihood of cross reactivity within a PCR assay, which might increase the false positive rate [6]. All primers should be validated by a thorough in silico analysis using multiple databases.
Another potential problem with PCR is sample contamination from airborne particles. Fungal spores are ubiquitous in the environment, and may cause false positive results in PCR. Accordingly, measures to reduce exogenous fungal contamination are critical [7]. A laminar flow hood in an independent laboratory should be used exclusively for DNA extraction and pre-PCR processing. Other measures to reduce contamination are a unidirectional workflow pattern (pre-to post-PCR), physically separating the laboratories for pre-and post-PCR analysis, and using aerosol-resistant pipette tips and laminar flow hoods. In addition, using the uracil-DNA glycosylase enzyme and dUTP instead of dTTP in the PCR master mix can eliminate this problem by destroying amplicons prior to PCR [7]. Moreover, controls can be used to rigorously monitor contamination. Negative control reactions, comprising all the PCR reagents except the template DNA, are essential, and can be introduced at any point in theassay, such as at sample acquisition, handling, storage, and DNA extraction.
It is still difficult for BALF-PCR to distinguish colonization from invasive infection. More importantly, the isolation of Aspergillus from the respiratory tract should arouse vigilance, especially in high-risk patients. The isolation of Aspergillus from the respiratory tract may represent one of three scenarios: (1) evidence of current disease, (2) true colonization, or (3) a marker for the probable development of invasive disease. A previous study demonstrated that a positive PCR result from BALF at the time of bone marrow transplant conditioning was predictive of the subsequent development of invasive pulmonary aspergillosis [51].
The PCR assay, especially qPCR, is becoming popular in the clinical diagnosis ''toolbox'', such as for Pneumocystis pneumonia diagnosis [52]. As the operation of PCR becomes more automated, and extraction methods and targets become commercially available, this tool should play an ever-greater role. However, because at present most laboratories perform in-house PCR assays, extensive validation and standardization is requied. An initiative is currently in progress, to devise a standard for Aspergillus PCR screening [8]. Once this has been achieved, formal validation shoule be possible. As long as these standardized assays are unavailable, a combination of various methods is still needed to improve the accuracy of IA diagnosis of IA.
Our study had some limitations. First, we acknowledge that the overall number of patients included in our review was relatively small. Although we aimed to incorporate all available relevact data, it is hard to ensure that no data were missed, especially unpublished data. Second, we might have introduced bias by exclusing non-English-language studies and studies with population fewer than 10. To test the latter, we reanalyzed the data including these small studies and obtained similar overall results. Third, we used the revised EORTC/MSG criteria as a reference standard which is widely accepted but not the ''gold standard'' for diagnosis of every patient, especially for probable IA. The disease group could be expanded when defining probable IA.