Comparison of 18F-FDG PET/CT and DWI for detection of mediastinal nodal metastasis in non-small cell lung cancer: A meta-analysis

Background Accurate clinical staging of mediastinal lymph nodes of patients with lung cancer is important in determining therapeutic options and prognoses. We aimed to compare the diagnostic performance of diffusion-weighted magnetic resonance imaging (DWI) and 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) in detecting mediastinal nodal metastasis of lung cancer. Methods Relevant studies were systematically searched in the MEDLINE, EMBASE, PUBMED, and Cochrane Library databases. Based on extracted data, the pooled sensitivity, specificity, positive and negative likelihood ratios (PLR and NLR) with individual 95% confidence intervals were calculated. In addition, the publication bias was assessed by Deek’s funnel plot of the asymmetry test. The potential heterogeneity was explored by threshold effect analysis and subgroup analyses. Results Forty-three studies were finally included. For PET/CT, the pooled sensitivity and specificity were 0.65 (0.63–0.67) and 0.93 (0.93–0.94), respectively. The corresponding values of DWI were 0.72 (0.68–0.76) and 0.97 (0.96–0.98), respectively. The overall PLR and NLR of DWI were 13.15 (5.98–28.89) and 0.32 (0.27–0.39), respectively. For PET/CT, the corresponding values were 8.46 (6.54–10.96) and 0.38 (0.33–0.45), respectively. The Deek’s test revealed no significant publication bias. Study design and patient enrollment were potential causes for the heterogeneity of DWI studies and the threshold was a potential source for PET/CT studies. Conclusion Both modalities are beneficial in detecting lymph nodes metastases in lung cancer without significant differences between them. DWI might be an alternative modality for evaluating nodal status of NSCLC.


Introduction
Lung cancer is the leading cause of all cancer-related deaths worldwide [1]. Non-small-cell cancer (NSCLC) is the main type of lung cancer, accounting for 80% of all cases. NSCLC typically metastasizes to the hilar and mediastinal lymph nodes (MLNs), and metastasis is a very important prognostic factor. The 5-year survival rates are 54.0% for patients without any metastases and 26.5% for subjects with MLNs metastases [2]. The selected treatment, such as surgery, radiotherapy and chemotherapy, is mainly dependent on the TNM staging. Therefore, accurate assessment of MLNs is necessary for TNM staging and optimal treatment selection.
Various diagnostic techniques, such as computed tomography (CT), positron emission tomography (PET), PET/CT, mediastinoscopy, and magnetic resonance imaging (MRI), are used for nodal staging assessment of NSCLC. CT is most widely used to assess the nodal status of lung cancer based on lymph node size, although lymph node size is not reliable for the evaluation of metastatic involvement [3]. FDG PET, a functional imaging modality, could detect potential tumor activity and facilitate earlier recognition of metastases [4]; however, this method has been limited by the low spatial resolution of stand-alone PET images [5]. Integrated PET/CT, which combines the anatomical detail and functional statue, is now commonly used for NSCLC staging.
Diffusion weighted imaging (DWI), an MRI technique, could detect the restricted diffusion of water molecules among tissues at the cellular level, which could be measured by apparent diffusion coefficient (ADC) value [5]. DWI and ADC values have been widely used in brain imaging for the evaluation of acute ischemic stroke, intracranial tumors and demyelinating disease [6]. However, DWI is highly sensitive to motion artifacts caused by breathing and movement of the heart and aorta, resulting in its limited application [7]. Recently, the rapid development of MRI techniques, such as echo-planar imaging sequence, multichannel coils and parallel imaging, has allowed for the application of DWI in anatomical regions prone to motion artifacts, such as the mediastinum [8]. Several studies have shown that diagnostic accuracy of DWI for nodal assessment in the mediastinum is 76-95% [9][10][11][12][13].
To our knowledge, the performance of DWI and FDG PET/CT in nodal staging has yet to be determined. Some studies validated the potential of DWI for N stage assessment and the characterization of mediastinal lymph nodes in patients with NSCLC with a capability similar to that of 18 F-FDG PET/CT [14]. Some studies showed advantages of DWI over FDG PET/CT [4,5], whereas other studies showed that DWI had lower capability than FDG PET/CT [8,11]. Therefore, we performed a meta-analysis to compare the diagnostic performance of DWI and FDG PET/CT in lymph node staging in patients with NSCLC.

Inclusion and exclusion criteria
The inclusion criteria were as follows: (i) the diagnostic performances of 18 F-FDG PET/CT or DWI in detecting nodal metastases in lung cancer were identified in the literature; (ii) pathological analysis, surgical biopsy, mediastinoscopy or follow-up results were used as the gold standard of diagnosis; (iii) the values of true positive (TN), false positive (FP), false negative (FN) and true negative (TN) depending on the original data could be obtained in the literature; (iv) the studies were based on a per-lesion analysis; and (v) the article with the most details or the most recent article was selected when similar data appeared in more than one article.
The exclusion criteria were as follows: (i) studies that focused on the therapy response or prognosis rather than on disease diagnoses; (ii) studies regarding mediastinal tumor or pleural diseases except for lung cancer; (iii) case reports, meeting abstracts, reviews, letters, comments, animal experiments, or the studies with less than 10 samples.

Data extraction
The following information was extracted from the included studies: the first author, year of publication, study design (prospective or retrospective), country of the study, patient enrollment, technique characteristics, reference standard, and blinding method. The TP, FP, TN, and FN results were also extracted.
Two reviewers independently extracted the relevant data from each study. Any disagreements were resolved by discussion with a third reviewer.

Statistical analysis
For lesion-based analyses, we obtained the pooled sensitivities and specificities of PET/CT and DWI, as well as their 95% confidence intervals using the weighted average method. We also calculated the pooled positive and negative likelihood ratios (PLR and NLR) with their 95% confidence intervals. The data were finally summarized in receiver-operating characteristic curves (SROC), with the area under the curve (AUC) and the Q Ã index obtained.
We used the I 2 index for heterogeneity assessment. If the I 2 index was higher than 50%, a random effect model was used; otherwise, a fixed model was used. In this study, we used the random-effect model to pool estimates. To explore the sources of heterogeneity, we performed subgroup analyses based on factors such as sample size (! 250 vs. <250), study design (retrospective vs. prospective), country (Asia vs. non-Asia), subject enrollment (consecutive vs. nonconsecutive), and analysis method (qualitative, quantitative, or both). The threshold effect analysis was also performed, and the publication bias was examined by Deek's funnel plot.
The statistical computations were performed using Stata software version 12.0 (StataCorp LP, Texas, USA) and MetaDisc version 1.4 (Unit of Clinical Biostatistics, Ramóny Cajal Hospital, Madrid, Spain). For P value, the level of statistical significance was set to 5%.

Quality assessment
We used QUADAS-2 to analyze the quality of the studies [54]. The methodological results are displayed in Fig 2. Participant selection was judged to be at low risk of bias in 16 of the studies and at high or unclear risk of bias in the remaining 27 studies. The majority of selected studies did not provide information regarding consecutive enrollment and did not avoid a case-control design. These inclusion restrictions artificially narrowed the range of patients who would undergo PET/CT in standard practice, which gave rise to a high concern about the applicability of these studies. For the index test and reference standard, common weaknesses focused on the fact that a blinding method was not provided or used when interpreting the results. With regard to the flow and timing, 12 articles displayed unclear or high risk because they lacked an explicit description of the time interval between the index test and reference standard. In a word, a substantial amount of underreporting in the included studies resulted in "unclear" or "high" bias or concern, hampering the methodological quality.

Diagnostic accuracy of DWI and FDG-PET/CT
The pooled results are shown in No differences were found between the pooled specificity, sensitivity, PLR and NLR between DWI and FDG-PET/CT (P > 0.05). Using a fitted SROC curve, the overall AUCs for DWI and FDG-PET/CT were 0.79 and 0.88, respectively ( Fig 5). For nodal staging of NSCLC, the diagnostic capacities of these two modalities were not significantly different. However, based on the PLR and NLR, a positive finding of DWI can diagnose the malignancy while a negative DWI finding alone might not exclude the malignancy. With regard to PET/CT, it can neither rule in nor rule out the disease.

Heterogeneity analysis
Our analysis revealed strong heterogeneity in sensitivity and specificity among the studies (P < 0.05, I 2 > 90%). The Spearman rank correlation test indicated an absence of threshold effect in the DWI studies (coefficient = 0.364, P = 0.301) and showed a significant threshold effect in the PET/CT studies (coefficient = 0.556, P = 0.001). The threshold effect of PET/CT might arise from different cutoff values of SUV to differentiate malignant lesions from benign ones between included studies. Because of the small sample size of the DWI studies, we only performed subgroup analyses based on the sample size, study design and patient enrollment. Six studies using prospective design showed higher specificity (0.98 vs. 0.81, P < 0.05), and  studies with consecutive enrollment showed higher specificity for nodal staging (0.98 vs. 0.81, P < 0.05). With regard to PET/CT studies, more factors including sample size, study design, country, patient enrollment, blinding method, and analysis method were explored in subgroup analyses; however, all these factors failed to explain the heterogeneity (P > 0.05). The results of the subgroup analyses are presented in Table 2.
Deek's funnel plot asymmetry tests indicated no significant publication bias (P = 0.277 for DWI and P = 0.098 for PET/CT) (Fig 6).

Discussion
Because integrated PET/CT directly combines PET data on metabolic changes with highly detailed anatomic CT information, this technique could detect lesions earlier and provide more precise location information than CT or PET alone [55]. DWI is a magnetic resonance imaging (MRI) technique based on the imaging of the molecular mobility of water [56]. Using this technique, the diagnoses of prostate cancer [57], urinary bladder cancer [58], uterine   Role of PET/CT and DWI for N staging of NSCLC performance and compared it with the diagnostic performance of 18 F-FDG PET/CT. Our results in the present meta-analysis showed that the pooled sensitivity and specificity of DWI were 0.70 and 0.97 for node-based data, and the corresponding values of PET/CT were 0.69 and 0.93, respectively; these results indicated that both 18 F-FDG PET/CT and DWI were beneficial in detecting mediastinal lymph nodes metastases in lung cancer without significant statistical differences in diagnostic capacity. Furthermore, the diagnostic capacity (low sensitivity and high specificity) of both modalities suggested that positive lymph nodes would be missed too often so that using individuals alone cannot make accurate evaluation of nodal status to make decisions about treatment plan, especially for those patients with potentially resectable NSCLC. Instead both modalities can help guide the next step: either mediastinoscopy with minimally invasive sampling or directly surgery. The SROC curve and its AUC presented the relationship between the sensitivity and specificity across studies and the overall estimation of test performance. The AUC for DWI (0.93, 95% CI: 0.91-0.95) was slightly higher than the AUC for 18 F-FDG PET/CT (0.89, 95% CI: 0.86-0.91), indicating that DWI might be more accurate in N staging in patients with NSCLC. By combining the sensitivity and specificity into a single number, the DOR can be regarded as a single measurement of diagnostic accuracy, and higher values indicate better discriminatory test performance [63]. The DOR of DWI is greater than that of 18 F-FDG PET/CT, indicating that DWI might be more accurate in assessing mediastinal lymph nodes of NSCLC. LRs, which are more clinically meaningful estimates, are commonly used to rule in and rule out disease. A good diagnostic test might have a PLR greater than 10 and a NLR less than 0.1 [48]. In our study, the PLR of DWI was 13.15 and NLR was 0.32, meaning that DWI could be only helpful to diagnose metastatic lymph nodes, not useful to exclude metastatic lesions. PET/CT could neither diagnose metastatic lesions nor rule out metastatic lesions with the PLR of 8.46 and NLR of 0.38. The heterogeneity between studies was notable for both PET/CT and DWI. To investigate the sources of heterogeneity, diagnostic threshold analyses and subgroup analyses were performed. The spearman correlation coefficient (0.439, P = 0.011) suggests the existence of the threshold effect for PET/CT in our meta-analysis; one possible explanation is that different diagnostic methods and thresholds were used in the individual studies. The PET/CT images were analyzed quantitatively, qualitatively or both. Although the images were all analyzed using quantitative methods, the SUV thresholds were different. Of the included PET/CT studies using quantitative methods, only 7 studies [15,20,21,33,35,41,48] adopted 2.5 as the SUV cutoff value, whereas the other studies used variable values. To date, the ideal cut-off value of the SUV for diagnosing malignant MLNs has not been determined. In addition, there is no standard reference for the visual interpretation. For DWI, the results of the threshold analysis showed that no significant threshold effect existed. We also conducted subgroup analyses based on factors including study design, country, sample size, analysis method, patient Role of PET/CT and DWI for N staging of NSCLC enrollment, and blinding. However, these factors failed to explain the heterogeneity between PET/CT studies. For the heterogeneity in DWI studies, study design and patient enrollment were potential sources. In addition, the differences in the technique characteristics of PET/CT and DWI were potential sources of heterogeneity.
In clinical practice, DWI and 18 F-FDG PET/CT have satisfactory specificity, and these two highly specific techniques are suitable for confirming diseases, especially some diseases with distinctive clinical manifestations or diseases that are fatal. However, with the disappointing sensitivity, a large number of patients would be misdiagnosed because of the relatively greater false negative results. DWI appears to have several advantages over FDG PET/CT, including no radiation exposure, no fasting and short examining time [9,38]. With comparative diagnostic capacity, the cost of DWI examination is approximately one third of PET/CT examination. Although DWI shows some advantages over PET/CT, its real value for evaluating nodal status of NSCLC in clinical practice has not been determined. There is still a long way to confirm the diagnostic value of DWI, and further confirm whether it can replace PET/CT examination for N stage of NSCLC.
The current analysis has several limitations. First and foremost, the number of DWI studies included in this meta-analysis was too small. More work is needed to enrich this field. Second, a wide variation in imaging techniques likely affected the assessment of diagnostic accuracy of DWI and PET/CT and resulted in heterogeneity. Due to limited information, these factors were not analyzed. Third, although no publication bias was found by using Deek's funnel plot, a potential publication bias could still exist, especially with the exclusion of conference abstracts and case reports during the study selection. Finally, there was no single reference standard strategy for the histopathologic analyses, and a wide variation in patient histopathologic types was found in all studies. This factor was not analyzed because it is too mixed and difficult to classify.

Conclusion
Our meta-analysis indicated that 18 F-FDG PET/CT and DWI had high specificity and low sensitivity for identifying metastatic mediastinal lymph nodes in NSCLC, and they are noninvasive imaging methods that might aid in confirming the diagnosis of metastases in clinical Role of PET/CT and DWI for N staging of NSCLC practice. However, the true value of DWI remains unknown in clinical practice, although DWI did show some advantages over PET/CT in some aspects. Therefore, large-scale, prospective studies are needed to further justify the diagnostic value of DWI in comparison with 18 F-FDG PET/CT.