Meta-analysis of the radiological and clinical features of Usual Interstitial Pneumonia (UIP) and Nonspecific Interstitial Pneumonia (NSIP)

Purpose To conduct a meta-analysis to determine specific computed tomography (CT) patterns and clinical features that discriminate between nonspecific interstitial pneumonia (NSIP) and usual interstitial pneumonia (UIP). Materials and methods The PubMed/Medline and Embase databases were searched for studies describing the radiological patterns of UIP and NSIP in chest CT images. Only studies involving histologically confirmed diagnoses and a consensus diagnosis by an interstitial lung disease (ILD) board were included in this analysis. The radiological patterns and patient demographics were extracted from suitable articles. We used random-effects meta-analysis by DerSimonian & Laird and calculated pooled odds ratios for binary data and pooled mean differences for continuous data. Results Of the 794 search results, 33 articles describing 2,318 patients met the inclusion criteria. Twelve of these studies included both NSIP (338 patients) and UIP (447 patients). NSIP-patients were significantly younger (NSIP: median age 54.8 years, UIP: 59.7 years; mean difference (MD) -4.4; p = 0.001; 95% CI: -6.97 to -1.77), less often male (NSIP: median 52.8%, UIP: 73.6%; pooled odds ratio (OR) 0.32; p<0.001; 95% CI: 0.17 to 0.60), and less often smokers (NSIP: median 55.1%, UIP: 73.9%; OR 0.42; p = 0.005; 95% CI: 0.23 to 0.77) than patients with UIP. The CT findings from patients with NSIP revealed significantly lower levels of the honeycombing pattern (NSIP: median 28.9%, UIP: 73.4%; OR 0.07; p<0.001; 95% CI: 0.02 to 0.30) with less peripheral predominance (NSIP: median 41.8%, UIP: 83.3%; OR 0.21; p<0.001; 95% CI: 0.11 to 0.38) and more subpleural sparing (NSIP: median 40.7%, UIP: 4.3%; OR 16.3; p = 0.005; 95% CI: 2.28 to 117). Conclusion Honeycombing with a peripheral predominance was significantly associated with a diagnosis of UIP. The NSIP pattern showed more subpleural sparing. The UIP pattern was predominantly observed in elderly males with a history of smoking, whereas NSIP occurred in a younger patient population.


Introduction
Idiopathic pulmonary fibrosis (IPF) constitutes the most prevalent type of idiopathic interstitial pneumonia (IIP), accounting for 55% of IIP cases [1]. IPF is known to occur in adult individuals aged greater than 50 years and affects more men than women [1][2][3]. In addition, IPF is thought to be associated with cigarette smoking, as many patients with IPF are former or current smokers [1][2][3]. The prevalence of IPF in the USA is reported to be 63 cases per 100,000 population and up to 23.4 cases per 100,000 population in Europe. The incidence of IPF in the USA ranges from 6.8 to 17.4 per 100,000 population and 0. 22-7.4 per 100,000 population in Europe [4]. The median survival time reported in recent studies ranges from 2 to 5 years, starting at the time of diagnosis; this survival time is worse than in patients with many types of cancer [5]. IPF is associated with the radiographic and pathological patterns known as usual interstitial pneumonia (UIP). The UIP pattern can be associated with several other entities, such as rheumatoid arthritis, certain medications or chronic hypersensitivity pneumonitis.
Nonspecific interstitial pneumonia (NSIP), on the other hand, represents a pathological subtype of IIP that can mimic IPF in its clinical presentation and has a more favorable prognosis, with a median survival time of more than 9 years. NSIP accounts for 25% of IIP cases, constituting the second most common type of IIP after IPF [3]. NSIP shows a slight female predominance and typically occurs in a younger patient population than IPF [6]. Similar to patients with UIP, the secondary NSIP pattern on a computed tomography (CT) scan can also be linked to collagen vascular disease and other entities among the spectrum of autoimmune diseases.
The diagnosis of IIP requires background clinical information. Several studies have shown significant inter-and intraobserver variability in the radiological diagnosis of IIPs of up to 50% [7], which affect the overall diagnostic accuracy.
The main CT features of IPF are reported to be basal and peripheral reticulations, which are most typically associated with honeycombing potentially predicting patient outcomes [8][9][10]. Ground-glass opacities (GGOs) are also common, but less extensive. For NSIP, the reported characteristic CT patterns overlap with those of UIP and consist of GGOs and/or reticular patterns, while honeycombing is rare. However, chronic NSIP might develop into a fibrotic form termed fibrosing NSIP. When typical UIP patterns are present, an IPF diagnosis is made based on high-resolution CT (HRCT) images. In these cases, histopathological confirmation may not be required, according to recent guidelines. However, if HRCT findings are equivocal, a biopsy is still necessary [11]. Overall, the diagnostic accuracy of HRCT for UIP and NSIP has been reported to be up to 70% [12].
By achieving a reliable diagnosis based on imaging features, patients potentially avoid the risks of bleeding and general anesthesia and the high costs associated with a surgical biopsy [13][14][15][16]. Although UIP and NSIP imaging features have been described extensively in the literature and were incorporated in the diagnostic guidelines, no systematic review of the literature has been conducted. Our aims were to review the literature and summarize the most pertinent findings for UIP and NSIP and to provide an evidence-based approach.
The goal of this systematic review was to provide an overview of the prevalence and location of CT patterns and typical medical variables (age, sex, and smoking status) for UIP and NSIP. We sought to determine the patterns and variables that best discriminated between UIP and NSIP. For exact numbers, suitable studies were pooled into this review.

Materials and methods
The reporting of the results from this systematic review was organized according to Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [17].

Eligibility criteria
The following criteria were applied to select the studies: (1) a dedicated research article (no letters or abstracts were considered); (2) adequate imaging studies (HRCT) including volume scans or HRCT sequences with a slice thickness of less than 2 mm; (3) a detailed description of radiological NSIP and/or UIP patterns on CT images according to the established guidelines [2]; and (4) a confirmed diagnosis of UIP or NSIP based on biopsy specimens or a board decision (dedicated ILD board composed of specialized pneumologists, radiologists and pathologists in a tertiary care setting, as recommended by the American Thoracic Society (ATS), European Respiratory Society (ERS), and Fleischner Society [12,18,19]).The following exclusion criteria were applied: (1) case reports and (2) studies of less than 10 cases. Additionally, review articles and studies with insufficient subject identification were excluded from the analysis. In the case of redundant reporting of patient populations, we only included the study with the largest sample size. We considered only studies published in English and French. All suitable studies were stored in a portable document format and transferred to Papers software (ReadCube, The Netherlands). The titles and abstracts of all manuscripts were screened by one author (A.C., who has 18 years of experience in chest radiology). Manuscripts were then separately analyzed for eligibility by one author (A.C.). A second validation of the preselected articles was conducted by a different author (L.E., who has 5 years of expertise in chest imaging). Duplicates were removed from the article list.

Information sources and search strategy
We performed a literature search of the PubMed/Medline and Embase databases. We applied the following search terms to titles and abstracts: "Lung Diseases, Interstitial/diagnosis", "Lung Diseases, Interstitial/diagnostic imaging", "Pulmonary Fibrosis/diagnosis", "Pulmonary Fibrosis/diagnostic imaging" and pattern, reticula � , honeycombing, ground-glass, ). Detailed information on the Embase search strategy can be retrieved from Appendix 1. We searched for articles published between 1992 and 2017. The complete literature search was conducted in May 2017. The rationale for the search dates was owed to the extensive amount of data that has been screened by the clinical radiologists as well as validated by experts. Further statistical analysis was time consuming as well. However, to the best of our knowledge, no substantial contributions to the existing data were published since then. Also, we are not aware of any similar meta-analysis that has been published on this topic to date.

Study selection and data collection processes
Data extraction and coding were performed by one investigator (A.C.). All data were collected in a standardized worksheet (Microsoft Excel). This analysis included studies deemed relevant, including randomized controlled trials, cohort studies, and cross-sectional studies. The extracted parameters included patient demographics and smoking history (current smokers, former smokers and never smokers). The radiological patterns denoted by CT included honeycombing, GGOs, consolidation, and reticulation. In addition, the dominant pattern was recorded if mentioned in the manuscript. The extent of these patterns was noted by the percent (%) of the total lung volume. Studies indicating a radiologist's estimate were pooled with studies mentioning only a per-lobe analysis (middle lobe or lingula counted as 1/4 of the right or left lung, and the lower lobe counted as 1/2 of the lung). The pattern distribution was recorded axially (inner 2/3 of the lungs versus the outer 1/3; the peribronchial distribution was considered the inner 2/3 of the lungs) and along the z-axis (upper or lower lobe predominance (below the level of the carina) or diffuse lung involvement). Data regarding involvement below the level of the carina were pooled with data regarding involvement of the lower lobe, middle lobe or lingula.

Risk of bias in individual studies
The quality of the included studies in this systematic review was assessed with Quality Assessment of Studies of Diagnostic Accuracy included in Systematic Reviews (QUADAS) [20].

Data extraction, quality assessment and statistical analysis
We compared study characteristics among UIP and NSIP studies. The number of complete observations for each arm and the median, minimum and maximum were summarized. The Wilcoxon test was applied to identify significant differences. For the 12 studies with both arms (UIP and NSIP), the number of complete observations and the median, minimum and maximum for each arm are displayed. We used random-effects meta-analysis described by DerSimonian & Laird and calculated pooled odds ratios (ORs) for binary data and pooled mean differences (MDs) for continuous data, together with 95% confidence intervals and p-values using the Stata command metan. Due to the low number of included studies, we also used the method by Paule-Mandel to assess robustness of pooled estimates using the Stata command admetan. The Stata function metan was used to estimate the pooled effect considering a random effect model using the DerSimonian-Laird's method. The Stata function admetan was used to estimate the pooled effect considering a random effect model, using the Paule-Mandel's method. The Wilcoxon-Mann-Whitney test was applied to compare the results of all arm studies with two-arm studies.
Heterogeneity was quantified using the I 2 measure [21,22]. An I 2 larger than 50% denotes moderate heterogeneity, and a value larger than 75% indicates severe heterogeneity. The effect measures of the individual studies, the pooled measures, and the I 2 measure and its p-value are shown in plots. The dashed red vertical line depicts the overall pooled OR or MD. The width of the diamond represents the 95% CI. All analyses were performed using Stata 14 software (Stata Corporation, College Station, Texas).

Study selection
After screening 639 abstracts, only 33 articles met the inclusion criteria. Of these 33 articles, only 12 studies were two-arm studies that simultaneously analyzed the prevalence of CT patterns in patients with NSIP and UIP [23][24][25][26][27][28][29][30][31][32][33][34]. A flow diagram was generated for the inclusion and exclusion criteria according to the PRISMA guidelines (Fig 1), and the characteristics of the included two-arm studies are listed in Table 1.

Risk of bias within studies
Due to the small number of studies, we performed a sensitivity analysis using the method of Paule-Mandel to assess robustness of pooled estimates, which yielded very similar results (data shown in appendix 2). A comparison of patterns between studies with only idiopathic cases and idiopathic and secondary UIP or NSIP cases did not show any significant differences (all p>0.1). Likewise, significant differences were not observed between biopsy-proven and ILD board-proven studies (all p>0.1). Therefore, the pooling of these subgroups is likely free of bias. The QUADAS-2 results are shown in Figs 2 and 3.

Study characteristics
Thirty-three studies were selected according to the inclusion criteria, including 725 patients with NSIP and 1,593 patients with UIP. Twelve studies included both patients with UIP (447 patients) and NSIP (338 patients), eight studies included only patients with NSIP, and 13 studies included only patients with UIP. In 85% (17/20) of the NSIP study arms, the diagnosis was    confirmed by a biopsy. In 30% (6/20) of the NSIP study arms, patients were diagnosed by a multidisciplinary ILD board of lung specialists (pulmonologists, pathologists and radiologists). In 50% (10/20) of the NSIP study arms, the final diagnosis of idiopathic NSIP was reached by excluding any other secondary etiologies. In 72% (18/25) of the UIP study arms, the diagnosis was confirmed by biopsy and histology (possible or incompatible UIP pattern on CT). In 48% (12/25) of the UIP study arms, patients were diagnosed by the dedicated ILD board. In 80% (20/25) of the UIP study arms, the diagnoses were classified as idiopathic without an identifiable etiology.
In patients with NSIP, the median central or peribronchovascular disease predominance (axial inner 2/3 of the lungs) was 41%, while in patients UIP, the median was only 8% (OR 6.2; p�0.001). Additionally, a diffuse distribution with equal involvement of both inner and outer regions of the lungs was more often observed in patients with NSIP than in patients with UIP (median 15.4% vs. 4.5%; OR 2.3; p = 0.033). Along the z-axis, the median upper lung predominance was zero in both groups, whereas the median lower lobe predominance was greater than 90%.

Results of individual studies
The individual results from each included study are summarized in forest plots for the most important variables that exhibited the best classification capabilities (Figs 4-12).

Heterogeneity (I 2 ) of two-arm studies
The studies included in this meta-analysis displayed low heterogeneity in terms of clinical characteristics: age, sex and smoking habits had I 2 values of 31%, 49% and 35%, respectively. The heterogeneity of the disease distribution was generally low (<35%), except for subpleural sparing, for which the I 2 was moderate at 67%. The largest heterogeneity was observed in the pattern identified by the radiologists: GGOs showed severe heterogeneity (I 2 = 92%), and honeycombing and consolidation displayed moderate heterogeneity (I 2 = 67% and 78%, respectively).

Discussion
The present results confirm that patient demographics, CT patterns and pattern distributions exhibited significant differences in patients with UIP and NSIP. The main discriminating factors were the presence of honeycombing, extent of GGOs, axial distribution, sex and age. Additionally, subpleural sparing and consolidation were mainly observed in patients with NSIP. Although these findings reflect the current diagnostic criteria [18,19]), to date, a comprehensive or formal meta-analysis of the published data has not been conducted. In this review of published data, we applied rigid inclusion criteria that relied on the aforementioned consensus statement. When applying the recommendations of the ATS, the European Respiratory Society (ERS) and Fleischner Society [11,18,19], only a few studies upheld this standard. For instance, an appropriate diagnosis of ILDs, mainly UIP, requires a multidisciplinary approach that takes into account clinical, radiological and pathological findings. By applying an interdisciplinary approach, the diagnostic accuracy and consequently the timely treatment of patients can be substantially improved. However, of more than 600 articles, only 33 studies adhered strictly to the proposed workflow and criteria for the management of these two specific patient populations. Additionally, the current recommendations regarding imaging patterns were determined by an expert panel.
Currently, the diagnostic algorithm is changing. Very recently, the Fleischner Society released a consensus paper introducing new diagnostic categories for the UIP pattern which has been adopted by the American Thoracic Society, European Respiratory Society, Japanese Respiratory Society, and Latin American Thoracic Society [35]. The former imaging categories Meta-analysis of the radiological UIP and NSIP pattern were reviewed, and a new diagnostic category was introduced: CT pattern indeterminate for UIP. However, typical and probable UIP patterns are still considered categories with a very high probability of UIP, even though honeycombing might be absent in probable UIP cases. Although a new category has been introduced, histopathological confirmation of the categories of "CT pattern indeterminate for UIP" and "CT features most consistent with a non-IPF diagnosis" is still required. Although the diagnostic category of a "probable UIP pattern" does not include honeycombing on CT images, in the data presented in the current study, honeycombing appears to be the most reliable factor discriminating between UIP and NSIP patterns. This finding is most likely attributable to the weighting of honeycombing patterns among radiologists. In this meta-analysis, the pattern distribution did not prove to be particularly helpful for differentiating UIP and NSIP. Both entities showed a more basal and peripheral predominance. Additionally, subpleural sparing was an inconsistent finding in patients with NSIP. An important factor for differentiating UIP and NSIP is age and smoking history. IPF peaks at a significantly older age than NSIP [1][2][3][4][5][6]. This finding is supported by the data retrieved from the present analysis. This result supports the need for multidisciplinary ILD boards to incorporate a broad spectrum of clinical factors and to reach a final diagnosis. In summary, the present data not only shows honeycombing to be the most significant discriminator between UIP and NSIP but also the importance of clinical factors. Based on these results, we encourage radiologists to incorporate age, sex and smoking history into their diagnostic routine. Conversely, referring clinicians can benefit the most from the radiological reports when asking the reading radiologist to incorporate critical clinical data in the interpretation.
Currently, computer aided detection (CAD) systems based on artificial intelligence have become major topics of discussion in diagnostic radiology [36][37][38][39][40][41][42][43][44][45]. In previous studies using convolutional neural networks, computerized detection of CT patterns became feasible [46][47][48]. By combining automated CT pattern recognition algorithms and clinical and demographic characteristics of patients, the diagnosis of ILDs and, in particular, the differentiation of UIP patterns and NSIP patterns by machine learning algorithms could be feasible. The present meta-analysis of the imaging features of UIP and NSIP will also provide a necessary foundation for the further development of these algorithms, ultimately improving the diagnosis of ILDs and patient care. For instance, the ORs might be included in a Bayesian model, which would provide a probability-based diagnostic approach for UIP.

Limitations
Our analysis has several limitations. First, the total number of studies and patients included was small. Although numerous studies have been published in the field, only a few studies fit our rigid inclusion criteria. Publication bias could not be assessed because we only had 2 to 7 studies in each meta-analysis. Another limitation might be the reference standards; some patient populations were confirmed by biopsy and some were diagnosed by imaging and ILD board consensus alone. Although these reference standards are consistent with clinical guidelines, this heterogeneity in the selection criteria might be questioned, presenting an incorporation bias. Some variables showed severe heterogeneity, probably due to the known interreader variability in CT patterns and the use of slightly different reference standards for cases that were diagnosed by an ILD board or histology. The different level of experience of the radiologists in each study may have influenced the results too: in the available method descriptions the median experience in chest imaging of the radiologists involved was 15.9 years, ranging from 10 to 22 years. The UIP and NSIP patients were analyzed by the same radiologists in all two arm studies, which may have helped counteracting this effect. Furthermore, the fact that some studies included secondary forms of fibrosis while others adhered to the idiopathic forms may have confounded the results of the study, although this hypothesis was not statistically confirmed.
The included studies did not investigate end-stage lungs affected by NSIP. In patients with terminal NSIP, honeycombing is almost always present, making discrimination from UIP nearly impossible. However, in end-stage lung disease, a radiological diagnosis and clinical options are very limited.

Conclusions
In conclusion, this meta-analysis provides an overview of the main clinical features and CT patterns that discriminate between UIP and NSIP. Specifically, the honeycombing pattern is still the most specific factor discriminating between UIP and NSIP.