Can CT Screening Give Rise to a Beneficial Stage Shift in Lung Cancer Patients? Systematic Review and Meta-Analysis

Objectives To portray the stage characteristics of lung cancers detected in CT screenings, and explore whether there’s universal stage superiority over other methods for various pathological types using available data worldwide in a meta-analysis approach. Materials and Methods EMBASE and MEDLINE were searched for studies on lung cancer CT screening in natural populations through July 2015 without language or other filters. Twenty-four studies (8 trials and 16 cohorts) involving 1875 CT-detected lung cancer patients were enrolled and assessed by QUADAS-2. Pathology-confirmed stage information was carefully extracted by two reviewers. Stage I or limited stage proportions were pooled by random effect model with Freeman-Tukey double arcsine transformation. Results Pooled stage I cancer proportion in CT screenings was 73.2% (95% confidence interval: 68.6%, 77.5%), with a significant rising trend (Ptrend<0.05) from baseline (64.7%) to ≥5 repeat rounds (87.1%). Relative to chest radiograph and usual care, the increased stage I proportions in CT were 12.2% (P>0.05), and 46.5% (P<0.05), respectively. Pathology-specifically, adenocarcinomas (66%) and squamous cell lung cancers (17%) composed the majority of CT-detected lung cancers, and had significantly higher stage I proportions relative to chest radiograph (bronchioloalveolar adenocarcinomas, 80.9% vs 51.4%; other adenocarcinomas, 58.8% vs 38.3%; squamous cell lung cancers, 52.3% vs 38.3%; all P<0.05). However, the percentage of small cell lung cancer was lower using CT than other detection routes, and no significant difference in limited stage proportion was observed (6.8% vs 10.8%, P>0.05). Conclusion CT screening can detect more early stage non-small cell lung cancers, but not all of them could be beneficial as there are a considerable number of indolent ones such as bronchioloalveolar adenocarcinomas. Still, current evidence is lacking regarding small cell lung cancers.


Introduction
With over one million annual fatal cases [1], lung cancer is incontestably the leading cancer burden worldwide. Historically, there has been long-standing interest in the early detection of pulmonary malignancy for the improvement of treatment effectiveness and prognosis [2]. Consequently, an extensive number of studies focusing on the efficacy and feasibility of mass screening have been or are being conducted . Among these efforts, the National Lung Screening Trial (NLST) is the first and only randomized controlled trial (RCT) to date demonstrating that lung cancer mortality can be reduced by conducting computed tomography (CT) mass screening [15]. Radical screening practices have been explored, accompanied by criticisms concerning high false positive rates, screening related anxiety, and cost-effectiveness [27], and questions remain about technical issues such as selection criteria and effective nodule management protocols [28]. However, before balancing all of these benefits and potential adverse effects and before carrying out solutions to barriers in implementing a screening program [28], one fundamental key issue, namely the stage characteristics and stage superiority over other approaches of CT screen-detected cancers, has been raised but still not fully explored [20,29], which is the theoretical precondition of any screening benefits [30].
Previous studies showed a less intention or weak capability to depict a comprehensive view of the cancer stage distribution as the cancer case numbers are generally small . The purpose of this systematic review is to shed light on the stage characteristics of subjects screened for lung cancers with CT, by using meta-analysis approach to synthetize available data from existing screening programs. The analysis process is organized by the following order: First, we attempted to address two basic questions: (a) How frequently are early stage cancers detected using CT, with regard to different populations and screening schemes? (b) Does the baseline and repeat screening rounds show different stage distribution patterns, and is there any changing trend over time as the screening program proceeds? After this descriptive analysis process, we go on to evaluate: (c) To what extent can CT detect more early stage cancers over other cancer detection routes? And then by pathology-specific analysis explore the most topical and important question of: (d) Do the additionally detected early lung cancers represent a real stage shift, or simply a mirage of over-diagnosis [31] (which denotes that, according to Patz et al [32], more than 18% lung cancers detected by CT in the NLST were excessively diagnosed indolent cancers compared with chest radiograph, and this rate was even much higher, namely 30%-65% according to Young et al [33], in European trials when compared with usual care).

Information Source and Search Strategy
Cohort (uncontrolled) or randomized controlled designed studies that reported the use of CT for lung cancer screening, in natural populations of any age, without language restriction, were considered in this systematic review and meta-analysis. To identify potential studies, an expert librarian was consulted in developing search strategy, and two reviewers (ZW and YH) conducted a comprehensive electronic search in MEDLINE and EMBASE databases from January 1990 (the year CT was introduced for lung cancer screening) through July 2015. The index words "lung cancer", "CT", "screening", and their synonyms (obtained by referring to their Mesh terms) with the field names "title" or "abstract", were used without language or other filters (a full record of the electronic search is available in S1 Text). References regarding related articles from the literature already identified were further reviewed. Authors and their affiliations, and the names of the identified screening programs were checked to obtain up to date results.

Eligibility Criteria and Study Selections
Lung cancer screening studies using CT were considered which: (a) were conducted in a community based, hospital centered, or nationwide population; (b) reported at least ten lung cancers; and (c) provided pathologically confirmed stage information (by bronchoscopic, aspiration, surgical biopsies, etc.) for the baseline screen, repeat rounds or as a whole (as clinical staging overestimates early cancer proportions and will introduce heterogeneity in pooled estimates). Studies were excluded if they: (a) focused on special occupational groups (exposed to asbestos, mineral dust, or nuclear fuel), or among patients such as those with HIV infections or tuberculosis diseases; (b) took CT as a subsequent examination (e.g. CT screening in chest radiograph negative subjects); or (c) failed to obtain pathological diagnoses for the screen detected lung cancers after contacting authors twice (actually only one author out of five studies responded but could not provide this information). Two reviewers (ZW and YH) did the study selections and reached a perfect agreement on study enrollment result when comparing the studies against the explicit eligibility criteria listed above.

Data Extraction and Quality Assessment
With a piloted and refined data collection form on five randomly-selected studies, basic study information (author, initial screening year, design), population characteristics (actual number screened, age range, male proportion, smoking status criteria), screening protocol (number of maximum round, threshold for further examination and biopsy, and other criteria for suspicious nodules), individual patient data (gender, age and nodule size at the time of detection), and the outcome of interest (stage and pathology-specific stage information concerning both lung cancer patients and nodules, in both baseline and repeat rounds) were carefully extracted by two reviewers (ZW and YH, shown in S2 Table), first individually and then by discussion to resolve discrepancies. Study quality assessment was carried out using a QUADAS2 instrument (which measures the risks of bias and concerns of applicability in the domains of Patient Selection, Index Test, Reference Standard, and Flow and Timing) [34], independent of the data extraction process, and also in the form of double reviewers (by ZW and YH); a third input from LW was used when unsolved discrepancies occurred. Specific index questions are available in S3 Table.

Summary Measures and Statistical Analysis
As there have been no significant changes in the definition of stage I lung cancer in the TNM staging system from version five to seven [35], we used pathological stage I (IA and IB were summarized) proportion as the main summary measure for representing early stage cancers (stage II-IV were not analyzed due to variations in the different staging systems applied by included studies). Small cell lung cancers were excluded from this calculation unless otherwise specified, but were all included and classified as 'limited' or 'expansive' stage for pathology-specific analysis. For patients with two or more cancers, patient level data were used to assess the screening outcomes in the public health perspective, and nodule level data was used for the pathological analysis. Specially, we maintained in this report the classification of bronchioloalveolar adenocarcinoma (BAC) as a special subgroup of adenocarcinomas though this concept was discontinued since 2011 in the IASLS/ATS/ERS Classification of lung adenocarcinoma [36], because almost all (but one) studies included in this analysis was initialed before that change and it was not practical to reclassify the BACs.
Fisher's exact probability method was applied to obtain the 95% confidence interval (95% CI) for proportions. The Freeman-Tukey double arcsine transformation was applied in consideration of the fact that some proportions were close to the margins [37]. As heterogeneity in diagnosis/screening studies were perceived to be high (also tested using I 2 index), random effect model was used to obtain conservative pooled estimates, and subgroup analysis of heterogeneity sources concerning study level information (region, design, quality, population age and smoking criteria), and patient level information (gender, age, and nodule size) were conducted. Multi-level logistic regression was used for the patient level data analysis. Additionally, Egger's test was performed to explore any small-study effect/publication bias. P values smaller than 0.05 were considered significant. All statistical analyses were planned by a senior statistician (JJ) and performed by YW and WH using STATA 13.0 software.

Inclusion of studies and basic results
The search strategy initially yielded 3839 articles. Exclusion and selection processes are detailed in Fig 1. After de-duplication (in 1285) and exclusion (in 2511), a total of 24 CT screening studies from 43 reports were enrolled (reporting on different screening rounds), including 16 cohort studies and eight RCTs that compared CT with chest radiograph (in 2) or usual care (in 6). These 24 programs (involving 105,007 CT screening participants and 1875 detected lung cancer patients) covered a wide geographic distribution, namely Asia (Japan, Israel, South Korea, and China), Europe (Germany, Spain, Italy, Netherlands, Belgium, Denmark, and Poland), and North America (United States and Canada). The basic characteristics of the included studies are outlined in Table 1. Study qualities were generally acceptable (ten ranked as high, 12 as moderate and only two as low quality for the study purpose; specific rankings are in S4 Table).
The summary estimate for the proportion of pathological stage I cancers was 73.2% (95% CI: 68.6%, 77.5%), with a significant increase from the baseline screens to the repeat rounds (70.1% and 75.6%, respectively; P<0.05). Analysis with the Egger's test did not reveal any small-study bias in baseline (P = 0.698), repeat rounds (P = 0.711), or summary result (P = 0.613). However, because studies were characterized with various traits, all of these estimates suffer from great heterogeneity (all I 2 >50%).
Further subgroup analyses identified some potential heterogeneity sources (Tables 2 and 3). In the study level analysis, lower proportions of stage I cancer were reported in studies that adopted a RCT design (compared with uncontrolled cohorts, 66.8% vs 77.0%), or in studies that were conducted in North America (63.0%, compared with in Europe: 73.0%, or compared with in Asia: 83.5%), or in studies whose study populations were restricted to those aged !50 years (compared with no age limits, 69.8% vs 80.2%), or studies that were only conducted in smokers/ex-smokers (compared with no smoking status limits, 70.5% vs 83.5%), all P<0.05. No heterogeneity was observed between studies that were initiated before and after the year of 2000, and no heterogeneity was seen among studies that were rated as high, medium or low qualities. Generally, separate subgroup analyses on baseline and repeat rounds showed similar results, though the statistical inferences were instable due to smaller sample sizes.
Six studies [4][5][6]9,19,20] provided individual patient data that made more in-detail patient level subgroup analysis available (Table 3). There were decreasing trends of stage I cancer proportion with age and nodule diameter at the time of detection, from 85.9% for those aged <50 years to 74.2% for those aged ! 60 years, P trend = 0.0509, and from 82.3% for nodules <10 mm to 62.4% for nodules ! 15mm, P trend = 0.0120. Females (82.1%) showed a higher stage I cancer proportion than males (71.8%), but this difference was not statistically significant, P = 0.1795.

Round specific changing trends over time in CT screens
To provide round-specific changes as the screening programs progressed, we pooled the results from six studies [11,15,[18][19][20]23] that provided detailed stage information for each screening round (Fig 2). Although the proportion of early cancers in single studies fluctuated over time, the pooled estimate showed an approximately perfect rising trend from baseline screening (64.7%) to the fifth and higher repeat rounds (87.1%; P trend <0.05). This rising trend seemed not to be explained by the influences of patient age or nodule size, as data extracted from the ITALUNG study [19] (the single study that reported such round-specific patient information, in which a rising stage I cancer proportion of 75.0-80.0-87.5% in the three repeat rounds was reported) showed no similar change in either average age (67.5-66.0-67.0 years) or average nodule diameter (8.3-10.6-8.4 mm), both P>0.05.

Comparison between cancers detected using CT and other detection routes
As compared with chest radiograph (Fig 3), the proportion of stage I lung cancer increased by 12.2% using CT, based on pooled results from NLST and its pilot trial (LSS) (P>0.05). When comparing results in the usual care arms, this advantage increased to 46.5% based on three

Pathological distribution and pathology-specific stages
Adenocarcinoma was the most common lung cancer type (Fig 4), accounting for 39% of the cancers detected using chest radiograph in the NLST, and 43% in control arms in other RCTs (NLST data was separately shown because it was the most extensive and greatly influenced the pooled results). In the CT screens, these percentages increased significantly to 47% (NLST), 62% (other RCTs), and 66% (further added cohort studies) (all P<0.05 for comparison). Next was squamous cell lung cancer, whose absolute number increased when moving from the control arm to the CT arm in NLST and other RCTs, but the composition percentage fell because of the large increase in adenocarcinoma in RCTs other than the NLST. As to the most progressive type, namely small cell lung cancers, the composition percentages shrank in both the NLST and other RCTs using CT (17% vs 13% [P<0.05], and 18% vs 7% [P>0.05], respectively). Pathology-specific differences in the proportion of early cancer between baseline and the repeat rounds are presented in Fig 5. Adenocarcinomas and squamous cell lung cancers predominated regarding the increase in the proportion of early cancer from baseline to the repeat rounds (75.6% to 80.6%, and 59.5% to 70.3%, respectively, both P>0.05), while in contrast the infrequent pathological types (e.g. limited stage proportion of small cell lung cancer from 55.6% to 43.8%) fell slightly in the repeat rounds (all P>0.05).
Data were only available from the NLST for comparison of the pathology-specific early stage cancer proportions between screening methods [15]. Significantly higher proportions of early stage adenocarcinomas (BACs: 80.9% vs 51.4%, P = 0.0006; other adenocarcinomas: 58.8% vs 38.3%, P<0.0001), squamous cell lung cancers (52.3% vs 39.0%, P = 0.0051) and some other non-small cell carcinomas were detected using CT than using chest radiograph (other non-small cell lung cancers combined: 40.1% vs 21.8%, P<0.0001); however, there was no such significant "shift" regarding the detection of early stage small cell lung cancers (6.8% vs 10.8%, P = 0.2277).

Discussion
For over half a century, attempts to reduce lung cancer mortality using screening modalities were based on the assumption that early detection and treatment of malignant pulmonary nodules leads to improved prognosis [2]. A number of reviews are now available regarding the summary of current knowledge on the mortality outcome in CT screening [38,39]. Because of the paucity of outcome evidence and difficulties in uniformly measuring treatment effectiveness, the present study focused on the topic of the first "detection" step and provided several deep insights regarding lung cancer stage. The results showed that >70% of non-small cell lung cancer patients detected using CT were at pathological stage I, and there is a tendency for this proportion to increase as screening continues. Relative to chest radiograph screening and usual care, the proportion of stage I cancer detected using CT was higher by more than 12% and 45%, respectively. Regarding pathology, almost all types of non-small cell lung cancer can increasingly be detected at an early stage in a CT rather than in a chest radiograph screening; however, evidence is lacking regarding this advantage in small cell lung cancer patients. As can also be inferred from the current study, different populations and study designs can to a great extent affect the stage characteristics of the detected lung cancers. Relative to patients in North America and Europe, the proportion of Stage I cancer was higher in Asian patients (83.5% in baseline and repeat summary), with three Japanese studies dominating the results. This could possibly be explained by the long history of lung screening in Japan [40], whose high proportion of early stage cancer observed complied well with the trend of long-term repeat screens (Fig 2). Studies that were restricted to populations aged !50 years or smokers/ ex-smokers had lower proportions of early stage cancer, potentially indicating a high proportion of late stage cancers in these subgroups; this is supported by the individual patient data subgroup analysis on age in this study, and the finding from three Chinese cities with a high lung cancer incidence (stage I cancer accounted for as low as 34.3% of the CT detected lung cancers) [41]. In addition, differences in study designs and study qualities reflect variations in modality parameters, threshold for follow-up, scan intervals (ranging from twice a year in the ALCAP to annual/biennial in the MILD, and a 1-3-5.5 year scheme in the NELSON trial) [3,18,22], and some other aspects of varied protocols that could have an impact on detection performance. All of these differences in screening yields stress the necessity that a cancer screening service should be delivered in a culturally sensitive and methodologically sound manner to optimize public health benefits [26].
Notably, the changing pattern of early cancer proportion also holds some implications for screening practice. Although the absolute number of lung cancers detected decreased in the repeat round relative to the initial screening (the case in almost all of the included studies), the proportion of early cancers could rise as screening continues. This is especially true when the ideal status is reached where only the most recently developed cancers are left to be detected in long-term screening (e.g., approaching 90% in the fifth and higher repeat rounds; however, it  [4], Israel [6], Mayo [8], Toronto [16],NELSON [18],and ITALUNG [19]studies. should be noted that this estimate came from just one study). Thus, it is suggested that the "detect early to increase curability" vision can only be more ideally fulfilled when screening is continued over time, rather than as a one-off practice, which is effectively no more than a "prevalence survey".
Currently there are two opinions on the observed "stage shift" in CT screening as compared with other detection routes (e.g., stage I cancer proportion increased by 46.5% than in usual care in this study, representing a more than 2.5 fold difference). Traditional lung cancer natural history theory holds that screening reveals many of the cancers that can be treated at an early stage before they become incurable (stage shift) [42]. However, the alternative opinion is not that optimistic, doubting that historically indolent cancers dominate the majority of "early detected" cancers (i.e. "histology shift" instead of real stage shift) [43], thus weakening the effectiveness of screening and resulting in unnecessary diagnosis and even over-treatment [32,43,44]. Therefore, analyses involving pathology could provide some insightful information regarding the discrepancy in "detecting cancer at an early stage" and "detecting early cancers" [45]. Veronesi et al reported that slow-growing cancer comprised about 25% of the CT detected lung cancers [46], and several similar studies found about 80% of such indolent cancers were BAC or adenocarcinomas [43,46,47]. In the study by Vazquez et al, the majority (95%) of adenocarcinomas had a BAC component, and the 10-year survival rates after resection were as high as 90%-100% [48]. It is also indicated that an increase in the recorded incidence of BAC in the past 30 years was partly attributable to CT scanning [49]. In an NLST subgroup, Young et al further revealed that BAC cancers were preferentially identified by CT screening and almost exclusively found in those with no airflow limitation [43]. In our pooled study, about 18.6% of adenocarcinomas (the most common type of cancer detected in CT screening, 66% of all cancers) were classified as BAC (consistent with a review article) [49]. All these strongly suggest that indolent cancers such as BAC could have contributed an unelectable portion of the additionally detected cancers. With prostate and breast cancer screening as predecessors, caution should be taken when interpreting the impressive "detection improvement" statistics [50].
In addition, the need should be stressed to separately report BAC from other adenocarcinomas (for example, the separated analyses were not performed in Figs 4 and 5 because nine studies did not make a difference between BAC and other adenocarcinomas), or to report according to the new classification system [36] the subtypes of cancers that have a favorable prognosis and are traditionally diagnosed as BAC (such as adenocarcinoma in situ, minimally invasive adenocarcinoma, and lepidic-predominant adenocarcinoma), because clumping them into overall adenocarcinoma will mean their relevance to true efficiency in screening programs may be lost.
In contrast to the non-small cell lung cancers, there is currently no evidence showing the superiority of early detection using CT for the more aggressive small cell lung cancer, and the most recent report from the MILD study showed that CT did not improve survival for such cancer types (no small cell lung cancer survivors after 3 years) [51]. The marked difference of early cancer detectability in CT screening for indolent and aggressive cancers could possibly provide one explanation regarding the phenomenon that early detection does not always relate to decreased mortality in small RCTs [19,20,22,23]. We noticed that one study named lung-SEARCh that aimed to demonstrate a stage shift towards early stage cancers have just finished its follow-up by March 2016 [52], and hope it could provide more histology-specific insights into such difference. Further, as volume doubling time can serve as a good way to monitor cancer behaviors [18,46], more studies on the heterogeneity of volume doubling time for different histological types of lung cancers (especially the non-small cell ones) are warranted for better defining cancer biology, predicting screening outcomes, and refining screening protocols.
Limitations remain in this study. First, though random effect model was applied to obtain conservative pooled results of the early stage lung cancer proportion, the averaged estimates could not cover all possibilities of CT screening findings as the patients backgrounds, CT techniques, and screening workflows (impractical to be categorized for analysis) in this study were quite various among all studies, especially when considering the diversity of cancers with different biological behaviors and future development in screening techniques. Second, sources of heterogeneity were not fully explored, as indicated by the heterogeneities persisted within most small subgroups in the study level analysis; the individual patient level data analysis can provide a way for more insights with greater accuracy, but it was limited by the small number of available subjects. Third, though no small study/publication bias was detected, as all the included studies were in English and Chinese, language bias may exist. Last, due to lack of sufficient evidence in the ultimate outcome of screening program (lung cancer mortality), and due to the inconsistency in the methods to determine over-diagnosis [39], it is still impractical to directly estimate the degree of correlation between "stage shift" and mortality reduction.
In conclusion, CT has superiority over chest radiograph and usual care for detecting a higher proportion of early stage non-small cell lung cancers, including a number of indolent cancers such as BAC. However, evidence is currently lacking for the same beneficial stage shift of the more aggressive small cell lung cancers.
Supporting Information S1