Recurrent Genomic Gains in Preinvasive Lesions as a Biomarker of Risk for Lung Cancer

Lung carcinoma development is accompanied by field changes that may have diagnostic significance. We have previously shown the importance of chromosomal aneusomy in lung cancer progression. Here, we tested whether genomic gains in six specific loci, TP63 on 3q28, EGFR on 7p12, MYC on 8q24, 5p15.2, and centromeric regions for chromosomes 3 (CEP3) and 6 (CEP6), may provide further value in the prediction of lung cancer. Bronchial biopsy specimens were obtained by LIFE bronchoscopy from 70 subjects (27 with prevalent lung cancers and 43 individuals without lung cancer). Twenty six biopsies were read as moderate dysplasia, 21 as severe dysplasia and 23 as carcinoma in situ (CIS). Four-micron paraffin sections were submitted to a 4-target FISH assay (LAVysion, Abbott Molecular) and reprobed for TP63 and CEP 3 sequences. Spot counts were obtained in 30–50 nuclei per specimen for each probe. Increased gene copy number in 4 of the 6 probes was associated with increased risk of being diagnosed with lung cancer both in unadjusted analyses (odds ratio = 11, p<0.05) and adjusted for histology grade (odds ratio = 17, p<0.05). The most informative 4 probes were TP63, MYC, CEP3 and CEP6. The combination of these 4 probes offered a sensitivity of 82% for lung cancer and a specificity of 58%. These results indicate that specific cytogenetic alterations present in preinvasive lung lesions are closely associated with the diagnosis of lung cancer and may therefore have value in assessing lung cancer risk.


Introduction
Early diagnosis of lung cancer is thought to lead to improved survival. Yet, less then 25% of patients are diagnosed at clinical stage 1 where expected survival is around 70% at 5 years. This survival rate is much higher than overall survival in advanced disease, estimated at 15% at 5 years. Approaches to early diagnosis of lung cancer remain a major challenge. The onset of the disease process is extremely slow (months to years) and no means of evaluating the rate of progression of the disease process are available.
A variety of lung cancer screening techniques have been studied to determine their utility in early stage of disease. These include chest X-ray, sputum cytology and molecular biomarkers in various biological specimens. None of these early detection strategies has been found to cause a reduction in cancer-related mortality. Lowdose spiral computed tomography (CT) may provide an accurate picture of the anatomic extent of early lung carcinoma [1]. Yet, although appealing as an early detection strategy, [2,3,4,5,6], the results of randomized, controlled studies are not known. In addition, most preinvasive lesions in the central airways will remain undetected by chest CT.
Molecular detection strategies from airway specimens are challenging because of the relatively difficult access, paucity of tissue and lack of molecular changes predictive of cancer. While the molecular biology of lung cancer has been extensively studied, no reliable diagnostic molecular correlates exist [7]. Lung cancer development is characterized by sequential accumulation of epigenetic and genetic aberrations in somatic cells [8]. These aberrations include single nucleotide point mutations, changes in chromosome copy number [9,10], and specific genomic amplifications or deletions that are implicated in the pathogenesis of lung tumor development and progression through the activation of oncogenes and inactivation of tumor suppressor genes [11,12,13,14,15,16].
Because not all preinvasive lesions develop into invasive tumors, it is critical to identify molecular determinants driving to an invasive phenotype [16]. Fluorescence in situ hybridization is emerging as a potentially useful clinical tool for the assessment of diagnosis, prognosis, and response to therapy in lung cancer [17,18,19,20,21,22]. Chromosomal aneuploidy has been found closely associated with the diagnosis of lung cancer. Recently we tested gain in copy numbers of two out of four selected DNA targets, taken as a reflection of genomic instability and a marker for risk of lung cancer development [10,23]. In the present study, we hypothesized that a selected set of cytogenetic alterations in preinvasive lesions may be a better predictor of lung cancer. Therefore, we determined whether the results of the cytogenetic analysis were associated with disease progression in the elected individuals. We selected six DNA targets commonly amplified in lung cancer [12,14,15,24,25] including two centromeric probes (CEP3 and CEP6) and four probes to areas of frequent genomic amplification, i.e. 3q28 (TP63) [11,13], 5p15.2 (D523 and D5S721 markers) [26], 8q24 (MYC) [27], and 7p12 (EGFR) [28]. With those validated FISH probes, we performed a quantitative evaluation of nuclear representation of genomic locus copy number in preinvasive lesions and their association with a diagnosis of invasive lung carcinoma.

Patient population characteristics
The population included 70 subjects recruited at the University of Colorado Cancer Center (UCCC), the British Columbia Cancer Agency (BCCA) and the University of Iceland Hospitals (UIH). This population represents a subgroup of patients previously investigated [23] and includes all subjects from that study with diagnosed moderate dysplasia (4 with lung cancer, 22 controls), severe dysplasia (6 with lung cancer, 15 controls) or carcinoma in situ (17 with lung cancer, 6 controls), for whom bronchial sections were available.
The subjects studied were all considered to be at high risk for lung cancer based on a history of at least 30 pack-years of smoking and spirometric evidence of airflow obstruction documented by an FEV1/FVC ration of less than 75% and an FEV1 of less than 70% of predicted. Former smokers were defined as having quit at least one year before the time of enrollment. Pack-years were defined as the average number of packs smoked per day multiplied by the numbers of years smoked. Flexible fiberoptic bronchoscopy was performed with both autofluorescence and white light examination of the airways using either a Xillix LIFE II or OncoLIFE system at the UCCC and BCCA sites; white light examination alone was performed at the UIH. The BCCA cases had been diagnosed in a prospective study of early lung cancer using autofluorescence and white light bronchoscopy or the subjects were enrolled as part of two National Cancer Institute sponsored chemoprevention trials. These included 27 patients with clinical diagnosis of invasive carcinoma (prevalent cases) and 43 subjects who remained free of invasive tumor for at least one year of follow up (controls). Detailed questionnaire data derived from personal interview were available on all study subjects, including demographic characteristics and smoking history. The study was approved by the local Institutional Review Boards of the Vanderbilt University, the University of Colorado Health Sciences Center, the BCCA-University of British Columbia Clinical Research Ethics Board, the National Bioethics Committee of Iceland and the Icelandic Data Processing Commission.

Histology and Selection of Areas of Interest
All biopsies obtained at bronchoscopy were scored according to the most recent WHO classification [29]. Biopsies with diagnoses ranging from moderate dysplasia to carcinoma in situ were selected for FISH analysis. Diagnostic areas within individual biopsies were reviewed and imaged by a pathologist (WAF), who marked the areas of interest to be specifically examined by FISH.
FISH for CEP3, TP63 (3q28), D523, D5S721 (5p15.2), CEP6, EGFR (7p12), and MYC (8q24) Four-micron paraffin sections were initially submitted to a 4target FISH assay (LAVysion, Abbott Molecular, Des Moines, IL) including sequences encompassing the DNA markers D523 and D5S721 at 5p15.2, centromere 6, the EGFR gene at 7p12 and the MYC gene at 8q24, according to protocol described elsewhere [23]. All sections analyzed by FISH were sequential to the respective H&E section. Individual nuclei were assessed for the number of fluorescent signal corresponding to a copy number of the gene of interest. Individual spot counts are referred as signals separated at least by the size of one fluorescent signal. After the analysis of these 4 genomic regions, the same sections were stripped of their fluorescence signals in 70% formamide solution at 72uC for 10 minutes and then re-probed with a 2-target FISH assay including TP63 at 3q28 (BAC clones RP11-53D15 and RP11-373I6, digoxigenin-labeled, detected by FITC) and Spectrum Orange-CEP 3 sequences (Abbott Molecular, Des Moines, IL). Immunochemical detection procedures were as described previously [30].
Hybridized slides were examined in fluorescence microscopes equipped with proper interference filters and coupled with a CytoVision Genetic workstation (Applied Imaging). Over the areas of interest marked by the pathologist, spot counts were obtained for 30-50 nuclei per specimen for each probe and representative images captured digitally. Considering that the bronchial sections had truncated nuclei, for each DNA target the specimen was defined as abnormal when the mean copy number per cell was greater than two.
Statistical analysis. We focused first on the distributions of demographic factors, such as age, sex, smoking status (current or ex-smoker) and histologic grades. Next, we examined the differences in mean copy number of each FISH marker based on the histology. Multiple comparison tests using bootstrap technique were performed for significance. Third, the associations between FISH markers and cancer status were assessed individually and in multiplicity models after controlling for clinical and biological parameters, such as smoking status and histology grade, using multiple logistic regression. Associations were reported as odds ratios with corresponding 95% confidence intervals (CI). A mean number of #2 copies per nucleus was considered as the FISH reference value. Finally, we used ROC analysis (c-statistics) to investigate the contributions of marker groups in combined models to differentiate cases and controls. All analyses were carried out in Statistical Analysis Software (Version 9.1, SAS Institute Inc, Cary NC). The comparisons of areas under the ROC curves between models were examined using the ''ROC Macro'' SAS tool [31,32].

Results
Patient clinical information and pathological characteristics of the lesions are summarized in Table 1. There was no statistical difference for age, gender, study center and current smoking status between cases and controls. However, the average smoking intensity was greater among the cases (pack year (PKY) mean = 80.1, SD = 46.1) compared with the controls (mean PKY = 56.6, SD = 24.3). Similarly, the distribution of preinvasive lesions was skewed towards higher grades in patients with cancer.
The mean copy number per cell of selected genomic candidate biomarkers is reported by case and control status for different grades of preinvasive lesions in Table 2, showing a tendency of increasing in copy number among samples as the histology grade advances. While the association with histology grade was significant for TP63, MYC, CEP6, and CEP3 (p,0.01), it was not for EGFR and 5p15.2. When these relationships were analyzed according to the proportion of cells that contained more than two copies of each individual marker, we observed associations limited to these same 4 markers. Interestingly, amplification of sequences was detected in a total of 8 specimens, for four of the tested targets: TP63, 5p15.2, EGFR and MYC. Amplification was represented by small to medium size clusters of signals (EGFR) or by more than 50% of cells carrying more than 5 copies of the signals (MYC, 5p15.2 and TP63). Representative images are presented in Figure 1.
The percentage of lesions abnormal for each FISH marker (according to the case or control status), with abnormality defined as mean copy number per cell greater than two is presented in Table 3. The presence of a malignancy in the airways was only moderately associated with the rate of copy number abnormalities except for MYC, which was more frequently amplified in preinvasive lesions of patients with lung cancer.
Because access to normal bronchial epithelium from the same individuals was not possible, moderate dysplasia was used as the reference baseline to measure the association between copy number abnormalities and lung cancer status. As shown in Table 4, the odds of having lung cancer given that a preinvasive lesion had gain for genomic regions increased from 4.23 (1.21-14.8, 95%CI) when 1 or 2 markers of 4 (CEP3, TTP63, CEP6, MYC) were abnormal to 11 (2.63-45.9, 95%CI) when 3 or 4 markers showed elevated copy numbers. Further adjustment for age, gender, center, current smoking status and histology grade of the lesion increased the odds to 17.
When assessed as a candidate biomarker signature predictive of lung cancer, the sensitivity of presence of abnormality in those 4 markers was 82% and specificity was 58%. The receiver operating characteristic (ROC) curves shown in Figure 2 demonstrate the added value of histology and epidemiological information, ultimately achieving an area under the curve of 92.6%. The demographic information represents gender, age, pack years of smoking history, and smoking status. The differences between the curves were significant between demographics vs. demographics and cytology (p = 0.02) or vs. demographics, cytology and 4 FISH biomarkers (p = 0.002). Although showing a trend, the difference was not significant between demographics and cytology vs. demographics, histology and 4 FISH biomarkers (p = 0.11).

Discussion
Molecular approaches for the early detection for lung cancer have been targeting the blood, the sputum, the exhaled breath and bronchial biopsies [7,33]. Preinvasive bronchial lesions are well established markers of risk and yet the histological grade does not necessarily predict the outcome [16,34,35]. Our goal was to determine whether specific genomic alterations in preinvasive lesions may be predictive of having a lung cancer in high-risk individuals. While genomic instability was addressed previously based on a quantitative analysis [23], a more refined molecular signature is expected to better associate with the diagnosis of lung cancer. Our results indicate that specific cytogenetic alterations present in preinvasive lung lesions such as amplification or overrepresentation of the TP63 and MYC genes are highly associated with the diagnosis of lung cancer and therefore suggest a role of those markers in assessing lung cancer risk. Unlike p53, TP63 is rarely mutated in lung cancer but a significant fraction of tumor and premalignant lesions are amplified for both TP63 and MYC genes.
Although chromosomal alterations have been linked to most solid tumors and serve as a hallmark of human cancer [36], they are becoming increasingly complicated; i.e., major patterns are being diluted by many variants [10,37,38]. Genomic alterations in the airway epithelium occur both stochastically and later in a clonal manner. Clonal (identical alteration in 2 or more cells) and non-clonal alterations associated with smoking history participate in the tumor initiation process. Some alterations may indicate risk better than others. Thus far, the majority of these alterations have been considered consequences as opposed to causes in the lung cancer development. Some of these low level aberrations have been called random noise but may reflect the measure of true  instability. Non-clonal chromosomal alterations (NCCAs) such as defective mitotic figures, chromosomal fragmentation, missegregation or non-recurrent genomic alterations indicate a dynamic process leading to instability [39]. In this study, we found that the high frequency of non-coding centromeric alterations (CEP3 and CEP 6) was independently associated with a diagnosis of lung cancer (Table 2), and therefore although not specifically linked to tumorigenesis, it is probably part of the genomic instability coming along with the disease process. In contrast, specific genomic amplifications or deletions [12,15,40] have been implicated in the pathogenesis of tumor development in part through the activation of oncogenes [41] and inactivation of tumor suppressor genes. Some genomic signatures seem to persist after tumor development, throughout their progression [42] and their histology differentiation. Preinvasive lesions have shown copy number alterations for several chromosomal regions including 3q28, 5p15 and 8q24 [10,13,23,43,44] and these alterations were also found. Genomic imbalances have been extensively investigated in invasive NSCLCs using CGH or SNP array methodology and numerous loci were found commonly amplified or over-represented in either or both squamous and adenocarcinoma of the lung [14,15,24,25,45,46,47,48,49]. If some of them are demonstrated to be early events, this may improve the test performance in the context of diagnosis of lung cancer.
We hypothesized that genomic alterations in TP63, 5p15.2, EGFR, MYC, CEP 6 and CEP 3, may provide a measure of risk assessment. Increased copy number (over-representation or amplification) is technically more reliable to detect with fewer false positives than genomic loss, mainly when using FISH assay in sectioned specimens exhibiting nuclear truncation, thus we have focused on loci that were involved in genomic gain. The cut off for ''normal'' copy number gain was set at #2 copies per nucleus based on the fact that the normal disomic cells have two copies of each genomic target and these cells were sectioned at 4 mm, which generated truncated nuclei. Although this value may not be the optimal cut off to use, it is conservative and assures that samples classified as exhibiting genomic gain are actually abnormal.
TP63 is an appealing target located at the tip of one of the most prevalent region of amplification in lung cancer. TP63 is a homologue of p53, which plays a role in development and oncogenesis by regulating proliferation and differentiation. Interest in TP63 stems from this ''two genes in one'' concept with agonist and antagonist properties that may be involved in tumor development [50]. TP63 is a complex gene that has multiple transcriptional isoforms, some of which are tumor suppressors (the TP63 isoforms), while the others are oncogenes (DTP63; dNTP63) [11].The TATP63 isoforms can bind to DNA through p53responsive elements and therefore ''p53-like''. The DNTP63 exerts dominant negative effects over p53 and is proposed as oncogenic. We found that there is an early and frequent genomic amplification of TP63 in the development of squamous carcinoma of the lung and that patients with NSCLC showing amplification and overexpression of TP63 have prolonged survival [13]. The DNTP63a splice variant is the most commonly expressed isoform in squamous epithelia [11,13]. The oncogenic activity of the TP63 isoforms may explain why we see the amplification and overexpression of this protein. MYC is also an important oncogene in lung cancer. It is expressed in a large number of NSCLCs [51]. Gene amplification at 8q24 and resultant increased expression of MYC is a common occurrence in carcinomas [48,52,53]. It leads to increased formation of the MYC:Max heterodimer transcription factors that alter gene expression in large part by recruiting histone-modifying enzymes [47]. Our data suggest that the changes observed for 5p15 and EGF R were less predictive of cancer, but these changes seemed to happen earlier in the dysplastic process, at least in smokers. Among 20 bronchial specimens with normal histology from never smokers, none have elevated EGFR or 5p15 copy number. Average copy number for these 20 specimens were 1.77 (STD 0.53) for EGFR and 1.73 for 5p15 [10]. These observations are also consistent with the observation of frequent EGFR mutations (24-43%) found in the airway epithelium in the vicinity of tumors [54,55], and with the data showing frequent early events on 5p in squamous differentiation of lung cancer [55,56].
Our design included controls who did not present with lung cancer for at least 12 months after endobronchial biopsy. Since genomic instability occurs not only among cases but also among controls, some of these high-risk controls may eventually develop lung cancer later and our cross sectional study design does not address this risk. Other limitations of the study include the nature of the tissues examined, a relative small sample size (although the study of 70 fully annotated high grade preinvasive lesions required three centers and is one of the largest reported to date), and the inability to study these samples' progression over time.
The use of bronchial biopsies for assessment of lung cancer risk is unlikely to be of optimal clinical use, although it may be useful to predict future cancer in subjects who happen to undergo a biopsy showing high grade bronchial preneoplasia. This type of molecular analysis may have to move to surrogate tissue in the airways including the histologically normal airway epithelium. The small size of the preinvasive lesions and the potential therapeutic effect of biopsies make the evaluation of the progression rate of the aberrations in these tissues rather challenging. Cross-sectional studies allow the investigation of the association between alterations and disease state. Yet, to prove clinical utility, the candidate biomarkers will require further validation in prospective cohort studies.
The accuracy of our cytogenetic signature may be improved in different ways. Genome wide copy number alterations of invasive and preinvasive lesions may allow the selection of regions more specifically associated with the diagnosis of lung cancer. Increasing the number of targets studied in small tissue samples is challenging but newer technologies may help reach this goal. Ultimately, refining a genomic signature observed in preinvasive lesions that predicts who is likely to develop lung cancer would be very informative. In this context, repeated measurement of such alterations and the rate of its accumulation may be particularly valuable in predicting the likelihood of developing lung cancer.
In this study we took advantage of advances in lung cancer molecular genetics and demonstrated that there is a strong association between targeted genomic alterations in preinvasive bronchial lesions and the diagnosis of lung cancer. These alterations can be reliably assessed by FISH, and may represent a method to measure the risk of developing lung cancer. The predictive value of these alterations deserves further evaluation in the airway epithelium of high risk individuals in longitudinal studies.