MicroRNA-218 Is Deleted and Downregulated in Lung Squamous Cell Carcinoma

MicroRNAs (miRNAs) are a family of small, non-coding RNA species functioning as negative regulators of multiple target genes including tumour suppressor genes and oncogenes. Many miRNA gene loci are located within cancer-associated genomic regions. To identify potential new amplified oncogenic and/or deleted tumour suppressing miRNAs in lung cancer, we inferred miRNA gene dosage from high dimensional arrayCGH data. From miRBase v9.0 (http://microrna.sanger.ac.uk), 474 human miRNA genes were physically mapped to regions of chromosomal loss or gain identified from a high-resolution genome-wide arrayCGH study of 132 primary non-small cell lung cancers (NSCLCs) (a training set of 60 squamous cell carcinomas and 72 adenocarcinomas). MiRNAs were selected as candidates if their immediately flanking probes or host gene were deleted or amplified in at least 25% of primary tumours using both Analysis of Copy Errors algorithm and fold change (≥±1.2) analyses. Using these criteria, 97 miRNAs mapped to regions of aberrant copy number. Analysis of three independent published lung cancer arrayCGH datasets confirmed that 22 of these miRNA loci showed directionally concordant copy number variation. MiR-218, encoded on 4p15.31 and 5q35.1 within two host genes (SLIT2 and SLIT3), in a region of copy number loss, was selected as a priority candidate for follow-up as it is reported as underexpressed in lung cancer. We confirmed decreased expression of mature miR-218 and its host genes by qRT-PCR in 39 NSCLCs relative to normal lung tissue. This downregulation of miR-218 was found to be associated with a history of cigarette smoking, but not human papilloma virus. Thus, we show for the first time that putative lung cancer-associated miRNAs can be identified from genome-wide arrayCGH datasets using a bioinformatics mapping approach, and report that miR-218 is a strong candidate tumour suppressing miRNA potentially involved in lung cancer.

In lung cancer, studies have identified the tumour suppressing let-7 family and the oncogenic miR-17-92 cluster of miRNAs. The let-7 family is consistently downregulated in primary lung tumours and lung cancer cell lines, which abrogates let-7's control on RAS, allowing RAS overexpression, thereby contributing to lung carcinogenesis [6,7,17]. Significant overexpression of the oncogenic miR-17-92 cluster of miRNAs occurs in lung cancer cell lines and tumours [18], where miR-17-92 miRNAs are upregulated by oncogenic c-Myc and act as part of a regulatory network balancing cell death and proliferation with c-Myc and E2F1 [19,20,21].
MiRNAs demonstrate complex patterns of genomic organisation with both intergenic and intragenic miRNA genes [22,23]. Calin et al reported that most known miRNAs are in regions of genomic aberration associated with cancer [17]. In keeping, let-7 and miR-17-92 are linked to chromosomal deletions and gains respectively, indicating copy number variations as potential mechanisms for their dysregulation in lung tumours [17,21,24].
We reasoned that novel dysregulated lung cancer miRNAs can be identified by virtue of somatically acquired aberrant gene dosage, and this allows the large number of publically available unbiased genomewide arrayCGH datasets to be exploited for identification of miRNAs involved in disease even though the original studies were not specifically designed for miRNA loci. Using this strategy we report for the first time, identification of miR-218 (hsa-miR-218; MI-MAT0000275) located within a region of genomic loss (4p15. 31 and 5q35.1) as a putative tumour suppressor in non-small cell lung cancer (NSCLC). MiR-218 was recently identified as a tumour suppressor in cervical cancer [25] where its downregulation was linked with human papilloma virus (HPV) [25]. Although we observed a significant reduction in miR-218 in subjects with a history of cigarette smoking we found no relationship between HPV and miR-218 in lung SCCs.

Clinical Specimens
Resected primary NSCLC and corresponding normal lung tissue were obtained with informed written consent from patients undergoing lung resection at The Prince Charles Hospital between 1990 and 2007 (Human Research Ethics Committee 9124). Clinicopathological data was available for all specimens as previously published including disease recurrence [26,27] and asbestos fibre burden [28].

ArrayCGH Detection of Chromosomal Aberration
ArrayCGH data for 132 curatively resected primary NSCLCs (training set) was provided by Dr JE Larsen (manuscript in preparation) ( Table 1) using the Agilent Human Genome CGH Microarray 44B (Agilent, G4410B) platform and tumour sample hybridisation against normal female DNA (Human Genomic DNA Female, Promega G152A) according to the manufacturer's instructions. Filtered, normalised signal log ratios between lung tumour DNA and normal female reference DNA were used for analyses.
For robust detection of DNA copy number aberrations, two independent bioinformatic approaches were used: i) Fold change (FC) of +/21.2 (high discovery sensitivity) in individual probe intensity using signal log ratios (Affymetrix, Statistical Algorithms Reference Guide); and ii) CGH Explorer Analysis of Copy Errors (ACE) algorithm [29] controlling significance and false discovery rate at ,0.001 to avoid Type 1 error for multiple comparisons. Adenocarcinomas (AC) (n = 72) and squamous cell carcinomas (SCC) (n = 60) were analysed separately. Analysis of chromosome X was conducted in the female data set (AC n = 23 and SCCs n = 17). Chromosome Y, which does not contain any known miRNA genes, was not examined. Selected thresholds for all analyses are summarised in Table S1.
MiRNA genes were classified as having copy number aberrations if both flanking arrayCGH probes were: i) located within 35kb (average spatial resolution of the platform) of the miRNA or positioned within the miRNAs host gene; and ii) identified as concordant gain or loss by both bioinformatic methods (i.e. FC and ACE). These were then called as ''significant'' if both flanking probes were concordantly altered in $25% but discordant in ,10% of primary tumours. This produced a list of miRNAs in regions of copy number variation.

Validation of miRNAs from Independent Test Sets
We also interrogated published arrayCGH data from three independent primary NSCLC cohorts [33,34,35] (Table S2). Regions of chromosomal aberration were compiled and miRNAs in areas of gain or loss were identified using genomic positioning information as described above. This list of miRNAs was then directly compared with candidates derived from the TPCH (training) dataset to select those miRNAs within regions of aberration consistent in the training set and at least one test set.

Prioritising miRNAs with Gene Dosage and Expression Concordance
To select candidate miRNAs with concordantly altered dosage and expression, we interrogated four public datasets of mature miRNA expression from independent NSCLC cohorts [6,7,8,36] (Table S3). Candidate miRNAs with changes in expression that were concordant with their inferred copy number changes were prioritised for further study.

Biological Verification of Aberrant miRNA Expression
Mature miRNA expression was measured using TaqMan qRT-PCR assays (Applied Biosystems) in 39 paired NSCLCs and normal lung (SCCs n = 18 and ACs n = 21) ( Table 2). Total RNA was isolated from 30 mg of tissue using TRIzol (Invitrogen) and DNase treated (Ambion). MiRNA-specific reverse transcription for miR-218 and U6 snRNA (internal control) used stem-loop TaqMan primer/probe sets for real-time PCR (qRT-PCR) [37]. Triplicate TaqMan qRT-PCR assays were performed on a Corbett Research Rotor-Gene 6000 with reference total RNA

Correlating Dysregulated miRNA and Host Gene Expression
Host gene expression was measured by qRT-PCR using SYBR Green chemistry in the same cohort of 39 NSCLCs. DNase treated total RNA were reverse transcribed using Superscript III (Invitrogen). Primers were designed using Primer Express (Applied Biosystems) and are listed in Table S4. Standard curves were performed to determine the linear dynamic range and reaction efficiency. All reactions were performed in triplicate on a Corbett Rotorgene 6000 with melt curve analysis to confirm detection of a single amplicon, with Universal Human Reference RNA (Stratagene) as reference. Relative gene expression was calculated and normalised against internal controls (18S rRNA, ACTN4 and BAT1) [39], using the Pfaffl method [38].

HPV Testing
Presence of HPV DNA was assessed in a cohort of 72 lung SCCs accessed from TPCH Tumour Bank, including the 18 cases with miR-218 expression data. HPV testing was performed by Gribbles Pathology, Victoria, Australia, using the Genera Biosystems PapType RUO Kit which detects 14 high-risk (16,18,31,33,35,39,45,51,52,56,58,59,66 and 68) and two low risk (6 and 11) HPV subtypes. This assay is based on 2 rounds of PCR amplification of HPV and control gene DNA (alkali myosin light chain protein) with addition of a fluorescent reporter dye and utilises silica detection beads which fluoresce when hybridised to its specific target DNA.

Statistical Analysis
Statistical analyses were performed as above or with SPSS Version 13.0 (SPSS Inc Chicago, IL, USA) using Mann-Whitney U and Pearson correlations. p values ,0.05 were considered statistically significant. Survival analysis was undertaken with log rank statistic of constructed Kaplan-Meier curves.

MiRNA Target Prediction and Pathway Analysis
Predicted mRNA targets of miR-218 were identified by using four online miRNA target prediction programs: PicTar (5 species conservation) [40], TargetScan 4.1 [41,42,43], miRBase Targets Version 5 [30,31,32] and miRNAMap 2.0 [44,45]. Overrepresented gene ontologies (GO) and functional classes for putative target genes of miR-218 predicted by two or more of the algorithms were identified using the Database for Annotation, Visualization and Integrated Discovery (DAVID, April 2008 release) gene-GO enrichment analysis (ranked by the Expression Analysis Systematic Explorer (EASE) score threshold) and gene functional classification (performed with the highest stringency setting) [46,47]. Ingenuity Pathway Analysis (IngenuityH Systems, www.ingenuity.com) was performed on both the entire miR-218 target gene list and a prioritised list of miR-218 targets found to be enriched $2.0 by DAVID gene functional classification.

MiRNAs are Located in Regions of Genomic Alteration in Lung Cancer
From TPCH arrayCGH data (training set), 89 of 474 (19%) miRNAs were within chromosomal regions deleted (41 miRNAs) or gained (48 miRNAs) in 132 primary NSCLC (Table S5). Within the SCC subtype, 77 miRNAs were associated with DNA copy number changes, with 40 miRNAs in areas of gain and 37 in regions of loss. Twenty-six miRNAs were identified for ACs, with 21 and five miRNAs in regions of gain and loss respectively. Fourteen miRNAs were located within regions of copy number variation common to both SCC and ACs (one associated with DNA loss and 13 associated with DNA gain). On the X chromosome, five and four miRNAs were found to be associated with regions of loss in ACs and SCCs respectively.

Validation of Candidate NSCLC miRNAs from Public Data Test Sets
MiRNA dosage. To validate candidate miRNAs, we repeated the positional mapping analysis using published independent arrayCGH data test sets [33,34,35] and found 51 miRNAs; 28 and 23 miRNAs located within regions of gain or loss respectively (Table S6). Eighteen of these were among the 89 miRNAs identified from positional mapping to arrayCGH ( Figure 1). No miRNAs were found to be located in genomic regions of copy number variation that were discordant between our cohort (training set) and published arrayCGH cohorts (test sets).
MiRNA Expression. To enrich for miRNAs with concordant dysregulated expression and dosage, we reviewed the literature and found 80 mature miRNAs reported to have altered expression levels (44 increased, 32 decreased and four with conflicting reports) in primary NSCLCs [6,7,8,36], 16 were represented in the 89 miRNA set with 10 showing expression dysregulation concordant with predicted miRNA gene copy number change. Seven miRNAs were among the 51 miRNAs identified from published arrayCGH studies, but only three had concordant miRNA expression and dosage between the public datasets. Two miRNAs, miR-218 and miR-216, were common to training and all test sets, however, only miR-218 demonstrated concordant loss of copy number and expression ( Figure 1). Expression levels of 38/89 candidate NSCLC miRNAs and 19/51 published arrayCGH NSCLC miRNAs have only recently been annotated in miRBase and have not yet been studied in lung cancer.

MiR-218 Expression is Reduced in NSCLCs Compared with Paired Normal Lung
To confirm decreased miR-218 dosage and expression in lung cancer, we measured mature miR-218 expression in 21 ACs and 18 SCCs from our arrayCGH training set and their paired normal lung. MiR-218 expression was down-regulated in 85% (33/39) of NSCLC tumours compared with paired normal lung. Statistically significant decreases were observed in both SCCs (mean FC = 24.4, p,1.0e-4) and to a lesser extent ACs (mean FC = 22.0, p = 0.001) (Figure 2a).
MiR-218, SLIT2 and SLIT3 Downregulation is Associated with SLIT2 and SLIT3 Gene Dosage Next, we compared SLIT2 and SLIT3 copy number with miR-218, SLIT2 and SLIT3 expression to determine if their expression was due to gene dosage. The 15 ACs and 15 SCCs found to have concomitant reduced miR-218 and host gene expression also demonstrated copy number loss in SLIT2 and/or SLIT3. Complete concordance (loss of both host genes plus a decrease in miR-218 expression) between SLIT2 and SLIT3 copy number and miR-218, SLIT2 and SLIT3 expression, was observed in 9/19 (47.4%) ACs and 10/18 (55.6%) SCCs ( Figure S2 and Table S8).
For one AC and two SCCs, miR-218 expression was increased despite reduced host gene copy numbers and expression.  Table 2) and HPV status not found to be associated with a reduction in miR-218 expression (Figure 3b).

Association of miR-218 with Clinicopathological Phenotypes
Asbestos. Altered miR-218 expression in NSCLC was also examined in relation to asbestos exposure. Asbestos fibre burden has previously been assessed for this cohort, and tumours with .20 asbestos bodies per gram wet weight lung tissue (AB/gww) were considered 'asbestos exposed' and those with 0 AB/gww were considered to be 'not exposed' [28]. Altered miR-218 expression was not found to be associated with asbestos exposure (Figure 3c) nor asbestos fibre burden (number of AB/gww lung tissue -data not shown).
Recurrence and Survival. Following the earlier publication by Larsen et al [26,27], recurrent NSCLCs were those that had recurred within 3-18 months post-surgical resection and nonrecurrent subjects were those who had remained disease free for a period of at least 36 months post-resection. No association was observed between miR-218 expression levels and disease recurrence or NSCLC survival post-resection (Figure 3d-e).

Putative Targets of miR-218 Support its Role in Carcinogenesis
We identified predicted target genes of miR-218 using four online miRNA target prediction algorithms. PicTar [40] and miRNAMap 2.0 [44,45] identified 575 and 688 target genes respectively, with 645 conserved sites and 133 poorly conserved sites. TargetScan 4.1 [41,42,43] and miRBase Targets V5 [30,31,32] reported 570 and 946 miR-218 target sites respectively, where multiple target sites could be present in the one gene. In total, 1794 individual target genes were identified by these four target prediction algorithms (578 identified by two or more of the algorithms) (available on request).
Gene ontologies and biological function of the 578 target genes were explored using DAVID (April 2008 Release) to determine biological relevance [46,47]. GO enrichment analysis revealed that biological processes such as cell adhesion, protein modifications and transport, development and cell signalling were overrepresented amongst miR-218 targets (p,0.001). Over-represented molecular functions (p,0.01) were predominantly focused on protein binding function, in particular to actin, cytoskeletal proteins, cyclic nucleotides, protein phosphatase 2A, cAMP, and metal ions. Similar to gene ontology analysis, functional annotation clustering reflected enrichment of groups of genes largely involved in cell adhesion (enrichment score = 4.66), protein modification relating to ubiquitin cycle (enrichment scores = 3.67 and 3.33) or kinase activity (enrichment score = 2.88), and regulation of transcription (including proto-oncogenes, enrichment score = 2.22) (Table S9).
Network analysis of the 578 target genes by Ingenuity Pathway Analysis identified 39 gene networks involved in biological functions such as cell-to-cell signalling, amino acid metabolism, post-translational modifications, cancer and respiratory system development and function (p,0.05) ( Figure S3a and Table S10). Canonical pathway analysis found numerous cancer signalling pathways including Wnt/b-catenin signalling, ERK/MAPK signalling and Notch signalling (p,0.05) ( Figure S3b). To attempt to define a more specific network or pathway associated with miR-218, we performed a focused analysis using 121 enriched genes ($2.00) identified with functional annotation clustering in DAVID. This identified 10 gene networks that reflected similar cancer-related biological functions to those identified using the larger target gene list (Table S10). For instance, in the 'gene expression, cancer and cell morphology' network, miR-218 may target genes directly and indirectly linked to two oncogenes, MYC and SRC ( Figure S4).

Discussion
We demonstrate that genomic profiling can be used to identify cancer related miRNAs. We found 19% of miRNAs were located in regions of copy number variation in primary NSCLC, which is relatively lower than ovarian (37%), breast (73%) and melanoma (86%) [48]. Aside from being different tumour types, this may be due to more recently identified miRNAs not highly represented in cancer-associated genomic regions, our high stringency bioinformatics, or both. We limited ourselves to miRNAs located within host genes or next to flanking probes that were within the spatial resolution of the platform (35 kb), because the array used in this study did not have any probes directly representing any miRNA genes. Next generation higher-resolution arrayCGH platforms will refine this approach specifically detecting miRNA genomic loci. Nonetheless, we successfully identified 89 potential lung cancer miRNAs, including established oncogenic and tumour suppressing miRNAs. Three members of the known tumour suppressing let-7 family of miRNAs were identified in regions of loss (let-7g, let-7f-2 and miR-98), and oncogenic miR-21 was found within a genomic region of amplification.
Over half of the 89 miRNAs (65%) are intragenic; interestingly, many of the host genes have reported roles in cancer. For example, MCM7 (minichromosome maintenance protein 7) is the host gene for three of the miRNAs (miR-25, miR-93 and miR-106b) we identified and has been reported to be amplified or over-expressed in human malignancies including prostatic, pancreatic, thyroid, cervical and colorectal cancers [49,50,51,52,53,54]. In prostate cancer, increased levels of these 3 miRNAs occurred with amplified and over-expressed MCM7 [55]. This observation supports the idea proposed that miRNAs and their host genes may be jointly affected by copy number changes [48].
Only one of the 89 miRNAs, miR-218, demonstrated concordant changes in copy number and expression. Additional candidates may be identified with publication of new miRNA expression studies in NSCLC as half of the miRNAs with concordant copy number changes (10/18) were excluded from this first pass analysis as their mature expression levels have not been measured. Furthermore, miRNAs are subject to complex regulatory mechanisms so it is not surprising to observe numerous miRNAs with discordant copy number and expression (such as miR-216). However, as with any genome-wide approach for gene discovery, there is the potential to identify false positives. We have attempted to alleviate this risk by using high stringency criteria for identification of candidate miRNAs from our aCGH data and subsequently using published NSCLC cohorts with copy number and miRNA expression data to independently validate these candidates. As such, miR-218 emerged as a strong candidate tumour suppressor, within a region of copy number loss in greater than two NSCLC studies and with demonstrated loss of expression in a third independent cohort.
Produced from two separate precursors, miR-218 is found within host genes SLIT2 and SLIT3, with both the miRNA and its host genes demonstrating reduced expression in the majority (80%) of NSCLCs. Zhang et al reported copy number losses of mir-218-1 and SLIT2 in ovarian (16%), breast (36%) and melanoma (33%), however, the alternative precursor and host gene were not mentioned [48]. It has been suggested that intronic miRNAs share host gene transcriptional regulatory control with resulting coexpression [22,23]. Despite noting a reduction in both miR-218 and host gene expression in the majority of NSCLCs, only a moderate correlation between miR-218 and SLIT2 expression was detected, with no significant relationship with SLIT3. This may be a reflection of disproportionate production of miR-218 from each genomic site, alterations in miRNA biogenesis or miR-218 may have a promoter separate from its host gene. Similarly, no significant associations between miR-218 or host gene expression with host gene copy number were identified, but we found that complete concordance (copy number loss of both host genes with reduced miRNA and host gene expression) observed in 51% of lung cancers studied. These findings would suggests that genetic loss may contribute to the observed reduction in miR-218, SLIT2 and SLIT3 expression, however it is likely that aberrations in multiple regulatory mechanisms are involved, including alterations to miRNA biogenesis and epigenetic controls. For example, altered epigenetic control of SLIT2 and SLIT3 has also been reported in lung cancer with hypermethylation of SLIT2 and to a lesser extent SLIT3 [56,57]. Thus, epigenetic silencing is an alternative or additional 'hit', and strengthens the case for miR-218 and its host genes as potential tumour suppressor genes.
Additional support for a role for miR-218 in NSCLC comes from the potential association between miR-218 downregulation and cigarette smoke exposure. Further, Schembri et al has also demonstrated a link between miR-218 downregulation and smoking; exposing human bronchial epithelial cells to cigarette smoke extract decreased miR-218 expression levels [58]. Our observation that in both subtypes of NSCLC, miR-218 expression is significantly reduced in subjects with a history of cigarette smoking, provides additional support the notion that miR-218 may be involved in tobacco-related carcinogenesis [58]. However, the number of never smokers used in this study is insufficient to confirm this relationship and further investigation is required with larger cohorts and functional validation to explore a causative relationship. We also examined whether the decrease in miR-218 expression was associated with asbestos, a well known lung carcinogen, but found no significant correlation. As miR-218 may play a role in metastasis, with Leite et al reporting high levels of miR-218 in high grade prostate cancer compared to significantly reduced levels in metastatic prostate cancers [59], we examined the survival of our cases with miR-218 dysregulation. In lung cancer, we found no relationship between miR-218 expression levels and NSCLC recurrence or patient survival.
MiR-218 has also been shown to be downregulated in cervical carcinoma through the action of the HPV 16 E6 oncogene and may be important in early cervical tumourigenesis [25]. We have previously identified a limited association for HPV infection in lung cancer, with reports of HPV prevalence ranging from 5-22% [60,61]. We did not find any link between HPV status and miR-218 expression but one reason may be that neither of our HPV positive samples were HPV type 16, and reduced miR-218 has only been reported with the HPV 16 -E6 oncogene. However, the small numbers preclude strong conclusions and more samples need testing. MiR-218 has also been reported to be reduced in gastric cancer, with reduced expression linked to Helicobacter pylori infection and carcinogenesis [62].
Target prediction found miR-218 can target numerous oncogenes, such as KIT, RET, BCL9, DCUN1D1 and PDGFRA. Gene ontology and functional annotation clustering of the predicted targets revealed enrichment of genes involved in cell adhesion, protein modification, development, cell signalling and regulation of transcription, all processes that may contribute to carcinogenesis. These findings were reflected in the Ingenuity Pathway Analysis, with miR-218 potentially regulating genes involved in cancer-related biological functions and three recognised cancer signalling pathways. Focused pathway analysis also revealed that miR-218 may regulate genes directly and indirectly related to MYC and SRC, two well characterised oncogenes [20,21,63,64,65]. Recent studies have begun to authenticate miR-218 target genes, including LAMB3 (laminin-5 b3) [25], ECOP (Epidermal growth factor receptor-coamplified and overexpressed protein) [58,62] and transcription factor MAGF (v-maf musculoaponeurotic fibrosarcoma oncogene homolog G) [58].
In summary, arrayCGH cancer datasets are increasingly available with unbiased probes or selected cancer associated genes. We combined bioinformatic and experimental approaches to test whether miRNAs located in regions of gene dosage alteration had concordant altered expression, and therefore of importance. With this approach we identified putative tumour suppressor miR-218, experimentally confirmed its downregulation in NSCLC, and provided additional support for an association between reduced microRNA-218 in NSCLC expression and cigarette smoke exposure. In addition, we identified relevant cancer-related target genes and pathways targeted by miR-218, supporting a potential role as a tumour suppressor gene for NSCLC, especially SCCs. Nonetheless, further evidence including structural mutations and demonstrable tumour suppressor functions should be gathered before miR-218 can be appropriately designated as a tumour-suppressor gene for lung cancer.