Genome Wide DNA Copy Number Analysis of Serous Type Ovarian Carcinomas Identifies Genetic Markers Predictive of Clinical Outcome

Ovarian cancer is the fifth leading cause of cancer death in women. Ovarian cancers display a high degree of complex genetic alterations involving many oncogenes and tumor suppressor genes. Analysis of the association between genetic alterations and clinical endpoints such as survival will lead to improved patient management via genetic stratification of patients into clinically relevant subgroups. In this study, we aim to define subgroups of high-grade serous ovarian carcinomas that differ with respect to prognosis and overall survival. Genome-wide DNA copy number alterations (CNAs) were measured in 72 clinically annotated, high-grade serous tumors using high-resolution oligonucleotide arrays. Two clinically annotated, independent cohorts were used for validation. Unsupervised hierarchical clustering of copy number data derived from the 72 patient cohort resulted in two clusters with significant difference in progression free survival (PFS) and a marginal difference in overall survival (OS). GISTIC analysis of the two clusters identified altered regions unique to each cluster. Supervised clustering of two independent large cohorts of high-grade serous tumors using the classification scheme derived from the two initial clusters validated our results and identified 8 genomic regions that are distinctly different among the subgroups. These 8 regions map to 8p21.3, 8p23.2, 12p12.1, 17p11.2, 17p12, 19q12, 20q11.21 and 20q13.12; and harbor potential oncogenes and tumor suppressor genes that are likely to be involved in the pathogenesis of ovarian carcinoma. We have identified a set of genetic alterations that could be used for stratification of high-grade serous tumors into clinically relevant treatment subgroups.


Introduction
Epithelial ovarian carcinoma represents the fifth leading cause of cancer death among women in the United States [1,2]. It is estimated that there will be 21,550 cases of invasive ovarian cancer diagnosed and 14,660 deaths attributed to ovarian cancer in 2009 [3]. The five year survival rate of ovarian cancer ranges from 30 to 92%, depending on the spread of the disease at the time of diagnosis [3]. While early-stage ovarian cancers are highly curable, over 70% of ovarian cancer patients are diagnosed with the advanced disease with lower cure rates and are associated with significant morbidity and mortality [4]. Over the past decades there have been significant advances in ovarian cancer treatment as a result of improved surgical techniques and chemotherapy regimens through multiple clinical trials [5,6]. Debulking surgery has become the standard treatment for advanced stage ovarian carcinoma; a residual tumor size of greater than 2 cm is associated with a survival of 12-16 months, compared with 40-45 months if the tumor is less than 2 cm [7,8]. Adjuvant chemotherapy with platinum and taxane based regimens improves both disease free survival and overall survival in all patient subgroups; however, the longest survival periods are observed in optimally debulked patients. Up to 80% of patients with advanced stage disease experience an initial response to chemotherapy but eventually relapse with a median progression free survival of 18 months [9,10,11,12,13]. A number of resistance mechanisms have been defined in vitro [14,15,16]. However, the importance of these resistance mechanisms in patients remains unclear. Thus, there is a need for improvement in the understanding of the underlying genetic alterations involved in the pathogenesis of ovarian cancer. Identification of prognostic/predictive markers can improve patient management and allow development of molecularly targeted therapeutics.
The serous type ovarian carcinoma accounts for approximately 70% of ovarian cancer cases and is one of the clinically aggressive subtypes [17]. High-grade serous tumors differ from all other ovarian carcinomas in terms of their pathology, pathogenesis, prognosis and underlying genetic alterations [18,19]. The most frequently documented mutation is in the TP53 tumor suppressor gene.
Array-based comparative genomic hybridization (aCGH) allows detection of DNA copy number alterations (CNA) and provides a global assessment of molecular events in the genome [40]. Several studies have been reported utilizing either conventional metaphase chromosome-based CGH [41,42,43] or array-based high resolution genomic technologies for identifying genome wide CNAs in ovarian cancer [23,44,45,46,47]. The above mentioned studies have identified frequent regions of increased copy number along 1q, 3q26, 7q32-q36, 8q24, 17q32 and 20q13; and regions of decreased copy number along 1p36, 4q, 13q, 16q, 18q and Xq12. However, specific genetic markers that are predictive of clinical outcome are yet to be identified for high-grade ovarian cancers. The rationale for our study is based on the idea that genetic alterations are the cause of tumor development and progression. Therefore, it is likely that combination of specific genetic alterations will be predictive of clinical behavior [48,49]. In this study, using high-resolution aCGH, we sought to identify potentially useful DNA-based prognostic marker/s to delineate high-grade serous type ovarian cancer patients into molecularly defined clinically relevant subgroups.

Tumor samples and clinical data
The study group included tumor samples from 72 patients identified within prospectively collected MGH Gynecological Tissue Repository and Cedars-Sinai Women's Cancer Research Program Tissue Bank under IRB approved protocols at Massachusetts General Hospital and Cedars-Sinai Medical Center from 1991 to 2008 (Table 1). Under these protocols, patients with suspected ovarian cancer are consented in writing for tissue collection and prospective clinical data collection prior to surgical exploration. Frozen tumor tissues were collected, catalogued and anonymized. In each case, a small piece of tissue adjacent to the tissue that was used for DNA extraction, was paraffin embedded and H&E stained for histological validation. All samples were reviewed by a pathologist to confirm the presence of viable tumor cells in the tissue sample. Only samples with more than 70-80% viable tumor tissue were chosen for this study. Clinical data were then paired to the assigned catalogue number of each sample. Clinical factors including age at diagnosis, stage of disease, grade of tumor, origin of tumor (ovary, peritoneum, fallopian tube), specific surgical therapy, specific chemotherapy, platinum sensitivity, recurrence, progression free survival (PFS), overall survival (OS) were recorded and paired with the molecular data for correlation.
For reference DNA, buffy coats from 5 anonymous donors were purchased from the Massachusetts General Hospital Blood Bank.

Validation datasets
Two independent datasets were used for validation. The first dataset included a panel of 160 high-grade serous tumors from UCSF and the Gynecology Oncology Group (UCSF-GOG). These samples were analyzed using a 1 Mb BAC array platform. For these patients, overall survival information was available. The second dataset was obtained from The Cancer Genome Atlas (TCGA) project, included 246 high-grade serous tumors that were analyzed using a custom designed 415 k oligonucleotide array from Agilent. Clinical information for these samples was obtained with permission from the TCGA data committee.

Oligonucleotide array CGH
High molecular weight genomic DNA was isolated from 72 primary ovarian tumor samples and normal whole blood from 5 anonymous female donors using routine protocol. Array CGH was performed to determine DNA copy number changes using Agilent Human 105 K oligonucleotide microarrays (014698_D_20070820) following the manufacturer's instructions (http://www.home. agilent.com/agilent/home.jspx). Genomic coordinates for this array are based on the NCBI build 36, March 2006 freeze of the assembled human genome (UCSC hg18), available through the UCSC Genome Browser. This array includes a comprehensive probe coverage spanning both coding and non-coding regions, with emphasis on well-known genes, promoters, micro RNAs, and telomeric regions and provides an average spatial resolution of 21.7 kb. Array hybridization, washing and image processing were performed following the protocol described in Gabeau-Lacet et al 2009 [50].
aCGH data analysis methods All 5 normal reference DNA samples were hybridized one at a time to identify the common polymorphisms (CNVs) [51]. These CNVs were flagged during image analysis and were eliminated from subsequent analysis. DNA copy number alteration (CNA) was identified through dynamic thresholding of segmented aCGH data. Circular binary segmentation (CBS) was used to segment each hybridization into regions of common mean [52]. For each hybridization, the median absolute deviation (MAD) across all segments was then obtained. Probes assigned to segments with mean value greater than a scaled MAD were identified as gain. Likewise, probes corresponding to segments with mean value less than a scaled MAD were identified as loss. A default MAD scaling factor of 1.11 was utilized for both gains and losses [53]. Both UCSF-GOG and TCGA data sets were subjected to CBS-MAD algorithms followed by GISTIC analysis to identify amplifications and deletions. Following segmentation and classification, data were further reduced, without compromising the continuity and breakpoints, to facilitate downstream analyses [54]. This reduced dataset was used for all subsequent analyses.
To identify minimal regions of common alteration across all hybridizations, the Genomic Identification of Significant Targets in Cancer (GISTIC) approach [55] was utilized on each data set. Threshold selection for the GISTIC procedure was based, conservatively, on the maximum threshold for alteration (across all hybridizations) identified under the MAD approach described above; 0.4 was selected as the gain and loss threshold and 0.25 was selected as the significance threshold. Each analyzed CBS segment consisted of at least four markers. Segments that contained fewer than four markers were combined with the adjacent segment closest in segment value. A q-value was then obtained for each region. Each peak (i.e., region associated with a low q-value) was tested to determine whether the signal was primarily due to broad events, focal events or overlapping events of both types.
Identification of markers associated with survival (PFS and OS) was conducted through utilization of cluster analysis. Unsupervised clustering was first conducted on the set of log2 ratios from the reduced data set described above. Markers on the X chromosome were excluded from the analysis. The Euclidean distance metric was employed in conjunction with the Ward approach for agglomerative clustering. Resultant clusters were then assessed for differences in survival under the Cox proportional hazards model. Because significant differences were identified, GISTIC was performed to identify makers uniquely associated with each subgroup.
To validate the identified set of discriminating markers, supervised clustering was then conducted separately on the UCSF-GOG and TCGA data sets through use of Support Vector Machines [56]; genomic regions in each of the two validation data sets corresponding to the identified discriminating markers were utilized to guide the clustering. For each data set, resultant clusters were then assessed for differences with regard to both overall and progression-free survival.

Clinical Characteristics of OVCA patients
The median age at the time of diagnosis of the 72 patient cohort was 60 years (range 37-90) ( Table 1). Mean follow up time was 37 months (range 1-212). The majority (93%) of the population presented with advanced stage disease. Surgical staging was utilized as upfront therapy for all patients in the cohort, and this intervention was described as optimal with less than 1 cm of residual disease in 67 patients (88%). Extensive surgical cytoreduction including peritoneal stripping and bowel resection were utilized in 64% of the cohort in order to achieve an optimal debulking. Only 1 patient did not receive a taxane and platinumcontaining regimen as adjuvant therapy after surgery. Six patients were lost to follow up less than 2 months after surgical exploration. Platinum sensitivity defined as a progression free survival of greater than 6 months following the last dose of adjuvant chemotherapy was observed in 42 of 70 (60%) patients, with 12 patients (17%) demonstrating progressive disease despite chemotherapy. Median progression free survival was 8 months, with a median overall survival of 38 months. Univariate survival analysis identified platinum sensitive disease (p,0.0001), optimal cytoreduction (p,0.0001), lack of recurrence or progression (p,0.001) and presenting CA-125,500 U/mL (p,0.04) as prognostic clinical factors predicting an overall survival advantage. A Cox proportional hazards model incorporating these clinical factors adjusted for age revealed that platinum sensitive disease (hazard ratio 0.06), and optimal cytoreduction (0.12) were independent prognostic factors associated with an improved survival.

Global DNA copy number alterations
Genomic copy number for each probe was determined by calculating the log2 ratio of median signal intensities of the tumor and normal reference DNA. High signal to noise ratios were observed in all samples due to good quality tumor DNA. Representative profiles for five different tumors are shown in Figure 1. A large number of tumors showed some degree of genetic heterogeneity in the background along with distinct increase and decrease of DNA copy numbers involving large portions of chromosome arms ( Figure 1A, C, and D). High-level amplifications of regions including 3q26.2 and 8q24.2 were frequently observed ( Figure 1B-D). Some tumors displayed more than 10 regions of high-level amplifications ( Figure 1E). A genome-wide view of the CNAs in the 72 tumors is shown in Figure 1F and the frequency of amplification and deletion is shown in Figure 1G. In order to identify frequent regions of copy-number alterations, and to define the minimal regions of gains and losses, the statistical method Genomic Identification of Significant Targets In Cancer (GISTIC) was applied to the entire dataset ( Figure 1H and I).
GISTIC analysis identified 19 regions of gains along 18 chromosome arms ( Figure 1H) and 18 regions of losses along 17 chromosome arms ( Figure 1I) distributed throughout the genome. Several chromosomal arms had more than one minimal region of gain and loss. For each alteration, the peak region (i.e., the highest frequency and amplitude of events) was selected as the region most likely to contain a cancer gene. Several oncogenes and tumor suppressor genes previously known to have copy number changes in human ovarian cancer, such as MYCL1, EVI1, BRAF, MYC, KRAS, CCNE1, TP73, RB1, and MN1, were readily identified by GISTIC. Chromosomal locations, frequencies, genomic intervals, gene contents and candidate cancer genes of these changes are highlighted in Table 2. There were 19 regions each of gains and 18 regions of losses (with significant q values) identified with the number of genes ranging from 2-61. The size of deletions ranged from 400 kb to 3 Mb and the number of genes mapping to these regions ranged from 6-106 respectively. In addition, gain and loss of entire chromosome arms were frequently observed. Genes with known or possible function in cancer are highlighted in figure 1H and 1I.
Amplification of 3q26.2 including EVI1 gene and 8q24.12 including MYC oncogene were the most frequent alterations occurring in 72-75% of tumors suggesting a role for these genes in tumor maintenance or dissemination process. The most frequently deleted regions (78%) were located on 16q24.2 including FBXO31 and BANP genes and on 22q13.33 (Table 2). Other amplified regions were observed in 28-58% of tumors and deleted regions were observed in 30-70% of tumors respectively. In addition to the identification of regions of gain and loss common to the entire set of tumors, it was also of interest to identify regions of copy number alteration significantly associated with differences in OS and PFS which was assessed using clustering algorithms.

Cluster analysis
In order to identify a robust genomic signature and to define clinically relevant genetic subgroups among the high-grade tumors, we applied unsupervised hierarchical clustering algorithm to unfiltered aCGH data from 72 serous type tumors. Figure 2A illustrates the two subgroups that resulted from unsupervised clustering. The two primary subgroups were shown to differ significantly with regard to progression free survival (PFS) (p = 0.0008) and a marginal difference in OS (p = 0.07); figure 2B shows the PFS Kaplan-Meier plot for the two groups. Figure 2C illustrates differences between clusters with regard to clinical covariates. Formal comparison under the Cox proportional hazards model revealed a significant difference between the two subgroups with regards to platinum sensitivity (p = 0.016) and peritoneal stripping (p = 0.011).
To identify CNAs associated with each subgroup, and to determine whether these markers predict outcome independent of grade, we conducted a separate GISTIC analysis of grade 3 tumors only from each cluster. Figure 3A and B show amplifications and deletions identified by GISTIC for tumors in cluster 1 (worse prognosis) and 3C and D for tumors in cluster 2 (better prognosis) respectively. Amplification and deletion peaks unique to each group were readily identified by GISTIC and are indicated by green stars. We used these unique probe sets, listed in Supplementary Table S1, to build a prediction model for conducting supervised clustering. We then evaluated the model against our tumor panel, including grades 2 and 3, using leaveone-out cross validation method. This resulted in 80% accuracy rate in classifying the tumors into good and poor outcome subgroups.

Validation of independent datasets
Two independent datasets of high-grade serous tumors with clinical follow up information were used for validation. The UCSF-GOG dataset included 160 high-grade tumors, with overall survival information, randomly selected from the Gynecology Oncology Group. Copy number information for this dataset was generated using a 1 Mb BAC array. Data were analyzed using CBS-MAD followed by GISTIC ( Figure S3). In order to perform a proper comparison, we pulled targets from the BAC array corresponding to unique probe sets identified from our analysis as described in methods (Table S2 1ists BACs used for clustering). Supervised clustering using our discriminating markers resulted in two subgroups with a statistically significant difference in overall survival (p = 0.028) (Figure 4). Since validation datasets were generated using different array formats, frequency of amplifications and deletions were compared in all three datasets prior to analysis (Table S3).
The second dataset included 246 high-grade serous tumors from the TCGA project that were analyzed by a custom made Agilent 415 K oligonucleotide array. Supervised clustering using our discriminating markers resulted in three subgroups with significant difference in PFS (p = 0.0017) and in OS (p = 0.0098) ( Figure 5A and B) ( Figure S1). Further analysis of the subgroups showed a difference in PFS (p,0.001) and OS (p = 0.0028) between subgroup 2 and combined subgroups 1 and 3 ( Figure 5A1 and B1) suggesting that cluster 2 includes patients with worst outcome. Results from the GISTIC analysis of TCGA clusters are shown in Figure S2 A-F. Note that the amplification and deletion peaks of original cluster 1 resembled the amplification and deletion peaks of TCGA cluster 2. To identify genetic alterations specific to each group, we compared CNAs in each cluster ( Figure 5C-E1

Discussion
In this study, we first evaluated global DNA copy number alterations in a panel of 72 clinically annotated high-grade serous ovarian carcinomas to identify specific genetic alterations associated with clinical outcome. Unsupervised hierarchical clustering identified two distinct genomic subgroups with significant difference in clinical outcome. Unique genomic regions identified from each group were then able to successfully divide two independent datasets into clinically distinct subgroups with a significant difference in survival.
Previous studies that attempted to identify the molecular determinants of clinical outcome have focused on single genes because of the frequent involvement of these genes/pathways in serous type ovarian cancers [57,58]. However, these genes, although frequently associated in ovarian carcinomas, failed to predict outcome compared to the conventional clinical indicator such as the extent of surgery [59,60]. Gene expression based studies have been useful in predicting clinical phenotypes such as histologic types and stage for various tumor types [61], including breast [62,63] and ovarian cancers [64,65,66,67,68].
Several groups have applied aCGH-based genomic technology to identify CNA patterns predictive of platinum resistance [23,45], and to identify potential driver genes contributing towards ovarian cancer pathogenesis [29,30,39]. However, these studies have not established a correlation between CNA pattern and clinical endpoints such as PFS and OS. Some limitations that could have affected the outcome of these studies are sample size, a heterogeneous mixture of samples from different histology/grades, difficulty in combining data from various platforms due to minimal overlap of the results, and lack of a robust dataset for validation. To our knowledge, our study is the first to link a distinct set of CNAs to clinically relevant patient subgroups of high-grade serous ovarian cancers with a significant difference in PFS and OS.
Based on GISTIC analysis, we identified a set of discriminating markers from a cohort a 72 high-grade serous ovarian cancer. Next, we applied those discriminating markers on a dataset generated from a cohort of 160 high-grade serous cancers that were analyzed using a 1 Mb BAC array and identified three clusters which is likely due to larger sample size. Analysis of the three resulting clusters showed a significant difference in overall survival between cluster 1 and combined clusters 2 and 3 (p = 0.028) (Figure 4) (Figures S4 and S5). We then used a cohort of 246 tumors from TCGA that were analyzed using Agilent 415 k oligonucleotide arrays. Using the same discriminating markers, we identified three clusters with a significant difference in both PFS (p = 0.0017) and OS (p = 0.0098) ( Figure 5). To further define the groups, we compared the groups in combination. Combination of clusters 1 and 2 versus cluster 3 showed a marginally significant p value of 0.048 for PFS and 0.077 for OS. However, comparison of cluster 2 versus clusters 1 and 3 resulted in a significant difference both in PFS (p,0.001) and OS (p = 0.0028) ( Figure 5). Of note, alterations in the cluster 1 of our dataset resembled the alterations in the cluster 1 of UCSF-GOG dataset and cluster 2 of TCGA dataset further confirming our initial results.
In order to identify markers specific for each group, we utilized TCGA dataset since it provided the highest resolution and larger sample size. First, we compared the frequency of losses, gains and high-level amplifications and deletions in each cluster ( amplifications. This is likely due to the lower resolution of the array used for these samples. Similarly, the deletions along 8p and 17p were also present in high frequencies in the other two clusters (Supplementary Figure S4).
The minimal region of deletions including homozygous deletions along 17p included the mitogen-activated protain kinase 3 (MAP2K3) and mitogen-activated protein kinase 4 (MAP2K4) genes. MAP2K3 is activated by mitogenic and environmental stress, and participates in the MAP kinasemediated signaling cascade. MAP2K4 is a central mediator in the stress activated protein kinase signaling pathway that responds to a number of cellular and environmental stress factors [69]. By phosphorylating MAP kinases such as JNK, MAP2K4 can ultimately transmit stress signals to nuclear transcription factors that mediate various processes including proliferation, apoptosis and differentiation. The majority of metastatic ovarian cancers show significantly reduced expression suggesting that MAP2K4 protein levels are down regulated when cells acquire the ability to grow at a metastatic site [70]. Analysis of a number human ovarian cancer cell lines showed that MAP2K4 expression is not detectable in 3 cell lines (SHOV3ip.1, SKOV-3 and HEY-A8) known to be metastatic in vivo while other members of the MAP2K4 pathway are intact including MEKK1, MKK7, JNK and c-JUN. In addition, key members of the p38 pathway including MKK6, MKK3 and p38 were also present. These results implicate dysregulation of the stress-activated protein kinase signaling cascade in ovarian cancer metastasis and support the hypothesis that MAP2K4 regulates metastatic colonization in ovarian cancer. Several studies have reported somatic mutations in the MAP2K4 gene in multiple cancer types including ovarian cancer [71,72,73]. Kan et al. 2010 stably expressed MAP2K4 mutants in mammalian cells to test their transforming activity. They found that several of the mutants promoted anchorageindependent growth. However, a majority of the MAP2K4 mutants showed reduced activity compared with wild-type kinase. These results suggest that the MAP2K4 mutants may function in a dominant-negative manner and promote anchorage-independent growth in a manner similar to a synthetic dominant-negative MAP2K4 previously reported [74]. From a translational perspective, this finding suggests that modulation of the MAP2K4 pathway, either by restoration of MAP2K4 function alone or in combination with therapeutic agents, could have a clinical benefit. Figure 1. A-E. Representative aCGH profiles of 5 ovarian carcinomas. Log2 ratios (y axis) are plotted along the chromosomes (x axis). Each tumor showing many CNAs including gain and loss of entire chromosome and/or chromosome arms, interstitial deletions, and high-level amplifications (indicated in red arrows). Some tumors had more than 10 high-level amplifications. F. Genomic profiles of 72 primary ovarian carcinomas generated by oligonucleotide array CGH. Each column in the left panel represents a tumor sample and rows represent losses and gains of DNA sequences along the length of chromosomes 1 through X as determined by the segmentation analysis of normalized log2 ratios. The color scale ranges from blue (loss) through white (two copies) to red (gain). The right panel indicates the frequencies of gain and loss of oligonucleotide probes on a probe-byprobe basis for all autosomes and the X chromosome. The color scale ranges from white (no changes) to blue (frequent changes). Amplification of 3q26.2 and 8q24.12 including the EVI1 and MYC oncogenes and deletion of 16q24.2 and 22q13.33 were the most frequent alterations observed in 75% and 78% of the ovarian carcinomas respectively. G. Overall frequency of CNAs in 72 high-grade serous ovarian carcinomas. H and I. GISTIC analysis of copy number gains (H) and losses (I) in ovarian carcinomas. The statistical significance of the aberrations identified by GISTIC are displayed as false discovery rate q values to account for multiple hypothesis testing (q values; green line is 0.25 cut-off for significance). Scores for each alteration are plotted along the x-axis and the genomic positions are plotted along the y-axis; dotted lines indicate the centromeres. H) GISTIC revealed twenty broad and focal regions of gain (copy number threshold = log2 ratio $0.4). I) Loss of both broad and focal regions were identified by GISTIC (copy number threshold = log2 ratio#0.4 for broad and #0.1 for focal events). Twenty broad and focal regions of losses, including seven focal events, were identified in the background of broad regions. Candidate genes for some broad and focal events are noted. Green stars indicate known or presumed copy number polymorphisms. doi:10.1371/journal.pone.0030996.g001 The second cluster included the worse outcome subgroup. In this cluster, four regions along 12p12.1, 19q12, 20q11.21, and 20q13.12 were amplified in significantly high proportion of samples ( Figure 5). The peak region on 12p12.1 included 4 genes: SRY (sex determining region Y)-box 5 isoform b (SOX5), (branched chain aminotransferase 1, cytosolic) BCAT1, cancer susceptibility candidate 1 isoform a (CASC1), and c-K-ras2 protein isoform a precursor (KRAS). The SOX5 gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins [75]. The functional consequence of SOX5 amplification in human cancers has not been explored. One report suggests that over expression of SOX5 enhances nasopharyngeal carcinoma progression and correlates with poor survival [76]. However, its role in ovarian cancer is unexplored.
The Bcat1 gene was isolated in mouse by a subtraction/ coexpression strategy with Myc-induced tumors of transgenic mice, and was shown that Bcat1 is a direct genetic target for Myc regulation in mouse [77]. The Bcat1 gene is highly expressed early in embryogenesis, and during organogenesis its expression is localized to the neural tube, the somites, and the mesonephric tubules. The gene is also expressed in several MYC-based tumors. As in mouse, the BCAT1 gene is a target for MYC activity in the oncogenesis process in human [77]. Using expression profiling, Ju et al. 2009 reported differential expression of BCAT1 gene in chemoresistant ovarian cancer compared to chemosensitive tumors [78]. Depletion of BCAT1 by RNA interference in nasopharyngeal cancer cells effectively blocked the proliferation of cells suggesting a role for BCAT1 in tumorigenesis [79]. In colorectal cancer immuno-histochemical analysis of BCAT1 protein showed significantly higher levels of expression in tumor tissues with distant metastasis compared to those without and was shown to be highly predictive of distant metastasis [80]. The Casc1 gene was identified as a strong candidate lung tumor susceptibility gene through whole genome analyses in inbred mice [81]. About 20-40% of human tumors carry mutation in KRAS [82]. The Kras G12D conditional knock-in mouse model has been extensively used to study the mechanisms of Ras-induced tumor development [83,84]. The conditional expression Kras G12D in mice, when combined with other mutations, leads to malignant tumorigenesis in various tissues, including ovarian surface epithelium (OSE). The responses of cells to RAS activation appear to be context dependent such that cells may either undergo oncogenic transformation or become senescent [85]. Although there are rare documented cases of RAS mutations in serous carcinomas, the amplification of this gene may ultimately activate the same pathways that mutant RAS turns on. A better understanding of the molecular targets of RAS in OSE will help identify potential therapeutic targets.
The region on 19q12 included focal amplification of the cyclin E1 (CCNE1) gene. High-levels of CCNE1 protein, an activating subunit of the cyclin dependent kinase 2 (CDK2), are often observed in patients with ovarian cancer [86]. Deregulation of cell cycle control is thought to be a prerequisite for tumor development, and several studies have shown an accelerated entry into S phase because of constitutive expression of CCNE1 [87,88]. Furthermore, CCNE1 is able to induce chromosome instability by inappropriate initiation of DNA replication, and centrosome duplication [89]. Amplification of CCNE1 in ovarian cancer correlates with drug resistance [23] and poor clinical outcome [90]. Our finding confirmed the above-mentioned studies and identified amplification of CCNE1 as a marker of poor outcome and a possible therapeutic target.
Amplification of two distinct regions on 20q11.21, and 20q13.12 were associated with the poor outcome subgroup. The region on 20q11.21 included two notable genes among others: inhibitor of DNA binding 1 (ID1) and BCL2-like 1 (BCL2L1). ID1 is a member of a family of 4 proteins (ID1-4) known to inhibit the activity of basic helix loop helix transcription factors by blocking their ability to bind DNA. ID1 has been implicated in a variety of cellular processes including cell growth, differentiation, angiogenesis, and neoplastic transformation. It has been shown that ID1 is de-regulated in multiple cancers and up-regulation of ID1 is correlated with high-grades and poor prognosis in human cancers [91,92]. ID1 has also been shown to be an effector of the p53-dependent DNA damage response pathway [93]. In ovarian cancer, the level of Id1 protein expression correlates with malignant potential, associated with poor differentiation and aggressive behavior of tumor leading to poor clinical outcome [94]. BCL2L1 is a BCL2-related gene and can function as a BCL2independent regulator of programmed cell death [95]. Both BCL2 and BCL2L1 are antiapoptotic and downstream targets of p53. Overexpression of BCL2L1 suppresses mitochondrial-mediated apoptosis and enhances cancer cell survival in cancer models [96]. Several studies report the expression of BCL2L1 in 60-70% of ovarian cancer and that BCL2L1 expression is associated with chemoresistant and recurrent disease [97].
Previous studies using conventional CGH have reported consistent high-level amplification of the 20q13.12 region encompassing many genes that may play causal role in ovarian cancer pathogenesis [42,98,99]. In this study, we have identified a 2.8 Mb region including 61 genes. Among others, the likely candidates are MMP9, PI3, NCOA5, TP53RK, ZMYD8 [100,101,102,103,104]. Based on integrated analysis of DNA copy number and expression profiling results, 20q11.22-q13.12 region has been reported to be associated with poor response to primary treatment [23]. More recently, another study using tissue microarray composed of late stage, high-grade serous ovarian carcinomas correlated PI3 expression with poor overall survival [101].
Finally, cluster 3 samples predominantly showed losses on 8p21.3 and 8p23.2 regions. Several candidate tumor suppressor genes that are less known to be implicated in human cancers include DOCK5 [105] and CSMD1 [106] map to this region. Based on the available literature, the above mentioned genes are likely to play important roles but future studies are required to define their roles in the pathogenesis of serous type ovarian carcinomas.
Whether expressions of all candidate genes described above are altered in high grade serous ovarian cancer is not yet known and is currently under investigation in our laboratory. Our study may also have missed rare copy number variants, including duplications and deletions, in predisposing cancer susceptibility genes since the normal reference DNA was made from healthy donors but not matched normal DNA from each patient. However, it is less likely given the very large deletions and amplifications we identified in these tumors.
In summary, the results from this study illustrate the unique molecular landscape of the genetic subgroups that exist within the high-grade tumors. In the future, using these genomic markers, the high-grade serous tumors can be stratified into clinically relevant subgroups, help develop new diagnostic strategies and eventually lead to targeted therapy.