Molecular Integrative Clustering of Asian Gastric Cell Lines Revealed Two Distinct Chemosensitivity Clusters

Cell lines recapitulate cancer heterogeneity without the presence of interfering tissue found in primary tumor. Their heterogeneous characteristics are reflected in their multiple genetic abnormalities and variable responsiveness to drug treatments. In order to understand the heterogeneity observed in Asian gastric cancers, we have performed array comparative genomic hybridization (aCGH) on 18 Asian gastric cell lines. Hierarchical clustering and single-sample Gene Set Enrichment Analysis were performed on the aCGH data together with public gene expression data of the same cell lines obtained from the Cancer Cell Line Encyclopedia. We found a large amount of genetic aberrations, with some cell lines having 13 fold more aberrations than others. Frequently mutated genes and cellular pathways are identified in these Asian gastric cell lines. The combined analyses of aCGH and expression data demonstrate correlation of gene copy number variations and expression profiles in human gastric cancer cells. The gastric cell lines can be grouped into 2 integrative clusters (ICs). Gastric cells in IC1 are enriched with gene associated with mitochondrial activities and oxidative phosphorylation while cells in IC2 are enriched with genes associated with cell signaling and transcription regulations. The two clusters of cell lines were shown to have distinct responsiveness towards several chemotherapeutics agents such as PI3 K and proteosome inhibitors. Our molecular integrative clustering provides insight into critical genes and pathways that may be responsible for the differences in survival in response to chemotherapy.


Introduction
Gastric cancer is the second leading cause of cancer death worldwide, and is particularly common in East Asia [1]. It does not get as much attention as other cancers because of its lower incidence in the West. There is a decreasing trend in the incidence of this cancer. However, rates in Asia are among the highest in the world. It is the third most common cancer in males in Singapore and the fifth most common cancer in females in Singapore [2]. It claimed approximately 330 lives every year in Singapore. Diagnosis of gastric cancer usually occurs at late stage of the disease when treatment options are limited and often unsuccessful. Therefore it is critical to improve on early detection and treatment of the cancer.
Traditionally, classification of gastric cancers is based on histopathological findings. The widely used Lauren's classification divides gastric cancers into two major histological types, namely intestinal type or diffuse type [3]. Intestinal type cancers have recognizable gland formation which ranges from well to poorly differentiated. The tumors grow in expanding, rather than infiltrative, patterns. They are believed to arise from chronic atrophic gastritis. In contrast, diffuse type cancers have noncohesive tumor cells diffusely infiltrating the stroma of the stomach and often exhibit deep infiltration of the stomach wall with little or no gland formation. They may arise out of single-cell mutations within normal gastric glands, and are associated with worse prognosis [4]. Advances in molecular biology have made available molecular classifications based either on genomic aberrations or gene expression profiles, or an integration of both [5][6][7][8][9][10].
Cell lines formed the foundation of cancer biology and the quest for drug treatments. Common alterations in cell lines, which include gains and losses of entire arms of chromosomes, are often the same ones found in primary tissue [11]. Cell lines also do not contain non-cancerous cells found in primary tumor tissues, making the cultivated lines ideal for finding mutations in the cancer genome [12]. Thus, comparing signatures between cell lines would more likely reflect intrinsic differences between tumor cells with minimal potentially confounding effects from neighboring non-cancer cells [5].
Many Asian gastric cell lines are available commercially and they have contributed to the progress in gastric cancer biology and treatment. Despite being relatively homogenous and devoid of tissue complexity, these Asian gastric cell lines are found to have heterogeneous response or susceptibility to drug treatment (personal unpublished data, and public data in CCLE and Sanger COSMIC). We reasoned that subtle genomic variations may contribute to the underlying differences observed among these gastric cell lines. Unlike tumor tissue, these differences would be intrinsic and a signature to the biology of the particular gastric cell line since there is no interference from other cell types.
We analyzed the gene copy number and LOH in 18 Asian gastric cell lines. Coupling our results to gene expression profiles of these cell lines available from CCLE, we identified two distinct genomic signatures based on genetic aberrations among these Asian gastric cancer cell lines. The clustering was further validated by in vitro chemotherapy sensitivity study done in our lab and with data publicly available from CCLE and Sanger COSMIC databases. A molecular classification on gastric cancer patients would have great clinical impact as it leads to more accurate prediction of prognosis, allowing targeted therapy based on the underlying biology of each subgroup.
Array comparative genomic hybridization (aCGH) and copy number determination CGH array was performed by Origen Labs (Singapore) using the Affymetrix CytoScan HD array platform. Cell pellets containing 1610 6 cells were used for the aCGH hybridization. The genomic DNA quality, hybridization signal strengths and internal controls satisfied Affymetrix required standard at each step before proceeding to the next. Data from the 18 gastric cell lines on Affymetrix Cytoscan HD array were pre-processed using Affymetrix Chromosome Suite 1.2.2.271 using the single sample analysis workflow and default settings. Hidden Markov model segmentation was applied to call DNA copy number gain, loss or loss of heterozygosity (LOH) status. For matching purposes, the Entrez Gene ID was assigned to each gene name in this data set using the NCBI's Entrez Gene Database [13]. DNA copy number gain, loss or LOH was compiled by counting the number of mutation for each cell line and mutation type. The kinome gene list and human pathways were downloaded from public databases [14,15].
Public microarray data pre-processing Hierarchical clustering and single-sample Gene Set Enrichment Analysis (ssGSEA) Unsupervised hierarchical clustering (1-Spearman distance, average linkage) was performed on the cell lines using the aCGH data. Putative driver genes of which copy number aberrations correlated to mRNA gene expression were identified to determine subtypes or clusters that are driven by different mechanisms. This was done using Mann Whitney U-test with p,0.05, and Spearman Correlation Coefficient test with Rho .0.6. We then performed consensus clustering [17] on the gene expression data of the 27 gastric cancer cell lines from CCLE using these putative driver genes. We selected k = 2 as it gives sufficiently stable similarity matrix. In order to assign new samples to this integrative cluster, significance analysis of microarray (SAM) [18] with threshold q,2.0 was used to generate subtype signature based on the mRNA expression data of the 1762 genes from the 27 gastric cancer cell lines in CCLE.
ssGSEA was used to estimate pathway activities of the gastric cancer cell line in the Molecular Signature Database v3.1 (Msigdb v3.1) [19,20]. The pathway activities are represented in enrichment scores which were rank normalized to [0.0, 1.0]. SAM analysis was performed with threshold q,0.2, and fold change . 2.0 (for up-regulated pathways), or ,0.5 (for down-regulated pathways) to obtain subtype-specific pathways from the 27 gastric cell lines in CCLE.

Drug treatment
A panel of 39 compounds against various cellular targets was obtained from Selleck Chemicals (Table S1 in File S1). Cells were seeded in 50 ml medium in 96-well plates at 8000 cells/well and incubated overnight. Serial dilutions of compounds were performed starting from 200 mM with 1: 4 dilutions for subsequent dilutions. Serially diluted compounds (50 ml) were added to cells and incubated for 48 hours. To measure cell viability, CellTiter-Glo (Promega) was added to the wells at 1:1 ratio. After 10 minutes of incubation at room temperature, luminescence was measured with Safire II plate reader (Tecan). The experiments were carried out in triplicates for each dose dilution point and independently replicated on a separate occasion. Data was analyzed with Graphpad Prism software to determine the half maximal inhibitory concentration (IC 50 ). P-values were computed from Mann Whitney U-test.

Genetic aberrations in the Asian gastric cell lines
In this study, we analyzed the copy number aberrations and LOH in Asian gastric cell lines by aCGH. Genes that are frequently gain, loss or have LOH are identified. Though continuous cell lines tend to harbor more mutations than the primary tumor where it was originally derived [6], cell lines are able to recapitulate major patterns of tumor heterogeneity [21,22]. We detected total genetic aberrations (gain, loss and LOH) in the Asian gastric cell lines ranging from 1724 in AZ-521 to 22631 in NUGC-3 (Table 1). Reflecting this trend, we found that AZ-521 has the least number of genetic aberrations in the human kinome (37 out of a total of 531 kinases) [14] while NUGC-3 has the most number of gene aberrations (510) in its kinome (Table 1).
We found that 72% of the Asian gastric cell lines have gene copy number gain in CDK13, EGFR, PAK1 and STK17A kinases ( Table 2). Amplification of CDK13 was found to associate with gastric and liver cancers [23]. High EGFR and PAK1 expression levels are closely correlated to the incidence and development of gastric cancer in East Asians [24,25] while an unanticipated role Table 1. Total and kinome genetics aberrations (consisting of gain, loss and LOH) in the 18 Asian gastric cell lines.    for STK17A as a candidate promoter of cell proliferation and survival was recently identified [26]. On the other hand, 44% of these cell lines have gene copy number loss in MAP3K15, an apoptosis-facilitating factor [27], and RPS6KA6, a potent tumor suppressor in multiple cancers [28]. LOH is associated with inactivation or loss of a normal allele. We detected LOH of GUCY2F in 61% of the Asian gastric cancer cell lines. GUCY2F is needed to repress transcription of several growth factor genes and inhibits growth of gastric carcinoma [29]. LOH of MYLK3 is found in 56% of the Asian gastric cell lines. MYLK3 is implicated in gastric acid secretion (KEGG entry 91807) and reduced secretion of gastric acid due to atrophic mucosa is observed in gastric cancer. The NCI-Nature Pathway Interaction Database [15] has 137 human pathways representing 9248 interactions. The top 10 pathways with the most number of gain, loss and LOH for each cell line from our array CGH analysis are summarized in Fig. 1. We found that genes in the PDGFR-beta and the caspase pathways contained the most number of genetic aberrations (gain, loss and LOH). Deregulation of the PDGFR-beta pathway affects angiogenesis in gastric cancer and depth of cancer cell invasion into the gastric subserosal layer [30]. Down-regulation of caspase activities has been detected in various human gastric cancerderived cell lines [31]. On the other hand, genes in the nuclear beta-catenin pathway have the most number of loss and LOH with no gain of genetic materials in these Asian cell lines. Deregulation of the Wnt/beta-catenin pathway due to loss of membranous Ecadherin has been reported in gastric cancers [32].

Integrative cluster identification and signature generation
Additionally, we investigated the correlation between copy number aberrations and gene expression data of these cell lines in the public domain. The integrated analysis of DNA copy number variations and corresponding gene expression data would allow identification of significant genes and cellular pathways critical to the gastric cancer pathophysiology. Of the 18 gastric cell lines that we have performed aCGH, only 14 cell lines have corresponding putative driver genes from 14 gastric cell lines. (C) ssGSEA pathway enrichment score (mean-centered) heatmap for 380 subtype-specific pathways using 27 gastric cell lines from CCLE. Only selected pathway/genesets are labeled. Color code for mRNA expression: red = high expression, green = low expression. Color code for copy number: green = copy number loss, red = copy number gain, black = normal copy number. Color code for pathway enrichment: red = high enrichment, green = low enrichment. doi:10.1371/journal.pone.0111146.g002 mRNA expression data in CCLE. CCLE has a total of 27 Asian gastric cell lines at the time of analysis (March 2013). We used Mann Whitney U-test and Spearman Correlation Coefficient test to identify 1762 putative driver genes of which copy number aberrations correlate to mRNA gene expression (Table S2 in File S1). Consensus clustering using these putative driver genes revealed 2 clusters of gastric cell lines. We named them integrated clusters (IC) 1 and 2 ( Fig. 2A).
An overall strong correlation between DNA copy number and mRNA expression was observed (Fig. 2B). A similar study with human gastric tumor samples [6] also noted the correlation. These findings suggest that DNA copy number variation is a key contributor to the expression variation of these genes. Cells in IC1 have higher expression of genes involved in oxidoreductase and mitochondria activities. Cells in IC2 have higher expression of genes involved in diverse cellular signaling functions. The roles of these genes in the two clusters of gastric cancer would need to be explored further. SAM was then performed to generate 114 subtype signature genes based on the mRNA expression data of the 1762 genes from the 27 gastric cancer cell lines in CCLE (Table S3 in File S1).
The cell lines in our two integrative clusters correlated strongly with a molecular clustering system reported by Tan et al. [5]. Cell lines in our IC1 and IC2 groups are almost identical to cell lines in their G-INT and G-DIF groups, respectively. The only differences are cell lines Fu-97 and SNU-1 in our IC1 are grouped into G-DIF instead of G-INT. Tan et al. performed the classification based on gene expression data only while we incorporated both in-house aCGH data with public gene expression data. The additional genomic information from aCGH may result in re-arrangement of the hierarchical tree. Tan et al. also found that their data associated significantly with Lauren's classification but remained distinct with overall concordance of 64% with Lauren's histopathological classification. The discrepancies between molecular and histological classification could be due to the ability of genetic classification to capture salient features of the tumor that are less likely to be discerned by light microscopy [5].

Pathway analysis for the Integrative Clusters
ssGSEA was used to estimate pathway activities of the gastric cancer cell line in the Msigdb v3.1. SAM analysis revealed 380 subtype-specific pathways (Table S4 in File S1). The pathway enrichment score heatmap of the 380 subtype-specific pathways from the 27 gastric cell lines in CCLE is shown in Fig. 2C. Cell lines in the IC1 cluster have enrichment of genes associated with oxidative phosphorylation and mitochondria functions. On the other hand, cell lines in the IC2 cluster have enrichment of genes associated with higher inflammatory response, epithelial-mesenchymal transition, TGF-beta, Notch, RAS, and NFkB signaling. Clustering of gastric cancers to IC1 emphasizes on metabolism and energy generation while IC2 emphasizes on cell signaling and regulation of transcriptions suggesting that there are two mechanistically very distinct groups of gastric cancers.

Drug sensitivity of Asian gastric cell lines based on the integrative clustering
Drug sensitivity data (50% growth inhibitory concentration, IC 50 ) were obtained from CCLE (Fig. 3A) and Sanger COSMIC (Fig. 3B). In both CCLE and COSMIC [33], cells were treated with compounds for approximately 72 hours. In-house drug sensitivity assay was performed for 48 hours (Table S5 in File S1). Growth inhibition between the two clusters of Asian gastric cancer cell lines was compared. Only compounds showing significant differential sensitivity between the two molecular clusters in CCLE, COSMIC and our in-house data are shown. We verified that the effect of 17-AAG and dasatinib in our collection of cell lines are similar to the results obtained from CCLE and COSMIC even though the length of incubation time with the compounds were different between our data and the public data (Fig. 3C).
5-Fluorouracil, a thymidylate synthase inhibitor which is the current treatment for gastric cancer, was found to be slightly more effective in cell lines in IC1 compared to IC2 (p = 0.047) (Fig. 3C). Similar results were also observed by Tan et al. [5] where cells in G-INT are more sensitive towards 5-fluorouracil than G-DIF. A significant benefit from adjuvant 59-fluorouracil therapy in G-INT subtype compared to G-DIF subtype in retrospective patient cohorts has also been reported by Tan et al. Furthermore, a 10year follow-up study found that 5-fluorouracil therapy with radiation could benefit all but the diffuse subtype based on Lauren's classification [34].
Since we observed gene loss and LOH in the nuclear betacatenin pathway, we postulate that targeting this pathway may have a therapeutic effect in gastric cancer [35]. However, we found that gastric cell lines in both IC1 and IC2 clusters are generally not responsive (IC 50 ,100 mM) towards XAV939, a tankyrase inhibitor [36] which selectively inhibit beta-catenin mediated transcription (Fig. 3C). This suggests that genetic aberrations in the beta-catenin pathway may be superfluous to the survival of the gastric cancer cells.
Cells in IC1 have enrichment of genes associated with oxidative phosphorylation and mitochondria functions. We found that cells in IC1 are more resistant to proteosome inhibitors bortezomib and MG132 (Fig. 3B). Proteosome inhibitors induce reactive oxygen species generation [37] which contribute to oxidative damage and cell death. Enrichment of genes associated with mitochondria function in cells in IC1 may enhance the ability of these cells to withstand oxidative damage. In contrast, the Hsp90 inhibitors, NVP-AUY922 and 17-AAG, are found to be more effective in inhibiting growth of cell lines in IC1 (p = 0.024 and 0.014, respectively). Mitochondrial Hsp90 is involved in complex signaling pathway that prevents initiation of induced apoptosis. The increased sensitivity of cells in the IC1 towards Hsp90 inhibitors further suggests that mitochondria activity is important in the survival of this cluster of cell lines [38].
Interestingly, we found that a subset of gastric cells within the IC1 are more sensitive towards the MEK/ERK inhibitor PD0325901 (p = 0.015; Fig. 3A). The MEK-ERK pathway is required for the S727 phosphorylation of mitochondrial STAT3 which is critical for electron transport chain activity and ATP abundance [39]. The pan histone deacetylase inhibitor panobinostat is also more toxic to gastric cells in IC1 (p = 0.020). On top of its mitochondrial modulatory effect and induction of apoptosis, panobinostat could also undermine the chaperon function of Hsp90 through hyperacetylation of Hsp90 [40]. Gastric cells in IC1 are also more sensitive towards TKI258 compared to cells in IC2 (p = 0.034). TKI258 is a multi-targeted receptor tyrosine kinase inhibitor with activity against FGFR, VEGFR, PDGFR, FLT3, and KIT. These will indirectly decrease Y243 phosphorylation of mitochondrial pyruvate dehydrogenase kinase 1, leading to inactivation of pyruvate dehydrogenase complex and decreased cell proliferation [41].
In support of our findings that the gastric cells in IC2 are enriched for genes involved in cell signalling, we found that cells in IC2 are generally more sensitive to kinase inhibitors than cells in IC1. Cell lines in IC2 are more sensitive to treatment with PI3 K inhibitors BEZ235, ZSTK424, PI-103 and PIK-75 (p = 0.032, 0.018, 0.021 and 0.018 respectively) ( Figure 3C). This reflects a central role for the PI3 K pathway in cancer cell proliferation [42]. Targeting the PI3 K/AKT pathway may represent an important therapeutic target for gastric cancer [43]. We also found significantly lower IC 50 values with gastric cells in IC2 compared to cells in IC1 when treated with kinase inhibitors dasatinib (Bcrabl and Src inhibitor) (p = 0.027), GSK269962A (Rho kinase inhibitor) (p = 0.019), and midostaurin (Flt3 and multiple kinase inhibitor) (p = 0.012).
The absolute magnitude of the differential drug sensitivities ranges from 2-10 fold in the gastric cell lines based on our clustering. The modest differences may still be clinically meaningful given the small therapeutic windows associated with cytotoxicity even in targeted chemotherapy. A large patient cohort study will be needed to confirm the value of the molecular clustering strategies by us or others in predicting chemosensitivity and prognosis.
In conclusion, combination of aCGH and gene expression analysis to identify potential candidate oncogenes or tumor suppressor genes is a powerful and proven approach that has been reported in other cancer studies. This study provides insight into DNA copy number variations and their correlation to gene expression profiles in Asian gastric cell lines. A schematic diagram of the overall workflow is shown in Fig. S1 located in File S1. We report here the discovery of signature genes and cellular pathways associated with two genomic clusters of these cell lines. The two clusters of cell lines responded differentially to targeted therapeutic agents. Our results provide new insights into the molecular pathogenesis of this malignancy and could potentially augment the conventional histological classification of gastric cancers.

Supporting Information
File S1 Table S1: A panel of target-specific compounds for chemosensitivity study in the Asian gastric cell lines. Table S2: Putative driver genes selected using correlated genes between copy number aberration and mRNA gene expression (Mann Whitney test, p,0.05, Spearman correlation coefficient, Rho .0.6). Table  S3: Integrative cluster signature (Copy number and mRNA correlated genes). Table S4: Integrative cluster-specific pathway. Table S5: Compounds showing significant differences in sensitivity between the two integrative clusters of Asian gastric cell lines. Figure S1: A schematic diagram of the analysis workflow. (XLSX)