Phosphoproteome of Human Glioblastoma Initiating Cells Reveals Novel Signaling Regulators Encoded by the Transcriptome

Background Glioblastoma is one of the most aggressive tumors with poor prognosis. Although various studies have been performed so far, there are not effective treatments for patients with glioblastoma. Methodology/Principal Findings In order to systematically elucidate the aberrant signaling machinery activated in this malignant brain tumor, we investigated phosphoproteome dynamics of glioblastoma initiating cells using high-resolution nanoflow LC-MS/MS system in combination with SILAC technology. Through phosphopeptide enrichment by titanium dioxide beads, a total of 6,073 phosphopeptides from 2,282 phosphorylated proteins were identified based on the two peptide fragmentation methodologies of collision induced dissociation and higher-energy C-trap dissociation. The SILAC-based quantification described 516 up-regulated and 275 down-regulated phosphorylation sites upon epidermal growth factor stimulation, including the comprehensive status of the phosphorylation sites on stem cell markers such as nestin. Very intriguingly, our in-depth phosphoproteome analysis led to identification of novel phosphorylated molecules encoded by the undefined sequence regions of the human transcripts, one of which was regulated upon external stimulation in human glioblastoma initiating cells. Conclusions/Significance Our result unveils an expanded diversity of the regulatory phosphoproteome defined by the human transcriptome.


Introduction
Stem cells have been known to exist in each tissue of multicellular organisms and have the ability to differentiate into various cell types based on their self-renewal and differentiation potency [1]. Although the existence of cancer stem cells had been postulated for decades, there had been no experimental evidence for their presence. Recent studies demonstrated the existence of cancer stem cells in glioblastoma [2][3][4][5][6], the most aggressive brain tumors with median survival of less than 12 months after diagnosis [7,8]. At present, the major therapies for glioblastoma are limited to radiation and anti-cancer drugs to target proliferating cells. Due to the resistance of glioblastoma stem cells to these treatments [9][10][11], however, little effect was observed for survival of patients. Therefore, in order to develop potential therapeutic strategies for the treatment of glioblastoma, functional roles of glioblastoma stem/initiating cells in tumor progression are required to be understood.
Signal transduction system transmits cellular information into nucleus in response to external stimuli via posttranslational modifications (PTMs) [12][13][14] and plays a critical role in regulating fundamental biological events such as cell growth, proliferation, and differentiation. Above all, reversible phosphorylation events are widely recognized as a central player in tumor growth. For example, the ErbB receptor family, one of the most studied receptor tyrosine kinases in vivo and in vitro, is activated by various types of ligands including epidermal growth factor (EGF), transforming growth factor alpha (TGF-a), and heregulin (HRG), leading to widespread phosphorylation of representative downstream signaling cascades such as mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase (PI3K)/AKT signaling pathways [14][15][16] to promote various types of tumor development.
Although phosphorylation events in cancer cell signaling have largely been studied in numerous biological contexts for many years, network-wide description of each signaling dynamics is essentially needed to theoretically define signaling machinery at the system level. Recent mass spectrometry-based proteomics technology enables us to identify and quantify thousands of proteins based on shotgun strategies using SILAC, which is an in vivo protein labeling technique to precisely evaluate quantitative behavior of signaling molecules [17][18][19]. Through phosphopeptide enrichment by strong cation exchange (SCX) and titanium dioxide (TiO 2 ) chromatography, the previous analysis quantitatively described the dynamics of phosphorylation sites in EGF-stimulated HeLa cells using high-resolution LC-MS/MS system in combination with SILAC technology, which provided a global view of cellular regulation via phosphorylation [20].
As it is well-known that amplification of EGF receptor frequently occurs in glioblastoma tumors, elevated EGF signaling is considered to make a substantial contribution to malignant character of glioblastoma stem cells. Based on the methodology developed for adherent culture of glioblastoma stem cells [21], we applied SILAC technology to primary cultured initiating cells established from glioblastoma patients and perform a global phosphoproteomics analysis in response to EGF stimulation to uncover the system-wide mechanisms for promoting brain aggressive tumorigenesis. As a result, 6,073 phosphopeptides on 2,282 human proteins were identified from glioblastoma initiating cell lysates using high-resolution nanoflow LC-MS/MS system. Characterization by gene ontology classification indicated that the two most frequent protein subgroups belonged to the terms of transcriptional activity and nucleic acid binding, which is in accordance with the previous phosphoproteome reports on human embryonic stem cells [22]. Our large-scale phosphoproteome data also demonstrated that the well-known markers of glioblastoma initiating cells and mesenchymal cells were highly phosphorylated. Very interestingly, further exploration for the human transcriptome database revealed the direct evidence that some proteins translated from novel open reading frames were also phosphorylated and regulated upon EGF stimulation in a cell-type dependent manner.

Identification and Quantification of the Phosphoproteome in Glioblastoma Initiating Cells
As amplification or mutation of EGF receptor frequently occurs in glioblastoma tumors, aberrant EGF signaling is considered to make a substantial contribution to malignant character of glioblastoma stem cells. In order to investigate EGF-induced signaling alteration at the network level, we performed global phosphoproteome analyses of SILAC-encoded glioblastoma initiating cells using high-resolution nanoflow LC-MS/MS system ( Figure 1). In our shotgun phosphoproteome analyses of human glioblastoma initiating cells, a total of 60,455 redundant phosphopeptides were identified, whereas 8,424 non-phosphorylated peptides were detected with redundancy, which indicated that enrichment of phosphorylated molecules from peptide mixtures achieved high selectivity (88%) in our sample preparation. Our results revealed 6,073 non-redundant phosphopeptides derived from 2,282 proteins in total using two different peptide fragmentation methodologies of CID and HCD (Table S1). The dataset included 5,497 singly and 576 multiply phosphorylated peptides, respectively. Intriguingly, thirty-six phosphorylation sites, including eleven novel sites, were identified regarding nestin, which is a well-known marker of glioblastoma stem cells ( Table 1). Many of the novel phosphorylated residues were conserved across species, which indicates functional constraint on these sites ( Figure 2). A large number of phosphopeptides derived from vimentin, which is commonly regarded as a marker of mesenchymal cells, were also identified in our phosphoproteome analysis, which reflected the previous evidence that the cells undergoing epithelial to mesenchymal transition (EMT) showed the charac-teristic of cancer stem cells [23]. In addition, four phosphorylation sites on two phophopeptides of EGF receptor were also identified. EGF receptor signaling pathway was reported to affect on differentiation and migration of glioblastoma stem cells [24]. These results indicate that SILAC-encoded cells established in our study maintained the main characteristics of glioblastoma initiating cells.
SILAC-based relative quantification of the activation fold changes revealed 516 up-regulated and 275 down-regulated phosphorylation sites which showed more than 1.5 fold change upon EGF stimulation (Table S2). The representative MS spectra of SILAC-encoded phosphopeptides derived from four phosphoproteins (nestin, 40 S ribosomal protein S6, filamin-A isoform 1 and zyxin) were shown ( Figure S1) and western blot analyses using each phospho-specific antibody provided further evidence for the elevated phosphorylation of the corresponding sites of ribosomal protein S6 and filamin-A upon EGF stimulation ( Figure 3). Furthermore, the reproducibility of our quantitative data from four independent experiments was also confirmed regarding representative phosphopeptides identified in all of the measurements (Table S3).

Gene Ontology Analysis of the Glioblastoma Phosphoproteome
In order to obtain a system-level view of the glioblastoma phosphoproteome networks, all the phosphorylated proteins identified in our analyses were classified in terms of biological According to the annotation, the largest subgroup in terms of molecular function was the factors with transcription activity, constituting 13% of the identified phosphoproteins. The group contained a variety of transcription-related factors, such as histone deacetylase, histone-lysine N-methyltransferase, and a variety of zinc finger proteins. The classification regarding cellular component showed that the phosphorylated molecules were localized in various organelles, especially in the nucleus and cytoplasm (48% and 31% of all the identified proteins, respectively). The classification by biological process showed that the molecules belonging to signal transduction (23%), cell regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism (23%), and cell growth and/or maintenance (8%) were the major populations. For further dissection, Database for Annotation, Visualization and Integrated Discovery (DAVID) [26] software was used to classify the phosphoproteome data into the related signaling pathways ( Table 2). As a result, they were assigned to diverse signaling pathways including Notch and EGF receptor signaling pathways. Notch signaling has been found to be associated with various types of tumors including gliomas and play crucial roles in regulating self-renewal and determining the fate of neural stem cells, while EGF receptor signaling is known to be required for maintaining stemness of glioblastoma [27]. Regarding the ErbB signaling pathway, some phosphorylation sites on GRB2-associated binding protein 1 (GAB1), SHC (Src homology 2 domain containing) transforming protein 1 (SHC1) and v-raf-1 murine leukemia viral oncogene homolog 1(RAF1) were found to be up-regulated upon EGF stimulation in our phosphoproteome analyses (Table S4). Very interestingly, ribo-somal protein S6 (RPS6) in the mTOR signaling was observed to be highly phosphorylated upon this input in human glioblastoma initiating cells.

Exploration for Novel EGF Effectors Encoded by the Human Transcriptome
In order to search for the novel phosphopeptides that were not derived from known protein sequences, we further performed searching against the human RNA database. As a result, three novel phosphopeptides were additionally identified from the nucleotide sequences presumed as non-coding regions (Table 3). Two phosphopeptides were derived from the regions different from already defined protein-coding sequences on mRNAs, whereas the other was from the regions on non-coding RNAs (Table S5). The surrounding amino acids of the novel phosphorylated residue identified from nuclear protein localization 4 homolog (S. cerevisiae) were well conserved across species, which indicates functional constraint on the phosphorylation site ( Figure  S2). Regarding the novel effector encoded by supervillin-like (LOC645954), we also analyzed sequence similarity with the already known supervillin protein by aligning the corresponding amino acid sequences and found that these sequences including the novel phosphorylation residue showed high similarity ( Figure  S3A). The phosphorylation level of this novel molecule was increased by 4-fold on average in response to EGF stimulation (Table S5), whereas, very intriguingly, its phosphorylation level was not altered upon EGF treatment in other human cancer cells ( Figure S3B). Our results indicate that this novel EGF effector identified from glioblastoma initiating cells behaved in a cell-type dependent manner.

Discussion
Cancer stem/initiating cells are known to have the ability of self-renewal, multi-lineage differentiation and proliferation. Due to their characteristics of chemotherapy and radiotherapy resistance, current conventional treatments have limited efficacy in curative therapies for each cancer. Especially, glioblastoma is recognized as the most aggressive tumors with poor prognosis. In order to develop potential therapeutic targets for glioblastoma initiating cells, it is essential to understand the molecular mechanisms underlying aberrant signaling behavior in these cells. As reversible phosphorylation-dephosphorylation signaling events in cancer cells are known to play a pivotal role in transmitting signals from receptors to the nucleus, we applied high-resolution mass spectrometry-based proteomics technology to comprehensively unravel global phosphoproteome dynamics in glioblastoma initiating cells.
There are some recent reports in which proteomics approaches were used to characterize molecular mechanisms of stem cells [22,28,29] and cancer stem cells [30,31]. The tyrosine-phosphoproteome in the cells with highly expressed EGF receptor variant III (EGFRvIII), the most common variant of the EGF receptor observed in glioblastoma multiforme, were quantitatively analyzed upon EGF stimulation [32]. Quantitative phosphorylation events focused on signal transducer and activator of transcription 3 (STAT3)/interleukin 6 (IL6)/hypoxia-inducible factor 2a (HIF2a) autocrine loop were also reported regarding glioblastoma stem cells [33]. The strategies for quantification of activation fold change in the studies above were based on in vitro labeling technologies, leading to generation of less accurate quantitative information on activation change than in vivo labeling such as SILAC.
Based on the two different peptide fragmentation methodologies of CID and HCD, we detected 6,073 phosphopeptides from 2,282 phosphoproteins in total, including various factors mainly related to signal transduction and transcriptional regulation. Among the proteins identified, 635 phosphorylated molecules including GAB1, SHC1, EIF4EBP1, RAF1 and RPS6 in the context of ErbB and mTOR signaling showed more than 1.5 fold change upon EGF stimulation (Table S4). The phosphorylated status of EIF4EBP1 and RPS6 is widely known to facilitate protein translation and the related mTOR signaling pathway is closely involved in regulating stem cell proliferation [34,35]. Therefore, these results suggest that EGF might play a part in regulating glioblastoma initiating cell proliferation.
Our phosphoproteome data also indicated that some marker molecules for glioblastoma initiating cells, such as nestin and vimentin, were found to be highly phosphorylated with many novel phosphorylation sites in addition to previously reported ones on these key molecules. These results indicate that SILACencoded glioblastoma initiating cells established in our study showed the characteristics of stemness even after repeated subculture.  The International Human Genome Sequencing Consortium has indicated that there are only ,20,000 protein-coding genes in the human genome [36]. The number of genes does not significantly differ between human and many other species, inferring that non-coding RNAs might be responsible for the complexity of cancer-related cellular signaling systems. Although the biological functions of short RNAs such as microRNAs remain largely unknown, recent studies regarding human embryonic stem cells showed that microRNA-145 directly targeted pluripotency factors such as OCT4, SOX2 and KLF4 and induced cell differentiation [37]. There is also another report in which screening for autoantibody derived from prostate cancer tissues showed that, among the 22 peptides used as a detector, 18 were generated from the untranslated regions of the transcripts [38].
Thus we investigated whether protein molecules encoded by previously undefined regions in the human transcriptome were involved in the phosphoproteome dynamics in EGF signaling of human glioblastoma initiating cells. Indeed, our shotgun phosphoproteome analysis based on high-resolution mass spectrometry led to identification of three phosphorylated peptides defined by novel coding regions on the human transcripts. Very interestingly, the phosphorylation level on one novel peptide encoded by supervillin-like (LOC645954) was altered upon external EGF stimulation in a cell-type dependent manner. This finding provided the first direct evidence that novel phosphorylated molecules encoded by the transcripts acted as signaling effectors in response to external stimulation, which indicates the possibility of the involvement of such previously unrecognized factors in tumorigenic potential of cancer stem-like cells. Further comprehensive analyses of phosphoproteome dynamics in many types of cancer stem-like cells will contribute to systematic definition of the core signaling machinery responsible for cancer progression.

Cell Culture
Glioblastoma initiating cells were originally established from the glioblastoma brain tissues in the University of Tokyo Hospital based on the written informed consent to undertake genetic and molecular analyses from the patients, which was approved by the Research Ethics Committee at the Institute of Medical Science, University of Tokyo. Among the cell populations established from the brain tissues provided by each patient, a representative one was selected on the basis of the CD15 (SSEA-1) expression level. CD15 (SSEA-1) is one of the most reliable markers for glioblastoma stem cell enrichment and the fluorescence-activated cell sorting (FACS) analysis showed that the CD15-positive rate of the selected glioblastoma initiating cell line was 12% [39]. The cells were cultured in Dulbecco's modified Eagle's medium: Nutrient Mixture F-12 (DMEM/F12) media (Pierce) with 2% B27 supplement minus vitamin A, 20 ng/ml EGF (Upstate Biotechnology) and 20 ng/ml fibroblast growth factor (FGF) (Roche) as previously described [21] and then applied to SILAC media (labeled with either normal L-lysine or 13 C 6 L-lysine, respectively). The incorporation of stable isotopes into cellular proteins was validated by MS measurement of some representative proteins including a-tubulin.

Protein Digestion
The same media without B27 supplement, EGF and FGF were used for starvation and then treated with 20 ng/ml EGF for 0 min (normal L-lysine) or 15 min ( 13 C 6 L-lysine), respectively. The cells were washed three times with PBS, harvested, and suspended in 8 M urea containing PhosSTOP (Roche Diagnostics) and Benzonase (Novagen). After the mixture was kept left on ice for 1 h, cellular debris was pelleted by centrifugation at 15,000 rpm for 30 min. Each lysate was diluted two hundred fold, quantified using BCA Protein Assay Kit (Thermo Scientific) and mixed in equal ratio.
The proteins were reduced with 1 mM ditiothreitol (DTT) for 90 min, and subsequently alkylated with 5.5 mM iodoacetamide (IAA) for 30 min. After digestion with Lysyl Endopeptidase (Lys-C) (1:50 w/w) (Wako) at 37uC for 3 h, the resulting peptide mixtures were diluted with 10 mM Tris-HCl (pH 8.2) to achieve final concentration below 2 M Urea and subsequently digested with modified trypsin (1:50 w/w) (Sequencing grade, Promega) at 37uC for 3 h. The equal amount of trypsin was then added once more for overnight digestion.

Sample Preparation for Mass Spectrometry
Phosphopeptides were enriched by Titansphere Phos-TiO Kit (GL Sciences) according to the manufacturer's instructions. In short, after equilibration of the column, the peptide mixtures from glioblastoma initiating cell lysates were applied to Spin Tip, mixed with the buffer containing 2-hydroxypropanoic acid and centrifuged. After the column was washed, captured peptides were eluted with 5% ammonium solution and 5% pyrrolidine solution, successively. The enriched phosphopeptide solutions were acidified by 10% TFA, desalted by ZipTip C 18 (Millipore) and centrifuged by vacuum concentrator.

Mass Spectrometric Analysis
Shotgun proteomic analyses of the Titansphere eluates were performed by a linear ion trap-orbitrap mass spectrometer (LTQ-Orbitrap Velos, Thermo Fisher Scientific) coupled with nanoflow LC system (Dina-2A, KYA Technologies). Peptides were injected into 75 mm reversed-phase C 18 column at a flow rate of 10 ml/min and eluted with a linear gradient of solvent A (2% acetonitrile and 0.1% formic acid in H 2 O) to solvent B (40% acetonitrile and 0.1% formic acid in H 2 O) at 300 nl/min. Peptides were sequentially sprayed from nanoelectrospray ion source (KYA Technologies) and analyzed by collision induced dissociation (CID) and higherenergy C-trap dissociation (HCD) [40], respectively. The analyses were carried out in data dependent mode, switching automatically between MS and MS/MS acquisition. For CID analyses, full-scan MS spectra (from m/z 380 to 2,000) were acquired in the orbitrap with resolution of 100,000 at m/z 400 after ion count accumulation to the target value of 1,000,000. The 20 most intense ions at a threshold above 2,000 were fragmented in the linear ion trap with normalized collision energy of 35% for activation time of 10 ms. For HCD analyses, full-scan MS spectra (from m/z 380 to 2,000) were acquired in the orbitrap with resolution of 30,000 at m/z 400 after accumulation to the target value of 1,000,000. The 10 most intense ions at a threshold above 5,000 were fragmented in the orbitrap with normalized collision energy of 35% for activation time of 0.1 ms. The orbitrap analyzer was operated   . Carbamidomethylation of cysteine residues was set as fixed modifications, whereas methionine oxidation, protein N-terminal acetylation, pyro-glutamination for N-terminal glutamine, phosphorylation (Ser, Thr, and Tyr) and stable isotopes of Lys 6 were set as variable modifications. A maximum of two missed cleavages was allowed in our database search. The tolerance for mass deviation fragmented by CID and HCD was set to 3 parts per million (ppm) for peptide masses and 0.8 Da (CID)/0.01 Da (HCD) for MS/MS peaks, respectively. In the process of peptide identification, we conducted forward and reverse database searching by Mascot and applied a filter to satisfy a false positive rate lower than 1%. Regarding the proteins supported by a single peptide identification, we applied an