Analysis of SOX2-Regulated Transcriptome in Glioma Stem Cells

Introduction Glioblastoma is the most malignant brain tumor in adults and is associated with poor survival despite multimodal treatments. Glioma stem-like cells (GSCs) are cells functionally defined by their self-renewal potential and the ability to reconstitute the original tumor upon orthotopic implantation. They have been postulated to be the culprit of glioma chemo- and radio-resistance ultimately leading to relapse. Understanding the molecular circuits governing the GSC compartment is essential. SOX2, a critical transcription regulator of embryonic and neural stem cell function, is deregulated in GSCs however; the precise molecular pathways regulated by this gene in GSCs remain poorly understood. Results We performed a genome-wide analysis of SOX2-regulated transcripts in GSCs, using a microarray. We identified a total of 2048 differentially expressed coding transcripts and 261 non-coding transcripts. Cell adhesion and cell-cell signaling are among the most enriched terms using Gene Ontology (GO) classification. The pathways altered after SOX2 down-modulation includes multiple cellular processes such as amino-acid metabolism and intercellular signaling cascades. We also defined and classified the set of non-coding transcripts differentially expressed regulated by SOX2 in GSCs, and validated two of them. Conclusions We present a comprehensive analysis of the transcriptome controlled by SOX2 in GSCs, gaining insights in the understanding of the potential roles of SOX2 in glioblastoma.


Introduction
Glioblastoma is the most common and deadly primary brain tumors in adults and despite multiple treatments, the survival of glioma patients remains poor, with a median time between 12-15 months [1][2][3] Glioblastoma contains a subpopulation of tumor propagating stem-like cells, known as glioma stem-like cells (GSCs) [4], which display the ability to self-renew, to differentiate into distinct lineages and to efficiently initiate and propagate tumors in xenografts models that recapitulate the phenotypic characteristics of the initial tumor from which they were derived [5,6]. Moreover, GSCs have been shown to increase resistance to radio-and chemotherapy [7,8], explaining in part the poor overall survival despite multiple treatments.
SOX2, a member of the SRY gene family, is a key transcription factor in the regulation of stemness properties, and it is essential in early embryonic development [9]. SOX2 has been reported to be deregulated in several human cancers [10][11][12] including glioblastoma where is over expressed due to several mechanisms such as amplification and promoter hypomethylation [13]. SOX2 is enriched in human-derived GSCs where it sustains stemness, migration, invasion and maintenance of tumorigenicity [13,14]. Although SOX2 response program in a glioblastoma cell line has been analyzed [15], to the best of our knowledge, an exhaustive analysis of SOX2-regulated molecular circuitries in GSCs has not been performed. Deciphering the molecular circuitries controlled by SOX2 in GSCs will provide insights about glioma development, biology and possible novel molecular therapies.
Besides coding genes, long non-coding RNAs (lncRNAs) are an emerging class of RNAs with no functional protein-coding ability that consists of more than 200 nucleotides [16]. Recent discoveries have proven that they play important roles regulating gene expression and function. These non-coding RNAs actively participate in many pathological processes in human malignancies [17][18][19] including cancer where a number of lncRNAs have been shown to act as oncogenes or tumor suppressors [20]. Recently, different groups published a signature of lncRNAs with aberrant expression in glioblastoma [21] and a set of prognostic lncRNAs that could have clinical implications in the sub-classification of this disease [22]; though the functional effect of lncRNAs in glioblastoma is not well understood.
Given that SOX2 is predominantly expressed in the GSCs compartment, which plays prominent roles in driving the growth, treatment resistance and recurrence of glioblastoma, the elucidation of the transcriptome and the molecular pathways involved in the generation and maintenance of this recalcitrant population is critical to understand the molecular underpinnings of glioblastoma malignancy. The aim of this work was to characterize the transcriptome regulated by SOX2 in GSCs. We set out to describe not only the coding genes but also the lncRNAs, which have been shown to play predominant roles in cancer. In this study we present a comprehensive analysis of the transcriptome controlled by SOX2 in GSCs, gaining insights in the understanding of the potential roles of SOX2 in glioblastoma.
To inhibit SOX2 expression, transient transfection assays were performed using two commercially available, specific siRNA against human SOX2 (si-SOX2, s13295 and s13294, Ambion) and a non-targeting control siRNA (si-scramble) (Ambion) in four independent experiments. The siRNA transfections were performed according to the manufacturer's instructions using Lipofectamine 2000 (Invitrogen). The cells were then cultured for 72 h after transfections and subjected to different analysis.

RNA extraction and Real Time PCR analysis
Total cellular RNA was isolated from the cultured cells using a Trizol reagent (Ambion) according to the manufacturers' protocols. For lncRNAs analysis, total RNA was subjected to DNase I treatment to digest the DNA. RNA quantity and quality were measured by NanoDrop ND-1000. The RNA samples were then reversely transcribed into cDNA using the Taqman miRNA Reverse Transcription Kit (Applied Biosystems) according to the manufacturer's instructions. Real Time PCR (RT-PCR) was performed using the Sybr Green Fast Master Mix (Applied Biosystems) in the ABI 7700 sequence detection system (Applied Biosystems, Foster City, CA). The quality of the products was controlled by the melting curve. Transcript levels were normalized against human GAPDH. Transcripts expression levels relative to GAPDH were calculated using the ddCt method. Primers for lncRNA detection and quantification were designed at Universal ProbeLibrary Assay Design Center (http://www.roche-applied-science. com/). All primer sequences are listed below (Tables 1 and 2):

Immunoblotting Assay
For the western blot assay, cells were lysed in RIPA buffer (Triton and PBS) for 30 min on ice. Samples containing identical amounts of protein (30 μg) were resolved in a 12% polyacrylamide gel, transferred to polyvinylidene membranes, and blocked in 5% nonfat milk in phosphate-buffered saline/Tween-20. Membranes were incubated with the following antibodies: GCAGGGCAACAAAGCAGACTA CAGGGCGCTGTTTCTCCAA SOX2 (Cell Signaling, Danvers, MA) and α-Tubulin (Sigma-Aldrich) using 1:1000 dilution. The membranes were developed according to the protocol for enhanced chemiluminiscence from Perkin Elmer.

Microarray expression analysis
Total RNA was isolated from scrambled and SOX2-siRNA GSC11 cells using Trizol extraction. RNA was purified by the QIAGEN RNAeasy mini kit (QIAGEN) according to the manufac-turer´s protocol. One-color Cy3 RNA labeling, array hybridization to Agilent SurePrint G3 8 × 60 K Human Gene Expression Arrays (Agilent Technologies), data collection, and analysis were performed at the Bioinformatics Unit (Fundación para la Investigación Médica Aplicada, CIMA, Pamplona, Spain). Normalization of microarray data was performed using quantile algorithm. After quality assessment a filtering process was carried out to eliminate low expression probe sets. Applying the criterion of an expression value greater than 64 in 2 samples of at least one of the experimental conditions, 40986 probe sets were selected for statistical analysis. LIMMA (Linear Models for Microarray Data) [23] was used to find out the probe sets that showed significant differential expression between experimental conditions. Genes were selected as significant using a B statistic cut off B>0. Data processing and statistical analyses were performed with R and Bioconductor [24]. The microarray data from this study are publicly available at Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession number GSE79302.

Functional group analysis
In this study we applied Gene Ontology (GO) analysis to find the primary function of the differential expression of mRNAs regulated by SOX2, using online software DAVID (Database for Annotation, Visualization and Integrated Discovery, http://david.abcc.ncifcrf.gov/). GO analysis can organized genes into hierarchical categories (Gene Ontology Consortium). To find out the significant pathway of the differential genes participating we performed gene regulatory network analysis using Ingenuity Pathway Analysis (IPA) software (http://www.ingenuity. com), which can integrate gene-expression data with other molecular databases to facilitate the development of new and more complete pathway maps. Fisher's exact test was used to select the significant GO categories. The threshold of significance was defined by P value with a cutoff set in 0.05.

Statistical analysis
Experimental data are represented as the mean ± SD of three biologic replicates and were compared using Student's t-test. Significant P-values are indicated with asterisks as follows:

Transcripts regulated by SOX2
Since SOX2 is a key driver in the maintenance of the GSCs phenotype and therefore in the perpetuation of this devastating tumor we down-regulated the expression of this gene in the GSC11 cell line in four independent experiments, using a SOX2 specific siRNA ( Fig 1A).
The efficiency of SOX2 knockdown was assessed by real-time PCR and western blot ( Fig 1B) and was also confirmed in our array results. Microarray data identified a total of 2048 differentially expressed coding transcripts and 261 non-coding transcripts (B value >0) (Fig 1A and S1 Table).

SOX2 controls a wide spectrum of protein-coding genes and pathways in GSCs
To further narrow the coding transcripts data a cut-off 1 logarithmic fold difference between SOX2 knockdown and scrambled GSC11 cells was set, identifying 35 up-regulated and 100 down-regulated genes, which suggest that SOX2 act primarily as a transcriptional activator. In Table 3 we showed the top-10 up or down-regulated protein coding-genes, and select the top 5 candidates of each group for further validation by qRT-PCR using the GSC11 and GSC23 cell lines. We confirmed the observed microarray expression changes in 5 out of 5 down-regulated coding-genes ( Fig 2B) and in one out of 5 up-regulated coding genes in GSC11 cells (Fig 2A).
Regarding GSC-23 cells, we down-modulate SOX2 expression using two different siRNAs against human SOX2, and we validate the expression of 5 out of 5 up-regulated coding-genes ( Fig 2A) and of 4 out of 5 down-regulated coding genes ( Fig 2B), partially validating the microarray results.
To understand the significance of differential gene expression, bioinformatics analysis related to Gene Ontology (GO) Classification and pathway analysis were performed. GO classifications using the DAVID web tool and pathway analysis using Ingenuity Pathway Analysis (IPA) was performed. For these analysis, gene lists were classified based upon decreased (logFC < -1) or increased (logFC > 1) expression relative to control and analyzed altogether as a single list.  Enrichment analysis of GO categories including biological process (BP), molecular function (MF), and cellular component (CC) were obtained using DAVID web tool (Fig 3). We observed the highest enrichment in the categories related to "cell adhesion", "biological adhesion", "cell-cell signaling", "extracellular region" and "calcium ion binding".
We used IPA analysis to undercover the canonical pathways regulated by SOX2 in GSCs. Our results showed 13 pathways significantly altered ( Table 4). Most of them related with SOX2 Transcriptome in GSCs amino-acid metabolism pathways, such as "histamine biosynthesis", where histamine is an important regulator of numerous physiological processes including neurotransmission in the central nervous system (CNS) [25]; "L-cysteine degradation process" where cystathionine γlyase (CTH) activity has been related with glioblastoma treatment [26] and "serotonin receptor signaling pathway" being serotonin an important neurotransmitter in the CNS during neuronal development [27]. Other enriched pathways were "hematopoiesis from multipotent stem cells", where KITLG has been reported to regulate neoplastic processes such as growth and invasion [28]; apoptosis [29] and cell adhesion [30]. Role of JAK2 in Hormone-like Cytokine Signaling stood out because GHR (growth hormone receptor) and IRS1 (insulin receptor substrate 1) has been linked with glioma progression [31,32]. A well-characterized pathway frequently altered in tumors is the NOTCH signaling cascade, which was also enriched in our analysis. The NOTCH pathway is a conserved intercellular signaling route that has been implicated in different developmental processes. Interestingly, NOTCH pathway is deregulated in human glioblastoma and plays a key role in maintaining the growth, the undifferentiated state of glioma cells and tumorigenesis [33][34][35]. The integrated analysis of SOX2 enriched canonical pathways revealed the link between this transcription factor and multiple cellular processes such as amino-acid metabolism and intercellular signaling cascades, like NOTCH pathway.
The IPA analysis also showed the most relevant biological functions and diseases in our data set. The most significant bio-functions altered following SOX2 down-modulation are showed in Table 5. The set of SOX2-associated genes were assigned mainly to the following networks: "cancer", "organismal injury and abnormalities", "cellular movement", "tissue morphology", "cellular development" and "hematopoiesis". Interestingly, most of these networks involved very well-known functions of SOX2 such as morphology determination [36], development [37] and cellular proliferation and migration in glioma [13,13]. Fig 4 shows the most relevant selection of bio-function categories: disease and disorders, molecular and cellular functions and physiological system development and function, obtained by using IPA software organized by p-value.  These results established a signature of protein coding-genes regulated by SOX2 in GSCs with biological functions relevant to glioblastoma growth and maintenance of its malignant phenotype. The tight overlap between the existing literature and our enrichment analysis highlights the robustness of our results and predicts that this approach will be an excellent discovery platform to identify novel SOX2 targets.

SOX2-regulated non-coding RNAs in GSC
Reprogramming transcription factors, including SOX2, have been shown to regulate both coding and non-coding RNAs [38]. LncRNAs are emerging as key regulators of biological processes and disease [39] therefore, seems reasonable to hypothesize that SOX2 will regulate this class of genes as well. The strength of our data-sets allowed us to identify potential non-coding transcripts differentially expressed (B value > 0) regulated by SOX2 in GSCs. After biotype distribution analysis we identify protein coding RNAs (44% for up-regulated and 41% for down-regulated), while the rest were classified as different types of non-coding transcripts. Out of the total number of transcripts differentially expressed we identify 80 upregulated and 181 down-regulated and we classify them as intergenic RNAs, antisense, processed transcripts, transcripts derived from pseudogenes and unassigned transcripts ( Fig  5). The transcripts classified as "others" correspond to transcripts derived from miRNAs, rRNAs, sense-overlaping and sense intronic transcripts. The lncRNA annotation was performed with the Bioconductor package ChIPpeakAnno [40] and using Gencode v19 as reference [41]. The gene type corresponding to the gene that overlaps with the lincRNA locus was assigned to each lincRNA. Table 6 shows the top 25 non-coding transcripts regulated by SOX2 in GSCs.
The top four differentially expressed lncRNAs ( Table 7) that presented chromatin marks and high abundance in brain were selected using GRCh37/hg19 assembly in UCSC Genome Browser, and their expression was validated using qRT-PCR in GSC11 cells, comparing SOX2-siRNA versus scrambled control. Our data indicated that the expression of chr19:28,281,401-28,284,848 (TCONS_00027256) was significantly down-regulated (p value = 0,018), while chr11:121899032-121899389 (TCONS_00020142) was significantly upregulated (p value = 0.042) after SOX2 inhibition (Fig 6A). The results were consistent with the microarray data (Fig 6B).  Altogether, these results identified and confirmed the non-coding transcript profile controlled by SOX2 in GSCs. Characterizing the functional relevance of these lncRNAs will undoubtedly impact our understanding of glioblastoma biology.

Discussion
Our work provides a comprehensive view of the genome wide SOX2 regulated transcripts in GSCs, illustrating a complex scenario where SOX2 is the central player regulating different molecules and pathways in glioblastoma.
In this study, we used state-of-the art microarray technology to query the SOX2 coding and non-coding RNA transcriptome in GSCs. It is interesting to note that among the down-regulated genes following SOX2 knockdown, F11R has been shown to be overexpressed in glioblastoma cells [42,43]. F11R is necessary and sufficient for GSC maintenance and self-renewal and of clinical significance is associated with increased malignancy and poor patient prognosis [42,43]. On the other hand, we found several interesting over-expressed candidates controlled by SOX2; for example, PPP1R1B is a well-known striatal projection neuron signature marker [44]. The fact that its expression increases following SOX2 inhibition is in line with its role in neuronal differentiation.
In a previous work where the SOX2 response program in a glioblastoma cell line was analyzed [15], authors identified 489 genes whose expression were altered in response to SOX2 knockdown, using several genomic technologies. Interestingly several of these genes are also differentially expressed in our array data, such as NGFR, CEBPA, BCL2, BNIP3, EBF4, ALCAM, protocadherins and solute carrier family members. Overall these results highlight the strength of our array data and the link between SOX2 and GSCs biology. The work of Fang and colleagues exhibits some similarities with our study, such as the analysis of SOX2 regulatedcoding genes. However, we focused in the molecular circuitries controlled by SOX2 in GSCs, meanwhile they performed their study in an established glioma cell line. Additionally, our study addressed the SOX2-regulated lncRNAs in GSCs. Altogether both works provide clues regarding SOX2 functions in glioblastoma.
We also analyzed the canonical pathways regulated by SOX2 in GSCs. Pathways related with amino-acid metabolism were among the most deregulated, illustrating that SOX2 expression is critical for maintaining metabolic homeostasis in the GSC population, and plays important role in different tumor microenvironment conditions, such as hypoxic stress conditions  [49]. Other enriched pathway altered in our analysis was the NOTCH pathway, where Hes5 and Hey1 had the most significantly down-modulated expression. Hes5 is a marker of neural multipotent progenitors with stem cell properties [50] where it sustains progenitors proliferative state inhibiting their differentiation into neurons [51]. On the other hand Hey1 has been related to a subset of molecules directly associated with hypoxia in glioblastoma tumors [52]; and might be used as a marker to distinguish glioblastoma patients with a relative good prognosis (negative Hey1 expression) [53]. Furthermore, Hey 1 is up-regulated in glioma samples correlating with tumor grade, and functionally its down-regulation results in a proliferation reduction [54], suggesting a role in the progression of glioblastoma. Taking all this into account, the canonical pathways more significantly altered after SOX2 inhibition are those related with intracellular signaling cascades and amino-acid metabolism pathways associated with tumor propagation. Consistent with what is known about SOX2 biological function, our data-set is enriched with genes involved in morphology determination, development and cellular proliferation and migration. Interestingly, "proliferation of tumor cells" (S2 Table), is one of the most repeated subcategories for which IPA analysis assigned an activation Z score close to -2, predicting it´s inhibition, which is in line with a putative role of SOX2 in cell proliferation.
One of the most exciting aspects of our study involved expanding our knowledge of the SOX2 transcriptome into the realm of lncRNAs. In this study we showed and classified the lncRNA landscape regulated by SOX2 in GSCs. Our microarray results showed a strong correlation with published reports, demonstrating the strength of our approach and providing confidence that we can use this data-set for de novo discovery of novel SOX2 targets, including lncRNAs. One previous study determined the differentially expressed lncRNAs between glioblastoma and brain tissues, showing 654 lncRNAs upregulated and 654 down-regulated [55]. To our knowledge, this is the first study that evaluates the differentially expression of lncRNAs in GSCs controlled by SOX2.
Among the transcripts regulated by SOX2 we found that SOX2OT was down-regulated in our data set, even though we did not validate it. SOX2OT is a lncRNA which harbors SOX2 gene in its intronic region and is transcribed in the same orientation as SOX2 [56]. Several studies have demonstrated a role of SOX2OT in the regulation of SOX2 gene in human stem cells [57,58] although little is known about the exact role of this non-coding RNA. SOX2OT has been associated with carcinogenesis and, for example in breast cancer is involved in the induction and/or maintenance of SOX2 expression [59], in esophageal squamous cell carcinoma has been shown to play a role in tumor initiation and/or progression as well as in regulation of the pluripotent state of stem cells [58], and proliferation in lung cancer cells [60]. Askarian-Amiri et al demonstrated that SOX2OT has a positive effect on SOX2 expression [59]. Published data suggest the mediation of lncRNA SOX2OT in pluripotency and tumorigenesis events, probably through regulation of SOX2 expression. These data together with our own results suggest a possible role of SOX2OT in the malignant phenotype of glioblastoma, however further functional and mechanistic studies will be necessary to elucidate the precise role of SOX2OT and other lncRNA candidates in the tumorigenicity of glioblastoma.
Although advance in managing and treating glioblastoma have been made, tumor recurrence and treatment resistance remains the major cause of glioblastoma mortality. Understanding how factors such as SOX2 drive the glioblastoma tumor phenotype will aid in the development of new therapeutic approaches based on targeting GSCs. Our study integrates for the first time the coding and non-coding transcriptome controlled by SOX2 in GSCs, gaining new insights about the molecular circuitries governing glioblastoma biology.

Conclusion
In conclusion we have performed a comprehensive analysis of differential expression of coding and non-coding transcripts controlled by SOX2 in GSCs. We performed gene set enrichment analysis to find the most relevant pathways and biological functions altered in our data set. This integrated analysis allows for a better understanding of the SOX2 transcriptome in GSCs.