Identification of a 6-Cytokine Prognostic Signature in Patients with Primary Glioblastoma Harboring M2 Microglia/Macrophage Phenotype Relevance

Background Glioblastomas (GBM) are comprised of a heterogeneous population of tumor cells, immune cells, and extracellular matrix. Interactions among these different cell types and pro-/anti-inflammatory cytokines may promote tumor development and progression. Aims The objective of this study was to develop a cytokine-related gene signature to improve outcome prediction for patients with primary GBM. Methods Here, we used Cox regression and risk-score analysis to develop a cytokine-related gene signature in primary GBMs from the whole transcriptome sequencing profile of the Chinese Glioma Genome Atlas (CGGA) database (n=105). We also examined differences in immune cell phenotype and immune factor expression between the high-risk and low-risk groups. Results Cytokine-related genes were ranked based on their ability to predict survival in the CGGA database. The six genes showing the strongest predictive value were CXCL10, IL17R, CCR2, IL17B, IL10RB, and CCL2. Patients with a high-risk score had poor overall survival and progression-free survival. Additionally, the high-risk group was characterized by increased mRNA expression of M2 microglia/macrophage markers and elevated levels of IL10 and TGFβ1. Conclusion The six cytokine-related gene signature is sufficient to predict survival and to identify a subgroup of primary GBM exhibiting the M2 cell phenotype.

cn) [15,16]. Tumor tissue samples were obtained by surgical resection. All patients (age range: 18-81 years) provided written informed consent. The study was approved by the institutional review boards of Capital Medical University, the Second Affiliated Hospital of Harbin Medical University and Beijing Institute for Brain Disorders Brain Tumor Center, and written informed consent was obtained from all patients. Survival data were collected by clinics during patient visits and/or phone interviews. Patients who underwent biopsy alone were not followed up at our center and were therefore excluded from the survival analysis. The Cancer Genome Atlas (TCGA) database (n = 518) was used as the validation set (http://cancergenome.nih.gov) [17].

Whole transcriptome sequencing
Total RNA was isolated using the RNeasy Mini Kit (Qiagen) according to the manufacturer's instructions. A pestle and a QIAshredder (Qiagen) were used to disrupt and homogenize frozen tissue. RNA integrity was assessed using the 2100 Bioanalyzer (Agilent Technologies), and only high quality samples with an RNA Integrity Number (RIN) greater than or equal to 7.0 were used to construct the sequencing library. The subsequent steps included end repair, adapter ligation, size selection, and polymerase chain reaction enrichment. DNA fragment lengths were measured using a 2100 Bioanalyzer, with median insert sizes of 200 nucleotides. The libraries were sequenced on the Illumina HiSeq 2000 platform using the 101-bp pairedend sequencing strategy. Short sequence reads were aligned to the human reference genome (Hg 19 Refseq) using the Burrows-Wheeler Aligner (BWA, Version 0.6.2-r126) [18].

Statistical Analysis
RNA sequencing data was downloaded from the CGGA dataset. Gene expression was calculated using the RPKM method (reads per kilobase transcriptome per million reads) [22,23]. The RPKM method eliminates the influence of varying gene lengths and sequencing discrepancies from the calculation of gene expression. Therefore, the calculated gene expression can be directly used to compare the differences in gene expression among samples [24].
To integrate cytokine-related gene sets, 821 cytokine-related genes were first extracted from canonical biological pathways in the Molecular Signatures Database v4.0 (MSigDB) (http:// www.broad.mit.edu/gsea/msigdb/) [25,26] and then combined with 137 cytokine-related genes from three publications in Journal of Allergy and Clinical Immunology [27][28][29]. After removing overlapping genes between the two gene sources, the cytokine-related gene set contained a total of 593 genes. The prognostic value of these genes in patient survival was calculated by the Kaplan-Meier method with the two-sided log-rank test by two packages (survival and KMsurv) of R. The permuted P-value for each gene was corrected by multiple comparison correction using the Benjamini-Hochberg false discovery rate (FDR). The genes with corrected permutation P-values < 0.01 were selected as candidate genes. In the end, six genes remained.
To evaluate the prognostic effectiveness of the six genes, a risk-score formula for predicting survival was developed based on a linear combination of the mRNA expression level (expr) weighted by the regression coefficient derived from the univariate Cox regression analysis (β) [30,31]. The risk score for each patient was calculated as follows: We next divided patients in the training dataset into high-risk and low-risk groups using the median mRNA signature risk score as the cutoff point. Patients with higher risk scores are expected to have poor survival. The same β was applied to the validation cohort.
Overall survival (OS) was defined as the period from the first operation to death or last follow-up. The progression-free survival (PFS) was defined as the period from the first operation to the time of tumor recurrence or evidence of progression based on magnetic resonance imaging (MRI). The differences in OS and PFS between high-risk patients and low-risk patients were estimated using the Kaplan-Meier method and 2-sided log-rank test in GraphPad Prism Version 6.01. Cox proportional hazards regression analyses were performed to assess the independent contribution of the mRNA signature and clinicopathologic variables to survival prediction.
GSEA was performed using Gene Set Enrichment Analysis v2 software downloaded from the Broad Institute (www.broadinstitute.org/gsea). The mRNA expression profile of GBM samples from the CGGA dataset was analyzed by GSEA [25,26]. For GSEA, risk score was treated as a binary variable divided into low or high risk by a criterion of whether the score was greater than the median value. To determine detailed immunologic functional gene sets for GSEA, we used the immunologic signature gene sets from MSigDB [26]. All other parameters were set based on their default values.
Significant difference in the comparison of two experimental groups was determined by t test. Excel was used for correlation analysis. Principal component analysis (PCA) was conducted to assess patterns in gene expression, and PCA was conducted with the R programming language (http://cran.r-project.org). STRING, a database of protein-protein interactions, was also used. STRING takes a list of gene products and returns a diagram of known and potential interactions [32]. Chi-Square test and Fisher's exact test were used to compare the frequencies between groups in SPSS version 13.0 software for Windows (SPSS). All differences were considered statistically significant at the level of p<0.05.

Results
Prediction of survival based on six cytokine-related genes signature in the CGGA database (the training cohort) In 105 CGGA primary GBM samples, we used Cox regression to analyze each of 593 cytokinerelated genes in the training set and identified six genes (CCL2, CCR2, CXCL10, IL10RB, IL17B, and IL17R) that were significantly associated with OS (S1 Table, P<0.01). We then performed the risk score method to construct a model for the prediction of survival based on these six genes [30,31]. We calculated the signature risk score for each of the 105 patients in the training set and divided them into a high-risk group (n = 52) and a low-risk group (n = 53); the median risk score was used as the cutoff point. Table 1 shows the clinical characteristics of the patients in each of the two risk groups. The rates of overall survival at one year in the low-risk and high-risk groups were 59% and 34%, respectively; the two-year survival rates were 40% and 11%, respectively. The median survival time in the low-risk group was 514 days, and that in the high-risk group was 295 days (Fig 1A, P<0.001). The high-risk score group also had shorter progression-free survival than the group with low-risk score (Fig 1B, P<0.01).
Clinical characteristics (age, gender, pre-operational Karnofsky Performance Scale (KPS) score, and treatment) and molecular information (IDH mutation and MGMT methylation status) were obtained from the CGGA dataset and are shown in Table 1.

Confirmation of the signature for survival prediction in the TCGA database (the validation cohort)
To validate the prognostic value of the 6-cytokine signature, we applied the risk score formulas from the CGGA dataset to patients from the TCGA dataset and then ranked the patients with  known risk scores in each respective situation. These patients were also divided into low-and high-risk groups, using the median risk score as a cut-off. The prognostic value of this signature was validated in the TCGA dataset (Fig 1C and 1D).
In patients treated with radiotherapy plus temozolomide-the standard treatment for primary GBM, the risk score predicted overall survival and progression-free survival in both datasets. Patients with a high-risk score had poor survival (Fig 1E and 1F The distribution of patient risk scores, OS, and mRNA expression in GBM for both the training set and the validation set is provided in Fig 2A and 2B. Patients with high-risk scores expressed high levels of these six genes. Principal component analysis (PCA) shows a good separation between the high-risk group and the low-risk group in the primary GBM specimens according to the six cytokine signature (Fig 2C).
The 6-cytokine signature is an independent prognostic factor in primary GBM We conducted univariate Cox regression analysis using clinical and genetic variables for the CGGA cohort ( Table 2) and observed that the signature (risk score), radiotherapy, chemotherapy (temozolomide), pre-operative KPS score, and MGMT promoter methylation status were statistically associated with OS. We also found that risk score, chemotherapy, pre-operative KPS score, and MGMT promoter methylation status were associated with PFS. Multivariate Cox regression analysis indicated that the signature was an independent prognostic factor in the CGGA cohort ( The high-risk score group from the 6-cytokine signature is associated with the M2 microglia/macrophage phenotype and exhibits increased expression of IL10 and TGFβ1 To better understand the relationship between immunology and inflammation, we conducted gene expression analysis between the low-risk group and the high-risk group in the CGGA samples. Gene set enrichment analysis (GSEA) [26] revealed that the subgroup with high risk score had increased expression of activated microglia-associated genes (Fig 3A, 3B and 3C, NES = 1.76, P<0.05; B, NES = 2.31, P<0.001; C, NES = 2.27, P<0.001). Using the mRNA sequencing gene expression profile, we observed that the M2 microglia/macrophage markers (CD68, CD163, CD204, CD206) [11][12][13] were significantly up-regulated in the high-risk subgroup compared to the low-risk subgroup (Fig 3D, P<0.0001, P<0.0001, P<0.0001, P<0.001, respectively). Likewise, MDSC-specific transcripts (CD11b, CD14, CD15, CD33) [33] were found at elevated levels in the high-risk subgroup (Fig 3E, P<0.0001, P<0.001, P<0.0001, P<0.0001, respectively). M2 markers, CD163 and CD204, and MDSC markers, CD11b, CD14, CD15, and CD 33, showed a similar pattern of increased expression in the high-risk subgroup of the TCGA dataset (S1 Fig; CD68 and CD206 were not included in the TCGA dataset.).
Compared to the low-risk group, IL10 and TGFβ1 mRNA were significantly elevated in the high-risk score group from the CGGA dataset and the TCGA dataset (Fig 3F, S1G and S1H  Fig). However, the M1 markers, iNOS and IL12, showed no significant difference between groups (Fig 3G, S1I and S1J Fig).
Using a more comprehensive panel of genes, we probed for associations between cell phenotype and immunomodulatory factors in the mRNA expression profile; results are displayed as a heat map (Fig 4A). The extent of Spearman correlation between the expression levels of the different genes was statistically significant. There was a strong association between the expression of M2 macrophage markers (CD68, CD163, CD204, and CD206) and MDSC markers (CD11b, CD14, CD15, and CD33). Additionally, expression of CCL2, CCR2, IL10RB, and IL17R positively correlated with the M2 macrophage markers and MDSC markers. Fig 4A also shows that mRNA expression of monocyte markers positively correlates with the expression of IL10 and TGFβ1. In contrast, iNOS and IL12 negatively correlate with the M2 cell marker CD163 and the MDSC marker CD33. In addition, STRING was used to visualize the gene list, containing cell markers and immunoregulatory factors; it was also used to show potential gene product interactions (Fig 4B).
PCA was conducted to determine whether or not there may be latent associations common to the genes encoding M2 (CD11b, CD68, CD163, CD204, CD206) and MDSC (CD15, CD14, CD33) phenotypic markers. The six cytokine-related genes, IL10, and TGFβ1, classical immunosuppressors, were also examined. The scores for individual samples (symbols) and the amount by which the expression of each gene "loads" on, or correlates with, the components (represented by the direction and length of the loading plot vectors) are shown in Fig 4C. The expression of genes that are near each other in the vector plots, for example, CD68, CD33, IL-17R, and TGFβ1, is expected to be associated. Moreover, there is a geometric link between the arrows and symbols indicating that higher levels of cell phenotype markers and immunoregulatory genes were significant in characterizing primary GBM samples (Fig 4C).
Consistent with previous work [34], we also observed that mesenchymal glioblastomas were enriched in the high-risk subgroups, supporting the report that immune genes exhibited differential expression in glioblastoma subtypes, and immunosuppression dominated in mesenchymal glioblastomas (S2 Table; CGGA: P<0.0001; TCGA: P<0.0001).

Discussion
In our present work, mRNA expression profile of 105 primary GBM samples was examined by whole transcriptome sequencing, because RNA-seq can add benefits for gene expression analysis such as quantitation of transcripts, improved dynamic range, and additional capabilities for detecting expressed single nucleotide variants (SNVs), translocations, and transcript isoform switches compared with microarray and immunochemistry [15,35,36]. We identified a six gene cytokine-related signature (CCL2, CCR2, CXCL10, IL10RB, IL17B, and IL17R) capable of dividing glioblastoma patients into a low risk score group with favorable survival and a high risk score group with poor survival. CCL2 (MCP1) is a cognate ligand for chemokine receptor CCR2, which is expressed by monocytes in peripheral blood and plays multiple roles in cancer, including chemoattraction of circulating CCR2-positive monocytes/macrophages to the tumor vicinity [37,38]. IL17 is a pro-inflammatory cytokine, which induces production of other proinflammatory cytokines, chemokines, and prostaglandins. IL17 includes six families (IL17A through F) expressed by lots of innate and adaptive immune cells, including epithelial cells, invariant natural killer T cells, natural killer cells, lymphoid-tissue inducer-like cells, neutrophils, T cells and so on [39]. In non-small-cell lung cancer patients, higher levels of IL17 within the tumor correlates with higher blood vessel density and shorter survival [40]. We have shown that expression of both IL17B and IL17 receptor are associated with survival of patients with primary GBM. Previous work has also demonstrated that endogenous CXCL10 could inhibit glioma development and promote tumor infiltration of CD8 + T cells in transgenic mice; additionally, myeloid-derived suppressor cells (MDSCs) produce CXCL10, indicating that they are pleiotropic [38,41]. We found that GBM patients harboring high mRNA expression of  CXCL10 had shorter overall survival in both CGGA and TCGA datasets. More work needs to be done to better understand the role of CXCL10 in glioma tumorigenesis. We identified that the high-risk score group from the 6-cytokine signature was associated with the M2 microglia/macrophage phenotype and exhibited increased expression of IL10 and TGFβ1. These data suggested that malignant glioma cells might induce TAMs to create a favorable microenvironment for glioma progression. Microglia are abundant in the CNS and comprise approximately 5-10% (depending on the region) of all brain cells [42]. A hallmark of glioblastoma is the large number of immune cells (i.e. microglia/macrophages) that accumulate in the tumor mass [43,44]. The number of TAMs (tumor associated microglia/macrophages) can be very high and constitute up to 30% of the tumor mass in GBM [5,14]. Moreover, TAM number is typically higher in glioblastomas compared to grade II or III gliomas; it also closely associates with vascular density in tumors [12]. TAMs that populate gliomas can originate from either activated intrinsic parenchymal microglia or from macrophages freshly derived from bone marrow precursors in the blood [14,45]. Peripheral blood-derived macrophages are largely restricted to the perivascular areas, meninx, and choroid plexus in the tumor-free brain; however, in GBM, these macrophages tend to assemble following the breakdown of the BBB [46]. Glioma-microglia/macrophage synergy drives a self-amplifying process in the tumor microenvironment. Glioma cells and microglial cells have a symbiotic relationship that becomes highly skewed in favor of the glioma. Glioma cells produce chemotactic factors, such as CCL2 and M-CSF, resulting in the recruitment of microglia and peripheral blood-derived macrophages. Glioma cells further promote the proliferation of microglia/macrophages. The immunosuppressive microenvironment created by molecules such as TGFβ and IL10 polarizes tumor-infiltrating microglia/macrophages toward the M2 phenotype. Lots of studies still delineated that TGFβ1 played a key role in affecting the malignant phenotype, including cell proliferation-promotion, migration and invasion-increasing, and apoptosis-inhibiton [47][48][49][50][51][52]. TGFβ1 increased glioma-initiating cells (GICs) self-renewal through the induction of LIF and the JAK-STAT pathway [53], and TGFβ inhibitors were emerging as compounds targeting GICs [54]. IL10 was associated with glioma progression. High grade gliomas expressed higher level of IL10 produced by glioma infiltrating macrophages (GIMs) [55]. IL10 was an important cytokine during glioma progression as a result that it promoted glioma invasion as well as an immunosuppressive environment [56]. In turn, M2 microglia/macrophages, influenced by the tumor itself, promote glioma growth, progression, invasion, and angiogenesis ( Fig 4D) [14]. Tumor associated macrophages typically are shifted toward the M2 phenotype, tumor promoting, and immunosuppressive [57,58]. miR-142-3p administration resulted in glioma growth inhibition by modulating M2 macrophages through the transforming growth factor beta signaling pathway [25]. Based on our result, targeting tumor-associated microglia/macrophages may be a novel, useful therapeutic approach for the high risk score group patients.
In conclusion, our signature reflects the proportion of M2 microglia/macrophages between the high-risk group and the low-risk group. The signature can be used to assess glioblastoma patient prognosis. It also provides new information regarding immunotherapy and the infiltration of immune cells into glioma tumors. line-database evidence; black line-co-expression evidence. mRNA expression of genes from (C) were used for PCA analysis, which is represented by 2-dimensional visualization. The symbols represent independent patient data (blue-low risk group; pink-high risk group). PCA projections of the first 2 principal components are shown. Arrows represent individual genes with the points directed at their loading coordinates. (D) Tumor-derived molecules, such as TGFβ and M-CSF, can polarize glioma-associated microglia/microphages (MMs) toward the M2 phenotype and stimulate the production of antiinflammatory molecules. Other glioma-derived molecules, such as CCL2 and VEGF, can recruit myeloid cells into the tumor site. TAMs refer to tumorassociated microglia/macrophages. doi:10.1371/journal.pone.0126022.g004 Supporting Information S1 Fig. Confirmation of mRNA expression (cell phenotype and immunomodulatory genes) in the high and low subgroups in the TCGA dataset. CD163 and CD204 are significantly upregulated in the high-risk group (A, B; CD68 and CD206 were not included in the TCGA dataset). Expression of CD11b, CD14, CD15, and CD33 were increased in the high-risk group (C, D, E, F). TGFβ1 and IL10 are expressed at higher levels in the high-risk group (G, H). iNOS and IL12 are not significantly different between the low-risk and high-risk groups (I, J). (TIF) S1