Conventional differential expression analyses have been successfully employed to identify genes whose levels change across experimental conditions. One limitation of this approach is the inability to discover central regulators that control gene expression networks. In addition, while methods for identifying central nodes in a network are widely implemented, the bioinformatics validation process and the theoretical error estimates that reflect the uncertainty in each step of the analysis are rarely considered. Using the betweenness centrality measure, we identified Etv5 as a potential tissue-level regulator in murine neurofibromatosis type 1 (Nf1) low-grade brain tumors (optic gliomas). As such, the expression of Etv5 and Etv5 target genes were increased in multiple independently-generated mouse optic glioma models relative to non-neoplastic (normal healthy) optic nerves, as well as in the cognate human tumors (pilocytic astrocytoma) relative to normal human brain. Importantly, differential Etv5 and Etv5 network expression was not directly the result of Nf1 gene dysfunction in specific cell types, but rather reflects a property of the tumor as an aggregate tissue. Moreover, this differential Etv5 expression was independently validated at the RNA and protein levels. Taken together, the combined use of network analysis, differential RNA expression findings, and experimental validation highlights the potential of the computational network approach to provide new insights into tumor biology.
Citation: Pan Y, Duron C, Bush EC, Ma Y, Sims PA, Gutmann DH, et al. (2018) Graph complexity analysis identifies an ETV5 tumor-specific network in human and murine low-grade glioma. PLoS ONE 13(5): e0190001. https://doi.org/10.1371/journal.pone.0190001
Editor: Ilya Ulasov, Northwestren University, UNITED STATES
Received: September 3, 2017; Accepted: December 6, 2017; Published: May 22, 2018
Copyright: © 2018 Pan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Details of Datasets RNA-Seq Data for 3-month-old murine non-neoplastic and OPG-1, OPG-2, OPG-3, OPG-4 models: RNA-Seq murine data that formed the basis of the original and follow-up analyses. The non-neoplastic samples are also denoted as FF (flox/flox control); OPG-1 model is also denoted as FMC (Gfap-Cre NF1 mutants); the OPG-2 model is also denoted as F18C (germline F18C NF1 mutants); the OPG-3 is also denoted as FMPC (Gfap-Cre NF1/PTEN mutants); and the OPG-4 model is also denoted as FMOC (Olig2-Cre NF1 mutants). The samples are publicly available at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102345. RNA-Seq Data for 6-month-old murine non-neoplastic and OPG-1 model: RNA-Seq murine data on 6-month-old mice. The OPG-1 model is also denoted as FMC (Gfap=Cre NF1 mutants). The samples are publicly available at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102345. qPCR Data for 3-month-old murine OPG-1: In these experiments, we chose 12 genes (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1, Nlgn3, Dusp6, Lrp4, Rsbn1l, and Socs2) for qPCR validation in optic nerves from normal healthy and optic glioma-bearing mice (n=3 mice/group). Cell-type based expression data from the brain RNA-Seq database: Data analyzed from the brain RNA-seq database . The samples are publicly available at: http://web.stanford.edu/group/barres_lab/brain_rnaseq.html. Microarray Data for Pediatric Pilocytic Astrocytoma: The dataset is available at the NCBI GEO repository, GSE42656 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42656). We used the samples from 14 pediatric pilocytic astrocytomas and 8 fetal cerebellum samples. The gene expression samples were measured using Illumina HumanHT-12 V3.0 expression beadchip.  Microarray Data for Juvenile Pilocytic Astrocytoma: The dataset is available at the NCBI GEO repository, GSE12907 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12907). We used the samples from 21 juvenile pilocytic astrocytomas and four normal cerebellum samples. The gene expression samples were measured using an Affymetrix Human Genome U133A array. 
Funding: This work was funded by grants from the James S. McDonnell Foundation (to DHG); and the National Cancer Institute (1R01-CA195692-01 to DHG, AR, and JH). YP was supported by a McDonnell Center fellowship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Similar to other ecological systems, mammalian tissues can also be considered as complex biological systems, composed of a multitude of cellular and acellular elements that each contribute to overall biosystem function. In this regard, both normal and disease tissues contain numerous distinct cell types and molecular components that bi-directionally interact to establish new functional states for tumor tissue relative to their normal non-neoplastic counterparts. One natural outgrowth of this conceptualization is the idea that normal healthy and disease tissues can be defined in an objective manner using computational approaches. Algorithms have been used to classify diseased states, to assess risk as a function of specific factors such as gender, age, and environmental exposures [1–3]; and to identify individualized treatments based on gene expression profiles [4–6].
In addition to disease state classification, computational modeling can also be employed to identify ecosystem relationships that exist in the tissue as a whole. One type of model consists of representing relationships as a network, where the nodes are proteins, transcription factors, or genes, and the edges between nodes signify communication pathways. Networks highlight core differences between normal healthy and tumor tissues, and, as such, might serve to identify unanticipated regulatory pathways germane to the maintenance of the tumor. To investigate potential networks, we leveraged one authenticated murine model of a brain tumor (optic glioma) that arises in children with the neurofibromatosis type 1 (NF1) cancer predisposition syndrome [1, 2]. These optic glioma tumors are low-grade (World Health Organization grade I pilocytic astrocytomas) neoplasms, which develop early in childhood . Since these tumors are not removed or biopsied in children with NF1, we specifically chose this Nf1 genetically-engineered mouse low-grade glioma model system, because it recapitulates many of the features seen in the human condition and has been successfully employed to evaluate promising targeted therapies now in clinical trial for children with the tumors [3–6] (http://clinicaltrials.org; NCT01089101, NCT01158651 and NCT01734512).
Applications of network analysis to the understanding of cancer biology are relatively new, and can be divided into three methodological types: (1) enrichment of fixed gene sets, (2) de novo subnetwork construction and clustering and (3) network-based modeling . While we applied all three methods to characterize murine Nf1 optic gliomas, the network-based approach (the third type) was the only one with sufficient power to leverage relational network information to define specific regulator connections.
In this report, we describe one specific type of network analysis based on measures of network complexity . We focused on a standard measure of network complexity predicated on the idea of centrality, as measured by betweenness. The general idea is that network complexity analysis reveals sub-networks that are the biggest contributors to genetic complexity. Genetic complexity equates to a “surplus of genotypic diversity over phenotypic diversity” , and is associated with networks that are more difficult to interpret. Our proof-of-principle study uses complexity analysis of weighted transcription networks to identify transcription factors that comprise a regulatory network unique to low-grade brain tumors arising in the optic nerves of Nf1 mutant mice (optic gliomas). The edges of the transcription network are given weights based on transcription data, so that the RNA expression data from a normal control group result in a “normal” network, while data from a group of tumor samples are used to define a “tumor” network. Having a normal network and a tumor network side-by-side is important for comparing the two experimental conditions. Using both networks simultaneously, we were able to identify the Etv5 network as a defining feature of the neoplastic state in mouse and human low-grade glioma tumors.
Materials and methods
Nf1flox/flox (N), Nf1flox/null; GFAP-Cre (OPG-1), Nf1flox/R681*; GFAP-Cre (OPG-2), Nf1flox/null; Ptenflox/wt; GFAP-Cre (OPG-3), and Nf1flox/null; Olig2-Cre (OPG-4) mice [2, 11–13] were maintained on a C57Bl/6 background. Mice were housed in standard caging systems, which included irradiated commercial mouse chow and acidified water available at all times. Temperatures were kept at 70–72°F with 50–60% humidity on static racks. Light cycles were set at 5:00am on and 7:00pm off. All cages include pressed cotton squares for environmental enrichment that were changed multiple times per week. Every animal was checked at least once per day, and mating cages and cages with litters were checked twice per day for appearance, movement and overall health to monitor well-being. Mice were euthanized with 200 mg/kg sodium pentobarbital via i.p. prior to tissue isolation. All procedures were performed in accordance with an approved Animal Studies Committee protocol at Washington University with associated ethics committee approval.
Mice were perfused, followed by optic nerves/chiasms microdissection, and preservation in TRIzol solution. RNA was isolated by phenol/chloroform extraction followed by isopropanol precipitation. RNA from primary astrocytes was isolated using the RNeasy Mini Kit (Qiagen) according to the manufacturer’s protocol.
Quantitative real-time PCR
Primary astrocyte culture
RNA-sequencing of optic nerves/chiasms isolated from 6-weeks-old mice was performed as previously described .
Regulatory network construction
The glioma network was inferred from the Rembrandt microarray data set (available from GEO GSE68848 ). The Rembrandt data were generated through the Glioma Molecular Diagnostic Initiative and include 874 glioma specimens. With the Rembrandt data, the network was built using the ARACNe-AP algorithm [22, 23].
The ARACNe-AP algorithm reconstructs gene regulatory networks from large-scale gene expression data . The steps used in this algorithm are summarized below:
- The input to the algorithm is a list of transcription facts and gene expression profile data.
- The gene expression data is pre-processed in order to determine a mutual information (MI) threshold. Specifically, all pairwise MI scores between gene expression profiles are estimated, and then their significance assessed by comparing them to a null dataset. The significance level depends on the sample size of the input.
- The bootstrapping step: selecting a random sample from the input gene expression profiles.
- The gene expression profiles are rank-transformed, and then the MI for each Transcription Factor/Target pair is calculated.
- The MI threshold from Step 2 is used to remove any connections that are not statistically significant.
- Indirect interactions are removed using the Data Processing Inequality tolerance filter described in .
- A network is constructed using consensus, by keeping only edges that were detected a significant number of times across all bootstrap runs in Step 3. Significance is determined using a Poisson distribution, keeping only those edges with a p-value less than 0.05.
The ARACNe algorithm, first described in , has been successfully used for over a decade. The Adaptive Paritioning version of the algorithm improves on the original method by eliminating the use of pre-determined bins of fixed sizes for estimating the mutual information. The resulting ARACNe-AP algorithm and its validation are described in detail in . Various networks constructed using the algorithm are available online from bioconductor.org as the aracne.networks data package. The regulatory network used in this study is provided as part of the supplementary materials (S2 Table).
RNA expression data from optic nerves of Nf1flox/flox (N, normal control group, n = 4) and Nf1flox/mut; GFAP-Cre (OPG-1, tumor group, n = 11) were generated previously  and used to create separate weighted networks. Note that the network structure is the same for the two groups, but the weights assigned to the edges differ. Genes that were not expressed in one of the groups were removed from the network. The network weights are given by the distance between two nodes (i.e., between two genes). The distances between two genes (j and k) are calculated as 1 − | ρ(xj, xk)| where ρ(xj, xk) is the minimum of the Pearson and Spearman correlations between the expression levels of gene j and gene k.
The betweenness centrality of gene i is the fraction of shortest paths in the network that go through gene i. Note that our original choice of transcriptome network built by the ARACNe-AP algorithm has a consequence that only regulators (and not their targets) have high centrality measures. Centrality measures were calculated using the igraph package in R .
Differential expression analysis
The differential expression and fold change calculations on the RNA-Seq datasets (including all of the OPG datasets) were performed using the R package DESeq2 version 1.18.1 , including use of Benjamini-Hochberg adjusted p-values  with an adjusted significance level of 0.01. Published microarray data were analyzed by comparing pilocytic astrocytomas and normal control samples. The publicly available data were already normalized, and we performed standard two-sample t-tests on all probes in each dataset. After analysis at the probe level, data were collated based on the gene to which the probes mapped. Because the analysis was confirmatory for ETV5 and its network (and not exploratory), we did not adjust for multiple comparisons.
Details of the datasets employed
- RNA-Seq Data for 3-month-old murine non-neoplastic and OPG-1, OPG-2, OPG-3, OPG-4 models. RNA-Seq murine data that formed the basis of the original and follow-up analyses. The non-neoplastic samples are also denoted as FF (Nf1flox/flox); OPG-1 model is also denoted as FMC (Nf1flox/null; GFAP-Cre); the OPG-2 model is also denoted as F18C (Nf1flox/R681*; GFAP-Cre); the OPG-3 is also denoted as FMPC (Nf1flox/null; Ptenflox/wt; GFAP-Cre); and the OPG-4 model is denoted as FMOC (Nf1flox/null; Olig2-Cre). The samples are publicly available at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102345.
- RNA-Seq data for 6-week-old murine non-neoplastic and Nf1 OPG-1 model0 RNA-Seq murine data on 6-week-old OPG-1 (FMC) mice. The OPG-1 model is also denoted as FMC. The samples are publicly available at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102345.
- qPCR data for 3-month-old murine Nf1 OPG-1 model. In these experiments, we chose 12 genes (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1, Nlgn3, Dusp6, Lrp4, Rsbn1l, and Socs2) for qPCR validation in optic nerves from normal healthy and optic glioma-bearing mice (n = 3 mice/group), 8 genes that demonstrated differential expression patterns similar to the original RNA-seq data (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3, Fig 1), while 4 demonstrated different patterns or no change (Dusp6, Lrp4, Rsbn1l and Socs2, data not shown). The p-values were calculated using a t-test (GraphPad, Prism).
- Cell-type based expression data from the brain RNA-Seq database. Data analyzed from the brain RNA-seq database , Fig 2 compares the expression levels in various cell types. The samples are publicly available at http://web.stanford.edu/group/barres_lab/brain_rnaseq.html.
- Microarray data for pediatric pilocytic astrocytomas. The dataset is available at the NCBI GEO repository, GSE42656 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42656), which included 14 pediatric pilocytic astrocytomas and 8 fetal cerebellum samples. The gene expression samples were measured using Illumina HumanHT-12 V3.0 expression beadchip . In addition, GSE12907 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12907), contained 21 juvenile pilocytic astrocytomas and four normal cerebellum samples. The gene expression samples were measured using an Affymetrix Human Genome U133A array .
Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3 RNA expression is increased in optic glioma samples (OPG-1) relative to the normal murine optic nerve (N). n.d., not detected. FC, fold change. p = 0.0003 (Gldc), p = 0.0258 (Spry4), p<0.0001 (Fabp5), p = 0.0005 (Spry2), p = 0.0045 (Shc3), p = 0.0111 (Spred1), p = 0.0008 (Nlgn3). The unadjusted p-values were calculated separately using a t-test.
(A) Etv5 RNA expression in astrocytes, neurons, OPCs (oligodendrocyte progenitor cells), newly formed oligodendrocytes, myelinating oligodendrocytes, microglia and endothelial cells demonstrates preferential enrichment in astrocytes, OPCs and microglia. Data analyzed from the brain RNA-Seq database  (http://web.stanford.edu/group/barres_lab/brain_rnaseq.html). (B) Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3 expression was enriched in astrocytes, OPCs and microglia. Spred1 was the only gene expressed in microglia.
The network complexity analysis in this case study is divided into two steps: (1) discovery of Etv5 with computational methods and (2) validation of Etv5 with independent experimental techniques. The edges of the transcription network are given weights based on expression data such that expression data from a normal control group (N, optic nerves from Nf1flox/flox mice, which are equivalent to wild-type mice ), result in a “normal” network, while data from a group of optic glioma tumor samples (OPG-1, optic nerves from Nf1flox/null; GFAP-Cre mice, ) comprise a “tumor” network. While the individual bioinformatics steps are not new, our novel approach illustrates how this practical approach can be implemented to identify unknown regulatory networks.
First, we use network complexity to identify genes that are central to the tumor network. Herein, the term central is used in the context of network complexity analysis: a node’s centrality measure is determined by the frequency with which it appears in paths between other nodes in the network, as described below. Next, we identify a single transcription factor, Etv5, not previously implicated in low-grade glioma, as the gene most differentially central: As such, Etv5 was the gene most central in the tumor network, and was not central in the normal network. Finally, we validated the differential expression of Etv5 in several other Nf1 murine models of optic glioma, as well as in human low-grade glioma (pilocytic astrocytoma, PA) datasets. This validation was accomplished by comparing expression levels of Etv5 and its target genes in normal healthy optic nerves and optic gliomas, as well as by determining the cell type expression profiles of those genes.
Bioinformatic discovery of Etv5
In our analysis, we employed two distinct computational techniques (network and differential expression analyses) to provide different information about the relationships between the genes within the networks and across the experimental conditions. The discovery of Etv5 and its network was only possible using the combination of the different computational techniques (see Algorithm 1).
Network and complexity analysis
In order to identify genes that might play a role in disease evolution, we sought to determine what has changed in the intra-cellular network that describes cellular processes. We also sought to identify genes whose role had significantly changed as the tissue evolved from a healthy to a diseased state. This study describes how to combine a centrality analysis with differential expression to identify potential genes of interest. In order to do this, we required a network that represented possible interactions between genes. We utilized expression data from samples in different groups (healthy and diseased, for example) to add weights to an existing interactome to create a weighted network for each group. This allowed us to identify “important” subnetworks in each network, and to determine which genes are central to the important subnetworks in the diseased tissue.
A gene regulatory network (GRN) depicts how some genes encoding regulatory molecules, such as transcription factors or microRNAs, regulate the expression of other genes . As the base network for the Etv5 analysis, we used a regulatory network based on biological interactions between transcription factors and their targets. This network was developed in prior research by another group, and is described more fully in . The nodes of the network represent regulator and target genes, and the edges represent protein-DNA interactions. The regulatory network was derived from transcription (RNA) data, representing a variety of different gliomas, including low-grade (benign) and high-grade (malignant) gliomas, to capture pan-glioma regulatory interactions. Four hundred twenty-seven human glioma gene expression profiles were obtained from the Rembrandt data repository , and these were combined to create a network using the ARACNe-AP (Algorithm for the Reconstruction of Accurate Cellular Networks). This network was selected as the most appropriate to use, since it is based on transcription data from gliomas. We added edge weights to this network using RNA expression data, as described below. The details of the ARACNe algorithm are provided in : The basic procedure uses mutual information (MI) to measure similarity across genes and then bootstrapping to measure the strength and consistency of the similarities. A step-by-step description of the algorithm is presented in the Materials and Methods section. The regulatory network is available as S2 Table.
Centrality measures are commonly used to identify nodes that could potentially play important roles in weighted networks . Some frequently used centrality measures are closeness, betweenness, and entropy; for example West et al. use differential entropy between tumor and normal tissue to detect relevant genes . The betweenness measure was selected as the best for distinguishing between normal control and tumor-weighted networks, since its large range of values led to a clear discrimination between the tumor and control networks. Fig 3 and S1 Fig show the comparison of using betweenness and closeness to differentiate the tumor and normal networks. The betweenness metric identifies genes that are substantially different across the two networks; the closeness metric does not identify any such stand-out genes. It should be noted that Etv5 was not among the 50 most differentially-expressed genes by standard differential expression analysis, and it is unlikely that it would have been identified as a potential glioma network regulator.
Filled (red) circles indicate genes whose betweenness measure is at least 1.1 times as large in the tumor network as in the normal network and either a tumor betweenness or normal betweenness value greater than 1e6. These genes are listed in Table 1, and shown in pink in Fig 4.
The Betweenness Centrality of gene i (i.e., node i), bi, is the fraction of shortest paths in the network that go through gene i: where nj,k is the number of shortest paths connecting genes j and k, and nj,k(i) is the number of shortest paths that also go through gene i. The length of the path between any two genes is given by the sum of the distances, or weights, of its edges, where the distances were computed using the expression data. Because the edge values are based on the expression data, the betweenness measures will differ between the normal and tumor expression networks. Importantly, the genes with larger betweenness values in the tumor expression network relative to the normal expression network were identified.
Using the 3-month-old murine OPG-1 RNA-Seq dataset, 22 genes were identified as having betweenness values at least 1.1 times as large in the tumor expression network as in the normal expression network and either a tumor betweenness or normal betweenness value greater than 1e6. These genes are shown as solid, red circles in Fig 3; listed in Table 1; and denoted as “central” to the expression network.
Algorithm 1: Identifying Etv5 network
- A reference network is identified for all transcription factors. In this study we use the regulatory network created with the ARACNe-AP algorithm
- Weights are added to the edges of the reference network using one minus the minimum of the Pearson or Spearman correlation of RNA-Seq data. Two different weighted networks are created—one with tumor RNA-Seq data and one with normal RNA-Seq data
- From each weighted network, the betweenness value for each gene is calculated
- Genes with betweenness values 1.1 times as big in the tumor network relative to the normal network and having either a tumor betweenness or a normal betweenness greater than 1e6 are identified. These genes are considered “central” in the tumor network (Table 1).
- Of the “central” genes, the most differentially expressed gene is identified (Etv5).
- Genes which represent targets of the Etv5 transcription factor and are differentially expressed across the two conditions were identified (Table 2).
Identification of ETV5
Once the twenty-three “central” genes were identified, we determined which of them behaved differently in the two groups with respect to average RNA expression. One gene, Etv5, emerged as having a large differential effect across the two expression datasets on the 3-month-old murine OPG-1 samples. Using the original regulatory network, we identified the target genes for each of the regulators listed in Table 1. Differential expression analysis was employed to identify which of the central genes and which of their target genes were significantly different between the tumor (OPG-1) and the normal (N) 3-month-old samples.
Importantly, of the regulator genes that were determined to be “central” according to the betweenness measure (Table 1), the gene that was mostly clearly differentially expressed was Etv5 (Table 3). Additionally, Etv5 also had the highest percentage of differentially-expressed targets of all of the central genes (Table 4). A list of the 31 differentially-expressed Etv5 target genes is shown in Table 2. The network consisting of the central regulators, along with the differentially expressed targets of Etv5 is shown in Fig 4. Henceforth, we call the subnetwork consisting of Etv5 and its differentially expressed targets as “the Etv5 network”.
The subnetwork is comprised of Etv5 (in lavender in the center), and its differentially-expressed targets (shown in yellow). The remaining central genes, identified by their high betweenness measures relative to the normal network, are shown on the periphery in pink.
To extend the in silico findings, and to validate Etv5 as a differentially-expressed tumor-specific gene, we performed a series of complementary experiments. These experiments showed that Etv5 and its targets are differentially expressed in different cell types, at different stages of tumor growth, and in different organisms, and lend further credence to the notion that Etv5 and its associated target genes comprise a potential regulatory network in neoplastic tissue relative to their normal healthy, non-neoplastic counterparts.
In addition to RNA-Seq experiments, we also used quantitative real-time PCR (qPCR) to analyze Etv5 RNA expression in independently-generated non-neoplastic (Nf1flox/flox ; abbreviated N for normal) and tumor-bearing (Nf1flox/null GFAP-Cre ; “OPG-1”) optic nerves. For these studies, four nerves and chiasms from each group were removed from 3-month-old mice at a time when optic gliomas in these mice are first detected . Consistent with the RNA-seq data, Etv5 expression was detected in the tumor-bearing optic nerves, but not in their non-neoplastic counterparts (Fig 5A). Moreover, Etv5 protein expression was also observed in optic glioma tissue, but not in normal healthy optic nerves, by immunohistochemistry with an Etv5-specific antibody on paraffin-embedded formalin-fixed specimens (Fig 5B). Collectively, these results validate the tumor-specific expression of Etv5 at both the RNA and protein levels.
(A) Etv5 expression was detected in in one representative murine Nf1 optic glioma model (OPG-1) but was not detected (n.d.) in the normal murine optic nerve (N) by qPCR. (B) Immunohistochemistry using Etv5-specific antibodies demonstrates Etv5 protein expression in the optic glioma (OPG-1, shown in brown), but not in the normal optic nerve (N). Nuclei are stained with hematoxylin (shown in blue). Scale bar, 20μm.
As an additional method to strengthen the specificity of the observed differential Etv5 expression in the neoplastic tissue, we performed RNA-seq analysis on 6-week-old optic nerves from normal healthy and optic glioma-bearing mice (n = 3 mice/group). At this stage of glioma development, the tumors are just beginning to form, as evidenced by an increase in tumor proliferation and glial cell numbers, but without a change in optic nerve volume . Similar to the results obtained from 3-month-old mouse specimens, there was differential Etv5 RNA-Seq expression in 6-week-old optic glioma-bearing mice, which contain a developing, but not mature, tumor  (S2 Fig). These observations suggest that differential expression of Etv5 defines a ground state of neoplasia that exists at a time when it is not yet possible to clearly classify the mouse optic nerve as a tumor.
Numerous prior studies from our laboratory have employed differential RNA expression methods to identify genes unique to murine Nf1 optic glioma. These include an analysis of neoplastic and non-neoplastic cell types; however, none of these previous studies identified Etv5 as a differentially-expressed gene [16, 34]. In this respect, previous studies including gene expression profiles for isolated cells in optic glioma did not find that Etv5 was differentially regulated in any specific single cell type (e.g., astrocytes, microglia). These particular findings were confirmed by the analysis of Nf1-deficient neoplastic cells (astrocytes) by qPCR (S3 Fig) and tumor-associated microglia (non-neoplastic cells ) using isolated cells and data previously acquired. Taken together, these observations argue that differential Etv5 expression reflects a “whole tumor” property, rather than representing the individual contributions from any single cell type within the tumor.
Next, we sought to determine whether the target genes of Etv5, which were differentially expressed in the tumor, are similarly differentially expressed in tumor-bearing optic nerves relative to normal healthy non-neoplastic optic nerve. In these experiments, we chose 12 genes (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1, Nlgn3, Dusp6, Lrp4, Rsbn1l, and Socs2) for independent validation by qPCR (Fig 1). Of the 12 genes chosen, 67% demonstrated differential expression patterns similar to the original RNA-seq data (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3), while four demonstrated different patterns or no change (Dusp6, Lrp4, Rsbn1l and Socs2).
Because the expression of Etv5 targets in the tumor may reflect the expression of these genes in different cell types, we queried an established brain cell type-specific transcriptome database . We focused on Etv5 and those targets that were significantly differentially expressed (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3). Using this brain RNA-Seq database , we found enrichment of Etv5 expression in astrocytes, oligodendrocyte progenitor cells (OPC) and microglia relative to endothelial cells, neurons, and more differentiated oligodendrocytes (Fig 2A). These findings are consistent with biological studies demonstrating that astrocytes , o-GSCs (optic glioma stem cells) , and microglia [6, 16] are all critical mediators of glioma formation and growth. Interestingly, seven of the eight Etv5 target genes were expressed predominantly in o-GSCs and astrocytes, whereas Spred1 was expressed in all three cell types (Fig 2B). These observations additionally support the idea that the network formed by Etv5 and its targets reflects a global change in the tumor ecosystem, rather than differential gene expression by only one cell type in the tumor.
To demonstrate that differential Etv5 expression is a hallmark of murine Nf1 optic glioma, we leveraged prior data from our laboratory in which multiple distinct genetically-engineered mouse models of optic glioma were generated. Each of these models differs in important respects, including the germline Nf1 gene mutation (OPG-2; c.2041C>T; p.R681X as seen in several patients with NF1-OPG ), the presence of additional genetic mutations (OPG-3; additionally harboring heterozygous Pten loss ), and the tumor cell of origin (OPG-4; Nf1flox/mut; Olig2-Cre in which tumors arise from Olig2+ cells). While the four optic glioma models are histologically similar to each other, they represent molecularly-distinct tumors, as assessed by RNA-seq analysis . In each of the murine Nf1 OPG models, we observed increased Etv5 RNA expression (Table 5). Moreover, in at least three out of the four optic glioma models, there is increased expression of the eight validated Etv5 target genes (Gldc, Spry4, Fabp5, Pcdhgc3, Spry2, Shc3, Spred1 and Nlgn3) (Table 5). These findings further strengthen the conclusion that differential Etv5 expression is a hallmark of the neoplastic state, as it is shared amongst multiple distinct murine Nf1 optic glioma models.
Finally, to determine whether differential ETV5 expression is also a feature of human pediatric low-grade gliomas (pilocytic astrocytoma; PA), we leveraged the only two human RNA microarray datasets that contained reference non-neoplastic tissue for comparison (GSE42656 and GSE12907). We chose these datasets for two major reasons: First, there are no currently-available datasets with RNA expression data on NF1-associated optic glioma and normal optic nerve, since these tumors are rarely biopsied as part of routine medical care. Second, while the genetic etiology of sporadic and NF1-associated PA are distinct, they both result in activation of the same growth control pathways [3, 35–39] and are histologically similar .
In both the pediatric and the juvenile pilocytic astrocytoma datasets, ETV5 was differentially expressed in the tumor groups relative to the non-neoplastic samples (Fig 6), similar to that observed in the mouse low-grade gliomas. Moreover, using the thirty-one Etv5 target genes that were differentially expressed in the murine tumors (Table 2), a large fraction of the probes for those targets were also differentially expressed (unadjusted p-values) in the two human datasets (56.1% in the pediatric pilocytic astrocytoma dataset and 40.8% of the probes in the juvenile pilocytic astrocytoma dataset; Fig 7).
ETV5 is differentially expressed in the tumor groups from both datasets. Using the pediatric pilocytic astrocytoma data, both of the two Illumina probes for ETV5 are significantly differentially expressed (p < 0.0001 for both probes). Using the juvenile pilocytic astrocytoma data, all three of the Affymetrix probes are significantly differentially expressed (p = 0.001, p = 0.00005, p = 0.00003 for the three probes). Vertical axis is on a log2-scale.
Histograms of (unadjusted) p-values for differential expression of ETV5 target genes in two human data sets; vertical lines are drawn at p = 0.05. In the GSE42656 dataset, 56.1% of all probes and 64.5% of the target genes are significantly differentially expressed (top panel). In the GSE12907 dataset, 40.8% of all probes and 38.7% of the target genes were significantly differentially expressed (middle panel). In the 6-week-old murine OPG-1 data set, 48.3% of the target genes were significantly differentially expressed (lower panel).
Leveraging the betweenness network analysis and RNA-sequencing of murine optic glioma tissues, we discovered Etv5 and Etv5-associated genes as a differentially-regulated transcriptional network in low-grade brain tumors relative to non-neoplastic brain tissue. We further validated the differential expression of Etv5 and its target genes in several other murine models of Nf1 optic glioma, as well as in human pilocytic astrocytomas. This type of network analysis is extremely useful in identifying potential network regulators unique to the neoplastic state, since multiple differentially-expressed genes were identified in tumor versus normal healthy tissue. In addition, the deployment of multiple computational and statistical analysis strategies, as well as biological validation across species and time further strengthen the idea that Etv5 may be a central control element in establishing and/or maintaining low-grade gliomas. Moreover, the identification of Etv5 and its network as a tissue-level signature is supported by the lack of differential Etv5 expression in isolated Nf1-deficient astrocytes or Nf1-mutant microglia, which are the major neoplastic and non-neoplastic cells in the optic glioma ecosystem.
ETV5 is a transcription factor that belongs to the conserved E26 transformation-specific family of transcription factors. Relevant to our study, ETV5 levels are elevated in many tumors, including glioma [41–43]. As such, ETV5 has been hypothesized to function downstream of RAS , which is consistent with the major role of the Nf1 protein (neurofibromin) as a negative regulator of RAS in the brain [3, 19]. In addition, ETV5 function is required to sustain Nf1-deficient high-grade glioma growth , but had not been previously implicated in low-grade gliomas. Since ETV5 expression is not a hallmark of any particular neoplastic cell type in the low-grade gliomas, it is most likely that the network established reflects aberrant RAS activation in numerous cell types in the tumor. Future mechanistic studies will be required to address the relationships between NF1/RAS/MEK/ERK pathway function, ETV5 network regulation, and NF1-associated low-grade glioma formation or maintenance.
Taken together, we present a proof-of-concept study that establishes betweenness network analysis as a valuable tool for identifying central genes unique to the tumor ecosystem. While this approach nicely illustrates the value of computational and bioinformatic approaches for characterizing the neoplastic state relative to its non-neoplastic counterpart, further investigation is required to understand the mechanistic role of Etv5 in glioma formation and maintenance. Based on these findings, we recommend the use of network analysis for similar expression studies designed to differentiate various tumor states.
S1 Fig. Comparison of closeness measures in the normal and tumor networks.
Filled (red) circles indicate genes whose betweenness measure is at least 1.1 times as large in the tumor network as in the normal network and either a tumor betweenness or normal betweenness value greater than 1e6 (identified in the body of the manuscript). These genes are listed in Table 1, and shown in pink in Fig 4. Note that the closeness metric does not differentiate the tumor and normal networks as well as betweenness (see Fig 3).
S2 Fig. Etv5 expression in 6-week-old optic glioma-bearing mice.
Etv5 RNA expression is higher in 6-week-old optic glioma-bearing mice relative to control non-neoplastic optic nerves (three samples for each experimental group; p = 0.0113). Note that there is no overlap in the expression of ETV5 for the control versus optic glioma-bearing nerves.
S3 Fig. Nf1-independent regulation of Etv5 expression in primary astrocytes.
No differences in Etv5 mRNA expression were observed between wild-type and Nf1-deficient astrocytes, as assessed by quantitative real-time RT-PCR. Three independently-generated pairs of wild-type and Nf1-/- primary brainstem astrocytes were generated, and maintained as previously described [1–3].
S1 Table. Primers used for real-time quantitative PCR.
S2 Table. The gene regulatory network used as a reference network.
We thank Courtney Corman for expert technical assistance. We thank Mariano Alvarez for providing us with the glioma ARACNe network. All mice were maintained on a C57Bl/6 background and procedures performed in accordance with an approved Animal Studies Committee protocol at Washington University.
- 1. Bajenaru ML, Garbow JR, Perry A, Hernandez MR, Gutmann DH. Natural history of neurofibromatosis 1-associated optic nerve glioma in mice. Ann Neurol. 2005;57(1):119–27. pmid:15622533.
- 2. Bajenaru ML, Hernandez MR, Perry A, Zhu Y, Parada LF, Garbow JR, et al. Optic nerve glioma in mice requires astrocyte Nf1 gene inactivation and Nf1 brain heterozygosity. Cancer Res. 2003;63(24):8573–7. pmid:14695164.
- 3. Kaul A, Toonen JA, Cimino PJ, Gianino SM, Gutmann DH. Akt- or MEK-mediated mTOR inhibition suppresses Nf1 optic glioma growth. Neuro Oncol. 2015;17(6):843–53. pmid:25534823
- 4. Hegedus B, Banerjee D, Yeh TH, Rothermich S, Perry A, Rubin JB, et al. Preclinical cancer therapy in a mouse model of neurofibromatosis-1 optic glioma. Cancer Res. 2008;68(5):1520–8. pmid:18316617.
- 5. Daginakatte GC, Gianino SM, Zhao NW, Parsadanian AS, Gutmann DH. Increased c-Jun-NH2-kinase signaling in neurofibromatosis-1 heterozygous microglia drives microglia activation and promotes optic glioma proliferation. Cancer Res. 2008;68(24):10358–66. pmid:19074905.
- 6. Daginakatte GC, Gutmann DH. Neurofibromatosis-1 (Nf1) heterozygous brain microglia elaborate paracrine factors that promote Nf1-deficient astrocyte and glioma growth. Hum Mol Genet. 2007;16(9):1098–112. pmid:17400655.
- 7. Listernick R, Ferner RE, Liu GT, Gutmann DH. Optic pathway gliomas in neurofibromatosis-1: controversies and recommendations. Ann Neurol. 2007;61(3):189–98. pmid:17387725.
- 8. Creixell P, Reimand J, Haider S, Wu G, Shibata T, Vazquez M, et al. Pathway and network analysis of cancer genomes. Nat Methods. 2015;12(7):615–21. pmid:26125594
- 9. Breitkreutz D, Hlatky L, Rietman E, Tuszynski JA. Molecular signaling network complexity is correlated with cancer patient survivability. Proc Natl Acad Sci U S A. 2012;109(23):9209–12. pmid:22615392
- 10. Thompson EG, Galitski T. Quantifying and analyzing the network basis of genetic complexity. PLoS Comput Biol. 2012;8(7):e1002583. pmid:22792055
- 11. Kaul A, Toonen JA, Gianino SM, Gutmann DH. The impact of coexisting genetic mutations on murine optic glioma biology. Neuro Oncol. 2015;17(5):670–7. pmid:25246427
- 12. Toonen JA, Anastasaki C, Smithson LJ, Gianino SM, Li K, Kesterson RA, et al. NF1 germline mutation differentially dictates optic glioma formation and growth in neurofibromatosis-1. Hum Mol Genet. (in press).
- 13. Solga AC, Toonen JA, Pan Y, Cimino PJ, Ma Y, Castillon GA, et al. The cell of origin dictates the temporal course of neurofibromatosis-1 (Nf1) low-grade glioma formation. Oncotarget. 2017;(in press).
- 14. Hegedus B, Yeh TH, Lee DY, Emnett RJ, Li J, Gutmann DH. Neurofibromin regulates somatic growth through the hypothalamic-pituitary axis. Hum Mol Genet. 2008;17(19):2956–66. pmid:18614544
- 15. Pan Y, Bush EC, Toonen JA, Ma Y, Solga AC, Sims PA, et al. Whole tumor RNA-sequencing and deconvolution reveal a clinically-prognostic PTEN/PI3K-regulated glioma transcriptional signature. Oncotarget. 2017.
- 16. Solga AC, Pong WW, Kim KY, Cimino PJ, Toonen JA, Walker J, et al. RNA Sequencing of Tumor-Associated Microglia Reveals Ccl5 as a Stromal Chemokine Critical for Neurofibromatosis-1 Glioma Growth. Neoplasia. 2015;17(10):776–88. pmid:26585233
- 17. Sandsmark DK, Zhang H, Hegedus B, Pelletier CL, Weber JD, Gutmann DH. Nucleophosmin mediates mammalian target of rapamycin-dependent actin cytoskeleton dynamics and proliferation in neurofibromin-deficient astrocytes. Cancer Res. 2007;67(10):4790–9. pmid:17510408.
- 18. Smithson LJ, Gutmann DH. Proteomic analysis reveals GIT1 as a novel mTOR complex component critical for mediating astrocyte survival. Genes Dev. 2016;30(12):1383–8. pmid:27340174
- 19. Dasgupta B, Li W, Perry A, Gutmann DH. Glioma formation in neurofibromatosis 1 reflects preferential activation of K-RAS in astrocytes. Cancer Res. 2005;65(1):236–45. pmid:15665300.
- 20. Pan Y, Smithson LJ, Ma Y, Hambardzumyan D, Gutmann DH. Ccl5 establishes an autocrine high-grade glioma growth regulatory circuit critical for mesenchymal glioblastoma survival. Oncotarget. 2017. pmid:28380429.
- 21. Madhavan S, Zenklusen JC, Kotliarov Y, Sahni H, Fine HA, Buetow K. Rembrandt: helping personalized medicine become a reality through integrative translational research. Mol Cancer Res. 2009;7(2):157–67. pmid:19208739
- 22. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006;7 Suppl 1:S7. pmid:16723010.
- 23. Lachmann A, Giorgi FM, Lopez G, Califano A. ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information. Bioinformatics. 2016;32(14):2233–5. pmid:27153652
- 24. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;Complex Systems:1695.
- 25. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. pmid:25516281
- 26. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B (Methodological). 1995;57(No. 1): 289–300.
- 27. Zhang Y, Chen K, Sloan SA, Bennett ML, Scholze AR, O'Keeffe S, et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J Neurosci. 2014;34(36):11929–47. pmid:25186741
- 28. Henriquez NV, Forshew T, Tatevossian R, Ellis M, Richard-Loendt A, Rogers H, et al. Comparative expression analysis reveals lineage relationships between human and murine gliomas and a dominance of glial signatures during tumor propagation in vitro. Cancer Res. 2013;73(18):5834–44. pmid:23887970.
- 29. Wong KK, Chang YM, Tsang YT, Perlaky L, Su J, Adesina A, et al. Expression analysis of juvenile pilocytic astrocytomas by oligonucleotide microarray reveals two potential subgroups. Cancer Res. 2005;65(1):76–84. pmid:15665281.
- 30. Narang V, Ramli MA, Singhal A, Kumar P, de Libero G, Poidinger M, et al. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks. PLoS Comput Biol. 2015;11(9):e1004504. pmid:26393364
- 31. Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4:Article17. pmid:16646834.
- 32. West J, Bianconi G, Severini S, Teschendorff AE. Differential network entropy reveals cancer system hallmarks. Sci Rep. 2012;2:802. pmid:23150773
- 33. Toonen JA, Ma Y, Gutmann DH. Defining the temporal course of murine neurofibromatosis-1 optic gliomagenesis reveals a therapeutic window to attenuate retinal dysfunction. Neuro Oncol. 2016. pmid:28039362.
- 34. Chen YH, McGowan LD, Cimino PJ, Dahiya S, Leonard JR, Lee da Y, et al. Mouse low-grade gliomas contain cancer stem cells with unique molecular and functional properties. Cell Rep. 2015;10(11):1899–912. pmid:25772366
- 35. Pfister S, Janzarik WG, Remke M, Ernst A, Werft W, Becker N, et al. BRAF gene duplication constitutes a mechanism of MAPK pathway activation in low-grade astrocytomas. J Clin Invest. 2008;118(5):1739–49. pmid:18398503
- 36. Jones DT, Kocialkowski S, Liu L, Pearson DM, Ichimura K, Collins VP. Oncogenic RAF1 rearrangement and a novel BRAF mutation as alternatives to KIAA1549:BRAF fusion in activating the MAPK pathway in pilocytic astrocytoma. Oncogene. 2009;28(20):2119–23. pmid:19363522
- 37. Kaul A, Chen YH, Emnett RJ, Dahiya S, Gutmann DH. Pediatric glioma-associated KIAA1549:BRAF expression regulates neuroglial cell growth in a cell type-specific and mTOR-dependent manner. Genes Dev. 2012;26(23):2561–6. pmid:23152448
- 38. Kouhara H, Hadari YR, Spivak-Kroizman T, Schilling J, Bar-Sagi D, Lax I, et al. A lipid-anchored Grb2-binding protein that links FGF-receptor activation to the Ras/MAPK signaling pathway. Cell. 1997;89(5):693–702. pmid:9182757.
- 39. Manns M, Gunturkun O, Heumann R, Blochl A. Photic inhibition of TrkB/Ras activity in the pigeon's tectum during development: impact on brain asymmetry formation. Eur J Neurosci. 2005;22(9):2180–6. pmid:16262656.
- 40. Collins VP, Jones DT, Giannini C. Pilocytic astrocytoma: pathology, molecular mechanisms and markers. Acta Neuropathol. 2015;129(6):775–88. pmid:25792358
- 41. Power PF, Mak IW, Singh S, Popovic S, Gladdy R, Ghert M. ETV5 as a regulator of matrix metalloproteinase 2 in human chondrosarcoma. J Orthop Res. 2013;31(3):493–501. pmid:22968857.
- 42. Firlej V, Ladam F, Brysbaert G, Dumont P, Fuks F, de Launoit Y, et al. Reduced tumorigenesis in mouse mammary cancer cells following inhibition of Pea3- or Erm-dependent transcription. J Cell Sci. 2008;121(Pt 20):3393–402. pmid:18827017.
- 43. Breunig JJ, Levy R, Antonuk CD, Molina J, Dutra-Clarke M, Park H, et al. Ets Factors Regulate Neural Stem Cell Depletion and Gliogenesis in Ras Pathway Glioma. Cell Rep. 2015;12(2):258–71. pmid:26146073.
- 44. Li X, Newbern JM, Wu Y, Morgan-Smith M, Zhong J, Charron J, et al. MEK Is a Key Regulator of Gliogenesis in the Developing Brain. Neuron. 2012;75(6):1035–50. pmid:22998872