The Leukemia-Specific Fusion Gene ETV6/RUNX1 Perturbs Distinct Key Biological Functions Primarily by Gene Repression

Background ETV6/RUNX1 (E/R) (also known as TEL/AML1) is the most frequent gene fusion in childhood acute lymphoblastic leukemia (ALL) and also most likely the crucial factor for disease initiation; its role in leukemia propagation and maintenance, however, remains largely elusive. To address this issue we performed a shRNA-mediated knock-down (KD) of the E/R fusion gene and investigated the ensuing consequences on genome-wide gene expression patterns and deducible regulatory functions in two E/R-positive leukemic cell lines. Findings Microarray analyses identified 777 genes whose expression was substantially altered. Although approximately equal proportions were either up- (KD-UP) or down-regulated (KD-DOWN), the effects on biological processes and pathways differed considerably. The E/R KD-UP set was significantly enriched for genes included in the “cell activation”, “immune response”, “apoptosis”, “signal transduction” and “development and differentiation” categories, whereas in the E/R KD-DOWN set only the “PI3K/AKT/mTOR signaling” and “hematopoietic stem cells” categories became evident. Comparable expression signatures obtained from primary E/R-positive ALL samples underline the relevance of these pathways and molecular functions. We also validated six differentially expressed genes representing the categories “stem cell properties”, “B-cell differentiation”, “immune response”, “cell adhesion” and “DNA damage” with RT-qPCR. Conclusion Our analyses provide the first preliminary evidence that the continuous expression of the E/R fusion gene interferes with key regulatory functions that shape the biology of this leukemia subtype. E/R may thus indeed constitute the essential driving force for the propagation and maintenance of the leukemic process irrespective of potential consequences of associated secondary changes. Finally, these findings may also provide a valuable source of potentially attractive therapeutic targets.


Introduction
The ETV6/RUNX1 (E/R) fusion gene (also known as TEL/ AML1) is the hallmark of one of the most common genetic subtypes of B-cell precursor acute lymphoblastic leukemia (BCP ALL) in children [1,2]. The fusion gene encodes a chimeric transcription factor that comprises the N-terminal portion of ETV6 and the almost entire RUNX1 protein and is thought to convert RUNX1 from a transcriptional modulator to a transcriptional repressor of RUNX1 target genes [3]. The current multistep model implies that this gene fusion occurs already during fetal development and constitutes the initiating -although not sufficient -event for neoplastic transformation [4,5]. The idea that the ensuing gene product might perhaps also be relevant for maintenance of the malignant phenotype is derived from the results of recent experiments, which showed that RNAimediated silencing of the endogenous fusion gene reduces in vitro cell proliferation and cell survival as well as significantly impairs the in vivo repopulation capacity of the treated cells in a xenotransplant mouse model [6] (Fuka et al. manuscript submitted).
Microarray technologies made it possible to define the specific gene expression signatures of specific ALL subgroups, including those with an E/R fusion gene [7][8][9][10][11][12]. These diagnostically and clinically relevant molecular patterns derive from the comparison of a differentially expressed set of genes in a given type of leukemia relative to other subgroups included in such analyses. Since particular genetic subgroups can be clearly delineated and distinguished with this approach, it seems likely that primary underlying genetic defects, as for instance E/R, are the main determinants of the respective gene expression signature, although the transcriptional derangements will most likely also be modified to a certain extent by other factors, such as secondary genetic alterations. To investigate the specific impact of the chimeric E/R protein on overall gene expression, we knocked down the endogenous fusion gene in two leukemia cell lines utilizing fusion transcript specific short hairpin RNAs (shRNA) and compared the native and suppressed gene expression signatures. We also compared the E/R KD signature with that obtained from primary childhood ALL cases and validated the expression of selected target genes that represented various pathways or cellular functions, which were identified with this approach.

Results and Discussion
Defining target genes of E/R knockdown We silenced the endogenous fusion protein by lentiviral transduction of shRNA-encoding vectors in the leukemia cell lines REH and AT-2. Detailed information on the experimental design is provided in the Text S1. Expression profiling was performed in cells that were selected for viral integration and stable fusion gene suppression, which resulted in chimeric protein reduction of 50-80% between different experiments ( Figure S1). Differentially expressed genes were determined by microarray analyses using three and two biological replicates from independent knock-down (KD) experiments of the REH and AT-2 cell lines, respectively, as well as appropriate control cells that were transduced with a non-targeting shRNA vector. Despite the dissimilar genetic background imposed by different secondary changes in the two cell lines there was a significant correlation of differential gene expression in both models (r = 0.31, P,0.0001) ( Figure 1). A joint analysis identified 777 genes that were significantly (P,0.05) and concordantly up-(KD-UP; n = 403) and down-regulated (KD-DOWN; n = 374) after the knockdown of the E/R fusion gene (Table S1). The top 50 regulated genes are listed in Table 1, along with the log2-fold changes from the array analysis. They include, for instance, the two direct RUNX1 targets ID2 and PTPRCAP. ID2 encodes a proposed inhibitor of tissue-specific gene expression and PTPRCAP is a key regulator of lymphocyte activation (Table S1) [13,14]. Consistent with the notion that E/R acts as a constitutive repressor of RUNX1 target genes [3], these two genes are repressed in E/R-positive leukemias and up-regulated upon fusion gene KD. In contrast to our findings, Wotton et al. report that RUNX1-induced repression of ID2 is abrogated by E/R. This seemingly controversial result might possibly be explained by a context dependent gene regulation, since Wotton et al. used 3T3 murine fibroblast cells in their experiments. In line with our data, PTPRCAP transcription was found to be repressed by RUNX1-MTG8 and -MTG16 fusion genes, two RUNX1 fusions that are frequently found in acute myeloid leukemia [14]. Furthermore, the regulation of two other genes that are differentially expressed in E/R-positive ALL, also concords with our E/R KD results. CALN1, a brain-specific member of the calmodulin superfamily, is exclusively over-expressed [10], while MS4A1 (CD20), a regulator of B-cell activation and proliferation, appears repressed in E/Rpositive ALL [15].

Functional annotation and pathway analysis of differentially expressed genes in the KD model
To systematically assess the molecular functions that are modulated by E/R, we annotated all significantly regulated genes from the E/R KD experiments according to their regulation by the fusion gene. For this purpose, we used the ''Database for Annotation, Visualization and Integrated Discovery'' (DAVID) [16] to classify gene lists into functionally related gene groups. The raw output from DAVID, derived from the analysis of up-and down-regulated genes (Table S2 and Table S3), was further parsed to work out more clearly the significance levels and affiliation to broader functional groups of annotation terms ( Figure 2). First inspection of these functional annotations revealed a large discrepancy between E/R KD up-and down-regulated genes ( Figure 2; right and left panel, respectively). While KD-UP genes significantly associate with various cellular functions and pathways, the KD-DOWN gene set, after correction for multiple testing, yielded no significant annotation term at all (the highest ranking term with P,0.3 was the KEGG pathway 04070:Phosphatidylinositol signaling system). These striking differences indicate that despite the similar number of up-and downregulated genes only the KD-UP ones relate, to a high degree, to similar functions and were therefore enriched by the DAVID analysis. The KD-DOWN genes, on the other hand, do not cluster into common functions and therefore not a single term was found to be significant. Hence, the channeling of KD-UP genes to specific pathways suggests that E/R exerts its distinct and relevant gene de-regulation through repression of specific classes of target genes. Conversely, the general lack of such a KD-DOWN-related ''pathway-channeling'' implies that the E/R-associated up-regulation of genes might be biologically far less relevant. Alternatively,  KD-DOWN genes may encode signaling pathway components that are mostly regulated by posttranslational modifications, as is, for instance, the case in the phosphoinositide-3-kinase (PI3K)/ AKT/mammalian target of rapamycin (mTOR) pathway.
To test for potential direct targets of E/R, we first looked for RUNX1 consensus motifs in the promoter regions of de-regulated genes. Using gene set enrichment (GSEA) and overrepresentation analysis we could not detect an enrichment of such motifs in up-or down-regulated genes (data not shown). Second, we compiled RUNX1 targets from two very recent ChIP-seq studies [17,18], which were derived from the analysis of human megakaryocytes and murine hematopoietic stem/progenitor cells. GSEA revealed that genes with ChIP-seq hits from both data sets are significantly up-regulated in our knockdown data. Of note, the Tijssen et al [17] data set showed a more pronounced enrichment that could be attributable to its origin from human tissue, as opposed to mouse tissue in the Wilson et al. study [18] (Table S5). Focussing on the KD-UP and KD-DOWN genes, we also found a significantly higher percentage of genes with ChIP-seq hits in KD-UP genes compared to the KD-DOWN genes (54.8% vs. 46.8%; P = 0.026, Fisher-Exact Test) (Table S1). These results are consistent with the notion that E/R regulates RUNX1 target genes primarily through repression [3].
Given their apparent biological relevance, we focused our further analysis on the 403 KD-UP genes and their molecular functions as well as involvement in pathways. Based on the genelevel clustering, the top 100 annotation terms were manually curated into 14 functional meta-groups ( Figure 3). Note that the name of the meta-groups reflects only the most prominent annotation terms that are comprised in the respective meta-group. A list including all terms within the 14 meta-groups is shown in Table S2. Applying stringent statistical criteria (P,0.05), only the meta-groups ''cell activation'', ''immune response'', ''apoptosis'', ''development and differentiation'', ''GTPase regulation'', and ''protein phosphorylation and phosphate metabolism'' were found to contain at least one significant annotation term ( Figure 3A). The regulation of individual genes within the top six meta-groups upon E/R KD is shown in Figure 3B. The remaining groups (''cell proliferation'', ''response to wounding'', ''nucleic acid binding'', ''DNA damage response'', ''cell adhesion and migration'', ''chemical homeostasis'', ''RNA synthesis'' and ''enzyme binding'') contained no nominally significant annotation term.
The DAVID pathway analysis was based on the overrepresentation of ''significant genes'' in certain gene sets and pathways. To assess the functional impact of differentially expressed genes from the E/R KD experiments independent of a specific P-value threshold, we performed GSEA. This analysis resulted in many more up-regulated GO terms (147) from KD-UP than downregulated terms (13). Importantly, these GO terms largely mapped to the same meta-groups identified in the DAVID analysis (Table  S4). The same discrepancy (324 vs. 49 gene sets) held true for a large collection of .2.500 gene sets that were obtained from experimental data (''curated gene sets, C2'' from MSigDB) ( Table  S5). The conclusions from the DAVID analysis were thus qualitatively confirmed by GSEA. Moreover, in the GSEA analysis the ''Jaatinen hematopoietic stem cell UP'' signaturederived from the gene expression profile of sorted cord blood CD133 (PROM1)-positive versus CD133-negative cells -emerged as the most significantly enriched set associated with the E/R KD-DOWN genes. This result may be an indicator for an intriguing new function of E/R, namely that it induces genes that are normally expressed in cord blood-derived hematopoietic stem cells [19]. To corroborate these findings, we supplemented our comparison with two other gene sets that were obtained from sorted CD34+/lineage-negative versus CD342 normal bone marrow cells designated ''Andersson-UP'' and ''Andersson-DOWN'' (microarray data were kindly provided by Andersson et al. [10]). In line with the above results, the ''Andersson-UP'' gene set also scored significantly in the GSEA analysis (Table S5). Combining the data from Jaatinen's and Andersson's gene sets, CALN1, PROM1, KIT and CDK6 were the most highly upregulated genes and they are similarly induced by E/R. With the exception of CALN1, whose function in the hematopoietic system is currently not known, all other genes are considered to be associated with hematopoietic stem or progenitor cells [20].
To the best of our knowledge, only one other group has previously analyzed the expression patterns of primary E/Rpositive ALL cases and assigned them to GO categories [11]. Consistent with our data, they also observed a distinct association with the categories ''cell differentiation'', ''cell proliferation'', ''apoptosis'', ''cell motility'' and ''response to wounding''. Moreover, ectopic E/R expression in a 3T3 mouse cell line model induced the categories ''adhesion'' and ''survival'' [13].

Genes concordantly modulated by E/R KD in leukemia model cell lines and primary ALL
Next we investigated to which extent gene expression changes that result from an E/R KD might also be reflected in a reciprocal fashion in primary ALL samples. For this purpose we used previously published data sets [8] that were generated by comparing expression profiles from E/R-positive with E/Rnegative BCP ALL cases.
The ensuing ''E/R ALL signature'' was then compared with the E/R KD signature. Note that from the 777 significantly regulated KD genes only 409 (n = 409; 175 KD-DOWN and 234 KD-UP) were represented in the primary ALL arrays and passed initial quality filters (Table S6). Taking into account the specific regulation of these genes in primary ALL, we identified a set of genes whose expression is inversely correlated in the KD and ALL signatures. This set comprises 66 of the 175 KD-DOWN and 71 of the 234 KD-UP genes and they account for approximately one third (137/409) of the E/R signature genes present in both data sets (Table S7). In this data set, we also found a significantly higher percentage of genes with ChIP-seq hits in KD-UP genes compared  (Table S7). The top 50 regulated genes of this set are listed in Table 2. Two thirds (272/409) of the E/R KD signature genes that concurred with the ALL data set were not specific for E/R-positive ALL, but were also evident in the other subgroups. This observation evokes two, not mutually exclusive explanations, namely that these genes are either de-regulated in a similar fashion in a variety of ALL subtypes or that they represent a kind of basic but essential ''BCP-ALL housekeeping gene set''. The notion that other initiating genetic events can elicit a similar gene deregulation effect as E/R is, for instance, supported by the fact that PROM1 is also up-regulated in MLL-rearranged and highhyperdiploid ALL cases, thereby counterbalancing its low expression in other ALL subtypes. Consequently, PROM1 deregulation was not considered as being a specific feature of E/Rpositive ALL (data not shown).
Establishing a ''malignancy signature'' from the E/R KD model E/R KD leads to profound phenotypic changes, which comprise impaired cellular proliferation, survival and leukemia reconstitution in a xenotransplant mouse model (Fuka et al., manuscript submitted). We therefore postulated the presence of a potential ''malignancy signature'' in the E/R KD data, whose loss would render the expression profile of treated cells again comparable to those of their normal counterpart. To test this hypothesis, we generated 10 new gene sets by comparing microarray data from primary E/R-ALL [8] with those from 5 sorted normal bone marrow derived B-cell precursor subsets [21]. Consistent with our notion, all five GSEA comparisons revealed that genes, which are up-regulated in E/R-ALL vs. normal B-cell precursors are overall down-regulated after the KD and vice versa (Table S5). Therefore, this result strongly suggests that on the gene expression level the E/R KD renders ALL cells more similar to their physiological B-cell precursor counterparts.

Validation of selected E/R target genes by RT-qPCR
We validated the differential expression of several selected candidate genes contained in the KD signature, which were previously either not associated with E/R-positive ALL (PROM1, PECAM1, IFITM1; Figure 4A) or concordantly regulated in both systems (SPIB, MDM2 and DDIT4; Figure 4B). These genes were chosen because of their potential biological relevance, since they play an important role in the context of stemness and differentiation, adhesion and migration, immune response, DNA damage response as well as apoptosis. Notably, their differential expression in the context of E/R is novel. Quantification results of these transcripts in both cell lines from independent KD experiments concurred with those of the microarray experiments ( Figure 4).

Functions of selected E/R regulated genes and potential implications for leukemia pathogenesis
Given that GSEA analysis of E/R KD regulated genes highlighted gene sets that are also up-regulated in hematopoietic stem cells, we chose PROM1 (CD133) as the most prominent and attractive candidate from this set. PROM1 is implicated in maintaining stem cell properties by suppressing differentiation and has recently gained much attention as a marker of tumor-initiating cells in a variety of human cancers [22]. The fact that E/R might regulate the expression of this gene is new and intriguing and provides additional arguments to the ongoing debate dealing with the structural hierarchy of ALL and its potential replenishment from rare leukemic stem cells [23]. In favor of this notion is a recent observation, which indicates that primitive leukemiainitiating cells with long-term in vitro and in vivo proliferation capabilities are exclusively found in the CD133+CD192CD382 cell compartment [24]. However, this observation is in contrast to the scenario proposed by le Viseur et al., which suggests that the vast majority of ALL blasts may maintain the propensity to reconstitute leukemia in vivo [25]. The ER-induced ''stemness'' expression signature, represented for instance by PROM1 and the stem cell factor ligand KIT in our model, therefore supports the later view.
The E/R-induced overexpression of stem cell markers in the respective leukemias can either be interpreted as a residual relict of a transformed primitive stem cell or, more likely, as the reflection of a continuously active stem cell program [26]. Although neither possibility excludes that the gene fusion process already occurs in a primitive hematopoietic stem cell [27], the latter requires that inappropriate stemness genes remain active or become perhaps reactivated at the level of maturation in which the bulk of the leukemic cells is arrested. This interpretation is supported by the fact that -similar to the Andersson data -E/R-positive leukemias cluster best with normal large pre-B II cells even after suppression of the fusion gene (data not shown) [10,21]. It is thus tempting to speculate that up-regulation of PROM1 may play a critical role in E/R-positive ALL. This possibility is also relevant for our recent finding that the E/R fusion gene is apparently required for the in vivo propagation of the respective cells (Fuka et al. manuscript submitted). PECAM1 (CD31) encodes a homophilic adhesion receptor that mediates adhesion between endothelial cells and leukocytes and could therefore probably influence adhesion and migration of leukemic cells across the micro-vascular endothelium in various niches [28]. Since it is also contained in Andersson's CD34+ stem cell signature, we envision that its over-expression also contributes to the stem cell properties of E/R-positive ALL.
In contrast to the up-regulation of stem cell signature genes, genes encoding B lineage differentiation markers are frequently repressed in BCP-ALL. Thus, it is not surprising that SPIB, a Blymphoid restricted transcription factor, is one of the genes that is strongly suppressed by E/R. Being directly induced by paired box 5 (PAX5), the master regulator of B-lineage commitment, SPIB is a key player in B-cell development and B-cell receptor signaling [29]. This SPIB down-regulation could thus contribute to the impaired B-cell differentiation in E/R-positive ALL.
The E/R-associated down-regulation of IFITM1, a transcriptional target of interferon (IFN) gamma [30], fits also well into one particular point of the current concept of childhood ALL etiology, namely the one which suggests that certain forms of childhood ALL may be the unfortunate consequence of an abnormal immune response to common infections [31]. The proposed mechanism implies that inflammatory cytokines suppress the growth of normal hematopoietic cells, whereas they do not exert such an effect on, for instance, E/R-expressing cells. Consequently, fusion gene carrying cells may experience a relative growth advantage. In support of this notion it was recently shown that E/ R-expressing cells are more resistant to the anti-proliferative effects of transforming growth factor (TGF) beta [32]. Since TGF beta and INF gamma are both key modulators of the immune system, one expects that the suppression of IFITM1 either concurs with or even augments these effects in response to an interferon release during common infections. In line with the proposed function of TGF beta, the suppression of IFITM1 may thus additionally fuel the expansion of an E/R-expressing leukemic clone [5].
Taking into account further mechanisms that might impair an INF gamma associated inhibition of proliferation, it is noteworthy that CDKN1A is induced via the tumor suppressor protein p53 pathway activation and leads to a G1 cell cycle arrest [33]. The attenuation of the p53 activity together with the transcriptional repression of its direct target p21, the gene product of CDKN1A, either by E/R-mediated repression of IFITM1 or up-regulation of the p53 inhibitor MDM2, as implied in our KD model, opens another fascinating layer of complexity to the E/R-mediated gene regulation process. Given that p53 acts as a gatekeeper of genome integrity [34], p53 down-regulation by any of the above outlined means may thus favor leukemia development. Intriguingly, MDM2 is induced by RUNX1-RUNX1T1 and may therefore be involved in the route of transformation in a similar fashion in other RUNX1-associated leukemias, as for instance the E/Rpositive ones [35]. Furthermore, MDM2 may also promote tumorigenesis via a p53 independent mechanism [36]. Such findings are not only crucial for our understanding of leukemia development per se, but may be particularly helpful for the identification of especially relevant targets for tailored future therapies.
Another E/R-down-regulated gene that is involved in the p53 pathway is DDIT4 (also known as REDD1). It is primarily induced by stress and negatively regulates the mTOR pathway. DDIT4 is activated by DNA damage via p53-dependent and -independent mechanisms, but also by hypoxia or energy stress [37]. Particularly this latter feature is interesting in the context of E/R-positive leukemia, because the majority of affected children are anemic at diagnosis, which seemingly grants hypoxic conditions a central role in their pathogenesis [38]. Noteworthy, E/R-associated DDIT4 suppression may further contribute to the observed PI3K/AKT/mTOR pathway activation and an improved cell survival [39]. Whether this suppression is a direct p53-related consequence that, as recently observed in breast cancer, also leads to hypoxia inducible factor (HIF) 1 alpha accumulation, is currently not known [40].
Taken together, the above clues reinforce the essential role that E/R plays in the entire process of leukemia development and maintenance: i) It induces genes that confer stem cell properties endowing cells with unlimited self renewal capability and simultaneously represses genes that otherwise promote differentiation; ii) it alters the DNA damage response by attenuating the p53 pathway, which in addition enables the survival and clonal expansion of cells with accumulating secondary genetic changes; iii) it triggers proliferation and cellular growth via PI3K/AKT/ mTOR pathway activation, which in turn adapts extracellular signaling as well as stress and hypoxia response accordingly; iv) it also attenuates the response to inflammatory signals. All these features, sustaining proliferative signaling, evading growth suppression, resisting cell death, and induced genome instability, are typical and well established hallmarks of cancer in general [41].
Based on the analyses of our KD model, we have established a functional map of the consequences of E/R expression in an endogenous background. The modulation of various specific and more general key processes that are pivotal for leukemia pathogenesis was thus highlighted. These processes include ''development and differentiation'', ''apoptosis'', ''adhesion and migration'' as well as ''DNA damage response''. Finally, these data provide also a valuable source of interesting targets and pathways whose functional validation will provide further insights into the biology of E/R-positive leukemia and possibly also promote the identification of novel targets for treatment.

Quantitative RT-PCR (RT-qPCR)
Total RNA was isolated from biological replicates of E/Rsilenced REH (n = 3) and AT-2 (n = 3) cells obtained from independent KD experiments by Trizol reagent (Life Technologies, Carlsbad, CA). cDNA was synthesized by SuperScript II Reverse Transcriptase according to the manufacturer's recommendations (Invitrogen, Carlsbad, CA). Transcripts were quantified by TaqMan   . Meta-groups of functional annotations for up-regulated genes upon E/R KD. Meta-groups were curated based on geneclustering of annotation terms. A: Top 100 annotation terms from KD-UP genes, their P-values and their affiliation to meta-groups. Similarity of the meta-groups was based on the number of shared genes. For distance calculations between the meta-groups genes from all contributing terms were taken together. B: Change in expression of individual genes in meta-groups that contain significant annotation terms. The color code at the bottom of the figure indicates the extent of log2-fold changes in gene expression. doi:10.1371/journal.pone.0026348.g003

Gene expression analysis by microarray technology
Gene expression changes upon knockdown of E/R were followed on Affymetrix HG-U133-PLUS2 arrays (Affymetrix, Inc., Santa Clara, CA). cRNA target synthesis and GeneChipH processing were performed in the Gene Expression Profiling Unit of the Medical University Innsbruck according to standard protocols (Affymetrix, Inc., Santa Clara, CA). Microarray data were performed in compliance to MIAME guidelines and submitted to GEO -accession number GSE29639. All further analyses were performed in R statistical environment using Bioconductor packages [43].
Affymetrix CEL files were preprocessed as described previously [44], yielding a final number of 9.498 probesets that were used for all further analyses.
Differentially expressed genes were determined using a moderated t-test in the R package ''limma'' [45]. All P-values were corrected for multiple testing using the ''Benjamini-Hochberg'' correction method. Significantly changing genes in the E/R KD vs. control experiments were determined by calculating ratios for each gene between the two conditions for each experiment separately, thus yielding five biological replicates of relative expression for each gene (REH, n = 3; AT2, n = 2). Then, for each gene, significance was determined using a weighted onesample t-test against the null hypothesis of no expression change (m = 0).
For the re-analysis of primary ALL data set from Ross et al. [8], CEL files were downloaded from the St. Jude's data server and microarray data was pre-processed as described previously [44], generating a data set of 12.068 genes. In this data set E/R-positive vs. E/R-negative BCP ALL samples were compared and yielded 1.980 differentially regulated genes (P,0.05, moderated t-test), 1.008 of which were under-and 972 over-expressed in E/Rpositive ALL. Combining the data sets from Ross [8] and the KD experiments a total of 5.119 genes were represented on both platforms independent of their regulation and passed initial quality filters (Table S6). This gene set was then used to look for genes that are regulated by E/R in KD experiments and primary ALL.
To test for differences in malignant vs. non-malignant cells, we analyzed E/R-positive ALL from the Ross data set [8] together with microarray data from five normal bone marrow B-cell precursor subsets [21] (http://franklin.et.tudelft.nl/).

Functional annotation
The ''Database for Annotation, Visualization and Integrated Discovery'' (DAVID) was used to annotate the 403 up-and 374 down-regulated genes from the joint analysis of the E/R

Hierarchical clustering of annotation terms
For further analysis and visualization of the similarity among annotation terms, the functional charts were first sorted by their Pvalue (corrected for multiple testing by the Benjamini-Hochberg method) and then, to determine the relationships of the top 100 annotation terms, similarity between all terms was measured by the number of their shared genes (gDist as described in Kauer et al. 2009) [44]. The matrix of pair wise gDist values (as dissimilarities: 1-gDist) for the 100 most significant terms was used as input for hierarchical clustering using the R function ''hclust'' in combination with the ''average linkage'' algorithm. Finally, the similarity among the annotation terms was visualized as dendrogram in combination with a heatmap indicating significance levels of the clustered terms. Names of meta-groups were chosen or modified from upstream gene ontology terms (http://www.geneontology.org/).

Gene set enrichment
To define functional categories of de-regulated genes independent of a P-value cutoff for ''significant genes'', we performed gene set enrichment analysis (GSEA) using the ''pGSEA'' package in the Bioconductor/R environment [46][47][48]. Gene-wise log2 expression ratios (logFC) of knockdown versus control for the cell lines REH and AT-2, and for the mean of their logFCs, were used as input for pGSEA. Gene sets were downloaded from the MSigDB v3.0 (http://www.broad.mit.edu/gsea/msigdb/ Cambridge, USA). We tested two different gene set collections available from MSigDB: curated gene sets from canonical pathways and experimental data (C2) and GO terms (C5). To validate the enrichment on genes involved in hematopoietic stem cells, we added two more gene sets to the C2 group: genes up-and downregulated in the Andersson et al. 2005 data set (CD34+/lineage negative vs. CD342 hematopoietic cells) [10].
To test whether E/R knockdown renders the gene expression of ALL cells more similar to non-malignant cells, we added new gene sets: For each of the five comparisons of E/R ALL vs. normal Bcell precursor subsets we defined significantly (P,0.01, logFC.1.5) up-and down-regulated genes, resulting in 10 gene sets. The results for all gene sets can be found in Table S5.
To test for enrichment of putative direct RUNX1 binding targets, RUNX1 ChIP-seq data was downloaded from two sources: Tijssen et al. [17]; Wilson et al. [18].

Supporting Information
Text S1 Materials and methods. (DOC) Figure S1 shRNA-mediated silencing of E/R leads to chimeric protein depletion. The E/R-positive leukemia cell lines REH and AT-2 were transduced by lentiviral constructs encoding either the E/R specific shRNA G1 (G1) or a non-targeting shRNA (control). Protein levels of E/R (A) and RUNX1 (B) were detected by immunoblotting using anti-ETV6 and anti-RUNX1 antibodies, respectively. GAPDH was used to ensure equal loading. Numbers between bands represent the ratio between tested proteins and GAPDH quantification. A vertical line has been inserted to indicate where a gel lane was cut. These gels came from identical experiments. Shown are results from one of at least three independent E/R knockdown experiments per cell line.