Constitutive Function of the Ikaros Transcription Factor in Primary Leukemia Cells from Pediatric Newly Diagnosed High-Risk and Relapsed B-precursor ALL Patients

We examined the constitutive function of the Ikaros (IK) transcription factor in blast cells from pediatric B-precursor acute lymphoblastic leukemia (BPL) patients using multiple assay platforms and bioinformatics tools. We found no evidence of diminished IK expression or function for primary cells from high-risk BPL patients including a Philadelphia chromosome (Ph)+ subset. Relapse clones as well as very aggressive in vivo clonogenic leukemic B-cell precursors isolated from spleens of xenografted NOD/SCID mice that developed overt leukemia after inoculation with primary leukemic cells of patients with BPL invariably and abundantly expressed intact IK protein. These results demonstrate that a lost or diminished IK function is not a characteristic feature of leukemic cells in Ph+ or Ph- high-risk BPL.

The PCR products were cleaned using the Qiagen QIAquick PCR Purification Kit (Cat No. 28104) and submitted to the DNA Sequencing Facility of Genewize Inc (CA) using the corresponding forward primer and the Applied Biosystems' dye-based (BigDye V3.1 TM ) DNA sequencing method. DNA sequencing was performed on an ABI 3730 DNA Analyzer using a long read protocol. Sequence obtained from each genomic PCR product was analyzed and aligned using SeqMan II contiguous alignment software in the LaserGene suite from DNASTAR Inc. and the MegAlign multisequence alignment software in comparison with the wild-type IKZF1 sequence (NCBI Reference Sequence: NCBI Reference Sequence: NT_007819.17 Homo sapiens chromosome 7, Genome Reference Consortium Human Build 37 (GRCh37.p9) primary reference assembly, www.ncbi.nih.gov) [1], [2].
We performed realtime quantitative (q) PCR for exons 4-7 of IKZF1 on genomic DNA samples isolated from primary leukemic cells of 32 high-risk pediatric BPL patients, including 4 Ph + BPL patients, and normal hematopoietic cells from 2 normal non-leukemic bone marrow samples. The rabbit polyclonal antibody for Ikaros (IK) 1 (H-100, sc-13039) was purchased from Santa-Cruz Biotechnology, Inc. (Santa Cruz, CA) for Western blot analysis of IK using previously reported procedures [1]. The mouse monoclonal anti-IK antibody used for confocal imaging was prepared in our laboratory [1]. Goat

SCID Mouse Xenograft Model of Human BPL.
We used an NOD/SCID mouse model of human B-precursor ALL [2]. NOD/SCID mice (NOD.CB17-Prkdc scid /J; 4-6 weeks of age at the time of purchase, female) were obtained from the Jackson Laboratory (Sacramento, CA). The research was conducted according to Institutional Animal Care and Use Committee (IACUC) Protocol #280-09, that was approved by the IACUC of CHLA on 11-24-2009 and its 3-year rewrite application 280-12 that was approved on 7-10-2012. All animal care procedures conformed to the Guide for the Care and Use of Laboratory Animals (National Research Council, National Academy Press, Washington DC 1996, USA). NOD/SCID mice (6-8 week old, female, same age in all cohorts in each independent experiment) were inoculated with primary leukemic cells from patients with BPL by injecting 0.5-1x10 6 leukemia cells in 0.2 mL PBS i.v. via tail vein injection with a 27-gauge needle. Mice were monitored daily and electively euthanized at the indicated time points by CO 2 asphyxia. At the time of their death or elective sacrifice, mice were necropsied to confirm leukemia-associated marked splenomegaly. Spleens of mice were removed, sized, and cell suspensions were prepared for determination of mononuclear cell counts and immunophenotype. Leukemia cells isolated from spleens of xenografted mice in 7 different xenograft cases were examined for IK expression using Western blot analysis as well as confocal imaging and intracellular flow cytometry. The xenograft models and the immunophenotypic features of the All BPL xenograft cases used in the present study were recently reported [2].

Bioinformatics and Statistical Analysis of Gene Expression Profiles. The publically
available archived GSE32311 database [3] was used to compare gene expression changes in control thymocytes from IKZF1 wildtype mice (N=3; GSM800500, GSM800501, GSM800502) vs. IK-deficient thymocytes (N=8) from IKZF1 null mice (N=3, GSM800503, GSM800504, GSM800505) with the same genetic background of (C57BL/6 x129S4/SvJae). Gene expression changes were screened utilizing probe level RMA signal intensity values from the mouse 430_2.0 Genome Array to identify the gene signatures for up-regulated and down-regulated transcripts in IKZF1 null mice by filtering changes greater than 2-fold and T-test P-values less than 0.05 (T-test, Unequal Variances, Excel formula). Application of this filter identified 1158 transcripts representing 924 genes that were down-regulated in IKZF1 null mice with a subset of 201 transcripts representing 137 genes exhibiting >2-fold decreased expression levels. By cross-referencing this IK-regulated gene set with the archived CHiPseq data (GSM803110) using the Integrative Genomics Browser [4], we identified 45 IK target genes that harbored IK binding sites [1].
The Gene Pattern software (http://www.broadinstitute.org/cancer/software/genepattern) was utilized to extract expression values for human lymphocyte precursors from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database [5]. We compiled the archived gene expression profiling (GEP) data on 1486 primary leukemia specimens from pediatric ALL patients from 7 independent studies (GSE3912, N=113; GSE18497, N=82; GSE4698, N=60; GSE7440, N=99; GSE13159, N=750; GSE11877, N=207; GSE12995, N=175) that combined datasets from U133A and U133 Plus 2 genechips using common IKZF1 probe sets to focus our analysis on 45 validated IK target genes as well as 20 lymphoid-priming genes [1]. For each study, the gene expression values were transformed into standard deviation units calculated from the mean and standard deviation expression values for all the samples in each study. Standardized values compiled from the 7 studies were rank ordered according to the mean expression of 3 highly correlated transcripts for IKZF1 (205038_at, 205039_s_at and 216901_s_at). Consensus/exemplar sequences for two of the IKZF1 probe sets (205038_at and 205039_s_at) exhibited alignment to exons 0 -7 of the IKZF1 sequence, and 1 probe set (216901_s_at) exhibited alignment to exons E2-E7 of the IKZF1 sequence (UCSC genome browser alignment track utilizing blat followed by pslReps).
Prospective power analysis was utilized to determine the Standard Deviation cut-off for "high IKZF1 expression" and "low IKZF1 expression" samples in the data sets. We set the unadjusted critical P-value at 2.5x10 -6 to control for False Positive Rate (FPR) at 0.05 in order to detect significant differences in any one of the IKZF1 transcripts out of approximately 20,000 transcripts common across the 7 Affymetrix platforms. At this critical P-value, a total sample size greater than 254 would be sufficient to detect a difference of 1 standard deviation unit with 99.9% power for IKZF1 transcripts. Therefore, samples were assigned to the "high IKZF1 expression" group if their expression level was >0.5 standard deviation unit higher than the mean expression level (N=390) and to the "low IKZF1 expression" group if their expression level was >0.5 standard deviation unit lower than the mean expression level (N=407). Forty-five validated IK target genes were represented by 60 transcripts common across 7 Affymetrix platforms. T-tests were performed using standardized expression values combined from 7 datasets (2-sample, Unequal variance correction, p-values<0.05 deemed significant). Forty-two transcripts representing 29 genes were up regulated in 390 "high IKZF1" samples compared to 407 "low IKZF1" samples. We used a one-way agglomerative hierarchical clustering technique to organize expression patterns using the average distance linkage method such that genes (rows) having similar expression across patients were grouped together (average distance metric). Dendrograms were drawn to illustrate similar gene-expression profiles from joining pairs of closely related gene expression profiles, whereby genes joined by short branch lengths showed the most similarity in expression profile across all samples. Expression levels of IK target genes were also examined for primary leukemic cells from matched-pair diagnosis vs. relapse specimens of BPL patients using archived GEP datasets from 2 independent studies (GSE 3912, GSE18497 [6], [7]. Matched pair gene expression values for leukemia cells obtained from 59 BPL patients at diagnosis and then at relapse (combined from GSE3912, N=32 and GSE18497, N=27). RMA-normalized values for the GSE18497 dataset and the MAS5-Signal intensity values for the GSE3912 dataset were log 10 transformed and mean-centered to the average value for the diagnosis samples for each gene transcript in each study. To determine the differential expression of each gene, paired T-tests were performed for the combined mean-centered values from GSE3910 and GSE18497 datasets (Unequal variance correction, P<0.05 deemed significant). We also compared the IK target gene expression levels in leukemic cells from initial diagnosis specimens of patients who subsequently experienced an early relapse (N=40; <36 months) versus a late relapse (N=19; We compiled the archived "The Microarray Innovations in Leukemia" (MILE) study gene expression profiling (GEP) data on primary leukemic cells (GSE13159) from 122 pediatric Ph + BPL patients, 237 pediatric Ph -BPL patients, 576 BPL, 174 T-lineage ALL and 74 normal bone marrow specimens [8]. Transcript signal values obtained from hybridization onto the Affymetrix Human Genome U133 Plus 2.0 Arrays were calculated using non-central trimmed mean of differences between perfect match and mismatch intensities with quantile normalization (DQN3, signal normalized with quantiles of the beta distribution with parameters p=1.2 and q=3 [9]. Differential expression of 45 IK target genes [1] were compared in T-tests utilizing the DQN3 values (2-sample, Unequal variance correction, p-values<0.05 deemed significant). Gene expression values were transformed into standard deviation units calculated from the mean and standard deviation expression values for all the samples in each study and effect sizes were reported using differences standard deviation units between comparison groups.
We also compared expression levels of IK target genes in primary leukemic cells from 155 pediatric BCR-ABL -BPL patients and 20 BCR-ABL + BPL patients on the Mullighan study (GSE12995) [10]. Transcript signal values were obtained from hybridization onto the Affymetrix Human Genome U133A genechip arrays. Trimmed mean target intensity of each array was globally scaled to 500 (MAS5 values) as the normalization method. T-tests were performed using log 10

IKZF1 Gene Expression Analysis Using Multiple Probe sets For Identification of IKZF1
Deletions. BLAT analysis on IKZF1 target sequences deposited in Affymetrix NetAffx™ Analysis Center (http://www.affymetrix.com/analysis/index.affx) mapped these probe sets onto specific IKZF1 exons visualized using the UCSC genome browser (http://genome.ucsc.edu/cgibin/hgBlat?command=start) [11]. This analysis is designed to locate sequences of 95% and greater similarity of length 25 bases or more in the entire genome [11]. The exon designation by comparing the BLAT analysis to 3 reference sequences (UCSC genes, Ensembl gene predictions, Human mRNA Genbank) were as follows: We compiled 6 archived gene expression profiling datasets that measured expression from Bprecursor ALL childhood patients hybridized to Human Genome U133 Plus 2.0 Array containing the 10 IKZF1 probe sets for determination of exon specific expression (GSE11877, N=207; GSE13159 N=823; GSE13351 N=107; GSE18497, N=82, GSE28460, N=98; GSE7440, N=99; Total, N=1416). To enable comparison of samples across studies, a normalization procedure was performed that merged the raw data from the 6 datasets (CEL files). Perfect Match (PM) signal values for probe sets were extracted utilizing raw CEL files matched with probe identifiers obtained from the Affymetrix provided CDF file (HG-U133_Plus _ 2.cdf) implemented by Aroma Affymetrix statistical packages ran in R-studio environment (Version 0.97.551, R-studio Inc., running with R 3.01) [12]. The PM signals were quantified using Robust Multiarray Analysis Normalization across all 6 studies and 1416 samples was achieved using a two-pass procedure.
First the empirical target distribution was estimated by averaging the (ordered) signals over all arrays, followed by normalization of each array toward this target distribution.
A subset of pediatric BPL cases were further analyzed to compare leukemic specimens from Ph + BPL (N=122 from GSE13159 and N=1 from GSE13351) with leukemic specimens from Ph -BPL cases (N=236 from GSE13159 and N= 91 from GSE13351) as well as normal bone marrow specimens (N=74 from GSE13159) using 5 IKZF1 probe sets that interrogated IKZF1 exons 1-4. We performed a one-way hierarchical clustering technique to organize expression patterns (log 2 transformed RMA normalized values for 524 samples and mean centered to normal samples) such that transcripts (rows) with similar expression profiles across patients were grouped together using the average distance metric. Mixed Model Analysis of Variance analysis with three fixed factors ("Diagnosis" (Normal, Ph -, Ph + ), "Probeset" (5 IKZF1 probe sets), an interaction term for Diagnosis x Probeset) and a random factor, "case" for sample identification was utilized for the analysis of differential IKZF1 gene expression levels. Planned Linear contrasts were performed using the fitted parameters from the interaction term to compare Ph + versus Ph -BPL or normal samples for each of the 5 IKZF1 probe sets. All calculations were performed using JMP statistical package (JMP v10, SAS, Cary, NC). We performed pairwise correlations of the genes (158 probe sets) reported by Iacobucci et al. to be down regulated in adult Ph + BPL patients with IKZF1 deletions [16] and the 5 IKZF1 probe sets for IKZF1 Exons 1-4 in pediatric BPL patients using the RMA normalized database that combined 123 Ph + BPL cases and 327 Ph -BPL cases. Correlation coefficients (r) were determined between all gene pairs and cluster analysis was applied to the matrix of correlation coefficients for both rows and columns of probeset identifications (JMP Software, SAS, Cary, NC). Modular structure of the probeset correlations was also deduced for the most significant down regulated probe sets in the Iacobucci study [16] (P<0.01), and for a subset of 29 IK target genes that harbored IK binding sites in mice identified by cross-referencing this gene set with the archived CHiPseq data (GSM803110 archived in GSE32311 [3] using the Integrative Genomics Browser [4]. Gene Set Enrichment Analysis (GSEA). Rank ordered differences in standard deviation units for BCR-ABL + BPL samples (N=20) compared to other samples (N=155) in the Mullighan study (GSE12995) [10] and Ph + BPL (N=122) versus Ph-BPL (N=237) in the MILE study (GSE13159) [8] were processed for enrichment of lymphoid priming genes (18 genes represented on the gene chips) and IK target genes (39 genes represented in the Mullighan study, 45 genes represented in the MILE study) using a supervised approach implemented in GSEA v2.08 (Broad institute). These ranked ordered genes were screened for enrichment of gene sets in 13321 genes (22283 transcripts) for the Mullighan study and 20606 genes (54613 transcripts) for the MILE study using weighted Kolmogorov-Smirnov statistics implemented in GSEA (GSEA v2.08 (Broad Institute) [17]. Genes for multiple probes were collapsed using the "Max_probe" algorithm provided. GSEA evaluated significance of the over-representation of gene sets correlated or anti-correlated with expression in Ph + samples by calculating the Enrichment Score (ES) that represents the difference between the observed rankings from the expected null assuming a random rank distribution utilizing an empirical permutation test procedure that randomly assigned gene names to the rank ordered standard deviation values ("GSEA Pre-ranked" algorithm). Leading edge genes were identified up to and including the peak of the ES profile. Nominal P-values were computed by comparing the tails of the ES scores for observed and permutation-generated null distributions following 1000 permutations.
Rank ordered difference in log 2 transformed RMA expression values between leukemic specimens from Ph + ALL patients (N=123) and normal bone marrow specimens ((N=74) from 2 studies (viz.: GSE13159 and GSE13351) were processed for enrichment of 158 probe sets that showed downregulation within the Ph + subset of the Iacobucci study [16] using a supervised approach implemented in GSEA2.08 (Broad institute). Significance was assessed using the weighted Kolmogorov-Smirnov statistics. GSEA evaluated significance of the overrepresentation of probe sets sets correlated or anti-correlated with IKZF1 expression in Ph + samples by calculating the Enrichment Score (ES) representing the difference between the observed rankings from the expected null. The null distribution assumed a random rank distribution utilizing an empirical permutation test procedure that randomly assigned probeset names to the rank ordered differences in expression ("GSEA Preranked" algorithm). Leading edge genes were identified up to and including the peak of the ES profile. Nominal P-values were computed by comparing the tails of the ES scores for observed and permutationgenerated null distributions following 1000 permutations.
Western Blot Analysis of Ikaros Expression. Western blot analysis of whole cell lysates and nuclear protein extracts for IK expression was performed by immunoblotting using the ECL chemiluminescence detection system (Amersham Life Sciences), as described previously [1].

Electrophoretic Mobility Shift Assays (EMSAs).
EMSAs were performed on nuclear extracts (NE) from BPL cells using the Thermo Scientific LightShift Chemiluminescent EMSA Kit (Catalog No. 20148) (Pierce, Rockford, IL, USA) following the manufacturer's protocol, as previously described [1], [18]. Preparation of nuclear extracts was carried out using the NE- was complete, biotin-labeled DNA was cross-linked to the membrane at 120 mJ/cm 2 using a Spectrolinker XL-1000 UV cross-linker with 254-nm UV light bulbs. The biotin-labeled DNA was detected using a stabilized streptavidin-horseradish peroxidase (HRP) conjugate and a highly sensitive chemiluminescent substrate according to the manufacturer's instructions [1]. The membrane was exposed to X-ray film and developed with a film processor.