ARID5B, IKZF1 and Non-Genetic Factors in the Etiology of Childhood Acute Lymphoblastic Leukemia: The ESCALE Study

Genome-wide association studies (GWAS) have identified that frequent polymorphisms in ARID5B and IKZF1, two genes involved in lymphoid differentiation, increase the risk of childhood acute lymphoblastic leukemia (ALL). These findings markedly modified the current field of research on the etiology of ALL. In this new context, the present exploratory study investigated the possible interactions between these at-risk alleles and the non-genetic suspected ALL risk factors that were of sufficient prevalence in the French ESCALE study: maternal use of home insecticides during pregnancy, preconception paternal smoking, and some proxies for early immune modulation, i.e. breastfeeding, history of common infections before age one year, and birth order. The analyses were based on 434 ALL cases and 442 controls of European origin, drawn from the nationwide population-based case-control study ESCALE. Information on non-genetic factors was obtained by standardized telephone interview. Interactions between rs10740055 in ARID5B or rs4132601 in IKZF1 and each of the suspected non-genetic factors were tested, with the SNPs coded as counts of minor alleles (trend variable). Statistical interactions were observed between rs4132601 and maternal insecticide use (p = 0.012), breastfeeding p = 0.017) and repeated early common infections (p = 0.0070), with allelic odds ratios (OR) which were only increased among the children not exposed to insecticides (OR = 1.8, 95%CI: 1.3, 2.4), those who had been breastfed (OR = 1.8, 95%CI: 1.3, 2.5) and those who had had repeated early common infections (OR = 2.4, 95%CI: 1.5, 3.8). The allelic ORs were close to one among children exposed to insecticides, who had not been breastfed and who had had no or few common infections. Repeated early common infections interacted with rs10740055 (p = 0.018) in the case-only design. Further studies are needed to evaluate whether these observations of a modification of the effect of the at-risk alleles by non-genetic factors are chance findings or reflect true underlying mechanisms.


Introduction
Acute lymphoblastic leukemia [ALL] accounts for 80% of all childhood leukemia and for approximately one quarter of all childhood neoplasms in developed countries [1]. Until recently, only Down's syndrome, a few inheritable predisposing diseases and high-dose ionizing radiation were known to increase the risk of ALL [2,3]. Genome-wide association studies (GWAS) have revealed common Single Nucleotide Polymorphisms (SNP) associated with childhood ALL risk [4][5][6][7][8][9][10][11][12]. The two loci associated with the greater ALL risk are in the AT-rich interactive domain 5b gene (ARID5B, chromosomal region 10q21.2) and Ikaros family zinc finger 1 gene (IKZF1, chromosomal region 7p12.21). IKZF1 codes for a zinc finger transcription factor, Ikaros, which is a key regulator of hematopoiesis. In particular, Ikaros is required for the development of the earliest B-cell progenitors and at later stages for V(D)J recombination and B-cell receptor expression [13,14]. Ikaros also acts as a tumor suppressor [15] and genetic alteration of IKZF1 in leukemia cells is associated with a poor outcome in B-cell-progenitor ALL [16,17]. ARID5B is a member of the AT-rich interaction domain family of transcription factors that plays an important role in embryogenesis and growth regulation [18]. The potential role of ARID5B in immune response and lymphoid lineage differentiation has been less investigated than for IKZF1. A role is supported by data from homozygous knockout mice that had immune abnormalities and reduced numbers of B-cell progenitors [19].
The discovery of some ALL susceptibility loci calls for revisiting the suspected environmental risk factors for childhood ALL in light of the possible interactions those factors may have with the variant alleles of ARID5B and IKZF1. The relationships between ALL and the exposures may vary depending on ARID5B and IKZF1 polymorphisms and, conversely, the genetic effects may be modified by environmental exposures. Indeed, growing evidence suggests that environmental factors may cause diseases by modifying the expression of genes involved in key cell functions, potentially through epigenetic deregulation [20][21][22][23][24][25][26][27]. Modeling their combined effect may enable insights into the underlying biologic processes and the potential implication of the environmental factors in ALL risk [28]. Several exposures are candidate risk factors for childhood ALL: pesticides, low dose ionizing radiation, extremely low frequency electromagnetic fields, benzene exposure, paternal smoking, absence of folic acid supplementation, infections transmitted through population mixing, a lack of early exposure to infectious agents, and absence of breastfeeding [2,3].
The exploratory study reported herein investigated the potential interactions between polymorphisms rs10740055 of ARID5B, and rs4132601 of IKZF1, and the non-genetic suspected ALL risk factors that were of sufficient prevalence in the ESCALE case-control study, for power reasons: maternal use of home insecticides during pregnancy, preconception paternal smoking, and some proxies for early immune modulation, i.e breastfeeding, history of repeated common infections before age one year, and birth order [29][30][31].

Case and control ascertainment
The ESCALE study is a population-based case-control study designed to assess the role of several environmental and genetic factors in childhood cancer, including ALL [29][30][31]. The ALL cases were children diagnosed with ALL between January 1, 2003, and December 31, 2004, as defined by the registration criteria of the National registry of childhood hematopoietic malignancies (RNHE). To be eligible, children were to be less than 15 years old and residing in France at the time of diagnosis. Children were not eligible if they had been adopted, or if their biological mother had died (n = 8), did not speak French (n = 19), had a serious psychiatric disorder (n = 12), or could not be interviewed for ethical reason because the child was in palliative care (n = 2) or had died (n = 17). Among the 714 eligible ALL cases, 648 cases (91%) had participated in the ESCALE study. Information on leukemia subtype and cytogenetic characteristics was subsequently obtained from the RNHE information system.
The population controls, children free from cancer, were contemporaneously selected in 2003-2004, using a quota sampling method. A base of 60,000 phone numbers representative of telephone subscribers was randomly extracted from the telephone directory (S1 Fig.). Then, controls were recruited using gender and age quotas in sixteen strata (boys and girls from the age strata 0-1, 2, 3, 4, 5-6, 7-8, 9-11 and 12-14 years) reflecting the expected distribution of all the cancer cases included in the ESCALE study, based on rates from the RNHE [32] and the Regional Childhood Cancer Registries [33]. Additional quotas were applied in order to make the number of children less than 15 years old living in the household similar to that of the French population, conditionally on age. Children who had been adopted, or whose biological mother had died, did not speak French, or had a serious psychiatric disorder were not eligible. Finally, 1681 of the 2360 eligible controls (71%) were enrolled in the ESCALE study.
A specimen was requested from each subject, and consisted in a blood sample for the cases, and a saliva sample collected using a swab brush for the controls. After obtaining parental consent, biological samples were taken from 619 (96%) of the 648 included ALL cases and 810 (48%) of the 1681 included controls, and 513 ALL cases and 570 controls had sufficient DNA for genotyping (S1 Table).

Data collection
The mothers were interviewed by telephone, using the same standardized questionnaire for cases and controls. The mothers were asked for socio-demographic information, child and family medical history, and details on the child's environment and lifestyle, birth characteristics and infancy care. The mothers were also asked whether they had used products against insects at home during pregnancy [30], if the father had ever smoked and the year he quitted [29], if their child had been breastfed, and if the child had had none, 1 to 3, or four or more infectious episodes for each of the following sites before age one year: tonsillitis, otitis, upper respiratory tract infections, gastroenteritis, bronchiolitis and other lower respiratory tract infections, and urinary tract infections [31]. A history of repeated early common infections was defined as 4 or more episodes of infection of at least one given site or 1-3 episodes of infection of at least 4 sites in infancy [31].
Quality control excluded 109 controls with an individual call rate less than 95% and 36 ALL cases with an individual call rate less than 97%, low or high heterozygosity or discrepancies between stated gender and sexual chromosomes [6]. Six cases were excluded because they had Down's syndrome. Mean individual call rates were 99.9% for the cases and 98.5% for the controls. The SNPs of interest did not deviate from the Hardy-Weinberg equilibrium in the control sample.
The analyses were restricted to the 434 cases and 442 controls who had at least two European-born grandparents, given that this criterion proved to be a good proxy to identify children of European-descent in the present study [6]. The cases consisted in 10 pro-B ALL, 355 common B-cell ALL, 19 mature B-cell ALL, 40 T-cell ALL and 10 unspecified ALL (S1 Table). Eighty-two common B-cell ALL cases were ETV6-RUNX positive and 157 were hyperdiploid ( 47 chromosomes).

Statistical analysis
The present analysis focused on rs4132601 for IKZF1 and, on rs10740055 for ARID5B, as these SNPs were the most significantly associated with ALL in the ESCALE genotyped sample [6]. We selected the suspected non-genetic factors associated with ALL in our previous analyses, and to which at least 30% of the control population were exposed. The factors included paternal smoking since the year prior to the child's birth [29], breastfeeding, repeated early common infections and birth order (2 and more versus 1) [31], and maternal use of home insecticides during pregnancy which was the pesticide exposure the most associated with acute leukemia in the ESCALE study [30]. The analyses addressing common infections were restricted to the children aged one year or more (398 controls and 426 ALL).
Odds ratios (OR) and their 95% confidence intervals (CI) were estimated using unconditional logistic regression models including the child's age and gender, and parental socio-professional category. ORs were calculated for each stratum of the variables combining each polymorphism with each of the 5 non-genetic factors of interest. Interaction odds ratios (IOR) between the polymorphism and the exposure of interest were estimated by both case-control and case-only analyses, with the SNPs coded as counts of minor alleles (trend variable), and tested under the multiplicative model. The IORs that could be detected with a power of 80% and a two-tailed 5% level of significance were equal to 1.5 and 1.8 for the case-only and the case-control design, respectively, assuming a minor allele frequency of 30%, an allelic OR of 1.5, an exposure prevalence of 30% and an OR of association between ALL and the exposure of 1.5.
The research was conducted in accordance with the principles of the Declaration of Helsinki (World Medical Association, 2004) and complied with all applicable international regulatory requirements. In particular, the study was reviewed and approved by the French National Institute of Health and Medical Research ethics committee and the Direction Générale de la Santé institutional review board before the study began (DGS No. 2003/0259). Written informed consent was obtained from the parents of the children enrolled in the study.
All regression analyses were conducted in SAS version 9.3 and minimal detectable IORs were estimated using Quanto version 1.2.4 [34]. Table 1 reports the associations previously observed between childhood ALL and the polymorphisms rs4132601 in IKZF1 and rs10740055 in ARID5B in the ESCALE study. The table also shows the associations with maternal use of home insecticides during pregnancy, preconception paternal smoking, birth order, breastfeeding and repeated early common infections observed both in the genotyped subsample used for the present analysis and in the whole ESCALE sample [29][30][31], with similar results.

Results
The combined variables and the joint effects are shown in Table 2.
The more marked interactions were observed between rs4132601 and maternal insecticide use (IOR = 0.5, 95%CI: 0.3-0.9, p = 0.012), breastfeeding (IOR = 1.7, 95%CI: 1.1-2.7, p = 0.017) and repeated early common infections (IOR = 2.0, 95%CI: 1.2-3.4, p = 0.0070) in the case-control design ( Table 3). The magnitudes of the interactions were similar using the case-only design ( Table 3). The allelic ORs were only increased among the subjects not exposed to the suspected nongenetic factor, i.e. the children whose mother did not use home insecticides during pregnancy (allelic OR = 1.8, 95%CI: 1.3, 2.4), those who had been breastfed (allelic OR = 1.8, 95%CI: 1.3, 2.5) and those who had had repeated common infections in their first year of life (allelic OR = 2.4, 95%CI: 1.5, 3.8) ( Table 4). On the opposite, the relationship between ALL and the G allele of rs4132601 was not present among the subjects exposed to the non-genetic factors, i.e. the children whose mother used home insecticides during pregnancy (allelic OR = 1.0, 95%CI: 0.7, 1.5), those who had not been breastfed (allelic OR = 1.0, 95%CI: 0.8, 1.4) and those who Table 2. Joint associations between childhood acute lymphoblastic leukemia (ALL) and the variables combining the non-genetic factors of interest with rs10740055 (ARID5B) and rs4132601 (IKZF1) polymorphisms. The analyses addressing common infections were restricted to the children aged one year or more (398 controls and 426 ALL), while the analyses addressing the other exposures were based on the whole sample (442 controls and 434 ALL).  Table 3. Interaction between the SNP rs4132601 (IKZF1) and maternal use of home insecticides during pregnancy, preconception paternal smoking, breastfeeding, repeated early common infections and birth order, in their relation with childhood acute lymphoblastic leukemia.  (Table 4). Similar differences in the effect of rs4132601 by the non-genetic factors were observed for the most frequent ALL subtypes, i.e., common B-cell ALL, hyperdiploid common Bcell ALL and ETV6-RUNX1 common B-cell ALL (Table 4). Regarding ARID5B, the at-risk loci of rs10740055 also interacted with maternal insecticide use, although not significantly. A negative interaction was observed with repeated early common infections (IOR = 0.7, 95%CI: 0.5-0.9, p = 0.018) in the case-only design (Table 5).
However, the tests were no longer significant when the Bonferroni correction for multiple testing was applied (10 tests for 2 SNPs and 5 non-genetic factors; corrected alpha risk = 0.005).

Discussion
In recent years some IKZF1 and ARID5B SNPs have been shown to be clearly associated with childhood ALL. The present study showed that the at-risk allele of IKZF1 interacted negatively with maternal use of home insecticides during pregnancy, and positively with factors related to early immune modulation, breastfeeding and history of repeated early common infections. There was no interaction with preconception paternal smoking.
However, chance findings cannot be ruled out as the study was exploratory with respect to the interaction between the at-risk alleles and the non-genetic suspected risk factors. We have limited multiple testing by focusing only on the 2 loci the most significantly associated with ALL in the present study and the literature [4][5][6][7][8][9][10][11][12], and on the 5 non-genetic suspected ALL risk factors that were of sufficient prevalence in the ESCALE case-control study. Therefore, the other ALL risk loci identified by the GWAS, in CEBPE, CDKN2A, GATA3 and PIP4K2A, were not investigated. Larger consortium-based studies are needed to comprehensively investigate ARID5B, IKZF1, Non-Genetic Factors and Childhood ALL Table 5. Interaction between the SNP rs10740055 (ARID5B) and maternal use of home insecticides during pregnancy, preconception paternal smoking, breastfeeding, repeated early common infections and birth order, in their relation with childhood acute lymphoblastic leukemia. the interactions between each of the ALL susceptibility polymorphisms and non-genetic suspected risk factors. There was a substantial number of losses in the present study, related to non-participation, refusal in providing genetic material, and insufficient amount of DNA, leading to the potential for selection bias. Employment in the most qualified category 'Intellectual and scientific jobs, managers and intermediate professions' was significantly more frequent in the parents of the genotyped controls (56%) than in those of the other non-genotyped Caucasian controls (39%), but significantly less frequent in the parents of the genotyped cases (34%) than in those of the other non-genotyped Caucasian cases (45%). However, the frequency of the at risk alleles of rs4132601 and rs10740055, differed only slightly according to whether the parents' belonged to the most qualified professional category, among both the controls and the cases. This, and the fact that the associations of ALL with the non-genetic factors were similar in the genotyped and non-genotyped children, suggest no or little effect of the loss of subjects on the estimation of the interactions.
The cases and controls were genotyped separately using different methods, but the genotypes obtained with the pangenomic and customized BeadChips were shown to be fully concordant in a sample of 96 cases genotyped using both platforms [6]. Past exposures may not be remembered perfectly by the mothers and some misclassifications inherent in the retrospective design of the study could not be prevented. The potential for recall bias is likely to be greater for a history of common infections and the use of pesticides during pregnancy, which are sporadic events, than for breastfeeding, paternal smoking or birth order. In particular, underreporting of medically diagnosed infections has been reported [35][36][37], and, in the United Kingdom Childhood Cancer Study, an inverse association was observed between ALL and a history of clinically diagnosed infections as reported by the mothers, but a positive association was observed when relying on information from medical records [37,38]. In addition, studies based on medical records or health claims databases reported null [39,40] or positive associations [41][42][43] between a history of clinically diagnosed infection and ALL, suggesting that children who develop leukemia are more likely to have had clinically diagnosed infections in infancy. The present study reported an inverse association between ALL and maternal report of repeated common infections, medically diagnosed or not, considered as a surrogate for frequent exposure to infectious agents. The influence of case/control status on that recall, and the direction of the bias remain difficult to predict. However, with regard to the specific goal of the present analyses, the exposure misclassifications are not expected to differ according to the polymorphisms of interest and are not likely to have resulted in strong bias in the estimates of the interactions.
To our knowledge, the interactions between the genetic variants identified by the GWAS and some non-genetic factors have been investigated in only two previous studies. Linabery et al. reported no interaction between some ARID5B and IKZF1 SNPs and some selected demographic factors, birth weight and maternal age, using a case-only design [44]. In a recent Australian study, there was an evidence of interaction between rs4132601 risk genotype at IKZF1 and paternal preconception smoking (p = 0.05) and maternal use of folic acid (p = 0.04) [12]. In the present study, we did not observe an interaction with paternal smoking, and the power was too limited for investigating interaction with folic acid supplementation, whose prevalence was low in France during the study period [45]. Interestingly, in the present study and in the Australian study, the increase in ALL risk in carriers of the G allele of IKZF1 rs4132601 was not observed in the event of exposure to the suspected risk factors (or in the absence of exposure to the suspected protective factor). Biologically plausible explanations for our observations are lacking and further studies are needed to replicate (or not replicate) our observations. One possible explanation may be the modification of IKZF1 expression by the environment, as growing evidence suggests that environmental factors, including pesticides and breastfeeding, may modify the expression of genes involved in key cell functions, in particular through epigenetic deregulation [20][21][22][23][24][25][26][27]. To our knowledge, no experimental study has yet investigated the possibility that exposure to pesticides or a lack of early immune stimulation may reduce the expression of IKZF1 or other genes involved in lymphoid differentiation, or that the impact of the non-genetic factors may differ depending on the gene polymorphisms. The decrease in the expression of one or several genes involved in a given cell function, following exposure to an environmental factor, might possibly result in the same effect as carrying a specific polymorphism associated with lower expression of the genes, potentially leading to a negative interaction. Statistical interactions on a multiplicative scale might also be observed if environmental factors and genetic factors would lead to leukemia through different tumorigenic pathways.
In conclusion, this study suggests the possibility of a negative interaction between ALL susceptibility locus in IKZF1 and prenatal insecticide exposure, and of positive interactions between the at-risk allele and breastfeeding and history of repeated early common infections. Further studies are needed to evaluate whether these observations are chance findings or reflect true underlying mechanisms.