IL-4 Haplotype -590T, -34T and Intron-3 VNTR R2 Is Associated with Reduced Malaria Risk among Ancestral Indian Tribal Populations

Background Interleukin 4 (IL-4) is an anti-inflammatory cytokine, which regulates balance between TH1 and TH2 immune response, immunoglobulin class switching and humoral immunity. Polymorphisms in this gene have been reported to affect the risk of infectious and autoimmune diseases. Methods We have analyzed three regulatory IL-4 polymorphisms; -590C>T, -34C>T and 70 bp intron-3 VNTR, in 4216 individuals; including: (1) 430 ethnically matched case-control groups (173 severe malaria, 101 mild malaria and 156 asymptomatic); (2) 3452 individuals from 76 linguistically and geographically distinct endogamous populations of India, and (3) 334 individuals with different ancestry from outside India (84 Brazilian, 104 Syrian, and 146 Vietnamese). Results The -590T, -34T and intron-3 VNTR R2 alleles were found to be associated with reduced malaria risk (P<0.001 for -590C>T and -34C>T, and P = 0.003 for VNTR). These three alleles were in strong LD (r2>0.75) and the TTR2 (-590T, -34T and intron-3 VNTR R2) haplotype appeared to be a susceptibility factor for malaria (P = 0.009, OR = 0.552, 95% CI = 0.356 –0.854). Allele and genotype frequencies differ significantly between caste, nomadic, tribe and ancestral tribal populations (ATP). The distribution of protective haplotype TTR2 was found to be significant (χ2 3 = 182.95, p-value <0.001), which is highest in ATP (40.5%); intermediate in tribes (33%); and lowest in caste (17.8%) and nomadic (21.6%). Conclusions Our study suggests that the IL-4 polymorphisms regulate host susceptibility to malaria and disease progression. TTR2 haplotype, which gives protection against malaria, is high among ATPs. Since they inhabited in isolation and mainly practice hunter-gatherer lifestyles and exposed to various parasites, IL-4 TTR2 haplotype might be under positive selection.


Introduction
Plasmodium falciparum malaria is one of the major causes of morbidity and mortality in tropical and sub-tropical areas [1]. Despite the significant advances in disease control, Plasmodium falciparum malaria accounts for 1-3 million deaths annually [2]. The variation in the severity of Plasmodium falciparum infections include different phenotypes such as hyper or asymptomatic parasitaemia, mild malaria, severe malaria and cerebral malaria [3,4] and the host genetic architecture contribute to these malarial phenotypes [5]. Increasing epidemiological and experimental evidences suggest that the host genetic variations play an essential role to thwart actively or passively the parasite invasion. [4]. The fundamental attribute of the innate immune system is to recognize pathogen and react swiftly to contain the early infection while signaling to specific adaptive immune response. Studies have investigated role of innate immune genes such as Toll-like receptors (TLR2, 4,9), chemokines and cytokines role in Plasmodium falciparum malaria [6]. In addition, plethora of studies have documented that genetic heterogeneity in many immune genes is associated with malaria susceptibility [3].
Malarial infection is characterized by pro-inflammatory responses during early stages of infection followed by antiinflammatory responses during disease progression [7]. The human Interleukin 4 (IL-4) located in the chromosome 5 (5q31- 33), is an anti-inflammatory cytokine produced by CD4+ Th2 cells, basophils and mast cells. IL-4 regulates variety of cell types [8] and play an essential role in differentiation of Th2 effector cells, suppression of Th1 signaling, promoting humoral immunity and Ig class switching and a dominant role in immunopathology [9,10,11]. Studies have revealed that INF-c levels were significantly elevated during early stages of malaria, whereas the IL-4 levels were elevated during intermediate and late stages indicating a switch towards Th2 response [12]. A significant inverse correlation between IL-4 to INF-c ratio and peripheral parasitaemia in malaria patients has been documented [13].
Human IL-4 gene promoter contains six conserved binding sites of NFAT (nuclear factor of activated T-cells), a transcription factor; along with activator protein 1 (AP1) regulates IL-4 transcription [14]. Studies have shown that the -590C/T transition creates a seventh NFAT binding site and synergistically up regulates IL-4 transcription rate up to 3 fold [15]. Studies have documented that elevated antibody IgG and IgE levels against malaria antigens, parasitaemia and malaria susceptibility has been associated with IL-4 -590T allele in several African populations [8,16,17,18,19]. The -590T allele is believed to be under positive selection in various populations and indicates local adaptation to diverse pathogenic challenges [20]. Further, -34T promoter polymorphism association with elevated total serum IgE levels has been demonstrated [8]. Also the presence of H3K27Ac mark was observed near active regulatory element of intron-3 VNTR region in different cell lines (www.ucsc.edu). Studies have established that IL-4 as a key regulator in malaria and three regulatory IL-4 polymorphisms (-590C/T, -34C/T and in intron-3 VNTR) have been shown to regulate serum IL-4 levels, IgG, IgE, disease progression and survival [16,17,18,19,21,22,23,24,25]. Also the three regulatory polymorphisms in the IL-4 loci were associated with end stage renal disease, multiple sclerosis, autoimmune Grave's disease, chronic polyarthritis, rheumatoid arthritis, asthma, rhinitis and atopic dermatitis [21,26,27,28,29].
Although several studies have documented on functional significance of IL-4 polymorphisms in different ethnicities, to the best of our knowledge no studies have investigated the contribution of IL-4 variants in Indian population. Indian populations remain isolated from rest of the world for thousands of years and are unique in their origin and have accumulated unique set of mutations and the variants influencing disease susceptibility among Indian populations remain different compared to other ethnicities [30][31]. In this study, we aim to investigate the contribution of three functional IL-4 polymorphisms rs2243250 (-590 C.T, promoter), rs2070874 (-34 C.T, 59UTR) and rs79071878 (intron-3, 70 bp VNTR) with P. falciparum malaria infection in well-defined malaria cases and in ethnically matched controls. Since the entire Indian subcontinent represents a malaria endemic region, we extended our investigation of the three functional IL-4 polymorphisms to different linguistically and geographically isolated Indian populations and compared the observed differences to that of different ethnicities representing world populations.

Study Subjects
A total of 4216 individuals were investigated for IL-4 gene polymorphisms (-590C/T, -34C/T and in intron-3 VNTR). Of which 173 individuals were clinically characterized with Plasmodium falciparum severe malaria and 101 with mild malaria and 156 with asymptomatic individuals from P. falciparum endemic states, Orissa and Chhattisgarh (Table 1). We also utilized 3452 individuals from 76 distinct populations representing caste (n = 1568), nomadic (n = 114), tribes (n = 517) and ancestral tribal populations [(ATP) (n = 1253)] of India. Also three world populations representing tropical regions such as Brazil (n = 84), Syria (n = 104) and Vietnam (n = 146) were included in this study ( Table 2).

Sampling
All individuals representing the malaria cohort were clinically classified. Classification of malaria was carried out on WHO guidelines; severe malaria (n = 173) is defined as severe anemia (hemoglobin ,50 g/l) and/or hyper-parasitemia (.250,000 parasites/ml, corresponding to .10% infected erythrocytes), a Blantyre coma score #2 and other facultative signs of severe malaria such as cerebral malaria, convulsions, hypoglycemia, and respiratory distress. All individuals were hospitalized for treatment. Mild malaria (n = 101) is defined as parasitemia 1000-50,000/ml on admission, no schizontaemia, circulating leukocytes containing malarial pigment ,50/ml, not homozygous for hemoglobin S, hemoglobin .80 g/l, platelets .50/nl, leukocytes ,12/nl, lactate ,3 mmol/l, and blood glucose .50 mg/dl. Asymptomatic individuals (n = 156) were characterized as individuals harboring parasites without clinical signs during sample collection. Intravenous blood sample (,5 ml) was collected from each individual admitted at Ispat General Hospital, Rourkela, India; and Pt. Jawaharlal Nehru Memorial Medical College, Raipur, India.

Genotyping of SNP and VNTR Variants
Genomic DNA was extracted from whole blood using the protocol as previously described [32]. To re-sequence 800 bp of IL-4 promoter region, we utilized the reference sequence from the ENSEMBL (ID: ENSG00000113520; www.ensembl.org). The sequence specific primer pairs were designed using the Primer-BLAST (http://www.ncbi.nlm.nih.gov/tools/primer-blast), Mac-Vector (MacVector, Inc. USA) and the Amplify 3X (http://engels. genetics.wisc.edu/amplify) software platforms.
DNA was amplified using a primer pair spanning the promoter regions to detect polymorphisms at -590C/T and -34C/T variants. The primer pairs employed were IL-4_promo_F: 59-TATGGACCTGCTGGGACCCAAACTA-39, and IL-4_pro-mo_R: 59-CACCTTCTGCTCTGTGAGGCTGTTC-39 (Eurofiins mwg operon). In brief: 5 ng of genomic DNA was amplified in a 10 ml reaction volume using Qiagen long-range PCR kit following manufacturers instructions (Qiagen, Germany) on a GeneAmp 9700 Thermal cycler (ABI, USA). Thermal cycling parameters for amplification were: initial denaturation at 94uC for 3 min, followed by 35 cycles of 30 sec at 93uC denaturation, 25 sec at 66uC annealing, 1 min 30 sec at 68uC extension, followed by a final extension of 5 min at 68uC. PCR products were cleaned up using Exo-SAP-IT (USB, Affymetrix, USA) and 1 ml of the purified product were directly used as templates for sequencing, using the BigDye terminator v. 3.1 cycle sequencing kit Applied Biosystems, USA) on an ABI 3730XL DNA sequencer, according to the manufacturer's instructions. DNA polymor- phisms were identified when assembled with the reference sequence using AutoAssembler software (Applied Biosystems).
The variable nucleotide repeat regions (VNTR) in the intron-3 were amplified using primer pairs IL-4_del_70_F: 59-GCCTTTA-GATTCCACCACGAGTATG-39 and IL-4_del_70_R: 59-GGTCATCTTTTCCTCCCCTGTATCTTA-39. PCR products were size fractionated on 2% agarose gel to detect the repeat polymorphisms. PCR amplicons of 389 bp (two repeats each of 70 bp) were designated as R2 whereas amplicons of 459 bp (three repeats, each of 70 bp) as R3. A subset of samples were reconfirmed and validated for their R2 and R3 polymorphisms by direct sequencing.

Statistical Analysis
The allele and genotype frequencies were analyzed by simple gene counting and expectation-maximum (EM) algorithm and the significance of deviations from Hardy-Weinberg equilibrium was tested using the random-permutation procedure as implemented in the Arlequin v.3.5.1.2 software (http://cmpg.unibe.ch/ software/arlequin3/) [33]. Pairwise Fst values and co-ancestry coefficient were calculated using Arlequin using un-phased data Linkage disequilibrium (LD) analysis was performed using Haploview v4.2 software [34]. The allele and genotype distribution were calculated by chi square test in different sample sets using the SPSS (ver. 20). In all analysis, a two tailed p-value less than 0.05 were considered significant. Chi square contingency-table test results were interpreted by standardized residual method of post hoc analysis [35]. Probable effect of sex stratification and sample size were verified by bootstrap (10000 random sampling events) with bias-corrected and accelerated (BCa) method, using SPSS (ver. 20).

Role of IL-4 Variants in Malaria
The distributions of IL-4 genotype and allele frequencies are summarized in Table 3 and Figure 1. Mainly two copies and three copies of 70 bp repeat (intron-3 VNTR) has been observed in humans and are designated as R2 and R3 respectively, whereas only a single copy of 70 bp repeat has been observed in other primates ( Figure S1, www.ensembl.org). The genotype frequencies of intron-3 VNTR polymorphism differed significantly among ethnically matched asymptomatic controls, individuals with mild and severe malaria (x 2 4 = 42.2; p,0.001). Promoter polymorphisms -590 C.T and -34 C.T also differed significantly among the studied malarial sub groups (x 2 4 = 19.5; p,0.001 and x 2 4 = 25.3; p,0.001). Similarly, allele distributions also differed significantly among these groups (intron-3 VNTR: x 2 2 = 30.3; p,0.001, -590 C.T x 2 2 = 13.6; p,0.001 and -34 C.T x 2 2 = 11.3; p = 0.003). All three studied loci were in HW equilibrium and were in strong LD (r 2 .0.75) ( Figure 2) with two major haplotypes, CCR3 and TTR2 identified (Table S2). Therefore, analysis has been performed only with VNTR polymorphism or the resulting haplotypes. Significant difference has been observed in the distribution of these haplotypes between cases and asymptomatic control (OR = 0.552, 95% CI = 0.35620.854, p = 0.009) (Table S2) with TTR2 as protective haplotype.
As the proportion of male sample were higher in asymptomatic as well as in case groups, we further analyzed the distribution of R2/R3 in these groups for the possible effect of sex stratification. The frequency of protective allele R2 was found significantly lower in both the male and female groups of mild and severe malaria compared to asymptomatic control and rule out the possibility of sex stratification (Table S3).

Association of Haplotype TTR2 with Ancestral and Tribal Indian Population
Among 3613 individuals from a total of 76 endogamous Indian ethnic populations investigated, we found striking pattern of genotype prevalence based on their social status and inhabitation. Most of the studied populations were in HW equilibrium (Table 5) and those, which were not in HW, were excluded for further analysis. Based on their social status, inhabitation, co-ancestry coefficient and pairwise F st (Figure S2), we initially grouped these populations into three groups; namely, caste, nomadic and tribal populations. Caste is further grouped in to traditionally upper, middle and lower caste populations to see if there is any difference between these groups. Our study did not find any significant difference in genotype distribution among these groups ( Figure  S3). Further, tribal group has been divided into two groups the tribe (started agriculture recently) and the ancestral tribal population [(ATP), which includes all hunter-gatherer tribal populations]. The genotype distribution of these three loci (-590C/T, -34C/T and the intron-3 VNTR) differed significantly among the four classified sub groups (x 2 6 = 501.1; p,0.001, x 2 6 = 326.9; p,0.001 and x 2 6 = 323.2; p,0.001) respectively (Table 6). In addition, allelic distributions differed also significantly ( Table 6). We found these markers in strong LD having r 2 .0.87 ( Figure 2) with significant difference in haplotype distribution among these four subgroups (x 2 3 = 182.95, p-value ,0.001) ( Table  S2). The protective haplotype TTR2 has been found to be highest in ATP (40.5%); intermediate in tribes (33%); and lowest in caste (17.8%) and nomadic (21.6%) (Figure 3).
Since there are four major linguistic families in India, we classified the samples in to four groups, namely; Indo-European (IE), Dravidian (DV), Austro-Asiatic (AA) and Tibeto-Burman (TB). Each linguistic family has been further classified in to caste, tribe and ATP. Interestingly, the difference in genotype distribution among caste, tribe and ATP follow the same pattern as of TTR2 haplotype distribution in ATP (highest), tribe and caste (lowest) populations ( Figure 4).

Geographical Distribution of IL-4 Variants
Comparison of all three studied variants to other world populations (Syria, Brazilian and Vietnamese) revealed diverse geographical patterns. We retrieved earlier reported data and compared with Indian population. We found Vietnamese has very high R2R2 (65.7%) genotype frequency, similar to other Southeast Asian countries. However, R2R2 genotype frequency in Syrian and Brazilian were only 4.8% and 6%, respectively ( Figure S4).

Discussion
We have screened the regulatory polymorphisms of IL-4 promoter region (rs2243250, -590CT and rs2070874, -34 CT) and intron-3 repeat region (rs79071878, 70 bp VNTR) to investigate their possible role in survival against disease and pathogen or infection employing three different approaches: (1) case-control malaria cohort from Orissa and Chhattisgarh, the malaria endemic regions of India; (2) a comprehensive assessment in seventy-six ethnically, linguistically and geographically diverse populations inhabited across India; and (3) evaluation of samples from Brazil, Syria and Vietnam and comparison of results with Indian populations.
We observed significant difference in genotype and allele frequency distribution along with haplotype carriage between three groups of malaria case-control study. However, mild and severe groups do not differ among themselves; they differ significantly when compared to asymptomatic control. We also found these regulatory markers are in strong LD (r 2 .0.75) and carriage of the resultant haplotype TTR2 with asymptomatic and CCR3 with malaria cases (OR = 0.552, 95%CI = 0.356-0.854, p = 0.009) (Table S2). Malaria outcome is the result of complex interaction of a large number of factors and the pathogenesis might be regulated by various mechanisms. However, our findings show that the high IL-4 producing haplotype TTR2 protects or increases survival against Plasmodium falciparum malaria. It has been reported that the carriers of TT with cerebral malaria had elevated total IgE compared to non-carriers and suggested that the IL-4 play a regulatory role in the pathogenesis of malaria in Ghanaian children [8]. Further, association between IL-4 -590T allele and lower prevalence of Plasmodium falciparum infection in asymptomatic Fulani population of Mali has been documented [36]. Several findings have shown association of -590T, -34T and intron-3 VNTR polymorphism R2 with high level of serum IL-4 and consequent high level of total IgG, IgE, anti-plasmodium IgG and IgE, and severity of infection in several populations of malaria endemic region across the globe [16,17,18,19,21,22,23,24,25,37].
It has been established that the high level of pro-inflammatory cytokines (TNF-a, IFN-c) produced during malaria infection leads to severe pathogenesis [38,39,40,41,42,43]. These cytokine upregulates expressions of endothelial adhesion molecules (ICAM-1) in brain and kidney, which facilitates increased sequestration of parasitized RBC within the microvasculature of these organs [38,39]. This increased sequestration of parasitized RBC leads to cerebral malaria and renal failure. Further, inhibitory effect of TNF-a on erythropoiesis and subsequent severe malarial pathogenesis has been demonstrated in human and mouse model of experimental cerebral malaria (ECM) [41,42,43]. In ECM mouse model, treatment with anti-inflammatory cytokine IFN-a and IFN-ß inhibited cerebral malaria and reduced the parasite burden [39,40]. IFN-ß treatment down regulate pro-inflammatory cytokine TNF-a, IFN-c, ICAM-1, CXCL9 and CXCR3 [40]. These studies indicate, a check on pro-inflammatory cytokine by antiinflammatory cytokine can lead to enhanced survival against Plasmodium falciparum malaria. This supports our finding that shows high IL-4 producing haplotype, TTR2, provide decreased susceptibility to Plasmodium infection. In contrast to the general accepted view that IL-4 secreting CD4+ T cells are anti-inflammatory mediators and suppress proinflammatory response, CD4+ T cells has been found to be crucial to the development of pro-inflammatory CD8+ T cell response against Plasmodium sporozoite infected hepatocytes [44,45,46]. It has been observed that early development of protective circumsporozoite protein (CSP) specific CD8+ T cell originates in cutaneous lymphoid tissue of infected site and then migrate to other sites including liver [47,48]. Development of this immunity requires IL-4 mediated cross talk of CD4/CD8 cells [44]. The CSP specific CD8+ T cells, which get primed in presence of IL-4 signals, differentiate into effector memory CD8+ T cells, whereas in absence of IL-4 the response fails to develop further after few days and reduced by more than 90% compared to that in presence of CD4+ T cells. These recent reports further support our findings [44].
In this study, among Indian populations, we found overrepresentation of R2R2 and R2R3 genotype in ATP while underrepresentation in caste and vice-versa. No significant difference in genotype or allele distribution has been found between caste and nomadic populations. Our Y-chromosomal markers based popu-lation study explains that this deviation from general perception is due to recent admixture and gene flow between the caste and nomadic populations (our unpublished data). However, significant difference has been found between all other groups. Every single Indian population maintain its unique genetic architecture; mainly due to endogamy marriage practice over the last thousands of year. This has been well supported by our earlier studies using mtDNA, Y chromosome and autosomal genetic markers [30,31]. We found that these three markers are in strong LD (r 2 .0.87) with two main haplotypes TTR2 and CCR3. Distribution of protective haplotype TTR2 has been found significantly higher in ATP than tribe and caste while at intermediary in tribe. The ATPs are inhabited in isolated forest and they mainly practice huntergatherer lifestyle, hence, they are under constant exposure to helminthes and various other parasites. Therefore, positive selection [20] might be operating on IL-4 locus of the ATPs compared to caste populations, who practice modern lifestyle and expose to modern medicine. This is the first study of its kind, where we studied the IL-4 variations in such a depth in diverse Indian populations. We also observed that a few populations (Bodo, Gadaba, Baiga, Toda, Malai Kuruwar) are significantly departs from HW equilibrium, which might be the result of positive selection or founder effect as their population size are very small and follow very strict endogamy practice.
Apart from its role in controlling malaria and other pathogenic disease, the IL-4 polymorphisms (-590T, -34T and intron-3 VNTR R2) have been found to be associated with end stage renal disease, multiple sclerosis, autoimmune Grave's disease, polyarthritis, rheumatoid arthritis, asthma, rhinitis and atopic dermatitis [21,26,27,28,29]. This Th2 response also mediates inflammatory response to helminth infection [49]. This indicates ATP and tribal populations not only have more survival potential against autoimmune and allergic disease but also against extracellular helminthic infection then the caste populations. In conclusion, IL-4 -590T, -34T and intron-3 VNTR R2 allele is associated with enhanced survival against malaria and other extracellular pathogens in Indian populations. However, their role needs to be assessed further for other infectious, inflammatory and autoimmune diseases. This observation may assist in finding individuals at high risk and hence, disease management. These linked marker along with other markers being in LD can cause balance shift of cytokine profile and hence T H 1 and T H 2 response in an individual up to an extent, where it can be deleterious also. Thus, a delicate balance of various cytokine is more important than the specific one. Hence, for detailed understanding, other regulators need to be studied among Indian populations. Our study also emphasize the importance of host genetics in resistance/ susceptibility to infectious disease. Figure S1 Multiple sequence alignment of IL4 intron-3 VNTR (70 bp repeat) region of six primates. Mostly two and three copies of repeats has been observed in humans, whereas only a single copy of 70 bp repeat has been observed in other primates (www. ensembl.org).