Geographical distribution of complement receptor type 1 variants and their associated disease risk

Background Pathogens exert selective pressure which may lead to substantial changes in host immune responses. The human complement receptor type 1 (CR1) is an innate immune recognition glycoprotein that regulates the activation of the complement pathway and removes opsonized immune complexes. CR1 genetic variants in exon 29 have been associated with expression levels, C1q or C3b binding and increased susceptibility to several infectious diseases. Five distinct CR1 nucleotide substitutions determine the Knops blood group phenotypes, namely Kna/b, McCa/b, Sl1/Sl2, Sl4/Sl5 and KCAM+/-. Methods CR1 variants were genotyped by direct sequencing in a cohort of 441 healthy individuals from Brazil, Vietnam, India, Republic of Congo and Ghana. Results The distribution of the CR1 alleles, genotypes and haplotypes differed significantly among geographical settings (p≤0.001). CR1 variants rs17047660A/G (McCa/b) and rs17047661A/G (Sl1/Sl2) were exclusively observed to be polymorphic in African populations compared to the groups from Asia and South-America, strongly suggesting that these two SNPs may be subjected to selection. This is further substantiated by a high linkage disequilibrium between the two variants in the Congolese and Ghanaian populations. A total of nine CR1 haplotypes were observed. The CR1*AGAATA haplotype was found more frequently among the Brazilian and Vietnamese study groups; the CR1*AGAATG haplotype was frequent in the Indian and Vietnamese populations, while the CR1*AGAGTG haplotype was frequent among Congolese and Ghanaian individuals. Conclusion The African populations included in this study might have a selective advantage conferred to immune genes involved in pathogen recognition and signaling, possibly contributing to disease susceptibility or resistance.

Introduction Complement receptor type 1 (CR1) is widely recognized to play a role in disease pathophysiology, diagnosis, prognosis and in therapy [1]. The gene encoding human CR1 is located on chromosome 1 (1q32.2; OMIM 120620) [2][3][4]. CR1 belongs to the regulator of complement activation family (RCA) and is a transmembrane glycoprotein (single chain type 1), which occurs either in membrane-bound or soluble forms [2,5]. CR1 is predominantly involved in the transport of circulating immune complexes to the reticuloendothelial system. CR1 acts as a regulator in the three pathways of the complement system [2], namely the classical, the lectin and the alternative pathway. It enhances phagocytosis of opsonized particles together with the complement components C3b, C4b, C1q, mannose-binding lectin and ficolin-2, thereby facilitating clearance of opsonized immune complexes. In the presence of Factor I, CR1 suppresses the complement cascade by inactivating C3b and C4b [6]. CR1 comprises of 30 short complement regulator (SCR) domains, known as complement control protein repeats (CCPs). Four protein isoforms have been identified based on their molecular weight and the number of CR1 exons [3]. Groups of seven CCPs are organized into four long homologous repeats (LHRs A to D) [7,8].
CR1 is also expressed on cells involved in both innate and adaptive immune responses [9][10][11]. The erythrocyte CR1 binds to circulating immune complexes and to complementcoated particles to transport them to the liver or spleen for subsequent phagocytosis [2,3]. CR1 deficient mice showed decreased and delayed IgM and IgG responses to West-Nile virus, thus increasing mortality [12]. Moreover, in vitro studies have shown that CR1 has distinct adjuvant properties [13][14][15][16], probably due to its involvement in uptake of antigen by antigen-presenting cells [17].
In the process of pathogen evasion from the host´s immune system, pathogens bind to complement receptors and other regulatory proteins to facilitate their uptake by host cells. This may considerably downregulate and impair the function of the complement system [24]. For instance, CR1 has been reported to facilitate entry of intracellular pathogens into host cells and CR1 protein levels are associated with disease susceptibility. Among protozoan parasites,

CR1 genotyping
In order to assess the distribution of six functional variants [rs17259045, rs41274768 (Kn a/b ), rs17047660 (McC a/b ), rs17047661 (Sl1/Sl2), rs4844609 (Sl4/Sl5), rs6691117 (KCAM+/-)], the complete CR1 exon 29 including their intron-exon boundaries was screened by direct sequencing in the 441 DNA samples (Table 1). A fragment of 884 bp in exon 29 of the CR1 gene was amplified by polymerase chain reaction (PCR) using the CR1 locus specific primer CR1F (5'-TCT TCA TAA ATA ATG CCA GAA GTG G-3') and CR1R (5'-TGC CAA TTT CAT AGT CCT TAT ACA C-3'). PCR amplifications were carried out in a 25 μl volume of reaction mixture containing 10X PCR buffer, 3.0 mM MgCl2, 0.2 mM dNTPs, 0.2 μM of each primer, 1 unit of Taq polymerase (Qiagen GmbH, Hilden, Germany) and 20 ng of genomic DNA on a TProfessional Basic Thermocycler (Biometra GmbH, Göttingen, Germany). Cycling parameters were initial denaturation at 94˚C for 5 minutes followed by 40 cycles of denaturation at 94˚C for 30 seconds, annealing at 55˚C for 30 seconds and elongation at 72˚C for 1 minute, and a final elongation step at 72˚C for 10 minutes. PCR fragments were stained with SYBR Safe DNA Gel Stain (Invitrogen, Carlsbad, USA) and visualized on 1.5% agarose gels. PCR products were subsequently purified using Exo-SAP-IT (USB, Affymetrix, Santa Clara, CA, USA) and the purified products were directly used as templates for sequencing using the Big-Dye terminator v. 1.1 cycle sequencing kit (Applied Biosystems, Foster City, CA, USA) on an ABI 3130XL DNA sequencer according to the manufacturer's instructions. DNA polymorphisms were identified by assembling the sequences with the reference sequence of the CR1 (NM_000573) using Geneious v9.1.4 software (Biomatters Ltd, Auckland, New Zealand) and reconfirmed visually from their respective electropherograms.

Statistical analysis
Statistical analyses were performed using the GraphPad Prism 3.0 software package (GraphPad Software, La Jolla, CA, USA) and Stata 12.0 (StataCorp, College Station, TX, USA). Normal Chi square and two tailed Fisher's exact tests were calculated to determine the differences of

Results
The frequencies of CR1 genotypes in the five populations were in Hardy Weinberg equilibrium (p>0.05). The allele and genotype frequencies of the CR1 SNPs rs17259045, rs17047660 (McC a/b ), rs17047661 (Sl1/Sl2) and rs6691117 (KCAM+/-) differed significantly among the groups (p 0.01) ( Table 1). Genotype frequencies of the CR1 variants rs41274768 (Kn a/b ) and rs4844609 (Sl4/Sl5) did not differ. The rs17259045AG genotype and the rs17259045G allele were more frequent in the Brazilian population. Moreover, the G carriers (AG and GG) and the G allele of variants rs17047660 (McC a/b ), rs17047661 (Sl1/Sl2) and rs6691117 (KCAM+/-) were observed more commonly among the two African populations (Republic of Congo, Ghana). Interestingly, among Congolese and Ghanaian individuals the minor allele of SNPs rs17259045A/G, rs41274768G/A (Kn a/b ) and rs4844609T/A (Sl4/Sl5) did not occur at all; this allele was observed exclusively in Brazilian individuals. Except for rs6691117 (KCAM+/-), the Vietnamese population was monomorphic. The Indian group was monomorphic for three of the SNPs, but not for rs17259045, rs41274768 (Kn a/b ) and rs6691117 (KCAM+/-). Brazilian individuals were polymorphic for all SNPs ( Table 1). The Knops blood antigen distribution among the studied populations is summarized in Table 2.
Haplotypes were reconstructed from the six CR1 variants. A total of nine haplotypes were observed. The haplotype distributions are summarized in Table 3 and Fig 1. The CR1 Ã AGAATA haplotype was more frequent among the Brazilian and Vietnamese populations; CR1 Ã AGAATG occurred frequently among the Indian and Vietnamese groups, while CR1 Ã AGAGTG was observed frequently among Congolese and Ghanaian individuals. The CR1 Ã AGGGTG and CR1 Ã AGAGTG haplotypes were observed only in Brazil and Africa, being far more frequent among the Congolese and Ghanaian groups. Interestingly, CR1 Ã GGAATA was exclusively observed in the Brazilian population. Linkage disequilibrium (LD) analysis between SNPs revealed medium levels of LD for SNPs rs17047661 (Sl1/Sl2) and rs6691117 (KCAM+/-) and for rs17047660 (McC a/b ) and rs17047661 (Sl1/Sl2) in the Congolese and Ghanaian study groups (Fig 2).

Discussion
Pathogens exert strong selective pressure on the human host, leading to substantial changes in host immune regulation thereby evading immune responses. This study utilized samples from population exposed to diverse infectious diseases, where a strong selective pressure is exerted by these infectious pathogens on the human immune locus. The samples utilized in this study are from different case-control cohorts investigated for possible associations of CR1 variants with different infectious diseases (unpublished data). Brazilian, Vietnamese and Indian samples utilized in this study are from an endemic area to Chagas disease, viral hepatitis and leprosy respectively. The Republic of Congo and Ghanaian samples are from malaria holoendemic sites. CR1 genetic variants in exon 29 are associated with CR1 expression levels, C1q or C3b binding activity and increased susceptibility to various infectious diseases. This study investigated the entire exon 29 of CR1 in five diverse populations in order to assess the distribution of Knops blood group antigens and the distinct functional CR1 SNPs. Such studies on geographically diverse populations can provide insights on how these CR1 alleles have spread in populations and contribute to the understanding of natural selection.
Allele and genotype frequencies of CR1 variants in exon 29 [rs17259045, rs41274768 (Kn a/b ), rs17047660 (McC a/b ), rs17047661 (Sl1/Sl2), rs4844609 (Sl4/Sl5), rs6691117 (KCAM+/-)] as well as their haplotype frequencies were differently distributed among the Brazilian, Vietnamese, Indian, Congolese and Ghanaian study groups. So far, the frequencies of these variants and especially, the distribution of blood group antigens have not been described explicitly for central African populations yet.
CR1 variants rs17047660A/G (McC a/b ) and rs17047661A/G (Sl1/Sl2) were observed to be polymorphic only in the African groups compared to those from Asia and Brazil, indicating that the frequencies of these two SNPs result from a strong selective bias exerted by exposure to distinct pathogens especially by Plasmodium falciparum. This is substantiated by a high linkage disequilibrium between the two variants. Of the reconstructed CR1 haplotypes, CR1 Ã AGAGTG and CR1 Ã AGGGTG were observed to be unique among the Congolese and Ghanaian groups. CR1 Ã AGAGTG contains the allele of the rs17047660A. This locus also determines the Knops blood group antigen McC a/b. Studies have demonstrated that this blood group antigen is dominant among many ethnic groups of African ancestry living in malaria endemic regions [34].
Higher rates of adaptive evolution are expected to occur especially in genes involved in the immune system, as these gene loci coevolve with pathogens. This is largely contributed by two factors the genetics of the population and natural selection. Immune genes tend to evolve rapidly as selection pressure is changing continuously due to various pathogenic challenges. Therefore, positive selection of rs17047660A/G (McC a/b ) and rs1704661A/G (Sl1/Sl2) loci is expected in sub-Saharan African populations exposed to distinct pathogenic challenges (e.g. falciparum malaria). Such a selective advantage occurs mainly in immune genes involved in pathogen recognition and signaling, and the CR1 is one of such genes involved in innate immunity.
In addition, the reported frequencies of these two loci, rs17047660A/G (Sl4/Sl5) and rs1704661A/G (Sl1/Sl2), in this study were in accordance with frequencies observed in other There is growing evidence of ethnic differences in susceptibility to some infectious diseases and of genetic adaptation to diverse pathogens [18,35]. This study investigated five antigens of the Knops blood group including the Knops (rs41274768, Kn a/b , p.N1540S), the McCoy (rs17047660, McC a/b , p.K1590E), the Swain-Langley/Villien (rs17047661, Sl1/Sl2, p.R1601G), the Swain-Langley (rs4844609, Sl4/Sl5, p.T1610S), and the KCAM antigens (rs6691117, KCAM+/-, p.I1615V) [19][20][21][22][23]. These Knops blood group polymorphisms have been found associated with various infectious diseases (Table 4). In particular, the two Knops blood group variants McC b (rs1704660G, E1590K) and Sl2 (rs1704661G, R1601G) have specific distributions among African populations, which has been related to selective pressure by malaria in Africa [36][37][38][39][40][41][42]. The substitution of lysine to glutamic acid at 1590 aa position modulates the epitope conformation and serologic reactivity due to its surface exposed feature, affecting the overall CR1 binding capacity [22]. A high frequency of the rs1704661G (Sl2) allele was observed in the African groups. The high frequency of the rs6691117G (KCAM-, I1615V) allele in Africa and India indicates that this allele, similar as the rs1704660G (McC b ) and rs1704661G (Sl2) alleles, might also be subjected to selection. The presence of rs1704661G (McC b ), which is almost limited to African populations, suggests that rs1704661A (Sl1) may be the ancestral allele [43]. Also a differential distribution of rs6691117A/G (KCAM+/-) variants was observed. For instance, in the Vietnamese and Brazilian groups, rs6691117A (KCAM+) is a major allele, while the variant rs6691117G (KCAM-) was observed to be the major allele in Africa. A study from India compared exon 29 CR1 variants in endemic and non-endemic populations and concluded that a differential association with falciparum malaria in regions of varying disease endemicity exists [44]. However, the Indian samples from the present study originate from an area not endemic for malaria.
Taken together, this study revealed significant differences in allele, genotype and haplotype frequencies of CR1 SNPs in five populations. A limitation of this study might be a small sample size. However, this study, first to include population from Central Africa, may provide an increased understanding of the contribution of red blood cell phenotypes and the complement regulator protein with regard to possible associations with infectious diseases. Further studies are warranted with increased sample sizes, to determine the role of CR1 in disease associations and pathogenesis mechanisms.