Determining the Population Frequency of the CFHR3/CFHR1 Deletion at 1q32

In this study we have used multiplex ligation-dependent probe amplification (MLPA) to measure the copy number of CFHR3 and CFHR1 in DNA samples from 238 individuals from the UK and 439 individuals from the HGDP-CEPH Human Genome Diversity Cell Line Panel. We have then calculated the allele frequency and frequency of homozygosity for the copy number polymorphism represented by the CFHR3/CFHR1 deletion. There was a highly significant difference between geographical locations in both the allele frequency (X2 = 127.7, DF = 11, P-value = 4.97x10-22) and frequency of homozygosity (X2 = 142.3, DF = 22, P-value = 1.33x10-19). The highest frequency for the deleted allele (54.7%) was seen in DNA samples from Nigeria and the lowest (0%) in samples from South America and Japan. The observed frequencies in conjunction with the known association of the deletion with AMD, SLE and IgA nephropathy is in keeping with differences in the prevalence of these diseases in African and European Americans. This emphasises the importance of identifying copy number polymorphism in disease.


Introduction
Complement genes within the RCA (Regulators of Complement Activation) cluster at chromosome 1q32 are arranged in tandem within two groups [1]. In a centromeric 360 kb segment lie the genes for factor H (CFH) (OMIM 134370) and five factor H-related proteins -CFHR1 (OMIM 134371), CFHR2 (OMIM 600889), CFHR3 (OMIM 605336), CFHR4 (OMIM 605337) and CFHR5 (OMIM 608593). Sequence analysis of this region shows evidence of segmental duplications (SDs) resulting in a high degree of sequence identity between CFH and the genes for the five factor H related proteins [2][3][4]. SDs such as those seen in the RCA cluster are frequently associated with genomic rearrangements [5]. These usually occur as a result of non-allelic homologous recombination (NAHR) between SDs but can also be a result of gene conversion and microhomology mediated end joining (MMEJ) [6]. Genomic disorders at this locus have affected CFH and the CFHRs in a number of ways. Deletions as a result of NAHR lead to the loss of CFHR1, CFHR3 and CFHR4. Deletions within genes, occurring through both NAHR and MMEJ, result in the formation of hybrid genes (CFH/CFHR1, CFHR1/CFH, CFH/ CFHR3, CFHR3/CFHR1) associated with diseases such as atypical haemolytic uraemic syndrome (aHUS) and membranoproliferative glomerulonephritis (MPGN) [7][8][9]. Complete deficiency of factor H related proteins 1 and 3 had been found to be occur in ,4% of a European population in protein studies before DNA studies of the region [10]. This DNA copy number polymorphism (CNP) has been extensively characterised in health and disease. It has been shown that the deletion is associated with the presence of factor H autoantibodies in aHUS [11,12], with an increased risk of SLE [13] and a decreased risk of age-related macular degeneration [14] and IgA nephropathy [15,16]. That there might be differences in the population frequency of the CFHR3/1 deletion was suggested from a study published in 2006 which showed that the prevalence of homozygous deletion in African populations was ,16% [17]. Population difference in the deletion have been confirmed in subsequent studies [13,18,19]. In this study we have measured copy number of CFHR3 and CFHR1 with multiplex ligationdependent probe amplification (MLPA) [20] in a range of populations derived from the HGDP-CEPH Human Genome Diversity Panel (http://www.cephb.fr/en/hgdp/diversity.php) [21].  [22,23]. The samples from the Health Protection Agency were originally obtained from a control population of randomly selected, non-related UK Caucasian blood donors. The full collection of samples with the HGDP-CEPH panel consists of 1051 individuals from 51 world populations (http://www.cephb.fr). We selected for analysis 439 samples from 17 different countries comprising 25 different populations (Table 1). We did not include populations for which data is either already available (for example European populations such as France) or where samples numbers were too small to be representative. There were still some populations with a small number of samples, including the sub-Saharan region of Africa. These were combined into 11 geographical locations ( Table 2) for subsequent analysis. In each of these locations the number of samples was greater than 20. In total 133 samples from African populations were analysed, including 83 from sub-Saharan countries. CFHR1 and CFHR3 copy number was measured as described previously [24] using multiplex ligation-dependent probe amplification [20] (MLPA) using a kit from MRC Holland (www. mlpa.com) (SALSA MLPA kit P236-A1 ARMD) and in house probes.

Statistics
Chi-square analysis was used to test whether there was deviation from Hardy-Weinberg equilibrium in the geographical locations. A p value of ,0.05 was considered to be not consistent with Hardy-Weinberg equilibrium. Chi-square analysis was undertaken to determine whether there was a significant difference between geographical locations in either the allele frequency of the CFHR3/CFHR1 deletion, the genotype frequencies (del + + /del + 2 /del 2 2 ) or the frequency (del + + ) of a homozygous deletion of CFHR3/CFHR1. Fisher's exact tests were undertaken to determine whether in different geographical locations either the allele frequency of the CFHR3/CFHR1 deletion, the genotype frequencies (del ++ /del +2 /del 22 ) or the frequency of a homozygous deletion of CFHR3/CFHR1 were significantly different to the values for these variables in the UK population, or to their values in all other populations combined.

Results
The allele frequency of the CFHR3/CFHR1 deletion in the various geographical locations and the individual populations within these locations is shown in Table 2. There was no deviation from Hardy-Weinberg equilibrium in any of the geographical locations. The CFHR3/CFHR1 deletion was not present in either the South American or Japanese locations. The highest allele frequency for the deletion was 54.7% in Nigeria. The deletion was Africa and Russia but was significantly higher in sub-Saharan Africa and Nigeria. The CFHR3/CFHR1 deletion was not found in homozygosity in Mexico, South America, China, Japan, Pakistan or Siberia. The frequency of homozygous deletion was 3.4% in the UK, between 5-10% in Italy, Russia, North Africa and Sub-Saharan Africa, and 33.3% in Nigeria. Differences in genotype frequencies between geographical locations were highly significant (X 2 = 142.3, DF = 22, P-value = 1.33610 219 ). Differences in the frequency of the homozygous del ++ genotype were also highly significant (X 2 = 56.8, DF = 11, P-value = 3.66610 28 ). The level of statistical significance derived using Fisher's exact test for the del ++ /del +2 /

Discussion
In this study we have used multiplex ligation-dependent probe amplification (MLPA) to determine the copy number of CFHR3 and CFHR1 in a variety of different geographical locations derived from the HGDP-CEPH collection. MLPA has the advantage over other techniques that have been used in that it provides a specific determination of copy number. We measured copy number of both CFHR3 and CFHR1 to determine the deleted allele frequency because measurement of CFHR1 copy number alone is not specific to this allele as it also occurs with the CFHR1/CFHR4 deletion [24,25]. Using MLPA we have been able to determine both the allele frequency of the deletion and the frequency of a homozygous deletion. For statistical purposes we have set the UK population as our reference. The value of 3.4% for the frequency of a homozygous deletion in the UK population in this study is similar to values that we have obtained in previous studies [11,24] (Table 4) and the allele frequency of the deletion is similar to that which we obtained on introduction of the MLPA assay (17.3% in Moore et al [24]). The latter value is similar to the frequency of 18.3% that we have obtained in this study.
The values for the allele frequency of the deletion, the genotype frequencies, and the frequency of a homozygous deletion that we obtained for world-wide populations using the HGDP-CEPH collection show marked population differences with the highest    frequencies being seen in African populations. The findings in the African groups are consistent with those reported (Hageman et al [17]) in African Americans and validate their findings in HGDP-CEPH African samples which were based on a gene specific PCR method that measured frequency of a homozygous deletion Subsequently there have been several other studies documenting the frequency of the CFHR3/CFHR1 deletion in a range of populations. The results from these are shown in Table 4. The values in this study for both allele frequency and the frequency of homozygous deletion are consistent with previous studies particularly for the UK, Japanese, Chinese and Nigerian populations. We chose in this study to combine several populations from subsaharan Africa as the numbers for each group were small. However, the study of Sivakumaran et al [19] suggests that for this region there are significant differences in the allele frequency of the deleted allele between tribes. For instance, they found an allele frequency of 23.8% in the Maasai tribe of Kenya compared to 42.3% in the Luhya. As can be seen in Table 2 we also observed differences in the allele frequencies of the different populations within this geographical location. For instance in the Biaka pygmyies the allele frequency was 8.7% compared to 50% in the Kenyan Bantus and Senegal Mandenka tribes. Recent studies documenting the genetic variation in this region show evidence of at least two different genetic groups derived from the North and South of the Kalahari [26,27]. This may explain the differences in the allele frequency that we have seen in sub-saharan Africa. It is possible that ancestral African populations with a low allele frequency of the deletion were the ones which participated in the ''out of Africa'' dispersal with the associated bottleneck reinforcing the low allele frequency. That generally the current African populations with a low allele frequency of the deletion are Huntergatherers is compatible with this [28,29]. The high allele frequency of the deletion in the African-American population is compatible with the allele frequency seen in the Yoruba and Mandenka [27,30]. How in evolution has this deletion arisen and how can the population differences be explained? The alternative pathway of complement is thought to be the oldest component of the innate immune system [31]. The earliest components of the alternative complement pathway to have been recognised are activators such as C3 which has been identified in a coral [32] suggesting their presence in the Cnidria. Regulatory components have been first recognised in the Agnatha with for instance identification of a C3 cleaving short consensus repeat protein in lamprey [33]. A protein (called SBP1) with a high degree of homology to human factor H was first described in the teleost, sand bass [34,35]. Factor H has also been identified in the zebrafish [36]. In the zebrafish, the mouse and humans there are genes encoding SCR proteins with a high degree of homology to factor H in close proximity to the gene encoding factor H. In man there are the five factor H related proteins (CFHR1-5), in the mouse there are three factor H related proteins and in the zebrafish there are 4 factor H like proteins. Sequence analysis of this region in man suggests that these genes have arisen through a series of segmental duplication events [2]. Analysis of primate genomes undertaken by Sivakumaran et al suggests that chimps have more extensive duplication in this region than humans. The analysis also suggests that the duplications arose in a common ancestor of the chimp and humans after divergence from the orang-utan [19]. The duplicated segments predispose to both non-allelic homologous recombination (NAHR) and gene conversion [37]. The available evidence would suggest that the CFHR3/CFHR1 deletion has arisen through NAHR after the initial formation of the SDs. Sivakuram et al used phylogenetic and linkage equilibrium analysis to determine the ancestral orgin of the deletion [19] and found a single origin in Caucasians and Asians but a recurrent origin in Africans. We believe that in certain populations that the deletion has resulted in an evolutionary benefit. There is evidence to suggest that polymorphisms in complement proteins are associated with susceptibility to infection [38]. For instance mannose-binding lectin (MBL) binds to microbes and activates the lectin pathway. Allelic variants in the gene (MBL2) encoding this protein are associated with differences in both the serum level and function of MBL. The frequency of these allelic variants differs in populations; and the same variants are associated with a differential risk of pneumococcal disease and leprosy. Recently it has been shown that variants in CFH and CFHR3 are associated with susceptibility to meningococcal disease [39]. These observations taken with the knowledge that complement plays a significant role in the pathogenesis other diseases such as malaria [40] would suggest that infection has driven the geographical variability seen in complement variants such as the CFHR3/1 deletion.
Since the CFHR3/CFHR1 deletion was first described a number of studies have documented strong linkage disequilibrium of the deletion with common CFH haplotypes [41,42]. In some populations the deletion is present on haplotypes H1-5 and absent on H6-7 [41]. In other populations the H2 haplotype perfectly tags the deletion [15]. Likewise in some populations individual SNPs have been shown to be in complete LD with the deletion. Zhao et al found that the deletion was in complete LD with rs6677604 in European Americans but not in African Americans (r 2 = 0.60). Whether the deletion confers an independent risk for AMD, SLE and IgA nephropathy or is simply associated with protective/atrisk haplotypes is an area of controversy [19,41,43]. However, factor H related protein 1 blocks the C5 convertase but binds, in competition with factor H, to host surfaces through its C-terminal regulatory domain [44]. We are, therefore, of the opinion that deletion of CFHR1 has a dual effect with reduced inhibition of terminal complement pathway activity but increased regulation by factor H of the alternative pathway. This may also explain why in some diseases (AMD and IgA nephropathy) the deletion is protective whilst in other others (SLE) it is associated with increased risk.
It is also possible that CFHR3 has functional activity that contributes to the disease association seen with the CFHR3/1 deletion. In African Americans with a higher frequency of the deletion the prevalence of AMD and IgA nephropathy is lower than in European Americans [45,46] whereas the prevalence of SLE is higher [47]. Thus studying the population frequency of disease associated CNPs such as the CFHR3/CFHR1 deletion provides novel insights into the pathogenesis of such diseases. However, at an individual level we do not think that screening for the deletion in the normal population is currently of any clinical benefit.