HLA Class I and Class II Conserved Extended Haplotypes and Their Fragments or Blocks in Mexicans: Implications for the Study of Genetic Diversity in Admixed Populations

Major histocompatibility complex (MHC) genes are highly polymorphic and informative in disease association, transplantation, and population genetics studies with particular importance in the understanding of human population diversity and evolution. The aim of this study was to describe the HLA diversity in Mexican admixed individuals. We studied the polymorphism of MHC class I (HLA-A, -B, -C), and class II (HLA-DRB1, -DQB1) genes using high-resolution sequence based typing (SBT) method and we structured the blocks and conserved extended haplotypes (CEHs) in 234 non-related admixed Mexican individuals (468 haplotypes) by a maximum likelihood method. We found that HLA blocks and CEHs are primarily from Amerindian and Caucasian origin, with smaller participation of African and recent Asian ancestry, demonstrating a great diversity of HLA blocks and CEHs in Mexicans from the central area of Mexico. We also analyzed the degree of admixture in this group using short tandem repeats (STRs) and HLA-B that correlated with the frequency of most probable ancestral HLA-C/−B and -DRB1/−DQB1 blocks and CEHs. Our results contribute to the analysis of the diversity and ancestral contribution of HLA class I and HLA class II alleles and haplotypes of Mexican admixed individuals from Mexico City. This work will help as a reference to improve future studies in Mexicans regarding allotransplantation, immune responses and disease associations.


Introduction
The human major histocompatibility complex (MHC) is located within chromosomal region 6p21.3 and spans at least 3.4 Mb of DNA containing as many as 420 genes, including the HLA system, other immune related genes and pseudogenes [1]. The extensive polymorphism of the HLA genes within populations could have resulted from selective pressures including functional adaptation particularly to bacteria, viruses and parasites [2][3][4][5]. Also, the hypothesis of heterozygote advantage proposed that individuals with heterozygosity at HLA loci would be more efficient to respond against pathogens in pathogen-enriched environments [6]. Nevertheless, studies of genetics of infectious diseases are difficult to replicate due to the complex nature of the environmental factors and the degree of genetic diversity among human populations. In this regard, MHC genes are important because they are involved in immune responses, and are essential markers to study genetic diversity, disease susceptibility and allotransplantation [7].
Different studies using DNA polymorphic markers such as short tandem repeats (STRs), low and intermediate resolution HLA typing, ABO, MN and Rr-Hr blood groups, serum haptoglobin, albumin, and Factor Bf types have described the complexity of the genetic admixture of Mexican populations. These studies have revealed a non-homogeneus combination of Amerindian, Caucasian, and African genes in Mexican admixed individuals [8][9][10]. In this context, an important role of ethnicity in the susceptibility to different inflammatory and infectious diseases has been attributable to the incorporation of MHC alleles by admixture with Caucasian, Asian and African populations [11].
An important aspect of the MHC genetics is the inheritance of non-random associated alleles known as linkage disequilibrium (LD) [12]. Extensive studies on the existence of small blocks and other relatively fixed genetic fragments within the human MHC have been conducted [7,13]. Specific DNA blocks with specific alleles of two or more MHC loci are often haplospecific for particular conserved extended haplotypes (CEHs). The frequency of CEHs and specific block combinations varies between major ethnic groups and/or in different geographic locations; these variations in the frequency of CEHs and blocks can be used as measurements of genetic diversity of the MHC [13]; however, little is known about the MHC blocks distribution and conserved haplotypes combination in Latin-American admixed human groups. Thus, the aim of the present study is to describe the distribution of HLA class I and class II blocks and the HLA CEHs using high resolution typing in a group of Mexican admixed individuals from Mexico City.

Conserved Extended HLA Haplotypes
We listed known CEHs in Table 4  (HF = 0.0107), and A*02:01/C*07:02/B*39:05/DRB1*04:07/DQB1*03:02 (HF = 0.0128), the first four of them being identified within samples of Native American people from all over the Americas, and the last one not found yet in other populations. Importantly, six Caucasian and one African CEHs were found. A set of 38 haplotypes was classified as not previously reported (unknown), some of them resulted from recombination between Caucasian and Amerindian blocks. Interestingly, one CEH which is frequent in Askenazi Jewish population was also observed in our sample (A*26:01/C*12:03/B*38:01/DRB1*04:02/DQB1*03:02).

HLA Genetic Diversity in Mexicans
The extensive polymorphism of the HLA loci in this group of Mexicans was confirmed using polymorphism information content (PIC) values .0.5. HLA-B and -DRB1 loci were the most polymorphic with PIC values of 0.9544 and 0.9123, respectively. HLA-C and -A loci were relatively less polymorphic with PIC values of 0.8845 and 0.8776, respectively, and the less polymorphic locus was HLA-DQB1 (PIC = 0.8020). The degree of polymorphism of HLA loci was also corroborated by the power of discrimination (PD) values. A lower observed heterozygosity (OH) than expected heterozygosity (EH) was found for HLA-DRB1 locus, Table 6.

Mexican Admixed Individuals have a Significant Proportion of Amerindian and Caucasian Genetic Components
The admixture estimations using HLA-B revealed an Amerindian contribution of 59.97%; Caucasian contribution of 25.71%; African contribution of 14.13%; and Asian contribution of 0.18%. These results were similar to the estimations obtained using STRs: Amerindian contribution: 60.5%; Caucasian: 25.9% and African: 13.6%.
In addition, the results using the ABF revealed a frequency of Amerindian HLA-C/2B blocks of 41.3%, followed by Caucasian 25.8%, African 5.5% and Asian 3.3% blocks. The ABF of MHC class II blocks were as follows: Amerindian 51.2%, Caucasian 41.7%, African 3.4% and Asian 2.1%. Further evidence of the distribution of immunogenetic diversity can be observed in the principal component analysis (PCA) plot ( Figure 1), in which our Mexican admixed sample (Mex) clusters together with Native American and Asian populations (which can not be clearly differentiated from each other when HLA-B frequencies are taken as the variable of the factor analysis), and not with the African or European clusters.

Discussion
Here, we analyzed MHC class I (HLA-C/B) and class II (HLA-DRB1/DQB1) blocks diversity, ancestry, and the frequency of CEHs from HLA-C/B/DRB1/DQB1 and their extension to HLA-A in a total number of 468 haplotypes of individuals from Mexico City. We found that 41.0% of the HLA-C/2B blocks in our group were from Amerindian origin. In addition, some of these HLA-C/ 2B blocks also have been described in Asian populations (e.g: C*08:01/B*48:01) including Ivatan from Philippines [15] and several ethnic groups from Taiwan [14]. These findings may indicate that those haplotypes could be frequent in an ancestral group from which both Amerindians and South-East Asians originated from. Amerindian HLA-C/2B blocks observed in the present study, have been also reported, with high frequencies, among Amerindian groups including Zapotecs, Mixe, and Mixtec from Oaxaca State in the southeast of Mexico [16]; Tarahumara from Chihuahua State in the north of Mexico [17]; Native Americans from US [18]; and Yucpa from Venezuela [19]. Genetic admixture estimations were similar to those previously reported data from Mexico City [9]. We detected 13.2% of haplotypes of Caucasian MPA and 11.1% were predominantly Caucasian but shared with other populations including the   [13,18,20,21,22]. In the PCA, our Mexican admixed sample (Mex) clearly separated from the European and African clusters and located within a loose cluster including populations from Asia and Native human groups from America. Notably, the ''Mestizo'' sample from Mexico (MMM) and the sample from Guadalajara (Gua) showed to be more proximate to the European cluster; Guadalajara population samples have shown a high degree of European genetic component in other works [23,24]. Differences in admixed populations show the importance of not taking ''Mestizo'' as a global grouping category for individuals or populations with shared ancestry derived from demographic history of the colonial period. Also, lack of available data with high resolution HLA typing is evident in Native American groups.
Genetic diversity parameters confirm the high degree of polymorphism of the HLA genes in the studied sample. HLA-B and HLA-DRB1 were the most polymorphic loci according to PIC and PD values, followed by HLA-C locus. However, lower OH than EH was found for HLA-DRB1 locus. This may indicate that  observed [31]. Migration patterns into Mexico City in the last 60 years also have to be taken into account to adecuately address an explanation for the low number of heterozygous individuals, as they represent an important source of incorporation of alleles and haplotypes -mainly from indigenous populations-, hence modifying the allelic diversity. In our study the admixture estimations using STRs confirm the greater contribuition of Amerindian and Caucasian and a small contribution of African and Asian genes. The results obtained using the ABF of HLA-C/2B blocks also demonstrated a greater contribution of Amerindian (41.3%), followed by Caucasian (24.6%), African (6.7%), and Asian (3.0%) genes in the admixed Mexicans. Also, the estimations using the ABF of MHC class II blocks revealed that 51.2% of them were from Amerindian and 40.4% from Caucasian MPA. These findings suggest that ABF method is applicable to analyze the genetic diversity and ancestral structure of admixed populations. In this perspective, the genetic admixture of Mexicans could have resulted from the Spaniards, which arrived to Mexico early in the 16 th century. Caucasian component consisted in conquerors and colonizers from Andalucía, Leon, Extremadura, and the Castillas, as well as Portugal and Genoa. Spaniards settled extensively all over the Viceroyalty of the New Spain and a massive migration of colonizers begun on the 17 th century and prevailed through the next two centuries. Presence of Caucasian-MPA or Caucasian-shared blocks or haplotypes may be explained by these demographic traits. The preponderance of haplotypes commonly found in Caucasian populations may be due to the fact that more Caucasian human groups than African or Asian ones have been studied, or may simply reflect a lower genetic diversity among Caucasians. Another hypothesis is that population replacement, together with the collapse of Native American groups that took place due to infectious diseases [32] and the conquest wars, may explain the high prevalence of Caucasian genetic blocks within Mexican admixed individuals [33]. African contribution, although subtle, is present in admixed Mexicans due to slaves introduced to Mexico from Africa during the first three centuries of Spanish colonial domination. All African specific associations present in this study are found in Sub Saharan Africa [14,18,34,], the place where slaves were extracted from by colonial slave traders [18,35]. For example, C*07:01/B*49:01, C*04:01/B*53:01, C*06:02/B*58:02, DRB1*13:01/DQB1*03:03, and DRB1*08:04/DQB1*03:01 blocks have been found in Africa, for instance in Bandiagara from Mali, Bantu from Congo, Bioko from Equatorial Guinea, Luo and Nandi from Kenya, Lusaka from Zambia, Ugandans and Kampala from Uganda, and Yaounde from Cameroon, [18,[36][37][38][39] and have been reported also in African American population from the US [40].
On the other hand, the presence of Asian genes in Mexican population possibly resulted from relative recent immigration of Chinese traders and slaves by transpacific travels from the oriental shores of Asia to the western coasts of Mexico, mainly disembarking in the port of Acapulco. Thus, the Não de China (the Manila Galleon) route, together with a foreign investment policy starting in the 19 th century, helped the Chinese community to become the largest non-Spaniard community in Mexico by mid-1920s [41]. The Asian contribution to the genetic pool conformation of Mexico is modest, mainly due to lack of admixture between Asian immigrants and Mexicans; however, classical Asian associations were found in our sample such as C*04:01/B*35:16 [42] and C*08:01/B*15:02 [18,27,43]. The admixture estimations using different indicators support a tryhybrid model of Amerindian, Caucasian and African ancestry Table 4. Cont.   in Mexicans. But we were able to detect also a small Asian component in Mexicans.
It is well known that MHC diversity influences the susceptibility or resistance to a wide variety of autoimmune disorders and infectious diseases caused by viruses, yeasts, bacteria and parasites. It has been suggested that pathogen-mediated selection might explain the maintenance of MHC diversity at population level [44,45]. However, the role of MHC diversity associated to the admixture between different ethnic groups in the resistance or susceptibility to autoimmune or infectious disease remains unclear. Furthermore, recent studies have suggested that genes that confer susceptibility to autoimmune diseases might be maintained in specific ethnic groups because they primarily confer protection against infectious agents, the major factor driving selection and influencing human adaptation to local environments [2][3][4][5][6][46][47][48]. Functional studies are necessary to define whether the genetic diversity of HLA is influenced in pathogen-enriched environments. The analyses of HLA diversity in the context of pathogen richness have shown a positive correlation between HLA class I allele diversity and pathogen richness and a negative correlation of HLA class II diversity, particularly HLA-DQB1 loci, and pathogen richness, suggesting that HLA class I and class II genes have disctint evolutionary strategies to confer immunity against infectious agents [5]. In this context, the higher diversity of HLA class I genes may result from the high mutation rate of intracelular pathogens, particularly viruses. In contrast, the lower diversity of MHC class II genes might result from the fixation of some alleles that provide efficient immune protection against highly prevalent extracelular pathogens in specific populations (e.g. parasites). In Mexicans, we found a high frequency of some MHC class II alleles that predispose to rheumatoid arthritis (RA) (DRB1*04:04, DRB1*14:02, and DRB1*01:02), to systemic lupus erythematosus (SLE) (DRB1*03:01) [11] and to systemic sclerosis (SSc) (DRB1*11:04) (Rodriguez-Reyna TS et al., Unpublished data). It is possible that class II MHC alleles associated with autoimmunity, together with alleles found in Native American populations may have increased their frequencies due to past selective processes or infectious and parasitic diseases developed in different environments and thus explain in part the susceptiblity to develop autoimmune diseases in Mexico or the clinical characteristics of these diseases in Mexican population.
In summary, Mexican admixed individuals from the central area of Mexico have an important component of Amerindian and Caucasian MHC class I (HLA-C/2B) and class II (HLA-DRB1/ 2DQB1) blocks and HLA CEHs. A relatively low frequency of African and Asian HLA blocks and CEHs were detected. In line with these results, admixture estimations using STRs and HLA-B revealed a greater proportion of Amerindian, followed by Caucasian and African ancestry in this population. The high frequency of certain relatively fixed haplotypes might result from

Subjects
A total of 234 unrelated Mexican admixed individuals were studied, including a group of 80 Mexican admixed participants belonging to 40 families. A total number of 468 haplotypes were analyzed in this study. Every participant came from Mexico City and had a Mexican ancestry whose parents and grandparents were born in Mexico. Age mean of studied individuals was 38.2615.3 years. There were 120 females (51%) and 114 males (49%).

Ethics Statement
The Institutional Review Board of the National Institute of Respiratory Diseases (INER) reviewed and approved the protocols for genetic studies. All subjects provided written informed consent for these studies, and they authorized the storage of their DNA samples at INER repositories for this and future studies. In this study we did not collected samples from minors/children, only young adults older than 17 years were included.

HLA Typing
Genomic DNA was obtained from peripheral blood mononuclear cells (PBMC), using the QIAamp DNA mini kit (Qiagen, Valencia, CA, USA). High-resolution HLA class I and class II typing was performed by a sequence-based method (SBT) as previously described [49]. Briefly, we amplified exon 2 and 3 from HLA-A, -B and -C and exon 2 for HLA-DRB1 and -DQB1. Polymerase chain reaction (PCR) contained 1.5 mM KCl, 1.5 mM MgCl 2 , 10 mM Tris-HCl (pH = 8.3), 200 mM concentrations of each dATP, dTTP, dGTP, and dCTP; 10 pM concentration of each primer, 30 ng of DNA and 0.5 U of Taq DNA polymerase in a final volume of 25 ml. Amplification was done on a PE9700 thermal cycler (Applied Biosystems, Foster City, CA, USA) using the following cycling conditions: 95uC for 30 s, 65uC for 30 s, 72uC for 1 min, preceded by 5 min at 95uC, and followed by a final elongation at 72uC for 5 min. Amplified products were sequenced independently in both directions using BigDye Terminator TM chemistry in an ABI PRISMH 3730xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Data were analyzed using match tools allele assignment software (Applied Biosystems, Foster City, CA,USA) using the IMGT/HLA sequence database alignment tool (http://www. ebi.ac.uk/imgt/hla/align.html) [50]. Ambiguities were solved using group-specific sequencing primers (GSSPs) that have been reported and validated previously [49].

HLA Blocks and Conserved Extended Haplotypes Assignment
HLA allelic and haplotypic frequencies were obtained by gene counting; one hundred and sixty of the 468 haplotypes were obtained by direct observation because they were obtained by HLA typing in the parents and siblings of 40 families, while the rest were acquired from HLA genotyping of 154 non-related individuals. Haplotypes were estimated by maximum likelihood methods using the computer program Arlequin ver. 3.0 [51]. This software was also used to calculate HWE, OH, and EH at a locusby-locus level with 1610 6 steps in the Markov chain and 1610 5 dememorization steps. p-values #0.05 indicated statistical difference between OH and EH and thus a deviation from HWE. Listed HLA-C/B, HLA-DRB1/DQB1 and CEHs and their extension to the HLA-A locus of Mexican origin were estimated by the maximum likelihood method based on the D9 between alleles of two loci and between the two blocks and/or the extension to the HLA-A region, as previously described [18]. Haplotypes or DNA blocks of African, Asian and Caucasian MPA were assigned based on previous reported frequencies [7,13,18]. Estimation of delta (D) and relative delta (D9) values to measure LD, nonrandom association of alleles at two or more loci, and their statistical significance, were calculated using previously described methods [18]. Absolute D9 values of 1 indicates complete LD; 0 corresponds to no LD. As many of this associations may return |D9| values of 1.000 -even though that value may be result of a random association between two infrequent alleles-we used the statistic parameter t, to validate all D9 data adjusted by sample size and number of times that each allele appeared in the sample [52].
Only t values $2.0 were considered significant.

HLA Genetic Diversity Calculations
Genetic diversity of each HLA loci was assessed by two previously described forensic parameters: PIC and PD [53][54][55] that were computed using the PowerStat ver.1.2 spreadsheet (Promega Corporation, Fitchburg, WI, USA) as described elsewhere [56]. PIC measures the strength of a genetic marker for linkage studies by indicating the degree of polymorphism of a locus. PIC .0.5 is considered as highly polymorphic [54]. PD is defined as the probability of finding two random individuals with different genotypes for that locus in the studied population, and values higher than 0.8 indicate high polymorphism in the studied population context [57]. The OH and EH of all HLA loci was also calculated [53].