Polymorphisms of HLA-DRB1, -DQA1 and -DQB1 in Inhabitants of Astana, the Capital City of Kazakhstan

Background Kazakhstan has been inhabited by different populations, such as the Kazakh, Kyrgyz, Uzbek and others. Here we investigate allelic and haplotypic polymorphisms of human leukocyte antigen (HLA) genes at DRB1, DQA1 and DQB1 loci in the Kazakh ethnic group, and their genetic relationship between world populations. Methodology/Principal Findings A total of 157 unrelated Kazakh ethnic individuals from Astana were genotyped using sequence based typing (SBT-Method) for HLA-DRB1, -DQA1 and -DQB1 loci. Allele frequencies, neighbor-joining method, and multidimensional scaling analysis have been obtained for comparison with other world populations. Statistical analyses were performed using Arlequin v3.11. Applying the software PAST v. 2.17 the resulting genetic distance matrix was used for a multidimensional scaling analysis (MDS). Respectively 37, 17 and 19 alleles were observed at HLA-DRB1, -DQA1 and -DQB1 loci. The most frequent alleles were HLA-DRB1*07:01 (13.1%), HLA-DQA1*03:01 (13.1%) and HLA-DQB1*03:01 (17.6%). In the observed group of Kazakhs DRB1*07:01-DQA1*02:01-DQB1*02:01 (8.0%) was the most common three loci haplotype. DRB1*10:01-DQB1*05:01 showed the strongest linkage disequilibrium. The Kazakh population shows genetic kinship with the Kazakhs from China, Uyghurs, Mongolians, Todzhinians, Tuvinians and as well as with other Siberians and Asians. Conclusions/Significance The HLA-DRB1, -DQA1and -DQB1 loci are highly polymorphic in the Kazakh population, and this population has the closest relationship with other Asian and Siberian populations.


Introduction
Kazakh Khanate (Kazakhskoye khanstvo) was established as the first Kazakh state in 1456 (1465/66) and was located in the territory of the present day Republic of Kazakhstan (Fig 1). This country is located in Central Asia, which lies on the border of Europe and Asia. This area was the intersection of many transport routes; west to Europe, east to Asia and Siberia. So Kazakhstan is located in an area where the population is characterized by different languages, religions and cultures. Many ancient tribes were involved in the formation of the Kazakhs. Anthropologists believe that the initial formation of a distinct Kazakh population began in the first millennium AD, and is considered an ancient Kazakh anthropological type with distinct features from those of European or Mediterranean anthropological types. In subsequent periods, during the Mongol invasions, an intensive mixing, resulted in Kazakhs acquiring mongolian traits [1]. Subsequently, the modern Kazakh population was formed from many different ancestor groups including Turkic tribes (Kipchaks, Argyns, Khazars etc.), Turko-Mongol tribes (Dughlat, Jalayir, Naimans etc.), and other Asian tribes. Even though Kazakhstan is basically characterized as a polyethnic country, a major section of the population (more than 60%) are Kazakhs. Kazakhs are a Turkicspeaking people, living in several Central Asian countries including Kazakhstan, Usbekistan, Kyrgyzstan, Russia, Mongolia, and China etc.
The targets of our study were: HLA-Typing of HLA-DRB1, DQA1 and DQB1 loci in the Kazakh population living in the new capital city of Kazakhstan; investigation of allele and haplotype frequencies in relation to HLA-DRB1 polymorphism; and comparisons with other world populations with different historical backgrounds in order to further understand the genetic background and the origin of the Kazakh population. The HLA class I and II are recognized as essential components of the immune response with a high polymorphism. More than 10,000 alleles are in the latest version 3.15. (2013-07) of the IMGT/HLA Database, which provides a specialised database for sequences of the HLAcomplex and official sequences for the WHO nomenclature Committee for factors of the HLA system [2].
Results of the HLA-study in populations with different ethnic backgrounds are the basis for development in several areas of clinical transplantation, diagnostics, forensics and can be considered as an anthropological guide. This is a prerequisite for research of HLA-diversity in the population of Kazakhstan. The distribution of specific HLA genes in representatives of a healthy group can be used as reference markers to search for genetic predispositions of various diseases in the Kazakh ethnic group. This could serve as a theoretical basis for clinical transplantation and to find donors of allogeneic bone marrow from the same ethnic group. In our study, we focused on the study of HLA-DRB1 alleles in the Kazakh population living in Astana. There were also other classic distributions of alleles in the HLA class II. The aim of this work was to investigate HLA-genetic heterogeneity among Kazakhs by studying allele-and haplotype frequencies in relation to the HLA-DRB1 locus based on its high polymorphism. We hypothesized that, relying on the use of HLA-distribution, the origin of the Kazakh population can be determined.

Ethical Statement
This project was approved by the Ethics Committee of the National Center for Biotechnology, Kazakhstan ( 10, 14.02.2010). The ethics committee approved the informed consent for this study. The investigation was conducted in accordance with humane and ethical research principles of National Center for Biotechnology. All 314 study participants completed a questionnaire requiring them to be healthy, provided informed consent, and included information regarding family history, lineage, etc. We confirm in our consent statement that consent was provided by 314 healthy individuals.

HLA Genotyping
For HLA-DRB1, -DQB1 and -DQA1 loci, allele polymorphisms were typed using the sequence-based typing (SBT) method. Genomic DNA from whole blood samples was extracted using a DNA Purification Kit (PROMEGA, Madison, WI) according to the manufacturer's protocol. The concentration of DNA was 50-100 ng/ml, with the purity of the extracted DNA ranging from a 1.5 to a 1.8 OD value. PCR and sequencing were performed for exon 2 of the HLA-DRB1, -DQA1 and -DQB1 genes using the SBT-method and locus, group, and sequence-specific primers according to multiple sources [46][47][48][49][50][51][52]. The thermal cycling profile for the amplification began with initial denaturation for 5 min at 94˚C, followed by 10 cycles of 30 s at 94˚C, 50 s at 65˚C, 20 subsequent cycles, each consisting of annealing of the primers at 62˚C for 50s and an elongation and 60 s at 72˚C, with a final elongation for 5 min at 72˚C. Polymerase chain reaction (PCR) was performed in 50 ml reaction mixtures of 100 mM Tris-HCl (pH 8.0) 2.5 mM MgCl2, 100 mM of each dNTP, 10 pmol of each primer, 2.0 U of Taq DNA polymerase and DNA was 50 ng/ml. For amplification the 96-well thermocycler (BioRad, Hercules, CA) was used. Amplification was verified by 2% agarose gel electrophoresis. Sequencing was performed on Genetic Analyzer (Applied Biosystem, Foster City, CA) with 96-capillaries using BigDye Terminator v3.1 chemistry (Applied Biosystem). The HLA alleles were identified using international database IMGT/HLA database [53] and a program dbMHC SBT Input [54]. This typing procedure has been published (Kuranov et al., 2014).

Statistical Analysis
Allelic frequencies of HLA-DRB1, -DQB1 and -DQA loci were estimated by the direct counting method. Allele frequencies, haplotype frequencies, neighborjoining dendrograms and multidimensional scaling analysis were obtained for comparing Kazakhs and worldwide populations. Statistical analyses were performed using Arlequin v3.11. The resulting genetic distance matrix was used for a multidimensional scaling analysis (MDS), for two dimensions. MDS for pairwise populations was computed using allele frequencies, based on the Euclidean distance matrix [55,56], applying the software PAST v. 2.17. The haplotype frequencies were estimated according to allele frequencies using the expectation maximization (EM) method with the Arlequin v3.11. Tests of Hardy-Weinberg equilibrium and Linkage disequilibrium (LD) were also perfomed using this software. LD (D) coefficient has been estimated for the strength of LD (.0.80 strong LD, 20.5 moderate LD, 20 weak LD) [57]. Assuming that the (D) values might show two rare alleles that were only accidentially linked to validate all D data, the statistic parameter t [58] (t values.2.0) was used to improve results [59]. Phylogenetic dendrograms were created using the neighbor-joining (NJ) method with Nei distances, applying the phylogeny program Phylip, based on allelic frequencies [60].

Neighbor-Joining Dendrogram
The Neighbor-joining dendrogramm was created using the allelic frequencies at the HLA-DRB1 locus of various populations including the Kazakh group (Fig. 2). DRB1 allele frequencies between the Kazakh population and other world populations (European, Scandinavian, Mediteranean, Siberian and Asian populations) were compared. Results showed a clear divergence among these world populations. The genetic distance dendrogram (Fig. 2) shows that the Kazakh population is clustered together with Asian and Siberian populations, separate from European, Scandinavian and Mediterranean populations. The genetic structure of the Kazakhs is therefore shown to be closest to the Asian and Siberian populations.

Multidimensional Scaling Analysis
Because of the multiethnic background of the Kazakh population, multidimensional scaling analysis for the Kazakhs with different worldwide populations was performed. Multidimensional scaling analysis of the 74 ethnic groups were based on the allelic frequencies of the HLA-DRB1 locus shown in Fig. 3. The results show that all the ethnic groups can be divided into five clusters; Asian and Siberian, American, Scandinavian, European and Mediterranean populations. The
The study polymorphism of mitochondrial DNAdata (Berezina G, 2011) shows that Western Europe (55%) and Eastern Europe (41%) mtDNA linkages are present in the Kazakh population. It has been indicated that a high degree of intensity of gene exchange has occurred between the Kazakh population and populations of Russia on the North-West, North, North-East and East of Kazakhstan (Berezina G, 2011). It was also supported, that Kazakh Ychromosome markers belong largely to the C3*, C3c and O3 haplogroups, which were obtained from people of southern Siberian or Mongolian lineage [66]. The highest frequencies of the C3* star-cluster (from 3 to 30%) were observed in Altaian Kazakhs [67], known as the C3* star-cluster ascribed to the descendants of Genghis Khan. Frequencies of Haplogroup C are very common in Mongolia (15%) and in populations of Central Asia (7-18%) [68].
In 1991, when the study of HLA allelic diversity was conducted few Caucasoid, Mongolian and mixed ethnic groups living in the territory of the former USSR were chosen. Based on these results, several authors concluded that the data on HLA-markers was broadly consistent with the anthropological information [69]. Kazakhs are characterized by the presence of HLA alleles that are also present in Caucasians and in Asians, although each of these populations has its own particular HLA-profile. The distribution of HLA alleles in Kazakhs agrees broadly with similar data for populations from Mongolia. This is confirming a hypothesis of the existence of gene flow between European, Asian and Siberian peoples, and may be due to the migration of peoples from Asia and/or Siberia into Europe. Cultural features of people in Eurasia corroborate genetic contacts between Asia and Siberia. These results could suggest that Kazakhs were genetically admixed with Caucasian, Siberian and Asian populations. Kazakhs, Uyghurs, Buryats, Mongolians and northern China inhabitants are representing a certain intermediate group, which is gradually loosing the HLA-specificities characteristic of the European groups and accumulating HLA-alleles specific to south-east Asian populations [70,71].
The date of the HLA class II neighbor-joining tree shows the relatedness of world populations with the Kazakh population (Fig. 3). Populations are grouped in two main branches which are related. On on side are clustered Kazakhs (Astana and Tarbagatay), Asians and Siberians, Chinese Kazakhs, Tuvinians, and Todzhinians. On the other side are grouped European, Mediterranean, and Scandanavian ethnic groups. The Kazakh population (Astana) shows the closest genetic relation with Siberians and Asians. The study on HLA polymorphisms, which includes historical and genetic data support that the Kazakh population is characterized by the features of the Central Asian anthropological type under the influence of different groups such as Asian, Siberian, and European anthropological types. Migrations and mixing of many different ethnic groups are the major factor determining the genetic diversity of Kazakh population. Finally, Kazakhs are genetically different from other Asians (Figs. 2 and 3) as their HLA genetic pool has alleles from European, Asian and Siberian populations. Kazakhs (Astana) are related to Kazakhs (Tarbagatay), also to Kazakhs (China) and Uyghur groups (Fig. 3). A genetic distance-based analysis clustered the populations into groups according to their geographic origin. The structure of genetic variation of the Kazakh population tended to have distinct geographic occurrences, in agreement with the distance clusters.
It should be noted that the relatively high degree of heterogeneity in Asian and Siberian populations compared to European populations may be associated with a wider habitat residence. Asian, especially the Siberian peoples, are relatively isolated from each other, whereas European populations are living in a more compact and limited area, providing more intense interactions. Previous studies and current results support a unique genetic origin of the Kazakhs, and this population could be genetically an admixture of three ethnic groups: Europeans, Siberians and Asians. Our results suggest that HLA loci and haplotypes in the Kazakh population are significant genetic polymorphisms, that will allow a future use of our results to find an HLA-matched donor, specifically for bone marrow transplantation, which in turn suggests the clinical relevance of ours and future research in the Kazakh population. Such studies are in high demand, as the data in this region is very limited. These data can be used for any research into HLA and disease, specifically relevant is data that has already been used in the study of tuberculosis in the Kazakh population [72].