High resolution human leukocyte antigen class I allele frequencies and HIV-1 infection associations in Chinese Han and Uyghur cohorts.

Background Host immunogenetic factors such as HLA class I polymorphism are important to HIV-1 infection risk and AIDS progression. Previous studies using high-resolution HLA class I profile data of Chinese populations appeared insufficient to provide information for HIV-1 vaccine development and clinical trial design. Here we reported HLA class I association with HIV-1 susceptibility in a Chinese Han and a Chinese Uyghur cohort. Methodology/Principal Findings Our cohort included 327 Han and 161 Uyghur ethnic individuals. Each cohort included HIV-1 seropositive and HIV-1 seronegative subjects. Four-digit HLA class I typing was performed by sequencing-based typing and high-resolution PCR-sequence specific primer. We compared the HLA class I allele and inferred haplotype frequencies between HIV-1 seropositive and seronegative groups. A neighbor-joining tree between our cohorts and other populations was constructed based on allele frequencies of HLA-A and HLA-B loci. We identified 58 HLA-A, 75 HLA-B, and 32 HLA-Cw distinct alleles from our cohort and no novel alleles. The frequency of HLA-B*5201 and A*0301 was significantly higher in the Han HIV-1 negative group. The frequency of HLA-B*5101 was significantly higher in the Uyghur HIV-1 negative group. We observed statistically significant increases in expectation-maximization (EM) algorithm predicted haplotype frequencies of HLA-A*0201-B*5101 in the Uyghur HIV-1 negative group, and of Cw*0304-B*4001 in the Han HIV-1 negative group. The B62s supertype frequency was found to be significantly higher in the Han HIV-1 negative group than in the Han HIV-1 positive group. Conclusions At the four-digit level, several HLA class I alleles and haplotypes were associated with lower HIV-1 susceptibility. Homogeneity of HLA class I and Bw4/Bw6 heterozygosity were not associated with HIV-1 susceptibility in our cohort. These observations contribute to the Chinese HLA database and could prove useful in the development of HIV-1 vaccine candidates.


Introduction
Since the first case of AIDS was identified in 1981, over 21 million people have died from HIV infection. According to an estimate issued jointly by China's Ministry of Health and the Joint United Nations Program on HIV/AIDS (UNAIDS), by the end of 2011, there were an estimated 780,000 people living with HIV in China, including 154,000 with severe symptomatic AIDS [1]. The high prevalence of HIV and limited access to treatment for many in the developing world strongly underlines the need for a preventive or therapeutic HIV vaccine. In the design and development of an effective vaccine, one must consider the role of T cell immunity in HIV-1 infection. Over the past few years, a number of studies have demonstrated that CD8 + cytotoxic T lymphocyte (CTL) response plays a central role in immune control of HIV. The protective effects of CD8 + CTL responses in primary HIV-1 infections [2,3] and in long term nonprogressors [4,5] have been documented. Associations between HIV specific CD8 + CTL responses, and HIV exposed but uninfected individuals, were also observed [6,7,8]. Intracellular epitopes presented to CD8 + cells are usually bound by HLA class I (HLA-A, B, and Cw) molecules, which are encoded by HLA-I genes. The HLA genes comprise the most polymorphic loci in the human genome. At the population level, HLA diversity has an impact on susceptibility and the outcome of HIV-1 infection. This is based on associated diversity of antigen recognition and presentation [9,10,11,12]. The relationship between the frequency of HLA genes within different racial and ethnic groups and HIV-1 susceptibility and disease progression deserves further investigation.
Recently, large investigations employing high resolution HLA typing of various Chinese populations have been conducted [13,14,15,16,17,18,19]. However, most of these studies focused exclusively on the distribution of HLA alleles and haplotypes, and not on their association with HIV infection. HIV infection and HLA profile association studies have been conducted in Chinese populations, but have been based on 2-digit HLA typing [20,21]. The scarcity of research on the association between high resolution HLA alleles and haplotypes and HIV-1 susceptibility may be an impediment to the development of an effective HIV vaccine in China.
In this study, we carried out high resolution HLA class I typing on a Chinese Han cohort and a Chinese Uyghur cohort, most of whom were HIV-1 positive. Associations between HIV-1 susceptibility and host immunogenetics were investigated by analysis of allele frequencies and EM algorithm predicted haplotype frequencies in HIV-1 positive and HIV-1 negative groups. The allele frequencies were also compared to other Chinese populations. The purpose of this study was to contribute to the database of Chinese HLA allele and haplotype distribution, which may be useful in HIV vaccine development and in the selection of subject populations in future clinical trials.

Allele distribution of HLA class I
From the 488 Chinese individuals, 58 HLA-A, 75 HLA-B, and 32 HLA-Cw distinct alleles were identified. In our Chinese Uyghur cohort, 21 common HLA-A alleles with frequencies higher than 0.01 accounted for 91.9% of the total HLA-A alleles. In addition, 28 HLA-B and 19 HLA-Cw alleles with frequencies higher than 0.01 comprised 88.5% of HLA-B and 95.3% of HLA-Cw alleles. In our Chinese Han cohort, 15 common HLA-A alleles with frequencies higher than 0.01 accounted for 93.0% of the total HLA-A alleles. Similarly, 26 HLA-B and 15 HLA-Cw alleles comprised 88.4% of HLA-B and 93.6% of HLA-Cw alleles, respectively (Table 1). Complete data are available at allelefrequencies.net. No novel alleles were identified in these two cohorts.
On the HLA-A locus of our Uyghur cohort, the HLA-A*02 group represented 27.3% of the total HLA-A alleles, followed by the A*24 group (12.4%) and the A*11 group (11.8%). In our Han cohort, the most predominant group was also the A*02 group (31.5%), followed by A*11 (18.7%) and A*24 (14.4%).

Haplotypes of HLA class I
The EM algorithm predicted HLA class I haplotypes with estimated frequencies higher than or equal to 2.0% are summarized in Table 2. In our Han cohort, two HLA-A-B haplotypes, three HLA-A-Cw haplotypes, and three HLA-Cw-B haplotypes had an estimated frequency greater than 5.0%. There were two HLA-A-Cw-B haplotypes with an estimated frequency higher than 5.0%. In our Uyghur cohort, only 2 HLA-Cw-B haplotypes had an estimated frequency higher than or equal to 5.0%, and no HLA-A-B, HLA-A-Cw, or HLA-A-Cw-B haplotype had frequencies higher than 5.0%.

HLA class I allele frequency comparison between HIV-1 positive and negative groups
In our Uyghur cohort, the frequency of HLA-B*5101 (OR = 0.18, 95% CI: 0.06 to 0.49; p = 0.002, q = 0.056) was significantly higher in the HIV-1 negative group. We did not find HLA-A or HLA-Cw alleles with significantly different frequencies between the HIV-1 positive and HIV-1 negative groups ( Figure 1). In our Han cohort, a statistically significant increase in allele A*0301 (OR = 0.25, 95% CI: 0.09 to 0.64; p = 0.002, q = 0.030) and B*5201 (OR = 0.24, 95% CI: 0.10 to 0.57; p = 0.001, q = 0.026) was observed in the HIV-1 negative group. We did not find significant differences in Han HLA-Cw allele frequencies when comparing the HIV-1 positive to the HIV-1 negative group ( Figure 2). In the Uyghur cohort, among all common alleles with a frequency higher than 0.01, twenty-one alleles were found exclusively in the HIV-1 positive group (Table S1). In both of the Uyghur and Han cohorts, there were several low frequency alleles (frequency less than 0.01) found only in the HIV-1 positive group or HIV-1 negative group. None of the common alleles nor any of the low frequency alleles showed significant differences when comparing the HIV-1 positive group to the HIV-1 negative group (Tables S1 and S2).

HLA class I haplotype frequency comparison between HIV-1 positive and negative groups
We compared 2-locus and 3-locus inferred haplotype frequencies in the HIV-1 positive groups and HIV-1 negative groups of our cohorts. The EM algorithm predicted HLA class I haplotypes with an estimated frequency higher than or equal to 0.02 in our Uyghur cohort are presented in Figure 3. In the inferred HLA-A-B haplotypes, the estimated frequency of A*0201-B*5101 (OR = 0.02, 95% CI: 0.003 to 0.21; p,0.01, q,0.01) was significantly higher in the HIV-1 negative group than in HIV-1 positive group. In our Han cohort, Cw*0304-B*4001 (OR = 0.21, 95% CI: 0.08 to 0.57; p = 0.003, q = 0.039) had a significantly higher estimated frequency in the HIV-negative group ( Figure 4). We also compared estimated 2-locus and 3-locus haplotypes with frequencies greater than or equal to 0.01 in HIV-1 positive and HIV-1 negative groups (Tables S3 and S4). No statistically significant differences (with a q value lower than 0.2) were observed in these inferred haplotypes in the Uyghur or Han cohort.

Discussion
Immunogenetic determinants of host susceptibility and resistance to HIV-1 infection have been an area of intense investigation. Increasingly, studies have reported associations between HIV-1 acquisition and low resolution HLA genotypes [20,21,32,33,34,35]. However, the 2-digit level data appears insufficient to provide a complete understanding of the role of HLA genotypes. [36]. Four-digit HLA typing may provide more  insight into associations between HLA class I profiles and HIV-1 susceptibility and/or disease progression [37]. Few investigations have dealt with the association of HIV-1 susceptibility and 4-digit HLA class I genotypes in Chinese populations.
In this study, we investigated high resolution HLA class I allele distribution and EM predicted haplotype frequencies in Chinese Uyghur and Chinese Han populations. Uyghur blood samples were collected from the Xinjiang Uyghur Autonomous Region, China. Han blood samples were collected from the Sichuan, Henan and Shanxi provinces. Xinjiang, Sichuan, and Henan are among the five provinces in China with the highest reported cases of HIV [38]. Our cohort included members of a large population from a broad region of China, however, analysis of the association of genetic factors with HIV susceptibility was supplemented with previously conducted research on single ethnic groups from specific regions within China [20,21,39].
In view of the accumulated evidence suggesting that HLA alleles influence both HIV-1 susceptibility and AIDS progression, the allele frequencies and inferred haplotype frequencies were compared between HIV-1 positive and HIV-1 negative groups within each cohort. In the Han cohort, the frequency of HLA-B*5201 was significantly higher in the HIV-1 negative group. In a study of haemophiliacs in Italy using 2-digit level HLA typing, HLA-B52 was found to be negatively associated with the risk of HIV seroconversion [40]. Epitopes restricted by HLA-B52 are poorly defined. Only one epitope (RMYSPTSI, p24 143-150) has been identified in the HIV Molecular Immunology Database as being restricted by HLA-B*5201 (http://www.hiv.lanl.gov/ content/immunology). HLA-Cw18 presents a similar epitope (VRMYSPVSI, p24 142-150). HLA-Cw*18 was found to be protective against HIV in Zambia [41] and Tanzania [37]. Further study of HLA-B*5201 may be relevant to populations beyond China.
In our Uyghur cohort, the frequency of HLA-B*5101 was significantly higher in the HIV-1 negative group. A novel HIV-1 vaccine candidate designed for B*5101 positive individuals protected mice against chimaeric HIV-1 virus [43]. This supports the idea that B*5101 confers some protection from HIV-1 infection. It has also been reported that HLA-B*27, B*57 and B*51 are associated with delayed onset of AIDS [42]. However, the strong association between B*51 and slow disease progression may decline with the adaptation of HIV-1 to HLA induced immune responses [44]. While there may be different mechanisms that result in an allele being associated with protection or slowed disease progression, it is possible that B*5101 is associated with both.
Based on our analysis of published data on the Chinese population [13,14,15,16,17,18,19,22,23,24,25,26], allele frequencies of HLA-B*5101 (averaged to 0.044 in northern Chinese vs. 0.038 in southern Chinese) and HLA-B*5201 (averaged to 0.030 in northern Chinese vs. 0.043 in southern Chinese) are not much different in northern and southern Chinese. However, HLA-A*0301, which appeared to be a protective factor in our Han cohort, is a relatively more frequent allele in northern Chinese than in southern Chinese (averaged to 0.042 in northern Chinese vs. 0.005 in southern Chinese). The varied distribution of these alleles in the Chinese population and their associations with HIV-1 susceptibility should be taken into consideration in HIV vaccine development and clinical trial design in China.
HLA phenotype can be grouped as Bw4 serotype and Bw6 serotype according to amino acid residues at positions 77-83 in the a1 domain. In a study with subjects who were mainly Caucasians, infected men carrying the Bw4 allele had lower transmission rates to their female sex partners [45]. The Bw4 homozygosity, Bw6 homozygosity and Bw4/Bw6 heterozygosity did not differ significantly between the HIV-1 positive and negative groups in our cohort. In another study in a Chinese population in Chongqing, Bw4/Bw6 serotype frequencies were not significantly different when HIV-1 positive and negative groups were compared [21]. It is possible that in Chinese populations, bw4/ bw6 heterzoygosity or homozygosity may be more associated with AIDS progression [46] or HIV-1 transmission than with susceptibility to infection.
Our data further indicated that 81 Han and 32 Uyghur individuals were homozygous on at least one locus at the 4-digit level. Within this homozygous group, 21 subjects were homozygous on two loci, and 10 were homozygous on all of the three HLA class I loci at the 4-digit level. Homogeneity on HLA class I was not identified to be related to susceptibility to HIV-1 in our cohort.
Considering the significant increase in the frequency of allele HLA-B*5101 in the Uyghur HIV-1 negative group, it is not surprising that there was a statistically significant increase in haplotype A*0201-B*5101 in the Uyghur HIV-1 negative group. Alleles classified in the same supertype can recognize and present epitopes with similar residue hydrophobicity in specific positions. When comparing frequencies of all ten identified supertypes [47], only supertype B62s was found to have a significantly higher frequency (OR = 0.54, 95% CI: 0.36 to 0.81; p = 0.003, q = 0.012) in the Han HIV-1 negative group than in the Han HIV-1 positive group.
In the neighbor-joining (NJ) phylogenic tree, our Han cohort was grouped in the northern Chinese cluster. Though it had the lowest estimated genetic distance to another northern Han population from Hebei province, the inclusion of subjects from Sichuan may have shifted our cohort slightly closer to Southern Han. Our Uyghur cohort, also in the northern Chinese cluster, included subjects distributed throughout the Xinjiang region. It had the lowest estimated genetic distance to another Uyghur population from Yining, followed by German and Mongolian populations. Previous studies showed that the Uyghur population was a hybrid of 60% European ancestry and 40% East Asian ancestry [48]. The proximity of the Uyghur to the German and Mongolian populations is consistent with these studies and the migration history of the Uyghur population.
Little is known about the mechanism of HLA protection from HIV-1 infection, and it is important to note that association is not equal to a causal relationship. We used statistical analysis to infer associations between alleles and HIV-1 infection or resistance, and this carries a risk of error. Our results are potentially confounded by factors including ethnic population subsets, uneven allele distribution, and proportionally smaller HIV negative cohorts. While efforts were made to select appropriate controls, they may have had a coincidentally lower exposure to HIV. These factors may have minimized some allele associations. Due to the crosssectional nature of this study, allele associations with protection from HIV-1 infection cannot be distinguished from associations with a delay in seroconversion, and survivorship bias is a risk. We believe the associations presented here are important to consider in our further study, design, and development of HIV vaccine candidates for the Chinese population.

Ethics statement
The ethnicity of the subjects' parents was identified using questionnaire surveys. All participants provided written informed consent for participation in the study. Subjects enrollment and sampling were approved by the Institutional Review Board (IRB) of the National Center for AIDS and Sexually Transmitted Disease Control and Prevention (NCAIDS) or the IRB of Tangdu Hospital as appropriate. The investigation was conducted in accordance with humane and ethical research principles of Nankai University, China.

Population
A total of 488 Chinese adults enrolled for this study at NCAIDS in Beijing, and at the Tangdu Hospital affiliated with the Fourth Military Medical University in Xi'an, China. Subjects were confirmed unrelated by collection and comparison of grandparent names. Three hundred and twenty-seven Han subjects were collected from Henan, Shanxi and Sichuan provinces, including 255 HIV-1 seropositive individuals and 72 HIV-1 seronegative individuals. One hundred and sixty-one Uyghur subjects were selected from Xinjiang Uyghur Autonomous Region, including 142 HIV-1 seropositive individuals and 19 HIV-1 seronegative individuals. The subjects ranged from 18 to 65 years of age. To minimize the effects of non-HLA genetic diversity and differences in the risk of exposure, cohorts from each region or province included HIV-1 seropositive and seronegative subjects.

HLA class I typing
Genomic DNA was extracted from 0.5 ml frozen whole blood using a Qiagen FlexiGene DNA Kit (QIAGEN China Co., Ltd, Shanghai), according to the manufacturer's protocol. HLA-A, HLA-B, and HLA-Cw allele level typing was performed by directly sequencing exon 2 and exon 3 amplification products of HLA loci. Most of the ambiguous allele combinations could be resolved by amplifying and sequencing exon 4 of the HLA-A and HLA-B loci. The HLA specific primers were a gift from Dr. W. H. Hildebrand at the University of Oklahoma Health Sciences Center. The sequence based typing was accomplished with Assign version 3.5 (Conexio Genomics, Applecross, Western Australia, Australia). Class I Direct to High Res SSP UniTray (Invitrogen, USA) was used to resolve the remaining ambiguities.

Statistical analysis
The HLA class I allele frequencies were calculated by direct counting of the sequencing based typing results. ARLEQUIN software v3.11 [49] was used to estimate the haplotypes (2-locus and 3-locus) using the maximum-likelihood method, with the iterative EM algorithm. POPTREE2 was used to calculate genetic distance between different populations. A neighbor-joining phylogenetic tree based on allele frequencies of HLA-A and HLA-B loci was constructed with POPTREE2 [50].
The comparison of Chinese Han HLA class I allele frequencies between HIV-1 seropositive and seronegative groups was performed using the Cochran-Mantel-Haenszel Chi-squared test with PASW statistics 18.0. Since our Chinese Han subjects were selected from three provinces in China, the location was selected as a layer variable. The comparison of Chinese Uyghur HLA class I allele frequencies between HIV-1 seropositive and seronegative groups was performed using Chi-squared test. The comparisons of Chinese Han and Uyghur HLA class I haplotype frequencies between HIV-1 seropositive and seronegative groups were performed using Chi-squared test. Fisher exact test (two tailed) or Yates's continuity correction was applied when necessary. The strength of an association was indicated by an odds ratio (OR) with a 95% confidence interval (CI) calculated with PASW software. In consideration of the multiple comparisons performed, p values were used to calculate q values to control the false discovery rate with the false discovery rate (FDR) method described by Benjamini and Hochberg. Comparisons with q values less than 0.2 were accepted as statistically significant associations.

Supporting Information
Table S1 Distribution of common HLA-A*, Cw* and B* alleles among Chinese Uyghur HIV-1 positive and negative subjects. (DOCX)