Association of HLA-A and Non-Classical HLA Class I Alleles

The HLA-A locus is surrounded by HLA class Ib genes: HLA-E, HLA-H, HLA-G and HLA-F. HLA class Ib molecules are involved in immuno-modulation with a central role for HLA-G and HLA-E, an emerging role for HLA-F and a yet unknown function for HLA-H. Thus, the principal objective of this study was to describe the main allelic associations between HLA-A and HLA-H, -G, -F and -E. Therefore, HLA-A, -E, -G, -H and -F coding polymorphisms, as well as HLA-G UnTranslated Region haplotypes (referred to as HLA-G UTRs), were explored in 191 voluntary blood donors. Allelic frequencies, Global Linkage Disequilibrium (GLD), Linkage Disequilibrium (LD) for specific pairs of alleles and two-loci haplotype frequencies were estimated. We showed that HLA-A, HLA-H, HLA-F, HLA-G and HLA-G UTRs were all in highly significant pairwise GLD, in contrast to HLA-E. Moreover, HLA-A displayed restricted associations with HLA-G UTR and HLA-H. We also confirmed several associations that were previously found to have a negative impact on transplantation outcome. In summary, our results suggest complex functional and clinical implications of the HLA-A genetic region.


Introduction
A complete Human Leukocyte Antigen (HLA) match is associated with better long term survival in transplant patients, however the effects of HLA-A, -B and -DRB1 matching are unequal and also depend upon the organ being transplanted. HLA-A mismatches were found to have less influence on kidney allografts than HLA-B mismatches [1]. This difference could reflect the higher number of HLA-B alleles compared to HLA-A and/or the cytotoxic T-cell allo-repertoire for HLA-A and -B antigens, as lower frequencies of Cytotoxic T-Lymphocyte precursor (CTLp) are found for HLA-A mismatches than for HLA-B mismatches [2,3]. Few amino acid polymorphisms at functionally important positions of the antigen recognition site of the HLA-A molecule have been shown to have a significant influence on clinical outcome [4]. In liver transplant patients however, only HLA-A mismatching was associated to a negative effect on patient survival and a higher recurrence of the hepatitis C virus, whereas no HLA-B and -DR mismatches had no effect [5]. In Lung Transplantation (LTx), the risk of Bronchiolitis Obliterans Syndrome only increased in the presence of two HLA-A mismatches, not for 0 or 1 mismatch or for HLA-B or -DR mismatches [6]. These data support the hypothesis that HLA-A mismatching leads to a particular immunological situation in organ transplantation compared to other loci. The HLA-A molecule displays specific features compared to HLA-B that could account for its particular role: HLA-A and HLA-B alleles carry an unpaired cysteine at different positions of the cytoplasmic tail domain (respectively at positions 339 and 325), which is reported to be involved in recycling, targeting for degradation and influencing recognition by NK receptors as well as in the formation of fully folded MHC class I dimers in exosomes [7]. Other genes, in association/linkage disequilibrium with HLA-A, could also induce either alloreactivity or tolerance. Indeed, the HLA-A locus on chromosome 6 is surrounded by HLA class Ib (non-classical) genes: HLA-E (at the centromeric end), HLA-G, -H and HLA-F (at the telomeric end).
HLA class Ib molecules, HLA-G, HLA-E, HLA-H and HLA-F, are described for their involvement in immune effector cells interactions. They display specific features that differentiate them from HLA class Ia (classical) molecules: a highly conserved peptide binding groove with restricted coding polymorphism and specific expression patterns [8,9]. Moreover, HLA-G lacks the canonical MHC class I cytoplasmic tail as it is truncated due to the deletion of 6 amino acids. Yet, HLA-G is capable of initiating cellular signaling events upon cross-linking, as HLA-G protein associates with lipid rafts and thus can induce cell proliferation [10,11].
The immuno-modulating and proliferative properties of HLA class Ib are of great interest in the clinical field, particularly in transplantation. Notably, higher HLA-G expression has been correlated to a better clinical outcome after heart, kidney and Hematopoietic Stem Cell Transplantation (HSCT) (see [12] and [13] for reviews). However, the exploration of HLA-G protein expression is currently limited because micro-environmental expression analysis requires invasive techniques. A further complication is that HLA-G is expressed as many different protein isoforms, generated by alternative splicing, dimer formation, and association with β 2 microglobulin. Unfortunately, these isoforms are not accurately discriminated by commercial antibodies whereas, most importantly, they are reported to display differential functionality. Primary HLA-G mRNA splicing generates both membrane-bound (HLA-G1 to 4) and soluble (HLA-G5 to 7) isoforms [14] and membrane-bound isoforms can also be shed via proteolytic cleavage by matrix metalloproteinase-2 [15]. The HLA-G1 and -G5 isoforms display a similar structure to HLA class Ia molecules (comprising α1, α2 and α3 domains), whereas HLA-G2 and -G6 are only composed of α1 and α3 domains, -G3 and -G7 contain an α1 domain, and -G4 is composed of α1 and α2 domains [14]. These truncated isoforms appear to be differently processed as HLA-G2, -G3 and -G4 are sensitive to endoglycosidase H, suggesting a non-involvement of the Golgi apparatus, in contrast to HLA-G1 and classical HLA class I [16]. Both HLA-G1 and -G5 isoforms reduce NK cell cytotoxicity with an additive effect [17]. Two studies concluded functional activity for HLA-G2, -G4 and -G6 [16], whereas conflicting results were published concerning HLA-G3 [16,18]. Besides these differentially spliced isoforms, HLA-G forms disulfide-linked dimers via Cys-42 of the HLA-G α1 domain. Most of the dimers are reported to be expressed on the cell surface, but soluble HLA-G is also able to dimerize [19]. HLA-G dimerization, which may depend on cell-surface density, has an impact on HLA-G receptor affinity [20]. Hence, possibly due to the many HLA-G isoforms and the lack of appropriate serological tools, conflicting results have been reported regarding the association between HLA-G expression and clinical outcome [21][22][23][24]. As a consequence, many studies focus on HLA-G genetic polymorphisms to predict clinical outcome.
The many HLA-G regulatory region polymorphisms (5'URR and 3'UTR) are in strong Linkage Disequilibrium (LD) with each other, forming haplotypes referred to as HLA-G UTRs in this manuscript [25]. These HLA-G UTRs are well conserved in different physiological populations, as well as in different clinical cohorts [22][23][24]. Several clinical studies have described favorable or deleterious effects of HLA-G alleles and HLA-G UTRs. For example, in LTx, HLA-G Ã 01:06~UTR2 was associated with a worse evolution of cystic fibrosis and HLA-G Ã 01:04~UTR3 impaired long term survival [24]. HLA-G Ã 01:06 was associated with pregnancy complications in French [26] and Singaporeans women [27], and an increase of HLA-G Ã 01:04 frequency was observed in couples with recurrent miscarriages [28,29]. Conversely, HLA-G Ã 01:04 has been shown to be protective in acute renal rejection and end stage renal disease (chronic inflammatory disease) [30].
Another non-classical HLA class I molecule, HLA-E, regulates NK cell function and is the only known ligand for the C-type lectin receptor CD94 combined with different NKG2 subunits expressed on NK and CD8+ αβ T cells [31,32]. HLA-E mRNA is expressed in most tissues [33] and its cell surface expression is controlled by leader peptides of HLA class I molecules, except HLA-F, the highest affinity being reported for HLA-G. Similarly to HLA-G, HLA-E's intrinsic peptide repertoire is constrained by a larger number of primary anchor sites compared to most class Ia molecules, which further supports a specific immunological role for this molecule [34]. HLA-E can also bind peptide ligands from stress proteins and viruses [9,35]. Although HLA-E possesses less anchor sites than HLA-G, the HLA-E peptide repertoire appears to be more restricted than that of HLA-G. Equal frequencies of the HLA-E coding alleles (HLA-E Ã 01:01 and HLA-E Ã 01:03, R107G, respectively associated to low and high protein expression) suggest that balancing selection may have maintained heterozygosis of highand low-expressing alleles [36][37][38][39]. In HSCT, HLA-E Ã 01:03 homozygosity was associated with a lower risk of Graft Versus Host Disease (GVHD), decreased mortality and a higher diseasefree survival rate, whereas HLA-E Ã 01:01 homozygosity was associated with an increased risk of bacterial infection (See [13] for a review). In LTx, HLA-E Ã 01:03 was correlated to Chronic Lung Allograft Dysfunction (CLAD) occurrence and HLA-E homozygosity was associated with a worse survival compared to heterozygosity (Di Cristofaro et al, submitted).
Much less is known about the other HLA class Ib molecules. HLA-F mRNA is expressed in most cell types and the protein is localized in the Endoplasmic Reticulum (ER) and Golgi apparatus. Unlike other HLA class I molecules, the function of HLA-F seems to be independent of peptide loading in the ER [31,40], as it is exclusively expressed in an open conformer form (OC, deprived of binding peptide and/or β2 microglobulin) since no peptide could be eluted from both membrane-bound and intracellular HLA-F captured proteins [41]. HLA-F is expressed at the surface of the extravillous trophoblast and is upregulated upon activation in monocytes, NK, B and T cells, but not T-Reg CD4+ CD25+ cells [33,42]. HLA-F seems to possess different functions: interaction with ILT-2 and ILT-4 and with KIR receptors [9, 43]; interaction with HLA-E in the OC form, possibly modifying the interaction with its receptors [44], and an implication in an original pathway of antigen (Ag) cross-presentation by HLA class I molecules [45]. HLA-F is characterized by 4 alleles defined at a 4 digit resolution, with F Ã 01:01 representing 90% of allelic diversity.
Finally, HLA-H has been defined as a pseudogene because of a deletion in exon 4, causing a premature stop codon before cysteine 164, impairing the Ag-presenting function [46]. Of note, HLA-H may be mistaken for the HFE gene (associated with haemochromatosis) [9, 47,48], but, although the HFE protein is similar to the proteins of the MHC class I family, it does not seem to inhibit NK cell activity [48]. Increased HLA-H nucleotide variation could be the consequence of balancing selection acting on nearby loci or of relaxed functional pressures on this pseudogene [49]. No data is currently available concerning HLA-F or HLA-H alleles and clinical outcome.
Many interactions are described between HLA class Ib molecules and immunological effectors cells, particularly NK cells. Thus, the principal objective of this study is to describe the main allelic associations between the HLA class Ib loci and HLA-A in a healthy population and to address their putative involvement in the synergy of these proteins.

Clinical samples
Ethylene Diamine Tetra Acetate-anticoagulated (EDTA) peripheral blood samples and sera were collected after written informed consent from 191 voluntary blood donors from southeastern France. The donations were collected in accordance with the French blood donation regulations and ethics and with the French Public Health Code (art L1221-1). The samples were approved by the French Committee for the Protection of Persons in Biomedical Research (CCPPRB) and the entire French collection was also declared to and approved by the French Ministry of Higher Education and Research. Blood samples were anonymized according to the French Blood Center (Etablissement Français du Sang) procedure. Genomic DNA (gDNA) was extracted from a 200-μl total blood sample using the QIAmp Blood DNA Mini kit (Qiagen, Courtaboeuf, France) according to manufacturer's instructions.

HLA-A genotyping
Luminex™ technology (HLA-A-One Lambda LABType1 SSO, InGen, Chilly Mazarin, France) was used to determine HLA-A alleles at a low resolution level, i. e. 2-digit or first field level typing, using the manufacturer's kit.

Locus NCBI Gene ID Start (bp from pter) End (bp from pter) Size (bp) Distance from HLA-A (Kbp) Location relative to HLA-A
HLA-E and HLA-F genotyping

HLA allelic assignment
The HLA-A, G, H, F and E allelic assignments were based on the HLA sequences listed in the official IMGT/HLA database [2].
HLA-G and HLA-H SNPs were automatically converted from output files (.txt) exported from GeneMapper 4.0 (HLA-G) or from Codon Code Aligner (HLA-H) into alleles and HLA-G UTRs using an in-house computer program readable by the 'Phenotype' application of the Gene[Rate] computer tool package (http://hla-net.eu/tools/) [60][61][62]. The HLA-G interpretation table used by Gene[Rate] is described in [53] and the HLA-H interpretation table is described in S1 Table. HLA-E and HLA-F genotypes were assigned by the NGSengine program (GenDx, The Netherlands).
For allelic and two or more loci frequency estimations, all HLA-G and HLA-A putative homozygotes were considered either true homozygotes or heterozygotes for both the observed allele and an undefined or undetectable ('blank') allele. Due to the deletion encompassing the HLA-H locus, all HLA-H homozygous samples were encoded either as true homozygotes, as heterozygotes for both the observed allele and a 'blank' allele or as heterozygotes for both the observed allele and a deleted ('del') allele. For HLA-E and HLA-F, this procedure was not applied because typing was performed from NGS data, which allows the identification of any unknown allele (clonal amplification).

Statistical analysis
Missing data at a locus led to the exclusion of the concerned sample from further analyses at the given locus. No multiple imputations were used.
Allelic and two or more loci haplotype frequencies were estimated using an EM algorithm implemented in the Gene[Rate] computer tools [62]. Deviations from Hardy-Weinberg equilibrium (HWE) were tested using a nested likelihood model [63]. Two-loci Global Linkage Disequilibrium (GLD) was assessed by a likelihood-ratio test on the frequency estimations [60,61]. LD for specific pairs of alleles was provided as a list of standardized residuals [64] for each observed haplotype and a value greater than |2| was considered to be a significant deviation [60,61].

HLA-A, -E, -H, -F and -G allelic frequencies
HLA-A, -E, -H, -F and -G allele frequencies are given in Table 2. The null hypothesis of Hardy-Weinberg equilibrium was never rejected. Sixteen 2-digit HLA-A alleles were found, and their frequencies were concordant with previously published data in France [61]), HLA-A Ã 02, HLA-A Ã 03, HLA-A Ã 24 and HLA-A Ã 01 being the most frequent alleles. Six 4-digit HLA-G alleles and 8 HLA-G UTRs were observed and their frequencies were similar to previously published data [22,53]. Putative undefined or undetectable blank 'alleles' represented at the most 1.6% of HLA-G UTRs according to the EM estimation (

Two loci linkage disequilibrium and haplotypes
Analysis of two-loci Global Linkage Disequilibrium (GLD) ( Table 3) showed that HLA-A, HLA-H, HLA-F, HLA-G and HLA-G UTRs were all in highly significant pairwise GLD (p<0.001). In contrast, the HLA-E locus was not in significant GLD with any of them. However, when defined at a 4-digit resolution level, HLA-E displayed a significant GLD with HLA-F but the p-value of 1% was not significant after adjusting for multiple testing (Table 3). Haplotypes were estimated between each pair of loci and their combined frequencies are given in Table 4. Strikingly, a strong LD was found between HLA-A, HLA-G UTRs and HLA-H, as all haplotypes with a frequency above 3% and concerning these pairs of loci were in significant LD. These loci were also found to be highly associated with HLA-F, as most haplotypes with a frequency above 3% were also in LD.
The detailed description of the haplotypes observed with a frequency greater than 3% (or 2% for those involving HLA-A; Table 5) reveals significant allelic associations between HLA-A, HLA-G UTRs and HLA-H, respectively, with very high standardized residual values. HLA-F also displays significant allelic associations with HLA-A, HLA-G UTRs and HLA-H, but to a lower extent. In contrast, HLA-G alleles display few significant associations with HLA-A, Table 2 HLA-H, HLA-F and HLA-G UTRs, with the exception of HLA-G Ã 01:03, Ã 01:04 and Ã 01:06. Finally, HLA-E reveals some significant associations, but the standardized residuals are very close to the threshold value for significance. The tight association found between HLA-A, HLA-G UTRs and HLA-H is further supported by the three-loci haplotype frequency estimation shown in Table 6. HLA-G UTR3 is associated with HLA-H Ã del and HLA-A Ã 23/24 alleles, HLA-G UTR6 is associated with HLA-H Ã 02:02 and HLA-A Ã 29, and HLA-G UTR7 is associated with HLA-H Ã 02:04new and HLA-A Ã 11, whereas HLA-G UTR2 displays no exclusive association.

Discussion
This study was devoted to the analysis of allelic associations between the HLA class Ia locus HLA-A, and the HLA class Ib loci HLA-E, -H, -G, HLA-G UTRs and -F which are all physically located within a region of 724 kb on chromosome 6. HLA-A seems to be involved in transplantation outcome in a different way from HLA-B or HLA-DRB1 [1,3]. This singular feature could be due to the HLA-A molecule per se and/or be a consequence of yet unknown patterns of linkage disequilibrium between HLA-A and nearby class Ib genes.
Currently, interactions between HLA class Ib molecules are barely known, and even less so in the clinical setting. Genetic phylogeny is one approach that might shed some light on this subject, as it has been suggested that 1) HLA-A and -H share a more recent ancestor than Table 3

HLA-E 8d
HLA HLA-A with either HLA-B or C and that 2) HLA-E is more closely related to HLA-H than to -F or -G [46]. Moreover, HLA-E alleles are thought to pre-date most of the HLA-A and HLA-B polymorphisms [36]. However, an evolutionary perspective on class Ib polymorphisms would probably not be very informative regarding clinical implications. Another possible approach to tackle this question was adopted here by looking at patterns of association previously described to have a clinical impact. Furthermore we have added to the interest of analyzing HLA-G UTRs in clinical studies as some of them display very specific associations, such as HLA-G UTR-4 associated with HLA-A Ã 03, HLA-H Ã 02:04 and HLA-F Ã 01:03; HLA-G UTR-6 associated with HLA-A Ã 29, HLA-H Ã 02:02 and HLA-F Ã 01:01; and HLA-G UTR-7 associated with HLA-A Ã 11, HLA-H Ã 02:04new and HLA-F Ã 01:01. In turn, HLA-G UTR-2 certainly needs further investigation as it was observed in association with several HLA-A, -H and -F alleles, probably consisting of at least 2 clusters with respect to HLA-H and HLA-G.
Of interest, several studies have described a deleterious association of HLA-G Ã 01:06~UTR2 and HLA-G Ã 01:04~UTR3, in lung transplantation and pregnancy [24,[26][27][28][29]. In this study we confirmed the exclusive association of HLA-G Ã 01:04 with HLA-G UTR3, HLA-A Ã 23/ Ã 24 and the deletion encompassing HLA-H. HLA-E and HLA-F did not display an exclusive association with HLA-G Ã 01:04, as this allele was associated with both F Ã 01:01:03 (6%) and F Ã 01:03:01 (3.3%), and with E Ã 01:01:01 (5.4%) and E Ã 01:03 (combined frequencies: 5.7%). Thus, we did not observe an exclusive association between HLA A Ã 23/ Ã 24 and E Ã 01:01 [36] could not be confirmed here. Whether the worse LTx progression reported for the HLA-G Ã 01:04 allele is due to HLA-A Ã 23/24 or to the deletion (encompassing HCG4P5, HLA-U, HLA-K, HCG4B, DDX39BP, MCCD1P1, HLA-T, HLA-H, HCG4P7, MICF and HCG3-2) remains to be explored. Noteworthy, this deletion, always associated to HLA-A Ã 23 and A Ã 24, was also found in HLA-G Ã 01:01 carriers (4.4%). Interestingly, the HLA-H locus, translated into a truncated protein, has a signal sequence similar to that of HLA-A, except that the second position (Valine) is identical to HLA-G and the third position displays a specific amino-acid (Val>Leu) The Structure of HLA Class Ib Haplotypes compared to other HLA class I proteins (Table 7) [2]. Interestingly, the amino-acid in the 10 th position is identical to HLA-A (Leu) and different from HLA-G (Phe). This characteristic of the HLA-G peptide signal is put forward to explain a higher affinity of the HLA-G-derived nonamer-HLA-E complex with CD94/NKG2C receptor complex [66]. Whether the HLA-H signal sequence is loaded by HLA-E and preferentially protects cells from NK lysis remains to be explored, but the absence of HLA-H could potentially have an impact on the interaction of HLA-E with its receptors. Concerning HLA-G Ã 01:06, it displayed a restricted association pattern as it was associated with HLA-G UTR-2, HLA-A Ã 01, HLA-H Ã 02:01:01 and HLA-F Ã 01:01:03, and in positive LD with HLA-E Ã 01:01:01 and negative LD with HLA-E Ã 01:03:02. Thus, the negative clinical effect of HLA-G Ã 01:06 could be the combined result of lower immune-tolerance due to reduced HLA-G expression expected from the association with HLA-G UTR-2 [23] and to reduced HLA-E expression expected from the association with HLA-E Ã 01:01 [35]; and yet unknown features of HLA-H Ã 02:01:01 and HLA-F Ã 01:01:03. Confirming such a hypothesis demands a much larger cohort, because of the low allelic frequency of HLA-G Ã 01:06 (3.9%), and would also greatly benefit from functional studies on HLA-H and HLA-F.
Looking now at the peculiar pattern of linkage disequilibrium revealed by HLA-E, its large physical distance from other HLA class Ib loci (>660Kb), as well as balancing selective forces acting on this locus could be put forward to explain the absence of GLD. However, the functional and clinical implications of HLA-E alleles not being associated to other HLA class Ib loci remain to be formally investigated.
In an HLA class Ia matching transplant context, however, the significant association observed between E Ã 01:01 and A Ã 01 on the one hand, and between E Ã 01:03 and A Ã 29, on the other hand, as shown in [36], implies that no HLA-E heterozygosity should be expected for most HLA-A Ã 01 and A Ã 29 carriers.
Finally, besides protein modifications and differential levels of expression that could impact their respective interaction and affinity with their specific receptors, non-classical HLA polymorphisms could also influence antibody production. Non-native forms of HLA-Ib, caused by alternative splicing or loss of β-2 microglobulin, may expose immunogenic cryptic epitopes and trigger an immune response [67]. Naturally occurring anti-HLA-E antibodies were shown to mimic anti-HLA-Ia antibodies and such cross-reactive binding was reported to affect the screening of anti-HLA-Ia antibodies in renal and liver transplant recipients [68]. Anti-HLA-F IgG, associated with the inflammatory status of systemic lupus erythematosus, may represent a biomarker of primary importance in transplantation, although their exact function has not yet been fully elucidated [69].
In conclusion, this study offers a unique view of non-classical HLA class I allelic associations that may be helpful for future functional studies in deciphering HLA class Ib protein interactions. Since HLA-A typing of recipients and donors is systematically and routinely performed in immunogenetics laboratories, these results could also potentially improve transplant matching strategies.