Comparing HLA Shared Epitopes in French Caucasian Patients with Scleroderma

Although many studies have analyzed HLA allele frequencies in several ethnic groups in patients with scleroderma (SSc), none has been done in French Caucasian patients and none has evaluated which one of the common amino acid sequences, 67FLEDR71, shared by HLA-DRB susceptibility alleles, or 71TRAELDT77, shared by HLA-DQB1 susceptibility alleles in SSc, was the most important to develop the disease. HLA-DRB and DQB typing was performed for a total of 468 healthy controls and 282 patients with SSc allowing FLEDR and TRAELDT analyses. Results were stratified according to patient’s clinical subtypes and autoantibody status. Moreover, standardized HLA-DRß1 and DRß5 reverse transcriptase Taqman PCR assays were developed to quantify ß1 and ß5 mRNA in 20 subjects with HLA-DRB1*15 and/or DRB1*11 haplotypes. FLEDR motif is highly associated with diffuse SSc (χ2 = 28.4, p<10−6) and with anti-topoisomerase antibody (ATA) production (χ2 = 43.9, p<10−9) whereas TRAELDT association is weaker in both subgroups (χ2 = 7.2, p = 0.027 and χ2 = 14.6, p = 0.0007 respectively). Moreover, FLEDR motif- association among patients with diffuse SSc remains significant only in ATA subgroup. The risk to develop ATA positive SSc is higher with double dose FLEDR than single dose with respectively, adjusted standardised residuals of 5.1 and 2.6. The increase in FLEDR motif is mostly due to the higher frequency of HLA-DRB1*11 and DRB1*15 haplotypes. Furthermore, FLEDR is always carried by the most abundantly expressed ß chain: ß1 in HLA DRB1*11 haplotypes and ß5 in HLA-DRB1*15 haplotypes. In French Caucasian patients with SSc, FLEDR is the main presenting motif influencing ATA production in dcSSc. These results open a new field of potential therapeutic applications to interact with the FLEDR peptide binding groove and prevent ATA production, a hallmark of severity in SSc.


Introduction
Systemic Sclerosis (SSc) or Scleroderma, is a chronic autoimmune disease with unknown aetiology, characterized by fibrosis, vascular alterations and autoantibodies. Scleroderma is stratified by clinical criteria into two subtypes: limited cutaneous SSc (lcSSc) mainly affecting the hands, arms and face, and diffuse cutaneous scleroderma (dcSSc), affecting a large area of the skin, at increased risk of cardiac disease, interstitial lung disease, renal crisis and early death. Specific autoantibodies correlate with subtypes. Anticentromere antibodies (ACA) and anti-topoisomerase antibodies (ATA) are respectively a hallmark of lcSSc and dcSSc, although not always detected and sometimes observed in each other group. Other autoantibodies, such as anti-RNA polymerase, antifibrilllin (AFA) or anti-U3RNP associate with particular clinical manifestations [1,2,3,4].
HLA-DR molecules are heterodimers composed of a nonpolymorphic a chain, encoded by DRA and a highly polymorphic ß chain (ß1 to ß5) encoded by DRB1*, DRB3*, DRB4* or DRB5* genes. In HLA-DR15 molecules, the a chain associates either with a ß1 or ß5 chain coded respectively by DRB1*15 or DRB5*01. The 67 FLEDR 71 motif is only expressed on the ß5 chain. Conversely, in HLA-DR11 molecules, the 67 FLEDR 71 motif is expressed on the ß1 chain encoded by DRB1*11 and not on the co-expressed ß3chain encoded by DRB3*02. Consequently, two heterodimeric HLA-DR molecules are co-expressed at the cell surface. For HLA-DR15, a chain associates with ß1 or ß5 and for DR11, a chain associates with ß1 or ß3. But often only heterodimers with ß1 chains are taken into consideration for antigen presentation purpose, as other chains are often considered as accessory chains. If indeed protein expression for ß1 chains is higher, a person carrying the FLEDR shared epitope on the HLA-DR15 haplotype would have less expression and a deficient FLEDR restricted antigen presentation compared to a person carrying it on the HLA-DR11 haplotype.
In the current study, we evaluated for the first time the relative risk conferred by one or two 67 FLEDR 71 and/or one or two 71 TRAELDT 77 motives among our Caucasian French patients with SSc. As we know, from previous studies, classification by clinical subtypes and autoantibody profiles could influence results, we conducted a complete analysis with both classifications to determine whether FLEDR and/or TRAELDT motives influence susceptibility to the limited or diffuse form of SSc and/or to autoantibody subsets. We further evaluated the importance of the ß chain carrying the FLEDR shared epitope by quantifying levels of ß chain transcripts.

Patients and Controls
We included 282 Caucasian patients with SSc with no overlapping disease, all with defined SSc type (94 diffuse SSc; 188 limited SSc) and auto-antibody status known for 243 of them (80 without ATA or ACA, 89 with ACA and 74 with ATA). Patients were enrolled in collaboration with 7 French hospitals from Paris, Marseille and Lille and fulfilled the criteria of LeRoy for SSc [10]. Altogether 235 women and 47 men were analysed for HLA-DR and DQ allele frequencies. Mean age at diagnosis was 49.3 years +14.3 (mean 6 SD).
In parallel, 468 healthy Caucasian controls were recruited at the Centre d'Examen de Santé de l'Assurance Maladie (CESAM), Marseille, France (N = 154, mean age at the inclusion was 52.5 years 6 7.5 [mean 6 SD]) and at Claude Huriez Hospital in Lille, France (N = 314, mean age at inclusion was 35.4 years 6 10.2 [mean 6 SD]). None of the controls had any symptom or familial history of autoimmune disorder.

Ethics Statements
Controls from Marseille are registered at INSERM under the Biomedical Research Protocol number RBM-04-10.
Controls from Lille were drawn from a DNA bank created in the biological laboratory in 1993. Written consent forms obtained according to the Declaration of Helsinki were signed [11].

HLA-DRB1 and DQB1 Typing
Genomic DNA was extracted from peripheral blood by standard methods and HLA genotyping was performed either by using PCR-RFLP, as previously described for samples received in Lille or sequence-specific oligonucleotide (SSO) HLA-DRB1 and HLA-DQB1 typing kit, (RELI SSO, Dynal, Invitrogen, Bromborough, Wirral, UK) according to manufacturer's protocol as previously described for samples received in Marseille [12,13]. Allelic typing for DRB1 was done at Etablissement Français du Sang (EFS, Marseille, France) for samples received in Marseille.

RNA Preparation and cDNA Synthesis
Total RNA was extracted from PBMC using GenElute Mammalian total RNA Miniprep Kit (Sigma-Aldrich, St Louis MO, USA). For each RNA sample, DNase I digestion was included as recommended (Sigma-Aldrich, St Louis MO, USA).  Concentration of RNA was spectrophotometrically measured and quality was ascertained with Agilent Analyser (Agilent RNA 6000 Nano Kit, Agilent Technologies, Germany). Total RNA was reverse-transcribed using Enhanced Avian HS RT-PCR kit (Sigma-Aldrich, St Louis MO, USA) according to manufacturer's recommendations.
As quantitative comparisons between mRNAs are not trustable since no perfect gene reference exist, we cloned PCR products from each assay (DRB1*15, DRB5*01, DRB1*11 or DRB3*02) in pCR4-TOPO plasmids (pCR4-TOPO TA Cloning Kit, Invitrogen, Carlsbad, CA) to calibrate standards curves from these constructions, as explained below. After bacterial transformation and plasmid extraction (QIAprep Spin Miniprep kit, Qiagen), each construct was checked by sequencing (Cogenics, Grenoble France). Plasmids were then linearized with NcoI restriction enzyme (Invitrogen, Carlsbad, CA) and concentrations for the 4 constructs were adjusted by using a common set of primers and probe specific to the plasmid sequence and designed as follows : forward primer 59-CAGAATTAACCCTCACTAAAGGGACT-39, reverse primer: 59-ATAGGGCGAATTGAATTTAGCG-39 and probe 59FAM-TCCTGCAGGTTTAAACGAATTCGCC-TAMRA39. The 4 standards were then serially diluted. Normal-ization of concentrations between the 4 constructs was possible because inserts were very similar in size (210 bp and 193 bp for HLA-DRB1*15 and -DRB5*01 respectively and 114 bp and 111 bp for HLA-DRB1*11 and -DRB3*02 respectively). Standards were run with their own set of primers and probes. Adjustment for PCR efficiency of each curve was done and relative comparison between quantities of HLA-DRB1*15/ DRB5*01 mRNA and HLA-DRB1*11/DRB3*02 mRNA was calculated.

Statistical Analysis
Pearson chi-square tests with the adjusted standardized residual method [15] were used to compare frequencies and give an indication of the strength of the association for each shared epitope (FLEDR and TRAELDT) between different groups. Indeed the x2 test indicates whether there is an association between two categorical variables. However, it does not in itself give an indication of the strength of the association. In order to identify the cells (one dose, two doses…) that have the larger differences between the observed and expected frequencies, we used the adjusted standardized residuals. These differences are referred to as residuals, and they can be standardized and adjusted to follow a Normal distribution with mean 0 and standard deviation 1 [2]. The adjusted standardized residuals, dij, are given by: Where Oij is the observed frequency in the cell in row i and column j and Eij is the expected frequency in the cell in row i and column j, where ni is the total frequency for row i, nj is the total frequency for column j, and N is the overall total frequency. The larger the absolute value of the residual, the larger the difference between the observed and expected frequencies, therefore the more significant the association between the two variables.
Adjusted standardized residual .1.96 indicates that the number of cases in that cell is significantly larger than would be expected if the null hypothesis were true, with a significance level of 0.05. An adjusted residual , 22.0 indicates that the number of cases in that cell is significantly smaller than would be expected if the null hypothesis were true. (Table 3) HLA-DRB1 allelic typing was performed on 468 controls and 282 patients with SSc (94 patients with dcSSc and 188 patients with lcSSc) to identify the FLEDR motif coded by some HLA-

FLEDR Motif is Highly Associated with dcSSc
DRB alleles ( Table 1 and Table S1). Similarly HLA-DQB1 allelic typing was performed on the same number of controls and patients to identify the TRAELDT motif coded by some HLA-DQB1 alleles and the amino acid present at position 30 ( Table 2 and Table S2). The presence of FLEDR or TRAELDT on one haplotype is noted: 1 dose, on both haplotypes: 2 doses and on none: 0 dose. When patients are divided by clinical subtypes (Table 3), standardized adjusted residuals give an indication of association for each motif, as explained above. Presence of 2 doses of FLEDR is the most significant risk to develop dcSSc compared to a single dose (respective adjusted standardized residuals: 3.7 and 2.3, see Methods). For TRAELDT having 2 doses is also the most significant risk to develop dcSSc compared to a single dose (respective adjusted standardized residuals: 2.9 and 21.4).
However if we compare which one of the two motives is associated with the highest risk to develop dcSSc, FLEDR motif is at higher risk than TRAELDT (respectively x 2 = 28.4, p,10 26 and x 2 = 7.2, p = 0.027).
The increase of FLEDR motif in patients with dcSSc is mostly due to the higher frequency of HLA-DRB1*11 and DRB1*15 alleles (Table S1).
As it had been previously proposed that tyrosine at position 30 ( 30 Y) could strengthen the TRAELDT association, we analysed this possibility. Indeed, the 30 Y residue added to TRAELDT slightly strengthens the TRAELDT association with dcSSc (x 2 = 10.5, p = 0.0015, data not shown) but remains weaker than FLEDR association (x 2 = 28.4, p,10 26 ).

FLEDR Motif is Highly Associated with Patients with ATA
Patients were divided in three subgroups according to their autoantibody status ( Table 4, Tables S3 and S4): patients without ACA or ATA (Abneg); patients with ACA (ACApos); patients with ATA (ATApos).
Double dose of FLEDR motif is significantly increased among patients with ATA (28.4%) compared with healthy controls (9.0%). The risk to develop ATA positive SSc is higher when double dose FLEDR is present (adjusted standardized residuals: 5.1) than when only one dose is present (adjusted standardised residuals: 2.6). The increase of FLEDR motif in patients with ATA is, like for patients with dcSSc, mostly due to the higher frequency of HLA-DRB1*11 and DRB1*15 alleles ( Table S3). The risk to develop ATA positive SSc is higher when double dose TRAELDT is present (adjusted standardized residuals: 3.6) than when only one dose is present (adjusted standardised residuals: 21.2). The increase of TRAELDT motif in patients with ATA is mostly due to the higher frequency of HLA-DQB1*03 alleles (Supplementary Table S4).

Prevalence of FLEDR in ATA Positive dcSSc and Not ATA Negative dcSSc
Although ATA is a hallmark of dcSSc, not all patients with dcSSc have ATA, we wondered whether the FLEDR association was mostly with the clinical subtype (dcSSc ATA+ or 2) or the autoantibody profile ( Table 5). Among the 94 patients with dcSSc we were able to obtain information for autoantibody status for 85 who divided into 52 ATA positive and 33 ATA negative. When both groups (dcSSc ATA pos and dcSSc ATA neg) are compared to healthy controls, the only remaining association is with the group positive for ATA. Again FLEDR is the most prevalent shared epitope in this group (x 2 = 35.2, p,10 27 ) compared with TRAELDT (x 2 = 9.4, p = 0.009). For both motives, having 2 doses confers the highest risk to develop dcSSc with ATA as they have the highest standardized adjusted residuals (respectively for FLEDR and TRAELDT: 4.8 and 2.6).
However TRAELDT alone is statistically associated with dcSSc and ATA positive individuals as the frequency of individuals negative for FLEDR (0 dose) but positive (1 or 2 doses) for TRAELDT is statistically higher in patients with dcSSc (x 2 = 12.9, p = 0.0003) and in patients with ATA (x 2 = 19.4, p = 0.000013) compared to controls (Tables S5 and S6). This last result indicates TRAELDT has its own contribution to disease susceptibility and autoantibody specificity.

Increased mRNAs of Beta Chains with FLEDR Motif in Patients and Controls
The HLA-DRB1*1104/DRB1*1501 genotype was present in 7 out of 19 patients with ATA positive dcSSc with double dose FLEDR but rarely seen in healthy controls with double dose FLEDR (3/42, 2 tailed Fisher's test, p,0.007, data not shown).
The FLEDR motif is expressed on the ß1 chain in some HLA-DR haplotypes (i.e. DR11), and on the ß5 chain on other HLA-DR haplotypes (i.e. DR15, see Table 1). Ratios of HLA-DRB5*01 (FLEDR pos )/HLA-DRB1*15 (FLEDR neg ) mRNA expression were compared in 14 patients with SSc and 6 healthy controls (Figure 1). Levels of ß5 mRNA (FLEDR pos ) were systematically higher than levels of ß1 mRNA with a mean ratio of 5.6. This difference was observed in patients and controls.
As a control of our experiments and validation of our quantitative comparisons, we checked whether levels of ß1 mRNA were higher than ß3 mRNA in HLA-DR11 haplotypes as previously described. Indeed, analyses on 5 subjects (2 patients   with SSc and 3 controls) confirmed an increased quantity of ß1 mRNA (FLEDR pos ), with a mean ratio of 4.5, compared to ß3 mRNA (data not shown).

Discussion
The most important genetic factors for scleroderma, as for many autoimmune diseases, are in the HLA locus. Indeed, results from worldwide cohorts of patients with SSc, going from early HLA allele frequency analyses to more recent Genome Wide Association studies, all show HLA association with SSc. Most studies have used the classification of diffuse or limited disease, while others, more recently, have analysed patients subsets classified according to their autoantibody status. Overall, the consensus is that HLA-DRB1*11:04 is a risk factor in numerous Caucasian populations for diffuse SSc and presence of ATA [3,5] and HLA-DRB1*15:02 and DRB1*0802 in Asian populations [7]. Sequence identity on SSc-associated HLA-DRß chains conducted to a model of FLEDR shared epitope at amino acid positions 67 to 71 [16]. Similarly, SSc-associated HLA-DQß1 chains have a common 71 TRAELDT 77 sequence, which has been associated with patients with dcSSc and ATA. Although many studies have analysed HLA allele frequencies in several ethnic groups including North American Caucasians, Japanese and Choctaw Indians [2], no study has evaluated the strength of FLEDR motif compared to TRAELDT motif. Moreover, to our knowledge, this is the first study analysing HLA class II shared epitopes in French Caucasian patients with systemic sclerosis (SSc) stratified by clinical subsets and autoantibodies. We confirmed two previous findings in Japanese and Korean patients [16] and showed that FLEDR is the most prevalent shared epitope for the most severe type of SSc (dcSSc ATA+) in French Caucasians. We further showed that the association is linked to autoantibody production rather than clinical subtype. Finally by using standardized adjusted residual method we showed that having 2 doses of FLEDR is the higher risk to develop dcSSc with ATA.
Since TRAELDT association is weaker compared with FLEDR association, and since TRAELDT is often in linkage disequilibrium (LD) with FLEDR, one could think TRAELDT association is solely due to LD. However we showed that TRAELDT has its own contribution to disease susceptibility and autoantibody specificity. A tyrosine at position 30 ( 30 Y), previously shown to reinforce the strength of the TRAELDT association with patients with ATA [9], added significance to the association but still remained lower than the FLEDR association. Similarly, on the DRß chain, by a novel approach which consists in subdividing into biologically relevant smaller sequence features and their variant types, Karp et al. showed that additional residual amino acids played a role in the risk to develop SSc [17]. Risk alleles had the sequence 26 F-28 D_ 30 Y_ 37 Y_ 67 F/I_ 70 D_ 71 R_ 86 V. However this additional effect of residual amino acids on DRß chain was not as obvious in our cohort as most double dose FLEDR shared epitopes were HLA-DRB1*11:04/*15:01 and HLA-DRB1*15:01 does carry FLEDR but not the whole risk sequence described above (ie: 86 V).
A parallel can be made between shared epitope in Rheumatoid Arthritis (RA) and in SSc. In patients with RA, the shared epitope 70 QK/RRAA 74 has a strong effect on the risk to develop Anti-Citrullinated Peptide Antibodies (ACPA) positive RA, whereas this association, although significant, is weaker in patients without ACPA [18]. This observation argues the admitted hypothesis that a particular HLA shared epitope presents particular auto-antigenic peptides triggering to a T cell helper response, which itself conducts to a particular auto-antibody production.
The FLEDR motif, by its position in the peptide binding groove, is determinant for efficient presentation of antigenic peptides to T cells. Interestingly, we showed that this motif would be overexpressed when carried by DRß5 chains in HLA-DR15 molecules, as well as when carried by DRß1 in HLA-DR11 molecules. Indeed, our results, very similar to Prat et al. recently found in patients with multiple sclerosis and controls [14], showed a 5 fold increase of ß5 chain at mRNA level. Prat et al. further showed that this mRNA increase correlated with a two-fold increase at protein level. ß5 chains might be then sufficiently expressed at cell surface to combine with the DRß chain to form additional DR molecules on the cell surface and be involved in antigen presentation [14]. These ''accessory chains'' serve to extend and complement the peptide repertoire of DRB1 in antigen presentation [19].
In the current study, not only we confirmed that some HLA-DRB1 and DQB1 alleles are highly associated with the production of ATA, but for the first time we statistically evaluated the strength of each HLA allele common motives. FLEDR is the main presenting motif involved in ATA production. Knowing better motives involved in peptide and autoantibody production could allow developing blocking therapies to prevent ATA production, a hallmark of higher risk for severe organ involvement, for internal malignancies and for reduced survival. Indeed, in a recent publication, by using an in silico molecular docking program to screen a large ''druglike'' chemical library, Michels et al. were able to find small molecules capable of occupying the pockets along the I-A g7 binding groove in the NOD mouse model of spontaneous autoimmune diabetes [20].
The focus of this paper has been the amino-acid sequence from position 67 to 71 encoded by HLA-DRB and the amino-acid sequence from position 71 to 77 encoded by HLA-DQB1, but classification of HLA-DRB1 genotypes according to their risk should provide diagnostic markers for SSc. Indeed we found that HLA-DRB1*1104/DRB1*1501 was the most common FLEDR double dose genotype among patients with ATA and was rarely seen in healthy controls. This highlights a synergistic effect of different alleles from each haplotype. Double dose of shared epitope but also compound heterozygosity, may confer a higher risk to disease as it has been shown in rheumatoid arthritis, type 1 diabetes, celiac disease and systemic lupus erythematosus suggesting a common autoimmune pathway [19,21,22,23,24].
Future larger studies should also focus on classification by HLA genotypes at risk for SSc to provide help in clinical practice for a disease still difficult to diagnose. Table S1 HLA-DRB1 allele frequencies in patients with SSc divided by clinical subtypes and compared with healthy controls. a Odds ratios (OR) and confidence intervals [CI] are given only for HLA-DRB1 allele frequencies statistically higher (susceptibility alleles) or statistically lower (protective alleles) in patients compared with controls. b Otherwise statistics are noted as non-significant (ns). c p,0.05 after correction for multiple comparisons. (DOCX) Table S2 HLA-DQB1 allele's frequencies in patients with SSc divided by clinical subtypes and compared with healthy controls. a Odds ratios (OR) and confidence intervals [CI] are given only for HLA-DRB1 allele frequencies statistically higher (susceptibility alleles) or statistically lower (protective alleles) in patients compared with controls. b Otherwise statistics are noted as non-significant (ns). c p,0.05 after correction for multiple comparisons.

Supporting Information
(DOCX) Table S3 HLA-DRB1 allele frequencies in patients with SSc divided by autoantibodies status and compared with healthy controls. a Odds ratios (OR) and confidence intervals [CI] are given only for HLA-DRB1 allele frequencies statistically higher (susceptibility alleles) or statistically lower (protective alleles) in patients compared with controls. b Otherwise statistics are noted as non-significant (ns). c p,0.05 after correction for multiple comparisons.