The PTPN22 Locus and Rheumatoid Arthritis: No Evidence for an Effect on Risk Independent of Arg620Trp

Objectives The Trp620 allotype of PTPN22 confers susceptibility to rheumatoid arthritis (RA) and certain other classical autoimmune diseases. There has been a report of other variants within the PTPN22 locus that alter risk of RA; protective haplotype ‘5’, haplotype group ‘6–10’ and susceptibility haplotype ‘4’, suggesting the possibility of other PTPN22 variants involved in the pathogenesis of RA independent of R620W (rs2476601). Our aim was to further investigate this possibility. Methods A total of 4,460 RA cases and 4,481 controls, all European, were analysed. Single nucleotide polymorphisms rs3789607, rs12144309, rs3811021 and rs12566340 were genotyped over New Zealand (NZ) and UK samples. Publically-available Wellcome Trust Case Control Consortium (WTCCC) genotype data were used. Results The protective effect of haplotype 5 was confirmed (rs3789607; (OR = 0.91, P = 0.016), and a second protective effect (possibly of haplotype 6) was observed (rs12144309; OR = 0.90, P = 0.021). The previously reported susceptibility effect of haplotype 4 was not replicated; instead a protective effect was observed (rs3811021; OR = 0.85, P = 1.4×10−5). Haplotypes defined by rs3789607, rs12144309 and rs3811021 coalesced with the major allele of rs12566340 within the adjacent BFK (B-cell lymphoma 2 (BCL2) family kin) gene. We, therefore, tested rs12566340 for association with RA conditional on rs2476601; there was no evidence for an independent effect at rs12566340 (P = 0.76). Similarly, there was no evidence for an independent effect at rs12566340 in type 1 diabetes (P = 0.85). Conclusions We have no evidence for a common variant additional to rs2476601 within the PTPN22 locus that influences the risk of RA. Arg620Trp is almost certainly the single common causal variant.


Introduction
Genome-wide association (GWA) scans have emphasised the importance of the PTPN22 gene, encoding the phosphatase LYP, in susceptibility to rheumatoid arthritis (RA) in European and European-ancestry populations, with HLA and PTPN22 locus SNPs dominating association at the genome-wide level. [1][2][3] The Arg620Trp variant, encoded by SNP rs2476601 (C.T), of the protein tyrosine phosphatase-22 (PTPN22) gene, is a prominent determinant of some autoimmune phenotypes, including RA. Strong association of the Trp620 variant has been repeatedly demonstrated to RA, type 1 diabetes (T1D), Graves' disease (GD) and systemic lupus erythematosus (SLE) in European populations. [4] The Trp620 allele is very rare in Asian populations. [5] The Trp620 effect seems to be restricted to autoimmune phenotypes in which a defined auto-reactivity is evidenced by specific autoantibodies. The LYP protein is part of a complex down-regulating signal from the activated T-cell receptor (TCR), with biochemical studies suggesting that PTPN22 inhibits T-cell activation through dephosphorylation of the LCK and ZAP-70 kinases. [6] The Trp 620 allotype of LYP is unable to interact with CSK, and generates a more active phosphatase that is more effective in inhibiting TCR signalling than the Arg 620 allotype. [7] In the thymus this might result in the positive selection of autoreactive T-cells that would otherwise be deleted or, in the periphery, reduced TCR signalling in T-regulatory cells resulting in reduced regulation of autoreactive T-cells. The Trp 620 allotype also impairs signalling from the B-cell receptor. [8] The PTPN22 gene maps to chromosome 1p13.2 within a haplotype block of conserved linkage disequilibrium (LD) spanning over 300 kb. [1] To investigate the possibility that further genetic variants within PTPN22 have a role in RA, coding regions within the gene were resequenced in USA people of white ethnic group-European ancestry and ten common haplotypes were tested for association with RA. [9] This analysis confirmed the predominant role in disease susceptibility conferred by the haplotype tagged by the Trp620 allele (minor allele of rs2476601: C.T) (haplotype '2'). The analysis also provided evidence for a second haplotype that increases risk of RA (haplotype '4') independent of the Arg620Trp variant. However this effect was not replicated in Norwegian, Dutch or UK sample sets. [10][11][12] Interestingly, some variants in PTPN22 provided evidence for a protective association with RA. A haplotype uniquely defined by SNP rs12760457 was associated with protection from RA (haplotype '5'), [9] although this association was not replicated in UK and Dutch RA sample sets. [10,11] Whilst this association was evident in T1D, it was not independent of the Arg620Trp effect. [13] In RA, analysis of intermarker LD and extent of association of SNPs within the extended PTPN22 haplotype block in a GWA scan of pooled genomic DNA samples from New Zealand and the UK suggested the presence of a second susceptibility determinant that was not explained by LD with the Trp620 variant, and perhaps related to the protective 'haplotype 5' identified by Carlton et al. [1,9] Collectively the previous reports [1,9] provided evidence for a second (protective) common RA risk allele or haplotype within the PTPN22 locus, that may map outside of the PTPN22 gene. Indeed association of a rare functional PTPN22 variant (Arg263Gln, rs33996649:G.A, minor allele frequency = 2.6% in Caucasian) with SLE and RA has been reported, with the A allele conferring a protective effect independent of rs2476601. [14,15] Here, our aim was to further investigate the possibility of allelic heterogeneity at PTPN22 in RA, focusing on the possibility of the existence of a common RA-protective haplotype independent of Arg620Trp. Here, despite replicating the protective association of haplotype '5' with RA, there was no evidence that this association was independent of rs2477601.

Results
The study of Carlton et al [9] first reported association of haplotype '5' with RA (P = 1.5610 25 ) in 1122 cases and 1767 controls. In an attempt to replicate the association of haplotype 5 with RA, a haplotype 5-defining SNP (rs3789607) was genotyped over the separate NZ and UK (London) RA case-control sample sets (Table 1). These data were combined with that of Wesoly et al [11] and with imputed data from the publically-available Wellcome Trust Case Control Consortium (WTCCC) RA genome-wide association (GWA) scan sample set. [2] The data of Hinks et al [10] were not used as they overlap with that of the WTCCC. The combined analysis of the UK, Dutch and NZ data provided independent evidence supporting protective association of haplotype 5 with RA (M-H pooled OR = 0.91 [0.85-0.98], P = 0.016).
Given some evidence for an RA protective effect conferred by haplotype 6 in the Carlton et al [9] sample sets (P = 0.15 in sample set 1; P = 5610 25 in sample set 2; combined one-tailed P by Fisher's method ,0.0001), we further investigated this finding in the NZ, UK (London) and WTCCC cohorts. Haplotypes 6-10 can be distinguished from haplotypes 1-5 by rs1217414. [ SNPs, none of which have been genotyped in CEPH CEU by HapMap (www.hapmap.org; release 23a), are required to distinguish haplotype 6 from haplotypes 7-10, meaning it would not be possible to use imputation to test these SNPs for association in the WTCCC data. Instead we used rs12144309 to investigate further association of the haplotype 6-10 group with RA -rs12144309, which maps outside of the region studied by Carlton et al [9], identifies one major haplotype within the 6-10 group ( Figure 1). In the CEU samples, rs12144309 tagged one haplotype of frequency 14.2%, with the remaining 6-10 haplotypes constituting 5.8%. rs12144309 was tested for association with RA in the WTCCC dataset by use of the publically-available imputation data, and by genotyping across the NZ and UK (London) sample sets ( Table 2). The resultant data (M-H pooled OR = 0.90 [0.82-0.98], P = 0.021) confirmed association of a protective haplotype within the haplotype 6-10 group with RA. However it is not possible in this case to ascribe this association to the same haplotype identified by Carlton et al. (haplotype 6). [9] These data were unable to be combined with that of Wesoly et al [11]; however, in their sample P for association of haplotype 6 with RA was 0.48 (frequency of 10.7% in cases, 11.9% in controls). Carlton et al [9] reported positive association of haplotype '4' (tagged by rs3811021) with RA (combined OR = 1.20, P = 0.009). We genotyped this SNP in the NZ and UK London cohorts and analysed association with RA, with the inclusion of data from the WTCCC [2], Wesoly et al [11] and Viken et al [12] studies (Table 3) Thus, within the PTPN22 gene there was replicated evidence for two protective effects, as defined by haplotype 5 and within the haplotype 6-10 group, and the protective effect we observed from haplotype 4. Analysis of association at the three tagging SNPs (rs3789607, rs3811021 and rs12144309) conditional on genotype at rs2476601 in the combined WTCCC and NZ samples, revealed no evidence for association with RA at any of the SNPs independent of rs2476601 (all P.0.05). Haplotypes 4,5 and 6-10 are unrelated to each other over the physical boundaries of the PTPN22 gene [9] (Figure 1), and all contain Arg620. We invoked an alternative explanation to account for the observed protective effects in RA observed by us and Carlton et al [9]; that the protective effects are caused by a single allele in the PTPN22 haplotype block, but outside of the region encompassing PTPN22 that was assessed by Carlton et al [9] (Figure 1), and not due to the Arg620 allele. The LD relationship of haplotypes 4, 5 and 6-10 with markers in the extended PTPN22 haplotype block was examined; all these haplotypes coalesce with the major allele of a group of markers exhibiting very strong inter-marker LD (Figure 1; r 2 .0.90; rs12566340, rs7529353, rs11102691). These markers all map within the 39 untranslated region (UTR) of the BFK (B-cell lymphoma 2 (BCL2) family kin) gene (C1orf178). We hypothesized that this group of markers was responsible for the protective effects of haplotypes 4, 5 and 6-10 within the PTPN22 region. Rs12566340 was genotyped in the NZ RA sample set and imputed in the WTCCC RA sample set. Conditional analysis revealed weak evidence for association of rs12566340 independent of rs2476601 in the separate NZ and WTCCC datasets (P = 0.036 and 0.033, respectively). Two-marker rs2476601-rs12566340 haplotypes were then estimated. In the NZ sample set comparison of the risk conferred by the C-C haplotype to that conferred by the C-T haplotype (both haplotypes contain the major allele at rs2476601) suggested a marginal protective effect independent of rs2476601 (OR = 0.79, P = 0.05). However this effect was not replicated in the WTCCC sample set (OR = 1.13, P = 0.06) ( Table 4). Consistent with this, conditional analysis on the combined NZ/WTCCC dataset did not support association of rs12566340 independent of rs2476601 (P = 0.76).
Previously .150 SNPs in the PTPN22 haplotype block and flanking 400 kb were genotyped in a British T1D case-control sample set with the aim of identifying putative T1D risk variants independent of rs2476601 [13]. There was no evidence for allelic heterogeneity, with rs2476601 remaining the best candidate for sole causal variant. However, none of the SNPs rs12566340, rs7529353 or rs11102691 was included in this analysis. We, therefore, genotyped rs12566340 over the same T1D samples previously studied by Smyth et al [13], with no evidence for an effect at rs12566340 independent of rs2476601 (P = 0.85). Equality of risk conferred by the C-C haplotype in comparison to the C-T haplotype at rs2476601-rs12566340 was also observed ( Table 4; OR = 1.01 [0.94-1.09], P = 0.78), consistent with the absence of common allelic heterogeneity at PTPN22 in T1D.
We also examined the rare T-C haplotype of rs2476601-rs12566340, testing for an effect on disease risk using the T-T haplotype as reference. In T1D there was evidence for an independent protective effect of the C allele ( Recently Steck et al [16] published evidence for a 6-marker protective haplotype in T1D, defined by markers in C1orf178, from a case versus control analysis of chromosomes containing only the Arg620 allele. The haplotype was tagged by the minor allele of rs1539438 (A.G), which had been genotyped by Smyth et al [13] and its association with T1D shown to be dependent on rs2476601. We genotyped rs1539438 in the NZ RA sample set and tested for association independent of rs2476601 in the NZ samples, and in the publically-available WTCCC samples, with no evidence for independent association detected (P = 0.42 and 0.25, respectively).

Discussion
The Trp620 allele (rs2476601) of the PTPN22 SNP rs2476601 is strongly associated with both RA and T1D. [4,13] Smyth et al [13] demonstrated that this allele explains the association of the PTPN22 locus with T1D. They analysed 46 PTPN22 SNPs, and 111 further SNPs from the PTPN22 haplotype block and 400 kb Table 2. Analysis of association of PTPN22 'haplotype 6-10' group (rs12144309: C.T) with rheumatoid arthritis. flanking the haplotype block. Our approach in RA was different to that taken by Smyth et al [13], being driven by the findings of Carlton et al [9], beginning with studying the specific haplotypes that they reported that altered RA risk (haplotypes 4, 5 and 6-10). By generation of new data, and meta-analysis, we also found that these haplotypes influence the risk of RA, although with an opposing direction of effect in the case of haplotype 4. However, neither the haplotype-defining SNPs, nor rs12566340 (haplotypes 4, 5, 6-10 converge to the major allele), were independently associated with RA. There was also no evidence for independent association of this SNP with T1D (neither rs12566340, nor any surrogate SNPs had previously been analysed by Smyth et al [13]). We conclude that it is unlikely that allelic heterogeneity at the PTPN22 locus, driven by common variants, exists in RA. The published functional data [7], including correlations between TCR signalling and carriage of the different alleles of Arg620Trp strongly supports that this non-synonymous SNP is the causal variant. However, given the entire LD region has not yet been resequenced, there is still a possibility that as yet unidentified variant(s) could play a role in disease etiology. It is important to note that our approach examined common Arg620Trp-independent variants. We did not address the question of whether or not there are rare Arg620Trp-independent variants, which are known to exist, in RA and SLE at least (Arg263Gln). [14,15] There was evidence for inequality of risk for the rare T-C haplotype compared to the T-T haplotype in T1D, but this was not supported in the RA sample set. Further study of this rare haplotype will be difficult, owing to its scarcity. The 263Gln allele, which confers a protective effect in RA independent of Arg620Trp, [15] is nearly exclusively contained on haplotypes containing the major (620Arg) allele at rs2476601 in Caucasians, [15] meaning it cannot explain any possible independent protective effect of the T-C haplotype.
Carlton et al [9] concluded that the Arg620Trp variant did not fully explain the association between PTPN22 and RA. Using a haplotype method of analysis and conditional logistical regression they concluded that haplotype '4' (tagged by the G allele of rs3811021) was primarily responsible for their observation that Arg620Trp did not fully explain the association of the PTPN22 locus with RA. By testing for equality of risk between haplotypes containing rs2476601 and rs12566340, we found no evidence for an RA risk effect independent of rs2476601, nor for any of the haplotype 4, 5 and 6-10 tagging SNPs. Our apparently conflicting findings need to be considered in light of the heterogeneity in association with RA of haplotype 4 between the USA Europeanancestry samples studied by Carlton et al [9] (susceptible effect), and the British and European samples studied here (protective effect; note that the European population of NZ is predominantly derived from immigrants from Britain and Europe). Acknowledging that different statistical methods were used, it is possible that populationspecific effects are obscuring investigation of an Arg620Trpindependent effect on RA risk at the PTPN22 locus. Certainly the LD between the haplotype '4' and '5'-defining SNPs differs between the samples studied by Carlton et al [9] (r 2 ,0.4 [9]) and the WTCCC and HapMap CEU samples (r 2 ,0.1). It is unlikely that the difference in haplotype 4 results is caused by fluctuation in control frequencies, as was previously noted with respect to observation of a protective haplotype in a study of the PTPN22 locus in Graves' disease [13,17] -changes in both the control and case rs3811201 frequencies are evident between the Carlton et al [9] and the newly analysed data ( Table 3). Genotyping of rs3811021, rs12566340, rs2476601, rs3789607 and other relevant SNPs over additional USA and British/European sample sets is warranted. The possibility of clinical heterogeneity between sample sets playing a role in the different outcomes of our and the Carlton et al studies should also not be overlooked. However, it is not possible to comprehensively consider this possibility presently, owing to the paucity of clinical data available for the relevant RA sample sets (refer to Samples and Methods and Carlton et al [9]).

Ethical Statement
Ethical approval for the NZ study was given by the MultiRegion and Lower South Ethics Committees, the UK London RA study by the Lewisham Hospital and Guy's and St. Thomas' Hospitals local research ethics committees, participants with T1D were enrolled under study protocols approved by the Institutional Review Board of each UK institution that contributed (see http:// www.childhood-diabetes.org.uk/grid.shtml), and 1958 birth cohort controls for the T1D comparison by the London Multiregion Ethics Committee. All subjects gave written informed consent, or their parents/guardian for those considered too young to consent.

Genotyping and imputation
Subjects from the NZ and UK London sample sets were genotyped in this study using PCR-RFLP: rs3789607 using primers GGCTGTTTTATTTCCCCTGT and GAGCTAGTT-TGCTATCACTG that result in cleavage of the 160 bp product into 100/60 bp fragments using aTaqI; rs12566340 using primers TGATCAATCTGATGGCAGTATATAGGACAA and CCTC-AGTCATTTTTACCTTG that result in cleavage of the 210 bp product into 179/31 bp fragments using Tsp509I; and rs12144309 using primers ATGGCACCTCAGATGCATTA and AGTATT-TACATATTTAATCCACCTGGAATC that result in cleavage of the 176 bp product in 145/31 bp fragments using aTaqI. rs1539438 was genotyped over the NZ samples by Taqman (Applied Biosystems) using probe C_1900118_10 and primers GTAATTTTATTAAGAAATACTTCCT[C/T] and GACTT-CTTAGGTCCTGCACATGGTA.
Subjects from the T1D sample set were genotyped with TaqMan, which was carried out in accordance with the manufacturers' protocols. All genotyping data were scored blind to case-control status; TaqMan genotyping was double scored by a second operator to minimize error.
Access to genotype data was granted by the WTCCC (www. wtccc.org.uk). Genotypes for rs12144309 and rs12566340 were imputed using IMPUTE v0.2.0 [19] with default parameters, a 10 Mb window centred on the PTPN22 locus and called using a quality threshold of 0.9.

Data analysis
All genotype data were checked for deviation from Hardy-Weinberg equilibrium using http://ihg.gsf.de/cgi-bin/hw/hwa1. pl. The NZ control samples exhibited a small deviation from Hardy-Weinberg equilibrium (HWE) for markers rs12566340 and rs12144309 (P = 0.009 and 0.02, respectively). 20% of genotypes for each of rs12566340 and rs12143309 were consequently repeated, with 100% concordance for both. SHEsis [20] was used to generate basic summary statistics (allelic and genotypic P values by Fisher's chi-square, testing for deviation from HWE). UNPHASED [21] was used to test for equality of risks between haplotypes and to test for association at one marker conditional upon a second. STATA 8.0 was used to calculate Mantel-Haenszel (M-H) pooled ORs and test for heterogeneity between datasets, using a fixed effects model in the absence, and a random effects model in the presence, of heterogeneity. The power to detect a putative independent causal effect of weak magnitude (OR = 1.2/0.83, alpha = 0.01) owing to rs12566340 was 97% for RA and 100% for T1D.