A Refined Study of FCRL Genes from a Genome-Wide Association Study for Graves’ Disease

To pinpoint the exact location of the etiological variant/s present at 1q21.1 harboring FCRL1-5 and CD5L genes, we carried out a refined association study in the entire FCRL region in 1,536 patients with Graves’ disease (GD) and 1,516 sex-matched controls by imputation analysis, logistic regression, and cis-eQTL analysis. Among 516 SNPs with P<0.05 in the initial GWAS scan, the strongest signals associated with GD and correlated to FCRL3 expression were located at a cluster of SNPs including rs7528684 and rs3761959. And the allele-specific effects for rs3761959 and rs7528684 on FCRL3 expression level revealed that the risk alleles A of rs3761959 and C of rs7528684 were correlated with the elevated expression level of FCRL3 whether in PBMCs or its subsets, especially in CD19+ B cells and CD8+ T subsets. Next, the combined analysis with 5,300 GD cases and 4,916 control individuals confirmed FCRL3 was a susceptibility gene of GD in Chinese Han populations, and rs3761959 and rs7528684 met the genome-wide association significance level (Pcombined = 2.27×10−12 and 7.11×10−13, respectively). Moreover, the haplotypes with the risk allele A of rs3761959 and risk allele C of rs7528684 were associated with GD risk. Finally, our epigenetic analysis suggested the disease-associated C allele of rs7528684 increased affinity for NF-KB transcription factor. Above data indicated that FCRL3 gene and its proxy SNP rs7528684 may be involved in the pathogenesis of GD by excessive inhibiting B cell receptor signaling and the impairment of suppressing function of Tregs.

FC receptor-like-3 (FCRL3, also known as CD307c) on 1q21.1 encodes a member of the immunoglobulin receptor superfamily and is one of several Fc receptor-like glycoproteins. The encoded protein of FCRL3 contains immunoreceptor-tyrosine activation motifs and immunoreceptor-tyrosine inhibitory motifs in its cytoplasmic domain and may play a role in regulation of the immune system. The 1p21-23 region, in which the FCRL family resides, has been identified as a candidate locus for multiple autoimmune disorders in both human and murine models [15]. Mutations in FCRL3 have been reported to be associated with a plethora of autoimmune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, and AITD [16][17][18]. Recently, two genome-wide association studies (GWAS) from Wellcome Trust Case Control Consortium (WTCCC) and our group, both identified FCRL3 as a susceptibility gene of GD in individuals of European ancestry and Chinese Han populations, respectively [19,20]. After the WTCCC GWAS, a case-control association study investigating twelve tag SNPs within FCRL5 was performed in 2,504 UK Caucasian patients with GD and 2,688 geographically matched controls and the results suggested that the association of FCRL5 with GD is secondary to the effect of FCRL3 [21]. Nevertheless, a refined association study in the entire FCRL region is required to determine the exact location of the etiological variant/s present.
In this study, we refined the association in the 1q21.1 region harborbing FCRL1-5 and CD5L, and confirmed FCRL3 was a susceptibility gene of GD in Chinese Han populations and the most significant signals associated with GD and correlated to FCRL3 expression were located at a cluster of SNPs including rs3761959 and rs7528684. Moreover, the haplotypes with the risk allele A of rs3761959 and the risk allele C of rs7528684 were associated with the predisposition of GD and can up-regulate the mRNA expression level of FCRL3, whether in peripheral blood mononuclear cells (PBMCs) or the subsets of PBMCs, especially in CD19 + B cells and CD8 + T subsets. Finally, the risk allele C of rs7528684 can increase the binding with NF-KB transcription factor, resulting in the pathogenesis of GD.

Subjects
All samples were recruited from Chinese Han population through collaboration with multiple hospitals in China. This study was approved by the local ethics committee from Ruijin Hospital, the Central Hospital of Xuzhou, the first affiliated hospital of Bengbu Medical College, Medical School Hospital of Qingdao University, and Linyi People's Hospital, respectively. And all subjects in this study provided written informed consent using protocols approved by local ethics committee. As mentioned in our previous GWAS paper, 1,536 patients with GD and 1,516 sexmatched controls were recruited for the initial GWAS stage, and additional 3,994 patients with GD and 3,510 sex-matched controls were recruited for the replication study [4,10,20]. Diagnosis of GD was based on documented clinical and biochemical evidence of hyperthyroidism, diffuse goiter, and the presence of at least one of the following: positive TRAb tests, diffusely increased 131 I uptake in thyroid gland, or exophthalmos [4,10,20]. All individuals classified as GD were interviewed and examined by experienced clinicians.
All the 1,516 controls in the GWAS stage were individuals with neither GD nor family history of GD, and without any other autoimmune disorders. Control subjects were matched for sex with cases and were over 35 years. Since GD or other AITD has a preponderance in the young female population, this age criteria could reduce the number of controls who might develop GD later on. To exclude clinical or sub-clinical AITD, the levels of sensitive TSH (sTSH) and TPOAb in control subjects were measured using chemiluminescence immunoassay (CLIA) in our laboratory. Of the 1,832 healthy controls whose levels of sensitive TSH and TPOAb were measured, 257 individuals with the levels of TPOAb $5.61 U/ml, and 94 subjects with the levels of sensitive TSH $4.94 mU/ml or #0.35 mU/ml were excluded, the remaining 1,516 served as the control cohort in the GWAS stage [20].

Genotyping and Quality Control
GWAS was performed by Illumina Human660-Quad Bead-Chips [20]. Genotype clustering was conducted using Illumina GenomeStudio V2011.1 software based on the 660W-Quad_v1_H manifest files. This software, which is used to convert the fluorescence intensities into SNP genotypes, was different from the software used in the previously published GWAS paper [20]. The mean call rate across all samples was 99.8%. Quality filtering was performed on SNPs and samples before analysis to ensure robust association tests. Cryptic relationships between genotyped individuals were examined using pairwise identity-by-descent (IBD) estimation by PLINK software [20]. To maintain the maximum number of available samples, all the pairwise relationships were evaluated and the person who formed the node that related to the most other nodes in the family trees was first excluded. This process was iterated several times until the remaining samples were not related to one another.
Of the 655,214 markers assayed, 3,185 that were from the Y or mitochondrial chromosomes or were CNV-related were excluded. Next, 168,082 markers with Hardy-Weinberg equilibrium P#10 26 , with genotype call rates below 98%, or with a minor allele frequency (MAF) ,0.01 were discarded, leaving 483,947 SNPs for subsequent analysis. After removing samples with low call rates (,98%, n = 23), gender inconsistencies (n = 6), and cryptic relatedness (n = 113), 2,910 samples were available for further association analysis.
In the replication cohort, six SNPs on 1q21.1 were genotyped using TaqMan SNP Genotyping Assays in Fludigm EP1 platform [20], and one SNP (rs7528684) was genotyped using ABI 7900HT platform. Of the seven SNPs genotyped, none of SNPs was removed for further association analysis. Ultimately, 3,655 GD patients and 3,385 controls with a 100% call rate were analyzed in the replication cohorts.

Statistical Analysis
After quality control [20], we used the genotypes of 67 SNPs on 1q21.1 in 1,442 patients with GD and 1,468 controls for association analysis using the Cochran-Armitage trend test by PLINK [22]. The forward and two locus logistic regression analysis were performed using R statistics packages. The linkage disequilibrium (LD) block was analyzed by Haploview software version 4.2.
The genotype imputation was performed using IMPUTE2 software [23] and the updated 1000G phase 1 integrated variant set (Mar 2012) were used as a reference. Of the imputed SNPs, we analyzed only those that could be imputed with a relatively high confidence (estimated probability .0.9), had a MAF .1%, a genotype call rate .98%, and a Hardy-Weinberg equilibrium Pvalue .10 26 . To take into account the uncertainty of imputed SNPs, the association analysis of the imputed SNPs was carried out utilizing the SNPTEST v2 software [24].
We inspected three eQTL databases from European Caucasian population. One was developed by Dixon et al. and contained 405 children of British descent organized into 206 sibships including 297 sib pairs and 11 half-sib pairs [25]. Another database assessed the transcriptome of circulating monocytes from 1,490 German individuals [26]. The third database was the cell type-specific eQTLs relevant to immunity and inflammation in paired samples of primary monocytes and B cells, purified by positive selection directly from 283 healthy British individuals [27].
For the replication stage, the Cochran-Armitage test for trend was used to examine the associations. Association analysis in the combined samples was performed by Cochran-Mantel-Hanezel stratification analysis [22]. We examined heterogeneity among studies using the Breslow-Day test [22,28]. The genome-wide significance level was set at 5610 28 , in keeping with the current consensus of the field. Samples with more blood volume (100 ml) were donated by these 95 individuals for gene expression assay in distinct subpopulations of PBMCs. The genotypes of rs761959 and rs7528684 among these 95 individuals were determined using the ABI 7900HT System (AA and TT, n = 29; AG and TC, n = 47; and GG and CC, n = 19). The CD4 + , CD8 + , CD14 + and CD19 + subsets of PBMCs were isolated using MACS Column kits (Miltenyi Biotec) according to the manufacturer's instructions. The purity of each cell subpopulation was determined by an LSR II Flow Cytometer (BD Biosciences), and the cell subpopulations with the purity of over 90% were used for real time RT-PCR [20]. The mean purities of the CD4 + , CD8 + , CD14 + and CD19 + subsets were 98.361.7%, 98.361.7%, 96.961.9% and 96.262.5%, respectively [20]. cDNAs were made from 1 mg RNA templates using reverse transcriptase and oligo(dT) primer (Promega). Quantitative RT-PCRs for FCRL3 at 1q21.1 were performed in duplicate using the SYBR Green and an ABI 7900HT Fast Real-Time PCR System. Expression of all samples was normalized to the relative expression level of GAPDH. Primer sequences for real-time PCR were as follows: human FCRL3 primers (forward, 59 TGGAGA-TAGCAACCCGATTTATTC 39; reverse, 59 CTGTAAGTTCCTCATGCTCTTGATG 39) and GAPDH (forward, 59 GAAGGTGAAGGTCGGAGTC 39; reverse, 59 GAA-GATGGTGATGGGATTTC 39). We performed statistical analysis for expression data using ANOVA and an unpaired Student's t-test (the two tail P value is indicated on the figures).

Results
The association and cis-gene expression (cis-eQTL) analysis in the initial GWAS scan cohort.
Our previous two-staged GWAS illustrated that FCRL3 was a predisposing gene of GD on 1q21.1 harboring FCRL1-5 and CD5L genes [20]. To pinpoint the exact location of the etiological variant/s present, we carried out the imputation analysis in the initial scan cohorts with 1,442 affected individuals and 1,468 controls. Within the ,414-kb region of high LD at 1q21.1, there were 67 genotyped SNPs and 972 imputed SNPs ( Figure 1A, Table S1). The mean variant density was 2.5 SNPs per 1-kb and can well tag all currently known common variation within this region. Among the 1,039 SNPs, there were 516 SNPs with P,0.05 and 124 SNPs with P,0.0001, respectively ( Figure 1A, Table S1). Remarkably, the P levels of 44 SNPs located in FCRL3-CD5L gene region were less than 1.00610 27 in the initial scan cohorts ( Figure 1A, Table S1, and Figure S1A).
To investigate the independent variant associated with GD on 1q21.1, we carried out the forward logistic regression analysis for the 124 SNPs with P,0.0001 using R statistics packages and the results displayed that rs3761959 was an independent variant in the original scan cohort. Next, the two locus logistic regression analysis was performed to confirm the independent variant. As shown in Figure 1B, rs3761959 selected as the best SNP on 1q21.1, was put individually into the regression models, and all other markers were sequentially added to see if a second locus could improve the model. The two locus logistic regression results in the original scan manifested that none of 1,031 SNPs (except 7 SNPs with high LD with rs3761959, r 2 $0.99, Figure S1A) improved the model with rs3761959 at the P,0.001 level ( Figure 1B). Conversely, the majority of SNPs, except for 50 SNPs with significant P value with the GD risk (P,2.30610 26 ; Figure S1B) and within a high LD region (r 2 .0.70; Figure S1A), were improved by adding rs3761959 ( Figure 1B). The logistic regression analysis could not pinpoint which SNP was the independent variant among a cluster of SNPs in a high LD block.
To further determine the most significant variant correlated with the FCRL1-5 expression within 1q21.1 region, we carried out the cis-eQTL analysis for the 516 SNPs with P,0.05 in the GWAS scan cohort ( Figure 1A, Table S1). From the cis-eQTL analysis using transcriptome data from about 400 lymphoblastoid cell lines [25], we found that the expression level of FCRL3 was associated with a cluster of SNPs, with the strongest signal at rs7528684 (P = 3.38610 211 for probeset 231093_at; Figure 1C). Other three SNPs (rs2210913, rs3761959, and rs945635), which were all in high LD with rs7528684 (r 2 = 1 in our data, Figure  S1A), also exhibited the significant correlation with FCRL3 expression (P,1.00610 210 for probeset 231093_at; Figure 1C). Furthermore, the four SNPs also showed the eQTL effect on the expression of FCRL5 (P,5.77610 28 for probeset 224405_at; Figure 1C). Nonetheless, among a cluster of SNPs with the association with FCRL5 expression, the strongest signal was located at rs6679793 (P = 5.90610 218 for probeset 224405_at; Figure 1C), which was not in high LD with any SNP associated with the FCRL3 expression and was weakly associated with GD in the original GWAS scan, in FCRL5 (r 2 ,0.12, P GWAS = 0. 0252; Figure 1A, Figure S2, and Table S1).
Additionally, we inspected a cis-eQTL database assessing the transcriptome of circulating monocytes from 1,490 individuals [26] and found rs7528684 was correlated with FCRL3 expression level (P = 1.41610 28 ; Figure 1D). Also, two SNPs (rs6681271 and rs7522061), which were in high LD with rs7528684 and rs3761959 (r 2 = 0.99, Figure S1A), displayed the high correlation with FCRL3 (P = 1.89610 28 and 1.65610 28 ; Figure 1D). cyan. Physical positions are based on NCBI build 36. Panel B shows the two locus logistic regression analysis results for 1,039 SNPs at 1q21.1 in the GWAS samples. The SNPs were improved by adding rs3761959 were shown in red points; whileas, the SNPs improved the model with rs3761959 were showed in blue triangles. Panels C,D,E, and F show the cis-eQTL analysis of 516 SNPs with P,0.05 in the initial GWAS scan from three different cis-eQTL databases. Panel C shows plot of linkage disequilibrium (LD) structures at 1q21.1 and correlation of SNPs to transcript abundances of FCRL3 and FCRL5 genes at 1q21.1 in transcriptome data from about 400 lymphoblastoid cell lines. The LD structures of 1,039 SNPs at 1q21.1 were analyzed by Haploview software version 4.2 based on our imputed data. The LD color scheme is stratified according to the logarithm of the odds (LOD) score and D9: LOD ,2 (white for D9 ,1 and blue for D9 = 1) or LOD .2 (shades of pink/red for D9 ,1 and bright red for D9 = 1). Two different red crosses indicate the association results of SNPs to the expression level of FCRL3. Three different blue signs indicate the correlation results of SNPs to the expression level of FCRL5. Panel D shows the correlation of 10 SNPs to transcript abundances of FCRL3 in the transcriptome of circulating monocytes from 1,490 individuals. Panel E shows the correlation of SNPs to the expression of FCRL3 in both-cis dataset of the cell type-specific cis-eQTL database. The expression levels of FCRL3 in B cell and monocyte were shown in red and blue color, respectively. FCRL3 expression detected by different probe was shown in two different crosses. Panel F shows the correlation of SNPs to the expression of FCRL3 and FCRL5 in B-cis dataset of the cell type-specific cis-eQTL database. The expression of FCRL3 was shown in different red signs and that of FCRL5 was shown in blue. Panel G shows the relative mRNA expression levels of FCRL3 for different genotypes of rs3761959 and rs7528684 in PBMCs from 95 individuals (GG and TT, n = 29; GA and TC, n = 47; and AA and CC, n = 19). Panel H shows the relative mRNA expression levels of FCRL3 for different genotypes of rs3761959 and rs7528684 in the subset of PBMCs from above 95 individuals. *, P,0.05; **, P,0.01; ***, P,0.001. doi:10.1371/journal.pone.0057758.g001 Table 1. The association results of seven SNPs on 1q23.1 in the combined populations.  Ultimately, we also inspected a cell type-specific cis-eQTL database relevant to immunity and inflammation in purified B-cell and monocyte populations [27]. Of note, in the both-cis dataset, among a cluster of SNPs correlated with the expression of FCRL3, rs7528684 displayed higher correlation in the B cell    (P = 7.43610 233 for nuID KV7kDSLO4uggquLXB4 and P = 5.93610 226 for nuID 67unrLpPnjv_uzOezU; Figure 1E) than that in the mono-cell (P = 2.35610 26 for nuID KV7kDSLO4ugg-quLXB4 and P = 0.0001 for nuID 67unrLpPnjv_uzOezU; Figure 1E). In the B cell-cis dataset, rs7528684 also manifested the highest correlation with FCRL3 expression (P = 2.37610 230 for nuID ZUclXxUi6VEoBJeRT8; Figure 1F).
To confirmed the cis-eQTL analysis, We then evaluated allelespecific effects for rs3761959 and rs7528684 on the mRNA expression of FCRL3 gene in PBMCs from 95 individuals and the result revealed both genotypes were correlated with the expression levels of FCRL3 (P ANOVA = 0.0009) ( Figure 1G). We then detected the expression of FCRL3 in distinct PBMC populations. Although FCRL3 was expressed in all subsets of PBMCs, there were higher expression levels of FCRL3 in CD19 + B cells and CD8 + T subsets than those in CD4 + T subsets and CD14 + monocytes ( Figure 1H). More specifically, both risk alleles A of rs3761959 and C of rs7528684 can significantly up-regulate the mRNA level of FCRL3 in all subsets of PBMCs, especially in CD19 + B cells and CD8 + T subsets isolated from 95 healthy volunteers (P ANOVA : 0.0001 in CD4 + T subsets, 1.72610 25 in CD8 + T subsets, 0.0122 in CD14 + monocytes, and 9.85610 27 in CD19 + B cells, respectively; Figure 1H).

The Association Analysis in the Replication and Combined Cohort
Among the 124 SNPs with P,0.0001 in FCRL3-CD5L gene region, 11 SNPs were genotyped and 113 SNPs were imputed in the initial scan ( Figure 1A, Table S1, and Figure S1A). Next, six genotyped SNPs tagging 11 genotyped SNPs with P,0.0001 and one imputed SNP (rs7528684), which were all related to FCRL3 expression, were genotyped in the second cohort ( Figure 2A). After quality control, the most significant association signal was observed at rs7528684 in 3,655 patients with GD and 3,385 controls (allele frequencies P replication = 5.44610 27 Figure 2B). Meanwhile, rs3761959 also met the genome-wide association level in the combined cohort (allele frequencies P combined = 2.27610 212 , OR = 1. 22 Figure 2B). In addition, the forward and two-locus logistic regression analysis in the combined population demon- Bold letters indicate those haplotypes with significant differences between GD and normal subjects. All data shown here are haplotypes whose frequencies are more than 2%. doi:10.1371/journal.pone.0057758.t002 strated that rs7528684 could not improve the model with rs3761959 and rs3761959 also could not improve the model with rs7528684 ( Figure 2C). Because multiple SNPs may act in combination to increase the risk of disease, haplotypes of the SNPs in the combined population were investigated and their frequencies in the GD and control groups were compared. The results displayed that seven haplotypes with a frequency of more than 4% were formed from seven SNPs and accounted for about 95% of all haplotypes ( Table 2). Four of seven haplotypes exhibited significantly higher frequencies among individuals with GD than the control group. As shown in Table 2, the haplotype TGACGGG manifested the highest statistical difference (P = 1.15610 25 , OR = 1.20, 95%CI = 1.11-1.30; Table 2), followed by haplotypes TGACAGA and TGACAAA (P = 0.0214, OR = 1.10, 95%CI = 1.01-1.20; and P = 0.0044, OR = 1.18, 95%CI = 1.05-1.33, respectively; Table 2). In contrast, haplotypes CAGTAAA and TGGTAAA were more frequently observed in controls than in patients with GD (P = 4.73610 28 , OR = 0.86, 95%CI = 0.81-0.91; and P = 0. 0262, OR = 0.87, 95%CI = 0.76-0.98, respectively; Table 2). Notably, all the risk haplotypes of GD contained the risk allele A of rs3761959 and risk allele C of rs7528684 and all the protected haplotypes of GD contained the protected allele G of rs3761959 and protected allele T of rs7528684 (Table 2).
Ultimately, we used the ENCODE databases of epigenetic study to narrow down the candidate regulatory regions and polymorphisms (http://genome.ucsc.edu/ENCODE/) [29]. The data from ENCODE manifested the chromosome region containing rs7528684 was without a DNaseI hypersensitivity, however, can bind the transcription factor ( Figure 3). Also, we found the risk allele C of rs7528684 can bind the transcription factor NF-KB utilizing the Searching Transcription Factor Binding Sites (TFSEARCH, ver 1.3) (score: 96.9) [30]. The previous study also found rs7528684 could affect the FCRL3 expression in the luciferase assay [16]. Whereas, the chromosome region containing rs3761959 could not bind the transcription factors (Figure 3), suggesting the association between rs3761959 and GD because of its high LD with rs7528684.
The false positive report probability (FPRP) of the SNPs with significant association to GD in the combined Chinese Han cohort was also analyzed. In the present study, the FPRP value was calculated for each genetic variant using the assigned prior probability range, the statistical power to detect an odds ratio of 1.5, and detected odds ratios and P values. As shown in Table 3, among the seven genetic variants, the FPRP values of five SNPs were below 0.2 for the prior probability from 0.25 to 0.00001, which was a relatively high prior probability range. In addition, the FPRP values for rs7528684 were still the lowest even for a prior probability of 0.00001 (Table 3). Interestingly, the casecontrol association study for these seven SNPs with significant differences in allele frequencies between the 5,107 patients with GD and 4,853 control individuals has 100% statistical power to detect a SNP with an a level equal to their reported P value, corresponding to relative risks of 1.5 for GD (Table 3).

Discussion
Our refined association study of the SNPs on 1q21.1 region verified that FCRL3 was the susceptibility gene for GD in the Chinese Han population. Moreover, the logistic regression analysis revealed that 8 SNPs including rs3761959 may be the most likely susceptibility variant. Cis-eQTL analysis from three databases indicated that the most significant signals correlated to the expression of FCRL3 were located at a cluster of SNPs including rs3761959 and rs7528684. In the combined population analysis, the risk haplotypes containing the risk allele A of rs3761959 and risk allele C of rs7528684 were associated with the predisposition of GD. Furthermore, the risk allele A of rs3761959 and risk allele C of rs7528684 increased FCRL3 expression whether in PBMCs or in its subsets, especially in CD19 + B cells and CD8 + T subsets. However, only rs7528684 can bind the NF-KB transcription factor to affect the FCRL3 expression. Intriguingly, the FPRP value for SNP rs7528684 was very low for the prior probability range and was quite robust even for low prior probabilities. These results suggested that rs7528684 in the promoter of FCRL3 was associated with GD etiology in the combined Chinese Han population.
The first reported variant about the association of FCRL3 with GD was rs7528684 located at position -169 in promoter of FCRL3 gene in a Japanese population (P = 7.4610 25 ) [16], which was confirmed by the later study in 1,056 UK patients with GD and 864 controls (P = 0.024) [31]. Our two-stage GWAS analysis also confirmed rs7528684 was associated with GD in Chinese Han population. Although there were two negative conclusions regarding association of rs7528684 in FCRL3 with GD in two small samples studies: one was in a Chinese population with 436 cases and 316 controls [32], and the other was in a UK population based on 625 cases and 490 controls (Table S2) [33], we still regarded SNP rs7528684 as a susceptibility GD locus in FCRL3 region in the Chinese Han population. Meanwhile, all of the risk haplotypes of GD contained the risk allele C of rs7528684, which can increase FCRL3 expression both from cis-eQTL analysis and from the real-time PCR results. Later, WTCCC in an analysis including 2,500 UK GD cases and 2,500 controls found an association at rs3761959 (a perfect proxy of rs7528684, r 2 = 1 in our data; Figure 2A) with GD (P = 0.0094; Table S2) [19]. Also, rs3761959 showed the significant association with GD in our two-stage GWAS analysis and could affect the expression of FCRL3 from our cis-eQTL analysis and real-time PCR. Noteworthy, in the WTCCC study, a stronger association was found with rs11264798 (in high LD with rs7528684 in our data, r 2 = 0.99, P = 1.6610 25 ; Table S2), located in the intron 8 of FCRL3 [19]. More recently, the WTCCC genotyped 743 SNPs across FCRL3 in 7,894 control samples and about 2,000 GD subjects to define the causal GD-associated SNPs using Bayes theorem [34]. Unfortunately, the fine mapping data from WTCCC failed to refine the signal in FCRL3 due to the 95% credible set containing 114 SNPs, albeit the top SNP, rs11264798, accounting for 7% of the posterior weight (Table S2) [34]. The imputation analysis manifested that SNP rs11264798 was also associated with GD in our initial scan cohort (P GWAS = 4.29610 27 ; Table S2). At the same time, we compared our data with those in the study by Simmonds et al that also fine-mapped the FCRL5 region in a UK population and found none of 11 SNPs (excluding SNP rs3900700 because of MAF ,0.01 in Chinese Han population) showed the association with GD in the Chinese Han population at the P,0.05 level (Table S2). Above results suggests that rs7528684 should be the likely FCRL etiological variant in our Chinese Han GD population, albeit the possible roles for the proxy of rs3761959 or rs11264798 maybe exist.
Although the three previous eQTL studies were all from the Caucasians rather than from the Han Chinese population, however, the LD analysis between the two populations displayed a similar LD block harboring CD5L and all three significant SNPs located in FCRL3 (rs11264798, rs3761959 and rs7528684) falling in the same LD block ( Figure S3). Interestingly, there were two LD blocks and the five genes (FCRL1-5) were located in the first LD block in the HapMap CEU populations ( Figure S3A). Whilst, among three LD blocks in CHB and JPT populations, two genes (FCRL5-4) and three genes (FCRL3-1) were located in two different LD blocks, respectively ( Figure S3B).
FCRL3 gene is one of five FCRL genes that shares significant structural homology to classical receptors for immunoglobulin constant chains (Fc receptors) [35] and encodes a protein containing an immunoreceptor-tyrosine activation motif and immunoreceptor-tyrosine inhibitory motif in its cytoplasmic domain [36]. FCRL3 is highly expressed in lymphoid organs, particular strongly on the surface of the B-cells, but also on that of the T-cells [16]. Among B-cell subsets, FCRL3 molecule is present on mature, germinal center, memory, plasma cells, and bone marrow immature B cells suggesting for its key role in the development, maturation, and function of B-lymphocytes [37]. The pathogenic activation of FCRL3 expression is suggested to lead to the down-regulation of B cell receptor-mediated signaling, incomplete induction of anergy and deletion in autoreactive Bcells, and, finally, to breakdown of B-cell tolerance [38]. Presence of FCRL3 was also demonstrated on the surface of a subset of Treg cells characterized by lower relative response to antigenic stimulation and reduced suppressor activity [39]. In the original report, rs7528684 was suggested to have functional significance as the disease-associated C allele increased affinity for NF-KB transcription factor and showed enhanced transcription rate in luciferase assay [16]. In our study, CD19 + B-cells had the highest expression of FCRL3 among the four subsets of PBMCs and the risk allele C of rs7528684 was significantly correlated with elevated mRNA expression level of FCRL3 mainly in CD19 + B cells and CD8 + T subsets, secondly in CD4 + T subsets and CD14 + monocytes. Moreover, our epigenetic analysis from ENCODE database and TFSEARCH software analysis also manifested that the risk allele C of rs7528684 could bind to NF-KB transcription factor. Therefore, we proposed that FCRL3 gene and its proxy SNP rs7528684 can be involved in the pathogenesis of GD by excessive inhibiting B cell receptor signaling and the impairment of suppressing function of Tregs.
In summary, our study provided the unequivocal evidence that FCRL3 was the susceptibility gene of GD and its proxy SNP rs7528684 may be the etiology variant to predispose to GD in Chinese Han population. Table S1 Association results of the imputed and typed SNPs in 1q21.1 region with GD in initial genome-wide scan.