Genetic Variants Associated with Serum Thyroid Stimulating Hormone (TSH) Levels in European Americans and African Americans from the eMERGE Network

Thyroid stimulating hormone (TSH) hormone levels are normally tightly regulated within an individual; thus, relatively small variations may indicate thyroid disease. Genome-wide association studies (GWAS) have identified variants in PDE8B and FOXE1 that are associated with TSH levels. However, prior studies lacked racial/ethnic diversity, limiting the generalization of these findings to individuals of non-European ethnicities. The Electronic Medical Records and Genomics (eMERGE) Network is a collaboration across institutions with biobanks linked to electronic medical records (EMRs). The eMERGE Network uses EMR-derived phenotypes to perform GWAS in diverse populations for a variety of phenotypes. In this report, we identified serum TSH levels from 4,501 European American and 351 African American euthyroid individuals in the eMERGE Network with existing GWAS data. Tests of association were performed using linear regression and adjusted for age, sex, body mass index (BMI), and principal components, assuming an additive genetic model. Our results replicate the known association of PDE8B with serum TSH levels in European Americans (rs2046045 p = 1.85×10−17, β = 0.09). FOXE1 variants, associated with hypothyroidism, were not genome-wide significant (rs10759944: p = 1.08×10−6, β = −0.05). No SNPs reached genome-wide significance in African Americans. However, multiple known associations with TSH levels in European ancestry were nominally significant in African Americans, including PDE8B (rs2046045 p = 0.03, β = −0.09), VEGFA (rs11755845 p = 0.01, β = −0.13), and NFIA (rs334699 p = 1.50×10−3, β = −0.17). We found little evidence that SNPs previously associated with other thyroid-related disorders were associated with serum TSH levels in this study. These results support the previously reported association between PDE8B and serum TSH levels in European Americans and emphasize the need for additional genetic studies in more diverse populations.

Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. Data files are available from dbGaP under accession number phs000360.
Funding: The eMERGE Network is funded by NHGRI, with additional funding from NIGMS through the following grants: U01HG04599 and U01HG006379 to Mayo Clinic; U01HG004610 and U01HG006375 to Group Health Cooperative; U01HG004608 to Marshfield Clinic; U01HG006389 to Essentia Institute of Rural Health; U01HG004609 and U01HG006388 to Northwestern University; U01HG04603 and U01HG006378 to Vanderbilt University; U01HG006385 to the Coordinating Center; U01HG006382 to Geisinger Clinic; U01HG006380 to Icahn School of Medicine at Mount Sinai; U01HG006830 to The Children's Hospital of Philadelphia; and U01HG006828 to Cincinnati Children's Hospital and Boston Children's Hospital. Group Health/University of Washington received additional funding through Group Health/UW ADPR/ ACT grant UO1 AG 0681. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors of this manuscript have read the journal's policy and have the following competing interests: Dr. Dana Crawford is an academic editor of PLOS ONE. Dr. Crawford is not involved in the review of this manuscript per journal policy. This disclosed competing interest does not alter the authors' adherence to PLOS ONE editorial policies and criteria. The remaining authors have declared that no competing interests exist.

Introduction
Hyperthyroidism and hypothyroidism are important endocrine diseases caused by over-or under-production of thyroid hormone, which is regulated by thyroid stimulating hormone (TSH) produced in the anterior pituitary gland. Hypothyroidism, the most common thyroid disease, can be caused by iodine insufficiency, autoimmunity, pregnancy, pituitary disease (leading to increased TSH production), or other conditions. Thyroid diseases occur more often in women than in men [1] and the risk of developing hypothyroidism increases with age [2,3]. Diagnosis of thyroid diseases involves measuring TSH levels and circulating thyroxine (T4) and triiodothyronine (T3) in the blood; elevated TSH levels and depressed T4 levels signify clinical hypothyroidism [2,4] while elevated TSH levels and normal T4 levels indicate mild (subclinical) hypothyroidism [5]. TSH is produced by a normally functioning pituitary gland in response to decreased thyroid hormone levels; as thyroid hormone levels decrease, TSH signals to the thyroid to produce additional thyroid hormone. When the thyroid gland does not maintain sufficient production of thyroid hormone, serum TSH levels become elevated, and the individual develops hypothyroidism. Similarly, elevated thyroid hormone levels from primary hyperthyroidism result in decreased TSH levels.
Both genetic and environmental factors influence serum TSH levels. Neonatal TSH levels have been associated with maternal characteristics such as nulliparity, preeclampsia, and induced labor [6]. Among adults, physical and emotional stress, poor nutrition, increased body mass index (BMI), smoking, and pregnancy are all risk factors for elevated serum TSH levels [7][8][9]. Normal serum TSH levels range from 0.3 mIU/mL-4.0 mIU/mL but are tightly regulated within an individual, suggesting a genetic 'set point' for individual thyroid hormone levels [5,10,11]. A cross-sectional population study demonstrated differences in mean TSH levels between race/ethnicities, with higher mean TSH levels in non-Hispanic whites than in Mexican Americans or non-Hispanic blacks [5]. The etiology behind the observed differences in mean TSH levels across ethnic groups has not been elucidated, and it is unclear if those differences lead to lower prevalence of hypothyroidism in populations of diverse ancestry. A recent study identified differences in prevalence of thyroid cancer across ethnic groups living in England [12], and TSH antibodies were demonstrably lower in non-Hispanic blacks compared to non-Hispanic whites or Mexican-Americans in the National Health and Nutrition Examination Survey (NHANES) III [13]; however, studies evaluating hypothyroidism or hyperthyroidism burden among different racial/ ethnic groups have not been performed. Twin and family-based studies have suggested heritability estimates of 32%-67% for TSH, T4, and T3 levels [14][15][16], and a recent study found heritability for TSH to be 58% in newborn twins [17]. These data taken together suggest TSH level variation is largely a product of genetic factors, corroborating the hypothesis that each individual maintains a setpoint for TSH levels. Several genetic association studies have been performed, including two meta-analyses of GWAS [18,19]. These studies have identified common variants associated with serum TSH levels: rs2046045 (PDE8B), rs10917477 (CAPZB), rs10028213 (NR3C2), and rs3813582 (16q23) [18,20]. Altogether, the known loci explain ,5% of the variance in TSH levels [19]. However, these GWAS and meta-analyses have been performed in populations of European ancestry, and it is unclear if these findings generalize to other race/ ethnicities.
In this study, we sought to identify variants associated with normal variability of serum TSH levels in euthyroid (thyroid disease free) European Americans and African Americans from the Electronic Medical Records and Genomics (eMERGE) Network. We looked to replicate in our study known associations between SNPs and serum TSH levels. We hypothesized variants associated with serum TSH levels might also be associated with thyroid disorders, such as hyperthyroidism (Grave's disease), hypothyroidism (Hashimoto's disease), and thyroid cancer. Given that increased BMI is a risk factor for elevated serum TSH levels, we also tested for evidence that TSH-associated SNPs are modified by BMI in this study of euthyroid European and African Americans from the eMERGE Network.

Methods eMERGE
The eMERGE Network is a collaboration of institutions with biobanks linked to EMRs. The data for these analyses included Phase I of the eMERGE Network whose members included Group Health Cooperative/University of Washington, Marshfield Clinic, Mayo Clinic, Northwestern University, Vanderbilt University and the eMERGE Administrative Coordinating Center [21].

Study Population
This study was performed in the eMERGE Network which includes approximately 17,000 individuals who were phenotyped and genotyped for previous studies investigating a variety of complex diseases (e.g. dementia, cataracts, peripheral arterial disease (PAD), type 2 diabetes) and medically relevant quantitative traits (e.g. cardiac conduction) [22]. To qualify for euthyroid designation in this analysis, individuals were required to have at least one test of thyroid function (i.e., TSH and T3 or T4 if available) with no abnormal results, must not have any billing codes for hypothyroidism or history of myasthenia gravis in his/her EMR or evidence of thyroid replacement medication, and must have at least two past medical history sections (non-acute visits) and medication lists. For individuals with multiple TSH tests, the median TSH level was used in the analysis. Individuals were excluded if they had any cause of hypothyroidism or hyperthyroidism, any other thyroid diseases (e.g. Graves, thyroid cancer) as indicated by billing (ICD-9) codes, procedure (CPT) codes or text word diagnoses, or were on thyroid-altering medication (e.g., lithium) [22]. From this group, 6,086 European Americans and 633 African Americans qualified as euthyroid, of which 4,501 European Americans and 351 African Americans had body mass index (BMI). The appropriate institutional review board at each participating study site approved all procedures.

Genotyping
Genotyping was performed using the Illumina Human660W-Quadv1_A and the Illumina1M BeadChips for European Americans and African Americans, respectively, as previously described [22]. Of the SNPs on each array, 474,366 SNPs and 905,285 SNPs, respectively, passed quality control filters for tests of genotyping efficiency (.99% call rate), and minor allele frequency (.5%). Details of eMERGE quality control have been previously published [23,24]. eMERGE Network data have been deposited into the Database for Genotypes and Phenotypes (dbGaP).

Statistical Analysis
Quality control and data analysis were performed using a combination of PLINK [25,26], and R software, and data were plotted using R code obtained from the Getting Genetics Done website [27,28], Stata [29] and Synthesis-View [30]. Power calculations were performed using Quanto [31]. Linear regression was performed assuming an additive genetic model to test for associations between individual SNPs and log-transformed median serum TSH levels. Tests were performed stratified by race/ethnicity, unadjusted and adjusted for age, sex, BMI, and first principal component (PC1) calculated with EIGENSTRAT [32]. Control for population stratification was evaluated with Q-Q plots and calculation of the lambda statistic using R packages qqman and GenABEL [33]. No evidence of residual population stratification was observed in the European Americans (l51.04) or African Americans (l51.00). Additional tests of association were performed in European Americans stratified by BMI (normal: BMI 18.5-24.9; overweight: BMI >25 and normal; overweight: BMI >25-30; obese: BMI .30) and adjusted for age, sex, and PC1. We also performed formal tests of interaction between SNPs associated with TSH levels as a significance threshold of p,1610 204 and stratified BMI (normal versus overweight) stratified by race/ ethnicity in adjusted (age, sex, PC1, and main effects) models. We considered a SNP-BMI interaction significant at a threshold of p,0.05. Wilcoxon rank-sum tests were performed to compare median TSH levels at each genotype for normal vs. overweight BMI categories for each SNP for the normal/overweight BMI analysis and Bonferroni-corrected multiple pairwise analysis following ANOVA for the normal/overweight/obese BMI analysis.
In addition to GWAS discovery, we sought to replicate and generalize previously reported genetic associations for TSH levels. We considered a SNP replicated in European Americans if the tested SNP was identical to the index SNP, or a proxy in strong linkage disequilibrium (LD) (r 2 .0.7) with the index SNP in 1000 Genomes CEU reference panel, and the direction of effect was consistent with the previous report after taking into account coding allele differences. We considered a SNP generalized to African Americans if the tested SNP was identical to, or a proxy in strong LD with (r 2 .0.7), the index SNP in 1000 Genomes CEU reference panel, and the direction of effect was consistent with European Americans. For the replication/generalization analysis, significance was defined at a threshold of p,0.05. Power calculations were performed assuming the genetic effect sizes reported in the literature, the present study sample size, and the present study coded allele frequencies.

Study participants
All eMERGE participating sites contributed data for European Americans and all sites except Marshfield Clinic contributed data for African Americans (Table S1) [1,5,34]. The age, BMI, and sex ratio differences between the groups observed here most likely reflect ascertainment differences resulting from the characteristics of the source populations at each eMERGE site, rather than true differences at the overall population level.

TSH levels: Discovery
We performed standard single SNP tests of association stratified by race/ethnicity and adjusted for sex, age (decade of birth), BMI, and PC1. For European Americans, we identified six SNPs in PDE8B on chromosome 5 as associated with TSH levels at genome-wide significance ( Figure 1; Table 2). Our most significant result, rs1382879, was a perfect proxy for previously-identified [19] rs2046045 (r 2 51.00) and was in moderate-to-high LD (r 2 .0.30) with the other significant PDE8B SNPs. No novel genotype-phenotype associations were identified at genome-wide significance in this sample of European Americans. However, an additional 111 SNPs were suggestively associated with serum TSH levels (p,1610 24 ), including seven SNPs in PDE8B, ten SNPs near FOXE1, three SNPs in PDE10A, four SNPs in THBS4, and eight SNPs in NRG1 (Table S2). The majority of these SNPs are located in noncoding regions of the genome (intronic, upstream, downstream); however, rs3745746 (CABP5, p54.93610 25 ) is a missense mutation, and rs1443434 (FOXE1, p56.53610 25 ) is located in the 39 untranslated region.

Trans-population genetic associations
Given the smaller sample size of African Americans with serum TSH levels, the GWAS was underpowered to detect associations at genome-wide significance with expected small to moderate effect sizes. Therefore, we evaluated the 31 most  had genetic effects in the same direction between the two populations ( Figure 2).

Replication and Generalization
At least 24 SNPs have been associated with serum TSH levels in European descent populations in the literature [18][19][20]35]. We considered a SNP replicated if the direction of effect was the same as previously reported and associated at a liberal threshold of p,0.05 with serum TSH levels. In European Americans, we replicated 22/25 (88%) SNPs previously associated with serum TSH levels ( Table 3). As previously mentioned, the most significant association with TSH levels in European Americans replicated the published reports for PDE8B SNPs rs2046045 and rs6885099 (Table 3). Beyond PDE8B, we replicated two SNPs on chromosome 1 in CAPZB previously implicated as associated with serum TSH levels ( Table 3). One SNP, rs12138950, was a perfect proxy for previouslyreported CAPZB rs10917469 (1000 Genomes CEU r 2 51.00, b520.05, p58.97610 25 ) ( Table 3). In African Americans, 5/24 (25%) SNPs previously associated with TSH levels in European-descent populations generalized at a liberal significance threshold of p,0.05 and a consistent direction of effect (Table S4). PDE8B rs2046045, a proxy for rs6885099 (1000 Genomes CEU r 2 51.00, YRI r 2 50.945), was associated with serum TSH levels in African Americans (b520.09, p50.03) (Table S4). NFIA rs334713, a proxy for rs334699 (1000 Genomes CEU r 2 51.00, YRI r 2 50.774), was associated with serum TSH levels in eMERGE African Americans (p51.50610 23 ) with a similar effect size (b520.17) as previously-reported European-descent populations. Notably, the coded allele frequency of this SNP was greater in African Americans (coded allele frequency 50.17; Table S4) compared with either eMERGE European Americans (0.08) or the previously-reported European descent population (0.05) ( Table 3). Intronic ABO rs657152 was significant at p50.03, and the magnitude and direction of effect were similar to previously published European American data (Table S4). VEGFA rs11755845 was significant at p50.01 (Table S4) with an effect size nearly double that of the previously reported result in European Americans (Table S4). SNP rs13020935 upstream of  IGFBP5, a proxy for rs13015993 (r 2 51.00), was significant at p51.82610 24 (Table S4).

SNPs previously associated with thyroid disease
Next, we investigated SNPs that had previously been associated with a thyroid disease phenotype, specifically: hypothyroidism, thyroid cancer, and Graves disease [36][37][38], since variation in TSH levels may indicate thyroid disease. Six SNPs in the FOXE1 region, including rs925489, generalized to euthyroid European American subjects (Table S5). An additional SNP in FOXE1, rs965513, previously associated with hypothyroidism [22,36], generalized to serum TSH levels in European Americans (p51.09610 26 , b520.05) (Table S5). FOXE1 rs1877432, previously associated with hypothyroidism, generalized to serum TSH levels in African Americans (p59.73610 23 , b50.11) (Table S6). RHOH/ CHRNA9 rs6832151, previously associated with Grave's Disease, generalized to serum TSH levels in African Americans (p50.01, b520.10) ( Table S6). None of the SNPs previously associated with thyroid cancer [38] were associated with serum TSH levels in either European Americans or African Americans at a liberal significance threshold of p,0.05 (Tables S5 and S6). Broadly, we found little evidence of association with serum TSH levels for SNPs, apart from FOXE1, that have been associated with other thyroid-related phenotypes.

Interaction with BMI
BMI is significantly positively associated with TSH levels and changes in BMI can be a symptom of thyroid disease, with hypothyroid persons gaining weight and hyperthyroid persons losing weight [39]. We observed that the addition of BMI into the linear regression model yielded more significant p-values for the SNPs in PDE8B and others, and the results from the stratified analyses differed within each race/ethnicity (Table S7, Table S8). Therefore, we performed formal tests of interaction between BMI and all SNPs (n5118) with p,1610 24 from the age, sex, PC1, and BMI adjusted model in European Americans and considered evidence for an interaction at p,0.05. Three SNPs met our significance threshold in European Americans for an interaction with BMI: NFIA rs10489909, NRG1 rs2466067 and rs4298457. An additional NRG1 SNP was just outside the p,0.05 significance threshold for the interaction: rs10954859 (Table S9, Figure 3). The NRG1 SNPs are in moderate-to-high LD with each other (r 2 .0.70). We compared median TSH levels by BMI category for each genotype by SNP and observed lower median TSH levels for individuals with the AA genotype for rs10489909 who were of normal BMI than compared to individuals with overweight BMI (p,0.005). We observed similar trends for rs2466067 (CC genotype), rs10954859 (GG genotype), and rs4298457 (GG genotype) (p,0.05) which suggests serum TSH levels may be attenuated based on BMI for these homozygous genotypes. To understand if the observed interaction effect was a threshold effect of overweight or obese BMI, or a dose-dependent effect, we further stratified the overweight BMI category into overweight (BMI 25-30) and obese (BMI .30) in the European Americans ( Figure S3). For the rs10489909, we observed lower median TSH levels for individuals with the GG genotype who were of normal BMI compared to individuals with overweight BMI (p,0.01) ( Figure  S3). We observed similar trends between individuals with normal BMI compared to obese BMI for rs4298457 (GG genotype) and rs2466067 (CC genotype) (p,0.05) ( Figure S3). These data suggest the variation observed in serum TSH levels for these genotypes may result from a threshold-effect of obese BMI. We also performed tests of interaction in African Americans for BMI and the 87 most significant SNPs (p,1610 24 from the age, sex, PC1, and BMI adjusted model). We observed five SNPs at the p,0.05 significance threshold (Table S9, Figure S2). MYT1L rs6728613 and rs4073401 are in perfect LD with each other (r 2 51.00) and were the most significant in this interaction analysis (p52.28610 23 ) (Table S9, Figure S2). While other interaction terms were significant in the African American sample, small sample sizes and low counts made comparisons across genotypes and BMI categories difficult to interpret ( Figure S2).

Discussion
The eMERGE Network was established in 2007 to determine whether electronic medical records could be used to identify disease susceptibility in diverse patient populations for complex traits/diseases. At each study site, DNA linked to an EMR was genotyped for a GWAS for specific complex diseases (e.g., type II diabetes) and medically relevant quantitative traits (e.g., cardiac conduction). A recent eMERGE Network GWAS demonstrated that these study-specific genotype data can be ''reused'' for additional GWAS for binary outcomes (hypothyroidism) extracted from the EMR [22]. As an extension of this exercise, we performed a GWAS for an additional medically relevant quantitative trait: thyroid stimulating hormone (TSH) levels, in 4,501 European American and 351 African American euthyroid individuals.
Several studies have shown associations between TSH levels and PDE8B (briefly: [11,35,40,41]). PDE8B is a phosphodiesterase gene that encodes a cAMP-specific protein expressed in thyroid tissue [42]. PDE8B upregulates cAMP through interaction with the TSH receptor on thyroid cells [11,42]. In this study, we have replicated the results recently obtained by several groups finding association of TSH levels and several SNPs in the PDE8B region in European Americans [35,40]. Variants in PDE8B were the only SNPs in this analysis to reach genome-wide significance in European Americans after accounting for multiple testing. In African Americans, rs2046045 (in high/perfect LD with rs6885099 and rs4704397) was nominally significant. These findings support the strong association of PDE8B to TSH levels in European Americans and suggest this association is generalizable to African Americans as well. Future studies to consider the association of PDE8B in other diverse populations are warranted. The FOXE1 region was not as strongly associated with TSH levels as PDE8B in European Americans, a result similar to that obtained by Medici et al. [40] and Alul et al. in neonates [41]. FOXE1 encodes a thyroid transcription factor with a characteristic forkhead motif believed to be important in thyroid morphogenesis [43,44]. Mutations in FOXE1 have been implicated in hypothyroidism [22,36,38] and thyroid cancer [45,46]. No SNPs in FOXE1 reached genome-wide significance in this study, though several were associated at the 10 26 threshold in European Americans and at the 10 23 threshold in African Americans. As the prior association with FOXE1 is for a disease state (hypothyroidism), it is unsurprising that we failed to find association at the genome-wide significant level in a euthyroid (non-thyroid disease) population.
Given the relationship between TSH levels and specific clinical outcomes, we hypothesized that serum TSH levels would also be associated with SNPs previously associated with hypothyroidism, Grave's Disease, or thyroid cancer by GWAS or candidate gene studies [36][37][38]. Patients with these disorders exhibit abnormal TSH levels and there is a strong autoimmune component to the diseases [30]. No SNPs in previously identified gene regions (CTLA-4, TSHR, TTF1, HLA, and PTPN22) were significantly associated with TSH levels in either European Americans or African Americans from the eMERGE Network (Tables S5 and S6), suggesting the contribution to these disorders from these genes may be specific to disease risk and not natural variation in TSH levels.
Obesity (BMI .30) has been implicated in higher TSH levels and change in an individual's set point [47,48]. We performed additional analyses adjusting for age, sex, PC1, and BMI in both the European American and African American cohorts and stratified analyses by BMI (normal versus overweight). In the European Americans, adjusting for BMI did not appreciably modify the results, though the results in both PDE8B and FOXE1 were more highly significant (Table S7). These results led us to consider potential SNPxBMI interactions. After performing tests of association for an interaction in the most significant results from the primary analysis, we identified two loci with SNPxBMI interactions in European Americans: NFIA and NRG1. NFIA, a transcription factor, has not previously been associated with thyroid-related traits. NRG1 encodes neuregulin, a signaling protein recently identified in a study to be associated with thyroid cancer, potentially mediated by regulation of TSH levels [38]. Neuregulin is expressed in papillary thyroid carcinomas and has been found to regulate cell proliferation in a rat thyroid cell model [49]. Further studies on the role NRG1 may play in regulating TSH levels are warranted. In the African American subjects, significant interactions at a liberal threshold (p,0.05) were identified, but small sample sizes and low genotype counts per BMI category made comparisons across groups difficult.
We compared results from the African Americans to those of the European Americans in our study and observed several differences. While several SNPs in PDE8B reached genome-wide significance in European Americans, none were significant in African Americans, and only two PDE8B variants identified in previous GWAS generalized to this population at a liberal significance threshold of p,0.05. Of the 32 most significant SNPs in European Americans, 21 had the same direction of effect and similar effect sizes in African Americans, suggesting the small sample size and resulting lack of power were responsible for our inability to generalize previously identified variants to the eMERGE African Americans.
A major limitation of this study is sample size. Among both populations, we excluded individuals in eMERGE with an abnormal TSH level given this study sought to identify genetic determinants of the normal distribution for TSH levels. Despite excluding individuals with abnormal TSH values, the mean (standard deviation) observed here for European Americans [1.90 (0.93)] was well within the range of previous TSH level genetic association studies: 1.5 (0.80) to 2.7 (4.1) mIU/mL [18]. The addition of the few individuals with abnormal TSH levels would unlikely increase statistical power to detect additional genome-wide associations or substantially impact the overall trait distribution. In comparison, the African American sample size was very small which impacted our ability to generalize previous findings to this population. In eMERGE African Americans, we were only adequately powered (.80%) for one test of association: PDE8B rs4704397. This SNP was not directly genotyped in the eMERGE African American dataset, but is in very high LD with genotyped rs2046045 in the 1000 Genomes CEU panel (r 2 50.94), but not with the 1000 Genomes YRI panel (r 2 50.49). The small sample size coupled with lower linkage disequilibrium resulted in underpowered tests of association for the African American dataset.
We also observed striking differences in minor allele frequencies (MAF) between European Americans and African Americans that may have impacted our ability to replicate and generalize previously associated variants ( Figure 2). In European Americans, most of the minor allele frequencies were comparable to those in previously published studies (Table S10), and we were adequately powered (80%) to replicate 18/25 SNPs previously associated with serum TSH levels at a liberal significance threshold of 0.05 (Table S10). Of the 18 properly powered tests of association, all of these SNPs replicated in the eMERGE European American dataset, validating prior associations for these SNPs with TSH levels in European Americans. The utility of these variants in the clinical setting to predict serum TSH levels has not yet been calculated; future studies considering the predictive capacity of these SNPs for a clinical application may be beneficial.
This study further demonstrates the feasibility of using genotypes linked to EMRs to perform secondary analyses for quantitative traits in complex diseases in diverse populations [50,51]. We identified SNPs associated with serum TSH levels and replicated findings from earlier GWAS for TSH levels and thyroid-related traits to the eMERGE European American euthyroid population. We further suggest BMI may modify genetic associations with serum TSH levels and that this may occur as a threshold effect with obese BMI for some genotypes. Consistent with other reports, we found few associations with SNPs associated with serum TSH levels that have effects on other thyroid-related traits/diseases, suggesting the development of thyroid disease and variation of TSH levels occurs primarily through different mechanisms. Importantly, we identified suggestive associations with biologically plausible SNPs and generalized several SNPs from previous GWAS to the eMERGE African American euthyroid population, suggesting additional studies in diverse populations are warranted. Figure S1. Manhattan plot of tests of association with serum TSH levels in African Americans in eMERGE. Data shown are p-values from 905,285 single SNP tests of association for serum TSH levels in a model adjusted for age, sex, principal component (PC) 1, and body mass index in euthyroid African Americans in eMERGE Network (n5351). Y axis represents the -log10 (p-value); horizontal lines represent Bonferroni corrected significance level (5610208) (top) and suggestive significance level (1610204) (bottom). Chromosomes are arranged on the x axis. doi:10.1371/journal.pone.0111301.s001 (TIF) Figure S2. Body mass index as a modifier of serum TSH levels genetic associations in eMERGE African Americans. Interaction analyses were performed using the SNPs with p,1610204 significance levels in the model adjusted for age, sex, PC1, and BMI in African Americans (n5351); the model was stratified by race/ethnicity and by normal/overweight BMI (normal: BMI 18-24.9; overweight: BMI 25+). We considered a SNPxBMI interaction significant at a threshold of p,0.05. Shown are p-values from Wilcoxon rank-sum tests comparing median TSH values between BMI categories at each genotype. doi:10.1371/journal.pone.0111301.s002 (TIF) Figure S3. Body mass index as a modifier of serum TSH levels genetic associations in eMERGE African Americans. Interaction analyses were performed using the SNPs with p,161024 significance levels in the model adjusted for age, sex, PC1, and BMI in European Americans (n54,501); the model was stratified by race/ethnicity and by normal/overweight/obese BMI (normal: BMI 18-24; overweight: BMI 25-30; obese: BMI 30+). We considered a SNPxBMI interaction significant at a threshold of p,0.05. Shown are Bonferroni-corrected p-values from multiple pairwise comparisons after ANOVA, comparing median TSH values between BMI categories at each genotype. doi:10.1371/journal.pone.0111301.s003 (TIF)   )   Table S5. Comparison of associations in eMERGE European Americans with previously published SNP associations for thyroid-related traits. SNP rs number, chromosomal location, nearest gene/gene region, coded allele (CA), coded allele frequency (CAF), and association summary statistics (odds ratio (OR) and p-values) are given for each previously reported association with thyroidrelated traits in European Americans. For SNPs not directly genotyped in this study, the proxy in highest linkage disequilibrium in 1000 Genomes CEU samples was identified. Results of adjusted (age, sex, body mass index, and principal component 1) tests of association are given for each previously reported SNP or its proxy in this European American dataset (n5,501). doi:10.1371/journal.pone.0111301.s008 (DOCX )   Table S6. Comparison of associations in eMERGE African Americans with previously published SNP associations for thyroid-related traits. SNP rs number, chromosomal location, nearest gene/gene region, coded allele (CA), coded allele frequency (CAF), and association summary statistics (odds ratio (OR) and p-values) are given for each previously reported association with thyroidrelated traits in European Americans. For SNPs not directly genotyped in this study, the proxy in highest linkage disequilibrium in 1000 Genomes CEU samples was identified. Results of adjusted (age, sex, body mass index, and principal component 1) tests of association are given for each previously reported SNP or its proxy in this African American dataset (n5351). doi:10.1371/journal.pone.0111301.s009 (DOCX )   Table S7. Comparison of SNP associations (p,10 204 ) in regression models with and without body mass index covariates for serum TSH levels in euthyroid eMERGE study European Americans (n54,501). For each SNP, pvalues and betas are given for models that include or exclude BMI as a covariate. All models are linear regressions assuming an additive genetic model adjusted for age, sex, and principal component 1. doi:10.1371/journal.pone.0111301.s010 (DOCX )   Table S8. Comparison of SNP associations (p,10 204 ) in regression models with and without body mass index covariates for serum TSH levels in euthyroid eMERGE study African Americans (n5351). For each SNP, p-values and betas are given for models that include or exclude BMI as a covariate. All models are linear regressions assuming an additive genetic model adjusted for age, sex, and principal component 1. doi:10.1371/journal.pone.0111301.s011 (DOCX) Table S9. Body mass index as a modifier of serum TSH levels genetic associations. Interaction analyses were performed using the SNPs with p,1610 204 significance levels in the model adjusted for age, sex, principal component (PC) 1, and BMI in African Americans (n5351); the model was stratified by race/ethnicity and by normal/overweight BMI (normal: BMI 18-24.9; overweight: BMI 25+). We considered a SNPxBMI interaction significant at a threshold of p,0.05. Displayed are significant interaction results at p50.05. doi:10.1371/journal.pone.0111301.s012 (DOCX) Table S10. Power calculations for replication/generalization in eMERGE TSH levels study. Power calculations for replication/generalization of SNPs previously associated with serum TSH levels to eMERGE euthyroid European Amercians (EA) and African Americans. SNP rs number, chromosomal location, nearest gene/gene region, coded allele (CA), coded allele frequency (CAF), association summary statistics (betas and p-values), and PubMed ID (PMID) are given for each previously reported association with serum TSH levels in European Americans. Starred (*) CAF represents mean CAF from Taylor et al. Power was calculated for each race/ethnicity using Quanto assuming the previously reported effect size, an additive genetic model, a liberal significance threshold of 0.05, the eMERGE minor allele frequencies, and the eMERGE sample sizes. Power calculations labeled with an asterisk indicate proxy SNPs listed in Table 3 (European Americans) and