Adiponectin GWAS loci harboring extensive allelic heterogeneity exhibit distinct molecular consequences

Loci identified in genome-wide association studies (GWAS) can include multiple distinct association signals. We sought to identify the molecular basis of multiple association signals for adiponectin, a hormone involved in glucose regulation secreted almost exclusively from adipose tissue, identified in the Metabolic Syndrome in Men (METSIM) study. With GWAS data for 9,262 men, four loci were significantly associated with adiponectin: ADIPOQ, CDH13, IRS1, and PBRM1. We performed stepwise conditional analyses to identify distinct association signals, a subset of which are also nearly independent (lead variant pairwise r2<0.01). Two loci exhibited allelic heterogeneity, ADIPOQ and CDH13. Of seven association signals at the ADIPOQ locus, two signals colocalized with adipose tissue expression quantitative trait loci (eQTLs) for three transcripts: trait-increasing alleles at one signal were associated with increased ADIPOQ and LINC02043, while trait-increasing alleles at the other signal were associated with decreased ADIPOQ-AS1. In reporter assays, adiponectin-increasing alleles at two signals showed corresponding directions of effect on transcriptional activity. Putative mechanisms for the seven ADIPOQ signals include a missense variant (ADIPOQ G90S), a splice variant, a promoter variant, and four enhancer variants. Of two association signals at the CDH13 locus, the first signal consisted of promoter variants, including the lead adipose tissue eQTL variant for CDH13, while a second signal included a distal intron 1 enhancer variant that showed ~2-fold allelic differences in transcriptional reporter activity. Fine-mapping and experimental validation demonstrated that multiple, distinct association signals at these loci can influence multiple transcripts through multiple molecular mechanisms.


Introduction
Genome-wide association studies (GWAS) have identified thousands of associations between genomic loci and complex traits [1,2]. However, identification of mechanisms underlying the association signals is often complicated by numerous potential target genes and patterns of linkage disequilibrium (LD) [3]. The complexity of GWAS loci also includes the consequences of allelic heterogeneity [4]. At loci harboring allelic heterogeneity, multiple functional variants can result in multiple distinct association signals. Depending on the haplotype on which each new functional variant allele arose [5], lead variants at these signals may exhibit moderate pairwise LD or may be nearly independent (r 2 <0.01). Allelic heterogeneity can include both cisregulatory and coding variants that may act alone or in combination to affect the phenotype [6]. With increasing sample sizes and use of larger imputation reference panels in array-and sequence-based GWAS, the extent of allelic heterogeneity at complex trait loci is becoming more apparent [7]. This heterogeneity creates challenges in identifying causal variants and target genes, but also leads to more comprehensive characterization of mechanisms by which GWAS variants influence the associated trait(s).
Interpretation of GWAS results includes elucidating the molecular mechanisms by which genetic variants affect gene expression or function. Early candidate gene and genome-wide association studies identified multiple coding variants associated with a complex trait within the same gene [8][9][10]. Other studies have reported loci where multiple regulatory variants in strong LD with each other likely affect one gene [11,12], loci where multiple regulatory variants may affect different genes [13], and loci for which both coding and noncoding variants in the same region are independently associated with a phenotype [5,14]. Examples of the dissection and characterization of multiple, distinct, cis-regulatory association signals at one locus remain few [11,15,16], especially when the signals are not statistically independent of each other.
To detect and characterize the molecular mechanisms at GWAS loci harboring allelic heterogeneity, we tested genetic associations with circulating plasma adiponectin levels in the Metabolic Syndrome in Men (METSIM) study. Adiponectin, encoded by the ADIPOQ gene, is a hormone involved in glucose regulation and secreted almost exclusively by adipose tissue [17]. We identified additional association signals within 1 Mb of significant lead GWAS variants (P<5x10 -8 ) using stepwise conditional analyses. To identify candidate functional variant(s) and gene(s) at these association signals, we used coding annotations, colocalized adipose expression quantitative trait locus (eQTL) signals, chromatin accessibility, and epigenomic marks of transcriptional regulation, and demonstrated allelic effects of variants on regulatory activity in functional assays. Taken together, this study demonstrates that independent and distinct association signals within one locus may affect different transcripts and act via multiple distinct molecular mechanisms.

Genetic variants associated with adiponectin levels
We conducted a genome-wide association study (GWAS) for adiponectin levels in 9,262 nondiabetic men from the METSIM study [18]. We assumed an additive genetic model and tested for association with~16.6 million variants with MAF � 0.08% (minor allele count � 15). We confirmed (P<0.05) twelve adiponectin loci previously identified in European and East Asian individuals (S1 Table), four of which achieved genome-wide significance (rs12051272 within CDH13 intron 1, P = 1.8 x 10 −68 ; rs199938283 near ADIPOQ, P = 8.2 x 10 −55 ; rs149689033, 630 kb upstream of IRS1, P = 3.0 x 10 −9 ; and rs2276824 within an intron of PBRM1, P = 3.5 x 10 −8 ; Table 1; S2 Table;   At these four loci, we identified additional association signals by performing stepwise conditional analyses and, for comparison, an approximate conditional analysis. We defined "distinct" as an association signal that met a locus-wide significance threshold of P cond <1x10 -5 within 1 Mb of an initial lead GWAS variant after conditioning. When the pairwise linkage disequilibrium (LD) between the signal's lead variant and all other identified lead variants at the locus was low (r 2 <0.01), we defined the signal more specifically as "independent". Only one association signal was identified at the IRS1 and PBRM1 loci (S2 Table and S4-S6 Figs). We identified additional association signals at both the ADIPOQ (seven total signals) and CDH13 (two total signals) loci, described in more detail below. After adding the seven additional signals identified at the ADIPOQ and CDH13 loci, the four loci explained 10.3% of the variation in adiponectin levels.

Four independent and three distinct association signals at ADIPOQ
ADIPOQ encodes the adiponectin protein measured in GWAS and is expressed almost exclusively in mature adipocytes [19]. At the ADIPOQ locus, stepwise conditional analysis revealed seven distinct signals associated with adiponectin levels (Table 1; S2 and S3 Tables; Fig 1; S3 and S7 Figs). After conditioning on the lead GWAS variant (rs199938283; signal 'A'), the next strongest adiponectin-associated variant was rs4632532 (P cond = 1.3x10 -35 ; signal 'B'). Variants at signal 'B' were previously reported by the ADIPOGen GWAS meta-analysis to have the strongest association with adiponectin levels. However, studies contributing to that analysis were imputed to the HapMap2 reference panel, which did not include any variants representing signal 'A' [20]. Our subsequent conditional analyses identified four additional association signals (lead variants rs16861209, signal 'C'; rs73187787, signal 'D'; rs17366653, signal 'E'; and rs17846866, signal 'F'; Table 1) that each reached locus-wide significance (P<1x10 -5 ). The next strongest signal (rs62625753, signal 'G') did not meet the significance threshold but is a known missense variant in exon 3 of ADIPOQ (G90S; Table 1) shown previously to influence adiponectin multimerization [21]. No additional coding variants within the ADIPOQ gene reached the locus-wide significance threshold for association with adiponectin levels (S4 Table). Altering the order by which variants were entered into the model as covariates did not alter the lead variants at any association signal, nor did selection of a different variant to represent a particular association signal or use of a different initial imputation reference panel (S5 Table). Together, the seven signals explained 5.8% of the phenotypic variance compared to 2.7% for the lead ADIPOQ signal alone.
To characterize the distinct signals, we examined the LD between lead variants at pairs of signals. We found the pairwise LD to be low (r 2 �0.01) for all signal pairs except for signals 'A' and 'B' (r 2 = 0.05) and signals 'A' and 'F' (r 2 = 0.33), suggesting that signals 'C', 'D', 'E', and 'G' are statistically near-independent, while signals 'A', 'B', and 'F' are distinct, yet the trait-raising alleles are sometimes present on shared haplotypes. To determine the extent to which the three signals act independently of each other, we next performed haplotype association analyses with adiponectin levels in METSIM using the lead variants from signals 'A', 'B', and 'F' (S8 Fig). Compared to a reference haplotype containing the adiponectin-increasing alleles at both signals 'A' and 'B' (rs199938283-Ins and rs4632532-T), haplotypes containing at least one adiponectin-decreasing allele exhibited lower adiponectin levels (e.g. Ins-C, β = -0.193; CT, β = -0.779; S8A Fig). A similar pattern of association was observed between signals 'A' and 'F' (S8B Fig). These results confirm that within haplotypes with the same allele at signal 'A', variants at both signals 'B' and 'F' still contribute to variation in adiponectin levels.
We extended the haplotype association analyses with adiponectin levels to use the lead variants for all seven signals (S9 and S10 Figs). Compared to a reference haplotype containing the Locus with seven adiponectin GWAS signals and subcutaneous adipose eQTL signals for three transcripts. (A) Seven signals near ADIPOQ (labeled 'A'-'G') were identified through stepwise conditional analyses for association with adiponectin levels (n = 9,262). The signals are labeled in the order in which they were identified in the stepwise process and are distinguished by both color and shape. Variants are shaded based on LD with the lead variants, shown as diamonds. (B) Two variants in perfect LD (r 2 = 1.00) with adiponectin GWAS signal 'A' (rs76786086 and rs115527175) show the strongest association with expression levels of ADIPOQ and LINC02043 in subcutaneous adipose tissue in 434 METSIM participants. The strongest adiponectin-associated variant from signal 'F' (rs17846866) also shows the strongest association with expression levels of ADIPOQ-AS1 in subcutaneous adipose tissue in 434 METSIM participants. adiponectin-increasing alleles at each signal (rs199938283-Ins, rs4632532-T, rs16861209-A, rs73187787-T, rs17366653-T, rs17846866-T, and rs62625753-G), haplotypes containing at least one adiponectin-decreasing allele exhibited lower adiponectin levels (e.g. Ins-TACTTG, β = -0.043; Ins-TCTTTG, β = -0.278). Haplotypes containing the largest number of adiponectin-decreasing alleles exhibited the lowest levels of adiponectin when compared to the baseline haplotype. To account for potential errors in chromosome phasing, we repeated haplotype analyses using only the 2,637 METSIM participants homozygous for all seven signals; estimated effect sizes were consistent with those based on the full set of participants (S10 Fig), providing support that our estimated effect size estimates are not influenced by haplotyping errors.

Multiple candidate transcripts at ADIPOQ
We next examined molecular mechanisms at the ADIPOQ locus, including variants in ADI-POQ predicted to affect the adiponectin protein ( Table 2). Signals 'E' and 'G' showed the largest effects on adiponectin levels (Table 1), and both have been shown previously to alter ADIPOQ protein sequence. rs62625753 at signal 'G' encodes Gly90Ser, which decreases adiponectin multimerization [21]. Because we measured total adiponectin, we do not have data on multimers to validate the functional consequences shown previously. rs17366653 at signal 'E' was reported to increase the proportion of ADIPOQ isoform that splices out exons 2 and 3 (50 bp), leading to nonsense-mediated decay [22]. We did not have a sufficient number of study participants in the RNA-seq data that expressed the ADIPOQ isoform without exons 2 and 3 to examine the association between rs17366653 and expression level of the isoform.
To aid in the identification of candidate genes, we examined whether the association signals for adiponectin are colocalized (see Methods) with association signals for expression levels of nearby (<1 Mb) transcripts in subcutaneous adipose tissue from a subset of 434 METSIM participants (S6 Table)[23]. Adiponectin association signal 'A' colocalized with eQTL signals for ADIPOQ (rs76786086-T; β = -1.69; P = 1.0x10 -9 ) and LINC02043 (rs115527175-T; β = -1.19; P = 1.9x10 -5 ), an adjacent long non-coding RNA expressed primarily in brain, adipose, testis, arterial, and splenic tissues ( Table 2;  . No additional eQTL signals were detected for these three transcripts, and no other colocalized cis-eQTL associations were identified for these variants (S7 Table). Variants associated with lower plasma adiponectin level at signal 'A' were associated with lower expression levels of both ADIPOQ and LINC02043, while variants associated with lower plasma adiponectin level at signal 'F' were associated with higher expression level of ADIPOQ-AS1 (S12 Fig). Expression levels of ADIPOQ and ADIPOQ-AS1 are only weakly positively correlated, potentially reflecting the very low relative expression level of ADIPOQ-AS1 and the wide distribution of expression level of ADIPOQ. In comparison, data from GTEx v8[25] showed different lead variants associated with expression level of ADIPOQ that correspond to GWAS signal 'F' and of ADIPOQ-AS1 that correspond to GWAS signal 'E'; these results may be influenced by differences in cell type composition, sex, sampling variability, and/or expression measurement/ analysis (S8 Table).
We further investigated signals 'A' and 'F' at the ADIPOQ locus to evaluate their statistically distinct contributions to gene expression. Within the METSIM adipose eQTL data, we used further conditional analyses to investigate whether the associations between signals 'A' and 'F' and expression levels of ADIPOQ and ADIPOQ-AS1 may have different molecular effects (S9 Table). The association between signal 'A' variant rs143784260 (LD r 2 = 1.00 with lead signal 'A' variant rs199938283) and ADIPOQ expression level persists after conditioning on signal 'F' variant rs17846866 (P initial = 1.0x10 -9 ; P cond = 1.3x10 -6 ). Similarly, the association between signal 'F' variant rs17846866 and ADIPOQ-AS1 expression level persists after conditioning on signal 'A' variant rs143784260 (P initial = 7.6x10 -10 ; P cond = 5.6x10 -12 ). The distinct contributions of each signal on transcript expression were confirmed using haplotype association analyses (S13 Fig). Compared to the reference haplotypes containing the transcript expression-increasing alleles at both ADIPOQ (rs143784260-C, rs17846866-T) and ADIPOQ-AS1 (rs143784260-C, rs17846866-G), haplotypes containing at least one transcript expression-decreasing allele exhibited lower expression levels, providing further support that these two association signals have molecularly distinct effects.

Candidate regulatory variants at ADIPOQ
Based on the position of signal 'A' and 'B' candidate variants in accessible chromatin and chromatin marks of active enhancers in adipocytes/adipose tissue, we tested variants for allelic effects on transcriptional activity (Fig 2; S14 Fig). Of 14 candidate variants at signal 'A' (lead GWAS variant and variants in LD r 2 �0.80), rs76071583, in a 462-base pair element located ~2.5 kb upstream from the ADIPOQ transcription start site (Fig 2A), showed the strongest allelic differences in transcriptional activity. In differentiated 3T3-L1 adipocytes, the element containing the adiponectin-increasing allele rs76071583-A showed 2.5-fold increased enhancer activity compared to the element containing the rs76071583-G allele (forward, P<0.0001; reverse, P = 0.0003; Fig 2B), suggesting that the alleles alter function of a cis-regulatory enhancer element. The sequence containing rs76071583-A is predicted to include a consensus core-binding motif for CEBP-α, and a ChIP-seq peak for CEBP-α also overlaps this region. In electrophoretic mobility shift assays (EMSA) using CEBP-α protein, we observed an allele-specific band for rs76071583-A (S15 Fig), suggesting that CEBP-α binds to rs76071583 to increase enhancer activity at this locus. Additional variants tested representing signals 'A' and 'B' did not exhibit allelic differences in transcriptional activity (S14  (B) rs76071583-A, associated with higher adiponectin levels, showed greater transcriptional activity in both forward and reverse orientation with respect to ADIPOQ in 3T3L1 differentiated adipocytes compared to rs76071583-G and an "empty vector" containing a minimal promoter. (C) rs73187787 and six candidate variants in high pairwise LD (r 2 �0.80) span a 10 kb region within ST6GAL1 intron 2. (D) A haplotype of variant alleles rs13322149-G and rs55958900-A, associated with lower adiponectin levels, showed greater transcriptional activity in both forward and reverse orientation in 3T3L1 differentiated adipocytes compared to the TC haplotype and an "empty vector" containing a minimal promoter. https://doi.org/10.1371/journal.pgen.1009019.g002

PLOS GENETICS
variants from signal 'A'. However, we did not observe any strong annotations of regulatory elements (S7 Fig) and the functional consequences of signals 'B' and 'C' remain to be detected.
Of candidate variants at ADIPOQ signal 'D', we examined transcriptional activity in reporter assays for a 445-base pair region spanning rs13322149 and rs55958900 located 135 kb distal to the ADIPOQ promoter within the first intron of ST6GAL1 ( Fig 2C). In differentiated 3T3-L1 cells, the haplotype containing both alleles associated with lower adiponectin (rs13322149-G and rs55958900-A) exhibited � 2.4-fold higher enhancer activity (forward, P = 0.012; reverse, P = 0.009) compared to the haplotype containing the alleles associated with higher adiponectin (Fig 2D), suggesting that one or both variants within this haplotype alter a distal cis-regulatory element. While we do not have direct mechanistic evidence of a link to any transcript, the direction of effect of adiponectin decreasing alleles on transcriptional activity is consistent with an effect on ADIPOQ-AS1, not ADIPOQ or LINC02043. The allelic effects on transcriptional activity, distance (>100 kb) from the other signals, and direction of effect consistent with ADIPOQ-AS1 all provide support for functional consequences molecularly distinct from the other signals.

Two association signals at CDH13
Expressed in multiple cell types throughout the cardiovascular system, CDH13 encodes the cadherin-13 receptor, a cell surface receptor for hexameric and high-molecular weight adiponectin known to influence levels of circulating adiponectin [26]. Positive feedback exists between adiponectin and the cadherin-13 receptor [27]. At the CDH13 locus, we identified two distinct adiponectin association signals. Conditioning on the lead GWAS variant (rs12051272, signal 'A') revealed a second distinct signal (rs4782722, P cond = 3.2x10 -8 , signal 'B') ( Table 1; S2  Table; Fig 3A; S2 Fig). Pairwise LD between the lead variants at the two signals was low (r 2 = 0.04). No CDH13 coding variants reached locus-wide significance with adiponectin levels after conditioning on rs12051272 and rs4782722 (S10 Table). Compared to a reference haplotype containing the adiponectin-increasing allele at both variants (rs12051272-G and rs4782722-T), haplotypes containing at least one adiponectin-decreasing allele exhibited lower adiponectin levels (GG, β = -0.072; TT, β = -0.381; Fig 3D). Haplotypes containing both adiponectindecreasing alleles exhibited the lowest levels of adiponectin (TG, β = -0.494) when compared to the baseline haplotype. Together, the two signals explained 3.8% of the variance in adiponectin levels compared to 3.5% for the lead CDH13 signal alone.

Function of candidate regulatory variants at CDH13
We next tested the CDH13 signals for colocalization with subcutaneous adipose eQTL signals for nearby transcripts. The adiponectin-decreasing allele rs2239857-G, a variant in perfect LD (r 2 = 1.0) with the lead adiponectin GWAS variant of signal 'A' (rs12051272; r 2 = 1.00), is most strongly associated with increased CDH13 expression level (β = 1.00; P = 4.35x10 -26 ; Table 2; S6 Table; Fig 3E). While we did not find significant evidence of a second signal associated with expression level of CDH13 or colocalizations with expression level of any other transcripts, larger eQTL studies would have more power to detect additional signals.
To investigate whether the two statistically distinct adiponectin-association signals at CDH13 have distinguishable consequences, we investigated potential molecular mechanisms. At both signals, no candidate variants (LD r 2 �0.80 with rs12051272 or rs4782722) are located in coding regions. The lead signal 'A' variant rs12051272, is located in intron 1,~3 kb from the CDH13 transcription start site, in an accessible chromatin region with chromatin marks of active enhancers in multiple cell types including HeLa (Fig 4A and 4B; S16 Fig). We used HeLa cells to examine transcriptional activity in reporter assays for a 775-bp region spanning PLOS GENETICS rs12051272 (Fig 4A and 4B). The adiponectin-increasing allele rs12051272-G showed an average of 1.7-fold increased enhancer activity compared to the rs12051272-T (P = 0.004), suggesting that rs12051272-G may be responsible for increased transcriptional activity of a target gene leading to higher adiponectin levels. The rs12051272-G allele that showed increased enhancer activity in HeLa cells was unexpectedly associated with decreased CDH13 adipose tissue expression levels, suggesting that the effect observed in HeLa cells does not fully represent the adipose tissue signal, that CDH13 expression level reflects feedback [27,28], or that CDH13 is not the target gene.
At CDH13 signal 'B', we investigated three intronic variants located~12 kb distal to the CDH13 transcription start site in accessible chromatin regions with marks of active enhancers (rs4782722, rs12444113, and rs3910232; Fig 4C and 4D; S17 and S18 Figs). In HeLa cells, a

PLOS GENETICS
398-bp region containing the adiponectin-decreasing alleles of two variants (rs4782722-G and rs12444113-G) showed 1.7-fold increased enhancer activity compared to the haplotype containing the non-risk alleles (P = 0.0001). Additional haplotype constructs containing rs4782722-G showed a 1.4-fold increase in enhancer activity compared to haplotype constructs containing rs4782722-T (S17 Fig), suggesting that rs4782722-G is responsible for increased transcriptional activity of a target gene leading to lower adiponectin levels. Taken together, these data show that rs4782722 at CDH13 signal 'B' exhibits allelic differences in transcriptional enhancer activity and suggest it functions within a distal intron 1 cis-regulatory element, distinct from the proximal intron 1 variants at CDH13 signal 'A'.

Discussion
Here we described statistically distinct and independent complex trait association signals that act through molecularly distinct mechanisms. At the ADIPOQ GWAS locus, which harbors extensive allelic heterogeneity, seven association signals appear to act through two different coding mechanisms (missense variant altering protein multimerization and splice variant leading to a nonfunctional transcript) and multiple transcriptional regulatory mechanisms acting on three target transcripts, ADIPOQ and two long non-coding RNAs. At the CDH13 GWAS locus, two signals associated with decreased plasma adiponectin levels likely act via proximal intron 1 promoter (signal 'A') and distal intron 1 enhancer (signal 'B') variants to affect expression levels of CDH13. Using molecular-based assays, we demonstrated variants at several association signals at both the ADIPOQ and CDH13 loci are located in regulatory elements and exhibit significant allelic effects on transcription. Identification and molecular dissection of multiple association signals at a GWAS locus provides a more accurate measure of the explained trait variability, enabling additional molecular mechanisms and target genes to be identified, furthering our understanding of target gene regulation and the biology underlying complex traits.
Allelic heterogeneity is a common characteristic of complex traits [29], and identifying multiple signals at GWAS loci serves several purposes. First, each signal can help identify which genes influence the trait [13]. Even at loci for which an effector gene is obvious, such as ADI-POQ for adiponectin levels, each signal can help elucidate gene function and/or a regulatory mechanism. In addition, each association signal can account for additional trait variation and explain more trait heritability [5]. Finally, the prediction accuracy of polygenic risk scores relies on the full set of variants that affect a trait, and including variants from multiple association signals at a locus can improve prediction accuracy [30].
As GWAS sample sizes increase and more signals are identified, the intricacies of how to characterize, annotate, and/or dissect multiple signals at individual loci are becoming more apparent [4]. eQTL sample sizes are also increasing, gaining sufficient statistical power to identify and differentiate multiple signals [25]. Several methods exist for identifying multiple signals at a GWAS locus[31, 32]. We had individual-level genotype and trait data available; therefore, we performed exact stepwise conditional analyses. This method proved robust to different modifications in consistently identifying the same lead variants and order in which the signals appeared. When only summary association data are available, such as from GWAS meta-analyses, distinct signals can be detected using approximate conditional analysis with GCTA with estimated LD from a provided reference sample[32]. When we applied GCTA using the genotype information from the GWAS study samples and METSIM as a reference for LD, GCTA and stepwise conditional analyses yielded nearly identical results across all loci, with the exception of one very rare signal (ADIPOQ signal 'G'). While all methods for identification of multiple association signals have limitations (e.g. computationally intensive, data availability, accurate representation for LD), identifying potentially novel additional signals can better capture the genetic architecture of GWAS loci.
Adiponectin is an adipose-tissue derived hormone that plays a central role in energy homeostasis [17]. Higher levels of adiponectin can protect from obesity, type 2 diabetes, atherosclerosis, and cardiovascular disease [33]. Interaction of adiponectin with its receptors activates signaling pathways that affect insulin signaling, nitric oxide production, adipogenesis, glucose uptake, fatty acid oxidation, lipogenesis, glycolysis, and gluconeogenesis [34]. Many loci associated with adiponectin levels are shared with other insulin resistance traits and risk of type 2 diabetes, including loci near IRS1, LYPAL1, ARL15/FST, VEGFA, CMIP, and PEPD [7]. The insulin-sensitizing, anti-diabetic, anti-atherogenic, and anti-inflammatory properties of adiponectin suggest its potential relevance as a therapeutic target for diabetes and metabolic syndrome [34].
Given the obvious biological link between the ADIPOQ gene and circulating plasma adiponectin levels, both candidate gene studies and GWAS have previously identified pronounced associations between variants in and near ADIPOQ and adiponectin levels [35][36][37]. Previous GWAS meta-analyses have consistently reported the strongest associations for variants in high LD (r 2 �0.80) with signal 'B' from this study [20,38]. However, these meta-analyses used genotype data imputed to the relatively sparse HapMap2 reference panel that does not include representative variants from signals 'A', 'D', 'E', 'F', or 'G'. Similar to our observation, another study using the denser UK10K reference panel identified stronger adiponectin association for variants within signal 'A' [39]. Experimental studies have also identified variants that functionally affect adiponectin levels, including rs17366653 (signal 'E') and rs62625753 (signal 'G') and other coding variants not polymorphic or available in this study [22,40,41]. Our results suggest that many statistically independent (signals 'C', 'D', 'E', 'G') and distinct (signals 'A', 'B', and 'F') association signals exist that act via distinct molecular mechanisms to affect adiponectin levels. Association analyses in larger studies will likely detect additional signals and may reveal additional potential molecular mechanisms.
ADIPOQ association signal 'A' was colocalized with eQTL signals for ADIPOQ and LINC02043, while ADIPOQ association signal 'F' was colocalized with an eQTL for ADIPO-Q-AS1, suggesting that the functional variant(s) at each signal may regulate expression of a non-coding transcript. Very little is known about LINC02043, which was very lowly expressed, but long non-coding RNAs can regulate nearby genes in cis through mechanisms such as transcriptional initiation, splicing, mRNA stability, translation, and the regulation of epigenetic modifications [42,43]. The alleles associated with decreased plasma adiponectin levels at ADI-POQ signal 'A' are associated with decreased expression of both ADIPOQ and LINC02043, suggesting that LINC02043 may play a role in coordinating the regulatory activity of ADIPOQ, although LINC02043 may simply be affected by nearby enhancers but not influence adiponectin levels. Previous work has suggested that ADIPOQ-AS1 plays a post-transcriptional role in the regulation of ADIPOQ, inhibiting adipogenesis through the formation of an ADIPO-Q-AS1/ADIPOQ mRNA complex to suppress the translation of ADIPOQ mRNA [24,44]. In addition, overexpression of ADIPOQ-AS1 was shown to inhibit the differentiation of preadipocytes, suggesting ADIPOQ-AS1 plays a crucial role in adipogenesis [24]. The directions of association we observe between variants at ADIPOQ signal 'F' with both decreased plasma adiponectin levels and increased expression level of ADIPOQ-AS1 are consistent with this role.
Complex loci can include multiple association signals that are located close in physical position yet are independent or distinct. Some GWAS loci are relatively straightforward, with a single association signal, functional variant, and mechanism. We demonstrate that multiple signals within a locus may also be functionally examined separately from each other, even when they are not statistically independent, and that multiple signals can exhibit distinct effects through multiple mechanisms and multiple transcripts. Full characterization of GWAS loci with regards to signal identification, fine-mapping, and functional annotation can provide insight into the genetic architecture of the loci and the corresponding traits.

Ethics statement
The METSIM study was approved by the Ethics Committee of the University of Eastern Finland and Kuopio University Hospital in Kuopio, Finland and carried out in accordance with the Helsinki Declaration. Written informed consent was obtained from all participants.

Adiponectin measurements
All METSIM participants participated in a one-day outpatient visit to the Clinical Research Unit at the University of Eastern Finland for data collection, which included an interview for their medical history and a blood sample following a 12-hour fast. Plasma adiponectin was measured using the Human Adiponectin ELISA kit (LINCO Research, St. Charles, MO, USA).

Genotyping and imputation
We genotyped the METSIM participants using the HumanOmniExpress-12v1_C BeadChip and Infinium HumanExome-12 v1.0 BeadChip [45]. Genotypes were imputed using the HRC reference panel [46], and separately using the GoT2D reference panel [45], a dense reference panel of >19 million variants (SNPs, in-dels, and large deletions) based on the whole genome sequencing of 2,657 Europeans individuals from Germany, Sweden, Finland, and Britain. Following quality control [45], we tested 16,607,452 variants for association with adiponectin.

Single-variant analysis
We tested for adiponectin association using the imputed dosages for all variants with summed minor allele count dosage >1 assuming an additive model of inheritance and accounting for cryptic relatedness using the EMMAX linear mixed model approach implemented in EPACTS [47]. Circulating adiponectin levels were adjusted for BMI, age, and age 2 , and then residuals were inverse normalized. Sensitivity analyses were also performed adjusting for age, age 2 , and fat mass percentage; results were similar (S5 Table).

Conditional analyses
At loci that exhibited genome-wide evidence of association (P<5x10 -8 ), we performed a series of stepwise conditional analyses by adding the most strongly associated variant into the regression model as a covariate and testing for association with all remaining variants within 1 Mb of the initial lead GWAS variant at each locus. At loci with genome-wide evidence of association, we set a less stringent, locus-wide significance threshold of P<1x10 -5 to define additional signals based on an approximate 5,000 tests performed at a locus. [7] We performed stepwise conditional analyses until the strongest association variant showed a conditional P-value > 1x10 -5 or had no annotation evidence that suggested a functional role. Sensitivity analyses were also performed altering the order at which signals were included in the stepwise conditional analysis and altering which variant was selected as the representative for each association signal; results were similar.
We also used Genome-wide Complex Trait Analysis[32] (GCTA) approximate regression models to identify multiple association signals at locus-wide significance (P<1x10 -5 ) using genotype data from 10,197 METSIM participants as the LD reference panel. Haplotype analyses were performed using Haplostats in R (https://cran.r-project.org/web/packages/haplo. stats/index.html). We created regional association plots using LocusZoom [48]. Unless otherwise noted, all LD estimates were calculated based on the METSIM data from 10,197 individuals; reported allele frequencies were calculated based on the 9,262 individuals included in this analysis.
We evaluated the proportion of variance explained by a single variant or any given locus by including the variant or a set of variants into a linear regression model with all covariates used in association analysis and calculating the R 2 for the full model. We performed these analyses using SAS version 9.4 (SAS Institute, Cary, NC, USA).

Subcutaneous adipose eQTLs
To aid in the identification of candidate genes at the most strongly associated signals, we examined whether any of the variants associated with adiponectin were also associated with expression of nearby transcripts in subcutaneous adipose tissue. The expression quantitative trait locus (eQTL) data were generated from a subset of 434 METSIM participants with RNAsequencing data [23]. A false discovery rate (FDR) <1% was used to identify cis-eQTLs. We also performed stepwise conditional analyses at each colocalized locus, conditioning stepwise on the variant most strongly associated with gene expression, to identify additional eQTL signals for each transcript. GWAS and eQTL signals were considered to be colocalized when (a) the GWAS variant and the variant most strongly associated with expression level of the corresponding transcript exhibit high pairwise LD (r 2 �0.80), and (b) the association of the eQTL signal became insignificant (FDR>1%) after conditioning on the GWAS variant. For colocalization analyses with ADIPOQ signal 'A', we used proxy variant rs143784260 to represent rs199938283 (r 2 = 1.00; calculated in METSIM), which was not available in the cis-eQTL dataset.

Variant annotation
To establish a set of candidate functional variants at the ADIPOQ and CDH13 loci, we used transcriptional regulatory chromatin marks (accessible chromatin and histone modification states) to predict which variants may affect the transcription of the nearby genes. We compiled regulatory elements from ENCODE [49], ChromHMM [50], Roadmap Epigenomics Project [51] data available through the UCSC Genome Browser [52], and accessible chromatin data from adipose tissue, preadipocytes, and adipocytes [53,54]. We defined candidate variants as those located within accessible chromatin peaks, Roadmap ChromHMM chromatin states of enhancers and promoters, and chromatin-immunoprecipitation sequencing (ChIP-seq) peaks of histone modifications H3K4me1, H3K4me3, and H3K27ac, and transcription factor binding sites in adipose-derived mesenchymal stem cells and adipocyte nuclei (ADIPOQ locus) and HeLa cells (CDH13 locus only).

Cell culture
Cell lines were selected based on expression patterns of nearby genes. Because ADIPOQ is expressed almost exclusively in mature adipocytes[25], we used 3T3-L1 cells differentiated into mature adipocytes for experiments with variants at the ADIPOQ locus. CDH13 is expressed throughout most of the tissues in the body, including vascular cells and adipose tissue [25]. Because variants at the CDH13 locus were observed to be located in accessible chromatin regions in HeLa cells, we performed experiments with these variants in HeLa cells. 3T3-L1 cells (ATCC, CL-173) were cultured and differentiated as described in the ATCC protocol. HeLa cervical cancer cells (ATCC CCL-2) were cultured in DMEM/Ham's F-12 (Corning) supplemented with 10% FBS. All cell lines were maintained at 37˚C with 5% CO 2 .

Transcriptional reporter assays
We designed oligonucleotide primers (S11 Table) with KpnI and XhoI restriction sites. Using Q5 High-Fidelity DNA Polymerase (New England BioLabs), we PCR-amplified fragments surrounding each variant from DNA of individuals homozygous for each allele and cloned the PCR product into the multiple cloning site of a minimal promoter firefly luciferase reporter vector (pGL4.23, Promega) in both orientations with respect to the reporter gene. For CDH13 signal 'B', rs4782722 in the 398 bp construct was altered to create missing haplotypes using the QuikChange II Site Directed Mutagenesis Kit (Agilent Technologies). Three to five isolated clones for each allele for each orientation were sequenced to verify genotype. Cells were seeded with approximately 2×10 5 (3T3-L1) or 6×10 5 (HeLa) cells per well in 24-well plates. Each clone was co-transfected with Renilla luciferase vector (phRL-TK, Promega) in duplicate (HeLa) or triplicate (differentiated 3T3-L1) using Lipofectamine 2000 (Life Technologies) according to manufacturer's recommendations. After 48 hours transfection, we lysed cells with passive lysis buffer (Promega) and measured luciferase activity using the Dual-Luciferase Reporter Assay System (Promega). We normalized firefly luciferase readings to Renilla luciferase readings and normalized all readings to the average of two empty pGL4.23 vector transfections. We used 2-sided Student's t-tests to compare allelic differences in firefly luciferase activity. All transfections were repeated independently and yielded comparable results.

Electrophoretic mobility shift assay (EMSA)
We designed 17-bp biotin-labeled and unlabeled complementary oligonucleotide probes for both rs76071583 alleles (S11 Table). As previously described [55], using the LightShift Chemiluminescent EMSA Kit (Thermo Fisher Scientific), 300 femtomole of biotin-labeled doublestranded oligos were incubated with 450 ng recombinant CEBP-a protein (Creative Biomart CEBPA-153H). To test the specificity of the protein complexes to the allele, 50x excess unlabeled probes were added to the binding reactions. We repeated EMSA experiments on independent days and obtained consistent results. indicates the number of estimated haplotypes. Alleles associated with lower adiponectin are shown in purple while alleles associated with higher adiponectin are shown in green. Haplotype association was performed with adiponectin inverse normalized residuals after adjusting for age, age 2 , and BMI using the most prevalent haplotype as the reference. The order of haplotype effect sizes is consistent with the order of haplotype effect sizes for all 9,262 study participants shown in S9 signal 'A' show rs12051272-G is consistently associated with greater transcriptional activity in both the forward and reverse orientations with respect to CDH13 in HeLa cells compared to rs12051272-T and an "empty vector" containing a minimal promoter. Using a linear regression model to analyze data from all three transcriptional activity experiments provides further support that rs12051272-G is associated with greater transcriptional activity in both the forward (P = 0.02) and reverse (P = 0.004) directions. (TIF) S17 Fig. The second signal at CDH13 (signal 'B') exhibits allelic differences in transcriptional activity. (A) rs4782722 and all five candidate variants in high pairwise LD (r 2 �0.80) span a 2.5 kb region in CDH13 introns 1 and 2. (B) At CDH13 signal 'B', additional haplotypes of rs4782722 and rs12444113 created by site-directed mutagenesis of rs4782722 implicate rs4782722 as a regulatory variant. Transcriptional reporter assays of a regulatory region spanning rs4782722 and rs12444113 in HeLa cells show that haplotypes containing rs4782722-G (GG and GC) exhibited greater transcriptional activity than haplotypes containing rs4782722-T (TG and TC) in the forward and reverse orientations. Each dot represents transcriptional activity of an independent experimental clone. (TIF) S18 Fig. rs3910232 at CDH13 signal 'B' does not exhibit allelic differences in transcriptional activity in HeLa cells. rs3910232 is a proxy of rs4782722 but does not appear to contribute to transcriptional activity differences at this locus. (TIF) S1