Regional fat depot masses are influenced by protein-coding gene variants

Waist-to-hip ratio (WHR) is a prominent cardiometabolic risk factor that increases cardio-metabolic disease risk independently of BMI and for which multiple genetic loci have been identified. However, WHR is a relatively crude proxy for fat distribution and it does not capture all variation in fat distribution. We here present a study of the role of coding genetic variants on fat mass in 6 distinct regions of the body, based on dual-energy X-ray absorptiometry imaging on more than 17k participants. We find that the missense variant CCDC92S70C, previously associated with WHR, is associated specifically increased leg fat mass and reduced visceral but not subcutaneous central fat. The minor allele-carrying transcript of CCDC92 is constitutively more highly expressed in adipose tissue samples. In addition, we identify two coding variants in SPATA20 and UQCC1 that are associated with arm fat mass. SPATA20K422R is a low-frequency variant with a large effect on arm fat only, and UQCC1R51Q is a common variant reaching significance for arm but showing similar trends in other subcutaneous fat depots. Our findings support the notion that different fat compartments are regulated by distinct genetic factors.


Introduction
While generalised adiposity, as measured by body mass index (BMI), is a well-established and major risk factor for cardio-metabolic diseases, body fat distribution is increasingly being recognised as an even stronger determinant of metabolic risk [1,2]. For example, Yusuf et al. [1] showed that waist-to-hip ratio (WHR) is a stronger predict of myocardial infarction than BMI. To date, the overwhelming majority of genome and exome-wide association studies on fat distribution have focussed on waist and hip circumference and WHR [3,4]. While these measures are easy and cheap to obtain on a large scale, they do not capture all variation in fat a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 distribution. For example, WHR does not capture peripheral fat stored in the upper limbs and the distribution of overall central fat over the subcutaneous and visceral compartments, of which the latter have been suggested to have discordant effects on cardio-metabolic risk [5][6][7][8]. Furthermore, circumference-based estimates of fat accumulation do not take into account differences in lean mass and bone structure and mass. Therefore, genetic association studies based on direct measures of regional fat mass may help unpick mechanisms underlying the expansion of distinct fat depots.
Quantification of fat mass in distinct regions of the body requires imaging methods with post-image processing to derive delineation of tissues, such as magnetic resonance imaging or dual x-ray absorptiometry (DXA). DEXA-based measures of regional body fat have been shown to correlate more strongly with metabolic risk factors than traditional anthropometric risk factors, such as BMI and waist and hip circumference [9]. We therefore investigated associations of common and rare non-synonymous variants covered on the exome chip with 6 distinct fat compartments, derived from DXA scans on 17,212 participants from 3 cohorts. We hypothesise that by identifying fat depot-specific genetic loci we may gain better insight into the site-specific role of adipose tissue on disease aetiology.

Results and discussion
We tested the associations of all non-synonymous genetic variants covered on the Illumina Human Exome Bead chip with DXA-derived fat mass in the 6 body fat regions: arm fat, leg fat, gynoid fat, total android fat, visceral abdominal fat and subcutaneous abdominal fat (S1 Table). The regional fat phenotypes were adjusted for the first 4 genetic principal components, age and % body fat and the residuals were rank-based inverse normally transformed for men and women separately. The exome chip meta-analyses included up to 17,212 participants of European ancestry from the Oxford Biobank [10], Fenland [11] and EPIC-Norfolk [12] cohorts (Table 1 and S1 Table). Three non-synonymous variants reached exome-wide significance (p<2×10 −7 ) (Fig 1, Table 2 and S2 Table): rs11057401, a common missense variant in Coiled-Coil Domain Containing 92 (CCDC92 S70C ); rs62621401, a novel low-frequency missense variant in Spermatogenesis Associated 20 (SPATA20 K422R ) and rs4911494, a common missense variant in Ubiquinol-Cytochrome C Reductase Complex Assembly Factor 1 (UQCC1 R51Q ). Heterogeneity tests based on Cochran's Q statistics indicate no significant differences among the effect estimates in the 6 contributing datasets (p for Cochran's Q statistic for rs11057401 = 0.19, for rs4911494 = 0.61 and rs62621401 = 0.42). The three loci identified here show the same direction of effect in both men and women. However, the effect size of rs11057401 (CCDC92) on visceral fat was two-fold stronger in men than in women whereas the association with lower body fat mass showed no difference between the sexes. An additional 30 non-synonymous variants reached suggestive significance across 38 tests (p<10 −6 , S3 Table), including  [13]. The minor allele of rs11057401 shows significant opposing effects on android, specifically visceral fat mass, and lower body fat, in the sex-combined additive model (total abdominal fat mass: β = -0.065, SE = 0.012, p = 3.5×10 −8 ; visceral abdominal fat mass: β = -0.063, SE = 0.012, p = 1.3×10 −7 ; leg fat mass: β = 0.075, SE = 0.020, p = 4.9×10 −10 ) ( Table 2). These data extend WHR associations reported by Justice, et al. [3] and Lotta et al. [11] at this locus. Lotta et al. [11] describe the contribution of leg fat mass; we here demonstrate an additional opposing effect specifically on abdominal visceral fat mass but not for abdominal subcutaneous fat mass, which would correspond to the already observed association with increased waist circumference but with the present analysis showing that the effect is confined to the intra-abdominal fat depot only. Found on chromosome 12q24, CCDC92 is ubiquitously expressed with highest levels in adipose tissue, brain and testes. It is a nuclear (EAF = 0.016) and C. rs4911494 in UQCC1 (EAF = 0.62). Data is presented for the 6 DXA measures under investigation and is presented as the beta value ± SD. The meta-analysis significance level using an additive model for gender combined (All) as well as for gender stratified analysis, together with the N indicated to the right of the data in parentheses. DXA measures are Arm fat mass (Arm), Total android fat mass (Android), Subcutaneous android fat mass (Abdominal subcut), Visceral android fat mass (Abdominal visceral), Gluteal fat mass (Gynoid) and Leg fat mass protein interacting with the centriole-ciliary interface [11] and may also be involved in DNA repair [14]. The lead variant tags a large haplotype of at least 99 SNPs (r 2 >0.9) across a number of genes including the putative transcription factor Zinc Finger Protein 664 (ZNF664) and Dynein Axonemal Heavy chain 10 (DNAH10) ( Table 2). There is also strong evidence for multiple eQTL signals across this haplotype which includes three genes, i.e. CCDC92, DNAH10 and ZNF664 [15]. Previous GWAS studies have also associated SNPs in this haplotype with a reduction in insulin resistance [11], improvements in metabolic syndrome [16], reduced WHRadjBMI [3,4], increased adiponectin levels [17] and with increased plasma HDL-cholesterol and reduced triglyceride concentrations [18][19][20]. Ablation of CCDC92 and DNAH10 in mouse OP9-K cells impairs adipogenesis and reduced lipid accumulation [11]. To further define the likely causative gene or genes in this complex region we undertook a number of gene expression studies in human regional adipocytes and whole adipose tissue. CCDC92 and ZNF664 showed very similar expression profiles between abdominal subcutaneous adipose tissue(ASAT), gluteal subcutaneous adipose tissue(GSAT) and arm subcutaneous adipose tissue derived (ARM) cDNA (Figs 2 and 3, S1-S3 Figs), whilst DNAH10 expression could not be detected in cDNA from either of the diverse human adipose tissues or cultured primary human preadipocytes, making it an unlikely effector transcript. Across a panel of 52 paired ASAT and GSAT cDNA samples (S1 Fig), qPCR showed small differences in expression of CCDC92 and ZNF664 between ASAT and GSAT as well as between lean and obese individuals. In a cultured human primary pre-adipocyte differentiation time course experiment, both CCDC92 and ZNF664 showed a significant upregulation by day 4 of differentiation (Fig 2A  and 2B) but no difference in expression levels was observed between preadipocytes of ASAT and GSAT origin. Whilst this study focusses on exonic coding variants, recent studies have highlighted that coding variants are not always the causal variant at the locus, but may be associated with the trait simply because they are correlated with the causal (non-coding) variant [21]. To that end, we also sought to investigate the reported eQTL signals at this locus, for both CCDC92 and ZNF664 (GTEx project [15] and [11]) using allele-specific qPCR; a method that allows us to assess expressed allelic imbalance in heterozygous individuals and thus an eQTL. This showed a highly statistically significant increased expression of transcripts found on the minor allele haplotype for both genes (ASAT 5.8%, GSAT 4.9%, Fig 3A and 3B). Of functional importance is that this allelic expression imbalance would result in the increased expression of the predicted deleterious serine-70-cysteine amino acid substitution in the CCDC92 protein.
Interestingly, zinc finger proteins such as ZNF664 have been suggested to regulate the expression of near-by genes [16]. The observed co-regulatory expression pattern of ZNF664 and CCDC92 could then possibly be due to the eQTL acting on ZNF664 which then upregulates the CCDC92 transcript carrying the deleterious coding variant. Further work needs to be done to investigate this. The rs62621401 SPATA20 K422R is a rare variant (EAF 0.016), which has to our knowledge not been reported for any other phenotypes. This amino acid substitution is predicted to be , SPATA20 (C), UQCC1 (D) and GDF5 (E) was measured over a 14-day adipogenic differentiation time-course using primary preadipocytes from abdominal subcutaneous adipose tissue (ASAT) and gluteal subcutaneous adipose tissue (GSAT) fat depots [40]. Data are shown as DDCt values (normalized to PPIA and PGK1; n = 6, mean ± SEM). A multivariate general linear model was used to test for statistical significance between depots and time, and to assess depot x time interactions. p-values are presented in the shaded boxes, NS: nonsignificant.
https://doi.org/10.1371/journal.pone.0217644.g002 Allelic expression was measured on 4 candidate transcripts in our three exome-wide significant regions using allele-specific qPCR. Data is presented as the % of the minor allele detected compared to the major allele, as described in the methods, with a line indicating the mean and 95%CIs. To assess the rs11057401 eQTL haplotype the proxy SNP rs9863 was assessed for benign and allele-specific qPCR on paired ASAT and GSAT cDNA samples of five Oxford Biobank participants heterozygous for rs62621401 did not reveal any association with expression of SPATA20 (Fig 3C). This variant shows a large effect size and is the first locus to be associated with arm fat mass (β = -0.29, SE = 0.04, p = 1.5×10 −11 , Table 2 and Fig 1B) with an estimated per-allele effect size in the Oxford Biobank (n = 4,606) of 125g less arm fat mass (-5.8%, 95% confidence interval (CI) = [-9.3%,-2.3%]) in men and 67g (-2.6%, 95% CI = [-5.3%,0.4%]) in women (approximate fat mass (grams) per allele after adjusting for covariates with % change in parentheses, S1 Table). SPATA20, linked to spermatogenesis in mice [22], is highly expressed in human testes but also ubiquitously expressed, including in adipose tissue. SPATA20 is a putative member of the thioredoxin family and other members of this family have been shown to be involved in preadipocyte proliferation [23] and pro-adipogenic Wnt signalling [24]. SPATA20 expression was higher in men than women, although this was only significant for GSAT (p = 0.01) (S1 Fig). Expression of SPATA20 during adipocyte differentiation showed an increase between days 0 to 7 of adipogenesis then a drop back to pre-differentiation levels between days 7 and 14 (p = 8.3×10 −8 , Fig 2C) suggesting a role for this gene in adipocyte development. Surprisingly, despite the arm-specific association, expression of SPATA20 was similar between arm fat, ASAT and GSAT (Panel A in S2 Fig). However, qPCR assessment of a number of developmental genes (Homeobox genes) in arm adipose tissue compared to ASAT and GSAT showed significant differences (Panel B in S2 Fig), which suggests that arm fat is developmentally distinct from the other fat depots assessed. It is therefore possible that SPATA20 is involved in an arm fat-specific developmental pathway.
The rs4911494 UQCC1 R51Q (EAF = 0.62) variant was also inversely associated with arm fat mass ( Table 2 and Fig 1C) but was not predicted as damaging by predictSNP. Whilst exomewide significance is only observed for arm fat, the arm fat-lowering allele is suggestively associated with less fat in all peripheral and subcutaneous fat depots and in both men and women ( Fig 1C and S1 Table). UQCC1 plays a role in mitochondrial respiratory chain complex III protein expression [25] and is structurally similar to the mouse Bfzb controlling mouse brown fat [26]. Previous associations at this locus, either the arm fat reducing allele of rs4911494 reported here or SNPs in LD have shown; reduced height [27], reduced body weight [28], increased WHRadjBMI [3] and increased osteoarthritis risk [29]. Another missense variant in the nearby canonical Wnt signalling gene GDF5 (rs224331) is in LD with rs4911494 (r 2 >0.9) and reaches suggestive significance with arm fat mass in women (β = 0.13, SE = 0.03, p = 8.97×10 −7 , S3 Table). During adipogenesis, expression of UQCC1 increases (p = 2.4×10 −22 , Fig 2D) but with no difference between ASAT and GSAT in cultured pre-adipocytes. Allele-specific qPCR showed that the minor allele (rs4911494) was associated with a small but statistically significant decrease in expression of UQCC1 in ASAT (per-allele percentage change in expression = -1.25%, p = 8.7×10 −4 ; Fig 3D). It is unclear whether this small change is biologically relevant. GDF5 expression was not detected in adipose tissue cDNA samples. However, during adipogenesis GDF5 showed transient expression at day 2, highlighting the possibility that GDF5 is a regulator of early adipocyte differentiation. CCDC92 (A) and the transcribed region proxy SNP rs1054852 for ZNF664 (B). The index SNP rs62621401 was used to assess the SPATA20 transcript (C) and the index SNP rs4911494 for UQCC1. Paired samples were compared between abdominal subcutaneous adipose tissue (ASAT) and gluteal subcutaneous adipose tissue (GSAT) and genomic DNA (gDNA). For each transcript ABI Taqman genotyping assays were selected that fall within the transcribed sequence. gDNA selected from the same individuals as the cDNAs acts as a paired control with presumed equal allele expression. Deviation from 50% for gDNA, particularly pronounced in SPATA20 (C), represents inherent imbalance in assay technical performance and position of optimal Ct between Vic and Fam fluorescence. By using paired gDNAs to select cDNAs allelic expression imbalance can be resolved by comparing cDNA to its paired gDNA. Significance was assessed with paired t-test in SPSSv24. Mean differences between comparisons and statistical significance is presented in shaded boxes. NS: Non-significant. The single outlier seen for SPATA20 (C) was replicated in a second cDNA synthesis and both ASAT and GSAT. No phenotype differences were observed for this individual and no obvious genetic differences were observed. https://doi.org/10.1371/journal.pone.0217644.g003 Regional fat depot masses and protein-coding gene variants

Conclusion
This study represents the largest exome chip meta-analysis on DXA-derived discrete fat depot masses to date. The value of better-defined fat depot regions is illustrated at the CCDC92 locus. The CCDC92 S70C variant shows a clear effect on visceral fat mass, but not on subcutaneous abdominal fat mass, while an opposing effect on lower body fat mass was observed. These findings refine our understanding of the role of the CCDC92 locus on distinct fat compartments compared to what could be derived from genetic analyses on circumference-based measures of fat distribution. Whether the opposing effects on visceral versus leg fat mass are because genetic variation at the CCDC92 locus plays a direct role in both depots, or because the opposite effect on one depot is simply a consequence of a primary effect on the other depot is unclear and will require further investigation. A previous study investigating computerised tomography (CT) scan-derived visceral and subcutaneous fat mass also found associations at the CCDC92 locus with ZNF664 (rs1048497) and DNAH10 (rs1316952) [30] but these SNPs are both in low LD with the index variant in this study (r 2 0.34 and 0.18, respectively) and may represent independent signals. No other previously reported loci associated with CT-derived visceral fat measures [30] were replicated to suggestive significance. The limitation of using an algorithm-based approximation of visceral fat mass, as presented here with the DXA platform should be acknowledged in comparison with the direct measurements provided by CT or MRI methods. However, DXA-derived estimation of visceral fat mass is very strongly correlated with CT-derived measures (r 2 = 0.96) [31] as has associations with T2D and CVD risk profiles comparable with CT and MRI methods [9]. These data and the depth of previous GWAS findings at the CCDC92-ZNF664 locus highlight this as an important region in regulating adipose tissue distribution. In addition, we report two coding variants associated with arm fat mass: a novel low-frequency variant in SPATA20 with a large effect size that seems to only affect arm fat mass and a common variant in UQCC1 that has additional but weaker effects on other subcutaneous fat depots. Overall, when comparing these data to exome chip analyses on standard anthropometric traits, such as WHR [3], we replicate few of the reported loci, likely due to the relatively small sample size of our analyses. However, we have identified a locus, not found for any traditional anthropometric traits, for arm fat mass and refined the tissue-specific association for another locus (CCDC92), highlighting the value of more defined regional fat measures.
Regional fat depots have distinct physiological regulation with an impact on the whole body metabolic homeostasis, with distinct transcriptomes demonstrating functional differences and differences in origin [7]. This study further supports genetic evidence for overall, distinct regulation of regional fat depots.

Population cohorts
Oxford biobank. The Oxford Biobank (OBB) cohort (http://www.oxfordbiobank.org.uk) consists of an age-stratified random sample of men and women (aged 30 to 50 years) of European ancestry resident in Oxfordshire, UK, as described previously [10]. All participants gave written, informed consent to participate, and studies were approved by the Oxfordshire Research Ethics Committee (08/H0606/107+5). A total of 3,281 individuals from the Oxford Biobank had both measures of fat mass with GE Lunar iDXA [9] and Illumina Human Exome Beadchip genotypes after QC checks. An additional 1,325 individuals contained DXA data and Affymetrix UK Biobank Axiom array genotype data from which overlapping ExomeChip data was extracted for the purposes of this study (Table 1).
Fenland. The Fenland study is a longitudinal cohort consisting of 12,435 participants born between 1950 and 1975. Participants were recruited from the general population through GP surgeries in Cambridge, Ely and Wisbech (Cambridgeshire, East Anglia) [32] and underwent detailed metabolic phenotyping and genome-wide genotyping. Ethical approval for the study was was given by the Cambridge Local Ethics committee (ref. 04/Q0108/19) and all participants gave their written consent prior to entering the study.
A total of 1,145 individuals from the Fenland cohort had both measures of fatmass with GE Lunar iDXA [9] and Illumina Human Exome BeadChip genotypes after QC checks (Fenland-ExomeChip, Table 1), a further 997 had Illumina Infinium Core Exome data and 7,363 had Affymetrix UK Biobank Axiom array genotype data from which overlapping ExomeChip data was extracted for the purpose of this study (Table 1).
EPIC-Norfolk. EPIC-Norfolk is one of the two UK-based centres of the European Prospective Investigation into Cancer and nutrition (EPIC) study-an ongoing multi-centre European cohort study set up to study the role of nutrition in cancer risk [33] The EPIC-Norfolk study is a cohort of 25,000 participants, aged 45-74 at baseline, who were recruited through GP surgeries in Norfolk (East Anglia, UK) from the general population between 1993 and 1997 [12]. The study was approved by the Norfolk Research Ethics Committee (ref. 05/Q0101/ 191) and all participants gave their written consent before entering the study. A total of 3,101 individuals had Affymetrix UK Biobank Axiom array genotype data from which overlapping ExomeChip data was extracted for the purposes of this study (Table 1).

DXA-derived depot-specific fat mass measures
For all cohorts depot-specific fat mass was quantified using GE Lunar iDXA (GE Healthcare, Bucks, UK). As previously described [9] these give high precision estimates of body composition. The standard setting of the Encore software (version 14.0; GE Healthcare, Bucks, UK) was used to automatically define regions of interest ensuring that boundaries were consistent between cohorts. The descriptives for the DXA measures used are presented in S1 Table. Visceral fat mass and android subcutaneous fat mass were not measured directly. Visceral fat mass was calculated using an algorithm within the Encore software as described elsewhere [9,31] and the android subcutaneous fat mass was calculated by subtracting the visceral fatmass from total android fat mass. The DXA scanning was calibrated as per manufacturer's instructions. For arm and leg fat masses the combined masses of the left and right limbs were used, as calculated in the software.

Exome-wide genotype analysis
Datasets. Six data sets from three cohorts, Oxford Biobank [10], Fenland [34] and EPIC--Norfolk [12] (Table 1), equalling a total of 17,212 individuals of European ancestry were compiled for this analysis. The Illumina Exome BeadChip v1.0 genotype content was used as the base content. Where other genotype arrays were used (see Table 1) only the content overlapping with the Illumina ExomeChip were selected. The breakdown of descriptives for each of the 6 datasets can be found in S1 Table. Standardized, between cohorts, Exome-Chip protocols and quality control (QC) metrics were employed for genotype calling for each cohort separately using zCall [35] and individuals and loci that failed QC removed before association analysis. Allele calls, chromosome positions and strand were harmonised between cohorts with a common reference files using PLINK and converted to .vcf files in plinkseq. A standard protocol was also shared between cohorts for harmonised phenotype transformations.
single-variant analysis. Principal component analysis was performed within each study on LD-pruned genotype data with MAF>5% obtained using GWAS arrays. Models were adjusted for the first 4 genetic principal components (PCs) in order to adjust for population stratification and relatedness, which is commonly done in exome and genome-wide association studies [36]. All DXA-derived phenotypes were log-transformed, adjusted for the first 4 PCs, age and percentage total fat mass (calculated as the percentage of total fat mass (grams) to total mass (grams)) and the residuals inverse normal transformed in the R statistical environment to remove systematic differences between studies/DXA instruments/operators. Percentage total fat mass adjusted for age and PC1-4 was also included in the analysis to assess collider bias. Individual datasets were analysed separately in sex-combined and sex-specific analyses using RAREMETALWORKER [37] (http://genome.sph.umich.edu/wiki/ RAREMETALWORKER). To account for cryptic relatedness, kinship matrices were first calculated and added into the analysis. Single-variant analysis was performed with, additive, recessive and dominant models.
Meta-analysis. Meta-analysis was carried out centrally using RAREMETAL [38]. Variants were excluded if they had a call rate <90%, Hardy-Weinberg equilibrium p<10 −7 and markers on Y chromosome or mitochondrial genome. Exome-wide significance for the single-variant analysis was set, based on the full ExomeChip content, as p<2×10 −7 . A suggestive significance was set to p<10 −6 .
For this analysis we focussed on non-synonymous variants only, therefore all non-coding variants and synonymous variants were filtered out post meta-analysis together with loci that had a total heterozygous count of less than 10. The exome-wide significant findings are presented in Fig 1 and S1 Table; the additional suggestive significant findings are presented in S3 Table.

Additional informatics
For the three exome-wide significant loci the amino-acid substitutions was assessed for functional significance using the predictSNP online consensus tool [13] (https://loschmidt.chemi. muni.cz/predictsnp1/). This allows for assessment across a number of different tools to generate a consensus assessment. For CCDC92 the S70C missense variant was assessed; for UQCC1 the R51Q was assessed and for SPATA20 three different proteins as products of different splice variants were assessed (K422R, K406R and K362R).
For tissue panels, subcutaneous adipose tissue biopsies were collected by needle biopsy as previously described [39]. For cell-cultured human primary preadipocytes, of both abdominal subcutaneous adipose tissue (ASAT) and gluteal subcutaneous adipose tissue (GSAT) origin, a differentiation time course (n = 6) was performed as described in Todorčević, et al. [40]. All biopsies and cells were homogenized in Tri-reagent (cat. no. T9424, Sigma-Aldrich, UK) and RNA was extracted with a standard Tri-reagent protocol. A total of 500ng RNA was used for cDNA synthesis following standard protocols and random hexamer primers using the cDNA Reverse Transcription Kit (Life Technologies, UK). Real-time PCR reactions were performed on a 1/40 cDNA dilution using Taqman Assays-on-Demand (Applied Biosystems) and Kapa Probe Fast Mastermix (Kapa Biosystems) in triplicate in a 6μl final volume and run on an Applied Biosystems 7900HT machine. Expression was assessed within each panel using a relative qPCR approach [41] and normalised using the previously assessed stably expressed endogenous control genes [39]. For the Lean/Obese Oxford Biobank panel (S1 Fig) the geometric mean of PPIA, PGK1, PSMB6 and IPO8 were used. IPO8 was not used in a paired arm, ASAT and GSAT panel (S2 Fig) as it was not stably expressed between arm and the other depots. PPIA and PGK1 were used as endogenous controls for primary cell culture experiments.
Neither DNAH10 nor GDF5 could be detected above background in whole tissue cDNA panels. GDF5 was however detected in a 14-day in vitro adipocyte differentiation time course.
Data for a panel of 52 paired ASAT and GSAT biopsy samples was used to assess expression between sexes, between ASAT and GSAT fat depots, and between lean and obese individuals. Descriptives for this panel are presented in S1 Fig. As both SPATA20 rs62621401 and UQCC1 rs4911494 were associated with arm fat mass their expression, along with CCDC92 and ZNF664 was assessed in a paired arm, ASAT and GSAT cDNA panel. As there is no published data on arm subcutaneous adipose tissue transcriptomics the additional HOX gene transcripts  HOXA5, HOXB8, HOXC8, HOXC9 and HOXC11 were assessed as these are known to be differentially expressed between ASAT, GSAT and visceral fat (Data comparing arm subcutaneous adipose tissue(ARM), ASAT and GSAT gene expression are presented in S2 Fig).
The setup of a human primary adipocyte differentiation time course is described elsewhere [40]. Relative qPCR was run as above on the adipocyte panel for CCDC92, ZNF664, SPATA20, UQCC1 and GDF5. Data is presented in Fig 2.

Allele-specific qPCR
Both the CCDC92 and the UQCC1 loci are associated with multiple eQTL signals. Whilst we only consider non-synonymous variants in this analysis this does not discount that the coding locus is also under the influence of an eQTL. To assess the available data from resources such as the GTEx portal and to assess any eQTL effect between ASAT and GSAT fat depots we used the combined resources available within the Oxford Biobank.
Allele specific qPCR was run essentially as described in Fogarty et al. [42]. Taqman genotyping assays (Applied Biosystems) were selected to fall within the transcripts under investigation. For CCDC92 the index SNP assay performed poorly so the Proxy SNP rs9863 (ABI assay, C_206415_30) was selected. To assess the nearby gene ZNF664 a SNP in high LD with the CCDC92 index SNP that fell within the ZNF664 transcript, rs1054852 (ABI assay, C_1169873_10), was selected. For SPATA20 the index SNP was used (rs62621401, ABI assay C_25983779_10) as was for UQCC1 (rs4911494, ABI assay, C_25472999_10). As was previously stated neither DNAH10, nor GDF5 could be detected in whole adipose tissue panels. Therefore, allele specific qPCR could not be assessed for these two genes.
From a panel of 200 paired ASAT and GSAT cDNA samples available from the Oxford Biobank, heterozygous individuals were selected. For CCDC92 and ZNF664 28 paired ASAT and GSAT samples were selected, for UQCC1 there were 34 and for SPATA20 there were 5. Genomic DNA (gDNA) for these individuals were also retrieved and diluted to 1.5ng/μl. The gDNA is used as the control comparison to the cDNA samples as there is an equal quantity of both alleles in heterozygous gDNA samples. By comparing the ratio of the Ct values from each allele (the ratio of the genotype assay Vic or Fam fluorophore signals) between cDNA and gDNA any allelic expression differences observed in the cDNA samples can be resolved. This is particularly relevant as technical variation exists with each genotyping assay; particularly pronounced in SPATA20 (Fig 3C).
Data are presented as the percentage of the minor allele Ct value compared to the major allele Ct. This is calculated by first generating a standard curve and regression statistic for each assay. A standard curve is generated from genomic DNA for individuals homozygous for the major allele (BB) and minor allele (bb). Genomic DNAs are diluted to 1.5ng/μl then BB and bb homozygotes are combined to ratios 80:20, 60:40, 50:50,40:60,80:20. Following qPCR analysis using the dual-labelled TaqMan Genotyping assays the ratio of the B to b Ct values are calculated (Ct B minus Ct b) then plotted against the percentage of the minor allele in the dilution series. The linear regression statistic from this standard curve is then used to calculate the percentage minor allele expression of the unknown heterozygous individuals. The standard curves are presented in S3 Fig panels A-D and allele-specific qPCR data for heterozygous individuals are presented in Fig 3. For CCDC92, ZNF664 and UQCC1 there were sufficient cDNAs in the 200 panel, however for SPATA20 there were only 5 individuals. Therefore, to improve the accuracy of the SPATA20 analysis, each sample was run in triplicate 4x across the assay plate and the average of all 4 sets of triplicates calculated. A single outlier in the SPATA20 data was followed up in a second cDNA synthesis and persisted in both ASAT and GSAT samples. No phenotype differences were observed for this individual and no obvious genetic differences were found.

Statistical analysis
Statistical significance was assessed for each experiment in SPSS v24. For estimates of per-allele grams fat mass change, log phenotype data was analysed in a general linear model and adjusted for age, PC1-4 and total %fat mass then estimated marginal means were calculated (S1 Table). . Data are shown as the mean ± SEM DDCt values (normalized to the geometric mean of the endogenous control genes PPIA, PGK1, IPO8 and PSMB6) as described previously [39,41]. A multivariate general linear model was used to test for statistical significance between gender, fat depots and obesity and to assess interactions. Pvalues are presented in the shaded box, NS: non-significant. There were small but significant differences in expression of CCDC92, ZNF664 and UQCC1 between fat depots in lean individuals but this difference was lost and expression was significantly reduced, in obese individuals. This is in keeping with a general quiescent state observed in transcripts associated with adipocyte metabolic activity in obesity. The standard curve and regression statistic used to calculate the percentage minor allele expression with allele-specific qPCR is shown above for CCDC92 (A), ZNF664 (B), SPATA20 (C) and UQCC1 (D). To quantify any allelic expression imbalance for the four genes a standard curve was generated from genomic DNA for individuals homozygous for the Major allele (BB) and Minor allele (bb). Genomic DNAs are diluted to 1.5ng/μl then BB and bb homozygotes were combined to ratios 80:20, 60:40, 50:50,40:60,80:20 to generate a standard curve. Following qPCR analysis using dual labelled TaqMan Genotyping assays the ratio of the B to b allele Ct values are calculated (Ct B minus Ct b) then plotted against the percentage of the minor allele in the dilution series. The linear regression statistic from this (A, B, C and D above) is then used to calculate the percentage minor allele expression of our unknown individuals. For CCDC92 (A), ZNF664 (B) and UQCC1 (D) three different pairs of homozygote individuals were used to generate each standard curve and a Mean ± SEM plotted for each dilution (A, B and D). For SPATA20 only one genomic DNA homozygote minor allele individual was available so an error bar cannot be displayed. As discussed in the main text there was an observed co-regulatory pattern of expression between CCDC92 and ZNF664 across different cDNA panels. To assess any correlation between these two genes within the samples, the allele-specific qPCR paired data points were plotted and regression statistic calculated (Graphs E and F). For both abdominal subcutaneous adipose tissue (ASAT) (E) and gluteal subcutaneous adipose tissue (GSAT) (F) there was a significant correlation, further supporting the co-regulatory pattern of expression. (TIF) S1  Table. Exome-wide significant loci. Detailed data on the three exome-wide significant loci described. DXA parameters are included for all measures and meta-analysis statistics for the additive model. DXA measures are arm fatmass (Arm), Total android fat mass (Android), Subcutaneous android fat mass (Subcut), Visceral android fat mass (Visceral), Gluteal fat mass (Gluteal) and Leg fat mass (Leg). Effect size data for suggestive exome-wide significance (p< = 10 −6 ) is shown in bold. Exome-wide significant data (p<2E -7 ) are in bold and underlined. a The impact of missense variants were assessed using the PREDICTsnp online consensus tool [13] (https://loschmidt.chemi.muni.cz/predictsnp1/). b Approximate fat mass (grams) changes per allele is shown where test reaches suggestive significance and were calculated as marginal means after adjusting for age, PCs1-4 and %fatmass as covariates in a general linear model, implemented in SPSS v24 (DOCX) S3 Table. Exome-wide loci showing suggestive level of statistical significance. Additional non-synonymous loci where statistical tests did not reach exome-wide significance but did reach a suggestive significance cut off of p< = 10-6 are included above. a Where it reaches suggestive significance the model is shown as Additive (add), Recessive(rec) or Dominant (Dom). b The impact of missense variants were assessed using the predictSNP online consensus tool [13] (https://loschmidt.chemi.muni.cz/predictsnp1/). c The cluster of 8 Missense SNPs found at the SPPL2C-MAPT-KANSL1 locus on chromosome 17 are part of a single haplotype that extends across~400kb in this region containing >2300 SNPs (r 2 >0.9), rather than independent signals. (DOCX)