High-Throughput Characterization of Blood Serum Proteomics of IBD Patients with Respect to Aging and Genetic Factors

To date, no large scale, systematic description of the blood serum proteome has been performed in inflammatory bowel disease (IBD) patients. By using microarray technology, a more complete description of the blood proteome of IBD patients is feasible. It may help to achieve a better understanding of the disease. We analyzed blood serum profiles of 1128 proteins in IBD patients of European descent (84 Crohn’s Disease (CD) subjects and 88 Ulcerative Colitis (UC) subjects) as well as 15 healthy control subjects, and linked protein variability to patient age (all cohorts) and genetic components (genotype data generated from CD patients). We discovered new, previously unreported aging-associated proteomic traits (such as serum Albumin level), confirmed previously reported results from different tissues (i.e., upregulation of APOE with aging), and found loss of regulation of MMP7 in CD patients. In carrying out a genome wide genotype-protein association study (proteomic Quantitative Trait Loci, pQTL) within the CD patients, we identified 41 distinct proteomic traits influenced by cis pQTLs (underlying SNPs are referred to as pSNPs). Significant overlaps between pQTLs and cis eQTLs corresponding to the same gene were observed and in some cases the QTL were related to inflammatory disease susceptibility. Importantly, we discovered that serum protein levels of MST1 (Macrophage Stimulating 1) were regulated by SNP rs3197999 (p = 5.96E-10, FDR<5%), an accepted GWAS locus for IBD. Filling the knowledge gap of molecular mechanisms between GWAS hits and disease susceptibility requires systematically dissecting the impact of the locus at the cell, mRNA expression, and protein levels. The technology and analysis tools that are now available for large-scale molecular studies can elucidate how alterations in the proteome driven by genetic polymorphisms cause or provide protection against disease. Herein, we demonstrated this directly by integrating proteomic and pQTLs with existing GWAS, mRNA expression, and eQTL datasets to provide insights into the biological processes underlying IBD and pinpoint causal genetic variants along with their downstream molecular consequences.


Introduction
The study of molecular mechanisms is of great importance for understanding the etiology of disease. Genome wide association studies (GWAS) help to identify genetic loci that are likely to contain causal variants for human diseases. Investigation of molecular phenotypes and how they relate to disease susceptibility can help close the gap in understanding between variations in the human genome that associate with disease and the biological processes that lead to disease. The integration of these two lines of research has proven particularly fruitful with the availability of high-throughput technologies (e.g., microarray and RNASeq), which allow for the measurement of the expression of genes comprising the entire transcriptome simultaneously across populations of individuals.
Circulating protein levels are known to be an important readout for diagnosing disease and tracking disease progression. Nevertheless, only recently have researchers begun employing high-throughput screening technologies to measure circulating protein levels in large human populations [1][2][3][4]. In this study, we employed a microarray technology (SOMAscan, Materials and Methods) to assess variations in the levels of 1128 proteins in the blood serum of three cohorts representing different disease conditions: Crohn's Disease (CD, n = 84), Ulcerative Colitis (UC, n = 88) and Normal Controls (NC, n = 15). Descriptive summaries of the study cohorts with respect to age, sex and disease condition are reported in S1 Table. Molecular impact of aging has been extensively studied at the epigenetic [5] and transcriptome level [6]. However, the high throughput proteome aging profile has only been studied in healthy subjects [4]. We attempted to close this gap by computing for the first time the aging profile of UC and CD patients, and comparing them with their normal counterpart. Further, we generated genome-wide genotype data (12.6 million SNPs, assayed and imputed) on the CD patients and systematically characterized the genetic variance component for each of the serum proteomic traits (proteomic quantitative trait loci, pQTL).

A serum proteomics aging signature in IBD patients
We studied the relationship between age and expression levels for 1128 proteins measured in the serum of the 15 NC individuals (all between 39 and 62 years old), 88 UC patients (all between 18 and 77 years old), and 84 CD patients (all between 18 and 64 years old). A normal linear regression was performed for each probe representing each protein, using the log2transformed probe intensity as the outcome variable. Sex, batch, and time point were included as covariates (Materials and Methods).
At a 10% false discovery rate (FDR), we observed no proteomic traits in NC, 32 in CD (16 positive and 16 negative), and 130 traits in UC (87 positive and 43 negative) associated with age (S2 Table). The lack of a significant aging signature in NC could mainly be attributed to both a small sample size and the reduced age range in the subjects' age. We detected fewer ageassociated traits in CD patients compared to UC patients (S1 Fig), despite similar sample size in the two disease groups. Similar differences were observed for sex-associated traits in CD and UC (S1 Fig). Because CD and UC subjects were assayed on different SomaSCAN plates, we were not able to determine whether fewer genes were influenced by age and sex in CD than in UC, due to batch effect, or both. A more definite answer would require further investigation with adequate study design.
The serum proteomic traits most strongly associated with age in UC, CD, and NC are depicted as a heatmap in Fig 1 (p 1E-4 in at least one cohort), alongside previously reported proteomic results from kidney [7] and skeletal muscle [8]. We observed generally good agreement of results among all three cohorts, despite the limited sample sizes. We further intersected our aging signatures with a proteomics aging signature derived from a study of healthy individuals [4] in which only the top 10 significant results were released ( Table 1). The overlap was significant in CD (OR = 6.48, p = 0.011) and UC (OR = 6.29, p = 0.006). Only one gene from previous published proteomics aging signature [4] was confirmed in our healthy cohort, CHRDL1 and not statistically significant (OR = 2.84, p = 0.323). We conducted gene set enrichment analysis (GSEA) on 23 MSigDB curated gene sets related to aging (S3 Table). At a 10% FDR, 2 gene sets were positively enriched in UC: "LEE AGING CEREBELLUM UP" and "DEMAGALHAES AGING UP". Positive enrichment of the 2 gene sets was also observed in NC, though none reached statistical significance. No gene set showed significant enrichment in CD at a 10% FDR.
Interestingly, both UC and CD patients displayed a slow but consistent increase in Albumin levels with age (Fig 2). The estimated log2 fold change (log 2 FC) per 10 years increase was 0.11 (SE = 0.02) in UC, and 0.12 (SE = 0.03) in CD. We found that APOE was upregulated in the blood serum of older subjects, in agreement with previous reports on the APOE mRNA levels in kidney [7]. APOE is known for its role in arteriosclerosis, Alzheimer's disease, Parkinson's disease and cardiovascular diseases [9]. The positive association (i.e., log 2 FC estimates) between serum APOE levels and age was fairly consistent across cohorts with different disease conditions. The increase in APOE concentration was mostly pronounced in CD and NC subjects, with its levels roughly doubling in 40 years (Fig 3).
As previously reported in kidney [7], we observed upregulation of matrix metalloproteinase-7 (MMP7) with aging among the UC patients (log 2 FC = 0.09, SE = 0.02, p = 5.31E-5). However, association between age and MMP7 was absent within the CD patients (log 2 FC = -0.01, SE = 0.03, p = 0.866) and non-significant in the healthy controls (log 2 FC = 0.15, SE = 0.13, p = 0.2756). A Cochran's Q test of heterogeneity between the estimates obtained from our 3 cohorts was significant (Q = 7.16, p = 0.0278), suggesting that the observed differences might not be attributed to sampling variability alone. MMP7 breaks down the extracellular matrix not only during embryonic development, reproduction, and tissue remodeling, but in disease processes such as arthritis as well. MMP7 is also known to be involved in inflammation and wound healing. In mice studies, MMP7 has been shown to regulate the intestinal bacterial microbiome, and is thus an important gene for the immune response and homeostasis in the intestine [10]. Chronic stress on the immune system among CD patients may disrupt the slow increase of MMP7 levels with increasing age.

Genetics of proteomic traits in serum
We performed proteomic-QTL mapping in 51 Caucasian CD subjects for which genome-wide genotype data were available. Because proteomic profiling was carried out on each CD patient at two time points, baseline and after 22 weeks, there were a total of 102 samples used in the analysis. A statistical approach similar to eQTL mapping was employed (Materials and Methods). At a 10% FDR, cis pQTL for 41 proteomic traits were mapped ( Table 2). A full list of pQTLs at a 50% FDR was provided in S4 Table. We explored the concordance between serum pQTL and eQTLs in various tissues (Table 3). Interestingly, serum pQTLs and whole blood eQTLs did not overlap more than would be expected by chance, whereas liver eQTL significantly overlapped with serum pQTLs (fold enrichment = 2.33, p = 5.31E-5). Proteins circulating in blood represent peptides from many tissues, with liver, but not blood lymphocytes, representing one of the primary sources of circulating serum proteins. Further, transcriptome profiling in blood is not a close surrogate of serum proteomics (Table 3). Thus, pQTLs carry orthogonal information not captured by mRNA/eQTL and thus have the potential to provide unique insights into the molecular etiology of IBD and other diseases. Serum pQTLs were enriched for GWAS loci of IBD and inflammatory diseases. It is well established that eSNPs are significantly enriched for GWAS SNPs [11,12]. To explore whether pSNPs were also enriched for GWAS human disease SNPs, we inspected the ranks of the pSNPs within published GWAS (Materials and Methods) to test whether pSNPs were enriched for small GWAS p values (Fig 4). Interestingly, while serum pSNPs were enriched for CD and UC GWAS SNPs [13], they were not enriched for other disease associated traits or diseases such as Body Mass Index (BMI) [14], Schizophrenia (SCZ) [15], Ischemic Stroke (Stroke) [16] and Type-2 Diabetes (T2D) [17]. This specificity for IBD GWAS may be attributed to both the study cohort (CD patients) as well as to the tissue's relevance to the disease. The significant enrichment of pSNPs for UC and CD GWAS SNPs highlights the potential utility of pSNPs for elucidating IBD etiology.
Siglec-3, coded by the CD33 gene, is a transmembrane receptor expressed on cells of myeloid lineage [33], and its serum levels were strongly controlled by pQTL (p = 4.02E-11, Table 2). CD33 is an established susceptibility locus for Alzheimer's disease [34][35][36][37][38][39], where the risk allele has been found to alter monocyte function and amyloid biology [36]. In the paper, we found CD33 serum level was influenced by Alzheimer's disease GWAS SNP, where the risk allele, rs12459419-G, was associated with higher serum CD33 level. This suggesting  rs12459419 may influence CD33 transcription, translation or post-translation control of CD33 product (Siglec-3), and in turn modify Alzheimer's disease risk. MST1 as a mediator of CD and UC risk. Our pQTL analysis revealed a chromosome 3 SNP (rs3197999), located within the MST1 (Macrophage Stimulating 1) gene, associated with MST1 protein levels (p = 5.96E-10). This locus is known to be associated with CD and UC susceptibility [13]. Prompted by this finding, we extended our pQTL analysis to fully cover the region chr3:48Mb-51Mb (S6 Table). The pattern of significance of association between genotype and serum MST1 levels matches closely that of association with CD and UC risk (Fig 5), a strong indication that MST1 protein levels and IBD share a common causal genomic variant.
The lead CD risk SNP in this region is rs3197999, a nonsynonymous mutation located within exon 18 of MST1. The minor allele 'A' is associated with an increased risk of both CD and UC (p = 6.20E-17 and 1.86E-17, respectively), and a decrease of MST1 protein levels (p = 5.96E-10). In our CD cohort, the risk allele 'A' has a frequency of 24.51%, which is in line with the observed frequency in the 1000 Genome Project CEU population (25.76%). A strong association of this SNP has also been reported with MST1 mRNA levels in liver (p = 7.65E-10) and subcutaneous fat (p = 1.20E-7), although interestingly not in blood, again demonstrating that peptides circulating in blood can reflect activity levels or abundance of different tissues other than blood. GTex data showed MST1 expression was 31.7 fold higher in liver compared to the average of all other tissues (S3 Fig). Discussion GWAS analysis has identified more than one hundred genome-wide significant loci for IBD [13,40]. Systems biology approaches (e.g. eQTLs and gene networks) have been used to fill the knowledge gap between GWAS SNPs and IBD susceptibility. However, most of these analyses have been applied at the mRNA expression level. Today, the technology and analytic tools are in place for large scale proteomic analysis in IBD relevant tissues. In this report, we leverage the high throughput analysis of the serum proteome to provide insight into the molecular etiology of IBD, and reveal the possible mechanisms of GWAS SNPs. Novel insights into the biology of disease can be missed if analyzing at the mRNA level or by low throughput protein analysis. Our results argue for the importance of a large-scale systems biology study of the proteome space to reveal the complete picture of molecular level alteration and disease predisposition in IBD. We observed a large degree of overlap between the aging signatures from our 3 discovery cohorts (Fig 1) and between our signatures and one previously published from healthy individuals by Menni et al [4]. This suggests that the circulating blood proteome has a robust aging pattern which is consistent across populations of diverse disease conditions. For example, the concentration of albumin, which constitutes a large fraction of the blood serum protein contents, increases slowly but consistently with aging of CD and UC patients. In contrast, we observed positive association between age and MMP7 (matrix metalloproteinase-7) levels in UC and normal controls, but this association was markedly absent in CD patients. MMP7 is known to be involved in inflammation and wound healing. The loss of age-MMP7 association among CD patients may reflect the disease progression, in other words, chronic stress on the  immune system among CD patients may disrupt and slow the increase of serum MMP7 levels with aging. In this study, we employed multiple SomaSCAN plates to assay all specimens, with CD, UC and normal control subjects assayed in different plates at different time points. The proteomic profile showed systematic differences among the plates (S6 and S7 Figs). From the Principal Components Analysis we see separation between the different disease groups. However it is challenging to distinguish whether differences observed were due to batch effects or to true biological differences between UC, CD and normal subjects. This design problem prevented us from directly studying the serum proteomic signature of UC and CD. Instead, we investigated age-and sex-associated genes within CD, UC and normal controls.

Table 3. Overlaps between blood serum protein-QTLs and previously published eQTLs from several tissues (10% FDR).
To our knowledge, this study is the first systematic mapping of proteomic QTLs in a cohort of Crohn's Disease patients. At 10% FDR, we found 41 distinct proteins showing evidence of association with a nearby (cis) SNP. Some of these genes and loci were previously discussed in relation to diseases and other molecular QTL studies, such as BST1, a gene known to be implicated in Parkinson's Disease [18][19][20].
Many of our 10% FDR pQTL results were previously reported as eQTLs in various tissues. However, overlap between our blood serum pQTLs results and mRNA eQTLs derived from several large tissue sets (including whole blood) was not higher than random chance (Table 3). Interestingly, liver eQTLs showed significant overlap with serum pQTLs. In the present study we screened protein products circulating in the blood serum as opposed to mRNA extraction from cells in solid and soft tissues biopsies, such as liver, fat and brain sections. Said otherwise, the blood serum proteome captures secretions from multiple and distant tissues and cell types, and thus observations from blood serum are to be expected to depart from those done in studies focused on the mRNA levels of a single tissue or cell type, and contain substantial molecular information not otherwise covered by mRNA surveys.
We further systematically surveyed for the presence of eQTLs and/or pQTLs among known IBD risk loci collected from the NHGRI-EBI GWAS catalog [42]. In particular, we examined eQTL of blood [12], brain (Harvard Brain collection, www.brainbank.mclean.org), liver [41], omental fat [41] and subcutaneous fat [41] tissues. Out of 393 published IBD risk loci, 149 were not eQTLs nor pQTLs for any of the surveyed tissues, 241 were eQTLs for one or more tissues, and 3 were both eQTLs and pQTLs (all 3 in the MST1 locus). Full results of our survey, SNP by SNP, are reported in S9 Table. In this paper, we demonstrated the potential of pQTLs as a powerful tool to interpret GWAS findings. Crohn's disease and ulcerative colitis susceptibility has been mapped to a wide locus of 3p21. Possible genes underlying this GWAS locus include BSN (bassoon), MST1 (macrophage stimulating-1), MST1R (MST1 Receptor), etc [43]. The lead GWAS SNP, rs3197999, is associated with the gene expression level of many genes in various tissues (e.g. UBA7 and HPEH in blood, CAPN5 and RBM6 in adipose, and MST1 in liver and adipose tissues) [12,41]. MST1 gene encoding Macrophage Stimulating Protein (MSP), and MSP binding to the MSP receptor (also known as RON receptor). The rs3197999 SNP results in an Arg689Cys amino acid substitution within the β-chain of MSP (MSPβ) [44]. Therefore, rs3197999 (MSPβ Arg689Cys) can possibly function by at least two mechanisms, (1) affecting the protein structure and function; and (2) regulating the protein levels in vivo.
Evidence of MSPβ Arg689Cys's effect on protein function remains inconsistent to date. Gorlatova et al. showed MSPβ Cys689 (GWAS risk allele) binding affinity to RON is approximately 10-fold lower than that of the wild-type MSPβ (Arg689) [44]. However, in a eukaryotic cell system, the Cys689 allele significantly increased the stimulatory effect of MSP on chemotaxis and proliferation by THP-1 cells, indicating a gain of function associated with the Cys689 allele [45]. In this study, we pointed out another possible mechanism that the GWAS SNP (rs3197999) causes IBD by regulating protein level of MSP. Shown in S8 Fig, the risk allele (rs3197999-A which codes Cys689) profoundly decreases serum MSP level (p = 5.96e-10). It is unclear whether lower serum MSP contributes to IBD risk, but it is reported that MST1R expression was significantly downregulated in other immune disease (ie, multiple sclerosis) in both mouse and human subjects [43]. We also noticed that the MST1 pQTL peak is almost identical to the IBD GWAS peak in the 3p21 locus in terms of location and shape, despite the pQTL and GWAS studies being carried out in completely independent cohorts (Fig 5). In this study, we measured several additional proteins on 3p21 locus with the SomaSCAN platform (IMDH2, MSP R and MAPKAPK3), but none of them showed pQTLs (S3 Fig). Furthermore, the MST1 eQTL and MSP pQTL are consistent in direction, where the risk allele (rs3197999-A) is linked to lower MST1 expression and lower MSP serum level. It is possible that results of SOMAscan can be affected by non-synonymous mutations. Although the exact binding site of MST1 Somascan probe is not known, distinct +/-binding of the MST1 probe on sample groups were not observed on Somalogic Inc development/validation samples, indicating at least the non-synonymous variant does not have a profound impact on the probe binding properties. In parallel, association of the rs3197999 risk allele with lower MST1 protein concentration in serum was also recently reported in a cohort of 4900 healthy individuals from the Gutenberg Health Study using ELISA assay [46], which further corroborates the reproducibility of our results. Taken together, our data suggest that the IBD locus 3p21 is attributable to the MST1 gene, and the possible mechanism is that the risk allele reduces MST1 mRNA abundance in relevant tissues as well as MSP protein level. The lower MSP in turn modify macrophage activities and lead to IBD risk.

Subjects
Blood serum proteomics profiles were available for 15 normal controls (NC) between 39 and 62 years old. Serum samples were available from the baseline pre-treatment visit of 88 Ulcerative Colitis (UC) patients between 18 and 77 years old who were enrolled in the PURSUIT study [47], as well as baseline and 22 weeks follow up visits of 84 moderate to severe Crohn's Disease (CD) patients between 18 and 64 years old who were enrolled in the CERTIFI study [48]. All subjects were of Caucasian ancestry (self reportedly).

Proteomics
Proteins were measured using a SOMAmer-based capture array called "SOMAscan" [2,49] (web site: http://www.somalogic.com/Products-Services/SOMAscan). A total of 1,128 proteins were measured by an approach that uses chemically modified nucleotides to convert a protein signal to a nucleotide signal that is measured as relative fluorescence units using a custom DNA microarray.

Genotyping
Genotyping of CD subjects was performed at the Medical Genetics Institute as Cedars-Sinai Medical Center using Illumina OmniExpress chips (Human610-Quadv1 Chips; Illumina, San Diego, CA, USA). Genotypes were determined based on clustering of the raw intensity data for the two dyes using Illumina BeadStudio software. Six samples performed in duplicate yielded >99% concordance. In total, 733'120 SNPs were successfully genotyped. Genotype imputation was performed using the 1000G reference following the MaCH pipeline [50].

Differential protein expression analysis
Differential protein expression analysis was performed by linear regression models, using the log-2 transformed protein level as the outcome variable (y) and age plus other covariates as regressors.
Specifically, the following ordinary least squares regression was performed in UC and NC: y~Age + Sex + PlateID. Within the CD cohort, as two separate measures were available from two different time points, a mixed effects model was estimated: y~Age + Sex + PlateID + TimePoint + (1|SubjectID), where '1|SubjectID' represents the random intercept associated with each CD subject. In all cases, significance of the association with Age was quantified with the two-sided Wald test on the 'Age' coefficient.
We estimated the False Discovery Rate using a previously reported empirical permutation approach [51][52][53], and N = 1000 permutation iterations were run. Specifically, FDR was computed for each probe as: avgð# permuted pvalues tÞ observed pvalues t Þ

Gene set enrichment analysis
Gene Set Enrichment Analysis of differential expression results was performed using the GSEA software from the BROAD institute, v2.2.0, and the MSigDB c2 (curated gene signatures) Gene sets database, gene symbols, v5.0 (http://software.broadinstitute.org/gsea). Results from each cohort were analyzed separately, using the 'preranked gene list' method. False Discovery Rate was evaluated by running 1000 permutations.

Proteomic-QTL mapping
We performed proteomic-QTL mapping on 51 Caucasian CD subjects with available imputed genotype data. A total of 102 samples were finally available for the analysis (all subjects had 2 proteomics assays available, at baseline and at 22 weeks follow up). A random effects linear regression model was adopted to map cis protein-QTLs (pQTLs): y~EffectiveAlleleCopyNumber + Age + Sex + TimePoint + (1|SubjectID), where 'y' is the inverse-normal transformed protein expression level, 'EffectiveAlleleCopyNumber' is the imputed allele copy number for a specific SNP, and '1|SubjectID' represents the random intercept associated with each CD subject. Significance of the genotype effect was quantified with a two-sided Wald test on the Maximum Likelihood estimator of its coefficient. The distribution of the Wald test pvalue across all cis effects under the null hypothesis of no correlation between genotype and gene expression was estimated by re-running the analysis on a null dataset obtained by permuting the genotype subject identifiers. A self-contained, re-usable R script was written to fit the random effects models using the 'lme4' R package. The full code is available at github.com/antoniofabio/eqtl-ranef. FDR was quantified by comparing the observed distribution of the test statistic with that estimated from the permuted data, as previously described [51][52][53].
Additional regressions were run for probe SL005202 (gene symbol: MST1) against all SNPs in chromosome 3, between 49 and 51 mega-bases (hg19), conditioning first on the peak pSNP rs9836291 (chr3:49697459) and then on the IBD risk SNP rs3197999 (chr3:49721532), in addition to the covariates already used for the main model.

Enrichment for GWAS signals in lists of SNPs
Enrichment for GWAS signals in proteomic-QTL hits was assessed as follows. First, full GWAS results (variants positions and pvalues) were retrieved from their original publications: Chron's Disease and Ulcerative Colitis (CD and UC, [13]), Body Mass Index (BMI, [14]), Schizophrenia (SCZ, [15]), Ischemic Stroke (Stroke, [16]), and Type-2 Diabetes (T2D, [17]). The full GWAS tables were then reduced to the subset of SNPs covered by our pQTL study. Within each reduced table, the relative rank of the pvalue of each SNP was computed (e.g., in a table of 1E5 SNPs, the smallest pvalue has relative rank 1E-5, the second smallest has relative rank 2E-5, etc.). Finally, we plotted the relative ranks of our protein-SNPs within each table, and compared it with a uniform distribution using a rank-rank plot.

Ethics statement
The current study is approved by the Icahn School of Medicine at Mount Sinai IRB with the approval number HSM11-01669, The study is also listed at ClinicalTrials.gov with reference number NCT00771667, and the protocol was approved by the institutional review board at each study center. All the participants received written consent forms.   Table. MST1 proteomic-QTL results in the region chr3:49Mb-51Mb. Variants are annotated with MST1 association statistics, CD and UC risk statistics, rsIDs, gene and function (from annovar). (XLSX) S7 Table. MST1 proteomic-QTL results in the region chr3:49Mb-51Mb, alternatively conditioning on the peak pSNP rs9836291 (chr3: 49697459) and on the IBD risk SNP rs3197999 (chr3:49721532). (XLSX) S8 Table. Distribution of baseline blood samples across microarray plates, by cohort and sex. (XLSX) S9 Table. Known IBD risk loci and 10% FDR mRNA expression-QTLs (eQTLs) and protein-QTLs (pQTLs) from different tissues. IBD risk loci were obtained from the NHGRI-EBI GWAS catalog (version 1.0.1 e84, 2016-06-12) and lifted to the hg19 genome build. For each locus, we surveyed 10% FDR cis or trans eQTL and pQTL studies from few tissues. Brain eQTLs (Prefrontal Cortex, Visual Cortex and Cerebellum) were obtained from the Harvard Brain collection (www.brainbank.mclean.org); Blood eQTLs from [12]; Liver, Omental fat and Subcutaneous fat from [41]; Blood serum pQTLs from the present study. (XLSX) S10 Table. Protein expression summary statistics. Expression measured as log2-probe intensity. (XLSX) S11 Table. Allele frequencies of all pSNPs with FDR 0.5.

Author Contributions
Conceptualization: AFDN SET CB JC EES AK RD KH.