Modeling of Environmental Effects in Genome-Wide Association Studies Identifies SLC2A2 and HP as Novel Loci Influencing Serum Cholesterol Levels

Genome-wide association studies (GWAS) have identified 38 larger genetic regions affecting classical blood lipid levels without adjusting for important environmental influences. We modeled diet and physical activity in a GWAS in order to identify novel loci affecting total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels. The Swedish (SE) EUROSPAN cohort (N SE = 656) was screened for candidate genes and the non-Swedish (NS) EUROSPAN cohorts (N NS = 3,282) were used for replication. In total, 3 SNPs were associated in the Swedish sample and were replicated in the non-Swedish cohorts. While SNP rs1532624 was a replication of the previously published association between CETP and HDL cholesterol, the other two were novel findings. For the latter SNPs, the p-value for association was substantially improved by inclusion of environmental covariates: SNP rs5400 (p SE,unadjusted = 3.6×10−5, p SE,adjusted = 2.2×10−6, p NS,unadjusted = 0.047) in the SLC2A2 (Glucose transporter type 2) and rs2000999 (p SE,unadjusted = 1.1×10−3, p SE,adjusted = 3.8×10−4, p NS,unadjusted = 0.035) in the HP gene (Haptoglobin-related protein precursor). Both showed evidence of association with total cholesterol. These results demonstrate that inclusion of important environmental factors in the analysis model can reveal new genetic susceptibility loci.


Introduction
Genome-wide association studies (GWAS) have identified more than 38 larger genetic regions which influence blood levels of total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C) and triglycerides (TG) [1][2][3].These studies modeled basic anthropometric confounders, such as sex and age, while leaving out important environmental influences, such as diet and activity.This strategy is statistically suboptimal since the unexplained variation in the phenotype can increase the measurement error and as a result require larger sample sizes to detect a significant effect.Manolio [4] argued strongly for modeling of environmental covariates in GWAS and recommended lipid levels as a paradigmatic phenotype for studying the genetic and environmental architecture of quantitative traits.
In order to explore the usefulness of including both environmental and genetic factors in the analysis model, we used lipid measurements from the EUROSPAN study, comprising 3,938 individuals for whom genome-wide SNP data (N SNP = 311,388) were available [5].We measured daily intake of food and physical activity at work and at leisure and modeled the influence of those environmental covariates on serum lipid levels in a GWAS.First, data from the Northern Sweden Population Health Study (NSPHS) were used as a discovery cohort to screen for SNPs that displayed the lowest p-values when the model was adjusted for environmental covariates.We then used the other, non-Swedish EUROSPAN cohorts for replication of our strongest associations in a candidate gene association study (CGAS).
We chose a population living in northern Sweden for the selection of candidate loci because it shows strong natural heterogeneity in certain lifestyle factors (e.g.diet, activity), but homogeneity in other environmental aspects such as climate [6].Whereas one group is living a modern, sedentary lifestyle found also in the southern part of Sweden and other western European countries, a subgroup of Swedes follows a traditional, seminomadic way of life based on reindeer herding.Reindeer herders typically show higher intake of game meat (reindeer, moose), which has a high protein and low fat content, and lower intake of non-game meat, fish, and dairy products among other, lesser differences.They also exert more physical activity at work to tend their reindeer herds, but less activity at leisure [7].

Exploratory GWAS in NSPHS
We performed a GWAS with a lifestyle-adjusted model which included not only sex and age, but also daily intake of game meat, non-game meat, fish, milk products, physical activity at work and at leisure as covariates.We focused on the 0.05% of all SNPs with the lowest p-values in the diet-and activity-adjusted model (corresponding to about 150 SNPs per lipid).For total cholesterol, 88 of these were located in a gene and 14 in genes that have been associated with energy metabolism (http://www.ncbi.nlm.nih.gov/omim/).For LDL-C, 65 SNPs were located in a gene, of which 8 were functionally relevant.Several of the SNPs for LDL-C were identical with those affecting total cholesterol, as expected from the high correlation (r = 0.91) between both phenotypes.For HDL-C, SNP rs2292883, located in the MLPH gene (Melanophilin), showed a genome-wide significant p-value (p = 1.06610 207 ).
69 SNPs for HDL-C were located in a gene and 14 of those genes were reported as having a metabolic effect.Finally, for triglycerides, 63 SNPs were located in a gene, but only 4 SNPs in genes with a functional annotation of interest (Table 1 and Table S1A, S1B, S1C, S1D).

P-value changes
In order to evaluate the effect of including diet and activity covariates in the association analysis, we overlaid the p-values in the Manhattan plots from the NSPHS for the unadjusted and adjusted GWAS models (Figure 1, Figure 2, Figure 3, Figure 4).More refined GWAS results separating the effect of adjusting for either diet or physical activity are presented in Figure S1A, S1B, S1C, S1D; and Figure S2A, S2B, S2C, S2D.As expected, the p-values for a number of SNPs were sensitive to the inclusion of both diet and activity covariates in the model.We matched the 0.05% SNPs with the lowest p-values (top SNP list) between the unadjusted and the adjusted model.For TC, 83 (53%) SNPs were found in both top SNP lists.Those lists contained 102 (64%) identical SNPs for LDL-C and 103 (65%) for HDL-C.The analyses resulted in the same 74 (47%) top SNPs for TG levels (Table S1A, S1B, S1C, S1D).Finally, we compared the p-value changes of the resulting 39 candidate SNPs that are located in genes with a metabolic effect between the diet and activityadjusted (full) model and the unadjusted (restricted) model resulting in an up to 27-fold p-value decrease (Table 1).

Confirmatory CGAS in EUROSPAN
A food-and activity-adjusted candidate gene association study of the final 39 candidate SNPs in the Scottish (SC) sample (N = 714) was applied using similar lifestyle covariates (Table 2; Table S1E, S1F, S1G, S1H; Table S2).We replicated the effect of rs2000999 (p SC,unadj = 6.16610 203 , p SC,adj = 4.33610 203 ) in the HP gene (Haptoglobin-related protein Precursor) on TC level and the effect of rs1532624 (p SC,unadj = 2.40610 209 , p SC,adj = 1.96610 209 ) in CETP (Cholesteryl ester transfer protein) on HDL-C.In the Swedish cohort (SE), the unadjusted genetic effect of rs2000999 in the HP gene is equivalent to a moderately large difference in average TC level of 20.21 mg/dl between the homozyguous genotypes (Mean- We also performed an unadjusted candidate gene analysis of the 39 candidate SNPs in all non-Swedish (NS) EUROSPAN cohorts (Scotland, Croatia, The Netherlands, and Italy, N NS = 3,282) and aggregated the results in a meta-analysis (Table 2; Table S1I, S1J, S1K, S1L).We confirmed the effects of rs5400 (p NS = 4.68610 202 ) in SLC2A2 on TC.We again found that rs2000999 (p NS,unadj = 3.54610 22 ) in HP influences TC levels and rs1532624 (p NS,unadj = 2.87610 220 ) in CETP (Cholesteryl ester transfer protein) affects HDL-C levels.The unadjusted genetic effect of rs5400 is equivalent to a moderately large difference in mean TC level of 27.11 mg/dl between homozyguous genotypes (M SE,unadj (TC|A/A)2M SE,unadj (TC|G/G) = 249.30mg/

Author Summary
In this article we report a genome-wide association study on cholesterol levels in the human blood.We used a Swedish cohort to select genetic polymorphisms that showed the strongest association with cholesterol levels adjusted for diet and physical activity.We replicated several genetic loci in other European cohorts.This approach extends present genome-wide association studies on lipid levels, which did not take these lifestyle factors into account, to improve statistical results and discover novel genes.In our analysis, we could identify two genetic loci in the SLC2A2 (Glucose transporter type 2) and the HP (Haptoglobin-related protein precursor) gene whose effects on total cholesterol have not been reported yet.The results show that inclusion of important environmental factors in the analysis model can reveal new insights into genetic determinants of clinical parameters relevant for metabolic and cardiovascular disease.No other associations, including LDL cholesterol or triglycerides levels, were replicated (all p.0.05).The genome-wide significant SNP rs2292883 in the Melanophilin (MLPH) gene found in the Swedish cohort was not confirmed.

Discussion
Environmental covariates may either act as moderators, mediators or even suppressors, thereby affecting the discovery of genetic susceptibility loci [8,9].Therefore, we conducted a GWAS, modeling genetic and important environmental effects, such as food intake and physical activity, on serum levels of classical lipids.To our knowledge, this is the first GWAS on blood lipid levels modeling environmental factors, in particular major food categories and physical activity, in international cohorts.Our analysis replicated one known locus in the CETP gene [1] and identified two other gene loci in the SLC2A2 and HP gene, respectively, involved in energy metabolism but not previously reported to be associated with cholesterol levels.
SLC2A2 encodes the facilitated glucose transporter member 2 (GLUT-2, Solute carrier family 2) and is predominantly expressed in the liver.Mice deficient in GLUT-2 are hyperglycemic and have elevated plasma levels of glucagon and free fatty acids [10].
Mutations in GLUT-2 cause the Fanconi-Bickel syndrome (FBS) characterized by hypercholesterolemia and hyperlipidemia [11,12].Cerf [13] argued that a high-fat diet causes a decreased expression of the GLUT-2 glucose receptor on b-cell islets.As a result, glucose stimulation of insulin exocytosis is impaired causing hyperglycemia, a clinical hallmark of type 2 diabetes.In addition, Kilpelainen et al. [14] found that physical activity moderates the genetic effect of SLC2A2 on type 2 diabetes.These studies suggest that these lifestyle factors could have masked genetic effects in previous, unadjusted GWAS.This is emphasized by the strong increase in statistical significance of the SLC2A2 polymorphisms after adjusting for diet and physical activity, indicating that the examined lifestyle factors modified the effect of this gene.Our supplemental results show that physical activity markedly moderated the genetic effect on total cholesterol.
The HP gene encodes the Haptoglobin-related Protein Precursor (Hp), which binds hemoglobin (Hb) to form a stable Hp-Hb complex and, thereby, prevents Hb-induced oxidative tissue damage.Asleh et al. [15] identified severe impairment in the ability of Hp to prevent oxidation caused by glycosylated Hb.Diabetes is also associated with an increase in the non-enzymatic glycosylation of serum proteins, so these authors suggested that there is a specific interaction between diabetes, cardiovascular disease and the Hp genotype.It results from the increased need of rapidly clearing glycosylated Hb-Hp complexes from the subendothelial space before they oxidatively modify low-density lipoprotein to form the atherogenic oxidized low-density lipopro- tein.The p-value for association between the HP SNP rs2000999 and total serum cholesterol concentration decreased in the model adjusted for diet and physical activity, suggesting that the genetic effect is moderated by diet and physical activity.Our supporting material points out the moderating role of physical activity in particular.
We also observed a highly significant association between rs1532624 in CETP and HDL-C levels.The CETP protein catalyzes the transfer of insoluble cholesteryl esters among lipoprotein particles.Variation in CETP is known to affect the susceptibility to atherosclerosis and other cardiovascular diseases [16].Adjustment for diet and physical activity in our model caused an increase of the p-value of this SNP.Our supporting results indicate that the genetic effect is mediated by diet or by physical activity in a similar way.
This study also has some limitations.First, we are aware that our candidate gene association approach covers only a very small fraction of all genomic loci, which is one of the potential reasons why some classical lipid-influencing genes, such as APOE, are not represented in our candidate SNP list.Therefore, our approach is not comprehensive and may have failed to identify other relevant lifestyle-sensitive genetic variants.Nonetheless, we decided to apply this approach to make the best out of the available lifestyle data.Second, our study provides only limited information on the role of individual lifestyle factors for a genetic variant.However, in this study we aimed at amplifying genetic effects by adjusting for a maximum amount of environmental variance in a single model and, therefore, we neglected some of these aspects here.Third, we did not model genetic covariates in known lipid-relevant genes which may also moderate the effect of other genetic predictors.This is due to the focus of this paper on gene-environment relationships.
In summary, we have demonstrated that modeling environmental factors, in particular major food categories and physical activity, can improve statistical power and lead to the discovery of novel susceptibility loci.Such models also provide an understanding of the complex interplay of genetic and environmental factors affecting human quantitative traits.Inclusion of environmental covariates represents a much needed next step in the quest to model the complete environmental and genetic architecture of complex traits.

Ethics statement
All EUROSPAN studies were approved by the appropriate research ethics committees according to the Declaration of Helsinki [17]

Participants
The examined subjects stem from five different populationrepresentative, pedigree-based cohorts from the EUROSPAN consortium (http://www.eurospan.org).All studies include a comprehensive collection of data on family structure, lifestyle, blood samples for clinical chemistry, RNA and DNA analyses, medical history, and current health status.All participants gave their written informed consent [18].A brief description of each population is given below: The Northern Swedish Population Health Study (NSPHS) represents a cross-sectional study conducted in the community of Karesuando in the subartic region of the County of Norrbotten, Sweden, in 2006 [5].This parish has about 1500 eligible inhabitants of whom 740 participated in the study.The final sample consisted of 309 men and 347 women who were aged between 14 and 91 years.The inclusion of diet and activity covariates in the analytical model and according missing values reduced the effective sample size by less than 5%.
The Orkney Complex Disease Study (ORCADES) is a longitudinal study in the isolated Scottish archipelago of Orkney [19].Participants from a subgroup of ten islands (N = 719) were used for the presented analysis.The sample comprised 334 men and 385 women aged between 18 and 100 years.The inclusion of diet and activity covariates in the analytical model and according missing values reduced the effective sample size by less than 5%.
The VIS study is a cross-sectional study in the villages of Vis and Komiza on the Dalmatian island of Vis, Croatia, and was conducted between 2003 and 2004 [20][21][22].795 participants who had both genotype and phenotypic data available were analysed.This cohort included 328 men and 467 women with an age between 18 and 93 years.
The Microisolates in South Tyrol Study (MICROS) is a crosssectional study carried out in the villages of Stelvio, Vallelunga, and Martello, Venosta valley, South Tyrol, Italy, from 2001 to 2003 [23].The 1,097 participants (475 males, 622 females, age between 18 and 88 years) presented in this study are those for whom both relevant genotype and phenotype data were available.
The Erasmus Rucphen Family Study (ERF) is a longitudinal study on a population living in the Rucphen region, the Netherlands, in the 19th century [24].Fasting total cholesterol, HDL cholesterol and triglyceride levels were available.LDL cholesterol was estimated using the Friedewald formula [25].The 918 individuals included in this study consisted of the first series of participants with 354 men and 564 women aged between 18 and 92 years.

Genotyping
DNA samples were genotyped according to the manufacturer's instructions on Illumina Infinium HumanHap300v2 or Hu-manCNV370v1 SNP bead microarrays.Both arrays have 311,388 SNP markers in common that are distributed across the human genome.Analysis of the raw data was done in the BeadStudio software with the recommended parameters for the Infinium assay and using the genotype cluster files provided by Illumina.Individuals with a call rate below 95% and SNPs with a call rate below 98%, deviating from Hard-Weinberg equilibrium (p HWE ,1610 26 ) or with a minor allele frequency of less than 1% were excluded from the analysis.

Lipids
Total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and   triglycerides (TG) were quantified by enzymatic photometric assays using an ADVIA1650 clinical chemistry analyzer (Siemens Healthcare Diagnostics GmbH, Eschborn, Germany) at the Institute for Clinical Chemistry and Laboratory Medicine, Regensburg University Medical Center, Germany.

Diet
In the NSPHS cohort, we collected data with a food frequency questionnaire based on the Northern Sweden 84-item Food Frequency Questionnaire (NoS-84-FFQ) [26].We included in the questionnaire several items on foods specific for the lifestyle in this geographic region, in particular on game consumption (reindeer, moose).The answer options consisted of an 11-point format: 0 = ''Never'', 1 = ''less than 1 time per month'', 2 = ''1 to 3 times per month'', 3 = ''1 time per week'', 4 = ''2 to 4 times per week'', 5 = ''5 to 6 times per week'', 6 = ''1 time per day'', 7 = ''2 to 3 times per day'', 8 = ''4 to 5 times per day'', 9 = ''6 to 8 times per day'', 10 = ''9 to 10 times per day''.The questionnaire was applied in electronic format by a trained study nurse as an interviewer.For each food item we calculated daily intake in gram per day as a standardized unit of measurement and aggregated the items to food categories, such game meat, non-game meat, fish, and dairy products.We evaluated the construct validity (known-groups validity) of the added items on game consumption in the NoS-84-FFQ questionnaire.We compared reindeer herders (N = 94) versus non-reindeer herders (N = 505).We observed highly significant, large effect sizes in men (ES = 1.25, p = 9.7610 204 ) and women (ES = 1.15, p = 2.9610 205 ) in the expected direction corresponding with an approximately three times higher consumption of absolute overall game intake in reindeer herders compared to others.A similar approach was used for the measurement and analysis of dietary data collected with a food frequency questionnaire in the Scottish cohort (Table S2).

Physical activity
In the NSPHS cohort, we used two self-report scales to measure overall physical activity at work and at leisure.The Work Activity Scale (WAS, 6 items) addresses typical occupational physical activities: sitting, standing, walking, lifting, and general indicators of physical activity, i. e. sweating and tiredness after work.The Leisure Activity Scale (LAS, 4 items) asks for various typical freetime activities such walking, cycling, other sporting activities, and sweating as a general indicator of physical activity.Participants reported the frequency of each activity on a 5-point rating scale (1 = ''never'', 2 = ''seldom'', 3 = ''sometimes'', 4 = ''often'', and 5 = ''always'').Both scales showed satisfying internal consistency with Cronbach's a(WAS) = 0.73 and Cronbach's a(LAS) = 0.70.A similar approach was used for the measurement and analysis of data on physical activity collected with a self-report questionnaire in the Scottish cohort (Table S2).

Statistical analysis
Model selection.Sex and age are chosen as standard moderators of medical outcomes.Food and physical activity covariates have been selected based on findings on natural variation in lifestyle factors in this (data not presented) and other [7] northern Swedish populations between a modern, sedentary and a traditional, semi-nomadic lifestyle based on reindeer herding.Mostly significant associations between diet and activity covariates and lipid levels were found in the examined Swedish EUROSPAN cohort in the following ranges:  S3).We finally selected sex, age, game meat, non-game meat, fish, dairy products, physical activity at work, and physical activity at leisure as covariates in our diet-and activity-adjusted model (''adjusted'' model) in the Swedish EUROSPAN sample.Sex and age were used as covariates in the ''unadjusted'' model.We tested whether the inclusion of those covariates in the explanatory model led to a statistical significant improvement of the goodness of model fit compared to a restricted model by applying a maximum likelihood ratio (MLR) test.We inferred a significant better model fit of the full model if the difference of the x 2 value between both models had an equal or lower probability than p = 0.05 (one-sided, upper tail) on a x 2 distribution with k degrees of freedom.The degrees of freedom k are equal to the difference of the number of parameters in each model.The difference of x 2 values between both models is calculated according to the following formula with MLE indicating the maximum likelihood estimates per model: x 2 (rest2full) = 22 (log 10 (MLE rest )2log 10 (MLE full )).The comparison of the goodness of fit between the unadjusted and the diet-and activity-adjusted full model, using a MLR test, showed a statistically significant improvement for all four lipid traits (TC: The confounding effect of treatment with statins on total cholesterol level and LDL cholesterol level was adjusted for by imputing untreated lipid concentrations of medicated individuals using the npsubtreated() function of the R/GenABEL package which implements the algorithm of Tobin et al. [27].Additionally, we conducted the same analysis in subsamples which did not receive any lipid-lowering treatment and found overall converging, but somewhat weaker results for rs2000999 (p  S4).
Genome-wide association analysis.First, deviations from normality for all quantitative traits (lipids, age, diet, and physical activity) were corrected by inverse-normal transformation without adjusting for covariates.Second, linear mixed effects models were fitted for the transformed outcomes (TC, LDL-C, HDL-C, TG) using the above mentioned covariates in the Swedish EUROSPAN sample and corresponding measures in the Scottish EUROSPAN sample (Table S2).The analysis was performed using the ''polygenic'' linear mixed effects model function polygenic() of the R/GenABEL package.Third, genomewide association analysis was performed using a score test, a family-based association test [28], implemented in the mmscore() function of R/GenABEL.It uses the residuals and the variancecovariance matrix from the polygenic model and additional the SNP fixed effect coded under an additive model (0 = A/A, 1 = A/ B, 2 = B/B).Fourth, genome-wide significance of a genetic loci was based on a local type I error of a = 0.05/311 388 SNPs = 1.6610 27 according to a Bonferroni adjustment.
Candidate gene association analysis.The same statistical approach was used for association analysis of candidate loci with a local type I error of a = 0.05.No Bonferroni adjustment was applied to protect against a inflation since this method would be biased for the following reasons.The applied selection procedure for candidate loci makes the assumption of a global null hypothesis highly unlikely.Additionally, the phenotypes and some of the genotypes are highly correlated decreasing the number of independent tests.Instead all confirmatory tests are reported to allow the reader to evaluate the overall significance of the findings [29].
Relatedness.l coefficients of lifestyle-adjusted genome-wide analysis varied in a low range between 1.00 and 1.04 in the Swedish cohort (see QQ-plots, Figures S3A, S3B, S3C, S3D, and Figure S4A, S4B, S4C, S4D) and between 1.00 and 1.01 in the Scottish cohort across all lipid traits.l values for the unadjusted model used in the other three EUROSPAN cohorts did not exceed 1.01.These values indicate that our statistical model adequately handled relatedness in our pedigree-based samples since deflation of l values is expected after correction for family structure.

Figure 1 .
Figure 1.Manhattan plot of genome-wide effects on total cholesterol levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained food intake and physical activity as predictors.The dashed line indicates the local Bonferroniadjusted a error = 1.6610 27 .doi:10.1371/journal.pgen.1000798.g001 . The Northern Swedish Population Health Study (NSPHS) was approved by the local ethics committee at the University of Uppsala (Regionala Etikpro ¨vningsna ¨mnden, Uppsala).The Scottish ORCADES study was approved by the NHS Orkney Research Ethics Committee and the North of Scotland REC.The Croatian VIS study was approved by the ethics committee of the medical faculty in Zagreb and the Multi-Centre Research Ethics Committee for Scotland.The Dutch ERF study was approved by the Erasmus institutional medical ethics

Figure 2 .
Figure 2. Manhattan plot of genome-wide effects on LDL cholesterol levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained food intake and physical activity as predictors.The dashed line indicates the local Bonferroniadjusted a error = 1.6610 27 .doi:10.1371/journal.pgen.1000798.g002

Figure 3 .
Figure 3. Manhattan plot of genome-wide effects on HDL cholesterol levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained food intake and physical activity as predictors.The dashed line indicates the local Bonferroniadjusted a error = 1.6610 27 .doi:10.1371/journal.pgen.1000798.g003

Figure 4 .
Figure 4. Manhattan plot of genome-wide effects on triglyceride levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained food intake and physical activity as predictors.The dashed line indicates the local Bonferroni-adjusted a error = 1.6610 27 .doi:10.1371/journal.pgen.1000798.g004

Figure S1
Figure S1 Manhattan plots of genome-wide effects on total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained dietary measures (game meat, non-game meat, fish, milk products) as predictors.The dashed line indicates the local Bonferroniadjusted a error = 1.6610 27 .Found at: doi:10.1371/journal.pgen.1000798.s001(0.31 MB DOC) Figure S2 Manhattan plots of genome-wide effects on total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels in the Swedish discovery cohort.Results for two GWAS analysis models are presented.The unadjusted model (dark blue and light blue circles) included only sex and age as covariates.The adjusted model (red and orange squares) additionally contained physical activity measures (job, leisure) as predictors.The dashed line indicates the local Bonferroni-adjusted a error = 1.6610 27 .Found at: doi:10.1371/journal.pgen.1000798.s002(0.31 MB DOC) Figure S3 QQ-Plots for the unadjusted GWAS on total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels in the Swedish discovery cohort.The analysis model was only adjusted for sex and age, but not for diet and activity measures (black line = expected slope under no inflation, red line = slope fitted to observations).Found at: doi:10.1371/journal.pgen.1000798.s003(0.12 MB DOC) Figure S4 QQ-Plots for the adjusted GWAS on total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels in the
All candidate SNPs show strongest associations (p-value, top 0.05% SNPs per lipid trait) and are located in a gene which has been reported to be relevant for energy metabolism.SNPs are sorted by p-value ratio (unadjusted:unadjusted).

Table 2 .
SNPs (n = 3) discovered in a Swedish and replicated in a non-Swedish EUROSPAN cohort.