Association of Common Genetic Variants with Lipid Traits in the Indian Population

Genome-wide association studies (GWAS) have been instrumental in identifying novel genetic variants associated with altered plasma lipid levels. However, these quantitative trait loci have not been tested in the Indian population, where there is a poorly understood and growing burden of cardiometabolic disorders. We present the association of six single nucleotide polymorphisms in 1671 sib pairs (3342 subjects) with four lipid traits: total cholesterol, triglycerides, high density lipoprotein cholesterol (HDL-C) and low density lipoprotein cholesterol (LDL-C). We also investigated the interaction effects of gender, location, fat intake and physical activity. Each copy of the risk allele of rs964184 at APOA1 was associated with 1.06 mmol/l increase in triglycerides (SE = 0.049; p = 0.006), rs3764261 at CETP with 1.02 mmol/l increase in both total cholesterol (SE = 0.042; p = 0.017) and HDL-C (SE = 0.041; p = 0.008), rs646776 at CELSR2-PSRC1-SORT1 with 0.96 mmol/l decrease in cholesterol (SE = 0.043; p = 0.0003) and 0.15 mmol/l decrease in LDL-C levels (SE = 0.043; p = 0.0003) and rs2954029 at TRIB1 with 1.02 mmol/l increase in HDL-C (SE = 0.039; p = 0.047). A combined risk score of APOA1 and CETP loci predicted an increase of 1.25 mmol/l in HDL-C level (SE = 0.312; p = 0.0007). Urban location and sex had strong interaction effects on the genetic association of most of the studied loci with lipid traits. To conclude, we validated four genetic variants (identified by GWAS in western populations) associated with lipid traits in the Indian population. The interaction effects found here may explain the sex-specific differences in lipid levels and their heritability. Urbanization appears to influence the nature of the association with GWAS lipid loci in this population. However, these findings will require replication in other Indian populations.


Introduction
Coronary heart disease is projected to be the leading cause of death for adult Indians by 2020 [1] due to rising prevalence of cardiometabolic disorders [2,3]. Plasma lipid concentrations are established risk factors for coronary artery disease (CAD) [4] and are also targets for therapeutic interventions [5]. While genomewide association studies (GWAS) have been instrumental in identifying the quantitative trait loci (QTL) associated with altered levels of plasma lipids [6][7][8][9], these new discoveries require validation in different population groups in order to understand their wider potential for application and clinical benefits.
Only two previous validations of a limited sub-set of GWAS lipid findings have been reported for Indian populations [10,11]. During the discovery and replication phases, samples from the LOLIPOP cohort have been widely used to validate GWAS loci for Indian populations, but this cohort comprises Indians residing in the UK and demonstrated a replication rate of 35% [8].
Further, considerable Asian/European differences in lipid profiles have been reported for Asian Indians exhibiting an adverse lipid pattern consisting of low high density lipoprotein cholesterol (HDL-C) and high triglycerides irrespective of diabetic status [12]. Moreover, none of the published reports addressed the complexity of numerous endogamous groups where the average allele frequency differentiation across different groups is known to be 3-fold greater than that observed in European population groups [13]. This indicates a gap in the understanding of the aetiology of lipid traits in Indian populations.
In addition to plasma lipids, other risk factors (e.g. obesity, diabetes and hypertension) are independently and interactively associated with increased risk of cardiovascular diseases [14][15][16] which are further associated with dyslipidemia [17]. Gottesman and colleagues [18] investigated the overlap of genetic variants related to cardiometabolic traits and reported 44 positional genes that have pleiotropic effects. With these findings in mind, we hypothesize that dyslipidemia and metabolic phenotypes such as hyperglycemia, hypertension and anthropometric traits have a common genetic basis.
In our previous study, we had reported on association analysis of five lipid-related QTLs in the Indian population [10]. Since our earlier report, a genome-wide meta-analysis [8] has reported 95 loci associated with lipid levels with an impact in three non-European populations including South Asians. Simultaneously, a non coding genetic variant in the SORT1 gene was observed that lead to clinical phenotypes, thus suggesting a novel regulatory pathway [9]. Further, a genome-wide meta-analysis found five new loci associated with CAD in European and South Asian populations [19]. In the present study, we raise the following questions: (i) are lipid-related QTLs discovered since our earlier study also associated with altered plasma lipid levels in Indian populations? and (ii) are these genetic loci associated with other cardiometabolic traits in Indians? Answering these questions will help in determining whether cardiometabolic traits have a common pathophysiology across different population groups.

Ethics statement
The ethical approval for the Indian Migration Study (IMS) was attained from All India Institute of Medical Sciences (AIIMS), New Delhi, India (reference number A-60/4/8/2004). Preinformed written consent was obtained from each participant before beginning the data collection.

Study population
The present study was carried out using trait data and DNA from the IMS where migrant and non-migrant factory workers and their co-resident spouses were recruited along with their ruraldwelling sibs [20,21]. The fieldwork for the IMS took place from 2005-2007 in four factories located in different cities of India (Lucknow, Nagpur, Hyderabad and Bangalore).

Data collection
Phenotyping details are described in File S1. Briefly, blood pressure, height, weight, waist and hip girth and skin folds were measured on the sib-pairs in the same clinic by trained clinicians and the % body fat was derived from the skin folds. Data on diet and physical activity were recorded on interviewer-administered questionnaires. Fasting blood samples were collected from the participants and the time of the last meal was recorded. Serum and plasma samples were used for generating data on glycemic and lipid profile.

Genotyping and quality control
Genotyping was performed during 2011-2012 using the Fluidigm platform with single-plex 96.96 chips wherein 96 established GWAS single nucleotide polymorphisms (SNPs) related to cardiometabolic traits were analyzed. Two pairs of duplicates and negative controls (water) were run with every 96 samples for quality control purposes. The genotyping success rate was .95% and duplicate samples had .99% concordance. Out of 96 SNPs, fourteen loci were selected from three major studies on lipid levels [8,9] and CAD [19]. The limited loci were selected from these studies based on their biological importance and pvalues (#1610 240 for lipid loci and ,1610 28 for CAD loci). Out of the 14 SNPs genotyped, nine passed the quality control during data cleaning process and finally six loci were found to be in Hardy-Weinberg equilibrium (HWE) (Table S1 in File S1) for which the results are presented.

Sample Size and power calculation
We analyzed 1671 sib pairs (3342 individuals) after excluding: (i) singletons (ii) cousin/friend pairs (iii) pairs with one or both sibs having missing phenotypes (iv) pairs with one or both sibs having missing genotyping data on .7 SNPs (v) pairs where one or both sibs self-reported cardiovascular diseases to avoid phenotypic heterogeneity that could cause distorted relationships with lipid traits. Power estimates were derived using the genetic power calculator using option ''QTL association for sibships and singletons'' [22]. Given the minor allele frequency (MAF) of 21% (minimum MAF in IMS) and the sample size of 1671 sibpairs, this study had 80% power at a = 0.05 to detect a QTL explaining 1% variation of a trait. Sex-specific associations were estimated among 632 male and 364 female sib-pairs.

Statistical analysis
After log transformation of skewly distributed variables (see File S1), the association analysis was done using an orthogonal familybased model described by Fulker et al. [23] assuming an additive model of inheritance and considering a sib-pair as the unit of analysis (described in File S1). We applied multi-level models adjusted for age, sex, site (i.e. city) and location (i.e. rural/urban) for analyses on all quantitative traits because these covariates were associated with various outcomes in the study population and differences were found across the sites and locations [20]. Since physical activity and fat intake are important determinants of the lipid profile [24][25][26], we also adjusted for these two variables when estimating the associations. Association of the six selected loci was estimated for four lipid traits [total cholesterol, triglycerides, HDL-C and low density lipoprotein cholesterol (LDL-C)] and also for other metabolic traits related to obesity [body mass index (BMI), waist-hip ratio (WHR), waist circumference (WC) and %body fat], hypertension [systolic blood pressure (SBP) and diastolic blood pressure (DBP)] and diabetes (fasting glucose and fasting insulin) after adjusting for lipid traits and also for WHR in the case of BMI, to detect the independent associations. Correction for multiple testing was not applied for lipid traits as the studied SNPs are established loci [8,9,19], whereas for all other metabolic traits inferences were made on the basis of corrected a (value = 0.0083) based on a Bonferroni correction [27] for six tests.
Sex-specific associations were also examined given prior evidence for dimorphic patterns of association [8,28]. We also tested for interaction effects by sex, location, fat intake and physical activity by including interaction terms within the fixed effect component of the Fulker association model (see details in File S1). Stratified analysis by location, fat intake and physical activity could not be performed due to limited sample size available in these groups.
To estimate the combined effect of loci on lipid levels, risk scores were calculated using loci associated with each of the lipid traits examined in the present study. Weighted risk scores (trait specific b coefficients as weights) based on associated loci observed [29] were fitted into the Fulker model for estimating within sib-pair effects. Since additional samples for estimating the effect of risk score were not available, the present data set was divided into two random halves representing the discovery and validation samples to validate the weighted risk scores.

Results and Discussion
Over 100 SNPs associated with altered plasma lipid levels have been discovered using GWAS [6][7][8][9]. Considering that these studies were mostly conducted in populations of European descent and that the minor alleles and their frequency, haplotype background and environmental influences vary across ethnic groups [30], we investigated the role of these loci on four lipid and other traits that predict cardiovascular disease risk in Indian population. Validation of the effects of GWAS loci will likely be more valuable in populations such as Asian Indians [31] that have high disease burden and where conducting GWAS is a difficult task. Table S2 in File S1 shows the comparison between the effect Table 2. Within sib-pair association estimates for lipid traits.   Table 4. Within sib-pair interaction estimates for lipid traits by location, average daily fat intake and physical activity. alleles and their frequency observed in European populations and that observed in our study samples and highlights the considerable variation between them. However, the allele frequencies we observed were consistent with those reported for Gujarati Indians living in the Houston (GIH) HapMap database. The general characteristics of the study population and outcome variables are summarized in Table 1. Significant differences were found between males and females for various cardiometabolic traits, except for total cholesterol and fasting glucose ( Table 1).

Association of six loci with lipid levels
In an earlier report, rs662799 at APOA5, rs10503669 at LPL, rs780094 in GCKR, rs562338 in APOB and rs4775041 in LIPC were validated in the present study population 10 . In the current analyses, we found associations between genetic variants on/near four loci (APOA1, CETP, CELSR2-PSRC1-SORT1 and TRIB1) and the four lipid traits in the Indian population ( Table 2). Although the directions of associations were consistent with that reported worldwide, the effect sizes in the Indian population were larger than that observed for European populations but consistent with other Asian populations (Table S3 in File S1). Of these, rs964184 at APOA1 locus was associated with 1.06 mmol/l higher triglycerides (SE = 0.049; p = 0.006); rs3764261 at CETP with 1.02 mmol/l higher total cholesterol (SE = 0.042; p = 0.017) and 1.02 mmol/l higher HDL-C (SE = 0.041; p = 0.008); rs646776 at CELSR2-PSRC1-SORT1 with 0.96 mmol/l lower total cholesterol (SE = 0.043; p = 0.0003) and 0.15 mmol/l lower LDL-C (SE = 0.043; p = 0.0003) and rs2954029 at TRIB1 with 1.02 mmol/l higher HDL-C (SE = 0.039; p = 0.047) levels.
Apolipoprotein A-1 is the major protein component of HDL and promotes cholesterol efflux from tissues to the liver for excretion. The APOA1 locus was reported to be associated with increased triglycerides and lower HDL-C in the discovery phase of various studies [32]. In subsequent GWAS and meta-analyses, the APOA1 locus was confirmed to be associated with higher triglycerides, total cholesterol, LDL-C and lower HDL-C levels in Europeans [8]. The association of APOA1 variants with higher triglyceride levels has also been established in Tibetans [33] as well as in Punjabi and US cohorts [11]. We have also observed significant association of this locus with higher triglyceride levels in the present analyses.
The CETP locus codes for cholesteryl ester transfer protein that facilitates the transfer of cholesteryl esters and triglycerides between lipoproteins. CETP was found to be associated with high HDL-C in GWAS discovery [34], which was further replicated among Europeans [8], Americans [11] and Punjabi cohorts [11,35] and with higher total cholesterol levels among Caucasians [8]. Lower triglycerides and LDL-C in a European GWAS metaanalysis were also observed to be associated with CETP [8]. In the present study, we validated its association with higher total cholesterol and HDL-C levels.
The third locus is mapped near the CELSR2-PSRC1-SORT1 gene cluster and emerged from a GWAS of LDL-C conducted among British population [36]. Its association with lower LDL-C levels was also replicated in Austrians [37] and Pakistanis [38]; and with high total cholesterol in Netherland population [39]. In the present study, CELSR2-PSRC1-SORT1 was associated with lower levels of total cholesterol and LDL-C.
The TRIB1 locus codes for tribbles homologue 1 protein that regulates the activation of mitogen activated protein kinases. The association of this locus was first reported to be associated with triglycerides [40] and subsequently with low total cholesterol, LDL-C and high HDL-C in European population [8,41]. Here, we observed its association with higher HDL-C levels, which is in agreement with that seen for Europeans. In contrast, the TRIB1 locus was associated with lower HDL-C levels in a Danish population [42].
Since lifestyle factors, especially diet and physical activity, are strongly associated with individual serum lipid profiles [24][25][26], dietary daily fat intake and physical activity (total MET score) were included as additional covariates to explore the possible associations of studied QTLs. In the studied population, these adjusted analyses did not alter the direction or effect size compared with the unadjusted analyses of these two covariates (Table S4 in File S1).
The cumulative effect of genetic variants for lipids is known to be associated with subclinical and clinical cardiovascular outcomes [43]. In the present study, multiple loci were associated with HDL-C and total cholesterol, but the directions of the effects were same only for SNPs associated with HDL-C (Table 2). Thus, an attempt was made to estimate the combined effect of the two significant loci (rs2954029 at TRIB1 and rs3764261 at CETP) on HDL-C levels. The weighted risk score was associated with a 1.25 mmol/l higher HDL-C level per risk alleles at both variants (SE = 0.312; p = 0.0007) as opposed to a 1.02 mmol/l increase that could be explained by independent SNPs.

Association of six loci with related metabolic traits
We further investigated these GWAS loci related to lipids for their association(s) with other metabolic traits which would help in identifying the causal pathways that are common to these outcomes. While there are sufficient epidemiological and clinical evidence that support the relationship among dyslipidemia, cardiovascular disease, diabetes, obesity and hypertension; the common genetic mechanisms underlying these diseases are not well established [18]. Evidence of weak associations between lipid related genetic variants in LPL and GCKR have been reported earlier with hypertension and variants in LPL with fasting glucose, fasting insulin and systolic blood pressure [10]. In the present study, three out of the six investigated loci were associated with metabolic disorders (Tables S5-S7 in File S1). While performing association analyses, adjustments were made for lipid traits (in addition to age, sex, site and location) to avoid bias that could occur due to phenotypic heterogeneity.
Of interest was that the two loci APOA1 and TRIB1 that affected HDL-C levels also influenced waist circumference. We noted an overlapping association between lipid levels and waist circumference, which would point towards a common pathophysiology between lipids and obesity traits. In addition, we found a weak association between the PDGFD locus and diastolic blood pressure, which echoes the pattern of association with other traits in a previous study [18] that found that PDGFD was implicated in variety of functions, especially angiogenesis. Recently, Schierer and colleagues [35] in a similar attempt reported that CETP was associated with a decrease in systolic blood pressure (b = 20.08, p = 0.002) among Asian normoglycemic controls.
However, these loci need to be assessed in a larger set of samples in order to draw more meaningful inferences, as none of the genetic variants retained the association after correction for multiple testing.

Sex-specific association of six loci related to lipid levels
There is evidence that point towards sex heterogeneity in the association of lipid-related loci with lipid parameters [8]. We found sex-specific associations with various lipid traits ( Table 3). Out of the four loci that were associated in the combined analyses, CETP was associated with 1.05 mmol/l higher HDL-C (SE = 0.071; p = 0.001) and CELSR2-PSRC1-SORT1 was associat-ed with 0.94 mmol/l lower triglycerides (SE = 0.081; p = 0.074), 0.95 mmol/l lower total cholesterol (SE = 0.076; p = 0.007) and 0.15 mmol/l lower LDL-C (SE = 0.072; p = 0.033) among male sib-pairs only. On the other hand, APOPA1 was associated with 1.05 mmol/l higher triglycerides (SE = 0.089; p = 0.015) and TRIB1 with 1.04 mmol/l higher HDL-C (SE = 0.077; p = 0.007) among female sib-pairs only. In addition, LIPA was associated with 1.03 mmol/l higher HDL-C (SE = 0.078; p = 0.041) only in female sib-pairs which did not emerge in the combined analyses. We previously also reported sex-specific associations for lipid traits [10] and now postulate that these findings might explain the sex differences in lipid levels and their heritability.
The exploratory interaction analyses provide evidence that the genetic effects of all six loci were influenced by gender and these associations were consistent even after adjustments for fat intake and physical activity (Table 3). Modifications in the genetic effects of two loci was seen where the effects were stronger among males in the case of APOA1 with triglycerides (b = 0.168, SE = 0.051, p = 0.001) and CELSR2-PSRC1-SORT1I with total cholesterol (b = 20.135, SE = 0.045, p = 0.003) and LDL-C (b = 20.099, SE = 0.046, p = 0.030). In addition, a few conditional associations with sex were found, such as association of LIPA with HDL-C (b = 20.100, SE = 0.033, p = 0.002) ( Table 3) that did not originate in main effects.

Effects of environmental factors on lipid loci
Rural to urban migration has been suggested to be associated with increased fat intake and reduced physical activity [20]. Thus, we tested for effect modification by location, fat intake and physical activity while allowing for the main effects of four loci that were associated with the lipid traits. Genetic associations of four loci with lipids was found in urban dwellers compared to their rural sibs after adjusting for daily fat intake and physical activity ( Table 4), suggesting interaction. The genetic effect of APOA1 on triglycerides (b = 0.147, SE = 0.044, p = 0.001) and CETP on total cholesterol (b = 0.110, SE = 0.035, p = 0.002) increased while interacting with location when compared to the main effects (see Table 2). Further, conditional associations with urban location were found, such as the association of LIPA with total cholesterol (b = 0.082, SE = 0.030, p = 0.006) which was not evident in the main effects.
Similarly, in comparison to the main effects (Table 2), reduction in the genetic effects of AOPA1 on triglycerides (b = 0.107, SE = 0.052, p = 0.040) and CETP on total cholesterol (b = 0.097, SE = 0.041, p = 0.018) were seen in people consuming high dietary fat after adjusting for physical activity (Table 4). Further, conditional associations with dietary fat was seen for CETP on LDL-C (b = 0.085, SE = 0.041, p = 0.042) and TRIB1 on total cholesterol (b = 0.114, SE = 0.038, p = 0.003) and LDL-C (b = 0.111, SE = 0.039, p = 0.004) which were absent in the main effects.
The genetic effect of TRIB1 on HDL-C (b = 0.087, SE = 0.038; p = 0.021) was found to be stronger among physically active participants (Talbe 4) than the main effects ( Table 2) after adjusting for fat intake. A conditional association of LIPA on triglycerides (b = 20.077, SE = 0.033; p = 0.021) was also seen among physically active individuals (Table 4).
To conclude, we confirm that four previously discovered QTLs in Europeans also influence lipid levels in the Indian population. Two of these loci (TRIB1 and CELSR2-PSRC1-SORT1) have been validated in the Indian population for the first time. However, the present findings will need to be replicated in larger samples. Sexspecific associations were also observed in the studied population along with strong interaction effects for all six loci studied. Genetic associations with lipid traits were stronger in urban dwellers compared to their rural sibs, suggesting interaction. Some evidence was also seen for interaction by dietary fat intake and physical activity on the genetic association of lipid traits.

Supporting Information
File S1 Supporting Information containing details of methodology and Tables S1-S7. (DOC)