Conceived and designed the experiments: DKS REF MIK. Performed the experiments: LFB. Analyzed the data: CEA. Contributed reagents/materials/analysis tools: DKS CEA LFB. Wrote the paper: DKS SR GSW NKM JRS REF MIK LFB CEA. National Heart Lung and Blood Institute (NHLBI)'s Mammalian Genotyping Service (Contract Number HV48141) performed genotyping for this study.
The authors have declared that no competing interests exist.
In this investigation, we have carried out an autosomal genome-wide linkage analysis to map genes associated with type 2 diabetes (T2D) and five quantitative traits of blood lipids including total cholesterol, high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, very low-density lipoprotein (VLDL) cholesterol, and triglycerides in a unique family-based cohort from the Sikh Diabetes Study (SDS). A total of 870 individuals (526 male/344 female) from 321 families were successfully genotyped using 398 polymorphic microsatellite markers with an average spacing of 9.26 cM on the autosomes. Results of non-parametric multipoint linkage analysis using Sall statistics (implemented in Merlin) did not reveal any chromosomal region to be significantly associated with T2D in this Sikh cohort. However, linkage analysis for lipid traits using QTL-ALL analysis revealed promising linkage signals with p≤0.005 for total cholesterol, LDL cholesterol, and HDL cholesterol at chromosomes 5p15, 9q21, 10p11, 10q21, and 22q13. The most significant signal (p = 0.0011) occurred at 10q21.2 for HDL cholesterol. We also observed linkage signals for total cholesterol at 22q13.32 (p = 0.0016) and 5p15.33 (p = 0.0031) and for LDL cholesterol at 10p11.23 (p = 0.0045). Interestingly, some of linkage regions identified in this Sikh population coincide with plausible candidate genes reported in recent genome-wide association and meta-analysis studies for lipid traits. Our study provides the first evidence of linkage for loci associated with quantitative lipid traits at four chromosomal regions in this Asian Indian population from Punjab. More detailed examination of these regions with more informative genotyping, sequencing, and functional studies should lead to rapid detection of novel targets of therapeutic importance.
Type 2 diabetes (T2D) is a major public health problem of 21st century and the fifth leading cause of death worldwide. According to Global Burden of Disease Study predictions, India, China and USA will be the top three leading countries for the prevalence of diabetes
Elevated serum lipid levels are important risk factors for the development of cardiovascular disease (CVD). The genetic basis of several monogenic forms of lipid disorders has been determined, including familial lipoprotein lipase (LPL) deficiency, apoC-II deficiency, defective apoB, familial hypercholesterolemia, and familial triglyceridemia
Recent genome-wide association studies (GWAS) performed for many complex traits are revolutionizing the dissection of genetic determinants of several complex traits including T2D and serum lipids. Although these studies are adding to the list of reliably associated common loci controlling T2D and blood lipids and even other complex traits, these loci explain only a small portion of the heritable component associated with these complex diseases. Clearly, additional loci that can explain a large proportion of the variation await discovery.
Asian Indians, one quarter of the global population, have unusually high CVD mortality and very high prevalence of insulin resistance and T2D
This study was carried out on an endogamous community of Khatri Sikhs living in Northern Indian states of Punjab, Haryana, and New Delhi. The Khatri population was chosen because of its relatively higher prevalence of diabetes as compared to other Sikh castes. Khatri Sikhs are more affluent and live in cities and are traders by profession. In general, Sikhs do not smoke for religious and cultural reasons and about 50% of the study participants are life-long vegetarians. A total of 1,115 individuals from 338 families were extensively phenotyped
T2D cases | Unaffected Relatives | |||
685 (412M/273F) | 185 (116M/69F) | |||
Age at recruitment (year) | M |
52.42±11.36 | 44.71±15.20 | <0.0001 |
F | 55.93±10.88 | 48.60±13.60 | <0.0001 | |
Age at onset (years) | M | 45.80±10.65 | – | – |
F | 48.62±10.40 | – | – | |
Duration of T2D (years) | M | 7.82±7.42 | – | – |
F | 7.33±6.65 | – | – | |
BMI (kg/m2) | M | 26.90±4.23 | 26.89±4.53 | 0.876 |
F | 28.50±5.20 | 27.58±4.79 | 0.140 | |
WAIST (cm) | M | 95.50±10.40 | 93.2±11.55 | 0.040 |
F | 92.70±11.00 | 88.1±10.60 | <0.0001 | |
HIP (cm) | M | 95.8±8.20 | 96.0±8.50 | 0.875 |
F | 99.3±11.00 | 97.6±9.70 | 0.135 | |
WHR† | M | 0.99±0.07 | 0.97±0.07 | <0.0001 |
F | 0.94±0.07 | 0.90±0.07 | <0.0001 | |
Fasting Glucose (mg/dl) | M | 185.81±70.66 | 95.30±11.38 | <0.0001 |
F | 193.23±74.58 | 97.77±9.31 | <0.0001 | |
Total Cholesterol (mg/dl) | M | 177.18±44.34 | 174.54±45.86 | 0.525 |
F | 187.24±47.42 | 177.58±38.94 | 0.049 | |
Triglycerides (mg/dl) | M | 197.63±113.50 | 160.71±85.49 | <0.0001 |
F | 172.66±95.34 | 155.74±69.72 | 0.078 | |
HDL-cholesterol (mg/dl) | M | 38.11±12.44 | 39.21±10.01 | 0.321 |
F | 41.76±12.59 | 43.70±11.37 | 0.187 | |
LDL-cholesterol (mg/dl) | M | 99.98±35.68 | 98.36±32.96 | 0.619 |
F | 110.17±39.81 | 102.89±35.56 | 0.098 | |
VLDL-cholesterol (mg/dl) | M | 39.75±24.05 | 32.45±18.43 | 0.001 |
F | 34.77±19.18 | 31.13±13.84 | 0.056 |
*M - male, F- female; †Waist to hip ratio; **Difference between T2D cases and unaffected relatives.
A total of 557 families were investigated and 236 families were excluded because they did not meet the eligibility criteria for the study. A total of 321 families containing 870 individuals (526 male/344 females), who were successfully genotyped (call rate
Phenotyped |
Genotyped and phenotyped for T2D** | Genotyped and phenotyped for lipid levels** | |||||||
Total | Male | Females | Total | Male | Females | Total | Male | Females | |
Number of families | 338 | – | – | 321 | _ | _ | 316 | _ | _ |
Family size | 6.51( |
– | – | 2.71(**) | _ | _ | 2.68 | _ | _ |
Generations (average) | 2.46 | – | – | 2.49 | _ | _ | 2.49 | _ | _ |
Number of individuals in pedigrees | 2,199( |
1248 | 951 | 870(**) | 526 | 344 | 846(**) | 511 | 335 |
Founders | 979( |
– | – | 85 | _ | _ | 82 | _ | _ |
Number of affecteds in pedigrees | 1,202( |
710 | 492 | 685 | 412 | 273 | |||
Number of individuals with blood available | 1,115 | 684 | 431 | ||||||
Number of affecteds with blood available | 868 | 530 | 338 | ||||||
Number of unaffected with blood available | 247 | 154 | 93 |
*—including deceased; **—excluding deceased.
The diagnosis of T2D was confirmed by (a) searching medical records for indications of symptoms of diabetes or measures of blood glucose levels, (b) use of diabetic medication, and (c) measuring fasting glucose levels following the guidelines of American Diabetes Association
Body mass index (BMI) was calculated as [weight (kg)/height (meter)2], and waist-to-hip ratio (WHR) was calculated as the ratio of abdomen or waist circumference to hip circumference. Despite having comparable BMI (27.5±4.0 T2D cases vs. 27.3±4.7 controls), patients had a pronounced abdominal adiposity as reflected by their significantly higher WHR (0.97±0.07 vs. 0.94±0.07; p<0.0001) than controls. Interestingly, WHR in Khatri Sikh men (BMI 26–27 kg/m2) was higher than obese Mexican American men (BMI>32 kg/m2); Sikhs (0.97±0.05) vs. Mexican Americans (0.95±0.06)
Education (highest level completed) was scored 1–4 where 1 = primary or none, 2 = high school, 3 = bachelor degree, and 4 = post graduate degree. Job-grade was scored 1–3 based on education and economic status where 1 = high income, 2 = middle-income, 3 = lower-middle and lowest income class; category 1 was used as a reference group. Smoking information was collected on past smoking, current smoking status, length of time, number of cigarettes smoked/day. Alcohol consumption was scored 0–4 where 0 = no alcohol, 1 = 50 to 100 ml/day, 2 = 100 to 400 ml/day, 3 = 400 to 1000 ml (1L)/day 4 = >1 L/day. Physical activity was scored 1–3 based on level of activity performed where 1 = very active, 2 = moderately active, 3 = quite inactive. About 83% of T2D patients were taking oral hypoglycemic agents. Some were maintaining glycemic control by diet and exercise. The individuals on lipid-lowering medications were not included in the analysis. Further recruitment details are available elsewhere
Serum lipids [total cholesterol, high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, very low-density lipoprotein (VLDL) cholesterol, and triglycerides] were quantified using standard enzymatic methods (Roche, Basel, Switzerland). Fasting serum insulin was measured by radio-immuno assay (Diagnostic Products, Cypress, USA). All quantitative parameters were determined by following manufacturer's instructions using a Hitachi 902 auto-analyzer (Roche, Basel, Switzerland).
DNA was extracted from buffy coats using QiaAmp blood kits (Qiagen, Chatworth, USA) or by the salting out procedure
A variety of statistical software was used to complete this study. To set up the files for analysis, we extensively used the statistical software R (version 2.0.1). Data cleaning was performed following several steps. To check for inconsistencies in the self-reported family structures, we carried out relationship testing using PREST
To adjust for the confounding effects of environmental influence on the lipid traits, we included information on age, age2 sex, BMI, dietary and lifestyle factors (smoking, alcohol consumption, and physical activity), socio-economic status (education and job-grade) as covariates. To select significant covariate, both stepwise regression and backward elimination were used in genetic models. Significant covariates considered for selection in the model were age, age2, sex, job grade, level of alcohol consumption. Additionally, analysis was performed including and excluding BMI in the model despite its elimination in stepwise regression. Univariate analysis was performed to obtain summary statistics for each trait (online supplementary Table S2). A classical multiple linear regression model:
We used the Sall statistic
In this study, families containing individuals affected by T2D are preferentially over-sampled, so this sample is non-randomly ascertained with respect to T2D. Therefore, to the extent that lipid traits are correlated with T2D status, the sample is also non-randomly ascertained with respect to the lipid traits. Thus, it would not be appropriate to use the usual variance-component based linkage analysis methods on these data. Instead we used score-based linkage statistics as implemented in the QTL-ALL program for this data set
Family structure data and X-linked genotypes at 27 markers were combined to detect possible gender errors by looking for males who are more heterozygous than expected and females who are more homozygous than expected. Five males were heterozygous at more than two markers; 16 women were more than 80% homozygous. All suspect participants were rechecked to ensure there was no misreporting of gender. We used RELPAIR and PREST to check the accuracy of self-reported family relationships. Misclassification of relationship for half-siblings as full-sibling, and unrelated as cousins, were detected and resolved. Participants with unresolved relationship errors were removed from families before analysis. We also used PEDCHECK to check Mendelian inconsistencies at each marker and erroneous data were omitted from further analysis.
As shown in online
Univariate analysis of the lipid traits showed some individuals with very high or very low outlier values, which were removed from the analysis. As needed a Box-Cox transformation was used to make the error distribution of the data more normal
Covariate Trait | Job grade | Alcohol consumption | Sex | Age | Age2 |
Total Cholesterol |
|
||||
Triglycerides |
|
||||
HDL Cholesterol |
|
||||
LDL Cholesterol |
|
|
|
||
VLDL Cholesterol |
|
*represents significant covariate used for each lipid trait.
QTL-ALL analysis, using the Score.Max statistics, was performed for the five quantitative traits. An overview of the linkage results for the significant signals associated with serum lipid associated traits is given in
Linkage plots show significant signals at four chromosomal regions with allele sharing LOD (−log10 p value) on Y axis and chromosome distance (cM) on X axis. Significant linkage includes chromosome 5 near marker D5S2488 (p = 0.0031) for total cholesterol; chromosome 9 near marker D9S1122 (p = 0.0039) for LDL cholesterol; chromosome 10q21.2 near D10S1225 (p = 0.0011) for HDL cholesterol; and chromosome 22 near marker TCTA015M (p = 0.0016) for total cholesterol.
Trait | Chromosome | Cytogenetic position | Physical Position |
Closest Marker | Genetic Position (cM) | Score.Maxp Value |
Total Cholesterol | 5 | 5p15.33 | 180390 | D5S2488 | 0.00 | 0.0031 |
LDL Cholesterol | 9 | 9q21.13 | 78878414 | D9S1122 | 75.88 | 0.0039 |
LDL Cholesterol | 10 | 10p11.23 | 30535660 | D10S1426 | 65.61 | 0.0045 |
HDL Cholesterol | 10 | 10q21.1 | 57199892 | D10S1221 | 84.44 | 0.0041 |
HDL Cholesterol | 10 | 10q21.2 | 64425005 | D10S1225 | 89.69 | 0.0011 |
Total Cholesterol | 22 | 22q13.32 | 47925896 | TCTA015M | 66.96 | 0.0016 |
*NCBI Build 36.1 positions from the UCSC browser.
Our study represents the first large scale genome-wide effort to identify chromosomal regions with putative loci affecting T2D and lipid traits in a unique community of Asian Sikhs from Northern India. This diabetic cohort from a genetically homogenous subgroup was collected with the initial goal of identifying T2D predisposing genes. However, the results of our non-parametric linkage scan did not identify any chromosomal region to be significantly linked to T2D
The other aim of this investigation was to identify genomic regions affecting lipid-related phenotypes in this cohort. We performed QTL-ALL analysis on this non-randomly ascertained dataset, which revealed several suggestive linkage signals associated with serum lipid levels
A linkage peak for total serum cholesterol (p = 0.0031) was detected near marker D5S2488 at the proximal region of chromosome 5p15.33. This region was previously linked to LDL cholesterol in the NHLBI Family Heart Study
The linkage signal at chromosome 22q13.32 near marker TCTA015M (p = 0.0016), detected for total cholesterol was linked with familial hypercholesterolemia in a Utah study
Our study does not represent a common replication attempt to identify lipid loci in an independent population. Rather, this investigation has been carefully carried out in this unique family-based cohort using a conservative statistical approach applying score-based statistics to map quantitative lipid traits in a non-randomly ascertained dataset. Exceeding our expectations, this study has identified linkage regions, primarily HDL cholesterol (10q21.1–21.2) and total cholesterol (22q13.32) that were previously reported for lipid traits or CVD. The most interesting part of this study is that some of these linkage signals also harbor important candidate loci (e.g.,
Unlike previous studies, our genome-wide linkage scan could not identify any significant chromosomal region associated with T2D in this unique family cohort of Punjabi Sikhs with increased risk to developing T2D and cardiovascular illnesses. Our study, however, for the first time provides an evidence of linkage for loci controlling quantitative lipid traits at four chromosomal regions in this Asian Indian population. The strongest linkage signal was seen for HDL cholesterol on chromosome 10q21.2. Our data also revealed linkage signals for total cholesterol on chromosome 5p15.33 and 22q13.32, and for LDL cholesterol on 10p11.23 and 9q21.13. Some of these regions have been linked to lipid-related traits in recent GWA studies and contain other plausible candidate genes. The strongest peak for HDL cholesterol (p = 0.0011 at 10q21.2) suggests that this region may contain novel gene(s) influencing serum HDL cholesterol levels and other lipid traits. Further denser and more informative genotyping in each of these regions would be important to discover functional loci influencing blood lipids.
Genome-wide non-parametric linkage scans for type 2 diabetes using 321 diabetic pedigrees and 398 microsatellite markers (9.26 cM). Individual plot shows linkage signals (Kong and Cox LOD score) on Y axis and microsatellite markers on X axis. None of the chromosome regions revealed any signal associated with T2D in these pedigrees.
(TIF)
Plot of Box-Cox coefficient lambda and the distribution of five quantitative traits including total cholesterol, triglycerides, HDL cholesterol, LDL cholesterol, and VLDL cholesterol before and after transformation.
(TIF)
Genome-wide autosomal linkage scan for five blood lipid phenotypes. Individual plot shows allele sharing LOD (−log10 p value) on Y axis and chromosome distance (cM) on X axis.
(TIF)
Linear regression model for quantitative traits.
(DOC)
This study would not have been possible without the generous support of the families who participated in this study and the non-governmental, social, and religious organizations who assisted in the recruitment and ascertainment of the SDS families. Authors are also thankful to Mr. Jagtar S. Sanghera for his repeated help in subject recruitment and providing guidance and support in logistical and ethical issues. We are thankful to National Heart Lung and Blood Institute (NHLBI)'s Mammalian Genotyping Service (Contract Number HV48141), for genotyping our study population.