Interactions among genes involved in reverse cholesterol transport and in the response to environmental factors in dyslipidemia in subjects from the Xinjiang rural area

Gene-gene and gene-environment interactions may be partially responsible for dyslipidemia, but studies investigating interactions in the reverse cholesterol transport system (RCT) are limited. We explored these interactions in a Xinjiang rural population by genotyping five SNPs using SNPShot technique in APOA1, ABCA1, and LCAT, which are involved in the RCT (690 patients, 743 controls). We conducted unconditional logistical regression analysis to evaluate associations and generalized multifactor dimensionality reduction to evaluate interactions. Results revealed significant differences in rs670 and rs2292318 allele frequencies between cases and controls (P<0.025). rs670 G allele carriers were more likely to develop dyslipidemia than A allele carriers (OR = 1.315, OR 95% CI: 1.067–2.620; P = 0.010). rs2292318 T allele carriers were more likely to develop dyslipidemia than A allele carriers (OR = 1.264, OR 95% CI: 1.037–1.541; P = 0.020). Gene-gene interaction model APOA1rs670-ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166 (P = 0.0107) and gene-environment interaction model ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166-obesity-smoking were optimal dyslipidemia predictors (P = 0.0107) and can interact (4). Differences in A-C-A-C-A and G-G-G-T-G haplotype frequencies were observed (P<0.05). Serum lipid profiles could be partly attributed to RCT gene polymorphisms. Thus, dyslipidemia is influenced by APOA1, ABCA1, LCAT, environmental factors, and their interactions.


Introduction
Coronary heart disease and cerebrovascular disease are common public health problems in both high-and low-income countries [1], Dyslipidemia is an important risk factor for these PLOS  diseases [2], and is characterized by elevated levels of triglycerides (TG), total cholesterol (TC), and low-density lipoprotein cholesterol (LDL-C) and reduced levels of high-density lipoprotein cholesterol (HDL-C). Dyslipidemic individuals have also been reported to have higher risk of coronary artery disease [3][4][5][6].
Dyslipidemia is a complex human disease that is influenced by both genetic and environmental factors [7]. Understanding the risk of developing complex diseases in a population involves studying a variety of factors. In particular, genetic and environmental factors interact to influence disease risk. Gene-gene interactions or gene-environment interactions are believed to produce combined effects in one or more genes or with one or more environmental factors that cannot be fully explained by their separate marginal effects [8]. Therefore, identification of gene-gene and gene-environment interactions will improve understanding of missing heritability [9,10].
Reverse cholesterol transport (RCT) is the process by which cholesterol is effluxed from peripheral tissues through acceptor particles, primarily HDL, to the plasma for subsequent uptake by the liver. RCT involves numerous lipid transfer proteins, enzymes, apolipoproteins, and membrane-bound receptors [11,12]. ATP-binding cassette A1 (ABCA1), apolipoprotein A-1 (APOA-1), and cholesterol acyltransferase (LCAT) are known to play crucial roles in the RCT system. APOA1, ABCA1 and LCAT that encode these proteins, as well as genes responsible for regulating their transcription, are candidates for influencing variation in plasma lipid levels. Mutations in the APOA1, ABCA1, and LCAT can influence transcription and translation, leading to abnormal lipid metabolism [13][14][15]. Recent studies have suggested the relationship between lipid levels and APOA1, ABCA1, and LCAT polymorphisms; however, results remain inconsistent across different races [16][17][18][19][20][21][22]. One major reason for the observed inconsistency is that different studies focused on the association of single gene polymorphisms of APOA1, ABCA1, or LCAT and risk factors with lipid levels [13-15, 20, 21, 23-26], However, abnormal blood lipid profiles are caused by multiple genes and environmental factors, each of which can contribute a minor marginal effect. Based on previous observations, we hypothesized that dyslipidemia is jointly caused by the APOA1, ABCA1, and LCAT genes and their interactions with environmental factors. Therefore, to explore gene-gene and gene-environment interactions underlying dyslipidemia, we investigated five (rs670, rs1800976, rs4149313, rs2292318, rs1109166) single nucleotide polymorphisms (SNPs) located in APOA1, ABCA1, and LCAT based on a model RCT pathway. The study population consisted of 1,433 individuals comprising 690 dyslipidemic subjects and 743 normal controls. In addition, we investigated the relationship between APOA1, ABCA1, and LCAT gene variants and serum lipid levels.

Study population
The present study was conducted between March 2009 and February 2010. A total of 3,049 subjects aged 18 years and above who were residents of Jiashi County and 5,692 subjects aged 18 years and above who were residents of Yili region Xinyuan County of Xinjiang Kashgar Prefecture were selected using the stratifying cluster sampling method. Xinyuan County and Jiashi County are at a distance of~3692 km (2294 miles) and~4329 km (2690 miles) from Beijing, respectively. A total of 743 dyslipidemic patients were assigned to the case group, and 690 subjects with normal blood lipid levels were assigned to the control group. The majority of participants in the control group were age-and sex-matched to those in the case group at an approximate ratio of 1:1. Subjects with hypertension, diabetes, liver diseases, renal diseases, or malignant tumors were excluded from the study.
The protocol was approved by the Institutional Ethics Review Board (IERB) of the First Affiliated Hospital of Shihezi University School of Medicine (IERB No. SHZ2010LL01). Written informed consent was obtained from each participant.

Epidemiological survey and biochemical index determination
All study subjects completed a demographic information survey questionnaire, which included details on past and/or current disease(s), family history, smoking and drinking habits, diet, and physical exercise status during face-to-face interview. Blood pressure, height, weight, waist circumference and hip circumference were measured according to standardized methods [27].
All blood samples were tested for various biochemical parameters according to the 2007 China Adult Dyslipidemia Prevention Guide [28] of dyslipidemia. Laboratory analyses of blood samples included tests for fasting TG, LDL-C, TC, HDL-C, and fasting plasma glucose (FPG) levels, all of which were analyzed using an automatic biochemical analyzer (AU400, Olympus: Tokyo, Japan).

Diagnostic criteria
According to the China Adult Dyslipidemia Prevention Guide (2007) [28], dyslipidemia is defined as the presentation of any one or more of the following four items: (1) TG!2.26 mmol/L (200 mg/dL); (2) HDL-C<1.04 mmol/L (40 mg/dL); (3) LDL-C!4.14 mmol/L (160 mg/dL); and (4) TC! 6.22 mmol/L (240 mg/dL). The drinking rate was calculated by dividing the number of drinkers by the total number of participants in each group.
The Working Group on Obesity in China defines obesity as having BMI !28 kg/m2 [29,30].

DNA extraction and genotyping analysis
Fasting venous blood (200 μL) was taken from each study subject, and a blood genomic DNA isolation kit (Non-centrifugal columnar, Tiangen, Beijing, China) was used to isolate the genomic DNA. The extracted DNA was verified by gel electrophoresis (0.7% agarose). A NanoDrop spectrophotometer (Thermo Scientific, Waltham, MA, USA) was used for quantitative determination of DNA concentration and purity: concentration !30 ng/μL and purity levels (optical density [OD]: OD260/OD280) of 1.7-2.0 were considered acceptable. Samples that met these criteria were diluted to 10-30 ng/μL using double-distilled water and then stored at -80˚C. The process of polymerase chain reaction (PCR) amplification, purification, and single-base extension were depicted in details in our published paper [31]and the DOI link dx. doi.org/10.17504/protocols.io.mmtc46n. All representative SNP genotyping experiments were performed using TaqMan technology on an ABI3730xl system (Applied Biosystems, Foster City, CA, USA). ABI GeneMapper was used to complete the classification and present the results.
For the five SNPs, the success rates were all 100%, and no evidence of departure from Hardy-Weinberg equilibrium was observed in control and case subjects using chi-squared-test analysis.

Statistical analysis
Epidata 3.02 software (EpiData Association, Odense, Denmark) was used to establish a database, and the double entry method was used for data input and logic error detection.
Descriptive statistics, t-test or Mann Whitney test, and ANOVA or Kruskal Wallis tests were performed as appropriate after checking for normality by Kolmogorov-Smirnov test, the chi-square test was used to evaluate differences between groups for the categorical variables. The gene counting method was used to calculate genotype and allele frequencies. The chisquare test was used to test for Hardy-Weinberg equilibrium. Unconditional logistical regression analysis was conducted to calculate the odds ratio (OR) and 95% confidence interval (95% CI) with adjustments for age and sex. Gene-gene and gene-environment interactions were identified using GMDR0.9 [32] which is based on the score of a generalized linear model and allows the adjustment of discrete and quantitative covariates, as well as applicable to both dichotomous and continuous data; in this study, ten-fold cross validation was performed. The confounding factors, namely, age, sex, height, weight, and drinking, were included as covariates for interaction analysis. SHEsis software was used to analyze haplotypes [33]. Statistical significance was considered at P<0.05. The significance threshold was adjusted for multiple comparison tests according to Bonferroni correction, and set at P<0.025 (0.05/2 = 0.025) when apply multiple comparison.

General comparison of study subjects
The characteristics of the study individuals are provided in Table 1. There were significant differences in characteristics, such as height, weight, BMI, waist circumference, hip circumference and waist-to-hip ratio, as well as lipid profiles. Moreover, the case group had higher mean blood pressure levels compared to the control group (P<0.05), and the rates of obesity and smoking were lower in the control group than in the case group (P<0.001). No differences in drinking rates were detected between the two groups (P>0.05). Values are presented either mean±SD or n (%). t-test or Wilcoxon rank sum test was used to obtained the P value for continuous variables, and a chi-square test was used to obtain significance for categorical variables, P<0.05 significant.

Genotype and allele frequencies
A comparison of the genotype and allele frequencies of the five SNPs between the control and case groups are presented in Table 2. No significant differences in the genotype and allele frequencies of ABCA1rs1800976, ABCA1rs4149313 and LCATrs1109166 were observed (P>0.025). Compared with carriers of the APOA1rs670 A alleles, G allele carriers were significantly more likely to develop dyslipidemia (OR = 1.315, OR 95% CI: 1.067-2.620; P = 0.010). Carriers of the LCATrs2292318 T allele were more likely to develop dyslipidemia than A allele carriers (OR = 1.264, 95% CI: 1.037-1.541; P = 0.020). Table 3 shows genotypes of five SNPs and serum lipid profiles which presented with median and interquartile range (25th, 75th percentile) and adjusted for sex and age in normal and dyslipidemic individuals. In the control group, carriers of the rs670 AA genotype showed higher TG levels than those of AG and GG genotype carriers (P = 0.014). Individuals with the rs2292318 TT genotype showed higher TG levels than those with the CT and CC genotypes in normal individuals (P = 0.003). For the remaining loci, control and case groups showed no significant differences in the genotype distributions and TG, TC, LDL-C, and HDL-C levels. Table 4 summarizes the results from the GMDR (Generalized multifactor dimensionality reduction) analysis of the association of gene-gene and gene-environment interactions with dyslipidemia. GMDR was used to assess gene-gene interactions of five SNPs located in the APOA1, ABCA1, and LCAT genes with adjustments for age, sex, smoking and drinking status, as well as blood pressure. After adjusting for co-variables, the best combination was determined to be APOA1rs670-ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166, suggesting that these four variants jointly contributed to the etiology of dyslipidemia. For the gene-environment analysis, which included age, sex, and blood pressure as the covariates, GMDR identified the five-factor interaction model of ABCA1rs1800976-ABCA1 rs4149313-LCATrs1109166-obesity-smoking as the best model for dyslipidemia, with a testing balance accuracy of 0.5313 and a maximum cross-validation consistency of 10/10 (P = 0.0107).

Haplotype analysis
Haplotype analysis results for the five SNPs are shown in Table 5. Global haplotype frequencies were significantly different between the control and case groups (P<0.001). The case group had significantly higher frequency of the A-C-A-C-A haplotype than the control group, whereas the G-G-G-T-G haplotype was less frequent in the case group than in the control group (P<0.05).

Discussion
Dyslipidemia is a complex human disease that is influenced by both genetic and environmental factors [7], Interactions among multiple genetic and environmental factors are known to exert joint effects, which serve as an important biological basis for understanding complex diseases and phenotype variation [34][35][36][37]. Recent studies have now focused on detecting factors that, owing to their interactions with other genetic (or environmental) factors, cannot be identified using standard single-locus tests. Thus, unraveling the so-called "missing heritability" [38], that is not limited to single-marker analysis will facilitate understanding of the mechanisms by which genetic and environmental factors contribute to the formation of complex traits. In the current study, we explored associations of APOA1, ABCA1, and LCAT genes with dyslipidemia and focused on gene polymorphisms involved in the RCT system. A total of 690 patients and 743 control subjects were analyzed to evaluate gene-gene and gene-environment interactions between dyslipidemia and single-nucleotide polymorphisms (SNPs) that influence lipid metabolism and to elucidate the mechanisms underlying missing heritability. In this study, we investigated the contribution of SNPs from RCT-related genes to susceptibility. APOA1rs670 and LCATrs2292318 polymorphisms are associated with dyslipidemia risk. Gene-gene interaction of APOA1rs670-ABCA1r-s1800976-ABCA1rs4149313-LCATrs1109166 and the gene-environment interaction of ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166-obesity-smoking were identified based on GMDR analysis. Our findings suggested that the interaction among APOA1, ABCA1, LCAT conferred the genetic susceptibility to dyslipidemia in Xinjiang Rural Area. We consider the present study to be explorative and hypothesis-generating and thus find the results interesting. However, a more convincing conclusion can only be reached by further independent replication studies. APOA-1, ABCA1, and LCAT play important roles in the RCT system, as well as in lipid metabolism. APOA-I functions as a cholesterol acceptor, and APOA1 variants are known to be associate with lipid levels. ABCA1 transports cellular cholesterol across the cell membrane to plasma acceptor particles, such as APOA-I [39,40], LCAT converts unesterified cholesterol to cholesterol esters during the process of HDL maturation [14]. Mutations in APOA-1, ABCA1, and LCAT can cause alterations and protein transcription and translation, eventually leading to abnormal lipid metabolism [13][14][15].
Dyslipidemia is determined by a plurality of factors. In the present study, we found that anthropometric characteristics, such as height, weight, BMI, waist circumference, hip circumference and waist-to-hip ratio were significantly higher in the dyslipidemia case group than in the control group. Furthermore, the two groups showed significant differences in blood pressure measurements, consistent with the results of previous study in a different population, in which an association between blood pressure and serum lipid levels was stably observed [41]. Furthermore, individuals in the case group had higher incidence of obesity than those in the control group. On the other hand, our findings were consistent with other reports, in which obesity based on waist circumference and weight was significantly higher in cases than controls [23][24][25]. In addition, the smoking rate was lower in the control group than the case group, which indicated that lifestyle factors, such as smoking, were associated with dyslipidemia. The above findings are consistent with the results of a previous study [42], that detected no differences in drinking habits between the two groups. This implies that drinking status is not the primary driver of abnormal blood lipids levels, which is inconsistent with the results reported by Rao et al. [43].
Our study showed that the genotype frequencies were in Hardy-Weinberg equilibrium and can represent the target population. Allelic frequencies of the five target SNPs also varied among different races in the NCBI database (URL: https://www.ncbi.nlm.nih.gov/snp/), indicating significant racial/ethnic variation in allelic frequencies in the APOA1, and LCAT genes. Results also showed significant differences in the distributions of rs670 and rs2292318 between the control and case groups. For rs670, G allele carriers were significantly more likely to develop dyslipidemia. Carriers of the rs2292318 T allele were more likely to develop dyslipidemia than A allele carriers. These results suggest an association between APOA1 and LCAT polymorphisms and dyslipidemia.
Previous studies have demonstrated the association of polymorphisms in APOA1, ABCA1, and LCAT with serum lipid levels in humans. However, the obtained data remain inconsistent. In previous reports, the APOA1 rs670 polymorphism was found to be correlated with dyslipidemia [22] and HDL-C levels [20,21]. The ABCA1 rs1800976 polymorphism was associated with HDL-C levels in the Suita population, but a different study found no association between ABCA1 rs1800976 genotype and lipid levels [18,19]. The ABCA1 rs4149313 polymorphism showed favorable effects on HDL-C levels [44,45]; Significant associations were observed between LCAT rs2292318 polymorphisms and diabetic dyslipidemia [17], whereas the association between LCAT rs2292318 and HDL cholesterol was not statistically significant in a sample population of French Canadian ancestry [16]. In the present study, we demonstrated an association between the APOA1, ABCA1, and LCAT polymorphisms and plasma lipid levels. Our results are consistent with several previous studies that provided evidence of the association between these gene variants and serum lipid levels.
Abnormal blood lipid profiles are influenced by multiple genes and environmental factors, each of which can contribute a minor marginal effect. To further assess their combined effects, gene-gene and gene-environment interactions were explored using GMDR, which is based on the score of a generalized linear model and is a special case of the original MDR method [32]. MDR has proven to be a useful statistical tool for detecting gene-gene interactions while avoiding ''the dimension curse" [36]. However, existing MDR approaches do not allow for covariates. The GMDR, which is based on the MDR, allows the adjustment of discrete and quantitative covariates and is applicable to both dichotomous and continuous data [46]. Given that the GMDR method is immune to bias and invalidity in the presence of population heterogeneity [47], it provides greater flexibility for a population-based study design [32] and is therefore highly useful for revealing genetic architecture in terms of gene-gene and gene-environment interactions that are responsible for dyslipidemia.
In the present study, results of the GMDR analysis for the gene-gene interactions showed that the combination APOA1rs670-ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166 was the best model irrespective of whether a dichotomous (sex, smoking, and drinking habits) or continuous (age and blood pressure) variable was measured after adjusting for covariates; this suggests that these four variants together are associated with the etiology of dyslipidemia. For the gene-environment analysis adjusted for covariates, including age, sex, and blood pressure, the five-factor interaction model, ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166-obesity-smoking, was determined to be the best model for dyslipidemia. Results indicated an interaction comprising ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166-obesity-smoking, which could have influenced dyslipidemia outcomes. Thus, our findings agree with previous reports showing that dyslipidemia is a complex trait caused by multiple environmental and genetic factors and their interactions [48][49][50]. Conflicting results were obtained from studies that focused on the association of single gene polymorphisms in genes involved in RCT and other risk factors with lipid levels [13-15, 20, 21, 23-26]. Thus, the contributions of gene-gene and gene-environment interactions to dyslipidemia can provide a more mechanistic explanation for this condition.
Results of haplotypes analyses using SHEsis software regarding the association of APOA1, ABCA1, and LCAT haplotypes and dyslipidemia showed that the A-C-A-C-A haplotype was significantly more frequent in the case group than in the control group, whereas the G-G-G-T-G haplotypes was less frequent in the case group than in the control group. The results indicated that the A-C-A-C-A haplotype can serve as a risk factor for dyslipidemia, whereas the G-G-G-T-G haplotype can serve as a protective factor against dyslipidemia.
Admittedly, the present study has certain limitations. First, serum lipid levels are regulated by multiple environmental and genetic factors and their interactions. However, we only investigated the interactions of five SNPs located in three genes and several environmental factors on dyslipidemia. Other environmental and genetic factors and their interactions were excluded; thus, "missing heritability" is still expected. Second, the current study was conducted in Xinjiang national region, which is located in the remote western region of China. No independent replication studies on the interactions between SNPs and environmental factors influencing dyslipidemia have been conducted for this population. Thus, the observed associations and interactions between these SNPs and dyslipidemia in the studied population may not represent the major characteristics of this conditions in other population groups. Elucidating epistasis in multidimensional space to infer gene function remains an interpretative challenge. It may be more practical to derive biological explanations for epistasis by investigating the physiological effects of the target polymorphisms [46,51].

Conclusions
Our findings showed that the APOA1rs670 and LCATrs2292318 polymorphisms are associated with dyslipidemia. The gene-gene interaction model comprising APOA1rs670-ABCA1r-s1800976-ABCA1rs4149313-LCATrs1109166 and the gene-environment interaction model comprising ABCA1rs1800976-ABCA1rs4149313-LCATrs1109166-obesity-smoking were found to be the best predictive models for dyslipidemia. In addition, the A-C-A-C-A haplotype can act as a risk factor for dyslipidemia, whereas the G-G-G-T-G haplotype can serve as a protective factor against dyslipidemia.  Writing -original draft: Xinping Wang.