Longitudinal Replication Studies of GWAS Risk SNPs Influencing Body Mass Index over the Course of Childhood and Adulthood

Genome-wide association studies (GWAS) have identified multiple common variants associated with body mass index (BMI). In this study, we tested 23 genotyped GWAS-significant SNPs (p-value<5*10-8) for longitudinal associations with BMI during childhood (3–17 years) and adulthood (18–45 years) for 658 subjects. We also proposed a heuristic forward search for the best joint effect model to explain the longitudinal BMI variation. After using false discovery rate (FDR) to adjust for multiple tests, childhood and adulthood BMI were found to be significantly associated with six SNPs each (q-value<0.05), with one SNP associated with both BMI measurements: KCTD15 rs29941 (q-value<7.6*10-4). These 12 SNPs are located at or near genes either expressed in the brain (BDNF, KCTD15, TMEM18, MTCH2, and FTO) or implicated in cell apoptosis and proliferation (FAIM2, MAP2K5, and TFAP2B). The longitudinal effects of FAIM2 rs7138803 on childhood BMI and MAP2K5 rs2241423 on adulthood BMI decreased as age increased (q-value<0.05). The FTO candidate SNPs, rs6499640 at the 5 ′-end and rs1121980 and rs8050136 downstream, were associated with childhood and adulthood BMI, respectively, and the risk effects of rs6499640 and rs1121980 increased as birth weight decreased. The best joint effect model for childhood and adulthood BMI contained 14 and 15 SNPs each, with 11 in common, and the percentage of explained variance increased from 0.17% and 9.0*10−6% to 2.22% and 2.71%, respectively. In summary, this study evidenced the presence of long-term major effects of genes on obesity development, implicated in pathways related to neural development and cell metabolism, and different sets of genes associated with childhood and adulthood BMI, respectively. The gene effects can vary with age and be modified by prenatal development. The best joint effect model indicated that multiple variants with effects that are weak or absent alone can nevertheless jointly exert a large longitudinal effect on BMI.


Introduction
Obesity, a major global public health concern, is a deleterious factor associated with different diseases, including cardiovascular events, type 2 diabetes mellitus, sleep -breathing abnormalities, and some cancers [1,2]. The prevalence of obesity in children and adults has increased strikingly over the past decades, leading to an increased likelihood of serious obesity-related complications and decreased life expectancy [3]. Thus, reducing obesity prevalence in adults and preventing overweight from early childhood is of great importance in terms of public health.
The predisposition to obesity varies widely in the population and is partially genetically determined. Intrauterine growth, measured by birth weight, can influence the probability of suffering from obesity in future life [4,5]. Heritability of the underlying genetic components ranges from 25% to 40% according to family studies, and from 50% to 80% according to twin studies.
Longitudinal studies in twins have shown genetic effects on obesity through middle age (over the course of 43 years) [6]. Recent advances in genome-wide association studies (GWAS) have evidenced the association of common variants of several different genes with body mass index. Among these genes, FTO has been repeatedly identified [7][8][9][10][11], and KCTD15, TMEM18, MTCH2, and NEGR1 have been considered to confer obesity risk through effects in the central nervous system [9]. Novel functions of three other genes, FAIM2, MAP2K5, and TFAP2B, were also found to be associated with obesity development [11].
The longitudinal effects of these GWAS findings over the course of childhood and adulthood, however, have not been widely studied, and each variant can explain only a very small proportion of BMI variance. It is not yet clear if there are joint effects among these variants and if these effects are age-dependent and modifiable by birth weight. To address these questions, we conducted a longitudinal replication study in the Bogalusa Heart Study cohort, using BMI data collected from 1973 to the present. We previously identified the longitudinal effects of FTO tag variants associated with birth weight [12]. In this follow-up study, we will examine 23 additional GWAS-significant SNPs and explore their joint effects using a heuristic forward searching strategy.

Materials and Methods
The Bogalusa Heart Study is a community-based study of the natural history of cardiovascular disease since childhood in the community of Bogalusa, Louisiana. The study is longitudinal, and the initial cross-sectional study began in 1973-1974. Subsequent cross-sectional surveys were conducted every 3-4 years during childhood and young adulthood. At present, seven major crosssectional surveys of children aged 3-17 years and six of adults aged 18-44 years, who were previously examined as children, have been conducted. This enabled us to study the evolution of obesity over a 40-year lifespan from childhood through middle-age. Information on personal health and medication history was obtained from participants through questionnaires. Standard protocols approved by the Institutional Review Board of the Tulane University Health Sciences Center were used for the collection of all data [13,14]. Written informed consent was obtained from the participants, or their parents (guardians) if the participants were children.
Participating subjects taken from the BHS for this replication study consist of 658 whites (308 males and 350 females). The birth weight data (kg) and gestational age (weeks) from menstrual history were retrieved from Louisiana State birth certificates. Their anthropometric measures were made in both childhood and adulthood. All examinations followed essentially the same protocols. Height and weight were measured twice to 60.1 cm and to 60.1 kg, respectively. The means of two independent measurements for height (in centimeters) and weight (in kilograms) were used to calculate BMI (kilograms per meter squared) as the measure of obesity. On average, study subjects were surveyed 4.3 times over childhood and 4.5 times over adulthood. Genotyping of BHS participants was based on the Illumina Human610 BeadChip and HumanCVD BeadChip (also referred to as IBC array), and completed in the laboratory of Genomic Medicine at the Scripps Research Institute in La Jolla, California [15]. Total 23 SNPs previously reported to be significantly associated with BMI in GWAS (p-value#5*10 28 ) were selected for this replicated longitudinal study, based on PubMed literature review and NHGRI GWAS Catalog [16,17].
Statistical analyses were performed using R (version 2.10.1). Linear modeling was used to examine the major effect of each polymorphism on birth weight, with gestational age and sex controlled as confounding covariates. Longitudinal effects of candidate SNPs on childhood (age,18 years) and adulthood (age$18 years) BMI were examined by linear mixed model using the R package lme4, with birth weight and age as confounding covariates. A random effect of age was included to control for repeated measures of BMI at different ages. Interaction tests were conducted to measure the modifications of variant effects by age and birth weight. For all studies, both general and additive genetic models were applied, which respectively treat a SNP as a categorical factor and as a continuous variable coded as 0, 1, or 2 copies of a particular reference allele. The false discovery rate (FDR) was taken to adjust for multiple tests [18][19][20]. The FDR qvalue of a variant association was defined as the smaller of two genetic models and a q-value#0.05 was considered significant.
A heuristic forward search was proposed to identify the best joint effect model among candidate variants through the addition of SNPs one by one. At each forward step, the next SNP and its corresponding reference allele were identified by the criterion that the number of reference alleles over all added SNPs must have the minimum modified Akaike Information Criterion (AIC) [21] of the linear mixed model. The modified AIC is defined as: 2*(the number of SNPs+the number of non-SNP parameters) -2*ln(L), where L is the maximized likelihood for the estimated model. The search will stop when the modified AIC of the best joint effect model cannot be further reduced in the next step. The BMI was permuted 10,000 times to generate distribution of a random best joint effect model and the empirical p-value was calculated as the percentage of minimized AIC#observed minimized AIC.

Results
Characteristics of participants at birth, first measurement during childhood, last measurement during adulthood, and p values for differences between genders are summarized in Table 1. Descriptions of 23 replicated SNPs are presented in Table 2. Hardy-Weinberg Equilibrium (HWE) was met for all SNPs in the study cohort. After adjustment for multiple tests, no significant associations were observed between candidate SNPs and birth weight (results not shown).
The significant interactions of SNP6age and SNP6birth weight are presented in Table 3. There were significant interactions between age and rs7138803 (FAIM2) for childhood BMI (qvalue = 7.56*10 24 , p-value = 2.46*10 25 ) and between age and rs2241423 (MAP2K5) for adulthood BMI (q-value = 0.02, pvalue = 0.002) with adjustment for sex and birth weight. Each year of increase in age is estimated to decrease genotype AG and GG effects by 0.07 (b1) and 0.17 (b2) for rs7138803, respectively, and genotype AG and GG by 0.05 (b1) and 0.10 (b2) for rs2241423, respectively. There were also significant interactions between birth weight and rs6499640 (FTO) for childhood BMI (q-value = 0.037, p-value = 0.008) and between birth weight and rs1121980 (FTO) for adulthood BMI (q-value = 0.043, p-value = 0.009), with adjustment for age and sex. Each kilogram of increase in birth weight is estimated to decrease genotype AG and GG effects by 0.91 (b1) and 0.04 (b2) for rs6499640, respectively, and genotype CT and TT by 0.88 (b1) and 1.75 (b2) for rs1121980, respectively.
The best joint effect model for longitudinal BMI was generated by sequentially adding SNPs, with adjustment for age, sex, and birth weight; these results are presented in Figure 4. The best model for childhood BMI contained 14 SNPs and increasingly explained the variance, from 0.17% for the first added SNP (CADM2 rs13078807) to 2.2% for the last added SNP (ETV5 rs7647305). Every additional copy of the reference allele was estimated to increase BMI by 0.27, with an empirical p-value = 0. The interaction test showed that a one year increase in age decreased the risk variant effect on BMI by 0.01 (p,0.05). No significant interaction was observed between the best model and birth weight. The best model for adulthood BMI contained 15 SNPs and increasingly explained the variance from 9.0*10 26 % for the first added SNP (NEGR1 rs2815752) to 2.7% for the last added SNP (ETV5 rs7647305). Each copy of the reference allele was estimated to increase BMI by 0.51, with an empirical pvalue = 0. No significant interaction was observed between the best model and age or between the best model and birth weight (p.0.05).

Discussion
Recent genome-wide association studies have identified several variants associated with BMI. We previously confirmed that FTO tag SNPs, including rs9939609, were associated with longitudinal adulthood BMI [12]. In this study, we extended the test to 23 additional GWAS-significant SNPs that were genotyped in the Bogalusa Heart Study cohort. BMI was repeatedly measured from age 3.32 to18 years for childhood, and from 18 to 45.30 years for adulthood. In contrast to other studies that focus on one measure of BMI per subject or measures at fixed ages, the repeated measures of BMI at broadly varied ages should cover more useful variation in the development of obesity determined by genetic components. Moreover, although the study sample included only 658 white subjects, multiple measures of BMI at different ages per subject increased the sample size of BMI observations to 2,708 for childhood and 2,844 for adulthood. By applying QUANTO [23] software, we confirmed that this study has 80% power to detect genetic variant or interaction effects, explaining 2.3% of the variance of the trait, given the following conditions: sample size of 658, type I error of 0.002 based on Bonferroni adjustment for 23 variants, and an additive genetic model. However, the study can detect much smaller variant or interaction effects than estimated here, because the longitudinal repeated measures serve to increase the number of observations about 4-fold, and both additive and general genetic models were examined, in order to reduce the risk of missing risk variants.
In this study, none of the 23 candidate variants was associated with birth weight (results now shown). The study observed that variants near or at certain genes were associated with BMI at different age ranges. Variants near or at BDNF (rs6265 and rs10767664), FAIM2 (rs7138803), TFAP2B (rs987237), and FTO (rs6499640) were associated with childhood BMI. Variants near or at MAP2K5 (rs2241423), MTCH2 (rs10838738), TMEM18 (rs2867125), and FTO (rs1121980 and rs8050136) were associated with adulthood BMI. Meanwhile, variants near or at KCTD15 rs29941 were associated with both childhood and adulthood BMI. Among these genes, BDNF [7,24,25], KCTD15 [9], TMEM18 [7,26], and FTO [7,27] were highly expressed in the central nervous system, and FAIM2 [28], TFAP2B [29], MAP2K5 [30], and MTCH2 [9,31] were involved in cell apoptosis and proliferation. These findings suggested that genes implicated in pathways related to neural development and cell metabolism can have long-term effects on obesity development, and that the genes underlying childhood and adulthood BMI can be different.
FTO, a gene on chromosome 16, is highly expressed in the brain, and may influence feeding regulation [7]. The associations of FTO   variants with BMI have been widely replicated in different studies. In this study, we confirmed that the FTO effects were longitudinal, influencing both childhood and adulthood BMI, repeatedly measured from 3-48 years of age. The results also indicated that the effects of FTO on obesity, due to a reduction in sensitivity of the appetite control system [32], can start from a young age, and can even be modified during prenatal development. However, no previous study provided the kind of direct evidence for an association between FTO and birth weight that was observed in this study. The FTO candidate SNPs (rs6499640, rs1121980, and rs8050136) were located at the 59 end, upstream of the tag SNPs (rs9939609, rs17820875, and rs860713) that were explored previously [12] ( Figure 2). These findings indicate that different regions of FTO are involved in childhood and adulthood obesity development; the upstream variant (rs6499640) was found to influence childhood BMI, and the downstream variants (rs1121980, rs8050136, rs9939609 [12], rs17820875 [12], and rs860713 [12]) were found to influence adulthood BMI. The risk variants for adulthood BMI, rs1121980, rs8050136, and rs9939609 [12], located between the first two exons of FTO, had strong pairwise LD (r 2 $0.85). Regional  recombination and LD plots show that they are located at the LD block between the upstream (25.46 cM/Mb) and downstream recombination hotspots (67.32 cM/MB). In contrast, the other three SNPs were located at or around the hotspot regions.
The interaction study showed that the longitudinal effects of FAIM2 rs7138803 on childhood BMI and MAP2K5 rs2241423 on adulthood BMI depend on age. The risk effects of the rs7138803 allele G (q-value = 7.6*10 24 , p-value = 2.5*10 25 ) and the rs2241423 allele G (q-value = 0.02, p-value = 0.004) were significantly reduced with increasing age. Both FAIM2 [28] and MAP2K5 [30] regulate cell metabolism. The results indicate that age-dependent genetic pathways related to cell apoptosis and proliferation can regulate longitudinal BMI changes. This interaction study also showed that effects of FTO rs6499640 on childhood BMI and rs1121980 on adulthood BMI were associated with birth weight. There was a significant negative association between birth weight and risk genotype effect, and our findings indicate that low birth weight increases the risk effect of rs6499640 GG and rs1121980 TT. Birth weight is an indicator of intrauterine development of the fetus. There is evidence that intrauterine development can influence the risk of later onset obesity [3]. The findings of the present study suggest that fetal nutrition can change the predisposition to obesity by modifying these obesity-related gene effects.
SNPs in high LD and the tests of same variants with different genetic models can all cause correlated results. To efficiently control for correlated multiple tests, we selected the FDR method to estimate the fraction of false positives among all tests (q-value). In contrast to the false positive rate (p-value) a measure of the rate of null associations deemed significant, the FDR cutoff point of q = 0.05 ensures that the error rate for findings deemed significant, but that are actually false positives, is smaller than 5%. In this study, both general and additive genetic models were used to examine the longitudinal effects of candidate SNPs. This strategy will improve the potential to identify inherited risk variants. For example, tests of FTO variants (rs6499640, rs1121980, and rs8050136) based on the general genetic model presented much stronger associations with longitudinal BMI (q-value = 1.8*10 24 ,0.03) than tests based on the additive genetic model (q-value = 0.07,0.10). Estimates of genotype effects based on the general model indicated that the pattern of inheritance is closer to codominance (Figure 3). Joint effects of multiple common variants at GWAS are often estimated as a cumulatively additive effect for the number of risk alleles, using weighted or unweighted methods [27,33,34]. However, these methods generally require that all variants are independently associated with risk [33]. In addition, because of allele heterogeneity, the risk alleles may not be the same in different populations. We therefore proposed the forward heuristic search method to identify the best joint effect model and the corresponding risk variants among the candidate variants. This method does not require mutual independence of the candidate variants. It attempts to identify a list of variants that are sequentially selected to make the best joint effect model and to continuously improve prediction for target phenotype based on minimization of AIC. In contrast to searching for SNPs that have the largest effects at each step, the forward searching step minimizing AIC has the advantage of maximally explaining BMI variation by analyzing joint effects of multiple common variants and confounding effects of non-genetic covariates together. For example, in the best joint effect model search analysis of longitudinal childhood BMI, rs2815752, the first selected SNP, explained a much smaller percentage of BMI variance (9.0*10 26 %) than rs2241423, which explained the largest percentage of BMI variance (0.90%). However, together with the three covariates, age, sex, and birth weight, rs2815752 best predicted the longitudinal BMI with the minimized AIC. The p-value for the best joint effect model is based on a permutation test that also controls for potential correlation among candidate variants.
Population structure and heterogeneity are critical issues in causing false-positive associations. In this study, all subjects were randomly selected from the white population living in the rural community of Bogalusa, Louisiana. All 23 variants have frequencies close to their frequencies in the HapMap white population (CEU) and meet Hardy-Weinberg equilibrium. All of the subjects had lived in a small town and in the same neighborhood since childhood, and environmental factors within the community were potentially homogeneous [35,36]. Thus, confounding effects due to unknown environment factors, genetic heterogeneity due to different ethnicity, and false-positive associations due to biased selection were reduced in this study. However, due to the restriction of available data, we did not consider socioeconomic status, physical activity, calorie intake, smoking, diabetes, hypertension, or family history for their potential influences on this longitudinal genetic study. Residual confounding due to these factors and the potential existence of unknown subpopulation stratification within the studied white population may affect our findings. The influence of these factors on the findings may require evaluation in follow-up studies.
In summary, we performed a longitudinal study to examine the long-term effects of candidate SNPs that had been previously reported as BMI-related risk variants in published GWAS. The present study confirmed that risk variants of genes implicated in pathways related to neural development and cell metabolism exert major longitudinal effects on BMI. We also found that there are different sets of risk variants associated with childhood and adulthood BMI; that the effects of FAIM2 and MAP2K5, both genes from a cell metabolism-related pathway, vary significantly with age; that FTO influences both childhood and adulthood BMI; that effects on obesity through appetite changes can be modified during prenatal development; and that multiple variants with weak or absent effects alone can jointly exert a large longitudinal effect. These findings provide new insights into biological processes influencing obesity development over the course of childhood and adulthood, and will help to guide future study of genetic pathways related to obesity development.