Novel Genetic Loci Identified for the Pathophysiology of Childhood Obesity in the Hispanic Population

Genetic variants responsible for susceptibility to obesity and its comorbidities among Hispanic children have not been identified. The VIVA LA FAMILIA Study was designed to genetically map childhood obesity and associated biological processes in the Hispanic population. A genome-wide association study (GWAS) entailed genotyping 1.1 million single nucleotide polymorphisms (SNPs) using the Illumina Infinium technology in 815 children. Measured genotype analysis was performed between genetic markers and obesity-related traits i.e., anthropometry, body composition, growth, metabolites, hormones, inflammation, diet, energy expenditure, substrate utilization and physical activity. Identified genome-wide significant loci: 1) corroborated genes implicated in other studies (MTNR1B, ZNF259/APOA5, XPA/FOXE1 (TTF-2), DARC, CCR3, ABO); 2) localized novel genes in plausible biological pathways (PCSK2, ARHGAP11A, CHRNA3); and 3) revealed novel genes with unknown function in obesity pathogenesis (MATK, COL4A1). Salient findings include a nonsynonymous SNP (rs1056513) in INADL (p = 1.2E-07) for weight; an intronic variant in MTNR1B associated with fasting glucose (p = 3.7E-08); variants in the APOA5-ZNF259 region associated with triglycerides (p = 2.5-4.8E-08); an intronic variant in PCSK2 associated with total antioxidants (p = 7.6E-08); a block of 23 SNPs in XPA/FOXE1 (TTF-2) associated with serum TSH (p = 5.5E-08 to 1.0E-09); a nonsynonymous SNP (p = 1.3E-21), an intronic SNP (p = 3.6E-13) in DARC identified for MCP-1; an intronic variant in ARHGAP11A associated with sleep duration (p = 5.0E-08); and, after adjusting for body weight, variants in MATK for total energy expenditure (p = 2.7E-08) and in CHRNA3 for sleeping energy expenditure (p = 6.0E-08). Unprecedented phenotyping and high-density SNP genotyping enabled localization of novel genetic loci associated with the pathophysiology of childhood obesity.


Introduction
Obesity is a complex disease influenced by genetic and environmental factors and their interactions. The current surge in childhood obesity in the U.S. is attributable to an interaction between a genetic predisposition toward efficient energy storage and a permissive environment of readily available food and sedentary behaviors [1]. Genetic architecture of common polygenic childhood obesity remains largely unknown. In genetic studies, the phenotypic description of the obese child usually has been limited to body mass index (BMI). BMI represents a composite trait of fat free mass (FFM) and fat mass (FM) and thus loci influencing BMI may differ from more direct measures of adiposity. In addition, markers of biological processes underlying the development of obesity such as dietary intake, energy expenditure and nutrient partitioning may be more effectual in identifying causal genetic variants [2]. In epidemiology studies, childhood obesity has been shown to be genetically correlated with glucose intolerance, hypertension, dyslipidemia, insulin resistance, chronic inflammation, and risk for fatty liver disease [3,4]. Identification of genes underlying these distinct patterns of association also may unravel important biological pathways involved in the pathophysiology of childhood obesity.
Genome-wide association studies (GWAS) have the potential to localize genetic loci contributing to obesity down to a few 100 kb [5]. In fact, a recent meta-analysis of the adult GIANT Consortium established 32 susceptibility BMI loci [6], several of which were confirmed in French and German children with extreme obesity [7] and European adults with early-onset obesity [8]. Two novel loci near OLFM4 and within HOXB5 were recently reported based on a meta-analysis of 14 pediatric studies of BMI [9]. These pediatric GWAS were confined to BMI and cohorts of European ancestry.
Here, we present findings from a GWAS designed to identify genetic variants influencing childhood obesity and its comorbidities in the Hispanic population. We have published evidence of heritability, pleiotropy amongst traits, and chromosomal regions implicated in obesity among Hispanic children in our VIVA LA FAMILIA Study [4,[10][11][12][13][14][15] In-depth phenotyping was performed to characterize the children, including anthropometry, body composition, growth, metabolites, hormones, inflammation, diet, energy expenditure and substrate utilization and physical activity. Our high-density SNP genotyping and phenotypes representing not only adiposity, but also biological processes associated with the development and consequences of childhood obesity enabled localization of novel genetic loci associated with the pathophysiology of childhood obesity.

Materials and Methods
The VIVA LA FAMILIA Study was designed to identify genetic variants influencing pediatric obesity and its comorbidities. Family recruitment and phenotyping were conducted in 2000-2005 in Houston, TX. All enrolled children and parents gave written informed consent or assent. The protocol was approved by the Institutional Review Boards for Human Subject Research for Baylor College of Medicine and Affiliated Hospitals and for Texas Biomedical Research Institute.
The VIVA LA FAMILIA study design and methodology have been described in detail elsewhere 4 . GWAS was performed on 815 children from 263 Hispanic families. The number of families by sibships was: 8 (one child), 40 (two children), 155 (three children), 48 (four children), 6 (five children), 3 (six children), 2 (seven children) and 1 (eight children). Each family was ascertained on an obese proband, defined as a BMI.95 th percentile, between the ages 4-19 y. Once identified, the obese proband and all siblings, 4 to 19 y of age, and their parents were invited to the Children's Nutrition Research Center for a tour and full explanation of the study prior to consenting. The cross-sectional, longitudinal study design consisted of baseline measurements, with a one-year followup to track children's growth and body compositional changes. Indepth baseline phenotyping included vital signs, anthropometry and body composition, diet and physical fitness, 24-h calorimetry, eating behavior, physical activity, fasting blood sampling for DNA and other biochemistries.
Briefly, blood pressure, heart rate and temperature were taken using an automated monitor. Anthropometric measurements were performed using standardized techniques according to Lohman [16]. Body composition was determined by dual-energy x-ray absorptiometry. Repeated measures after one year were used to compute growth velocities and changes in FM, FFM and energy storage [17]. Methods used to measure fasting blood and 24-h urinary biochemistries are described elsewhere [14,18,19]. A multiple-pass 24-h dietary recall was recorded on two occasions using Nutrition Data System (NDS) [20]. Eating behavior was assessed with a dinner meal and eating in the absence of hunger [21]. Room respiration calorimetry was used to make 24-h measurements of energy expenditure and substrate oxidation [13]. Maximum oxygen consumption (VO 2 max) and heart rate maximum (HRmax) were measured on a treadmill [22]. Actiwatch accelerometers were used to measure frequency, duration and intensity of physical activity [23].

Genotyping
The Illumina HumanOmni1-Quad v1.0 BeadChips were used to genotype 1.1 million single nucleotide polymorphisms (SNPs) in 815 children enrolled in the VIVA LA FAMILIA Study. Genotype calls were obtained after scanning on the Illumina BeadStation 500GX and analysis using the GenomeStudio software. Our genotyping error rate (based on duplicates) was 2 per 100,000 genotypes. Illumina has included sample-dependent and -independent controls on their BeadChips to test for accuracy of the procedure. The average call rate for all SNPs per individual sample was 97%.
SNP genotypes were checked for Mendelian consistency using the program SimWalk2 [24]. The estimates of the allele frequencies and their standard errors were obtained using SOLAR [25]. Specific SNPs were removed from analysis if they had call rates ,95% (n = 6,596), were uncommon alleles in less than 5 participants (26,537), deviated from Hardy-Weinberg equilibrium (n = 0), or were monoallelic (n = 56,448). The number of SNPs that passed quality control and were included in the GWA analysis was 899,892.

Genome-wide Association Analysis
Measured genotype analysis (MGA) was performed using the SOLAR program [25]. All phenotypes were transformed by inverse normalization to meet assumptions of normality. We obtained residuals using linear regression models adjusted for age, sex, their interaction and higher order terms. Because energy expenditure is strongly influenced by body weight across the age range of 4 to 19 years of age, total energy expenditure and sleeping energy expenditure were additionally adjusted for body weight. Also, observed energy intakes consumed in a snack and dinner were adjusted for total energy expenditure or estimated energy requirement, again to compensate for the wide range of ages in our cohort.
Each SNP genotype was converted in SOLAR to a covariate measure equal to 0, 1, or 2 copies of the minor allele (or, for missing genotypes, the weighted covariate based on imputation). These covariates were included in the variance-components mixed models for measured genotype analyses [26] versus null models that incorporated the random effect of kinship and fixed effects such as age, sex, their interaction and higher order terms. For the initial GWA screen, we tested each SNP covariate independently as a 1 degree of freedom likelihood ratio test. The p-value threshold for genome-wide significance (alpha = 0.05) was set at 1.01610 27 . The p-value threshold for genome-wide significance was computed for our family-based cohort that takes into account pedigree structure. The effective number of SNPs given linkage disequilibrium (LD) was calculated by the method of Moskvina and Schmidt [27] as implemented in SOLAR. LD was computed in SOLAR using all available information (all genotyped SNPs on all individuals). The average ratio of SNP effective number/actual number obtained from analysis of 1,989 non-overlapping bins of SNPs was used to calculate the genome-wide effective number of tests and thus the significance threshold for genome-wide association. We performed quantitative transmission disequilibrium test (implemented in SOLAR) to test for population stratification.

Results
GWAS was performed on 815 children from 263 Hispanic families. Mean 6 SD (range) was 25.267.5 (13.4 to 61.9) for the children's BMI, 85.6620.8 (4 to100) for BMI percentile and 1.5261.01 (23.0 to 4.5) for BMI z-score. Measured genotype analysis examined 129 obesity-related traits including BMI and adiposity as well as biological processes associated with the pathophysiology of childhood obesity; a description of the phenotypes is provided in Table S1. In our GWAS, population stratification was not significant and therefore did not confound our associations. A complete listing of all suggestive (p,1.0E-06) and genome-wide significant (p,1.0E-07) genetic variants and their associated traits are presented by chromosomal position in Table S2.

Anthropometry and Body Composition
BMI status, body composition and the growth process were assessed from repeated measurements of body weight, height, FFM and FM at baseline and after one-year ( Table 1). A nonsynonymous SNP (rs1056513; G1178S (NP_005790.2)) in INADL on chromosome 1 attained near genome-wide significance (p = 1.2E-07) for weight, and BMI, FFM, FM, trunk FM, and hip circumference (p = 8.3E-06 to 1.6E-07). SNP rs1056513 is common (MAF = 0.50) and accounted for 3% of the variance in body weight and body composition in this cohort. Weight z-score change was significantly associated with an intronic variant in COL4A1 on chromosome 13 (p = 4.7E-08) (Supporting Information Figure S1). Linear growth (height change) was associated with a variant in the 59UTR region of TSEN34 on chromosome 19 (p = 4.5E-08).

Diet, Energy Expenditure, Substrate Utilization and Physical Activity
Genome-wide significant variants were associated with energy intake, energy expenditure and substrate utilization, and physical activity ( Table 4). An intronic variant in TMEM229B on chromosome 14 was associated with ad libitum energy intake at dinner (p = 5.1E-08). An intronic variant in ARHGAP11A on chromosome 15 was associated with sleep duration (p = 5.0E-08). A SNP in the intronic region of C21orf34 was detected for respiratory quotient (RQ) during sleep (p = 5.3E-08). A variant was identified for accelerometer-measured light activity in the 59UTR region of RPL7P3 on chromosome 9 (p = 7.5E-08) and sedentary- light activity in 39UTR of CTCFL on chromosome 20 (p = 3.6E-08). In addition, adjusting for body weight, highly significant variants were identified for total energy expenditure (p = 2.7E-08) for MATK on chromosome 19 and sleeping energy expenditure (p = 6.0E-08) for CHRNA3 on chromosome 15.

Discussion
Extensive phenotyping and high-density SNP genotyping enabled localization of novel genetic loci associated with the pathophysiology of obesity in Hispanic children. Our unprecedented phenotypes represent not only adiposity, but also biological processes underlying the development of childhood obesity. The number of genome-wide significant genetic variants detected substantiates our analytical strategy and statistical power to identify variants.
Genome-wide significant and suggestive genetic variants were associated with anthropometric indices, body composition and growth rate. The common nonsynonymous SNP rs1056513 in INADL accounted for 3% of the variance in body weight and body composition, and is highly conserved across mammalian species. INADL encodes a PDZ domain-containing protein thought to play a role in tight junctions and adipocyte differentiation [28]. Weight z-score change observed over one-year was associated with an intronic variant in COL4A1, which encodes a basement membrane collagen. Height change was associated with a 59UTR variant inTSEN34, which is involved in tRNA splicing, a fundamental process required for cell growth and division [29].
Fasting serum glucose was associated with an intronic variant (MAF = 0.20; effect size 4.8%) in MTNR1B which encodes one of the melatonin receptors expressed in the retina and brain. Corroborating our findings, this variant (rs10830963) has been strongly associated with fasting glucose levels in adults [30,31] and children [32][33][34][35]. Melatonin, the ligand to MTNR1B, has an inhibitory effect on insulin secretion resulting in elevated fasting glucose.
Fasting serum triglycerides were associated with variants in the intron and 39UTRs within the APOA5-ZNF259 region. APOA5 is an important determinant of circulating triglyceride levels [36]. SNPs detected in our GWAS have been associated with triglycerides in other populations [37]. In the Kosrae population, triglyceride levels were associated with seven SNPs near APOC3/ A5 [38]. Variants in the APOA5-ZNF259 region (including rs3741298) were associated with HDL-C and ApoA-1 response to therapy with statins and fenofibric acid in patients with dyslipidemia [39]. Association with total antioxidants was shown for a variant in PCSK2, a proprotein convertase that is involved in proteolytic processing of neuropeptide and hormone precursors. PCSK2 is highly expressed in the islets of Langerhans, where it plays a role in the conversion of proinsulin to insulin.
A LD block of 23 SNPs in the XPA/FOXE1 (TTF-2) region was highly associated with fasting serum TSH (MAF = 0.25-0.29, effect size = 2.9-3.8%). The association is likely attributed to FOXE1 rather than XPA which is involved in DNA excision repair [40] and the skin disease xeroderma pigmentosum [41]. FOXE1 or thyroid transcription factor 2 (TTF-2) belongs to the 'forkhead' gene family and is involved in promoting the migration process or in repressing differentiation of the thyroid follicular cells until migration has occurred [42] and has been associated with thyroid cancer [43][44][45][46]. In the Kosrae population, plasma TSH levels were strongly associated with 10 SNPs in a region encompassing TTF-2 on chromosome 9 [38]. In a cohort of Caucasian adults, genetic variation in FOXE1(TTF-2) showed significant effects on Table 3. Measured genotype analysis for inflammation markers.  free T4 levels and borderline effects on serum TSH [47]. Three of the SNPs (rs925488, rs1877431, rs1588635) detected in the VIVA cohort was also reported by Lowe et.al (38); there was no overlap with Medici et al (43). Also, three of the SNPs (rs2805809, rs2668804, rs2808693) reported by Lowe et al. (38) was in high linkage disequilibrium (1.00) with rs2805771, rs2808699, rs7875482 detected in the VIVA cohort. This is a region of high LD, located in the 59UTR-region of the gene which has been shown to influence transcriptional regulation of FOXE1(TTF-2).
The functional variant is likely to be situated at this locus, but its exact localization remains to be elucidated in future studies, involving in-depth resequencing.
A unifying role of inflammation in chronic diseases including cardiovascular disease, diabetes, hypertension and obesity is emerging. In the VIVA GWAS, variants in six genes were associated with proinflammatory markers. Our findings support a major role of Duffy antigen receptor for chemokines (DARC) in the regulation of the circulating levels of the cysteine-cysteine (CC) chemokine, MCP-1 (effect size = ,10%). In circulation, MCP-1 is bound to erythrocyte DARC that acts as a chemokine receptor/ reservoir of proinflammatory cytokines [48]. Our strongest association for MCP-1 was with a nonsynonymous, highly conservedSNP in DARC (rs12075; MAF = 0.44) which replicated results from the Framingham Heart Study GWAS in Caucasian adults [49]. MCP-1 was also associated with SNPs in GREB1, DFNB31, RASGEF1A, and CCR3. Huber et al. found increased expression of CCR3 in subcutaneous and visceral adipose tissue in obese patients compared to lean controls [50]. Studies by Schnabel et al [49] and Naitza et al. [51] found suggestive associations of serum MCP-1 with CCR2 which is also a MCP-1 receptor and is in the same region of chromosome 3 as CCR3, and referred to as the CCR2/CCR3 cytokine receptor gene cluster.
Fasting serum IL-6 levels were associated with variants in the ABO gene that determines blood group. ABO blood group has been found to be associated with a number of biomarkers such as von Willebrand factor levels, Factor VIII levels, thrombomodulin, TNF-a and ICAM-1 [52,53], and in our case IL-6. The mechanism by which the A and B alleles affect these biomarkers is uncertain. In Caucasians with and without type 1 diabetes, a variant rs579459 near the ABO blood group gene accounted for 19% of the variance in E-selectin levels [52]; in our GWAS, this same variant was associated with IL-6 levels (p = 1.7E-07; Table  S2).
Total energy expenditure, adjusted for body weight, was significantly associated with rs12104221 in MATK which encodes a protein-tyrosine kinase involved in signal transduction pathways [54]. Sleeping energy expenditure, adjusted for body weight, was associated with rs8040868 in CHRNA3 (cholinergic receptor, neuronal nicotinic, alpha polypeptide 3), a member of a superfamily of ligand-gated ion channels that mediate fast signal transmission at synapses. After binding to acetylcholine, the receptor responds by opening ion-conducting channels across the plasma membrane, suggesting a plausible role of the coding variant (rs8040868) in energy metabolism [55]. Acetylcholine receptors activate proopiomelanocortin neurons that in turn activate melanocortin-4 receptors that are involved in the regulation of energy intake and expenditure [56], [57].
Sleep duration was associated with an intronic SNP in ARHGAP11A (rho GTPase activating protein 11A) that encodes a 1,023-amino acid protein that has a rhoGAP domain and tyrosine phosphorylation site. Evidence is emerging that obesity affects sleep, and that sleep patterns and disorders may have an effect on weight [58]. Although the mechanism is unclear, sleep disturbances are characteristic of Prader Willi Syndrome, caused by a deletion in 15q11-q13 that encompasses ARHGAP11A [59].
Sedentary-light physical activity was associated with a variant in CTCFL, an 11-zinc-finger factor involved in gene regulation. CTCFL forms methylation-sensitive insulators that regulate Xchromosome inactivation which may play a role in epigenetic regulation [60].
Variants in the 11 genes known to cause extreme early-onset obesity also may contribute to milder forms of obesity [61,62]. None of the genotyped variants in genes for monogenic obesity reached genome-wide significance in our GWAS, although several variants in CRHR1, CRHR2, MCHR1, MC3R, MC4R and POMC were nominally associated. Similarly, variants in or nearby the susceptibility genes for obesity did not attain genome-wide significance but several -CHST8, KCTD15, MTCH2, SFRS10, SH2B1 and TMEM18were nominally associated with obesityrelated traits, consistent with GWAS in children of European-American [63] or European ancestry [9]. The lack of genomewide significant findings for monogenic causes of obesity or susceptibility genes may be a function of our sample size and statistical power, or the presence of rare variants in the Hispanic population not represented in the Illumina platform.
The VIVA LA FAMILIA Study is unique in its consideration of a pediatric Hispanic population. We believe that our extensive phenotyping and genotyping enabled localization of novel genetic loci associated with obesity in Hispanic children, despite our relatively small sample size. Our phenotypes represent not only adiposity, but also biological processes underlying the development and consequences of childhood obesity. We applied a standard and stringent approach to our measured genotype analysis, but do agree replication is desirable. We believe we have identified genes that warrant further investigation.
In conclusion, unprecedented in-depth phenotyping and highdensity SNP genotyping enabled the localization of novel genetic loci associated with the pathophysiology of obesity in Hispanic children. Identified genome-wide significant loci: 1) corroborated genes implicated in other studies (MTNR1B, ZNF259/APOA5, XPA/FOXE1 (TTF-2), DARC, CCR3, ABO); 2) localized novel genes in plausible biological pathways (PCSK2, ARHGAP11A, CHRNA3); and 3) revealed novel genes with unknown function in obesity pathogenesis (MATK, COL4A1). As with other GWAS, the variants identified are likely not the actual causal variants but rather markers for genomic regions or loci in which the causal variants lie. Characterization of the underlying functional genetic variants contributing to this serious public health problem in Hispanic children will involve additional study. Figure S1 GWAS Manhattan plots are displayed for three phenotypes: total energy expenditure, adjusted for body weight, measured by 24-h room calorimetry; fasting serum thyroid stimulating hormone; and 1-y change in weight z-score. The genomic coordinates are shown along the X-axis, and the negative logarithm of the association pvalue for each SNP on the Y-axis. (TIFF)

Supporting Information
Table S1 Description of the phenotypes used in the genome-wide association study. (DOCX)