Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Analysis of coding variants in the human FTO gene from the gnomAD database

  • Mauro Lúcio Ferreira Souza Junior ,

    Contributed equally to this work with: Mauro Lúcio Ferreira Souza Junior, Jaime Viana de Sousa, João Farias Guerreiro

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft

    Affiliation Laboratory of Human and Medical Genetics, Institute of Biological Sciences, Federal University of Pará, Belém, PA, Brazil

  • Jaime Viana de Sousa ,

    Contributed equally to this work with: Mauro Lúcio Ferreira Souza Junior, Jaime Viana de Sousa, João Farias Guerreiro

    Roles Conceptualization, Data curation, Methodology, Resources, Software, Supervision, Validation, Visualization

    Affiliation Federal Rural University of Amazon, Capanema Campus, PA, Brazil

  • João Farias Guerreiro

    Contributed equally to this work with: Mauro Lúcio Ferreira Souza Junior, Jaime Viana de Sousa, João Farias Guerreiro

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Validation, Visualization, Writing – review & editing

    joao.guerreiro53@gmail.com

    Affiliation Laboratory of Human and Medical Genetics, Institute of Biological Sciences, Federal University of Pará, Belém, PA, Brazil

Abstract

Single nucleotide polymorphisms (SNPs) in the first intron of the FTO gene reported in 2007 continue to be the known variants with the greatest effect on adiposity in different human populations. Coding variants in the FTO gene, on the other hand, have been little explored, although data from complete sequencing of the exomes of various populations are available in public databases and provide an excellent opportunity to investigate potential functional variants in FTO. In this context, this study aimed to track nonsynonymous variants in the exons of the FTO gene in different population groups employing the gnomAD database and analyze the potential functional impact of these variants on the FTO protein using five publicly available pathogenicity prediction programs. The findings revealed 345 rare mutations, of which 321 are missense (93%), 19 are stop gained (5.6%) and five mutations are located in the splice region (1.4%). Of these, 134 (38.8%) were classified as pathogenic, 144 (41.7%) as benign and 67 (19.5%) as unknown. The available data, however, suggest that these variants are probably not associated with BMI and obesity, but instead, with other diseases. Functional studies are, therefore, required to identify the role of these variants in disease genesis.

Introduction

The Fat mass and obesity-associated gene, also known as FTO (alpha-ketoglutarate-dependent dioxygenase), was the first obesity susceptibility gene identified through Genome-Wide Association Studies (GWAS) and remains the locus with the greatest effect on adiposity in different human populations. Four independent GWAS published in 2007 reported a significant association between body mass index and body fat and common genetic FTO gene variants, specifically, a group of single nucleotide polymorphisms (SNPs) in the first intron of this gene. The FTO was identified for the first time in Europeans in 2007 [1], and shortly thereafter, its association with BMI and obesity risk was confirmed by three other studies [24]. This association has been replicated in other populations (Asians, Hispanics and Native Americans), although conflicting results have been observed in African/African-American populations [5, 6]. The frequencies of risk alleles vary substantially between different ethnic groups, which may explain, to some degree, the differences in estimates concerning the effects of these alleles on the BMI. Different populations are characterized by several specific patterns of tightly linked SNP haplotypes associated with the phenotype [7].

The FTO gene, located on chromosome 16q12.2, is expressed in a wide range of tissues, as it is a maintenance gene that maintains the CpG islands in gene promoters [6]. It contains nine exons and spans approximately 410 kb, unusually large for a maintenance gene. It encodes a 2-oxoglutarate-dependent oxygenase that performs oxidative RNA/DNA demethylation, and available data suggest that FTO plays a role in the arcuate nuclei of the hypothalamus, where it mediates energy balance and eating behavior [7]. The intronic location of common SNPs associated with BMI and obesity within a 47 kb region that covers parts of the first two introns and exon 2 of FTO [1] indicates that the amino acid sequence of FTO protein does not exert its effects through functional mutations, and is more likely to play a role in transcription regulation through its effect on the expression of FTO gene and/or neighboring genes, such as the IRX3/IRX5 genes, specifically in adipocytes. Experimental data [8] have confirmed that FTO intron 1 is involved in enhancer activation, as previously described by another study [9], and regulates the expression of the IRX3 and IRX5 loci, which are vital for adipocyte maturation [7, 6].

Data obtained in 2017 from the NHGRI-EBI GWAS catalog [10], an online database that compiles data from genomic association studies and offers a curated collection of published GWAS that evaluate at least 100,000 single nucleotide polymorphisms (SNP), revealed a grouping of 15 SNPs associated with obesity in intron 1 of FTO gene [11], and a total of 61 different intronic SNPs associated with BMI, body fat distribution and other obesity characteristics, were identified from GWAS, almost all present in Europeans, African/African-Americans, Asians, South Asians and Latino/Admixed Americans (miscegenated populations in Latin America) and, at smaller rates (19/61) in Native Americans (Peruvian Amerindians) [12]. Although the available data indicate that the SNPS associated with obesity are located in the first intron of the FTO gene, it is important to understand exon mutations to evaluate their effects not only on obesity, but also on other genetic diseases, as indicated by some studies. For example, a rare, non-synonymous exonic mutation (p.Arg322Gln) has been associated with congenital malformations in two siblings from a Yemen inbreeding family [13] whereas another rare non-synonymous exonic mutation (p.Arg316Gln) has been associated with a lethal autosomal recessive syndrome, resulting in normal development impairment of the central nervous and cardiovascular systems [14]. The FTO p.Ala134Thr variant has been associated with leukopenia induced by thiopurine, related to Inflammatory Bowel Disease [15] and two missense variants (p.Cys326Ser and p.Ser256Asn) were associated to reduced semen quality [16]. Another missense mutation in the FTO gene has been associated with microcephaly, developmental delay, behavioral abnormalities, dysmorphic facial features, hypotonia and several other phenotypic abnormalities in a five-year-old girl born from an inbreeding marriage [17]. At present, complete sequencing data for the exome of continental populations are available at public databases, such as the Genome Aggregation Database (gnomAD), 1000 Genomes and the NHLBI Exome Sequencing Project (Exome Variant Server), providing an excellent opportunity to investigate the pathogenicity of these mutations and their potentially functional alleles in the FTO gene. In this context, the aim of this study was to track nonsynonymous variants in FTO gene exons in different population groups using the gnomAD database and analyze the potential functional effects of these variants.

Methodology

FTO gene data available at the gnomAD 2.1 were downloaded on May 1, 2021 from https://gnomad.broadinstitute.org/. The GnomAD, also known as the Genome Aggregation Database Consortium, was developed by an international coalition of researchers to aggregate and harmonize exome and genome sequencing data from a wide range of large-scale sequencing projects and make data summaries available for the scientific community. Formerly known as the Exome Aggregation Consortium (ExAC), the project began in 2012 and expanded on the work of the 1000 Genomes Project and others that cataloged human genetic variations [18, 19]. The reference genome used for sequence alignment was GRCh37/hg19 (reference), and alignment was performed using the GATK tool [20]. Variants were analyzed using the following publicly available pathogenicity prediction programs: FATHMM [21], PROVEAN [22], SIFT [23], POLYPHEN-2 [24] and PANTHER [25]. The variants subjected to predictor analysis followed this conformation, exclusively: p.Ala405Val, p.Tyr23Cys, p.Ser256Asn and so on. Synonymous mutations were excluded from the analyses, as well as variants in intronic regions. The c.-60C>T, c.-56dupG, c.-48C>T mutations were also excluded. The criteria employed to classify the nature of the mutations were as follows: benign, when three or more predictors classified the variant as benign; pathogenic, when three or more predictors classified the mutation as pathogenic; inconclusive, when at least one predictor was unable to analyze the variant, two classified it as pathogenic and two others classified it as benign or when no prediction was made by multiple predictors.

The ClinVar [26] is one of the most commonly applied databases for clinical and pathological mutation analysis. Although the vast majority of mutations are not reported in this database, those considered pathogenic or benign will be subjected to a search to support the findings of this study or published literature reports.

Results

In total, 345 nonsynonymous mutations were identified at the gnomAD database, of which nineteen were stop-gain mutations (5.6%), 321 were missense mutations (93%) and five were splice region mutations (1.4%). Of the 345 identified mutations, 134 (38.8%) were classified as pathogenic, 144 (41.7%) were classified as benign, and 67 (19.5%) were classified as inconclusive based on in silico analyses by five pathogenicity predictors (Table 1).

thumbnail
Table 1. Number of types of mutations and their characterizations according to the employed predictors.

https://doi.org/10.1371/journal.pone.0248610.t001

Information on the position, nucleotide change, amino acid change, type of mutation, allele count, number of alleles and frequency of each variant in Latino/Admixed American, South Asian, East Asian, African/African-American, European (non-Finnish) and European (Finnish) populations is presented in S1 Table. Of the 38 mutations identified in South Asians, 27 were classified as pathogenic, 27 as benign and 16 as inconclusive. The most frequent mutation (Arg123Trp), classified as inconclusive, was the only one detected at a frequency ≥ 1% (1.7%). The other mutations were detected at very low frequencies. The most common pathogenic mutation was Glu325Val (0.69%). In East Asian populations, 45 very rare mutations were identified, 12 classified as pathogenic, 22 as benign and eleven as inconclusive. The most common mutation, p.Ala134Thr (0.02), was classified as benign. The variants classified as pathogenic exhibited frequencies in the range of 0.0001. Eighty-eight mutations with very low frequencies, most in the range of 1/10,000 or more, were detected in African/African-American populations. Of these, 25 were classified as pathogenic, 42 as benign and 21 as inconclusive. In this population, three mutations exhibited frequencies greater than 1%, namely p.Ala405Val, p.Tyr23Cys and p.Gly182Ala estimated, respectively, at 0.02, 0.04 and 0.01. In Europeans (non-Finnish), a greater number of mutations was detected (188), but all displaying with very low frequencies, most in the range of 1/10,000 or more. Of these, 69 were classified as pathogenic, 82 as benign and 37 as inconclusive. In Europeans (Finnish), on the other hand, only 21 variants were found, 12 classified as benign, six as pathogenic and three classified as inconclusive. The most common variant, p.Asp332Gly, classified as pathogenic, exhibited a frequency of 0.005, while the other variants displayed frequencies in the range of 0.0001. A total of 75 rare mutations were identified in Latino/Admixed American populations, 30 classified as pathogenic, 29 as benign and 14 as inconclusive. The most common variant in Latino/Admixed American populations was p.Ser256Asn (0.002), classified as benign. Globally, the most common variant, p.Tyr23Cys, was found in 773 individuals (0.003), most common in African/African-Americans (0.021) and also detected in Europeans (Non-Finnish) and Latino/Admixed Americans. However, the pathogenicity of this variant was classified as inconclusive by all five predictors employed herein (S2 Table). A total of 12 variants were detected in the “Ashkenazi” Jew population, a name used to refer to Jews from Central and Eastern Europe, eight of which were benign mutations, one pathogenic and three inconclusive. A benign p.Ala163Thr mutation was found in 70 Ashkenazi Jewish, with a frequency of 0.006.

Among the variants classified as pathogenic, 69 were found in Europeans (non-Finnish), 27 iIn South Asians, 30 in Latino/Admixed American, 25 in African/African-Americans, 12 in East Asians, six in Europeans (Finnish) and only one in Ashkenazi Jewish. Globally, the most common pathogenic variant was the p.Glu325Val substitution (0.0008), found with a frequency of 0.006 in South Asians and detected at a very low frequency in Latino/Admixed Americans,. Sixty-five pathogenic mutations are shared by more than one population group, 17 of which are found in Europeans and in one or two other continental populations, suggesting a European origin for this variant, spread by migration, as follows: variants p.Arg84Ser, p.Pro117Ser, p.Tyr333Cys, p.Pro399Ala and p.His62Arg for Latino/Admixed American, p.Pro93Leu, p.Arg445Cys and p.Arg322Ter for Latino/Admixed Americans and South Asians, p.Val65Phe, p.Arg322Gln and p.Arg388Ter for South Asians, p.Cys338Arg and p.Thr115Met for East Asians, African/African-Americans and Ashkenazi Jewish, p.Gln306Lys, p.Met207Val and p.Pro93Arg for African/African-Americans, and p.Asn143Ser for Latins, African/African-Americans and South Asians.

Discussion

As expected, divergences between the pathogenicity analysis findings concerning the variants performed by the five prediction programs employed herein were observed (S3 Table). In general, both the FATHMM and PROVEAN programs classified most variants as benign (177; 51.3%), while the SIFT, POLYPHEN and PANTHER programs classified most variants as pathogenic (137; 39.7%, 192; 55.6% and 167; 48.4%, respectively (Fig 1).

thumbnail
Fig 1. Performance (%) of the employed pathogenicity prediction programs in the analysis of 158 nonsynonymous variants found at the gnomAD database.

https://doi.org/10.1371/journal.pone.0248610.g001

Pathogenicity prediction programs allow for the evaluation of the effect of amino acid substitutions on protein structure or function without performing functional studies, and the available data indicate that the average accuracy of pathogenicity predictors is 85%. However, as different pathogenicity prediction programs vary widely in their methods and ability to predict the pathogenicity of a given sequence change, significant disagreements in the identification of mutational effects and pathogenicity among different programs are noted [27, 28]. In total, 67 exonic variants were classified as pathogenic by all five predictors employed in this study (S3 Table). Exonic FTO gene variants have been little explored, and the few studies available to screen for variants by exon sequencing of FTO gene have found no evidence that the identified variants confer an increased risk of obesity. A total of 34 variants were identified on obese European children (English, French, Belgian and Swiss), [29], but only seven non-synonymous variants were found in Chinese (Han) children with early-onset obesity [30] and four in obese African/African-American children [31]. Likewise, next-generation sequencing (NGS) of the FTO gene in severely obese Swedish children has revealed little evidence of functional variants in the coding region of this gene [32]. These data corroborate the suggestion that the FTO gene does not exert its effects on BMI and obesity through functional mutations, and that this effect is more likely to be exerted by the intron 1 of the FTO gene regulating the expression of the IRX3 and IRX5 loci, vital for adipocyte maturation [6, 7].

The findings of complete exome sequencing data from large populations available at the Genome Aggregation Database (gnomAD) indicate a substantial number of rare coding variants classified as pathogenic or potentially pathogenic by different pathogenicity prediction programs which are not detected by GWAS due to low linkage disequilibrium, as well as the GWAS limitations in capturing rare variants present in less than 1.0% of the investigated population. However, the available data [2931] suggest that these variants are probably not associated with BMI and obesity but instead, with other diseases [1317]. Functional studies are, this, required to identify the role of these variants in disease genesis.

The obvious limitation of this work is that it does not explore the non-synonymous exonic variants identified at the gene expression level in an attempt to identify the biological effects underlying these 134 potentially pathogenic mutations, which constitutes a challenge to be addressed in due course. Additionally, these variants should be explored in an exome database of indigenous and non-indigenous Brazilians, which are not included in the gnomAD.

Supporting information

S1 Table. Missense variants in the FTO gene found in the gnomAD database by population.

FATHMM, PANTHER, SIFT, PROVEAN and POLYPHEN-2.

https://doi.org/10.1371/journal.pone.0248610.s001

(DOCX)

S2 Table. Missense variants in the FTO gene found in the gnomAD database in the global population.

* Five predictors: FATHMM, PANTHER, SIFT, PROVEAN and POLYPHEN-2.

https://doi.org/10.1371/journal.pone.0248610.s002

(DOCX)

S3 Table. Missense variants in the FTO gene found in the gnomAD database and pathogenicity based on five predictor programs.

https://doi.org/10.1371/journal.pone.0248610.s003

(DOCX)

References

  1. 1. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science (80-). 2007. pmid:17434869
  2. 2. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 2007. pmid:17658951
  3. 3. Dina C, Meyre D, Gallina S, Durand E, Körner A, Jacobson P, et al. Variation in FTO contributes to childhood obesity and severe adult obesity. Nat Genet. 2007. pmid:17496892
  4. 4. Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, et al. A genome-wide association study of type 2 diabetes in finns detects multiple susceptibility variants. Science (80-). 2007. pmid:17463248
  5. 5. Loos RJF, Yeo GSH. The bigger picture of FTO—The first GWAS-identified obesity gene. Nature Reviews Endocrinology. 2014. pmid:24247219
  6. 6. Babenko V, Babenko R, Gamieldien J, Markel A. FTO haplotyping underlines high obesity risk for European populations. BMC Med Genomics. 2019. pmid:30871540
  7. 7. Kolačkov K, Łaczmański Ł, Lwow F, Ramsey D, Zdrojowy-Wełna A, Tupikowska M, et al. The frequencies of haplotypes of FTO gene variants and their association with the distribution of body fat in non-obese poles. Adv Clin Exp Med. 2016. pmid:26935496
  8. 8. Claussnitzer M, Dankel SN, Kim K-H, Quon G, Meuleman W, Haugen C, et al. FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med. 2015. pmid:26287746
  9. 9. Ragvin A, Moro E, Fredman D, Navratilova P, Drivenes Ø, Engström PG, et al. Long-range gene regulation links genomic type 2 diabetes and obesity risk regions to HHEX, SOX4, and IRX3. Proc Natl Acad Sci U S A. 2010. pmid:20080751
  10. 10. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014. pmid:24316577
  11. 11. Mao L, Fang Y, Campbell M, Southerland WM. Population differentiation in allele frequencies of obesity-associated SNPs. BMC Genomics. 2017. pmid:29126384
  12. 12. Araújo GS, Lima LHC, Schneider S, Leal TP, Da Silva APC, Vaz De Melo POS, et al. Integrating, summarizing and visualizing GWAS-hits and human diversity with DANCE (Disease-ANCEstry networks). Bioinformatics. 2016. pmid:26673785
  13. 13. Rohena L, Lawson M, Guzman E, Ganapathi M, Cho MT, Haverfield E, et al. FTO variant associated with malformation syndrome. Am J Med Genet A. 2016; 170A(4):1023–1028. pmid:26697951
  14. 14. Boissel S, Reish O, Proulx K, Sedgwick B, Yeo GSH, Meyre D, et al. Loss-of-function mutation in the dioxygenase-encoding FTO gene causes severe growth retardation and multiple malformations. Am J Hum Genet. 2009; 85(1):106–111. pmid:19559399
  15. 15. Kim HS, Cheon JH, Jung ES, Park J, Aum S, Park SJ, et al. A coding variant in FTO confers susceptibility to thiopurine-induced leukopenia in East Asian patients with IBD. Gut. 2017; 66(11): 1926–1935. pmid:27558924
  16. 16. Landfors M, Nakken S, Fusser M, Dahl JA, Klungland A, Fedorcsak P. Sequencing of FTO and ALKBH5 in men undergoing infertility work-up identifies an infertility-associated variant and two missense mutations. Fertil Steril. 2016; 105(5): 1170–1179.e5. pmid:26820768
  17. 17. Çağlayan AO, Tüysüz B, Coşkun S, Quon J, Harmancı AS, Baranoski JF, et al. A patient with a novel homozygous missense mutation in FTO and concomitant nonsense mutation in CETP. J Hum Genet. 2016; 61(5):395–403. pmid:26740239
  18. 18. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581(7809):434–443. pmid:32461654
  19. 19. The Gnomad Consortium Releases First Studies of Human Genetic Variation. Am J Med Genet A. 2020;182(9): 1999–2000. pmid:33043608
  20. 20. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. pmid:20644199
  21. 21. Shihab HA, Gough J, Cooper DN, Stenson PD, Barker GLA, Edwards KJ, et al. Predicting the Functional, Molecular, and Phenotypic Consequences of Amino Acid Substitutions using Hidden Markov Models. Hum Mutat. 2013;34(1):57–65. pmid:23033316
  22. 22. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the Functional Effect of Amino Acid Substitutions and Indels. PLoS One. 2012;7(10):e46688. pmid:23056405
  23. 23. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7):1073–1081. pmid:19561590
  24. 24. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nature Methods. 2010;7(4):248–249. pmid:20354512
  25. 25. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, et al. PANTHER: A library of protein families and subfamilies indexed by function. Genome Res. 2003;13(9):2129–2141. pmid:12952881
  26. 26. Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42(Database issue):D980–D985. pmid:24234437
  27. 27. Walters-Sen LC, Hashimoto S, Thrush DL, Reshmi S, Gastier-Foster JM, Astbury C, et al. Variability in pathogenicity prediction programs: Impact on clinical diagnostics. Mol Genet Genomic Med. 2015;3(2):99–110. pmid:25802880
  28. 28. Pshennikova VG, Barashkov NA, Romanov GP, Teryutin FM, Solov’ev A V., Gotovtsev NN, et al. Comparison of Predictive in Silico Tools on Missense Variants in GJB2, GJB6, and GJB3 Genes Associated with Autosomal Recessive Deafness 1A (DFNB1A). Sci World J. 2019;2019:5198931. pmid:31015822
  29. 29. Meyre D, Proulx K, Kawagoe-Takaki H, Vatin V, Gutiérrez-Aguilar R, Lyon D, et al. Prevalence of loss-of-function FTO mutations in lean and obese individuals. Diabetes. 2010; 59(1):311–318. pmid:19833892
  30. 30. Zheng Z, Hong L, Huang X, Yang P, Li J, Ding Y, et al. Screening for coding variants in FTO and SH2B1 genes in Chinese patients with obesity. PLoS One. 2013;8(6):e67039. pmid:23825611
  31. 31. Deliard S, Panossian S, Mentch FD, Kim CE, Hou C, Frackelton EC, et al. The missense variation landscape of FTO, MC4R, and TMEM18 in obese children of African/African-American Ancestry. Obesity (Silver Spring). 2013;21(1):159–163. pmid:23505181
  32. 32. Sällman Almén M, Rask-Andersen M, Jacobsson JA, Ameur A, Kalnina I, Moschonis G, et al. Determination of the obesity-associated gene variants within the entire FTO gene by ultra-deep targeted sequencing in obese and lean children. Int J Obes (Lond). 2013; 37(3):424–431. pmid:22531089