Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Estimation of Newborn Risk for Child or Adolescent Obesity: Lessons from Longitudinal Birth Cohorts

  • Anita Morandi , (PF); (AM)

    Affiliations Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France, Regional Centre for Juvenile Diabetes, Obesity and Clinical Nutrition, University of Verona, Verona, Italy

  • David Meyre,

    Affiliations Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Canada

  • Stéphane Lobbens,

    Affiliation Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France

  • Ken Kleinman,

    Affiliation Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States of America

  • Marika Kaakinen,

    Affiliation Institute of Health Sciences and Biocenter, University of Oulu, Oulu, Finland

  • Sheryl L. Rifas-Shiman,

    Affiliation Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States of America

  • Vincent Vatin,

    Affiliation Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France

  • Stefan Gaget,

    Affiliation Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France

  • Anneli Pouta,

    Affiliations Department of Children, Young People and Families, National Institute for Health and Welfare, Helsinki, Finland, Institute of Clinical Medicine/Obstetrics and Gynecology, University of Oulu, Oulu, Finland

  • Anna-Liisa Hartikainen,

    Affiliation Institute of Clinical Medicine/Obstetrics and Gynecology, University of Oulu, Oulu, Finland

  • Jaana Laitinen,

    Affiliation Finnish Institute of Occupational Health, Helsinki, Finland

  • Aimo Ruokonen,

    Affiliation Department of Clinical Sciences and Clinical Chemistry, University of Oulu, Oulu, Finland

  • Shikta Das,

    Affiliation Department of Epidemiology and Biostatistics, School of Public Health, Imperial College, London, United Kingdom

  • Anokhi Ali Khan,

    Affiliation Department of Epidemiology and Biostatistics, School of Public Health, Imperial College, London, United Kingdom

  • Paul Elliott,

    Affiliations Department of Epidemiology and Biostatistics, School of Public Health, Imperial College, London, United Kingdom, Centre for Environment and Health, School of Public Health, Imperial College, London, United Kingdom

  • Claudio Maffeis,

    Affiliation Regional Centre for Juvenile Diabetes, Obesity and Clinical Nutrition, University of Verona, Verona, Italy

  • Matthew W. Gillman,

    Affiliation Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States of America

  • Marjo-Riitta Järvelin ,

    Contributed equally to this work with: Marjo-Riitta Järvelin, Philippe Froguel

    Affiliations Institute of Health Sciences and Biocenter, University of Oulu, Oulu, Finland, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College, London, United Kingdom, Centre for Environment and Health, School of Public Health, Imperial College, London, United Kingdom, Department of Life Course and Services, National Institute for Health and Welfare, Oulu, Finland

  •  [ ... ],
  • Philippe Froguel

    Contributed equally to this work with: Marjo-Riitta Järvelin, Philippe Froguel (PF); (AM)

    Affiliations Unité Mixte de Recherche 8199, Centre National de Recherche Scientifique (CNRS) and Pasteur Institute, Lille, France, Department of Genomics of Common Disease, School of Public Health, Imperial College, London, United Kingdom

  • [ view all ]
  • [ view less ]



Prevention of obesity should start as early as possible after birth. We aimed to build clinically useful equations estimating the risk of later obesity in newborns, as a first step towards focused early prevention against the global obesity epidemic.


We analyzed the lifetime Northern Finland Birth Cohort 1986 (NFBC1986) (N = 4,032) to draw predictive equations for childhood and adolescent obesity from traditional risk factors (parental BMI, birth weight, maternal gestational weight gain, behaviour and social indicators), and a genetic score built from 39 BMI/obesity-associated polymorphisms. We performed validation analyses in a retrospective cohort of 1,503 Italian children and in a prospective cohort of 1,032 U.S. children.


In the NFBC1986, the cumulative accuracy of traditional risk factors predicting childhood obesity, adolescent obesity, and childhood obesity persistent into adolescence was good: AUROC = 0·78[0·74–0.82], 0·75[0·71–0·79] and 0·85[0·80–0·90] respectively (all p<0·001). Adding the genetic score produced discrimination improvements ≤1%. The NFBC1986 equation for childhood obesity remained acceptably accurate when applied to the Italian and the U.S. cohort (AUROC = 0·70[0·63–0·77] and 0·73[0·67–0·80] respectively) and the two additional equations for childhood obesity newly drawn from the Italian and the U.S. datasets showed good accuracy in respective cohorts (AUROC = 0·74[0·69–0·79] and 0·79[0·73–0·84]) (all p<0·001). The three equations for childhood obesity were converted into simple Excel risk calculators for potential clinical use.


This study provides the first example of handy tools for predicting childhood obesity in newborns by means of easily recorded information, while it shows that currently known genetic variants have very little usefulness for such prediction.


Childhood and adolescent overweight and obesity, which are leading causes of early type 2 diabetes and cardiovascular disease, have become major public health problems both in westernized and more recently in developing countries [1]. Traditional approaches for the management of overweight and obesity have had poor long term efficacy and therefore prevention is currently the most promising strategy for controlling the obesity epidemic [1].

Prevention of obesity should start as early as possible after birth. Longitudinal studies have shown a strong association between early infancy weight gain rate or adiposity and childhood and even adult body weight, fat mass and body mass index (BMI) [2][3]. Moreover, the efficacy of preventive behavioural and nutrition interventions targeting school children, either in primary schools or at home, is very limited [4][5]. Finally, in many countries pre-school and school children are already burdened by a high prevalence of overweight or obesity [4].

Assessing the risk for future overweight or obesity in newborns may be a basis for focused preventive interventions for at-risk individuals during the very first months of their life. Even though several sociodemographic and anthropometric predictors, as well as several common genetic variants, have been associated with childhood overweight/obesity, no longitudinal study has attempted to explore the cumulative predictive properties of these known early life risk factors, or to propose possible tools to predict childhood obesity at birth [6][20].

We aimed to build such predictive algorithms for the early identification of newborns at an increased risk for childhood and adolescent overweight/obesity. For this purpose, we estimated the ability of clinical, socio-demographic, and genetic risk factors to predict childhood and adolescent overweight/obesity in a large Finnish birth cohort. We then confirmed the promising usefulness of socio-demographic and anthropometric factors in predicting childhood obesity in two independent paediatric cohorts.


Ethics Statement

The study conducted on the NFBC1986 cohort was approved by the Ethical Committee of Northern Ostrobothnia Hospital District. The retrospective study of the Veneto cohort was approved by the Ethical Committee of the University of Verona and Project Viva was approved by the Human subjects Committees of Harvard Pilgrim Health Care, Brigham and Women’s Hospital, and Beth Israel Deaconess Medical Center.

Written informed consent was obtained from parents or guardians of all participants and all clinical investigations were conducted according to the principles expressed in the Declaration of Helsinki.


Development sample.

The Northern Finland Birth Cohort 1986 (NFBC1986) ( was followed prospectively from 12th gestational week and several well known early risk factors for childhood obesity were recorded systematically. Participants who had their weight and height recorded at seven and sixteen years of age and met data completeness criteria (see below, N = 4,032) were used to build the models. We separately predicted childhood obesity (obesity at 7 years of age), childhood overweight/obesity (overweight or obesity at 7 years of age), adolescent obesity (obesity at 16 years of age), adolescent overweight/obesity (overweight or obesity at 16 years of age), and the severe sub-phenotypes of childhood obesity persistent into adolescence (obesity at 7 and 16 years of age) and childhood overweight/obesity persistent into adolescence (overweight or obesity at 7 and 16 years of age) (Table 1, Table S1S2). Overweight and obesity were defined by the IOTF BMI cut-offs [21].

The traditional predictors used for building the predictive models (gender, pre-pregnancy parental BMI, parental professional category, single parenthood, gestational weight gain, pre-pregnancy maternal smoking, gestational smoking, number of household members, birth weight) were a-priori selected among all available baseline NFBC1986 variables according to their association with early obesity in previous literature (Table 1) [2], [6][11]. Forty-four obesity predisposing single-nucleotide polymorphisms (SNPs) were selected according to the following criterion: genome-wide significant level of association (P<5×10−8) for BMI and/or obesity reported in a population of European ancestry [12][20]. Genotyping was performed by TaqMan (Applied Biosystems, Foster City, CA): the average genotyping success was of 99.4% (95.1–100) and the average consensus rate from 255 duplicates was 99.8% (99.2–100) (Table S3, S4, S5, S6).

Five SNPs were discarded during the genotyping procedure, since they did not pass the genotyping quality control criteria, leaving 39 SNPs. All 39 SNPs were in Hardy-Weinberg equilibrium (P>0.05). We assumed an additive model and constructed a cumulative genotype score by summing the number of risk alleles (0–78).

Validation samples.

We used a school-based retrospective sample of 1,503 children aged 4–12 from Veneto, Italy, as one of the two validation samples to explore whether results from the NFBC1986 could be applied to a European paediatric cohort contemporary to the NFBC1986, with similar obesity prevalence (4%) but different cultural background [22]. The second validation set used was a prospective sample of 1032 children (7 years) from Massachusetts (United States) from the Project Viva ( to explore whether results would remain valid when applied to a very recent U.S. child cohort, with higher obesity prevalence (8%) and very different cultural background. Genetic variants were not available for the validation analyses. All children meeting the international criterion for obesity definition at the time of recruitment in the Italian sample and at 7 years of age in the U.S. sample were classified as affected by childhood obesity [21].

Statistical Analysis

Development phase.

Predictive models were fitted by stepwise logistic regression analysis (criterion for variable entry: p<0.05, for variable removal: p>0.10) using traditional risk factors only, genetic score only and traditional risk factors plus genetic score for each obesity outcome. Each risk factor entering the analysis as continuous or ordinal scale variable showed a linear relationship with the logit-risk of childhood obesity in a preliminary linear regression analysis. For persistent childhood obesity, not all the a priori selected traditional predictors were used for the stepwise analysis but only the five with the strongest association with persistent childhood obesity in a preliminary univariate analysis, in order to avoid possible model over-fitting due to the relatively small number (forty-seven) of outcome events.

The discrimination accuracy of each model was evaluated by the area under the receiver operating curve (AUROC) of the modeled risk [23]. Models with AUROCs larger than 0.7 were considered potentially clinically useful and those with AUROCs larger than 0.8 were considered to have excellent accuracy [23]. The model calibration, that is the “precision” or correlation between the predicted and observed event rate, was assessed by the Hosmer-Lemeshow test [23]. The possible accuracy improvement associated with adding the genetic score to the traditional risk factors was evaluated by calculating the integrated discrimination improvement (IDI) compared to the traditional risk factors alone [24].

For each model a risk threshold was arbitrarily adopted at the 75th percentile of the modeled risk, identifying the top 25% as being at increased risk and the thresholds’ predictive properties (sensitivity, specificity and predictive values) were calculated.

An average of 1.67% (0–11.4%) of data was missing for each traditional risk factor, while an average of 0.72% (0–4.95%) of genotypes was missing for each SNP. We included participants with zero or one missing traditional baseline variable and three or fewer missing SNPs. Multiple imputation was performed for the remaining missing values, in order to avoid possible bias associated with missing potentially important information [25]. Win MICE (Multiple Imputation by Chain Equations) V0.1. was used for multiple imputation [25]. By the MICE procedure, imputed values for missing data are drawn from modelling them on the basis of the other considered variables, with logistic regression if the variable to impute is dichotomous, polytomous logistic regression if it is categorical with three or more categories and with linear regression if it is continuous [25]. So each missing value is replaced by an estimated value modelled on the other variables. Indeed, the method estimates a distribution of each missing variable, taking all aspects of uncertainty in the imputations into account. From this distribution, values are sampled and filled in for the missing data. So every imputation cycle produces, for each missing data, one estimated value sampled among several possible ones, giving rise to a unique dataset which can not be reproduced by following imputation cycles [25].

Five imputation cycles were run so that five values were imputed for each missing datum to get variation in the imputed values, thus reflecting the uncertainty introduced by imputation itself. Inference was based on the five resulting datasets [25]: areas under AUROCs were obtained by averaging the five single data sets coefficients, while 95% confidence intervals were delimited by the two overall most extreme boundaries, the lowest and the highest [25]. All the coefficients, the AUROCs and the 95% C.I. boundaries were identical up to the first or second decimal for any considered variable across the five datasets.

Validation and replication phase.

Only the model developed for childhood obesity was used for validation because the model for prediction of childhood overweight/obesity was not considered accurate enough to be clinically useful and the models concerning adolescent phenotypes required older cohorts than Veneto and Project Viva. The NFBC1986 equation was applied to the validation cohorts after recalculation of the intercept according to the cohort-specific phenotype prevalence and mean values of predictors. In the Veneto sample, number of household members and gestational smoking were not available.

A replication analysis was also performed in which the model for childhood obesity was re-built in the two validation samples by stepwise logistic regression using the available traditional risk factors.

Statistics were performed with R 2.11.0 (, SPSS.18 (IBM Company, Chicago, Illinois) and SAS 9.3 (SAS Institute, Cary, North Carolina).


Parental BMI, birth weight, maternal gestational weight gain, number of household members, maternal professional category and smoking habits were independent predictors of all or most of the six obesity outcomes (Table 23).

Table 2. Stepwise multiple logistic models for prediction of overweight phenotypes: ORs and p values associated with predictors, AUROC and P of Hosmer-Lemeshow test in the final models (bold characters) and AUROCs and P of Hosmer-Lemeshow of each step (italic characters).

The equations to estimate the risk for the obesity outcomes from these traditional risk factors are represented in supporting information (Dataset S1).

Table 3. Stepwise multiple logistic models for prediction of obesity phenotypes: ORs and p values associated with predictors, AUROC and P of Hosmer-Lemeshow test in the final models (bold characters) and AUROCs and P of Hosmer-Lemeshow of each step (italic characters).

Discrimination accuracy of the risk calculation from traditional risk factors was excellent for persistent childhood obesity (AUROC = 0.85[0.80–0.90], p<0.001), clinically meaningful for persistent childhood overweight/obesity (AUROC = 0.75[0.73–0.78], p<0.001), childhood obesity (AUROC = 0.78 [0.74–0.82], p<0.001), adolescent obesity (AUROC = 0.75[0.71–0.79], p<0.001) and adolescent overweight/obesity (AUROC = 0.71[0.69–0.73], p<0.001), and below the threshold for clinical usefulness for childhood overweight/obesity (AUROC = 0.67[0.65–0.69], p<0.001) (Figure 1 and Table 23) (23). All of the six models developed from traditional risk factors were adequately calibrated (all p for Hosmer-Lemeshow test >0.05).

Figure 1. Estimates of risk percentages for childhood obesity for given pairs of parental BMIs according to the NFBC1986 equation.

Estimates are provided for three different combinations of birth weight, maternal professional category, number of household members and maternal gestational smoking, corresponding to three progressively higher risk backgrounds. Grey cells correspond to risk estimates within the highest risk quartile in the overall population.

Parental BMI was the main contributor to discrimination accuracy while other predictors contributed moderately to the model discrimination effectiveness but increased the overall model calibration (Table 23).

For any given pair of parental BMIs, estimation of the probability of childhood obesity varied greatly, depending on the combination of other predictors (Figure 1).

Genetic score was an independent predictor of all of the six considered outcomes, with ORs associated with unitary score increase ranging from 1.05[1.03–1.08] to 1.09[1.03–1.14] (0.05> all P>4 × 10−8) but its discrimination accuracy was poor, with AUROCs ranging from 0.56[0.54–0.58] to 0.59[0.54–0.64] (Table S7). Adding the genetic score to the traditional risk factors did not produce better AUROCs than using traditional risk factors alone and was associated with modest IDIs not larger than 1% (Figure S1). The genetic score composed of only the twenty SNPs identified for childhood obesity traits exhibited similar associations with early obesity phenotypes (Table S8). Then only the models developed from traditional risk factors were taken into consideration for further analyses. Predictive properties of the risk thresholds corresponding to the highest risk quartile for each obesity phenotype are represented in Table 4. Positive predictive values were low, due to the low prevalence of predicted conditions, while negative predictive values were high (Table 4).

Table 4. Risk threshold and predictive properties corresponding to the 75° percentile of calculated risk for the obesity phenotypes in the NFBC1986.

The version of the NFBC1986 equation for childhood obesity lacking gestational smoking and number of household members (AUROC = 0.73[0.69–0.77] in the NFBC1986) had an AUROC = 0.70[0.63–0.77] (p<0.001) when applied to the Veneto cohort, with acceptable calibration accuracy (p for Hosmer-Lemeshow test = 0.12).

The NFBC1986 equation for childhood obesity had an acceptable AUROC = 0.73[0.67–0.80] (p<0.001) when applied to the project Viva children. However, calibration in the Project Viva sample was not satisfactory (p for Hosmer-Lemeshow test = 0.02).

The VENETO equation, i.e. the equation to predict childhood obesity issued from the Italian sample (model replication), included parental BMIs and gender (Dataset S1), had an AUROC of 0.74[0.69–0.79] (p<0.001) in the Veneto sample and was adequately calibrated (p for Hosmer-Lemeshow test = 0.11).

The Project Viva equation, i.e., the equation to predict childhood obesity issued from the U.S. sample (model replication), included parental BMI, race, gestational smoking and gestational weight gain (Dataset S1), had an AUROC of 0.79[0.73–0.84] (p<0.001) in the Project Viva sample and was adequately calibrated (p for Hosmer-Lemeshow test = 0.91).

The three equations predicting childhood obesity in the three studied cohorts were converted in an electronic automatic risk calculator for potential clinical use (Dataset S2).


Our study provides the first example of predictive tool for assessing the risk of developing early obesity phenotypes, based on readily available traditional risk factors about newborns. The potential inclusion of genetic variants was explored, but due to their modest contribution to predictive accuracy, they were not included in the final models.

Analysis of the NFBC1986 showed that traditional risk factors performed better in prediction of severe rather than mild obesity phenotypes. Importantly, the predictive accuracy of the models did not decline from childhood to adolescence, suggesting that the association between the traditional risk factors and obesity is stable until early adulthood. This is consistent with recent evidence about the relationship between single early risk factors and adolescent and adult obesity [6], [9], [10]. The risk of childhood obesity was largely driven by parental BMI. However, other predictors moderately improved the discrimination accuracy and increased the exactitude of risk estimation. They also produced large ranges of possible risk estimates for any given parental BMI, significantly improving risk classification at any level of parental BMI (Table 23, Figure 1 and Dataset S2).

Predictive tools need to satisfy important requisites before they can be applied in clinical settings. First, significant preventive advantages should derive from prediction. Although medical societies have been called on to provide reasonable guidance on prevention based on available data and the American Academy of Paediatrics has recently underlined the emergent need of finding effective clinical tools to enable primary care providers to contribute to obesity prevention [26][27], there is no compelling evidence of any efficient obesity preventive strategy involving infancy. Then, robust trials proving the effectiveness of strategies of early prevention are still needed to justify the adoption of early obesity prediction in the everyday clinical practice. Should trials prove the efficacy of preventive strategies implying special interventions going beyond paediatric counselling and public health campaigns routinely provided to the general population, a predictive tool like that proposed here would offer the important advantage to exclude a large proportion of infants from such interventions, thanks to its good negative predictive value. This would improve the cost/effectiveness ratio of preventive actions.

However few available controlled prevention trials suggest that interventions directly involving parents of pre-school children outside education settings are more effective than school or community-based interventions targeting later ages, supporting the hypothesis that involving parents in the prevention of their offspring’s obesity as early as possible is likely to be a good strategy [1]. In this view, it has been suggested that « Let’s Move » against child obesity campaign, which is a U.S. government-sponsored obesity prevention program targeting children aged 2–10, might be more effective if children under 2 could be identified as prevention targets [4].

Parents of newborns are particularly sensitive to information given about their child’s health. Once informed of their baby’s increased risk for obesity, they might be more receptive to routine advice provided from birth during the first two years of life within population-wide prevention: breastfeeding, feeding on demand, weaning no earlier than the sixth month with recommended meal patterns and food portions, avoiding of television and sugar-sweetened beverages [28]. Moreover, families of newborns at risk could be enrolled in more intensive schedules of growth monitoring and nutritional counselling than those offered to general population, in order to avoid excessive weight gain in infancy. Encouraging strategies aiming at significantly decreasing energy intake in infants should be avoided however, both because of the well known difficulties encountered by parents in doing it and because of potential, unknown harmful effects of an early caloric restriction. In contrast, recent evidence suggests that some preventive strategies prevention of obesity based on educating mothers could be useful to limit excessive infant weight gain promoting appropriate maternal responses to satiety cues and decreasing non-responsive feeding behaviours which over-ride satiety cues, such as food rewards, non food rewards to encourage infant to eat, etc… [29]. Such strategies do not imply a direct food restriction, but rather a limitation of “passive” (not hunger-driven) infant over-eating.

Obviously, even in case of proved efficacy of early obesity prevention, the targeted approach should also be carefully assessed by means of trials with a “focused intervention” design, before any dissemination of the early obesity prediction into broad clinical practice. In fact, targeted approach might also imply deleterious effects, among which, for example, stigmatization of families of infants classified as “at risk” or false reassurance of other families. Indeed, early prediction should not mean a “diagnostic” attitude towards any of the two categories of families. In particular, the assessment of age and BMI at adiposity rebound, which are good predictors of childhood and adult obesity, should be carried on in young children in order to optimize the overall detection rate of those likely to become obese and possibly sensitize families previously “missed” by the neonatal score [30].

Accuracy is another important requisite for a predictive tool. The model predicting persistent obesity had excellent accuracy (AUROC = 0.85) while the models predicting obesity and persistent overweight had clinically useful discrimination accuracy (AUROCs = 0.75 to 0.78) [23], similar to that of widely used tools for predicting multifactor medical conditions, such as the Framingham risk score for coronary heart disease (AUROC = 0.74 to 0.77 depending on gender and type of scoring adopted) [31]. Due to low prevalence of the obesity phenotypes in the NFCB1986, the fourth quartiles of predicted risk had low to moderate prevalence of cases even if they “captured” most or a high percentage of cases (low positive predictive value despite good sensitivity) (Table 4). This represents a possible drawback of preventive strategies based on risk assessment [32]. Nevertheless, risk thresholds conceived for prediction and focused prevention are not required to be “diagnostic” but rather cost-effective. Thus, the criteria we propose to select newborns at risk for obesity, could have a strong impact on public health, despite their low specificity/positive predictive value, because they could justify cost-effective preventive strategies on a subsection of the general population, similarly to several sensitive though little specific selective criteria used for widespread preventive interventions, such as: age higher than 30 years as criterion to recommend pap test against cervical cancer, age higher than 50 years as criterion to recommend the faecal occult blood test against colon cancer, etc…[33][34]. The adequate discrimination and calibration accuracy achieved by the equations presented in the manuscript imply that a high percentage of future obese children (more than two-thirds), is included in the highest quartile of calculated risk. Thus, using the highest risk quartile of calculated risk as selective criterion would allow focused preventive strategies to reach 70–75% of potential future cases though involving only 25% of newborns. Should these strategies have just about 50% effectiveness, the number of future obese children would have a 35–38% decrease, which would represent much greater success compared with results obtained to date by large scale preventive strategies involving later infancy and childhood [5]

The models using traditional risk factors had good calibration, which suggests that it may be possible to use the newborns’ calculated risks in addition to the two risk categories. This would add precision to prediction and potential further effectiveness to related prevention

Finally, the equations we present use easily accessible information, do not incur additional costs to clinical care, and only require minimal time to calculate, if converted into simple automatic calculators like those we propose in the Supporting Information. Such electronic risk calculators could be part of an electronic medical record system and/or be housed within computer-assisted standardised programs of obesity prevention, which are promising tools for the prevention and care of paediatric obesity [35].

The results of the validation/replication analyses allow for important considerations. First of all, traditional risk factors have a good cumulative accuracy (AUROC = 0.79) in the recent U.S. paediatric cohort, which has a significantly higher prevalence of childhood obesity than the NFBC1986. This demonstrates that the environmental pressure towards obesity has not weakened the role of early risk factors. Moreover, it supports the hypothesis that, at the current phase of the obesity pandemic, the use of “familial and personal” risk factors for early prediction may be useful, in addition to population wide interventions, in those regions, like Massachusetts, where the prevalence of obesity is still moderate and characterised by ethnic and social disparities rather than influenced by country-related risk factors [36]. In these regions, focused preventive strategies based on personal risk stratification may effectively integrate large scale interventions based on nation wide characteristics [32], [36]. Interestingly, since 2010 the U.S Government has been supporting a preventive strategy against childhood obesity involving low-income children from Boston (, indicating efforts towards focused prevention. Employing focused strategies involving newborns whose risk is high according to diverse factors beyond social parameters, could lead to earlier, more effective prevention of overweight/obesity in children.

The NFBC1986 equation for childhood obesity proved to keep acceptably discriminative when applied to both the validation cohorts, but showed a lost of calibration when applied to the Viva cohort, suggesting that its adoption in the U.S. would have acceptable validity to discriminate newborns at risk for early obesity but not to perform exact risk estimations. This is probably due to inconsistency of some predictors, such as maternal professional category and number of household members. Accordingly, the Project Viva equation lacks these variables while it includes race, which is not present among obesity predictors in the NFBC1986 equation, because of the high ethnical homogeneity of the NFBC1986. Inconsistency of the role of SES variables across different populations is expected and it is the main reason why it would be very difficult to build a highly accurate and calibrated score that also has complete widespread validity [36].

Overall, the validation analysis suggests that “local” equations, including parental BMI but also other locally important early predictors, may have good accuracy in predicting childhood obesity at birth, even in countries like the U.S., with high environmental pressure towards early weight excess, and should be preferred, whenever possible, to the universal adoption of the NFBC1986 equation. Interestingly, parental BMI, which partly reflects the degree of familial genetic predisposition to obesity, had very similar effect size and accuracy in the three studied cohorts, consistently with the evidence that the growing obesity epidemic has not lowered the heritability of childhood adiposity [37].

Our study also explored, with the largest list of obesity-SNPs ever used, the performance of genetics in predicting early obesity phenotypes, showing very modest predictive accuracy of the assessed genetic variants, consistently with previous evidence on adult obesity [20]. Even if a modest predictive accuracy of the studied genetic variants was expected, the accuracy estimates obtained in this study rule out, for the first time, the hypothesis that genetics may perform a little better in predicting early obesity than adult obesity, due to presumed lower impact of environmental determinants during childhood than later in life. This result is consistent with recent evidence that polygenic risk and BMI show substantially similar correlation coefficients between childhood and adulthood and further contributes to the growing evidence that common genetic variants are not yet “ready for use” for the prediction of several complex diseases, due to the still small proportion of heritability explained by the newly discovered variants [30], [38]. It is possible that next-generation sequencing techniques will reduce significantly the gap of “missing heritability” of obesity, identifying rare causative variants and clarifying the role of epigenetics by the genome-wide characterisation of DNA methylation patterns in foetuses or infants developing later obesity or not [39].

Finally, the most important evidence obtained by including currently known SNPs in our analyses is that not only common genetic variants have very low accuracy in predicting early obesity but also they produce a very little improvement of the prediction when combined with clinical factors. This is particularly important because although the notion that genetic variants have poor value in predicting common diseases is quite well established, the possible utility of including polygenic risk scoring within management strategies for complex diseases is a topical subject of current research and genetic testing services including obesity are being offered to consumers by private companies [39][40].

The main limitations of our manuscript are the lack of external validation for the equations predicting adolescent and persistent obesity, due to the young age of our validation cohorts and the use, in one of the validation analyses, of a retrospective paediatric cohort with some variables lacking and an age of assessment not perfectly corresponding to that of the original cohort (4–12 years versus 7 years).

The main strengths include: the novelty and the potential strong public health impact of multivariate obesity predicting tools valid for newborns; the optimization of results reliability and robustness by the adoption of several recommended methods shown recently to be lacking in several recent high impact prediction studies [41]: external geographical and temporal validation (for the model predicting childhood obesity), use of multiple imputation for missing values, avoidance of predictor dichotomisation, assessment of models calibration accuracy, avoidance of model over-fitting.

In summary, our study provides the first example of at birth prediction of early obesity by means of traditional, routinely available risk factors and should guide future efforts towards randomized trials of very early preventive approaches for identified high risk individuals to help combat the obesity epidemic.

Supporting Information

Figure S1.

ROC curves of combined traditional risk factors (blue), genetic score (beige) and traditional risk factors + genetic score (green) predicting six obesity outcomes in the NFBC1986. Integrated discrimination improvements (IDIs) associated with adding the genetic score to the traditional risk factors are also provided.


Table S1.

Metabolic differences between obese adolescents with or without a history of childhood obesity in the NFBC1986.


Table S2.

Metabolic differences between overweight/obese adolescents with or without a history of childhood overweight/obesity in the NFBC1986.


Table S3.

SNPs selected for building the genotype score with the relative genotyping quality control parameters.


Table S4.

Associations between single SNPs and childhood obesity and overweight/obesity in the NFBC1986.


Table S5.

Associations between single SNPs and adolescent obesity and overweight/obesity in the NFBC1986.


Table S6.

Associations between single SNPs and persistent obesity and overweight/obesity in the NFBC1986.


Table S7.

Association, discrimination and calibration parameters of the 39-SNPs genetic score predicting the six obesity outcomes in the NFBC1986.


Table S8.

Association, discrimination and calibration parameters of the genetic score composed of the 20 “childhood obesity SNPs”a predicting the six obesity outcomes in the NFBC1986.


Dataset S1.

Equations predicting the obesity phenotypes from traditional risk factors.


Dataset S2.

Example of automatic calculator of risk for childhood obesity.



We thank the GIANT Consortium (Metabolism Initiative and Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA) for letting us access to unpublished obesity associated SNPs that have been genotyped in the current study. We thank Nabila Bouatia-Naji (CNRS-8199-Lille North of France University, Pasteur Institute, Lille, France) for advice on SNPs selection and genotyping and Marion Marchand (CNRS-8199-Lille North of France University, Pasteur Institute, Lille, France) for performing part of genotyping. We thank M. Deweirder and F. Allegaert (CNRS-8199-Lille North of France University, Pasteur Institute, Lille, France) for DNA bank management.

We are indebted to all subjects who participated in these studies.

Author Contributions

Conceived and designed the experiments: AM DM PF. Performed the experiments: SL SG VV. Analyzed the data: AM KK SLR-S. Contributed reagents/materials/analysis tools: SL MK SG VV. Wrote the paper: AM. Cohort investigators: M-RJ CM MG KK SLR-S MK AP A-LH JL AR SD AAK. Supervised the study: PE M-RJ PF. Equally contributed as last authors: M-RJ PF. Equal corresponding authors: AM PF.


  1. 1. Waters E, de Silva-Sanigorski A, Hall BJ, Brown T, Campbell KJ, et al.. (2011) Interventions for preventing obesity in children. Cochrane Database of Systematic Reviews 12: DOI:
  2. 2. Stocks T, Renders CM, Bulk-Bunschoten AM, Hirasing RA, van Buuren S, et al. (2011) Body size and growth in 0- to 4-year-old children and the relation to body size in primary school age. Obes Rev 12(8): 637–52.
  3. 3. Druet C, Stettler N, Sharp S, Simmons RK, Cooper C, et al. (2012) Prediction of childhood obesity by infancy weight gain: an individual-level meta-analysis. Paediatr Perinat Epidemiol 26(1): 19–26.
  4. 4. Wojcicki JM, Heyman MB (2010) Let’s move – Childhood Obesity Prevention from Pregnancy and Infancy Onward. N Eng J Med 362: 1457–1459.
  5. 5. Summerbell CD, Waters E, Edmunds LD, Kelly S, Brown T, et al. (2011) Interventions for preventing obesity in children. Cochrane Database Syst Rev 12: CD001871.
  6. 6. Whitaker RC, Wright JA, Pepe MS, Seidel KD, et al. (1997) Predicting obesity in young adulthood from childhood and parental obesity. N Engl J Med 337: 869.
  7. 7. Yu ZB, Han SP, Zhu GZ, Zhu C, Wang XJ, et al. (2011) Birth weight and subsequent risk of obesity: a systematic review and meta-analysis. Obes Rev 12(7): 525–42.
  8. 8. Ino T (2010) Maternal smoking during pregnancy and offspring obesity: meta-analysis. Pediatr Int 52(1): 94–9.
  9. 9. Plachta-Danielzik S, Landsberg B, Johannsen M, Lange D, Müller MJ (2010) Determinants of the prevalence and incidence of overweight in children and adolescents. Public Health Nutr 13(11): 1870–81.
  10. 10. Mamun AA, O’Callaghan M, Callaway L, Williams G, Najman J, et al. (2009) Associations of gestational weight gain with body mass index and blood pressure at 21 years of age: evidence from a birth cohort study. Circulation 119: 1720.
  11. 11. Smith GD, Steer C, Leary S, Ness A (2007) Is there an intrauterine influence on obesity? Evidence from parent-child associations in the Avon Longitudinal Study of parents and children (ALSPAC). Arch Dis Child 92: 876.
  12. 12. Benzinou M, Creemers JWM, Choquet H, Lobbens S, Dina C, et al. (2008) Common nonsynonymous variants in PCSK1 confer risk of obesity. Nat Genet 40: 943–945.
  13. 13. Dina C, Meyre D, Gallina S, Durand E, Körner A, et al. (2007) Variation in FTO contributes to childhood obesity and severe adult obesity. Nat Genet 39(6): 724–6.
  14. 14. Loos RJF, Lindgren CM, Li S, Wheeler E, Zhao JH, et al. (2008) Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet 40: 768–775.
  15. 15. Willer CJ, Speliotes EK, Loos RJF, Li S, Lindgren CM, et al. (2009) Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet 41: 25–34.
  16. 16. Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, et al. (2009) Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet 41: 18–24.
  17. 17. Meyre D, Delplanque J, Chèvre J-C, Locoeur C, Lobbens S, et al. (2009) Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations. Nat Genet 41: 157–159.
  18. 18. Liu YJ, Liu XG, Wang L, Dina C, Yan H, et al. (2008) Genome-wide association scans identified CTNNBLI as a novel gene for obesity. Hum Mol Genet 17: 1803–1813.
  19. 19. Scherag A, Dina C, Hinney A, Vatin V, Scherag S, et al.. (2010) Two new loci for body-weight regulation identified in a joint analysis of genome-wide association studies for early onset extreme obesity in French and German study groups. Plos Genet e1000916.
  20. 20. Speliotes EK, Willer CJ, Berndt SI, Monda K, Thorleifsson G, et al. (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature Genet 42(11): 937–48.
  21. 21. Cole TJ, Bellizzi MC, Flegal KM, Dietz WH (2000) Establishing a standard definition for child overweight and obesity worldwide : international survey. BMJ 320: 1–6.
  22. 22. Maffeis C, Shutz Y, Piccoli R, Gonfiantini E, Pinelli L (1993) Prevalence of obesity in children in north-east Italy. Int J Obesity 17: 287–294.
  23. 23. Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley-Interscience Publication. II EDITION.
  24. 24. Pencina MJ, D’Agostino RB, D’Agostino RB, Vasan RS (2008) Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Statistics in Medicine 27: 157–172.
  25. 25. Langkamp DL, Lehman A, Lemeshow S (2010) Techniques for Handling Missing Data in Secondary Analyses of Large Surveys. Acad Pediatr 10(3) 205–10.
  26. 26. Steering Committee on Quality Improvement and Management (2008) Towards Transparent Clinical Policies. Pediatrics 121: 643–646.
  27. 27. Haemer M, Cluett S, Hassink SG, Liu L, Mangarelli C, et al.. (2011) Building Capacity for Childhood Obesity Prevention and Treatment: Call to Action. Pediatrics 128: S 71.
  28. 28. US Department of Health and Human Services, US Department of Agriculture (2005) Dietary Guidelines for Americans, 2005. 6th ed. Washington, DC: Government Printing Office.
  29. 29. Daniels LA, Mallan KM, Battistutta D, Nicholson JM, Perry R, et al. (2012) Evaluation of an intervention to promote protective infant feeding practices to prevent childhood obesity: outcomes of the NOURISH RCT at 14 months of age and 6 months post the first of two intervention modules. IJO 36(10): 1292–8.
  30. 30. Belsky DW, Moffitt TE, Houts R, Bennett GG, Biddle AK, et al. (2012) Polygenic Risk, Rapid Childhood Growth, and the Development of Obesity. Arch Pediatr Adolesc Med. 166(6) 515–521.
  31. 31. Wilson PWF, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, et al. (1998) Prediction of Coronary Heart Disease Using Risk Factor Categories. Circulation 97: 1837–1847.
  32. 32. Rose G (1985) Sick individuals and sick populations. International Journal of Epidemiology 14: 32.
  33. 33. Saslow D, Runowicz CD, Solomon D, Killackey M, Kulasingam SL, et al. (2002) American Cancer Society. American Cancer Society guideline for the early detection of cervical neoplasia and cancer. CA Cancer J Clin 52(6): 342–362.
  34. 34. Levin B, Lieberman DA, McFarland B, Smith RA, Brooks D, et al. (2008) Screening and Surveillance for the Early Detection of Colorectal Cancer and Adenomatous Polyps, 2008: A Joint Guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. CA Cancer J Clin 58: 130–160.
  35. 35. Rattay KT, Ramakrishnan M, Atkinson A, Gilson M, Drayton V (2009) Use of an Electronic Medical Record System to Support Primary Care Recommendations to Prevent, Identify, and Manage Childhood Obesity. Pediatrics 123: S100–S107.
  36. 36. Bethell C, Read D, Goodman E, Johnson J, Besl J, et al. (2009) Consistently Inconsistent: A Snapshot Across and Within State Disparities in the Prevalence of Childhood Overweight and Obesity. Pediatrics 123: S277–S286.
  37. 37. Wardle J, Carnell S, Haworth CM, Plomin R (2008) Evidence for a strong genetic influence on childhood adiposity despite the force of the obesogenic environment. Am J Clin Nutr 87: 398–404.
  38. 38. Kraft P, Hunter DJ (2009) Genetic risk prediction - Are we there yet? N Engl J Med 360: 1701.
  39. 39. Manco M, Dallapiccola B (2012) Genetics of pediatric obesity. Pediatrics 130(1): 123–33.
  40. 40. Waxler JL, O’Brien KE, Delahanty LM, Meigs JB, Florez JC, et al.. (2012) Genetic Counseling as a Tool for Type 2 Diabetes Prevention: A Genetic Counseling Framework for Common Polygenetic Disorders. J Genet Couns [Epub ahead of print]
  41. 41. Bouwmeester W, Zuithoff NP, Mallett S, Geerlings MI, Vergouwe Y, et al. (2012) Reporting and methods in clinical prediction research: a systematic review. PLoS Med 9(5): e1001221.