Metrics of early childhood growth in recent epidemiological research: A scoping review

Metrics to quantify child growth vary across studies of the developmental origins of health and disease. We conducted a scoping review of child growth studies in which length/height, weight or body mass index (BMI) was measured at ≥ 2 time points. From a 10% random sample of eligible studies published between Jan 2010-Jun 2016, and all eligible studies from Oct 2015-June 2016, we classified growth metrics based on author-assigned labels (e.g., ‘weight gain’) and a ‘content signature’, a numeric code that summarized the metric’s conceptual and statistical properties. Heterogeneity was assessed by the number of unique content signatures, and label-to-content concordance. In 122 studies, we found 40 unique metrics of childhood growth. The most common approach to quantifying growth in length, weight or BMI was the calculation of each child’s change in z-score. Label-to-content discordance was common due to distinct content signatures carrying the same label, and because of instances in which the same content signature was assigned multiple different labels. In conclusion, the numerous distinct growth metrics and the lack of specificity in the application of metric labels challenge the integration of data and inferences from studies investigating the determinants or consequences of variations in childhood growth.


Introduction
There is substantial ongoing investment in research into the early life factors that influence the development of chronic diseases such as obesity and cardiovascular disease. Of particular interest is the hypothesis that a child's size at birth and the subsequent infant and early childhood growth (i.e., change in size over time) influence the risk of later metabolic and cardiovascular conditions [1]. Epidemiologic studies of the developmental origins of health and disease (DOHaD) hypothesis often rely on quantitative measures of early childhood growth that distinguish children with respect to their relative rates of growth (e.g., weight gain, length/height PLOS  increases) during critical and sensitive windows of development. Studies are typically focused on growth as an exposure causing later childhood or adult conditions [2][3][4][5][6][7][8][9][10][11][12], or as an outcome caused by earlier factors [4,[13][14][15][16][17]. The evidence that relatively slow versus fast growth in early life influences the risk of later health conditions has been conflicting [18]. Between-study inconsistencies may be largely due to differences in statistical methods, as exemplified by considering the variability among studies of the hypothesized association between early child growth and future blood pressure. First, a wide range of statistical models have been used to address the association between growth and other health outcomes; for example, a previous review of growth models demonstrated that different approaches (e.g., lifecourse plots and models versus latent growth curve models) can yield varying inferences regarding the association of growth with later systolic blood pressure [19]. Second, discrepancies in effect estimates can sometimes be attributed to subtle differences in the parameterization of repeated measures of size in regression models [20]; for example, regression models that adjust for current size [21,22] and those that condition on earlier measurements of size [23,24] yield different contrasts, but it can be challenging to reconcile the nuanced distinctions in the interpretations of the regression coefficients.
The varying definitions and statistical formulations used to quantify growth metrics may complicate efforts to integrate evidence across studies, particularly in the context of meta-analyses [4]. Recent reviews have narratively described the lack of a standardized approach for analyzing growth [25][26][27][28], yet no study to our knowledge has empirically characterized the extent of the definitional variation of growth in the recent published literature. Therefore, the specific objectives of this review were: 1) to generate an empirical framework for categorizing operational definitions of child growth, and 2) to use this framework to describe the range and frequency of metrics used to quantify early postnatal growth in recent epidemiological research.

Study inclusion and exclusion criteria
We conducted a scoping review to systematically summarize the variability in metrics of early childhood growth in recently published human growth research, following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) guidelines (S1 File) [29]. We sought to include peer-reviewed longitudinal studies published from January 2010 to June 2016 in which child growth was used as an exposure (independent) variable or outcome (dependent) variable and the analytical approach used ! 2 serial measures of length/height, weight or body mass index (BMI), with at least one measure taken in the period between birth to 5 years (up to and including 60 months of age). Multiple studies involving the same cohort were eligible for inclusion, as the metric of growth or age interval (i.e., timing of follow-up measures) can vary across published articles. We excluded: studies of animal growth, review articles or meta-analyses that did not present original individual-level analyses, studies involving only data simulations/mathematical models rather than empiric analyses of individuallevel data, and studies that were published in any language other than English, as language is essential to data extraction and classification (attaching growth labels to definitions would be too complicated across multiple languages).

Search strategy
MEDLINE and Embase electronic databases were searched for relevant articles in June 2016. The search syntax included a comprehensive list of keywords, medical subject heading (MeSH) (MEDLINE) and Emtree terms (EMBASE) identifying the study design, participant age group, anthropometric measure, and growth metrics (S2 File).

Study selection
Study selection was conducted in two stages. The first stage consisted of title and abstract screening based on the eligibility criteria. Abstracts were excluded if they did not meet all the inclusion criteria. If there was insufficient data or if it was unclear from the title and abstract whether a study met the inclusion criteria, it was included in the full-text screening. In the second stage, we conducted full-text screening for 1) a 10% random sample of studies identified from the title and abstract screening, as it was unfeasible to full-text screen all articles given the large numbers identified in the electronic databases, and 2) all studies identified in the most recent 9 months of the search (October 2015 to June 2016) to ensure we did not miss any upto-date strategies due to our random sampling approach.
For title/abstract and full-text screening, two reviewers independently screened each article using the web-based platform, COVIDENCE [30]. Any disagreements about inclusion/exclusion at the screening stage were flagged for a third reviewer to make the final decision on the eligibility of the study.

Data abstraction
Data from eligible studies were abstracted using a standardized data abstraction tool designed for this study. The tool captured the relevant information on key study characteristics and detailed information on all metrics used to estimate/describe growth based on at least two data points per child/group (even though our tool can accommodate metrics based on cross-sectional analyses) anywhere in the article, including metrics that were mentioned in the narrative yet for which results were not shown. The metric 'label' was considered to be the word or phrase used by the study authors to identify a particular growth metric (e.g., "weight gain", "length velocity"), where one metric could potentially have multiple labels. The metric 'content' consisted of the conceptual and statistical properties of the metric (i.e., the derivation/ estimation, application and interpretation of the growth parameter), which we deconstructed into a 6-component (8 digit) 'content signature' (Fig 1): standardization (metric based on raw measurements or z-scores), level of analysis (individual or group as the unit of analysis), metric type (expressed as a continuous or categorical variable), quantity of data (minimum number of size measurements per individual/group that were used in the derivation of the growth metric), metric subtype (further classification of the manner by which the metric was quantified and expressed) and analytical approach (categorization, calculation or estimation method). As an example, Escribano et al. (2016) operationalized growth in weight as an incremental rate of change (grams per month) by taking the difference in unstandardized weight between birth and 6 months and dividing it by the duration of time, which they described as 'weight gain velocity' [31]. Using our framework (Fig 1), this metric would be classified as follows: '1-Raw' for standardization, '2-Individual' for level of estimation, '1-Continuous' for metric type, '2-2 data points' for quantity of data, '14-Incremental rate of change' for metric subtype and '11-Manual or simple calculation' for analytic approach. These component codes can then be concatenated to generate the content signature for this metric: '12121411'. More detailed descriptions of the 6-component framework can be found in S3 File.
Two reviewers independently extracted data from each eligible article. Any disagreements were resolved through discussion between the two reviewers or further adjudication by a third reviewer. Data abstraction was implemented using REDCap [32], a customizable informatics systems-based web software.

Data analysis
We quantified the heterogeneity among growth metrics by comparing the relative frequency of use of each unique content signature in six strata defined by anthropometric parameter (length/height, weight or BMI) and whether the metric was used as an exposure or outcome variable in the growth analysis. In the stratified analyses, we also examined label-to-content concordance by constructing a matrix of the content signatures by the author-assigned labels. We also used the content signature components to construct decision trees to illustrate the most common approaches to growth analyses in the literature given the relative frequency of use of each unique metric based on content signatures only for a particular anthropometric parameter.

Results
Among 6477 articles retrieved from electronic databases, 122 studies of child growth were eligible and randomly selected for inclusion in the scoping review (Fig 2; S4 File for details). Most of the studies included in this review were cohort studies, conducted in the Americas or the European regions, and enrolled pregnant women or infants within the first month of life (Table 1).
In the 122 included articles, we identified a total of 235 early childhood growth metrics, among which there were 40 unique metrics based on content signatures (Fig 3). There was substantial overlap in the use of these 40 metrics across the three anthropometric parameters (length, weight, BMI) and between exposure and outcome variables. Of the 40 unique metrics, 3 were only used as exposure variables, 24 were only used as outcome variables, and 13 were used at least once as both exposure and outcome variables. Among 16 unique metrics used at least once as an exposure variable, 2 were used only for length, 3 only for weight, 3 only for BMI, and 8 for more than one anthropometric parameter. Among 37 unique metrics used at least once as an outcome variable, 3 were used only for length, 7 only for weight, 8 only for BMI, and 19 for more than one anthropometric parameter. All unique content signatures are listed in S5 File, and decision trees constructed on the basis of the content signature components are shown in Figs 4-6.
Label-to-content discordance was common due to distinct signatures carrying the same author-assigned label, and because of differently assigned labels to the same content signature between authors (Tables 2 and 3). For example, the most common 8-digit signature for growth in length as an exposure (22121311; incremental change between 2 points) was labeled as 'change', 'gain', and 'growth' (Table 2). However, these same labels were also commonly used to refer to the second most common signature for growth in BMI as an outcome (22131417; child-specific rate of change estimated from a linear mixed model) ( Table 3). Label-to-content matrices can be found in S6 File.

Discussion
In the present scoping review, we found that a diverse array of statistical metrics has been used in recent published literature to quantify early childhood growth. Metrics with simple    derivations, such as the estimation of the incremental change in an anthropometric parameter between two time points, were much more commonly used than those that require either more complex statistical methods, such as latent class analysis, or a deeper understanding of the theoretical assumptions required to make inferences, such as conditional growth models.
Investigators in the field of human growth research are often aware of the nuances that influence the selection of particular analytical approaches that best suit the research question. However, we found that an explicit justification of the choice of approach-e.g., raw BMI "is more appropriate for analyzing change over time" [33], or the use of the Berkey-Reed 1 st -order model to "reflect the actual pattern of change that child health practitioners will observe" [34]was the exception rather than the norm. That is, specific growth model selection was not well justified with a narrative rationale, even if investigators selected a suitable analytical approach to address their research question (e.g., the use of conditional growth models to investigate independent associations between consecutive growth periods and later health outcomes).
In principle, the use of distinct metrics or statistical approaches to growth analyses may partly explain the inconsistent findings relating child growth to later health and economic outcomes. For example, the adjustment for size at the beginning of the growth interval in a regression model or the use of the classical formulation of 'conditional growth' [23,[35][36][37] (two approaches that are algebraically interchangeable) yield estimates of growth effects that are Table 2. Common content signatures and their associated author-specified labels for growth as an exposure, by anthropometric parameter a .

Parameter n/N b (%)
Signature description Author-specified labels conditional on baseline size; therefore, inferences from these models are expected to differ from analyses of the same growth-outcome association in which there was no adjustment for baseline or earlier size [38]. Conversely, many growth metrics that appear superficially distinct (and were assigned different content signatures in our analysis) are in fact interchangeable reparameterizations of the longitudinal data, such that the between-child variance in growth may be similarly captured by the different approaches. For example, inferences from an incremental change over time calculated manually may be similar to a child-specific slope derived using a linear mixed effects model. We found wide variability in label and content signature combinations. Many content signatures were associated with the same label, and there were also instances in which the same content signature was assigned multiple different labels (S6 File). Many commonly used generic labels (e.g., growth, gain) are suitably applied to a range of metrics, but lack precision. The use of the term 'velocity' was widely used with highly variable meanings. For example, a conventional use of the term refers to a 'rate of change' implying a denominator that represents a time interval; however, in other cases it was used to describe changes in z-scores, yet Table 3. Common content signatures and their associated author-specified labels for growth as an outcome, by anthropometric parameter a .

Parameter n/N b (%)
Signature description Author-specified labels since z-scores are centered on zero rather than being consistently positive, the term 'velocity' may be less intuitive when used to quantify the extent to which a child's growth curve deviates from the trajectory predicted on the basis of a population growth reference/standard. The discordance between metric labels and their statistical formulations poses a particular methodological challenge for systematic reviews and meta-analyses, in which the terms used in a search strategy may not fully capture the true scope of the relevant literature. For instance, a systematic review summarizing the effect of probiotics on child growth only used the terms 'growth' and 'stunt' in their search strategy [39], while another review assessing factors associated with accelerated growth in childhood only used the terms 'catch-up' and 'rapid weight gain' [40]. The limited range of search terms used by investigators may bias the selection of studies for inclusion, and therefore may have implications for evidence synthesis.
The content signatures that we designed to classify growth metrics in this review may be used to formulate decision trees to inform investigators of the most common approaches to growth analyses in the literature, given the particular anthropometric parameter of interest and the data available for analysis (Figs 4-6). For example, in a study to investigate early growth in length as a risk factor for a future health outcome, such as blood pressure in midchildhood, whereby length was assessed at only two time points across the age interval of interest, it may be instructive for investigators to know that most previous studies with the same data structure standardized the anthropometric parameter and then calculated an incremental change (Fig 4). Alternatively, in a study to examine growth in weight in relation to a set of antecedent risk factors, in which weight is assessed at more than two time points in the age interval of interest, the most commonly used approach among previous studies was to estimate the incremental rate of change in the unstandardized anthropometric parameter using a linear mixed effect model (Fig 5). The routine reporting of analyses using the most common metric, if it appropriately addresses the question of interest, may promote more straightforward comparisons and synthesis of results across studies, even if authors additionally report other lesscommon or novel analytical approaches that they consider to be particularly suited to their research question or study design. However, we also suggest that authors be as explicit as possible with regards to their research question and provide suitable justification for their specific choice of growth metric and modeling approach, so that the coherent fit of the modeling approach to the research question is apparent.
Several limitations of the review should be acknowledged. First, we may have missed recent or relatively uncommon methods used to analyze early childhood growth due to our sampling approach as it was not feasible to screen and data abstract from all published articles given the large numbers identified in the electronic databases and the laborious process of extracting and classifying each metric. For example, the SITAR method [41][42][43] or the use of WHO velocity charts [44] have been used in some recent studies, but due to the low frequency of their uses, they did not appear in our random selection of studies. However, since our pool of studies is comprised of a random sample, we considered it to be a fair representation of variability with which researchers currently operationalize and quantify growth. Another weakness of the review was our focus on growth in length, weight and BMI only, yet there are numerous other anthropometric parameters that may be relevant to human growth research (e.g., head circumference, weight-for-length, ponderal index, skin-fold thickness), for which there may be different content signatures. Thus, we may have underestimated the true heterogeneity in growth metrics in the recent literature. Finally, our classification of growth metrics was based on six components; there may be other relevant components that we did not incorporate in our analysis that would further differentiate metrics.
In summary, this scoping review was not designed to identify a set of ideal metrics to summarize growth, as the choice of growth model is contingent on both available data and the specific research question. However, our findings indicate the need for greater consensus on standardized approaches to summarizing growth for specific questions of interest. Variations in growth metrics complicate comparisons of findings across studies, and discordance between metric labels and their statistical formulation further challenges the integration of inferences. We conclude that the implications of child growth metric heterogeneity should be considered when aggregating and/or designing studies of the causal determinants or consequences of variations in early childhood growth.