Utilizing Longitudinal Measures of Fetal Growth to Create a Standard Method to Assess the Impacts of Maternal Disease and Environmental Exposure

Impaired or suboptimal fetal growth is associated with an increased risk of perinatal morbidity and mortality. By utilizing readily available clinical data on the relative size of the fetus at multiple points in pregnancy, including delivery, future epidemiological research can improve our understanding of the impacts of maternal, fetal, and environmental factors on fetal growth at different windows during pregnancy. This study presents mean and standard deviation ultrasound measurements from a clinically representative US population that can be utilized for creating Z-scores to this end. Between 2006 and 2012, 18, 904 non-anomalous pregnancies that received prenatal care, first and second trimester ultrasound evaluations, and ultimately delivered singleton newborns at Brigham and Women’s hospital in Boston were used to create the standard population. To illustrate the utility of this standard, we created Z-scores for ultrasound and delivery measurements for a cohort study population and examined associations with factors known to be associated with fetal growth. In addition to cross-sectional regression models, we created linear mixed models and generalized additive mixed models to illustrate how these scores can be utilized longitudinally and for the identification of windows of susceptibility. After adjustment for a priori confounders, maternal BMI was positively associated with increased fetal size beginning in the second trimester in cross-sectional models. Female infants and maternal smoking were associated with consistently reduced fetal size in the longitudinal models. Maternal age had a non-significant association with increased size in the first trimester that was attenuated as gestation progressed. As the growth measurements examined here are widely available in contemporary obstetrical practice, these data may be abstracted from medical records by investigators and standardized with the population means presented here. This will enable easy extension of clinical data to epidemiologic studies investigating novel maternal, fetal, and environmental factors that may impact fetal growth.


Introduction
Impaired or suboptimal fetal growth is associated with an increased risk of perinatal morbidity and mortality as well as an increased risk of later life conditions including type II diabetes, hypertension, obesity and impaired neurodevelopment [1][2][3][4][5]. The epidemiologic evaluation of predictors of fetal growth has typically involved the assessment of birth weight alone, birth weight for gestational age at delivery, and in some cases the use of other anthropometric measures, such as infant head circumference at birth. However, these studies do not adequately consider the time points during pregnancy when growth is slowed or affected. Identifying these specific windows has the potential to improve the understanding of how both endogenous and exogenous factors potentially influence fetal development, which could in turn inform intervention and prevention strategies.
A second major issue with studies of fetal growth is that they are frequently underpowered, though we acknowledge smaller studies may be unavoidable if the study aims require examination of expensive biomarkers or similar cost prohibitive tests. These limitations, however, are not insurmountable.
In the present paper, we present mean and standard deviations of ultrasound measurements and birth weight from an unbiased and clinically representative population of pregnant women who delivered at Brigham and Women's Hospital (BWH) in Boston, Massachusetts, USA, from 2006-2012. Measures from this standard population can be utilized in epidemiologic studies of longitudinal fetal growth, and may be particularly useful for researchers with populations in the US. We also illustrate the epidemiologic utility of these data by creating Z-scores for measures from a cohort study population (N = 868) and examining the effects of maternal age, maternal BMI, fetal gender, and maternal tobacco use during pregnancy on fetal growth across gestation.

The standard population
The American College of Obstetricians and Gynecologists' (AJOG) current standards for prenatal care suggest that all women undergo ultrasound evaluations both at the end of the first trimester, for the estimation of aneuploidy risk, and then again at the mid-point of the second trimester to assess fetal anatomy [6]. Information abstracted from these measurements, therefore, represents an unbiased source of data since nearly all women are required to undergo these exams.
Between 2006 and 2012 there were 18, 904 non-anomalous pregnancies which received prenatal care, first and second trimester ultrasound evaluations, and ultimately delivered singleton newborns at BWH. The only requirements for inclusion in this analysis were: 1) a live singleton birth; and 2) that the delivery occurred at BWH. (Note: this differs from the cohort study population, which required that women receive their prenatal care at the BWH academic practice.) Crown rump length (CRL) was abstracted from the first trimester aneuploidy screening ultrasound. Subsequently, occipito-frontal diameter, head circumference, abdominal circumference, biparietal diameter, and femur length were abstracted from the second trimester morphology survey. Infant weight at delivery was abstracted from the electronic delivery record. All ultrasound measurements were made by faculty of the Radiology and Maternal-Fetal Medicine Departments at BWH, who are all experienced sonologists with active Society of Maternal-Fetal Medicine certification. Patients with more than one qualifying pregnancy between 2006 and 2012 contributed data from a gestation chosen at random. Dating was established in a two-step hierarchical fashion consistent with current AJOG recommendations [6]. First, in cases of in vitro fertilization (IVF) the estimated date of confinement (EDC) was fixed based on the known date of conception. Second, last menstrual period (LMP) was used if it agreed with the first trimester ultrasound (i.e., if dating by LMP differed by no more than 8% from that determined by ultrasound). If the LMP was unknown or was inconsistent with the first trimester ultrasound, the first trimester ultrasound estimate was used for gestational dating.
The crown rump lengths from the standard population were stratified by the week of measurement for weeks 9 through 12. Fetal anthropometric measurements from the second trimester were stratified for weeks [16][17][18][19][20][21][22]. These measurements included: abdominal circumference (mm), biparietal diameter (mm), femur length (mm), occipitofrontal diameter (mm), and head circumference (mm). In addition to individual measurements, we also calculated estimated fetal size (g) at the second trimester using the formula of Hadlock, which combines biparietal diameter, abdominal circumference, and femur length [7]. Birth weight (g) was stratified by week of gestation at delivery (weeks 23 to 42). Means and standard deviations were calculated for each metric for the relevant weeks of gestation ( Table 1).

The cohort study population
The study population was derived from a pregnancy cohort, begun in 2006, that prospectively enrolls women seeking care at the faculty, midwifery, or resident practices of BWH. Women are enrolled early in the first trimester (median 10 weeks gestation) and followed until delivery. Women are eligible for participation in the study if they are at least 18 years of age, seek prenatal care prior to 15 weeks gestation, and plan to deliver at the BWH hospital. Thus, this sample is a subset of the standard population; however we ensured that none of the cohort study participants were included in the standard population.
At enrollment, each woman completes a survey to provide demographic, socioeconomic, and behavioral information. At delivery, medical history and events during the pregnancy were abstracted from medical records. CRL, second trimester ultrasound measurements, and birth weight were also abstracted from the clinical record for each patient in the cohort. Estimated due date was established using the same hierarchy as described above. Body mass index (BMI; kg/m 2 ) was calculated based on height and weight at the first prenatal visit. Maternal age was also taken from the time of the first prenatal visit. Maternal smoking status was positive (dichotomized as Yes/No) if the patient reported any cigarette use that persisted throughout pregnancy. Only singleton non-anomalous pregnancies were included in this analysis. The study protocol was approved by institutional review board at BWH, and written informed consent was obtained from all participating women.

Z-score calculations
Using means and standard deviations from the Standard Population stratified by gestational week, Z-scores were calculated for CRL (Observation Time 1), estimated fetal size (Observation Time 2) and birth weight (Observation Time 3) for each subject in the Cohort Study Population. The calculation was made in the following fashion: where Observed refers to the metric under consideration at week i for the study subject, sMean refers to the Standard Population mean for that metric at week i, and sStandard Deviation refers to the standard deviation of the metric at week i from the Standard Population. We were thus able to create a longitudinal series of size Z-scores for our Cohort Study Population corresponding to the Observation Times 1 to 3.

Statistical analysis
First, we performed cross-sectional analysis to examine the relationship between Z-scored fetal growth measurements at each Observation Time 1 (1 st trimester ultrasound), 2 (2 nd trimester ultrasound), and 3 (delivery) in association with maternal factors of interest, including maternal age, BMI, fetal gender, and maternal smoking status. Linear regression models were additionally adjusted for maternal race/ethnicity, health insurance provider, alcohol use during pregnancy, use of IVF for conception, gestational age at growth measurement, and diagnosis of preeclampsia in current pregnancy a priori. Second, we performed a longitudinal analysis, examining fetal growth Z-scores from each observation time in the same model. We combined Z-scores of CRL (Observation Time 1), estimated fetal weight (Observation Time 2), and birth weight (Observation Time 3) into one vector for this analysis. Using a generalized least squares model with unstructured residual correlation, we regressed the Z-score vector on gestational age at ultrasound and the variables examined in the cross-sectional analysis. Z-score vector and gestational age at ultrasound were treated as time-varying factors and covariates were examined as fixed effects. The model implemented was as follows: where y ij is Z-scored fetal growth measurement for subject i at time j, x 1ij is gestational age (centered at 20 weeks) for subject i at time j, x 2i ,x 3i ,x 4i , and x 5i are maternal age, BMI, fetal gender, and maternal smoking status for subject i, respectively, and z i is the vector of the additional covariates for subject i. The error term ε i is defined as shown with a multivariate normal distribution and an unstructured correlation matrix for the within-subject correlation of the repeated Z-score measures on the same subject (S(α)). In subsequent models we examined interaction terms between either fetal gender or maternal smoking during pregnancy and gestational age at the time of measurement. Gestational age was centered at 20 weeks for interpretability of interaction terms. Third, to explore potential non-linear association between measures of fetal growth and other covariates of interest (including gestational time) we used a generalized additive mixed model (GAMM) with an unstructured correlation structure. The intent of this analysis is to examine the relationship between gestational age and fetal growth in a non-linear fashion with potential effect modifiers maternal age, maternal BMI, fetal gender, or maternal smoking status, so that associations can be examined visually and sensitive time points when fetal growth measurements are most strongly impacted can be identified. Three possible interaction structures between gestational age and these covariates (nonlinear interaction, linear interaction, and no interaction) were examined and the best-fit structure based on Bayesian information criterion was incorporated into the final GAMM. The model with nonlinear interaction terms was defined as: where g 1 (.),g 2 (.),g 3 (.),g 4 (.), and g 5 (.) are smooth functions for the relationship between the gestational age and each of the non time-varying covariates of interest. The random effects were defined the same as in the marginal model with an unstructured correlation matrix. All statistical analyses were performed using R version 3.0.2 (R Foundation for Statistical Computing, Vienna, Austria) and SAS version 9.2 (SAS Institute Inc., Cary, NC). P-values less than 0.05 were deemed statistically significant.

The standard population
Characteristics of the Standard Population (N = 18,904) are representative of singleton patients seeking care at BWH over this interval. Mean maternal age was 32 years. Primiparous births represented 47% of the population. Over half of the population (55%) self-reported as White, while approximately 17%, 15%, and 10% self-reported as African-American, Hispanic, or Asian, respectively. Means and standard deviations from the Standard Population stratified by gestational week are presented in Table 1, where Observation Time 1 (weeks 9-13) presents CRL, Observation Time 2 (weeks 14-22) presents individual and summed (i.e., estimated fetal weight) anthropometric measurements, and Observation Time 3 (weeks 23-42) presents birth weight only. We additionally calculated means and standard deviation of anthropometric measurements from ultrasound scans taken between weeks 23-42. Because these scans were taken outside the clinically proscribed window for second trimester assessment of fetal anatomy, these values are presented as a supplement (S1 Table). While these measurements are more likely to be from pregnancies with suspected complications, they still may be of utility as a reference.

The cohort study population
Characteristics of the Cohort Study Population (N = 868) are presented in Table 2. The Cohort Study Population was largely white (61.8%), had attained greater than a high school level education (85.1%), and were similar in age to the Standard Population (mean age 32.1 years). Among the Cohort Study Population subjects, 26 (3.0%) indicated they smoked during pregnancy and 435 (50.1%) delivered male infants. All of participants included in the Cohort Study Population had growth measurements available for each of the three Observation Time points. Table 3 demonstrates the cross-sectional associations between pregnancy characteristics of interest (maternal age, BMI, smoking status, and infant gender) Z-scores at each of the observation times. Models were adjusted a priori for variables known to influence fetal size, including maternal race/ethnicity, health insurance provider, alcohol use during pregnancy, use of IVF for conception, gestational age at observation time, and diagnosis of preeclampsia in current pregnancy. At the first trimester measurement (Observation Time 1), no significant associations were observed between the characteristics of interest and raw or Z-scored growth measurements, although maternal age was suggestively associated (p = 0.09) with the Z-scored measure (one year increase in maternal age associated with 0.018 mm increase in crown rump length). At the second trimester measurement (Observation Time 2) maternal BMI was positively associated with both raw (one kg/m 2 increase associated with 1.56 gram increase in estimated fetal size) and Z-score (one kg/m 2 increase associated with 0.017 increase in estimated fetal weight) measures, and female infants had significantly lower raw and Z-score growth indicators as well. At delivery (Observation Time 3), infant gender and smoking were inversely associated with raw as well as Z-score growth measures, and maternal pre-pregnancy BMI was positively associated with the Z-score growth measure.

Longitudinal analysis: linear interaction
Results from the linear longitudinal analysis using the stacked Z-score fetal growth measures are presented in Table 4. Linear mixed models were adjusted for the same set of covariates as were included in linear regression models. Maternal age or maternal BMI did not show any interaction with gestational age. Model 1 included an interaction between infant gender and gestational age at Observation Time and Model 2 included an interaction between maternal smoking and gestational age at Observation Time. A suggestive (p = 0.11) interaction was observed between infant gender and gestational age in Model 1, indicating that female infants had slightly lower Z-score fetal growth profiles across the entire gestation compared to males. In Model 2, the interaction between maternal smoking and gestational age was statistically significant (p = 0.002), indicating that fetuses of mothers who smoked during pregnancy had significantly lower Z-score growth profiles across all points of gestation compared to mothers who did not smoke. Notably, these results indicate significant linear differences in growth across pregnancy in these groups.

Longitudinal analysis: non-linear interaction
To examine potential non-linear differences in growth trajectories by maternal smoking and fetal gender, we created generalized additive models. Again, covariates were consistent with those included in linear regression and linear mixed models. Model 1 included an interaction term between infant gender and gestational age at Observation Time, and Model 2 included an interaction term between maternal smoking and gestational age at Observation Time. Predicted values for fetal growth Z-score in association with gestational age by fetal gender and maternal smoking status are presented in Fig 1. For fetal gender and maternal smoking status, differences in fetal growth Z-scores across groups appeared to be non-linear. Male and female Table 3. Cross-sectional associations between pregnancy characteristics and fetal growth measures at Observation Times 1-3. Growth measurements are modeled both as A) Raw growth measurements, and B) Z-scored growth measurements to the Standard Population. fetuses were similar early in pregnancy but as gestation progressed decreased growth in females became apparent (p for interaction = 0.003). Fetuses of smokers had slightly higher fetal growth Z-scores early in gestation, but toward the end of gestation fetal growth scores dropped to far below those from non-smoking mothers. Smaller p-values for GAM compared to linear mixed model interaction terms indicate the improved ability for the non-linear model to detect significant differences in growth Z-score trajectories between groups.

Discussion
In this study we present means and standard deviations of first and second trimester ultrasound estimates of fetal size and corresponding birth weights in pregnancies ending in singleton live births from a large contemporary North American clinical service. We suggest that these can be utilized by researchers who have an interest in leveraging data that are automatically available in the clinical setting, because of their routine inclusion in prenatal care. These 1 st and 2 nd trimester ultrasound as well as birth weight measurements can be abstracted from medical records and standardized to Z-scores using the means and standard deviations presented here. Thus, these clinical data can be utilized for investigation of research questions pertaining to maternal, fetal, or environmental factors that may influence fetal growth. Additionally, as these measures are collected at multiple time points, they may be used for identification of windows of susceptibility during pregnancy when these factors may be impacting growth. We illustrate how these data may be utilized by presenting associations with maternal age, BMI, and smoking status, as well as fetal gender, in a Boston birth cohort with growth measurements standardized to this population. Several other groups have performed large studies to collect ultrasound and birthweight data longitudinally for the creation of reference populations for standardization purposes. One example is from the INTERGROWTH-21 st project, designed specifically to create international reference curves for fetuses [8]. The standards were created from ultrasound scans and birth measurements taken from 4,321 pregnancies in 8 countries, and had very stringent exclusion criteria so that the population would represent healthy pregnancies with very low risk of adverse outcomes. Another example is the Generation R study (Rotterdam, the Netherlands) [9], a prospective birth cohort of 8,800 pregnancies. Our reference population differs from these notably because it represents a clinically representative population of patients receiving routine care, rather than subjects recruited and willing to participate in a research cohort. On a similar note, scans from our reference population are not selected to represent only healthy pregnancies (as in the INTERGROWTH-21 st project) and are thus more representative of realistic fetal growth. This difference means that Z-scores calculated from our standard will be less precise, and calculations of differences will have lower power. However, using this more heterogeneous population will provide better control for Type I error; i.e., differences in Z-scores detected will be less likely to be spurious. Another difference and advantage of our population is that it is specifically representative of growth in the US which may differ slightly from European populations. To our knowledge, this study is the largest to present ultrasound scans longitudinally across pregnancy to date in the US.
Recalling that Observation Time 1 (the ultrasound evaluation for CRL) corresponds to the end of the first trimester, Time 2 to the latter portion of the second trimester when the fetal survey is performed and Time 3 to delivery, our analysis indicates a variety of growth responses to the several exposures and conditions we have examined. In cross-sectional analysis of an inherently maternal condition (BMI), increased maternal BMI was significantly associated with increased estimated fetal size (Time 2) during the second trimester and birth weight, but not CRL (Time 1) in the first trimester. Longitudinal modeling did not support any alteration of this effect uniformly over time. Similar associations have been found by past researchers for birth weight and fetal growth parameters later in gestation [10][11][12][13][14].
When assessing an inherent fetal condition (fetal gender) we found differences in measurements taken in utero as well as in the gender specific pattern in birth weight that has been extensively supported in past literature [15][16][17][18][19]. This difference appeared to develop between 16 to 22 weeks of gestation but was absent in the initial CRL measurement. Longitudinal modeling suggests that this association may also be enhanced with increased gestational age, since this pattern follows a non-linear relationship.
An exogenous source of exposure, maternal cigarette smoking, demonstrates a different pattern with regard to fetal growth. In our cross-sectional analysis the findings that decreases in birth weight, but not estimated fetal size or CRL, are related to maternal cigarette smoking during pregnancy parallel findings in multiple prior studies [20][21][22][23]. We also show, in longitudinal models, a consistent negative interaction term with gestational age and cigarette smoking. This finding suggests that with increased gestational age, an increasingly stronger adverse association with maternal smoking is observed. Past studies that have assessed ultrasound-based growth parameters at varying time points during pregnancy find significant adverse effects with cigarette smoking clustered more consistently later in pregnancy as opposed to earlier [24][25]. It has been hypothesized that the reduction in birth weight or late pregnancy fetal growth measured associated with smoking may be related to developmental adaptations in placental vasculature and/or fetal arterial resistance [25][26].
Knowing the timing of an effect on fetal growth offers insights into possible underlying mechanisms. Lin & Santolaya-Forgas divide fetal cell growth into three phases [27]. The initial phase broadly corresponds with the embryonic period and extends to 16 weeks. This phase is characterized by cellular hyperplasia and involves a rapid increase in overall cell number. The second phase ranges from 16 to 32 weeks and consists of continued cellular hyperplasia but now with increasing hypertrophy of cellular size. The final phase occurs after 32 weeks when cellular numbers are broadly set and hypertrophy of cellular size increases. It is during this final phase that an increase in cellular glycogen and fetal fat deposition occurs. The rate of fetal growth increases with each of these phases from 5g per day at 15 weeks, 20g per day at 24 weeks to 35g per day at 34 weeks [28]. Different fetal and maternal characteristics or exogenous exposures are unlikely to affect these growth phases equivalently. Examination between such characteristics or exposures and fetal Z-scores will indicate the portion of pregnancy where the effect manifests and thus indicate a possible mechanism. The adverse effects of exposure to cigarette smoke manifest in later pregnancy, suggesting that cellular hypertrophy and possibly glycogen deposition is preferentially affected. Male gender begins to exert a positive effect on growth at a point in gestation where cellular hypertrophy becomes increasingly important. A similar pattern is associated with increasing maternal BMI suggesting, again, an effect on cellular hypertrophy with less of an effect cellular hyperplasia.
The limitations of this analysis should be considered within the context of its methodology. As mentioned above, our study population includes all pregnancies and is not restricted to those that are low-risk. We suggest that this perceived limitation may actually represent an underlying strength to the design, since the data is clinically relevant and for statistical analyses will reduce the likelihood of Type I error. Further, the nature of the reference population, so long as it is broadly representative of general clinical experience, is less of a defining factor with regard to its utility. The reference population is essentially a 'measuring stick' for comparisons and ordering of strata within the observation population of interest. Regardless of height being measured in inches or centimeters, a valid comparison and ranking of short versus tall can still be achieved. Our standard population may also be limited by inclusion of only the women who receive a first trimester ultrasound; however we expect that this would be true for only a small number of pregnancies and for the aforementioned reasons this would not impact the use of our standard population for comparison purposes in statistical analysis.
Another potential limitation to our study includes our use of the Hadlock formula for the creation of a metric of fetal size between 16 and 22 weeks, which can be criticized since the formula was not intended for use in that gestational age range. However, we used this as a means of summarizing the individual measurements routinely gathered at that exam into a single metric. While the formula may only imperfectly relate to actual weight at this gestation age range, we suggest that the rank order of fetal size is unlikely to be affected and thus its utility as a reference for comparison is intact. This analysis has a number of characteristics that may be useful in epidemiological research. The metrics used are, under contemporary obstetrical practice, widely available and constitute standard obstetrical practice. In addition, these metrics can be obtained through a retrospective chart review instead of a more time-consuming prospective data collection. Finally, by allowing a longitudinal examination of fetal size from the first trimester onward, the stage of gestation when differences in growth occur can be determined. More specifically we demonstrate, and parallel contemporary research, that maternal BMI is positively associated with fetal size longitudinally while both fetal gender and maternal smoking are inversely associated with fetal size in a longitudinal manner. Maternal age was not found to influence fetal size in our analysis. In conclusion using available clinical data and in comparison to this reference population, epidemiological studies can examine maternal, fetal, and environmental factors in a way that help to inform mechanistic pathways.
Supporting Information S1 Table. Means and standard deviations (SD) of ultrasound parameters outside the clinically proscribed window of gestation.