The World Health Organization Fetal Growth Charts: A Multinational Longitudinal Study of Ultrasound Biometric Measurements and Estimated Fetal Weight

Background Perinatal mortality and morbidity continue to be major global health challenges strongly associated with prematurity and reduced fetal growth, an issue of further interest given the mounting evidence that fetal growth in general is linked to degrees of risk of common noncommunicable diseases in adulthood. Against this background, WHO made it a high priority to provide the present fetal growth charts for estimated fetal weight (EFW) and common ultrasound biometric measurements intended for worldwide use. Methods and Findings We conducted a multinational prospective observational longitudinal study of fetal growth in low-risk singleton pregnancies of women of high or middle socioeconomic status and without known environmental constraints on fetal growth. Centers in ten countries (Argentina, Brazil, Democratic Republic of the Congo, Denmark, Egypt, France, Germany, India, Norway, and Thailand) recruited participants who had reliable information on last menstrual period and gestational age confirmed by crown–rump length measured at 8–13 wk of gestation. Participants had anthropometric and nutritional assessments and seven scheduled ultrasound examinations during pregnancy. Fifty-two participants withdrew consent, and 1,387 participated in the study. At study entry, median maternal age was 28 y (interquartile range [IQR] 25–31), median height was 162 cm (IQR 157–168), median weight was 61 kg (IQR 55–68), 58% of the women were nulliparous, and median daily caloric intake was 1,840 cal (IQR 1,487–2,222). The median pregnancy duration was 39 wk (IQR 38–40) although there were significant differences between countries, the largest difference being 12 d (95% CI 8–16). The median birthweight was 3,300 g (IQR 2,980–3,615). There were differences in birthweight between countries, e.g., India had significantly smaller neonates than the other countries, even after adjusting for gestational age. Thirty-one women had a miscarriage, and three fetuses had intrauterine death. The 8,203 sets of ultrasound measurements were scrutinized for outliers and leverage points, and those measurements taken at 14 to 40 wk were selected for analysis. A total of 7,924 sets of ultrasound measurements were analyzed by quantile regression to establish longitudinal reference intervals for fetal head circumference, biparietal diameter, humerus length, abdominal circumference, femur length and its ratio with head circumference and with biparietal diameter, and EFW. There was asymmetric distribution of growth of EFW: a slightly wider distribution among the lower percentiles during early weeks shifted to a notably expanded distribution of the higher percentiles in late pregnancy. Male fetuses were larger than female fetuses as measured by EFW, but the disparity was smaller in the lower quantiles of the distribution (3.5%) and larger in the upper quantiles (4.5%). Maternal age and maternal height were associated with a positive effect on EFW, particularly in the lower tail of the distribution, of the order of 2% to 3% for each additional 10 y of age of the mother and 1% to 2% for each additional 10 cm of height. Maternal weight was associated with a small positive effect on EFW, especially in the higher tail of the distribution, of the order of 1.0% to 1.5% for each additional 10 kg of bodyweight of the mother. Parous women had heavier fetuses than nulliparous women, with the disparity being greater in the lower quantiles of the distribution, of the order of 1% to 1.5%, and diminishing in the upper quantiles. There were also significant differences in growth of EFW between countries. In spite of the multinational nature of the study, sample size is a limiting factor for generalization of the charts. Conclusions This study provides WHO fetal growth charts for EFW and common ultrasound biometric measurements, and shows variation between different parts of the world.

Author Summary

Why Was This Study Done?
• Small size at birth is associated with perinatal mortality, child morbidity, and adult health risks, all major global health challenges prioritized by the World Health Organization.
• Ultrasound estimation of fetal weight before birth is today very widely used in clinical practice, and, while essential for the identification and management of high-risk pregnancies, the current reference ranges used worldwide are largely based on single populations from a few high-income countries and are therefore of uncertain general applicability.
• WHO therefore requested new fetal growth charts based on multiple populations to be made available for general use and at the same time provide a foundation for the growing initiative to prevent noncommunicable diseases and promote a healthy life course starting before birth.

What Did the Researchers Do and Find?
• In all, 1,387 healthy women with low-risk pregnancies and unconstrained nutritional and social background from ten countries in Africa, Asia, Europe, and South America were included in a longitudinal study of fetal growth.
• During pregnancy, repeated ultrasound measurements were used to establish international fetal growth charts for head and abdominal circumference, length of the thigh bone, and fetal weight, estimated using a combination of the three measurements.
• Fetal growth showed considerable natural variation, differing significantly between countries. Growth was to a small extent influenced by maternal age, height, weight, and parity, and by fetal sex.
• Similarly, birthweight varied significantly between countries, even after adjustment for differences in the length of pregnancy.

What Do These Findings Mean?
• We suggest that these WHO charts for growth in estimated fetal weight are more suitable for international use than those commonly applied today. However, the differences between countries, with maternal factors, and with fetal sex mean that these growth charts may need to be adjusted for local clinical use to increase their diagnostic and predictive performance.
• The considerable variation in fetal growth and birthweight which occurs even under optimal conditions, and which is not explicable in terms of maternal and population factors, may suggest, first, that such natural variation in offspring size is a collective adaptive strategy that has proved extremely successful from an evolutionary point of view and, second, that major determinants of variation in human development before birth are still to be determined.
• Although the present study encompasses ten countries, it still represents only a small selection when the substantial anthropometric variations existing even within continents are taken into account.

Introduction
Global mortality for infants under age 5 y halved from 90 to 43 deaths per 1,000 live births between 1990 and 2015. This is the result of a tremendous global effort to achieve the UN Millennium Development Goals [1] and the goals of the UN Secretary-General's Every Woman Every Child initiative [2]. Neonatal mortality in the first 28 d declined (by 47%) from 5.0 to 2.6 million deaths annually over this period. Unfortunately, inequality between countries persists, with 98% of neonatal deaths occurring in low-and middle-income countries [3]. Importantly, more than 60% of such deaths are associated with low birthweight due to intrauterine growth restriction or preterm birth or both [4,5]. Ultrasound imaging has become an essential tool for assuring correct gestational age and for fetal size assessment, increasingly so even in societies with restricted resources. Correspondingly, evidence is emerging at the population level that use of ultrasound biometry increases the rate of detection of fetal growth restriction and the identification of those at increased risk of neonatal morbidity [6]. Birthweight, closely linked to fetal growth, is also a marker of risks for noncommunicable diseases in adult life, with cardiovascular diseases, type II diabetes, and obesity being the most prominent [7,8]. While the birthweight gradient across the entire population reflects the distribution of degrees of such risk, it is increasingly evident that it is the developing physiology associated with fetal growth, rather than birthweight per se, that conditions cardiovascular, metabolic, endocrine, and neural functions for the life course, and thus long-term health and disease risks [9]. For this reason, fetal growth data and aspects of intrauterine development need to be included as an important part of an early-life noncommunicable disease prevention initiative, as this targets the time when the effect of an intervention is greatest [10].
A meeting of experts convened by WHO in 2002 reviewed current knowledge on birthweight as a health outcome and identified a need for research to develop fetal growth charts for international use [11]. In 2006, WHO published the multicenter WHO Child Growth Standards [12] using a prescriptive concept that assumes that, under optimal socioeconomic and nutritional conditions, all children follow one growth standard, regardless of ethnic background. Some support for this concept was drawn from previous studies [13,14]. Although widely adopted, the applicability of these child growth standards has been questioned on the grounds of lack of fit to some populations [15,16], especially for the head circumference standards [17].
Recently, a large multicenter study, the Fetal Growth Longitudinal Study of the Intergrowth-21st Project [18], applied the same concept and approach to fetal growth. The study presented growth standards using ultrasound biometric measurements but did not estimate fetal weight (EFW), even though this is the single most widely used clinical assessment of fetal growth today. Another large recent study, the NICHD Fetal Growth Studies, showed significant differences in fetal growth with ethnicity, and established ethnic-specific growth charts [19]. This contradicts the prescriptive concept that one standard fits all. The study was, however, restricted to four self-reported ethnic groups of Asian, Hispanic, black, and white women in the US.
The present study is the fetal component of the WHO Multicentre Growth Reference Study, which aimed to establish growth charts for clinical use based on populations recruited from multiple countries [20].

Methods Design
This was a multinational observational study approved by the WHO Research Project Review Panel (RP2) and the WHO Research Ethics Review Committee, secondarily approved by the national or local ethics review committee for each study center, and correspondingly carried out according to the Helsinki declaration on ethical principles for medical research in humans [20,21]. All women were recruited specifically for this study, gave written informed consent at inclusion, and otherwise followed their conventional antenatal care program separately from study sessions. Study measurements were revealed to the clinician when the information was thought to be of importance for the management of the pregnancy. The study protocol was published previously [20], so here we present a condensed account of the methods. The study selected participating centers from a range of ethnic and geographical settings, and intended to recruit 1,400 participants. The sample size calculation procedure was published previously [20].

Setting
The following centers participated in the study based on the proficient use of ultrasonography:

Participants
Participants without known health, environmental, and/or socioeconomic constraints were invited to participate in the study. Further inclusion criteria were used: living at an altitude lower than 1,500 m and near the study area (intended to promote compliance for the duration of the study and any possible follow-up studies); age ! 18 y and 40 y; body mass index (BMI) 18-30 kg/m 2 ; singleton pregnancy; gestational age at entry between gestational week 8+0 d and 12+6 d according to reliable information on last menstrual period (LMP) and confirmed by ultrasound measurement of fetal crown-rump length; no history of chronic health problems; no long-term medication (including fertility treatment); no environmental or economic constraints likely to impede fetal growth; not smoking currently or in the previous 6 mo; no history of recurrent miscarriages; no previous preterm delivery (<37 wk) or birthweight < 2,500 g; and no evidence in the present pregnancy of congenital disease or fetal anomaly at study entry. Fetal anomalies detected during pregnancy or at birth were noted and verified postnatally. Pregnancies in which small-for-gestation-age fetuses were observed or intrauterine growth restriction was suspected were also noted. All mothers recruited were followed up until the end of the study, apart from those withdrawing consent.

Study Procedures
Women in the first trimester (before week 12+6 d of gestation) attending antenatal care clinics were approached by members of the study team and asked to participate. They were informed about the study objectives and procedures. Those who signed the consent form were enrolled in the study. After the ultrasound scan to assess agreement between gestational age based on LMP and that based on crown-rump length, they were scheduled for fetal biometry scans at monthly intervals.
All infants had an anthropometric assessment after delivery, including measurement of birthweight. All pregnant women in the study were asked for a 24-h dietary recall at entry into the study (and at 28 and 36 wk of gestation) [22]. Clinically relevant conditions (e.g., hypertension, preeclampsia, and diabetes) occurring during pregnancy and childbirth were noted. Otherwise, no further procedures were added to the routine antenatal care provided at the study centers.

Gestational Age Assessment
Gestational age was confirmed by measuring the crown-rump length between gestational week 8 + 0 d and 12 + 6 d based on LMP and recorded as the average of three measurements. To acquire the crown-rump length, the midline sagittal section of the whole fetus was visualized with the fetus horizontal on the screen at 90 degrees to the angle of insonation. Gestational age was assessed by using the reference charts published by Robinson and Fleming [23]. The woman was eligible for the study provided that gestational age by crown-rump length confirmed LMP-based age within 7 d. The LMP-based age was used for the analyses.

Ultrasound Measurements
The first visit (dating scan) was between 8 + 0 and 12 + 6 wk, and subsequent visits for fetal biometry were scheduled at approximately 4-wk (±1 wk) intervals at 14,18,24,28,32,36, and 40 wk. All scanning appointments were arranged at the time of the dating scan and study enrollment. All participants were scanned in the lateral recumbent position.
The compulsory ultrasound measurements obtained at all visits included the following biometric parameters: biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC), femur length (FL), and humerus length (HL). At each examination, all measurements were obtained three times from three separately generated ultrasound images and uploaded electronically (with the associated images) to the data management system. The median of the three measurements of each parameter was used in the analyses.
In addition, a full morphological evaluation (anomaly scan) was conducted at 18-24 wk following standard practice at each center. Fetuses diagnosed with any anomaly were managed according to local clinical guidelines. Their ultrasound measurements were included in the study, and the possible effect on the percentiles derived was evaluated. The following measurement techniques were used. BPD was measured as the outer-inner distance of the parietal bones in a cross-sectional view of the fetal head at the level of the thalami and cavum septi pellucidi or cerebral peduncles. The cerebellum was not included in the section. The measurement was obtained from an image with the midline echo as close as possible to the horizontal plane, 90 degrees to the ultrasound beam. HC was obtained from the same image as BPD as follows: calipers were placed on the outer borders of the occipital and frontal edges of the bone at the point of the midline of the skull, and the ellipse facility was used to follow the outer perimeter of the skull to calculate HC. AC was measured in the transverse section of the fetal abdomen that was as close as possible to circular and that included the stomach and the junction of the umbilical vein and portal sinus. The anteroposterior and transverse diameters were then measured with calipers placed on the outer borders of the body outline. The anteroposterior diameter was measured from the spine to the anterior abdominal wall, and the transverse diameter at a right angle to the anteroposterior diameter. The ellipse facility was used to calculate AC as outlined above. FL was measured from an image of the full femoral shaft in a plane close to 90 degrees to the ultrasound beam. The distal femoral epiphysis was excluded.
Similarly, HL was measured from an image of the full humeral shaft in a plane close to 90 degrees to the ultrasound beam.
The participating centers used identical ultrasound machines during the project (Voluson Expert E8, General Electric, Kretz Ultrasound, Zipf, Austria) equipped with two curvilinear transabdominal transducers (4-8 MHz and 1-5 MHz) and a transvaginal transducer (6)(7)(8)(9)(10)(11)(12), observing that the energy output was set so that thermal index (TI) was <1.0. The TI was automatically recorded and transmitted to the web-based data management system by the ultrasound machine.
Measurement results were stored electronically, with the images together with all information collected from the mother and the perinatal outcomes. EFW was calculated by including HC, AC, and FL in Hadlock et al.'s third formula [24]. To facilitate assessment of relative fetal head size and growth, the ratios FL/HC and FL/BPD were established.

Training and Quality Assurance
The choice of participating centers was based on their proficient use of ultrasound by experienced sonographers. The sonographers participating in the study received specific training for the study and were certified as proficient under the supervision of a qualified instructor, according to a standard protocol. All the ultrasound operators had their scans assessed for quality during their early period in the project. Instruments and techniques used in all centers were standardized, i.e., equipment and training were provided to each of the measurement teams.

Maternal Anthropometric and Nutritional Assessment and Birthweight
Weight wearing light clothing was measured using a beam balance with nondetachable weights and recorded to the nearest 0.1 kg. Height of the mother was measured in the standing position using a stadiometer and recorded to the nearest millimeter. If the reading fell between two values, the lower was recorded.
The 24-h diet recall assessment was carried out by a specifically trained nutritionist or nurse who asked the study participant about food and beverages consumed during the previous 24 h [22]. Further details are available elsewhere [20]. Birthweight was assessed at delivery, and neonatal morphometry carried out within 24 h according to the protocol [20].

Data Management
Data were collected via a web-based data management system developed by Centro Rosarino de Estudios Perinatales, Rosario, Argentina. All data (clinical, anthropometric, nutritional, and fetal biometry measurements plus 2-D/3-D images) were stored in a central server compliant with good clinical practice. Data transmission was encrypted to assure data integrity and patient confidentiality. Access to the web system was password protected, and only authorized users had access. Data changes were documented by a complete audit trail record kept automatically by the web system (recording when, by whom, and why data were changed). Data entered into the web system were checked by the coordinating unit at Centro Rosarino de Estudios Perinatales for completeness, accuracy, reliability, and consistent intended performance. Different kinds of validation procedures were carried out (checking missing values and outliers, cross-checks, cross-time verifications among scanning appointments, and protocol compliance). Measurements and 2-D/3-D images corresponding to fetal biometry had special processing. In collaboration with General Electric Healthcare, Germany, ViewPoint software was installed at all participating centers, allowing a standard interface/procedure for scans and an automatic transfer of fetal biometry measurements/images to the web-based system. Thus, all fetal biometry measurements considered by the protocol were automatically transferred instead of being entered manually (except for D. R. Congo; there, a complete checking of values was done by the comparison of images and values entered into the webbased system). The above mentioned web-based system and procedures have been used in five previous HRP (UNDP/UNFPA/UNICEF/WHO/World Bank Special Programme of Research, Development and Research Training in Human Reproduction)/WHO multicenter studies and are proven to be efficient and compliant with HRP/WHO Standard Operating Procedures as well as with Title 21 CFR Part 11 of the Code of Federal Regulations, which deals with United States Food and Drug Administration guidelines on electronic records.

Adjustments of Analyses Compared with the Protocol and Justifications
Compared with the original protocol [20], the following aspects of the study were adjusted. Reliable information on LMP (confirmed by a measurement of crown-rump length), rather than ultrasound measured crown-rump length alone, was used as the basis for gestational age calculation for the following reasons: there is no evidence that ultrasound dating more accurately determines gestational age than a reliable LMP confirmed by crown-rump length; reliable LMP is the basis for establishing crown-rump length charts for dating; crown-rump length dating translates natural variation of size into variation of gestational age, which is not desirable for a study of growth; and LMP, not crown-rump length, is the accessible, low-cost method for gestational age assessment for all women in the world, and for the low-income areas usually the only one.
The sample size calculation was based on the assumption of normality for the distribution of ultrasound measurements. However, we used quantile regression, which calculates quantiles (i.e., percentiles) directly from the observed measurements without making assumptions about the distribution.
Maternal and fetal conditions occurring during pregnancy were not excluded from the analysis. The rationale for this was that the reference intervals of this study are intended primarily for clinical use and therefore should reflect the population for which they are intended as closely as possible. The pregnancy conditions (e.g., complications) that the study population experienced are those common to low-risk pregnancies around the world. Likewise, excluding all neonates below the 10th percentile of birthweight, as suggested in the protocol [20], would by definition remove the 10% of the participants at the bottom of the range (the vast majority being healthy in this low-risk cohort) and cause a corresponding distortion of the new growth charts, i.e., a substantial upward shift of all the lowest percentiles (10, 5, 2.5, and 1) in the direction of supernormal.
Given the plethora of measurements, we prioritized clinical usefulness in the analyses and results presented here (e.g., EFW and common biometric measurements) and left the following for secondary studies and publications: transverse cerebellar diameter, fetal foot length, 3-D ultrasound acquisitions, maternal anthropometric measurements except height and weight, the second and third sets of dietary 24-h-recall data (at 28 and 36 wk of gestation), and newborn anthropometric measurements except birthweight.

Data Analysis and Statistical Methods
Descriptive statistics were calculated for the women's characteristics at study entry, for mode of delivery, for birth events, and for fetal, neonatal, and maternal conditions, by country and overall. Protocol compliance was evaluated by comparing the dates of the windows of gestational age defined in the protocol with the dates of actual measurements.
The ultrasound measurements were used to estimate reference curves for individual parameters (BPD, HC, AC, FL, HL, FL/HC, FL/BPD) and EFW based on Hadlock et al.'s formula 3 [24]. Reference curves were fitted using quantile regression for reference models, as described by Wei et al. [25] from the work of Koenker [26,27].
The development of reference curves has up to now in general used parametric models, based on assumptions about distribution and on transformation of the observations to normal distributions. Advances brought by computer power and by the work of Koenker and others have made it possible to estimate the distributions directly by estimating their quantiles. Quantile regression is now a well-established technique [26,27], and statistical software is available to fit quantile regression models. Quantile regression fits a function to each chosen quantile using linear programming and has the advantage of not imposing any distributional assumptions. The asymmetry and kurtosis of the fitted distributions may thus assume any form dictated by the data, even changing with gestational age. In addition, quantile regression is more robust against the influence of outliers in the data. The flexibility of the fitting and the fact that any inference drawn is entirely data-driven led us to choose quantile regression as the method for the construction of reference curves.
The estimated quantiles were smoothed by polynomial functions of gestational age. Full models fitted a polynomial on gestational age for each country by including interaction terms between gestational age polynomial and country. Additive terms were included for other covariates.
The models were checked by the residual analysis produced by the software. Hypotheses on the overall importance of covariates were formally tested using likelihood ratio or Wald chisquare tests. In addition, visual inspection of quantile profilers was used to assess the relevance of each covariate in explaining the variation. To compare the distributions of the different countries with the overall distribution, we used quantile-quantile plots. We calculated 95% confidence intervals for the difference between country and global EFW percentiles for particular gestational ages, using the result that the parameter estimates from quantile regression were asymptotically normally distributed [28].
Logarithms of ultrasound parameters and EFW were used for the fitting. This was done only to achieve better numerical accuracy and faster convergence of the fitting algorithm. After the fitting, the results were retransformed to the original scale. To describe growth asymmetry, we used the Bowley coefficient of asymmetry [29], based on differences of semi-quartile ranges relative to the quartile range, for the gestational ages 15 and 40 wk.
Data were analyzed using SAS Software version 9.4 (SAS Institute, Cary, North Carolina, US) and JMP Pro 12 (SAS Institute, Cary, North Carolina, US).

Participants
A total of 1,439 women were enrolled between October 2009 and September 2014, with data collection being completed with the last childbirth in April 2015. Of these, 52 (3.6%) withdrew consent, leaving 1,387 women and their fetuses participating in the study. Table 1 shows the numbers of women recruited, those withdrawing consent, those lost to follow-up, and those having miscarriages or intrauterine deaths, by country. Among women lost to follow-up and with miscarriage or intrauterine death, 10 and 15, respectively, did not contribute ultrasound information. All women other than those withdrawing consent were included in the growth curve analyses if they contributed ultrasound information, with the number in this analysis being 1,362.

Population Characteristics
Statistics for participating women's characteristics, their daily caloric intake, and ethnicity are presented in Table 2. Median age at study entry was 28 y but varied between 24 y (Argentina and Egypt) and 32 y (France). Median maternal height ranged from 155 cm (India) to 169 cm (Germany), and weight from 54 kg (Thailand) to 66 kg (Germany). While overall median BMI was 23.1 kg/m 2 , the median by country ranged from 21.6 kg/m 2 in Thailand to 25.9 kg/m 2 in Egypt. Median daily caloric intake in the study group was 1,848 calories according to the 24-h dietary recall assessment, with Thailand having the lowest median, 1,232 calories, and Egypt having the highest median, 2,094 calories. The ethnic distribution of the study group was roughly 20% African (including the peri-Mediterranean Egypt), 20% Asian, and 60% white. Table 3 shows delivery information. The overall rate of spontaneous onset of birth was 67.3%, with a wide range by country: 28.5% in Brazil to 94.5% in D. R. Congo. There was an overall cesarean section rate of 32.1%, with a considerable range from 5.5% in D. R. Congo to 70.1% in Brazil. The occurrence of Apgar score < 7 at 5 min was similar in all countries, i.e., 0%-2.2%. Most of the countries had a similar distribution between female and male neonates except for Egypt, Germany, and Norway, where about 40% of neonates were female. The incidence of preterm birth varied from 3.6% in Germany to 14.7% in Egypt (p = 0.03 for differences among countries). It was lowest in D. R. Congo, Denmark, Germany, and Norway and highest in Egypt and India.

Gestational Age at Birth and Birthweight
Gestational age at birth varied between countries from a median of 38 wk 4 d in India to 40 wk 3 d in Norway (p < 0.001 for differences among countries) ( Table 3). Norway had the highest median birthweight (3,575 g), and Denmark and Germany had birthweights approximately 100 g less, while Argentina, Brazil, and France had birthweights 200 g less. There is a group    of countries (D. R. Congo, Egypt, and Thailand) with birthweight a median 400 g less than that of Norway, and lastly India, with birthweight 500 g less. The differences in birthweight between countries were highly significant for all percentiles (p < 0.001 for all). When adjusted for gestational age at birth, the differences were still significant for all the percentiles (p = 0.0018 for the 5th percentile and p < 0.001 for the 10th, 25th, 50th, 75th, 90th, and 95th percentiles). The estimated birthweight according to neonatal sex and gestational age is shown in Table 4.

Maternal Complications and Perinatal Conditions
Conditions occurring in the mother during pregnancy are shown in Table 5, together with fetal malformations and neonatal conditions. In addition to globally experienced maternal complications such as preeclampsia, pregnancy-induced hypertension, gestational diabetes, and anemia, 42 had identified malaria. There was no maternal death. Four small-for-gestational-age fetuses were identified clinically, of which two were examined using Doppler ultrasound; none had abnormal recordings in the umbilical artery or middle cerebral artery, and all were kept in the analysis. It was registered when neonates needed transmission to the neonatal intensive care unit, commonly due to prematurity, respiratory distress syndrome, infections,   or jaundice. There were three intrauterine deaths and three neonatal deaths, representing a perinatal mortality of 0.4%.

Compliance with Ultrasound Scans
The median number of ultrasound scans (excluding the study entry screening scan) in all women was 6 (range 0-7). Compliance by gestational age window as defined in the protocol is presented in S1 Table, by country and for all countries combined ("Total"). Compliance for all countries combined in each gestational age window was between 89.1% and 100%; 72% of the participants had a complete set of all the scheduled scans. In addition, for each of the measurements BPD, HC, AC, FL, and HL, scans were obtained !2 times for at least 95% of participants.

Thermal Index
Of the 8,372 scan sessions in the project, 115 had no scans stored and 54 belonged to women who withdrew consent, leaving 8,203 for the statistics. The median TI was 0.2, and none had TI ! 1.0.  Tables 6-13 and in csv format in S1 File. The distribution of EFW starts with a slight asymmetry to the left (i.e., lower percentiles) in early pregnancy and ends with a very noticeable right asymmetry (i.e., higher percentiles) in later pregnancy. The Bowley coefficient of asymmetry [29], based on differences of semi-quartile ranges relative to the quartile range, was −0.016 for gestational age 15 wk and +0.111 for 40 wk.

Influence of Covariates on Growth Percentiles
Fetal sex. Male fetuses were larger than female fetuses as measured by EFW, but the disparity was smaller in the lower quantiles of the distribution (3.5%) and larger in the upper quantiles (4.5%) (Fig 2 and S2 Table, without adjustment for country differences). This difference in size by fetal sex was significant at the 5% level for all percentiles. EFW reference values were also established for female and male fetuses separately (Tables 14 and 15) to allow assessment customized according to fetal sex. For example, at gestational week 37, the median EFW of female fetuses is 84 g lower than that of male fetuses.
Country. Countries differed in EFW (Fig 3). Using country as a covariate in a quantile regression model, including interaction terms with gestational age, showed significance at the 5% level for all percentiles 5th, 10th, 25th, 50th, 75th, 90th, and 95th (S2 and S3 Tables). This variation due to country was adjusted for maternal characteristics (mother's age, parity, height, and weight, or with BMI substituting the latter two) and sex of the fetus. To assess the relative contribution of these variables to the variation in EFW, the Wald chi-square statistics in S2 and S3 Tables are informative, e.g., for the 5th percentile (quantile 0.05, first table in S2 Table), as expected, most of the variation (Wald chi-square = 1,797, 1 df) is due to gestational age (linear) as the fetus grows, and there is significant curvature (Wald chi-square = 207, 1 df). Country variation gives Wald chi-square = 36 (9 df); sex of the fetus, 29 (1 df); mother's height, 26 (1 df); and mother's age, 22 (1 df), while the Wald chi-square value for weight is negligible. In the same table, the level of significance is listed for these variables, e.g., p < 0.001 for country, highly significant. It is clear that variation due to country also occurs independently of   Table. The clinical relevance of the differences between the country quantiles and the global quantiles can be assessed in quantile-quantile plots (Fig 4). These plots are intended to enable the reader to derive the magnitude of difference in grams for any size and country and percentile. For example, consider the quantile-quantile plot for the individual country 0.05 quantile (i.e., the 5th percentile) for EFW versus the global 0.05 quantile: the 5th percentiles at low values of EFW cannot be differentiated because of the relative smallness of EFW at early pregnancy ( Fig  4). However, at the end of gestation (high values of EFW), the 5th percentile for Norway is 3,200 g, while the overall 5th percentile is 2,800 g; for France it is 2,800 g, and for Egypt, 2,700 g. Similarly, it can be seen that while the 10th percentile for EFW at the end of gestation for Norway is 3,400 g, it is 2,700 g for India (versus about 3,100 g for the global 10th percentile), showing that a fetus weighing 3,200 g would be below the 10th percentile for Norway but well above it for India. The magnitude of the differences among countries can also be appreciated in Fig 5, where selected country percentiles are shown with the corresponding global percentile curve. Maternal age and maternal height. Maternal age and height seem to be associated with a positive effect on EFW, especially in the lower tail of the distribution, significant at the 5% level, of the order of 2% to 3% for each additional 10 y of age of the mother and 1% to 2% for each additional 10 cm of height (S1D and S1F Fig, without adjusting for country differences).
Maternal weight. Maternal weight seems to be associated with a small positive effect on EFW, especially in the higher tail of the distribution, significant at the 5% level, of the order of 1% to 1.5% for each additional 10 kg of weight of the mother (S1E Fig, without adjusting for country differences).
Parity (0 versus !1). Parous women had heavier fetuses than nulliparous women, with the disparity being much higher in the lower quantiles of the distribution, of the order of 1% to 3%, significant at the 5% level, and subsiding in the upper quantiles (S1C Fig, without adjusting for country differences).

Influence of Clinical Conditions on Growth Percentiles
Participants for whom clinical conditions occurred during pregnancy and childbirth were retained in the study. We then assessed the effect of excluding them on the parameter estimates of the quantiles. We excluded successively maternal conditions, fetal malformations, and neonatal conditions and assessed the fit for the global EFW percentiles. The parameter estimates obtained were indistinguishable. In order to illustrate variation of the clinically relevant 10th and 90th percentiles for EFW, we compiled the values (without any formal comparison) for 24, 28, 32, and 36 wk of gestation from the present study, the NICHD Fetal Growth Studies [19], a study from D. R. Congo [30], and another study from Norway [31] (Table 16). Since the other existing multinational study, the Fetal Growth Longitudinal Study of the Intergrowth-21st Project, did not publish EFW but rather AC, which is a major determinant for EFW, we also compiled 10th and 90th percentiles for AC from relevant studies [18,19,30,[32][33][34] (Table 17).

Discussion
In this paper we present the WHO fetal growth charts for EFW and common ultrasound biometric measurements intended for international use. They reveal a wide range of variation in human fetal growth across different parts of the world. Significant differences in fetal growth between countries are confirmed by differences in birthweight. Furthermore, the study shows that intrauterine growth is influenced by fetal sex and by maternal age, height, weight, and parity, although these influences explain only partially the differences in growth between countries.
The primary motivation for this study, the fetal component of the WHO Multicentre Growth Reference Study [11], was the need for clinical reference intervals applicable internationally, including for areas of the world where perinatal morbidity and mortality are high, hence the multinational design. Driven by the same motivation, we prioritized ultrasound measurements in common clinical use worldwide, the most prominent being EFW (Fig 1;  Table 11). The use of estimated weight in grams is simple and intelligible, which enhances clinical management, facilitates communication within the health care system, and is valuable when counselling patients. In addition to the other common measurements in daily use (BPD, HC, AC, and FL) (Fig 1; Tables 6-9), we established reference intervals for the ratios FL/HC and FL/BPD aimed at facilitating the identification and monitoring of disproportionate fetal head development, e.g., hydrocephaly or microcephaly (Fig 1; Tables 12 and 13). The diagnosis in pregnancies complicated by such conditions is often hampered by uncertainty about gestational age since head size (BPD and HC) is also commonly used for the dating of the pregnancy. FL/HC and particularly FL/BPD are less dependent on gestational age after 20 wk of gestation (Fig 1) and may therefore have diagnostic utility. A strength of the new growth charts provided by the study (Tables 6-15) is that they are based on multinational data, i.e., ten countries, and therefore are more likely to be applicable internationally than previously published reference intervals for EFW based on single countries. A recent sizeable study found significant variation in fetal growth between Asian, black, Hispanic, and white ethnic groups, with Asian fetuses being the smallest and white fetuses the largest, justifying ethnic-specific growth charts [19]. However, that study was confined to the US. Table 16 demonstrates the relation between studies for the clinically important 10th and 90th percentiles for EFW. The WHO growth chart for all countries lies in the middle of them. Although the present study was not designed to investigate ethnic differences, a limited record of participants' ethnicity showed a distribution largely according to country (Table 2). Interestingly, there was a significant difference in the growth of EFW between countries that was not explained by maternal factors (Fig 3; S2 Table). While ethnic differences may play a role in this variation, as for the US-based study [19], variation could also be due to differences in diet and cultural and socioeconomic factors commonly associated with particular ethnic groups. These may also have played a role in the US-based study.  5  5  10  25  50  75  90  95  97.5   14  10  11  11  12  14  15  16  16  17   15  13  13  14  15  16  18  19  19  20   16  16  16  17  18  19  21  22  22  23   17  19  19  20  21  23  24  25  25  26   18  22  22  23  24  26  27  28  28  29   19  25  25  26  27  28  30  31  31  32   20  27  28  29  30  31  32  33  34 32  49  50  51  53  54  56  57  59  59   33  51  52  53  54  56  58  59  60  61   34  53  53  54  56  58  59  61  62  63   35  54  55  56  57  59  61  62  63  64   36  55  56  57  59  61  62  64  65  66   37  56  57  58  60  62  64  65  66  67   38  57  58  59  61  63  65  66  67  68   39  58  59  60  62  64  65  67  68  69   40  57  58  60  62  64  66  68  69  69 doi:10.1371/journal.pmed.1002220.t010 Another recently published multinational study by the Intergrowth-21st Project presented biometric growth but not EFW data [18]. We therefore present variation in AC, which is closely linked to EFW and is an important predictor of perinatal outcome [6], for the commonly used cutoffs, the 10th and 90th percentiles (Table 17). Interestingly, the 10th percentile for the Intergrowth-21st Project results seems to fall below that of the WHO study, even though the Intergrowth-21st Project study was carried out according to a strictly "prescriptive" concept to establish so-called optimal fetal growth (low-risk pregnancies with no environmental and nutritional constraints, and excluding all conditions during pregnancy and childbirth that may be associated with effects on fetal growth). The WHO study had a similar recruitment but retained in the analysis pregnancies with maternal, fetal, and neonatal clinical conditions, based on the principle that reference intervals should reflect as closely as possible the population to which they will be applied. Furthermore, we assessed the effect of removing such pregnancies from the dataset and found no identifiable effect on the percentiles. As seen from Table 17, it is as if rigorous selection and exclusions have limited effect, and other uncontrolled factors are responsible for the variation between studies and countries. Apart from random error, systematic error due to differences in ultrasound measurement techniques could influence the differences between the studies. However, these studies had well-trained ultrasound Table 11. Growth chart for estimated fetal weight regardless of fetal sex. Percentile   2.5  5  10  25  50  75  90  95  97.5   14  70  73  78  83  90  98  104  109  113   15  89  93  99  106  114  124  132  138  144   16  113  117  124  133  144  155  166  174  181   17  141  146  155  166  179  193  207  217  225   18  174  181  192  206  222  239  255  268  278   19  214  223  235  252  272  292  313  328  340   20  260  271  286  307  330  355  380  399  operators specifically instructed for the research procedure using internationally accepted techniques, and this should minimize such error. Another strength of the present WHO study is the use of quantile regression to establish the reference intervals. Quantile regression makes an inference about regression coefficients for the conditional quantiles of a variable without making assumptions about its distribution: there is no need to assume a particular distribution and to estimate its moments. In consequence, it provides a more direct representation of the observed measurements. This is nicely demonstrated in a recent large study establishing population-specific fetal growth charts [35]. The technique is especially useful when the quantiles vary differently with a covariate such as, in the present study, gestational age. In addition, the method is robust against the effect of outliers and can capture important features of the data that might be missed by models that average across the conditional distribution [25].

Gestational Age (Weeks) Estimated Fetal Weight (g) by
Quantile regression is particularly useful in studying distribution changes, and shows in the present study that fetal growth in the population is not symmetrical with gestation. Starting with a higher distribution towards the lower percentiles, EFW shifts to an expanded distribution among the higher percentiles and ends with a noticeable asymmetry near term. The Bowley coefficient for asymmetry changed from −0.016 to +0.111 during that period. We are not sure of the nature of the small negative asymmetry in early pregnancy, but speculate that regulatory functions, such as the process of maternal constraint of fetal growth, change through gestation, i.e., fetuses in the higher percentiles may be exposed to greater influences, which vary with maternal characteristics. This corroborates the differential effects of covariates across the percentiles shown in S1 Fig. We believe that studying distribution dynamics may yield more information on the control of fetal growth. The study confirmed the biologically interesting facts that fetal sex and maternal height, weight, parity, and age significantly influence fetal growth [31,36,37]. Together with the country differences, the ethnic differences shown in the US population [19], and, not least, the substantial variation in birthweight among carefully selected low-risk pregnancies, these findings document a diversity and plasticity in human prenatal growth dynamics that is only partially understood. There is increasing evidence linking fetal development, and proxies of development such as birthweight, to postnatal health and life course risk of disease [7,9]. This issue is prioritized by the UN and WHO at a time when noncommunicable diseases are becoming global epidemics [10,38]. For example, in our study, birthweights in India were significantly lower than in the other countries, and Indian participants also had the lowest fetal growth and were the shortest mothers. It is known that body composition in Indian newborns contains relatively more fat [39], a pattern that passes across generations [40] and that is linked to increased risk of subsequent type 2 diabetes [41]. It seems clear that the understanding of "optimal" fetal growth needs to incorporate more than birthweight.
To have a single fetal growth chart that fits all pregnancies across the world would require that all fetuses had the same genetic background for growth, that this genetic background was reliably expressed in the mother, and that influences such as nutrition, physical activity, stress, toxicants, and other environmental conditions had similar effects on the genotype in all embryos and fetuses. This is very unlikely: recent research has revealed a range of interactions between the developmental environment and genetic and epigenetic processes [9]. Even influences on fetal growth classically thought to be primarily genetic, such as maternal and paternal height, are complicated by environmental factors. Altitude, climate, geography, other environmental conditions, and the challenges of daily life and nutrition vary around the world. Humans adapt across generations to local conditions, and fetal development adds an important adaptive refinement for the next generation. Secular changes in birthweight and child growth patterns have been shown to accompany social changes [42,43]. Fetal growth charts may thus need to be adjusted to fit the diversity of individuals and populations if they are to be of the greatest clinical utility. While including ten countries in the present WHO study was a strength compared to previous studies, it still has limitations. The ten population samples, including two in South-East Asia and two in Africa, were included to increase generalizability, but they are still a very limited sample of the global human population. Africa alone has a greater genetic diversity than has the rest of the world [44], and anthropometric variation on that continent is substantial. The present study showed population differences within the pooled dataset, and so the extent to which the results can be extrapolated to other populations, which possibly have other growth dynamics, is at present unknown.
A limitation of the study is that ultrasound measurements were accompanied by a corresponding gestational age exposed on the screen, which could have led to undue changes in the management of the pregnancy and pregnancy duration. However, it was common practice among the sonographers and midwives doing the examination not to pay attention to this gestational age because the department was using other reference values than the one on the screen. On the other hand, part of the ethical commitment of the study was actually to let the mother be informed of any abnormality or deviation of importance discovered, so that it could be taken into account for the management of the pregnancy, and to refer the case to the managing clinician. However, the reported referrals were few and were found not to influence the statistics. Pooling data is not ideal in the presence of variation among populations, and a single overall growth chart will only partially reflect the individual populations included. Figs 4 and 5 show the variation of country-specific percentiles compared with the corresponding overall percentiles of the study and provide an opportunity to assess the magnitude and clinical relevance of the observed variation. Tables 16 and 17 illustrate a similar pattern when compiling the 10th and 90th percentiles for EFW and AC from various relevant high-quality studies available for clinical use. Although no formal statistical comparison was undertaken, the results of these studies illustrate the distribution that can be found around the world. This gives an impression of a wider spread for the 90th percentile than for the 10th. A similar pattern is found within the WHO study itself: a more obvious diversity between the countries for the 90th percentile than for the 10th percentile (Fig 3). As seen from these figures, variation between countries may increase to several hundred grams towards the end of pregnancy, and may cause misclassifications when the overall percentile is used. Secondly, it seems that population variation in growth is more reflected in the 90th percentile than in the lowest percentiles. Thus, it is possible that the 10th, 5th, and 2.5th percentiles of a pooled study are more universally applicable, while the upper percentiles-90th, 95th, and 97.5th-vary more according to population characteristics and accordingly will be more in need of adjustment, i.e., customization, for use at the population level [37]. It follows that whenever the WHO growth charts, or any reference intervals, are applied to a population, their performance should be checked or tested in order to ensure appropriate use. It is possible to adjust them by changing cutoffs (e.g., from 10th to 5th percentile) to fit clinical needs better, and it is possible to customize the percentiles to country, maternal characteristics, and fetal sex to improve diagnostic performance [45]. A further refinement would be to introduce conditioning terms when using repeated ultrasound measurements for monitoring growth [46,47], i.e., narrowing the expected reference interval for an assessment by conditioning it using a previous measurement. WHO is working on these methods to make them generally available with the growth chart. If such adjustments and refinements do not suffice to make the growth charts fit clinical needs appropriately, then it may be necessary to establish new high-quality reference intervals for a population. For example, the WHO growth charts and many others are based on populations living at altitudes < 1,500 m. However, millions of people live at higher altitudes, and their physiological adaptations include pregnancy and fetal development. It might be that specific charts will be needed for such populations.
The concept of a "standard," whether international or national, is often used for instruments and methods to make procedures uniform and to reduce random and systematic error, rather than to set a standard for a biological parameter such as height or bodyweight for the population globally. We are inclined to the view that, while the methodology to define reference ranges or charts for fetal growth needs to be standardized, fetal growth itself is a biological parameter expected to reflect adaptive processes and to change with development, time, location, and environmental conditions. Variation in fetal growth within and between populations should therefore not be ignored.
To apply any growth chart sensibly requires insight, critical attitude, and pragmatism. We believe that the present WHO fetal growth charts can be used internationally, particularly where no local data exist. However, once they are in use, it will be prudent to test the performance of the charts in a particular setting in case adjustments, customization, or replacement with population-specific high-quality reference intervals is needed. With the currently varying degrees of resources, health, and needs around the world, health care professionals have the  Graphs of the 10th, 50th, and 90th percentiles for the ultrasound measure HL in millimeters for the ten participating countries. (TIF) S1 File. Growth charts for the fetal ultrasound measurements biparietal diameter, head circumference, abdominal circumference, femur length, and humerus length; for estimated fetal weight; and for the ratios femur length/head circumference and femur length/ biparietal diameter in one Excel file. (XLSX) S1 Table. Compliance of ultrasound visits with protocol, measured by observed versus expected. (DOCX) S2 Table. Variation of estimated fetal weight quantiles due to country, maternal characteristics (age, height, weight, and parity), and sex of the fetus. Output from quantile multivariate regression showing Wald chi-square tests for gestational age; country; the interaction of gestational age and country; sex of the fetus; and maternal characteristics. (DOCX) S3 Table. Variation of estimated fetal weight quantiles due to country, maternal characteristics (age, BMI, and parity), and sex of the fetus. Output from quantile multivariate regression showing Wald chi-square tests for gestational age; country; the interaction of gestational age and country; sex of the fetus; and maternal characteristics. (DOCX) S4 Table. Comparison of country percentiles with overall percentiles. The 10th, 50th, and 90th percentiles for overall EFW, and the 95% confidence intervals for the difference between each country's percentiles and the overall percentiles at 20, 24, 28, 32, and 36 wk of gestational age. The results should be interpreted with caution (the study was not powered for this analysis; multiplicity of inferences implies that the confidence is much lower than 95%).