Validity of U.S. Nutritional Surveillance: National Health and Nutrition Examination Survey Caloric Energy Intake Data, 1971–2010

Importance Methodological limitations compromise the validity of U.S. nutritional surveillance data and the empirical foundation for formulating dietary guidelines and public health policies. Objectives Evaluate the validity of the National Health and Nutrition Examination Survey (NHANES) caloric intake data throughout its history, and examine trends in the validity of caloric intake estimates as the NHANES dietary measurement protocols evolved. Design Validity of data from 28,993 men and 34,369 women, aged 20 to 74 years from NHANES I (1971–1974) through NHANES 2009–2010 was assessed by: calculating physiologically credible energy intake values as the ratio of reported energy intake (rEI) to estimated basal metabolic rate (BMR), and subtracting estimated total energy expenditure (TEE) from NHANES rEI to create ‘disparity values’. Main Outcome Measures 1) Physiologically credible values expressed as the ratio rEI/BMR and 2) disparity values (rEI–TEE). Results The historical rEI/BMR values for men and women were 1.31 and 1.19, (95% CI: 1.30–1.32 and 1.18–1.20), respectively. The historical disparity values for men and women were −281 and −365 kilocalorie-per-day, (95% CI: −299, −264 and −378, −351), respectively. These results are indicative of significant under-reporting. The greatest mean disparity values were −716 kcal/day and −856 kcal/day for obese (i.e., ≥30 kg/m2) men and women, respectively. Conclusions Across the 39-year history of the NHANES, EI data on the majority of respondents (67.3% of women and 58.7% of men) were not physiologically plausible. Improvements in measurement protocols after NHANES II led to small decreases in underreporting, artifactual increases in rEI, but only trivial increases in validity in subsequent surveys. The confluence of these results and other methodological limitations suggest that the ability to estimate population trends in caloric intake and generate empirically supported public policy relevant to diet-health relationships from U.S. nutritional surveillance is extremely limited.


Introduction
The rise in the population prevalence of obesity has focused attention on U.S. nutritional surveillance research and the analysis of trends in caloric energy intake (EI). Because these efforts provide the scientific foundation for many public health policies and food-based guidelines, poor validity in dietary measurement protocols can have significant long-term implications for our nation's health.
In the U.S., population-level estimates of EI are derived from data collected as part of the National Health and Nutrition Examination Survey (NHANES), a complex, cross-sectional sample of the U.S. population. The primary method used in NHANES to approximate EI is the 24-hour dietary recall interview (24HR) [1]. The data collected are based on the subject's self-reported, retrospective perceptions of food and beverage consumption in the recent past. To calculate EI estimates, these subjective data are translated into nutrient food codes and then assigned numeric energy (i.e., caloric) values from food and nutrient databases. Prior to 2001-2002, the NHANES relied upon databases of varying quality and composition for the post-hoc conversion of food and beverage consumption (24HR) data into energy values [2][3][4][5]. After 2001After -2002, the NHANES and the U.S. Department of Agriculture's (USDA) Continuing Survey of Food Intakes by Individuals were integrated into the ''What We Eat in America'' program [6], and the translation process was standardized via use of successive versions of the USDA's National Nutrient Database for Standard Reference (NNBS) [7].

Misreporting
Given the indirect, pseudo-quantitative nature of the method (i.e., assigning numeric values to subjective data without objective corroboration), nutrition surveys frequently report a range of energy intakes that are not representative of the respondents' habitual intakes [8], and estimates of EI that are physiologically implausible (i.e., incompatible with survival) have been demon-strated to be widespread [9][10][11]. For example, in a group of ''highly educated'' participants, Subar et al. (2003) demonstrated that when total energy expenditure (TEE) via doubly labeled water (DLW) was compared to reported energy intake (rEI), the raw correlations between TEE and rEI were 0.39 for men and 0.24 for women. Men and women underreported energy intake by 12-14% and 16-20%, respectively. The level of underreporting increased significantly after correcting for the weight gain of the sample over the study period [9], and underreporting was greater for fat than for protein, thereby providing additional support for the well-documented occurrence of the selective misreporting of specific macronutrients (e.g., fat and sugars) [12][13][14][15]. These results are consistent with earlier work, in which the correlations between DLW-derived TEE and seven 24HR and the average of two seven-day dietary recalls were 0.33 and 0.30, respectively [16].
Because the NHANES collected dietary data over the period in which the population prevalence of obesity was increasing, these data have been used (despite the widely acknowledged issues [17]) to examine the association of trends in EI with increments in mean population body mass index (BMI) and rates of obesity (e.g., [18][19][20]). Given that implausible rEI values and the misreporting of total dietary intake render the relationships between dietary factors, BMI and other indices of health ambiguous [21], and diminish the usefulness of nutrition data as a tool to inform public health policy, this report examines the validity of U.S. nutrition surveillance EI data from NHANES I (1971-1974) through NHANES 2010 (nine survey periods) using two protocols: the ratio of reported energy intake (rEI) to basal metabolic rate (rEI/BMR) [22,23] and the disparity between rEI and estimated total energy expenditure (TEE) from the Institute of Medicine's (IOM) predictive equations [24].

Population
Data were obtained from the National Health and Nutrition Examination Surveys for the years 1971-2010 [1]. The NHANES is a complex multi-stage, cluster sample of the civilian, noninstitutionalized U.S. population conducted by the Centers for Disease Control and Prevention (CDC). The National Center for Health Statistics ethics review board approved protocols and written informed consent was obtained from all NHANES participants.

Inclusion Criteria
The study sample was limited to adults aged $20 and ,74 years at the time of the NHANES in which they participated, and had a body mass index (BMI) $18 kg/m 2 , and with complete data on age, sex, height, weight, and dietary energy intake.

Dietary Data
Estimates of EI were obtained from a single 24HR from each of the nine NHANES study periods [1]. Energy content of the selfreported food consumption was determined by NHANES using nutrient databases based on previous versions of the USDA National Nutrient Database for Standard Reference (NNDS) [7].

Determination of Physiologically Credible rEI Values
The ratio of rEI to BMR (rEI/BMR) ,1.35 [22,23,25] was used to determine EI values that were implausible. BMR was estimated via the Schofield predictive equations [26]. The ,1.35 cut-off for implausible EI values was used because ''it is highly unlikely that any normal, healthy free-living person could habitually exist at a PAL [i.e., TEE/BMR] of less than 1.35'' [22].
It is important to note that the ,1.35 cut-off does not assess all forms of misreporting (e.g., over-reporting). To avoid the confounding effects of potential over-reporting, all rEI/BMR values .2.40 [27] were excluded from analyses of underreporting. One form of misreporting that neither cut-off addresses is the underreporting of EI from a high caloric intake associated with elevated levels of physical activity. * Physical activity (PA) values were 1.12 and 1.14 for NW men and women, respectively. The use of these values assumes a physical activity level (PAL) of $1.4 and ,1.6, which is indicative of a ''low active'' population [24]. *PA values were 1.12 and 1.16 for OW/OB men and women, respectively. The use of these values assumes a physical activity level (PAL) of $1.4 and ,1.6, which is indicative of a ''low active'' population [24].

Statistical Analyses
Data processing and statistical analyses were performed using SASH, V 9.2 and SPSSH V.19 in 2012-2013. Analyses accounted for the NHANES' complex survey design via the incorporation of stratification, clustering and post-stratification weighting to maintain a nationally representative sample for each survey period. All analyses included adjusted means, and a ,0.05 (2-tailed) was used to identify statistical significance.  As Table 1 depicts, the 95% confidence intervals (CI) suggest that all mean rEI values for women and six of nine mean rEI values for men were apparently implausible. Table 2 depicts the rEI/BMR index for all women by BMI categories from NHANES I through NHANES 2009-2010.

Examination of Underreporting via rEI/BMR
As Table 2 depicts, the 95% CI suggest that in 20 of the 27 measurement categories (i.e., three BMI categories and nine surveys) the rEI values were not in the physiologically plausible range. The overall mean for rEI/BMR values for the total sample of women (n = 33,431) across all NHANES was 1.19 (95% CI: 1.18, 1.20) and therefore not physiologically plausible. Table 3 depicts the rEI/BMR index for all men by BMI categories from NHANES I through NHANES 2009-2010.
As shown in Table 3, the 95% CI suggest that in 12 of 27 measurement categories (i.e., three BMI categories and nine surveys), the rEI values were not in the physiologically plausible range. The overall mean value for rEI/BMR for the total sample of men (n = 27,285) across all NHANES was 1.31 (95% CI: 1.30, 1.32), and therefore not in the physiologically plausible range. As Figure 1 depicts, across the entire study period (i.e., 1971-2010) the majority of respondents did not report plausible rEI values in any survey. When stratified by sex and BMI categories, plausible reporting in OB women ranged from a low of ,12% in NHANES I and II to a high of 31%  Table 4 depicts the disparity of rEI and TEE for men and women (20-74 years). These values were calculated by subtracting the IOM TEE from the NHANES rEI. Negative values indicate the kilocalorie-per-day (kcal/day) value of underreporting. As Table 4 depicts, in no survey group (i.e., men & women in 9 surveys) does the 95% CI for the disparity between rEI and TEE include zero. This suggests that that underreporting of EI occurred in both men and women, and across all surveys. The overall mean value for the disparity of rEI and IOM TEE for the total sample of women (n = 33,431) across all NHANES was 2365 kcal/day (95% CI: 2378, 2351), or ,18% of TEE, and for the total sample of men (n = 27,285) was 2281 kcal/day (95% CI: 2299, 2264), or ,10% of TEE.

Trends in Underreporting
After the removal of over-reporters, both protocols, that is rEI/ BMR (Figure1) and the disparity between rEI and IOM TEE (Table 4) exhibited significant decreases in underreporting from NHANES II and NHANES III (p,0.001). There were significant negative linear trends for both men and women in changes in underreporting total caloric intake from NHANES I to NHANES 2009-2010 (rEI/BMR: p,0.001, and disparity: p = 0.028).

Trends in Over-reporting
Across the study period, approximately 4.9% of men and 2.9% of women reported rEI/BMR values suggestive of over-reporting (i.e., rEI/BMR .2.4) with no significant trends. The greatest increase in the percentage of over-reporters between survey periods occurred from NHANES II to NHANES III, with men increasing from 4.1% to 6.4%, and women from 1.7% to 3.4% (both p,0.001). The greatest absolute percentage of over-

Validity of NHANES EI Data
Our results suggest that across the 39-year history of U.S. nutrition surveillance research, rEI data on the majority of respondents (67.3% of women and 58.7% of men) were not physiologically plausible. The historical average rEI/BMR values for all men and women were 1.31 and 1.19 respectively ( Table 1). These values are indicative of substantial underreporting. The expected average values for healthy, free living men and women are ,1.55, with a range of .1.35 to ,2.40 [23,27]. In no survey did at least 50% of the respondents report plausible EI values ( Figure 1). These data are consistent with previous research demonstrating that the misreporting of EI in nutrition surveys is widespread [9,11,[28][29][30][31][32][33][34]. Goldberg et al. (1991) demonstrated that in 37 studies across 10 countries, .65% of the mean rEI/ BMR values were below the study-specific plausibility cut-off [23]. In addition to the extensive underreporting in our sample, 4.9% of men and 2.9% of women reported rEI/BMR values suggestive of over-reporting (i.e., rEI/BMR .2.40).

Disparity between NHANES rEI and IOM Derived TEE
Throughout the study period (i.e., 1971-2010) the disparity between rEI and TEE values were large and variable across BMI and sex categories suggesting substantial systematic biases in underreporting (Tables 4, 5, 6). The overall mean disparity values for men and women were 2281 kcal/day and 2365 kcal/day, respectively. The greatest mean disparity values were 2717 kcal/ day (25% of TDEE) and 2856 kcal/day (41% of TEE) in OB men and women, respectively. Trends in the Validity and Inferences from NHANES rEI Data As depicted in Tables 1 and 2, and Figure 1, there were large decreases in underreporting between NHANES II and NHANES III. This is clearly evidenced by the increase in rEI/BMR index (Table 1), the large and significant increase in the percent of plausible reporters (Figure1), and the reduction in the disparity between NHANES rEI and NAS/IOM EER (Table 4). This decrement in underreporting between NHANES II and subsequent surveys across all sex and BMI categories is likely the result of improvements in survey protocols for NHANES III, such as the inclusion of more days of dietary recall (i.e., weekends), automated multi-pass methodology, and increased staff training and quality control (see [35]), The extent of these improvements is notable; for example, the percentage of OB women reporting implausible values decreased from ,88% in NHANES II to 74% in NHANES III.
These changes in measurement protocols led to an apparent increase in mean rEI values that has been reported as an actual increase in population-level EI despite caveats that the ''Interpretation of trends in energy and nutrient intakes is difficult when methodologic changes occur between surveys'' [36]. Nevertheless, Briefel and Johnson state (without caveat) in their abstract, ''During the 30-year period, mean energy intake increased among adults…'' [37]. The data presented in the present report refute this inference. When the NHANES dietary measurement protocols were altered after NHANES II, the improved method captured a higher percentage of actual intakes. The apparent increase in mean rEI was merely an artifact of improved measurement protocols and not indicative of a true increase in caloric consumption. Despite this fact, the apparent increase has been regularly published and uncritically accepted as a true upward trend in caloric consumption (e.g., [37,38]) and the cause of the obesity epidemic (e.g., [39,40]).

Changes in Underreporting and Public Policy Recommendations
In addition to the ubiquity of misreporting, there is strong evidence that the reporting of 'socially undesirable' (e.g., high fat and/or high sugar) foods has changed as the prevalence of obesity has increased [12][13][14][15]. Additionally, research has demonstrated that interventions emphasizing the importance of 'healthy' behaviors may lead to increased misreporting as participants alter their reports to reflect the adoption of the 'healthier' behaviors independent of actual behavior change [17,41]. It appears that lifestyle interventions ''teach'' participants the socially desirable or acceptable responses [17,42]. As such, the ubiquity of public health messages to 'eat less and exercise more' may induce greater levels of misreporting and may explain the recent downward bias in both self-reported EI [20] and body weight [17,43], especially given that social desirability bias is often expressed in the underreporting of calorically dense foods [44].
Selective misreporting of specific macronutrients has important ramifications for epidemiological research and nutrition surveillance. Heitmann and Lissner (2005) demonstrated that the selective misreporting of dietary fat by groups at an increased risk of chronic non-communicable diseases may result in an overestimated association between fat consumption and disease [45]. If the potentially negative effects of high-fat diets are overestimated due to selective misreporting, current recommendations for fat intake may be overly conservative [45].

Additional Systematic Biases of Nutrition Surveillance Data
In addition to known sources of systematic reporting error, there are numerous sources of systematic bias in nutrition surveillance research protocols that are not addressed via our data. Another potentially large source of error is the translation of food and beverage consumption data (e.g. 24HR) into nutrient energy values via nutrient composition databases. The accuracy of this translation relies on a number of assumptions that are rarely justified. As cited earlier, research on misreporting shows that reports do not accurately reflect the quantity or number of foods consumed, and are not representative of usual intakes [12][13][14][15][46][47][48][49][50]. Given that the basic methodological assumptions are violated, it is not surprising that research has demonstrated that food data to nutrient energy conversions are ''riddled with potential pitfalls at all stages'' that ''hamper the interpretability of the results'' [51][52][53], and represent a major source of systematic error in national nutrition surveillance efforts [2].
Throughout its history, the NHANES has relied upon databases of varying quality and composition for the post-hoc conversion of food and beverage consumption (i.e., 24HR) data into energy values [2][3][4][5]53]. This makes the analysis of trends extremely complex because the nutrient energy (i.e., caloric) values in the databases varied considerably over time [54,55]. Additionally, research has demonstrated that the energy content of restaurant food (and especially fast-food outlets) vary significantly when compared to the industry values used in the NNDS [56], and an internal quality review of NHANES 2003-2004 data led to ,400 substantive changes in nutrient and energy values. [57]. The result of these limitations are discussed in detail elsewhere, see [4,5,58].
As with the improvements in the NHANES survey protocols, the progressive alterations to the nutrient database combined with changes in the types of foods that are available for consumption led to artifactual differences in nutrient and energy consumption estimates that frustrate efforts to examine trends in caloric consumption [58]. To account for these changes, researchers must maintain the real differences in the composition of foods while correcting for artifactual differences attributable to improvements in the quality of nutrient data [58]. Given the lack of comprehensive crossover studies and metrics for adjustment as the food and nutrient databases evolved, papers examining trends in caloric consumption must be treated with skepticism [51,58].

Commercially Prepared Foods and Meals Away From Home
One of the most prominent systematic errors from 24HR datato-nutrient energy conversions is due to the increased reliance on the food service industry and the substantial rise in meals eaten 'away from home' [59][60][61]. As stated previously, the vast majority of foods and beverages in the NNDS have not been evaluated empirically and research has demonstrated that the energy and macro/micro nutrient content of commercially prepared foods varies significantly compared to the industry values used in the NNDS [56]. When foods or commodities are not in the database, substitutions are necessitated. For these interpolations to be accurate, the analogues must be similar in composition to the consumed food or beverage. This is extremely difficult to perform in practice because no two foods or commodities are identical, and local vs. imported foods/commodities differ significantly. For example, in survey data collection, knowledge of the specific preparation and cut of beef are essential since the energy content of generic beef substitutions may differ dramatically (e.g., 166 kcals per 100 grams in round steak to 257 kcals in top sirloin [62]) [63,64]. Given these realities, USDA estimates of caloric consumption may be increasingly inaccurate as the number of food and beverages supplied by the commercial sector expands rapidly.
Recent research has attempted to quantify the changes in consumer packaged foods and beverages, and their impact on the American diet [65]. Nevertheless, these efforts suffer from the same limitations as all food data-to-nutrient energy value conversions via nutrient composition databases. Additionally, the translation of ''as-purchased'' foods and beverages (using information from the commercial sector) to ''as-consumed'' energy and macro/micronutrient content for national surveillance relies on the accurate quantification of food preparation and waste [65]. Unfortunately, these data are limited and highly variable [52,66]. In a report from the USDA's Economic Research Service, Muth et al. (2011) state that the current data are incomplete and overstate actual consumption because the level of ''documentation of food losses… ranged from little to none for estimates at the retail and customer levels.'' [67]. These results clearly demonstrate the conceptual and methodological complexity of translating food and beverage purchases into nutrient energy and macro/micronutrient intake in the context of a rapidly evolving food supply.

Methods of Adjustment for Systematic Biases
There are various methods that attempt to improve estimates of caloric consumption derived from self-reported dietary intake [32,[68][69][70][71][72]. While these methods may improve the shape of the distribution of the estimates, none can address the significant systematic biases described in this report. For example, the National Research Council and the Iowa State University methods provide significantly improved estimates of the shape of the distribution, but do not substantially improve estimates of mean energy intake (10-15% underestimation) or protein consumption (6-7% underestimation) [70]. 291.

Strengths and Limitations
A strength of the present study was the use of the established rEI/BMR method for the determination of physiologically implausible EI values. We used a liberal cutoff (i.e., ,1.35) that is below the study-specific theoretical cutoff for our smallest subgroup (i.e., n .400). The use of the more conservative cutoff of rEI/BMR ,1.50 recommended by Goldberg et al., (1991) [22] increased underreporting by 10% in women and 7% in men across all surveys. A second strength was the use of a rEI/BMR .2.4 for the elimination of potential over-reporters to correct the limitations of previous research [29].
Finally, the use of the IOM factorial equations for estimating TEE for specific subgroups (i.e., OW & OB respondents) in the calculation of disparity values is a significant strength. The results of this additional protocol demonstrated significant underreporting in all surveys, and that the disparity values closely paralleled the implausible values in 15 of the 18 sub-groups (i.e., men & women in 9 surveys). The close agreement between these two dissimilar protocols increases confidence in our results and conclusions.
A potential limitation to our analysis was the use of the Schofield predictive equation for estimating BMR. The Schofield predictive equations may overestimate BMR in some populations [73,74]. If the Schofield equation overestimated BMR, a greater percentage of survey respondents would be classified as underreporters. To address this potential limitation, we performed the analyses using the Mifflin equation [75], which has been validated in OW and OB populations such as the U.S [74]. The results of those analyses were similar to those obtained using the Schofield equation, with substantial underreporting (.50%) in all surveys, significant trends in changes in underreporting, and a small increase in over-reporting. To remain consistent with past research on implausible rEI and underreporting [29,33], we chose to present the results from the Schofield predictive equations.

Conclusions
Throughout its history, NHANES dietary measurement protocols have failed to provide accurate estimates of the habitual caloric consumption of the U.S. population. Furthermore, successive changes to the nutrient databases used for the 24HR data-to-energy conversations and improvements in measurement protocols make it exceedingly difficult to discern temporal patterns in caloric intake that can be related to changes in population rates of obesity. As such, there are no valid population-level data to support speculations regarding trends in caloric consumption and the etiology of the obesity epidemic. Because under-reporting and physiologically implausible rEI values are a predominant feature of U.S. nutritional surveillance, the ability to generate empirically supported public policy and dietary guidelines relevant to the obesity epidemic based on these data is extremely limited.