Assessing dietary intakes from household budget surveys: A national analysis in Bangladesh

Background Accurate national information on dietary intakes, including heterogeneity among individuals, is critical to inform health implications and policy priorities. In low- and middle-income countries, household expenditure surveys constitute the major source of food data, but with uncertain validity for individual-level intakes. Objective To investigate how individualized dietary consumption estimated from household survey data compared with individual-level 24-hr dietary recalls (24hR); and to assess potential heterogeneity by method for individualizing household intakes, dietary indicator, and individual characteristics (age, sex, education, religion, household income). Methods We evaluated data from the 2011–2012 Bangladesh Household Integrated Survey (BIHS), which included household-level consumption data (5,503 households) and individual-level dietary data based on 24hR from these households (22,173 participants). Household and 24hR estimates were standardized and harmonized for 33 dietary indicators, including 9 food groups, total energy, 8 macronutrients, and 15 micronutrients. Individual consumption was estimated from household data using two approaches, the Adult Male Equivalent (AME) and per capita (PC) approach. For each dietary indicator, differences in household vs. individual mean estimates were evaluated overall and by strata of individual characteristics, using Spearman’s correlations and univariate and multivariate linear regression models. Results Individualized household estimates overestimated individual intakes from 24hR for all dietary factors using either estimation method (P<0.001 for each), except for starchy vegetables (AME: P = 0.15; PC: P = 0.85). For foods, overestimation ranged from 4% for seafood to about 240% for fruits, and for nutrients from 11% for carbohydrates and poly-unsaturated fats to 55% for vitamin C, with similar overestimation for the AME and the PC method. By strata, overestimation was modestly higher in men vs. women, in children (0-10y) vs. adolescents (11-19y) and adults (20-44y, ≥45y), among adults of higher (≥6y) vs. lower (<6y) education, in Muslims vs. other religions (Christians, Hindus), and for the lowest vs. all other income groups. This overestimation was notably higher in young children (0-5y) vs. all other age groups and in the lowest vs. all other income groups. Underestimation was rarely observed, for example for milk intake (-56%) in young children (0-5y). The PC approach did not capture heterogeneity in validity of estimation of different dietary factors by age, mainly in children (0-5y, 6-10y). Spearman's correlations between individualized household estimates and 24hR data were higher for the AME (0.30–0.70) than PC (0.20–0.50) approach. Findings were similar with and without multivariate regression, with proportions of variance (R2) in 24hR intakes explained by the AME being generally greater than PC estimates, yet still low to modest. Conclusions In this national survey, established methods for estimating individual level intakes from household surveys produce overestimation of intakes of nearly all dietary indicators, with significant variation depending on the dietary factor and modest variation depending on individual characteristics. These findings suggest a need for new methods to estimate individual-level consumption from household survey estimates.


Objective
To investigate how individualized dietary consumption estimated from household survey data compared with individual-level 24-hr dietary recalls (24hR); and to assess potential heterogeneity by method for individualizing household intakes, dietary indicator, and individual characteristics (age, sex, education, religion, household income).

Methods
We evaluated data from the 2011-2012 Bangladesh Household Integrated Survey (BIHS), which included household-level consumption data (5,503 households) and individual-level dietary data based on 24hR from these households (22,173 participants). Household and 24hR estimates were standardized and harmonized for 33 dietary indicators, including 9 food groups, total energy, 8 macronutrients, and 15 micronutrients. Individual consumption was estimated from household data using two approaches, the Adult Male Equivalent (AME) and per capita (PC) approach. For each dietary indicator, differences in household vs. individual mean estimates were evaluated overall and by strata of individual characteristics, using Spearman's correlations and univariate and multivariate linear regression models.

Results
Individualized household estimates overestimated individual intakes from 24hR for all dietary factors using either estimation method (P<0.001 for each), except for starchy PLOS  vegetables (AME: P = 0.15; PC: P = 0.85). For foods, overestimation ranged from 4% for seafood to about 240% for fruits, and for nutrients from 11% for carbohydrates and polyunsaturated fats to 55% for vitamin C, with similar overestimation for the AME and the PC method. By strata, overestimation was modestly higher in men vs. women, in children (0-10y) vs. adolescents (11-19y) and adults (20-44y, !45y), among adults of higher (!6y) vs. lower (<6y) education, in Muslims vs. other religions (Christians, Hindus), and for the lowest vs. all other income groups. This overestimation was notably higher in young children (0-5y) vs. all other age groups and in the lowest vs. all other income groups. Underestimation was rarely observed, for example for milk intake (-56%) in young children (0-5y). The PC approach did not capture heterogeneity in validity of estimation of different dietary factors by age, mainly in children (0-5y, 6-10y). Spearman's correlations between individualized household estimates and 24hR data were higher for the AME (0.30-0.70) than PC (0.20-0.50) approach. Findings were similar with and without multivariate regression, with proportions of variance (R 2 ) in 24hR intakes explained by the AME being generally greater than PC estimates, yet still low to modest.

Introduction
The major global health and economic impacts of food insecurity and undernutrition have been recognized, now joined by tremendous diet-induced burdens of non-communicable diseases (NCDs) [1][2][3][4][5]. In nearly every region of the world, suboptimal diet is a leading modifiable risk factor for mortality and morbidity, exceeding the burdens attributable to most other global health challenges [6][7][8]. Even modest dietary changes are associated with improvements in maternal and child undernutrition and micronutrient deficiencies [9][10][11][12][13], as well as meaningful reductions in NCDs [14][15][16][17]. Based on the crucial role of nutrition in health, a better understanding of patterns and distributions of dietary habits globally is critical to inform and establish dietary priorities and improvement goals [6,18]. For most countries around the world [18,19], particularly low-and middle-income countries (LMIC), limited survey data are available on individual-level dietary intakes. Because of this, methods have been developed to utilize all available individual-level dietary data worldwide, together with estimates of national food-supply availability (food balance sheets) from the UN Food and Agriculture Organization (FAO) [20][21][22], to estimate individual-level dietary intakes in every country globally [6,19,[23][24][25][26][27]. Yet, the utility of another major potential source of dietary information, household-level consumption and expenditure surveys (HCES), is not well established. Household surveys have the advantage of being done regularly (typically every 3-5 years in most countries) and including large samples, providing potentially relevant data to augment existing estimates of individual-level dietary intakes. However, such surveys are designed mainly to evaluate financial and living conditions of households, rather than specifically for nutrition or food habits; and also collect data on overall household food consumption, not individual intakes [28][29][30]. Other potential limitations of household surveys include short recall reference periods, a limited number of foods assessed, absence of data on within-household distributions of intakes, and insufficient accounting for food waste, for foods acquired for purposes other than household consumption (e.g., for storage, guests, livestock), for foods consumed away from home, and for cooking effects on food weight and nutrient content [30][31][32][33]. As such, their potential validity for estimating individual-level dietary intake distributions, as well as potential variability in this validity according to individual characteristics such as age, sex, education, or income, is not established.
Two main methods have been proposed for estimating individualized dietary intakes from household data, including the adult male equivalent (AME) and the per capita (PC) approach [20,[34][35][36][37][38][39][40][41][42]. The PC approach assumes that each person within the household has equal access to and intake of food, while the AME assumes that each person's intake is proportional to their age and sex-specific caloric requirements. However, the validity of such approaches in predicting individual dietary intakes, including for total caloric/energy intake, major foods, macro-and micronutrients and by various population subgroups, remains uncertain. Few prior studies have used and tested the PC [39] or AME approach [34,35,40,41]. These have focused on total energy and a limited number of macro-or micro-nutrients (e.g., protein, fat, fiber, iron) [34,40,41] or specific foods associated with malnutrition [35,39]; have included only specific population groups, such as women of reproductive age and young children (up to 5y) [34,35] rather than the general population [39][40][41]; and have used heterogeneous sources of household-level dietary data, including acquisition [35,39], consumption [34], or computed from individual intakes [40,41]. Multiple other foods, macronutrients and micronutrients linked to the double burden of malnutrition, as well as potential differences by individual characteristics or method of estimation (PC, AME) have not been evaluated. To address these gaps in knowledge, we compared estimates of individualized dietary intakes from household data to 24-hr recall (24hR) dietary estimates among the same individuals in the nationally representative Bangladesh Integrated Household Survey (BIHS). We evaluated validity overall as well as according to method of estimation (AME, PC), dietary factor, and key individual characteristics.

Dietary survey
We utilized the BIHS 2011-2012 [43], a comprehensive nationally representative survey from a LMIC that includes both household survey and individual-level 24hR dietary estimates from the same individuals. BIHS data are publicly available [43], and household-level dietary data further meet the International Household Survey Network (IHSN) reliability and relevance assessment criteria (Table A in S1 File) [30]. BIHS used a two-stage stratified sampling design and covered a total of 6,503 households including 27,285 individuals (47.6% men, 0-120 years, mean age 26.6 (SD: 19.9) years) [43,44]. Of those, 5,503 households were representative of the rural Bangladesh and 2,040 of southwest Bangladesh as part of the Feed the Future (FTF) global hunger and food security initiative; 1,040 households contributed to the representativeness of both national and FTF zone samples [44]. For the present analysis, and in line with local experts, we used the BIHS national sample, which included 5,503 households and 23,135 individuals. Of these, we excluded 962 individuals with missing 24hR data that were not home at the time of the interview, did not report any consumption of foods or beverages for unknown reasons, or were exclusively breastfed babies. Excluded individuals did not differ in key sociodemographics compared to the overall sample, yet as expected, they were younger (17.0±19.6 years), since 26% were exclusively breastfed babies. The final analytical sample consisted of 5,503 households and 22,173 individuals ( Table 1). Training of the researchers Bangladeshi consulting firm with expertise in complex surveys and data analysis. IFPRI researchers and the consulting firm experts trained experienced enumerators, researchers, and editors to edit the completed questionnaires during the survey.
24hR data were collected in-person by trained interviewers using an open-ended recall. The household member responsible for preparing the meals (women 98.8%, 36.2±12.3 years) reported the foods (single-ingredient, mixed dishes) consumed during the previous day in the household from any source, including own cooking, purchased foods, and gifts. Information was collected on the total "as consumed" weight of each food item, the disaggregated ingredients and corresponding raw weights in mixed dishes, and on how much of these food items was consumed by each household member, stored as leftovers, thrown away, or given to guests, others, and animals/livestock.
Household-level consumption was assessed by a 7-day 287-food item questionnaire, including 37 food items for food consumed away from home. Foods consumed and their quantity for the household (raw weight for single-ingredient foods; cooked weight for mixed dishes) were reported by the same household member as for the 24hR. Dietary data collection was performed from December 2011 to March 2012. Details on the BIHS administration and questionnaires can be found elsewhere [43].

Dietary dataset harmonization
Dietary data were harmonized within and between the 24hR and household datasets. This process involved 7 key steps (Appendix A in S1 File): 1. dataset retrieval, involving identification and retrieval of relevant dietary and sociodemographic BIHS datasets and variables; 2. unique food item identification and description, identifying the unique food items (single-ingredient or disaggregated ingredient) across the diet assessment methods by matching their available food description, further accounting for food consumed away from home; 3. food matching, matching food items to available food composition data [45,46] for nutrient profiling (if nutrient composition was available for the overall recipe/mixed dish then that was preferred) further accounting for alterations in nutrient content during cooking (use of retention factors [45,47,48]) [49,50]; 4. unit standardization, accounting for non-edible portions and cooking alterations (use of yield factors [45,47,51,52]), converting and reporting in standardized "as consumed" metrics, i.e., g/day for foods and macronutrients (other than cholesterol), and mg/day or μg/day for micronutrients; 5. food classification, classifying unique food items (including disaggregated ingredients) to food groups (e.g., fruits, vegetables) using previously established methods [19,53,54]; 6. individualization of household consumption, where household food and nutrient consumption was individualized by the AME [20] and PC [55] approach (Appendix B in S1 File); and, 7. final dataset preparation, merging and creating a complete dataset including individual-level dietary and sociodemographic information. Local experts provided advice on each of those steps, particularly for steps 2, 3, and 5.
The AME method [20] was our primary approach for individualizing household consumption [34,35,37]. This method assumes that the intra-household food distribution is proportional to the individual's share of total household energy requirements, and as such household members do not receive an equal share of the food available for consumption. The energy requirements of household members of different age, sex, and status (pregnant/ lactating women) were expressed in proportion to an adult male's energy requirements (Appendix B in S1 File). In secondary analysis, we used the PC approach [56] to estimate the per capita consumption, assuming that the available food in the household is equally distributed among household members.
(18 food groups, 11 macronutrients, 18 micronutrients, and total energy) based on evidence for etiological effects on a) major chronic diseases (e.g., type 2 diabetes, stroke, heart disease) and related risk factors (e.g., blood lipids, blood pressure, obesity) [7,8,[57][58][59][60][61][62], or b) deficiency-related health conditions and mortality (e.g., anemia, blindness, maternal mortality) [1,10,12,13]. Among these 48 factors, the final selection of dietary indicators was based on observed intake levels (foods) and available food composition data (nutrients) in this survey. For example, if intake for a selected food group was low, it was combined into a broader category (e.g., whole grains and refined grains were combined into total grains due to low whole grain intake), or omitted if very rarely consumed (e.g., sugar-sweetened beverages). Nutrients were not analyzed if food composition data were not available (iodine), missing for >70% of foods (trans fats, omega-6 fats, vitamin B 12 , selenium) or missing for major dietary sources (omega-3 contents for seafood).

Statistical analysis
Average dietary consumption was estimated and compared for the individual 24hR intakes and individualized household estimates overall and by population strata, including by age ( 5, 6-10, 11-19, 20-44, and !45 years), sex (men, women), education (<6 years of education, !6 years of education), religion (Muslims, other religions), and monthly household income (quintiles). To assess how well individualized household estimates ranked participants in comparison to 24hR estimates, Spearman's correlations were used.
To assess differences in dietary means and also the proportion of variance explained, the relation between 24hR and AME estimates was assessed by using univariate and multivariable random-intercept linear regression analysis which accounted for household clusters [19].
where 24hR estimate ij represents intake estimates for individual i and household j; β 0 represents the intercept; β (slope) represents the difference in the 24hR mean for a 1-unit difference in the AME mean; AME estimate ij represents individualized intake estimates for individual i and household j; covariates ij were covariates specific to individual i and household j; β' represents a set of regression coefficients for differences in the 24hR mean for a 1-unit difference in covariates; and random effects were modeled for β 0 and fixed effects were modeled for βs. Analyses were repeated for the PC approach. For the multivariable models, we selected covariates which would be available in household surveys, including basic demographics, such as age and sex, in the minimally adjusted model, and additionally education, religion, household income, respondent's characteristics (age, sex, education), and household characteristics (household size, number of children, and wastage percentage) in the fully adjusted model.
For the aim to assess potential heterogeneity, regression analyses (univariate, multivariate) were performed by sex, age, sex and age, education, religion, and household income. There was no correlation between education and income, as assessed by Cohen's kappa coefficient (κ = -0.02), which justified stratified analysis by each variable separately. Stratification by all demographic factors was not performed because of low sample size and unstable estimates in some strata. Missing covariate values for education (n = 17, 0.0008%) were imputed with a single regression imputation as the missing values were very few, assuming education was missing at random [63], and using age, sex, and household size as predictors; predictors had no missing values.
Analyses were performed using STATA 14 (College Station, TX: StataCorp LP). Results from statistical tests were considered significant with two-sided alpha = 0.05.
Considering relative rankings (Spearman's correlations) of individuals, correlations between individualized household estimates and 24hR data were generally higher for AME compared with PC estimates ( Table 2). For AME estimates, correlations were generally modest (ranging from about 0.30 to 0.70) while for PC estimates correlations were lower (ranging from about 0.20 to 0.50).
For all dietary indicators except starchy vegetables, mean individualized household estimates significantly exceeded individual intakes (P<0.001 each) ( Table 2). The degree of overestimation was generally very similar between the AME and PC approaches (Fig 1). Among different foods, the overestimation was smallest for seafood (AME: 4%, PC: 4%) and total grains (8%, 7%), and greatest for non-starchy vegetables (56%, 54%) and fruits (242%, 239%). Total energy was overestimated by about 12%. Among macro-and micronutrients, the degree of overestimation ranged from about 11-55% and was highest for vitamins A (50%, 49%) and C (55%, 54%) and lowest for carbohydrates (12%, 11%), and protein (13%, 12%). The shape of the dietary factor distribution, including narrowness, was generally similar between the AME and 24hR estimates (Figure A in S1 File). The magnitude of the narrowness, as assessed by the standard deviations, was also quite similar between the two types of estimates ( Table 2). The AME estimates were generally characterized by less variable distributions.
In unadjusted linear regression analyses, proportions of the variance (R 2 ) in 24hR intakes explained by individualized household estimates (AME) were generally modest to low ( Table 3, Table C in S1 File). For foods, for example, modest values were seen for total grains (R 2 = 0.48), milk (0.24) and fats/oils (0.23), and lower values for fruits (0.06), legumes (0.13) and meat/eggs (0.13). For total energy and nutrients, variation explained was highest for niacin (0.50), vitamin B 6 (0.49), energy (0.46) and carbohydrates (0.46), and lowest for sodium (0.08) and vitamin C (0.11). Proportions explained by the PC estimates were generally smaller. Adjustment for sex and age improved the variation explained for all dietary factors, mainly for energy, protein, carbohydrates, fiber, potassium and magnesium. Additional adjustments did not appreciably change observed relations.

Findings by sex
Overall dietary consumption patterns were broadly similar by sex, although men often had higher mean intakes across foods, especially for milk, meat/eggs, legumes, and total grains, and certain nutrients, mainly B vitamins, cholesterol, fatty acids and calcium (Tables D and E in S1 File). The overestimation of AME household estimates was modestly higher in men than women for most factors (Fig 1) except for meats/eggs, milk, and dietary cholesterol. In contrast, the PC estimates often produced higher overestimation in women than men, highest for milk (men: 32%, women: 66%), fruits (226%, 255%), meats/eggs (31%, 58%), and cholesterol (19%, 39%). In men vs. women, the variance explained for each dietary factor by individualized household estimates was similar to that seen for the overall population (Table T in S1 File).

Findings by age
Overall dietary consumption patterns were similar by age groups (Tables F-J in S1 File), except that younger adults (20-44 years) generally had higher consumption levels compared to other ages; legume, vitamin A, and vitamin D consumption was higher at older ages, and milk and fruit consumption were lower at older ages. Variability was evident in the relationship between the household estimates and individual intakes by age, especially among children ( 10 years) (Fig 1, Tables F-J in S1 File). For 0-5 year-olds, underestimation was seen for milk (-56%) with AME estimates, while milk was overestimated in all other ages (e.g., 114% in 20-44 year-olds); intakes of most other foods were overestimated to a greater extent in 0-5 year-olds compared with older children and adults. Notably, overestimation in children was substantially higher with the PC method that did not capture heterogeneity in validity of estimation of dietary factors by age (Fig 1). Variance explained by household estimates for each dietary factor by age was generally lower than that observed for the overall population; across age groups it was higher among younger children (0-5y) vs. all other ages (Table U in S1 File).

Findings by education, religion and household income
Stratification by education revealed similar overall dietary consumption patterns, although adults with higher education (!6 years) had generally higher intakes, especially for milk, cholesterol, meat/eggs, and fruits (Tables K and L in S1 File). The overestimation of AME estimates was modestly higher in individuals of higher vs. lower education (<6 years). PC  1 Dietary factors presented had adequate data/information for the present analysis (see Selection of dietary targets). 2 Sample sizes differ because we performed paired analysis for each dietary factor, i.e., for each analysis we used only the individuals with available intake data for both diet assessments. 3 Bangladesh Integrated Household Survey (BIHS) 2011-2012 provided household-level dietary data from a 7-day household consumption questionnaire and individual-level data from 24-hour recalls (24hR). Household consumption was individualized by applying a) the Adult Male Equivalent (AME) method [20] as proposed by FAO [56], assuming moderate physical activity, and b) the per capita (PC) approach assuming equal distribution among household members (Appendix B in S1 File). Individual intake was estimated from 24hR. 4 Spearman correlation coefficients (rho). All correlations were significant (P<0.001).
estimates were quite similar by education, but diverged considerably from those of the overall population; notably for starchy vegetables (individuals with higher vs. lower education: -12%, -12%), total grains (-8%, -6%), and seafood (-7%, -9%) that were underestimated. Variance explained with the AME method tended to be higher among adults of lower vs. higher education (Table V in S1 File). The opposite was observed with the PC approach, but this was reversed with sex and age adjustment. Comparing dietary estimates from household-level and individual-level dietary data  and education (<6 years of education, !6 years of education), household size, number of children within household, and food wastage percentage (using 24hR data, we calculated for each household, the percent of food wastage -sum of food waste, and food given to guests, others and animals-to total consumed food (mean: 11.6%, SD: 13.6)). Bs represent the change in the individual intake (24hR) for every unit increase in the respective mean of household estimates. SEs for the intercept and βs are presented. R 2 represents the coefficient of determination for the overall model. Intakes were only modestly higher among other religions (Christians, Hindus) vs. Muslims (Tables M and N in S1 File). The overestimation by both individualized household estimates was similar and generally higher in Muslims vs. other religions, except for seafood intakes (6%, 53%, respectively) (Table W in S1 File). Higher variance was explained in other religions vs. Muslims.
Consumption patterns were similar by household income, with higher intakes generally seen in the highest income group (Tables O-S in S1 File). Variability was unremarkable between household estimates and overestimation was higher for the lowest vs. all other income groups. Proportions of variance explained in 24hR intakes were generally modestly higher in the middle (2 nd , 3 rd quintile) income groups (Table X in S1 File).

Discussion
This investigation of household-level data and 24-hour dietary recall information from the same households in rural Bangladesh showed that individualized household estimates significantly exceeded individual intakes for nearly all dietary factors assessed, including by 12% for total energy, 0-242% for major food groups, 11%-30% for macronutrients and 13%-55% for micronutrients. The degree of overestimation varied by both sex and estimation method, with larger overestimation by AME in men and by PC in women; and also for young children (0-5 years), where milk intake was underestimated and intakes of other dietary factors were greatly overestimated. For all dietary factors, low to modest variation in intakes was explained by individualized household estimates, higher for the AME than the PC approach. These novel findings suggest that current methods to utilize household-level survey data are problematic for estimating individual dietary intakes.
Smallest to modest overestimation (<10-15%) was observed for key staple foods, including starchy vegetables, seafood, total grains, legumes and fats/oils; total energy; macronutrients; and specific vitamins and minerals, including niacin and zinc. This small to modest overestimation was similar between the AME and PC approach, and across all strata, except for 0-5 year-old children and individuals of low income in whom all dietary indicators were greatly overestimated. Interestingly, overestimation was higher among adults of higher vs. lower education, but given this population is mostly less educated, these results should be interpreted within that context. These findings suggest that established methods for estimating individual intakes from household surveys could be used to approximate specific dietary indicators, such as most frequently consumed foods, energy and macronutrients. Yet, intakes were still overestimated with further differences noted by individual characteristics and estimation method, and thus future applications of these methods should acknowledge and potentially try to account for this likely limitation.
In contrast, largest overestimation (!20%) was seen for foods with high seasonal variation and increased variability between individuals, such as fruits and non-starchy vegetables; and less commonly consumed foods with fewer questions assessing their consumption in the household questionnaire, such as milk and meats/eggs. Almost all vitamins and minerals assessed were substantially overestimated, particularly those found in the above food sources, such as vitamin A, vitamin C, folate, and calcium. Overestimation was also larger for sodium consistent with increased variability in salt use -irrespectively of total energy intake levelsbetween individuals [64]. Similar findings were generally observed by individual characteristics (other than younger children) and estimation method. Our findings suggest that household estimates may not reasonably estimate dietary intakes of foods that may be underrepresented in the household questionnaire, as well as foods and most vitamins and minerals that are generally characterized by high individual variability. These findings are consistent with methods used to generate such data, which were developed to capture household consumption rather than actual individual intake, and do not account for food wastage, food preparation alterations in weight and nutrient content, or food eaten away from home [65].
Notably, in really young (0-5 years) children all dietary factors were substantially overestimated, except for milk that was greatly underestimated. Household-level data are challenging for accurately estimating dietary intakes in infants and toddlers [34,35,40]. It is quite usual for children to leave a substantial proportion of leftover food [66], and this potential misconception between what is offered vs. what is consumed could contribute to an overall overestimation of household consumption. Furthermore, food consumption by different household members is not necessarily proportional to their energy requirements, particularly for young children and/or for specific foods. Our results showed that 24hR milk intake was highest in very young children compared to all other ages. Yet with the AME method, 0-5 year-olds were assigned the smallest proportion of the household milk consumption (relative to their energy requirements), while with the PC method they were treated as any other individual (assuming equal consumption). These findings confirm that household estimates are not appropriate for estimating dietary intakes of young children, whose dietary patterns vary from the rest of the population.
Both AME and PC approaches appear to be quite problematic for estimating individual intakes from household-level data, with key differences noted by individual characteristics. The AME improved estimation in women, children (0-5 and 6-10 years) and the PC in men, adolescents and adults, and individuals of low or high education, with no substantial differences in the overall population estimates or by religion or income. Women and children ( 10 years) in particular are two of the top interest population groups for several international organizations and priority guidelines [67][68][69][70]. These vulnerable populations are more likely to develop nutrient deficiencies, especially in LMICs [71], while there is increasing evidence [72][73][74][75] of obesity and NCD originating in early development stages and as a potential consequence of maternal and childhood malnutrition. Our results recognize the usefulness of the AME over the PC method for these populations, a finding consistent with the method used to derive PC estimates. The PC method is by design crude and assumes equal consumption within household members, thus limiting its ability to capture variations in individual intake. For all dietary factors variation explained by the AME estimates was consistently higher than the PC estimates, though it was still low to moderate. Potential improvements in the AME estimations -further dependent on data availability-could include the use of caloric equations that account for each individual's anthropometrics and actual physical activity levels, to enable more accurate redistribution of household consumption.
Household surveys suggest several appropriate uses for its household consumption estimates, including constructing food balance sheets, providing food security indicators and poverty measurements, and informing food nutrition interventions [30]; yet, their use for assessing dietary quality or examining diet-disease burden relations [42,[76][77][78] can be problematic, further supported by the present findings. Given that several LMICs rely solely on household surveys for their food consumption estimates [38], these results highlight the need to further adapt existing individualization methods or develop new ones for better approximating individual-level intakes from household-level data. They also highlight the need to investigate the reasons behind observed overestimations, particularly for dietary factors and population groups with largest discrepancies.
The present findings support and greatly expand on prior reports which compared individualized household estimates to individual dietary data for energy and selected macro-and micronutrients such as protein, vitamin C, and iron, but not foods, or did not evaluate differences by individualization approach, and key population subgroups, such as sex, age, education, and income [34-36, 39-41, 79]. In Uganda, AME underestimated the energy adjusted intakes of key nutrients (e.g., vitamin C, folate, calcium) related to deficiencies in women (15-49y) and young children (2-5y) compared to 24hR estimates [34]. In Cameroon, AME estimates overestimated 24hR intakes for key foods assessed (vegetable oil, sugar, bouillon cube) in women (15-49y), and either over-(vegetable oil, bouillon cube) or under-(wheat flour, sugar) estimated intakes in children (1-5y) [35]. These studies used different surveys as sources for individual-and household-level data, not always comparable (i.e., both nationally representative); the sample was limited to households with only women and/or children of certain age; AME nutrient estimates were compared with energy-adjusted individual intakes; or only a few nutrients or single food items (for purposes of fortification) were assessed.
In prior analysis using the same BIHS survey, individual intakes from the 24hR were summed back to the household level rather than using actual household-level data [40]; subsequent application of AME approach efficiently redistributed energy, iron, zinc, vitamin A, and calcium among household members aged 4 years and above. In similar analyses -using computed household dietary data-in Ethiopia and Bangladesh, AME estimates compared well with individual intakes of energy, iron, and protein in adults and children, but not in women of reproductive age and infants (<2y) where substantial overestimation was seen [41]. These analyses do not test the validity of the AME approach in individualizing actual household data, which is particularly important when household questionnaires are the only source of dietary data.
Our investigation has several strengths. We systematically quantified differences between estimated individual dietary intake from household-level data and individual 24-hr recalls for multiple dietary indicators and different estimation methods, evaluating both rankings within the population, differences in means, and variation explained. We further assessed heterogeneity in this validity according to several key individual characteristics, and for a range of population subgroups, including children, women, and men. We included a wide range of nutrients related to both chronic diseases, and deficiencies, undernourishment, and child-maternal outcomes [10,12,13]. We adjusted our analyses for food wastage, as reported in the 24hR, and accounted for food consumed away in all estimates. To maximize comparability between diet assessment methods, we followed a series of standardized steps to harmonize description, classification, and quantification of food and nutrient intakes, including application of nutrient retention factors and yield factors, highlighting key crucial preparatory methodological steps in individualizing household consumption.
Despite comprehensive approaches to harmonize dietary data, there are inherent limitations in the dietary collection methods. We could not assess certain foods or nutrients, such as sugar-sweetened beverages, iodine, omega-3 and omega-6 fats. In this survey, as in other LMIC settings [38,40,41,80], the main person responsible for cooking reported 24hR intakes for all other household members; though this is standard for children, it is generally not recommended for other adults, as it could lead to systematic reporting biases [81]. Only one 24hR was administered, which may have affected the accuracy of the estimated individual intakes. Yet, single 24hR can provide valid estimates of the absolute mean "usual" intake of a population subgroup, as assessed in the present analysis. Energy requirements for the AME approach were based on standard FAO equations that may be less sensitive in capturing individual variation. Seasonality (monga period in rural Bangladesh) was not covered [82], and our findings may differ for certain foods with substantial seasonal variation. Our results are based on a rural low-income population and their generalizability to urban or middle-income populations may be limited. Conversely, rural low-income populations globally are more likely to be lacking individual-level surveys and, thus, most relevant to assess in the present analysis.
Future studies are needed to replicate the validity of and extend our approaches in different settings and populations over time.
In conclusion, household estimates substantially overestimated individual intakes in a national survey in rural Bangladesh with significant heterogeneity according to sex, age, education, and income. Methodology constructed in the present analysis showed that current methods for estimating individual intakes from household-level data are problematic, yet it confirmed usefulness of the AME vs. the PC approach in better approximating dietary intakes for key populations, mainly children and women. These findings will facilitate future use of household consumption estimates by scientists and policy makers to more accurately estimate dietary intakes, when household questionnaires are the only source of dietary data. Leveraging national household surveys already in place to routinely collect individual-level dietary data, even in a reasonably powered subset of the population, would be of great value to LMIC settings [83]. Relative to its importance as a global risk factor for health, disparities, and sustainability, national investment in the routine collection of individual-level dietary data should be prioritized for accurate diet assessment, burden analyses and policy implementation.  Economics, University of Dhaka, for assistance with the BIHS dataset and dietary dataset preparation.