Prediction of undernutrition and identification of its influencing predictors among under-five children in Bangladesh using explainable machine learning algorithms

Md. Merajul Islam; Nobab Md. Shoukot Jahan Kibria; Sujit Kumar; Dulal Chandra Roy; Md. Rezaul Karim

doi:10.1371/journal.pone.0315393

Abstract

Background and objectives

Child undernutrition is a leading global health concern, especially in low and middle-income developing countries, including Bangladesh. Thus, the objectives of this study are to develop an appropriate model for predicting the risk of undernutrition and identify its influencing predictors among under-five children in Bangladesh using explainable machine learning algorithms.

Materials and methods

This study used the latest nationally representative cross-sectional Bangladesh demographic health survey (BDHS), 2017–18 data. The Boruta technique was implemented to identify the important predictors of undernutrition, and logistic regression, artificial neural network, random forest, and extreme gradient boosting (XGB) were adopted to predict undernutrition (stunting, wasting, and underweight) risk. The models’ performance was evaluated through accuracy and area under the curve (AUC). Additionally, SHapley Additive exPlanations (SHAP) were employed to illustrate the influencing predictors of undernutrition.

Results

The XGB-based model outperformed the other models, with the accuracy and AUC respectively 81.73% and 0.802 for stunting, 76.15% and 0.622 for wasting, and 79.13% and 0.712 for underweight. Moreover, the SHAP method demonstrated that the father’s education, wealth, mother’s education, BMI, birth interval, vitamin A, watching television, toilet facility, residence, and water source are the influential predictors of stunting. While, BMI, mother education, and BCG of wasting; and father education, wealth, mother education, BMI, birth interval, toilet facility, breastfeeding, birth order, and residence of underweight.

Conclusion

The proposed integrating framework will be supportive as a method for selecting important predictors and predicting children who are at high risk of stunting, wasting, and underweight in Bangladesh.

Citation: Islam MM, Kibria NMSJ, Kumar S, Roy DC, Karim MR (2024) Prediction of undernutrition and identification of its influencing predictors among under-five children in Bangladesh using explainable machine learning algorithms. PLoS ONE 19(12): e0315393. https://doi.org/10.1371/journal.pone.0315393

Editor: Benojir Ahammed, Khulna University, BANGLADESH

Received: May 27, 2024; Accepted: November 25, 2024; Published: December 6, 2024

Copyright: © 2024 Islam et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: TThe data set used and analyzed in this study is freely available at the Demographic and Health Surveys (DHS) program website https://dhsprogram.com/data/available-datasets.cfm. Interested researchers can freely obtain the data by registering at the website https://dhsprogram.com/data/new-user-registration.cfm. The step-by-step instructions on how to register and download the data are provided at: https://dhsprogram.com/data/Access-Instructions.cfm.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors declare that they have no conflict of competing interests

1. Introduction

Malnutrition refers to deficiencies, excesses, or imbalances in an individual’s energy and/or nutrient intake [1]. It encompasses two main types of medical conditions. The first is "undernutrition," which includes stunting, wasting, and underweight [2]. The other is non-communicable diseases linked to unhealthy diets, such as overweight, obesity, and related problems. Malnutrition, particularly undernutrition in early childhood, has an adverse effect on children’s physical and mental development and poses a significant risk for various chronic diseases, including both communicable and non-communicable [3–5]. Undernutrition can lead to individuals becoming undernourished, making them more susceptible to illness, increasing their chances of infection, and raising the risk of fractures [6]. The nation’s economy suffers long-term consequences from this problem, which also seriously impedes advancement. It is estimated that undernutrition accounts for one-third of sickness and mortality among children aged 59 months and under, and nearly 3.5 million fatalities worldwide [7, 8]. UNICEF reports that the current population of Bangladesh is 169.8 million, with 16.3 million being children under the age of five. It has been reported that approximately 9.5 million (54%) children are stunted, 17% are wasted, and 56% are underweight [9]. Despite the decrease in rates of undernutrition over the past few decades, child undernutrition remains a significant issue for Bangladesh. To enhance the management and control of undernutrition risk among children under five, it could be beneficial to employ a smart system that utilizes modern technologies to identify undernourished children early on [10]. Early detection of the associated risk factors as well as accurate diagnosis of the risk of undernutrition can play a key role in timely intervention with implementation in preventing undernutrition and other associated diseases. Thus, early detection of undernourished children and the identification of the contributing variables to their condition requires the implementation of a smart system.

Nowadays, machine learning (ML) is a modern technology that falls under the umbrella of artificial intelligence (AI). It is designed to identify patterns within data autonomously and utilize this information to make predictions. ML-based automated models that have been developed recently are gaining more and more interest for their ability to predict the risk of malnutrition among children under the age of five in various countries [11–19]. Over the past decade, a few studies have been done to develop a predictive tool for predicting the risk of undernutrition in Bangladesh [20–22]. The development of the prediction models for undernutrition is influenced by various factors that show considerable variation across different countries or regions over time. Rahman et al. [21] applied some ML algorithms to identify the risk factors of malnutrition based on Bangladesh Demographic Health Survey (BDHS), 2014 data. They utilized logistic regression (LR) model to identify the important factors of undernutrition. This study uses the latest BDHS, 2017–18 data and considers more predictors than those included in the referenced study [21]. The Boruta approach, which is a wrapper-based feature selection approach based on random forest (RF) classifier and can handle complex non-linear data with correlated features more effectively than the LR method, is applied here for selecting the important features. The paper utilizes the popular ML-based algorithms (LR, ANN, RF) and additionally extreme gradient boosting (XGB) for comparing models’ performance. Also, determining the influencing predictors that contribute to the outcome prediction using the SHapley Additive exPlanations (SHAP) method is one of the main focuses of this study.

An attempt has been made in this study to use an explainable ML-based model for the prediction of undernutrition status among under-five children in Bangladesh. Therefore, this study’s objective was to develop an appropriate model using explainable ML algorithms that predict the risk of undermatron among under-five children in Bangladesh. Furthermore, for model interpretation, the study has successfully determined the influential predictors that contribute to the prediction of undernutrition using SHAP, which is a post hoc model interpretation technique viz. theoretically based on the Shapley value. Consequently, this information can then be used as a guide for individualized prevention and treatment to prevent the development of undernutrition among under-five children in Bangladesh. The diagrammatic representation of the proposed framework is displayed in Fig 1.

Download:

Fig 1. Diagrammatic representation of the proposed framework.

https://doi.org/10.1371/journal.pone.0315393.g001

The organization of the remaining part of this work is structured as follows: Section 2 introduces the materials and methods utilized. The results are shown in Section 3, and a thorough discussion is given in Section 4. Finally, the conclusions are presented in Section 5.

2. Materials and methods

2.1 Data source and study design

The dataset utilized in this study was obtained from the BDHS, 2017–18. This is the most recent comprehensive survey that includes all of the enumeration areas (EAs) of the nation. The samples of this survey were collected from different households using two-stage stratified cluster sampling [23]. A total of 675 EAs were chosen in the 1^st stage, with the probability of selection being equal to the size of the EA. In the 2^nd stage, 30 households were chosen with a systematic procedure from each selected EA. About 18, 000 ever-married women were selected to participate in the interview, with 17,863 (99%) of them completing the interview successfully. In the survey conducted in 2017–18, a total of 8,759 children under the age of five were identified as eligible for anthropometric measurement. Finally, excluding refused, don’t know, flagged cases, and others (technical problems), a total of 7,796 stunting, 7,777 wasting, and 7,838 underweight cases were incorporated for analysis.

2.2. Ethical approval

This study made use of an available public domain survey dataset from the BDHS, 2017–18. The BDHS surveys received ethical approval from ICF Macro Institutional Review Board, Maryland, USA, and the National Research Ethics Committee of Bangladesh Medical Research Council (BMRC), Dhaka, Bangladesh, that is why it does not require any additional ethical approval.

2.3. Outcome variables

The outcome variable of this study was undernutrition (stunting, wasting, and underweight) among under-five children. Stunting was measured by height-for-age Z-score (HAZ), wasting weight-for-age Z-score (WAZ), and underweight weight-for-height Z-score (WHZ) [24, 25]. According to WHO, children with HAZ<-2 standard deviation (SD) were considered stunted, WAZ<-2SD were wasted, and WHZ<-2 were underweight [26]. These variables were encoded in binary form, with values of 1 and 0 (1 for stunted, 0 for not stunted), (1 for wasted, 0 for not wasted), and (1 for underweight, 0 for not underweight).

2.4. Predictors

This study considered different demographic, socioeconomic, behavioral, and medical-related explanatory variables as predictors for undernutrition based on the accessibility of the dataset, self-efficacy, and previous sittings [27–32]. The predictors are age, residence, region, sex, mother education, father education, working, BMI, members, religion, wealth, breastfeeding, decision contraception, twin child, child sex, total children, birth order, birth interval, age at 1^st birth, BCG, vitamin A, diarrhea, fever, cough, water source, cooking fuel, toilet facility, and watching television. Table 1 represents a detailed description and categorization of the chosen predictors.

Download:

Table 1. Description and categorization of the predictors.

https://doi.org/10.1371/journal.pone.0315393.t001

2.5. Statistical analysis

The study participants’ background characteristics were reported as numbers (%) for the chosen predictors. Pearson’s χ² test was executed to examine the association between different predictors and stunting, wasting, and underweight. Statistical package SPSS and programming languages R and Python were applied to analyze the data.

2.5.1. Handling missing value.

Missing data is the absence of values or information that should ideally be present in a dataset. In any data collection or analysis process, missing data can occur for a wide range of reasons, including, such as technical issues, human error, non-response, and so on. Addressing missing data is an essential aspect of data analysis that has the potential to affect accuracy and reliability [33]. In this study, predictors with<40% missing values were taken into consideration, while predictors with ≥40% were eliminated from the dataset [34, 35]. There are several methods for handling missing data, including dropping missing values and filling missing values. This study employed a widely popular K-NN method to address the issue of missing values [36–38].

2.6. Data partition

The dataset was partitioned into two sets: training and testing, using a random partitioning method with an 8:2 ratio. There were 6237 children in the training set and 1559 in the testing set for stunting, 6222 children in the training set and 1555 in the testing set for wasting, 6270 children in the training set, and 1568 in the testing set for underweight.

2.7. Feature selection

Feature selection, also known as variable/attribute selection in statistics as well as ML, plays a vital role in developing an effective prediction model by choosing the most important features. It can also lead to enhanced performance of the model, better generalization, speedier computing, and greater interpretability [39]. We utilized the Boruta feature selection method in this work to identify the important predictors of stunting, wasting, and underweight in the training phase. Boruta is a wrapper-based approach that makes use of the RF classifier and outperforms others because it is consistent and unbiased [40]. The following steps were used in Boruta method to identify the important predictors

Step 1: Create a shadow dataset by shuffling the values of each predictor randomly.
Step 2: Merge the original and shadow datasets to make a single dataset.
Step 3: Train a RF classifier by utilizing the merged dataset and assess each predictor’s significance using a variable importance measure.
Step 4: Calculate the Z-score for each predictor by utilizing the predictor’s importance values. The Z-score can be determined using the following formula: Z-score = (Predictor Importance—Mean (Shadow Predictor Importance)) / Standard Deviation (Shadow Predictor Importance).
Step 5: Predictors exceeding a specific threshold Z-score (typically positive) are labeled as "Confirmed," while predictors falling below this threshold are labeled as "Rejected."
Step 6: Repeat this process until all predictors are either confirmed or rejected.

2.8. Machine learning algorithms

The current study adopted three distinct types of popular ML-based algorithms to predict undernutrition risk among under-five children in Bangladesh (Table 2).

Download:

Table 2. Types of machine learning algorithms.

https://doi.org/10.1371/journal.pone.0315393.t002

2.8.1. Logistic regression.

Logistic regression (LR) is a commonly utilized statistical method in predictive modeling for predicting the outcome of a categorical dependent variable [41]. The LR method uses the sigmoid function to determine the probability of the outcome variable based on the input predictors. The logistic regression equation can be defined as follows: (1)

Where, p_i indicates the probability of undernutrition for i^th children and 1−p_i indicates the probability of non-undernutrition; x_ki is the k^th input predictors of the i^th children and β_k is the k^th regression coefficients. The maximum likelihood method was utilized to estimate the model parameters for the logistic regression equation. Eq (1) can be represented as (2) and odds as (3)

If , predict class 1 (undernutrition); otherwise, predict 0 (not-undernutrition).

2.8.2 Artificial neural network.

An artificial neural network (ANN) is a type of non-linear ML technique that can be utilized to perform a variety of tasks, including classification, regression, and so on [42]. It consists of interconnected nodes, called neurons, which are arranged into three layers. including an input layer, one or more hidden layers, and an output layer. During training, the network adjusted the weights and biases linked to each neuron to minimize error. This process is carried out by employing an optimization algorithm, like gradient descent, that iteratively updates the weights and biases through the sigmoid activation function. The function can be expressed as follows (4)

Here, z is the input. The procedure is repeated until the values of the iteration remain unchanged.

2.8.3 Random forest.

Random forest (RF), introduced by Breiman, is a versatile ensemble-based ML algorithm [43]. The RF model was constructed using the following steps

Step 1: Select sample data using the bootstrap method from the training set
Step 2: Construct a decision tree (DT) for each sample data.
Step 3: Build a forest with 500 trees or more by repeating Step 1 and Step 2
Step 4: Take into account the predictions made by each formed DT, then use a majority vote to determine the final prediction.

2.8.4 Extreme gradient boosting.

Extreme gradient boosting (XGB) is a highly effective ensemble learning algorithm commonly used in various fields such as classification, regression, and ranking [44]. The algorithm is developed based on the principles of gradient boosting framework. It works iteratively through the use of decision trees, each aimed at correcting the errors of the previous trees. For binary classification, a logistic loss function with logistic transformation is useful for deriving the predicted probabilities from the model predictions. `The logistic loss function is defined as (5)

Where, y_i is the true class label of i^th children and p_i is the predicted probability that the i^th children belong to the positive class.

2.9. Hyperparameters tuning

Hyperparameters in machine learning are variables whose values are predefined before to the start of the learning process. They control the execution of the learning algorithm, affecting factors such as learning rate, regularization strength, and model complexity. Tuning these hyperparameters is crucial for optimizing model performance. The grid search approach with 10-fold cross-validation (CV) protocols was employed to tune the hyperparameter values in the training phase.

2.10. Performance evaluation metrics

The model’s performance was assessed by accuracy, precision, recall, and F-score in the testing set [45–47]. These values of the performance metrics are calculated based on the confusion matrix via four measurements: true positive (TP), false negative (FN), false positive (FN), and true negative (TN). Also, the area under the curve (AUC) is considered for the evaluation of the models. The AUC is a single value representing the area under the ROC curve, demonstrating the model’s ability to discriminate between undernutrition and non-undernutrition. It is mathematically represented as (6)

The probability curve, known as the ROC curve, shows the relationship between sensitivity and 1-specificity at various classification cut-off points. It is a widely used metric for evaluating the predictive effectiveness of machine learning models in medical diagnostics [48].

2.11. Predictor’s assessment using SHAP analysis

The traditional output of the XGB model only sorts the importance of variables, but it does not provide a way to assess the direction and magnitude of their impact on outcomes. SHAP is a widely used framework for interpretability in machine learning [49]. It assigns the prediction of a model to its individual features, determining how much each feature contributes to the final outcome through visualization. It is based on Shapley values derived from additive feature attribution methods, originally introduced by Lloyd Shapley in the field of game theory [50]. This approach provides a fair solution for each participant in the models by offering a wide range of features, including consistency, efficiency, dummy, and additively. The efficiency property of the SHAP method leads to more reliable outcomes when compared to alternative methods, like local interpretable model-agnostic explanations. However, predictors that have a positive SHAP value aid in the prediction of children with undernutrition in the model, while predictors with a negative SHAP value aid in the prediction of children with not undernutrition. Particularly, the importance of individual predictor, say the k^th predictor is ascertained through the Shapley value, which is computed using the following formula (7)

Where, S represents the subset of predictors that do not contain the predictor for which we are determining the value of ∅_k(v); S∪{k} represents the group of predictors that includes S as well as the k^th predictor; v(S) represents the outcome of an ML-based model that utilizes the predictors from S. S⊆M\{k} means all sets of S in M predictors, excluding the k^th predictor.

3. Results

3.1 Background characteristics

Table 3 represents the background characteristics of the study participants. This study reported that the overall prevalence of stunting was 31.3%, wasting 8.5%, and underweight 22.5%. The average height, weight, and age of the children were 83.07±14.60 cm, 10.77±3.41 kg, and 28.61±17.58 months, respectively, and mostly resided in rural areas. Mothers aged 45–49 years showed the highest percentage (66.7%) of stunting, whereas mothers aged 40–44 years revealed the largest percentage (10.6%) of wasting, and mothers aged 45–49 years showed the largest percentage (53.3%) of underweight. Sylhet division showed the highest percentage of being stunting (41.2%), wasting (9.9%), and underweight (30.8%) compared to other divisions in Bangladesh. Uneducated mothers exhibited the largest percentage of stunting (44.3%), wasting (12.4%), and underweight (36.3%), while higher educated mothers found the lowest percentage of stunting (15.1%), wasting (6.2%), and underweight (10.9%). Underweighted mothers found the greater percentage of stunting (41.8%), wasting (13.7%), and underweight (33.2%). Table 3 showed that age, residence, region, sex, mother education, father education, working, BMI, members, wealth, contraception, twin child, total children, birth order, birth interval, age at 1^st birth, vitamin A, water source, cooking fuel, toilet facility, and watching television were significantly associated with stunting; Mother education, BMI, child sex, BCG, vitamin A, and fever were significantly associated with wasting; Age, residence, region, sex, mother education, father education, working, BMI, members, wealth, breastfeeding, contraception, twin child, total children, birth order, birth interval, age at 1^st birth, fever, cough, water source, cooking fuel, toilet facility, and watching television were significantly associated with underweight (p-value<0.05).

Download:

Table 3. Background characteristics.

https://doi.org/10.1371/journal.pone.0315393.t003

3.2 Predictor’s selection by Boruta

The predictors selection results based on Boruta for stunting, wasting, and underweight are displayed in Figs 2–4. The method revealed that there are 17 important predictors associated with stunting out of 21, 5 predictors for wasting out of 7, and 17 predictors for underweight out of 23. The selected predictors of stunting are water sources, residence, toilet facility, coking fuel, child twin, age, contraception, total children, watching television, birth interval, division, birth order, vitamin A, BMI, wealth, mother education, and father education (Fig 2); wasting are fever, BCG, BMI, father education, and mother education (Fig 3); underweight are residence, watching television, water source, toilet facility, contraception, birth interval, total children, fever, region, cooking facility, birth order, age, twin child, BMI, wealth, mother education, and father education (Fig 4). The selected predictors have been incorporated for predicting the risk of undernutrition (stunting, wasting, and underweight) among under-five children in Bangladesh.

Download:

Fig 2. Stunting predictors selection using the Boruta method.

https://doi.org/10.1371/journal.pone.0315393.g002

Download:

Fig 3. Wasting predictors selection using the Boruta method.

https://doi.org/10.1371/journal.pone.0315393.g003

Download:

Fig 4. Underweight predictors selection using the Boruta method.

https://doi.org/10.1371/journal.pone.0315393.g004

3.3. Performance comparison of ML-based models

The predictive performance of four ML-based models is presented in Table 4.

Download:

Table 4. Performance (%) comparison of four models for stunting, wasting, and underweight.

https://doi.org/10.1371/journal.pone.0315393.t004

It is to be noticed that the XGB model attained the outperformed prediction accuracy of 81.73%, precision of 88.28%, recall of 89.41%, and F-score of 88.84% for stunting, while LR obtained the lowest accuracy of 76.40%, precision of 84.11%, recall of 85.56%, and F-score of 84.83%. The XGB model also demonstrated the highest level of predictive accuracy of 81.73%, precision of 88.28%, recall of 89.41%, and F-score of 88.84% for wasting. Furthermore, in comparison to the other models, the XGB model achieved an accuracy of 81.73%, a precision of 88.28%, a recall of 89.41%, and F-score of 88.84% for underweight.

The corresponding ROC curves of stunting, wasting, and underweight was portrayed in Figs 5–7, and indicated that the XGB-based model acquired a larger area of ROC curve than other models: LR, ANN, and RF. Hence, the XGB-based model appears to be the most appropriate choice for predicting indicators of undernutrition among under-five children in Bangladesh.

Download:

Fig 5. ROC curves of four models for stunting.

https://doi.org/10.1371/journal.pone.0315393.g005

Download:

Fig 6. ROC curves of four models for wasting.

https://doi.org/10.1371/journal.pone.0315393.g006

Download:

Fig 7. ROC curves of four models for underweight.

https://doi.org/10.1371/journal.pone.0315393.g007

3.4 Influencing predictors for undernutrition

To examine the importance of each predictor in the prediction, SHAP summary plot was made for the best XGB model by using SHAP values. In the SHAP summary plot, the x-axis represents the SHAP values, while the y-axis represents the contribution of each predictor. A predictor with a higher SHAP value is more likely to influence the occurrence of undernutrition. The red dots indicate higher values, while the blue dots indicate lower values. SHAP summary plot of the XGB model for stunting, wasting, and underweight was depicted in (S1–S3 Figs). The SHAP methods revealed that father education, wealth, mother education, BMI, birth interval, vitamin A, watching television, toilet facility, residence, and water source possess a higher SHAP value exceeding zero, thereby indicating that they are the influential predictors of stunting (S1 Fig). While, BMI, mother education, and BCG (S2 Fig) are influential predictors of wasting; and father education, wealth, mother education, BMI, birth interval, toilet facility, breastfeeding, birth order, and residence (S3 Fig) are the influential predictors of underweight.

4. Discussion

Nutrition is crucial for maintaining good health and promoting the growth and well-being of the human body at every stage of life. Severe malnutrition can lead to life-threatening consequences such as inhibited growth, impaired immune systems, and even death. Thus, this study highlighted the usefulness of several ML algorithms utilizing the most recent BDHS, 2017–2018 data to explore an appropriate explainable model that predicts the risk of undernutrition among children under five and determines the predictors that influence it. For each undernutrition indicator, four widely used ML-based algorithms were trained using the important predictors obtained by the Boruta method. The models’ performance was evaluated through accuracy, precision, recall, F-score, and ROC curve with AUC value. Based on the performance metrics, the XGB-based model was found superior to others for predicting the risk of undernutrition. The latest study conducted by Anku in Ghana, demonstrated that the XGB model was the best performer in predicting undernutrition among under-five children [51]. Other investigations also reported that the XGB-based model was the most precise for predicting undernutrition among children under five [52, 53]. The superiority of the XGB model may be due to its operation within the gradient boosting framework, which sequentially adds weak learners (typically DTs) and iteratively corrects errors by the preceding weak learners to achieve accurate prediction and it has the capability to effectively handle high-dimensional and complex data for classification [25]. However, the SHAP method in the XGB-based model reveals that the predictors of undernutrition vary across the three different indicators. Nevertheless, mother education and BMI are the coexistent predictors across three indicators of stunting, wasting, and underweight. This result is in line with the most recent research carried out in different nations [14, 54–57]. A mother who has received education may have a better awareness of the nutritional needs of her children. Better child feeding techniques, such as introducing supplementary foods to infants on time and exclusively breastfeeding during the first six months of a newborn’s life, are strongly linked to a decreased incidence of undernutrition in children [52]. Furthermore, mothers with higher levels of education are more likely to employ family planning [16], to use resources for the family effectively [19, 53], and to improve their children’s access to healthcare [25, 53]. The growth and development of a child greatly depend on the nutritional status of his/her mother. Mothers who are underweight face a much greater risk of stunting and wasting in comparison to mothers who have a normal weight [58]. Children of mothers with normal or above BMI have a lower risk of being underweight. Therefore, policymakers should prioritize the nutritional status of children to reduce malnutrition among them effectively. The coexistent predictors of stunting and underweight are the father’s education, wealth, birth interval, toilet facility, and residence. The socioeconomic status of the family influences the growth and development of the child, as well as their access to food security. Children from low-income families have more difficulty accessing food and medical care, which increases their risk of illness and death. This study demonstrated that the risk of stunting and underweight was highest in the poorest households, which coincided with recent research from neighboring countries including Bangladesh [56, 59, 60]. Birth spacing also has an impact on the nutritional status of under-five children. A lengthy gap between births is beneficial for the health and nutrition of both mothers and children, which was corroborated with the previous studies [61, 62]. Children living in rural areas with poor sanitation are more likely to experience stunting and being underweight. Improving access to clean and safe toilet facilities, along with promoting proper sanitation and hygiene practices, is essential for preventing childhood undernutrition and promoting overall health and well-being. Additionally, vitamin A, watching television, and water sources are also the influencing predictors of stunting, BCG of wasting, breastfeeding, and birth order of underweight. These findings are aligned with the earlier studies [63, 64]. Insufficient levels of vitamin A can impact various elements of growth and maturation, such as cellular growth, immune response, skeletal development, and hormonal equilibrium, all of which play a key role in the hindered growth of children [65]. The first-born siblings were prone to nurturing a deep sense of responsibility, the middle siblings a hunger for attention, and the youngest siblings a thirst for adventure and rebellion [66].

5. Conclusion

This study utilized four different ML-based algorithms to explore an appropriate explainable predictive model for the prediction of undernutrition among under-five Bangladeshi children. The comprehensive findings from our experiments indicate that, out of the four models, the XGB model is the most appropriate for predicting children with undernutrition. The SHAP method reveals that father education, wealth, mother education, BMI, birth interval, vitamin A, watching television, toilet facility, residence, and water source are the influential predictors of stunting among under-five children in Bangladesh. While, BMI, mother education, and BCG of wasting; and father education, wealth, mother education, BMI, birth interval, toilet facility, breastfeeding, birth order, and residence of underweight. The proposed integrating framework may be used to create an automated tool in clinical settings that correctly detect children who are undernourished in their early stages. With the help of this information, a healthcare provider can make proper decisions and formulate patient-specific treatment plans to mitigate wait times and healthcare expenses. Ultimately, our research may greatly enhance the care of undernourished children and assist decision-makers in taking appropriate initiatives to fulfill the Sustainable Development Goal (SDG) of decreasing pediatric undernutrition in Bangladesh by 2030.

5.1. Limitations of the study

This study is cross-sectional in nature, thereby restricting our capacity to establish causal relationships. While investigating several plausible factors, the data was absent in some other significant predictors, such as poor consumption of vitamin supplements, not up-to-date immunization, and so on. The important predictors of undernutrition will aid in obtaining precise results and enhanced model interpretability.

Supporting information

S1 Fig. SHAP summary plot of the XGB model for stunting.

https://doi.org/10.1371/journal.pone.0315393.s001

(TIF)

S2 Fig. SHAP summary plot of the XGB model for wasting.

https://doi.org/10.1371/journal.pone.0315393.s002

(TIF)

S3 Fig. SHAP summary plot of the XGB model for underweight.

https://doi.org/10.1371/journal.pone.0315393.s003

(TIF)

Acknowledgments

This study analyzed the dataset obtained from the Bangladesh Demographic and Health Survey (BDHS), 2017–18. The authors are thankful to the DHS Program for granting access to BDHS data. Also, the authors would like to thank the editor and the two anonymous reviewers for providing valuable comments and suggestions on the earlier version of the manuscript.

References

1. Ersado TL. Causes of malnutrition. In Combating Malnutrition through Sustainable Approaches 2022. IntechOpen.
2. Scrinis G. Reframing malnutrition in all its forms: a critique of the tripartite classification of malnutrition. Global Food Security. 2020;26:100396.
- View Article
- Google Scholar
3. Grey K, Gonzales GB, Abera M, Lelijveld N, Thompson D, Berhane M, et al. Severe malnutrition or famine exposure in childhood and cardiometabolic non-communicable disease later in life: a systematic review. BMJ global health. 2021;6(3):e003161. pmid:33692144
- View Article
- PubMed/NCBI
- Google Scholar
4. Soliman A, De Sanctis V, Alaaraj N, Ahmed S, Alyafei F, Hamed N, et al. Early and long-term consequences of nutritional stunting: from childhood to adulthood. Acta Bio Medica: Atenei Parmensis. 2021;92(1). pmid:33682846
- View Article
- PubMed/NCBI
- Google Scholar
5. Cerf ME. Healthy lifestyles and noncommunicable diseases: nutrition, the life‐course, and health promotion. Lifestyle Medicine. 2021;2(2):e31.
- View Article
- Google Scholar
6. Morales F, Montserrat-de la Paz S, Leon MJ, Rivero-Pino F. Effects of Malnutrition on the Immune System and Infection and the Role of Nutritional Strategies Regarding Improvements in Children’s Health Status: A Literature Review. Nutrients. 2023;16(1):1. pmid:38201831
- View Article
- PubMed/NCBI
- Google Scholar
7. Dukhi N. Global prevalence of malnutrition: evidence from literature. Malnutrition. 2020;1:1–6.
- View Article
- Google Scholar
8. Hossain S, Chowdhury PB, Biswas RK, Hossain MA. Malnutrition status of children under 5 years in Bangladesh: A sociodemographic assessment. Children and Youth Services Review. 2020;117:105291.
- View Article
- Google Scholar
9. Rahman MT, Alam MJ, Ahmed N, Roy DC, Sultana P. Trend of risk and correlates of under-five child undernutrition in Bangladesh: an analysis based on Bangladesh Demographic and Health Survey data, 2007–2017/2018. BMJ open. 2023;13(6):e070480. pmid:37308267
- View Article
- PubMed/NCBI
- Google Scholar
10. Govender I, Rangiah S, Kaswa R, Nzaumvila D. Malnutrition in children under the age of 5 years in a primary health care setting. South African Family Practice. 2021;63(1).
- View Article
- Google Scholar
11. Thangamani D, Sudha P. Identification of malnutrition with use of supervised data mining techniques–decision trees and artificial neural networks. Int J Eng Comput Sci. 2014;3(09).
- View Article
- Google Scholar
12. Kuttiyapillai D, Ramachandran R. Improved text analysis approach for predicting effects of nutrient on human health using machine learning techniques. IOSR J Comput Eng. 2014;16(3):86–91.
- View Article
- Google Scholar
13. Krishna PV, Gurumoorthy S, Obaidat MS, Mani JJ, Rani Kasireddy S. Population classification upon dietary data using machine learning techniques with IOT and big data. Social Network Forensics, Cyber Security, and Machine Learning. 2019:9–27.
- View Article
- Google Scholar
14. Bitew FH, Sparks CS, Nyarko SH. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public health nutrition. 2022;25(2):269–80. pmid:34620263
- View Article
- PubMed/NCBI
- Google Scholar
15. Fenta HM, Zewotir T, Muluneh EK. A machine learning classifier approach for identifying the determinants of under-five child undernutrition in Ethiopian administrative zones. BMC Medical Informatics and Decision Making. 2021;21:1–2.
- View Article
- Google Scholar
16. Anku EK, Duah HO. Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms. Plos one. 2024;19(2):e0296625. pmid:38349921
- View Article
- PubMed/NCBI
- Google Scholar
17. Shen H, Zhao H, Jiang Y. Machine learning algorithms for predicting stunting among under-five children in Papua New Guinea. Children. 2023;10(10):1638. pmid:37892302
- View Article
- PubMed/NCBI
- Google Scholar
18. Chilyabanyama ON, Chilengi R, Simuyandi M, Chisenga CC, Chirwa M, Hamusonde K, et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children. 2022;9(7):1082. pmid:35884066
- View Article
- PubMed/NCBI
- Google Scholar
19. Talukder A, Ahammed B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh. Nutrition. 2020;78:110861. pmid:32592978
- View Article
- PubMed/NCBI
- Google Scholar
20. Shahriar MM, Iqubal MS, Mitra S, Das AK. A Deep Learning Approach to Predict Malnutrition Status of 0–59 Month’s Older Children in Bangladesh. In 2019 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT) 2019; (pp. 145–149). IEEE.
21. Rahman SJ, Ahmed NF, Abedin MM, Ahammed B, Ali M, Rahman MJ, et al. Investigate the risk factors of stunting, wasting, and underweight among under-five Bangladeshi children and its prediction based on machine learning approach. Plos one. 2021;16(6):e0253172. pmid:34138925
- View Article
- PubMed/NCBI
- Google Scholar
22. National Institute of Population Research and Training (NIPORT), & ICF. Bangladesh demographic and health survey 2017‐18. Dhaka, Bangladesh, and Rockville, Maryland, USA.
23. Al-Sadeeq AH, Bukair AZ, Al-Saqladi AW. Assessment of undernutrition using Composite Index of Anthropometric Failure among children aged< 5 years in rural Yemen. Eastern Mediterranean Health Journal. 2018;24(12).
- View Article
- Google Scholar
24. Kassie GW, Workie DL. Exploring the association of anthropometric indicators for under-five children in Ethiopia. BMC public health. 2019;19:1–6.
- View Article
- Google Scholar
25. Khan S, Zaheer S, Safdar NF. Determinants of stunting, underweight and wasting among children< 5 years of age: evidence from 2012–2013 Pakistan demographic and health survey. BMC public health. 2019;19:1–5.
- View Article
- Google Scholar
26. Wondiye K, Asseffa NA, Gemebo TD, Astawesegn FH. Predictors of undernutrition among the elderly in Sodo zuriya district Wolaita zone, Ethiopia. BMC nutrition. 2019;5:1–7.
- View Article
- Google Scholar
27. Modjadji P, Madiba S. Childhood undernutrition and its predictors in a rural health and demographic surveillance system site in South Africa. International journal of environmental research and public health. 2019;16(17):3021. pmid:31438531
- View Article
- PubMed/NCBI
- Google Scholar
28. Chowdhury MR, Rahman MS, Billah B, Rashid M, Almroth M, Kader M. Prevalence and factors associated with severe undernutrition among under-5 children in Bangladesh, Pakistan, and Nepal: a comparative study using multilevel analysis. Scientific Reports. 2023;13(1):10183. pmid:37349482
- View Article
- PubMed/NCBI
- Google Scholar
29. Kiarie J, Karanja S, Busiri J, Mukami D, Kiilu C. The prevalence and associated factors of undernutrition among under-five children in South Sudan using the standardized monitoring and assessment of relief and transitions (SMART) methodology. BMC nutrition. 2021;7(1):25. pmid:34044874
- View Article
- PubMed/NCBI
- Google Scholar
30. Danso F, Appiah MA. Prevalence and associated factors influencing stunting and wasting among children of ages 1 to 5 years in Nkwanta South Municipality, Ghana. Nutrition. 2023;110:111996. pmid:37003173
- View Article
- PubMed/NCBI
- Google Scholar
31. Menalu MM, Bayleyegn AD, Tizazu MA, Amare NS. Assessment of prevalence and factors associated with malnutrition among under-five children in Debre Berhan town, Ethiopia. International Journal of General Medicine. 2021:1683–97. pmid:33976568
- View Article
- PubMed/NCBI
- Google Scholar
32. Hoffman DJ, Kassim I, Ndiaye B, McGovern ME, Le H, Abebe KT, et al. Childhood Stunting and wasting following independence in South Sudan. Food and Nutrition Bulletin. 2022;43(4):381–94. pmid:36245391
- View Article
- PubMed/NCBI
- Google Scholar
33. Kwak SK, Kim JH. Statistical data preparation: management of missing values and outliers. Korean journal of anesthesiology. 2017;70(4):407. pmid:28794835
- View Article
- PubMed/NCBI
- Google Scholar
34. Samuel O, Zewotir T, North D. Application of machine learning methods for predicting under-five mortality: analysis of Nigerian demographic health survey 2018 dataset. BMC Medical Informatics and Decision Making. 2024;24(1):86. pmid:38528495
- View Article
- PubMed/NCBI
- Google Scholar
35. Zhang Q, Wan NJ. Simple Method to Predict Insulin Resistance in Children Aged 6–12 Years by Using Machine Learning. Diabetes, Metabolic Syndrome and Obesity: Targets and Therapy. 2022:2963–75. pmid:36193541
- View Article
- PubMed/NCBI
- Google Scholar
36. Faisal S, Tutz G. Multiple imputation using nearest neighbor methods. Information Sciences. 2021;570:500–16.
- View Article
- Google Scholar
37. Emmanuel T, Maupong T, Mpoeleng D, Semong T, Mphago B, Tabona O. A survey on missing data in machine learning. Journal of Big data. 2021;8:1–37.
- View Article
- Google Scholar
38. Beretta L, Santaniello A. Nearest neighbor imputation algorithms: a critical evaluation. BMC medical informatics and decision making. 2016;16:197–208. pmid:27454392
- View Article
- PubMed/NCBI
- Google Scholar
39. Chen RC, Dewi C, Huang SW, Caraka RE. Selecting critical features for data classification based on machine learning methods. Journal of Big Data. 2020 Jul 23;7(1):52.
- View Article
- Google Scholar
40. Islam MM, Alam MJ, Maniruzzaman M, Ahmed NF, Ali MS, Rahman MJ, et al. Predicting the risk of hypertension using machine learning algorithms: A cross sectional study in Ethiopia. PLoS One. 2023;18(8):e0289613. pmid:37616271
- View Article
- PubMed/NCBI
- Google Scholar
41. Ranganathan P, Pramesh CS, Aggarwal R. Common pitfalls in statistical analysis: logistic regression. Perspectives in clinical research. 2017;8(3):148–51. pmid:28828311
- View Article
- PubMed/NCBI
- Google Scholar
42. Hassoun MH. Fundamentals of artificial neural networks. MIT press; 1995.
43. Breiman L. Random forests. Machine learning. 2001;45:5–32.
- View Article
- Google Scholar
44. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 2016; 785–794.
45. Buczak AL, Guven E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials. 2016;18(2):1153–1176.
- View Article
- Google Scholar
46. Dahiya P, Srivastava DK. Intrusion detection system on big data using deep learning techniques. Int J Innov Technol Exploring Eng. 2020;9(4):3242–3247.
- View Article
- Google Scholar
47. Hagar AA, Gawali BW. Implementation of Machine and Deep Learning Algorithms for Intrusion Detection System. In G. Rajakumar et al. (eds.), Intelligent Communication Technologies and Virtual Mobile Networks. Springer Nature Singapore. 2023:1–20.
48. Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian journal of internal medicine. 2013;4(2):627. pmid:24009950
- View Article
- PubMed/NCBI
- Google Scholar
49. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Advances in neural information processing systems. 2017;30.
- View Article
- Google Scholar
50. Kuhn HW, Tucker AW, editors. Contributions to the Theory of Games. Princeton University Press; 1953.
51. Steinfath M, Vogl S, Violet N, Schwarz F, Mielke H, Selhorst T, et al. Simple changes of individual studies can improve the reproducibility of the biomedical scientific process as a whole. PLoS One. 2018;13(9):e0202762. pmid:30208060
- View Article
- PubMed/NCBI
- Google Scholar
52. Antipov EA, Pokryshevskaya EB. Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values. Journal of revenue and pricing management. 2020;19:355–64.
- View Article
- Google Scholar
53. Sumon IH, Hossain M, Ar Salan S, Kabir MA, Majumder AK. Determinants of coexisting forms of undernutrition among under‐five children: Evidence from the Bangladesh demographic and health surveys. Food Science & Nutrition. 2023;11(9):5258–69. pmid:37701232
- View Article
- PubMed/NCBI
- Google Scholar
54. Khan RE, Raza MA. Determinants of malnutrition in Indian children: new evidence from IDHS through CIAF. Quality & Quantity. 2016;50:299–316.
- View Article
- Google Scholar
55. Akombi BJ, Agho KE, Merom D, Hall JJ, Renzaho AM. Multilevel analysis of factors associated with wasting and underweight among children under-five years in Nigeria. Nutrients. 2017;9(1):44. pmid:28075336
- View Article
- PubMed/NCBI
- Google Scholar
56. Vijay J, Patel KK. Malnutrition among under-five children in Nepal: A focus on socioeconomic status and maternal BMI. Clinical Epidemiology and Global Health. 2024;27:101571.
- View Article
- Google Scholar
57. Ahmmed F, Hasan MN, Hossain MF, Khan MT, Rahman MM, Hussain MP, et al. Association between short birth spacing and child malnutrition in Bangladesh: a propensity score matching approach. BMJ Paediatrics Open. 2024;8(1). pmid:38499349
- View Article
- PubMed/NCBI
- Google Scholar
58. Ntambara J, Zhang W, Qiu A, Cheng Z, Chu M. Optimum birth interval (36–48 months) may reduce the risk of undernutrition in children: A meta-analysis. Frontiers in Nutrition. 2023;9:939747. pmid:36712519
- View Article
- PubMed/NCBI
- Google Scholar
59. Ssentongo P, Ba DM, Ssentongo AE, Fronterre C, Whalen A, Yang Y, et al. Association of vitamin A deficiency with early childhood stunting in Uganda: A population-based cross-sectional study. PloS one. 2020;15(5):e0233615. pmid:32470055
- View Article
- PubMed/NCBI
- Google Scholar
60. Das S, Gulshan J. Different forms of malnutrition among under five children in Bangladesh: a cross sectional study on prevalence and determinants. BMC Nutrition. 2017;3:1–2.
- View Article
- Google Scholar
61. Hossain MM, Yeasmin S, Abdulla F, Rahman A. Rural-urban determinants of vitamin a deficiency among under 5 children in Bangladesh: Evidence from National Survey 2017–18. BMC Public Health. 2021;21:1–0.
- View Article
- Google Scholar
62. Ahmed R, Ejeta Chibsa S, Hussen MA, Bayisa K, Tefera Kefeni B, Gezimu W, et al. Undernutrition among exclusive breastfeeding mothers and its associated factors in Southwest Ethiopia: A community-based study. Women’s Health. 2024;20:17455057241231478.
- View Article
- Google Scholar
63. Hossain MM, Abdulla F, Rahman A. Prevalence and risk factors of underweight among under-5 children in Bangladesh: Evidence from a countrywide cross-sectional study. PLoS One. 2023;18(4):e0284797. pmid:37093817
- View Article
- PubMed/NCBI
- Google Scholar
64. Chandna A, Bhagowalia P. Birth order and children’s health and learning outcomes in India. Economics & Human Biology. 2024;52:101348. pmid:38237431
- View Article
- PubMed/NCBI
- Google Scholar
65. Mutumba R, Pesu H, Mbabazi J, Greibe E, Olsen MF, Briend A, et al. Correlates of iron, cobalamin, folate, and vitamin A status among stunted children: A cross-sectional study in Uganda. Nutrients. 2023;15(15):3429. pmid:37571364
- View Article
- PubMed/NCBI
- Google Scholar
66. Yu T, Chen C, Jin Z, Yang Y, Jiang Y, Hong L, et al. Association of number of siblings, birth order, and thinness in 3-to 12-year-old children: a population-based cross-sectional study in Shanghai, China. BMC Pediatrics. 2020;20:1–3.
- View Article
- Google Scholar

[ref1] 1. Ersado TL. Causes of malnutrition. In Combating Malnutrition through Sustainable Approaches 2022. IntechOpen.

[ref2] 2. Scrinis G. Reframing malnutrition in all its forms: a critique of the tripartite classification of malnutrition. Global Food Security. 2020;26:100396.
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Grey K, Gonzales GB, Abera M, Lelijveld N, Thompson D, Berhane M, et al. Severe malnutrition or famine exposure in childhood and cardiometabolic non-communicable disease later in life: a systematic review. BMJ global health. 2021;6(3):e003161. pmid:33692144
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref4] 4. Soliman A, De Sanctis V, Alaaraj N, Ahmed S, Alyafei F, Hamed N, et al. Early and long-term consequences of nutritional stunting: from childhood to adulthood. Acta Bio Medica: Atenei Parmensis. 2021;92(1). pmid:33682846
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref5] 5. Cerf ME. Healthy lifestyles and noncommunicable diseases: nutrition, the life‐course, and health promotion. Lifestyle Medicine. 2021;2(2):e31.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Morales F, Montserrat-de la Paz S, Leon MJ, Rivero-Pino F. Effects of Malnutrition on the Immune System and Infection and the Role of Nutritional Strategies Regarding Improvements in Children’s Health Status: A Literature Review. Nutrients. 2023;16(1):1. pmid:38201831
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref7] 7. Dukhi N. Global prevalence of malnutrition: evidence from literature. Malnutrition. 2020;1:1–6.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref8] 8. Hossain S, Chowdhury PB, Biswas RK, Hossain MA. Malnutrition status of children under 5 years in Bangladesh: A sociodemographic assessment. Children and Youth Services Review. 2020;117:105291.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Rahman MT, Alam MJ, Ahmed N, Roy DC, Sultana P. Trend of risk and correlates of under-five child undernutrition in Bangladesh: an analysis based on Bangladesh Demographic and Health Survey data, 2007–2017/2018. BMJ open. 2023;13(6):e070480. pmid:37308267
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref10] 10. Govender I, Rangiah S, Kaswa R, Nzaumvila D. Malnutrition in children under the age of 5 years in a primary health care setting. South African Family Practice. 2021;63(1).
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref11] 11. Thangamani D, Sudha P. Identification of malnutrition with use of supervised data mining techniques–decision trees and artificial neural networks. Int J Eng Comput Sci. 2014;3(09).
View Article
Google Scholar

[34] View Article

[35] Google Scholar

[ref12] 12. Kuttiyapillai D, Ramachandran R. Improved text analysis approach for predicting effects of nutrient on human health using machine learning techniques. IOSR J Comput Eng. 2014;16(3):86–91.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref13] 13. Krishna PV, Gurumoorthy S, Obaidat MS, Mani JJ, Rani Kasireddy S. Population classification upon dietary data using machine learning techniques with IOT and big data. Social Network Forensics, Cyber Security, and Machine Learning. 2019:9–27.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref14] 14. Bitew FH, Sparks CS, Nyarko SH. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public health nutrition. 2022;25(2):269–80. pmid:34620263
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref15] 15. Fenta HM, Zewotir T, Muluneh EK. A machine learning classifier approach for identifying the determinants of under-five child undernutrition in Ethiopian administrative zones. BMC Medical Informatics and Decision Making. 2021;21:1–2.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref16] 16. Anku EK, Duah HO. Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms. Plos one. 2024;19(2):e0296625. pmid:38349921
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref17] 17. Shen H, Zhao H, Jiang Y. Machine learning algorithms for predicting stunting among under-five children in Papua New Guinea. Children. 2023;10(10):1638. pmid:37892302
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref18] 18. Chilyabanyama ON, Chilengi R, Simuyandi M, Chisenga CC, Chirwa M, Hamusonde K, et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children. 2022;9(7):1082. pmid:35884066
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref19] 19. Talukder A, Ahammed B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh. Nutrition. 2020;78:110861. pmid:32592978
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref20] 20. Shahriar MM, Iqubal MS, Mitra S, Das AK. A Deep Learning Approach to Predict Malnutrition Status of 0–59 Month’s Older Children in Bangladesh. In 2019 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT) 2019; (pp. 145–149). IEEE.

[ref21] 21. Rahman SJ, Ahmed NF, Abedin MM, Ahammed B, Ali M, Rahman MJ, et al. Investigate the risk factors of stunting, wasting, and underweight among under-five Bangladeshi children and its prediction based on machine learning approach. Plos one. 2021;16(6):e0253172. pmid:34138925
View Article
PubMed/NCBI
Google Scholar

[67] View Article

[68] PubMed/NCBI

[69] Google Scholar

[ref22] 22. National Institute of Population Research and Training (NIPORT), & ICF. Bangladesh demographic and health survey 2017‐18. Dhaka, Bangladesh, and Rockville, Maryland, USA.

[ref23] 23. Al-Sadeeq AH, Bukair AZ, Al-Saqladi AW. Assessment of undernutrition using Composite Index of Anthropometric Failure among children aged< 5 years in rural Yemen. Eastern Mediterranean Health Journal. 2018;24(12).
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref24] 24. Kassie GW, Workie DL. Exploring the association of anthropometric indicators for under-five children in Ethiopia. BMC public health. 2019;19:1–6.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref25] 25. Khan S, Zaheer S, Safdar NF. Determinants of stunting, underweight and wasting among children< 5 years of age: evidence from 2012–2013 Pakistan demographic and health survey. BMC public health. 2019;19:1–5.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref26] 26. Wondiye K, Asseffa NA, Gemebo TD, Astawesegn FH. Predictors of undernutrition among the elderly in Sodo zuriya district Wolaita zone, Ethiopia. BMC nutrition. 2019;5:1–7.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref27] 27. Modjadji P, Madiba S. Childhood undernutrition and its predictors in a rural health and demographic surveillance system site in South Africa. International journal of environmental research and public health. 2019;16(17):3021. pmid:31438531
View Article
PubMed/NCBI
Google Scholar

[84] View Article

[85] PubMed/NCBI

[86] Google Scholar

[ref28] 28. Chowdhury MR, Rahman MS, Billah B, Rashid M, Almroth M, Kader M. Prevalence and factors associated with severe undernutrition among under-5 children in Bangladesh, Pakistan, and Nepal: a comparative study using multilevel analysis. Scientific Reports. 2023;13(1):10183. pmid:37349482
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref29] 29. Kiarie J, Karanja S, Busiri J, Mukami D, Kiilu C. The prevalence and associated factors of undernutrition among under-five children in South Sudan using the standardized monitoring and assessment of relief and transitions (SMART) methodology. BMC nutrition. 2021;7(1):25. pmid:34044874
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref30] 30. Danso F, Appiah MA. Prevalence and associated factors influencing stunting and wasting among children of ages 1 to 5 years in Nkwanta South Municipality, Ghana. Nutrition. 2023;110:111996. pmid:37003173
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref31] 31. Menalu MM, Bayleyegn AD, Tizazu MA, Amare NS. Assessment of prevalence and factors associated with malnutrition among under-five children in Debre Berhan town, Ethiopia. International Journal of General Medicine. 2021:1683–97. pmid:33976568
View Article
PubMed/NCBI
Google Scholar

[100] View Article

[101] PubMed/NCBI

[102] Google Scholar

[ref32] 32. Hoffman DJ, Kassim I, Ndiaye B, McGovern ME, Le H, Abebe KT, et al. Childhood Stunting and wasting following independence in South Sudan. Food and Nutrition Bulletin. 2022;43(4):381–94. pmid:36245391
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref33] 33. Kwak SK, Kim JH. Statistical data preparation: management of missing values and outliers. Korean journal of anesthesiology. 2017;70(4):407. pmid:28794835
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref34] 34. Samuel O, Zewotir T, North D. Application of machine learning methods for predicting under-five mortality: analysis of Nigerian demographic health survey 2018 dataset. BMC Medical Informatics and Decision Making. 2024;24(1):86. pmid:38528495
View Article
PubMed/NCBI
Google Scholar

[112] View Article

[113] PubMed/NCBI

[114] Google Scholar

[ref35] 35. Zhang Q, Wan NJ. Simple Method to Predict Insulin Resistance in Children Aged 6–12 Years by Using Machine Learning. Diabetes, Metabolic Syndrome and Obesity: Targets and Therapy. 2022:2963–75. pmid:36193541
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref36] 36. Faisal S, Tutz G. Multiple imputation using nearest neighbor methods. Information Sciences. 2021;570:500–16.
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref37] 37. Emmanuel T, Maupong T, Mpoeleng D, Semong T, Mphago B, Tabona O. A survey on missing data in machine learning. Journal of Big data. 2021;8:1–37.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref38] 38. Beretta L, Santaniello A. Nearest neighbor imputation algorithms: a critical evaluation. BMC medical informatics and decision making. 2016;16:197–208. pmid:27454392
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref39] 39. Chen RC, Dewi C, Huang SW, Caraka RE. Selecting critical features for data classification based on machine learning methods. Journal of Big Data. 2020 Jul 23;7(1):52.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref40] 40. Islam MM, Alam MJ, Maniruzzaman M, Ahmed NF, Ali MS, Rahman MJ, et al. Predicting the risk of hypertension using machine learning algorithms: A cross sectional study in Ethiopia. PLoS One. 2023;18(8):e0289613. pmid:37616271
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref41] 41. Ranganathan P, Pramesh CS, Aggarwal R. Common pitfalls in statistical analysis: logistic regression. Perspectives in clinical research. 2017;8(3):148–51. pmid:28828311
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref42] 42. Hassoun MH. Fundamentals of artificial neural networks. MIT press; 1995.

[ref43] 43. Breiman L. Random forests. Machine learning. 2001;45:5–32.
View Article
Google Scholar

[142] View Article

[143] Google Scholar

[ref44] 44. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 2016; 785–794.

[ref45] 45. Buczak AL, Guven E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials. 2016;18(2):1153–1176.
View Article
Google Scholar

[146] View Article

[147] Google Scholar

[ref46] 46. Dahiya P, Srivastava DK. Intrusion detection system on big data using deep learning techniques. Int J Innov Technol Exploring Eng. 2020;9(4):3242–3247.
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref47] 47. Hagar AA, Gawali BW. Implementation of Machine and Deep Learning Algorithms for Intrusion Detection System. In G. Rajakumar et al. (eds.), Intelligent Communication Technologies and Virtual Mobile Networks. Springer Nature Singapore. 2023:1–20.

[ref48] 48. Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian journal of internal medicine. 2013;4(2):627. pmid:24009950
View Article
PubMed/NCBI
Google Scholar

[153] View Article

[154] PubMed/NCBI

[155] Google Scholar

[ref49] 49. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Advances in neural information processing systems. 2017;30.
View Article
Google Scholar

[157] View Article

[158] Google Scholar

[ref50] 50. Kuhn HW, Tucker AW, editors. Contributions to the Theory of Games. Princeton University Press; 1953.

[ref51] 51. Steinfath M, Vogl S, Violet N, Schwarz F, Mielke H, Selhorst T, et al. Simple changes of individual studies can improve the reproducibility of the biomedical scientific process as a whole. PLoS One. 2018;13(9):e0202762. pmid:30208060
View Article
PubMed/NCBI
Google Scholar

[161] View Article

[162] PubMed/NCBI

[163] Google Scholar

[ref52] 52. Antipov EA, Pokryshevskaya EB. Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values. Journal of revenue and pricing management. 2020;19:355–64.
View Article
Google Scholar

[165] View Article

[166] Google Scholar

[ref53] 53. Sumon IH, Hossain M, Ar Salan S, Kabir MA, Majumder AK. Determinants of coexisting forms of undernutrition among under‐five children: Evidence from the Bangladesh demographic and health surveys. Food Science & Nutrition. 2023;11(9):5258–69. pmid:37701232
View Article
PubMed/NCBI
Google Scholar

[168] View Article

[169] PubMed/NCBI

[170] Google Scholar

[ref54] 54. Khan RE, Raza MA. Determinants of malnutrition in Indian children: new evidence from IDHS through CIAF. Quality & Quantity. 2016;50:299–316.
View Article
Google Scholar

[172] View Article

[173] Google Scholar

[ref55] 55. Akombi BJ, Agho KE, Merom D, Hall JJ, Renzaho AM. Multilevel analysis of factors associated with wasting and underweight among children under-five years in Nigeria. Nutrients. 2017;9(1):44. pmid:28075336
View Article
PubMed/NCBI
Google Scholar

[175] View Article

[176] PubMed/NCBI

[177] Google Scholar

[ref56] 56. Vijay J, Patel KK. Malnutrition among under-five children in Nepal: A focus on socioeconomic status and maternal BMI. Clinical Epidemiology and Global Health. 2024;27:101571.
View Article
Google Scholar

[179] View Article

[180] Google Scholar

[ref57] 57. Ahmmed F, Hasan MN, Hossain MF, Khan MT, Rahman MM, Hussain MP, et al. Association between short birth spacing and child malnutrition in Bangladesh: a propensity score matching approach. BMJ Paediatrics Open. 2024;8(1). pmid:38499349
View Article
PubMed/NCBI
Google Scholar

[182] View Article

[183] PubMed/NCBI

[184] Google Scholar

[ref58] 58. Ntambara J, Zhang W, Qiu A, Cheng Z, Chu M. Optimum birth interval (36–48 months) may reduce the risk of undernutrition in children: A meta-analysis. Frontiers in Nutrition. 2023;9:939747. pmid:36712519
View Article
PubMed/NCBI
Google Scholar

[186] View Article

[187] PubMed/NCBI

[188] Google Scholar

[ref59] 59. Ssentongo P, Ba DM, Ssentongo AE, Fronterre C, Whalen A, Yang Y, et al. Association of vitamin A deficiency with early childhood stunting in Uganda: A population-based cross-sectional study. PloS one. 2020;15(5):e0233615. pmid:32470055
View Article
PubMed/NCBI
Google Scholar

[190] View Article

[191] PubMed/NCBI

[192] Google Scholar

[ref60] 60. Das S, Gulshan J. Different forms of malnutrition among under five children in Bangladesh: a cross sectional study on prevalence and determinants. BMC Nutrition. 2017;3:1–2.
View Article
Google Scholar

[194] View Article

[195] Google Scholar

[ref61] 61. Hossain MM, Yeasmin S, Abdulla F, Rahman A. Rural-urban determinants of vitamin a deficiency among under 5 children in Bangladesh: Evidence from National Survey 2017–18. BMC Public Health. 2021;21:1–0.
View Article
Google Scholar

[197] View Article

[198] Google Scholar

[ref62] 62. Ahmed R, Ejeta Chibsa S, Hussen MA, Bayisa K, Tefera Kefeni B, Gezimu W, et al. Undernutrition among exclusive breastfeeding mothers and its associated factors in Southwest Ethiopia: A community-based study. Women’s Health. 2024;20:17455057241231478.
View Article
Google Scholar

[200] View Article

[201] Google Scholar

[ref63] 63. Hossain MM, Abdulla F, Rahman A. Prevalence and risk factors of underweight among under-5 children in Bangladesh: Evidence from a countrywide cross-sectional study. PLoS One. 2023;18(4):e0284797. pmid:37093817
View Article
PubMed/NCBI
Google Scholar

[203] View Article

[204] PubMed/NCBI

[205] Google Scholar

[ref64] 64. Chandna A, Bhagowalia P. Birth order and children’s health and learning outcomes in India. Economics & Human Biology. 2024;52:101348. pmid:38237431
View Article
PubMed/NCBI
Google Scholar

[207] View Article

[208] PubMed/NCBI

[209] Google Scholar

[ref65] 65. Mutumba R, Pesu H, Mbabazi J, Greibe E, Olsen MF, Briend A, et al. Correlates of iron, cobalamin, folate, and vitamin A status among stunted children: A cross-sectional study in Uganda. Nutrients. 2023;15(15):3429. pmid:37571364
View Article
PubMed/NCBI
Google Scholar

[211] View Article

[212] PubMed/NCBI

[213] Google Scholar

[ref66] 66. Yu T, Chen C, Jin Z, Yang Y, Jiang Y, Hong L, et al. Association of number of siblings, birth order, and thinness in 3-to 12-year-old children: a population-based cross-sectional study in Shanghai, China. BMC Pediatrics. 2020;20:1–3.
View Article
Google Scholar

[215] View Article

[216] Google Scholar

Figures

Abstract

Background and objectives

Materials and methods

Results

Conclusion

1. Introduction

2. Materials and methods

2.1 Data source and study design

2.2. Ethical approval

2.3. Outcome variables

2.4. Predictors

2.5. Statistical analysis

2.5.1. Handling missing value.

2.6. Data partition

2.7. Feature selection

2.8. Machine learning algorithms

2.8.1. Logistic regression.

2.8.2 Artificial neural network.

2.8.3 Random forest.

2.8.4 Extreme gradient boosting.

2.9. Hyperparameters tuning

2.10. Performance evaluation metrics

2.11. Predictor’s assessment using SHAP analysis

3. Results

3.1 Background characteristics

3.2 Predictor’s selection by Boruta

3.3. Performance comparison of ML-based models

3.4 Influencing predictors for undernutrition

4. Discussion

5. Conclusion

5.1. Limitations of the study

Supporting information

S1 Fig. SHAP summary plot of the XGB model for stunting.

S2 Fig. SHAP summary plot of the XGB model for wasting.

S3 Fig. SHAP summary plot of the XGB model for underweight.

Acknowledgments

References