Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identifying influential determinants of women’s empowerment in Bangladesh using machine learning algorithms

  • Md. A. Salam,

    Roles Conceptualization, Data curation, Formal analysis, Writing – original draft

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

  • Samiul Islam,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

  • Md. Mahfuz Uddin,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

  • Tamanna Rahman Shraboni,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

  • Antora Das,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

  • Md. Merajul Islam,

    Roles Conceptualization, Data curation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh, Bangladesh

  • Md. Rezaul Karim

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    mrkarim@ru.ac.bd

    Affiliation Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh

Abstract

Background and objectives

Women’s empowerment is a vital issue in lower-middle-income developing countries like Bangladesh, where it plays a pivotal role in advancing development across the nation. Thus, this study aimed to identify the influential determinants of women’s empowerment in Bangladesh using machine learning (ML) algorithms.

Materials and methods

The data for this study were obtained from the Bangladesh Demographic and Health Survey (BDHS) 2022, which included a nationally representative sample of 18,600 ever-married women aged 15–49 years. The important variables for women’s empowerment were identified using logistic regression and the Boruta feature selection method. Subsequently, eight popular machine learning algorithms - Decision Tree, Random Forest (RF), Naïve Bayes, Artificial Neural Network, Logistic Regression, Extreme Gradient Boosting, Gradient Boosting, and Support Vector Machine - were employed to predict women’s empowerment status. Model performance was assessed using accuracy, F1-score, and the area under the curve (AUC). Additionally, the most suitable model with SHAP analysis was used to identify the influential determinants driving women’s empowerment.

Results

The RF-based model demonstrated the best performance, achieving an accuracy of 71.07%, an F1-score of 81.58%, and an AUC of 0.676. The analysis revealed age, division, wealth index, working status, household members, husband’s education, and respondent’s education as the most influential determinants of women’s empowerment.

Conclusion

This study provides the best predictive model and identifies influential determinants of women’s empowerment in Bangladesh, offering valuable insights for achieving Sustainable Development Goal 5 (SDG-5) by 2030 through targeted actions and policies.

1. Introduction

Women’s empowerment is a fundamental pillar of global development, playing a critical role in raising sustainable economic growth, poverty reduction, and overall societal advancement [1,2]. It encompasses women’s active and equal participation in the financial, political, educational, and social spheres, ensuring they can contribute meaningfully to all aspects of life. Empowerment is about providing opportunities and creating an environment where women can exercise their rights, make informed decisions, and achieve personal growth. Women’s empowerment promotes gender equality, a prerequisite for achieving human rights and societal development. Empowering women with equal access to resources, education, healthcare, and economic opportunities allows them to overcome barriers and become active agents of change in their personal, family, and community lives. This participation drives positive outcomes such as improved family well-being, increased economic productivity, and stronger community resilience. Empowering women contributes to breaking cycles of poverty, as empowered women often invest in the health, education, and well-being of their children, fostering long-term intergenerational progress [35]. Women’s representation in political and decision-making processes also strengthens governance and leads to more inclusive and equitable policies.

Furthermore, women’s empowerment is directly linked to achieving key global development goals, such as the United Nations Sustainable Development Goals (SDGs), particularly Goal 5: Gender Equality [6]. Empowering women contributes to improved child health, reduced maternal mortality, enhanced education outcomes, and better economic resilience at both local and global levels. For instance, when women are educated and empowered, they are more likely to ensure their children receive education, improving the prospects of future generations. Women’s representation in political and decision-making processes also strengthens governance and leads to more inclusive and equitable policies. Overall, women’s empowerment is not merely a women’s issue—it is a societal imperative. It drives economic progress, strengthens communities, and ensures more inclusive, equitable, and sustainable development. It is therefore essential to identify the factors that affect women’s empowerment. Several studies have previously investigated these factors globally, including in Bangladesh [712]. However, most of these studies have used traditional statistical methods, such as logistic regression (LR), to identify the factors influencing women’s empowerment. LR assumes a linear relationship between predictors and the outcome variable, which may not always be appropriate when the data is more intricate or involves complex patterns. That’s why advanced methods like machine learning are essential. ML algorithms can handle large datasets and identify complex, non-linear relationships that traditional statistical methods may overlook. These algorithms can learn from data to uncover hidden patterns and interactions that could be crucial to understanding the factors influencing women’s empowerment. By examining the factors through robust methods, this research provides actionable insights to promote social transformation and help women attain autonomy and self-reliance in Bangladesh. However, no ML-based study was conducted to identify the factors influencing women’s empowerment in Bangladesh. Therefore, this study explores the application of ML algorithms to identify influential determinants of women’s empowerment in Bangladesh. The study’s findings highlight the significant role of ML in identifying and predicting the factors influencing women’s empowerment, providing valuable insights to guide policy decisions that promote gender equality and women’s empowerment in Bangladesh. By focusing on the influential determinants identified by the most suitable ML-based model, targeted interventions can be designed to further empower women. These interventions can contribute to achieving the Sustainable Development Goals (SDGs), particularly SDG 5, which emphasizes gender equality and women’s empowerment.

The remaining part of the study is organized as follows: Section 2 contains the materials and methods, including the data source, outcome variable, explanatory variables, statistical analysis, feature selection, machine learning algorithms, cross-validation, hyperparameter tuning, and model evaluation. Section 3 represents the analysis’s results. Section 4 discusses the findings, and Section 5 summarizes the conclusions.

2. Materials and methods

2.1. Data source

This study used the most recent Bangladesh Demographic and Health Survey (BDHS), 2022 data [13]. The data collection followed a two-stage stratified random sampling approach. In the first stage, 675 enumeration areas (EAs) were selected using probability proportional to size, including 438 rural and 227 urban areas. The survey was conducted among 30,375 residential households, including 10,665 in urban areas and 19,710 in rural areas. About 30,340 households from 30,375 households were interviewed, including ever-married women aged 15–49 years. After eliminating records with missing, unknown, or irrelevant information, 18,600 women were included in the final dataset for this study.

2.1.1. Outcome variable.

The outcome variable of this study was women’s empowerment. Women’s empowerment was defined as involvement in either some or all of the three decisions: (1) their healthcare, (2) significant household purchases, and (3) visits to family or relatives [14]. It was coded as a binary variable, with empowerment categorized as Yes (1) if the woman participated in any of the decisions and No (0) if she did not participate in any of them.

2.1.2. Explanatory variables.

We considered different types of explanatory variables, including demographic, socio-economic, environmental, and other factors, based on insights from previous studies and the availability of data in the BDHS, 2022 database [8,9,11,14]. The variables are respondent Age (15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49), Division (Barisal, Chittagong, Dhaka, Khulna, Mymensingh, Rajshahi, Rangpur, and Sylhet), Residence (urban, rural), Education (None, primary, secondary, and higher), Religion (Muslim, non-Muslim), Sex of the household head (male, female), Newspaper (not at all, < 1 week, and ≥ 1 week), Watching TV (not at all, < 1 week, and ≥ 1 week), Wealth index (poorest, poorer, middle, richer, and richest), Husband education (None, primary, secondary, and higher), Respondents currently working (no, yes), Number of household members (1–2, 3–4, 5 and more), Age at first-year marriage (10-14, 15-19, 20-24, and ≥25), Number of living children (0, 1–2, ≥ 3), and Wife-beating (no, yes). Wife beating was measured based on ‘agrees with at least one reason’ (agrees with husband is justified in hitting wife if she goes out without telling him, neglects the children, refuses sexual intercourse, or burns food).

2.1.3. Statistical analysis.

Descriptive statistics of the study participants were reported as frequency (%) for the selected variables. The chi-square test was used to assess the bivariate associations between the response and explanatory variables, with a significance level of p-value < 0.05. Data analysis was carried out by using SPSS and the R programming language. The dataset was split into two sets: 80% for training and 20% for testing.

2.2. Feature selection

Feature selection is the process of selecting the most important features in a dataset by analyzing the underlying relationships within the data. This helps improve model performance, reduce complexity, and enhance interpretability [15]. In this study, we have applied LR and Boruta-based feature selection methods to select the most important features of women’s empowerment [1621].

2.3. Machine learning algorithms

2.3.1. Decision tree.

The Decision tree (DT) is a tree-like machine learning algorithm [16,22]. It splits a dataset into subsets based on the values of input features, forming a tree structure. Nodes represent tests on attributes (features), branches represent test outcomes, and leaves represent final outputs (class labels or values). The process starts with a root node for the entire dataset, which splits based on the most significant feature. Internal nodes refine the splits, while leaf nodes provide the final predictions. Internal nodes continue splitting the data, and leaf nodes provide the final predictions.

2.3.2. Random forest.

Random forest (RF) is an ensemble ML algorithm that combines multiple DTs to enhance accuracy and reduce overfitting [23]. It uses bootstrap sampling (bagging) to create multiple subsets of the training data and uses random feature selection to ensure diversity among the trees. Each tree in the forest independently predicts an outcome, and the final prediction is made by majority voting [24].

2.3.3. Naïve Bayes.

Naïve Bayes (NB) is a probabilistic ML algorithm based on Bayes’ theorem. It assumes that all features are independent of each other. Despite this “naïve” assumption, it performs well in many applications [25]. The algorithm calculates the probability of each class given the input features and predicts the class with the highest probability [26].

2.3.4. Artificial neural network.

The Artificial neural network (ANN) is an ML algorithm inspired by the structure and functioning of the human brain [27]. The ANN consists of nodes organized into three main layers: input, hidden, and output. The data enters through the input layer, passes through one or more hidden layers where the neurons apply weights, biases, and activation functions, and finally produces an output through the output layer [28].

2.3.5. Logistic regression.

Logistic regression (LR) is a widely used probabilistic classification model in ML. It assumes a linear relationship between the features and the log odds of the output variable [29]. During the training phase, LR estimates the model’s parameters using maximum likelihood estimation (MLE) to minimize the error in predicting class probabilities. Once trained, the model can predict the probability of the positive class by applying the logistic function to the input features. If the predicted probability exceeds a threshold (usually 0.5), the model classifies the input as the positive class; otherwise, it classifies it as the negative class.

2.3.6. Extreme gradient boosting.

Extreme gradient boosting (XGBoost) is an optimized version of gradient boosting [30]. It builds decision trees as base learners in sequence, each trained to address the prediction error left by the preceding tree, leading to improved predictions. This sequential approach provides an alternative method for developing sophisticated, precise models with trees, enabling control over individual tree depth and complexity.

2.3.7. Gradient boosting.

Gradient boosting is an ensemble ML method that builds a strong predictive model by sequentially combining multiple weak learners, typically DTs [31]. The process begins with a simple model, and in each subsequent step, new models are trained to predict the residual errors from the previous model. These new models are added to improve the overall prediction, with each model contributing to the final output based on a learning rate. The approach is guided by gradient descent to minimize the loss function.

2.3.8. Support vector machine.

Support vector machine (SVM) is a widely used ML algorithm for classification and regression tasks [32]. It identifies the optimal hyperplane that maximizes the margin between classes in the dataset. SVM effectively handles linear and non-linear data using kernel functions, such as linear, polynomial, and radial basis function (RBF), transforming the data into higher-dimensional spaces [33]. This study employed the RBF kernel to achieve the largest possible margin between support vectors representing the closest data points from each class.

2.3.9. Cross-validation and tuning hyperparameters.

Cross-validation (CV) is a popular protocol to assess a model’s performance more robustly. In K-fold CV, the original dataset is divided into K equal-sized subsamples or folds, where one of the K subsets is used as the validation or test set, while the remaining K-1 subsets are combined to form the training set. The models DT, RF, NB, ANN, LR, XGB, GB, and SVM contain hyperparameters that must be defined by the user prior to training to optimize model performance. Hyperparameter tuning was performed using a grid search combined with 10-fold (K = 10) cross-validation on the training dataset. For each fold, the data were split into training and test subsets at a 9:1 ratio. The caret package (version 6.0–93) in R was employed to identify optimal hyperparameter combinations for all eight models, as summarized in Table 1. In this study, we adopted a 10-fold CV protocol to assess a grid search for identifying the optimal hyperparameter value.

thumbnail
Table 1. Hyperparameter tuning ranges and criteria for eight ML-based models.

https://doi.org/10.1371/journal.pone.0338037.t001

2.4. Model evaluation

Model evaluation is the process of assessing a machine learning model’s performance to ensure it generalizes effectively to unseen data [34]. We used the confusion matrix (Table 2), accuracy, sensitivity, specificity, ROC curve, and AUC to evaluate the models.

Accuracy: The proportion of correctly predicted instances to the total instances.

Sensitivity: The proportion of actual positive cases correctly identified by the model.

Specificity: The proportion of actual negative cases correctly identified by the model.

2.4.1. ROC curve.

The ROC curve is a graphical tool used to evaluate the performance of a binary classifier. It plots the true positive rate (sensitivity) against the false positive rate (1-specificity) across different threshold values, providing insight into how well the model can distinguish between positive and negative cases. The ROC curve helps assess the model’s diagnostic ability and the trade-offs between sensitivity and specificity [34].

2.4.2. Model explain-ability.

The full form of SHAP is SHapley Additive exPlanations, first introduced by Lundberg and Lee (2017) to determine the contribution of each factor in a machine learning–based prediction model. It explains both local and global feature importance using SHAP values and is widely applied worldwide as an interpretability and visualization technique [35].

3. Results

3.1. Background characteristics of the study participants

Table 3 represents the bivariate association between sociodemographic characteristics and women’s empowerment. It was observed that younger women (15–19 years) showed lower empowerment (69.4%), while older women (35–44 years) demonstrated higher levels (89.2%−90.2%). Women living in the Sylhet division had the lowest empowerment (77.9%), whereas those in Dhaka and Rangpur had the highest (88.4% and 88.1%, respectively). Urban women have a higher empowerment rate (87.2%) than rural women (84.0%). Women with higher education showed greater empowerment (88.3%), whereas those with no education (85.8%), primary education (84.8%), and secondary education (84.2%) did not. Female-headed households report slightly higher (88.7%) compared to male-headed ones (84.6%). Women who read newspapers or watch TV more frequently exhibited higher empowerment (87.8%). The richest women were the most empowered (88.3%), while the poorest were the least (83.6%). Women with more educated husbands are slightly more empowered (86.4%). Working women report significantly higher empowerment (90.6%) than non-working women (82.8%). Women in smaller households (1–2 members) are more empowered (88.6%) than those in (82.3%). Women who married later, at age ≥ 25, had the highest empowerment (91.1%), while those who married at ages 15–19 had the lowest (84.2%). Women with no children have the lowest empowerment (74.8%), while those with number of living children ≥ 3 are more empowered (87.5%). Women who oppose wife-beating showed higher empowerment (85.5%) than those who accept it (82.6%).

thumbnail
Table 3. Association between explanatory variables and women’s empowerment.

https://doi.org/10.1371/journal.pone.0338037.t003

3.2. Identified important features using logistic regression

Table 4 represents the results of logistic regression analysis. The results showed that women aged 15–19 (OR: 0.404, 95% CI: 0.323–0.506; p < 0.001) and 20–24 (OR: 0.570, 95% CI: 0.469–0.694; p < 0.001) years have lower odds of empowerment compared to those aged 45−49 years. In contrast, women in the aged group 40–44 years (OR: 1.314, 95% CI: 1.082–1.597; p = 0.006) had higher odds of empowerment. Women living in the Barisal (OR: 1.257, 95% CI: 1.064–1.489; p = 0.007), Chittagong (OR: 1.932, 95% CI: 1.642–2.272; p < 0.001), Dhaka (OR: 1.885, 95% CI: 1.594–2.229; p < 0.001), Mymensingh (OR: 1.469, 95% CI: 1.238–1.743; p < 0.001), Rajshahi (OR: 1.671, 95% CI: 1.400–1.994; p < 0.001) and Rangpur (OR: 1.889, 95% CI: 1.578–2.260; p < 0.001) were more likely to attain empowerment compared to those living in the Sylhet division. Women residing in urban areas (OR: 1.103, 95% CI: 0.999–1.218; p = 0.048) had higher odds of empowerment than those living in rural areas. Women who had educational level none (OR: 0.533, 95% CI: 0.425–0.669; p < 0.001), primary (OR: 0.616, 95% CI: 0.510–0.744; p < 0.001), secondary (OR: 0.734, 95% CI: 1.165–1.618; p < 0.001) had lower odds of being empowerment than higher educated women. Households headed by males (OR: 0.733, 95% CI: 0.635–0.846; p < 0.001) were less likely to have empowered women than those headed by females.

thumbnail
Table 4. Risk factor identification of women’s empowerment using logistic regression.

https://doi.org/10.1371/journal.pone.0338037.t004

Women who watched TV either not at all (OR: 0.772, 95% CI: 0.701–0.849; p = 0.01) or < 1 time in a week (OR: 0.768, 95% CI: 0.653–0.903; p = 0.01) had lower odds of being empowered compared to those who watched TV ≥ 1 times in a week. Women in the middle (OR: 0.825, 95% CI: 0.712–0.956; p = 0.01) and richer (OR: 0.853, 95% CI: 0.741–0.982; p = 0.01) households had lower odds of empowerment than the richest women. Women with no working status (OR: 0.608, 95% CI: 0.547–0.676; p < 0.001) had lower odds of being empowered than their working counterparts. Women with 1−2 household members (OR: 1.581, 95% CI: 1.268–1.971; p < 0.001) and 3−4 household members (OR: 1.464, 95% CI: 1.332–1.609; p < 0.001) had greater chances of being empowered compared to those with ≥ 5 household members. Women with no live children (OR: 0.691, 95% CI: 0.573–0.832; p < 0.001) had lower odds of empowerment compared to those with ≥ 3 children. Women who had not been beaten by their husbands (OR: 1.279, 95% CI: 1.136–1.440; p < 0.001) were more likely to be empowered than those who had been beaten.

3.3. Identified important features using Boruta

Fig 1 displays the results of the Boruta-based feature selection method. The Boruta method showed that education, age, husband’s education, wealth index, number of living children, household members, division, residence, watching television, age at first marriage, currently working, sex of household head, and newspaper are important features for women’s empowerment. The variables deemed significant or important by either the LR or Boruta methods were utilized in the development of the ML-based models.

3.4. Performance comparison of ML models

Fig 2 shows the confusion matrices of ML-based models. In this figure, Model a represents DT, Model b represents RF, Model c represents NB, Model d represents ANN, Model e represents LR, Model f represents XGB, Model g represents GB, and Model h represents SVM.

Table 5 represents the prediction performance of different ML models. The results indicated that the RF-based model achieved the highest accuracy of 71.07%, specificity of 75.26%, precision of 89.98%, F1-score of 81.58%, and AUC of 0.676. In contrast, the DT-based model exhibited the highest specificity of 64.01%. All ML models were trained on 80% of the data and tested on the remaining 20%, with a random seed of 16160. The ROC curves of ML models, shown in Fig 3, revealed that the RF-based model had the largest area under the curve. Therefore, we proposed that the RF-based model is the best model for predicting women’s empowerment.

thumbnail
Table 5. Comparisons of the predicted performance of different ML models.

https://doi.org/10.1371/journal.pone.0338037.t005

Additionally, we computed uncertainty estimates by running each model across 5-, 10-, and 30-fold cross-validations to assess the robustness of our findings. The k-fold cross-validation results (Tables 6 and 7), which provide performance estimates across multiple runs, also demonstrate the superior performance of the RF model across 5-, 10-, and 30-fold settings. Based on these findings, the RF model is considered the most effective model for predicting empowered women in Bangladesh.

thumbnail
Table 6. Results of K-fold cross-validations of ML models based on accuracy.

https://doi.org/10.1371/journal.pone.0338037.t006

thumbnail
Table 7. Results of K-fold cross-validation of ML models based on F1-score.

https://doi.org/10.1371/journal.pone.0338037.t007

The contributing predictors in the RF-based model for predicting women’s empowerment are presented in Fig 4 to explain global influential determinant importance and Fig 5 to display the direction of the relationship between a determinant and game outcome. These figures show that age group, division, wealth index, husband’s education, and working status are the top five features that influence women’s empowerment. These findings suggest that these five features are the most influential determinants of women’s empowerment in Bangladesh.

thumbnail
Fig 4. Importance of influential determinants based on mean absolute SHAP values.

https://doi.org/10.1371/journal.pone.0338037.g004

thumbnail
Fig 5. Importance of influential determinants based on local explanation summary.

https://doi.org/10.1371/journal.pone.0338037.g005

4. Discussion

This study investigates the use of ML algorithms to identify the influential determinants of women’s empowerment in Bangladesh, with the aim of supporting the achievement of SDG-5, which focuses on gender equality and the empowerment of women for sustainable development [36]. The study utilized the most recent BDHS, 2022 data to achieve this, which provides comprehensive socio-economic, health, and demographic information. Two popular feature selection methods were applied to identify the important features related to women’s empowerment. Subsequently, eight ML algorithms were used to build models using training data and to predict empowerment status on test data. The performance of each model was assessed using several metrics, including accuracy, sensitivity, specificity, precision,AUC, and F1-score. Among the models, the RF-based model achieved the highest performance regarding these evaluation metrics, making it the most effective model for predicting women’s empowerment status. Several previous studies have found that Random Forest (RF) was the most accurate and robust method for analyses based on BDHS or other datasets [3739]. The superiority of RF can be attributed to: (i) its ensemble nature,(ii) Robustness to various feature types and outliers, (iii) Ability to capture nonlinear, interaction, and complex relationships, and (iv) Efficient handling of high-dimensional data with intrinsic feature selection. SHAP analysis was finally performed to identify the interpretable and influential predictors of women’s empowerment for the optimal prediction model (RF), based on SHAP values. The RF-based model with SHAP analysis revealed that age, division, wealth index, working status, husband’s education, and women’s education are the most influential determinants of women’s empowerment. To ensure the stability of our results, we conducted a robustness check by running each model with K-fold cross-validation for different values of K (K = 5, 10, 30). These results are provided in Tables 6 and 7. The results of Tables 6 and 7 confirm that the RF is robust for predicting women’s empowerment with the dataset.

Polin et al. [40] performed a study using ML-based algorithms to predict the impact of microcredit on women’s empowerment in rural areas of Bangladesh. To assess this impact, they collected data from three villages across three districts (Narayanganj, Kishoreganj, and Jessore). The dataset included 43 samples with 21 explanatory variables, focusing on women aged 18–50 years. They applied five machine learning algorithms—NB, k-Nearest Neighbors (k-NN), DT, RF, and Sequential Minimal Optimization (SMO)—to predict women’s empowerment status. Among these models, the DT-based approach achieved the highest accuracy of 83.75% and AUC 0.836. They demonstrated that microcredit positively influences women’s empowerment. Our findings differ somewhat from those of Polin et al. due to variations in objective, dataset, feature selection, geographical coverage, and timeframes.

Age is an influential determinant of women’s empowerment, and this finding aligns with previous studies [910,41]. They showed that women’s empowerment often increases with age as they gain more experience, education, and social influence. In younger years, women often focus on education and career development, which can enhance their autonomy and decision-making abilities. Additionally, social norms and expectations often change with age; younger women may experience societal pressures around marriage, appearance, and career choices, while older women may gain greater freedom from these expectations, enabling greater empowerment. However, as women age, especially during childbearing or caregiving years, they may face increased responsibilities that could limit their access to opportunities and independence, potentially impacting their empowerment. Furthermore, empowering older generations of women can positively affect younger women, as they benefit from stronger legal rights, better educational opportunities, and greater workforce participation, thereby enhancing their overall empowerment [9].

Divisional variation in women’s empowerment was observed, with women residing in the Chittagong, Rangpur, Dhaka, Rajshahi, Mymensingh, and Barisal divisions exhibiting higher levels of empowerment than those in Sylhet. Women in Chittagong, Rangpur, Dhaka, Rajshahi, Mymensingh, and Barisal divisions have higher empowerment rates than those in Sylhet, which can be attributed to a combination of education, economic opportunities, social norms, and infrastructure [36]. Education plays a crucial role, as higher female literacy and access to schooling contribute to greater awareness. These divisions have higher female literacy rates and more economic prospects, which help increase women’s participation in decision-making and financial independence [42]. In comparison, Sylhet has traditionally lagged in female education due to cultural norms and migration trends that prioritize male employment abroad over local female empowerment. Dhaka and Chittagong, as major urban centers, offer women more employment opportunities in industries such as garments, banking, and the service sector [43]. Rangpur and Mymensingh have seen active interventions from government and non-governmental organizations (NGOs) promoting women’s empowerment through microfinance, healthcare, and educational programs, which have been less prevalent in Sylhet. In contrast, Sylhet has a high rate of male outmigration, leading to an economy largely supported by remittances rather than local female labor participation. This reduces the necessity for women to seek employment, ultimately limiting their financial independence and empowerment. Dhaka and Chittagong benefit from a strong presence of government-led initiatives promoting women’s education and entrepreneurship. Cultural factors further influence the gap in women’s empowerment between these regions. Sylhet remains more conservative regarding gender roles, with societal norms often restricting women’s mobility and decision-making power. In contrast, Dhaka, Chittagong, Rangpur, Rajshahi, and Mymensingh have undergone gradual cultural shifts that support greater gender equality and encourage female participation in various sectors. Infrastructure and urbanization also contribute to the disparities. Dhaka and Chittagong offer better mobility, safety, and access to essential resources like education, healthcare, and financial services, which directly support women’s empowerment. Political engagement and legal awareness also vary across these divisions. In urban areas like Dhaka and Chittagong, women are more engaged in politics and local governance, allowing them to advocate for their rights and make informed decisions. Sylhet, however, has lower levels of female political participation and awareness of legal rights, which further hinders overall empowerment.

The wealth index is an influential determinant of women’s empowerment as it directly impacts their access to essential resources and opportunities [44]. Women from wealthier households are more likely to have access to quality education, which enables them to acquire the skills and knowledge necessary for greater independence and decision-making power. Financial resources also enable women to pursue economic opportunities, such as jobs or businesses, fostering economic independence and greater influence within the household and society [45]. Additionally, wealth improves access to healthcare services, contributing to better health and well-being, which is important for women to lead active and empowered lives. Wealthier women are also more likely to have greater social mobility, enabling them to participate in public life, challenge traditional gender norms, and advocate for their rights. Furthermore, access to technology and information, which is often more readily available to wealthier women, enhances their ability to connect with opportunities and engage in social and economic activities. Overall, the wealth index plays a central role in enhancing women’s autonomy, decision-making power, and overall empowerment. Working status is also an important determinant of women’s empowerment, as promoting women’s employment opportunities is a key policy aim for improving their general empowerment and socio-economic well-being. Education is an influential determinant of women’s empowerment, and this finding is corroborated by earlier studies [46,47]. The education of husbands and women plays a crucial role in shaping women’s empowerment, as it significantly influences a woman’s autonomy, decision-making ability, and access to resources. An educated husband is more likely to support his wife’s autonomy, including her education, career, and participation in community activities. This support fosters a more equal partnership, enabling women to have a voice in important family decisions, such as finances, healthcare, and child-rearing. Additionally, educated husbands tend to adopt more progressive attitudes toward gender roles, promoting gender equality within the household. Women’s education is perhaps the most direct factor influencing their empowerment. When women are educated, they gain the skills, knowledge, and confidence to make informed decisions about their own lives. Education enables women to enter the workforce, achieve financial independence, and gain a voice in family and community matters. It also helps them challenge traditional gender norms and fight against discriminatory practices. Educated women are more likely to be aware of their rights, demand better health services, and invest in their children’s education and well-being, creating a cycle of empowerment that can be passed on to future generations. Both husband’s and women’s education contribute to creating an environment of mutual respect, shared responsibilities, and equal opportunities, making them key determinants of women’s empowerment.

5. Conclusion

This study sought to explore the influential determinants of women’s empowerment in Bangladesh using ML-based algorithms. The study applied eight widely used ML algorithms to build predictive models, selecting variables using two feature selection techniques: LR and Boruta. The RF-based model outperformed the others in predictive performance, revealing that age, geographic division, wealth index, and both the husband’s and the woman’s educational levels are influential determinants of women’s empowerment in Bangladesh. These findings emphasize the importance of education, socio-economic status, and regional factors in empowering women in Bangladesh, offering valuable insights for targeted policy interventions to enhance women’s empowerment. By focusing on improving educational opportunities for women, addressing socio-economic disparities, and considering regional variations, policymakers can design more effective strategies to promote gender equality and empower women across the country.

Acknowledgments

The authors thank the Bangladesh Demographic and Health Survey (BDHS) for providing us with nationally representative data collected in 2022. Also, thanks to the editor and reviewers for their valuable comments in improving the manuscript.

References

  1. 1. Bayeh E. The role of empowering women and achieving gender equality to the sustainable development of Ethiopia. Pacific Science Review B: Humanities and Social Sciences. 2016;2(1):37–42.
  2. 2. Batool H, Afzal M. Role of women’s empowerment in achieving sustainable development goals: empirical evidence from Central Punjab-Pakistan. Webology. 2021;18(6).
  3. 3. Ogbari ME, Folorunso F, Simon-Ilogho B, Adebayo O, Olanrewaju K, Efegbudu J, et al. Social Empowerment and Its Effect on Poverty Alleviation for Sustainable Development among Women Entrepreneurs in the Nigerian Agricultural Sector. Sustainability. 2024;16(6):2225.
  4. 4. Dhiman DB. Education’s role in empowering women and promoting gender inequality: A critical review. 2023.
  5. 5. Women UN. Empowering women at work: government laws and policies for gender equality. 2021.
  6. 6. Ernst KP, Pagot R, Prá JR. Sustainable development goal 5: Women’s political participation in South America. World Development Sustainability. 2024;4:100138.
  7. 7. Kabir R, Rahman S, Monte-Serrat DM, Arafat SY. Exploring the decision-making power of Bangladeshi women of reproductive age: Results from a national survey. South East Asia Journal of Medical Sciences. 2017;4–8.
  8. 8. Srivastava V, Srivastava A. Determinants of Women Empowerment in Uttarakhand. International Journal of Research in Economics and Social Sciences. 2017;7:303–14.
  9. 9. van Dongen D-M, Obrizan M, Shymanskyi V. Determinants of women’s empowerment in Nepal. PLoS One. 2024;19(9):e0310266. pmid:39259759
  10. 10. Shabnaz S, Shajahan B. Factors influencing empowerment of employed women: A case study on RMG sector of Bangladesh. 2017.
  11. 11. Zinia J. The factors affecting on women participation in household decision making in Dhaka city: A sociological study. GSJ. 2022;10(11).
  12. 12. Akter S, Hosen MS, Khan MS, Pal B. Assessing the pattern of key factors on women’s empowerment in Bangladesh: Evidence from Bangladesh Demographic and Health Survey, 2007 to 2017–18. PLOS One. 2024;19(3):e0301501.
  13. 13. National Institute of Population Research and Training (NIPORT), I C F. Bangladesh Demographic and Health Survey 2022: Final Report. Dhaka, Bangladesh: NIPORT and ICF. 2024.
  14. 14. Kirkwood EK, Raihana S, Alam NA, Dibley MJ. Women’s participation in decision‐making: Analysis of Bangladesh Demographic and Health Survey data 2017–2018. J of Intl Development. 2024;36(1):26–42.
  15. 15. Cheng X. A Comprehensive Study of Feature Selection Techniques in Machine Learning Models. Ins Comput Signal Syst. 2024;1(1):65–78.
  16. 16. Manikandan G, Pragadeesh B, Manojkumar V, Karthikeyan AL, Manikandan R, Gandomi AH. Classification models combined with Boruta feature selection for heart disease prediction. Informatics in Medicine Unlocked. 2024;44:101442.
  17. 17. Al_Bairmani ZAA, Ismael AA. Using Logistic Regression Model to Study the Most Important Factors Which Affects Diabetes for The Elderly in The City of Hilla / 2019. J Phys: Conf Ser. 2021;1818(1):012016.
  18. 18. Sserwanja Q, Mukunya D, Habumugisha T, Mutisya LM, Tuke R, Olal E. Factors associated with undernutrition among 20 to 49 year old women in Uganda: a secondary analysis of the Uganda demographic health survey 2016. BMC Public Health. 2020;20(1):1644. pmid:33143673
  19. 19. Maxwell KL. Logistic regression analysis to determine the significant factors associated with substance abuse in school-aged children. 2009.
  20. 20. Kursa MB, Rudnicki WR. Feature Selection with theBorutaPackage. J Stat Soft. 2010;36(11).
  21. 21. Bhalaji N, Kumar KBS, Selvaraj C. Empirical study of feature selection methods over classification algorithms. IJISTA. 2018;17(1/2):98.
  22. 22. Lu Y, Ye T, Zheng J. Decision Tree Algorithm in Machine Learning. In: 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), 2022.
  23. 23. Liu Y, Wang Y, Zhang J. New machine learning algorithm: Random Forest. In: 2012. 246–52.
  24. 24. Biau G, Scornet E. A random forest guided tour. TEST. 2016;25(2):197–227.
  25. 25. Bayes T. Naive bayes classifier. Article Sources and Contributors. 1968;1–9.
  26. 26. Peretz O, Koren M, Koren O. Naive Bayes classifier – An ensemble procedure for recall and precision enrichment. Engineering Applications of Artificial Intelligence. 2024;136:108972.
  27. 27. Agatonovic-Kustrin S, Beresford R. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J Pharm Biomed Anal. 2000;22(5):717–27. pmid:10815714
  28. 28. Hassoun MH. Fundamentals of artificial neural networks. MIT Press. 1995.
  29. 29. LaValley MP. Logistic regression. Circulation. 2008;117(18):2395–9. pmid:18458181
  30. 30. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016. 785–94.
  31. 31. Natekin A, Knoll A. Gradient boosting machines, a tutorial. Front Neurorobot. 2013;7:21. pmid:24409142
  32. 32. Gaye B, Zhang D, Wulamu A. Improvement of support vector machine algorithm in big data background. Mathematical Problems in Engineering. 2021;2021(1):5594899.
  33. 33. Suthaharan S. Support vector machine. Machine learning models and algorithms for big data classification: thinking with examples for effective learning. 2016:207–35.
  34. 34. Rainio O, Teuho J, Klén R. Evaluation metrics and statistical tests for machine learning. Sci Rep. 2024;14(1):6086. pmid:38480847
  35. 35. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems. 2017;30.
  36. 36. Langworthy M. Women’s microenterprise and the SDGs: reframing success in women’s economic development in Sri Lanka. Journal of International Women’s Studies. 2024;26(1):7.
  37. 37. Prima AT, Thity NT, Rois R. Risk predictors selection and predict for the first-day neonatal mortality in Bangladesh using machine learning techniques. J Clin Images Med Case Rep. 2022;3(1).
  38. 38. Taye EA, Woubet EY, Hailie GY, Arage FG, Zerihun TE, Zegeye AT, et al. Application of the random forest algorithm to predict skilled birth attendance and identify determinants among reproductive-age women in 27 Sub-Saharan African countries; machine learning analysis. BMC Public Health. 2025;25(1):901. pmid:40050868
  39. 39. Thity NT, Rahman A, Dulmini A, Yasmin MN, Rois R. An illustration of multi-class roc analysis for predicting internet addiction among university students. PLoS One. 2025;20(7):e0325855. pmid:40690508
  40. 40. Polin JA, Sarker MdFH, Dolon MDK, Hasan N, Rahman MdM, Vasha ZN. Predicting the effects of microcredit on women’s empowerment in rural Bangladesh: using machine learning algorithms. Bulletin EEI. 2024;13(4):2699–706.
  41. 41. Jamil M, Bukhari K. Intergenerational Comparison of Women Empowerment and its Determinants. Int j women emp. 2020;6:30–46.
  42. 42. Mou SN. Women’s Empowerment through Higher Education and Employment in Bangladesh. jgcs. 2024;4(2):39–66.
  43. 43. Mamun MAA, Hoque MM. The impact of paid employment on women’s empowerment: A case study of female garment workers in Bangladesh. World Development Sustainability. 2022;1:100026.
  44. 44. Voronca D, Walker RJ, Egede LE. Relationship between empowerment and wealth: trends and predictors in Kenya between 2003 and 2008-2009. Int J Public Health. 2018;63(5):641–9. pmid:29159537
  45. 45. Akhmadi A, Amaliyah E. Women empowerment and its relationship with wealth index and COVID-19 prevention. IJPHS. 2022;11(2):391.
  46. 46. Bushra A, Wajiha N. Assessing the Socio-economic Determinants of Women Empowerment in Pakistan. Procedia - Social and Behavioral Sciences. 2015;177:3–8.
  47. 47. Jannah N. Analyzing the role of education in women empowerment in Bangladesh. KDI School. 2020.