Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Microfinance institutions failure prediction in emerging countries, a machine learning approach

Abstract

This study is about what matters: predicting when microfinance institutions might fail, especially in places where financial stability is closely linked to economic inclusion. The challenge? Creating something practical and usable. The Adjusted Gross Granular Model (ARGM) model comes here. It combines clever techniques, such as granular computing and machine learning, to handle messy and imbalanced data, ensuring that the model is not just a theoretical concept but a practical tool that can be used in the real world.Data from 56 financial institutions in Peru was analyzed over almost a decade (2014–2023). The results were quite promising. The model detected risks with nearly 90% accuracy in detecting failures and was right more than 95% of the time in identifying safe institutions. But what does this mean in practice? It was tested and flagged six institutions (20% of the total) as high risk. This tool’s impact on emerging markets would be very significant. Financial regulators could act in advance with this model, potentially preventing financial disasters. This is not just a theoretical exercise but a practical solution to a pressing problem in these markets, where every failure has domino effects on small businesses and clients in local communities, who may see their life savings affected and lost due to the failure of these institutions. Ultimately, this research is not just about a machine learning model or using statistics to evaluate results. It is about giving regulators and supervisors of financial institutions a tool they can rely on to help them take action before it is too late when microfinance institutions get into bad financial shape and to make immediate decisions in the event of a possible collapse.

Introduction

Throughout this research, it becomes abundantly clear that microfinance institutions, or MFIs, have an important job helping in places where money and resources are scarce. They fill a surprisingly large gap left by traditional banks that ignore smaller, often overlooked businesses [1]. By offering loans, savings options, insurance, and even lessons on managing finances, MFIs sincerely intend to improve things in communities that need a boost, making things fairer regarding who has money and who doesn’t [2].

They are focused on creating more jobs, boosting businesses, and ensuring more people have a fair chance at growth, but while MFIs are doing some pretty important work, they are still lacking in convincing new clients with their products [3]. Their efforts to reach out to clients who don’t normally get enough attention from mainstream banks means that they are heavily involved in places where the financial situation can change very quickly [4]. This is a major problem because if an MFI fails, it is not simply a mistake. It could lead to serious problems, such as people losing their savings immediately or making it even harder for everyone to have a fair shot at economic opportunities in the future [5]. Now, you understand why MFIs are a big deal. In theory, but not necessarily in fact, they are supposed to break down various barriers to make economies stronger and more inclusive.

That part is key: they are in this to fight the monetary inequality that keeps many people in poverty [3]. Here, in this situation, that noble goal is in trouble. The problem is that the things that make MFIs able to help people also put them at risk of collapsing if things go wrong, particularly during tough economic times.

This fear is not limited to a few people losing out; it is mostly about the possibility of implementing strategies that will improve the lives of many people and even entire countries. The research is focused on building an intelligent and informed predictive model just for microfinance [6].

On the other hand, the usual ways of guessing when financial places might fail do not capture all the special aspects of microfinance in places where the economy is just starting since understanding when these institutions are likely to get into trouble is vitally important to keep everyone’s money safe, fully verify that the economy remains on track, and not to lose all the remarkably positive contributions that microfinance has made, we need better analytical tools.

Something needs to be created that could do the job. This proposal looks at specific things that warn us if a microfinance institution might be in trouble. Giving these more accurate tools to individuals or people who need to be on the lookout for financial risks means they can intervene earlier and more effectively. In addition, the work helps to understand how to keep microfinance strong, which is an important issue in helping poorer areas grow economically without leaving anyone behind.

In doing all this, the study significantly changes the way the concentrated environment, or world, of microfinance is understood and protected. This is not just about theory; it is primarily about ensuring that these banks and microfinance institutions can continue to do their work without unexpectedly failing completely.

Theoretical and empirical literature review

Specifically, financial problems are rapidly aggravated by difficulties suffered by institutions, exacerbating macroeconomic deterioration. When banks are in trouble, things spiral out of control: jobs disappear, loans become extremely difficult to obtain, and everything becomes complete chaos [7]. Because of all the chaos mentioned, regulatory institutions decided to establish stricter rules to prevent banks from losing financial soundness and collapsing so easily [8]. Furthermore, predicting whether a bank or microfinance institution will fail has become much more complicated over the years [9,10]

The evolution of financial failure prediction models reflects the progressive development of increasingly sophisticated analytical techniques. From Altman’s seminal work with discriminant analysis [11], which established the importance of financial ratios in failure prediction, progress was made towards Logit and Probit models that improved predictive capacity. In parallel, the CAMEL model [12] emerged as a comprehensive framework for assessing the financial health of banking institutions, proving particularly effective in emerging markets [13] despite its limitations in capturing systemic risk [14].

Since the 1990s, computers have become very smart with machine learning, a technology that learns through patterns. They started using neural networks [15,16],[a special way computers learn, and decision trees [17] to improve the prediction of bank errors.

These types of algorithms became very good [18]. Still, the problem is that sometimes, these algorithms can be difficult to understand or even receive the data wrong due to hidden problems in their learning [18]. At its most basic level, we have become experts at guessing when banks might get it wrong, but thinking about the intricacies of what these technological tools are thinking? That’s another story [9].

To start and get familiar, we delve into what everyone has been talking about in the concentrated environment (or world) of learning to control when financial failures can occur. We are examining a trio of indicators: financial ratios [19], macroeconomic variables [8], and capital market indicators [20]. In emerging markets, where many microfinance institutions are not publicly traded, predictive models have focused primarily on financial ratios, combining traditional metrics from the Altman model [11]with financial sector-specific indicators such as those from the CAMEL framework [12].

Recent studies have demonstrated the effectiveness of this approach in similar contexts [21,22], particularly when the indicators are adapted to the specific characteristics of the local market [23].

They’ve mixed in outdated material from Altman’s Z-score [24] playbook with some components honed specifically for the money-lending public, courtesy of the CAMEL checklist. It’s worked wonders, especially if they tweak it to fit the local scene perfectly. Moving on to something new, the details of avoiding a big failure in these hot markets are getting a major makeover. They’ve gone from looking at Altman’s Z-score treasure box straight into the intellectual realm of computer learning on the fly – machine learning is stealing the show.

We can easily see that using a combination of Random Forest and Gradient Boosting [25], after some SMOTE preprocessing, sharpens the crystal ball picture quite a bit, i.e., it is more accurate [26]. But there is more: diving into deeper regions with deep learning [27] and selecting the best features with XGBoost [28] is a very high-performing model.

The selected financial ratios (S1 Appendix) are grouped into five categories following the CAMEL framework: Capital Adequacy (8 ratios), Asset Quality (11 ratios), Management Capability (5 ratios), Profitability (4 ratios), and Liquidity (4 ratios). This selection integrates traditional Altman indicators with metrics specific to the microfinance sector. Although this model has limitations inherent to financial statements, the reliability of these reports means that the selected variables have adequate predictive power [29,30].

Materials and methods

Study sample

The Peruvian microfinance sector represents a relevant case study due to its high level of development and its pioneering role in the region [new high-impact reference on microfinance in emerging markets]. This allows for evaluating the effectiveness of predictive models in a mature market with characteristics of emerging economies.

Data processing

Table 1 displays Peruvian financial institutions’ status between 2014 and 2023. Many of them were affected by the economic crises, changes in the game’s rules by the governments in power, and the Covid-19 pandemic. All samples are described by 39 attributes of financial ratios evaluated. All data were grouped into a single data set to be assessed by the model (S1 Database). Each dataset analyzed by period has unbalanced characteristics (S2 Appendix). In addition, the data sets have missing values. The percentage of cases with missing values in the data set ranged from 13% to 15%. Therefore, it was challenging to preprocess the original data properly. A more detailed analysis of the data decomposition complexity was performed using the ARGM algorithm.

Therefore, the financial ratios to be applied in our model were reduced to 32, eliminating those with 40% or more of missing data to ensure more accurate results in the classification model.

Quality metrics

Inter-class imbalance is an inherent challenge in financial failure prediction [31], where failing institutions (minority class) represent a small but critical sample fraction. This imbalance affects the predictive performance of traditional machine learning models [32], requiring specific balancing techniques and more suitable evaluation metrics than conventional accuracies, such as ROC and Precision-Recall curves [33].

To attenuate class imbalance, the literature [31] identifies four important treatments: (1) machine learning algorithms that modify the learning process, (2) oversampling techniques such as SMOTE, (3) cost-sensitive methods that adjust the classification weights of the imbalanced model, and (4) ensembles of machine learning algorithms that combine multiple classifiers.

This study’s approach, which integrates SMOTE with machine learning techniques (Random Forest and Gradient Boosting), leads to a notable improvement in the predictive capacity of imbalanced data [34]. This enhancement holds significant potential for future applications. When you look at the data collected from 56 financial places in Peru, which include assorted banks and microfinance, such as the ones helping small communities from 2014 to 2023, only a small slice, 11.44% to be exact as cases, were not doing particularly well. This stark disparity underscores the necessity for specific balancing techniques, justifying this study’s approach.

The sample comprises 56 Peruvian financial institutions (including banks, microfinance, and municipal/rural financial institutions) observed during the period 2014–2023, with an imbalance index (RI) of 18.67. This asymmetry in the data required the implementation of specific assessment metrics for imbalanced data, including sensitivity, specificity, F1 score, MCC, and AUC [23].

Traditional metrics such as accuracy prove inadequate in financial failure prediction contexts, where misclassification costs are asymmetric [9]. Therefore, our analysis focuses mainly on the area under the ROC curve (AUC) and the Matthews correlation coefficient (MCC), complemented with specific metrics for unbalanced data such as sensitivity and specificity. This combination of metrics allows us to evaluate the model performance robustly, especially considering the critical cost of misclassifying a failing institution as healthy [33].

Proposed approach: Rough granular computing modeling (RGCM)

Building upon previous research on failure prediction in financial institutions [25], we propose an enhanced approach that addresses the class imbalance problem through selective synthetic sampling [35]. Our work extends recent developments in financial credit risk assessment [36] by incorporating a granular approach using manually adjusted machine learning ensemble models to generate minority class examples in homogeneous feature spaces. This approach is particularly relevant for highly imbalanced financial datasets, where traditional methods often underperform [37].

Hence, the proposed Adjusted Rough-Granular Model (ARGM) integrates rough set theory and granular computing with ensemble techniques like Random Forest (RF) and Gradient Boosting (GB) [34,38,39]. This hybrid approach aims to improve classification performance in uncertain and high-dimensional imbalanced data scenarios while maintaining model interpretability. The ARGM consists of six key steps:

  1. Model Ensemble: Combines RF and GB for robust prediction
  2. Data Granules Formation: Segments data using granular computing principles
  3. Granules Evaluation: Identifies uncertainty areas using rough sets
  4. Processing Mode Selection: Optimizes classifier selection per granule type
  5. Synthetic Sampling: Generates balanced training data using SMOTE
  6. Inconsistent Sample Filtering: Removes noise through tolerance-based rough sets.

The model validation employed 5-fold cross-validation with GridSearchCV optimization, achieving accuracy scores of 98.51% and 95.52% for Random Forest and Gradient Boosting, respectively.

Results and discussion

From the above observations, 25 symmetrical attributes suggest that most of the data are evenly distributed, which is ideal for the model being worked on in this research. Subsequently, we split the dataset using the Random Sampling technique, using 70% of the data for training and 30% for testing.

Model performance comparison

In Table 2, the base logistic regression model’s performance on the imbalanced dataset (IR = 18.67) revealed a significant bias toward the majority class. While achieving 91.46% overall accuracy, it showed poor failure detection capability (sensitivity = 0.4444).

SMOTE rebalancing dramatically improved minority class detection while maintaining high specificity. Most notably, sensitivity increased by 45.76 percentage points in the test set, with minimal specificity loss (-1.18%). The F1-Score improvement (+0.3960) and AUC gain (+0.2225) confirm the enhanced model’s superior discriminative power. (see Fig 1)

The curve ROC (Receiver Operating Characteristic) of the base logistic regression model’s performance on the imbalanced dataset (IR = 18.67) revealed a significant bias toward the majority class.

After the 32 financial ratios were prepared, the balancing processes were carried out without using the SMOTE technique. From a sample of 271 records, 189 records, representing 70%, will be taken for training, and 82 records, representing 30%, will be taken for testing, as shown in Table 3. We will construct a simple logistic regression classifier and compare the classifier’s results without SMOTE.

thumbnail
Table 3. Confusion matrix for training and test data set.

https://doi.org/10.1371/journal.pone.0321989.t003

Table 3 shows the model’s performance in training and test sets. In the training set, with 189 records, the model correctly predicted 167 healthy institutions with no false positives. However, it misclassified ten failing institutions as healthy (false negatives). In the test set containing 82 records, the model correctly classified 71 healthy and four failing institutions. However, five failing institutions were classified as healthy (false negatives), and two healthy institutions were incorrectly classified as failing (false positives). The ROC curve shown is in Fig 2.

thumbnail
Fig 2. Curve ROC of the training and test dataset without SMOTE.

https://doi.org/10.1371/journal.pone.0321989.g002

Initial logistic regression without SMOTE revealed limitations in handling the imbalanced dataset (IR = 18.67). While achieving high overall accuracy (91.46% in the test set), the model showed poor performance in identifying institutions at risk of failure, with a sensitivity of only 0.4444 in the test set (see Table 4).

thumbnail
Table 4. Comparison of the base model versus smote-enhanced.

https://doi.org/10.1371/journal.pone.0321989.t004

The SMOTE-enhanced model significantly improved minority class detection while maintaining high specificity. Most notably, sensitivity increased from 0.4444 to 0.9020 in the test set, demonstrating robust generalization. The substantial improvement in F1-Score (0.5333 to 0.9293) and AUC (0.7085 to 0.9310) confirms the effectiveness of the balanced approach. Fig 3 shows the ROC curve comparison, highlighting the enhanced discriminative power of the SMOTE-based model.

thumbnail
Fig 3. Curve ROC of training and test dataset with SMOTE.

https://doi.org/10.1371/journal.pone.0321989.g003

Overall, the model shows good performance, but false negatives in both sets (training and testing) still represent a risk, as they might not detect institutions in danger of bankruptcy, which is critical in a financial environment.

Table 5 shows an Accuracy for the training dataset of 94.71% and 91.46% for the Accuracy testing dataset. One could say that the model has performed exceptionally well, but one should not rely on this first result; one should examine the other reported data. The model has a high specificity in both sets (0.99 and 0.9726), indicating that it identifies healthy institutions well. However, sensitivity is low, especially in the test set (0.4444), reflecting difficulties in detecting failures. Although accuracy is high in training (0.99), it decreases significantly in the test (0.6667), indicating a deterioration in performance outside of training. The drop in F1-Score and MCC suggests that model balancing needs to be improved, possibly using techniques such as SMOTE to balance the classes.

thumbnail
Table 5. Train and test classification evaluation metrics.

https://doi.org/10.1371/journal.pone.0321989.t005

With the same 32 financial ratios, a new data analysis was performed to perform a balance with the SMOTE technique. Table 6 shows the training set; the model correctly classified all instances of both classes (116 healthy and 116 failing), suggesting perfect performance. On the other hand, in the test set, the model correctly classified 51 healthy and 51 failing institutions.

The balance achieves an Accuracy of 93.97% for the training dataset and 93.14% for the Accuracy testing dataset. To better assess the results, we will review the confusion matrix with the training and testing datasets in Table 7.

thumbnail
Table 7. Training and test classification evaluation metrics with SMOTE.

https://doi.org/10.1371/journal.pone.0321989.t007

The SMOTE metrics table shows the model’s performance in the training and test sets. In the training set, with a specificity of 0.9741 and sensibility of 0.9052, the model correctly identifies healthy and failing institutions. The precision (0.9722) and F1-Score (0.9375) metrics reflect a good balance between precision and sensitivity, with an MCC of 0.8810 indicating a strong correlation. Although the metrics decrease slightly in the test set, performance remains high, with specificity of 0.9608 and sensibility of 0.9020, demonstrating good generalization. The precision (0.9583) and F1-Score (0.9293) are equally robust, and the MCC (0.8640) indicates that the model maintains good predictive ability. The AUC values (0.940 in training and 0.931 in test) confirm the model’s ability to discriminate between classes in both sets.

Construction of ARGM model

The Rough-Granular Approach [40] combines rough set theory and granular computing to handle uncertainty and complexity in data, especially in unbalanced data contexts. It partitions the feature space into granules based on similarity, improving classification in difficult problems. For the model proposed in this research, the Adjusted Rough-Granular Model (ARGM) is used, which is a simplified model using Random Forest [41] and Gradient Boosting [42], as shown in Fig 4 that applies incremental and granular learning, combining multiple “granules” or smaller models to create a global view and improve performance on unbalanced data.

A five-step cross-validation was used. The k-fold version of the cross-validation method ensured that the percentage of samples for each class was preserved. The Python library for the GridSearchCV search and optimization technique that works with the model hyperparameters was applied Figs 5 and 6.

thumbnail
Fig 6. Gradient boosting search and optimization technique.

https://doi.org/10.1371/journal.pone.0321989.g006

With the calibrated data from the previous process, we proceeded to search for the best hyperparameter scores (Accuracy) for the model and obtained values of 98.51% and 95.52%, with processing times of 15.2 and 6.41 seconds for the random forest and gradient boosting respectively. The one with the highest score was selected.

Table 8 compares the classification results of two approaches, SMOTE and ARGM (Adjusted Rough-Granular Model), evaluating their performance regarding correct and incorrect predictions for Healthy and Failure classes.

The SMOTE model performs well overall, with few false positives and false negatives, handling both the majority and minority classes well after oversampling. On the other hand, the ARGM model is stronger in predicting the minority (failure) class, although with more false negatives. It is more effective in scenarios where correctly identifying failing institutions is crucial. To validate the results, several metrics were used in Table 9.

As shown in Table 9, ARGM reflects a good performance of the classification model. With a specificity of 0.9552 and a sensitivity of 0.8955, the model correctly identifies healthy and failing institutions. The precision of 0.9524 and F1-Score of 0.9231 indicate a good balance between precision and sensitivity. The MCC (0.8520) and AUC (0.9250) confirm a strong correlation between predictions and actual classes and an excellent ability to discriminate between classes. Fig 7 plots the AUC for the SMOTE and ARGM.

Theoretical and methodological implications

The Adjusted Rough-Granular Model (ARGM) significantly advances the literature on microfinance bankruptcy prediction by addressing traditional models’ critical limitations. Previous studies, such as Altman’s seminal work on Z-scores and discriminant analysis [11], and subsequent refinements using Logit and Probit models [16,19], demonstrated the importance of financial ratios in predicting bankruptcy but struggled with interpretability when extended to more complex machine learning techniques[15,27]. While recent advancements like Gradient Boosting and XGBoost [25,28] achieve high predictive accuracy, they often compromise interpretability, making them less practical for regulatory applications. By contrast, ARGM integrates granular computing principles and rough set theory to improve predictive accuracy (AUC = 0.9250) and interpretability. This dual advantage is critical in microfinance, where decision-makers require transparent models to assess risks effectively. The ARGM approach thus bridges a vital gap identified in the literature, combining state-of-the-art machine learning with practical usability, as highlighted by its robust performance even in highly imbalanced datasets.

Failure predictions with new dataset

We prepared a new sample of 30 records, including banks, finance companies, microfinance companies, municipal microfinance, and rural savings banks. As shown in Table 10, we maintained the same financial attributes used to validate the RGCM model and predict possible institutions with failure problems.

thumbnail
Table 10. Micro financial institutions assessed with RGCM.

https://doi.org/10.1371/journal.pone.0321989.t010

The Adjusted Rough-Granular Model, applied to evaluate 30 financial institutions with 32 financial ratio attributes validated in the previous data analysis, correctly identified that six institutions (20%) are experiencing financial health problems or failure. This indicates that the approach has effectively detected institutions at risk, crucial in financial risk prevention and management. Given the level of granularity in the prediction, the model seems to have accurately captured the entities most likely to face failure, contributing to early detection that could facilitate timely intervention. The 20/80 balance suggests that the model has a good ability to classify between failing and healthy entities.

Conclusions and managerial implications

This study uncovered three key insights that add to our understanding of predicting failures in microfinance institutions in emerging markets. First, the methodology we developed, the Adjusted Rough-Granular Model (ARGM), proved to be a significant improvement over traditional approach. It achieved impressive results, with nearly 90% accuracy in identifying institutions at risk of failure and over 95% in spotting safe ones. This is important for a sector that relies heavily on early warning systems to mitigate financial risks and improve strategic decision making [43,44].

Also, combining SMOTE balancing techniques with granular computing gave the model a serious boost in handling the usual class imbalance that makes failure prediction tricky. What stood out to us was how well the ARGM captured the intricate patterns behind financial distress—without losing the interpretability that’s so important for practical use [43]. With an AUC score of 0.9250, this hybrid approach offers both power and clarity, making it a valuable tool for anyone working to keep microfinance institutions stable and Sustainable.

The Adjusted Rough-Granular Model (ARGM) brings something fresh and much-needed to predicting failures in microfinance institutions. For years, traditional models like Altman’s Z-scores and discriminant analysis have shown how useful financial ratios can be for this purpose. However, when these methods evolved into more complex approaches, like Logit or Probit models, and even modern machine learning techniques like Gradient Boosting and XGBoost, they faced a big challenge: interpretability. These newer models are great at making accurate predictions, but their complexity often makes it hard for decision-makers, especially regulators, to understand how the predictions are made. That’s where ARGM shines, integrating granular computing and rough set theory strikes the perfect balance between accuracy and transparency.

Predicting financial risks in microfinance institutions is difficult, especially with imbalanced datasets. That’s where the Adjusted Rough-Granular Model (ARGM) steps in. This model combines techniques like granular computing and machine learning (think Random Forest and Gradient Boosting) to identify patterns linked to financial distress.

Here’s the thing—it doesn’t just give accurate predictions; it also makes them easy to understand. In one test with a dataset of 30 institutions, the model flagged six as high-risk. That’s 20% is a big deal if you’re trying to catch problems early. What’s even better is that it helps decision-makers trust the results. For regulators and microfinance managers, they can act before things get out of hand. It’s not just a theoretical tool; it’s something people can use to make smarter decisions.

There’s a lot to unpack from our findings, especially for emerging markets. These regions face unique challenges—think about volatile economies, weaker regulatory systems, and how easily systemic shocks [1,7,8,45]. Microfinance institutions in these areas need tools to identify risks early to protect financial inclusion and avoid setbacks [2,3]. That’s where ARGM shows its value. It can spot financial distress with impressive accuracy, even when working with tricky datasets. But that’s not all. The model isn’t just accurate, it’s flexible too. It can adapt to other regions facing similar issues, like South Asia or Sub-Saharan Africa, where microfinance is essential for connecting underserved communities with financial services [5,9]. And here’s the part I find really exciting: the ARGM uses granular computing to make its predictions clear and easy to understand. This means regulators and managers can use it to design interventions that match specific regional needs; based on local financial indicators [13,21]. When you think about it, this combination of adaptability and reliability makes the ARGM more than just a theoretical framework. It’s a practical tool that can help strengthen financial stability and support economic growth in diverse emerging markets worldwide.

The model offers a valuable tool for regulators and supervisors to improve their monitoring of microfinance institutions. Using it, they could step in earlier and target interventions more effectively. Its high sensitivity means it’s less likely to miss struggling institutions. For microfinance managers, the model’s interpretability provides clear insights into which financial indicators signal trouble. This allows them to take proactive steps and address risks before becoming bigger problems. But there’s more. Our findings suggest that regulatory frameworks should be more flexible, considering the unique challenges and vulnerabilities that microfinance institutions face in emerging markets. The model could also be useful for investors, helping them evaluate the financial health of institutions with more precision. That said, there are some limitations we need to point out. The dataset we used—271 observations from 56 institutions—is relatively small, which could affect how generalizable the results are. This isn’t unusual for studies in emerging markets, where data can be hard to come by, but it’s still something to keep in mind.

While the statistical analysis demonstrates that our model achieves strong performance metrics—with sensitivity, specificity, and AUC values indicating its effectiveness—it is important to discuss the inherent trade-offs, particularly in specificity. The enhanced sensitivity achieved through SMOTE rebalancing comes with a modest decrease in specificity. This indicates that, while the model is more adept at identifying at-risk institutions (reducing false negatives), it may slightly increase the misclassification of healthy institutions (false positives). In the context of microfinance, where the cost of overlooking a failing institution can be severe, this trade-off is considered acceptable; however, it warrants further investigation to ensure balanced decision-making. Furthermore, although our dataset of 56 Peruvian financial institutions collected over nearly a decade provides valuable insights, potential biases could arise from its representativeness and the data splitting process. Despite using stratified random sampling to preserve class distributions, variations in economic conditions and institutional characteristics across different regions and time periods might influence the model’s generalizability. Future research should aim to incorporate more diverse datasets and explore alternative cross-validation strategies to further mitigate any bias introduced during data partitioning.

Another limitation of the model is its reliance solely on financial ratios, which, while objective, might miss critical aspects like governance or management effectiveness. Expanding the dataset to include more regions or adding non-financial variables could provide a more complete view of failure risks in microfinance institutions. Even so, this study offers valuable insights and practical tools for identifying risks early. The ARGM has proven reliable in handling imbalanced datasets and delivering accurate predictions. For regulators and managers, its validation with new data shows it’s more than just theory, it’s a tool with real-world potential. Future research could explore integrating macroeconomic trends and qualitative factors to make these models even more robust.

References

  1. 1. Morduch J, Armendariz B. The economics of microfinance. MIT Press; 2005.
  2. 2. Hermes N, Lensink R. Microfinance: its impact, outreach, and sustainability. World Dev. 2011;39(6):875–81.
  3. 3. Cull R, Demirgüç-Kunt A, Morduch J. Banks and microbanks. J Financ Serv Res. 2013;46(1):1–53.
  4. 4. Banerjee A, Duflo E, Glennerster R, Kinnan C. The miracle of microfinance? Evidence from a randomized evaluation. Am Econ J: Appl Econ. 2015;7(1):22–53.
  5. 5. Vanroose A, D’Espallier B. Do microfinance institutions accomplish their mission? Evidence from the relationship between traditional financial sector development and microfinance institutions’ outreach and performance. Appl Econ. 2013;45(15):1965–82.
  6. 6. Sainz‐Fernandez I, Torre‐Olmo B, López‐Gutiérrez C, Sanfilippo‐Azofra S. Crisis in microfinance institutions: identifying problems. J Int Dev. 2015;27(7):1058–73.
  7. 7. Laeven L. Banking crises: a review. Annu Rev Financ Econ. 2011;3(1):17–40.
  8. 8. Betz F, Oprică S, Peltonen TA, Sarlin P. Predicting distress in European banks. J Bank Financ. 2014;45:225–41.
  9. 9. Wei-Yang L, Ya-Han H, Chih-Fong T. Machine learning in financial crisis prediction: a survey. IEEE Trans Syst, Man, Cybern C. 2012;42(4):421–36.
  10. 10. Mai F, Tian S, Lee C, Ma L. Deep learning models for bankruptcy prediction using textual disclosures. Eur J Oper Res. 2019;274(2):743–58.
  11. 11. Altman EI, Iwanicz‐Drozdowska M, Laitinen EK, Suvas A. Financial distress prediction in an international context: a review and empirical analysis of Altman’s Z‐score model. Financ Manag Account. 2016;28(2):131–71.
  12. 12. Ferrouhi E. Moroccan banks analysis using camel model. J Econ Financ Issues. 2014:622–7. Available from: https://dergipark.org.tr/en/download/article-file/362894
  13. 13. Misra S, Aspal P. A CAMEL model analysis of state bank group. Proceedings of 19th International Business Research; 2012 [cited 2025 Feb 18. ]. Available from: https://ssrn.com/abstract=2177099
  14. 14. Afroj F. Financial strength of banking sector in Bangladesh: a CAMEL framework analysis. AJEB. 2022;6(3):353–72.
  15. 15. Adya M, Collopy F. How effective are neural networks at forecasting and prediction? A review and evaluation. J Forecast. 1998;17(5–6):481–95.
  16. 16. Bell TB. Neural nets or the logit model? A comparison of each model’s ability to predict commercial bank failures. Int J Intell Syst Acc Fin Mgmt. 1997;6(3):249–64.
  17. 17. Bernd T, Kleutges M, Kroll A. Nonlinear black box modelling – fuzzy networks versus neural networks. Neural Comput Appl. 1999;8(2):151–62.
  18. 18. Barboza F, Kimura H, Altman E. Machine learning models and bankruptcy prediction. Expert Syst Appl. 2017;83:405–17.
  19. 19. Meyer PA, Pifer HW. Prediction of bank failures. J Finance. 1970;25(4):853–68.
  20. 20. Distinguin I, Rous P, Tarazi A. Market discipline and the use of stock market data to predict bank financial distress. J Finan Serv Res. 2006;30(2):151–76.
  21. 21. Halteh K, Kumar K, Gepp A. Financial distress prediction of Islamic banks using tree-based stochastic techniques. MF. 2018;44(6):759–73.
  22. 22. Narula S, Singh MK. Do creditors punish weak banks? Evidence from Indian urban cooperative banks’ failure. Pac-Basin Finance J. 2024;88:102517.
  23. 23. Gregova E, Valaskova K, Adamko P, Tumpach M, Jaros J. Predicting financial distress of slovak enterprises: comparison of selected traditional and learning algorithms methods. Sustainability. 2020;12(10):3954.
  24. 24. Rossi Valverde RM, Rossi Ortiz RG. Análisis del riesgo de quiebra de instituciones financieras peruanas, 2015-2021. REMEF. 2022;17(3):1–20.
  25. 25. Quynh TD, Thi Lan Phuong T. Improving the bankruptcy prediction by combining some classification models. 2020 12th International Conference on Knowledge and Systems Engineering (KSE). IEEE; 2020. p. 263–8. https://doi.org/10.1109/KSE50997.2020.9287707
  26. 26. Alam TM, Shaukat K, Mushtaq M, Ali Y, Khushi M, Luo S, et al. Corporate bankruptcy prediction: an approach towards better corporate world. Comput J. 2020;64(11):1731–46.
  27. 27. Smiti S, Soui M. Bankruptcy prediction using deep learning approach based on borderline SMOTE. Inf Syst Front. 2020;22(5):1067–83.
  28. 28. Muslim M, Dasril Y. Company bankruptcy prediction framework based on the most influential features using XGBoost and stacking ensemble learning. IJECE. 2021;11:5549–57.
  29. 29. Chiaramonte L, Liu (Frank) H, Poli F, Zhou M. How accurately Can Z‐score predict bank failure? Financ Market. 2016;25(5):333–60.
  30. 30. Le HH, Viviani J-L. Predicting bank failure: an improvement by implementing a machine-learning approach to classical financial ratios. Res Int Bus Finance. 2018;44:16–25.
  31. 31. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst, Man, Cybern C. 2012;42(4):463–84.
  32. 32. Krawczyk B. Learning from imbalanced data: open challenges and future directions. Prog Artif Intell. 2016;5(4):221–32.
  33. 33. López V, Fernández A, García S, Palade V, Herrera F. An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci. 2013;250:113–41.
  34. 34. Cahyana N, Khomsah S, Aribowo AS. Improving imbalanced dataset classification using oversampling and gradient boosting. 2019 5th International Conference on Science in Information Technology (ICSITech). IEEE; 2019. p. 217–22. https://doi.org/10.1109/ICSITech46713.2019.8987499
  35. 35. Fernandez A, Garcia S, Herrera F, Chawla NV. SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. JAIR. 2018;61:863–905.
  36. 36. Chen N, Ribeiro B, Chen A. Financial credit risk assessment: a recent review. Artif Intell Rev. 2015;45(1):1–23.
  37. 37. Son H, Hyun C, Phan D, Hwang HJ. Data analytic approach for bankruptcy prediction. Expert Syst Appl. 2019;138:112816.
  38. 38. Stepaniuk J. Rough – granular computing in knowledge discovery and data mining. Berlin, Heidelberg: Springer Berlin Heidelberg; 2008. https://doi.org/10.1007/978-3-540-70801-8
  39. 39. Yan HC, Wang ZR, Niu JY, Xue T. Application of covering rough granular computing model in collaborative filtering recommendation algorithm optimization. Adv Eng Inform. 2022;51:101485.
  40. 40. Borowska K, Stepaniuk J. Rough–granular approach in imbalanced bankruptcy data analysis. Procedia Comput Sci. 2022;207:1832–41.
  41. 41. ML in Python (scikit-learn Random Forest); [cited 2024 Aug 28. ]. Available from: https://scikit-learn.org/dev/modules/ge-nerated/sklearn.ensemble.RandomForestClassifier.html
  42. 42. ML in Python (scikit-learn Gradient Boosting); [cited 2024 Aug 28. ]. Available from: https://scikit-learn.org/0.15/mo-dules/generated/sklearn.ensemble.GradientBoostingClassifier.html
  43. 43. Khoirul Anam A, Khajar I. Managerial myopia control to improve financial performance in micro-finance institutions. Rev Econ Fin. 2023. Available from: https://orcid.org/0000-0002-4139-1955
  44. 44. Magashi CH, Agbinya J, Sam A, Mbelwa J. Prediction of SACCOS failure in tanzania using machine learning models. Eng Technol Appl Sci Res. 2024;14(1):12887–91.
  45. 45. Amina A, Mustapha A. Contribution to the analysis of microfinance failure in morocco: an econometric analysis, SDG1. JLSDGR. 2025;5(2):e04414.