Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

RETRACTED: Advancing enterprise risk management with deep learning: A predictive approach using the XGBoost-CNN-BiLSTM model

Retraction

The PLOS One Editors retract this article [1] because it was identified as one of a series of submissions for which we have concerns about compromised peer review.

All authors either did not respond directly or could not be reached.

23 Feb 2026: The PLOS One Editors (2026) Retraction: Advancing enterprise risk management with deep learning: A predictive approach using the XGBoost-CNN-BiLSTM model. PLOS ONE 21(2): e0343285. https://doi.org/10.1371/journal.pone.0343285 View retraction

Abstract

Enterprise risk management is a key element to ensure the sustainable and steady development of enterprises. However, traditional risk management methods have certain limitations when facing complex market environments and diverse risk events. This study introduces a deep learning-based risk management model utilizing the XGBoost-CNN-BiLSTM framework to enhance the prediction and detection of risk events. This model combines the structured data processing capabilities of XGBoost, the feature extraction capabilities of CNN, and the time series processing capabilities of BiLSTM to more comprehensively capture the key characteristics of risk events. Through experimental verification on multiple data sets, our model has achieved significant advantages in key indicators such as accuracy, recall, F1 score, and AUC. For example, on the S&P 500 historical data set, our model achieved a precision rate of 93.84% and a recall rate of 95.75%, further verifying its effectiveness in predicting risk events. These experimental results fully demonstrate the robustness and superiority of our model. Our research is of great significance, not only providing a more reliable risk management method for enterprises, but also providing useful inspiration for the application of deep learning in the field of risk management.

Introduction

With the rapid development of the global economy, the number of listed companies in China continues to rise. However, changes in policy and economic environment, such as fluctuations in national policies or market supply and demand, can adversely affect the normal business activities of enterprises, leading to negative effects on economic development [1]. Enterprises face various risks, which may arise from factors such as market fluctuations, competitive pressures, policy changes, natural disasters, etc., posing potential threats to the stable operation and sustainable development of enterprises [2]. In today’s rapidly changing and uncertain business environment, effectively managing enterprise risks has become an important challenge for business managers and decision-makers. Traditional risk management methods often rely on historical data and experiential judgment, but when faced with increasingly complex and volatile risk environments, these methods have proven to be inadequate [3]. Existing approaches often fail to address the limitations posed by large-scale, unstructured, and time-series data, making it challenging for enterprises to accurately and timely predict risks. With the rapid development of deep learning technology, more and more researchers have begun to use deep learning techniques to address challenges in enterprise risk management. Deep learning, with its powerful feature learning and pattern recognition capabilities, can unearth potential patterns and correlations from massive data, providing more accurate and timely risk prediction and identification for enterprises [4]. In current research, people have attempted to use deep learning techniques to identify various types of risk events, including financial risks, market risks, supply chain risks, etc., and have achieved certain results. Among them, time series forecasting, as one of the important application areas of deep learning in enterprise risk management, has attracted widespread attention. Time series data contains data points that change over time, such as stock prices, sales volumes, weather changes, etc. These data have obvious time correlation and trend characteristics, reflecting the risk situation at different points in time [5,6]. Therefore, using deep learning techniques for time series forecasting can help enterprises to timely discover and respond to potential risks, thereby improving the efficiency and accuracy of risk management.

In recent years, the application of deep learning technology in the field of enterprise risk management has shown rapid growth. Here are four recent studies, each using a specific deep learning model, to explore enterprise risk management: One study used Long Short-Term Memory (LSTM) to predict stock price changes in financial markets [7]. Researchers utilized a large amount of historical stock price data to build an LSTM-based time series prediction model to forecast future stock price trends. However, the model faces challenges in handling extreme market conditions and sudden events, leading to decreased prediction accuracy, thus requiring further improvement to enhance robustness. Another study employed Convolutional Neural Networks (CNN) to identify potential risk factors in internal textual data of enterprises [8]. Researchers constructed a CNN-based text classification model to automatically identify risk-related information in internal documents of enterprises. However, the model has limitations in dealing with semantic information and highly uncertain textual data, necessitating more semantic modeling techniques to improve accuracy. Yet another study utilized deep reinforcement learning methods to optimize risk control strategies in supply chain management. Researchers proposed a Supply Chain Risk Management model based on Deep Q-Networks (DQN), which learns the optimal risk control strategy through interaction with the environment [9]. However, due to the high training complexity of deep reinforcement learning algorithms, the model may encounter issues of excessive computational costs in practical applications. Finally, some scholars explored the use of Bidirectional Long Short-Term Memory Networks (BiLSTM) for predicting financial market risk events [10]. Researchers built an event sequence prediction model based on BiLSTM to identify abnormal fluctuations and sudden events in financial markets. However, the model still needs improvement in handling long-term dependencies and abnormal events in the sequence, especially in noise and uncertainty processing.

These related works provide valuable insights into the application of deep learning technology in enterprise risk management. However, they also expose some challenges and limitations. Many existing models struggle with integrating different types of data (e.g., structured, unstructured, and time-series), handling long-term dependencies, and addressing the dynamic nature of risk events. Future research should focus on improving the performance and robustness of deep learning models to better address the complex and dynamic enterprise risk environment.

Building upon the deficiencies highlighted in the aforementioned studies, we propose the XGBoost-CNN-BiLSTM network to address the challenges present in the field of enterprise risk management. Our model integrates three advanced techniques, combining the strengths of each to overcome the limitations of existing models and offer a more comprehensive approach to risk prediction. Our model combines three components: XGBoost, CNN, and BiLSTM, to enhance the prediction and identification performance of risk events. The XGBoost model is applied to process structured data and time-series features, aiming to improve prediction accuracy. In contrast, the CNN model is utilized to process both textual and image data, facilitating the identification of potential risk factors. Meanwhile, the BiLSTM model is utilized to process time-series data and sequential textual data, capturing long-term dependencies and enhancing identification accuracy. By integrating the strengths of XGBoost, CNN, and BiLSTM, our model effectively leverages information from structured data, textual data, and time-series data, thereby enhancing the efficiency and accuracy of enterprise risk management. This innovative combination addresses the limitations identified in prior work, enabling our model to achieve superior performance in both predictive accuracy and risk event identification, even in complex and dynamic risk environments.

This paper makes the following contributions:

  • A Novel XGBoost-CNN-BiLSTM Framework for Risk Prediction: We propose a unique integration of XGBoost, CNN, and BiLSTM, specifically designed to comprehensively leverage structured data, image data, and time-series data for risk prediction. This approach surpasses traditional methods by more effectively capturing complex correlations and patterns among risk events, providing a more accurate and reliable prediction model for enterprise risk management.
  • Enhanced Real-Time Monitoring and Detection: By utilizing deep learning techniques, our model significantly improves the timeliness and accuracy of real-time risk monitoring and detection. Our research offers a solution for processing large-scale and complex data efficiently, which allows enterprises to identify risk signals promptly and adjust their risk management strategies in a timely manner, minimizing potential losses. This improvement in real-time capabilities sets our model apart from existing solutions.
  • Broader Application and Generalizability: Our model is not limited to specific datasets but demonstrates robust performance across diverse financial data sources, such as S&P 500, Bloomberg, and Thomson Reuters datasets. The generalizability of our model is validated across multiple datasets, showing its applicability in various enterprise risk management scenarios. This enhanced adaptability makes it more suitable for dynamic and complex environments than previously developed models.

Related work

The application of traditional methods in enterprise risk prediction

Traditional methods play a significant role in enterprise risk prediction. These methods are typically based on statistical principles and mathematical models, including but not limited to regression analysis and time series analysis. In regression analysis, models are built based on historical data to identify the correlation between different variables, thereby predicting future risk trends [11]. Time series analysis focuses on analyzing the regularity of data changes over time to predict future risk situations. In enterprise risk management, traditional methods are often used to analyze structured data such as financial statements and market data [12,13]. These methods have advantages such as simplicity, computational efficiency, and applicability to certain risk event prediction and identification scenarios. For example, regression analysis and time series analysis are computationally efficient for structured financial data, but their limited ability to model complex and nonlinear relationships reduces their effectiveness when applied to more dynamic and uncertain environments. However, traditional methods also have some limitations, such as weaker handling of nonlinear relationships and lower sensitivity to data [14]. In addition, these methods often struggle with processing unstructured data, such as text or images, and cannot effectively integrate different data types (e.g., structured, unstructured, and time-series data) to provide comprehensive risk insights. With the continuous increase and complexity of enterprise data, traditional methods may appear inadequate when facing large-scale, high-dimensional, and unstructured data, which has spurred the development and application of emerging technologies like deep learning.

The application of machine learning methods in enterprise risk prediction

Machine learning methods have wide applications in enterprise risk prediction. These methods utilize algorithms and models to learn patterns and rules from historical data, enabling the prediction of future potential risk events. In the field of enterprise risk management, machine learning methods include but are not limited to Support Vector Machine (SVM), Random Forest, and XGBoost [15]. Support Vector Machine (SVM) is a commonly used supervised learning method that has been widely applied in enterprise risk prediction [16,17]. SVM classifies or regresses data by finding the optimal hyperplane in high-dimensional space, suitable for handling structured data and nonlinear problems, effectively identifying and predicting potential risk factors. Random Forest is an ensemble learning algorithm consisting of multiple decision trees, capable of handling large-scale data and high-dimensional features, with strong generalization ability and robustness [18]. In enterprise risk prediction, Random Forest can be used for tasks such as feature selection, pattern recognition, and anomaly detection, providing comprehensive risk assessment and warning for enterprises [19]. However, these traditional machine learning models often rely heavily on feature engineering, which can limit their flexibility and scalability when dealing with complex and dynamic risk environments. Furthermore, their ability to handle unstructured data, such as textual or image data, is limited, making them less effective in providing comprehensive insights from multimodal data sources.

In recent years, advanced machine learning models like XGBoost have been increasingly applied due to their ability to handle structured data more efficiently and with better accuracy. XGBoost, with its gradient boosting framework, has been shown to outperform traditional models in certain risk prediction tasks. However, similar to SVM and Random Forest, it still faces challenges in processing time-series and unstructured data, highlighting the need for more integrated approaches that can leverage diverse data types for improved risk prediction.

In summary, machine learning methods play an important role in enterprise risk prediction, helping enterprises identify potential risk factors, reduce losses from risks, and improve operational efficiency and competitiveness. Despite their importance, current machine learning methods still have limitations in effectively integrating multimodal data and capturing long-term dependencies in time-series data, necessitating the exploration of hybrid models that can better address these challenges.

Application of multimodal data fusion in enterprise risk prediction

Multimodal data fusion involves integrating data from multiple sources and types, such as structured data, text data, and image data, to improve the accuracy and comprehensiveness of risk prediction [20,21]. In enterprise risk management, different types of data can provide rich information, but using only one type of data may be constrained by data limitations and model constraints [22]. Therefore, integrating multiple data types can complement the deficiencies of individual data and thus improve the performance and effectiveness of prediction models. Multimodal data fusion is crucial for delivering a comprehensive analysis of enterprise risk. It converges varied data types, like structured figures showcasing financial health and market standing, with textual and visual data that offer granular insights into specific circumstances [23,24]. This convergence not only fills in the gaps left by single data types but also bolsters the resilience and adaptability of predictive models against diverse data distributions and potential data insufficiencies. One of the key challenges in enterprise risk management is extracting meaningful insights from this diverse data landscape. Traditional models often struggle to fully integrate structured and unstructured data, leading to incomplete risk assessments. Multimodal data fusion addresses this limitation by combining information from various modalities, enabling a more holistic understanding of risk events and their potential impact.

Furthermore, recent advancements in enterprise risk management have leveraged more sophisticated models. For instance, the study on quantified risk networks (QRNs) proposed by [25] introduces a complex adaptive systems approach, which aligns well with our efforts to address multifaceted enterprise risks. Similarly, adversarial attack detection in deep learning models, as explored in [26], highlights the importance of robustness, a factor we have considered by utilizing our hybrid XGBoost-CNN-BiLSTM framework. In addition, the work of [27] on deep learning for enterprise sales prediction, using CNN-LSTM-Attention, further supports our approach by demonstrating the effectiveness of hybrid models in real-world applications. These studies provide valuable insights and support the novelty and application of our proposed framework.

Our XGBoost-CNN-BiLSTM framework effectively builds upon these advancements by integrating structured, unstructured, and time-series data to improve prediction accuracy and resilience. This hybrid model leverages the strengths of each modality, enhancing the capability to identify potential risks across diverse data types and delivering a more reliable and comprehensive enterprise risk prediction solution.

Materials and methods

Overview of our network

Our model adopts a hybrid structure of XGBoost-CNN-BiLSTM for risk prediction and identification in enterprise risk management. Specifically, the model consists of the following three parts: XGBoost Model: XGBoost is employed to handle structured data such as financial indicators and market data. Its role is to screen and extract critical structured features, thus providing the foundation for risk prediction. CNN Model: Leveraging the excellent feature extraction capabilities of the CNN model, secondary extraction of feature indicators is conducted to better cover relevant factors and support the accuracy of subsequent prediction models. BiLSTM Model: Bidirectional Long Short-Term Memory (BiLSTM) networks are utilized for processing textual and time-series data, such as sentiment text data and historical time series data. BiLSTM can capture the temporal dependencies and semantic information in the data, contributing to more accurate risk prediction. These three components are combined to comprehensively utilize information from different types of data, enhancing the effectiveness of risk management. In the process of network construction, we first prepare the structured and unstructured data required for enterprise risk management, including financial indicators, market data, and sentiment text. Subsequently, structured data is passed to the XGBoost model for feature selection and initial risk event prediction, obtaining representative and predictive features. Then, the feature data selected and extracted by XGBoost is passed to the CNN model for secondary feature extraction and preprocessing, aiming to better cover relevant factors. Meanwhile, the feature data processed by CNN and unstructured data are passed to the BiLSTM model for capturing temporal dependencies and semantic information in the data. Finally, the XGBoost, CNN, and BiLSTM models are integrated to leverage their combined strengths, enhancing the effectiveness and sophistication of enterprise risk management. The overall structure diagram of the model is shown in Fig 1.

This hybrid model structure is crucial for enterprise risk management. It comprehensively utilizes different deep learning technologies to analyze and predict risks more comprehensively. This means that companies can identify potential risks more accurately and take timely action to reduce the adverse impact of risks on the company, thereby ensuring the robust development of the enterprise.

XGBoost model

XGBoost (Extreme Gradient Boosting) is an ensemble learning algorithm based on decision trees. It sequentially trains multiple weak learners (often decision trees), where each training corrects the errors of the previous model, continuously improving the performance of the model [28]. The core principle of XGBoost is to gradually build tree models by optimizing a loss function and introducing regularization terms in each training round to prevent overfitting. Additionally, XGBoost leverages the advantages of gradient boosting algorithms to more effectively handle errors during training, thus improving the model’s generalization ability and stability [29]. The ow chart of XGBoost is shown in Fig 2.

For our model, XGBoost plays a crucial role in handling structured data. It efficiently filters and extracts key structured features, laying the foundation for risk prediction. Through preprocessing with XGBoost models, we can extract the most representative and predictive features from massive structured data, providing essential inputs for subsequent risk prediction models. XGBoost operates on several mathematical foundations:

XGBoost Loss Function: computes the overall loss of the XGBoost model, combining both the individual data point loss and the regularization term:

(1)

Where: represents the true label, denotes the predicted label, L is the loss function, K stands for the number of trees, Ω indicates the regularization term, and refers to the k-th tree.

Gradient of XGBoost Loss Function: Represents the derivative of the loss function with respect to the predicted label:

(2)

Where: denotes the gradient of the loss function with respect to the predicted label.

Hessian of XGBoost Loss Function: Denotes the second derivative of the loss function with respect to the predicted label:

(3)

Where: represents the Hessian of the loss function with respect to the predicted label.

XGBoost Tree Model Prediction Function: Calculates the predicted label of a data point using the XGBoost tree model:

(4)

Where: denotes the predicted label, represents the k-th tree, and stands for the input features.

Score of XGBoost Tree Model Leaf Node: Computes the score of a leaf node in the XGBoost tree model:

(5)

Where: represents the score of the j-th leaf node, indicates the sum of first derivatives at the j-th leaf node, denotes the sum of second derivatives at the j-th leaf node, and λ stands for the regularization term.

CNN model

Convolutional Neural Network (CNN) is a type of deep learning model primarily used for processing spatially structured data such as images and videos. The core principle of CNN involves extracting features from input data and hierarchical abstraction through convolutional layers, pooling layers, and fully connected layers to ultimately classify or recognize the data [30]. Convolutional layers extract local features through convolution operations with filters, pooling layers reduce data dimensions through downsampling while retaining critical information, and fully connected layers integrate and classify features [31]. The structure of CNN is shown in Fig 3.

Here are the core mathematical equations for Convolutional Neural Network:

Convolutional Layer: Performs convolution operation on input data using filters:

(6)

where: x represents input data, w represents filter weights, b represents bias,  ∗  denotes convolution operation, f denotes activation function.

Pooling Layer: Reduces the spatial dimensions of the convolutional layer output:

(7)

where: x represents input data.

Fully Connected Layer: Performs matrix multiplication followed by bias addition and activation:

(8)

where: x represents input data, w represents weight matrix, b represents bias vector, f represents activation function.

ReLU Activation Function: Introduces non-linearity to the network:

(9)

Where: x represents input data.

Softmax Activation Function: Converts raw scores into probabilities:

(10)

Where: represents raw score for class i, N represents number of classes.

The formulations provided encapsulate the essential mathematical operations driving feature extraction, spatial dimension reduction, and classification, pivotal for leveraging convolutional neural networks (CNNs) in advancing enterprise risk management through enhanced risk prediction and event identification capabilities.

For our model, CNN plays a pivotal role in feature extraction. By leveraging the remarkable capabilities of the CNN model, we can perform secondary feature extraction on feature indicators, better covering relevant factors and providing technical support for improving the accuracy of subsequent prediction models. Particularly in handling image data, CNN effectively captures spatial information and local patterns. Therefore, in our model, we can employ CNN for feature extraction from image data to enhance the model’s understanding and analytical capabilities for image data in risk management.

BilSTM model

The Bidirectional Long Short-Term Memory network (BiLSTM) is a specialized type of recurrent neural network (RNN) designed to effectively manage sequential data, including text and time series. BiLSTM, by incorporating mechanisms like forget gates, input gates, and output gates, effectively captures long-term dependencies and semantic information in sequential data [32]. Compared to traditional RNNs, BiLSTM exhibits stronger memory capabilities and better handling of the vanishing/exploding gradient problem, making it perform exceptionally well in processing sequential data [33]. Fig 4 shows the process of BilSTM.

For our model, BiLSTM plays a crucial role in processing both textual and time-series data. We can leverage BiLSTM models to process sentiment text data and historical time-series data, thus improving the accuracy of risk prediction. BiLSTM excels at capturing temporal dependencies and semantic information in data. Therefore, in our model, we can employ BiLSTM for modeling tasks involving unstructured and time-series data, thereby enhancing the accuracy and stability of risk management models.

Here’s an overview of the fundamental computations in a BiLSTM cell:

Input Gate:

(11)

Where: is the input gate activation, is the input at time step t, is the hidden state at the previous time step, and are the weight matrices, is the bias vector, and σ is the sigmoid activation function.

Forget Gate:

(12)

Where: is the forget gate activation, and are the weight matrices, is the bias vector.

Output Gate:

(13)

Cell Input Modulation:

(14)

Where: is the cell input modulation, and are the weight matrices, is the bias vector.

Cell State Update:

(15)

Where: is the cell state,  ⊙  denotes element-wise multiplication.

Hidden State Update:

(16)

Where: is the hidden state.

In summary, the provided equations encapsulate the key mechanisms of a Bidirectional Long Short-Term Memory (BiLSTM) cell. These equations govern the activation and modulation of gates, the update of cell state, and the computation of hidden states. Together, they enable BiLSTM networks to capture temporal dependencies in sequential data by efficiently processing input information and updating internal states across time steps.

Results

Datasets

In this study, experiments were conducted using four datasets: S&P 500 Historical Dataset [34], Bloomberg Terminal Dataset [35], Capital IQ Dataset [36], and Thomson Reuters Eikon Dataset [37]. Incorporating these four diverse datasets provides a comprehensive and multi-dimensional perspective, enhancing the robustness and accuracy of our risk prediction model by leveraging authoritative financial market data from different sources.

S&P 500 Historical Dataset: The S&P 500 historical dataset comprises historical trading data of the constituents of the S&P 500 index, including stock prices, trading volumes, market capitalization, among other information. These data, sourced from Standard & Poor’s, are recognized as authoritative financial market data. This dataset covers multiple industries and companies, reflecting the overall trends and fluctuations in the stock market. In this paper, the S&P 500 historical data is utilized to analyze the impact of market fluctuations on enterprise risk, providing crucial market references for the risk prediction model. The data is collected through professional financial market data providers and subsequently organized and processed.

Bloomberg Terminal Dataset: The Bloomberg Terminal dataset originates from the Bloomberg Terminal, encompassing global financial market data, economic data, company data, and more. Renowned for its high credibility and accuracy, this dataset is widely used by financial practitioners and academic researchers. It boasts rich financial and economic indicators, facilitating in-depth analysis of enterprise financial status, market performance, and risk exposure. In this paper, the Bloomberg Terminal dataset is employed to extract enterprise financial and market data, furnishing vital input variables for the risk prediction model.

Capital IQ Dataset: The Capital IQ dataset, provided by Standard & Poor’s Global Market Intelligence, includes financial data, trading data, analyst forecasts, and other information of globally listed companies. Covering detailed information on millions of companies, this dataset encompasses financial statements, historical stock prices, analyst forecasts, and more. It also includes data on company ownership structure, historical trends of key indicators, industry comparisons, among others. Renowned for its high credibility and authority, the Capital IQ dataset serves as a crucial data source for financial analysis, investment decisions, and academic research. In this paper, the Capital IQ dataset is utilized to acquire enterprise financial, market, and industry data, offering comprehensive inputs and references for the risk prediction model.

Thomson Reuters Eikon Dataset: The Thomson Reuters Eikon dataset is a comprehensive financial market analysis tool, containing global market data on stocks, bonds, commodities, currencies, along with relevant news and analytical reports. Covering over 12 million financial products and more than 20,000 market indices globally, this dataset includes nearly two decades of historical data. In addition to market data, it also comprises company financial statements, analyst forecasts, news sentiment, and other multidimensional data. Renowned for its high reliability and timeliness, the Thomson Reuters Eikon dataset is a vital data source for financial professionals and academic researchers. In this paper, the Thomson Reuters Eikon dataset is utilized to acquire enterprise financial, market, and sentiment data, providing multidimensional inputs and references for the risk prediction model.

The indicator system

The selection of the indicator system for this study mainly refers to the indicator variables used by credit rating agencies, combined with the summary of previous literature research, based on the basic principles of constructing indicators, taking into account factors such as comprehensiveness, comparability, and scientificity, and comprehensively integrating past literature Research and study five aspects that affect corporate credit risk status, namely: debt repayment, profitability, operations, growth and cash acquisition capabilities. At the same time, adding the "Management Discussion and Analysis" text sentiment score to the credit risk evaluation index system can more directly grasp the company’s past operating conditions and future development direction and predictions from the perspective of management. Through the analysis and evaluation of these aspects, the specific indicator system is obtained as shown in Table 1, which can comprehensively and objectively reflect the actual status of corporate business risks.

Experimental details

In this paper, Four data sets are selected for training, and the training process is as follows:

Step 1: Data Processing

  • Data Cleaning: After obtaining the enterprise operational data, the first step is to clean the data and check for any missing values. For numerical variables such as asset-liability ratio, cash flow coverage ratio, return on capital, etc., in the case of missing values, the global mean of the indicator sequence is used for filling. This approach aims to minimize the impact caused by missing values as much as possible.
  • Data Standardization: Through this process, each feature within the dataset is scaled independently to with a mean of 0 and variability normalized to 1, thereby mitigating the potential dominance of certain features due to their larger scales. By standardizing the data, we enhance the convergence speed of optimization algorithms and reduce the model’s sensitivity to the scale of input features, ultimately improving the stability and performance of our enterprise risk prediction model.
  • Data Splitting: Split into training and testing sets: We will split the dataset into training and testing sets, using a ratio such as 80:20, to train the model on a subset of the data and evaluate its performance on unseen data. This helps assess the model’s generalization ability and prevents overfitting.

The meticulous execution of the aforementioned steps guarantees the integrity and uniformity of the data, laying a solid foundation for model training and assessment.

Step 2: Model Training

We will provide a detailed explanation of the model training process, including specific hyperparameter settings, model architecture design, and training strategies.

  • Network Parameter Settings: We meticulously configured the network parameters to ensure optimal model convergence and computational efficiency. Specifically, we set the learning rate to 0.001 and the batch size to 64. These values were chosen through rigorous experimentation and fine-tuning to strike a balance between convergence speed and computational resource utilization. Additionally, to prevent overfitting and enhance model robustness, we employed a dropout rate of 0.5. This regularization technique effectively regularizes the model during training by randomly dropping out a proportion of units from the network, thereby reducing the risk of overfitting.
  • Model Architecture Design: Our model architecture, comprising XGBoost, CNN, and BiLSTM components, was meticulously crafted to handle diverse types of data effectively. The XGBoost component, tailored for structured data, consists of two fully connected layers with 128 and 64 hidden units, respectively, followed by rectified linear unit (ReLU) activation functions. On the other hand, the CNN component, adept at processing image data, comprises two convolutional layers with 32 and 64 filters, each followed by max-pooling layers and ReLU activation functions. Lastly, the BiLSTM component, designed for sequential data, incorporates two BiLSTM layers with 64 hidden units in each direction, followed by a dropout layer. This intricate design enables our model to extract and learn meaningful representations from diverse data modalities effectively.
  • Model Training Process: Our model training process is characterized by meticulous attention to detail and strategic optimization strategies. We initialized the model parameters randomly and employed the Adam optimizer with default parameters for gradient descent optimization. Throughout the training process, the model was iteratively fed with batches of training data, and the loss function was minimized using backpropagation. Concurrently, we closely monitored the model’s performance on a separate validation set. To prevent overfitting, we implemented early stopping with a patience of 10 epochs, halting training when the model’s performance on the validation set ceased to improve. Finally, after completing the specified number of epochs, the trained model underwent rigorous evaluation on the testing set to assess its performance and generalization capability.

Step 3: Model Evaluation

In this critical step, we evaluate the performance of the XGBoost-CNN-BiLSTM model using specific evaluation metrics to measure the effectiveness of enterprise risk prediction. We focus on two key aspects:

  • Our evaluation focuses on several key metrics to assess the performance of the XGBoost-CNN-BiLSTM model in enterprise risk prediction, including Accuracy, F1 Score, AUC, Recall, Training Time (S), Inference Time (ms), and Parameters (M). These metrics collectively evaluate the model’s accuracy, efficiency, complexity, and computational cost, providing a comprehensive assessment of its performance in enterprise risk prediction.
  • Cross-Validation: To ensure the reliability and robustness of our model, we employ cross-validation techniques. Specifically, we leverage k-fold cross-validation, dividing the dataset into k subsets, or folds, and training the model k times. In each iteration, a different fold is utilized as the validation set, while the remaining folds serve as the training set. By repeating this process, we mitigate the variability in model performance stemming from random data splitting, obtaining a more accurate estimation of the model’s performance on unseen data. Through comprehensive cross-validation, we ascertain the model’s consistency and generalization ability, affirming its efficacy in real-world enterprise risk prediction scenarios.

Experimental results and analysis

As shown in Table 2, a comparative analysis across four distinct financial market datasets highlights the advantages of our method across several key performance metrics. In the S&P 500 Historical Dataset, our method achieves an accuracy rate of 93.84%, ranking second, while leading in recall with 95.75%, outperforming the nearest competitor, LSTM-GRU, by 3.68 percentage points. The F1 Score and AUC for our method are 92.62% and 92.43%, approximately 2% and 1% higher than the average of all methods. On the Bloomberg Terminal Dataset, our method achieves the highest accuracy (95.29%), recall (92.83%), F1 Score (94.12%), and AUC (96.29%), with the AUC outperforming the second-best model, XGBoost-MLP, by 2.88 percentage points. For the Capital IQ Dataset, our method demonstrates strong performance with an accuracy of 98.11%, 2.39 percentage points higher than the second-ranked OWA-LSTM. Recall, F1 Score, and AUC stand at 93.76%, 92.25%, and 93.89%, respectively. However, the slightly lower recall in detecting certain minority class events indicates potential areas for improvement in fine-tuning model sensitivity to rare events. Similarly, in the Thomson Reuters Eikon Dataset, our method achieves the highest accuracy (96.52%), recall (95.11%), F1 Score (92.85%), and AUC (92.96%), surpassing SMOTE-CNN by 1.78 percentage points in AUC. While our method exhibits robust performance, it incurs higher computational complexity compared to simpler models like XGBoost-MLP, which may limit its applicability in real-time financial contexts. Optimizing the model to reduce computational cost is a potential future direction. Overall, our method consistently achieves excellent results across multiple datasets, validating its accuracy and reliability in different financial markets. Future work could focus on improving computational efficiency and enhancing the model’s handling of rare risk events to increase its practicality in real-world applications. Fig 5 provides a visual representation of the key results, enhancing clarity and accessibility.

thumbnail
Fig 5. Comparison of Accuracy, Recall, F1 Sorce, and AUC performance of different models on S&P 500 Historical dataset, Bloomberg Terminal Dataset, Capital IQ dataset, Thomson Reuters Eikon dataset.

https://doi.org/10.1371/journal.pone.0319773.g005

thumbnail
Table 2. Comparison of model performance across S&P 500, Bloomberg, Capital IQ, and Thomson Reuters datasets.

https://doi.org/10.1371/journal.pone.0313772.t002

Table 3 presents a comparison of model parameters and FLOPs across four datasets for different methods. On the S&P 500 Historical Dataset, our method exhibits a parameter count of 329.53M and Flops of 5.51G. For the Bloomberg Terminal Dataset, our method has parameters totaling 307.92M and requires 5.77G Flops. On the Capital IQ Dataset, our method’s parameters amount to 326.5M with 5.5G Flops, while on the Thomson Reuters Eikon Dataset, it stands at 308.47M parameters and 5.78G Flops. Comparatively, our method demonstrates lower model parameters and Flops across all datasets, indicating its advantage in terms of model complexity and computational cost. However, despite the relative efficiency in terms of parameters and FLOPs, the overall computational requirements of our hybrid XGBoost-CNN-BiLSTM model are still higher compared to some simpler models such as XGBoost-MLP and Random Forest, which may be more suitable for real-time applications. This suggests that while our model is efficient relative to other complex deep learning models, there is still room for optimization, particularly for deployment in time-sensitive financial environments. Techniques such as model pruning and quantization could be explored in future work to further reduce model size and computational demands. Furthermore, our method exhibits the lowest parameter count and Flops on both the S&P 500 Historical Dataset and Capital IQ Dataset, suggesting its efficiency in model training and inference on these datasets. These results validate the effectiveness and feasibility of our approach. Finally, in Fig 6, we visualize the table contents, further illustrating the advantages of our method in terms of model parameters and Flops.

thumbnail
Table 3. Comparison of Parameters (M) and Flops (G) performance of different models on datasets.

https://doi.org/10.1371/journal.pone.0313772.t003

thumbnail
Fig 6. Comparison of Parameters (M) and Flops (G) performance of different models on datasets.

https://doi.org/10.1371/journal.pone.0319773.g006

Table 4 details the results of the BiLSTM model tests, highlighting the substantial performance improvements of our XGBoost-CNN-BiLSTM hybrid model over the GRU, Transformer, and LSTM models. For instance, in the S&P 500 dataset, our model achieved an accuracy of 93.84%, which is a notable increase of 6.83% over the GRU model’s accuracy of 87.01%. Similarly, the recall improved by 4.32%, rising from 91.43% with the GRU model to 95.75% with our model. The F1 score and AUC also saw substantial improvements, with the F1 score increasing by 5.55% and the AUC by 6.27% compared to the GRU model. In the Bloomberg Terminal dataset, our model’s accuracy of 95.29% demonstrates a significant improvement of 4.38% compared to the GRU model’s 90.91%. The recall improved by 5.98%, and the F1 score saw a significant increase of 10.21%, with the AUC improving by 5.75%, indicating enhanced precision and reliability in the model’s predictions. Overall, the consistent improvements across these datasets strongly validate the robustness and effectiveness of our XGBoost-CNN-BiLSTM model in enterprise risk prediction. These results underscore the model’s superiority in providing more accurate and reliable risk predictions, significantly outperforming the GRU, Transformer, and LSTM models. Fig 7 offers a visualization of the tabulated content, making these comparative results more accessible and easier to grasp for the reader. This graphical representation effectively illustrates the performance differences between the models, providing a clear and intuitive comparison of their strengths and weaknesses.

thumbnail
Table 4. Ablation experiments on the BiLSTM model using different datasets.

https://doi.org/10.1371/journal.pone.0313772.t004

thumbnail
Fig 7. Efficient comparison of BiLSTM with other models on different datasets.

https://doi.org/10.1371/journal.pone.0319773.g007

The ablation experiments on the XGBoost model using different datasets, as summarized in Table 5, highlight the superior performance of our proposed XGBoost-CNN-BiLSTM hybrid model compared to other boosting models. For the S&P 500 dataset, our model achieved an accuracy of 93.84%, which is a notable improvement over the other models. The recall and F1 scores also showed significant increases, indicating a robust ability to identify and predict risk events accurately. Specifically, the recall improved by approximately 9% over the best-performing alternative model, and the F1 score saw an increase of around 7%. This improvement suggests that our hybrid model excels in both identifying risk events and minimizing false negatives. In the Bloomberg Terminal dataset, our model’s accuracy of 95.29% represents a substantial increase compared to the other models, with improvements in recall and F1 score of about 6% and 8%, respectively. These improvements are indicative of our model’s enhanced precision and robustness in dealing with large-scale financial data. For the Capital IQ dataset, our model achieved an outstanding accuracy of 98.11%, significantly outperforming the other models. The recall improved by approximately 6%, and the F1 score by about 5%, underscoring the model’s effectiveness in handling complex financial data. In the Thomson Reuters Eikon dataset, our model maintained its superiority with an accuracy of 96.52%, an improvement of around 7% over the next best model. The recall and F1 score also showed significant enhancements, with increases of about 8% and 7%, respectively. This consistent improvement across datasets demonstrates the hybrid model’s ability to generalize across different financial environments and effectively capture diverse risk patterns. Overall, these results confirm the robustness and effectiveness of our XGBoost-CNN-BiLSTM model in enterprise risk prediction. The consistent improvements across all datasets highlight the model’s superior predictive power and reliability, making it a highly valuable tool for enterprise risk management. Fig 8 illustrates the table’s content, providing a clear visualization of the performance gains and reinforcing the superiority of our model in comparison to the other methods.

thumbnail
Table 5. Ablation experiments on the XGBoost model using different datasets.

https://doi.org/10.1371/journal.pone.0313772.t005

thumbnail
Fig 8. Comparison of XGBoost with other models across different datasets.

https://doi.org/10.1371/journal.pone.0319773.g008

thumbnail
Table 6. Ablation experiments on the CNN model using different datasets.

https://doi.org/10.1371/journal.pone.0313772.t006

Table 6 demonstrates that our proposed model achieves significant performance improvements over traditional CNN architectures across various datasets. For the S&P 500 dataset, our model achieved an accuracy of 93.84%, surpassing LeNet, AlexNet, and ResNet by 5.47%, 4.29%, and 4.27%, respectively. This pattern of improvement is also evident in recall, F1 score, and AUC, where our model consistently outperforms the others. In the Bloomberg dataset, our model’s accuracy of 95.29% represents a substantial increase compared to LeNet (89.02%), AlexNet (88.13%), and ResNet (87.67%), showing improvements of 6.27%, 7.16%, and 7.62%, respectively. The consistent improvement across all metrics highlights our model’s robustness and adaptability in handling diverse financial data. For the Capital IQ dataset, our model achieved an accuracy of 98.11%, which is notably higher than LeNet (86.47%), AlexNet (87.58%), and ResNet (86.38%). The accuracy improvement ranges from 11.64% to 11.73%, with corresponding increases in recall, F1 score, and AUC metrics. Such significant differences in performance can be attributed to the ability of our hybrid model to effectively capture intricate patterns in financial data that traditional CNN models struggle with, particularly in complex and high-dimensional datasets. In the Thomson Reuters dataset, our model’s accuracy of 96.52% exceeds LeNet (90.56%), AlexNet (89.80%), and ResNet (88.88%) by 5.96%, 6.72%, and 7.64%, respectively. The recall, F1 score, and AUC metrics further demonstrate the model’s robust performance. These improvements suggest that our hybrid architecture, combining CNN and LSTM components, not only enhances predictive accuracy but also provides better generalization across diverse datasets. In summary, across all datasets, our CNN model significantly outperforms LeNet, AlexNet, and ResNet, showcasing its effectiveness and robustness in financial risk prediction tasks. These results validate the superiority of our hybrid model, particularly in terms of its ability to handle time-series financial data, where traditional CNN architectures fall short. Fig 9 visualizes the content of the table, providing a graphical comparison that accentuates our model’s advantages and allows for a clear visual representation of these performance disparities.

thumbnail
Fig 9. Comparing CNN with other models across different datasets.

https://doi.org/10.1371/journal.pone.0319773.g009

Conclusion

In this study, we proposed an enterprise risk prediction method based on the XGBoost-CNN-BiLSTM hybrid model and conducted experiments on four different financial market datasets. The experimental results demonstrate significant advantages of our model in key metrics such as accuracy, recall, F1 score, and AUC, indicating its effectiveness and robustness across multiple datasets. However, despite these promising results, several limitations must be acknowledged. Firstly, our model may face challenges in handling large-scale data, particularly in terms of computational resource consumption during model training and inference. While the XGBoost-CNN-BiLSTM framework performs well in the tested scenarios, the integration of these models increases computational complexity, which may limit its scalability to even larger datasets or real-time applications without further optimization. Secondly, there might be limitations in processing unstructured data, especially in dealing with complex semantic information and time-series data. Future work will focus on addressing these limitations and further enhancing the performance and robustness of the model.

Firstly, we will explore more efficient model compression and acceleration methods to reduce computational complexity and improve the model’s applicability to large-scale data. Techniques such as quantization and pruning will be considered to minimize model size and computational cost, allowing for faster inference and reduced memory usage. Secondly, we will delve deeper into the capability of deep learning models in handling unstructured data, exploring more flexible and effective methods to extract and utilize information from unstructured data. Additionally, we will explore more financial market datasets and further validate the model’s performance and generalizability in different scenarios.

Furthermore, another limitation is that our current experiments are primarily focused on financial market datasets. Although the results show robustness across different financial datasets, further evaluation on diverse types of datasets beyond financial markets would be beneficial to fully demonstrate the generalizability of the model. This study is important not only for introducing a novel enterprise risk prediction method but also for offering valuable insights into advancing financial risk management. Future work will continue to explore the potential of deep learning in the financial domain, contributing to the development of more intelligent and reliable risk management systems. Through relentless efforts and continuous exploration, we believe that we can make greater contributions to the stability and healthy development of financial markets.

References

  1. 1. Cao Y, Shao Y, Zhang H. Study on early warning of E-commerce enterprise financial risk based on deep learning algorithm. Electron Commerce Res. 2022;22(1):21–36.
  2. 2. Fraser JRS, Quail R, Simkins BJ. Questions asked about enterprise risk management by risk practitioners. Bus Horiz. 2022;65(3):251–60.
  3. 3. Olson D, Wu D. Enterprise risk management in supply chains. Enterprise risk management models: focus on sustainability. Springer; 2023. .
  4. 4. Olson D, Wu D. Data mining models and enterprise risk management. In: Enterprise risk management models: focus on sustainability. 2023. p. 119–41.
  5. 5. Yang B, Yang M. Research on enterprise knowledge service based on semantic reasoning and data fusion. Neural Comput Appl 2022;34(12):9455–70. pmid:34456516
  6. 6. Yang D. Evaluation of enterprise financial risk level under digital transformation with artificial neural network. Security and communication networks. 2022;2022.
  7. 7. Yang W, Jia C, Liu R. Construction and simulation of the enterprise financial risk diagnosis model by using dropout and BN to improve LSTM. Security and communication networks. 2022;2022.
  8. 8. Sekhar S, Kovvuri S, Vyshnavi K, Uppalapati S, Yaswanth K, Teja R. Risk modelling and prediction of financial management in macro industries using CNN based learning. 2023.
  9. 9. Kalva S, Satuluri N. Stock market investment strategy using deep-Q-learning network. 2023.
  10. 10. Yang M, Wang J. Adaptability of financial time series prediction based on BiLSTM. Procedia Comput Sci. 2022;199:18–25.
  11. 11. Li X, Wang J, Yang C. Risk prediction in financial management of listed companies based on optimized BP neural network under digital economy. Neural Comput Appl. 2023;35(3):2045–58.
  12. 12. Ak M, Yucesan M, Gul M. Occupational health, safety and environmental risk assessment in textile production industry through a Bayesian BWM-VIKOR approach. Stochast Environ Res Risk Assessm. 2022;36(2):629–42.
  13. 13. Tian S, Li W, Ning X, Ran H, Qin H, Tiwari P. Continuous transfer of neural network representational similarity for incremental learning. Neurocomputing. 2023;545:126300.
  14. 14. Wang D, Li L, Zhao D. Corporate finance risk prediction based on LightGBM. Inf Sci. 2022;602:259–68.
  15. 15. Gao B. The use of machine learning combined with data mining technology in financial risk prevention. Comput Econ. 2022;59(4):1385–405.
  16. 16. Ma Z, Wang X, Hao Y. Development and application of a hybrid forecasting framework based on improved extreme learning machine for enterprise financing risk. Exp Syst Appl. 2023;215:119373.
  17. 17. Yao G, Hu X, Wang G. A novel ensemble feature selection method by integrating multiple ranking information combined with an SVM ensemble model for enterprise credit risk prediction in the supply chain. Exp Syst Appl. 2022;200:117002.
  18. 18. Uddin M, Chi G, Al Janabi M, Habib T. Leveraging random forest in micro-enterprises credit risk modelling for accuracy and interpretability. Int J Financ Econ. 2022;27(3):3713–29.
  19. 19. Zhu W, Zhang T, Wu Y, Li S, Li Z. Research on optimization of an enterprise financial risk early warning method based on the DS-RF model. Int Rev Financ Anal. 2022;81:102140.
  20. 20. Albahri A, Duhaim A, Fadhel M. A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Inf Fusion. 2023.
  21. 21. Gao T, Wang C, Zheng J, et al. A smoothing group lasso based interval type-2 fuzzy neural network for simultaneous feature selection and system identification. Knowl-Based Syst. 2023;280:111028.
  22. 22. Che W, Wang Z, Jiang C, Abedin M. Predicting financial distress using multimodal data: An attentive and regularized deep learning method. Inf Process Manag. 2024:61(4);103703.
  23. 23. Cheng L, Du L, Liu C, Hu Y, Fang F, Ward T. Multi-modal fusion for business process prediction in call center scenarios. Inf Fusion. 2024;2024:102362.
  24. 24. Zhang W, Yan S, Li J, Tian X, Yoshida T. Credit risk prediction of SMEs in supply chain finance by fusing demographic and behavioral data. Transp Res Part E: Logist Transp Rev. 2022;158:102611.
  25. 25. Sheth A, Sinfield JV. Advancing the complex adaptive systems approach to enterprise risk management with quantified risk networks (QRNs). Sci Rep 2024;14(1):22312. pmid:39333144
  26. 26. Tian J, Shen C, Wang B, Xia X, Zhang M, Lin C, et al. LESSON: Multi-label adversarial false data injection attack for deep learning locational detection. IEEE Trans Depend Secure Comput. 2024.
  27. 27. Liu Y. Deep learning for enterprise sales prediction: Haressing CNN-LSTM-Attention model. In: 2024 IEEE 3rd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA). IEEE; 2024. .
  28. 28. Li Y-y, Xu Z-m, Feng C, Jiang Y. XGBoost-based Survival Analysis in Business Risk Prediction. Paper presented at: 2022 International Conference on High Performance Big Data and Intelligent Systems (HDIS). 2022.
  29. 29. Zhang T, Zhu W, Wu Y, Wu Z, Zhang C, Hu X. An explainable financial risk early warning model based on the DS-XGBoost model. Financ Res Lett. 2023;56:104045.
  30. 30. Zhou D, Zhuang X, Zuo H, Cai J, Zhao X, Xiang J. A model fusion strategy for identifying aircraft risk using CNN and Att-BiLSTM. Reliabil Eng Syst Safety. 2022;228:108750.
  31. 31. Jain A, Rao A, Jain P, Hu Y-C. Optimized levy flight model for heart disease prediction using CNN framework in big data application. Exp Syst Appl. 2023;223:119859.
  32. 32. Zhang X, Ma Y, Wang M. An attention-based Logistic-CNN-BiLSTM hybrid neural network for credit risk prediction of listed real estate enterprises. Exp Syst. 2024:41(2);e13299.
  33. 33. Du S, Jiang X, Guo A, Zuo K, Zhang T. Clinical application of early warning scoring based on BiLSTM-attention in emergency obstetric preexamination and triage. J Healthc Eng. 2022;2022:6274230. pmid:35340245
  34. 34. S&P 500 Historical Dataset. [cited 2024 July 08]. https://www.nasdaq.com/zh/market-activity/index/spx/historical?page=1&rows_per_page=10&timeline=y5
  35. 35. Bloomberg Terminal Dataset. Bloomberg Professional. [cited 2024 July 08]. https://www.bloomberg.com/professional/solution/bloomberg-terminal/
  36. 36. Capital IQ Dataset. S&P Global Market Intelligence. [cited 2024 July 08]. https://www.spglobal.com/marketintelligence/en/solutions/sp-capital-iq-platform
  37. 37. Thomson Reuters Eikon Dataset. Refinitiv Eikon. [cited 2024 July 08]. https://www.refinitiv.com/en/products/eikon-trading-software
  38. 38. Hussain W, Raza M, Jan M, Merigo´ J, Gao H. Cloud risk management with OWA-LSTM and fuzzy linguistic decision making. IEEE Trans Fuzzy Syst. 2022;30(11):4657–66.
  39. 39. Li H, Liu H, Hu Y. Prediction of unbalanced financial risk based on GRA-TOPSIS and SMOTE-CNN. Scientific Program. 2022;2022.
  40. 40. Agga A, Abbou A, Labbadi M, El Houm Y, Ali I. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electric Power Syst Res. 2022;208:107908.
  41. 41. Dai H, Huang G, Zeng H, Zhou F. PM2.5 volatility prediction by XGBoost-MLP based on GARCH models. J Clean Prod. 2022;356:131898.
  42. 42. Rao C, Liu Y, Goh M. Credit risk assessment mechanism of personal auto loan based on PSO-XGBoost Model. Complex Intell Syst. 2023;9(2):1391–414.
  43. 43. Ahmadzadeh E, Kim H, Jeong O, Kim N, Moon I. A deep bidirectional LSTM-GRU network model for automated ciphertext classification. IEEE Access. 2022;10:3228–37.