Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Prediction of stress-strain behavior of rock materials under biaxial compression using a deep learning approach

Abstract

Deep learning has significantly advanced in predicting stress-strain curves. However, due to the complex mechanical properties of rock materials, existing deep learning methods have the problem of insufficient accuracy in predicting the stress-strain curves of rock materials. This paper proposes a deep learning method based on a long short-term memory autoencoder (LSTM-AE) for predicting stress-strain curves of rock materials in discrete element numerical simulations. The LSTM-AE approach uses the LSTM network to construct both the encoder and decoder, where the encoder extracts features from the input data and the decoder generates the target sequence for prediction. The mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) of the predicted and true values are used as the evaluation metrics. The proposed LSTM-AE network is compared with the LSTM network, recurrent neural network (RNN), BP neural network (BPNN), and XGBoost model. The results indicate that the accuracy of the proposed LSTM-AE network outperforms LSTM, RNN, BPNN, and XGBoost. Furthermore, the robustness of the LSTM-AE network is confirmed by predicting 10 sets of special samples. However, the scalability of the LSTM-AE network in handling large datasets and its applicability to predicting laboratory datasets need further verification. Nevertheless, this study provides a valuable reference for solving the prediction accuracy of stress-strain curves in rock materials.

1. Introduction

Stress-strain curves can reveal the mechanical behavior of materials under external forces. Analyzing the stress-strain behavior of rock materials is crucial for evaluating safety in geotechnical engineering, e.g., stability analysis of rock slopes [1,2] and sensitivity studies of geological hazards [3,4]. Currently, the discrete element method (DEM) is widely used to model the stress-strain behavior of rock materials [58]. However, over the past decades, constitutive models based on mechanical assumptions have often been used to capture the mechanical behavior of rock materials [9]. Due to the complex mechanical characteristics of rock materials, the development of a unified theoretical model remains an ongoing challenge [1014]. Particularly, most constitutive models typically require many parameters that are sometimes difficult to determine [1517].

With the advancement of machine learning, obtaining constitutive relationships for materials through data-driven methods is a potential scheme to address the above challenge [1821]. Compared to traditional laboratory and numerical simulation methods, data-driven deep learning approaches can quickly predict the nonlinear behavior of materials under different conditions. Deep learning methods leverage multi-layer neural network structures that can effectively address highly complex nonlinear problems through automated feature learning and end-to-end training. These methods excel in capturing nonlinear relationships. However, they come with higher computational costs and longer training times. In contrast, traditional machine learning methods offer better computational efficiency and easier implementation, making them widely applicable across various fields such as granular materials [22], energy storage [23], computational physiology [24], and renewable energy [25]. However, they tend to be less accurate when handling large-scale datasets or complex problems [26]. Therefore, deep learning methods are particularly suited for analyzing stress-strain data of different materials with strong nonlinear characteristics. Ghaboussi et al. [27] modeled the mechanical behavior of concrete using neural networks to predict various load paths under biaxial loading. Ellis et al. [28] simulated the stress-strain characteristics of soil using neural networks. Banimahd et al. [29] used artificial neural networks to model the stress-strain behavior of in situ sandy soils containing non-plastic fines and pointed out the problem that the ANN model does not provide information on how the inputs affect the outputs. However, the constitutive behavior fundamentally involves a time sequence problem [30]. Considering the historical behavior of stress-strain enhances prediction accuracy. Significant progress in artificial intelligence has been made for time series forecasting problems [3133]. Models such as recurrent neural networks (RNNs) and the long short-term memory neural network (LSTM) can effectively handle long-term dependencies and yield favorable results in time series forecasting [3435]. So far, temporal neural networks have been widely used to capture the mechanical behavior of various materials. Zhang et al. [17] employed the LSTM model to reproduce the stress-strain behavior of soil, discovering a novel phenomenon of “bias at low-stress levels” when modeling the stress-strain behavior of soil using LSTM, namely, the bias in the predicted results is greater at low-stress levels. Wang et al. [18] explored and validated the capabilities of recurrent neural networks in predicting the constitutive behavior of granular materials by constructing a temporal convolutional network. Shi et al. [36] predicted the deformation of rock materials under different loading conditions using LSTM, proposing a method to integrate the trained LSTM model as a constitutive relationship within finite element methods (FEM) to model the mechanical behavior of sandstone. Li et al. [37] proposed a fractional long short-term memory (F-LSTM) neural network prediction model, which effectively addresses the lengthy process of obtaining stress-strain behavior during the stretching of rubber. Li et al. [38] proposed a data-driven method based on the LSTM model to simulate the mechanical response of frozen soil.

Although the LSTM method has been widely used to capture the mechanical behavior of various materials, it has rarely been applied to the study of large-scale rock mass mechanical properties. Additionally, existing studies have shown a significant bias in the prediction results of traditional neural network models and LSTM models at low-stress levels. This problem is relevant with the error back propagation in the training process, where the small weight gradients of the sample at low-stress levels can lead to insufficient learning of these samples, resulting in large prediction errors [17]. Consequently, the accuracy of stress-strain curve prediction remains a problem waiting to be solved. To address this issue, this paper introduces a novel LSTM Autoencoder (LSTM-AE) model for predicting the stress-strain behavior of rock materials in DEM numerical simulations. Unlike traditional LSTM models, the LSTM-AE model improves prediction accuracy by incorporating an autoencoder structure and utilizing LSTM units to construct the encoder and decoder. The following are the main contributions of this research work:

  1. An LSTM-AE network is proposed in this paper for predicting the stress-strain curve of rock materials in DEM numerical simulations. In the proposed model, the encoder and decoder are implemented using LSTM cells, which effectively capture temporal dependencies between sequences.
  2. The proposed LSTM-AE network is compared with LSTM, RNN, BPNN, and XGBoost models to analyze their prediction accuracy and validate the effectiveness of the proposed model.
  3. The robustness of the proposed model is analyzed, and the impact of microscopic parameters in DEM on the model’s prediction accuracy is explored, thereby verifying its reliability in predicting special samples.

2. Discrete element modeling and data preparation

2.1. Discrete element modeling of sample

In this paper, the Hertz-Mindlin contact mechanics model and the bond contact mechanics model are employed to simulate the deformation behavior of brittle rocks. The combination of the Hertz-Mindlin model with bond contact effectively accounts for the bonding characteristics between particles, thereby accurately simulating the formation and expansion of rock cracks. Compared with other models, it can better reflect the mechanical properties of rock materials and is more suitable for simulating rock mass failure [39]. The dataset used for model training and testing was generated through discrete element numerical simulations. Biaxial compression tests were conducted using the high-performance 2D discrete element software ZDEM [4044], which produced stress-strain curve data. However, at large scales, rock mass often exhibits fractures and cracks [45]. Therefore, selecting appropriate particle properties in the DEM is crucial to reproducing the actual mechanical behavior of the rock mass [46]. In order to simulate the structural deformation behavior of geological layers in nature and accurately characterize the mechanical properties of the rock mass at this scale, biaxial specimens measuring 4 km × 8 km were constructed in the expectation of simulating the overall mechanical properties of the rock mass [42,46,47]. The microscopic parameters used in the biaxial test are referenced in Li [42] and Morgan [47]. Modeling is divided into two main processes, the process of initial model generation and the process of biaxial compression (Fig 1).

The initial model generation of biaxial specimens employed the radii expansion method [42,49,50]. The process is as follows: First, 2100 particles with radii of 30 and 40 meters are randomly generated in a 4×8 km area, where the number of particles of both types is approximately the same, as shown in Fig 1A. Next, the microscopic parameters of the rock material were assigned to the particles, and the values of the microscopic parameters are shown in Table 1. The particle radii are then doubled using the radii expansion method, and a system in equilibrium was obtained by calculation as shown in Fig 1B. Finally, the walls on both sides are loosened, allowing the system to expand again under a given confining pressure (σ3). The model in the dashed line is derived as the initial model, as shown in Fig 1C.

The biaxial compression process is described as follows: The initial model was placed into an area of the same size, and the microscopic parameters of the particles were set as shown in Table 1. In addition, the microscopic and bond parameters of the particles influence the overall mechanical properties of the specimen. Therefore, the bond parameters [42,47] of the rock layers were set as shown in Table 2. Subsequently, biaxial compression tests were conducted, and the deformation of the rock material was recorded. Finally, stress-strain data were collected for a series of biaxial compression tests.

2.2. Data preparation

Discrete element simulations were conducted through the above modeling process. However, different microscopic parameters affect the mechanical properties of rock material in various ways [51]. To acquire more stress-strain data, 96 sets of biaxial compression tests were randomly carried out at five confining pressures with different friction coefficients, different tensile strengths, and different shear strengths, respectively. Each sample contains 20 loading steps, which means 20 stress-strain pairs. Therefore, this means that there are 9600 (96×5×20) pairs of stress-strain pairs as data. Fig 2 illustrates the stress-strain curves obtained from biaxial compression tests at five different confining pressures.

thumbnail
Fig 2. The stress-strain curves obtained by biaxial compression test.

https://doi.org/10.1371/journal.pone.0321478.g002

3. Methodology

3.1. Long short-term memory neural networks

In recent years, deep learning has achieved significant advancements in processing nonlinear and high-dimensional data, owing to its data processing capabilities and automatic feature extraction advantages [52]. By constructing multi-layer neural networks, deep learning can autonomously extract higher-order features from data and recognize complex patterns. This ability makes it perform well in complex tasks [5356]. However, in time series analysis, a key challenge for deep learning is effectively capturing long-term dependencies in the data. To address this issue, the LSTM networks have emerged as a successful solution [35]. The LSTM network is a neural network structure designed for processing long sequence data, addressing the issues of gradient explosion or disappearance in traditional RNN [35,57]. In LSTM, internal recurrent loops make it to retain earlier data and establish temporal dependencies between consecutive data points. Its core is the cell state and the “gate” structure. The cell state is a crucial variable that enables the transfer of information from earlier steps throughout the entire network. The process consists of three key steps: forgetting, inputting, and outputting, which are realized by three “gates”. Specifically, the “gate” structure of LSTM comprises the forget gate, the input gate, and the output gate [5860]. The structure of LSTM is shown in Fig 3. The basic principle is as follows:

  1. 1.. The input of the forget gate ft, the input gate it, and the output gate ot are all taken from the input xt of the current time step and the output ht-1 of the previous time step, using a sigmoid activation function such that the output is between 0 and 1. The difference between these gates is the weights and bias values of the inputs. The relevant formulas are as follows:
(1)(2)(3)(4)

where Wf, bf, Wi, bi, Wo, and bo denote the input weights and biases of the forget gate, the input gate, and the output gate, respectively. And ht-1 and xt denote the output at moment t-1 and the input at moment t.

  1. 2.. ct represents the state of the unit cell. It enables the LSTM model to learn the long-term dependencies well. The formulas are defined as follows:
(5)(6)

where ft is the output of the forget gate, ct-1 is the state of the unit cell at the moment t-1, it is the output of the input gate, and Wc and bc are the weights and bias values.

  1. 3.. ht represents the final output of the LSTM model at time step t. The formulas are defined as follows:
(7)

where ct is the output of the unit cell at time t and ot is the output of the output gate.

3.2. LSTM autoencoder

Autoencoder (AE) is a widely used unsupervised learning algorithm designed to achieve dimensionality reduction and feature extraction of data by learning an efficient encoding of unlabeled data [6263]. It consists of an encoder and a decoder. The goal of the encoder is to convert the input data x into a low-dimensional representation. The goal of the decoder is to map this low-dimensional representation back to the original state and reconstruct the input data [6465]. This type of model enhances performance by modifying hidden layers in various ways. When a neural network is deep, the vanishing gradient problem is addressed by stacking hidden layers [66]. The structure of the AE model is shown in Fig 4. LSTM-AE combines the ideas of LSTM and AE, which uses LSTM units to build the encoder and the decoder, thus enabling the encoder and decoder to process time series data or time-dependent data [6768]. The LSTM-AE framework is shown in Fig 5. The input data is the time series data within a time step t. The hidden state ht and the memory cell state ct at the last moment of the encoder are obtained after the computation of the LSTM unit, which is passed into the first LSTM of the decoder as initial states for prediction.

3.3. Indicators for the assessment

Four evaluation metrics commonly used in regression tasks are used to assess the goodness of the model, and they are defined in Eqs 811. The mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) are all used to measure the error between the predicted value and the true value, and their values are all as small as possible. The coefficient of determination (R2) is used to measure the ability of the model to fit the data and its value range is [0, 1]. The closer its value is to 1, the better the model fits. Due to their different calculation methods, each metric is affected differently by outliers. Thus the combined use of these four indicators helps to assess the model more comprehensively.

(8)(9)(10)(11)

Where represents the true value, represents the predicted value, represents the sample average and m represents the number of samples.

4. Experiments and results

This section displays the prediction results of the LSTM-AE model and its accuracy in predicting stress-strain curves of rock materials through an in-depth examination. Fig 6 illustrates the flowchart for predicting the stress-strain curve. Next, some details of the experiment, the results, and a discussion of the results will be presented.

thumbnail
Fig 6. Flowchart of LSTM-AE network in stress-strain curve prediction.

https://doi.org/10.1371/journal.pone.0321478.g006

4.1. Data preprocessing and modeling details

In order to make the data collected in section 2.2 fit the inputs to the model, preprocessing of the dataset is required. Therefore, the coefficient of friction (μ), confining pressure (σ3), tensile strength (Tb), shear strength (Cb), different strains (ε), and the deviatoric stress (σ1-σ3) at the previous moment are taken as inputs and the output is the deviatoric stress at the next moment. To ensure that the model is better trained, the input data needs to be scaled to a specific range, i.e., standardized. Additionally, the data of each curve is transformed appropriately so that it can be used for training and testing the network model [37]. As shown in Fig 6, the time series data is converted to LSTM training data using the sliding window method [69].

In the model-building process, the prepared data is first divided into training, validation, and testing sets, with the training set comprising 70% of the data, and the validation and testing sets each comprising 15%. Next, the model structure, consisting of an encoder and a decoder, is constructed using LSTM. During training, the model’s performance is evaluated using validation sets, and training is stopped when there is no significant improvement in the model performance after several iterations. Table 3 represents the parameters setting of the LSTM-AE model, which were manually tuned through iterative trials, with the optimal configuration selected based on validation score. Among these, the learning rate and dropout rate are two key hyperparameters. Specifically, the learning rate controls the step size of the parameter update of the model during the training process. Choosing an appropriate learning rate can accelerate the convergence of the model, improve the training efficiency, and enhance the generalization ability of the model. If the learning rate is too large, it may cause the model to fail to converge because the step size of the parameter update is too large, and the optimal solution may be missed; while if the learning rate is too small, although the convergence is stable, the training process is too slow, which may make the model fall into the local optimal solution. The dropout rate, on the other hand, controls the proportion of nodes randomly dropped in the neural network. It reduces the model’s dependence on some specific nodes by randomly dropping some neurons during the training process, thus enhancing the model’s generalization ability. If the dropout rate is set too high, too many neurons are discarded during each training, which may make the training process unstable; while if it is set too low, the model may be overfitted on the training set, resulting in poor generalization ability. Fig 7 illustrates the time series prediction details for a time step. It is important to highlight that the time step discussed here pertains to the time series data, whereas in Table 1, it refers to a parameter associated with the DEM.

thumbnail
Fig 7. Processing time series data using the sliding window method, where t represents the label of the time series segment.

https://doi.org/10.1371/journal.pone.0321478.g007

4.2. Predictive results of the proposed model

Fig 8 shows loss of the model in the training process. As shown in Fig 8, the model’s loss decreases gradually in the initial stage. As the number of training epoch increases, both the training loss and validation loss stabilize at a low level, indicating that the model has converged. Table 4 shows the results of the evaluation metrics for different models. Fig 9 show the prediction results of different models for specific samples. From the table, the LSTM model has lower MSE, RMSE, MAE, and higher R2 compared to the RNN, BPNN, and XGBoost models. However, the LSTM model still requires further enhancement in predicting the stress-strain curves of rock materials. Overall, the improvement of the proposed model is significant. Compared to the LSTM model, the LSTM-AE model reduces the MSE, RMSE, and MAE by 0.0191, 0.08494, and 0.04745, respectively. The R2 improves by 0.01599. Fig 10 illustrates the MSE distribution of different models on the testing set. The LSTM-AE model exhibits a lower MSE than the other models, with a more concentrated error distribution. These results indicate that the proposed model has a less error between the predicted value and the true value, and it has a stronger prediction ability and data fitting ability. In contrast, the prediction results of the other four models under specific samples show significant deviations.

thumbnail
Fig 8. Loss on the training and validation sets in the training process.

https://doi.org/10.1371/journal.pone.0321478.g008

thumbnail
Fig 9. Prediction results of different models on specific samples.

(a) LSTM-AE, (b) LSTM, (c) RNN, (d) BPNN, (e) XGBoost.

https://doi.org/10.1371/journal.pone.0321478.g009

thumbnail
Fig 10. Boxplots for the MSE of the different models in the testing set.

https://doi.org/10.1371/journal.pone.0321478.g010

4.3. The robustness of the LSTM-AE network

In general, some samples affect the prediction accuracy of stress-strain curves in practical applications. Therefore, in this section, the influence of the microscopic parameters of the rock material on the prediction accuracy of the stress-strain curve is analyzed and verifies the robustness of the LSTM-AE model. Samples with coefficients of determination less than 0.8 were first analyzed, and the coefficients of determination for the different models for each group of samples on the testing set are shown in Fig 11. It can be seen that LSTM, RNN, BPNN and XGBoost models have lower prediction accuracy on individual samples. The analysis of the samples with lower prediction accuracies reveals that there is a phenomenon of low confining pressure or low friction coefficient in all of these samples. Thus, in order to investigate the effect of different confining pressures and friction coefficients on the prediction accuracy, the coefficients of determination of the five models were examined for different confining pressures and friction coefficients. Fig 12 shows that the coefficient of determination increases with the friction coefficient on LSTM, RNN, BPNN and XGBoost models. Bias exists at low friction coefficients, and high friction coefficients help improve the accuracy of these four models. However, a similar pattern is not shown in the confining pressure. Fortunately, the LSTM-AE model has high accuracy on all these samples. Thus, to further verify the accuracy of the model, 10 sets of microscopic parameters at low friction coefficients and low confining pressures were designed to generate the stress-strain data. The prediction results of LSTM-AE are shown in Fig 13. It can be seen that the proposed LSTM-AE model still exhibits relatively high robustness at low friction coefficients and low confining pressures.

thumbnail
Fig 11. Coefficient of determination of each group of samples for different models on the testing set.

https://doi.org/10.1371/journal.pone.0321478.g011

thumbnail
Fig 12. Coefficients of determination for different models at different friction coefficients.

https://doi.org/10.1371/journal.pone.0321478.g012

thumbnail
Fig 13. Coefficients of determination of different models for special samples.

https://doi.org/10.1371/journal.pone.0321478.g013

5. Discussion

Currently, various numerical techniques are widely available to simulate the stress-strain behavior of rock materials. However, due to the complexity of rock materials, obtaining a rigorous constitutive model is extremely challenging. With the development of deep learning technology, a data-driven approach for researching constitutive models of rock materials has been provided. In this study, a method for predicting stress-strain curves of rock materials in discrete element numerical simulations based on the LSTM-AE model is proposed. The LSTM-AE model has better prediction accuracy as well as robustness compared to the methods often used for prediction. The prediction bias of traditional machine leanring methods primarily arises from their limited ability to capture temporal dependencies, particularly when addressing complex time-series problems. Additionally, the RNN model is susceptible to the gradient vanishing problem, while the lower weight gradients in LSTM models may cause undertraining in low-stress and low- friction samples, resulting in large prediction biases. By leveraging the properties of the self-encoder, LSTM-AE effectively mitigates the influence of noise and better extracts key features from time-series data, thereby improving prediction accuracy for low-stress and low-friction coefficient samples. To further analyze the bias in these samples, future methods such as Explainable Artificial Intelligence (XAI) and SHapley Additive exPlanations (SHAP) algorithms can be integrated to provide deeper insights into the neural network’s inference process, helping address these issues [70,71].

However, only the prediction of stress-strain curves for rock materials in discrete element numerical simulations is discussed in this study. In the future, this method can predict the stress-strain behavior of more complex DEM experiments, and the effects of more microscopic parameters on the accuracy of stress-strain curve prediction can be explored. The aim is to improve the accuracy and reliability of data-driven methods for stress-strain curve prediction of granular materials. In terms of application, it provide an idea to researchers in the field of DEM numerical simulation. When performing DEM experiments, it is often necessary to determine a set of microscopic parameters by continuously adjusting them to produce the stress-strain curve required for the experiment. However, this process is time-consuming and cumbersome, and a data-driven approach makes it possible to solve this problem. By training a large amount of data to get the prediction model, the stress-strain curve can be obtained quickly. Thus, the accuracy of the prediction is particularly important in this process.

In the future, we aim to explore more efficient techniques to replace or enhance existing stress-strain curve prediction models, thereby improving the accuracy and reliability of data-driven approaches. Specifically, we plan to integrate additional microscopic parameters into the model, apply SHAP for feature extraction, and incorporate advanced methods such as attention mechanisms and transfer learning to further improve prediction accuracy and generalization. Furthermore, we will train, calibrate, and validate the model using indoor experimental data. Our goal is to develop a more accurate, robust, and adaptable prediction model, which will significantly contribute to research on stress-strain curve prediction for rock materials.

6. Conclusion

The evaluation of stress-strain curves of rock materials is one of the important indicators for analyzing their mechanical properties. However, it is difficult to describe the stress-strain relationship of rock materials with a single constitutive law. Furthermore, existing deep learning methods cannot guarantee the accuracy of the prediction of the stress-strain curve of rock materials. In this study, with the help of discrete element numerical simulation, a deep learning method based on LSTM-AE for predicting stress-strain curves of rock materials is proposed. By comparing with four models commonly used for prediction tasks, which are LSTM, RNN, BPNN, and XGBoost. The results show that the MSE, RMSE, MAE, and R2 of the LSTM-AE model are 0.00489, 0.06993, 0.03993, and 0.99591, respectively, and the four evaluation indexes are better than the other four models, which verifies the accuracy of the LSTM-AE model. In addition to this, special samples were designed to validate the accuracy of the LSTM-AE model again. The proposed method improves the accuracy and reliability of stress-strain curve prediction. In the future, this method can be extended to 3D discrete element numerical simulations. Additionally, it can be applied to predict the stress-strain behavior of rock materials at different scales.

Supporting information

S1 File. Parameters of biaxial compression tests.

https://doi.org/10.1371/journal.pone.0321478.s001

(TXT)

References

  1. 1. Li J, Zhou K, Liu W, Zhang Y. Analysis of the effect of freeze–thaw cycles on the degradation of mechanical parameters and slope stability. Bull Eng Geol Environ. 2017;77(2):573–80.
  2. 2. Pan Y, Wu G, Zhao Z, He L. Analysis of rock slope stability under rainfall conditions considering the water-induced weakening of rock. Comput Geotech. 2020;128:103806.
  3. 3. Liu CN, Wu CC. Integrating GIS and stress transfer mechanism in mapping rainfall-triggered landslide susceptibility. Eng Geol. 2008;101(1–2):60–74.
  4. 4. Urmi ZA, Saeidi A, Yerro A, Chavali RVP. Prediction of post-peak stress-strain behavior for sensitive clays. Eng Geol. 2023;323:107221.
  5. 5. Cundall PA. A computer model for simulating progressive, large-scale movement in blocky rock system. Proc Int Symp Rock Mech. n.d.;8:129–36.
  6. 6. Cundall PA, Strack ODL. A discrete numerical model for granular assemblies. Géotechnique. 1979;29(1):47–65.
  7. 7. Hao Y, Hao H. Numerical investigation of the dynamic compressive behaviour of rock materials at high strain rate. Rock Mech Rock Eng. 2012;46(2):373–88.
  8. 8. Liu Q, Liu D, Tian Y, Liu X. Numerical simulation of stress-strain behaviour of cemented paste backfill in triaxial compression. Eng Geol. 2017;231:165–75.
  9. 9. Wu Z, Liu C. The stress distribution around a thick-walled cylinder by a proposed constitutive model for rocks. PLoS One. 2024;19(8):e0307878. pmid:39146261
  10. 10. Qu T, Di S, Feng YT, Wang M, Zhao TT, Wang MQ. Deep learning predicts stress–strain relations of granular materials based on triaxial testing data. Comput Model Eng Sci. 2021;128(1):129–44.
  11. 11. Zhang P, Yin Z-Y, Jin Y-F. State-of-the-art review of machine learning applications in constitutive modeling of soils. Arch Computat Methods Eng. 2021;28(5):3661–86.
  12. 12. Garaga A, Latha GM. Intelligent prediction of the stress–strain response of intact and jointed rocks. Comput Geotech. 2010;37(5):629–37.
  13. 13. Zhang QB, Zhao J. A review of dynamic experimental techniques and mechanical behaviour of rock materials. Rock Mech Rock Eng. 2013;47(4):1411–78.
  14. 14. Zhao Y, Zhang Z. Mechanical response features and failure process of soft surrounding rock around deeply buried three-centered arch tunnel. J Cent South Univ. 2015;22(10):4064–73.
  15. 15. Jin Y-F, Yin Z-Y, Wu Z-X, Daouadji A. Numerical modeling of pile penetration in silica sands considering the effect of grain breakage. Finite Elem Anal Des. 2018;144:15–29.
  16. 16. Zhang K, Shen S, Zhou A. Dynamic brittle fracture with eigenerosion enhanced material point method. Numer Meth Eng. 2020;121(17):3768–94.
  17. 17. Zhang N, Shen S-L, Zhou A, Jin Y-F. Application of LSTM approach for modelling stress–strain behaviour of soil. Appl Soft Comput. 2021;100:106959.
  18. 18. Wang M, Qu T, Guan S, Zhao T, Liu B, Feng YT. Data-driven strain–stress modelling of granular materials via temporal convolution neural network. Comput Geotech. 2022;152:105049.
  19. 19. Maurizi M, Gao C, Berto F. Predicting stress, strain and deformation fields in materials and structures with graph neural networks. Sci Rep. 2022;12(1):21834. pmid:36528676
  20. 20. Zhang Z, Liu Q, Wu D. Predicting stress–strain curves using transfer learning: knowledge transfer across polymer composites. Mater Des. 2022;218:110700.
  21. 21. Zhang T, Li S, Yang H, Zhang F. Prediction of constrained modulus for granular soil using 3D discrete element method and convolutional neural networks. J Rock Mech Geotech Eng. 2024.
  22. 22. Shentu J, Lin B. A novel machine learning framework for efficient calibration of complex dem model: A case study of a conglomerate sample. Eng Fract Mech. 2023;279:109044.
  23. 23. Emad-Eldeen A, Azim MA, Abdelsattar M, AbdelMoety A. Utilizing machine learning and deep learning for enhanced supercapacitor performance prediction. J Energy Storage. 2024;100:113556.
  24. 24. Ali M, Abdelsallam A, Rasslan A, Rabee A. Predictive modeling of heart rate dynamics based on physical characteristics and exercise parameters: a machine learning approach. Int J Phys Educ Fit Sports. 2024:1–14.
  25. 25. Abdelsattar M, AbdelMoety A, Emad-Eldeen A. Machine learning-based prediction of illuminance and ultraviolet irradiance in photovoltaic systems. Int J Holist Res. 2024;0(0):1–14.
  26. 26. Chauhan NK, Singh K. A review on conventional machine learning vs deep learning. 2018 Int Conf Comput Power Commun Technol (GUCON). 2018:347–52.
  27. 27. Ghaboussi J, Garrett Jr JH, Wu X. Knowledge-based modeling of material behavior with neural networks. J Eng Mech. 1991;117(1):132.
  28. 28. Ellis G W, Yao C, Zhao R, Penumadu D. Stress-strain modeling of sands using artificial neural networks. J Geotech Eng. 1995;121(5): 429-435.
  29. 29. Banimahd M, Yasrobi SS, Woodward PK. Artificial neural network for stress–strain behavior of sandy soils: knowledge based verification. Comput Geotech. 2005;32(5):377–86.
  30. 30. Qu T, Di S, Feng YT, Wang M, Zhao TT. Towards data-driven constitutive modelling for granular materials via micromechanics-informed deep learning. Int J Plast. 2021;144:103046.
  31. 31. Wang L, Wang Z, Qu H, Liu S. Optimal forecast combination based on neural networks for time series forecasting. Appl Soft Comput. 2018;66:1–17.
  32. 32. Yang B, Yin K, Lacasse S, Liu Z. Time series analysis and long short-term memory neural network to predict landslide displacement. Landslides. 2019;16(4):677–94.
  33. 33. Salinas D, Flunkert V, Gasthaus J, Januschowski T. DeepAR: probabilistic forecasting with autoregressive recurrent networks. Int J Forecast. 2020;36(3):1181–91.
  34. 34. Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature. 1986;323(6088):533–6.
  35. 35. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997.
  36. 36. Shi LL, Zhang J, Zhu QZ, Sun HH. Prediction of mechanical behavior of rocks with strong strain-softening effects by a deep-learning approach. Comput Geotech. 2022;152:105040.
  37. 37. Li D, Liu J, Zhang Z, Yan M, Dong Y, Liu J. Fractional calculus & machine learning methods based rubber stress-strain relationship prediction. Mol Simul. 2022;48(10):944–54.
  38. 38. Li K-Q, Yin Z-Y, Zhang N, Liu Y. A data-driven method to model stress-strain behaviour of frozen soil considering uncertainty. Cold Reg Sci Technol. 2023;213:103906.
  39. 39. Yang W, Wang M, Zhou Z, Li L, Yang G, Ding R. Research on the relationship between macroscopic and mesoscopic mechanical parameters of limestone based on Hertz Mindlin with bonding model. Geomech Geophys Geo-energ Geo-resour. 2020;6(4).
  40. 40. Li C, Yin H, Liu C, Cai S. Design and test of parallel discrete element method program of shared memory type. J Nanjing Univ Nat Sci. 2017;53:1161–70.
  41. 41. Li C, Yin H, Jia D, Zhang J, Wang W, Xu S. Validation Tests for Discrete Element Codes Using Single-Contact Systems. Int J Geomech. 2018;18(6).
  42. 42. Li C. Quantitative analysis and simulation of structural deformation in the fold and thrust belt based on discrete element method. 2019.
  43. 43. Li C, Yin H, Wu C, Zhang Y, Zhang J, Wu Z. Calibration of the discrete element method and modeling of shortening experiments. Frontiers in Earth Science. 2021;9:636512.
  44. 44. Li C, Yin H, Xu W, Wu Z, Guan S, Jia D et al. Quantitative analysis and simulation of compressive tectonics based on discrete element method. Geotecton et Metallog. 2022;46:645–61.
  45. 45. Germanovich LN, Salganik RL, Dyskin AV, Lee KK. Mechanisms of brittle fracture of rock with pre-existing cracks in compression. Pure Appl Geophys. 1994;143(1–3):117–49.
  46. 46. Botter C, Cardozo N, Hardy S, Lecomte I, Escalona A. From mechanical modeling to seismic imaging of faults: a synthetic workflow to study the impact of faults on seismic. Marine and Petroleum Geology. 2014;57:187–207.
  47. 47. Morgan J. Effects of cohesion on the structural and mechanical evolution of fold and thrust belts and contractional wedges: discrete element simulations. J Geophys Res Solid Earth. 2015;120(5):3870–96.
  48. 48. Maxwell SA. Deformation styles of allochthonous salt sheets during differential loading conditions: Insights from discrete element models. 2009.
  49. 49. Itasca Consulting Group. PFC2D Users’ Manual (version 3.1). 2004.
  50. 50. Wang Y, Tonon F. Modeling Lac du Bonnet granite using a discrete element model. Int J Rock Mech Min Sci. 2009;46(7):1124–35.
  51. 51. Hsieh Y-M, Li H-H, Huang T-H, Jeng F-S. Interpretations on how the macroscopic mechanical behavior of sandstone affected by microscopic properties—Revealed by bonded-particle model. Eng Geol. 2008;99(1–2):1–10.
  52. 52. Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. J Big Data. 2015;2(1):1–21.
  53. 53. Fujiyoshi H, Hirakawa T, Yamashita T. Deep learning-based image recognition for autonomous driving. IATSS Res. 2019;43(4):244–52.
  54. 54. Akbar S, Raza A, Zou Q. Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model. BMC Bioinformatics. 2024;25(1):102. pmid:38454333
  55. 55. Akbar S, Hayat M, Iqbal M, Jan MA. iACP-GAEnsC: evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space. Artif Intell Med. 2017;79:62–70. pmid:28655440
  56. 56. Akbar S, Zou Q, Raza A, Alarfaj FK. iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks. Artif Intell Med. 2024;151:102860. pmid:38552379
  57. 57. Jozefowicz R, Zaremba W, Sutskever I. An empirical exploration of recurrent network architectures. Proc Int Conf Mach Learn. 2015:2342–50.
  58. 58. Yu Y, Si X, Hu C, Zhang J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019;31(7):1235–70. pmid:31113301
  59. 59. Lu W, Li J, Li Y, Sun A, Wang J. A CNN-LSTM-based model to forecast stock prices. Complexity. 2020;2020:1–10.
  60. 60. Wang JQ, Du Y, Wang J. LSTM based long-term energy consumption prediction with periodicity. Energy. 2020;197:117197.
  61. 61. Zhang J, Zhu Y, Zhang X, Ye M, Yang J. Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas. J Hydrol. 2018;561:918–29.
  62. 62. Li D, Li L, Li X, Ke Z, Hu Q. Smoothed LSTM-AE: a spatio-temporal deep model for multiple time-series missing imputation. Neurocomputing. 2020;411:351–63.
  63. 63. Wei Y, Jang-Jaccard J, Xu W, Sabrina F, Camtepe S, Boulic M. LSTM-autoencoder-based anomaly detection for indoor air quality time-series data. IEEE Sensors J. 2023;23(4):3787–800.
  64. 64. Hnamte V, Nhung-Nguyen H, Hussain J, Hwa-Kim Y. A novel two-stage deep learning model for network intrusion detection: LSTM-AE. IEEE Access. 2023;11:37131–48.
  65. 65. Nguyen HD, Tran KP, Thomassey S, Hamad M. Forecasting and anomaly detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. Int J Inf Manage. 2021;57:102282.
  66. 66. Sabri M, El Hassouni M. Photovoltaic power forecasting with a long short-term memory autoencoder networks. Soft Comput. 2023;27(15):10533–53.
  67. 67. Du S, Li T, Yang Y, Horng S-J. Multivariate time series forecasting via attention-based encoder–decoder framework. Neurocomputing. 2020;388:269–79.
  68. 68. Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong H. Informer: beyond efficient transformer for long sequence time-series forecasting. Proc AAAI Conf Artif Intell. 2021;35(12):11106–15.
  69. 69. Koh SJA, Lee HP. Molecular dynamics simulation of size and strain rate dependent mechanical response of FCC metallic nanowires. Nanotechnology. 2006;17(14):3451–67. pmid:19661590
  70. 70. Barredo Arrieta A, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, et al. Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion. 2020;58:82–115. https://doi.org/10.1016/j.inffus.2019.12.01271
  71. 71. Lundberg S. A unified approach to interpreting model predictions. arXiv preprint. 2017.