## Figures

## Abstract

In view of the strong randomness and non-stationarity of complex system, this study suggests a hybrid multi-strategy prediction technique based on optimized hybrid denoising and deep learning. Firstly, the Sparrow search algorithm (SSA) is used to optimize Variational mode decomposition (VMD) which can decompose the original signal into several Intrinsic mode functions (IMF). Secondly, calculating the Pearson correlation coefficient (PCC) between each IMF component and the original signal, the subsequences with low correlation are eliminated, and the remaining subsequence are denoised by Wavelet soft threshold (WST) method to obtain effective signals. Thirdly, on the basis of the above data noise reduction and reconstruction, our proposal combines Convolutional neural network (CNN) and Bidirectional short-term memory (BiLSTM) model, which is used to analyze the evolution trend of real time sequence data. Finally, we applied the CNN-BiLSTM-SSA-VMD-WST to predict the real time sequence data together with the other methods in order to prove it’s effectiveness. The results show that SNR and CC of the SSA-VMD-WST are the largest (the values are 20.2383 and 0.9342). The performance of the CNN-BiLSTM-SSA-VMD-WST are the best, MAE and RMSE are the smallest (which are 0.150 and 0.188), the goodness of fit *R*^{2} is the highest(its value is 0.9364). In contrast with other methods, CNN-BiLSTM-SSA-VMD-WST method is more suitable for denoising and prediction of real time series data than the traditional and singular deep learning methods. The proposed method may provide a reliable way for related prediction in various industries.

**Citation: **Su H, Ma S, Xu X (2024) The multi-strategy hybrid forecasting base on SSA-VMD-WST for complex system. PLoS ONE 19(4):
e0300142.
https://doi.org/10.1371/journal.pone.0300142

**Editor: **Qichun Zhang,
Buckinghamshire New University - High Wycombe Campus: Buckinghamshire New University, UNITED KINGDOM

**Received: **November 13, 2023; **Accepted: **February 21, 2024; **Published: ** April 18, 2024

**Copyright: ** © 2024 Su et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the paper.

**Funding: **This work was supported by the grants from the National Natural Science Foundation of China (No. 12362005), Ningxia higher education first-class discipline construction funding project (NXYLXK2017B09), Major Special project of North Minzu University (No. ZDZX201902). The funders had role in supervision, writing, decision to publish and review of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## 1 Introduction

With the noise influence of complex time series data, the accuracy of prediction in many industries can not reach the ideal effect. To address the strong noise problem of complex time series [1, 2], traditional denoising methods have already been used widely, such as Wavelet analysis [3], Singular spectrum analysis [4] and Fourier transform [5]. Nguyen et al. [3] used wavelet analysis to reduce the noise of natural gas price time series and input the reduction sequence into the prediction model to achieve higher prediction accuracy. Wen et al. [4] applied Singular spectrum analysis to decompose stock time series and predicted the trend based on Support vector machine.

In relation to the traditional methods, empirical mode decomposition(EMD) has obvious advantages in processing complex time series data [6], which can decompose complex time series into signal components from high frequency to low frequency. However, sometimes EMD would suffer from the mode aliasing and end effect problems [7, 8]. Wu and Huang introduced the ensemble empirical mode decomposition(EEMD) as a solution to the drawbacks of EMD [9], in which adaptive sequential diagnostic algorithm combined with EEMD was used to improve the stability and generalization capacity of the prediction model with lower error. However, a lot of redundant information was generated. To address the problems mentioned above, Drag-omiretskiy et al. [10] presented variational mode decomposition(VMD) which can separate signals into their modal components at various frequencies. VMD method can effectively solve EMD mode aliases and end effects, which has stronger robustness [11, 12]. Nevertheless, the parameters in VMD are indeterminate and different parameter settings will give different results. Therefore, finding the best solution for the parameters in VMD is currently a significant problem [13]. Yang et al. [14] employed a genetic algorithm to maximize the kurtosis of the index set as the objective function in order to estimate the VMD parameters in an adaptive manner. Yan and co-author [15] utilized the cuckoo search algorithm to acquire VMD parameters with the minimal mean envelope entropy as the optimization aim. Zhou et al. [16] utilized the particle swarm optimization approach and the average permutation entropy as the fitness function to optimize. While the previous techniques have yielded some satisfactory outcomes, they were all designed with a single assessment index as the goal function to be optimized rather than a more comprehensive global optimization. To deal with the uncertain effects from collecting historical data, many scholars used the wavelet transform (WT) [17–19] which can remove random fluctuations for prediction. Artificial neural network was utilized to predict wind power based on WT [20]. A hybrid model based on Discrete wavelet threshold was presented to prediction [21]. He et al. [22] successfully predicted monthly precipitation and climatic indices using a hybrid wavelet neural network model. In fact the single denoising methods can’t eliminate noise well [23, 24] for the complex time series data, so scholars began to pay attention to hybrid denoising methods. Chao et al. [25] used TVF-EMD-ENN to predict precipitation time series data, which achieved higher prediction accuracy than EMD. Lv et al. [26] choosed WT and Singular value decomposition effectively to suppress multi-source noise. Lin et al. [27] applied EMD and WT to process the noise signal. Dao et al. [28] used WT and EEMD to complete the effective removal of uncertain noise in fault signals. Song et al. [29] choosed the principle of multiple wavelet denoising to denoise NMR spectra, which showed that the hybrid method effectively to removes a series of spikes comparing with WT method.

Based on above analysis, the VMD parameter optimization model with comprehensive evaluation index is proposed in this paper, which can obtain the optimal VMD parameters in the global scope. Additionally, in order to overcome the noise redundancy problems for single denoising method, a novel time series denoising strategy is presented in this paper based on the hybrid optimized VMD and wavelet soft threshold(WST) to accurate trend evolution. This paper is presented as follows. In section 2, we build a model for optimizing VMD with SSA by establishing a new comprehensive fitness function. In section 3, a hybrid noise reduction model is given through PCC. Finally, we verify the theoretical model with real time series data by numerical experiment, CNN-BiLSTM-SSA-VMD-WST method is proved more suitable for denoising and prediction of real time series data than the traditional and single deep learning methods.

## 2 Correlation theory

### 2.1 Variational mode decomposition

VMD is used to decompress the original signal into IMF components and find the best variational mode solution, which is a completely non-recursive model [30].

For the optimal solution of the constrained variational model, the principle is as follows [30]:

The mode equation is explained as an amplitude frequency signal, which is defined as follows Eq (1):
(1)
where *a*_{k}(*t*) is corresponding vibration amplitude of the IMF, the derivative of *ϕ*(*t*) is the frequency of the IMF. The variational solution principle of VMD is as follows:

- The analytic signal corresponding to each
*s*_{k}(*t*) is calculated using Hilbert transform to get frequency of single-sided defined as follows Eq (2): (2) where*δ*(*t*) is the impact function, and its expression is as follows Eq (3): (3) - Add the
*e*^{−jwkt}and pass the spectrum of each modal component to the corresponding baseband is defined as follows Eq (4): (4) - The principle of VMD can be rewritten as an optimization problem with constraints as follows Eq (5):
(5)
where
*s*_{k}(*t*) is show in Eq (1),*w*_{k}is the frequency center of every modal component, K is the number of iterations,*f*(*t*) is the original signal,*s*.*t*is the bound term, * is the convolution calculation symbol. - Add a penalty factor
*α*and an improved Lagrange formula to solve Eq (5), convert ordinary variation problems into unconstrained variation problems. An extended Lagrange expression is obtained as follows Eq (6): (6) where*α*is the second penalty factor which can be adjusted to achieve the completeness of the VMD. - After processing, the optimal solution expression of the constrained variational model is obtained as follows Eq (8): (7) (8)

### 2.2 SSA algorithm for parameter optimization based on VMD

There are two important parameters that need to be set in the VMD, namely the number of signal decomposition *K* and the penalty parameter *α*. If the *K* is set to a small value, it will lead to insufficient signal decomposition; If the *K* is greater than the number of useful component signals contained in the analyzed signal, it will cause the analyzed signal to be over decomposed. In addition, the penalty parameter *α* in the VMD is used to ensure the accuracy of signal reconstruction. In this paper, the parameter optimization of VMD based on SSA algorithm is constructed, which can get the optimized parameters quickly and accurately. SSA was first proposed by Xue [31] in 2020. This algorithm imitates the foraging behavior of the sparrow population for target optimization, and has strong optimization ability and fast convergence speed.

The original signal is decomposed into *K* modal components by VMD. If the modal component contains less noise, the information related to the signal of the original time series data will be more obvious and the sample entropy will be smaller. The ratio of the sample entropy and PCC is used as a comprehensive evaluation index to construct objective function. In this paper, the procedure for optimizing VMD parameters is transformed into seek the ratio of minimum sample entropy and PCC by SSA.

Sample entropy can measure the irregularity and complexity of the signals. The lower the entropy, the stronger the regularity of information. PCC is a good measure for the correlation between two data sets shown as follows Eq (9):
(9)
where *N* is the sequence length, *B*^{m}(*r*) is the probability of matching *m* point before and after signal reconstruction under tolerance *r*. The value of Eq (9) is related to the selection of *m*, *r*, *N*. Different embedding dimensions *m* and similar tolerances *r* correspond to different sample entropy. In general, *m*=1 or 2, *r*=0.1∼0.25*SD*_{x}, *SD*_{x} is the standard deviation of the new sequence after decomposition. In this paper, we choose *m*=2, *r*=0.25*SD*_{x}. *R* is the degree of correlation between the subsequence and the original sequence is defined as follows Eq (10),
(10)
where *X* and *Y* are the sequences before and after reconstruction. *Fit* is fitness function is defined as follows Eq (11),
(11)
where *D* = *log* 10(*d*), *d* is the number of estimated center frequencies for the modes.

The minimum of *Fit* means that sample entropy is minimum and *R* is maximum. At the moment, the new sequence signal is simpler. We can make decomposition layers *K* ∈ [2, 15] and quadratic penalty factor *α* ∈ [200, 4000] [32]. The following Fig 1 shows the detailed procedure for SSA-VMD.

### 2.3 Wavelet Soft threshold denoising

Donoho [17, 18] proposed the wavelet threshold denoising, reducing irrelevant signals in the original signal is its basic idea. The process of wavelet threshold denoising is shown in Fig 2.

In the two processes of wavelet decomposition and reconstruction, the different types of wavelet bases (dbN, symN, coifN, N is an integer) directly affect signal decomposition and reconstruction. Generally, the number of decomposition layers is chosen based on the properties of the signal. The choice of wavelet threshold value and threshold function also affects the denoising impact. There are four kinds of wavelet threshold value: heuristic threshold, fixed threshold, unbiased risk estimation threshold and minimax threshold. The unbiased risk threshold is more suitable for situations where the noise is similar to additive white Gaussian noise. Minimax threshold is more suitable for signal filtering with low signal-to-noise ratio (SNR) [33]. The heuristic threshold is the synthesis of unbiased risk estimation threshold and fixed threshold. If the threshold value is too small, the denoising effect will be unsatisfactory. If the threshold value is too large, some effective signals will be eliminated. The principle of threshold is as follows [27].

The fixed threshold which is shown as follows Eq (12):
(12)
where *N* is the original signal length. Standard deviation of the noise is defined as follows Eq (13):
(13)
where *w*_{j,k} is the original wavelet coefficient.

The noise reduction effect and threshold function selection are closely related. Different threshold functions have different noise reduction effects, mainly including hard threshold function and soft threshold function. Hard threshold function is presented as follows Eq (14):
(14)
Soft threshold function is shown as Eq (15) (15)
where *sgn*(⋅) is a symbolic function, and the expression is shown as Eq (16):
(16)
According to Eqs (14) and (15), the hard threshold function can eliminate the wavelet coefficient which lower than the threshold value in the space. The soft threshold function is to further improve the residence of the hard threshold value, and the signal with the soft threshold value has better smoothness after denoising. Therefore, this paper selects the soft threshold function to denoise the time series signal.

### 2.4 CNN-BiLSTM model

#### 2.4.1 Convolutional neural network.

CNN for time series prediction have been paid more attention recently. Three components typically make up CNN: the convolution layer, the pooling layer, and the fully connected layer [34]. In this study, after the experimental data is input into CNN, the convolution layer analyzes the input experimental data using various convolutions, and then the activation function endow nonlinear features for the experimental data. As a result, the experimental data gets the local feature information and achieves the purpose of extracting features. Subsequently, the pooling layer will reduce dimensionality sampling and connect more important data from the convolutional output for making the output feature components more stable. Finally, the fully connected layer creates column vectors of a specific length from the features output from the pooling layer and transmits to the Bidirectional long short-term memory (BiLSTM) network for subsequent operations.

#### 2.4.2 Bidirectional Long Short-Term Memory network.

BiLSTM is able to further processed the results of the CNN, and the time features of experimental data are extracted with the memory function. BiLSTM is developed from the Recurrent neural network (RNN) [35], which can enter data into the network for calculation at every time point, and every hidden layer sends its output immediately to the following time point and the next layer of the network. Its structure is shown in Fig 3.

Each hidden layer consists of two Long short term memory (LSTM), a forward LSTM can obtain the output data from the previous moment and compute it for the following moment. A backward LSTM can obtain the output data from the following moment and compute it for the previous moment. There are four gated units in each LSTM computing unit, they are input gate (*i*_{t}), output gate(*o*_{t}), control gate(*c*_{t})and forget gate(*f*_{t}), and M is the network module. The relationship between the various gates are as follow [34, 35]:

The forget gate determines how much the input information *h*_{t−1}of the previous moment is stored in the current control gate(*c*_{t}), the relationship between the output *f*_{t} and the current time input *x*_{t} is shown as follows Eq (17):
(17)
where *w*_{f} and *b*_{f} are the weights and biases entered for the forget gate respectively, *σ* is the sigmoid activation function that gives the nonlinear properties of the network.

How much of the input data is currently reserved for *c*_{t} is determined by the input gate. The intermediate variable is used as confirmed whether the cell state *i*_{t} will be added. The related functional connection is shown as follows Eqs (18) and (19):
(18) (19)
where *w*_{c} and *w*_{i} are the middle weight parameters variables , *i*_{t} respectively. *b*_{c} and *b*_{i} are the intermediate variables respectively, *i*_{t} is the bias parameters, tanh is the activation function.

The amount that the current control gate *c*_{t} is transferred to the current time output *h*_{t} is determined by the output gate. The related functional connection is shown as follows Eq (22):
(20) (21) (22)
where *w*_{o} and *b*_{o} are the weight parameters and bias parameters of the output gate respectively, *c*_{t} is the outcome of control gate. The result of the LSTM computation unit is denoted by *h*_{t}.

The sum of the output values from the forward and backward LSTM units at time *t* is the overall output value of the BiLSTM computing unit. The particular formula for calculating is as follows Eq (25):
(23) (24) (25)
where ⊙ represents the concatenation of two vectors.

#### 2.4.3 CNN-BiLSTM model.

CNN-BiLSTM has not only the advantages of CNN, but also the strong information memory ability of BiLSTM. In this paper, CNN is first used for convolution processing of the input time series data, and then BiLSTM is extracted the features of the time series data, which retains the original information of the data to a large extent. Finally, the results are output from the fully connected layer, which significantly raises prediction accuracy. The process is shown in Fig 4.

## 3 Hybrid denoising strategy for SSA-VMD-WST

In this paper, we optimize two parameters which can affect the VMD decomposition based on SSA algorithm, and then carry out WST denoising by analyzing the PCC of each component and the original sequence. The process is as follows in Fig 5: We optimize VMD using SSA algorithm to get the optimal parameter decomposition layer and penalty parameter. The optimized VMD is used to decompose the original signal. Then the PCC of the decomposed IMF component was calculated. The smaller PCC component was removed, and then the retained IMF component was denoised by wavelet threshold. The PCC definition of the IMF component is shown as Eq (26) (26) Reconstruct the signal to obtain the final signal.

## 4 Numerical experiment analysis

In order to verify the validity of proposed model, the daily maximum temperature data (Data from the http://data.cma.cn/data/cdcdetail/dataCode/A.0019.0001.S002.html) during 2019–2022 were selected as the samples in this paper.

### 4.1 VMD parameter optimization and signal filtering analysis

Bring in the daily maximum temperature data and use the SSA algorithm to optimize the parameters of VMD. Make the following settings before the experiment, the population number is set to 20, the maximum number of iterations is set to 15. The minimum *Fit* value is equal to 0.11569 after two iterations, the output decomposition layer *K* = 13 and quadratic penalty coefficient *α* = 1549.

From Table 1, based on the root mean-square error(RMSE) and the mean absolute error(MAE) [36], the data error of decomposition parameters *K* = 13, *α* = 1549 are the smallest, MAE and RMSE are 0.326 and 0.416. Under the same *α*, as *K* value increases to 15, the data error of before and after denoising increased by 53.16% and 53.83%. So it can be considered that the signal has been completely decomposed when *K* = 13. Parameters *K* and *α* optimized by the SSA algorithm were input into Matlab, and 13 IMF components were shown as in Fig 6 after SSA-VMD decomposition.

In this paper, the threshold of PCC is set to 0.15. It is believed that when the PCC of IMF is greater than 0.15, we consider it is an effective mode. The retained IMF can express the features of the original signal with less information loss. The PCC of each IMF component are calculated shown in Table 2. It can be found that the PCC of IMF1∼IMF8 are all less than 0.15, and the PCC of reserved IMF9∼IMF13 are all greater than 0.15. It can be considered that IMF9∼IMF13 can preserve the signal characteristics greatly.

### 4.2 Analysis of denoising effect

By substituting the data into the program, it is obtained that the most suitable wavelet threshold is fixed threshold, and the number of decomposition layers is 5 layers. At the same time, the best wavelet basis is db3. The results are shown in Fig 7. The comparison results between the original signal and the other denoising signal results are shown in Fig 8.

Compared with the above denoising signal diagram in Fig 8, the effect of SSA-VMD-WST denoising is smoother than SSA-VMD and WST methods. This shows that only after SSA-VMD denoising, noise is still left. In order to further evaluate the advantages of the SSA-VMD-WST, the signal denoising is quantitatively analyzed by means of four common signal index, MAE, RMSE [36], cross-correlation(CC) [37] and signal-to-noise ratio(SNR) [38].

The index comprehensively evaluate the effect of signal denoising. SNR refers to the ratio of useful signal to noise. Therefore, the higher SNR and CC of the signal after denoising, the smaller RMSE and MAE with original signal, the better denoising effect is. The evaluation index values of the three denoising methods are shown in Table 3.

In Table 3, the RMSE, CC and SNR of the SSA-VMD-WST denoising method are 1.0727, 0.9342 and 20.2383, respectively, which are the best model. The SNR is the largest which indicates that the SSA-VMD-WST can retain the original signal information better than other methods and remove more noise than SSA-VMD and WST methods, and the CC is the largest which shows that the SSA-VMD-WST can ensure consistent data fluctuation before and after denoising.

### 4.3 Prediction of daily maximum temperature based on CNN-BiLSTM

Aiming at accuracy and rationality of the CNN-BiLSTM, the daily maximum temperature prediction is take an example to shown in Fig 9 and Table 4 compared with different denosing methods and different deep leaning methods.

After WST denosing, the daily maximum temperature predictions with real data, RNN, CNN, BiLSTM and CNN-BiLSTM are shown in Fig 9(a). From the figure we can find the CNN-BiLSTM with WST denosing has better fitting effect than RNN, CNN and BiLSTM with WST denosing. In Fig 9(b), we can know that there is better fitting effect for the CNN-BiLSTM with SSA-VMD denosing than RNN, CNN and BiLSTM. After SSA-VMD-WST denosing, the daily maximum temperature predictions with real data, RNN, CNN, BiLSTM and CNN-BiLSTM are shown in Fig 9(c). From the figure we can find the CNN-BiLSTM with SSA-VMD-WST denosing has better trend evolution. In Fig 9(d), we can know that there is better fitting effect for CNN-BiLSTM with real data for the SSA-VMD-WST compared with other denosing methods.

The trend fitting of the SSA-VMD-WST based on CNN-BiLSTM in Fig 9 is smoother and more accurate. At the same time, according to the data in the Table 4, the hybrid strategy which is combined SSA-VMD-WST with CNN-BiLSTM has the best prediction effect, RMSE and MAE are the smallest which are 0.188 and 0.150, the goodness of fit *R*^{2} is the highest, and its value is 0.9364. Compared with CNN-BiLSTM-WST error decreased by 72.47% and 84.03%. Meanwhile, in relation to CNN-BiLSTM-SSA-VMD error decreased by 54.80% and 53.42%. In Table 4, we also find that the data without noise reduction has the worst prediction effect, indicating that the data itself has a greater noise interfere.

## 5 Conclusions

With the aim to overcome the strong noise problem of complex time series data, a new hybrid denoising model SSA-VMD-WST is proposed based on the traditional complex time series denoising. Firstly, a new fitness function is constructed by using sample entropy and PCC, which is embedded into SSA to realize the comprehensive optimization of VMD uncertain parameters and establish SSA-VMD model. Then, VMD is applied to decompose and denoise the complex time series data. Considering the effectiveness of each subsequence, PCC method is used to filter the subsequence. The SSA-VMD-WST model is established after denoising the reserved subsequences by WST method. Finally, CNN-BiLSTM model is used to fit the trend of the denoised data. Compared with CNN-BiLSTM-SSA-VMD and CNN-BiLSTM-WST, the RMSE of CNN-BiLSTM-SSA-VMD-WST decreased by 54.80% and 72.47% respectively, and the MAE decreased by 53.42% and 84.03%, which indicates that the proposed method is suitable for nonlinear and nonstationary complex time series data denoising and prediction. In addition, some outliers may be removed as noise in the above process. Therefore, the prediction accuracy can be improved by further processing in the future research.

## Acknowledgments

The authors would like to thank the editors and reviewers for their valuable suggestions on the logic and preciseness of this article.

## References

- 1. Shirazi A H, Raoufy M R, Ebadi H, et al. Quantifying memory in complex physiological time-series[J]. PLoS One, 2013, 8(9): e72854. pmid:24039811
- 2. Zhang C Y, Lan Q J, Mi X T, Zhou Z D, Ma C Q, Mi X H. A denoising method based on the nonlinear relationship between the target variable and input features[J]. Expert Systems with Applications, 2023, 218: 119585.
- 3. Nguyen H T, Nabney I T. Short-term electricity demand and gas price forecasts using wavelet transforms and adaptive models[J]. Energy, 2010, 35(9): 3674–3685.
- 4. Fenghua W E N, Jihong X, Zhifang H E, et al. Stock price prediction based on SSA and SVM[J]. Procedia Computer Science, 2014, 31: 625–631.
- 5. Radojičić D, Kredatus S. The impact of stock market price fourier transform analysis on the gated recurrent unit classifier model[J]. Expert Systems with Applications, 2020, 159: 113565.
- 6. Huang N E, Shen Z, Long S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis[J]. Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences, 1998, 454(1971): 903–995.
- 7. Wu Z, Huang N E. Ensemble empirical mode decomposition: a noise-assisted data analysis method[J]. Advances in adaptive data analysis, 2009, 1(01): 1–41.
- 8. Zheng J, Tong J, Ni Q, et al. Partial ensemble approach to resolve the mode mixing of extreme-point weighted mode decomposition[J]. Digital Signal Processing, 2019, 89: 70–81.
- 9. Sun N, Zhou J, Chen L, et al. An adaptive dynamic short-term wind speed forecasting model using secondary decomposition and an improved regularized extreme learning machine[J]. Energy, 2018, 165: 939–957.
- 10. Dragomiretskiy K, Zosso D. Variational mode decomposition[J]. IEEE transactions on signal processing, 2013, 62(3): 531–544.
- 11. Abdoos A A. A new intelligent method based on combination of VMD and ELM for short term wind power forecasting[J]. Neurocomputing, 2016, 203: 111–120.
- 12. Zhang D, Peng X, Pan K, et al. A novel wind speed forecasting based on hybrid decomposition and online sequential outlier robust extreme learning machine[J]. Energy conversion and management, 2019, 180: 338–357.
- 13. Jin Z Z, He D, Ma R, et al. Fault diagnosis of train rotating parts based on multi-objective VMD optimization and ensemble learning[J]. Digital Signal Processing, 2022, 121: 103312.
- 14. Zhou F, Yang X, Shen J, et al. Fault diagnosis of hydraulic pumps using PSO-VMD and refined composite multiscale fluctuation dispersion entropy[J]. Shock and Vibration, 2020, 2020: 1–13.
- 15. Yan X, Jia M. Application of CSA-VMD and optimal scale morphological slice bispectrum in enhancing outer race fault detection of rolling element bearings[J]. Mechanical Systems and Signal Processing, 2019, 122: 56–86.
- 16. Zhou X, Li Y, Jiang L, et al. Fault feature extraction for rolling bearings based on parameter-adaptive variational mode decomposition and multi-point optimal minimum entropy deconvolution[J]. Measurement, 2021, 173: 108469.
- 17. Donoho D L. De-noising by soft-thresholding[J]. IEEE transactions on information theory, 1995, 41(3): 613–627.
- 18. Donoho D L, Johnstone I M. Adapting to unknown smoothness via wavelet shrinkage[J]. Journal of the american statistical association, 1995, 90(432): 1200–1224.
- 19. Gurley K, Kareem A. Applications of wavelet transforms in earthquake, wind and ocean engineering[J]. Engineering structures, 1999, 21(2): 149–167.
- 20. Catalão J P S, Pousinho H M I, Mendes V M F. Short-term wind power forecasting in Portugal by neural networks and wavelet transform[J]. Renewable energy, 2011, 36(4): 1245–1251.
- 21. Azimi R, Ghofrani M, Ghayekhloo M. A hybrid wind power forecasting model based on data mining and wavelets analysis[J]. Energy conversion and management, 2016, 127: 208–225.
- 22. He X, Guan H, Qin J. A hybrid wavelet neural network model with mutual information and particle swarm optimization for forecasting monthly rainfall[J]. Journal of Hydrology, 2015, 527: 88–100.
- 23. Li J, Zhang X, Tang J. Noise suppression for magnetotelluric using variational mode decomposition and detrended fluctuation analysis[J]. Journal of Applied Geophysics, 2020, 180: 104127.
- 24. Zhang X, Li L, Xu C. Large-scale landslide inventory and their mobility in Lvliang City, Shanxi Province, China[J]. Natural Hazards Research, 2022, 2(2): 111–120.
- 25. Song C, Chen X, Wu P, et al. Combining time varying filtering based empirical mode decomposition and machine learning to predict precipitation from nonlinear series[J]. Journal of Hydrology, 2021, 603: 126914.
- 26. Lv P, Wu X, Zhao Y, et al. Noise removal for semi-airborne data using wavelet threshold and singular value decomposition[J]. Journal of Applied Geophysics, 2022, 201: 104622.
- 27. Lin T, Yu S, Wang P, et al. Removal of a series of spikes from magnetic resonance sounding signal by combining empirical mode decomposition and wavelet thresholding[J]. Review of Scientific Instruments, 2022, 93(2). pmid:35232142
- 28. Dao F, Zeng Y, Qian J. A novel denoising method of the hydro-turbine runner for fault signal based on WT-EEMD[J]. Measurement, 2023, 219: 113306.
- 29. Song P, Xu J, Liu X, et al. Stationary Wavelet denoising of solid-state NMR spectra using multiple similar measurements[J]. Journal of Magnetic Resonance, 2023: 107615. pmid:38310668
- 30. Liu Y, Wang L H, Yang L B, et al. Drought prediction based on an improved VMD-OS-QR-ELM model[J]. Plos one, 2022, 17(1): e0262329. pmid:34990468
- 31. Xue J, Shen B. A novel swarm intelligence optimization approach: sparrow search algorithm[J]. Systems science, 2020, 8(1): 22–34.
- 32. Geng G, He Y, Zhang J, et al. Short-Term Power Load Forecasting Based on PSO-Optimized VMD-TCN-Attention Mechanism[J]. Energies, 2023, 16(12): 4616.
- 33. Mazumder R, Radchenko P, Dedieu A. Subset selection with shrinkage: Sparse linear modeling when the SNR is low[J]. Operations Research, 2023, 71(1): 129–147.
- 34. Arkin E, Yadikar N, Xu X, et al. A survey: Object detection methods from CNN to transformer[J]. Multimedia Tools and Applications, 2023, 82(14): 21353–21383.
- 35. Wang M, Rahardja S, Fränti P, et al. Single-lead ECG recordings modeling for end-to-end recognition of atrial fibrillation with dual-path RNN[J]. Biomedical Signal Processing and Control, 2023, 79: 104067.
- 36. Chai T, Draxler R R. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature[J]. Geoscientific model development, 2014, 7(3): 1247–1250.
- 37. Dong X, Fan Q, Li D. Detrending moving-average cross-correlation based principal component analysis of air pollutant time series[J]. Chaos, Solitons & Fractals, 2023, 172: 113558.
- 38. Johnson D H. Signal-to-noise ratio[J]. Scholarpedia, 2006, 1(12): 2088.