Figures
Abstract
Copper is an important mineral and fluctuations in copper prices can affect the stable functioning of some countries’ economies. Policy makers, futures traders and individual investors are very concerned about copper prices. In a recent paper, we use an artificial intelligence model long short-term memory (LSTM) to predict copper prices. To improve the efficiency of long short-term memory (LSTM) model, we introduced a simulated annealing (SA) algorithm to find the best combination of hyperparameters. The feature engineering problem of the AI model is then solved by correlation analysis. Three economic indicators, West Texas Intermediate Oil Price, Gold Price and Silver Price, which are highly correlated with copper prices, were selected as inputs to be used in the training and forecasting model. Three different copper price time periods, namely 485, 363 and 242 days, were chosen for the model forecasts. The forecast errors are 0.00195, 0.0019 and 0.00097, respectively. Compared with the existing literature, the prediction results of this paper are more accurate and less error. The research in this paper provides a reliable reference for analyzing future copper price changes.
Citation: Chen J, Yi J, Liu K, Cheng J, Feng Y, Fang C (2023) Copper price prediction using LSTM recurrent neural network integrated simulated annealing algorithm. PLoS ONE 18(10): e0285631. https://doi.org/10.1371/journal.pone.0285631
Editor: Nebojsa Bacanin, Univerzitet Singidunum, SERBIA
Received: January 3, 2023; Accepted: April 26, 2023; Published: October 30, 2023
Copyright: © 2023 Chen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data are mainly derived from the website of Investing (https://investing.com/). The data are collected from the Investing website (https://cn.investing.com/commodities/copper-historical-data).
Funding: This study is supported by the National Natural Science Foundation of China (NSFC) (No. 71991482). The funder played an important role in the research design, publication decision and manuscript preparation of the study.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Copper, a critically important metal, has always been a concern of policy-makers. For example, Chile, the largest exporter of copper in the world, produced 28% (estimated 5.6 million metric tons) of the global copper production in 2019 [1]. The copper price is the economic cornerstone for Chile. According to the estimation of copper demand, the demand for copper will significantly increase by between 275% and 350% by 2050 [2]. Some countries’ economies rely on copper. The fact that the copper price increases or decreases has a profound influence on some countries’ economies. In addition, as a kind of financial subject matter, copper futures are widely traded. There are hundreds of millions of institutional and individual investors all over the world. Through copper futures contracts, speculation, arbitrage, and hedge trades are undergoing. Hence, for a long time, financial researchers have been devoted to studying how to efficiently and precisely forecast the price of copper to help them avoid risk and make profit.
Traditionally, the copper price can be forecasted through the statistical time series model. One of the basic models is the autoregressive integrated moving average (ARIMA) model. Although the ARIMA model works easily [3–5], it can only be used to capture the linear relationship with time series. In addition, time series must be stationary, which requires that the mean and variance of the time series not dramatically fluctuate. Although differencing transformation is implemented to ensure stationarity, the times of differencing transformation are unknown. All of these factors pose an obstacle to applying the ARIMA model. Considering the different standard deviations of the yield rate during different periods of time, the generalized autoregressive conditional heteroskedasticity (GARCH) model is introduced to predict the time series variance [6, 7], but the weakness of GARCH is that the model is symmetric in terms of modeling volatility. In the real market, volatility increases in the downward trend of returns with bad news, and volatility decreases in the upward trend of returns with good news [8]. GARCH cannot be used to explain this kind of effect.
As artificial intelligence is developed so fast, advanced algorithms have already begun to be applied in the price forecasting field. Usually, the overarching advantage of AI technology is that it works nonlinearly; in other words, AI can simulate any nonlinear function because the activation function differs from traditional simple perceptrons. A feed-forward neural network (FFNN), for example, is used to forecast uranium prices [9]. A back propagation (BP) neural network is also used to predict stock prices [10, 11]. However, a data point of time series is not isolated; in other words, a historical time point can contribute to the current price, which is the cornerstone of time series analysis. Researchers believe that there is autocorrelation in the stationary time series after trend removal. FFNN and BP cannot deal with this kind of data series very well. A recurrent neural network (RNN) can predict this kind of time series with good result [12–14]. This is because the hidden layer of RNN involves the information from the previous moment, which treats the time series as a whole part rather than a single point. The weakness of RNN is gradient vanishing. The information is passed moment by moment using factorial multiplication. However, after many multiplication operations, the value of the historical information weight will approach 0, which means that the current moment only depends on the nearby moments. When the range of data is years, RNN does not work effectively.
A long short-term memory (LSTM) neural network is introduced to solve the vanishing gradient problem [15]. There is a cell state line crossing the whole time series, and there are three special gates (forget gate, input gate and output gate) in the LSTM. At each single time point, unnecessary information is eliminated from the cell state, and new information is added to the cell state. Through the addition operation rather than the multiplication operation, every single moment of valid information can be conserved. Long-term memory can be withdrawn. Compared with the RNN, LSTM can provide a more accurate prediction of time series. In recent years, LSTM has become popular in the price forecasting field [16–21]. For example, predicting the close price of a stock market [22], combining LSTM and the salp swarm algorithm to improve accuracy of crude oil price forecasting [23], LSTM and multi-head attention to forecast a given time series [24], graph Long Short-Term Memory (GLSTM) neural network to predict the air quality characteristics [25].
In the field of copper price forecasting, many scholars have carried out innovative research. Such as bat algorithm (BA) [26], support vector regression (SVR) [27], adaptive neuro-fuzzy inference system (ANFIS) [28]. However, the convergence and stability of BA need to be further improved. When solving nonlinear problems, SVR is difficult to choose a suitable kernel function. ANFIS is convenient and efficient, while the lack of adaptive capability of fuzzy systems limits its application. Therefore, long short-term memory (LSTM) also drew a lot of attention [29–31].
Although LSTM works well in forecasting, one of the common problems of artificial intelligence models is tuning hyperparameters. Hyperparameters include the batch size, number of neurons, number of hidden layers and so on [32]. Hyperparameters can dramatically influence the performance of the AI model. Traditionally, grid searching and random searching are methods to address this issue. Grid searching is too time-consuming, and random searching could miss the optimal solution. Researchers have attempted to use many other methods, for example, particle swarm optimization [14, 33], beetle antennae search algorithm (BAS) [34], teaching and learning based optimization (TLBO) model is applied to tune the output unit of the LSTM [35]. A common problem in other methods is that the local optimal solution can be found easily, and the global optimal solution is most likely missed. The simulated annealing (SA) algorithm can be used to sufficiently address this problem. In other words, SA allows the acceptance of a bad set of parameters under a probability at the beginning, which means that every direction existing in the optimal solution will be searched. At the end of all iterations, the solution is probably the global best solution.
Another key point that makes an AI model successful is feature engineering. Many factors influence the price of copper. An AI model cannot consider all of these factors, resulting in the “curse of dimensionality”. However, too few features can cause insufficient machine learning. Researchers have attempted to combine principal component analysis (PCA) and AI models [36], but PCA focuses on dimension reduction; furthermore, it is a linear transformation. As long as the number of dimensions is not extremely large, PCA is not necessary. Instead, a few of the most relevant features can be selected. Hence, correlation analysis can work sufficiently well.
In the current paper, our innovation is to use simulated annealing (SA) algorithm to find the best set of hyperparameters for the LSTM model, which makes the LSTM model more efficient and accurate. Meanwhile, correlation analysis is used to solve the feature engineering problem of AI models. Then three economic indicators, WTI crude oil price, closing gold price and closing silver price, are used as the inputs of the model. In the model prediction, three different copper price time periods, 485 days, 363 days and 242 days, are chosen. The prediction error is 0.00195, 0.0019, and 0.00097, respectively.
The subsequent parts of this paper are organized as follows: Part two introduces the research methods, Part three describes the data processing process, Part four is the analysis and discussion of the prediction results, and Part five is the conclusion and future outlook of the study.
Methodologies
Long short-term memory (LSTM)
In recent years, artificial intelligence technologies have become increasingly popular. Basically, the recurrent neural network (RNN) is good at dealing with the time series problem because of the special cell structure. In detail, the information held by the previous neural cell state could contribute weight to the current cell state. This special phenomenon allows RNNs to tackle the time series problem, for example, the natural language process.
Compared with the RNN, the neural cell of LSTM can memorize more historical information because the vanishing gradient existing in the RNN is eliminated in LSTM. Technically, weight propagation is completed through addition operations in LSTM rather than factorial multiplication, which means that the weight will not approach 0 and the information can be retained.
Fig 1 shows the scheme of the LSTM cell. There are several gates in the LSTM that make it special. A gate is a function that gives a result ranging from 0 to 1. A value of 0 means that no information passes through the gate, and a value of 1 means that all information passes through the gate. The gates decide which information should be passed to the next moment and which information is forgotten. However, there are some processes that are used to connect these three gates.
Forget gate
The forget gate is represented by the red dashed rectangle in Fig 1. First, the output from the last moment, ht−1, concatenates with the new input xt to a larger vector. Through this gate, the proportion of forgotten information can be obtained. Eq 1 shows this process mathematically.
(1)
where ft is the proportion of the forgotten information, sigmoid is the activation function, Wf is the weight matrix forget gate, [ht−1, xt] is the concatenation of vectors ht−1 and xt, and bf is the bias matrix of the forget gate.
Input gate
The input gate is the represented by the blue dashed rectangle in Fig 1. The output from the last moment, ht−1, is concatenated with the new input xt to a larger vector. Through this gate, the proportion of current cell information will be retained. Eq 2 shows this process mathematically.
(2)
where it is the proportion of new information retained, sigmoid is the activation function, Wi is the weight matrix of the input gate, [ht−1, xt] is the concatenation of vectors ht−1 and xt, and bi is the bias matrix of the input gate.
Current memory
The dark dashed brown rectangle represents the current memory in Fig 1. This memory includes both unnecessary information and important information. Eq 3 mathematically shows this process.
(3)
where
is the current memory, tanh is the activation function, Wc is the weight matrix of the current memory, [ht−1, xt] is the concatenation of vectors ht−1 and xt, and bc is the bias matrix of the current memory.
Renewing the current cell state
Through the input gate and forget gate, the cell state can be renewed, as shown by the purple box in Fig 1. This cell state will retain the information to the next moment. Eq 4 also shows that information from the input gate is added to the cell state. This addition operation ensures that valid information can be memorized by the model in each moment, which is an important characteristic of the LSTM model.
(4)
where ct is the current cell state, ft is the proportion of forgotten information, ct−1 is the cell state of the last moment, it is the proportion of new information retained,
is the current memory, and tanh is the activation function.
Output gate
The output gate is represented by the yellow box in Fig 1. The amount of information that should be output at a given moment is dependent on this gate. Similarly, Eq 5 shows this process mathematically. The result is a proportion value.
(5)
where ot is the proportion value that tells the cell state what information should be outputted at this moment, sigmoid is the activation function, ot is the weight matrix of the output gate, [ht−1, xt] is the concatenation of vectors ht−1 and xt, and bo is the bias matrix of the input gate.
Output of the current moment
In the last step, the current cell state is multiplied by the output value from the output gate. The current cell state includes all of the information from the beginning moment to the current moment. Hence, the output of this moment considers all of the given conditions. Eq 6 shows this process mathematically.
(6)
where ht is the output of the current moment, tanh is the activation function, and ct is the current cell state.
Through the above description of the scheme of the internal LSTM neural cell, it is clear that the LSTM model can predict time series considering historical information. Moreover, the behavior of LSTM is much better than that of the RNN.
Simulated annealing algorithm
Although many artificial intelligence models have been successfully used in the economy field, one of the quintessential problems of these models is hyperparameter searching. In the present paper, the simulated annealing (SA) algorithm is applied to search the best set of hyperparameters. Compared with other optimization algorithms, one of the most atypical advantages of SA is that the global optimal solution can be found easily, which means that the LSTM model can run more efficiently.
Fig 2 demonstrates the flow of the simulated annealing algorithm, the steps are as follows:
Step 1.
Set the initial parameters of the artificial intelligence model and SA model. The deadline conditions (final temperature T and iteration number i) of the SA model are also set.
Step 2.
According to the initial parameters, the new set of parameters can be calculated by using Eq 7.
(7)
where xtemporary is the temporary new parameter, xold is the parameter of the last step, T is the current temperature, and u is a random number ranging from -1 to 1. Before the first step, xold and Tcurrent are the initial values.
Because the hyperparameters are in a range, the value of xtemporary must be processed to be in an acceptable range. The sigmoid function is monotonically increasing and ranges from 0 to 1. The sigmoid function is called to obtain the new hyperparameter (Eq 8).
(8)
where xnewed is the processed Xtemporary, and a is the artificial setting range.
Step 3.
Both xnew and xold are substituted into the model, and then two error values for each set of hyperparameters are calculated by the error function. The difference between these two error values can be represented by the symbol ΔE in Eq 9.
There are two different situations of ΔE (metropolis rule).
- ΔE<0
This situation illustrates that the error calculated from the model result run by xnew is smaller than the error from the last step. In other words, the hyperparameter of the model is improved. Hence, the new set of hyperparameters is kept replacing the old hyperparameters. - ΔE>0
This situation illustrates that the error calculated from the model result run by xnew is larger than the error from the last step. However, to avoid the local optimal value being accepted, the model needs to be run based on this poor set of hyperparameters. The possibility that the model accepts the poor hyperparameters can be calculated by Eq 10
(10)
where possibility is the possibility that a poor set of hyperparameters could be accepted, ΔE is the error difference between the new model and the old model, and Tcurrent is the current temperature.
It can be seen that the poor set of hyperparameters can be easily accepted at high temperature, which means that the global optimal parameters are more likely to be found.
Data preprocessing
Data acquisition
For the prediction of the copper price, four groups (eight different economic indicators) are chosen and shown in Table 1. Through the statistical analysis, the high correlation indicators with copper price are chosen and used in the LSTM forecasting model.
All data are collected from the Investing website (https://cn.investing.com/commodities/copper-historical-data), and closed price/yield rates are chosen. The period chosen is from January 1st, 1990, to December 31st, 2009. If one of these eight indicators’ data is missing in the single trade day, all of the data in that day will be deleted. In total, 4870 pieces of data are collected and used in the data analysis.
Table 2 shows descriptive statistics of the features. The mean closed copper price is 1.38 dollars with a standard deviation of 0.87.
Feature selection
Fig 3 shows the historical copper price. The mean copper price is $1.38, and the standard deviation is $0.866. Through the KS test, the statistic value is 0.2367, and the p value is 1.6e-237. Hence, it can be seen that the distribution of the copper price (Fig 4) is not a normal distribution, which means that Pearson correlation cannot be used to find the correlation coefficient between the copper price and other economic indicators.
Because of the nonnormal distribution of copper prices, the correlation coefficient is calculated through the Spearman method. As Fig 5 shows, the Spearman correlation coefficients between the copper price and the WTI crude oil price, closed gold price, and closed silver price are 0.89, 0.84, and 0.91, respectively, which are higher than 0.8. These three economic indices have a strong positive correlation with the copper price. Hence, these three economic indices are chosen as the input of the model.
Data normalization
To increase the speed of the model running and the model’s accuracy, the chosen data are normalized. All numbers are narrowed down to between 0 and 1 using Eq 12
(12)
As Fig 6 shows, these four sets of normalized data have similar trends. These data are split into a training set, validation set and test set.
As Fig 7 shows, there are only few data points that outside of the 95% confidence interval; however, it is believed that these points represent sudden changes in price and are retained in the dataset.
Discussion
Simulated annealing algorithm performance
There are hundreds of hyperparameter scenarios for the LSTM model. Finding the most efficient method is challenging. In the current paper, the simulated annealing algorithm is used to perform this task. Before the hyperparameter searching task, the basic and fundamental parameters, which are not changed in the course of the whole prediction process, of the LSTM model are mentioned below. The batch size of the model is 64 and the number of epochs is 100. A dropout layer is introduced into the model in the case of overfitting. The dropout layer is located after the first layer of the dense layer, and the dropout rate is set as 0.2.
First, the proposed model is given through the simulated annealing algorithm. The initial temperature, final temperature, and rate of cooling are 1, 0.1, and 0.95, respectively. In addition to these parameters, each simulation process is implemented using 50 iterations. Initially, the learning rate, number of neurons in the first LSTM layer, and number of neurons in the second LSTM layer are set to 0.0005, 15, and 30, respectively. Specifically, the range of the learning rate is set between 0 and 0.005 to ensure that the simulated annealing algorithm can converge. If the learning rate exceeds these bounds, it is set to the default value, 0.00005. The number of neurons must be an integer. In total, there are 650 scenarios in this whole hyperparameter searching job, as shown in Fig 8.
It is clear that the losses of the LSTM model are more discrete before the 400th (before the beige line) simulated annealing step, especially between 120 and 300 steps (black rectangle box). After 400 steps, the loss values of LSTM become stable and plateau at approximately 0.01. The above processing can be seen as a typical simulated annealing process. In other words, the searching algorithm was testing some ‘poor’ scenarios in case the global optimum solution was missed. When cooling down, the model becomes more conservative, the risky options are rejected, and the whole model begins to converge.
Additionally, Table 3 and Figs 9–11 shows the comparison of the different loss graphs of LSTM at different steps of the simulated annealing model. It can be seen that the worst scenario occurs at 98 steps, for which LSTM loss is 9.37%. Compared with other scenarios, an incorrect numbers of neurons causes this problem. Fortunately, SA finds more and better solutions. For example, at 202 steps, even though the loss is 5.38%, the loss graph looks smoother and more sensible. There is still a small amount of underfitting because the loss from the test is smaller than that from the training set. Until 222 steps, the graph looks good, and the machine learning process is close to perfect.
In the whole hyperparameter searching process, it must be noted that the initial learning rate does not change during the whole process. In other words, the learning rate is always set to the default value (0.00005). An attempt was made to eliminate the bound at the very beginning of the code, but SA crashed, and the loss value of LSTM increased to an unacceptable amount. In fact, many learning rates from outside of the set range were attempted to be used by SA, which is why the learning rate was set to the default value.
LSTM prediction
At the end of the hyperparameter searching task, the simulated annealing algorithm provides the best scenario. The learning rate is 0.00005, the number of neurons in the first LSTM layer is 39, and the number of neurons in the second LSTM layer is 111. According to this scenario, the loss of the model is reduced to 0.000569. In addition, based on the loss graph, the training of the model is appropriate. There is no underfitting or overfitting. Hence, this scenario of the LSTM hyperparameters can be applied in real prediction work.
As discussed above, the basic structure parameters of the LSTM model are shown in Table 4 and Fig 12 below.
In the prediction process, there are three different time periods of predicted copper prices. In other words, the training data are split by ratios of 0.9, 0.925, and 0.95. The period of the copper price prediction is 485 days, 363 days and 242 days, respectively.
Fig 13 shows the prediction result of the 485-day copper price. Based on this graph, two observations can be made. First, the real price fluctuates more than the predicted price. This could be normal because of noise in the real price data. If a denoising technique is used on these original data, the prediction result could be smoother. Second, the prediction model can provide a relatively good prediction of the copper price. Although a precise price could not be captured by the model, the total trend of the copper price is well represented. The mean square error of the prediction result is 1.95%.
Fig 14 shows that the copper price trend can be predicted accurately for a shorter time period. The mean square error for these two predictions is 0.0019 and 0.00097. As the prediction time period becomes shorter, the prediction accuracy increases. This is reasonable because there are more data remaining to help the model learn. In other words, more information can be learned by the model when less data enter the test set. However, there is a critical number for this process, and the model is used to predict the 50-day copper price. The error increases again, which means that the model is probably overfitting the data. The dropout rate of the dense layer or any other technique used for overfitting should be applied.
Prediction result of 363-day (left) and 242-day copper price (right).
Conclusion
Copper, a critically important metal in the modern world, is widely used in industry. As more researchers learn about copper, it is important to pay attention to its financial features. One of the attractive points of copper is the copper price. Unlike past prediction methods, there are now more state-of-the-art techniques used in price prediction. One of these techniques is artificial intelligence. As artificial intelligence has become more common to use, researchers have focused on how to make it work sufficiently.
In a recent paper, three tasks were completed. First, through correlation methods, it was found that the WTI oil price, gold price and silver price have a great influence on the copper price. Hence, these three kinds of data were collected to predict the copper price.
The simulated annealing model is used to find hyperparameters of LSTM. The learning rate of LSTM is 5e-5, the number of neurons in the first LSTM layer is 39, and the number of neurons in the second LSTM layer is 111. In the training phase, the data are split by a ratio of 0.9. The proposed model shows the best behavior, and the error is 0.00569.
In the model prediction, three different copper price time periods, 485 days, 363 days and 242 days, are chosen. The prediction error is 0.00195, 0.0019, and 0.00097, respectively. The more data that are used to train the model, the lower the error. In a recent paper, the model yielded a prediction with an error below 2%.
Although the SA-LSTM model has improved the prediction effect of the LSTM model on copper prices, we believe that the model can be further improved. First, due to the presence of noise in the time series, these noises may have a negative impact on price forecasting. We plan to use signal processing techniques, such as wavelet noise reduction, empirical mode decomposition and other methods, to denoise the time series in the future to highlight the characteristics of price trends. In addition, in view of the problem that the LSTM model does not capture the periodicity and seasonal features of the time series model, we plan to combine the LSTM model with the SARIMA model to try to highlight the seasonality of the time series and facilitate the learning of the LSTM model.
References
- 1. Alves B. (2020). Chile: copper mine production 2009–2019.
- 2. Elshkaki A, Graedel TE, Ciacci L, Reck BK. Copper demand, supply, and associated energy use to 2050. Global Environmental Change. 2016;39: 305–315.
- 3. Kriechbaumer T, Angus A, Parsons D, Rivas CM. An improved wavelet–ARIMA approach for forecasting metal prices. Resources Policy. 2014;39(C): 32–41.
- 4. Ohyver M, Pudjihastuti H. Arima Model for Forecasting the Price of Medium Quality Rice to Anticipate Price Fluctuations. Procedia Computer Science, 2018;135: 707–711.
- 5. Riesgo GM, Alicja K, del Campo M Á M, García-Miranda CE, Lasheras FS. Rare earth elements price forecasting by means of transgenic time series developed with ARIMA models. Resources Policy. 2018;59: 95–102.
- 6. Lin L, Jiang Y, Xiao H, Zhou Z. Crude oil price forecasting based on a novel hybrid long memory GARCH-M and wavelet analysis model. Physica A: Statistical Mechanics and its Applications. 2020;543: 123532.
- 7. Lu W, Feng M, Jing L, Lin Y. Forecasting stock price volatility New evidence from the GARCH MIDAS model. International Journal of Forecasting. 2020;36(2): 684–694.
- 8. Dritsaki C. An empirical evaluation in GARCH volatility modeling: Evidence from the Stockholm stock exchange. Journal of Mathematical Finance. 2017;7(2): 366–390.
- 9. Moshkbar-Bakhshayesh K. Performance study of bayesian regularization based multilayer feed-forward neural network for estimation of the uranium price in comparison with the different supervised learning algorithms. Progress in Nuclear Energy. 2020;127: 103439.
- 10. Sun W, Huang C. A carbon price prediction model based on secondary decomposition algorithm and optimized back propagation neural network. Journal of Cleaner Production. 2020;243: 118671.
- 11. Yu Z, Qin L, Chen Y, Parmar MD. Stock price forecasting based on LLE-BP neural network model. Physica A: Statistical Mechanics and its Applications. 2020;553: 124197.
- 12. Berradi Z, Mohamed Lazaar. Integration of Principal Component Analysis and Recurrent Neural Network to Forecast the Stock Price of Casablanca Stock Exchange. Procedia Computer Science. 2019;148: 55–61.
- 13. Jianwei E, Ye J, He L, Jin H. Energy price prediction based on independent component analysis and gated recurrent unit neural network. Energy. 2019;189: 116278.
- 14. Wang X, Wen J, Zhang Y, Wang Y. Real estate price forecasting based on SVM optimized by PSO. Optik. 2014;125(3): 1439–1443.
- 15. Sen J, Mehtab S. Long‐and‐Short‐Term Memory (LSTM) Networks Architectures and Applications in Stock Price Prediction. Emerging Computing Paradigms: Principles, Advances and Applications. 2022; 143–160.
- 16. Hu Y, Ni J, Wen L. hybrid deep learning approach by integrating LSTM-ANN networks with GARCH model for copper price volatility prediction. Physica A: Statistical Mechanics and its Applications. 2020;557: 124907.
- 17. Chang Z, Zhang Y, Chen W. Electricity price prediction based on hybrid model of adam optimized LSTM neural network and wavelet transform. Energy. 2019;187: 115804.
- 18. Ji L, Zou Y, He K, Zhu B. Carbon futures price forecasting based with ARIMA-CNN-LSTM model. Procedia Computer Science. 2019;162: 33–38.
- 19. Kim HY, Won CH. Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Systems with Applications. 2018;103: 25–37.
- 20. Peng L, Liu S, Liu R, Wang L. Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy. 2018;162: 1301–1314.
- 21. Lu Q, Sun S, Duan H, Wang S. Analysis and forecasting of crude oil price based on the variable selection-LSTM integrated model. Energy Inform. 2021; 4: 1–20.
- 22. Stoean C, Paja W, Stoean R, Sandita A. Deep architectures for long-term stock price prediction with a heuristic-based strategy for trading simulations. PLoS ONE. 2019; 14(10): e0223593. pmid:31600306
- 23. Jovanovic L, Jovanovic D, Bacanin N, Jovancai Stakic A, Antonijevic M, Magd H, et al. Multi-Step Crude Oil Price Prediction Based on LSTM Approach Tuned by Salp Swarm Algorithm with Disputation Operator. Sustainability, 2022; 14(21): 14616.
- 24. Abbasimehr H, Paki R. Improving time series forecasting using LSTM and attention models. Journal of Ambient Intelligence and Humanized Computing. 2022;13: 673–691.
- 25. Bacanin N, Sarac M, Budimirovic N, Zivkovic M, AlZubi AA, Bashir AK. Smart wireless health care system using graph LSTM pollution prediction and dragonfly node localization. Sustainable Computing: Informatics and Systems. 2022;35: 100711.
- 26. Dehghani Hesam, Bogdanovic Dejan. Copper price estimation using bat algorithm. Resources Policy. 2018; 55: 55–61.
- 27. Astudillo G, Carrasco R, Fernández-Campusano C, Chacón M. Copper Price Prediction Using Support Vector Regression Technique. Applied Sciences. 2020; 10(19):6648.
- 28. Khoshalan HA, Shakeri J, Najmoddini I, Asadizadeh M. Forecasting copper price by application of robust artificial intelligence techniques. Resources Policy. 2021; 73: 102239.
- 29. Liu K, Cheng J, Yi J. Copper price forecasted by hybrid neural network with Bayesian Optimization and wavelet transform. Resources Policy. 2022;75: 102520.
- 30. Luo H, Wang D, Cheng J, Wu Q. Multi-step-ahead copper price forecasting using a two-phase architecture based on an improved LSTM with novel input strategy and error correction. Resources Policy. 2022;79: 102962.
- 31. Ni J, Xu Y, Li Z, Zhao J. Copper price movement prediction using recurrent neural networks and ensemble averaging. Soft Computing. 2022;26(17): 8145–8161.
- 32. Predić B, Vukić U, Saračević M, Karabašević D, Stanujkić D. The possibility of combining and implementing deep neural network compression methods. Axioms. 2022;11(5): 229.
- 33. Pousinho HMI, Mendes VMF, Catalão JPS. Short-term electricity prices forecasting in a competitive market by a hybrid PSO–ANFIS approach. International Journal of Electrical Power & Energy Systems. 2012;39(1): 29–35.
- 34. Zivkovic M, Bacanin N, Venkatachalam K, et al. COVID-19 cases prediction by using hybrid machine learning and beetle antennae search approach. Sustainable cities and society. 2021; 66: 102669. pmid:33520607
- 35. Swathi T, Kasiviswanath N, Rao A A. An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis. Applied Intelligence. 2022; 52(12): 13675–13688.
- 36. Zhang Y, Chen B, Pan G, Zhao Y. A novel hybrid model based on VMD-WT and PCA-BP-RBF neural network for short-term wind speed forecasting. Energy Conversion and Management. 2019;195: 180–197.
- 37. Zhou AH, Zhu LP, Hu B, Deng S, Song Y, Qiu H, et al. Traveling-salesman-problem algorithm based on simulated annealing and gene-expression programming. Information. 2018;10(1), 7.