Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Forecasting crude oil futures price with energy uncertainty: Evidence from machine learning methods

Abstract

Energy related uncertainty has significant influence on crude oil market. To explore the influence, this paper investigates the predictive ability of the Energy-Related Uncertainty Index (EUI), over and above standard macroeconomic predictors, in forecasting crude oil prices using an array of machine learning methods. We find that EUI has a significant impact on crude oil prices. Moreover, machine learning methods combined with EUI performed better than the linear regression method due to a lower rate of prediction errors. Among these methods, the Random Forest (RF) model with EUI performs better in the short term, while the Attention-enhanced Long Short-Term Memory (Attention-LSTM) model with EUI has more substantial predictive power in the long term. These empirical results pass a series of robustness tests. Our findings have important implications for both regulators and investors in the crude oil market.

Introduction

As one of the most important commodities in the world, crude oil is the foundation of modern industry and the global economy. Over the last few decades, researchers have found that crude oil price affect many goods and services that have a direct impact on the economy [1,2]. However, to accurately predict crude oil prices remains a significant challenge both theoretically and practically.

Since the introduction of economic policy uncertainty (EPU) by Baker et al. [3], the impact of uncertainty on the crude oil market and its application to forecasting crude oil price has gained considerable attention from scholars and policymakers. Until now, a large number of studies have been conducted to investigate the linkage between uncertainty and the crude oil market using indices related to uncertainty. For example, Li et al. [4] use multiple uncertainty indicators to forecast crude oil volatility and found that the U.S. petroleum market equity market volatility tracker index (PMEMV) performs better in forecasting short-term crude oil volatility, while the geopolitical risk index (GPR) has better predictive power for short-term crude oil volatility. Dai et al. [5] investigate whether the global economic policy uncertainty (GEPU) and the change of GEPU (GEPU) have different impacts on crude oil futures volatility under the single-factor model and two-factor model. The findings show that the one-factor model with GEPU or GEPU is consistently effective in predicting the volatility of crude oil futures, while the two-factor model with GEPU change has a much stronger forecasting ability than that with GEPU. Nonejad [6] evaluates whether the geopolitical risk (GPR) index could improve the prediction accuracy of crude oil price volatility and what forms of nonlinearity provide the largest degree of forecast accuracy gains. They find that GPR increases over the recent maximums offer the highest degree of forecast accuracy gains, while GPR decreases below recent minimums do not deliver such gains.

While previous studies show that uncertainty can not only affect crude oil markets but also predict crude oil prices, few studies have examined the predictive power of energy-related uncertainty on crude oil prices. Moreover, existing studies mainly explore the predictive ability of uncertainty on crude oil prices through traditional methods such as linear regression model (LR) [7,8], autoregressive model (AR) [9,10], and generalized autoregressive conditional heteroskedasticity model (GARCH) [1113]. These methods are too dependent on the quality of the data, which needs to be stationary [14,15]. Furthermore, these methods usually ignore the possible non-linear relationship between uncertainty and crude oil prices, which carries the risk of underfitting [16,17]. To solve these problems, this study examines the predictive ability of the energy-related uncertainty index (EUI) on crude oil prices based on machine learning methods. Compared to traditional methods, machine learning methods are effective not only in solving the problem of data pre-processing but also in capturing the non-linear relationship between uncertainty and crude oil prices.

The main purpose of this study is to answer two vital questions: 1. Will the inclusion of EUI improve the accuracy of crude oil price prediction? 2. Which model will most effectively predict crude oil price? Following this logic, we first collect data on the EUI and other factors that may influence crude oil price. Then, we employ the Least Absolute Shrinkage and Selection Operator (LASSO) regression test to select other variables with significant predictive power. Finally, we use six types of models, including one traditional model (Linear Regression model (LR)), one machine learning method (support vector regression (SVR)), two deep learning methods (Long Short-Term Memory (LSTM), Attention-enhanced Long Short-Term Memory (Attention-LSTM)), and two tree-based techniques (Random Forest (RF), and Extreme Gradient Boosting (XGBoost)), to predict crude oil price. The LR model is employed as a simple, interpretable baseline. The selection of other five models is motivated by their ability to capture the diverse and complex characteristics of crude oil price movements. Specifically, LSTM and Attention-LSTM are adept at learning from sequential data. SVR, RF, XGBoost have strong capability in capturing non-linear patterns and complex interactions among features. The results indicate that the RF model that incorporates EUI excels in the short-term prediction by robustly handling noisy, non-linear feature interactions dominant in immediate price movements. Conversely, the Attention-LSTM model combined with EUI dominates long-term forecasts, as its sequential learning and attention mechanism are better suited to distill the underlying long-range dependencies that govern crude oil prices trends.

This study makes three primary contributions to the energy finance literature. First, we are among the first to integrate LASSO feature selection with the Energy Uncertainty Index (EUI) for crude oil forecasting, thereby providing a novel and tailored framework to assess uncertainty’s impact on the oil market. Second, we identify the most relevant predictors from a broad set of variables, which not only enhances forecasting precision but also improves model interpretability and guards against overfitting. Finally, we deploy a diverse suite of machine learning algorithms to separately forecast short-term and long-term crude oil prices, enabling a comprehensive comparison of their predictive performance across different time horizons. From a practical standpoint, our findings offer actionable insights for investors and policy regulators in managing oil price risks and formulating data-driven strategies.

The remainder of this paper is organized as follows. Section 2 describes the data used in this study. Section 3 describes the methodologies utilized. Section 4 shows the empirical results. Section 5 presents various robustness tests. Section 6 provides our conclusions.

Data

Crude oil futures prices

We focus on Brent oil futures, a key global benchmark, as it represents the international pricing standard for crude oil. The monthly close price for Brent is collected from the U.S. Energy Information Administration (EIA). The sample period spans from January 1996 to October 2022, encompassing major events such as the Iraq War (2003), the global financial crisis (2008–2009), the international oil price crash (2014), and the COVID-19 pandemic (2020). As illustrated in Fig 1, these events induced substantial price volatility. This underscores the practical need for market participants and policymakers to forecast the future price levels for investment and planning, rather than relying solely on period-to-period returns.

Consequently, our study predicts crude oil price levels directly. Although price levels may exhibit non-stationarity, the machine learning methods we employ are adept at learning from such data without relying on the strict stationarity assumptions required by traditional econometric models. Their ability to capture complex, persistent patterns is a key advantage in this context. Additionally, we predict returns, as detailed in the Robustness Tests section to address any concerns regarding potential non-stationarity.

EUI and other predictors

In this paper, we use two kinds of global EUI introduced by Dang et al. [18] to investigate the impact of energy uncertainty on crude oil prices, including the equal-weighted EUI (EUI_equally) and the GDP-weighted EUI (EUI_GDP). The data of EUI is available from the official website of EPU (http://www.policyuncertainty.com/energy_uncertainty.html), where the sample range is also from January 1996 to October 2022.

Furthermore, many other factors affect the price of crude oil futures. Therefore, we must consider extensive aspects in our prediction model. Based on previous studies, we also include 24 potential factors in five categories as predictors, as shown in Table 1. Concerning the bond market, we select five benchmark interest rates—the federal funds rate, treasury bill rate, London Interbank Offered Rate, default yield spread, and government long-term bond yield. The selection of these specific benchmarks is justified by their documented empirical significance in previous research as key drivers of crude oil prices [1922]. For the stock market, we choose three factors: the S&P500 index, the Dow Jones Industrial Average, and stock variance. This aligns with Xu et al. [19], and Welch and Mensi [23]. We choose the S&P GSCI Non-Energy index and CRB Rind index, for the futures market following Xu et al. [19]. Concerning the economic state, we choose nine factors that are commonly used to predict crude oil price: unemployment rate, capacity utilization, the Chicago Fed’s national activity index, inflation rate, money supply, industrial production index, US dollar index, MSCI World Index, ISM Manufacturing Index [22]. Additionally, Zhang et al. [21], and Dai and Kang [22] have capitalized on crude oil production, crude oil import, and crude oil stock on predicting crude oil price. We thus select five factors to represent the impact of the related crude oil market, including growth of global crude oil production, growth of US crude oil production, growth of US crude oil stock, growth of US crude oil import, and growth of global crude oil import.

thumbnail
Table 1. Potential factors affecting crude oil futures price.

https://doi.org/10.1371/journal.pone.0341496.t001

Finally, the collected data is normalized and divided chronologically into three parts to prevent look-ahead bias. The first part is the training set (60% of the sample), which is used for preprocessing, feature selection and model training. The second part is the validation set (20% of the sample), which is used to tune hyper parameters through grid search method. The third part is the testing set (20% of the sample), which is used to evaluate the performance of the prediction model. The complete dataset supporting the findings of this study is available in S1 File.

Methodology

Feature selection methods

Feature selection plays a crucial role in the field of machine learning by selecting the most informative features from raw data, improving model performance, reducing overfitting, and speeding up model training and prediction. Feature selection is important in large datasets and high-dimensional data, as unnecessary features increase computational complexity and introduce redundant information. The overfitting phenomenon may occur when employing a prediction model, especially when some variables are strongly correlated. Therefore, we adopted LASSO regression [24] to filter the variables. Lasso is a linear regression method that adopts L1-regularization. The use of L1-regularization will make the weight of some learned features 0, to achieve the purpose of sparsity and feature selection.

The LASSO estimate is defined by the equation shown below:

(1)

Where is the total number of observations, represents a nonnegative regularization parameter corresponding to one value of Lambda, is the dependent variable, is the number of independent variables , symbolizes the intercept, and stands for the other parameters.

The loss function expression of LASSO is:

(2)

where is the constant coefficient and needs to be tuned, is the L1 norm.

LASSO is characterized by variable selection and regularization while fitting the generalized linear regression model. Therefore, whether the target dependent variable is continuous, binary or multivariate discrete, it can be modeled and predicted. LASSO can solve problems with multicollinearity, allowing automatic feature selection for high-dimensional data sets. It can improve the generalization ability of the model and avoid overfitting. Moreover, LASSO can provide the sparsity of the coefficients, making the model more explanatory.

Predictive models

Long short term memory model (LSTM).

RNNs are powerful and well suited for processing sequences of inputs. These networks have a certain persistence, which enables information to be passed from one time step to the next. However, their weakness is that they are unable to capture long-term historical information because they cannot store data for an extended period. Moreover, in the reverse model training process with long sequences, RNN will cause gradient explosion or gradient disappearance.

To overcome this shortcoming of the model, the long and short term memory model was proposed by Hochreiter S and Schmidhuber J [25]. It has four gated units that can adaptively regulate the information flow inside the unit. Unlike previous neural networks such as ANN, LSTM can master intense learning tasks that require long-term memory of events.

The specific process is clarified in the following equations:

(3)(4)(5)(6)

Where , , and represent the forget gate, input gate, and output gate for the sigmoid method, respectively. At time , represents the input vector and signifies the hidden state vector, which is also known as the output vector of the LSTM unit that employs the elementwise product method signified as . and are weight matrices and bias parameters that must be learned during training, a process in which an LSTM neuron learns the hidden state () of the previous neuron, and the current input it. After passing through the gate unit, the neural network obtains the output and passes the hidden state on to the next unit. There is a mechanism in the unit for determining the reset and update gates to control the quantity of information that will be ignored or used.

In this paper, the LSTM model is implemented and tuned with the following hyperparameters: the number of LSTM layers: [1, 2], the number of hidden units per layer: [32, 64, 128], the dropout rate: [0.1, 0.2, 0.3], the optimizer: [‘Adam’, ‘RMSprop’], the learning rate: [0.001, 0.005], and the batch size: [16, 32, 64]. We employ early stopping with a patience of 15 epochs to determine the training epochs. The final model architecture that yields the best performance consists of two stacked LSTM layers with 64 and 32 hidden units respectively, followed by a fully connected output layer. We incorporate dropout layers with a rate of 0.2 after each LSTM layer. The model is trained using the Adam optimizer with a learning rate of 0.001 and a batch size of 32. The typical number of training epochs ranges between 80 and 120.

Attention-enhanced long short-term memory (Attention-LSTM).

Attention-LSTM model is a hybrid neural architecture that merges the sequential modeling capabilities of LSTMs with adaptive attention mechanisms to overcome limitations in processing long-term dependencies. Originating from the attention framework proposed by Bahdanau et al. [26] for neural machine translation, Attention-LSTM computes attention scores to dynamically weight hidden states, addressing the “information bottleneck” in standard LSTMs. In vanilla LSTM, the hidden state at time t evolves through gating mechanisms:

(7)(8)(9)(10)(11)(12)

where denotes the sigmoid function and is element-wise multiplication. However, this structure struggles to prioritize critical time steps in long sequences. Attention-LSTM addresses this by introducing a context vector , calculated as a weighted sum of encoder hidden states :

(13)

where the attention weights are derived from a softmax-normalized alignment score:

(14)

The alignment score function, often implemented as additive or multiplicative , enables the model to focus on relevant input regions dynamically.

By integrating LSTM’s memory retention with attention’s adaptive focus, the hybrid architecture achieves superior performance in tasks requiring fine-grained temporal reasoning, such as outlier detection [27] or financial forecasting [2830]. The attention weights also provide interpretable insights into feature importance, bridging the gap between model complexity and transparency. This synergy positions Attention-LSTM as a versatile solution for modern sequence modeling challenges.

In this paper, the Attention-LSTM implementation is configured and tuned with the following hyperparameters: the number of LSTM layers in the encoder: [1, 2], the number of encoder hidden units: [50, 100, 150], the type of attention mechanism: [‘additive’, ‘multiplicative’], the dropout rate: [0.2, 0.3, 0.4], the optimizer: [‘Adam’], the learning rate: [0.0005, 0.001], and the batch size: [16, 32]. We employ early stopping with a patience of 20 epochs. The final model configuration that achieves the best results uses a single LSTM layer with 100 hidden units as the encoder, an additive attention mechanism (Bahdanau), and a dropout rate of 0.3 applied to the LSTM outputs. Training is conducted using the Adam optimizer with a learning rate of 0.0005 and a batch size of 16. The typical training epoch range is 100–150.

Support vector regression (SVR).

SVR, an extension of Support Vector Machines (SVMs) proposed by Vapnik et al. [31] in the 1990s, is a supervised learning algorithm designed for regression tasks. Its primary objective is to predict continuous outcomes by identifying a hyperplane that maximizes the margin around predicted values while tolerating small deviations controlled by a parameter . Unlike traditional regression methods that minimize mean squared error, SVR focuses on minimizing the -insensitive loss function, which disregards errors within a predefined tolerance band. This approach enhances robustness to outliers and noisy data, making SVR particularly effective in high-dimensional or non-linear problems, such as financial forecasting and industrial process modeling [32,33].

The core principle of SVR revolves around mapping input data into a higher-dimensional feature space via a kernel function (e.g., radial basis function: , where a linear regression model is constructed. The optimization objective is formulated as:

(15)(16)(17)(18)

where w is the weight vector, C controls the trade-off between margin width and error tolerance, and , are slack variables. Predictions are made using , where , are Lagrange multipliers. This dual formulation allows SVR to handle non-linearity efficiently without explicitly computing high-dimensional transformations.

In this paper, the SVR model is implemented using the Radial Basis Function (RBF) kernel. Hyperparameters are tuned via grid search on the validation set. The search ranges are: regularization parameter C: [0.1, 1, 10, 100], kernel coefficient γ: [‘scale’, 0.01, 0.1, 1], and tolerance ∊: [0.01, 0.1, 0.5]. The final chosen values that yield the best performance are C = 10, γ = 0.1, and ∊=0.1.

Random forest (RF).

Decision trees can be used for various machine learning applications. However, trees grown deep to learn highly irregular patterns tend to overfit the training set. Moreover, a slight noise in the data may cause the tree to grow in a completely different manner.

Random Forest is an upgrade based on the decision trees which is constructed using a combination of decision tree classifiers [34]. Random Forest overcomes this problem by training multiple decision trees on different subspaces of the feature space at the cost of slightly increased bias. This means none of the trees in the forest sees the entire training data. The data is recursively split into partitions. At a particular node, the split is done by asking a question on an attribute. The choice for the splitting criterion is based on some impurity measures such as Shannon Entropy or Gini impurity.

Gini impurity is used as the function to measure the quality of the split in each node. Gini impurity at node N is given by

(19)

where is the proportion of the population with class label . Another function that can be used to judge the split quality is Shannon Entropy. It measures the disorder in the information content. In decision trees, Shannon entropy is used to measure the unpredictability in the information contained in a particular node of a tree (In this context, it measures how mixed the population in a node is). The entropy in a node can be calculated as follows

(20)

where is number of classes considered and is the proportion of the population labeled as . Entropy is the highest when all the classes are contained in equal proportion in the node. It is the lowest when only one class is present in a node (when the node is pure).

The obvious heuristic approach to choosing the best splitting decision at a node is the one that reduces the impurity as much as possible. In order words, the best split is characterized by the highest gain in information or the highest reduction in impurity. The information gain due to a split can be calculated as follows

(21)

where is the impurity measure (Gini or Shannon Entropy) of node , is the proportion of the population in node that goes to the left child of after the split and similarly, is the proportion of the population in node that goes to the right child after the split. and are the left and right child of respectively.

Since the whole feature factor subset is no longer considered at a time, but one feature factor subset, the training speed of Random Forest is faster. Besides, each decision tree in Random Forest selects training samples and features randomly, which makes the random forest not easy to fall into overfitting, and also has stronger adaptability to data and can handle discrete and continuous data.

In this paper, the RF model is configured and tuned with the following hyperparameters: the number of trees (n_estimators): [100, 200, 300], the maximum depth of trees (max_depth): [5, 10, 15, None], the minimum samples required to split a node (min_samples_split): [2, 5, 10], and the minimum samples required at a leaf node (min_samples_leaf): [1, 2, 4]. The final model uses 200 trees, a maximum depth of 10, a minimum of 5 samples to split a node, and 2 samples per leaf. The Gini impurity is used as the splitting criterion.

Extreme gradient boosting (XGBoost).

XGBoost, proposed by Tianqi Chen and Carlos Guestrin in 2016 [35], is an advanced ensemble learning algorithm designed to optimize predictive accuracy and computational efficiency. Built upon the gradient boosting framework, it addresses limitations in traditional gradient boosting machines (GBMs) by introducing regularization, parallel processing, and hardware optimization. Specifically, the method aims to iteratively combine weak learners (typically decision trees) into a strong ensemble model, minimizing prediction errors while controlling overfitting.

At its core, XGBoost optimizes a regularized objective function comprising a loss function L (e.g., mean squared error or log loss) and a regularization term :

(22)

where is the ensemble prediction, represents the k-th tree, and . Here, T is the number of leaves, w denotes leaf weights, and , control regularization strength.

During training, trees are added sequentially to correct residuals from previous iterations. The gradient and Hessian of the loss guide tree construction, with the optimal weight for leaf j computed as:

(23)

where and are first- and second-order gradients.

By balancing model accuracy, computational speed, and interpretability, XGBoost outperforms conventional machine learning methods while avoiding the complexity of deep neural networks. From credit risk modeling to financial prediction, XGBoost’s versatility and efficiency continue to drive its adoption across academia and industry [3638].

In this paper, the XGBoost model’s hyperparameters are optimized through grid search. Key parameters and their search ranges include: n_estimators: [100, 200, 300], max_depth: [3, 6, 9], learning_rate: [0.01, 0.05, 0.1], subsample: [0.8, 0.9, 1.0], and colsample_bytree: [0.8, 0.9, 1.0], gamma: [0, 0.1, 0.2], and lambda: [1, 1.5, 2]. The final model configuration was: n_estimators = 200, max_depth = 6, learning_rate = 0.05, subsample = 0.9, colsample_bytree = 0.9, gamma = 0.1, and lambda = 1.5.

Evaluation methods

To assess the predictive ability of the forecasting models, following Lu and Xu [30], and Jiang et al. [39], four main evaluation metrics, out of sample R2, root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), are selected. The larger the value of R2, the smaller the values of RMSE, MAE, and MAPE, the better the model performance. The equations of these evaluation metrics are shown below:

(24)(25)(26)(27)

where , , represent the true value, the predicted value of the prediction model, and benchmark prediction of the historical average model at time t, respectively, m is the number of in-sample observations, and N is the number of total samples.

Furthermore, we employ the Diebold–Mariano (DM) test introduced by Diebold and Mariano [40] to evaluate the forecasting performance of the machine learning methods. The null hypothesis of the DM test is that the predictive accuracy of the tested model is equal to the alternative model. Consistent with the previous evaluation step, RMSE, MAE, MAPE are used as the DM loss function, and the historical average model is used as the alternative model, respectively. Given the multiple comparisons inherent in testing across horizons and models, we follow the forecasting literature’s common practice of reporting unadjusted p-values, as our conclusions rely on consistent performance patterns rather than isolated test results, and adjustments would unduly reduce statistical power.

Empirical results

LASSO feature selection

Table 2 presents the correlation coefficients among some predictors. The correlation between EUI (EUI_equally, EUI_GDP) and other predictors is small, suggesting that the energy-related uncertainty contains additional information. Moreover, the correlation coefficients of some macroeconomic predictors are shown to be relatively high. Therefore, we need to use the LASSO regression model to select the predictors with strong predictive power, especially the predictors with high correlations.

By adopting the LassoCV method, we are able to choose the optimal value of penalty coefficient , which is set to be [0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1]. Table 3 shows the predictors we selected through the LASSO regression method, except EUI_equally and EUI_GDP.

One-month-ahead forecasting results

In this subsection, we run 6 × 3 models, namely, six prediction methods (LR, LSTM, Attention-LSTM, SVR, RF, XGBoost), with three energy uncertainty indices (EUI_equally, EUI_GDP, EUI_without), to predict crude oil prices after one month. We use the adam optimizer and RMSE loss function for all machine learning methods. Other parameters that need to be tuned, such as the kernel and the value of epsilon for the SVR model, are determined through the trial and error method. All experiments are implemented in Python 3.7.

Table 4 shows the one-month-ahead forecasting results, Tables 5 and 6 present the improvement percentages of the EUI index and the best machine learning model, respectively. It can be concluded that the RF model with EUI is superior to other models in the short term. Specifically, it can be summarized as follows:

thumbnail
Table 4. One-month-ahead forecasting performance for all models.

https://doi.org/10.1371/journal.pone.0341496.t004

thumbnail
Table 5. Improvement percentages of the EUI index in one-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t005

  1. (1) The inclusion of energy uncertainty index (EUI) can improve the predictive accuracy of all models, as shown in Tables 4 and 5. For instance, using the RF model, the EUI_equally can improve the performance of , RMSE, MAE, MAPE by 1.064%, 1.110%, 1.520%, 1.422% compared to the RF model using no EUI, respectively. The EUI_GDP can improve the performance of , RMSE, MAE, MAPE by 0.912%, 0.892%, 3.422%, 0.943% compared to the RF model using no EUI, respectively.
  2. (2) Based on the DM test reported in Table 4, all prediction models are helpful in predicting crude oil price, and the RF models outperform other models in the short term. Here, the results of the DM test are indicated with an asterisk. According to Table 4, all prediction models perform better than the baseline model, which is a historical average model in this study. Moreover, according to Table 6, no matter which EUI is used in the prediction models, the RF models always perform better than others For example, using the EUI_equally in the models, the RF model can improve the performance of by 122.408%, 7.431%, 6.742%, 192.952%, 1.838% compared to LR, LSTM, Attention-LSTM, SVR, RF, XGBoost models, respectively. Using the EUI_GDP in the models, the RF model can improve the performance of by 145.926%, 6.752%, 3.912%, 204.587%, 4.239% compared to LR, LSTM, Attention-LSTM, SVR, XGBoost models, respectively. Similar conclusions can be drawn using other evaluation metrics.

Combining the results summarized above, it is concluded that the RF model with EUI is superior to other types of models in the short term. We attribute this advantage to two interrelated factors: the intrinsic capabilities of the Random Forest algorithm and the distinct role of EUI in high-frequency price dynamics. RF excels in capturing the complex, nonlinear interactions among transient market features—such as inventory shocks, speculative flows, and currency fluctuations—through its ensemble structure and feature selection mechanism, which effectively filters high-frequency noise while identifying the most relevant short-term drivers.

In this context, EUI functions as a critical barometer of immediate market sentiment and energy-specific risk. It captures real-time uncertainty stemming from geopolitical events, supply disruptions, or sudden policy shifts in the energy sector, which often trigger rapid repricing in oil futures. By incorporating EUI, the RF model gains access to a high-frequency proxy for the “fear factor” that drives short-term volatility—a dimension often missed by conventional macroeconomic variables. Thus, the predictive superiority of RF with EUI not only reflects the model’s algorithmic strength but also underscores the essential role of energy uncertainty as a near-term risk amplifier in crude oil markets.

Three-month-ahead forecasting results

In this subsection, we conduct a three-month-ahead prediction to further investigate the impact of EUI on the crude oil market in the long term. The empirical results presented in Tables 79 show that the Attention-LSTM model with EUI is superior to other competing models over this horizon. This can be explained by the model’s architectural strengths in capturing temporal dependencies and the evolving nature of EUI’s influence beyond short-term noise.

thumbnail
Table 7. Three-month-ahead forecasting performance for all models.

https://doi.org/10.1371/journal.pone.0341496.t007

thumbnail
Table 8. Improvement percentages of the EUI index in three-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t008

thumbnail
Table 9. Improvement percentages of the LSTM model in three-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t009

Specifically, the Attention-LSTM combines the long-range memory of LSTM with a dynamic attention mechanism, enabling it to identify and weigh historically significant periods—such as sustained uncertainty regimes or structural breaks—that shape medium- to long-term price trends. In such a framework, EUI transitions from being a high-frequency sentiment indicator to a forward-looking gauge of fundamental risks affecting investment, production, and consumption decisions. Persistent rise in EUI may signal delays in oilfield development, energy transition, shifts in the global political landscape, or policy uncertainty that gradually alters supply-demand balances. The Attention-LSTM model, by assigning learned importance to relevant past states of EUI and its interactions with slow-moving fundamentals, effectively captures how energy uncertainty propagates into long-term price dynamics. Therefore, the outperformance of Attention-LSTM with EUI not only demonstrates its technical merit but also validates EUI’s role as a persistent driver in the formation of medium-term crude oil price trends.

Six-month-ahead forecasting results

To validate the results in the long term, we conduct the six-month-ahead prediction in this subsection. The empirical results are shown in Tables 1012. As shown in Tables 1012, the Attention-LSTM model with EUI still outperforms other models in the long run. These results are consistent with the results in the above section.

thumbnail
Table 10. Six-month-ahead forecasting performance for all models.

https://doi.org/10.1371/journal.pone.0341496.t010

thumbnail
Table 11. Improvement percentages of the EUI index in six-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t011

thumbnail
Table 12. Improvement percentages of the LSTM model in six-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t012

Twelve-month-ahead forecasting results

To further validate the results in the long term, we conduct the twelve-month-ahead prediction in this subsection. The empirical results are shown in Tables 1315. As shown in Tables 1315, the Attention-LSTM model with EUI remains superior to the other models in the long term. These results are consistent with the results in the above section.

thumbnail
Table 13. Twelve-month-ahead forecasting performance for all models.

https://doi.org/10.1371/journal.pone.0341496.t013

thumbnail
Table 14. Improvement percentages of the EUI index in twelve-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t014

thumbnail
Table 15. Improvement percentages of the LSTM model in twelve-month-ahead forecasting.

https://doi.org/10.1371/journal.pone.0341496.t015

Forecasting uncertainty and interval analysis

To address the uncertainty surrounding our point prediction and enhance the practical relevance of our findings, we supplement our analysis with an evaluation of prediction intervals. Following the common practice in previous literature [41,42], we employ the residual-based bootstrapping method to construct empirical prediction intervals for our best-performing models: the RF model for one-month-ahead forecasts and the Attention-LSTM model for three-, six-, and twelve-month-ahead forecasts.

The procedure is as follows: First, we fit the model with the training set. Second, after obtaining the final model, we calculate the distribution of residuals on the training set. Third, for each point forecast in the test set, we generate 1000 bootstrap samples by adding randomly sampled residuals to the point forecast. Forth, we form the 95% empirical prediction interval with the 2.5th and 97.5th percentiles of this bootstrap distribution. Fifth, We assess the quality of these intervals by their coverage rate—the percentage of testing set that fall within the constructed interval. The results are summarized in Table 16.

thumbnail
Table 16. Prediction Interval Coverage Rates for Best Models.

https://doi.org/10.1371/journal.pone.0341496.t016

As shown in Table 16, the empirical coverage rates for these models are close to the nominal 95% level. This indicates that the bootstrapped prediction intervals are well-calibrated and reliably capture the forecast uncertainty. The slight under-coverage, particularly for the longer-horizon Attention-LSTM model, is common in financial forecasting and reflects the challenge of encapsulating all sources of uncertainty in highly volatile markets.

Feature importance analysis

In the previous section, LASSO regression is utilized for feature selection. In this subsection, we use other advanced methods including SHAP (SHapleyAdditive exPlanations) values and attention scores to further investigate the feature importance in each horizon. Specifically, in the 1-month-ahead prediction, we use RF model and SHAP values to sort the selected features. In the 3-month-ahead, 6-month-ahead, 12-month-ahead predictions, we use Attention-LSTM model and attention scores to rank these features. The results are shown in Table 17, in which the top seven features are sorted in descending order.

thumbnail
Table 17. Feature importance of crude oil futures prices.

https://doi.org/10.1371/journal.pone.0341496.t017

The feature importance rankings in Table 16 reflect a clear temporal hierarchy, where short-term oil price dynamics (1-month horizon) are dominated by immediate supply-demand imbalances and market sentiment. The top-ranked feature, Growth of US crude oil stock, directly captures inventory fluctuations that signal near-term supply tightness or surplus, while Growth of US crude oil production reflects shale operators’ rapid response to price signals, both driving short-term price volatility. Moreover, high-frequency financial indicators like stock variance and the ISM manufacturing index further amplify these effects by quantifying speculative trading intensity and real-time industrial demand shocks. Concurrently, the US dollar index and S&P GSCI Non-Energy index introduce short-term noise through currency valuation shifts and cross-commodity arbitrage, respectively. These features collectively highlight the dominance of operational and transactional factors over structural and macroeconomic policy drivers in shore-term forecasting.

As the forecasting horizon extends to 3–12 months, structural economic indicators and policy anticipation supersede transient supply shocks. For instance, capacity utilization and inflation rate emerge as critical drivers at the 3-month horizon, which gradually reshape energy investment and consumption patterns. At the 6-month horizon, lagged effects of systemic variables and monetary policy such as industrial production index and treasury bill rates, gain prominence, reflecting the delayed transmission of global trade adjustments and credit conditions. Over 12 months, features such as default yield spread and CRB Rind index dominate, as they encapsulate macroeconomic stability risks and commodity supercycle trends. This transition from operational and transactional features to structural and policy drivers underscores the increasing relevance of macroeconomic stability, policy cycles, and structural demand rebalancing in longer-term predictions.

Robustness tests

Different feature selection method

The out-of-sample performance is sensitive to feature selection methods. In this subsection, we consider Random Forest (RF) as an alternative feature selection method in crude oil price prediction. We first fit a RF model on the training set to obtain feature importance scores, then select the top 17 features to form the modified predictor set. The number of features is set to match that chosen by LASSO in the main analysis. The data splitting scheme remains strictly chronological—training (60%), validation (20%), and testing (20%)—to prevent look-ahead bias. The evaluation metrics is RMSE in this subsection and the prediction results are shown in Table 18. The results provide strong empirical evidence that the RF model with EUI outperforms other models in the short term (one-month-ahead), while the Attention-LSTM model with EUI is superior to other models in the long term (three-month-ahead and longer). These results are consistent with previous findings.

Different crude oil futures

In the crude oil market, Brent crude oil and WTI crude oil are two essential varieties that attract a lot of attention. They are not only the benchmark for global oil pricing but also the investments to which investors pay close attention. In this subsection, we re-examine the forecasting performance of the RF model with EUI and the Attention-LSTM model with EUI by considering WTI crude oil futures, respectively. The data of WTI crude oil futures price is collected from EIA. The WTI series ranges from February 1996 to October 2022, containing 321 months. The same set of predictors selected by LASSO in the main analysis is used, and the identical chronological split is applied to the WTI data. The evaluation metrics are also RMSE, and the prediction results are shown in Table 19. Obviously, the RF model with EUI can yield smaller RMSE in the short term, while the Attention-LSTM model with EUI has higher predictive accuracy in the long term. These results are consistent with previous results.

thumbnail
Table 19. Forecasting performance using WTI crude oil futures.

https://doi.org/10.1371/journal.pone.0341496.t019

Different data splits

While our primary analysis employs a fixed 60/20/20 chronological split, we acknowledge that the choice of split ratio may influence model performance. To assess the sensitivity of our findings, we conduct additional experiments using a 70%/15%/15% chronological split. This configuration allows us to examine whether our core conclusions remain stable under different data allocations. Using the same LASSO-selected predictors and RMSE as the evaluation metric, the results summarized in Table 20 demonstrate remarkable consistency. Specifically, RF with EUI maintains superior performance for one-month-ahead forecasts, while Attention-LSTM with EUI continues to achieve the lowest RMSE for three-, six-, and twelve-month-ahead forecasts.

thumbnail
Table 20. Forecasting performance under different data split.

https://doi.org/10.1371/journal.pone.0341496.t020

Different forecasting target

To address concerns that our findings might be driven by the persistence of crude oil price levels rather than genuine predictive relationships, we conduct a robustness check by forecasting returns instead of price levels We define returns as the logarithmic first difference: , where is the crude oil price at time t. Using the same LASSO-selected predictors and RMSE as the evaluation metric, the results presented in Table 21 demonstrate that our key findings remain robust. Specifically, models incorporating EUI continue to outperform those without it, and the relative advantage of RF for short-term forecasts and Attention-LSTM for long-term forecasts persists. This confirms that the predictive power of our models and the value of EUI are not artifacts of non-stationarity in price levels but reflect meaningful relationships that extend to the returns series.

Different market conditions

When predicting crude oil prices, it’s critical to consider extreme events like the financial global financial crisis, the Eurozone debt crisis, and the COVID-19 pandemic because these events trigger extreme demand destruction, market panic, and policy interventions that drastically distort traditional supply-demand patterns. In this subsection, we further validate the performance of the prediction models under global financial crisis (2008–2009) and COVID-19 pandemic (2020–2023). We use the same LASSO-selected predictors and maintain the forecasting scheme with chronological splits, ensuring no future information is leaked. The evaluation metrics are also RMSE, and the prediction results are shown in Tables 22 and 23. The results show that under extreme events, the RF model with EUI still has better performance in the short term, while the Attention-LSTM model with EUI still outperforms the other prediction models in the long term, except for the 1-month-ahead XGBoost model without EUI index. We suspect that this is because XGBoost can handle nonlinear relationships and noisy data through gradient-boosted decision trees, which robustly adapt to abrupt market fluctuations and outliers. Additionally, its regularization techniques and feature importance prioritization reduce overfitting, enhancing accuracy in capturing rapid, localized price shifts during volatile periods. These results are basically consistent with previous results.

thumbnail
Table 22. Forecasting performance under global financial crisis.

https://doi.org/10.1371/journal.pone.0341496.t022

thumbnail
Table 23. Forecasting performance under COVID-19 pandemic.

https://doi.org/10.1371/journal.pone.0341496.t023

Conclusion

This study uses the energy-related uncertainty index (EUI) to predict the crude oil prices and introduces machine learning methods to discuss how to improve the accuracy of predicting crude oil prices. Existing empirical studies regarding the impact of uncertainty on crude oil have not considered energy uncertainty and rarely use machine learning methods to assess the predictive power of uncertainty. Therefore, we explore whether the EUI and machine learning methods (i.e., LSTM, Attention-LSTM, SVR, RF, XGBoost) can help generate more accurate predictions. In this study, LR is used as the benchmark model.

Several interesting findings are highlighted here. First, the empirical results suggest that EUI is shown to have a significant impact on the crude oil prices. Second, the RF model has stronger predictive power than other competing methods in the short term, while the Attention-LSTM model exhibits better predictive performance in the long term. Additionally, our results are robust based on different feature selection method and crude oil futures. These findings will help policymakers and investors better understand the crude oil futures market and make more informed decisions about the energy market, including investment and risk management decisions.

While this paper contributes to the existing literature, there are still limitations that need to be overcome in the future. First, when analyzing the predictive capability of EUI, we comprehensively consider the financial and economic factors while overlooking the influence of political factors. The political climate all across the world is undergoing unprecedented changes, and the related factors are highly likely to impact crude oil prices in the near term, which deserves our attention. In the subsequent research, we will evaluate the predictive effectiveness of energy uncertainties after incorporating political factors and conduct an in-depth analysis of their interactive effects on crude oil prices Second, while our analysis demonstrates robustness to alternative data split, employing rolling or expanding window schemes in future work would better simulate real-time forecasting conditions. Finally, we have analyzed the crude oil market mainly based on Brent crude oil and WTI crude oil. In our future work, we will study the crude oil market more specifically, such as the Chinese crude oil market.

Supporting information

References

  1. 1. Busari GA, Lim DH. Crude oil price prediction: A comparison between AdaBoost-LSTM and AdaBoost-GRU for improving forecasting performance. Computers & Chemical Engineering. 2021;155:107513.
  2. 2. Roman M, Górecka A, Domagała J. The Linkages between Crude Oil and Food Prices. Energies. 2020;13(24):6545.
  3. 3. Baker SR, Bloom N, Davis SJ. Measuring Economic Policy Uncertainty*. The Quarterly Journal of Economics. 2016;131(4):1593–636.
  4. 4. Li X, Liang C, Chen Z, Umar M. Forecasting crude oil volatility with uncertainty indicators: New evidence. Energy Economics. 2022;108:105936.
  5. 5. Dai P-F, Xiong X, Zhang J, Zhou W-X. The role of global economic policy uncertainty in predicting crude oil futures volatility: Evidence from a two-factor GARCH-MIDAS model. Resources Policy. 2022;78:102849.
  6. 6. Nonejad N. Forecasting crude oil price volatility out-of-sample using news-based geopolitical risk index: What forms of nonlinearity help improve forecast accuracy the most? Finance Research Letters. 2022;46:102310.
  7. 7. Ding Y, Liu Y, Failler P. The Impact of Uncertainties on Crude Oil Prices: Based on a Quantile-on-Quantile Method. Energies. 2022;15(10):3510.
  8. 8. Shahbaz M, Sharif A, Belaid F, Vo XV. Long‐run co‐variability between oil prices and economic policy uncertainty. Int J Fin Econ. 2021;28(2):1308–26.
  9. 9. Li X, Wu M, Yuan L, Xiao M, Zhong R, Yu M. Uncertainties and oil price volatility: Can lasso help? Finance Research Letters. 2024;61:104963.
  10. 10. Xu B, Fu R, Lau CKM. Energy market uncertainty and the impact on the crude oil prices. J Environ Manage. 2021;298:113403. pmid:34365183
  11. 11. Gupta R, Pierdzioch C. Forecasting the Volatility of Crude Oil: The Role of Uncertainty and Spillovers. Energies. 2021;14(14):4173.
  12. 12. Ringim SH, Alhassan A, Güngör H, Bekun FV. Economic Policy Uncertainty and Energy Prices: Empirical Evidence from Multivariate DCC-GARCH Models. Energies. 2022;15(10):3712.
  13. 13. Yu M, Umair M, Oskenbayev Y, Karabayeva Z. Exploring the nexus between monetary uncertainty and volatility in global crude oil: A contemporary approach of regime-switching. Resources Policy. 2023;85:103886.
  14. 14. Foroutan P, Lahmiri S. Deep learning systems for forecasting the prices of crude oil and precious metals. Financ Innov. 2024;10(1).
  15. 15. Tiwari AK, Sharma GD, Rao A, Hossain MR, Dev D. Unraveling the crystal ball: Machine learning models for crude oil and natural gas volatility forecasting. Energy Economics. 2024;134:107608.
  16. 16. Liu W, Xu X. Forecasting crude oil price: A deep forest ensemble approach. Finance Research Letters. 2024;69:106153.
  17. 17. Yu L, Zhang X, Lin Y, Yu Y, Wu J, Dai D. Forecasting Crude Oil Prices: Evidence From WOA-VMD-FE-Transformer Model. Comput Econ. 2025;66(6):4645–76.
  18. 18. Dang TH-N, Nguyen CP, Lee GS, Nguyen BQ, Le TT. Measuring the energy-related uncertainty index. Energy Economics. 2023;124:106817.
  19. 19. Xu Y, Guan B, Lu W, Heravi S. Macroeconomic shocks and volatility spillovers between stock, bond, gold and crude oil markets. Energy Economics. 2024;136:107750.
  20. 20. Wei Y, Shi C, Zhou C, Wang Q, Liu Y, Wang Y. Market volatilities vs oil shocks: Which dominate the relative performance of green bonds?. Energy Economics. 2024;136:107709.
  21. 21. Zhang Y, Ma F, Wang Y. Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors? Journal of Empirical Finance. 2019;54:97–117.
  22. 22. Dai Z, Kang J. Bond yield and crude oil prices predictability. Energy Economics. 2021;97:105205.
  23. 23. Mensi W, Ziadat SA, Rababa’a ARA, Vo XV, Kang SH. Oil, gold and international stock markets: Extreme spillovers, connectedness and its determinants. The Quarterly Review of Economics and Finance. 2024;95:1–17.
  24. 24. Tibshirani R. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology. 1996;58(1):267–88.
  25. 25. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. pmid:9377276
  26. 26. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 2014; arxiv preprint arxiv:1409.0473.
  27. 27. Liu Y, Young R, Jafarpour B. Long–short-term memory encoder–decoder with regularized hidden dynamics for fault detection in industrial processes. Journal of Process Control. 2023;124:166–78.
  28. 28. Dip Das J, Thulasiram RK, Henry C, Thavaneswaran A. Encoder–Decoder Based LSTM and GRU Architectures for Stocks and Cryptocurrency Prediction. Journal of Risk and Financial Management. 2024;17(5):200.
  29. 29. Wang Y, Qin L, Wang Q, Chen Y, Yang Q, Xing L, et al. A novel deep learning carbon price short-term prediction model with dual-stage attention mechanism. Applied Energy. 2023;347:121380.
  30. 30. Lu M, Xu X. TRNN: An efficient time-series recurrent neural network for stock price prediction. Information Sciences. 2024;657:119951.
  31. 31. Drucker H, Burges CJ, Kaufman L, Smola A, Vapnik V. Support vector regression machines. Advances in neural information processing systems. 1996;9.
  32. 32. Chen L, Pan Y, Zhang D. Prediction of Carbon Emissions Level in China’s Logistics Industry Based on the PSO-SVR Model. Mathematics. 2024;12(13):1980.
  33. 33. Wu S, Wang W, Song Y, Liu S. An EEMD‐LSTM, SVR, and BP decomposition ensemble model for steel future prices forecasting. Expert Systems. 2024;41(11).
  34. 34. Breiman L. Random Forests. Machine Learning. 2001;45(1):5–32.
  35. 35. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016; 785–94.
  36. 36. Liu J, Zhang S, Fan H. A two-stage hybrid credit risk prediction model based on XGBoost and graph-based deep neural network. Expert Systems with Applications. 2022;195:116624.
  37. 37. Qin R. The Construction of Corporate Financial Management Risk Model Based on XGBoost Algorithm. Journal of Mathematics. 2022;2022(1).
  38. 38. Tan B, Gan Z, Wu Y. The measurement and early warning of daily financial stability index based on XGBoost and SHAP: Evidence from China. Expert Systems with Applications. 2023;227:120375.
  39. 39. Jiang Z, Zhang L, Zhang L, Wen B. Investor sentiment and machine learning: Predicting the price of China’s crude oil futures market. Energy. 2022;247:123471.
  40. 40. Diebold FX, Mariano RS. Comparing Predictive Accuracy. Journal of Business & Economic Statistics. 2002;20(1):134–44.
  41. 41. Eğrioğlu E, Fildes R. A New Bootstrapped Hybrid Artificial Neural Network Approach for Time Series Forecasting. Comput Econ. 2020;59(4):1355–83.
  42. 42. Ivanyuk V. The method of residual-based bootstrap averaging of the forecast ensemble. Financ Innov. 2023;9(1).