Shipping indices are extremely volatile, non-stationary, unstructured and non-linear, and more difficult to forecast than other common financial time series. Based on the idea of "decomposition-reconstruction-integration", this article puts forward a combined forecasting model CEEMD-PSO-BiLSTM for shipping index, which overcomes the linearity limitation of traditional models. CEEMD is used to decompose the original sequence into several IMF components and RES sequences, and the IMF components are recombined by reconstruction. Each sub-sequence is predicted and analyzed by PSO-BiLSTM neural network, and finally the predicted value of the original sequence is obtained by summing up the predicted values of each sub-sequence. Using six major shipping indices in China’s shipping market such as FDI and BDI as test data, a systematic comparison test is conducted between the CEEMD-PSO-BiLSTM model and other mainstream time-series models in terms of forecasting effects. The results show that the model outperforms other models in all indicators, indicating its universality in different shipping markets. The research results of this article can deepen and improve the understanding of shipping indices, and also have important implications for risk management and decision management in the shipping market.
Citation: Li C, Wang X, Hu Y, Yan Y, Jin H, Shang G (2023) Forecasting shipping index using CEEMD-PSO-BiLSTM model. PLoS ONE 18(2): e0280504. https://doi.org/10.1371/journal.pone.0280504
Editor: Qichun Zhang, University of Bradford, UNITED KINGDOM
Received: July 6, 2022; Accepted: January 2, 2023; Published: February 2, 2023
Copyright: © 2023 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: Our funding came from Hebei GEO University and Guizhou University of Finance and Economics and two projects. Two projects: Key Project of Philosophy and Social Science Planning in Guizhou Province (No.: 20GZD61); 2020 Graduate program of Guizhou University of Finance and Economics (No.: 2020ZXSY09). The sponsor is Chengang Li and Xuan Wang respectively. The role of Chengang Li in the research is as follows: Conceptualization, Funding acquisition and Supervision. The role of Xuan Wang in the research is as follows: Funding acquisition, writing-original draft, writing-Review & editing.
Competing interests: The authors have declared that no competing interests exist.
Since modern times, international shipping has become an important part of world trade and economic exchanges between countries. Compared with other means of transportation, ship transportation has the advantages of large carrying capacity and low operating costs. Today, more than 90% of the world’s trade is completed by sea. The shipping index, as a price index constructed by the actual shipping market rates, not only comprehensively reflects the level of shipping rates in the maritime transportation market, but also objectively reflects the degree of fluctuation in the shipping market, and reflects the economic trend of a country to a certain extent. With the development of shipping market and the continuous growth of world trade volume, a country’s economy and shipping trade demand are getting closer and closer. More and more scholars link shipping index with global economic development trend [1–3], among which BDI and other shipping indices have been used as economic indicators of world trade by various countries . Many experts and scholars at home and abroad have been working on how to grasp the trend of future changes in the shipping market through the prediction of shipping indices. Based on a reasonable forecasting model, investors can get excess returns from it, and managers and decision makers can guide the company’s strategic decision-making according to the future trend, thus effectively avoiding market risks.
Previous studies have pointed out that shipping indices are non-linear, highly noisy and periodic [5–7], and in recent years, with the development of big data, a series of breakthroughs have been made in the field of machine learning and deep learning, which are widely used in the field of time series forecasting. The concept of deep learning, first proposed by Hinton et al. , is a new type of multi-hidden layer neural network based on artificial neural networks (ANN), which has shown powerful capabilities in the extraction of essential features of complex input samples. However, as the shipping market itself is closely linked to the international economic forms and macroeconomic cycles, resulting in shipping indices with characteristics such as cyclicality, multi-dimensionality and high complexity, if the deep learning model is directly applied to the prediction of time series, the model will highly fit the complex and chaotic noise in the original series during the training process, resulting in a weakened generalization ability of the model, and thus unable to learn and fit the original series of key features. To solve this problem, many scholars have tried to explain the characteristics of financial time series from signal decomposition, among which the empirical modal decomposition proposed by Huang et al.  and its modified model effectively decompose the high-frequency components and low-frequency components of financial time series, which enables the prediction accuracy of neural network models to be continuously improved.
This article adopts the idea of "decomposition, reconstruction and integration" to construct a comprehensive model—CEEMD-PSO-BiLSTM model, aiming to analyze and predict the internal characteristics and trend of shipping index, grasp the dynamic trend of shipping market, and prevent the major risks that may be brought by shipping market. At the same time, the prediction effect of deep learning in shipping index is explored to fill the gap of neural network in the field of shipping index prediction. The contribution of this article are as follows: (1) We introduce the neural network algorithm system into the shipping market prediction, and through the current better CEEMD method, the original sequence is decomposed, which creates favorable conditions for accurately fitting the nonlinear and high noise characteristics of shipping index, and significantly reduces the difficulty of neural network prediction. (2) We choose BiLSTM model with strong generalization ability in deep learning models as the framework, and combine BiLSTM model with CEEMD model in the field of signal decomposition to build a high-precision combined prediction model based on shipping index, providing a practical and reliable modeling scheme. (3) PSO was introduced to optimize the BiLSTM model, which significantly improved the prediction accuracy of shipping index.
The specific structure of this article is as follows: “Related works” section illustrates the rationality of this article through previous research work; “Models” section introduces the sub-models used in this article and the construction process of the CEEMD-PSO-BiLSTM combined model; “Empirical Analysis” section uses the CEEMD-PSO-BiLSTM model to predict and analyze the shipping index; “Results and discussion” section will compare other single models and combined models to illustrate the effectiveness of the model proposed in this article; “Summary” section is related research conclusions.
2. Related works
As the shipping market is always full of uncertain events, the quantitative analysis method of ocean freight index has always been the focus of international scholars. Since modern times, more and more literatures have put forward the methods of freight index prediction. With the development of computer technology, the analysis of shipping index in academic circles is becoming more and more active for traditional model of freight index, such as ARMA model, ARIMA model, GARCH model, VAR model and so on. Cullinane et al.  conducted a univariate time series prediction analysis on Baltic Freight Index (BFI), and they successfully demonstrated that ARIMA model has a good effect in short-term prediction of BFI. Veenstra and Frans  established a vector autoregressive model for marine dry bulk freight samples and showed that there is a stable long-term relationship between different series of freight rates. Li  established ARMA model after eliminating trend and seasonal factors, which further improved the short-term prediction ability of BFI. Yang et al.  selected four vessel types of the Baltic Dry Index to build a GARCH (1,1) model, which exactly reflected the sensitivity and persistence pattern of its fluctuation. Chen et al.  used ARIMA and VAR models to predict spot rates of dry bulk routes of three major routes respectively, and they found that the VAR model had significantly better prediction effect than ARIMA model. Traditional models, although well constructed theoretically, must satisfy their prescribed statistical assumptions before a statistical model can be built, and they do not apply to high-latitude, noise-laden time series. When dealing with complex time series such as shipping data, traditional econometric models are unable to learn and fit their non-smooth, non-linear characteristics, resulting in unsatisfactory forecasting results.
With the advent of the Big Data era, the surge of data volume and the breakthrough development of computer computing power, the gradual rise of the artificial intelligence industry has led to new solutions to the financial time series forecasting problem. The emergence of machine learning has gradually replaced the use of traditional econometric models in financial time series. Yang et al.  selected CCBFI, CCFI and BFI as early warning criteria and used SVM models to successfully measure the prediction interval of the degree of freight alarm. Cho and Lin  used a fuzzy neural network model to analyze and forecast the BDI. The results showed that the fuzzy neural network model had high forecasting accuracy. Pérez-Cruz et al.  showed that when forecasting the return volatility of the stock market, the SVM model was used to estimate the parameters of the GARCH model, and this estimation method had higher forecasting ability than the ordinary ML method. Liu et al.  proposed an AR-SVR-GARCH model and an AR-SVR-GJR model. The empirical results show that both models have better volatility forecasting ability for the dry bulk shipping market, crude oil shipping market and shipping stock market. Neural networks (ANN), a multilayer neural network-based machine learning algorithm, has received considerable attention in the industry due to its suitability for solving complexity problems. Li and Parsons  showed that neural networks can significantly outperform time series models, especially in long-term forecasting. Kamal et al.  used deep neural networks to transform the BDI forecasting problem into a high-dimensional multiple regression problem. Zeng et al.  showed that the proposed ANN method outperformed VAR in the study of BDI forecasting. Sahin et al.  showed that the ANN method is an important modelling and forecasting method in the field of BDI forecasting and demonstrated its applicability. Zhang et al.  showed that ANN-based algorithms yielded less error and higher directional matching rates than econometric algorithms when forecasting weekly and monthly data models.
Schmidhuber and Hochreiter put forward the LSTM networks , which, as an improvement to the RNN model, has significantly better prediction results on time series than the traditional RNN model, due to its addition of memory units and forgetting units in the network structure, making it overcome the gradient dissipation or gradient explosion problem. In the study of time series prediction, many papers have shown that LSTM has obvious advantages.Kim et al.  found that LSTM can effectively improve the prediction performance of BDI by comparing time series analysis and deep learning methods. In a study by Xiao et al. , they combined LSTM and integrated learning techniques to forecast CBCFI, which outperformed other methods when dealing with information involving dramatic market downturns. In stock market research, Nelson et al.  proposed a stock price prediction model based on LSTM, which was used to simulate transactions and compared with the benchmark model to evaluate its prediction performance. The experimental results showed that the LSTM-based model had lower risk compared with other models in trading simulation. Nabipour et al.  found that the LSTM model has stronger fitting ability than other machine learning models. Although the LSTM model exhibits strong predictive ability in time series, it can only obtain information from a single forward timeline during weight training. The BiLSTM model makes up for the shortcomings of the LSTM model. Many scholars have confirmed that BiLSTM has better performance than LSTM in the field of prediction [28, 29].
As shipping indices are a complex system influenced by multiple external factors and the shipping price series itself possesses a large amount of noise, more and more scholars recognize that a single neural network model alone cannot adequately extract the key features of a complex series. It has become a challenge to combine existing forecasting methods to improve forecasting effectiveness by decomposing time series data. Therefore, scholars have integrated machine learning models from disciplines such as financial econometrics and signal engineering in the hope of achieving accurate forecasting of financial time series, and analytical methods such as Fourier transform and wavelet transform have started to be applied in financial time series. Yang et al.  built a combined model of wavelet transform and support vector machine to predict BPI. The prediction results showed that the combined wavelet transform and SVM model had higher accuracy than the original single model. In addition, Han et al.  also tried to build a model combining wavelet transform and SVM to solve the prediction problem of shipping index. Leonov and Nikolov  used a hybrid model of wavelet and neural network to study the fluctuation of freight rates on the Baltic Panama 2A and Baltic Panama 3A routes. However, due to the difficulty of Fourier transform in handling mutant data, the non-adaptive drawbacks of wavelet analysis, and the high volatility and noise characteristics of shipping indices, the above two methods are not effective in dealing with this type of financial time series. Empirical modal decomposition techniques are usually applied to non-smooth and non-linear signals. Huang et al. , members of the US Academy of Engineering, proposed the Empirical Mode Decomposition (EMD) method, which can adaptively decompose a non-linear signal into multiple Intrinsic Mode Functions (IMFs) and can effectively suppress continuous noise such as Gaussian noise. Therefore, the combination of EMD and forecasting models is gradually being considered. Zeng et al.  used a combined EMD and ANN model to forecast the Baltic Dry Index. The research of Chen et al.  combined empirical mode decomposition (EMD), component reconstruction technology and grey wave prediction to simulate China Container Freight Index (CCFI). Using the EMD decomposition method, Li et al.  constructed a forecasting model with a combination of EMD decomposition GMDH of the AC algorithm for forecasting analysis of NYMEX crude oil futures, and the results showed that the forecasting model significantly improved the forecasting accuracy. Awajan et al. used EMD-HW bagging when predicting stock market data. The results show that the accuracy of the model is significantly improved after using EMD decomposition . However, the EMD cannot suppress intermittent noise and mixed noise. To solve this problem, Wu et al.  proposed Ensemble Empirical Mode Decomposition (EEMD), which is an improved form of empirical mode decomposition, and the integrated empirical mode decomposition technique can solve the problem of noisy mode mixing. In the EEMD algorithm, a set of white noise is superimposed on the original signal and then decomposed into several IMFs. The average value of the corresponding IMF set is considered as the correct result and EEMD will separate the noise in different IMFs from the original signal components, thus eliminating the noise mode mixing phenomenon. However, the EEMD approach requires a large number of averages to reduce the Gaussian white noise added during processing, which makes the computational time cost significantly higher. Yeh and Shieh  proposed the Complementary Ensemble Empirical Mode Decomposition (CEEMD), which uses a more complex method, adding a pair of opposite white noise signals to the original signal. And then performing the EMD decomposition separately, ensures that the decomposition is at least as good as EEMD, while the method reduces the reconstruction error caused by white noise and can significantly reduce the prediction difficulty of the model.
In general, most of the existing literature on shipping index forecasting models are traditional econometric models or simple machine learning models, and there are few papers that combine empirical modal decomposition with neural network models and apply them to the field of shipping market forecasting. Since deep learning has shown good prediction performance in various fields, we hope to explore the effectiveness of deep learning in shipping index prediction in this article. Based on the above literatures, this article proposes a combined model—CEEMD-PSO-BiLSTM model by using the idea of "decomposition-reconstruction-integration" and combining the empirical mode decomposition method with neural network model. The above-mentioned method firstly decomposes the shipping index by CEEMD, recombines the IMF by reconstruction, then constructs PSO-BiLSTM model prediction of each sub-sequence in turn, and finally integrates the prediction results to get the final prediction results. In this article, six major shipping indices (Far Eastern Dry Index, Baltic Dry Index, Sea Bulk Composite Freight Index, China Coastal Coal Freight Index, New York Crude Oil Index and China Import Crude Oil Freight Index) in the Chinese shipping market are selected as test objects, while existing mainstream machine learning forecasting models are compared, as well as different combination models based on them, to verify the effectiveness of the method in this article.
3.1 CEEMD model
CEEMD is based on EMD. In order to solve the mode aliasing problem of EMD in IMF decomposition, several groups of independent and identically distributed white Gaussian noises are added to the original sequence, and then EMD decomposition is carried out to obtain IMF components, thus better solving the problem of the influence of EEMD on the accuracy of the original sequence. The specific CEEMD algorithm process is as follows.
- Generate T groups of positive and negative pairs of Gaussian white noise data sets are added to the original sequence, so there are 2T signal sets:
x1, and x2 are the signals after the addition of positive and negative paired white noise, respectively. X is the original time series. ωi is the Gaussian white noise subject to normal distribution ε0 is the standard deviation of the added noise; T is the number of times the noise is added (or the number of pooled samples).
- Using EMD method to decompose the sequence data, each signal gets a group of IMF components. where the jth component in the ith decomposition is denoted as . The corresponding IMF components are integrated and averaged to obtain each IMF component : (2)
- The final decomposition result of CEEMD is: (3)
3.2 LSTM model
The LSTM network was first proposed and designed by Hochreiter, and then improved by Schmidhuber et al. . As shown in Fig 1, it is proposed to add forgetting gates to the model to avoid gradient explosion or gradient disappearance in RNN, which is suitable for continuity prediction. Simple cyclic neural network consists of input layer, hidden layer and output layer.
Specifically, one neuron in LSTM model contains one cell and three gate mechanisms. Cell state is the key of LSTM model, which is similar to memory and is the memory space of the model. Cell state changes with time, and the recorded information is determined and updated by gate mechanism. The gate mechanism is a method to let information selection pass through, which is realized by sigmoid function and dot multiplication operation. Sigmoid takes a value between 0 and 1. And the multiplication, or dot product, determines the amount of information that is transferred (how much of each part can pass). when Sigmoid takes 0 it means that information is discarded, and when it takes 1 it means that it is fully transferred (i. e. fully remembered).
The LSTM has three gates to protect and control cell state, which include a forget gate, an update gate and an output gate. Wherein the forgetting gate determines how many memory states are removed in the previous moment, that is, how many memories are left, and controls the input of the hidden state a (t-1) at the previous moment and the current moment X (t), and the activation of Sigmoid, which selectively removes the old information at the previous moment; The input gate determines how much new input information needs to be stored in the memory state at the current time, and controls the input of the hidden state a (t-1) at the previous time and the current time X (t), and the activation of Sigmoid, which determines how much new input information needs to be stored in the memory state at the current time; The output gate determines how many memory states are used for output at the current time, including the inputs of hidden state a (t-1) at the previous time and X (t) at the current time, and the activation of Tanh and Sigmoid. The information transmission is completed by the above three gates.
- 1. Forget gate
The network determines what information is forgotten from the cellular state through the Sigmoid function of the forgetting gate, with the following equation.(4)
a<t−1> is the output representing the moment (t-1). x<t> is the input representing this layer at moment t. wf is the weight of each variable. bf is the representative bias term. σ is the sigmoid function of the form σ(x) = (1+e−x)−1; the Γf represents the output to each of the values in the cell state c<t−1> in the cell state, between 0 and 1, where 1 means "keep all" and 0 means "discard all".
- 2. Update gate
The update gate is responsible for updating the information stored in the cell in a three-step operation.
Step 1: Update the results of the sigmoid function calculation for the gate Γu to determine which values need to be updated.
Step 2: Create a new vector of candidate values based on the tanh function and add it to the cell state.
Step 3: By multiplying the old cell state by the forgetting gate (Γu), like human memory, to forget some of the past information and subsequently add new information to the memory. The exact formula is shown below.(5)(6)(7)
tanh is a hyperbolic tangent function of the form ; c<t−1> is the value of the state of the cell at moment t—1. is the information to be remembered extracted from the input information at moment t. is the new added value; and c<t> is the updated cell state value.
- 3. Output gate
The output gate determines the information that is output, which is based on the current cell state output. The sigmoid function is used to determine which part of the information is output, and the tanh function is used to process c<t>, the Γo and c<t> multiplying by each other to obtain the output value at moment t. The equation is shown below.(8)(9)
The internal processing of 1 neuron is accomplished through three gate mechanisms: forget gate, update gate and output gate. The LSTM model formed by multiple neurons in series has a selective memory function, allowing the model to form memories of long periods of past data.
3.3 BiLSTM model
The LSTM model is only a unidirectional neural network. A unidirectional network model can only receive information transmitted in the forward direction, and in practical applications, the output results may be influenced by the combination of the preceding and following information. In the case of time series prediction, a bi-directional LSTM model can effectively constrain the range of results of the training set operations and make the network structure more generalisable, thus optimising the fitting effect to the test set. As shown in Fig 2, the BiLSTM network structure consists of four layers: Input Layer, Forward Layer, Backward Layer and Output Layer.
As can be seen in the network structure in Fig 2, the Forward and Backward layers together influence the output layer, where w1−w6 are six shared weight values. The data is computed once in the Forward layer in the forward direction and the output of the hidden layer is saved for each moment forward. The data is computed once in the Backward layer in the reverse direction and the output of the backward hidden layer is stored at each moment. The final output is obtained by combining the results of the Forward and Backward layers at the corresponding moments. The mathematical expression of the process steps is as follows.(10)(11)(12)
3.4 Particle swarm optimization algorithm
Particle Swarm Optimization (PSO) is an algorithm proposed by Eberhart and Kennedy in 1995 . The algorithm was originally an optimization algorithm constructed by simulating the foraging behavior of a flock of birds. The idea of PSO itself is: particles constructed in the multi-dimensional space, each particle can be regarded as an individual bird foraging in the multi-dimensional space, and the location coordinates of food can be regarded as the parameters of the global optimal solution. The search for an optimal solution by a particle can be likened to the flight of a bird in search of food. The flying speed of particles can be dynamically adjusted according to the historical optimal position of particles and the historical optimal position of population. A particle has only two properties, speed and position, speed represents the speed of movement, position represents the direction of movement. The optimal solution searched individually by each particle is called the individual extremum. The optimal individual extremum in the particle swarm is taken as the current global optimal solution. In the iteration process, the global optimal solution is finally obtained by updating the filling speed and position. The basic formula of PSO is as follows: (13) (14)
t is the number of iterations, i is the ith particle, and j is the jth dimension; is the velocity of i particles in the jth dimension at the t th iteration; ω is the inertia weight; c1 and c2 represent the two acceleration coefficients; is the spatial position of i particle at t iterations, is the space extreme point of t iterations; and are random numbers with uniform distribution in [0,1]. According to the formula, it can be observed that the velocity of the ith particle in the jth dimensional space at time t+1 is determined by three regions: the first is the velocity of the particle at time t, representing the inertia of the particle; The second is the influence of the previous trajectory of the particle in space on the direction of the subsequent movement; The third is the influence of the trajectories of all particles in space on the direction of each particle’s subsequent motion.
It can be divided into the following 6 steps:
- Initialize parameters. It includes setting the upper and lower limits in the data space, two acceleration coefficients c1 and c2, the maximum iteration coefficient max_ episode, the maximum and minimum velocity of each particle. The initial position and velocity of each particle are randomly set.
- Define the fitness function, calculate the fitness at the initial position of each particle, save the fitness at the initial global optimum and the current spatial position and velocity of each particle.
- Update the velocity and position of each particle under the current iteration number according to Eq (13) and Eq (14).
- Determine the fitness of each particle after moving once according to the fitness function, and compare the current optimal fitness with the historical individual optimal value. If the current fitness is superior, the historical individual optimal fitness is replaced by the current optimal fitness, and the current particle is updated at the same time. If the historical global optimum is superior, the replacement process is not performed.
- Determine the global optimal fitness of each individual particle after updating. If the global optimum fitness is better than the initial global optimum, the global optimum is updated.
- Determine whether the iteration cycle meets the stopping condition (the stopping condition can be set to meet the accuracy requirements of the experimental purpose or to reach the maximum number of iterations). If the conditions are met, the current optimal value and optimal parameters are output. If the condition is not met, repeat Step 3 to Step 6 until the condition is met.
3.5 Model construction of CEEMD-PSO-BiLSTM for shipping index
According to the above, the shipping index has the characteristics of non-stationary, non-linear and high complexity, and a single deep learning prediction method cannot accurately grasp its main characteristics for prediction. The specific modeling process of CEEMD-PSO-BiLSTM model is shown in Fig 3. Using the idea of "decomposition-reconstruction-integration", firstly, the original time series can be decomposed into multiple eigenmode components by CEEMD decomposition method. Different eigenmode function components reveal the characteristics of shipping index in different time scales, and the high-frequency components and low-frequency components are distinguished by single sample T test. Then the low-frequency components are reorganized; The reorganized sequences are predicted by deep learning model respectively, and combined with the excellent performance of BiLSTM in long memory in time series prediction, each component is characterized, learned and fitted to achieve accurate prediction of each component; Finally, the integrated method is used to recombine the prediction results of each component to form the final prediction results. The specific modeling steps are as follows:
Step 1: Carry out CEEMD decomposition on the original sequence of shipping index to obtain n intrinsic modal function components (IMF in turn)1, IMF2IMFN) and the trend item RES.
Step 2: Pass a one-sample t-test with zero mean for all eigenmodal function components, and identify the first component with α > 0.05 (set as IMFm) and its subsequent components as the low-frequency component (IMFm,……, IMFn), which reflects the cyclical trend of the shipping index, and combine the low-frequency components into a new component LF; The previous components of IMF are the high frequency components (IMF1, IMF2,……, IMFm-1), which indicate random fluctuations in the short term; RES is the trend term, which reflects the long-term trend of the original series.
Step 3: The IMF1, IMF2,……, IMFm-1, LF, and RES subsequences are modeled and predicted using BiLSTM neural networks, respectively. The predicted results are and , respectively, and PSO algorithm was used to optimize the hyperparameters of each prediction.
Step 4: Combine the predictions from step 3. The prediction results of the original sequence are obtained by integration processing.
4. Empirical analysis
4.1 Data sources
This article focuses on six major Chinese shipping indices, including the price series along the Baltic Dry Index (BDI), the Consolidated Maritime Bulk Freight Index (CBFI), the China Coastal Coal Freight Index (CBCFI), the New York Crude Oil (CFD), the Far East Dry Bulk Composite Index (FDI) and the China Crude Oil Import Freight Index (CTFI), with data from the RESSET database. Table 1 shows the descriptive statistics of the six shipping indices, with the start date and end date of the data selected in this article, where CBFI is measured in weeks and the remaining five shipping indices are measured in days. The skewness and kurtosis values of the six shipping indices reflect the non-normal distribution of the shipping indices.
4.2. Shipping index decomposition based on CEEMD
Due to the similarity of the six shipping indices construction methods, the Far East Dry Bulk Composite Index (FDI) is used here as a representative for analysis and research. Both the EMD and CEEMD operations taken in this experiment were performed in Matlab 2020a. In this article, the CEEMD decomposition integration number NE is set to 1000, the ratio of additional noise standard deviation to the original series standard deviation is 0.2, and the maximum number of iterations for each component is 1000. The CEEMD decomposition results are shown in Fig 4.
As can be seen from the left of Fig 4, the first series is the raw data of the Far Eastern Dry Bulk Composite Index trading price from 28 November 2017 to 27 October 2021, the data is divided into 9 series after CEEMD decomposition, the first 8 series are the Intrinsic Modal Fraction (IMF), from the figure we can find that IMF1~IMF8 in order down its fluctuation frequency gradually decreases, the last one is the res. The first part of the IMF component series shows a complex irregular and high frequency fluctuation image, and its mean value fluctuates above and below 0. It is called high frequency component, and the high frequency component can be regarded as the short-term fluctuation of the Far East Dry Index time series, such as short-term investor sentiment, short-term policy fluctuations and other factors, these factors cause the shipping market price fluctuations in the short term; The IMF series more backward have the lower frequency of fluctuation, which are called the low frequency component, showing the occurrence of some major events in the shipping market, the implementation of important policies or the results of the impact of economic cycles, reflecting the long-term change pattern of the time series; the last item is called the trend term series, and the trend of the res component turning from smooth to upward can be seen in Fig 4, reflecting the long-term fluctuation of the Far Eastern Dry Bulk Composite Index steadily increasing in four years trend.
This article further plots the spectrum of each component based on the Fourier Transform (FFT) measure, thus providing a more intuitive analysis of the impact of each IMF on the FDI price series. The horizontal axis of the spectrum is frequency and the vertical axis is amplitude, where the lower the frequency of a component, the more profound the influence of that component on the original series, and the larger the amplitude, the greater the influence of that component on the original series. The spectrum in the right of Fig 4 shows that the frequency of IMF1 to IMF8 decreases and its amplitude increases, indicating that the lower frequency component has a more significant and far-reaching effect than the higher frequency component, which is characterized by periodic fluctuations.
To determine the high-frequency and low-frequency components of the IMF series, the method used by Li and Feng  is referenced here, where the high-frequency component is assumed to fluctuate up and down around a mean of 0. A one-sample t-test with a mean of 0 is conducted for each IMF, and the first IMF series that deviates from a mean of 0 is used as a marker to distinguish the high-frequency component from the low-frequency component.
From Table 2, we can learn that the one-sample t-test for the IMF1 sequence has a t-value of -0.106, a mean difference of -0.041, and a significance p-value of 0.916 greater than 0.05 indicating that the mean of this sequence is not different from zero at the α = 0.05 The IMF5 sequence is judged to be a low frequency component as its p-value of 0.001 is less than 0.05, indicating that the mean of IMF5 is significantly different from zero. In particular, we can see from Table 2 that the p-values of IMF6 and IMF7 sequences are 0.218 and 0.759 respectively, which are greater than the given significance level of 0.05, indicating that their mean values are not significantly different from zero, but we still judge them as low-frequency components because IMF4 was judged to be a low-frequency component in the previous step, and the frequency of IMF sequences is gradually decreasing as shown in the spectrum of Fig 4, so even though IMF4 was still judged to be a low-frequency component in the previous step.
4.3 Selection of evaluation indicators
The experiments in this article were done on python 3.7, built and run by Keras+TensorFlow 2.5. The FDI decomposed and reconstructed sequences for high- The FDI decomposed and reconstructed sequences for high- frequency component, low-frequency component, and trend term res. BiLSTM prediction models were constructed for each subsequence separately, and then the models were used to make rolling predictions for each subsequence in the prediction interval with root mean square error (RMSE), mean absolute percentage error (MAPE) percentage error (MAPE), mean absolute error (MAE) and coefficient of determination (R2) four evaluation metrics.
- 1. RMSE
The range is [0, +∞), when the predicted value and the true value are completely equal to 0, it is a perfect model. the greater the error, the greater the value.(15)
- 2. MAE
The range is [0, +∞). When the predicted value is exactly equal to the true value, the MAE is equal to 0, which is a perfect model. the greater the error, the greater the MAE.(16)
- 3. SMAPE
The range is [0, +∞). If the value of SMAPE is 0, it is a perfect model, and if the value of SMAPE is greater than 100%, it means an inferior model.(17)
- 4. R2
The range is [0, 1], R2 reflects the extent to which the independent variable x explains the changes in the dependent variable y. The closer its value is to 1, the better the model fits.(18)
Where yi is the true value of the series, is the sequence predicted value, is the sequence mean, and n is the total number of predicted data.
In order to eliminate the influence of different dimensions between different sequences and improve the operational efficiency of the model, Z-score is used to standardize the data of each sequence before predicting the model. The first 70% data of the sample is set as the training set, the verification set is the first 10% of the training set, and the last 30% data of the sample is the test set. The rolling prediction modeling method is adopted to predict the price of Far East Dry Bulk Composite Index in the next period through the transaction data of the first 10 days, and the look_back is set as the data amount within 10 days. The standardized series are predicted by BiLSTM model respectively. Taking FDI as an example, after repeated adjustment and omparison of parameters, the first layer is LSTM layer with 32 nodes, the second layer is BiLSTM layer with 32 nodes, the number of training samples batch_size is set to 6, the iteration times epoch is set to 50, and the optimizer is selected as Adam.
4.4 Reorganization and optimization of IMF sequence
The analysis in the previous section shows that the high frequency component contains a lot of noise and short-term random influences, while the low frequency component is a smoother time series. If the high-frequency components are combined for prediction, the large amount of noise in the high-frequency components will not cancel each other out, but will instead amplify the effect of their random factors, so the high-frequency components can be predicted one by one; while the low-frequency components themselves are smoother, the low-frequency components are combined into a new series (named LF) for prediction, in order to improve the prediction efficiency of the model. The specific formula is as follows.(19)
In here, m denotes the number of high-frequency components, and denotes the predicted value of each HF component, and denotes the predicted value of the LF after recombination of the low frequency components, and denotes the predicted value of the trend term res.
In this article, we take the FDI as an example, and the reconstructed components are shown in Fig 5. The process of processing the data of the other five shipping indices is similar to that of FDI, which will not be explained in detail due to space limitation. The three high-frequency components IMF1, IMF2, IMF3 and IMF4 of the FDI after CEEMD decomposition, the new series LF and trend term res after the reorganization of the low-frequency components are predicted by the BiLSTM model, and the prediction results of each series are added together to form the final prediction results, and the combined prediction method is named CEEMD-BiLSTM1:
In order to verify the superiority of the IMF recombination method in this article, the high frequency components are recombined according to the previous practice, and the recombined new sequence is named HF, HF and LF, and the trend term res are predicted and summed by BiLSTM model, this method is named CEEMD-BiLSTM2; the HF and 4segment low frequency components IMF5 to IMF8, and res are summed up by BiLSTM model prediction, and the combined prediction method is named CEEMD-BiLSTM3; the decomposed components are predicted directly by BiLSTM model without any processing, and this method is named CEEMD-BiLSTM4:
Using the model parameters determined in this article, the optimal HF IMF was determined based on the results of the prediction evaluation index, as shown in Table 3, after BTC was processed with 3 different IMF combination methods after CEEMD decomposition and predicted by the BiLSTM model, it can be seen that the first combination method is the optimal combination, and its RMSE, SMAPE, MAE, and R2 21.45%, 24.23%, 31.63% and 0.4% respectively over the second combination method, 58.61%, 64.25%, 65.23% and 2.95% respectively over the third combination method, and 53.4%, 62.1%, 63.55% and 2.18% respectively over the fourth combination method. It can be seen that the first combination approach chosen in this article is advantageous in forecasting the Far East Dry Bulk Composite Index. Fig 6 shows the forecasting effect under the four IMF combination methods.
4.5 PSO to optimize the BiLSTM
According to the experiments in the previous section, we will adopt the combination of CEEMD-BiLSTM1 for prediction. However, since the complexity of each sequence after CEEMD decomposition and recombination is different, it is unscientific to use the same hyperparameters in the model. Therefore, in order to match the model network structure with the characteristics of shipping index, we used PSO algorithm to optimize the model hyperparameters separately when each sequence was predicted by the BiLSTM model.
We set the number of neurons in LSTM layer and BiLSTM layer and the size of batch size as the optimization object of PSO algorithm, so the dimension is 3, the range of LSTM layer is 16~64, the range of BiLSTM layer is 16~64, and the range of batch size is 6~16. Set the parameter ω to be linearly decreasing from 0.9 to 0.4, c1 to be linearly decreasing from 2.5 to 0.5, c2 to be linearly increasing from 0.5 to 2.5, and the number of particles is 10; In order to avoid PSO falling into local optimum, we dynamically adjust the acceleration according to different iteration times. Acceleration as a function is: (20)
c is the acceleration c1 and c2, episodei is the current iteration number, episodemax is the maximum iteration number. We set the maximum iteration number in this experiment as 20.
The function of particle fitness is the average percentage error: (21)
N is the length of test set data; is the predicted value, and yi is the actual value.
As can be seen from Table 4, the accuracy of the four indexes has been significantly improved after PSO optimization. CEEMD-PSO-BiLSTM is the optimal combination. Compared with CEEMD-BiLSTM, the RMSE, SMAPE and MAE of CEEMD-BILSTM are reduced by 15.12%, 13.77% and 7.44%, respectively, and the R^2 is increased by 0.16%. It can be seen that the PSO algorithm can better improve the accuracy of the model.
5. Results and discussion
As the construction process of CBFI, CBCFI, CFD, BDI and CTFI prediction models is similar to that of FDI, it is not repeated due to space limitation. In this article, we constructed Support Vector Regression (SVR), Recurrent Neural Network (RNN), Gate Recurrent Unit (GRU), Long Short Term Memory (LSTM), Bi-directional Long Short-Term Memory (BiLSTM) as a comparison to verify the advantages of the BiLSTM model; meanwhile, to verify the effectiveness of the CEEMD decomposition method in shipping data, this article also decomposes the original sequences based on the empirical modal decomposition (EMD) method, and the decomposition integration number NE is uniformly set to 1000, and the ratio of additional noise standard deviation to five combined models (CEEMD-PSO-BiLSTM、CEEMD-LSTM、CEEMD-BiLSTM、EMD-LSTM、EMD-BiLSTM) including the model in this article were constructed, and through the above ten models, it is hoped that the advantages of the CEEMD-PSO-BiLSTM model constructed in this article can be demonstrated. Among them, Fig 7 shows the comparison of the prediction results of the five combined models.
As can be seen from Table 5, the analysis and prediction of the six shipping indices by the "decomposition-reconstruction-integration" approach is significantly better than the prediction effect of a single model. After CEEMD decomposition, the prediction accuracy of the LSTM model and BiLSTM model was greatly improved. The CEEMD-PSO-BiLSTM model has the best results among the four evaluation indicators of the prediction results of the six shipping indices by the ten methods. In the time series model with more data, the prediction advantage of LSTM model in the long time series model can be given full play, when the prediction performance of LSTM model and BiLSTM model are similar, in CFD for example, BiLSTM model has less improvement in prediction effect than LSTM, in RMSE, MAE, SMAPE, R2 compared to LSTM model respectively 2.19%, 2.03%, 2.47%, and 0.06%, which is due to the fact that the LSTM forward layer plays a major role in the same time series forecasting model, while the reverse layer can still capture information that may be overlooked in the forward layer, thus improving the forecasting performance of the model. In time series models with less data volume, the forecasting advantages of LSTM models in long time series models cannot be fully exploited, while BiLSTM neural network models obtained better forecasting results than LSTM after adding the inverse layer. Taking FDI as an example, in a single model, due to the small amount of FDI data, where in RMSE, MAE, SMAPE and R2 are improved by 21.40%, 30.31%, 26.58% and 0.62% respectively compared to the LSTM model.
In the combinatorial model before optimization, the CEEMD-BiLSTM model, which is first decomposed by CEEMD and optimised and restructured, performs best, with RMSE, MAPE, MAE and R2 improved by 23.02%, 15.75%, 22.87% and 0.41%, 20.02%, 34.62%, 47.96%, 0.33%, respectively, over the EMD-LSTM model, 8.63%, 10.91%, 15.72%, 0.12%, respectively, over the EMD-BiLSTM model, and 19.28%, 20.51%, 15.98%, 0.32%, respectively, over the CEEMD-BiLSTM model. 15.98%, and 0.32% respectively; After PSO optimization, CEEMD-PSO-BiLSTM model has good prediction accuracy in each shipping index. Fig 8 shows the SMAPE metrics for the six shipping indices in five different hybrid models, and it can be seen that the CEEMD-PSO-BiLSTM model has good prediction results in both long time series and short time series.
Analyzing the prediction performance of the above models in the six shipping indices, it can be found that: 1. the prediction effect of the integrated model constructed by the "decomposition-reconstruction-integration" method is significantly better than that of the single model; 2. the decomposition result of the CEEMD method for shipping data is better than that of the EMD decomposition method, indicating that CEEMD is better in feature decomposition; 3. The BiLSTM model can obtain better prediction results compared with the LSTM model, and its prediction effect is better in short time series. 4. The model optimized by PSO has better fitting effect and prediction accuracy. In summary, the CEEMD-PSO-BiLSTM model proposed in this article is a very effective method for analyzing and forecasting highly volatile and non-linear shipping index data, and has obvious advantages over other methods.
Due to the non-stationary and unstructured characteristics of shipping indices, this article proposes a CEEMD-PSO-BiLSTM model through the idea of "decomposition-reconstruction-integration", and makes the modelling and forecasting of non-linear, multi-scale and highly complex time series feasible and efficient through a unique IMF restructuring approach. In this framework, the price data of shipping indices are first processed by means of Complementary Ensemble Empirical Mode Decomposition, and the prediction efficiency of the subsequent model is improved by decomposing the time series into multiple time series; the IMF series decomposed by CEEMD are reorganized, and a more efficient IMF reorganization strategy is proposed in this article, i.e. the high-frequency component and low-frequency component are identified by independent sample t-test The low-frequency components are combined into a new series LF, and the PSO-BiLSTM model forecasts are carried out simultaneously with the high-frequency components and trend terms, and the individual forecasts are summed and combined into the final forecast set; the forecast results of each series are combined to obtain the final results. In this article, six mainstream shipping indices in the Chinese shipping market (FDI, CBFI, CBCFI, BDI, CFD, CTFI) are selected to test the prediction accuracy of each model with four evaluation indicators (RMSE, SMAPE, MAE, R2), and comparing the prediction results of other eight models we find that:
(1) It can be seen from the four evaluation indicators that the IMF restructuring strategy proposed in this article significantly outperforms other IMF restructuring strategies and provides a more efficient method for other related financial time series forecasting; (2) The CEEMD-LSTM model optimized by PSO has a better prediction effect (3) The forecasting results in all six shipping indices show that the CEEMD-PSO-BiLSTM model constructed in this article outperforms other models under all evaluation indicators.
In summary, this article further improves the deep learning framework for time series forecasting, enhances the forecasting accuracy of deep learning models in time series, demonstrates that the deep learning framework still has excellent forecasting results in shipping indices other than common forecasting objects (stock market, bond market, exchange rate, etc.), and hopes to continue to advance the integration of deep learning and the field of financial forecasting, providing future research with practical and reliable modelling solutions. However, there are still some limitations in this study, for example, this article is only based on the analysis of shipping index price data, and does not introduce factors such as the influence of macro policies of various countries and the influence of investor sentiment, which is also the focus of future improvement of the forecasting model.
- 1. Hummels D, Lugovskyy V, Skiba A. The trade reducing effects of market power in international shipping. Journal of Development Economics. 2009; 89(1): 84–97.
- 2. Kim H.G. Study about how the chinese economic status affects to the Baltic Dry Index. International Journal of Business and Management. 2011; 6(3): 116–123.
- 3. Liu J, Li Z, Hao S, Yu L, Gao W. Volatility forecasting for the shipping market indexes: an AR-SVR-GARCH approach. Maritime Policy & Management. 2021; 1–18.
- 4. Bakshi G, Panayotov G, Skoulakis G. The Baltic Dry Index as a predictor of global stock returns commodity returns and global economic activity. Social Science Electronic Publishing. 2011.
- 5. Lu J, Chen Q. Study on fluctuation of Baltic Freight Index. Journal of Dalian Maritime University. 2003; 29(1): 1–4.
- 6. Duru O. A fuzzy integrated logical forecasting model for dry bulk shipping index forecasting: An improved fuzzy time series approach. Expert Systems with Applications. 2010; 37(7) 5372–5380.
- 7. Ruan Q, Wang Y, Lu X, Qin J. Cross-correlations between Baltic Dry Index and crude oil prices. Physica A Statistical Mechanics & Its Applications. 2016; 453: 278–289.
- 8. Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006 Jun 22; 313: 504–507. pmid:16873662
- 9. Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, et al. The empricial mode decomposition and the Hilbert Spectrum for Nonlinear and Non-stationary Time Series Analysis. Proceedings of the Royal Society of London. 1998 Nov 4; 454: 903–995.
- 10. Cullinane KPB, Mason KJ, Cape M. A Comparison of models for forecasting the Baltic Freight Index: Box-Jenkins Revisited. International Journal of Maritime Economics. 1999; 1(2): 15–39.
- 11. Veenstra AW, Franses PH. A co-integration approach to forecasting freight rates in the dry bulk shipping sector. Transportation Research Part A: Policy and Practice 1997; 31(6): 447–458.
- 12. Li Z. ARMA forecasting model of Baltic freight index. Journal of Shanghai Maritime University. 2004; 4: 69–72.
- 13. Yang H, Liu J, Fan Y. Study of volatility of Baltic Dry Bulk Freight Index based on GARCH model. Navigation of China. 2011; 34(3): 84–88.
- 14. Chen S, Meersman H, Voorde E. Forecasting spot rates at main routes in the dry bulk market Maritime Econ. Logistics. 2012; 14(4): 498–537.
- 15. Yang H, Dong F, Ogandaga M. Forewarning of Freight Rate in Shipping Market Based on Support Vector Machine. Traffic and Transportation Studies. 2008; 104: 295–303.
- 16. Chou CC, Lin KS. A Fuzzy Neural Network Model for Analysing Baltic Dry Index in the Bulk Maritime Industry. The International Journal of Maritime Engineering. 2007; 159(A2): 167–174.
- 17. Pérez-Cruz F, Afonso-rodríguez J, Giner J. Estimating GARCH models using support vector machines. Quantitative Finance. 2003; 3(3): 163–172.
- 18. Li J, Parsons MG. Forecasting tanker freight rate using neural networks. Maritime Policy & Management. 1997; 24(1): 9–30.
- 19. Kamal IM, Bae H, Sim S, Kim H, Kim D, Choi Y, et al. Forecasting High-dimensional Multivariate Regression of Baltic Dry Index (BDI) Using Deep Neural Networks (DNN). ICIC Express Letters. 2019; 13(5): 427–434.
- 20. Zeng Q, Qu C, Ng A.K, Zhao X. A new approach for Baltic Dry Index forecasting based on empirical mode decomposition and neural networks. Maritime Economics & Logistics. 2016; 18(2): 192–210.
- 21. Sahin B, Gurgen S, Unver B, Altin I. Forecasting the Baltic Dry Index by using an artificial neural network approach. Turkish Journal of Electrical Engineering & Computer Sciences. 2018; 26(3): 1673–1684.
- 22. Zhang X, Xue T, Stanley HE. Comparison of econometric models and artificial neural networks algorithms for the prediction of Baltic Dry Index. IEEE Access. 2019; 7: 1647–1657.
- 23. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1998; 9(8): 1735–1780.
- 24. Kim D, Kim H, Sim S, Choi Y, Bae H, Yun H. Prediction of dry bulk freight index using deep learning. Journal of Korean Institute of Industrial Engineers. 2019; 45(2): 111–116.
- 25. Xiao W, Xu C, Liu H, Liu X. A hybrid LSTM-Based ensemble learning approach for China coastal bulk coal freight index prediction. Journal of Advanced Transportation. 2021; 4: 1–23.
- 26. Nelson D.M, Pereira A, Oliveira R. Stock market’s price movement prediction with LSTM neural networks. International Joint Conference on Neural Networks (IJCNN). 2017: 1419–1426.
- 27. Nabipour M, Nayyeri P, Jabani H, Mosavi A, Salwana E. Deep learning for stock market prediction. Entropy. 2020; 22 840. pmid:33286613
- 28. Lin Z, Tang Y, Zhang Y. Joint Deep Model with Multi-Level Attention and Hybrid-Prediction for Recommendation. Entropy. 2019; 21 143. pmid:33266859
- 29. Abduljabbar RL, Dia H, Tsai PW. Development and evaluation of bidirectional LSTM freeway traffic forecasting models using simulation data. Sci. Rep. 2021; 11:23899. pmid:34903780
- 30. Yang Z, Jin L, Wang M. Forecasting Baltic Panamax index with support vector machine. Journal of Transportation Systems Engineering & Information Technology. 2011; 11(3): 50–57.
- 31. Han Q, Yan B, Ning G, Yu B. Forecasting dry bulk freight index with improved SVM. Mathematical Problems in Engineering 2014; Article ID: 460684.
- 32. Leonov Y, Nikolov V.A wavelet and neural network model for the prediction of dry bulk shipping indices. Maritime Economics & Logistics. 2012; 14: 319–333.
- 33. Chen Y, Liu B, Wang T. Analysing and forecasting China containerized freight index with a hybrid decomposition-ensemble method based on EMD grey wave and ARMA. Grey Systems Theory and Application. 2020; 11(3): 358–371.
- 34. Li C, Tian Y, He J. Prediction model of AC algorithm based on EMD decomposition combined with GMDH and its application. Journal of Systems & Management. 2012; 21(1): 105–110.
- 35. Awajan AM, Ismail MT, Wadi SA. Improving forecasting accuracy for stock market data using EMD-HW bagging. Plos one. 2018; 13(7): e0199582. pmid:30016323
- 36. Wu Z, Huang N.E. Ensemble empirical mode decomposition: a noise-assisted data analysis method. Advances in Adaptive Data Analysis. 2009; 1(1): 1–41.
- 37. Yeh JR, Shieh JS, Huang NE. Complementary ensemble empirical mode decomposition: a novel noise enhanced data analysis method. Advances in Adaptive Data Analysis. 2010; 2(2): 135–156.
Eberhart R, Kennedy J. A new optimizer using particle swarm theory[C]. In Micro Machine and Human Science, 1995. MHS’95. Proceedings of the Sixth International Symposium on. IEEE, 1995: 39–43.
- 39. Li H, Feng C. Relationship between investor sentiment and stock indices fluctuation based on EEMD. Systems Engineering-Theory & Practice. 2014; 34(10): 2495–2503.