Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Information aggregation based trend prediction of energy structure via an improved compositional data time series forecasting model and its application

Abstract

To improve the prediction accuracy of compositional data time series (CDTSs), the aggregation of compositional data was considered and applied to construct a combination forecasting model. Different from current arithmetic mean based aggregation of compositional data, the aggregation method of compositional data from the induced ordered weighted averaging (IOWA) operator was put forward. Properties of such aggregation methods are discussed. Since prediction accuracies of different individual forecasting models are diverse over time, forecasting error between the aggregated CDTSs and the original CDTS is minimized and set as an objective function of the aggregated weights. To derive the optimal weights associated to individual forecasting models, the genetic algorithm was utilized. Correspondingly, an improved time-varying combination mode and an IOWA operator based combination mode are developed. Finally, a numerical study on China’s primary energy production structure is presented. The results show that the developed varying weight combination model is superior to the benchmark model in terms of prediction accuracy comparison, illustrating the feasibility and validity of the developed combination forecasting model.

1. Introduction

Generally, primary energy includes non renewable energy (e.g., fossil fuels, nuclear energy, hydropower, wind energy, solar energy) and renewable energy (e.g., biomass energy). To describe the development and evolution of regional energy, the energy structure is a key object. Currently, the consumption of fossil fuels still dominates China’s energy consumption. Emerging energies include nuclear energy, hydropower, wind energy and solar energy have achieved sustained development, while the proportion still needs to be further increased for the sake of carbon peaking and carbon neutrality goals. Forecasting regional energy structure is of significant importance.

As one of the world’s largest energy producer and energy consumer, rich coal, poor oil and low gas are three main characteristics of China’s energy resources. Coal is the main energy source for China’s energy production and consumption. From the perspective of world energy’s development trends, the development and utilization of new and renewable energy become an important direction. To realize a clear understanding about the future trend, how to provide an accurate forecasting on the primary energy production structure could provide guiding conclusions for future energy development plan.

Fig 1 shows the annual data of primary energy production structure in China during the period of 2000–2021 (Data source: http://www.stats.gov.cn/). Totally, it can be concluded that the percentage of crude coal has be decreased in the past two decades. On the other hand, it’s still the main energy source in China. Besides, crude oil owns a similar trend. Different from the two sources mentioned above, percentages corresponding to the natural gas and the other energy sources perform a increasing trend.

thumbnail
Fig 1. Annual data of primary energy production structure in China from 2000 to 2021.

https://doi.org/10.1371/journal.pone.0351310.g001

To analyze the energy production structure, Hou and Song [1] considered the optimization of energy structure in China. The measures include adopting energy price or carbon tax are suggested. Liu et al. [2] analyzed the influences of consuming coal, oil, natural gas, hydropower, solar and wind energy, nuclear power, and other renewable energy sources on carbon emissions in European Union. By using a dynamic stochastic general equilibrium model, Jin et al. [3] quantified the macroeconomic effects of an unconventional monetary policy–targeted refinancing, and a conventional monetary policy–general refinancing on the optimization of the energy supply structure. Li et al. [4] introduced the Markov chain to set up an improved energy structure prediction model. Vorontsova et al. [5] considered the innovative energy producing structures in Russia. Currently, most of the researchers mainly pay attention on the energy consumption structure [6,7,8,9,10,11], while forecasting on the energy producing structure was not widely discussed.

To forecast observations with different components and corresponding proportional information, the concept of compositional data comes to be a valid tool [12]. Since compositional data owns the characteristic of constant sum to 1 constraint, it can reflect many scenarios, which has been widely utilized to predict real-world events [1316]. Current studies on CDTSs mainly focused on the processing of compositional data, forecasting on CDTSs relies on common forecasting models and the inverse transformation corresponding to the transformed data. It’s worth noting that combination forecasting has shown to be a valid tool to improve forecasting accuracy [1719]. Besides, since the time-varying weights of combination forecasting can also be seen as a CDTSs, the integration of combination forecasting and compositional time series could provide a novel forecasting mode.

By reviewing the history of existed publications, the following research gaps can be summarized: 1) To make a scientific and reasonable energy plan, forecasting of energy producing structure is needful. Knowledge about the energy producing structure would be useful to improve the reasonability of possible energy plan. 2) From the theoretical perspective, how to design a CDTSs forecasting model with high accuracy is still an open and interesting problem.

In this paper, the combination forecasting model is chosen for the forecasting of energy producing structure presented by CDTSs. Relying on individual forecasting results, it is common to obtain prediction results that are closer to actual observation data by integrating individual results. As an extension of aggregation model, the induced ordered weighted averaging (IOWA) operator [20] was chosen to aggregate individual predicted CDTSs.

2. Literature review

With the background mentioned above, this section mainly summarizes current studies on compositional data and corresponding time series analysis, combination forecasting and IOWA operator.

2.1. Compositional time series analysis

To describe the observed data with proportional information, compositional data [21,22] provides a valid tool. For instance, the energy producing structure [7, 23], the energy consumption structure [8, 9, 10, 24] and market share [14]. Thus, compositional data time series (CDTS) is the sequence that the observed data in each time point is a compositional data, i.e., , where p is the dimension and T is the sample size.

For the sake of forecasting compositional time series, data preprocessing is first considered. Up until now, typical data preprocessing models include logarithm transformation [21] and hyperspherical transformation [25]. The logarithm transformation models mainly include asymmetric logarithmic ratio transformation (ALRT) and central logarithmic ratio transformation (CLRT). On the other hand, the scenario with 0 and 1 as components cannot be handled. Hence, Wang et al. [25] developed the hyperspherical transformation model, which can deal with the scenario with 0 as component(s). The application areas of compositional data were then enlarged. So far, diverse forecasting methods for CDTSs have been developed. Petra et al. [13] considered the generalization of vector autoregressive models. Snyder et al. [14] developed a state space approach. Zhou et al. [26] put forward the autoregressive Dirichlet estimation method. The usage of VARIMA model was extended by Carles et al. [27]. Chang et al. [28] proposed a new class of seasonal time series models based on a stable seasonal composition assumption. It can be seen that classical statistical forecasting models have been widely generalized under the environment of CDTSs.

2.2. Combination forecasting

To improve the forecasting accuracy, Bates and Granger [29] introduced the concept of combination forecasting. By weighting the results produced by individual forecasting models, the combination forecasting result is commonly more closer to the real observations. Since then, the combination forecasting has been widely studied in theory and applied to real-world applications. Liu et al. [30] proposed a combination forecasting model based on the hybrid interval multi-scale decomposition method, which is applied to forecasting interval-valued carbon prices. Radchenko et al. [31] provided the first thorough investigation of the negative weights in combination forecasting. Qian et al. [17] proposed an AI-AFTER method, which can determine the appropriate goal of forecast combination. Besides, proper goal of the combination forecasting can be automatically achieved. Kang et al. [18] suggested a change of focus from the historical data to the produced forecasts, so that the features in forecast combinations can be extracted. Following the combination forecasting mode given by Bates and Granger [29], the essence of combination forecasting is data fusion, which are commonly presented by type of mean values [19].

Currently, in the field of CDTSs forecasting, the combination forecasting mode has not yet received sufficient attention. Besides, applications of the fusion of compositional data has also merely been discussed [32]. Furthermore, current information aggregation of compositional data was designed by using fundamental mathematical means, which cannot be used for complex scenarios. Since the induced ordered weighted averaging operator (IOWA for short) [20] is a generalization of types of mathematical means include common weighted arithmetic mean. IOWA has also been applied to build combination forecasting model [33]. Hence, the IOWA is chosen as further tool for the fusion of individual forecasting CDTSs.

Actually, the application of aggregation operators in combination forecasting has been reported [34,35]. By weighting different individual forecasting models according to the accuracies of these models at different time points, strengths of the individual forecasting model with larger accuracies could be enlarged. Hence, the combination forecasting of CDTSs from the perspective of information aggregation could be considered [36,37].

2.3. Aim, contribution and organization

Due to the literature review and analysis mentioned above, the aim of this paper is to consider a combination forecasting model for CDTSs from the perspective of information aggregation. To realize the motivation, the contributions of this paper can be summarized as below:

  1. 1) Theoretically, a novel compositional data aggregation process is discussed. Pérez-Fernández et al. [32] considered a general betweenness-based aggregation framework for compositional data. Although theoretical studies on the general betweenness-based aggregation framework has been developed, detailed compositional data aggregation functions need to be further detailed. We integrated the concept of induced ordered weighted averaging operator and compositional data, a novel compositional data aggregation model is provided. Properties of the developed novel compositional data aggregation functions still need to be analyzed.
  2. 2) From the methodological perspective, the combination forecasting model of CDTSs is developed. Currently, how to improve the prediction accuracy of CDTSs is an open problem. To enlarge the strengths of individual forecasting model at different time points, a time-varying weighted and IOWA based combination forecasting structure of CDTSs are put forward in this paper.
  3. 3) Technologically, for the out-of-sample forecasting of CDTSs, a bilayer forecasting procedure is developed, i.e., the forecasting of time-varying weighting vectors and the forecasting of individual CDTSs models. By setting the weighting vectors as a CDTS, the problem of out-of-sample time-varying combination forecasting can be solved.

The rest of this paper is structured as follows: Fundamental notions, notations and the main results are shown in the Main Results section. Application of the developed combination forecasting model for CDTS in the field of energy producing structure in China is shown in the Numerical Study section. The Conclusion section summarizes the conclusions and possible future work.

3. Main results

3.1. Induced ordered weighted compositional data aggregation function

From the perspective of mathematics, a compositional data is presented as a p-dimension vector, denoted as , where and . Since the compositional data satisfies the characteristic of sum-constrained to 1, catastrophic consequences would produced if the characteristic was intentionally or unintentionally ignored. Thus, Aitchison [38] introduced the logratio transformation to realize the analysis of compositional data. Since then, another logratio transformation methods include centered logratio transformation and isometric logratio transformation were developed. On the other hand, the logratio transformation models are limited under the scenarios with 0 or 1 as component. Wang et al. [25] introduced the hyperspherical transformation model, i.e.,

For a given compositional data , let , then . Then, the vector can be regarded as a point on the hypersphere. Then,

(1)

By inverting Eq. (1), it can be obtained that:

(2)

It can be seen that the scenario with 0 as component(s) can be handled by the hyperspherical transformation model. By Eq. (2), initial compositional data is transformed with p dimensions to be series of . When using logarithmic transformation method, the range of p-1 sequences after hyperspherical transformation is smaller and therefore more stable. As a result, it would be beneficial for accurate prediction.

To aggregate compositional data, Pérez-Fernández et al. [32] introduced the betweenness-based aggregation mode [39], which can be defined in the following:

Definition 2.1. Let (X, B, S) be a bounded beset and , a function is named as an (n-ary) aggregation function on (X, B, S) if

  1. (1) it satisfies the boundary conditions, i.e., for any ;
  2. (2) it is monotone, i.e., for any and any , , the fact that for any implies that , where .

Herein, the concepts of betweenness relation B and beset (X, B) can be founded in Pérez-Fernández et al. [39].

Given that are n compositional data vectors and is a suitable weighing vector satisfying , Pérez-Fernández et al. [39] introduced the following natural aggregation function, i.e.,

(3)

for any .

With the concept of IOWA operator, we have

Definition 2.2. Let be n 2-tuple compositional data arrays composed by n compositional data vectors and corresponding induced values , the induced ordered weighted compositional data aggregation function, denoted as , can be defined according to,

(4)

where is the compositional data vector corresponding to the i-th largest induced value .

Herein, the induced values are commonly numerical values that represent the importance of associated compositional data.

According to Eq. (4), the following results can be derived from the perspective of compositional data aggregation.

Property 1. The induced ordered weighted compositional data mean defined by Eq. (4) is a valid betweenness-based aggregation function.

Proof. Firstly, we prove that satisfies the boundary conditions.

Let be any element in S, then if , i.e., , we have

Thus, it can be derived that the property of boundary conditions are valid.

Secondly, with given associated induced values, denoted as , the proof of monotonicity is similar to Proposition 7 in Pérez-Fernández et al. [39]. Thus, the proof is omitted.☐

Property 2. The aggregated result produced by Eq. (4) is also a compositional data.

Proof. By Eq. (4), it can be seen that the induced ordered weighted compositional data aggregation function is a weighted averaging mean in essence.

For the summarization of the aggregated compositional data, denoted as , the following result is valid,

Thus, it’s proved that the aggregated result produced by Eq. (4) is a compositional data.☐

Property 3. When , the induced ordered weighted compositional data aggregation function is degenerated to common aggregation function of compositional data vectors.

3.2. Compositional data time series combination forecasting: From the perspective of data aggregation

In this subsection, a hybrid forecasting structure for CDTSs from the perspective of information aggregation is developed. The forecasting procedure is shown in Fig 2.

thumbnail
Fig 2. Hybrid forecasting structure for compositional data time series.

https://doi.org/10.1371/journal.pone.0351310.g002

By Fig 2, a time-varying combination forecasting procedure is developed. The weighting vectors are also seen as a CDTSs so that the out-of-sample forecasting on the observed CDTSs can be realized. Next, the total hybrid forecasting procedure is illustrated in detail.

3.2.1. Determination of weighting vectors.

Let be a compositional data time series. Assume that are the predicted compositional data time series given by K individual forecasting models. To realize the combination forecasting of CDTSs, the weighting mode is firstly introduced.

Time invariant weighting mode. In this mode, individual forecasting results are aggregated with a time-invariant weighting vector. Given that is the time-invariant weighting vector corresponding to the individual forecasting models, while is an aggregation function with . Thus, the combined forecasting results corresponding to X can be obtained according to

(5)

To derive the time-invariant weighting vector , the following optimization model is commonly used:

(6)

where and are respectively the dissimilarity and similarity measures between any two compositional data time series.

The optimization model provides an objective way to derive the weighting vector. It’s worth noting that there are some another non-optimal weighting methods including entropy weights and equal weights processing. Since the optimization model can realize given combination forecasting efficiency like minimizing the sum of squares of errors, the optimization model is selected.

Time-varying weighting mode. In this mode, individual forecasting models are weighted with associated time-varying weighting vector, i.e., the weighting vectors are different with the changing of time. Let be the weighting vector corresponding to the t-th time point. Thus, the combined forecasting compositional data at the t-th time point, denoted as can be derived according to

(7)

Correspondingly, determination of the time-varying weighting vectors can be obtained by the following optimization model:

(8)

To obtain the optimal time-varying weighting vectors, genetic algorithm is utilized [40]. The main idea of the algorithm is to introduce the parameters to be optimized into the gene coding, and then construct an adaptation function according to the objective function. Through the genetic operations such as selection, crossover and mutation, the algorithm is continuously optimized until the optimal solution is obtained.

The process of genetic algorithm to seek the optimal time-varying weighting vectors is summarized as follows:

  1. (1) Population initialization: An initial population containing K individuals is randomly generated. Each chromosome contains n genes , where n is the number of weight. The random numbers are randomly selected in [0,1] as the initial weight.
  2. (2) Coding: Binary coding is adopted in this paper to encode each chromosome, and each individual is composed of binary strings.
  3. (3) Construct the fitness function: The optimization obtained by genetic algorithm is the maximum fitness. However, the objective function in the developed model is to minimize the error. The fitness function is thus set as the opposite number of the objective function.
  4. (4) Selection: Select individuals to form a new group according to a certain probability. The probability of individuals being selected is defined as:
(9)

where is the fitness of the i-th individual.

  1. (5) Genetic operators: Crossover operators and mutation operators are utilized.

The crossover operation is to select two individuals and exchange genes at corresponding positions according to the crossover probability . The mutation operation is to select an individual in the population and change the gene in the corresponding position according to the mutation probability .

  1. (6) Set the number of iterations and repeat steps (3)-(5) until the end of the iteration.

As a population-based stochastic search method, genetic algorithm excels at navigating the complex, potentially non-convex landscape of weight space, especially when the objective function is discontinuous, noisy, or lacks closed-form gradients. Such method naturally respects the simplex constraints through tailored encoding (e.g., normalizing candidate weights to sum to 1) and does not rely on gradient information, making them robust to ill-behaved objective functions.

3.2.2. Construction of the combination forecasting.

As shown in 5 and 7, hybrid CDTSs forecasting can be realized by using the developed compositional data aggregation functions.

Next, let f be the natural aggregation function or the IOWA operator mentioned in subsection 3.1, the hybrid forecasting of CDTSs will be analyzed both within and outside the sample period.

For the time-invariant mode within the sample period, the hybrid fitted compositional data sequence can be directly derived according to Eq. 5.

For the time-invariant mode outside the sample period, since the fixed weighting vector is still valid, the hybrid forecasted compositional data sequence can also be directly derived according to Eq. 5.

For the time-varying mode within the sample period, the hybrid fitted compositional data sequence can be directly derived according to Eq. 7.

Herein, the weights within the sample period can be obtained according to Eq. 8. Since the time-invariant weighting vector can also be seen as a compositional data. To derive the out-of-sample weights, the weighting vector is simultaneously predicted by the developed forecasting model.

The combinational forecasting procedure can be summarized as below:

Algorithm 1: Combination forecasting of CDTSs

1: Input: Observed CDTSs

2: Output: Predicted CDTSs

3: Start

4: Transform observed CDTSs by using Eq. (2)

5: Forecast the transformed multiple sequences by using individual forecasting models

6: Transform individual predicted results by inverse transformation model

7: Calculate time-invariant weights and time-varying weights by Eq. (8)

8: Obtain the combination forecasting results by Eq. (5) or Eq. (7)

9: Compare different forecasting models by error indicators

10: The end

For the time-varying mode outside the sample period, the out-of-sample predictions need to further determine the time-varying weighting vectors. The reason is that the actual compositional data are missing during this period so the weighting vectors cannot be produced. On the other hand, it’s worth noting that the weighting vectors also satisfy the characteristic of sum-constraint to 1. As a result, the time-varying weighting vectors are collected and set as a CDTSs. The out-of-sample time-varying weighting vectors can thus been predicted. Furthermore, the vectors can be utilized to drive the out-of-sample combined compositional data predictions with the forecasting of individual out-of-sample compositional data predictions.

4. Numerical study

4.1. Evaluation and comparison on the developed compositional data time series analysis model

Except traditional prediction error indexes include RMSPE, MAPE, et al., the prediction error corresponding to compositional data time series is often measured by the Aitchsion distance in the uniform space. The formula for the norm and the distance between two compositional data units in the simplex space is defined as:

(10)(11)

where .

To comprehensively show the superiority of the developed model, two traditional error evaluation indexes RMSPE and MAPE and four evaluation indexes MSD, SSD, MCPE and CVPE are selected. Table 1 shows the error indicators used in this manuscript.

4.2. Experimental results

In this subsection, three individual forecasting models include Holt’s Linear Trend Method (HLTM), Support Vector Regression (SVR) and Extreme Learning Machine (ELM) were used to model training the training samples (Main codes can be seen in the supporting information S1 File), i.e.,

HLTM: The scenario that the observed time series data with linear trends but no seasonality is applicable. HLTM adds trend estimation on the basis of simple exponential smoothing, which smooths the level and trend separately through two equations, i.e.,

HLTM is suitable for mid-term forecasting and can capture linear trends.

SVR: A classic variation of Support Vector Machine (SVM) in regression problems. The core issue is to find the optimal fitting hyperplane while allowing for controllable errors. Both nonlinear modeling capabilities and anti overfitting properties are included. Different from linear regression and decision tree regression, SVR does not pursue perfect fitting of all data points, but focuses on key support vectors, making it suitable for handling numerical prediction problems with small and medium-sized datasets and complex nonlinear relationships.

ELM: ELM is a learning algorithm used for single hidden layer feedforward neural networks. The characteristic of this model is that the input weights and bias terms are randomly assigned, and the training process can be completed by adjusting the output weights. Such characteristic gives ELM extremely high computational efficiency and good generalization ability.

Given that , and are transformed series of the initial observed CDTSs. In this paper, the data from first, second and third order lags were taken as independent variables for SVR and ELM. The radial basis function was chosen as the kernel function for SVR, with the penalty factor and kernel parameter set to the default values of R Language (1 and 0.333). The activation function for ELM was selected as the sigmoid function. The step size was set to 1 to increase the number of hidden neurons for training the model. The results indicated that the best result was achieved when the number of hidden neurons was 6, 5, and 6 for predicting , and , respectively.

Based on the results of three individual models, the proposed optimization model is solved by using the genetic algorithm. Correspondingly, varying weights in combination model on the training set can be obtained, which are presented in Table 2. To determine the varying combination weights on the testing set, the method of compositional data prediction is employed. This involves treating each group of variable weight weights on the training period as a CDTS. By performing spherical coordinate transformation, the Holt exponential smoothing method is used to predict the weights’ CDTS. Varying weights used in testing data set are then obtained by performing inverse transformation on the predicted results.The population size, crossover rate and mutation rate of GA are respectively set as 20, 0.6 and 0.001. The results are also displayed in Table 2. Additionally, the time-invariant optimal weights obtained using the IOWA operator are w1 = 0.7526, w2 = 0.1482, w3 = 0.0992.

thumbnail
Table 2. Weights information of varying weight combination model.

https://doi.org/10.1371/journal.pone.0351310.t002

Table 3 shows the overall performance of five prediction methods. It is evident that the varying weight combination prediction and IOWA operator have yielded favorable results on the training set. Error indicators of the varying weight combination model are lower than those of the three individual forecasting models. The genetic algorithm based varying weight combination model outperforms traditional time-invariant combination model. IOWA operator based combination prediction results are totally lower than individual models and time-invariant weight combination model.

thumbnail
Table 3. Comparisons among different forecasting models.

https://doi.org/10.1371/journal.pone.0351310.t003

On the training set (compositional datum located from 2003 to 2017), the genetic algorithm based varying weight combination model performs the best on RMSPE and MAPE. On the other hand, the IOWA operator based combination model performs better on MSD, SSD, CVPE, and MCPE. It achieved reductions of 0.50%, 0.29%, 0.02%, and 0.02% compared to the varying weight combination model. On the testing set (compositional datum located from 2018 to 2021), both the varying weight and IOWA operator based combination models significantly reduced prediction errors compared to individual models and the time-invariant weight combination model. Specifically, compared to the time-invariant weight combination model, the varying weight combination and IOWA operator based combination model reduced RMSPE by 1.35% and 3.62%, MAPE by 3.31% and 5.22%, MSD by 0.51% and 1.18%, and CVPE by 0.03% and 0.10%, respectively. Furthermore, the varing weight combination model outperformed the IOWA operator version in terms of prediction accuracy. This indicates that the varying weight combination and IOWA operator based combination model have higher prediction accuracies compared to other models.

Fig 3 shows the comparison between the predicted and actual values of China’s Primary energy production structure using two combination methods. It is evident that the predicted values closely align with the actual values, suggesting that the proposed combination prediction method exhibits a strong predictive capability. The evolving trend of China’s Primary energy production structure from 2000 to 2021 can be effectively captured.

thumbnail
Fig 3. Actual and predicted values of primary energy production structure.

https://doi.org/10.1371/journal.pone.0351310.g003

4.3. Discussion

By Table 3 and Fig 3, it can be seen that the combination forecasting results are nearly coincident with the original observations. Strengths of the developed combination forecasting model can be shown. The error level of test set is greater than training set, which is acceptable. Besides, the test period (20% of the observations) is short due to the size of total dataset is not large. On the other hand, since the accuracies of the forecasting results between the training dataset and the testing dataset locate at the same numerical level, there is no overfitting problem. Furthermore, the combination forecasting is realized by weighting the individual forecasting results. As a result, generalization ability of the developed combination forecasting model can be kept or even improved by the generalization ability of individual forecasting models.

From the trend perspective, traditional energy is gradually shifting from being the main base load energy to the role of peak shaving guarantee. Advanced coal production capacity is released in an orderly manner to ensure stable supply. The oil and gas sector continues to promote the increase of reserves and production. At the same time, natural gas, as the bridge of energy transformation, plays a key role in peak shaving guarantee, helping to ensure energy supply security and green transition. Besides, new energy has become the core driving force for optimizing the energy structure, and the proportion of non-fossil energy consumption continues to increase. The installed capacity of wind and photovoltaic power generation has surpassed that of thermal power for the first time, and the installed capacity of new energy storage remains the world’s largest scale. The total amount of hydropower, nuclear power, wind power, and solar power generation has significantly increased. The integration of artificial intelligence and the energy industry is accelerating, promoting the upgrading of energy production and consumption towards intelligence and efficiency. Emerging energy such as hydrogen energy has entered a critical stage of large-scale application, and clean energy technology is accelerating global sharing, reshaping the global energy supply chain. The predicted results show that current energy plan and policies are effective, which transforming the energy structure towards a green and better direction.

In the future, it’s suggested to promote the orderly transformation of traditional energy, release advanced coal production capacity to ensure stable supply, and strengthen the transformation of coal-fired power into a basic guarantee and flexible regulation function. By utilizing modern coal chemical technology to convert coal from fuel to raw materials and clean fuels, we aim to enhance oil and gas storage and production capacity to solidify our bottom line capability. At the same time, we should leverage natural gas as a bridge for energy transition to ensure the resilience of energy supply. Besides, new energy can be taken as the core driving force for structural optimization to expand the installed capacity of wind and photovoltaic power. it’s also suggested to focus on promoting the construction of large-scale bases and distributed and decentralized energy sources in the Shage Desert region, consolidating the global leading advantage in wind and solar power installed capacity surpassing thermal power and energy storage. Finally, nuclear power, coordinate hydropower development and ecological protection could be safely and orderly developed, so that the proportion of non fossil energy consumption can be increased, forming a multi energy complementary system.

5. Conclusion

In light of the limitations of existing combined models for forecasting CDTSs, novel combined forecasting methods based on genetic algorithm for varying weight shown by compositional data and IOWA operator are proposed. The effectiveness of these methods is verified by applying it to China’s Primary energy production structure. The data is divided into training and testing sets, and the weights are obtained by minimizing the error using genetic algorithms. Comparing the individual forecasting models, the results demonstrate that the prediction error of the varying weights and IOWA operator based combination model is generally smaller. Furthermore, the varying weight combination model exhibits better prediction performance on the test set, resulting in improved prediction accuracy compared to the IOWA operator combination model. These findings suggest that the varying weights and IOWA operator based combination model outperforms other models in predicting compositional data time series. In the future, the application of decomposition method [41] can be considered in the compositional data time series analysis with the combination of the developed model. Besides, robustness of the combination forecasting model is an open and interesting problem. On the other hand, due to the fact of constant sum constraint, a systematic modeling process is needed. Next, how to measure the credibility of CDTSs has not been considered in previous work. Traditional confidence interval method can not be directly used in such field. It’s necessary to developed a parallel or extended theoretical framework to realize the analysis.

Supporting information

Acknowledgments

The author would like to thank the editor and anonymous reviewers’ recognition and valuable comments to our work.

References

  1. 1. Hou G, Song H. Whether the directed technical change promotes the improvement of the energy structure in China. Front Environ Sci. 2022;10.
  2. 2. Liu Y, Xie X, Wang M. Energy structure and carbon emission: analysis against the background of the current energy crisis in the EU. Energy. 2023;280:128129.
  3. 3. Jin Y, Wang S, Bu L, Zhai P. Unconventional, conventional monetary policies, and optimal energy supply structure in China. Fin Res Lett. 2023;54:103732.
  4. 4. Li X, Ge Y, Mao X, Xue W, Xu N. Improved Energy Structure Prediction Model Based on Energy Demand Forecast. In: 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2). 2018. pp. 1–5. https://doi.org/10.1109/EI2.2018.8582170
  5. 5. Vorontsova OV, Ukraintsev VB, Sedykh YA. Innovative Energy Producing Structures in Russia. In: 2019 International Science and Technology Conference “EastConf”. 2019. pp. 1–4. https://doi.org/10.1109/EastConf.2019.8725426
  6. 6. Feng T, Sun L, Zhang Y. The relationship between energy consumption structure, economic structure and energy intensity in China. Energy Policy. 2009;37(12):5475–83.
  7. 7. Yang X, Wang S, Zhang W, Li J, Zou Y. Impacts of energy consumption, energy structure, and treatment technology on SO2 emissions: A multi-scale LMDI decomposition analysis in China. Appl Energy. 2016;184:714–26.
  8. 8. Jiang P, Yang H, Li H, Wang Y. A developed hybrid forecasting system for energy consumption structure forecasting based on fuzzy time series and information granularity. Energy. 2021;219(C):119599.
  9. 9. Li H, Li B, Niu D. Prediction on the energy consumption structure in liaoning province based on system dynamics. Pol J Environ Stud. 2021.
  10. 10. Liu J, Ma H, Wang Q, Tian S, Xu Y, Zhang Y, et al. Optimization of energy consumption structure based on carbon emission reduction target: A case study in Shandong Province, China. Chin J Popul Resourc Environ. 2022;20(2):125–35.
  11. 11. Guan Y, Yang J, Wang R, Zhang L, Wang M. Exploring the role of energy consumption structure and digital transformation in urban logistics carbon emission efficiency. Atmosphere. 2025;16(8):929.
  12. 12. Martín-Fernández JA, Hron K, Templ M, Filzmoser P, Palarea-Albaladejo J. Model-based replacement of rounded zeros in compositional data: classical and robust approaches. Comput Stat Data Anal. 2012;56(9):2688–704.
  13. 13. Kynčlová P, Filzmoser P, Hron K. Modeling compositional time series with vector autoregressive models. J Forecast. 2015;34(4):303–14.
  14. 14. Snyder RD, Ord JK, Koehler AB, McLaren KR, Beaumont AN. Forecasting compositional time series: a state space approach. Int J Forecast. 2017;33(2):502–12.
  15. 15. Liang W, Wu Y, Ma X. Robust sparse precision matrix estimation for high-dimensional compositional data. Stat Probab Lett. 2022;184:109379.
  16. 16. Tian Y, Majahar Ali MK, Wu L, Li T. Imputation method based on adaptive group lasso for high-dimensional compositional data with missing values. Mal J Fund Appl Sci. 2025;21(1):1551–65.
  17. 17. Qian W, Rolling CA, Cheng G, Yang Y. Combining forecasts for universally optimal performance. Int J Forecast. 2022;38(1):193–208.
  18. 18. Kang Y, Cao W, Petropoulos F, Li F. Forecast with forecasts: diversity matters. Eur J Operat Res. 2022;301(1):180–90.
  19. 19. Genre V, Kenny G, Meyler A, Timmermann A. Combining expert forecasts: can anything beat the simple average? Int J Forecast. 2013;29(1):108–21.
  20. 20. Yager RR, Filev DP. Induced ordered weighted averaging operators. IEEE Trans Syst Man Cybern B Cybern. 1999;29(2):141–50. pmid:18252288
  21. 21. Billheimer D. Compositional Data. Encyclopedia of Environmetrics. 2006.
  22. 22. Kim Y, Heon S, Kenyon J, Kim J, Geller J, Redeker NS. Associations of 24-hour movement behaviors with cardiorespiratory fitness in heart failure patients: a compositional data analysis. Med Sci Sports Exerc. 2025;57(10S):82–82.
  23. 23. Wu Y, Zhang D, Zhang Y, Zhang H, Zhou L, Liu Y, et al. Water wave energy-harvesting accordion structure triboelectric nanogenerators for self-driven corrosion protection. Nano Energy. 2025;142:111207.
  24. 24. Zhang H, Jin J. Assessing the effect of green finance, energy consumption structure and environmental sustainable development: a moderated mediation model. Econ Change Restruct. 2024;57(2).
  25. 25. Wang H, Liu Q, Mok HMK, Fu L, Tse WM. A hyperspherical transformation forecasting model for compositional data. Eur J Operat Res. 2007;179(2):459–68.
  26. 26. Zhou G, Luo P, He Q. Predicting compositional time series via autoregressive Dirichlet estimation. Sci China Inf Sci. 2018;61(9).
  27. 27. Barceló-Vidal C, Aguilar L, Martín-Fernández JA. Compositional VARIMA time series. Compositional Data Analysis: Theory and Applications. 2011.
  28. 28. Chang K, Chen R, Fomby TB. Prediction‐based adaptive compositional model for seasonal time series analysis. J Forecast. 2017;36(7):842–53.
  29. 29. Bates JM, Granger CWJ. The combination of forecasts. J Operat Res Soc. 1969;20(4):451–68.
  30. 30. Liu J, Wang P, Chen H, Zhu J. A combination forecasting model based on hybrid interval multi-scale decomposition: Application to interval-valued carbon price forecasting. Exp Syst Appl. 2022;191:116267.
  31. 31. Radchenko P, Vasnev AL, Wang W. Too similar to combine? On negative weights in forecast combination. Int J Forecast. 2023;39(1):18–38.
  32. 32. Pérez-Fernández R, Gagolewski M, De Baets B. On the aggregation of compositional data. Inform Fus. 2021;73:103–10.
  33. 33. Li B, Ding J, Yin Z, Li K, Zhao X, Zhang L. Optimized neural network combined model based on the induced ordered weighted averaging operator for vegetable price forecasting. Exp Syst Appl. 2021;168:114232.
  34. 34. Clemen RT. Combining forecasts: a review and annotated bibliography. Int J Forecast. 1989;5(4):559–83.
  35. 35. Plocoste T, Regis S, Nuiro SP, Sankaran A. Application of aggregation operators for forecasting PM10 fluctuations: from available Caribbean data sites to unequipped ones. Atmosph Pollut Res. 2024;15(6):102116.
  36. 36. Huang H, Tian Y, Tao Z. Multi-rule combination prediction of compositional data time series based on multivariate fuzzy time series model and its application. Exp Syst Appl. 2024;238:121966.
  37. 37. Lu S, Wang H, Zhao J. Graph convolutional network for compositional data. Inf Fus. 2025;117:102798.
  38. 38. Aitchison J. The Statistical Analysis of Compositional Data. London - New York: Chapman and Hall; 1986.
  39. 39. Pérez-Fernández R, Baets BD. Aggregation theory revisited. IEEE Transac Fuzzy Syst. 2021;29(4):797–804.
  40. 40. Tan F, Wang J, Jiao Y-Y, Ma B, He L. Suitability evaluation of underground space based on finite interval cloud model and genetic algorithm combination weighting. Tunnel Undergr Space Technol. 2021;108:103743.
  41. 41. Xiao F, Yang S, Li X, Ni J. Branch error reduction criterion-based signal recursive decomposition and its application to wind power generation forecasting. PLoS One. 2024;19(3):e0299955. pmid:38517881