Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Enhanced multi-horizon photovoltaic power forecasting: A novel approach integrating ICEEMDAN decomposition with hierarchical frequency neural networks

  • Yaopeng Han,

    Roles Conceptualization, Data curation, Investigation, Methodology, Visualization, Writing – original draft

    Affiliations College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang, Liaoning, China, Extended Energy Big Data and Strategy Research Center, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, China, Shandong Energy Institute, Qingdao, Shandong, China, Qingdao New Energy Shandong Laboratory, Qingdao, Shandong, China

  • Chenxi Li,

    Roles Data curation, Resources

    Affiliation College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang, Liaoning, China

  • Siqi Chen,

    Roles Data curation, Resources

    Affiliation College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang, Liaoning, China

  • Jinghao Zhao,

    Roles Conceptualization, Formal analysis, Methodology, Supervision, Validation, Writing – review & editing

    Affiliations Extended Energy Big Data and Strategy Research Center, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, China, Shandong Energy Institute, Qingdao, Shandong, China, Qingdao New Energy Shandong Laboratory, Qingdao, Shandong, China

  • Yajun Tian,

    Roles Funding acquisition, Supervision, Writing – review & editing

    Affiliations Extended Energy Big Data and Strategy Research Center, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, China, Shandong Energy Institute, Qingdao, Shandong, China, Qingdao New Energy Shandong Laboratory, Qingdao, Shandong, China

  • Jun Wang

    Roles Project administration, Writing – review & editing

    wj_software@hotmail.com

    Affiliation College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang, Liaoning, China

Abstract

As a crucial renewable energy source, solar PV power generation drives environmental protection and energy transformation. However, existing forecasting models struggle to accurately capture the complex dynamics of photovoltaic (PV) power, primarily due to monolithic modeling paradigms and inadequate representation of temporal information. To address these challenges, this paper proposes a novel hybrid model that leverages data decomposition and frequency-stratified prediction. The model employs the advanced ICEEMDAN algorithm to address complex non-stationarity. Additionally, it introduces a frequency-stratified heterogeneous network for precise component-wise modeling and integrates Improved Relative Positional Encoding (IRPE) to accurately capture temporal dependencies. To comprehensively evaluate model performance, this study employs quantile regression to generate probabilistic prediction intervals, using the median output as the baseline for point predictions. The model’s performance is validated through ablation experiments and comparisons of single-step and multi-step predictions with recent benchmark models. The results indicate that the model excels under the MIMO strategy, achieving normalized nMAE values of 0.1142 and 0.1490 for 120-minute and 2880-minute forecasts on the DKASC and Solar I datasets, respectively, surpassing recent baseline models by 14.6% and 8.1%. Furthermore, the model’s statistical stability and robustness are confirmed through 30 independent Wilcoxon signed-rank tests, as well as an uncertainty analysis conducted under various weather conditions. In summary, the model’s high accuracy and stability provide robust support for power plant operations and planning.

1 Introduction

Energy is the fundamental driving force behind modern global progress, and electricity demand is an important aspect of it [1]. Renewable energy is becoming increasingly attractive due to the high costs, limited reserves, and environmental impact of traditional energy sources. In recent years, the economic and environmental advantages of photovoltaic (PV) power generation have attracted widespread attention, and its market penetration rate has also significantly increased [2]. This has positioned PV to play a crucial role in meeting global energy demand and achieving dual-carbon goals. Despite the enormous potential of photovoltaic power generation, accurate estimation of its power generation remains a challenge due to the inherent instability and periodicity of the technology, which may affect the reliability and resilience of the power backbone network [3]. Therefore, it is imperative to improve the accuracy of photovoltaic power generation prediction.

In recent years, researchers have investigated a variety of technical approaches to achieve accurate predictions, encompassing physical methods, statistical techniques, and artificial intelligence [4]. While deep learning models, especially hybrid models that integrate signal decomposition techniques, have gained popularity and enhanced prediction accuracy to some degree [5], a thorough analysis of existing literature indicates that current research continues to face two fundamental limitations (a comprehensive discussion of related work is provided in Sect 2).

  1. The current paradigm for signal decomposition and modeling has significant limitations in accurately capturing the complex dynamic characteristics of PV power. While signal decomposition is widely utilized, research often adheres to a homogenized modeling approach, applying a uniform model to both high-frequency components—characterized by peaks, valleys, and transient changes—and low-frequency components, which represent the fundamental trend. This method neglects the unique physical properties of each component, resulting in a sluggish model response to sharp fluctuations in PV power and ultimately limiting overall prediction accuracy [6]. Additionally, the choice of decomposition algorithm is critical for preserving these dynamic details. Compared to the commonly used CEEMDAN and VMD algorithms, the more advanced ICEEMDAN algorithm better suppresses modal aliasing and retains dynamic signal information [7].
  2. In time series forecasting, the sequential information of data is critically important. The loss of positional information within sequences and inadequate model representation are significant challenges in this field. Although models based on the Attention Mechanism perform well in time series tasks, their standard attention modules cannot effectively capture the absolute or relative positional information of the input [8]. However, most existing studies either overlook this issue or utilize absolute positional encodings that can be rendered ineffective by linear transformations [9]. This limitation hinders the model’s ability to learn temporal dependencies, ultimately constraining its predictive accuracy.

To address these limitations, this paper proposes a hierarchical frequency modeling network, as shown in Fig 1. In Step 1, the PCC [10] method is used to select climatic factors that significantly impact PV power. In Step 2, the ICEEMDAN algorithm decomposes the PV power sequence, generating a high-frequency error sequence and a low-frequency sum sequence. These sequences are combined with the selected climatic factors to form a PV power prediction dataset. In Step 3, the Conv1dBiGRU-IRPE-A model and LSTM model predict the high-frequency error sequence and low-frequency sum sequence, respectively. Finally, the model parameters are optimized using the PSO [11] algorithm, and the results of each frequency sequence are summed to obtain the final PV power prediction. To quantify the prediction uncertainty of the model, this study first employs quantile regression to generate probabilistic prediction intervals. Concurrently, the median of the model’s output is used as the point forecast. Subsequently, the performance of the proposed ICIAL model was comprehensively evaluated on the DKASC and Solar I datasets. The evaluation included single-step forecasting, ablation studies, multi-step forecasting under multiple strategies (MIMO, Direct, and Recursive) [12], and a comparative analysis against recent baseline models. The results demonstrate that the ICIAL model achieves optimal performance under the MIMO strategy. Notably, in the multi-step MIMO forecasting experiments, the ICIAL model exhibited robust performance. It achieved a 10.2% reduction in nMAE compared to the ITransformer model [13] for the T+24 (120 min) task on the DKASC dataset, and a 5.9% reduction in nRMSE compared to the DA-GRU model [14] for the T+192 (2880 min) task on the Solar I dataset. To ensure the stability of the MIMO predictions, 30 rounds of Wilcoxon Signed-Rank Tests [15] were conducted. Specifically, 30 rounds of model training and prediction were performed on both the DKASC and Solar I datasets. The results reveal median p-values well above 0.05, with the proportion of significant differences being extremely low, peaking at 6.67%. This indicates high statistical consistency across multiple runs, confirming the greater performance and robustness of the ICIAL model. Furthermore, an uncertainty analysis conducted under various weather conditions shows the prediction intervals are both reliable, as measured by the Prediction Interval Coverage Probability (PICP), and sharp, as indicated by a narrow Prediction Interval Average Width (PIAW).

The contributions presented in this paper are as follows:

  1. This study proposes a novel hybrid forecasting paradigm that combines signal decomposition with a frequency-stratified network. The proposed model, ICIAL, first employs ICEEMDAN to capture the complex fluctuation patterns of the PV power series. Subsequently, it utilizes a heterogeneous neural network integrated with Improved Relative Positional Encoding (IRPE) to perform stratified forecasting on the different frequency components, thereby significantly enhancing the model’s ability to represent temporal features.
  2. This study effectively addresses the limitations of traditional models in accurately capturing the complex dynamic characteristics of PV power. By applying targeted modeling techniques to both high- and low-frequency signals, the new paradigm significantly enhances the model’s precision in capturing sharp fluctuations, including peaks, valleys, and transient changes.
  3. This study systematically evaluated the model’s performance and robustness across various multi-step forecasting strategies. Comparative experiments conducted in multiple scenarios confirmed the long-term forecasting performance advantages of the proposed model, particularly under the MIMO strategy. Furthermore, the Wilcoxon signed-rank test was employed to ensure the statistical consistency and reliability of the prediction results.
  4. By introducing a quantile regression-based framework for uncertainty analysis conducted under various weather conditions, the model generates prediction intervals that demonstrate both high coverage probability (PICP) and robust sharpness (PIAW). This approach provides more comprehensive informational support for grid risk management and decision-making.

The remainder of this paper is organized as follows: Sect 2 introduces related works; Sect 3 details the core methodologies of the study, including the proposed ICIAL model and the validation techniques; Sect 4 provides a comprehensive description of the sources of the experimental data, the pre-processing procedures, and the methodologies employed for model training; Sect 5 presents a comprehensive experimental evaluation of the proposed ICIAL model, which includes performance comparisons against baseline models, stability validation via the Wilcoxon signed-rank test, and an assessment of its uncertainty quantification capabilities under various weather scenarios. Finally, Sect 6 summarizes the study and provides insights into potential directions for future research.

2 Related work

This section will provide a comprehensive review of prediction methods pertinent to this study, emphasizing the analysis of their strategies and limitations concerning hybrid model architectures, signal processing, and time series information modeling.

2.1 Evolution of mainstream PV prediction methods

The development of photovoltaic prediction technology has progressed from traditional methods to intelligent algorithms. Initially, physical and statistical methods were the predominant techniques. The application of physical methods measured physical data, Ding et al. [16] predict the PV output power based on meteorological data obtained from weather forecasts. However, physical models are highly dependent on data availability, which often limits their forecasting performance [17]. In contrast, statistical approaches offer enhanced flexibility and more rapid computational speeds. Vishal et al. [18] proposed a SARIMA-RVFL hybrid statistical model for very short-term solar PV power generation forecast. However, it turned out that physical and statistical methods cannot produce more accurate forecast results.

Recently, machine learning has shown great potential in PV power prediction, especially in the nonlinear correlation between PV power data and meteorological data [19]. Scott et al. [20] analysed a variety of machine learning models that are widely used in PV power forecasting, including models such as SVM and random forests, to provide insight into the advantages of different models in PV power forecasting. However, individual machine learning models often struggle to effectively capture intricate data relationships. Therefore, to improve the representation of complex time series, deep learning—particularly hybrid models that integrate various advantageous components—has increasingly become the predominant tool in the field of photovoltaic prediction [21]. Ali et al. [22] proposed two hybrid models (CNN-LSTM and ConvLSTM) to predict the power generation of PV plants. Tovar et al. [23] put forward a short-term power load forecasting model combining CNN and LSTM for photovoltaic power prediction with real data from a site in Mexico. However, none of these studies effectively address the non-stationarity of PV power data.

To address this issue, researchers have introduced signal decomposition techniques [5]. Feng et al. [24] proposed a PV power prediction model based on CEEMDAN-LSTM to improve the accuracy of PV power prediction results. Tao et al. [25] proposed sequence decomposition using VMD, where the decomposed subsequences and preprocessed historical meteorological data are fed into several LSTMs integrating one-dimensional convolution and attention mechanisms, respectively. The prediction results corresponding to each subsequence are then summed to obtain the final prediction result. Despite the application of decomposition techniques, current research has generally failed to adopt differentiated modeling approaches for the decomposed subsequences with varying physical characteristics, instead commonly employing homogeneous modeling paradigms. Table 1 summarizes these related studies, compares them with the ICIAL model proposed in this paper, and highlights the key limitations of existing methods.

thumbnail
Table 1. Comparative analysis of ICIAL model and advanced hybrid forecasting models.

https://doi.org/10.1371/journal.pone.0334828.t001

2.2 Position representation in time series modeling

In photovoltaic time series forecasting, effectively representing the position of data points within the sequence to capture temporal dependencies is a core challenge, particularly in non-recurrent architectures based on the attention mechanism. As illustrated in Table 1, existing models address this issue in various ways. Traditional statistical models and recurrent neural networks implicitly model temporal order through their inherent sequential processing structures. However, to harness the powerful parallel computation and feature interaction capabilities of the attention mechanism, Transformer-based models must implement explicit positional encoding schemes [8]. Currently, the most common approach is absolute positional encoding. However, a substantial body of research indicates that the precise positional information encoded in this manner may be weakened or lost after multiple layers of non-linear transformations, thereby affecting the model’s ability to learn temporal dynamics [9].

In contrast, relative positional encoding (RPE), which directly models the “query-key” relative temporal relationships between data points, has proven to be a more robust and suitable strategy for capturing dynamic changes [26]. However, in the field of photovoltaic forecasting, this crucial difference has not been adequately addressed. As clearly illustrated in the comparison presented in Table 1, most existing advanced hybrid prediction models rely on implicit positional representations or basic absolute positional encoding. To address these limitations and further enhance the model’s accuracy in capturing temporal dynamics, this paper introduces an improved relative positional encoding (IRPE) mechanism.

In summary, existing photovoltaic prediction models, as shown in Table 1, struggle to accurately capture the complex dynamics of power output due to dual bottlenecks in signal processing and temporal representation, which limits both prediction accuracy and reliability. To address these challenges, this research proposes a novel ICIAL prediction framework. This framework aims to resolve the issue of inaccurately capturing dynamic characteristics by introducing a frequency-hierarchical heterogeneous network architecture designed for the targeted modeling of signal components at different frequencies. Additionally, it incorporates an improved relative positional encoding (IRPE), which significantly enhances the model’s capacity to represent dynamic temporal dependencies, ultimately achieving greater accuracy and robustness in multi-step prediction tasks.

3 Methods

This section introduces the methodology of the study. First, a heterogeneous prediction framework based on ICEEMDAN decomposition (ICIAL) is described, including its overall workflow and the function of each module. Finally, the Wilcoxon signed-rank test, which is employed to ensure the statistical significance of the results, along with the quantile regression method for quantifying prediction uncertainty, is presented.

3.1 ICEEMDAN algorithm

The ICEEMDAN algorithm refines EMD by adding adaptive white noise . This approach reduces mode mixing, ensures frequency continuity across scales, and improves both the accuracy and interpretability of the decomposition. In the iterative decomposition process of ICEEMDAN, each IMF is extracted using the current residual for subsequent steps, rather than by reconstructing the original dataset. This mechanism prevents the introduction of future data when decomposing individual time series. Additionally, to avoid incorporating future information while processing long sequences, this study employs a sliding window technique in conjunction with the ICEEMDAN algorithm, which further enhances prediction accuracy. The ICEEMDAN algorithmic process [39] is described in Table 2:

thumbnail
Table 2. ICEEMDAN algorithm steps with implementation details.

https://doi.org/10.1371/journal.pone.0334828.t002

where a and r1 are shown in Eqs (1) and (2):

(1)(2)

In both equations, a represents the i-th ensemble member generated by adding the i-th set of white noise (which has a zero mean and unit variance) to the original sequence a. Here, is the weighting constant for the initial noise level, denotes the standard deviation of a, r1 is the first residual following the initial decomposition stage of ICEEMDAN, and L represents the ensemble size.

3.2 PSO algorithm

The Particle Swarm Optimization (PSO) algorithm is a group intelligence method that simulates the social foraging behavior of bird flocks. This approach is based on the observation that birds tend to fly in a swarm, with each bird following a set of simple rules to reach an optimal solution. In particular, the algorithm guides the movement of particles within the search space by modifying their velocities and positions in order to bring them as close as possible to the optimal solution. The formulae for calculating the velocity and position of the particulates are as follows [11].

(3)(4)

Where and are the velocity values of the particles in the neighbouring moments; w is the inertia weight; and are the position values of the particles in the neighbouring moments; c1 and c2 are factors that contribute to the process of learning; r1 and r2 are the randomly generated numbers ranging from 0 to 1; and are the individual and global historical optimal position values at time k, respectively.

Fig 2 illustrates the detailed algorithmic flow of the PSO algorithm for the ICIAL model applications.

thumbnail
Fig 2. Flow chart of particle swarm optimisation algorithm.

https://doi.org/10.1371/journal.pone.0334828.g002

3.3 Wilcoxon signed-rank test method

This study employs the Wilcoxon signed-rank test [15] to ensure statistical consistency and mitigate the randomness inherent in deep learning models. This non-parametric statistical method is designed to assess whether significant differences exist between two paired datasets, making it particularly suitable for scenarios where data distributions do not conform to normality assumptions.

The core formulation of the Wilcoxon signed-rank test is as follows: Consider two paired datasets Xi and Yi (), with differences defined as Yi. Samples with di = 0 are excluded, and the absolute differences |di| are ranked in ascending order, denoted as Ri. The positive rank sum W +  and negative rank sum W- are calculated based on the signs of di:

(5)

The test statistic W is taken as the smaller of W +  and W-. For large samples (n > 20), W approximates a normal distribution, with its standardized form given by:

(6)

where and .

The p-value is calculated based on the standardized statistic Z. For a two-tailed test, the p-value is determined using the cumulative distribution function (CDF) of the standard normal distribution:

(7)

where is the CDF of the standard normal distribution, and p represents the probability of observing the test statistic under the null hypothesis. The significance level (α), typically set at 0.05, is a predefined threshold used to compare with the p-value to assess statistical significance.

This study employs p-values and a significance level α as evaluation criteria. The p-value represents the probability of observing a test statistic under the null hypothesis, while α serves as the significance threshold. A p-value less than α indicates a significant difference, whereas a p-value greater than α suggests no significant difference and indicates statistical consistency. To enhance the robustness of evaluations, the median p-value is employed as a key metric. This measure reflects the central tendency of p-values across multiple experiments, is less sensitive to outliers, and more reliably represents the statistical consistency of model performance. This approach is particularly effective when analyzing repeated runs, ensuring that evaluations are not influenced by random fluctuations [41].

3.4 Uncertainty quantification and interval prediction method

To address the limitation of point prediction in quantifying the inherent uncertainties of PV power generation, this study introduces an uncertainty quantification framework based on quantile regression. This method is particularly suitable for PV forecasting because it eliminates the need to assume the error distribution— a key advantage given that prediction errors in PV systems typically exhibit non-Gaussian characteristics [42]. By generating probabilistic prediction intervals, the approach provides critical information for risk assessment and operational decision-making in power systems.

3.4.1 Quantile regression and pinball loss function.

Quantile regression is a statistical technique used to estimate the conditional quantiles of a response variable [43]. It can model the entire conditional distribution, rather than just focusing on the conditional mean as in traditional regression. This offers significant advantages in photovoltaic power forecasting: First, it does not require assumptions about the distribution of prediction errors, thus demonstrating strong robustness when dealing with the skewed and heavy-tailed error distributions commonly found in photovoltaic data due to weather fluctuations. Second, by modeling the upper and lower bounds at a given confidence level, it can directly generate prediction intervals. To train a model for quantile prediction, a quantile loss function (also known as the pinball loss function) must be used. This function applies asymmetric penalties for overestimation and underestimation, guiding the model to learn specific conditional quantiles. For a target quantile q (where q is between 0 and 1), the loss function Lq is defined as follows:

(8)

where y denotes the actual value, represents the predicted quantile value, and q is the target quantile.

3.4.2 Interval prediction evaluation metrics.

To quantify the performance of the generated prediction intervals, two standard metrics are employed: Prediction Interval Coverage Probability (PICP) and Prediction Interval Average Width (PIAW).

PICP measures the percentage of actual values that fall within the prediction intervals, reflecting the reliability of the predictions. A higher PICP indicates more reliable intervals. The formula for PICP is:

(9)

where, N is the total number of samples, yi is the i-th actual value, is the prediction interval for the i-th sample, is the indicator function.

PIAW measures the average width of the prediction intervals. Under the premise of a satisfactory PICP, narrower intervals imply more precise and sharper predictions. The formula for PIAW is:

(10)

where Ui and Li are the upper and lower bounds of the prediction interval for the i-th sample, respectively.

3.5 High frequency forecasting model

The high-frequency error sequence information displays significant fluctuations and high-frequency variation characteristics. Consequently, the hybrid model ConvBiGRU-IRPE-A captures these notable high-frequency fluctuations more effectively by integrating the one-dimensional convolutional layer, the BiGRU unit, the IRPE mechanism, and the attention mechanism, thereby enhancing the accuracy of the prediction results.

3.5.1 Attentional mechanism.

The Attention Mechanism Model [8]is one of the most widely used attention models in current. It calculates the similarity by vector dot product and scales the result in accordance with the respective magnitude of the input vectors. Subsequently, Softmax function is applied for normalisation to generate the attention weights. Ultimately, the aforementioned weights are employed to weigh the value vectors, thereby forming the attention score matrix. The following is the standard formula for Attention process calculations:

(11)

Where is a self-attention score that is calculated in accordance with the specified parameters; Q represents the variable used to formulate the query; The variable designated K refers to the principal vector; V represents the magnitude of the values contained within the value vector; X indicates the value of the input vector. The matrices and correspond to trainable linear transformation matrices for the Q,K, and V vectors,respectively. d denotes the dimension of the input vector.

3.5.2 Improved relative position encoding (IRPE).

Absolute position encoding is a method of generating a unique representation for each position in a sequence using the following formula:

(12)

Where: T denotes the model input vector after position coding; xi denotes the data embedding vector of the i-th position; pi indicates the absolute position coding vector of the i-th position.

Using absolute position encoding, the formula for calculating the encoded value of the i-th position in relation to the j-th dimension in the model is as follows [9]:

(13)

Absolute positional coding primarily considers the absolute positions of elements within a sequence. In contrast, relative positional coding allows the model to effectively account for the positional relationships between elements during the learning process, particularly focusing on the relationships across different sequences. This approach resolves the limitations of analyzing elements solely within a single sequence.

This paper proposes an improved XLNet-style [44] position coding method that introduces two different trainable relative position identification parameters that vary linearly during vector computation. Following this approach, the parameters are adjusted in real-time throughout the training phase, aiming for continuous optimization via backpropagation within the network. In each training round, the model updates the trainable parameters u and v based on the training weights. This will result in the updating of parameter values and generation of new parameters u1 and . The newly generated parameter values will then be employed as initial parameter values for the subsequent training round. As the model training progresses, the weight values will continue to evolve, enabling the parameters u and v to be progressively refined until the model is optimally trained, reaching un and . Additionally, the positional information between the sequences is adapted to facilitate the dynamic adjustment of the u and v values. This process optimises relative position coding, thereby improving its ability to capture detailed relative position relationships between different time points. In this paper, the improve relative position coding formula for linear transformation of trainable parameters is introduced as follows:

(14)

Where: Si,j denotes the vector product of calculating the i-th element of Q and the j-th element of K; Rij denotes the relative positional distance as it relates to the query vector qi as well as the key vector kj; ui and are two parameter vectors that can be trained to vary.

3.5.3 ConvBiGRU-IRPE-A.

The ConvBiGRU-IRPE-A model integrates several advanced components to achieve precise predictions. First, a Conv1D serves as the initial feature extractor, capturing local fluctuation patterns in photovoltaic power caused by abrupt weather changes. Subsequently, a BiGRU processes the sequence in both forward and reverse directions to capture comprehensive contextual information, thereby modeling long-term dependencies and overall trends [45]. Building on this, an attention mechanism dynamically assigns weights to different time steps, allowing the model to focus on the most critical historical information. Finally, to address the limitations of standard attention mechanisms in perceiving sequence order, this study designed an IRPE. By dynamically encoding the relationships between time steps, the IRPE significantly enhances the model’s representation of temporal dynamics.

Fig 3 illustrates the workflow of the model. Initially, the high-frequency error sub-series is input into a Conv1D layer for local feature extraction, followed by a fully connected layer for feature integration. The processed features are then passed to a stacked BiGRU network to capture bidirectional temporal dependencies. A Dropout layer is inserted between consecutive BiGRU layers to randomly deactivate a fraction of the neurons, effectively reducing model complexity and mitigating overfitting. Subsequently, the IRPE mechanism injects dynamic relative position information into the output states of the BiGRU. These temporally informed states are sent to an attention layer that computes and assigns weights to emphasize critical time steps. Finally, the weighted features are mapped to the final output space via another fully connected layer, generating the predicted values for the high-frequency error series.

thumbnail
Fig 3. The architectural design of the ConvBiGRU-IRPE-A model.

https://doi.org/10.1371/journal.pone.0334828.g003

Fig 3 is divided into two sections. The upper section illustrates the primary data flow of the model, detailing the core components from the Input Layer to the Output Layer, which include the Conv1D layer, the BiGRU network, Dropout layers, and the Attention Layer. In this diagram, ht represents the hidden state output of the BiGRU at time step t, while Wt denotes the corresponding weight computed for this state by the attention mechanism. Meanwhile, the lower section explains the principles of the IRPE mechanism. This mechanism dynamically generates XLNET-style relative position encodings by continuously updating two trainable base parameters, u and v, during each training epoch. This process significantly enhances the model’s ability to capture temporal dynamics.

3.6 Low frequency forecasting model

Low-frequency summed series are typically characterised by slow data changes and significant trends or periodic changes. It can be reasonably deduced that the forecasting model for low-frequency summed series should adopt the traditional LSTM structure, as this will enable the LSTM model to more effectively capture and store the characteristics of these series [46].

3.6.1 LSTM.

LSTM was first conceptualised by Hochreiter in collaboration with Schmi-dhuber. Its core idea is to alleviate the gradient vanishing problem by using the ‘gate’ structure with short-time memory and the unit state with long-time memory. The architectural of LSTM is illustrated in Fig 4, and the specific computational procedures of different gates are as follows [47]:

(15)(16)(17)(18)(19)(20)

ht−1 and ht are the states of the hidden layer that have preceded and succeeded one another in time; Ct−1 and Ct are the previous and current unit memory information, respectively; are the vectors that represent the degree of activation of respective components of the neural network, namely the forgetting gate, the input gate, the cell input, and the output gate; denote the weight matrix of the input data vector xi, respectively; ht is the hidden-state vector; Bias f,i, c, o representing the states of different gates or cells, where and denote the Hadamard product; and denote the sigmoid activation and hyperbolic tangent functions, respectively.

4 Experiment

Two distinct datasets were employed in this study, and pre-processing operations, including the filling of missing values, handling of outliers, and normalization of data, were conducted on each of these two datasets. Subsequently, the ICEEMDAN algorithm was applied to the pre-processed datasets. The goal of these operations is to ensure data integrity and reliability for subsequent analysis.

4.1 DKASC dataset

The Desert Knowledge-Based Energy Research and Solar Technology Application Centre (DKASC) is situated in the arid desert environment of the Northern Territory’s Alice Springs. Despite the region’s lack of precipitation, it boasts one of the highest solar resources in Australia. The dataset is publicly available for consultation in the relevant literature [48]. The data selected for analysis in this study was collected over the period between the first of January, 2019, and the thirty-first of December, 2019. Table 3 presents the supplementary technical particulars of the PV sites. The temporal resolution of the data is 5 minutes. The experimental data set was composed of 80% of the full-year 2019 data, the validation-set 15%, and the test-set 5%. The PV power prediction was trained using these datasets, and the resulting predictions were compared with the actual data.

thumbnail
Table 3. Technical parameters of PV power generation system.

https://doi.org/10.1371/journal.pone.0334828.t003

The value of the PCC of the combined PV power and weather influences was calculated [10] and the heat map were generated. The results demonstrate a notable interdependence between PV power and irradiance.According to the heat map, the absolute value of the correlation coefficient of the influencing factors is greater than 0.1, the following were selected as the weather influencing factors for the experiment: global horizontal irradiance, scattered radiation, relative humidity, air temperature and wind speed.

4.2 SolarI dataset

SolarI dataset was derived from the Renewable Energy Power Generation Forecasting Competition [49], which was organised by the State Grid of China. In comparison to the DKASC dataset, the SolarI dataset is characterised by a paucity of data and a prevalence of anomalous weather conditions. The dataset comprises PV data from eight solar stations located within a single geographical region of China, spanning the period between 2019 and 2020. The 2019 PV power generation data from the Solar Station 1 dataset, which has a nominal capacity of 50 MW, is employed as the experimental dataset, with a time step of 15 minutes for each data point. The dataset has been divided into three distinct subsets: the training-set, comprising 70% of the full-year 2019 data; the validation-set, comprising 10% of the full-year 2019 data; and the test-set, comprising the remaining 20% of the full-year 2019 data. The data are employed in order to train the photovoltaic power prediction model, and thus generated predictions are then compared with the actual data.

4.3 Preprocessing

The data was pre-processed using characteristic equations to fill in missing values and remove anomalies. This was done in order to avoid the inclusion of any missing or anomalous values in the data set [50]. Subsequently, the data underwent normalisation and transformation into a uniform standard form, thus facilitating subsequent modelling and experimentation. Then, the ICEEMDAN algorithm has been utilised for the purpose of decomposing PV power series within this standardised dataset. This has enhanced the precision and reliability of data analysis.

4.3.1 Feature engineering.

It is unavoidable that the raw data will contain missing values as a consequence of the inevitable wear and tear caused by the operation and maintenance of equipment. In the event that the aforementioned missing values are situated within a span of 10 consecutive data points, they are duly filled through the utilisation of linear interpolation, employing the surrounding data as a reference. In cases where more than 10 consecutive data points are absent, for instance as a consequence of equipment malfunction or scheduled maintenance, the data for the days in question are removed. Following the aforementioned data pre-processing, the DKASC dataset comprises 104,256 data rows, while the SolarI dataset contains 35,040 data rows.

4.3.2 Data normalization.

To eliminate the impact of different data scales and to account for the distinct physical boundaries of photovoltaic (PV) power data (e.g., zero power at night), this study employs Min-Max Scaling for data preprocessing [10]. This method linearly maps the original data, x, to the interval [0, 1] using the following formula:

(21)

Where x represents the original data, x  is the normalized data, and and are the minimum and maximum values of the original dataset, respectively. This method linearly maps the data to the interval [0,1], preserving the shape of the original distribution. This approach is particularly well-suited for PV forecasting as it accounts for the distinct physical boundaries of the data (e.g., zero power at night), which is beneficial for the subsequent quantile regression interval forecasting task. All predicted results are inversely normalized prior to the calculation of evaluation metrics.

4.3.3 ICEEMDAN decomposition results of datasets.

The PV power sequence was decomposed in a stepwise manner using the ICEEMDAN algorithm. In order to obtain the multi-scale modal components of the power series in its original form, the Signal-to-Noise Ratio was established at 0.2, with the maximum number of iterations designated as the value of 100. This resulted in the identification of 17 sets of IMFs and 1 set of Res. As shown in Fig 5, the decomposition of the 2019 PV power series for the DKASC dataset is presented, which comprises 104,256 data points. The zero-crossing rate [51] of the signal is used to determine the frequency of the mode decomposition sequence(i.e. the frequency with which the signal crosses the zeros in a specified time interval). Subsequently, the over-zero rate of each decomposition sequence is calculated in accordance with Eq (22).The sequences are divided into nine groups of high-frequency sequences and eight groups of low-frequency sequences, as well as a Res sequence that cannot be further decomposed, based on whether the zero-crossing rate is greater than or equal to 0.02 [52]. The remaining component of the original PV power sequence that is not involved in decomposition is defined as the error component [53]. The high-frequency sub-sequences generally have higher volatility, resulting in sudden changes in the original PV power sequence. In contrast, the low-frequency sub-sequences, which change more smoothly, exhibit periodicity in the original PV power sequence. The residual component shows a monotonic curve, and the amplitude of the residual component in 2019 is about 0.0025 kW. The error component usually shows a high frequency of curve fluctuations, but in this study, the error component that is not involved in decomposition fluctuates between -0.1 kW and 0.1 kW, which fully proves the effectiveness of the ICEEMDAN method in decomposing signals, capturing the main components, and reducing the impact of useless noise.

(22)

Where Zi, ni, and Ni are the zero crossing rates of the ith decomposition sequence, the frequency by which the sequence of data crosses the zero threshold, and the total number of data points, respectively.

The study also employs the ICEEMDAN algorithm in order to decompose the PV power series in the Solar I dataset, resulting in the generation of 16 IMFs and 1 Res. As shown in Fig 5, illustrates the disaggregation of the PV power series in the 2019 SolarI dataset, which comprises 35,040 data points. In accordance with the over-zero rate criterion of 0.02, the decomposition results comprise eight groups of high-frequency sub-sequences, seven groups of low frequency sub-sequences, and a group of residual sequences. Residual sequence amplitude in 2019 is approximately 0.005 kW, while the error component amplitude ranges from -0.5 kW to 0.5 kW. The findings indicate that the ICEEMDAN approach maintains excellent decomposition efficacy even when applied to sequences containing anomalous meteorological data.

4.4 parameter settings

Hyperparameter tuning is essential for achieving optimal model performance; however, an exhaustive search can be computationally prohibitive. To balance efficiency with effectiveness and enhance experimental reproducibility, this study employed a two-stage optimization strategy. First, an evidence-based initial search space for the core hyperparameters was established through a systematic review of literature in recent years on photovoltaic forecasting [5457]. Subsequently, within this empirically defined range, the PSO algorithm was utilized for efficient automated tuning to determine the optimal parameter combination for the quantile regression interval forecasting task. The resulting hyperparameters are detailed in Table 4.

Following a series of experimental trials, the number of epochs was set to 100. The batch size was 64 and Adam was selected as the optimizer. Initially, the learning rate was set to 0.001 and then it gradually decayed.

4.5 Model training

Ablation experiments were conducted on the ICIAL model .The experimental outcomes were compared. Furthermore, the benchmark models (including LSTM, TCN [11], Informer [58], Itransformer and DA-GRU ) and the ICIAL model proposed in this paper were subjected to training and prediction experiments on the DKASC and Solar I datasets.Then, the evaluation metrics of the benchmark and ICIAL models under various weather conditions are compared. Additionally, the evaluation metrics and visualisation results for these models in a single-step and multi-step prediction task are presented, highlighting the differences in prediction accuracy among the benchmark and ICIAL models.

The hardware environment utilised in this experiment comprises Intel Core i9-13980HX CPU and 16GB RAM. The software environment is as follows: The operating system utilized in this experiment is Windows 11, with Python 3.9.18 and Pytorch 2.0.0 as the respective programming languages.

5 Evaluation

The goal of this section is to evaluate the prediction performance, robustness, and uncertainty quantification of each model for practical applications. The evaluation begins with ablation experiments to assess the contributions of the ICIAL model’s individual components. Next, the ICIAL model is benchmarked against baselines for single-step prediction accuracy, with further tests assessing its adaptability under varied weather conditions. This is followed by multi-step prediction experiments across different datasets and strategies to explore the model’s long-horizon forecasting capability. To ensure the statistical reliability of these findings, Wilcoxon signed-rank tests are employed to verify the model’s robustness against the randomness inherent in deep learning and PSO algorithms. Finally, to overcome the limitations of traditional point predictions, the study introduces uncertainty quantification by generating and evaluating probabilistic prediction intervals under various weather conditions, assessing their reliability (PICP) and sharpness (PIAW) to demonstrate the model’s practical capabilities.

5.1 Evaluation metrics of model performance

The experiment employed various evaluation metrics to assess the predictive performance of different models. The formulas for calculating MAE, nMAE, RMSE, nRMSE, and SMAPE are as follows [59]:

(23)(24)(25)(26)(27)

where yi, , and represent the true value, predicted value, and mean of the true values, respectively, while n denotes the total number of observations. Lower values of MAE, nMAE, RMSE, nRMSE, and SMAPE indicate higher accuracy. MAE and RMSE measure the average error and the root mean squared error, respectively. nMAE normalizes MAE by the mean of the true values, whereas nRMSE utilizes the range of the true values, facilitating comparisons across different scales. SMAPE, a symmetric percentage error, is less sensitive to extreme values and remains stable with a bias of 0.1 when both the true and predicted values are zero.

5.2 Evaluation on DKASC dataset

The ICIAL model proposed in this paper was evaluated through a series of experiments conducted on the DKASC dataset, comparing its performance against several benchmark models. These experiments included ablation studies, single-step ahead forecasting, single-step ahead forecasting under varying weather conditions, and multi-step ahead forecasting. In the multi-step ahead forecasting experiments, three distinct strategies were employed to address the prediction tasks: the recursive strategy, the direct strategy, and the MIMO strategy. A comparative analysis of experimental metrics and visualization results across the three forecasting strategies demonstrates that the proposed ICIAL model significantly outperforms other benchmark models in all forecasting tasks, confirming its greater capability in PV power forecasting.

5.2.1 Ablation experiment.

Ablation experiments were conducted on the ICIAL model, and comparisons were made between the result. The model was trained in accordance with the experimental environment and settings described above, as well as the previously determined optimal hyperparameters. Subsequently, the trained models were employed to predict the test-sets. The predicted outcomes were compared to the actual values, and the results were analyzed in detail. Table 5 illustrates the predictive capacity of each model with respect to the dataset.

Based on the results presented in Table 5, the traditional BiGRU model lacks a mechanism to account for the varying importance of time steps, which leads to suboptimal performance. In contrast, the ConvBiGRU model enhances predictive accuracy, reducing MAE by 1.8% compared to BiGRU. The ConvBiGRU-A model, which includes an additional attention mechanism, further decreases the MAE by 4.1% relative to BiGRU. The ConvBiGRU-PE-A model, incorporating position encoding, achieves an 8.6% MAE reduction, while the ConvBiGRU-RPE-A model, which utilizes relative position encoding, reduces the MAE by 11.9% compared to BiGRU. The ICIAL model, which employs the ICEEMDAN algorithm and the IRPE mechanism, demonstrates the best performance, achieving a 20.7% reduction in MAE compared to BiGRU. This improvement stems from the model’s capacity to capture complex data relationships and learn effective feature representations.

The u/v parameter scaling in IRPE significantly enhances the model’s focusing capability under relatively stable conditions. As shown in the attention heatmaps for the Average, Peak, and Trough scenarios (Fig 6), the ICIAL model exhibits sharper and brighter diagonal patterns compared to the ConvBiGRU-RPE-A model, which uses standard RPE. This indicates that while standard RPE applies static attention to all relative positions, the u/v scaling mechanism in ICIAL enables adaptive weighting during training. This mechanism amplifies critical periodic dependencies, enhances the perception of peak-trough dynamics, and suppresses irrelevant noise, thereby improving prediction accuracy.

thumbnail
Fig 6. Comparison of attention heatmaps between the proposed ICIAL model and the baseline model across four typical scenarios.

https://doi.org/10.1371/journal.pone.0334828.g006

Furthermore, the proposed mechanism demonstrates robust dynamic adaptability, particularly in the Ramp scenario. When faced with rapid photovoltaic power fluctuations, the standard RPE model maintained a fixed periodic attention pattern similar to other scenarios. In contrast, the ICIAL model adapted dynamically: while preserving key periodic memories through sub-diagonals, it substantially intensified the brightness of the main diagonal. This demonstrates that the model can shift its attentional focus to recent historical data in response to input dynamics, allowing it to accurately capture the “slope” and local trends of the changes.

5.2.2 One-step ahead forecasting.

The ICIAL model was compared with benchmark models in a one-step-ahead forecasting experiment. Fig 7 illustrates the prediction results and the corresponding scatter distributions for each model. Table 6 presents the MAE, nMAE, RMSE, nRMSE and SMAPE values for the various models in the one-step-ahead prediction.

The ICIAL model demonstrates the highest prediction accuracy among all benchmark models, and its prediction curve closely aligns with the true value of photovoltaic power (see Fig 7(a)). The scatter plots in Fig 7(b)–7(f) illustrate the ICIAL model’s high accuracy, as its predicted values form the most compact cluster compared to all other models. Specifically, as detailed in Table 6, ICIAL has made significant improvements in all key metrics when compared to the second-best performing model, DA-GRU. ICIAL’s nMAE is 0.0408, which is 4.7% lower than DA-GRU’s nMAE of 0.0428. Additionally, its MAE is 0.0526 kW, representing a 4.5% reduction compared to DA-GRU’s MAE of 0.0551 kW. Furthermore, the nRMSE and RMSE were also decreased by 2.8% and 1.9%, respectively.

Furthermore, the comparison with the worst-performing model, LSTM, underscores the robustness of the ICIAL model. The data presented in Table 6 and the visualization results in Fig 7(a) demonstrate that LSTM’s predictions exhibit significant deviations from the actual values during periods of high volatility. In contrast, ICIAL’s nMAE and RMSE are 9.3% and 8.0% lower than those of LSTM, respectively. This indicates that LSTM has a limited capacity to capture complex, high-dimensional temporal dependencies, whereas ICIAL’s architectural design allows it to adapt more effectively to rapid changes, thereby significantly reducing prediction errors.

To thoroughly evaluate the predictive performance of the ICIAL model in real-world scenarios, comparative experiments were conducted under three different lighting conditions: gradual lighting, fluctuating lighting, and intense lighting.

As shown in Table 7, the prediction accuracy of the ICIAL model surpasses that of all benchmark models under stable, fluctuating, and highly variable lighting conditions. Notably, under highly variable conditions characterized by significant lighting fluctuations, the performance enhancement of the ICIAL model is particularly pronounced, with its MAE and RMSE optimized by approximately 2.3% and 3.5%, respectively, compared to the LSTM model. Among the benchmark models, the DA-GRU model demonstrates relatively strong performance under fluctuating conditions, outperforming all other models except for ICIAL, with an MAE of 0.0249 kW, representing a 50.1% reduction compared to the ITransformer model. However, the overall predictive performance of the DA-GRU model remains inferior to that of the ICIAL model. This discrepancy primarily arises from the DA-GRU model’s limited adaptability to rapid fluctuations, inadequate extraction of short-term variational features, and its tendency to overfit long-term trends.

thumbnail
Table 7. Comparison of metrics for multiple models in three weather conditions.

https://doi.org/10.1371/journal.pone.0334828.t007

In contrast, the outstanding performance of the ICIAL model under these complex conditions can be primarily attributed to its synergistic high- and low-frequency prediction model, as well as the IRPE mechanism integrated into its architecture. These innovations allow the ICIAL model to more effectively capture the intricate multi-dimensional features of photovoltaic power data and adapt to its inherent variability and non-linear dynamics, thereby significantly enhancing prediction accuracy.

5.2.3 Multi-step ahead forecasting.

In smart grids, multi-step forecasting is essential for optimizing renewable energy use, maintaining grid stability, and conducting reliable electricity market bidding [12]. To provide a more comprehensive performance assessment, this study introduces a method of uncertainty quantification based on quantile regression as an alternative to traditional point forecast comparisons. The median of the quantile regression (50th percentile) is utilised as the new point forecast value [42]. This value is then benchmarked against baseline models under multi-step forecasting horizons using recursive, direct, and MIMO multi-step forecasting strategies. This approach enables us to evaluate not only predictive accuracy but also forecast uncertainty, providing a more robust assessment of a model’s long-term forecasting capabilities across various strategies.

As indicated in Table 8, the MIMO strategy consistently outperforms both the Direct and Recursive strategies across all models. Notably, the ICIAL model, when paired with the MIMO strategy, achieved the highest performance across all forecast horizons. For instance, in the most challenging T+24 long-term forecast, ICIAL’s nMAE (0.1142) and RMSE (0.3382 kW) are 10.2% and 17.3% lower, respectively, than those of the runner-up model, ITransformer, demonstrating greater predictive accuracy. Although some models, such as LSTM, show a slight advantage with the Direct strategy in short-term (T+3) forecasts, the performance advantage of the MIMO strategy become more pronounced over longer horizons, highlighting its greater stability and accuracy.

thumbnail
Table 8. Advanced multi-step prediction experiments for the DKASC dataset.

https://doi.org/10.1371/journal.pone.0334828.t008

From a computational cost perspective, the Direct strategy is exceedingly expensive due to its “one model per step” mechanism. For the T+24 task, the ICIAL model under the Direct strategy requires four times the number of parameters and 3.9 times the training time compared to its MIMO counterpart. Although ICIAL’s intrinsic computational overhead (7.13 MB of parameters and approximately 39 minutes of training) is not the lowest, it is comparable to other advanced Transformer models (e.g., Informer), achieving an effective balance between acceptable costs and significant performance gains.

The results also reveal significant compatibility issues between the model and strategy. The poor performance of ITransformer and TCN under the Recursive strategy stems from a fundamental conflict between their architectures—such as ITransformer’s inversion mechanism and TCN’s reliance on parallel convolution across complete sequences—and the strategy’s iterative, point-by-point mechanism. This finding highlights the importance of aligning model architecture with the forecasting paradigm.

The data not only confirms the high accuracy of the MIMO-ICIAL combination but also highlights the intrinsic robustness of the ICIAL model itself. A key observation is the stark performance divergence among models under the error-prone recursive strategy. In contrast to the performance collapse of the ITransformer and the significant decline of the LSTM, ICIAL’s performance degradation is far more controlled and gradual (T+24 nMAE: 0.2004). This suggests that ICIAL’s unique decomposition and hierarchical architecture provide a stronger inherent capability to suppress noise and error propagation.

Furthermore, an evaluation of the ICIAL model’s performance under different multi-step forecasting strategies, as detailed in Fig 8 and Table 8, reveals its robust stability. Among all strategies, the MIMO strategy exhibited the highest long-term stability: from T+3 to T+24, it demonstrated the smallest increases in nMAE (0.0606) and nRMSE (0.0268), while its absolute error values consistently remained the lowest. The Direct strategy followed with moderate stability (nMAE/nRMSE increases of 0.0745/0.0393), whereas the Recursive strategy resulted in the largest final absolute errors due to error accumulation. The heatmap in Fig 8 visually corroborates these performance disparities. Under the MIMO strategy, the predicted points are compactly clustered around the diagonal identity line (ground truth) across all time steps. Although the Direct strategy performs comparably to MIMO in the short term (T+3), its point cloud becomes significantly more dispersed as the forecast horizon extends, revealing a key limitation in its long-range capability. In contrast, the point cloud of the Recursive strategy is the most scattered and exhibits systematic bias resulting from error propagation. More critically, the ICIAL model demonstrates profound intrinsic robustness even under the inherently flawed Recursive strategy. Compared to the baseline models referenced in Table 8, ICIAL’s performance degrades more gracefully, avoiding a collapse of the core predictive trend. This behavior highlights the architecture’s robust feature extraction and noise resistance, which enables it to maintain relatively stable performance even under such adverse conditions.

thumbnail
Fig 8. Multi-step prediction of heat density in advance under different strategies for the ICAL model.

https://doi.org/10.1371/journal.pone.0334828.g008

Finally, as illustrated in Fig 9, presents a visualisation of the predicted values under representative weather scenarios. Under relatively stable fluctuating weather conditions (e.g., T+6, T+24), ICIAL’s prediction curve demonstrates a high degree of fidelity to the actual values (grey shaded area). It particularly excels in peak prediction accuracy, effectively addressing the peak underestimation problem commonly observed in other baseline models, such as LSTM. Conversely, in abrupt weather scenarios with drastic changes in photovoltaic power (e.g., T+3, T+12), ICIAL exhibits an robust ability to capture dynamic trends. Its prediction curve (solid blue line) closely tracks rapid fluctuations and sharp drops in power, showing significantly less prediction lag compared to other models. Furthermore, a critical observation is that as the forecast horizon extends (from T+3 to T+24), ICIAL’s prediction curve remains consistently smoother and more stable than those of the other baseline models. This is particularly evident under the Recursive strategy, where ICIAL’s output maintains a robust form, while other models exhibit severe noise fluctuations due to error accumulation, once again demonstrating the robustness of its architecture. This robust capability for peak prediction and rapid trend response holds significant practical importance, as it directly contributes to optimising resource allocation, enhancing operational efficiency, and improving adaptability to market volatility.

5.3 Evaluation on SolarI dataset.

To mitigate the limitations arising from a single dataset, this study conducts experiments across multiple datasets to enhance the reliability and generalizability of the results. Compared to the DKASC dataset, the Solar I dataset is smaller in scale and exhibits greater volatility, posing more stringent demands on the model’s feature extraction and temporal information processing capabilities. Consequently, evaluations on the Solar I dataset provide a more rigorous test of the model’s robustness and adaptability.

As illustrated in Table 9, the experimental results clearly reproduce the core conclusions observed in the DKASC dataset: the ICIAL model employing the MIMO strategy remains the best-performing combination, demonstrating strong cross-dataset adaptability. Taking the T+192 (2880-minute) long-term forecast as an example, the ICIAL model’s nMAE not only significantly exceeds that of the classic LSTM-Recursive baseline strategy (with a reduction of 61.2%), but its nRMSE also drastically outperforms the worst-performing TCN-Recursive strategy (with a reduction of 68.7%). Furthermore, it surpasses the robustly performing DA-GRU model (MIMO strategy), with nMAE and nRMSE reductions of 8.1% and 5.9%, respectively. This highlights its robust discriminative ability on complex data. In terms of training efficiency, the ICIAL model’s training time of approximately 13 minutes under the MIMO strategy ensures the highest accuracy while maintaining computational efficiency comparable to other advanced models.

thumbnail
Table 9. Advanced multi-step prediction experiments for the solar I dataset.

https://doi.org/10.1371/journal.pone.0334828.t009

5.4 Experimental evaluation of model stability using Wilcoxon signed-rank test

To rigorously evaluate the stability of the proposed photovoltaic power forecasting model utilizing the MIMO prediction strategy, which integrates deep learning and PSO, a comprehensive experiment was conducted to assess its robustness against the inherent randomness associated with the stochastic nature of both deep learning and PSO. The experiment aimed to quantify the consistency of predictions across multiple runs, employing two distinct datasets—DKASC and Solar I—across various forecasting horizons (T+1, T+3, T+6, T+12, and T+24). The Wilcoxon signed-rank test, a non-parametric statistical method, was utilized to compare prediction results over 30 independent runs, ensuring a thorough assessment of model stability.

The DKASC and Solar I datasets were selected to represent diverse photovoltaic power generation scenarios, with DKASC exhibiting relatively stable patterns and Solar I characterized by higher variability due to environmental factors. For each dataset and forecasting horizon, the model was executed 30 times, and the differences between predicted and true values (Predicted - True) were recorded.The Wilcoxon signed-rank test was applied to evaluate the statistical significance of differences between runs, with p-values calculated for each comparison. The median p-value and the proportion of significant differences (p < 0.05) were reported as key indicators of stability. Additionally, box plots and histograms were utilized to visualize the distribution of prediction errors and p-values, respectively, offering a comprehensive perspective on the model’s performance and consistency.

The results, presented in Table 10, highlight the model’s robust stability across all forecasting horizons and datasets. The median p-values from the Wilcoxon signed-rank test consistently exceed the significance threshold of 0.05, ranging from 0.3626 to 0.6989, indicating that the prediction results across multiple runs are statistically indistinguishable in most cases. Specifically, for short-term forecasts (T+1 and T+3), the proportion of significant differences (p < 0.05) is 0.00% for both the DKASC and Solar I datasets, underscoring the model’s robust consistency in these scenarios. For longer forecasting horizons (T+6, T+12, and T+24), the proportion of significant differences remains low. The DKASC dataset exhibits 0.00% significant differences from T+1 to T+12, increasing slightly to 3.33% at T+24, while the Solar I dataset shows a gradual rise to 3.33% at T+6, 6.45% at T+12, and 6.67% at T+24. These findings suggest that the model maintains strong stability even as the forecasting horizon extends, with only a marginal increase in variability observed for the more challenging Solar I dataset.

thumbnail
Table 10. Median p-values and significant differences for forecasting time steps.

https://doi.org/10.1371/journal.pone.0334828.t010

Fig 10 further validates the model’s stability by depicting the distribution of prediction errors across 30 runs. For the DKASC dataset, the error distributions are tightly clustered, ranging from ±0.8kW at T+1 to ±1.0kW at T+24, with median errors consistently near zero, indicating negligible systematic bias. In contrast, the Solar I dataset exhibits slightly greater variability, with error ranges increasing from ±0.6kW at T+1 to ±1.5kW at T+24; however, the distributions remain symmetric and centered around zero, with minimal outliers. This visual analysis underscores the model’s high consistency across runs, with the DKASC dataset demonstrating robust stability compared to Solar I.

thumbnail
Fig 10. Distribution of prediction errors across 30 runs for DKASC and solar I datasets.

https://doi.org/10.1371/journal.pone.0334828.g010

Histograms of p-values (Fig 11) offer deeper insight into the statistical consistency of the model. For the DKASC dataset, p-values are predominantly distributed above 0.2 across all forecasting horizons, with only a few instances falling below the 0.05 threshold, aligning with the low proportion of significant differences reported in Table 10. Conversely, the Solar I dataset displays a slightly broader distribution, with a modest increase in p-values below 0.05 at longer horizons (T+12 and T+24), consistent with the observed increase in significant differences. However, even in the most challenging case (Solar I at T+24), the proportion of p-values below 0.05 remains low at 6.67%, indicating that the model’s predictions are largely stable across runs.

thumbnail
Fig 11. Distribution of p-values across forecasting horizons for DKASC and solar I datasets.

https://doi.org/10.1371/journal.pone.0334828.g011

The experimental evaluation demonstrates that the proposed photovoltaic power forecasting model exhibits high stability across diverse datasets and forecasting horizons, effectively mitigating the randomness introduced by deep learning and PSO algorithms. Results from the Wilcoxon signed-rank test, supported by median p-values consistently exceeding 0.05 and a low proportion of significant differences (maximum of 6.67%), confirm the statistical consistency of the model’s predictions across multiple runs. Box plots and p-value histograms further substantiate this finding, revealing tightly clustered error distributions and a predominance of high p-values, respectively. Although the Solar I dataset shows slightly elevated variability in longer-term forecasts, the overall impact of randomness remains minimal, with the model sustaining robust performance. These findings underscore the reliability of the proposed approach for photovoltaic power forecasting, positioning it as a promising solution for real-world applications where stability and consistency are paramount.

5.5 Uncertainty quantification and interval prediction under various weather conditions

To address the limitations of deterministic point prediction and enhance the practical utility of models in PV power time-series forecasting, this study conducts uncertainty quantification by generating probabilistic prediction intervals. A quantile regression method is adopted to construct these intervals, with the PICP and PIAW introduced as core evaluation metrics. This section aims to comprehensively and rigorously evaluate the model’s uncertainty and interval prediction performance under different weather conditions (including stable, fluctuating, and abrupt weather) from two dimensions: reliability and sharpness.

As shown in Fig 12, the analysis of 95% confidence prediction intervals profoundly reveals the model’s robust capability to quantify uncertainties across different datasets and diverse weather conditions. Particularly on the Solar I dataset, where weather conditions are more complex and the prediction time steps are longer (up to 360 minutes), the model demonstrates remarkable robustness. Even when facing the dual challenges of long-term prediction and abrupt weather changes, the generated prediction intervals (shaded areas) effectively envelop the actual power values (black line).

thumbnail
Fig 12. Prediction intervals and prediction steps for uncertain photovoltaic power from the DKASC and solar I datasets at a 95% confidence level.

https://doi.org/10.1371/journal.pone.0334828.g012

Meanwhile, on the DKASC dataset, the model exhibits similar adaptive capabilities: the interval width increases reasonably as the prediction horizon extends from 15 to 120 minutes. The ability to provide reliable and dynamically adjusted prediction intervals across two datasets with distinct characteristics fully validates the model’s strong performance in uncertainty estimation and its practical application value.

As shown in Table 11, the quantitative evaluation of prediction intervals confirms these visual observations. For the DKASC dataset, PICP values exhibit excellent calibration, closely matching the nominal confidence levels. At the 95% confidence level, the PICP values at prediction steps 3, 6, 12, and 24 are 96.56%, 95.89%, 95.70%, and 94.01%, respectively, remaining high and stable, which demonstrates a high degree of reliability. However, despite the greater challenges posed by the more volatile Solar I dataset, especially at the 24-step prediction horizon (PICP of 90.44%), the coverage remains robust, validating the model’s effectiveness. The trend of PICP values slightly exceeding their nominal confidence levels reflects a conservative yet reliable model, which is more desirable for risk-averse operational planning in power systems. In terms of sharpness, PIAW values increase reasonably with the extension of the prediction horizon and the rise of confidence levels. For the 5.2 kW system in DKASC, at the 95% confidence level, as the prediction horizon extends from T+3 to T+24, the PIAW increases from 0.5747 kW to 0.7678 kW, indicating that the intervals are narrow enough to be practically valuable.

thumbnail
Table 11. Prediction Interval Performance (PICP and PIAW) on DKASC and solar I datasets at different confidence levels.

https://doi.org/10.1371/journal.pone.0334828.t011

The in-depth analysis of the prediction error distribution in Fig 13 provides strong evidence for the rationality and reliability of the model in uncertainty quantification. Each subplot in the figure includes a prediction error histogram, a non-parametric kernel density estimation (KDE) curve, and a Gaussian fitting curve.

thumbnail
Fig 13. Prediction error distributions for photovoltaic power from the DKASC and solar I datasets under various weather conditions and at different prediction horizons.

https://doi.org/10.1371/journal.pone.0334828.g013

The analysis reveals two key points: firstly, all error distributions are centered at zero, indicating low systematic bias in the model predictions. Secondly, and more importantly, the KDE curves show that, especially in the two realistic and challenging scenarios of weather fluctuations and abrupt weather changes, the prediction errors exhibit significant non-Gaussian characteristics such as spikiness and heavy tails. By adopting quantile regression, the model is not constrained by traditional Gaussian assumptions but successfully captures and models this complex real-world error distribution. Additionally, the trend of the error distribution widening reasonably with the extension of the prediction horizon and the deterioration of weather conditions is highly consistent with the variation pattern of the PIAW metric, further confirming that the model proposed in this study can accurately and reliably capture the dynamic growth of uncertainties.

6 Conclusion

This study proposed a novel hybrid prediction model (ICIAL) that combines signal decomposition with a frequency-stratified network to improve photovoltaic (PV) power forecasting. The key findings of the research are as follows:

  1. The model first employs the ICEEMDAN algorithm to decompose the original photovoltaic power series into multi-scale components, effectively capturing the intrinsic characteristics of the data. Ablation experiments further confirm the efficacy of this method in improving prediction accuracy. Subsequently, a heterogeneous hierarchical network is implemented: an LSTM model is utilized to predict the low-frequency trend series, while a ConvBiGRU-A model, integrated with IRPE, is employed for the high-frequency error series. The integrated IRPE mechanism allows the model to adapt to external factors, such as changing light intensity, thereby maintaining high accuracy across various weather conditions.
  2. By applying distinct models to high- and low-frequency components, this approach effectively captures complex PV dynamics, overcoming a key limitation of traditional methods. The method first employs the ICEEMDAN algorithm for signal decomposition and subsequently performs predictions through a hierarchical network that integrates IRPE mechanism. A visual analysis of the model’s attention mechanism clearly reveals the effectiveness of IRPE: in critical scenarios such as peaks, valleys, and transient changes, the attention heatmaps of the ICIAL model exhibit sharper and more focused patterns, thereby verifying its ability to accurately identify and capture these complex dynamic changes. This advantage is ultimately reflected in the model’s prediction performance. Comparative experimental results under different weather conditions show that, whether under stable, fluctuating, or drastically changing light conditions, the prediction curve of the ICIAL model closely aligns with the actual power values, particularly excelling in peak prediction accuracy and effectively addressing the common issue of peak underestimation observed in other baseline models.
  3. Through comprehensive experiments conducted on different datasets (DKASC and Solar I) using three multi-step prediction strategies (MIMO, Direct, and Recursive), this study systematically verified the robustness of the proposed model. The results consistently indicated that the ICIAL model, under the MIMO strategy, exhibits strong long-term prediction capabilities. In the 120-minute and 2880-minute prediction tasks on the DKASC and Solar I datasets, its nMAE was reduced by 14.6% and 8.1%, respectively, compared to the second-best model. Furthermore, thirty independent rounds of the Wilcoxon signed-rank test confirmed the model’s statistical stability, demonstrating consistent performance across multiple iterations.
  4. To overcome the shortcomings of deterministic point predictions in quantifying future risks, this study introduces an uncertainty analysis framework based on quantile regression. Experiments have demonstrated that the model can generate reliable and sharp probabilistic prediction intervals. Whether applied to the DKASC dataset, which exhibits relatively stable data patterns, or the Solar I dataset, characterized by more complex weather conditions, the prediction intervals produced by the model strike a balance between a high PICP and a reasonable PIAW at different confidence levels. A comprehensive analysis of the prediction error distribution further indicates that the model effectively captures the non-Gaussian error characteristics commonly observed in the real world, confirming its practical value in power grid risk assessment and operational planning.

In conclusion, this study has fully demonstrated the outstanding performance of the ICIAL model in multi-step photovoltaic power prediction. By integrating the ICEEMDAN algorithm with a hierarchical frequency modeling neural network, the model has exhibited strong medium- to long-term prediction capabilities and robustness across diverse experimental scenarios. However, long-term prediction of photovoltaic power continues to pose a significant challenge for the efficient operation of power stations. Future work will focus on optimizing the model and extending its application to long-term forecasting scenarios.

Furthermore, the ICIAL model demonstrates broad generalization potential, with its core advantage stemming from its universal architectural principles rather than being confined to a specific domain. First, as a domain-agnostic decomposition tool, ICEEMDAN can effectively handle any non-stationary time series. Second, hierarchical frequency modeling—specifically, the customization of different models for low-frequency trends and high-frequency fluctuations—serves as a fundamental principle that is universally applicable to the analysis of complex systems.

Therefore, the framework could likely be transferred to other challenging prediction tasks. For instance, in the field of renewable energy, the model can be directly applied to wind power forecasting, which also faces the challenge of intermittency. Its potential applications can be extended to various domains that require high-precision predictions, such as grid load forecasting, financial market analysis (e.g., stock price forecasting), and climate modeling. The exploration of these applications will be a key direction for future research, aiming to further validate and expand the model’s generalization capabilities and to contribute a robust analytical paradigm to the field of complex time series analysis.

References

  1. 1. Wang J, Yang W, Du P, Li Y. Research and application of a hybrid forecasting framework based on multi-objective optimization for electrical power system. Energy. 2018;148:59–78.
  2. 2. Zhang T, Nakagawa K, Matsumoto K. Evaluating solar photovoltaic power efficiency based on economic dimensions for 26 countries using a three-stage data envelopment analysis. Applied Energy. 2023;335:120714.
  3. 3. Hu F, Mou S, Wei S, Liping Qiu, Hu H, Zhou H. Research on the evolution of China’s photovoltaic technology innovation network from the perspective of patents. Energy Strategy Reviews. 2024;51:101309.
  4. 4. Massidda L, Bettio F, Marrocu M. Probabilistic day-ahead prediction of PV generation. A comparative analysis of forecasting methodologies and of the factors influencing accuracy. Solar Energy. 2024;271:112422.
  5. 5. Salman D, Direkoglu C, Kusaf M, Fahrioglu M. Hybrid deep learning models for time series forecasting of solar power. Neural Comput & Applic. 2024;36(16):9095–112.
  6. 6. Yu C, Qiao J, Chen C, Yu C, Mi X. TFEformer: a new temporal frequency ensemble transformer for day-ahead photovoltaic power prediction. Journal of Cleaner Production. 2024;448:141690.
  7. 7. Zhou Y, Wang J, Li Z, Lu H. Short-term photovoltaic power forecasting based on signal decomposition and machine learning optimization. Energy Conversion and Management. 2022;267:115944.
  8. 8. Vaswani A. Attention is all you need. Advances in Neural Information Processing Systems. 2017.
  9. 9. Yan H, Deng B, Li X, Qiu X. TENER: adapting transformer encoder for named entity recognition. 2019.
  10. 10. Choumal A, Rizwan M, Jha S. Big data analytics for photovoltaic and electric vehicle management in sustainable grid integration. Journal of Renewable and Sustainable Energy. 2025;17(1).
  11. 11. Li F, Zuo W, Zhou K, Li Q, Huang Y. State of charge estimation of lithium-ion batteries based on PSO-TCN-attention neural network. Journal of Energy Storage. 2024;84:110806.
  12. 12. D’Aversa A, Pio G, Ceci M. Leveraging spatio-temporal locality in linear model trees for multi-step time series forecasting. In: 2024 IEEE International Conference on Big Data (BigData). 2024. p. 1282–7. https://doi.org/10.1109/bigdata62323.2024.10826062
  13. 13. Pei J, Liu N, Shi J, Ding Y. Tackling the duck curve in renewable power system: a multi-task learning model with iTransformer for net-load forecasting. Energy Conversion and Management. 2025;326:119442.
  14. 14. Su Z, Gu S, Wang J, Lund PD. Improving ultra-short-term photovoltaic power forecasting using advanced deep-learning approach. Measurement. 2025;239:115405.
  15. 15. El-kenawy E-SM, Khodadadi N, Mirjalili S, Abdelhamid AA, Eid MM, Ibrahim A. Greylag goose optimization: nature-inspired optimization algorithm. Expert Systems with Applications. 2024;238:122147.
  16. 16. Ding M, Yang R. Research on short-term prediction of PV output power based on weather forecast. Renewable Energy Resources. 2014.
  17. 17. Wang J, Wang N, Wang H, Cao K, El-Sherbeeny AM. GCP: a multi-strategy improved wireless sensor network model for environmental monitoring. Computer Networks. 2024;254:110807.
  18. 18. Kushwaha V, Pindoriya NM. A SARIMA-RVFL hybrid model assisted by wavelet decomposition for very short-term solar PV power generation forecast. Renewable Energy. 2019;140:124–39.
  19. 19. Fu X, Zhang C, Xu Y, Zhang Y, Sun H. Statistical machine learning for power flow analysis considering the influence of weather factors on photovoltaic power generation. IEEE Trans Neural Netw Learn Syst. 2025;36(3):5348–62. pmid:38587954
  20. 20. Scott C, Ahsan M, Albarbar A. Machine learning for forecasting a photovoltaic (PV) generation system. Energy. 2023;278:127807.
  21. 21. Ferkous K, Guermoui M, Menakh S, Bellaour A, Boulmaiz T. A novel learning approach for short-term photovoltaic power forecasting - A review and case studies. Engineering Applications of Artificial Intelligence. 2024;133:108502.
  22. 22. Agga A, Abbou A, Labbadi M, El Houm Y. Short-term self consumption PV plant power production forecasts based on hybrid CNN-LSTM, ConvLSTM models. Renewable Energy. 2021;177:101–12.
  23. 23. Tovar M, Robles M, Rashid F. PV power prediction, using CNN-LSTM hybrid neural network model. case of study: Temixco-Morelos, México. Energies. 2020;13(24):6512.
  24. 24. Feng H, Yu C. A novel hybrid model for short-term prediction of PV power based on KS-CEEMDAN-SE-LSTM. Renewable Energy Focus. 2023;47:100497.
  25. 25. Tao K, Zhao J, Wang N, Tao Y, Tian Y. Short-term photovoltaic power forecasting using parameter-optimized variational mode decomposition and attention-based neural network. Energy Sources, Part A: Recovery, Utilization, and Environmental Effects. 2024;46(1):3807–24.
  26. 26. Gan L, Xiao Y. Knowledge base question answering based on multi-head attention mechanism and relative position coding. J Phys: Conf Ser. 2022;2203(1):012056.
  27. 27. Patil Y, Shruti T, K. R. Time-series forecasting using ARIMA and SARIMA models for solar NASA POWER data. In: 2025 3rd International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC). 2025. p. 946–52. https://doi.org/10.1109/isacc65211.2025.10969243
  28. 28. Guruge PB, Priyadarshana YHPP. Time series forecasting-based Kubernetes autoscaling using Facebook prophet and long short-term memory. Front Comput Sci. 2025;7.
  29. 29. Zhu H, Zhang X, Wu J, Hu S, Wang Y. A novel solar irradiance calculation method for distributed photovoltaic power plants based on K-dimension tree and combined CNN-LSTM method. Computers and Electrical Engineering. 2025;122:109990.
  30. 30. Minh LTT, Tri Si N, Truc NTK. Exploring the efficiencies of Bi-LSTM model and attention mechanism in solar power forecasting. J Phys: Conf Ser. 2025;2949(1):012063.
  31. 31. Liu M, Wang X, Zhong Z. Ultra-short-term photovoltaic power prediction based on BiLSTM with wavelet decomposition and dual attention mechanism. Electronics. 2025;14(2):306.
  32. 32. Lai L, Wang J, Li F, Zou E, Yang W, Zhang Y. Thermal performance prediction of rainwater-vented composite green roofs using the VMD-TCN-GRU model. Journal of Building Engineering. 2025;103:112152.
  33. 33. Lin H, Gao L, Cui M, Liu H, Li C, Yu M. Short-term distributed photovoltaic power prediction based on temporal self-attention mechanism and advanced signal decomposition techniques with feature fusion. Energy. 2025;315:134395.
  34. 34. Surribas-Sayago G, Fernández-Rodríguez JD, Dominguez E. Advancing photovoltaic forecasting with neural networks: integrating N-beats and sequential models with Fourier analysis. In: Proceedings of the Ibero-American Conference on Artificial Intelligence. 2024. p. 287–97.
  35. 35. Yu WJ, Dai YM, Ren T, Leng MM. Short-time photovoltaic power forecasting based on informer model integrating attention mechanism. Applied Soft Computing. 2025; p. 113345.
  36. 36. Chen J, Peng T, Qian S, Ge Y, Wang Z, Nazir MS, et al. An error-corrected deep autoformer model via Bayesian optimization algorithm and secondary decomposition for photovoltaic power prediction. Applied Energy. 2025;377:124738.
  37. 37. Zhai C, He X, Cao Z, Abdou-Tankari M, Wang Y, Zhang M. Photovoltaic power forecasting based on VMD-SSA-Transformer: Multidimensional analysis of dataset length, weather mutation and forecast accuracy. Energy. 2025;324:135971.
  38. 38. Wu S, Guo H, Zhang X, Wang F. Short-term photovoltaic power prediction based on CEEMDAN and hybrid neural networks. IEEE J Photovoltaics. 2024;14(6):960–9.
  39. 39. Zhao H, Huang X, Xiao Z, Shi H, Li C, Tai Y. Week-ahead hourly solar irradiation forecasting method based on ICEEMDAN and TimesNet networks. Renewable Energy. 2024;220:119706.
  40. 40. Wu S, Jiang H, Gao Q. Short-term prediction of photovoltaic power generation based on ICEEMDAN-SE-GAPSO-LSTM. J Phys: Conf Ser. 2024;2728(1):012027.
  41. 41. Foucart A, Elskens A, Decaestecker C. Ranking the scores of algorithms with confidence. In: ESANN 2025 ; 2025.
  42. 42. Upadhaya A, Telle J-S, Schlüters S, Saber M, von Maydell K. A robust approach to extend deterministic models for the quantification of uncertainty and comprehensive evaluation of the probabilistic forecasting. International Journal of Energy Research. 2025;2025(1):4460462.
  43. 43. Chung Y, Neiswanger W, Char I, Schneider J. Beyond pinball loss: quantile methods for calibrated uncertainty quantification. In: Advances in Neural Information Processing Systems. vol. 34; 2021. 10971–84.
  44. 44. Topal MO, Bas A, van Heerden I. Exploring transformers in natural language generation: GPT, BERT, and XLNet. 2021.
  45. 45. Luo D, Wang T, Han J, Niu D. Study of three-dimensional distribution of chloride in coral aggregate concrete: a CNN-BiGRU-attention data-intelligence model driven by beluga whale optimization algorithm. Construction and Building Materials. 2025;458:139740.
  46. 46. Wang J, Ge C, Li Y, Zhao H, Fu Q, Cao K, et al. A two-layer network intrusion detection method incorporating LSTM and stacking ensemble learning. CMC. 2025;83(3):5129–53.
  47. 47. Sami Khafaga D, Ali Alhussan A, M. El-kenawy E-S, Ibrahim A, H. Abd Elkhalik S, Y. El-Mashad S, et al. Improved prediction of metamaterial antenna bandwidth using adaptive optimization of LSTM. Computers, Materials & Continua. 2022;73(1):865–81.
  48. 48. DKASC Alice Springs. DKASC. 2022.
  49. 49. Chen Y, Xu J. Solar and wind power data from the Chinese State grid renewable energy generation forecasting competition. Sci Data. 2022;9(1):577. pmid:36130945
  50. 50. Wang J, Luo D, Chen W, Peng F, Li Z. Deployment method of wireless sensor networks based on MaOEA/P-GM algorithm. Soft Comput. 2024.
  51. 51. Luo SY, Long H. Noise classification algorithm based on short-term energy, zero-crossing rate. Advancements in Mechatronics and Intelligent Robotics: Proceedings of ICMIR 2020 . Singapore: Springer. 2021. p. 287–92.
  52. 52. Chen Y, Bai X. A model of day-ahead market electricity price forecasting considering 2D variation of time series. In: Proceedings of the CSU-EPSA. 2024. p. 22–9.
  53. 53. Yuan C, Hu Z, Liu Y, He S, Du J. Application of ICEEMDAN to noise reduction of near-seafloor geomagnetic field survey data. Journal of Applied Geophysics. 2023;209:104933.
  54. 54. Vatanchi SM, Etemadfard H, Maghrebi MF, Shad R. A comparative study on forecasting of long-term daily streamflow using ANN, ANFIS, BiLSTM and CNN-GRU-LSTM. Water Resour Manage. 2023;37(12):4769–85.
  55. 55. Tahir MF, Yousaf MZ, Tzes A, El Moursi MS, El-Fouly THM. Enhanced solar photovoltaic power prediction using diverse machine learning algorithms with hyperparameter optimization. Renewable and Sustainable Energy Reviews. 2024;200:114581.
  56. 56. Sutarna N, Tjahyadi C, Oktivasari P, Dwiyaniti M, Tohazen T. Hyperparameter tuning impact on deep learning bi-LSTM for photovoltaic power forecasting. Journal of Robotics and Control. 2024;5(3):677–93.
  57. 57. Zhou N, Shang B, Xu M, Peng L, Feng G. Enhancing photovoltaic power prediction using a CNN-LSTM-attention hybrid model with Bayesian hyperparameter optimization. Global Energy Interconnection. 2024;7(5):667–81.
  58. 58. Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong H, et al. Informer: beyond efficient transformer for long sequence time-series forecasting. AAAI. 2021;35(12):11106–15.
  59. 59. Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci. 2021;7:e623. pmid:34307865