Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Overall architecture of KALFormer.

The framework integrates Long Short-Term Memory (LSTM) units for temporal encoding, a self-attention mechanism for capturing global dependencies, a Graph Neural Network (GNN)-based Knowledge-Augmented Network (KAN) for nonlinear relational interactions, and a multi-layer Transformer encoder–decoder for feature fusion and sequence prediction. The model outputs are normalized and passed through a linear layer and Softmax for forecasting probabilities.

More »

Fig 1 Expand

Fig 2.

Structure of the multi-head attention mechanism.

Each attention head performs a scaled dot-product attention using individual query (Q), key (K), and value (V) projections. The outputs from all heads are concatenated and linearly transformed to form the final attention representation, enabling the model to jointly attend to information from different subspaces.

More »

Fig 2 Expand

Table 1.

Main characteristics of each dataset used in the experiments.

More »

Table 1 Expand

Table 2.

Experimental environment, model configuration, and training hyperparameters.

More »

Table 2 Expand

Table 3.

Performance comparison between KALFormer and state-of-the-art models.

More »

Table 3 Expand

Fig 3.

Performance comparison across benchmark datasets.

Heatmap visualization of MSE and MAE values for different models on six public datasets. Darker blue shades indicate lower error values. KALFormer consistently achieves the lowest MSE and MAE across all datasets, demonstrating superior generalization and robustness.

More »

Fig 3 Expand

Table 4.

Ablation study results for different module combinations.

More »

Table 4 Expand

Fig 4.

Ablation study on MAE and MSE performance.

Each configuration represents a variant of the KALFormer architecture, isolating the contribution of individual modules. Results are reported as mean ± Std over three independent runs. KALFormer achieves the lowest error, confirming the effectiveness of multi-level fusion.

More »

Fig 4 Expand

Fig 5.

Accuracy and loss comparison in ablation experiments.

The figure reports trend-based accuracy (%) and loss (mean ± Std) for each model variant. KALFormer exhibits the highest accuracy (95.19%) and the lowest loss (0.1631), highlighting its predictive precision and training stability compared with other configurations.

More »

Fig 5 Expand

Fig 6.

Attention mechanism visualization on representative datasets.

Normalized attention maps of KALFormer on ETTm1 and Electricity datasets for 96- and 192-step forecasts, showing a shift from local diagonal focus to broader periodic patterns across variables.

More »

Fig 6 Expand

Fig 7.

Evolution of attention energy across temporal scales.

As the forecasting horizon extends, attention becomes more diffuse, reflecting adaptive balance between short-term precision and global contextual awareness.

More »

Fig 7 Expand

Table 5.

Complexity and efficiency comparison among LSTM, BiLSTM, and KALFormer.

More »

Table 5 Expand