Redefining multi-target weather forecasting with a novel deep learning model: Hierarchical temporal convolutional long short-term memory with attention (HTC-LSTM-Attn) in Bangladesh | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Overview of the data preprocessing pipeline for multi-target weather forecasting.
Starting from raw monthly weather data (1961–2022, 24 stations) from the Bangladesh Agricultural Research Council (BARC), the process includes missing data handling via KNN imputation (k = 5), integration of seasonal features (e.g., Month_sin, Month_cos) and lagged statistics (e.g., 1-3 month lags, rolling means/std), outlier detection using IQR with replacement by nearest non-outlier values, correlation-based feature selection (removing highly correlated features with Pearson’s > 0.9), data normalization (Min-Max scaling to [0,1]), and quality validation (e.g., using Matplotlib and Seaborn). The data is split temporally and spatially: training (1961–2012), validation (2013–2015), and test (2016–2022) sets, sorted by station code and year, with exclusion of flat stations (e.g., Tangail, Syedpur, Mongla etc.). This ensures no information leakage and prepares sequential inputs (time steps = 12) for the HTC-LSTM-Attn model.

More »

Fig 2.

Architecture of the proposed Hierarchical Temporal Convolutional Long Short-Term Memory with Attention (HTC-LSTM-Attn) model for one-month-ahead forecasting of maximum temperature and humidity.
The input window consists of a sequence of 12 time steps with multivariate features (e.g., solar radiation, PET, sunshine hours, wind speed, cloud coverage, rainfall, and seasonal/lagged components). Hierarchical Temporal Convolutional (HTC) layers extract multi-scale patterns using Conv1D filters (e.g., 96, 64, 96 filters with kernels ), followed by batch normalization, ReLU activation, and concatenation. Bidirectional LSTM layers (96 units) capture forward and backward sequential dependencies, with an attention mechanism computing context vectors, softmax-weighted scores (e.g., ), and concatenated hidden states using . The output layers include dense units (256) with ReLU, dropout (0.1), and final predictions for temperature () and humidity (). This design uses purely historical temporal data up to month t to predict the next month (horizon = 1), ensuring no future data leakage.

More »

Fig 2.

Architecture of the proposed Hierarchical Temporal Convolutional Long Short-Term Memory with Attention (HTC-LSTM-Attn) model for one-month-ahead forecasting of maximum temperature and humidity.
The input window consists of a sequence of 12 time steps with multivariate features (e.g., solar radiation, PET, sunshine hours, wind speed, cloud coverage, rainfall, and seasonal/lagged components). Hierarchical Temporal Convolutional (HTC) layers extract multi-scale patterns using Conv1D filters (e.g., 96, 64, 96 filters with kernels ), followed by batch normalization, ReLU activation, and concatenation. Bidirectional LSTM layers (96 units) capture forward and backward sequential dependencies, with an attention mechanism computing context vectors, softmax-weighted scores (e.g., ), and concatenated hidden states using . The output layers include dense units (256) with ReLU, dropout (0.1), and final predictions for temperature () and humidity (). This design uses purely historical temporal data up to month t to predict the next month (horizon = 1), ensuring no future data leakage.

More »

Fig 3.

Hyperparameter tuning process using Keras Tuner for the HTC-LSTM-Attn model.
Using the training and validation sets (Adam optimizer, MSE loss, batch size 32, maximum 50 epochs, patience 10), a RandomSearch tuner performs 15 trials (2 executions per trial) to minimize validation MSE. The search optimizes key parameters including HTC filters (96 at , 64 at , 96 at ), Conv1D filters (128), LSTM units (96), dense units (256), and dropout rates (0.1 for HTC/LSTM and 0.5 for dense). The tuned hyperparameters are then applied to build the final HTC-LSTM-Attn model for forecasting, ensuring robustness across both temporal and spatial test sets while preventing overfitting.

More »

Fig 3.

Hyperparameter tuning process using Keras Tuner for the HTC-LSTM-Attn model.
Using the training and validation sets (Adam optimizer, MSE loss, batch size 32, maximum 50 epochs, patience 10), a RandomSearch tuner performs 15 trials (2 executions per trial) to minimize validation MSE. The search optimizes key parameters including HTC filters (96 at , 64 at , 96 at ), Conv1D filters (128), LSTM units (96), dense units (256), and dropout rates (0.1 for HTC/LSTM and 0.5 for dense). The tuned hyperparameters are then applied to build the final HTC-LSTM-Attn model for forecasting, ensuring robustness across both temporal and spatial test sets while preventing overfitting.

More »

Table 1 — Table 1.

Overall station-level metrics for the HTC-LSTM-Attn model (Temporal Test).

More »

Table 2 — Table 2.

Overall station-level metrics for the HTC-LSTM-Attn model (Spatial Test: 5 unseen stations).

More »

Table 3 — Table 3.

Selected station-level metrics for the HTC-LSTM-Attn model (Temporal).

More »

Table 4 — Table 4.

Selected station-level metrics for the HTC-LSTM-Attn model (Spatial).

More »

Table 5 — Table 5.

Performance comparison for maximum temperature forecasting.

More »

Table 6 — Table 6.

Performance comparison for humidity forecasting models.

More »

Fig 4 — Fig 4.

Temperature prediction comparisons for selected stations (Temporal).

More »

Fig 5 — Fig 5.

Humidity prediction comparisons for selected stations (Temporal).

More »

Fig 6 — Fig 6.

Prediction (Temperature & Humidity) comparisons for selected stations (Temporal)-Bar Chart.

More »

Fig 7 — Fig 7.

Temperature prediction comparisons for selected stations (spatial).

More »

Fig 8 — Fig 8.

Humidity prediction comparisons for selected stations (Spatial).

More »

Fig 9 — Fig 9.

Prediction (Temperature & Humidity) comparisons for selected stations (Spatial)-Bar Chart.

More »

Fig 10 — Fig 10.

Overall time-series comparison of temperature and humidity predictions (Temporal).

More »

Fig 11 — Fig 11.

Overall time-series comparison of temperature and humidity predictions (Temporal)- Bar Chart.

More »

Fig 12 — Fig 12.

Overall time-series comparison of temperature and humidity predictions (Spatial).

More »

Fig 13 — Fig 13.

Overall time-series comparison of temperature and humidity predictions (Spatial)- Bar Chart.

More »

Fig 14 — Fig 14.

Scatter plot comparison of predicted vs. actual values (Temporal).

More »

Fig 15 — Fig 15.

Scatter plot comparison of predicted vs. actual values (Spatial).

More »

Table 7 — Table 7.

Ablation study results comparing HTC-LSTM-Attn and HTC-LSTM (no attention).

More »

Fig 16 — Fig 16.

Ablation Study.

More »

Table 8 — Table 8.

Ablation study results comparing HTC-LSTM-Attn and HTC-LSTM (no attention) across five runs for temperature and humidity forecasting.

More »

Fig 17 — Fig 17.

Attention Weights Across 12 Time Steps.

More »

Fig 18 — Fig 18.

Weather Stations in Bangladesh with Temperature and Forecast Accuracy.
This map was generated using data from Natural Earth [1:10m Cultural Vectors]. Natural Earth data is in the public domain; therefore, the map is presented under a CC BY 4.0 license.

More »

Table 9 — Table 9.

Calibrated Uncertainty Metrics (95% Prediction Intervals).

More »