Regional PM2.5 pollution forecasting using a hybrid model based on multi-scales feature fusion and deep learning algorithms

Yong Zhang; Wenya Zhang; Bo Wu; Weichen Yi

doi:10.1371/journal.pone.0333489

Abstract

The issue of regional haze pollution has become increasingly prominent. However, early warning models for regional haze pollution are significantly lacking. To accurately predict regional PM_2.5 pollution, hourly average concentration data of pollutants from January 1, 2021, to December 31, 2023 in the Chengdu-Chongqing urban agglomeration, along with concurrent surface meteorological data, are used and builds multi-scales feature fusion regional pollution prediction network (MSFRPM) based on a multi-input-multi-output deep learning framework. This model can simultaneously forecast PM_2.5 concentrations for all cities in the region. The results show that the annual and seasonal prediction evaluation metrics of the MSFRPM model are significantly better than those of the baseline models. This can be attributed to the ability of the MSFRPM model to effectively capture the temporal dependency of historical PM_2.5, the complex nonlinear relationships between other pollutants and meteorological factors within cities, and the multi-scales spatiotemporal dependencies of PM_2.5 transport between cities in the urban agglomeration. In 2023, the Chengdu-Chongqing urban agglomeration experienced 15 days of mild regional pollution, 21 days of moderate pollution, and 2 days of severe pollution, with moderate pollution being the dominant type of PM_2.5 pollution. Seasonally, regional PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration is mainly concentrated in the winter. The MSFRPM model assesses that the interannual and seasonal assessments of regional PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration in 2023 are generally consistent with actual observations. Accurate prediction of regional PM_2.5 pollution is of great significance for the coordinated management and early warning of regional pollution.

Citation: Zhang Y, Zhang W, Wu B, Yi W (2025) Regional PM_2.5 pollution forecasting using a hybrid model based on multi-scales feature fusion and deep learning algorithms. PLoS One 20(10): e0333489. https://doi.org/10.1371/journal.pone.0333489

Editor: Fausto Cavallaro, Universita degli Studi del Molise, ITALY

Received: April 27, 2025; Accepted: September 15, 2025; Published: October 9, 2025

Copyright: © 2025 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

With the comprehensive implementation and promotion of the Airborne Pollution Action Plan, PM_2.5 pollution in China has significantly improved [1–3]. PM_2.5 concentrations in the Beijing-Tianjin-Hebei, Yangtze River Delta, Pearl River Delta, and Chengdu-Chongqing urban agglomerations have also simultaneously decreased [4]. However, looking only at changes in ρ(PM_2.5) cannot fully reflect the trend of regional PM_2.5 pollution. Haze pollution in urban agglomerations has shown significant regional characteristics. These regional PM_2.5 pollution not only reduce atmospheric visibility but, more importantly, pose serious threats to the physical and mental health of urban residents [5]. The numerical prediction of the evolution of regional PM_2.5 pollution is not only of great academic significance but also a pressing need for the current society, holding significant research value. It has already become a key focus in the field of atmospheric environment research.

Air pollution events refer to the phenomenon where pollutants emitted by human activities exceed concentration limits after undergoing a series of physical and chemical processes in the atmosphere [6–7]. Regional pollution is mainly primarily the result of local emissions and cross-regional transport interacting under unfavorable weather environments. When emissions from regional pollution sources are relatively stable, the concentration of pollutants mainly depends on the atmospheric diffusion capacity, closely related to surface weather patterns and meteorological factors [8]. Variations in regional meteorological conditions have a direct impact on the accumulation of pollutants, as well as the formation and dissipation of severe pollution. During periods of large-scale stable weather, pollutants released from human activities accumulate and transform rapidly, leading to a quick increase in local pollution concentrations, and through transport and mixing, form large-scale, high-concentration pollution air masses, which is a typical triggering mechanism for regional pollution events [9–11]. Currently, Various methods have been used to predict air pollution events, such as chemical transport models, statistical methods, and deep learning. Chemical transport models characterize the impact of emissions and meteorological conditions on air quality by adjusting emission inventories and meteorological settings, and designing multiple scenario analyses [12–13]. However, as a numerical simulation, chemical transport models have relatively high demands for computer resources. Additionally, the uncertainty in emission inventories and meteorological data somewhat diminishes the reliability of the analysis results. Traditional statistical methods are developed based on linear assumptions, assuming that the relationships between variables are linear. However, in real atmospheric systems, most time series of air pollution concentrations exhibit significant non-linear and non-stationary characteristics [14–16]. This makes traditional statistical methods perform poorly in predicting air quality time series. Deep learning is a machine learning method that employs multi-layer deep neural network structures, where neural networks consist of a large number of interconnected neurons, and information is transmitted from one layer of neurons to the next through activation functions [17,18]. Deep learning can efficiently massive and complex air quality data, leveraging stronger feature extraction and representation capabilities to capture the inherent patterns and complex relationships contained in air quality time series [19–27]. Current studies focusing on air quality prediction using deep learning have achieved substantial advancements but have mainly focused on predicting PM_2.5 pollution in individual cities [28–36]. However, in recent years, PM_2.5 pollution has expanded beyond single cities to exhibit regional characteristics [37]. Since the formation of regional pollution processes is closely related to weather patterns, differences in meteorological conditions and pollutant transport can lead to significant variations in the impact range and evolution of pollution events, which poses significant challenges for assessment and prediction at the regional scale. Therefore, traditional predictions of PM_2.5 for individual cities can no longer address the current challenges of regional PM_2.5 pollution forecasting. Constructing effective deep learning predictive models that incorporate regional pollution characteristics has become a primary task for current regional pollution forecasting.

The Chengdu-Chongqing urban agglomeration is one of the most urbanized regions in western China and is identified as a key area for air pollution control under the “Three Zones and Ten Regions” initiative [38]. The Sichuan Basin is encircled by mountains, and its deep basin topography makes it highly susceptible to prolonged stable weather conditions in winter, resulting in poor vertical and horizontal diffusion conditions. The prolonged accumulation of pollutants, along with the input from the southern Sichuan region, makes regional pollution events likely to occur. According to the 2023 Sichuan Province Environmental Air Quality Annual Report, 11 cities in the Sichuan Basin still had annual average PM_2.5 concentrations exceeding the national secondary standard for ambient air quality (35 µg/m³), and PM_2.5 pollution has not been fully controlled. Therefore, building a regional PM_2.5 prediction model based on the Chengdu-Chongqing urban agglomeration has significant practical implications for preventing regional pollution in this area.

2. Data and methods

2.1. Data

The Chengdu-Chongqing urban cluster, with Chengdu and Chongqing at its core, constitutes a key economic and cultural center in western China, hosting a resident population of nearly 100 million. Geographically, the urban agglomeration spans approximately 101°57′-108°56′E and 27°40′-32°19′N, covering a total area of about 185,000 km². The air pollutant monitoring data includes hourly average concentration data for 16 cities in the Chengdu-Chongqing urban agglomeration (Chengdu, Zigong, Luzhou, Deyang, Mianyang, Suining, Neijiang, Leshan, Nanchong, Meishan, Yibin, Guang’an, Dazhou, Ya’an, Ziyang, and Chongqing), with the study period from January 1, 2021, to December 31, 2023. The primary pollutants monitored are SO₂, NO₂, CO, PM_2.5, PM_10, and O₃. The air pollution data is sourced from the National Urban Air Quality Real-time Release Platform (https://www.aqistudy.cn/). Missing pollution data was filled using the average of adjacent values. In addition, meteorological data for the same period comes from the China Meteorological Data Service Center (http://data.cma.cn/), primarily including meteorological factors such as temperature, relative humidity, atmospheric pressure, rainfall, and wind speed. Missing values in the datasets for each indicator were supplemented using the cubic spline interpolation method.

2.2. Methods

2.2.1. Data preprocessing.

To better capture the dependency features within different variables, this paper classifies the input data of the MSFRPM model into three categories: historical PM_2.5 concentrations, exogenous variables (SO₂, NO₂, CO, PM₁₀, and O₃ and meteorological data) data, and PM_2.5 data from various cities in the region. It is worth noting that the correlation relationships between the three categories of input data and future PM_2.5 concentrations are not consistent. For example, historical PM_2.5 concentrations data has a temporal dependency relationship with future PM_2.5 concentrations, while exogenous variables show a nonlinear dependency relationship, and PM_2.5 data from various cities in the region has a spatial dependency relationship with the future PM_2.5 concentrations in the target city. However, different dependency features require different deep learning structures for processing and mapping. Accordingly, to more accurately capture the dependency relationships of the input variables, this paper needs to adjust the input data. The input of historical PM_2.5 concentrations are feed to LSTM models and exogenous variables follow the conventional deep neural network (DNN). For the PM_2.5 data from various cities in the region, we adopt the preprocessing method proposed by Xiao et al. [39], which transforms the sequence data into a graph structure, which is more conducive for the multi-scales pyramid network to capture the multi-scales spatial dependency characteristics of pollutant transport.

To mitigate the impact of different indicator scales and enhance model efficiency, all input dataset should have been processed by normalization as follow:

(1)

where, is the normalized data for each indicator, with a range of 0 to 1. refers to the original data values for each indicator. indicates the minimum value. represents the maximum value.

2.2.2. Definition of regional PM_2.5 pollution.

This study refers to the “Technical Guide for the Evaluation of Ambient Air Quality Forecasting Performance” regarding the definition of regional pollution levels, using five adjacent prefecture-level or higher cities as the spatial scope for determining regional pollution [40]. This definition is consistent with the HJ 633 − 2012 “Ambient Air Quality Index (AQI) Technical Regulation”, using natural days as the time unit for determining regional pollution, a regional PM_2.5 pollution definition and classification system for the Yangtze River Delta region was developed [41]. Mild regional pollution: In one natural day, more than five contiguous cities within the region experience PM_2.5 pollution, with no more than four cities classified as moderately polluted, and the rest experiencing mild pollution. Moderate regional pollution: In one natural day, more than five contiguous cities in the region experience PM_2.5 pollution, with no more than four cities classified as heavily polluted or worse; if no cities experience heavy pollution or worse, at least five cities must be moderately polluted. Severe regional pollution: Within a single natural day, more than five contiguous cities in the region exhibit PM_2.5 pollution, with at least five cities reaching severe or higher pollution levels.

2.2.3. Multi-scales feature fusion regional pollution prediction network.

In the atmospheric system, the evolution of PM_2.5 concentrations in a city is influenced not only by its historical PM_2.5 values, exogenous variables, but also by regional transport [42]. However, the patterns of influence on PM_2.5 may vary significantly across different factors. The historical values of PM_2.5 primarily have a temporal dependency relationship with future PM_2.5, while exogenous variables mainly present nonlinear dependencies. Regional transport of pollutants plays a crucial role in the formation and maintenance of air pollution events [11,43]. For emission source areas, it is generally believed that stagnant meteorological conditions such as weak surface winds, strong temperature inversions, subsidence, and low mixing layer heights are conducive to the accumulation of air pollutants, leading to heavy air pollution [44,45]. Under unique non-stagnant meteorological conditions of strong near-surface winds, no temperature inversion, and additional instability in the atmospheric boundary layer, regional transport of PM_2.5 exacerbates regional pollution [46,47]. Governed by strong winds in the lower troposphere and vertical diffusion, air pollutants can be transported over long distances from the pollution source to downwind receptor areas, significantly expanding the affected region [48,49]. Compared to source-area pollutants, the impact of pollutant transport is also not negligible. It is worth noting that the impact of regional transport on PM_2.5 evolution of a city is primarily spatially dependent. Due to the varying distances between different transport sources and the target city, the timing of transport shows significant differences. This results in significant differences in the spatial dependency of regional transport at different time scales. If only traditional methods are used, assuming consistent dependency relationships among variables influencing PM_2.5 evolution, significant errors may occur in PM_2.5 concentrations predictions. To address this, this paper uses a locally connected approach to design a multi-scales deep learning module for the spatial dependencies at different time scales, ultimately outputting accurate predictions through a fully connected layer. The detailed process is as follows:

Step 1: For PM_2.5 historical values (), the relationship between this data and PM_2.5 concentrations is primarily a temporal dependency relationship. In deep learning, LSTM effectively learns long-term dependencies in time series using a unique feedback structure and gating mechanism [50]. Therefore, a temporal dependency module (TDM) is built using the LSTM model to capture the temporal dependencies between and future PM_2.5 values, as illustrated in Fig 1c. The specific process is as follows:

Download:

Fig 1. The overall framework of MSFRPM.

https://doi.org/10.1371/journal.pone.0333489.g001

(2)

(3)

(4)

(5)

(6)

(7)

where, represents the historical value of PM_2.5 concentrations. indicates the forget gate. is the output gate. indicates the input gate. refers to the feature vector of temporal dependencies. indicates a neuron. represents a memory neuron. is the sigmoid activation function. , and represent the weight and bias matrices. D represents the spatial dimension. ◦ represents the Hadamard product.

Step 2: For the complex nonlinear relationship between exogenous variables () and PM_2.5 concentrations, a deep neural network (DNN) framework is used to capture the complex, nonlinear correlations between variables. The DNN possesses powerful feature learning and nonlinear processing capabilities. In the DNN, serves as the input data, fed into the DNN, undergoing layer-by-layer computations in the hidden layers and nonlinear transformations through the activation function, enabling the DNN to effectively extract complex nonlinear relationship features between other pollutants, meteorological factors, and PM_2.5 concentrations. Accordingly, a nonlinear relationship module (NRM) is further constructed based on the DNN structure, with its structure shown in Fig 1b. The calculation formula for the NRM module is as follows:

(8)

(9)

(10)

where, represents the exogenous variables within the city, such as pollutants like SO₂, NO₂, CO, PM₁₀, and O₃, as well as meteorological factors. , where and represent the input layer and output layer, respectively. represents the hidden layers. represents the nonlinear feature of the -th neuron in the -th layer after the activation function . and denote the weight and bias vectors. represents the feature vector of the nonlinear dependency of the exogenous variables.

Step 3: To effectively extract the influence of the spatial dependencies of regional transport on PM_2.5 concentrations at different time scales, a module for extracting multi-scales spatial features (MSPM) has been designed based on a multi-scales pyramid network, illustrated in Fig 1a. According to the pyramid network structure, multiple convolutional layers are utilized to convert the original time series into feature representations hierarchically, from smaller scales to larger scales. This multi-scales structure provides us with the opportunity to observe the original time series at different time scales. Specifically, feature representations at smaller scales can retain spatial features of transport over shorter distances, while feature representations at larger scales can capture spatial features of transport over longer distances. The computation process of MSPM is as follows:

(11)

(12)

(13)

(14)

where, ⊗ represents the convolution operator. denotes the PM_2.5 concentrations values of various cities within the region. and denote the weights and biases for the first convolution operation. and represent the convolution kernel and bias vector of the k-th layer, respectively. ReLU denotes the ReLU activation function. Pooling is the pooling operation. represents the multi-scales spatial features.

Step 4: Finally, to achieve regional pollution prediction, a deep learning multi-output structure is employed to simultaneously output PM_2.5 pollution for all cities in the urban agglomeration, enabling regional pollution assessment. The multi-output structure mainly consists of two layers: one is the feature fusion layer, and the other is the multi-output layer, as shown in Fig 1d. The structural formulas are as follows:

(15)

(16)

where, , , and denote the weights and biases for the historical dependency features of PM_2.5, the nonlinear features of exogenous variables, and the multi-scales spatial features of pollution transport, respectively. represents the sigmoid activation function. represents the predicted PM_2.5 value for the -th city in the urban cluster. represents the number of cities within the urban agglomeration.

2.2.4. Loss function.

As MSFRPM is a multi-output deep learning model, conventional loss functions are inadequate for optimizing the MSFRPM model. To enable gradient descent-based optimization algorithms to be applicable to the MSFRPM model, a multi-output loss function (MOLF) is proposed. The specific formula for MOLF is as follows:

where M denotes the number of cities in city clusters. N denotes the length of the PM_2.5 series. represents the observed value of city at time. represents the predicted value of city at time.

2.2.5. Experimental setup.

In the experiment, we split the preprocessed data into training, validation, and test sets in a ratio of 7:1:2. The first 70% of the data was used as the training set, the next 10% as the validation set, and the remaining 20% as the test set. We employed the MSFRPM prediction model, with an input window length of 168 (one week) and an output prediction length of 1. The number of training batches is set to 30, with the learning rate set at 0.0001. The maximum number of training epochs is 100. The optimizer used for the MSFRPM is Adam, and the loss function is MOLF loss function.

2.2.6. Evaluation metrics.

To assess performance of the proposed multi-scales feature fusion regional pollution prediction model, the ten-fold cross-validation algorithm are employed in this paper, which can effectively prevent model overfitting [51]. The steps of the ten-fold cross-validation algorithm are as follows: First, the input data set of the model is randomly divided into 10 subsets of equal length. Then, the model is trained 10 times, with each training using one of the subsets (without repetition) as the test set, while the remaining 9 subsets serve as the training set. Finally, the model performance is evaluated. In evaluating the model, R², Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) are selected as the three metrics to assess the performance of model. The R² value ranges from [0, 1]; the closer the value is to 1, the better the model performance, and the closer it is to 0, the worse the performance. MAE and RMSE mainly measure the error between the predicted and actual values, and the closer their values are to 0, the better the model performance. The formulas for the evaluation metrics are as follows:

(17)

(18)

(19)

where, denotes the length of the pollutant series. refers to the PM_2.5 concentrations value at time . represents the predicted PM_2.5 concentrations at time . are the average values of PM_2.5 concentrations.

3. Results and discussion

3.1. Statistical analysis

Table 1 shows the basic statistics of PM_2.5 pollution concentrations data in the Chengdu-Chongqing urban agglomeration. As shown in the Table 1, the maximum values of PM_2.5 concentrations in each city of the Chengdu-Chongqing urban agglomeration in 2023 was 201.000 µg/m³, 204.000 µg/m³, 215.000 µg/m³, 226.000 µg/m³, 207.000 µg/m³, 214.000 µg/m³, 226.000 µg/m³, 218.000 µg/m³, 205.000 µg/m³, 234.000 µg/m³, 195.000 µg/m³, 240.000 µg/m³, 209.000 µg/m³, 209.000 µg/m³, 316.000 µg/m³, and 169.000 µg/m³, respectively. The maximum values for all cities are far higher than the National Ambient Air Quality Standard (GB 3095−2012), which stipulates a PM_2.5 annual concentration limit of 35 µg/m³ for the secondary standard. The average values of PM_2.5 concentrations in the cities of the Chengdu-Chongqing urban agglomeration in 2023 were 39.297 µg/m³, 41.619 µg/m³, 36.981 µg/m³, 38.318 µg/m³, 35.005 µg/m³, 30.069 µg/m³, 40.673 µg/m³, 32.012 µg/m³, 43.057 µg/m³, 43.853 µg/m³, 40.092 µg/m³, 36.572 µg/m³, 44.235 µg/m³, 31.397 µg/m³, 40.369 µg/m³, and 36.585 µg/m³, respectively. The average values in most cities exceeded the national primary PM_2.5 standard limit of 35 µg/m³, indicating that haze pollution in the Chengdu-Chongqing urban agglomeration is still very serious, and most cities simultaneously exceeded the national primary PM_2.5 standard limit of 35 µg/m³. This suggests that PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration has clearly developed into a regional pollution issue. The standard deviation of each city was significantly greater than 0, indicating that the PM_2.5 concentrations fluctuations in each city of the Chengdu-Chongqing urban agglomeration are quite severe. Moreover, based on skewness and kurtosis, the PM_2.5 series deviates from a normal distribution. The skewness values are notably greater than 0, suggesting that most PM_2.5 concentrations are above the mean. Further applying the Lilliefors test to the PM_2.5 series showed that the Lilliefors statistics were markedly higher than the critical value (0.013). This indicates that the null hypothesis is rejected in the Lilliefors test, showing that the PM_2.5 concentrations sequences in each city of the Chengdu-Chongqing urban agglomeration in 2023 do not follow a normal distribution and exhibit significant nonlinear characteristics. This also suggests that the evolution of regional PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration may be influenced by nonlinear dynamic mechanisms, and that traditional linear statistical prediction methods may not meet the requirements for regional pollution assessment.

Download:

Table 1. Basic statistical metrics of PM_2.5 pollution concentration.

https://doi.org/10.1371/journal.pone.0333489.t001

Furthermore, based on the definitions of regional pollution from the “Technical Guidelines for Evaluating the Effectiveness of Ambient Air Quality Forecasts” and the “Technical Specifications for Ambient Air Quality Index (AQI)” the regional PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration in 2023 was statistically analyzed. Annually, in 2023, the Chengdu-Chongqing urban agglomeration experienced 15 occurrences of mild regional pollution, 21 occurrences of moderate regional pollution, and 2 occurrences of severe regional pollution. Annually, in 2023, the Chengdu-Chongqing urban agglomeration experienced 15 occurrences of mild regional pollution, 21 occurrences of moderate regional pollution, and 2 occurrences of severe regional pollution events. This result indicates that moderate regional pollution is predominant in PM_2.5 pollution within the Chengdu-Chongqing urban agglomeration. In terms of seasons, during spring 2023, the Chengdu-Chongqing urban agglomeration experienced 1 occurrence of mild regional pollution, 0 occurrences of moderate regional pollution, and 0 occurrences of severe regional pollution events. In summer, the Chengdu-Chongqing urban agglomeration experienced 0 occurrences of mild regional pollution, 0 occurrences of moderate regional pollution, and 0 occurrences of severe regional pollution events. In autumn, the Chengdu-Chongqing urban agglomeration experienced 1 occurrence of mild regional pollution, 0 occurrences of moderate pollution, and 0 occurrences of severe pollution events. In winter, the Chengdu-Chongqing urban agglomeration experienced 14 occurrences of mild regional pollution, 21 occurrences of moderate regional pollution, and 2 occurrences of severe regional pollution events. These results indicate that mild, moderate, and severe regional pollution mainly occurred in winter.

3.2. Model validation

3.2.1. Model training and testing.

To achieve optimal performance of the MSFRPM model, this study primarily adjusts two important hyperparameters: the size of the convolutional kernel and the length of the input data. In the tuning process, the convolutional kernel sizes were set to 2, 4, 6, 8, and 10, and the input data lengths were set to 4, 12, 24, 120, and 168. It was found through experiments that the model achieved optimal performance with a convolutional kernel size of 4 and an input length of 24. The results are presented in Table 2, showing an average MAE of 4.04 µg/m³, an average R² of 0.97, and an average RMSE of 6.13 µg/m³ for all cities.

Download:

Table 2. Comparison results of predictions between MSFRPM model and benchmark models.

https://doi.org/10.1371/journal.pone.0333489.t002

3.3.2. Annual model performance.

To assess the validity of the MSFRPM model, this section conducts a comparative analysis using benchmark models including XGBoost, SVM, RF, BPNN, LSTM, and ConvLSTM. Fig 2 illustrates the prediction results of PM_2.5 concentrations predictions for the XGBoost, SVM, RF, BPNN, LSTM, ConvLSTM, and MSFRPM models in 2023. The R² values for XGBoost, SVM, and LSTM were relatively low, whereas RF, BPNN, ConvLSTM, and MSFRPM models all surpassed an R² of 0.97. The three metrics for deep learning models (excluding LSTM) showed significant improvements over traditional machine learning models like XGBoost, SVM, and RF. This improvement may primarily be due to the ability of deep learning models to better capture the nonlinear relationships between PM_2.5 concentrations and other factors. The poor performance of LSTM may be related to its structure, as LSTM primarily explores the inherent temporal dependencies in time series, while the fluctuations in PM_2.5 concentration in the Chengdu-Chongqing urban agglomeration are less influenced by residual particles in the atmospheric system, leading to poorer performance. Compared to deep models, MSFRPM takes into account the complex nonlinear relationships among various precursors and their meteorological factors within cities, as well as the spatial dependency relationships of PM_2.5 concentrations across different scales among cities in the Chengdu-Chongqing region, yielding an MAE of 3.70 µg/m³, an R² of 0.98, and an RMSE of 5.09 µg/m³.

Download:

Fig 2. Prediction results of PM_2.5 concentrations by XGBoost, SVM, RF, BPNN, LSTM, ConvLSTM, and MSFRPM Models.

https://doi.org/10.1371/journal.pone.0333489.g002

Furthermore, based on the fitting graph, it can be observed that when PM_2.5 concentrations is low, the fitting line of predicted values and monitored values is very close to the theoretical fitting line (y = x), indicating that all prediction models perform well when PM_2.5 concentrations are low. When PM_2.5 concentrations are high, the fitting line of predicted and monitored values increasingly deviates from the theoretical fitting line as the concentration rises. This indicates that the performance of each prediction model declines as PM_2.5 concentrations rise. Notably, regardless of how PM_2.5 concentrations change, the scatter points for predicted and monitored values from the MSFRPM model are evenly distributed around the theoretical fitting line. These results indicate that the MSFRPM model consistently outperforms XGBoost, SVM, RF, BPNN, LSTM, and ConvLSTM models, whether pollution levels are low or high. This is primarily because the MSFRPM model takes into account not only the temporal dependencies of historical PM_2.5 values but also the complex nonlinear relationships between other pollutants and meteorological factors within cities, as well as the spatial dependencies of PM_2.5 concentrations at different scales among cities in the city cluster. However, when concentrations exceed 190 µg/m³, the MSFRPM model tends to underestimate. This phenomenon may primarily result from unmeasurable anthropogenic emission factors.

3.2.3. Seasonal performance of MSFRPM model.

Fig 3 shows the prediction results of seasonal PM_2.5 concentrations predictions by the MSFRPM model in the Chengdu-Chongqing urban agglomeration. Due to the differences in the evolution patterns of PM_2.5 concentrations across seasons, the predictive performance of the MSFRPM model also varies. The MAE for the MSFRPM model in spring, summer, autumn, and winter are 2.63 µg/m³, 2.10 µg/m³, 2.82 µg/m³, and 4.28 µg/m³, respectively; R² values are 0.96, 0.91, 0.97, and 0.97; and RMSE values are 3.67 µg/m³, 2.94 µg/m³, 4.06 µg/m³, and 6.77 µg/m³, respectively. Although the MAE and RMSE are lower in summer, the R² is also the lowest. The lower MAE and RMSE may be attributed to favorable atmospheric transport conditions and the cleansing effect of rainfall in summer. The lower R² suggests that the predictive performance of the MSFRPM model is weakest in summer. Despite higher MAE and RMSE values in spring, autumn, and winter, the R² values exceed 0.90 in all cases. The higher R² indicates that the MSFRPM model can effectively capture the evolution patterns of PM_2.5 concentrations across cities in the Chengdu-Chongqing urban agglomeration. This is mainly because the Chengdu-Chongqing urban agglomeration is located in the Sichuan Basin, where winter is characterized by calm winds, weak atmospheric transport capacity, and strong stable stratification, making it easier for the MSFRPM model to capture the evolution patterns of PM_2.5 concentrations. The higher errors may be attributed to anthropogenic emissions, such as increased population mobility during winter holidays and higher emissions from firecrackers.

Download:

Fig 3. Seasonal predictions result of PM_2.5 concentrations using the MSFRPM model.

https://doi.org/10.1371/journal.pone.0333489.g003

3.3. Comparison results of different machine learning algorithms

To evaluate the predictive performance of the MSFRPM model, this section further compares its PM_2.5 predictions for Chengdu with those of XGBoost, SVM, RF, BPNN, LSTM, and ConvLSTM, as shown in Fig 2 and Table 2. According to the Table 2, the MSFRPM model exhibits the best predictive performance, followed by ConvLSTM, RF, BPNN, LSTM, XGBoost, and SVM. Compared to traditional machine learning methods (XGBoost, SVM, and RF), deep learning (excluding LSTM) shows better predictive performance for PM_2.5. This is because deep learning models can better uncover the nonlinear relationships between PM_2.5 concentrations and other factors. Compared to BPNN and LSTM, the ConvLSTM model exhibits improved predictive performance. This is due to the advantages of combining CNN and LSTM in the ConvLSTM model, which not only considers the temporal dependencies of PM_2.5 concentrations but also incorporates the spatial dependencies among cities through the CNN structure. Compared to ConvLSTM, the MSFRPM model integrates the advantages of BPNN, CNN, and LSTM, taking into account the temporal dependencies of PM_2.5 historical values, the complex nonlinear relationships between other pollutants and meteorological factors within cities, and the multi-scales spatiotemporal dependencies of PM_2.5 concentrations across different cities in the urban agglomeration. Therefore, the MSFRPM model achieves optimal prediction results for PM_2.5 in the Chengdu-Chongqing urban agglomeration.

3.4. Assessment of regional PM_2.5 pollution concentration

3.4.1. Annual variation of regional PM_2.5 pollution concentration.

The spatiotemporal evaluation of PM_2.5 for the Chengdu-Chongqing Urban Agglomeration in 2023 using the MSFRPM model is presented in Fig 4. Fig 4 indicates that the annual average PM_2.5 values among cities in the Chengdu-Chongqing urban agglomeration exhibit notable differences, ranging from 30.385 µg/m³ to 44.306 µg/m³, with an average of 37.980 µg/m³. The actual monitored average ranged from 30.070 µg/m³ to 44.236 µg/m³, with a mean of 38.133 µg/m³. This result indicates a significant consistency between the MSFRPM assessment results and the fluctuations in the actual monitored concentration values.

Download:

Fig 4. PM_2.5 Concentrations distribution in the Chengdu-Chongqing urban agglomeration in 2023.

https://doi.org/10.1371/journal.pone.0333489.g004

According to Fig 4, the annual average PM_2.5 pollution values in the Chengdu-Chongqing urban agglomeration exhibit significant spatial heterogeneity. The interannual PM_2.5 pollution concentrations in the Chengdu-Chongqing urban agglomeration in 2023 can be roughly classified into four categories. The first category includes Yibin and Luzhou, with a concentration range of 43.4 to 48.0 μg/m³. The second category comprises Chengdu, Deyang, Leshan, Zigong, Neijiang, and Guang’an, with concentrations ranging from 38.80 to 43.40 μg/m³. The third category consists of Chongqing, Nanchong, Mianyang, Meishan, and Ziyang, with a range of 34.20 to 38.80 μg/m³. The fourth category includes Dazhou, Suining, and Ya’an, with a range of 29.60 to 34.2 μg/m³. The cities in the first category have higher annual average pollution concentrations, primarily because Yibin and Luzhou are major heavy industrial bases in the Chengdu-Chongqing urban agglomeration, where high-pollution industries such as petrochemicals, metal smelting, and cement manufacturing lead to severe pollution events. Within the second category, Zigong and Neijiang are significant industrial bases, and the pollution levels in Leshan are largely affected by PM_2.5 transport from the northwest of Yibin and the south of Zigong [52,53]. The pollution in Chengdu and Deyang is mainly attributed to a combination of vehicle emissions in urban areas, high industrial emissions, pollutant transport from southern Sichuan, and locally unfavorable meteorological conditions.

3.4.2. Seasonal variation of regional PM_2.5 pollution concentration.

Fig 5 illustrates the spatial distribution of PM_2.5 concentrations in the Chengdu-Chongqing urban agglomeration across different seasons in 2023. As shown in Fig 5, PM_2.5 concentrations exhibit a significant seasonal effect, characterized by lower levels in summer and higher levels in winter, with transitional patterns in spring and autumn. The average PM_2.5 concentrations range for the cities in the Chengdu-Chongqing urban agglomeration during winter is 59.16 to 83.61 μg/m³, with significant pollution focused in southern Sichuan, mainly in Luzhou, Yibin, Zigong, and Neijiang. Southern Sichuan serves as a concentrated area for industrial bases within the Chengdu-Chongqing urban agglomeration, where high-pollution industries like petrochemicals, metal smelting, and cement manufacturing intensify pollution, affecting PM_2.5 fluctuations in other cities through pollutant transport. The average PM_2.5 concentrations range during summer is 12.70 to 23.95 μg/m³, which is significantly lower than the annual average for the urban cluster, primarily due to Chengdu’s humid and rainy summer conditions that facilitate wet deposition and diffusion of PM_2.5. The average concentrations for spring and autumn are 31.67 μg/m³ and 31.61 μg/m³, respectively, with higher pollution levels concentrated mainly in southern Sichuan (Luzhou, Yibin, Zigong, and Neijiang) as well as in Chengdu and Deyang. The main reason is that southern Sichuan is a concentrated area for industrial bases in the Chengdu-Chongqing urban agglomeration, which exacerbates air pollution in the region.

Download:

Fig 5. Distribution of PM_2.5 concentrations in the Chengdu-Chongqing urban agglomeration during spring, summer, autumn, and winter.

https://doi.org/10.1371/journal.pone.0333489.g005

3.4.3. Assessment of regional PM_2.5 pollution.

Furthermore, the evaluation of the frequency of regional PM_2.5 pollution events in the Chengdu-Chongqing urban agglomeration using the MSFRPM model is generally consistent with the actual results, underestimating only one instance of mild regional pollution event and one instance of severe regional pollution event, as shown in Table 3. The MSFRPM model estimates that there was a total of 36 regional pollution events in the Chengdu-Chongqing urban agglomeration in 2023, including 14 instances of mild regional pollution events, 21 instances of moderate regional pollution events, and 1 instance of severe regional pollution event. In 2023, PM_2.5 was primarily characterized by moderate regional pollution events, followed by mild regional pollution events. Regional pollution events exhibit notable differences among the seasons. During spring and autumn, there was only one instance of mild regional pollution event, while no regional pollution events occurred in summer. During winter, the Chengdu-Chongqing Urban Agglomeration recorded 12 instances of mild regional pollution events, 21 instances of moderate regional pollution events, and 1 instance of severe regional pollution event, with a higher occurrence of moderate regional pollution outbreaks. The primary reason for this is the combined effect of the geographical location of the Chengdu-Chongqing urban agglomeration and the winter climate. The Chengdu-Chongqing urban agglomeration is mainly located in the Sichuan Basin, where the near-surface atmospheric transport conditions are weak, making it difficult for air pollutants to disperse. These external conditions provide favorable conditions for the accumulation of PM_2.5 pollution. During winter, the cities in the Chengdu-Chongqing Urban Agglomeration have high calm wind frequencies and low rainfall, which further diminish the ability to transport pollutants and enhance wet deposition, ultimately intensifying regional pollution.

Download:

Table 3. Regional PM_2.5 pollution assessment results.

https://doi.org/10.1371/journal.pone.0333489.t003

4. Conclusion

This study proposes multi-scales feature fusion regional pollution prediction network (MSFRPM) for accurate prediction of regional PM_2.5 pollution, based on data from 16 cities in the Chengdu-Chongqing Urban Agglomeration from January 1, 2021, to December 31, 2023, including SO₂, NO₂, PM₁₀, CO, O₃, temperature, rainfall, pressure, relative humidity, and wind speed.

(1) The Chengdu-Chongqing urban agglomeration experienced mild regional pollution events for 15 days, moderate regional pollution events for 21 days, and severe regional pollution events for 2 days in 2023, indicating that PM_2.5 pollution in the Chengdu-Chongqing urban agglomeration is primarily characterized by moderate regional pollution. From a seasonal perspective, PM_2.5 in the Chengdu-Chongqing urban agglomeration is mainly concentrated in winter, while regional PM_2.5 pollution is rarely observed in spring, summer, and autumn.
(2) The interannual prediction metrics of the MSFRPM model for 2023 are 3.70 μg/m³, with an R² of 0.98 and an RMSE of 5.09 μg/m³. The MAE for each season is 2.63 μg/m³, 2.10 μg/m³, 2.82 μg/m³, and 4.28 μg/m³; the R² values are 0.96, 0.91, 0.97, and 0.97; and the RMSE values are 3.67 μg/m³, 2.94 μg/m³, 4.06 μg/m³, and 6.77 μg/m³. This indicates that its predictive performance is significantly superior to that of models such as XGBoost, SVM, RF, BPNN, LSTM, and ConvLSTM.
(3) The interannual PM_2.5 pollution concentrations in the Chengdu-Chongqing urban agglomeration in 2023 can be roughly divided into four categories. The first category includes Yibin and Luzhou, with a concentration range of 43.4–48.0 μg/m³. The second category includes Chengdu, Deyang, Leshan, Zigong, Neijiang, and Guang’an, with a concentration range of 38.80–43.40 μg/m³. The third category includes Chongqing, Nanchong, Mianyang, Meishan, and Ziyang, with a range of 34.20–38.80 μg/m³. The fourth category includes Dazhou, Suining, and Ya’an, with a range of 29.60–34.2 μg/m³. The heavily polluted areas are primarily located in the old industrial regions of southern Sichuan.
(4) The MSFRPM model assessed a total of 36 regional pollution events in the Chengdu-Chongqing Urban Agglomeration in 2023, with 14 instances of mild regional pollution events, 21 instances of moderate regional pollution events, and 1 instance of severe regional pollution event. Compared to the actual results, the assessment of regional PM_2.5 pollution by the MSFRPM model closely aligns with the real outcomes, underestimating only one mild regional pollution event and one severe regional pollution event.

Supporting information

S1 Data. Data in the experiment.

https://doi.org/10.1371/journal.pone.0333489.s001

(ZIP)

References

1. Sheehan P, Cheng E, English A, Sun F. China’s response to the air pollution shock. Nature Clim Change. 2014;4(5):306–9.
- View Article
- Google Scholar
2. Zheng B, Tong D, Li M, Liu F, Hong C, Geng G, et al. Trends in China’s anthropogenic emissions since 2010 as the consequence of clean air actions. Atmos Chem Phys. 2018;18(19):14095–111.
- View Article
- Google Scholar
3. Wang J, Zhao B, Wang S, Yang F, Xing J, Morawska L, et al. Particulate matter pollution over China and the effects of control policies. Sci Total Environ. 2017;584–585:426–47. pmid:28126285
- View Article
- PubMed/NCBI
- Google Scholar
4. Ding A, Huang X, Nie W, Chi X, Xu Z, Zheng L, et al. Significant reduction of PM_2.5 in eastern China due to regional-scale emission control: evidence from SORPES in 2011–2018. Atmos Chem Phys. 2019;19(18):11791–801.
- View Article
- Google Scholar
5. Zhang Q, Li X, Li XM, Zhang RH, Ren B, Che HX. Lung cell injury risks of PM_2.5 exposure in the high humidity and low solar radiation environment of southwestern China. Atmos Environ. 2024;338:120794.
- View Article
- Google Scholar
6. Feng XY, Zhang ZZ, Guo JP, Wang SG. Multilayer inversion formation and evolution during persistent heavy air pollution events in the Sichuan Basin, China. Atmos Res. 2023;286:106691.
- View Article
- Google Scholar
7. Li T, Zhang Q, Peng Y, Guan X, Li L, Mu J, et al. Contributions of various driving factors to air pollution events: interpretability analysis from Machine learning perspective. Environ Int. 2023;173:107861. pmid:36898175
- View Article
- PubMed/NCBI
- Google Scholar
8. Sulaymon ID, Zhang Y, Hopke PK, Guo S, Ye F, Sun J, et al. Using the COVID-19 lockdown to identify atmospheric processes and meteorology influences on regional PM_2.5 pollution episodes in the Beijing-Tianjin-Hebei, China. Atmos Res. 2023;294:106940.
- View Article
- Google Scholar
9. Huang X, Ding A, Wang Z, Ding K, Gao J, Chai F, et al. Amplified transboundary transport of haze by aerosol–boundary layer interaction in China. Nat Geosci. 2020;13(6):428–34.
- View Article
- Google Scholar
10. Sulaymon ID, Zhang Y, Hopke PK, Hu J, Zhang Y, Li L, et al. Persistent high PM_2.5 pollution driven by unfavorable meteorological conditions during the COVID-19 lockdown period in the Beijing-Tianjin-Hebei region, China. Environ Res. 2021;198:111186. pmid:33930403
- View Article
- PubMed/NCBI
- Google Scholar
11. Xu GY, Ren XD, Xiong KN, Li LQ, Bi XC, Wu QL. Analysis of the driving factors of PM_2.5 concentration in the air: a case study of the Yangtze River Delta, China. Ecol Indic. 2020;110:105889.
- View Article
- Google Scholar
12. Li K, Jacob DJ, Liao H, Shen L, Zhang Q, Bates KH. Anthropogenic drivers of 2013-2017 trends in summer surface ozone in China. Proc Natl Acad Sci U S A. 2019;116(2):422–7. pmid:30598435
- View Article
- PubMed/NCBI
- Google Scholar
13. Dang R, Liao H, Fu Y. Quantifying the anthropogenic and meteorological influences on summertime surface ozone in China over 2012-2017. Sci Total Environ. 2021;754:142394. pmid:33254879
- View Article
- PubMed/NCBI
- Google Scholar
14. Wu B, Liu C, Zhang J, Du J, Shi K. The multifractal evaluation of PM_2.5-O₃ coordinated control capability in China. Ecol Indic. 2021;129:107877.
- View Article
- Google Scholar
15. Liu C, Liang J, Li Y, Shi K. Fractal analysis of impact of PM_2.5 on surface O₃ sensitivity regime based on field observations. Sci Total Environ. 2023;858(Pt 3):160136. pmid:36375545
- View Article
- PubMed/NCBI
- Google Scholar
16. Zhang J, Li Y, Liu C, Wu B, Shi K. A study of cross-correlations between PM_2.5 and O₃ based on Copula and Multifractal methods. Physica A. 2022;589:126651.
- View Article
- Google Scholar
17. Liu S, Liu X, Lyu Q, Li F. Comprehensive system based on a DNN and LSTM for predicting sinter composition. Appl Soft Comput. 2020;95:106574.
- View Article
- Google Scholar
18. Tehrani AA, Veisi O, kia K, Delavar Y, Bahrami S, Sobhaninia S, et al. Predicting urban Heat Island in European cities: a comparative study of GRU, DNN, and ANN models using urban morphological variables. Urban Climate. 2024;56:102061.
- View Article
- Google Scholar
19. He Z, Guo Q, Wang Z, Li X. Prediction of monthly PM_2.5 concentration in Liaocheng in China employing artificial neural network. Atmosphere. 2022;13(8):1221.
- View Article
- Google Scholar
20. Guo Q, He Z, Wang Z. Predicting of daily PM_2.5 concentration employing wavelet artificial neural networks based on meteorological elements in Shanghai, China. Toxics. 2023;11(1):51. pmid:36668777
- View Article
- PubMed/NCBI
- Google Scholar
21. Guo Q, He Z, Wang Z. Prediction of hourly PM_2.5 and PM₁₀ concentrations in Chongqing City in China based on artificial neural network. Aerosol Air Qual Res. 2023;23(6):220448.
- View Article
- Google Scholar
22. Guo Q, He Z, Wang Z. Prediction of monthly average and extreme atmospheric temperatures in Zhengzhou based on artificial neural network and deep learning models. Front For Glob Change. 2023;6.
- View Article
- Google Scholar
23. Guo Q, He Z, Wang Z. Monthly climate prediction using deep convolutional neural network and long short-term memory. Sci Rep. 2024;14(1).
- View Article
- Google Scholar
24. Guo Q, He Z, Wang Z, Qiao S, Zhu J, Chen J. A performance comparison study on climate prediction in weifang city using different deep learning models. Water. 2024;16(19):2870.
- View Article
- Google Scholar
25. He Z, Guo Q. Comparative analysis of multiple deep learning models for forecasting monthly ambient PM_2.5 concentrations: a case study in Dezhou City, China. Atmosphere. 2024;15(12):1432.
- View Article
- Google Scholar
26. Guo Q, He Z, Wang Z. Assessing the effectiveness of long short-term memory and artificial neural network in predicting daily ozone concentrations in Liaocheng City. Sci Rep. 2025;15(1):6798. pmid:40000767
- View Article
- PubMed/NCBI
- Google Scholar
27. He Z, Guo Q, Wang Z, Li X. A hybrid wavelet-based deep learning model for accurate prediction of daily surface PM_2.5 concentrations in Guangzhou City. Toxics. 2025;13(4):254. pmid:40278570
- View Article
- PubMed/NCBI
- Google Scholar
28. Yang H, Wang W, Li G. Multi-factor PM_2.5 concentration optimization prediction model based on decomposition and integration. Urban Clim. 2024;55:101916.
- View Article
- Google Scholar
29. Zhang J, Li S. Air quality index forecast in Beijing based on CNN-LSTM multi-model. Chemosphere. 2022;308(Pt 1):136180. pmid:36058367
- View Article
- PubMed/NCBI
- Google Scholar
30. Zhao J, Deng F, Cai Y, Chen J. Long short-term memory - Fully connected (LSTM-FC) neural network for PM_2.5 concentration prediction. Chemosphere. 2019;220:486–92. pmid:30594800
- View Article
- PubMed/NCBI
- Google Scholar
31. Wang W, Tang Q. Combined model of air quality index forecasting based on the combination of complementary empirical mode decomposition and sequence reconstruction. Environ Pollut. 2023;316(Pt 2):120628. pmid:36370980
- View Article
- PubMed/NCBI
- Google Scholar
32. Jamei M, Ali M, Jun C, Bateni SM, Karbasi M, Farooque AA, et al. Multi-step ahead hourly forecasting of air quality indices in Australia: application of an optimal time-varying decomposition-based ensemble deep learning algorithm. Atmos Pollut Res. 2023;14(6):101752.
- View Article
- Google Scholar
33. Guo QC, He ZF, Li SS, Li XZ, Meng JJ, Hou ZF. Air pollution forecasting using artificial and wavelet neural networks with meteorological conditions. Aerosol Air Quality Res. 2020;20(6):1429–39.
- View Article
- Google Scholar
34. Guo Q, Wang Z, He Z, Li X, Meng J, Hou Z, et al. Changes in air quality from the COVID to the post-COVID Era in the Beijing-Tianjin-Tangshan Region in China. Aerosol Air Qual Res. 2021;21(12):210270.
- View Article
- Google Scholar
35. Guo Q, He Z, Wang Z. Change in air quality during 2014-2021 in Jinan City in China and its influencing factors. Toxics. 2023;11(3):210. pmid:36976975
- View Article
- PubMed/NCBI
- Google Scholar
36. Guo Q, He Z, Wang Z. The characteristics of air quality changes in Hohhot City in China and their relationship with meteorological and socio-economic factors. Aerosol Air Qual Res. 2024;24(5):230274.
- View Article
- Google Scholar
37. Yin D, Song Q, Guo Y, Jiang Y, Dong Z, Zhao B, et al. Regional transport characteristics of PM_2.5 pollution events in Beijing during 2018-2021. J Environ Sci (China). 2025;152:503–15. pmid:39617571
- View Article
- PubMed/NCBI
- Google Scholar
38. Tan S, Xie D, Ni C, Zhao G, Shao J, Chen F, et al. Spatiotemporal characteristics of air pollution in Chengdu-Chongqing urban agglomeration (CCUA) in Southwest, China: 2015-2021. J Environ Manage. 2023;325(Pt A):116503. pmid:36274306
- View Article
- PubMed/NCBI
- Google Scholar
39. Xiao Y, Yin H, Zhang Y, Qi H, Zhang Y, Liu Z. A dual‐stage attention‐based Conv‐LSTM network for spatio‐temporal correlation and multivariate time series prediction. Int J Intell Syst. 2021;36(5):2036–57.
- View Article
- Google Scholar
40. China National Environment Monitoring Centre. Technical guidelines for evaluating the effectiveness of environmental air quality forecasting. Beijing: China Environment Publishing Group; 2018.
41. Ministry of Ecology and Environment of the People’s Republic of China. Technical regulation on ambient air quality index. Beijing: China Environment Publishing Group; 2016.
42. Ming L, Jin L, Li J, Fu P, Yang W, Liu D, et al. PM_2.5 in the Yangtze River Delta, China: chemical compositions, seasonal variations, and regional pollution events. Environ Pollut. 2017;223:200–12. pmid:28131471
- View Article
- PubMed/NCBI
- Google Scholar
43. Gao M, Carmichael GR, Wang Y, Saide PE, Yu M, Xin J, et al. Modeling study of the 2010 regional haze event in the North China Plain. Atmos Chem Phys. 2016;16(3):1673–91.
- View Article
- Google Scholar
44. Ding Y, Wu P, Liu Y, Song Y. Environmental and dynamic conditions for the occurrence of persistent Haze events in North China. Engineering. 2017;3(2):266–71.
- View Article
- Google Scholar
45. Ning G, Wang S, Yim SHL, Li J, Hu Y, Shang Z, et al. Impact of low-pressure systems on winter heavy air pollution in the northwest Sichuan Basin, China. Atmos Chem Phys. 2018;18(18):13601–15.
- View Article
- Google Scholar
46. Zhong J, Zhang X, Dong Y, Wang Y, Liu C, Wang J, et al. Feedback effects of boundary-layer meteorological factors on cumulative explosive growth of PM_2.5 during winter heavy pollution episodes in Beijing from 2013 to 2016. Atmos Chem Phys. 2018;18(1):247–58.
- View Article
- Google Scholar
47. Yu C, Zhao T, Bai Y, Zhang L, Kong S, Yu X, et al. Heavy air pollution with a unique “non-stagnant” atmospheric boundary layer in the Yangtze River middle basin aggravated by regional transport of PM_2.5 over China. Atmos Chem Phys. 2020;20(12):7217–30.
- View Article
- Google Scholar
48. Hu W, Zhao T, Bai Y, Kong S, Xiong J, Sun X, et al. Importance of regional PM_2.5 transport and precipitation washout in heavy air pollution in the Twain-Hu Basin over Central China: Observational analysis and WRF-Chem simulation. Sci Total Environ. 2021;758:143710. pmid:33223179
- View Article
- PubMed/NCBI
- Google Scholar
49. Lu M, Tang X, Wang Z, Wu L, Chen X, Liang S, et al. Investigating the transport mechanism of PM_2.5 pollution during January 2014 in Wuhan, Central China. Adv Atmos Sci. 2019;36(11):1217–34.
- View Article
- Google Scholar
50. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. pmid:9377276
- View Article
- PubMed/NCBI
- Google Scholar
51. Jiang G, Wang W. Error estimation based on variance analysis of k -fold cross-validation. Pattern Recognit. 2017;69:94–106.
- View Article
- Google Scholar
52. Zhou Y, Luo B, Li J, Hao Y, Yang W, Shi F, et al. Characteristics of six criteria air pollutants before, during, and after a severe air pollution episode caused by biomass burning in the southern Sichuan Basin, China. Atmos Environ. 2019;215:116840.
- View Article
- Google Scholar
53. Guo Q, Wu D, Yu C, Wang T, Ji M, Wang X. Impacts of meteorological parameters on the occurrence of air pollution episodes in the Sichuan basin. J Environ Sci (China). 2022;114:308–21. pmid:35459494
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Sheehan P, Cheng E, English A, Sun F. China’s response to the air pollution shock. Nature Clim Change. 2014;4(5):306–9.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Zheng B, Tong D, Li M, Liu F, Hong C, Geng G, et al. Trends in China’s anthropogenic emissions since 2010 as the consequence of clean air actions. Atmos Chem Phys. 2018;18(19):14095–111.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Wang J, Zhao B, Wang S, Yang F, Xing J, Morawska L, et al. Particulate matter pollution over China and the effects of control policies. Sci Total Environ. 2017;584–585:426–47. pmid:28126285
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Ding A, Huang X, Nie W, Chi X, Xu Z, Zheng L, et al. Significant reduction of PM_2.5 in eastern China due to regional-scale emission control: evidence from SORPES in 2011–2018. Atmos Chem Phys. 2019;19(18):11791–801.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Zhang Q, Li X, Li XM, Zhang RH, Ren B, Che HX. Lung cell injury risks of PM_2.5 exposure in the high humidity and low solar radiation environment of southwestern China. Atmos Environ. 2024;338:120794.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Feng XY, Zhang ZZ, Guo JP, Wang SG. Multilayer inversion formation and evolution during persistent heavy air pollution events in the Sichuan Basin, China. Atmos Res. 2023;286:106691.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Li T, Zhang Q, Peng Y, Guan X, Li L, Mu J, et al. Contributions of various driving factors to air pollution events: interpretability analysis from Machine learning perspective. Environ Int. 2023;173:107861. pmid:36898175
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref8] 8. Sulaymon ID, Zhang Y, Hopke PK, Guo S, Ye F, Sun J, et al. Using the COVID-19 lockdown to identify atmospheric processes and meteorology influences on regional PM_2.5 pollution episodes in the Beijing-Tianjin-Hebei, China. Atmos Res. 2023;294:106940.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref9] 9. Huang X, Ding A, Wang Z, Ding K, Gao J, Chai F, et al. Amplified transboundary transport of haze by aerosol–boundary layer interaction in China. Nat Geosci. 2020;13(6):428–34.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref10] 10. Sulaymon ID, Zhang Y, Hopke PK, Hu J, Zhang Y, Li L, et al. Persistent high PM_2.5 pollution driven by unfavorable meteorological conditions during the COVID-19 lockdown period in the Beijing-Tianjin-Hebei region, China. Environ Res. 2021;198:111186. pmid:33930403
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref11] 11. Xu GY, Ren XD, Xiong KN, Li LQ, Bi XC, Wu QL. Analysis of the driving factors of PM_2.5 concentration in the air: a case study of the Yangtze River Delta, China. Ecol Indic. 2020;110:105889.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref12] 12. Li K, Jacob DJ, Liao H, Shen L, Zhang Q, Bates KH. Anthropogenic drivers of 2013-2017 trends in summer surface ozone in China. Proc Natl Acad Sci U S A. 2019;116(2):422–7. pmid:30598435
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref13] 13. Dang R, Liao H, Fu Y. Quantifying the anthropogenic and meteorological influences on summertime surface ozone in China over 2012-2017. Sci Total Environ. 2021;754:142394. pmid:33254879
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref14] 14. Wu B, Liu C, Zhang J, Du J, Shi K. The multifractal evaluation of PM_2.5-O₃ coordinated control capability in China. Ecol Indic. 2021;129:107877.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref15] 15. Liu C, Liang J, Li Y, Shi K. Fractal analysis of impact of PM_2.5 on surface O₃ sensitivity regime based on field observations. Sci Total Environ. 2023;858(Pt 3):160136. pmid:36375545
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref16] 16. Zhang J, Li Y, Liu C, Wu B, Shi K. A study of cross-correlations between PM_2.5 and O₃ based on Copula and Multifractal methods. Physica A. 2022;589:126651.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref17] 17. Liu S, Liu X, Lyu Q, Li F. Comprehensive system based on a DNN and LSTM for predicting sinter composition. Appl Soft Comput. 2020;95:106574.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref18] 18. Tehrani AA, Veisi O, kia K, Delavar Y, Bahrami S, Sobhaninia S, et al. Predicting urban Heat Island in European cities: a comparative study of GRU, DNN, and ANN models using urban morphological variables. Urban Climate. 2024;56:102061.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref19] 19. He Z, Guo Q, Wang Z, Li X. Prediction of monthly PM_2.5 concentration in Liaocheng in China employing artificial neural network. Atmosphere. 2022;13(8):1221.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref20] 20. Guo Q, He Z, Wang Z. Predicting of daily PM_2.5 concentration employing wavelet artificial neural networks based on meteorological elements in Shanghai, China. Toxics. 2023;11(1):51. pmid:36668777
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref21] 21. Guo Q, He Z, Wang Z. Prediction of hourly PM_2.5 and PM₁₀ concentrations in Chongqing City in China based on artificial neural network. Aerosol Air Qual Res. 2023;23(6):220448.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref22] 22. Guo Q, He Z, Wang Z. Prediction of monthly average and extreme atmospheric temperatures in Zhengzhou based on artificial neural network and deep learning models. Front For Glob Change. 2023;6.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref23] 23. Guo Q, He Z, Wang Z. Monthly climate prediction using deep convolutional neural network and long short-term memory. Sci Rep. 2024;14(1).
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref24] 24. Guo Q, He Z, Wang Z, Qiao S, Zhu J, Chen J. A performance comparison study on climate prediction in weifang city using different deep learning models. Water. 2024;16(19):2870.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref25] 25. He Z, Guo Q. Comparative analysis of multiple deep learning models for forecasting monthly ambient PM_2.5 concentrations: a case study in Dezhou City, China. Atmosphere. 2024;15(12):1432.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref26] 26. Guo Q, He Z, Wang Z. Assessing the effectiveness of long short-term memory and artificial neural network in predicting daily ozone concentrations in Liaocheng City. Sci Rep. 2025;15(1):6798. pmid:40000767
View Article
PubMed/NCBI
Google Scholar

[84] View Article

[85] PubMed/NCBI

[86] Google Scholar

[ref27] 27. He Z, Guo Q, Wang Z, Li X. A hybrid wavelet-based deep learning model for accurate prediction of daily surface PM_2.5 concentrations in Guangzhou City. Toxics. 2025;13(4):254. pmid:40278570
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref28] 28. Yang H, Wang W, Li G. Multi-factor PM_2.5 concentration optimization prediction model based on decomposition and integration. Urban Clim. 2024;55:101916.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref29] 29. Zhang J, Li S. Air quality index forecast in Beijing based on CNN-LSTM multi-model. Chemosphere. 2022;308(Pt 1):136180. pmid:36058367
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref30] 30. Zhao J, Deng F, Cai Y, Chen J. Long short-term memory - Fully connected (LSTM-FC) neural network for PM_2.5 concentration prediction. Chemosphere. 2019;220:486–92. pmid:30594800
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref31] 31. Wang W, Tang Q. Combined model of air quality index forecasting based on the combination of complementary empirical mode decomposition and sequence reconstruction. Environ Pollut. 2023;316(Pt 2):120628. pmid:36370980
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref32] 32. Jamei M, Ali M, Jun C, Bateni SM, Karbasi M, Farooque AA, et al. Multi-step ahead hourly forecasting of air quality indices in Australia: application of an optimal time-varying decomposition-based ensemble deep learning algorithm. Atmos Pollut Res. 2023;14(6):101752.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref33] 33. Guo QC, He ZF, Li SS, Li XZ, Meng JJ, Hou ZF. Air pollution forecasting using artificial and wavelet neural networks with meteorological conditions. Aerosol Air Quality Res. 2020;20(6):1429–39.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref34] 34. Guo Q, Wang Z, He Z, Li X, Meng J, Hou Z, et al. Changes in air quality from the COVID to the post-COVID Era in the Beijing-Tianjin-Tangshan Region in China. Aerosol Air Qual Res. 2021;21(12):210270.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref35] 35. Guo Q, He Z, Wang Z. Change in air quality during 2014-2021 in Jinan City in China and its influencing factors. Toxics. 2023;11(3):210. pmid:36976975
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref36] 36. Guo Q, He Z, Wang Z. The characteristics of air quality changes in Hohhot City in China and their relationship with meteorological and socio-economic factors. Aerosol Air Qual Res. 2024;24(5):230274.
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref37] 37. Yin D, Song Q, Guo Y, Jiang Y, Dong Z, Zhao B, et al. Regional transport characteristics of PM_2.5 pollution events in Beijing during 2018-2021. J Environ Sci (China). 2025;152:503–15. pmid:39617571
View Article
PubMed/NCBI
Google Scholar

[123] View Article

[124] PubMed/NCBI

[125] Google Scholar

[ref38] 38. Tan S, Xie D, Ni C, Zhao G, Shao J, Chen F, et al. Spatiotemporal characteristics of air pollution in Chengdu-Chongqing urban agglomeration (CCUA) in Southwest, China: 2015-2021. J Environ Manage. 2023;325(Pt A):116503. pmid:36274306
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref39] 39. Xiao Y, Yin H, Zhang Y, Qi H, Zhang Y, Liu Z. A dual‐stage attention‐based Conv‐LSTM network for spatio‐temporal correlation and multivariate time series prediction. Int J Intell Syst. 2021;36(5):2036–57.
View Article
Google Scholar

[131] View Article

[132] Google Scholar

[ref40] 40. China National Environment Monitoring Centre. Technical guidelines for evaluating the effectiveness of environmental air quality forecasting. Beijing: China Environment Publishing Group; 2018.

[ref41] 41. Ministry of Ecology and Environment of the People’s Republic of China. Technical regulation on ambient air quality index. Beijing: China Environment Publishing Group; 2016.

[ref42] 42. Ming L, Jin L, Li J, Fu P, Yang W, Liu D, et al. PM_2.5 in the Yangtze River Delta, China: chemical compositions, seasonal variations, and regional pollution events. Environ Pollut. 2017;223:200–12. pmid:28131471
View Article
PubMed/NCBI
Google Scholar

[136] View Article

[137] PubMed/NCBI

[138] Google Scholar

[ref43] 43. Gao M, Carmichael GR, Wang Y, Saide PE, Yu M, Xin J, et al. Modeling study of the 2010 regional haze event in the North China Plain. Atmos Chem Phys. 2016;16(3):1673–91.
View Article
Google Scholar

[140] View Article

[141] Google Scholar

[ref44] 44. Ding Y, Wu P, Liu Y, Song Y. Environmental and dynamic conditions for the occurrence of persistent Haze events in North China. Engineering. 2017;3(2):266–71.
View Article
Google Scholar

[143] View Article

[144] Google Scholar

[ref45] 45. Ning G, Wang S, Yim SHL, Li J, Hu Y, Shang Z, et al. Impact of low-pressure systems on winter heavy air pollution in the northwest Sichuan Basin, China. Atmos Chem Phys. 2018;18(18):13601–15.
View Article
Google Scholar

[146] View Article

[147] Google Scholar

[ref46] 46. Zhong J, Zhang X, Dong Y, Wang Y, Liu C, Wang J, et al. Feedback effects of boundary-layer meteorological factors on cumulative explosive growth of PM_2.5 during winter heavy pollution episodes in Beijing from 2013 to 2016. Atmos Chem Phys. 2018;18(1):247–58.
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref47] 47. Yu C, Zhao T, Bai Y, Zhang L, Kong S, Yu X, et al. Heavy air pollution with a unique “non-stagnant” atmospheric boundary layer in the Yangtze River middle basin aggravated by regional transport of PM_2.5 over China. Atmos Chem Phys. 2020;20(12):7217–30.
View Article
Google Scholar

[152] View Article

[153] Google Scholar

[ref48] 48. Hu W, Zhao T, Bai Y, Kong S, Xiong J, Sun X, et al. Importance of regional PM_2.5 transport and precipitation washout in heavy air pollution in the Twain-Hu Basin over Central China: Observational analysis and WRF-Chem simulation. Sci Total Environ. 2021;758:143710. pmid:33223179
View Article
PubMed/NCBI
Google Scholar

[155] View Article

[156] PubMed/NCBI

[157] Google Scholar

[ref49] 49. Lu M, Tang X, Wang Z, Wu L, Chen X, Liang S, et al. Investigating the transport mechanism of PM_2.5 pollution during January 2014 in Wuhan, Central China. Adv Atmos Sci. 2019;36(11):1217–34.
View Article
Google Scholar

[159] View Article

[160] Google Scholar

[ref50] 50. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. pmid:9377276
View Article
PubMed/NCBI
Google Scholar

[162] View Article

[163] PubMed/NCBI

[164] Google Scholar

[ref51] 51. Jiang G, Wang W. Error estimation based on variance analysis of k -fold cross-validation. Pattern Recognit. 2017;69:94–106.
View Article
Google Scholar

[166] View Article

[167] Google Scholar

[ref52] 52. Zhou Y, Luo B, Li J, Hao Y, Yang W, Shi F, et al. Characteristics of six criteria air pollutants before, during, and after a severe air pollution episode caused by biomass burning in the southern Sichuan Basin, China. Atmos Environ. 2019;215:116840.
View Article
Google Scholar

[169] View Article

[170] Google Scholar

[ref53] 53. Guo Q, Wu D, Yu C, Wang T, Ji M, Wang X. Impacts of meteorological parameters on the occurrence of air pollution episodes in the Sichuan basin. J Environ Sci (China). 2022;114:308–21. pmid:35459494
View Article
PubMed/NCBI
Google Scholar

[172] View Article

[173] PubMed/NCBI

[174] Google Scholar

Figures

Abstract

1. Introduction

2. Data and methods

2.1. Data

2.2. Methods

2.2.1. Data preprocessing.

2.2.2. Definition of regional PM2.5 pollution.

2.2.3. Multi-scales feature fusion regional pollution prediction network.

2.2.4. Loss function.

2.2.5. Experimental setup.

2.2.6. Evaluation metrics.

3. Results and discussion

3.1. Statistical analysis

3.2. Model validation

3.2.1. Model training and testing.

3.3.2. Annual model performance.

3.2.3. Seasonal performance of MSFRPM model.

3.3. Comparison results of different machine learning algorithms

3.4. Assessment of regional PM2.5 pollution concentration

3.4.1. Annual variation of regional PM2.5 pollution concentration.

3.4.2. Seasonal variation of regional PM2.5 pollution concentration.

3.4.3. Assessment of regional PM2.5 pollution.

4. Conclusion

Supporting information

S1 Data. Data in the experiment.

References

2.2.2. Definition of regional PM_2.5 pollution.

3.4. Assessment of regional PM_2.5 pollution concentration

3.4.1. Annual variation of regional PM_2.5 pollution concentration.

3.4.2. Seasonal variation of regional PM_2.5 pollution concentration.

3.4.3. Assessment of regional PM_2.5 pollution.