Spatial–seasonal characteristics and critical impact factors of PM2.5 concentration in the Beijing–Tianjin–Hebei urban agglomeration

As China’s political and economic centre, the Beijing–Tianjin–Hebei (BTH) urban agglomeration experiences serious environmental challenges on particulate matter (PM) concentration, which results in fundamental or irreparable damages in various socioeconomic aspects. This study investigates the seasonal and spatial distribution characteristics of PM2.5 concentration in the BTH urban agglomeration and their critical impact factors. Spatial interpolation are used to analyse the real-time monitoring of PM2.5 data in BTH from December 2013 to May 2017, and partial least squares regression is applied to investigate the latest data of potential polluting variables in 2015. Several important findings are obtained: (1) Notable differences exist amongst PM2.5 concentrations in different seasons; January (133.10 mg/m3) and December (120.19 mg/m3) are the most polluted months, whereas July (38.76 mg/m3) and August (41.31 mg/m3) are the least polluted months. PM2.5 concentration shows a periodic U-shaped variation pattern with high pollution levels in autumn and winter and low levels in spring and summer. (2) In terms of spatial distribution characteristics, the most highly polluted areas are located south and east of the BTH urban agglomeration, and PM2.5 concentration is significantly low in the north. (3) Empirical results demonstrate that the deterioration of PM2.5 concentration in 2015 is closely related to a set of critical impact factors, including population density, urbanisation rate, road freight volume, secondary industry gross domestic product, overall energy consumption and industrial pollutants, such as steel production and volume of sulphur dioxide emission, which are ranked in terms of their contributing powers. The findings provide a basis for the causes and conditions of PM2.5 pollution in the BTH regions. Viable policy recommendations are provided for effective air pollution treatment.

Introduction 41]. Zhang et al. (2013) [22] indicated that motor vehicle ownership is the key contributing factor to Beijing's air pollution, accounting for 63% of the carbonaceous components on PM 2.5 , whereas coal combustion is the main air pollutant source in Tangshan, accounting for 30% of PM 2.5 compositions. Ma and Zhang (2014) [16] investigated the PM 2.5 distribution characteristics in China from 2001 and 2010 based on satellite data developed by Battelle Memorial Institute. Their study confirmed that the spatial aggregation effects of PM 2.5 are apparent in China. Specifically, highly polluted areas are mainly concentrated in the BTJ region, Yangtze River Delta (YRD) and the linking zones.
Several studies have focused on the PM 2.5 characteristics in the BTH region. Using data from 1497 station-based monitoring sites, Shen and Yao (2017) [42] investigated the effects of demographic and economic factors on PM 2.5 concentration in four urban agglomerations, namely, BTH, YRD, Pearl River Delta (PRD) and Chengdu-Chongqing (CC). The estimated results indicated that a high correlation exists amongst population density, economic affluence and PM 2.5 concentration. Geography is another important determining factor for PM 2.5 concentration because high altitudes are usually associated with high PM2.5 concentrations. Zhou et al. (2017) [43] investigated the impact of economic and ecological factors on PM 2.5 concentrations by using a two-stage distribution lag model. Their estimated results indicated that the emission of atmospheric pollutants causes hysteresis effects on PM 2.5 concentrations. Specifically, coal consumption, industrial exhaust, value-added from heavy pollution industry and the ownership of 'yellow label car', which are heavy-polluting vehicles, are the key sources of PM 2.5 pollutant emissions. On the basis of panel data in the last 10 years, Li and Yin (2017) [44] utilised a panel threshold model to investigate the nonlinear changing patterns between socioeconomic development and PM 2.5 concentration. The study confirmed that the development of the manufacturing and construction sectors and the growth of automobile volumes aggravate PM 2.5 pollution when the value-added of the tertiary industry is below the threshold value of 6,080 billion. Moreover, the development of the second and third industries was noted to be an effective roadmap to alleviate PM 2.5 pollution when the value-added of the tertiary industry exceeds 6,080 billion. Su and Zhong (2015) [45] analysed the natural and man-made contributing factors of PM 2.5 in nine key economic circles in China, such as BTH, YRD, PRD and CC, by using a factor analysis method. They concluded that the effects of human activities are more significant than those of natural factors, and industrial activities are the important contributing factors of PM 2.5 .
These findings also stimulated another key question regarding the seasonal and spatial characteristics and critical impact factor of PM 2.5 concentrations. However, considering the insufficient long-term and large-scale PM 2.5 concentration data [46], especially the lack of real-time monitoring data [47], few studies have quantitatively estimated these factors in BTH regions [5,48]. Although the BTH urban agglomeration is recognised as one of the most severely polluted regions in China [24,49], analyses on the seasonal and spatial characteristics of PM2.5 spanning a long period are limited [50][51][52].
The present study aims to estimate the PM 2.5 concentration characteristics in the BTH urban agglomeration. Specifically, the key objectives of this study are as follows: 1) to investigate the relationships between seasons and PM 2.5 concentrations (i.e. seasonal variation characteristics), 2) to evaluate the spatial distribution of PM 2.5 concentrations in different cities and 3) to investigate the critical impact factors of PM 2.5 concentrations. Real-time monitoring data spanning December 2013 to May 2017 are collected from 80 atmospheric physics observation points by the China Meteorological Administration. Several studies have investigated Beijing's PM 2.5 concentration [8,27,53]. However, the data sources of these studies were mainly based on satellite images [47], and few studies collected PM 2.5 real-time monitoring data to significantly improve the estimation accuracy. The current study derives a reliable and accurate estimation on the spatial and seasonal characteristics of PM 2.5 concentration.
This study provides novel contributions to existing literature as follows: 1) A unique PM 2.5 data source with a long-time span from December 2013 to April 2017 is used. Real-time monitoring data are consistent with the reliable estimation of satellite data. 2) An array of novel and promising techniques, namely, remote sensing techniques, spatial interpolation and partial least squares (PLS) regression are employed, all of which provide an insightful analysis on PM 2.5 characteristics in the BTH region. Specifically, spatial interpolation is a powerful approach to replace the traditional inversion method in reflecting PM 2.5 concentrations at near-ground level [54]. PLS regression is an effective approach to cope with the multicollinearity issue when a large number of independent variables are included in the estimation. 3) Various variables are considered in identifying the critical impact factors of PM 2.5 . Considering that peripheral or distant sources commonly affect the air pollution of a city, this study provides a complete analysis on PM 2.5 characteristics of a city and a region.

Study area
The study area used is the BTH urban agglomeration, which is also called the Greater Beijing region (Fig 1). This area is located in Northeast China, at longitude 113˚04' to 119˚53' east and 36˚01' to 42˚37 ' north. It measures 218,000 km 2 and had more than 100 million residents as of 2016 (National Bureau of Statistics of China, 2016). The BTH urban agglomeration is the largest urban agglomeration and the most developed economic centre in northern China. Beijing is the political capital, cultural and information centre of China and is one of the largest megacities worldwide, with more than 21 million people and 5.7 million vehicles in 2016 [55]. Given the importance of the Greater Beijing region, severe air pollution has been the leading environmental challenge, with frequent occurrences of fog and haze. Related statistical consensus indicate that the total annual mean values of Beijing's PM 10 and PM 2.5 concentrations from 2012 to 2015were 138.5 ± 92.9 μg/m 3 and 2.3 ± 54.4 μg/m 3 , respectively [37].

Data sources
In terms of urban distribution and prefectural boundary, prefectural boundary layers at a scale of 1:4,000,000 are obtained from the National Geomatics Centre of China.  Table).
This monitoring dataset is obtained from the atmospheric physics sites of the 12 cities by the Ministry of Environmental Protection of China. The remote sensing data from the Atmospheric Composition Analysis Group (2016) [56] are initially used in BTH region. Compared with the remote sensing data on PM 2.5 (Atmospheric Composition Analysis Group, 2016) [56], spatial interpolation has higher accuracy than remote sensing data in reflecting PM 2.5 concentrations at near-ground level in this study.
These data are measured by 80 monitoring stations distributed throughout the BTH region (Fig 3). Each monitoring station automatically measures monthly PM 2.5 concentrations. Cangzhou, which is a small city, has three air quality monitoring stations. Other cities have more than five air quality monitoring stations that are distributed from the suburbs to downtown. The annual average of PM 2.5 concentration for each monitoring station is calculated  Table 1 describes the geographic information on several atmospheric physics observation points. The geographic information of all observation points is shown in S1 Table. In terms of data standardisation, the collected PM 2.5 values at 2:00, 8:00, 14:00 and 20:00 from different observation points are averaged to derive the daily  and monthly PM 2.5 concentrations of a city. The PM 2.5 concentration of 12 sites is the average of 80 sites. Table 2 describes an array of PM 2.5 -related variables summarised from an extensive literature review. These variables are tested to identify the critical impact factors for PM 2.5 concentration. Existing studies on identifying the key contributing factors of PM 2.5 concentration in China have mainly focused on demographic and economic aspects and other chemical air pollutants. Yang and Chen (2017) [57] used independent variables, namely, coal consumption, cement production, automobile volume, population and gross domestic product (GDP). Lu et al. (2017) [58] incorporated the following variables in their estimation, namely, population density, annual volume of bus passengers, road freight, proportion of secondary industry to overall GDP, volume of SO 2 emissions and volume of industrial soot emission. Ma and Xiao (2017) [59] considered urbanisation, energy consumption structure [60] and construction areas in their investigation. On the basis of an extensive literature review, 12 potential contributing factors for PM 2.5 concentrations are identified ( Table 2).
The possible critical impact factors of PM 2.5 concentration are selected (Tables 2 and 3), discussed and included in the estimation model. Since independent variable data in 2016 and 2017 has yet been published by the statistic consensus, this study used the latest data of 2015 in the PLS model to analyse the critical factors for PM2.5 concentration (For details, see S3 Table). Population density. A comparison of PM 2.5 concentrations with population reveals interesting findings that should be considered. Shijiazhuang, which is the most polluted city, features a relatively high population density of 788.14 persons/km 2 . Handan, the second most polluted city, has the highest population density of 870.29 persons/km 2 . Zhangjiakou, which has the best air quality, presents a low population density of 127.46 persons/km 2 . Chengde, with a similar pollution level as Shijiazhuang, also manifests a low population density of 96.73 persons/km 2 (Fig 4). A common pattern exists in which population density forms a certain positive relationship with PM 2.5 concentration. However, this pattern is affected by various critical impact factors that lead to certain variations. Therefore, a detailed investigation on critical PM 2.5 factors should be conducted for a thorough analysis of such particles.
Industrial and energy aspects. The estimation results reveal that PM 2.5 concentration in the BTH urban agglomeration exhibits a distinctive spatial distribution characteristic. Related literature shows that coal combustion accounts for 20%-30% of PM 2.5 pollution in Chinese cities [20,22,27,28]. In winter, PM 2.5 concentration is usually high because coal is used as the main energy for winter heating. In summer, the situation significantly differs in the BTH urban agglomeration. For example, motor vehicles account for 63% of the carbonaceous components of PM 2.5 in Beijing, while coal combustion accounts for 30.3% of PM 2.5 compositions because it is used as the major energy source for industrial production in the city [22]. Therefore, the present study uses industrial dust and industrial SO 2 emissions as parameters to investigate the air pollution contributions of heating and industrial development. Tangshan shows the highest volumes of industrial SO 2 emission, which amounted to 214,723 t in 2016, followed by Shijiazhuang and Handan with 113,652 and 110,193 t, respectively. Xingtai and Handan represent the top two contributors of industrial dust emissions, accounting for 191,713 and 100,738 t, respectively. Transportation. Several studies argue that transportation exerts a significant adverse influence on air pollution [29]. On the basis of available data from statistical consensus, 'passenger and freight volume of highway traffic' are used as a parameter for measuring PM 2.5 pollution from transportation. Data suggest that polluted cities are generally associated with high road freight volume. For example, Shijiazhuang, Tangshan and Handan, which are the most polluted cities, are associated with relatively high freight volumes with 3,695,410,000, 363,580,000 and 387,040,000 t, respectively.

Research methods
The research framework and main research steps are illustrated in Fig 6. Spatial interpolation. PM 2.5 concentration is a scalar description of atmospheric state significantly affected by local human activities. Although remote sensing has been improved by techniques such as regional correlations in recent years, several studies indicate that spatial interpolation is a powerful approach to replace the inversion method, leading to higher accuracy than remote sensing data in reflecting PM 2.5 concentrations at near-ground level [40,54,[64][65][66][67]. To address this limitation, spatial interpolation is employed and the results of the  Characteristics and critical factors of PM 2.5 concentration in Beijing-Tianjin-Hebei urban agglomeration inversion method are considered as references. Interpolation methods used in regional-scale factors include inverse distance interpolation (IDW) and Kriging interpolation method (OKM). OKM is a more widely recognised method for dealing with interpolation points than IDW [22].
This study uses OKM to simulate seasonal variations of PM 2.5 in the 12 cities of the BTH urban agglomeration. The supporting concept of OKM is that the interpolation results at the target point are the weighted sum of known attribute values of the samples [68]. In the study area, x represents the spatial location of point x. z(x i ) (i = 1, 2, Á Á Á, n) represents the property value of sampling point x i (i = 1, 2, Á Á Á, n), and annual mean PM 2.5 concentration is the property value of point x i . Then, the interpolation result at target point x 0 is z(x 0 ): Where λ i (i = 1, 2, Á Á Á, n) depends on undetermined coefficients. Assuming that the entire study area satisfies the second-order stationary assumption, that is, 'the mathematical expectation of z(x) exists and is equal to the constant, that is, E[z(x)] = m', the covariance function of variables z(x) exists and only depends on lag value (h), that is, On the basis of unbiased expectation refers to the spatial variation of PM 2.5 concentration in BTH by OKM in point x i , E[z Ã (x 0 )] denotes the spatial variation of PM 2.5 concentration in BTH by OKM in point x 0 , and z(x 0 ) is the PM 2.5 concentration in point x 0 . We can conclude that P n i¼1 l i ¼ 1. For regionalised variables that satisfy the second-order stationary conditions, the estimated variance can be calculated using the following formula: To obtain the minimum variance estimation under unbiased conditions, that is, MinfVar ½z Ã ðx 0 Þ À zðx 0 Þ À 2m P n i¼1 ðl i À 1Þg. The weight coefficients should satisfy the following equations: Then, we can calculate the value of λ i (i = 1, 2, Á Á Á, n) and obtain the attribute value z Ã (x 0 ) at sample point x 0 .
1. The degree of correlation between t 1 and u 1 should be the maximum.
The two conditions can be summarised as follows: After the first principal components t 1 and u 1 are extracted from X and Y, PLS performs linear regressions of X and Y on t 1 . In the PLS estimation, components t 1 and u 1 have typical component characteristics. A significant linear relationship between t 1 and u 1 indicates that X has a notable correlation with Y, and PLS is appropriate for estimating the contribution of X to Y. The algorithm is terminated when the regression equations reach satisfactory levels. Otherwise, the residuals of X and Y after regression on t 1 are used to extract the next principal component. The algorithm iterates until the results reach satisfactory levels.
Cross-validation (Q h 2 ) is used as the measurement criterion to determine whether the regression results reach the satisfactory level. For the number of extracted principal components h, rounding observation i (i = 1,2, Á Á Á, n) for each time, (i = 1,2, Á Á Á, n), the PLS model is built with the remaining (n−1) observations. Then, observation i is substituted in the fitted PLS regression equation to obtain the predicted value of y j (j = 1,2, Á Á Á, q) at observation i, and the predicted value is recorded as c y ðiÞj ðhÞ. The above calculation is repeated for each i (i = 1,2, Á Á Á, n). The sum of the squared errors (SSE) for dependent y j is obtained when h principal components are extracted and PRESS j ðhÞ ¼ After m principal components t 1 , t 2 , Á Á Á, t m are finally extracted from X, PLS first performs a regression of y k on t 1 , t 2 , Á Á Á, t m and converts it in the regression equation of y k on x 1 , The specific procedures of the PLS algorithm are summarised as follows: Step 1. To simplify the calculation and eliminate the effects of different units of variables, this study first standardises the original data matrices (X and Y), which are denoted by E 0 and F 0 .
Step 2. Let t 1 be the first principal component extracted from E 0 . The regression of E 0 and F 0 on t 1 is performed as follows: Where p 1 and r 1 refer to regression coefficient vectors, and E 1 and F 1 represent the corresponding residual matrices. The accuracy of the regression equation is calculated. The algorithm is terminated when the regression equations reach satisfactory levels. Otherwise, let E 0 = E 1 and F 0 = F 1 , and iterate the component extraction and regression analysis. Cross validation (Q h 2 ) is used to evaluate the model until the expected accuracy is obtained. Step4. The regression equation of E 0 and F 0 on t 1 , t 2 , Á Á Á, t m is derived if the model extracts m principal components. The following regression equation is developed through inverse transformation.
In the calculation of PLS, the principle component t h should both represent the variation information of X (x j (j = 1, 2, . . ., p)) and explain the information of Y (y k (k = 1, 2, Á Á Á, q)) as much as possible. To measure the explanatory power of t h for interpreting X and Y, we define various explanatory powers of t h as follows: 1. The explanatory power of t h to interpret X: Rd X; t h ð Þ ¼ 1 2. The cumulative explanatory power of t 1 , t 2 , Á Á Á, t m to interpret X: 4. The cumulative explanatory power of t 1 , t 2 , Á Á Á, t m to interpret Y: RdðY; t 1 ; t m Þ ¼ P m h¼1 RdðY; t h Þ. A significant advantage of PLS regression is the reliable choice of variables. When independent variable x j is used to explain the set of dependent variables Y, the variable importance in projection VIP j can be used to measure the importance of x j in interpreting Y [69]. hj ¼ p indicates that if the VIP j of all independent variables x j (j = 1,2, . . .,p) equals 1, then they all play the same role in interpreting Y. Otherwise, x j exerts a significant effect on interpreting Y when VIPj> 1.

Seasonal variation characteristics of PM 2.5 concentration
The average PM 2.5 concentration in the BTH urban agglomeration shows a notable periodical U-shaped variation from December 2013 to May 2017. The annual average PM 2.5 concentration in all cities is 77.79 mg/m 3 . The monthly variation of PM 2.5 concentrations in the BTH region (Fig 7) shows that the PM 2.5 concentration is below 35 mg/m 3 for only 8.3% of the time (Interim target-1 of WHO, 2005). Specifically, PM 2.5 concentration is high in autumn and winter and low in spring and summer. This finding is consistent with other findings on China's PM 2.5 [51,66,70]. PM 2.5 concentration from January to May shows a downward trend and from June to September maintains a stable level at 35-65 μg/m 3 , which is slightly lower than those from January to May. PM 2.5 concentration from October to December exhibits an upward trend, with an increase from 58.95 μg/m 3  Related studies have recorded that coal consumption for heating in autumn and winter is the main reason for the difference in values [66]. Coal combustion remains the main way of heating for residents in winter. Thus, government in the BTH regions are proactively promoting coal substation schemes, such as urban gasification projects. Coal is 'dirty' energy, and emissions of particle pollutants are enormous in winter. Meteorological factors also contribute to high PM 2.5 in winter [39]. Specifically, rainfall in winter is scarce relative to the other seasons, the flushing effect of rainfall to air is little, inhalable particles are easily suspended in air and low temperature in winter is not conducive to the diffusion of PM 2.5 particles.
This study employed OKM to analyse the seasonal variation of PM 2. Spatial variation characteristics of pm 2.5 concentration PM 2.5 concentration in the BTH urban agglomeration shows a significant spatial variation (Fig  9), which is also recorded in some studies [51,52,70]. From 2014 to 2016, PM 2.5 concentration is significantly high in the south and east of BTH urban agglomeration, particularly in Shijiazhuang. The mean value of PM 2.5 concentration in Shijiazhuang amounts to 104 μg/m 3 . PM 2.5  [66] also add evidence to the fact that air pollution in the BTH urban agglomeration generally showed a decreasing trend from January 2013 to December 2015.
As shown in Fig 10, the remote sensing data is used to analyse the relationship between PM 2.5 concentration and population density, landform, geography and elevation. Remote sensing data (Fig 10) shows that PM 2.5 concentration is low in mountainous and basin regions, such as Zhangjiakou and Chengde, which are situated more than 1000 feet above sea level. High sea level is an advantageous landform for blocking the invasion of PM 2.5 from peripheral regions [71]. Regions at high sea levels are also subject to northern and northwest winds, which benefit the dissemination of PM 2.5 pollutants.

Estimation results of PLS regression
Two key plots are considered when examining the performance of PLS regression. The first is t 1 /t 2 oval plot.
As the first two extracted linear combinations of x 1 , x 2 , . . ., x p , t 1 and t 2 represent the key information of X variables and exhibit remarkable explanatory power for Y variables [71,72]. The underpinning logic is straightforward: when all t 1 /t 2 points are covered in the oval, raw data are homogenous and appropriate for model calculations [72]. Fig 11 shows that all the 12 observations of sample points are covered in the oval, indicating that the PLS model is suitable for use in this study.
Another plot that should be considered is the t 1 /u 1 scatter plot. When t 1 /u 1 of the sample data shows a nearly linear relationship, PLS is appropriate for studying the issue [73]. As shown in Fig 12, the scattering of sample data generally shows a linear relationship. Thus, the PLS model is suitable for use in this study.  Table 4 illustrates the overview results of PLS regression. The PLS regression is estimated by using SIMCA-P (a data analytic software). Two principle components are extracted according to the value of cross validation. When two components are extracted, 'R 2 X(cum) = 0.510' indicates that the two components exhibit explanatory powers for 51.0% of the variance of independent variables. 'R 2 Y(cum) = 0.681' indicates that the two extracted components explain 68.1% of the information on dependent variables, indicating the acceptable explanation power of the PLS method.
VIP is a critical parameter for measuring the fitting performance of the PLS model [33,69]. It quantifies the statistical significance of independent variables (X) in explaining dependent variables. When the VIP value of a variable is higher than 0.8, then the variable is 'important' and exhibits significant explanatory powers on the independent variables [69]. Table 5 presents the significance of 12 independent variables in evaluating the PM 2.5 index amongst 12 cities of the BTH region. Table 5 shows that the VIP values of most variables reach more than 0.8, that is, most variables are significant in explaining the PM 2.5 concentration. The VIP values of population density, urbanisation rate and road freight volume are higher than 1.0. These results indicate that the variables are the three top contributors to PM 2.5 concentration in the 12 sample cities of the BTH urban agglomeration. The VIP values of secondary industry GDP, overall energy consumption, steel production and volume of sulphur dioxide emission are higher than 0.8. Therefore, these factors are also major drivers of PM 2.5 pollutant emission. The rest of the variables with VIP values in the spectrum of 0.5-0.8 cannot be assessed or are unimportant drivers of pollutant emission [72].
The VIP value of population density is 1.774, which is the highest explanatory power for PM 2.5 concentration. The VIP value of urbanisation rate in explaining PM 2.5 concentration is 1.476, which ranks second, as shown in Fig 13. Previous studies have demonstrated that PM 2.5 concentration is particularly high in large cities and urbanised regions [1,2,4,5]. BTH is one  Characteristics and critical factors of PM 2.5 concentration in Beijing-Tianjin-Hebei urban agglomeration of the most urbanised, populous and developed urban agglomerations in China, and human activities, such as transport, productions of secondary industry and energy consumption, are intensive in this region. Active human activities demand considerable resource and energy and create significant traffic daily. Numerous studies have reported that vehicular exhaust is the main source of PM 2.5 [61]. Coal and oil, as cost-effective energy, have long been used as main fuels for the secondary industry in developing countries [30]. In particular, in industrial zones of BTH, coal consumption is the main fuel for energy-and emission-intensive sectors, such as those of steel, cement and glass, which are key materials for China's remarkable infrastructure build-up in the past decades. Coal combustion is the main air pollution source in this region [59]. The VIP value of road freight volume totals 1.157 in explaining PM 2.5 concentration. Previous studies have demonstrated that road vehicular exhaust is highly related to PM 2.5 concentration [26,29,62,63]. According to literature, road freight volume, particularly those of heavy-duty trucks, is the main air pollution source in the transportation sector in some cities [64]. However, in the current study, the VIP value of road passenger traffic volume reaches only 0.504, which is much less than that of road freight volume. Therefore, this factor is not a primary air pollutant source.
The VIP value of the secondary industry GDP totals 0.953 in explaining PM 2.5 concentration. In the industrial zones of BTH, such as Tangshan, coal is the main energy in different  [61]. Secondary industry, heating for residences in winter and electricity from coal-fired stations rely heavily on coal combustion [20,22,27,28]. Thus, the VIP value of overall energy consumption in explaining PM 2.5 concentration is high at 0.890. The VIP value of steel production in BTH in explaining PM 2.5 concentration is 0.889, because metal elements correspond to other sources of PM 2.5 in China. The process of producing steel requires significant coal combustion and releases abundant metal elements and sulphur dioxide into air [61]. Studies have recorded that PM 2.5 concentrations exhibit a notable and positive relationship with associated air pollutants, such as SO 2 , NO 2 and O 3 , and suggested that those atmospheric pollutants can evolve from primary pollution to secondary pollution and form a vicious cycle [39,74]. As China's political and economic centre, the particulate matter (PM) pollution in the Beijing-Tianjin-Hebei (BTH) urban agglomeration attracts extensive attention of the scholars. Many recent studies investigate PM 2.5 concentration characteristics in the BTH region by using various methods and data sources. Similarities and new evidences of empirical findings of this study are compared with those of related studies.
First, some findings of this research are generally consistent with conclusions of existing studies. For example, this study reconfirmed that the annual PM 2.5 concentration experienced a slight downturn in recent years, which is also recorded by He et al (2018) [67]. For temporal characteristics of PM 2.5 concentration, the study captured a periodic U-shaped variation pattern in BTH urban agglomeration with high pollution levels in autumn and winter and low levels in spring and summer. Yan et al. (2018) [51] also recorded a pronounced characteristic of seasonal variation. They found that the concentrations increased from late autumn to early winter and that the PM 2.5 concentration decreased rapidly from late winter to early spring. For spatial aspects, empirical results of this study find that the south and east of the BTH urban agglomeration where are densely populated are suffered with highest PM 2.5 concentration and that PM 2.5 concentration is significantly low in the north. Similar spatial characteristics are also recorded in existing studies [52,67].
Second, this study employed data with a long period from December 2013 to May 2017, which ensures robust and complete understandings on PM 2.5 concentration characteristics in the region. Most studies on the BTH urban agglomeration used either one year cross-sectional data or daily time series data [50,51,70] or outdated data before 2015 [67]. For example, based on daily monitoring data from 1 January 2014 to 31 December 2014, Liu et al. (2018) [50] investigated dynamic interactions and relationships between PM 2.5 concentrations in different cities. Estimation results based on a short period of data undermined the understandings on the temporal characteristics of PM 2.5 . In addition, since 2015, Chinese government has committed great efforts and resources in PM 2.5 treatment and special focuses are dedicated in the BTH region. Findings with data before 2015 cannot reflect the recent characteristic of PM 2.5 concentrations and effectiveness of PM 2.5 treatment.
Third, this research combines the satellite sensing data and monitoring sites data. Some academics carried out their studies based on satellite sensing data [52,67,75], meanwhile others used the data obtained from surface monitoring stations [51,70]. In this study, the surface monitoring data of 12 cities in the regions are measured by 80 monitoring stations distributed throughout the BTH region, which is used for statistical modeling. In addition, remote sensing data is processed by OKM method to uncover the temporal and spatial characteristics. The remote sensing data are also used to analyze the relationship between PM 2.5 concentration and landform, geography and elevation. Remote sensing data (Fig 10) shows that PM 2.5 concentration is low in mountainous and basin regions, which are situated more than 1000 feet above sea level.
Fourth, this study focused on the driving factors of PM 2.5 concentration, besides one goal of investigating the temporal and spatial characteristics. Related studies on the BTH regions are generally salient on diving facts of PM 2.5 concentration. For example, Zheng et al. (2018) [70] focused on improvement of the real-time forecast. Liu et al.(2018) [50] aim to visualize the dynamic interactions and relationships between PM 2.5 in different cities in the BTH regions. Yan et al. (2018) [51] investigated the spatiotemporal pattern of PM 2.5 concentrations in China. In this study, a large number of socioeconomic factors are included in the investigation. Partial least squares (PLS) method to explore the critical driving factors of PM2.5, which is rarely used in PM 2.5 research. Estimation bias caused by multicollinearity of raw data can be avoided and it can lead to a robust estimation results. Empirical results demonstrate that the deterioration of PM 2.5 concentration in 2015 is closely related to a set of critical impact factors, including population, transport, industry production, and energy consumption aspects.

Conclusion
PM 2.5 is a challenging and urgent air pollutant that should be fully treated in China. The BTH region is heavily exposed to serious PM 2.5 pollution. Previous studies are limited by the lack of real-time monitoring data and poor consideration of multicollinearity issues amongst dependent variables. This study aims to quantitatively measure the spatial-seasonal concentration characteristics of PM 2.5 and identify critical impact factors. Empirical findings reveal the following. (1) Notable differences in PM 2.5 concentrations exist amongst different seasons. Specifically, a periodical U-shaped variation trend with high pollution levels is found in autumn and winter and one with low levels is observed in spring and summer. (2) An apparent spatial distribution pattern exists in which PM 2.5 concentration in the south is higher than in the northern regions in BTH. (3) The deterioration of air pollution is closely related to several critical impact factors, including population density, urbanisation rate, road freight volume, secondary industry GDP, overall energy consumption and industrial pollutants (e.g. steel production and volume of sulphur dioxide emission).
Numerous viable and concrete police recommendations are provided for effective PM 2.5 treatment. 1) Decentralisation policy is a viable alternative policy for improvement of PM 2.5 treatment. Decentralisation policy is an effective approach for downsizing urban population and relieving congestion [76,77]. Empirical studies verify a tight relation between population density and PM 2.5 concentration. For example, rapid population influx in Beijing leads to PM 2.5 concentration deterioration due to associated energy consumption and production. Population and emission-intensive industries should be decentralised outward under the cooperated plan of BTH integration. 2) Expansions of high-energy consumption and emission industries should be constrained. Non-coal or renewable energies, such as natural gas, solar, biomass and wide and regular hydro power, should be encouraged for wide commercialisation and deployment. Statuary standards for energy use efficiency and pollutant emission should be reinforced to raise the consciousness and capability of enterprises on PM 2.5 treatments. 3) Mass transportation should be encouraged by the government and the public. For example, a complete and convenient mass transportation facility should be developed by the government to improve public ridership rate; the number of fuel-based motor vehicle ownership should be restricted; non-fuel cars, such as pure electric and plug-in new energy, should be encouraged; and diverse low-emission transportation modes, such as walking, cycling, car sharing and subway, should be used by urban residents. 4) Winter is the most heavily polluted season in the BTH region. Heating and vast coal combustion are associated, to a large extent, to contributing to high PM 2.5 emission. In addition, meteorological conditions in winter are not conducive to the purification and diffusion of PM 2.5 particles, thus adding to PM 2.5 pollution. Ma and Zhang (2014) [16] emphasised that the wide use of imported low-calorie coal and lignite in industrial and domestic sectors are particularly adverse for treatments of PM 2.5 pollution. The Chinese government should reduce the use of low-calorie coal and promote energy substations for clean heating energy. 5) PM 2.5 pollution is more serious in the south than in the north. The government should formulate an industrial restructuring scheme to avoid excessive concentration of polluting industries in certain regions, plan scientific reallocation for highly polluting industries and upgrade the energy use efficiency of industrial sectors.
Supporting information S1 Table