Industrial agglomeration and air pollution: A new perspective from enterprises in Atmospheric Pollution Transmission Channel Cities (APTCC) of Beijing-Tianjin-Hebei and its surrounding areas, China

Air quality in China has gradually been improving in recent years; however, the Beijing-Tianjin-Hebei (BTH) region continues to be the most polluted area in China, with the worst air quality index. BTH and its surrounding areas experience high agglomeration of heavy-polluting manufacturers that generate electric power, process petroleum and coal, and carry out smelting and pressing of ferrous metals, raw chemical materials, chemical products, and non-metallic mineral products. This study presents evidence of the air pollution impacts of industrial agglomeration using the Ellison–Glaeser index, Herfindahl–Hirschman index, and spatial autocorrelation analysis. This was based on data from 73,353 enterprises in “2+26” atmospheric pollution transmission channel cities in BTH and its surrounding areas (herein referred to as BTH “2+26” cities). The results showed that Beijing, Yangquan, Puyang, Kaifeng, Taiyuan, and Jinan had the highest Ellison–Glaeser index among the BTH “2+26” cities; this represents the highest enterprise agglomeration. Beijing, Langfang, Tianjin, Baoding, and Tangshan also showed a low Herfindahl–Hirschman index of pollutant emissions, which have a relatively high degree of industrial agglomeration in BTH “2+26” cities. There was an inverted U-shaped relationship between enterprise agglomeration and air quality in the BTH “2+26” cities. This means that air quality improved with increased industrial agglomeration up to a certain level; beyond this point, the air quality begins to deteriorate with a decrease in industrial agglomeration.


Introduction
Industrial agglomeration promotes urban expansion, population concentration, and regional economic development; however, it also introduces a series of environmental problems [1][2][3]. As such, industrial agglomeration may have positive and negative environmental impacts. On the one hand, industrial agglomeration leads to greater regional resource consumption. It drives high density populations and the enlargement of the city scale which increase the consumption of raw materials, generating more pollution [4,5]. On the other hand, it may improve the efficiency of resource utilization. Within industrial clusters, industrial agglomeration promotes the centralized use of raw materials, enhances economic scale, and creates knowledge spillover effects and technological advantages. Thus, it improves the overall efficiency of raw material and energy use in the region, reducing pollutant emissions [6,7]. The industrial cluster theory first emerged in the mid-20th century, in which it proposed the concept of industrial externalities accompanied by geographic clusters [8][9][10][11]. Previous researchers have studied the effect of industrial aggregation; some recognize that industrial aggregation may promote the optimization of local industrial structures and improve economic and energy efficiency in agglomerated areas through economies of scale and knowledge spillover effects [12][13][14][15][16]. Specifically, industrial agglomeration promotes the scale effect of each industry in the region, introducing positive externalities, driving productivity growth and the energy efficiency of industries, and reducing local pollution [17,18].
Other researchers observed the insignificant or negative effects of industrial agglomeration on energy efficiency [19,20]. Researchers have reported that economic agglomeration and urbanization boosted energy demand and carbon emissions in more than 200 cities across the European Union (EU). An apparent positive correlation between industrial agglomeration and air pollution was observed; an increase in industrial agglomeration in 200 EU urban clusters was accompanied by serious air pollution. Researchers postulated that industrial agglomeration may increase the total amount of pollution within a certain geographic range. This would occur to the extent that it exceeds the local carrying capacity of the environment and generates negative externalities in terms of environmental pollution [21][22][23]. The scale of industrial production may inflate the discharge of industrial pollutants. The effect on industrial wastewater is considerably stronger than that on air pollution or waste [24].
Researchers have also observed that there is either a non-linear or unclear correlation between industrial agglomeration and environmental pollution [25][26][27][28]. A study found positive and negative relationships between agglomeration and energy efficiency from the paper and cement industries in Japan [29]. Another study suggested a non-linear relationship between agglomeration and productivity in Dutch firms [25]. It has been concluded that there is no definitive relationship between industrial agglomeration and pollution in the manufacturing industry [30]. A study examined the systematic linkage between industrial agglomeration and environmental performance using city-level data over the 2003-2010 period in China. They found a non-linear pattern between agglomeration and the emissions intensity of sulfur dioxide, and a "U-shaped" relationship between industrial agglomeration and environmental efficiency [2]. A study has also demonstrated that the relationship between emissions and manufacturing concentrations in the Beijing-Tianjin-Hebei (BTH) region is "inverted U-shaped," and this relationship is still in the preliminary stage [31]. While an increase in concentration will eventually deteriorate air quality, agglomeration may also promote the scale effect of the local area and greatly improve efficiencies in environmental technologies. In particular, an "inverted, U-shaped" relationship was observed between environmental pollution and income. This means that initially, environmental pollution worsened at lower income levels and then improved with economic growth exceeding a certain rate; this is known as the Environmental Kuznets Curve hypothesis [32].
In China, the BTH region experiences the worst air pollution in the country [33]. In 2019, 168 cities participated in an air quality index (AQI) evaluation; 16 of the 20 cities with the worst air quality were located in the BTH "2+26" cities, including Anyang, Xingtai, Shijiazhuang, Handan, Tangshan, Taiyuan, Zibo, Jiaozuo, Jincheng, Baoding, Jinan, Liaocheng, outlined in the Methods section (a) The urban economic and social development data (including total industrial output value, industrial structure, population density, and per capita GDP) are derived from the "China City Statistical Yearbook" issued by the National Bureau of Statistics in 2017, we can disclose this part in the attachment of supporting information. (b) the urban air quality data in the Beijing-Tianjin-Hebei region (including PM2. 5 Xinxiang, Hebi, Luoyang and Zhengzhou [34]. The average fine particulate matter (PM 2.5 ), coarse particulate matter (PM 10 ), sulfur dioxide (SO 2 ), and nitrogen dioxide (NO 2 ) concentrations in these 168 cities were 44 , 74 , and 33 μg/m 3 respectively. The average PM 2.5, PM 10, SO 2, and NO 2 concentrations in the BTH "2+26" cities were 57, 100 , 15, and 40 μg/m 3 . Highly polluting activities, such as steel making, iron making, and coking, alongside the cement, coal, and chemical industries are concentrated in the BTH and its surrounding areas [35]. The exposure pattern of the BTH cities showed that Beijing was prone to the highest air quality population exposure. Additionally, the exposure levels in Zhengzhou, Puyang, and Anyang were higher than the average of the BTH cities [36]. Due to variations in the contribution of unit emission reduction to pollutant concentration reduction between cities, there is heterogeneity in the marginal pollutant concentration reduction cost among various districts. A joint regional air pollution control policy in the BTH area is likely to save expenses associated with air pollution control compared with a locally based pollution control strategy [37].
In recent years, the government has been implementing the "Beijing-Tianjin-Hebei (BTH) Collaborative Development Plan" and the "Opinions on Strengthening the Construction of the Key Platform for the Transfer of the Beijing-Tianjin-Hebei (BTH) Industry;" prompting industrial agglomeration in BTH and its surrounding areas. A comprehensive mechanism to prevent and control air pollution in the BTH region and its surrounding areas is crucial to improve air quality [38,39]. The "Rules for the Implementation of the Action Plan for the Prevention and Control of Air Pollution in BTH and Surrounding Areas" proposes that BTH and its surrounding areas should significantly reduce the total emissions from SO 2 , nitrogen oxides (NO x ), PM, and volatile organic compounds (VOCs) [40]. The BTH region is a key area for air pollution prevention and control as it consumes over 33% of coal in China. Additionally, the main air pollutant emissions account for approximately 30% of the total emissions in China, and the emissions intensity per square kilometer is approximately four times the national average [41]. Rapid urbanization with high energy consumption has imposed significant pressure on ecosystems [42].
Although there has been substantial progress on collaborative governance policies in the BTH region, the comprehensive reduction of regional pollutant emissions remains the focus and the greatest challenge. The "Enhanced Measures for the Prevention and Control of Air Pollution in Beijing-Tianjin-Hebei (BTH) (2016-2017)" and the "Work Plan for Air Pollution Control in Beijing-Tianjin-Hebei (BTH) and Surrounding Areas in 2017," emphasize the air pollution control task for the BTH and its surrounding areas. With coordinated development of various industries, the agglomeration effect in the BTH has gained prominence over recent years. Whether industrial agglomeration has intensified air pollution in the BTH "2+26" cities is a topic worthy of in-depth study.
It is important to understand the overall impact of industrial agglomeration on the BTH regional environment, and whether it exacerbates air quality. To clarify this relationship, we employed data on 73,353 enterprises in the BTH region of China and compared the spatial emission characteristics of industrial pollution sources by using the Ellison-Glaeser index (EGI), Herfindahl-Hirschman index (HHI), and spatial autocorrelation analysis. We then constructed a spatial measurement model that empirically analyzed the relationship between industrial agglomeration and pollution aggregation. The effect of industrial aggregation on air quality was evaluated, providing a new perspective on regional atmospheric environmental governance.
This study differs from existing studies on agglomeration and environmental performance in two key ways. First, at the data level, previous studies on industrial agglomeration largely utilized city-level panel data; this study used the emissions data of 73,353 enterprises in BTH "2+26" cities. Then, we analyzed the spatial distribution of the agglomeration and its pollutant emission accumulation in different areas. Second, we constructed a spatial measurement model on the impact of industrial agglomeration on environmental pollution. This enabled further exploration into the impact of enterprise agglomeration, emission agglomeration, and air quality, characterizing the degree of pollution agglomeration in different areas. It aims to provide policy support for the coordinated management of air pollution in BTH and its surrounding cities in China.

Data sources and description
The data used in this study included industrial enterprise pollutant emission data, urban economic and social development data, and air quality data from the BTH "2+26" cities in 2016.
The enterprise and pollution source emission inventory datasets were derived from the China National Environmental Monitoring Centre (due to data confidentiality, this research data has not been disclosed in this study). Note that to obtain this data, the consent of the China National Environmental Monitoring Centre is required. Urban economic and social development data (including total industrial output values, industrial structure, population density, and per capita gross domestic product (GDP)) were collated from the National Bureau of Statistics from the Urban Statistical Yearbook of China. Air quality data (including PM 2.5 , PM 10 , SO 2 , NO 2 , ozone (O 3 ), and carbon monoxide (CO)) in the BTH "2+26" cities were collected from the National Urban Air Quality Real-time Publishing Platform of the Ministry of Ecology and Environment (http://106. 37.208.233:20035/). Table 1 provides basic information on the "2 +26" cities of the BTH and its surrounding areas.

Methods and models
2.3.1 Herfindahl index. The HHI was used measure the agglomeration of pollutant emissions of BTH "2+26" cities, and explore the different degrees of agglomeration between key and non-key survey micro-enterprises. Compared with the EGI, the HHI is simple to calculate and can more accurately reflect changes in industry or corporate market concentrations. HHI was selected to measure the agglomeration of pollutant emissions. This was because we had obtained the pollutant emission data of 73,353 micro-enterprises, and HHI is better able to measure the pollution agglomeration of micro-enterprises in the region. The HHI was calculated using Eq (1): Here, X i is the pollutant emission of enterprise, i, in a specific industry; X is the total pollutant emission of enterprise I; S i is the proportion of a specific pollutant discharge of enterprise I; and N is the total number of enterprises in this industry. The range of HHI is 0<H�1; that is, when H is equal to 1, pollutant emissions are highly concentrated within a specific city. When H is close to 0, the scale of enterprises in a specific city is similar. The study calculated the HHI for key and non-key surveyed enterprises separately, as enterprises with large emissions may have an extreme impact on HHI. In addition, pollutant emission levels in different industries vary; as such, the HHI in different industries was also calculated.

Ellison and Glaeser index.
Based on the EG coefficients proposed by Ellison and Glaeser in 1997 [43], an adjusted EG was used to calculate the degree of enterprise concentration in each city [44]. The minimum unit of calculation was at the county (district) level. Assuming that there are N enterprises in industry, i, and there are r counties (districts) in an administrative division, the EG coefficient of industry, i, was calculated using Eq (2): where G i is calculated using Eq (3), and H is calculated using Eq (4): Here, G is the Gini coefficient; H is the HHI; X j is the ratio of the output value of the district (county) of j to the total output value of all districts and counties of the city in that year; S ij is the ratio of the output value of industry, i, to the total output value of the industry in the district (county); Z k is the ratio of the output value of enterprise, k, to the total output value of industry, i. We assume that all enterprises in industry, i, have the same scale; this means their industrial output values are equal. This assumption was made in the formula to obtain HHI as detailed data on state-owned and non-state-owned industrial enterprises above a designated size are not publicized in China [45]. The adjusted formula for the HHI was calculated using Eq (5): Here, i represents an industry; j represents the district or county; r represents the total districts or counties in the city; n ij is the number of enterprises in county, j; Output ij is the total output value of the second industry of industry, i, in county j; Output i is the total output value of the second industry of industry, i, in the city. HHI may reflect changes in the industry or enterprise market concentration, market monopoly, and scale of competition.

Spatial correlation test.
We used global and local autocorrelation methods to analyze if there was spatial accumulation of "high pollution around high-pollution areas." This was carried out to verify the spatial correlation of pollutant emissions among different regions in the "2+26" cities. Typically, global autocorrelation is used to reflect the presence of research problems within the entire research scope. The Moran's I index and its numerical value were calculated to determine whether the research object has a spatial correlation. The Moran's I index was selected to analyze the spatial correlation of different pollutants in the "2+26" cities. The Global Moran's I was calculated using Eq (6): The Local Moran's I was calculated using Eq (7): The S 2 and � Y values were calculated using Eqs (8) and (9), respectively.
Here, Y i represents the pollutant concentration of city, i; n is the total number of cities; and W ij is the spatial weight matrix. When W ij = 1, this indicates that the two areas are adjacent. If W ij = 0, it indicates that the two areas are not adjacent or that i and j are the same area. Y i and Y j represent specific observations on enterprises in the BTH "2+26" cities.
The value range of Moran's I is (-1, 1); when the value of Moran's I is greater than zero, observations between the enterprises of the region have a positive correlation, and this spatial correlation strengthens as the value approaches 1. When the value is less than zero, observations between the regions have a negative correlation, and this negative correlation strengthens as the value tends to -1. When the value is zero, there is no spatial correlation between different cities.
For Local Moran's I, if the Local Moran's I >0, it indicates that there is a positive spatial effect. Cities with similar pollutant concentrations are aggregated together, showing high-high aggregation or low-low aggregation. If the Local Moran's I <0, it indicates that the spatial effect is negative, whereby different pollutant concentrations are clustered together; this shows high-low aggregation and low-high aggregation.
To accurately determine spatial correlation, the Moran's I index construction statistic was used to conduct the significance test. The expectations, E(I), and variances, VAR(I), of Moran's I were calculated using Eqs (10)-(15): Here; Z ¼ I À EðIÞ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi VARðIÞ p ð15Þ In Eqs (10)- (15), the E(I) and VAR(I) are the expectations and variances of Moran's I; n is the total number of cities; W ij , W ji , W ik , and W ki are the spatial weight matrices. We constructed a Z statistic that progressively obeyed a normal distribution. A value of the Z statistic greater than the critical value of the normal distribution at the 0.05 confidence level indicates that the evaluation index has spatial autocorrelation. The local autocorrelation test calculated the spatial agglomeration effect by calculating the local Moran's I using the Moran scattergram.

Spatial measurement model analysis.
The spatial lag model (SLM) was used to investigate whether the variable has spread (spillover effect) in the area, and its expression is given by Eq (16): Here, Y is the dependent variable (air pollution); X is the exogenous explanatory variable matrix of order n×k; and ρ is the spatial lag coefficient, which reflects the spatial dependence degree of the agglomeration between BTH "2+26" cities. It is the direction and extent of influence from the observation value, WY, in the adjacent area on the observation value, Y, in the specific area. W is a spatial weight matrix of order n×n, WY is the spatial lagging dependent variable, and ε is the random error term vector.
The mathematical expressions of the spatial error model (SEM) are shown in Eqs (17) and (18): Here, β is the influence of the explanatory variable, X, on the explained variable, Y; ε is the random error term vector; and λ is the spatial error coefficient, which measures the spatial dependence of BTH "2+26" cities. This means the influence of the observed value, Y, in neighboring regions on the observed value, Y, in the specific region.
Based on the quadratic items of enterprise agglomeration, we established the following basic spatial lag models (SLM) given in Eqs (19) and (20): Here, Aggregation represents the enterprise agglomeration index and the enterprise pollution concentration agglomeration index; and Z i represents the control variable, which is per capita GDP (perGDP), population density (POP), and industrial structure (IND). We used EG to represent the enterprise agglomeration index, and HHI to represent the enterprise pollution concentration agglomeration index ( Table 2).

Industrial enterprise clustering index
The EG index provides a well-defined range for measuring the geographical distribution of industrial agglomerations. According to Ellison and Glaeser [43], the EG index may be divided into three categories: (1) a low agglomeration is when the EG index is less than 0.02; (2) a moderate agglomeration is when the EG index exceeds 0.02 and is less than 0.05; and (3) a relatively high agglomeration is when the EGI is greater than 0.05.
Based on the adjusted EG index calculation, we used county (district) level data to analyze the agglomeration of industrial enterprises in BTH "2+26" cities in 2016; there are 303 districts and counties in the BTH "2+26" cities. The results show that there were 20 cities with an EG index below 0.02, four cities between 0.02 and 0.05, and four cities with an EG index exceeding 0.05 (Fig 2). The EG index of Beijing, Yangquan, Puyang, and Kaifeng had the highest concentrations among the "2+26" cities.
In general, the agglomeration of manufacturing enterprises in the BTH region was relatively low (Table 3). Among the 28 cities, 20 had a low EGI of less than 0.02, indicating that industrial agglomeration in this region was relatively low. According to the industry categories of enterprises, more than 85% of the BTH "2+26" cities were manufacturing enterprises (65,275 of 73,353 enterprises) (Table 4); however, BTH manufacturing enterprises are not highly agglomerated. The proportion of tertiary industry in the economic development of the BTH area was relatively high, consistent with the overall trend of transformation from secondary to tertiary industries all over the country. The threshold for entry into labor-intensive industries was relatively low [46]. Beijing was faster and better than other areas in terms of industrial structure transformation in the BTH area [47]. The employment and industrial structure in Hebei province was not coordinated, characterized by a prominent labor and resource-intensive nature.
The results show that there are two major types of cities with a high agglomeration of enterprises. The first category includes cities in the Shanxi, Henan, and Shandong provinces; these enterprises form economies of scale and a certain level of agglomeration. Although these regions have relatively higher industrial agglomeration, they are mostly labor and resourceintensive [48]. They were mostly in the second half of the medium term of industrialization or the first half of the late term of industrialization, with a relatively lagging tertiary industry and lower innovation output [49,50]. The other category includes the areas surrounding Beijing and Tianjin. Due to adjustments in the urban and industrial planning policies, enterprises in the core areas of Beijing have relocated to its peripheries. A considerable number of small and

Spatial density analysis of air pollutant emissions
We divided the enterprise statistics into industrial sectors according to the "National Economic Industry Classification GBT 4754-2017." The surveyed enterprises were from 44 industries (Table 4); among them, manufacturing enterprises accounted for more than 85%, covering non-metallic mineral products, metal products, general equipment manufacturing, and rubber and plastic products. There were 6,604 electricity, heat production and supply enterprises in the BTH "2+26" cities (Table 4), releasing the heaviest SO 2 and NO x emissions. We obtained the pollution discharge list of the 73,353 industrial enterprises in the BTH "2 +26" cities for 2016 and statistics on the pollutant emissions from different industries. The results of the analysis are presented in Table 5.
In terms of emission types, electricity, heat production and supply, ferrous metal smelting and rolling processing, and non-metallic mineral products were the major sources of SO 2 , NO x , and PM 2.5 emissions among the "2+26" cities. The SO 2 emissions from the electricity, heat production, and supply industries were 315,136.7 t, accounting for 31.8% of SO 2 emissions in the industry. The NO x emissions were 522,941.1 t, accounting for 31.8% of the overall industrial NO x emissions (Table 5). The results showed that SO 2 and NO x emissions from the power industry accounted for the largest proportion of total emissions in 2016. Although the government has adopted various measures to promote the de-capacity of steel, cement, and petrochemical plants in BTH during recent years, these industries continue to cause the highest SO 2 and NO x emissions and are the main sources of air pollution in the BTH [55]. Desulfurization and denitrification facilities have been installed in the coal-fired power plants in China, with 80% of plants completing ultra-low emission retrofits by 2019 [56][57][58]. The PM 2.5 emissions of ferrous metal smelting and rolling processing industries were 269,447 t, accounting for 31.6% of the overall industrial PM 2.5 emissions, and causing the highest PM 2.5 pollution (Table 5). Emissions from steel companies have a significant impact on urban air quality [59]; the air quality of the 20 cities with the highest production capacity in the steel industry (where capacity accounts for 51% of the total production capacity in China), exceeded the air quality standards of China [60]. Therefore, it is necessary to accelerate the ultra-low emission transformation in the steel industry to win the Blue-Sky Protection Campaign [61,62] and implement coordinated emission reduction policies for this industry in the BTH region [63].

Agglomeration index of pollutant emissions
We used the HHI to measure the agglomeration of pollutant emissions in different cities. Based on the classification standard of the Ministry of Ecology and Environment for air pollution control enterprises [64], a total of 73,353 industrial enterprises in BTH were categorized as 11,168 key surveyed enterprises and 62,185 non-key surveyed enterprises. The HHI was used to calculate the industrial pollution discharge agglomeration of the "2+26" cities; the results are presented in Table 6. The results showed that the overall concentration of pollutant emissions in the BTH "2+26" cities was high. In terms of different pollutants, the overall agglomeration in northwest BTH was low, while the central and southeastern parts were high; there were also specific differences. The overall agglomeration of key and non-key survey enterprises was similar, although identical trends were not observed for various cities (Fig 3).
Beijing, Langfang, Tianjin, Baoding, and Tangshan had a low HHI of pollutant emissions, which indicates a relatively high degree of pollution agglomeration. This is closely related to the AQI in these cities [65,66], as the large number of pollutant emission sources causes poor air quality [40]. The HHI of key industries in Tangshan, Zibo, Tianjin, Baoding, Changzhi, Xingtai, Beijing, Handan, and Shijiazhuang showed higher pollution concentration aggregation than other cities. As there were a large number of non-key industries, the HHI of these industries was determined separately. It was observed that Beijing, Tianjin, Langfang, Baoding, Hengshui, Xingtai, and Zhengzhou had high levels of pollution agglomeration; there was a correlation between industrial clusters and pollution clusters [2,3,7]. Therefore, follow-up studies analyzed and verified the correlation between industrial agglomeration and pollution agglomeration.

Spatial autocorrelation test.
The pollutant emissions of the BTH "2+26" cities demonstrated a certain level of spatial agglomeration. To further verify this spatial correlation, Moran's I was used to confirm the relationship between district and county-level pollutant emissions and spatial distribution in the BTH "2+26" cities. The spatial correlation may be measured by the Global and Local Moran's I, with the former reflecting the spatial correlation as a whole, and the latter measuring the local correlation. The global correlation may be used to describe the overall spatial correlation status of air pollution in the BTH "2+26" cities, and determine whether air pollution is spatially aggregated in nature. The Moran's I values for SO 2 , PM 2.5 , PM 10 , NO x , and VOCs were all positive and were significant at the 5% significance level (Table 7). This shows that the overall pollution emission from industrial enterprises was significantly positively correlated with the spatial distribution of the BTH "2+26" cities; air pollution demonstrated a spatial agglomeration effect in the BTH "2+26" cities. This meant that the pollutant emission sources among BTH APTCC cities affect each other, as does the pollution concentration of SO 2 , NO x ,

Local spatial autocorrelation test.
Moran scatter plots were able to describe local spatial correlations in the BTH "2+26" cities. The spatial autocorrelation analysis of SO 2 emissions demonstrated that 74% of the county-level pollutant emissions was positively spatially correlated, and NO x , PM 2.5 , and VOC emissions had positive spatial correlation ratios of 74%, 74%, and 68%, respectively. More than 50% of the polluting district and county agglomerations were located in the first and third quadrants; these are characterized by the "High-High" or "Low-Low" types. This indicates that there was a significant spatial autocorrelation in the pollutant emissions of industrial enterprises in the BTH "2+26" cities in 2016. Pollution agglomeration in most districts and counties was similar to that in neighboring counties. The districts and counties with high levels of pollution discharge were geographically adjacent, and the districts and counties with low pollution discharge agglomeration were also largely in close proximity to each other.
Anselin and Florax [70] proposed the local indicators of spatial association (LISA); analysis of the LISA cluster map may further reflect the spatial clustering significance of pollutant emissions in the BTH "2+26" cities. We established the LISA agglomeration map of different pollutant emissions of the BTH "2+26" cities (Fig 4). The Moran scatter plot was divided into four quadrants according to the four types of pollution control performance occurring in each region: high-high (H-H), low-high (L-H), low-low (L-L), and high-low (H-L), which correspond to the first, second, third, and fourth quadrants, respectively. The first quadrant represents the significant concentration of high pollutant emissions, and the third quadrant

Correlation analysis of enterprise agglomeration and pollution agglomeration.
We calculated the Pearson correlation coefficient between enterprise agglomeration and pollutant emission agglomeration. The results showed that the correlation coefficient between enterprise agglomeration and pollutant emission agglomeration was 0.631; this is significantly positive at the 1% level, indicating that there is a strong positive correlation between enterprise agglomeration and pollutant emission agglomeration in the "2+26" cities. The high concentration of enterprises in the spatial distribution means that the pollution discharge in this area is also highly concentrated. The Pearson correlation between enterprise agglomeration and pollution agglomeration is shown in Table 8.

Analysis of the impact of enterprise agglomeration on air quality.
A spatial measurement model was established to determine the relationship between air quality, enterprise  agglomeration, and economic development. In this spatial econometric model, the daily average of the AQI in 2016 was the explanatory variable, and the EGI index was used as the relevant indicator for agglomeration. Other control variables include per capita GDP (perGDP), population density (POP), and industrial structure (IND). The descriptive statistics of the variables is provided in Table 9.
To determine whether the SEM or the spatial lag (SAR) model was more suitable for the sample of this study, we initially conducted the ordinary least squares (OLS) estimation to observe the Lagrangian multiplier lag (LM), error, and robustness tests. The test contains two statistics: LM-Error and LM-Lag. If these two statistics are not significant, the OLS model is selected as the final model; if the robust LM-Error statistic is significant, it points to the SEM; and if the robust LM-Lag statistic is significant, it points to the SAR. The results of the OLS estimation and robustness tests are shown in Table 10.
The table shows that LM (lag) and LM (error) were not significant, indicating that it is not applicable to the SAR or SEM models. Additionally, the coefficient of the EG index was negative and insignificant. This result is inconsistent with those of previous spatial autocorrelation tests. Therefore, considering a possible non-linear relationship, we used the square of EG as the control variable for least squares regression.
The results of the OLS regression show that the LM (error) result is statistically more significant than the LM (lag); both RLM (lag) and RLM (error) were not significant. Therefore, based on the model judgment criteria, an SEM was selected for further testing. We used the structural equation modeling estimation, where the results are shown in Table 11. In industrial agglomeration areas, there is often a large scale of production that attracts labor and increases population density, resulting in more severe air pollution. However, in the post-industrial period, a service-oriented city attracts an increasing number of people and causes air pollution. The analysis results of the SEM found that the spatial error regression coefficient was highly statistically significant. This confirms the spatial correlation of air pollution in the BTH "2+26" cities and the reasonability of developing a spatial measurement model. The value of the spatial error regression coefficient was positive, indicating that air pollution has strong spatial dependence and there are spatial spillover effects. For the BTH APTCC, air pollution in one area induced air pollution in adjacent areas. As such, this spatial spillover effect indicates that the level of air pollution in a region is affected by the economic development within a region, demographic changes, and industrial enterprise agglomeration. It is also impacted by the economic development, demographic changes, and industrial enterprise agglomeration level of its neighboring regions.
The primary and secondary terms of the EG index passed the 1% significance level test, and the primary coefficient β 1 was negative, while the secondary coefficient β 2 was positive, indicative of an inverted, U-shaped relationship between enterprise agglomeration and air quality in the BTH "2+26" cities. First, as enterprise agglomeration grows, the scale of production increases, usually attracting more labor. With economic development, the industrial structure is continuously adjusted, and air quality is improved. However, as agglomeration increases excessively and the scale of production blindly expands, this generates an explosion in population growth. As such, it reaches a threshold where any further growth in agglomeration leads to deteriorating air quality. Alternatively, if the industrial structure is not well adjusted as the enterprises agglomerate blindly, air quality will deteriorate once it passes an inflection point [71][72][73].
This type of industrial agglomeration has an inverted, U-shaped relationship with environmental quality, which conforms to the characteristics of the Environmental Kuznets Curve [37], which describes the relationship between economic development and pollution. It postulates that as the economy develops, pollutant emissions increase until a certain threshold is reached; then, the high economic level contributes to technological innovations that aim to relieve pollution emissions, causing emissions to decrease. Therefore, there is an inverted, U-shaped curve between economic development and environmental pollution. This indicates the importance of promoting the rationalization of the industrial structure and establishing a long-term mechanism for environmental governance to effectively balance economic growth and environmental protection [74].

Conclusions
The relationship between industrial agglomeration and environmental pollution is complex. Previous studies on industrial agglomeration largely used city-level panel data. This study used the EGI, HHI, and spatial correlation analysis to analyze the spatial emission and agglomeration characteristics of pollution sources. This was carried out using data on 73,353 enterprises in the BTH "2+26" cities of China; we were able to provide new evidence of the impact of industrial agglomeration on environmental performance. Through the calculation of the adjusted EGI, we found that there were mainly two types of cities with relatively high concentrations of enterprises. The first category includes the areas surrounding Beijing and Tianjin, excluding the two cities themselves. Due to changes in government policies on urban and industrial development, enterprises have relocated from the core areas of Beijing to its peripheries; the transferred area surrounds the urban regions of Beijing and Tianjin. The other category includes cities such as Taiyuan, Zhengzhou, and Jinan, which are provincial capitals with a relatively dense distribution of industries; a large number of these industries are resource-intensive, forming scaled economies and a certain level of agglomeration. As these cities have not begun to evacuate or transfer industries, they are still considered economic centers. These two types of agglomeration characteristics represent differences in the development models and stages in different regions of China.
The results of Global and Local Moran's I showed that the overall pollution discharge of industrial enterprises was significantly positively correlated with the spatial distribution of the BTH "2+26" cities. Air pollution presented a spatial agglomeration effect, meaning that pollutant emissions in any of the BTH "2+26" APTCC will impact other surrounding areas. Additionally, the pollution concentrations of SO 2 , NO x , PM 2.5 , and PM 10 in adjacent cities will also affect nearby cities, which verifies the conclusions of previous studies. We should be alert to the problem of industrial cliffs caused by excessive industrial gradients among the BTH "2 +26" APTCC. Gradually, a cross-complementary and distinctive regional division of labor pattern should be formed, ultimately promoting the coordinated development of the BTH region [75].
There is a strong positive correlation between enterprise agglomeration and pollutant emission agglomeration in the BTH "2+26" cities. Through spatial measurement model analysis, we found that there was an inverted U-shaped relationship between the industrial enterprise agglomeration and air quality in the BTH. This is likely due to the increase in the agglomeration of enterprises; the scale effect or an industrial chain gradually forms between enterprises, introducing improvements in the technological level and energy saving. However, when the agglomeration reaches a certain point, the diffusion effect of pollution causes further serious air pollution to the surrounding areas as agglomeration proceeds.
Industrial agglomeration is inseparable from different forms of pollution. The spatial distribution of enterprises, such as the formation of industrial parks, may enhance air quality to a certain extent. However, urban pollution control needs to be combined with the optimization of local urban and industrial layouts. It is also necessary to consider the industrial development of nearby cities and the impact of pollutant emissions on the air quality of adjacent cities.
This study analyzed the relationship between atmospheric pollution and industrial agglomeration from the BTH APTCC and provided evidence for the diffusion effect of air pollution in urban agglomerations. These findings may be applied in air pollution management in urban agglomerations such that it is more objective, reliable, and powerful. It is recommended that future studies use more detailed data and related data mining technology to explore air pollution characteristics in different areas.