Figures
Abstract
Heavy metal pollution in coastal agricultural soils poses significant threats to food security, human health, and marine ecosystems. Effective prevention and control require systematic analysis of their spatial distribution and sources. This study integrated geostatistics, principal component analysis (PCA), positive matrix factorization (PMF), and finite mixture modeling (FMM) to comprehensively analyze the spatial variability and sources of five heavy metals (Cr, Pb, Cd, Hg, As) across 877 sampling sites in the coastal area of eastern Zhejiang. The results indicate that overall soil quality is good, though enrichment occurs at some sites due to anthropogenic activities. Pollution displays a spatial pattern of lower levels in the south and higher levels in the north. Pb is widely distributed, while Cd, Hg, and As are concentrated in agricultural plain areas. PMF-based source apportionment revealed that mobile sources (traffic) contributed the most (52.5%), followed by industrial sources (30.4%) and agricultural sources (17.1%). The consistency of multi-model results validated the reliability of source identification. By implementing precise management strategies based on pollution source contributions, it is expected to effectively curb the further deterioration of heavy metal pollution in agricultural soils in Zhejiang Province, gradually improve soil environmental quality, and ensure the safety of agricultural products and the sustainable development of agriculture.
Citation: Ji J, Wu X (2026) Assessing spatial variability and source identification of heavy metals in agricultural soils: A geostatistical and multivariate analysis of coastal eastern Zhejiang, China. PLoS One 21(3): e0344184. https://doi.org/10.1371/journal.pone.0344184
Editor: Somayeh Soltani-Gerdefaramarzi, Ardakan University, IRAN, ISLAMIC REPUBLIC OF
Received: May 28, 2025; Accepted: February 17, 2026; Published: March 5, 2026
Copyright: © 2026 Ji, Wu. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Due to legal restrictions based on China’s Data Security Law and institutional data sharing agreements with the Zhejiang Provincial Ecological and Environmental Monitoring Center, the complete raw dataset containing precise geographic coordinates cannot be made publicly available. However, all processed data necessary to replicate the study findings are provided within the paper and its supporting information files. Researchers may request access to the minimal processed dataset (with anonymized locations) from the corresponding author (liyan522@zju.edu.cn) or submit inquiries to the Data Management Committee of Zhejiang Provincial Ecological and Environmental Monitoring Center (zjemc@zj.gov.cn). The authors confirm they had no special access privileges to the data.
Funding: This work was supported by the National Key Science and Technology Project for Comprehensive Environmental Management in the Beijing-Tianjin-Hebei Region(2025ZD1205801) and the National Key Research and Development Program of China (2024YFC3711805). There was no additional external funding received for this study.
Competing interests: The authors declare no conflict of interest.
Introduction
In rapid industrialisation and urbanisation, agricultural soil pollution is becoming increasingly severe, significantly impacting human health and life [1]. Among the various forms of agricultural soil pollution, heavy metal contamination is of particular concern due to its widespread sources, high bioaccumulation, toxicity, and resistance to degradation [2]. It has been reported that approximately 10 million soil-contaminated plots exist globally, with over 50% (about 20 million hectares of land) affected by heavy metals [3,4]. According to the report “The state of soils in Europe”, published on December 19, 2024 by the Joint Research Center of the European Commission and the European Union Environment Agency, it can be seen that in the European Union (EU), about 19% of agricultural soils have at least one heavy metal that exceeds the limit values set by national legislation [5]. In the United States, 28% of soil samples contain heavy metal concentrations exceeding the Environmental Protection Agency’s (EPA) ecological soil screening levels [6]. In Russia, around 10% of the soil is deemed contaminated with “dangerous” or “moderately dangerous” heavy metals [7]. To effectively prevent and control soil heavy metal pollution, many countries have made significant efforts [8]. In the 1970s, the U.S. investigated heavy metal accumulation in its soils and crops, conducting risk assessments in contaminated areas [9]. In April 2024, the EU adopted the Soil Monitoring Act, marking the first soil-specific legislation in EU history, aiming to promote sustainable soil use and establish a legal framework to achieve soil health by 2050 [10]. In March 2022, China’s Ministry of Ecology and Environment issued the “Opinions on Further Strengthening Prevention and Control of Heavy Metal Pollution,” setting two primary objectives: first, to reduce emissions of key heavy metal pollutants from major industries by 5% compared to 2020 levels by 2025; and second, to comprehensively improve heavy metal pollution management, environmental risk prevention, and regulatory capacity by 2035 [11]. An overview of governance actions across countries highlights that investigating the spatial and source characteristics of heavy metals in agricultural soils is crucial for enhancing pollution monitoring and prevention [4,12–15].
Sources of heavy metal contamination in agricultural soils fall into two main categories: natural and anthropogenic [16]. Natural sources include the weathering of parent rocks and vegetation formation [17], while anthropogenic sources are more diverse, including industrial activities (e.g., industrial waste emissions, mining waste), urbanisation (e.g., automobile exhaust), agricultural activities (e.g., excessive pesticide use, irrigation wastewater), and military operations [18]. In recent years, source analysis of heavy metals in agricultural soils has gained prominence as a critical research topic, serving as a foundation for efficient pollution monitoring and land remediation [19]. Initially, researchers primarily employed Principal Component Analysis (PCA) and correlation analysis to qualitatively determine the sources of soil elements [20]. More recently, methods such as the Single pollution index method, Nemero pollution index, Ground Cumulative Index, and Enrichment Factor Method have been used to assess the potential risks of multiple heavy metals to plants, animals, and humans [21–24]. Additionally, techniques such as the Positive Matrix Factorisation (PMF) model and multivariate statistical analysis models are commonly used to qualitatively determine pollution sources and calculate the precise contribution of each source [25,26]. While these techniques provide valuable insights when used individually, they often yield only a fragmented understanding when applied in isolation. For instance, methods such as PCA and single pollution indices focus primarily on qualitative interpretation of pollution sources, whereas receptor models like PMF enable quantitative analysis of source contributions. It is therefore through the combined application of models with complementary emphases that a truly comprehensive and practical understanding of pollution sources can be achieved, thereby facilitating the development of precise and effective remediation strategies. In terms of sample selection, most studies on soil heavy metal pollution in China have focused on inland provinces, with less attention given to coastal areas. Furthermore, most studies concentrate on farmland around mining and industrial zones, with fewer studies examining farmland in river downstream areas or hilly regions [27–30]. To more comprehensively examine heavy metal contamination in agricultural soils, this study proposes selecting four cities in the eastern part of Zhejiang Province, China (Ningbo City, Taizhou City, Shaoxing City, and Jinhua City) as the sample area. The reasons for this selection are as follows: first, the Eastern Zhejiang is a key area in the Yangtze River estuary, situated within the River-Sea-Land Interaction Zone, and characterised by a mixed economic structure of agriculture, light industry, and trade, with rapid urbanisation [31,32]. Second, the Yangtze River Delta region, where Eastern Zhejiang is located, has been identified as a priority area for environmental protection due to its ecological sensitivity and high population density [33,34]. Soil contamination in this region directly affects food safety and public health for millions of residents [35]. Before this, agricultural development in Eastern Zhejiang was characterised by excessive use of chemical fertilisers, pesticides, and plastic films, leading to significant heavy metal pollution [35,36]. Third, the Eastern Zhejiang is also a common site for military exercises in the East China Sea, and as noted earlier, military operations can contribute to soil heavy metal pollution.
This study comprehensively applied geostatistical analysis, principal component analysis (PCA), positive matrix factorization (PMF), and finite mixture modeling (FMM) to investigate the spatial variability, pollution characteristics, and sources of heavy metals in agricultural soils of eastern Zhejiang. Based on 877 surface soil samples, the research aims to: (1) analyze the spatial distribution patterns of chromium (Cr), lead (Pb), cadmium (Cd), mercury (Hg), and arsenic (As); (2) perform source apportionment and quantify the contribution rates of heavy metal pollution in the study area; and (3) provide evidence-based targeted pollution control and sustainable soil management recommendations for coastal agricultural regions.
Materials and methods
Study area
Zhejiang Province (27°02′N–31°11′N, 118°01′E–123°10′E) is situated along China’s southeastern coast, forming the southern wing of the Yangtze River Delta. Bordered by the East China Sea to the east, Shanghai and Jiangsu to the north, Fujian to the south, and Jiangxi and Anhui to the west. The province spans approximately 365,500 km², comprising 105,500 km² of land and 260,000 km² of marine territory. Its topography descends terracedly from southwest to northeast: mountainous terrain dominates the southwest, hills the central region, and low-lying alluvial plains the northeast, encapsulated by the regional maxim “seven parts mountains, one part water, and two parts farmland (QiShan Yi Shui Liang Fen Tian)” [37]. Zhejiang experiences a subtropical monsoon climate with moderate humidity [38]. With per capita arable land at one-third of the national average, limited quantity, suboptimal quality, and scarce reserves, Zhejiang abandoned traditional agricultural models [39]. In 2003, guided by the “Lucid Waters and Lush Mountains Are Invaluable Assets (Lvshui Qingshan Jiushi Jinshan Yinshan)” philosophy, the province launched the “One Thousand Demonstration Villages and Rectification of Ten Thousand Villages (Qiancun Shifan Wancun Zhengzhi)” project to enhance rural ecosystems and livelihoods. In June 2019, Zhejiang province became China’s sole pilot province for modern eco-circular agriculture. In May 2021, the Central Government of the People’s Republic of China designated Zhejiang as a national common prosperity demonstration zone [40]. The province now achieves upper-middle-income economy status, with urban and rural disposable incomes consistently ranking first among Chinese provinces. Notably, all its prefectural cities exceed national average income levels.
Administratively, Zhejiang comprises 11 prefectural cities, geographically categorised as northern (Hangzhou, Jiaxing, Huzhou), southern (Wenzhou, Lishui, Quzhou), and eastern regions (Ningbo, Taizhou, Shaoxing, Jinhua, Zhoushan). This study focuses on four eastern cities: Ningbo (predominantly plains and low hills; 2024 GDP: 1,814.77 billion yuan), Jinhua (predominantly hills and basins; 2024 GDP: 692.55 billion yuan), Shaoxing (mixed plains, hills, and mountains; 2024 GDP: 836.9 billion yuan), and Taizhou (predominantly coastal plains and hills; 2024 GDP: 665.64 billion yuan). Zhoushan was excluded as 93.7% of its area comprises marine territory, offering limited relevance to the study.
Soil sampling and sample determination
Sampling was conducted as part of the 2013 Zhejiang Province Agricultural Land Soil Heavy Metal Pollution Survey. Based on the distribution of basic farmland, topographic heterogeneity, and soil type variability, 877 agricultural soil sampling points were systematically established across the study area. Land use types at these points included paddy fields, upland fields, tea gardens, orchards, bamboo plantations, and forested areas. Paddy fields accounted for 48.35% of the total area, while upland fields constituted 27.94%. Soil types at sampling points encompassed red soil, paddy soil, coarse skeletal soil, purple soil, loess, and tidal soil, with red soil being the most prevalent at 51.31%. The spatial distribution of sampling points is illustrated in Fig 1. To more accurately reflect soil physicochemical properties, at least three samples were collected from each site, ultimately forming composite agricultural soil samples for each location. Sampling methods were selected based on field size and shape to ensure representativeness: spot sampling for small irregular fields, grid sampling for medium-sized regular fields, and serpentine sampling for large uniform fields. This approach followed standard soil sampling protocols to minimize spatial bias and enhance the representativeness accuracy of composite samples. For conventional crops, the sampling depth of the plow layer is 0–20 cm. For fruit trees and forest crops, the sampling depth is 0–60 cm. Differential GPS technology is used for precise location of sampling points. All soil samples were transported to the laboratory and air-dried indoors. After removing plant debris and stones, samples were ground and sieved through a 100-micron mesh. They were then sealed in amber glass bottles and stored at −20°C until analysis.
(Note: Base map sources: Esri, HERE, Garmin, Intermap, Increment P Corp., GEBCO, USGS, FAO, NPS, NRCAN, GeoBase, IGN, Kadaster NL, Ordnance Survey, Esri Japan, METI, Esri China (Hong Kong), (c) OpenStreetMap contributors, and the GIS User Community [41]).
The indices for soil sample analysis included the total content of five heavy metal elements. Plasma optical emission spectrometry (ICP-OES) was used to determine Cr content, plasma mass spectrometry (ICP-MS) to determine Pb and Cd content, cold vapour-atomic fluorescence (CV-AFS) to determine Hg content, and hydride-atomic fluorescence (HG-AFS) to determine As content [42–45] The detection limits were 3.60 μg·g ⁻ ¹ for Cr, 0.91 μg·g ⁻ ¹ for Pb, 0.019 μg·g ⁻ ¹ for Cd, 0.53 ng·g ⁻ ¹ for Hg, and 0.069 μg·g ⁻ ¹ for As. Calibration curves were established with heavy metal standard solutions after every ten samples to ensure instrumental error remained within a 2% range. All reagents used in the analysis were of high purity. National standard reference materials, GSS-1 and GSS-4, were incorporated throughout the quality control process. All results were within the acceptable error tolerance range.
Data analysis methods
Descriptive statistics and geostatistical analyses.
Descriptive statistical analyses of soil heavy metals were conducted using SPSS 22.0, including measures such as the arithmetic mean, extreme values, standard deviation, coefficient of variation, kurtosis, and skewness. Since the Ordinary Kriging Interpolation method in geostatistics requires normally distributed data. In the study, the data were tested for normality using the Kolmogorov-Smirnov (K-S) test in SPSS 22.0. Data that did not conform to a normal distribution were transformed using either a logarithmic or Box-Cox transformation. Then, we utilised the Ordinary Kriging interpolation model in the geostatistical analysis module in ArcGIS to determine the spatial variability of soil heavy metal elements.
Principal component analysis (PCA).
Principal Component Analysis (PCA) is a multivariate statistical method used for dimensionality reduction and feature extraction [46]. In this study, PCA is employed for exploratory analysis of heavy metal pollution data and preliminary determination of the number of pollution sources. First, the standardized heavy metal concentration data is processed through PCA. By calculating the eigenvalues and eigenvectors of the covariance matrix, the major components that explain the data variability are extracted.
The optimal number of pollution sources is determined based on the Kaiser criterion (retaining components with eigenvalues greater than 1), the cumulative variance contribution (reaching over 80%), and the elbow method from the scree plot. By analyzing the component loading matrix, the contribution of each heavy metal element to the different principal components is identified, allowing for a preliminary judgment of pollution source types and their characteristic element combinations.
Positive matrix factorization (PMF).
Positive Matrix Factorization (PMF) is a constrained non-negative matrix factorization method that can simultaneously provide quantitative estimates of both source profiles and source contributions [47]. This study uses the PMF model for precise identification and quantitative analysis of pollution sources. First, the heavy metal concentration data is log-transformed, and an uncertainty matrix is constructed, incorporating detection limits and analytical errors. Through an iterative factorization algorithm, the concentration matrix is decomposed into a source contribution matrix and a source profile matrix.
In the process of source identification, a multi-index decision system based on element enrichment characteristics and ratio relationships is established. For natural sources, the normalized concentrations of chromium and arsenic must both be greater than 0.15, with the chromium-arsenic ratio maintained within the typical range of 5−15. For industrial sources, at least two of the elements lead, cadmium, and mercury must have normalized concentrations greater than 0.1, and the lead-cadmium ratio should be between 50 and 200. For agricultural sources, a multi-index weighted scoring system is used, which comprehensively considers factors such as cadmium concentration, arsenic concentration, cadmium-arsenic co-enrichment characteristics, lead concentration range, mercury concentration levels, and the cadmium-lead ratio. The source is identified when the total score exceeds a threshold. For traffic sources, the normalized concentrations of lead and chromium must both be greater than 0.08, and the lead-chromium ratio should be maintained within the range of 0.3–0.8. Finally, the percentage contributions of each pollution source to the sample concentration are calculated through normalization, achieving a quantitative allocation of pollution sources.
The finite mixture model (FMM).
The Finite Mixture Model (FMM) is implemented using the Gaussian Mixture Model (GMM) for probabilistic clustering analysis of samples [48]. The principal component scores obtained from the previous PCA analysis are used as input features. The optimal number of clusters is determined through the Bayesian Information Criterion (BIC) and silhouette coefficient, with the Expectation-Maximization (EM) algorithm employed to estimate the parameters of each Gaussian distribution.
In terms of interpreting the clustering results, the heavy metal composition characteristics of each cluster center are compared with the source profiles obtained from Positive Matrix Factorization (PMF). A cosine similarity measure is used, and clusters with a similarity above 0.7 are identified as corresponding to specific pollution source types. This method provides each sample with the posterior probability of belonging to each cluster, enabling probabilistic source contribution assessment.
Results
Descriptive statistical analysis of heavy metal elements in agricultural soils
Descriptive statistics were performed on the concentrations of five heavy metals in 877 soil samples from the study area to obtain the minimum, maximum, mean, standard deviation, and coefficient of variation for each metal (see Table 1). The data revealed that the highest Cr concentration was 90.8 times greater than the lowest, with a mean value of 65.17 mg/kg. The highest Pb concentration was 77.7 times greater than the lowest, with a mean value of 35.08 mg/kg. The highest Cd concentration was 156 times greater than the lowest, with a mean value of 0.18 mg/kg. The highest Hg concentration was 291 times greater than the lowest, with a mean value of 0.10 mg/kg. The highest As concentration was 79.5 times greater than the lowest, with a mean value of 5.44 mg/kg. When compared with the background values of the soil environment in Zhejiang Province and national secondary standards, the average concentrations of the five heavy metals were lower than the environmental background values and the national standards. However, the highest concentrations exceeded these values. Specifically, the maximum concentrations of Cr, Pb, Cd, Hg, and As were 3.96, 11.73, 7.42, 12.65, and 7.29 times higher than the environmental background values for soil in Zhejiang Province, respectively. This suggests that the concentrations of these elements in the study area have significantly exceeded the natural levels and that soil quality at certain sample sites is compromised, likely due to industrial development and other factors.
The coefficient of variation (CoV) indicates the degree of variability across sample points. A CoV of less than 20% represents low variability, 20%−50% indicates medium variability, 50%−100% denotes high variability, and over 100% reflects extreme variability [49]. The data showed that Hg exhibited extreme variability, while Cr, Pb, Cd, and As exhibited high variability. This indicates that the heavy metal concentrations in the study area are highly variable and discontinuous, likely due to the influence of external factors, suggesting that human activities are the primary source of pollution for these heavy metals.
Spatial distribution of heavy metal contamination in agricultural soils
Spatial variation characteristic functions.
The nugget effect is a key indicator in geostatistics for measuring the randomness of spatial variability. A higher value suggests that the spatial distribution of heavy metals is predominantly influenced by localized, stochastic anthropogenic activities such as industrial emissions, transportation sources, or agricultural fertilization. Conversely, a lower value reflects the dominant role of structural factors like natural soil-forming processes. By comparing the parameters of each fitting test, the best-fitting models for the five heavy metal elements and their associated parameters were determined (see Table 2). These parameters reflect the variability characteristics of heavy metal content in soil. By comparing and analysing these parameters, it is possible to theoretically understand the spatial distribution characteristics of soil heavy metals [50,52,53].
The nugget value (C0) represents a type of variation not caused by the spacing of sampling points. It is random and reflects spatial variation influenced by random factors (e.g., socio-economic factors). The abutment value (C1 + C0) refers to the extreme value of the semi-variance observed at different sampling spacings. It reflects spatial variation caused by natural factors (e.g., soil-forming parent material, topography) and socio-economic factors (e.g., fertiliser application, cropping systems). This value includes both random and structural variability. The ratio of the nugget value (C0) to the abutment value (C1 + C0) is an important indicator of the degree of spatial variability in regional variables, commonly referred to as the nugget effect. This ratio helps to determine whether natural (structural) or anthropogenic (stochastic) factors primarily influence spatial variation. When C0/(C1 + C0) < 25%, it suggests that spatial variation is dominated by structural factors, indicating a strong spatial correlation primarily controlled by natural factors with minimal human influence. When 25% ≤ C0/(C1 + C0) ≤ 75%, it indicates moderate spatial correlation. When C0/(C1 + C0) > 75%, it indicates that spatial variation is primarily random, resulting in weak spatial correlation influenced more by human factors. The data reveal the following nugget effects for the five heavy metals, from smallest to largest: Cr (21.54%), As (128.36%), Pb (147.23%), Cd (343.76%), and Hg (594.25%). Only Cr has a nugget effect of less than 25%, while the nugget effects of the remaining four metals exceed 75%.
The maximum correlation distance, or range, refers to the distance at which the variability function reaches the abutment value, indicating the spatial autocorrelation range of the elements. Changes in this range reflect shifts in the primary variability process of the soil elements. A larger variance range suggests stronger homogeneity of the element in the soil, while a smaller range indicates weaker homogeneity, with more pronounced local variations and a more complex overall distribution. As shown in Table 2, the ranges for the five heavy metals—Cr, Pb, Cd, Hg, and As—are all small. This suggests that the distribution of these metals in soil within a narrow range of variation should not be overlooked.
Spatial distribution characteristics of heavy metal pollution in agricultural soil.
Using the geostatistical analysis function of the extended ArcGIS module (Geostatistical Analyst), the Ordinary Kriging interpolation method was applied to establish the spatial variability pattern of heavy metal elements in the soil (Fig 2). In the figure, the heavy metal content is divided into ten levels.
As shown in Fig 2, the distribution of soil heavy metal content in the study area shows the characteristics of low in the south – high in the north.The high value areas of Cr element appear in the territory of Shengzhou City and Xinchang County, which, as one of the top 100 industrial counties (cities) in the country in 2018, can be seen that there are individual chemical enterprises in the region that produce more serious point-source pollution.The high value areas of Pb element have a wider distribution, covering almost the entire eastern part of Zhejiang Province. Considering that the main reason for the influence of Pb is automobile exhaust, which is related to the local population density and the number of fuel vehicles, the government should introduce policies to encourage the public to use public transportation and new energy vehicles to travel.The high value areas of Cd are found in Yuyao, Shangyu, Shaoxing and Tiantai counties, the high value areas of Hg are found in Yuyao, Shangyu and Shaoxing counties, the high value areas of As are found in Yuyao, Shangyu and Shaoxing counties, and the high value areas of Hg are found in Yuyao, Shangyu and Shaoxing counties. in Yuyao City, Shangyu City, Shaoxing County and Zhuji City. These counties (cities) are all located in the Ningshao Plain, which is one of the important grain, cotton, hemp and freshwater fish production areas in Zhejiang Province. It can be seen that the application of chemical fertilizers and pesticides is still high in agricultural activities in the Ningshao Plain, which needs further greening and ecological transformation.
Tracing heavy metal pollution in the soil
Preliminary identification of pollution sources based on PCA.
The correlation between the content of the five heavy metals was analysed using the Pearson coefficient method, as shown in Table 3. It is evident that there is a significant (P < 0.05) or highly significant positive correlation (P < 0.01) between Pb-Cd, Pb-Hg, Cd-Hg, and Hg-As, indicating that Pb, Cd, Hg, and As are homologous.
PCA, as a classical multivariate statistical dimensionality reduction technique, effectively identifies correlation patterns among heavy metal elements and provides a crucial basis for the preliminary identification of pollution sources. The results shown in Fig 3 reveal that the eigenvalues of the first three principal components are 1.66, 1.31, and 0.91, respectively. The first two components have eigenvalues greater than 1, satisfying the Kaiser criterion [54]. Although the eigenvalue of the third component is slightly below the Kaiser criterion threshold, it still holds some explanatory significance. Considering the cumulative variance contribution and the actual complexity of pollution sources, three principal components were selected for subsequent analysis, with a cumulative variance contribution rate of 77.7%.
Fig 4 further illustrates the distribution characteristics of the samples in the reduced-dimensional space. The PC1-PC2 score plot shows a relatively concentrated distribution of the samples, indicating that the soil heavy metal pollution in the study area exhibits a certain homogeneity. The PC1-PC3 and PC2-PC3 score plots demonstrate the spatial differentiation patterns of the samples under different combinations of principal components, providing spatial references for identifying the influence range of different pollution sources.
Fig 5 shows the results of the principal component analysis. The PC1 explains 33.1% of the total variance, with Pb (0.66) and Hg (0.52) showing high positive loadings, while Cr (−0.39) exhibits a negative loading. The high loading of Pb and Hg suggests that this component mainly reflects industrial pollution sources, particularly those arising from industrial processes such as non-ferrous metal smelting, coal-fired power generation, and chemical production.
The PC2 explains 26.3% of the total variance, with Cr (0.60) and As (0.69) displaying significant positive loadings. The co-enrichment pattern of Cr and As primarily reflects mobile source pollution, with Cr mainly originating from vehicle exhaust emissions and road wear, while As is associated with gasoline combustion and tire wear.
The PC3 explains 18.3% of the total variance, with Cd (0.84) exhibiting a very high loading, while As (−0.37) shows a negative loading. The high loading of Cd suggests its primary source is agricultural activities, particularly the use of phosphate fertilizers, pesticides, and livestock farming.
In summary, the PCA analysis preliminarily identifies three major pollution sources in the study area: industrial pollution, mobile source pollution, and agricultural pollution. This result lays a significant foundation for subsequent precise source apportionment analysis.
Pollution source apportionment and quantification based on PMF.
PMF, as an advanced receptor model, can quantitatively identify pollution sources and estimate the contribution of each source to pollutant concentrations at receptor sites. Based on the three pollution sources determined by the preliminary PCA analysis, this study employed the PMF model to conduct precise source apportionment and quantitative analysis of soil heavy metal pollution in the study area. As shown in Fig 6, the source apportionment results of the PMF model were highly consistent with the preliminary identification of pollution sources from the PCA analysis. The model successfully resolved three pollution sources with clear physical significance. Among them, the mobile source was characterized primarily by Cr, with a standardized concentration of 0.62, indicating that the mobile source is the main contributor to Cr pollution in the soil. Mobile source pollution primarily originates from traffic-related activities such as vehicle exhaust emissions, tire wear, brake wear, and road dust. The enrichment of Cr in the mobile source reflects the widespread use of automotive catalytic converters, engine component wear, and chromium-containing alloy materials [55,56].
The industrial source exhibited extremely high enrichment of Pb, with a standardized concentration as high as 0.87, while Hg also showed a certain contribution (standardized concentration of 0.05). This combination of characteristics typically reflects the composite pollution features of industrial activities, mainly including non-ferrous metal smelting, coal-fired power generation, chemical production, and other high-temperature industrial processes. The high loading of Pb indicates that the industrial source is the dominant factor for soil Pb pollution, while the presence of Hg further confirms the contribution of high-temperature industrial processes such as coal combustion [57,58].
The agricultural source was absolutely dominated by As, with a standardized concentration of 0.97, showing a typical single-element dominance characteristic. The extreme enrichment of as in the agricultural source is mainly related to the application of phosphate fertilizers, as phosphate ores naturally contain high concentrations of arsenic. Additionally, historically used arsenic-containing pesticides and the application of livestock manure may also contribute to soil arsenic pollution [59,60].
Meanwhile, the quantitative analysis results of the PMF model indicated significant differences in the relative contributions of the three pollution sources to soil heavy metal pollution in the study area. The mobile source contributed the most, accounting for 52.5% of the total pollution load, making it the primary source of soil heavy metal pollution in the study area. This result reflects the significant impact of highly developed transportation and continuously growing vehicle ownership on soil environmental quality in Zhejiang Province as an economically developed region. The high contribution rate of mobile source pollution suggests that soil environmental supervision should focus on traffic-intensive areas.
The industrial source contributed the second highest proportion, accounting for 30.4% of the total pollution load, reflecting the important impact of industrial activities on soil heavy metal pollution. As a major manufacturing province with densely distributed industrial enterprises, especially the development of heavy chemical and metal processing industries, Zhejiang Province has exerted a continuous cumulative impact on the soil environment.
The agricultural source contributed relatively less, accounting for 17.1% of the total pollution load, mainly reflecting the contribution of fertilizer and pesticide application in agricultural production activities to the accumulation of heavy metals in soil. Although the proportion is relatively low, given that the study object is agricultural land soil, pollution from agricultural sources still requires sufficient attention.
Probabilistic clustering of pollution sources based on PCA-FMM.
Building upon the prior PCA dimensionality reduction results, this study employs the FMM method to conduct probabilistic clustering analysis in the principal component space, aiming to reveal the spatial distribution patterns of soil samples under the influence of different pollution sources. As shown in Fig 7, the FMM probabilistic clustering analysis successfully identified three statistically significant clusters in the PC1–PC2 principal component space. The probability ellipse boundaries of each cluster are distinct, indicating clear statistical demarcations among regions influenced by different pollution sources, thereby providing a quantitative basis for spatial management of pollution sources.
Cluster 1 is primarily distributed in the middle-lower region of the PC1–PC2 space, with a centroid position of approximately (–0.5, –0.8). This cluster contains the largest number of samples and exhibits a relatively concentrated elliptical distribution pattern. Based on the loadings characteristics of PC1 and PC2, this cluster corresponds to regions dominated by mobile source pollution, mainly reflecting the impact of transportation activities on heavy metal contamination in soil.
Cluster 2 is distributed in the upper-right region of the PC1–PC2 space, with a centroid position of approximately (1.8, 0.6), displaying relatively dispersed distribution characteristics. This cluster corresponds to regions dominated by industrial source pollution, reflecting the spatially heterogeneous influence of industrial activities. These areas are mainly concentrated around industrial parks, smelting enterprises, and historically contaminated industrial sites, with their spatial dispersion reflecting the characteristics of industrial point source pollution.
Cluster 3 is primarily distributed in the left region of the PC1–PC2 space, with a centroid position of approximately (–2.2, 1.0), exhibiting a moderate degree of spatial aggregation. This cluster corresponds to regions dominated by agricultural source pollution, mainly reflecting the contribution of agricultural activities to heavy metal contamination in soil. These areas typically include intensive agricultural production zones, facility agricultural areas, and farmlands with long-term phosphate fertilizer application.
Crucially, the FMM clustering results show strong correspondence with the PMF source apportionment results: the distribution patterns of the three clusters in the principal component space highly align with the three types of pollution sources identified by PMF, validating the spatial rationality of the pollution source apportionment results. Cluster 1 has the largest number of samples, corresponding to the highest contribution rate (52.5%) from mobile sources in the PMF results, while the sample distribution proportions of Cluster 2 and Cluster 3 are also consistent with the contribution rates of industrial sources (30.4%) and agricultural sources (17.1%), respectively. The clear probability ellipse boundaries of each cluster indicate well-defined statistical limits for regions influenced by different pollution sources.
Discussions
Comparison and validation of multi-model results
As shown in Fig 8, the PMF and FMM methods demonstrate strong consistency in quantifying the contributions of the three pollution source categories, though some methodological differences exist. The contributions of mobile and industrial sources show minor differences, with PMF results slightly higher than those of FMM. In contrast, the agricultural source contribution exhibits the largest discrepancy, with PMF estimating 17.1% and FMM estimating 27.0%, a difference of 9.9 percentage points.
These differences can be attributed to the distinct algorithmic principles of the two methods. PMF is highly sensitive to concentration variations of source-specific elements, enabling precise capture of subtle changes in source contributions. In comparison, FMM focuses more on the distribution patterns of samples in multidimensional space and demonstrates stronger robustness to outliers and noise. PMF, based on factorization principles, directly decomposes source contributions by minimizing an objective function, while FMM, grounded in probabilistic statistics, identifies sample cluster distributions using Gaussian mixture models. The differing mathematical foundations of the two algorithms lead to variations when addressing complex pollution source mixtures [61].
Overall, the pollution source identification results obtained from these two independent methods are highly consistent, with a correlation coefficient of 0.943, validating the reliability of the source apportionment results. The quantitative precision of PMF and the spatial distribution analysis of FMM complement each other, providing a more comprehensive and robust scientific basis for pollution source analysis. The consistency observed despite differing algorithmic principles indicates that the identified three pollution source categories possess clear physical meaning and statistical significance.
Implications for soil pollution remediation
Based on the results of pollution source apportionment using a multi-method fusion approach combining PCA-PMF-FMM, this study provides a scientific theoretical basis and practical guidance for the remediation of heavy metal pollution in agricultural soils in Zhejiang Province. The contribution patterns revealed by PMF pollution source apportionment—mobile sources (52.5%), industrial sources (30.4%), and agricultural sources (17.1%)—offer important insights for developing differentiated and targeted pollution prevention and control strategies.
For mobile sources: Implement traffic emission reduction measures in identified high-lead contamination areas (particularly the northern region with elevated pollution levels), including: (a) promoting electric vehicle infrastructure in agricultural zones; (b) establishing green buffer zones along major highways traversing farmland; (c) conducting regular monitoring of roadside soils.
For industrial pollution sources: Strengthen oversight of industrial emissions in key pollution zones (Shenzhou and Xinchang counties) through: (a) strict enforcement of wastewater and exhaust emission standards for metallurgical and chemical enterprises; (b) mandatory soil remediation plans for historically polluted industrial zones; (c) establishment of real-time heavy metal emission monitoring systems in industrial parks.
For agricultural pollution sources: Promote sustainable agricultural practices in the Ning-Shao Plain region, specifically including: (a) Reducing phosphorus fertilizer application through precision agriculture technologies; (b) Implementing organic agriculture certification programs with heavy metal testing requirements; (c) Developing arsenic-safe agricultural guidelines for affected areas. By implementing targeted remediation strategies based on pollution source contributions, it is anticipated that further deterioration of heavy metal contamination in agricultural soils across Zhejiang Province can be effectively controlled. This approach will progressively improve soil environmental quality, safeguard the quality and safety of agricultural products, and ensure sustainable agricultural development.
Conclusions
This study employed a combination of analytical methods, including geostatistics, PCA, PMF, and FMM, to systematically investigate the spatial distribution patterns, pollution characteristics, and sources of heavy metals (Cr, Pb, Cd, Hg, As) in agricultural soils in the coastal region of eastern Zhejiang. The results indicate that heavy metal pollution in soils is predominantly anthropogenic, with significant spatial heterogeneity and generally higher pollution levels in the northern part of the study area compared to the south. Source apportionment using the PMF model revealed that mobile sources, industrial sources, and agricultural sources were the primary contributors, with respective contribution rates of 52.5%, 30.4%, and 17.1%. The FMM clustering results further confirmed the spatial differentiation among these pollution sources.
However, this study has certain limitations. First, the source apportionment relied on receptor models and statistical methods without validation against specific industrial enterprise locations and emission inventories, which constrains the precise quantification of source contributions. Second, the analysis was based solely on surface soil samples and did not capture the behavior of heavy metals in terms of vertical migration and bioavailability. Furthermore, the influence of temporal variations on pollution accumulation was not considered, making it difficult to dynamically assess pollution trends.
Future research should focus on the following aspects: First, integrating high-precision industrial emission data with spatial information technologies to establish quantitative source–receptor relationships. Second, conducting studies on heavy metal speciation and vertical distribution to enhance understanding of their environmental behavior and ecological risks. Third, establishing long-term monitoring networks to elucidate spatiotemporal evolution patterns of pollution, thereby providing a scientific basis for dynamic regulation and targeted management. Building on these efforts, a soil pollution prevention and control system based on major source identification and zonal management can be further developed to support regional agricultural sustainable development and ecological security maintenance.
References
- 1. Ahamad MI, Rehman A, Mehmood MS, Mahmood S, Zafar Z, Lu H, et al. Spatial distribution, ecological and human health risks of potentially toxic elements (PTEs) in river Ravi, Pakistan: a comprehensive study. Environ Res. 2024;263(Pt 3):120205. pmid:39442657
- 2. Lu D, Ou J, Qian J, Xu C, Wang H. Prediction of non-equilibrium transport of nitrate nitrogen from unsaturated soil to saturated aquifer in a watershed: Insights for groundwater quality and pollution risk assessment. J Contam Hydrol. 2025;274:104649. pmid:40554323
- 3. Runsheng H, Yan Z, Wenlong Q, Tianzhu D, Mingzhi W, Feng W. Geology and geochemistry of Zn-Pb(-Ge-Ag) deposits in the Sichuan-Yunnan-Guizhou Triangle area, China: a review and a new type. Front Earth Sci. 2023;11.
- 4. Sarwar A, Ali M, Israr M, Gulzar S, Khan MI, Ali MAS, et al. Mapping annual soil loss in the southeast of Peshawar basin, Pakistan, using RUSLE model with geospatial approach. Geol Ecol Landsc. 2024;9(3):1102–13.
- 5.
European Environment Agency (EU body or agency), Arias-Navarro C, Baritz R, Jones A. The state of soils in Europe: fully evidenced, spatially organised assessment of the pressures driving soil degradation. Publications Office of the European Union; 2024. Available from: https://data.europa.eu/doi/10.2760/7007291
- 6. Sharma K, Basta NT, Grewal PS. Soil heavy metal contamination in residential neighborhoods in post-industrial cities and its potential human exposure risk. Urban Ecosyst. 2014;18(1):115–32.
- 7. Barsova N, Yakimenko O, Tolpeshta I, Motuzova G. Current state and dynamics of heavy metal soil pollution in Russian Federation-a review. Environ Pollut. 2019;249:200–7. pmid:30889503
- 8. Kumpiene J, Bert V, Dimitriou I, Eriksson J, Friesl-Hanl W, Galazka R, et al. Selecting chemical and ecotoxicological test batteries for risk assessment of trace element-contaminated soils (phyto)managed by gentle remediation options (GRO). Sci Total Environ. 2014;496:510–22. pmid:25108253
- 9. Wolnik KA, Fricke FL, Capar SG, Meyer MW, Satzger RD, Bonnin E, et al. Elements in major raw agricultural crops in the United States. 3. Cadmium, lead, and eleven other elements in carrots, field corn, onions, rice, spinach, and tomatoes. J Agric Food Chem. 1985;33(5):807–11.
- 10. Panagos P, Jones A, Lugato E, Ballabio C. A soil monitoring law for Europe. Glob Chall. 2025;9(3):2400336. pmid:40071225
- 11.
Opinions on Further Strengthening the Prevention and Control of Heavy Metal Pollution. [cited 9 Sep 2025]. Available from: https://www.mee.gov.cn/xxgk2018/xxgk/xxgk03/202203/t20220315_971552.html
- 12. Lado LR, Hengl T, Reuter HI. Heavy metals in European soils: a geostatistical analysis of the FOREGS Geochemical database. Geoderma. 2008;148(2):189–99.
- 13. Su Y, Cui Y-J, Dupla J-C, Canou J. Soil-water retention behaviour of fine/coarse soil mixture with varying coarse grain contents and fine soil dry densities. Can Geotech J. 2022;59(2):291–9.
- 14. Wang L, Zhang Y, Han R, Li X. LA-ICP-MS analyses of trace elements in zoned sphalerite: A study from the Maoping carbonate-hosted Pb-Zn(-Ge) deposit, southwest China. Ore Geol Rev. 2023;157:105468.
- 15. Wu L, Bai X, Li C, Li H, Cao Y, Ran C, et al. Assessment of carbon sinks caused by the chemical weathering of carbonate rocks under the influence of exogenous acids: Methods, progress, and prospects. Sci China Earth Sci. 2025;68(6):1785–804.
- 16. Zhang Q, Wang C. Natural and human factors affect the distribution of soil heavy metal pollution: a review. Water Air Soil Pollut. 2020;231(7).
- 17.
K A, S S, A M. Bio-remediation of Pb and Cd polluted soils by switchgrass: A case study in India. PubMed. [cited 9 Sep 2025]. Available from: https://pubmed.ncbi.nlm.nih.gov/26696008/
- 18. Wu X, Zhao Y. A novel heat pulse method in determining “effective” thermal properties in frozen soil. Water Resourc Res. 2024;60(12).
- 19. Liu H, Zhang Y, Yang J, Wang H, Li Y, Shi Y, et al. Quantitative source apportionment, risk assessment and distribution of heavy metals in agricultural soils from southern Shandong Peninsula of China. Sci Total Environ. 2021;767:144879. pmid:33550057
- 20. Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics Intell Lab Syst. 1987;2(1–3):37–52.
- 21. Linjin L, Baohui M, Rui P. Water quality evaluation of wenyu river based on single factor evaluation and comprehensive pollution index method. NEPT. 2021;20(3).
- 22. Hu W, Hao W, Liu H, Wang W, Jiao R. Water quality evaluation of Dahai Lake in Inner Mongolia based on improved Nemero pollution index method and principal component analysis. E3S Web Conf. 2023;393:03040.
- 23. Wronski EB, Humphreys N. A method for evaluating the cumulative impact of ground- based logging systems on soils. J For Eng. 1994;5(2):9–20.
- 24. Li Y, Zhou H, Gao B, Xu D. Improved enrichment factor model for correcting and predicting the evaluation of heavy metals in sediments. Sci Total Environ. 2021;755(Pt 1):142437. pmid:33011598
- 25.
Comero S, Capitani L, Gawlik B. An introduction to the chemometric evaluation of environmental monitoring data using PMF. 2009. Available from: https://www.semanticscholar.org/paper/An-introduction-to-the-chemometric-evaluation-of-Comero-Capitani/24d37d67993228d05513f877b007a16d9a677ff0
- 26.
USEPAO. Positive Matrix Factorization Model for Environmental Data Analyses. 2015 [cited 9 Sep 2025]. Available from: https://www.epa.gov/air-research/positive-matrix-factorization-model-environmental-data-analyses
- 27. Chen L, Ma K. Spatial and temporal distribution and source analysis of heavy metals in agricultural soils of Ningxia, Northwest of China. Sustainability. 2023;15(21):15360.
- 28. Wang S, Zhang Y, Cheng J, Li Y, Li F, Li Y, et al. Pollution assessment and source apportionment of soil heavy metals in a coastal industrial city, Zhejiang, Southeastern China. Int J Environ Res Public Health. 2022;19(6):3335. pmid:35329032
- 29. Shao S, Hu B, Fu Z, Wang J, Lou G, Zhou Y, et al. Source identification and apportionment of trace elements in soils in the Yangtze River Delta, China. Int J Environ Res Public Health. 2018;15(6):1240. pmid:29895746
- 30. Li F, Xiang M, Yu S, Xia F, Li Y, Shi Z. Source identification and apportionment of potential toxic elements in soils in an eastern industrial City, China. Int J Environ Res Public Health. 2022;19(10):6132. pmid:35627668
- 31. Fu L, Mao X, Mao X, Wang J. Evaluation of agricultural sustainable development based on resource use efficiency: empirical evidence from Zhejiang Province, China. Front Environ Sci. 2022;10.
- 32. Hu M, Wang Y, Xia B, Jiao M, Huang G. How to balance ecosystem services and economic benefits? - A case study in the Pearl River Delta, China. J Environ Manage. 2020;271:110917. pmid:32583803
- 33. Shi T, Zhang Y, Gong Y, Ma J, Wei H, Wu X, et al. Status of cadmium accumulation in agricultural soils across China (1975-2016): From temporal and spatial variations to risk assessment. Chemosphere. 2019;230:136–43. pmid:31103859
- 34. Huang Y, Liu Q, Jia W, Yan C, Wang J. Agricultural plastic mulching as a source of microplastics in the terrestrial environment. Environ Pollut. 2020;260:114096. pmid:32041035
- 35.
Xl R, Gl Z, Yg Z, Dg Y, Yj W. Distribution and migration of heavy metals in soil profiles by high-resolution sampling. [cited 9 Sep 2025]. Available from: https://pubmed.ncbi.nlm.nih.gov/16850852/
- 36. Juhasz AL, Naidu R. Explosives: fate, dynamics, and ecological impact in terrestrial and marine environments. Rev Environ Contam Toxicol. 2007;191:163–215. pmid:17708075
- 37. Chen Y, Li R. Spatial distribution and type division of traditional villages in Zhejiang Province. Sustainability. 2024;16(12):5262.
- 38. Guo F, Jin J, Yong B, Wang Y, Jiang H. Responses of water use efficiency to phenology in typical subtropical forest ecosystems—A case study in Zhejiang Province. Sci China Earth Sci. 2019;63(1):145–56.
- 39. Jiao W, Fuller A, Xu S, Min Q, Wu M. Socio-ecological adaptation of agricultural heritage systems in modern China: three cases in Qingtian County, Zhejiang Province. Sustainability. 2016;8(12):1260.
- 40. Weihe Z. The Chinese path to rural common prosperity: based on an analysis of Jiangsu and Zhejiang rural demonstration projects. Soc Sci China. 2023;44(3):137–52.
- 41.
Esri. “World Street Map” [basemap]. Scale Not Given. “World Street Map”. 2023. [cited 2023 Oct 1] Available from: https://www.arcgis.com/home/item.html?id=de26a3cf4cc9451298ea173c4b324736
- 42. Khan SR, Sharma B, Chawla PA, Bhatia R. Inductively coupled plasma optical emission spectrometry (ICP-OES): a powerful analytical technique for elemental analysis. Food Anal Methods. 2021;15(3):666–88.
- 43.
Nr R, Mv T. An overview of recent applications of inductively coupled plasma-mass spectrometry (ICP-MS) in determination of inorganic impurities in drugs and pharmaceuticals. PubMed. 2007. [cited 9 Sep 2025]. Available from: https://pubmed.ncbi.nlm.nih.gov/16891084/
- 44. da Silva MJ, Paim APS, Pimentel MF, Cervera ML, de la Guardia M. Determination of mercury in rice by cold vapor atomic fluorescence spectrometry after microwave-assisted digestion. Anal Chim Acta. 2010;667(1–2):43–8. pmid:20441864
- 45. Chen Y-W, Belzile N. High performance liquid chromatography coupled to atomic fluorescence spectrometry for the speciation of the hydride and chemical vapour-forming elements As, Se, Sb and Hg: a critical review. Anal Chim Acta. 2010;671(1–2):9–26. pmid:20541638
- 46. Seghouane A-K, Shokouhi N, Koch I. Sparse principal component analysis with preserved sparsity pattern. IEEE Trans Image Process. 2019;28(7):3274–85. pmid:30703025
- 47. Paatero P, Hopke PK. Discarding or downweighting high-noise variables in factor analytic models. Anal Chimica Acta. 2003;490(1–2):277–89.
- 48. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B: Stat Methodol. 1977;39(1):1–22.
- 49. Lv J, Wang Y. Multi-scale analysis of heavy metals sources in soils of Jiangsu Coast, Eastern China. Chemosphere. 2018;212:964–73.
- 50. Yadav MBN, Patil PL, Hebbara M. Assessment of soil erosion risk in a hilly zone sub-watershed of Karnataka using geospatial technologies and the RUSLE model. Geol Ecol Landsc. 2024;9(3):1087–101.
- 51. Sun L, Guo D, Liu K, Meng H, Zheng Y, Yuan F, et al. Levels, sources, and spatial distribution of heavy metals in soils from a typical coal industrial city of Tangshan, China. CATENA. 2019;175:101–9.
- 52. Wei Z, Miao L, Peng J, Zhao T, Meng L, Lu H, et al. Bridging spatio-temporal discontinuities in global soil moisture mapping by coupling physics in deep learning. Remote Sens Environ. 2024;313:114371.
- 53. Zhang Y, Wu X. Global space-time patterns of sub-daily extreme precipitation and its relationship with temperature and wind speed. Environ Res Lett. 2025;20(8):084019.
- 54. Kaiser HF. The application of electronic computers to factor analysis. Educ Psychol Measure. 1960;20(1):141–51.
- 55. Harrison RM, Allan J, Carruthers D, Heal MR, Lewis AC, Marner B, et al. Non-exhaust vehicle emissions of particulate matter and VOC from road traffic: a review. Atmospheric Environ. 2021;262:118592.
- 56. Ravindra K, Bencs L, Van Grieken R. Platinum group elements in the environment and their health risk. Sci Total Environ. 2004;318(1–3):1–43. pmid:14654273
- 57. Meng W, Wang Z, Hu B, Wang Z, Li H, Goodman RC. Heavy metals in soil and plants after long-term sewage irrigation at Tianjin China: a case study assessment. Agric Water Manage. 2016;171:153–61.
- 58. Gawley DJ, Timberlake W, Lucas GA. Schedule constraint on the average drink burst and the regulation of wheel running and drinking in rats. J Exp Psychol Anim Behav Process. 1986;12(1):78–94.
- 59. Nriagu JO, Bhattacharya P, Mukherjee AB, Bundschuh J, Zevenhoven R, Loeppert RH. Arsenic in soil and groundwater: an overview. Trace Metals and other Contaminants in the Environment. Elsevier; 2007. pp. 3–60.
- 60.
Ali D. Heavy Metals Toxicity in Agricultural Soil By Fertilisers & Pesticides. In: Chemspectro [Internet]. 2025 [cited 9 Sep 2025]. Available from: https://www.chemspectro.com/heavy-metals-toxicity-in-agricultural-soil-due-to-fertilizers-pesticides-ecological-risks-human-health-implications/
- 61. Xu B, Xu H, Zhao H, Gao J, Liang D, Li Y, et al. Source apportionment of fine particulate matter at a megacity in China, using an improved regularization supervised PMF model. Sci Total Environ. 2023;879:163198. pmid:37004775