Evaluation of Soil Contamination Indices in a Mining Area of Jiangxi, China

There is currently a wide variety of methods used to evaluate soil contamination. We present a discussion of the advantages and limitations of different soil contamination assessment methods. In this study, we analyzed seven trace elements (As, Cd, Cr, Cu, Hg, Pb, and Zn) that are indicators of soil contamination in Dexing, a city in China that is famous for its vast nonferrous mineral resources in China, using enrichment factor (EF), geoaccumulation index (Igeo), pollution index (PI), and principal component analysis (PCA). The three contamination indices and PCA were then mapped to understand the status and trends of soil contamination in this region. The entire study area is strongly enriched in Cd, Cu, Pb, and Zn, especially in areas near mine sites. As and Hg were also present in high concentrations in urban areas. Results indicated that Cr in this area originated from both anthropogenic and natural sources. PCA combined with Geographic Information System (GIS) was successfully used to discriminate between natural and anthropogenic trace metals.


Introduction
Environmental issues that pose a threat to soil health include erosion, a decline in organic matter content and biodiversity, contamination, sealing, compaction, salinization, and landslides [1]. In China, contamination is recognized as a major threat to soil. In recent years, there have been numerous review and research articles providing assessments of various kinds of soil contamination, including urban soil contamination, agricultural soil contamination, and soil contamination in mining areas [2]. Several studies have also provided a comparison of the results of different methods for the assessment of soil contamination [3][4][5]. Such studies help to raise public awareness of soil contamination and to facilitate research on contamination and contamination control strategies. However, the status and trends of soil contamination, especially at regional scales, have not been well described. Knowledge of soil geochemistry is fundamental to assessing soil contamination at the regional scale. One of the most efficient tools for studying environmental geochemistry problems is geographical information system (GIS) based on geostatistical analysis. To our knowledge, maps and comparisons of indices derived from different soil contamination methods are not widely available.
The objective of our work was to determine the origin of trace metals in soils using various indices based on geochemistry mapping, including enrichment factor (EF), geoaccumulation index (I geo ), and pollution index (PI), along with principal component analysis (PCA); we also aimed to critically evaluate the advantages and limitations of these methods. The data we used were obtained from a regional geochemical survey carried out in Dexing, a city in China that is famous for its vast nonferrous mineral resources. To better understand the outcome of this work, we first present a brief overview of core issues and problems associated with current soil contamination assessment methods.

Selection of reference values
A major methodological problem associated with correctly assessing soil contamination is the identification of appropriate reference values for uncontaminated soil conditions, since all quantitative assessment methods rely on reference values of background concentrations [6]. The background, the crust, and the regulatory reference values are common reference values used for soil contamination assessment; the background value is the most appropriate reference value to evaluate soil contamination for theoretical considerations alone.
There is some variability in the definition of background. A selection of definitions and relevant terms is presented in Table 1 [7,8,9]. Indiscriminate usage of the term ''background'' to evaluate soil contamination can result in misinterpretations if several flaws are ignored. Reimann and de Caritat critically discuss the definitions and use of background values in environmental geochemistry [10]. Some characteristics are summarized: (1) No specific global background levels of elements can be defined. Natural element concentrations can be as high or even higher than any visible anthropogenic contamination, therefore it is difficult to identify anthropogenic additions and contamination in most cases. (2) Background levels depend on location and scale, and should usually be restricted to the local scale. It has been demonstrated that background levels may vary both within and between regions. (3) It is more realistic to view background as a range rather than an absolute value. There are a range of values characterizing any particular area or region that reflect the heterogeneity of the environment. (4) It can be argued that natural background no longer exists on this planet. There is evidence from the world's ice sheets and glaciers that small amounts of elements have been transported on intercontinental scales to remote regions and deposited as a result of being released into the atmosphere from human activities.
Threshold is usually expressed as a single value showing the upper background between anomalous and background concentrations, while the baseline, usually expressed as an observed or 95% expected range, is used mainly in geochemical exploration, and is not appropriate for environmental purposes. The background values derived from different percentiles of trace metal soil concentrations for some countries are summarized in Table 2 [11][12][13][14][15][16]. The use of percentile as an upper background (threshold) provides a practical approach to continue to use the term ''background''. This implies the availability of reliable procedures to evaluate soil contamination, but raises the question of data comparability.
When local information is unavailable, and more cannot be obtained, it is necessary to resort to data generated by surveys from different parts of the world covering spatially significant areas ( Table 2). The average concentrations of 90 naturally occurring elements in the Earth's crust have been estimated; these are known as ''Clarke values'' and can be found in Taylor and Wedepohl [17,18]. These two papers summarize published data on the composition of the upper continental crust, which varies slightly because there are hypothetical concentrations based on assumed proportions of various crustal rock types. The concentrations of elements differ so widely from one geologic unit to another, that the use of the Clarke value for an element in a regional or local context does not sufficiently represent variations in element distributions caused by mineralization or contamination in a particular sampling medium [19]. However, such values can give a preliminary indication of whether results from a new investigation are within an expected range and whether they reflect natural variations in concentrations present in different environments [10].
The use of regulatory reference value (RRVs), which are generally based on background values in combination with toxicity levels, is a different approach to evaluating soil contamination. RRV is set by a state authority, and is not always based solely on scientific evidence, but also on economic or political considerations. The RRVs for trace metals in soil of some countries are provided in Table 3 [20][21][22]. RRVs have been given various names in their original languages that translate in English to maximum admissible concentration values, target values, intervention values, guideline, cut-off values, and many others. Advantages of using screening values have been pointed out by several authors [23,24] and are confirmed in practice by their long term and successful use in many countries. Advantages include their speed and ease of application, their clarity for use by regulators and other non-specialist stakeholders, and their comparability and transparency. The major limitation of screening values is that crucial site-specific considerations cannot be included. Screening values may give rise to a misleading feeling of certainty, knowledge, and confidence, which can lead to reluctance on the part of users to apply them to site-specific risk assessments [25]. A combined approach, using guideline values to streamline the preliminary stages of decision making and sitespecific risk assessment to achieve fine-tuning in later stages of an investigation, is generally considered the most appropriate [26]. Table 1. A selection of definitions of background and relevant term.

Definition Term Reference
The normal abundance of an element in barren earth material, and it is more realistic to view background as a range rather than an absolute value Background [7] Geogeneous or pedogeneous average concentration of a substance in an examined soil Background [8] If the atmosphere in a particular area is polluted by some substance from a particular local source, then the background level of pollution is that concentration, which would exist without the local source being present.
Background [9] Widely used to infer background levels reflecting natural processes uninfluenced by human activities.
Natural background [10] used to describe the unmeasurably perturbed and no longer pristine natural background Ambient background [10] Used when data either come from age-dated materials or are collected from areas believed to represent a survey/study area in its supposed preindustrialization state.
Pre-industrial background [10] The outer limit of background variation Threshold [11] A depature from the geochemical patterns that are normal for a given area or geochemical landscape Anomaly [7] Concentrations of substances characterizing variability in the geochemistry of earth's surface materials and are needed for documenting the present state of the surface environment and to provide datum against which any changes can be measured Baseline [12] doi: 10 Based on the location of a reference area in relation to a study site, two types of reference areas can be classified: on-site and offsite. All the statistically derived references mentioned above are off-site references and are easy to compute. Desaules argued that off-site reference methods are obviously not appropriate to assess weakly contaminated sites, while the specific and sensitive on-site reference method could be used to accurately identify soil contamination based on the observed values of investigated trace metals [6]. On-site reference is a value specific to a particular material and to a particular locality.
Deep soil layer values are not affected by contamination and are considered to be the most convenient for use as on-site references of the same soil profile [27]. There is debate about the use of deep soil layer values to evaluate soil contamination. The use of deep soil layers, instead of the continental crust, as a reference value improves the sensitivity of EF to anthropogenic surface enrichments [27,28]. In contrast to other authors who have promoted the use of deep soil layer values, Reimann and de Caritat demonstrate that it does not significantly reduce the shortcomings of the EF approach and may even give spurious results based on results from subcontinental-scale geochemical surveys [10].
Other suggestions for on-site references to identifying contamination are buried fossil topsoils, provided the buried soils have not been contaminated or depleted subsequently by pedogenic processes, and dated peat bog samples, which make it possible to trace the chronology of atmospheric deposition [6,29,30]. However, both these types of bog samples are difficult to obtain.

Indices and methods for the assessment of soil contamination
Popular soil contamination assessment methods can be classified into two categories: quantitative and qualitative. The qualitative methods, such as PCA, factor analysis, and cluster analysis, are inferential and indicative. These multivariate analyses require that each variable shows a normal distribution and that the whole dataset shows a multivariate normal distribution [31]. Some of the most commonly used quantitative methods are the contamination factor (CF), enrichment factor (EF), and geoaccumulation index (I geo ). The CF, defined by Hakanson, enables an assessment of soil contamination through the use of concentrations in the surface layer of bottom sediments to preindustrial levels as a reference [32]. In China, the CF was adopted as a pollution index (PI), which is often evaluated by comparing metal concentrations with related environmental guidelines, or with respect to relevant background values. The CF is sometimes used in equivalency to background. The PI will be used in this paper because it has been widely used in soil contamination assessments. EF was introduced in the 1970s, and was initially developed to obtain information on the origin of elements in the atmosphere [33,34]. I geo , a method used for the evaluation of the degree of contamination in aquatic sediments was originally defined by Müller and has been widely used in soil trace metal studies [35]. There are numerous studies which use the abovementioned factors to assess soil contamination at different scales [36,37], while, several studies use a combination of methods [38][39][40].
Care needs to be taken when using the terms 'contamination' and 'pollution'. Contamination is the presence of a substance where it should not be, or in levels that are above background levels [29]. The term pollution is defined as contamination that results in adverse biological effects [29]. In the context of soil systems, the difference between contamination and pollution is that contamination is presence of the substance in soil adversely affecting the soil, and pollution is the presence of the substance in the soil adversely affecting the usefulness of the soil [41]. The Table 3. Summary of regulatory reference values of trace metal in soil of some countries (mg/kg). sources of trace metals in soils are manifold, and include natural parent materials and various exogenous pollution sources [42]. Identifying and quantifying anthropogenic trace metals in soil is crucial for the assessment of soil contamination. However, difficulties arise from correctly evaluating the degree of soil contamination, especially at slightly disturbingly area. Generally, local hotspots of soil contamination (such as metal smelters and brownfields) are easier to identify and delimitate than regional contamination by agrochemicals and atmospheric deposition close to urban or industrial sources, or global contamination by longrange transboundary air contamination [6]. There is no soil contamination assessment method available to provide accurate information on the extent of perturbation for a number of reasons.
The formation of soil is a function of climate, soil organisms, landscape, plants, time, and geology. All of these factors can affect the concentration of any one element in a soil system. Because different sample materials will respond differently to the input of an element, it is not appropriate to use a single value (e.g., mean, maximum) to evaluate soil contamination of an entire area. There are two methods to describe characteristics of contamination over an entire area: the calculation of the proportion of contaminated samples in a given area, and geochemical mapping. However, the proportion of contaminated samples does not represent the specific geochemical context of each sample or other relevant information, so that the proportion calculated will not reliably provide a complete picture of soil contamination of a given area. Geochemical mapping, usually performed on GIS, provides a visual representation of the geochemical and contamination processes related to the distribution of trace elements. Additionally, most current soil contamination assessment frameworks are limited to  potentially toxic inorganic trace metals (As, Cd, Cr, Cu, Hg, Mn, Pb, and Zn); it is important to also consider other important inorganic (F, P, and Se) and organic (PAHs, PCBs, and PCDD/Fs) substances.

Study site description
The study area is located in the northeast part of Jiangxi province (117u0092118u009E, 28u509229u209N), China (Figure 1). No specific permissions were required for these locations. The field studies did not involve endangered or protected species. The altitude ranges from 20-1300 m, and the climate zone is subtropical monsoon, with an annual average temperature of 17uC and rainfall of 1900 mm. The soils are mainly classified as paddy soil in the plains, and yellow soil and red soil in the hilly areas. The stratum is full-fledge and spread across the study area, except for areas containing Silurian, Devonian, and Tertiary strata [43]. The Lean River is the main water body in the study area and has a number of branches, including the Jishui River, the Dawu River, and the Changle River.  (Figure 1). The area covered by the sampling sites was approximately 400 km 2 . One sample per 16 km 2 was collected at sites far from potential contamination sources, and one sample per 4 km 2 was collected around potential contamination sources, such as the Dexing copper mine, and the Leping coal mine. Each sample represents composite material taken from four points over a 1-km 2 patch of land; total sample weight was 1-1.5 kg. Samples were air dried at 35-40uC prior to analysis. The soil was passed through a 6-mm sieve to remove stones and plant material, then was milled with a carnelian mortar then passed through a 0.015-mm sieve prior to chemical analysis.

Sampling and analyzing
Each soil sample (10-20 mg) was digested in 1 mL of 60% (w/ w) HNO 3 and 1 mL of 60% (w/w) HClO 4 in a stainless steel highpressure digestion bomb at 140uC for 6 h. After completely cooling the system, the open vial was transferred to a hot plate (about 190uC) to evaporate the solution until the volume had decreased to several hundred micro-liters, then 0.5 mL of 49.5% (w/w) HF was added and the sample was evaporated again. The HF treatment was repeated several times until the silicate minerals had been completely dissolved. Finally, the residual solution was diluted to 6 mL with 1% (w/w) HNO 3 , filtered through a syringe filter (0.45 mm). Total concentrations of Cu, Pb, Zn, and Cr were analyzed by inductively coupled plasma atomic emission spectroscopy, As and Hg were analyzed by atomic fluorescence spectroscopy, and Cd was analyzed by atomic absorption spectroscopy. The total concentrations of K, Ca, Na, Mg, Si, Al, Mn, Ti, and Fe were determined by wavelength-dispersive X-ray fluorescence spectroscopy. Quality assurance and quality control procedures were performed along with laboratory analyses through the analysis of standard reference materials GSS-1, GSS-2, GSS-3, and GSS-4 soil (National Research Center for Geoanalysis of China). The results showed that the precision and bias of the analysis were generally below 5%. Recoveries of samples spiked with standards ranged from 95 to 105%.

Soil contamination assessment method
The assessment of soil contamination was carried out using EFs, I geo , and PIs. To enable a comparison of the three indices, the value of the EFs, I geo , and PIs were calculated using the modified formula based on the equations suggested by Chester and Stoner, Hakanson, and Müller, respectively [32,33,35].
where C n is the concentration of the element in the soil environment, B n is the background concentration of soil in Jiangxi, X n is the concentration of the reference element in the soil environment, and X r is the concentration of the reference element in the reference environment. For this study, we used Al 2 O 3 as the reference element. For comparison of the degree of contamination, soil contamination indices were divided into five grades according to their classification criteria ( Table 4). The classification for I geo and PI were adjusted based on the definitions given by Müller and Hakanson and the classification of EF was done according to Sutherland [32,35,44].

Multivariate statistics analysis
In this study, principal component analysis was conducted to identify the relationship between heavy metals in soil and their potential sources. The common two potential sources were: natural (the biogeochemical processes of parent material and the physicochemical processes of parent material) and anthropogenic (industrial activity, industrial activity, vehicle-related activity and fossil energy activity). PCA is designed to reduce a dataset containing a large number of variables to a smaller size by finding a new set of variables called components. Is this study, there are 10 element measurements constituting the variables, and hence 10 components. PCA was conducted using a commercial statistics software package SPSS (version 17) for Windows. The assumption of normality for all variables was checked before multivariate statistical and spatial analyses; when necessary, data transformation was done via a Box-Cox transformation.

Geostatistical Analysis
The Kolmogorov-Smirnovtest (p,0.05) indicated that the various metals had skewed concentration distributions. Only As and Zn fitted a normal distribution after being logarithmically transformed. A log transformation was conducted prior to the analysis because of the skewed distributions of the heavy metal data.
Ordinary kriging is the most commonly used interpolation method to predict the overall trend of soil pollution. However, for the purpose of identifying contaminated areas, inverse distance weighting (IDW) is more appropriate to predict local features of soil pollution, especially local hotspots and cold spots [45]. It is a deterministic spatial interpolation model that is directly related to the values being estimated, and is suited to small datasets for which modeled semi-variograms are very difficult to fit. The interpolating function is: Where Z(x) is the predicted value at an interpolated point, W i is the weight assigned to point i, Z i is at a known point, d i is the distance between point i and the prediction point, and n is the number of known points used in the interpolation. Interpolation mapping was conducted using IDW within ArcGIS 9.30 software.

Descriptive statistical analysis
Descriptive statistics of heavy metal concentrations of topsoil are presented in Table 5. The arithmetic means concentrations of As, Cd, Cr, Cu, Hg, Pb, and Zn were 11.63, 0.24, 72.09, 53.48, 0.10, 47.02, and 87.98 mg/kg, respectively. Wide concentrations ranges coupled with the relatively high CV values for metal elements demonstrate the anthropogenic contribution in the study area. In this study, the Coefficient of variance was higher for Cu than for the other metals, and their concentrations varied widely. This suggests that Cu inputs to the soil in the study area may be attributable to anthropogenic sources.
The mean concentrations of all metals, especially Cd and Cu, exceeded the environmental background values for Jiangxi and China [14]. This was probably because of the influence of mining activities in the study area. It was found that the highest concentrations of all heavy metals were higher than their corresponding guidelines for soils, except Pb, based on the Chinese Environmental Quality Standard for Soils [20]. However, the mean concentrations of the metals were lower than the guidelines. The mean concentrations of most metals, except Cd, Cu and Pb, were lower than the background values of local.

Soil contamination assessment based on EF
The descriptive statistics of EF corresponding to the seven trace elements measured in the study area are given in Table 6. Mean values of EF were less than 2 for As and Zn, indicating no contamination by those metals in the soil. The mean EF values of Cd, Cr, Cu, and Pb ranged from 2.09 to 3.58; with respect to those  metals, the soil was classified as moderately contaminated. The mean EF value of Hg was approximately 20, which was the highest of all the metals and which indicates considerable soil contamination. Estimated maps of EF of seven heavy metals in soil are presented in Fig. 2. The EF map of Cu shows higher values in areas surrounding the Dexing and Leping mining areas, which contain many Cu and Mo mining sites. The highest levels of Cd and Pb occurred at the centre of Dexing. Urban vehicular emissions and industrial activity, including incinerator operation and metallurgic activities, have continuously contributed to Cd and Pb contamination of topsoil in this area. The spatial distribution of As was highly heterogeneous in contrast to the other metals, suggesting that As in these samples may originate from point source pollution. In contrast to other heavy metals, the spatial distribution of Cr shows no clear hotspots, suggesting the study area is weakly polluted by Cr.

Soil contamination assessment based on I geo
The mean I geo values for all trace elements were lower than 0 (ranged from 20.07 to 20.64), suggesting a lack of soil contamination, except for Cd ( Table 6). The spatial distributions of Cd, Cu, and Pb exhibited similar patterns (Fig. 3), however, the I geo values indicated that the area polluted by Cd and Cu was more extensive than the area polluted by Pb. The spatial distribution of I geo for Cr was similar to the EF for Cr, confirming the lack of Cr contamination. Most soils in the world do not contain elevated concentrations of Hg, which is leached and evaporates after being reduced to the metallic form, although a portion is absorbed by organic matter and clay minerals. The urban areas, including Jingdezhen, Leping, Dexing, and Wuyaun, had the highest Hg I geo values; the remaining area is weakly enriched in Hg.

Soil contamination assessment based on PI
The mean PIs for all trace elements ranged from 1.12 to 2.57, which indicates that the soils were moderately contaminated ( Table 6). The assessment of the overall contamination of soil was based on IPI Ave . The IPI Ave , calculated according to the mean of the PIs of the seven trace elements, was 1.66, which indicates moderate contamination. Estimated PI maps of seven heavy metals in soil are presented in Fig. 4. Among these soil contamination indices, the spatial distributions of I geo and PI are remarkably similar across the study area.

Source identification based on PCA
PCA has been extensively used to identify contamination sources. The results of the PCA conducted in this study are shown in Table 7. In this study, three principal components explained 64.36% of total variance, according to the initial eigenvalues (eigenvalues.1). As, Zn, Cd, Cu, Pb, and Hg were closely associated with the first principal component (PC1), explaining 35.71% of total variance; Cr and Ti were associated with the second principal component (PC2), which explained 16.78% of total variance; and Fe 2 O 3 and Al 2 O 3 were associated with the third principal component (PC3), explaining 11.87% of total variance. The other seven components (eigenvalues,1) explain little of the variability in the dataset and will not be discussed further.
As shown in this Fig. 5, high score areas were distributed in and around some of the Cu-Mo mining sites and along major roads. The areas with high component 1 scores that produced high amounts of Cd, Cu, Pb and Zn, were located around the Fujiawu Cu-Mo deposit (the biggest open store of Cu in Asia). These mining activities represented by PC1 may have be the primary contributors of Cd, Cu, Pb, and Zn contamination in soil. Thus, PC1 was mainly controlled by anthropogenic sources.
Interpolated scores associated with PC2 are displayed in Fig. 5; the scores exhibit a different spatial distribution than PC1 scores. Two high score areas were located in the city of Leping and the Fujiawu Cu-Mo deposit. The high score areas located in Leping are associated more strongly with natural sources. The reasons for the observed high scores of areas located in the Fujiawu Cu-Mo deposit are not clear; in Cu deposits, naturally occurring Cu is often present in higher concentrations than other environment. There are a number of potential causes of high PC2 scores, including the influence of anthropogenic activities. The findings suggest that Ti and Cr in soil originated from both natural and anthropogenic sources.
The spatial distribution of PC3 is presented in Fig. 5. The spatial variability of the score associated with PC3 is different than that of the scores associated with PC1 and PC2.

Comparative method evaluation
Almost all of the four indices used in this study have been employed previous in soil contamination assessments. However, our assessments based on GIS have some distinct advantages over those done in previously studies: (1) using these maps, soil researchers and managers can visually identify the degree of anthropogenic influence on the environment at a regional scale; (2) all mapping indices incorporate some other relative information, such as land-use type, soil type, and human activities, which lead to increased confidence in the results; and (3) mapping indices can serve as a platform for planning other soil research.
Though similar integrated soil quality evaluation results were obtained from the four indices, PCA is better for than EFs, I geo , and PIs integrated soil contamination assessment in the study area. Using PCA, integrated soil contamination was assessed by differentiating the importance of various indicators. The 10 elements measurements constituting a dataset were included in the statistical analyses to find the influence of anthropogenic components by multivariate analysis. The drawback is that this is a qualitative method, which cannot evaluate the degree of contamination. However, IPIave treats each trace element as an independent entity and does not consider the specific geochemical context of each element.
The three soil contamination indices we used were dependent on the use of regional background values. Based on the indices, which are calculated according to mean values, Cd was classified to have caused moderate contamination, while the degree of contamination of other heavy metals varied. Using mean trace metal values/regional background ratios of soil on a regional scale is an oversimplified approach and may result in erroneous estimates of soil contamination. Thus, the use of mean values is a reliable way to evaluate contamination of an entire region because different sample material will respond differently to the presence of elements in the soil.
The mapping of the contamination indices we used, which take into account spatial information and human activities, provide an effective way to evaluate the spatial distributions of anthropogenic impact on soil composition. Using EFs, I geo , and PI calculated relative to off-site reference values of an entire region does not improve the sensitivity of the methods to the anthropogenic enrichment and may even give spurious results. This study demonstrates that values of contamination indices can be high relative to off-site values for a number of reasons, and contamination is just one potential cause. The three off-site references methods employed in this study are easy to conduct, and may be used for quantitative analyses to assume consistent effects of geologic and pedogenic processes at regional scale.

Conclusions
The findings of this study suggest that EFs, I geo , and PI calculated according trace metal mean values relative to off-site reference values to assess soil contamination provide different interpretations of the same data. The assessment results are inconsistent, and no conclusions are reliable. However, the mapping of EFs, I geo , PI, and PCA, combined with contamination source analysis, has the potential to differentiate between anthropogenic and natural element sources.
The most plausible results are likely to be obtained from multivariate statistical analysis-methods. In this study, the use of PCA allowed us to discriminate between natural and anthropogenic trace metals in soils of the study area. The results are supported by the resulting EF, I geo , and PI maps. According to the analysis, surface horizons are highly enriched in Cd, Cu, Pb, and Zn. The composition of topsoil is significantly modified by human activity in areas with high population density and areas near mining sites. As and Hg present in the soil were also mainly derived from anthropogenic sources, and occurred in relatively high concentrations in urban areas, in contrast to Cd, Cu, Pb, and Zn. Mapping of the soil contamination assessment indices seems to be an efficient tool for detecting sources of anomalies in the study area.