Source Apportionment and Risk Assessment of Emerging Contaminants: An Approach of Pharmaco-Signature in Water Systems

This paper presents a methodology based on multivariate data analysis for characterizing potential source contributions of emerging contaminants (ECs) detected in 26 river water samples across multi-scape regions during dry and wet seasons. Based on this methodology, we unveil an approach toward potential source contributions of ECs, a concept we refer to as the “Pharmaco-signature.” Exploratory analysis of data points has been carried out by unsupervised pattern recognition (hierarchical cluster analysis, HCA) and receptor model (principal component analysis-multiple linear regression, PCA-MLR) in an attempt to demonstrate significant source contributions of ECs in different land-use zone. Robust cluster solutions grouped the database according to different EC profiles. PCA-MLR identified that 58.9% of the mean summed ECs were contributed by domestic impact, 9.7% by antibiotics application, and 31.4% by drug abuse. Diclofenac, ibuprofen, codeine, ampicillin, tetracycline, and erythromycin-H2O have significant pollution risk quotients (RQ>1), indicating potentially high risk to aquatic organisms in Taiwan.


Introduction
Emerging contaminants (ECs) are mainly substances that many of them are unregulated or inadequately regulated and has raised the public attention to their presence in the environment used by different kinds of aspect, for instance, industrial and domestic [1,2]. The occurrence and fate of ECs in aquatic environments have been widely studied. Increasing contamination of aquatic systems by ECs is a major problem for aquatic life, as well as for human health, as they are highly mobile and often of toxicological concern [3][4][5][6]. Pharmaceuticals and personal care products (PPCPs), as well as illicit drugs, are increasingly discharged with wastewater to surface water environments [7][8][9]. Several direct and indirect pathways are available for introduction of ECs into an aqueous environment. One primary route is via effluent from municipal wastewater treatment plants (WWTPs) [10][11][12]. Since wastewater treatment processes are designed primarily to remove pathogens, suspended particles, and nutrients from sewage, removal of ECs is purely incidental and their elimination varies [10,13]. Several authors have documented conventional wastewater treatment showed inadequate on ECs removal [11,14,15]. Several ECs may be susceptible to degradation or transformation, but their continuous introduction into the aquatic environment in reality confers some degree of pseudo-persistence [10,16]. Although these compounds occur at relatively low concentrations, their continual long-term release may nevertheless result in significant environmental impacts.
According to statistical data from the Taiwan Food and Drug Administration, drug disposal in Taiwan amounts to 36 tons per year, and total medical expenses in 2011 reached 48 billion US dollars [17]. Therefore, the large amounts of unconsumed drugs may be present in the water systems. Available information concerning ECs in Taiwan is still limited. Few recent studies focus on selected sampling locations (industrial and hospital) for certain pharmaceuticals in northern Taiwan [18,19], while the occurrence of ECs in the water systems of southern Taiwan, particularly any effect on water quality in adjacent areas, remains unknown.
Multivariate statistical techniques, such as receptor model and cluster analysis, have been widely used to apportion the contributions of contaminants derived from different sources and investigate the distribution pattern and association of contaminants in the environment [20,21]. In addition, taking into account the ubiquity of the selected ECs, the relative abundance of contaminants, as opposed to absolute concentrations, can be considered as a chemical signature specific to a source contribution or contaminant plume. This chemical signature can help to better understand the fate and contribution of ECs in aquatic environments.
This study develops a methodology for a concept we refer to as the "Pharmaco-signature" for a source assessment of ECs in the particular land-use zone with a particular contribution of a mixture of ECs. The methodology is built upon a comprehensive and exploratory multivariate data analysis including the principal component analysis-multiple linear regression model (PCA-MLR) and the hierarchical cluster analysis (HCA). This methodology makes it possible to (a) obtain more information about the structure of the data; and (b) separate and discern the source contributions of ECs. Results of this study could provide information on levels, sources and potential risks of ECs, and for protecting water resources and environmental management in Taiwan.

Ethics Statement
For sampling in the four rivers of Kaohsiung, no specific permit was required for the described field study. The study location is not privately owned or protected in any way and we confirm that the field study did not involve endangered or protected species.

Materials
The chemicals and standards used (including suppliers, purities, and detailed physicochemical properties of the 28 selected ECs) are described in S1 Text and S1

Study area and sample collection
The study area covers the entirety of the urban, suburban, animal husbandry, and rural districts of Kaohsiung (22°18' N, 120°38' E), which has a population of 3 million and is also the largest industrial city in Taiwan. A map of the four selected rivers and our sampling locations are shown in Fig 1. Detailed description and coordinates of the sampling sites is included in the Table 1. Like many other rivers in Taiwan, these four rivers receive a variety of wastewaters from untreated domestic wastewater and/or animal husbandry discharge [22]. Gaoping River has the largest drainage basin, including rural, suburban, animal husbandry, and industrial regions of Kaohsiung, with an area of 3,256 km 2 . Gaoping River is also the longest river in Taiwan, with a length of approximately 140 km. Love River flows through the most urbanized and densely populated area of Kaohsiung City, with a length of 16.4 km and a 56 km 2 drainage area. Houjin River and Dianbao River have drainage basins of 70.4 and 107.1 km 2 and lengths of 21 and 25 km, respectively. Both rivers drain a partially rural region, with one tributary (located near H2) of the Houjin River flows through a suburban area, and downstream Dianbao River flows through an animal husbandry area. Two sampling campaigns were conducted in April 2010 (dry season) and July 2013 (wet season) at the water systems, with sampling sites denoted as follows: Gaoping River (sites G1-G8), Love River (L1-L10), Houjin River (H1-H4), and Dianbao River (D1-D4). Surface water samples (1L) in duplicate were collected in precleaned amber glass bottles at each sampling site. All of the samples were stored in a cooler during sampling campaigns and were immediately transported to the laboratory.

Sample preparation and analysis
Chemical analysis of ECs followed the methods employed in our previous study [23]. Water samples were filtered through 0.7 μm glass fiber filters, then acidified to pH = 6 by adding 0.1 M HCl, followed by addition of 0.2 g/L Na 2 EDTA as the chelating agent. For solid-phase extraction (SPE) of water samples, 300 mL water samples were spiked with acetaminophen-d 4 , amphetamine-d 11 , methamphetamine-d 14 , MDMA-d 5 , 13 C 6 -ibuprofen, and 13 C 3 -caffeine as isotopically labelled surrogates in quantifying procedural recovery. An Oasis HLB cartridge (500 mg, 6 mL, Waters, Milfort, USA) was conditioned with 6 mL methanol and 6 mL deionized water. The water sample was then passed through the pre-conditioned SPE-cartridge at a flow rate of approximately 20 mL/min. Then, the cartridge was rinsed with 6 mL deionized (DI) water and dried for 30 min using the vacuum of the SPE manifold. The analyte was then eluted by 6 mL of methanol. The extract was evaporated to dryness under a gentle nitrogen stream. Afterwards, the residue was re-dissolved in a final 1 mL volume with a 50:50 (v/v) solution of methanol in DI water and filtered through a 0.22 μm filter and analyzed by liquid chromatography-tandem mass spectrometry coupled with electrospray ionization (LC-ESI-MS/ MS).
Chromatography was performed using an Agilent 1200 module (Agilent Technologies, Palo Alto, CA, USA). The injection volume for PPCPs and illicit drugs was 50 and 10 μL, respectively, and the auto-sampler was operated at room temperature. Separation of PPCPs was performed on a 150 × 4.6 mm ZORBAX Eclipse XDB-C18 column with a 5 μm particle size (Agilent, Palo Alto, CA, USA). Illicit drugs were separated on a Kinetex PFP column (Phenomenex, Torrance, CA, USA, 100 × 2.1 mm, 2.6 μm). The gradients and mass spectrometer conditions used are described in the S1 Text.

Method validation and quality control
For all the compounds, wide linearity ranges were obtained for the quantification. Seven to ten points' calibration curves were constructed using least-squares linear regression analysis, and subjecting them to the same SPE procedures used for the environmental water samples (river waters) spiked with the analytes, typically from 0.5 to 2000 ng/L with r 2 > 0.9991 for all compounds. Recovery experiments were performed on DI water and river water samples spiked with 500 ng/L target analytes and isotopically labelled surrogates to estimate the precision, recovery, and accuracy of the analytical method. Table 2 presents the recoveries for the target analytes in DI water and river water. Mean recoveries in DI water range from 74 to 110%, and in river water they range from 76 to 115%. Mean recoveries of the isotopically labelled surrogate standards (acetaminophen-d 4 , amphetamine-d 11 , methamphetamine-d 14 , MDMA-d 5 , 13 C 6ibuprofen and 13 C 3 -caffeine) are 87 ± 11%, 74 ± 13%, 82 ± 15%, 84 ± 9%, 89 ± 8%, and 93 ± 12%, respectively. Blank samples and duplicate samples are analyzed in each batch to assure quality of the analysis. Analysis of these blanks demonstrated that the extraction and sampling procedures were free of contamination. The relative percentage difference for individual target congeners identified in paired duplicates is less than 10%. The limits of detection (LODs) are defined as three times the standard deviation of the blank samples, and the limits of quantification (LOQs) for the analytes are defined as three times the LODs (International Organization for Standardization, ISO/TS 13530, 2009). The target compound LODs ranged from 0.15 to 1.79 ng/L, and the LOQs ranged from 0.45 to 5.36 ng/L (Table 2). Overall, the validation data, such as repeatability, recoveries, and limits of detection are good, and therefore a reliable determination of the target compounds is feasible.

Environmental risk assessment
Levels of environmental risk from these ECs are evaluated based on methods described by several authors [24][25][26][27]. Risk quotients (RQs) for aquatic organisms were calculated from the measured environmental concentration (MEC), and the predicted no effect concentration (PNEC) of the EC compounds. In this study, the highest concentration measured in the river waters was used for maximum MEC to calculate the maximum RQs. PNEC is calculated by dividing the lowest chronic no observed effect concentration (NOEC) by the assessment factor according to the European Technical Guidance Document [28]. A commonly used risk ranking

Multivariate statistical analysis
Hierarchical cluster analysis (HCA) is a statistical method to classify samples into clusters through their similarity and different cluster rules. In this work, the HCA was implemented in SPSS 16.0, using Ward's Hierarchical agglomerative method of clustering and Euclidean distance measure, to analyze the relationships among the chemical compounds. Source contribution analysis was conducted using principal component analysis-multiple linear regression (PCA-MLR) model. The purpose of PCA is to represent the total variability of the original EC data in a minimum number of factors. Each factor is orthogonal to all others, which results in the smallest possible covariance. The first factor represents the weighted (factor loadings) linear combination of the original variables (i.e., individual ECs) that account for the greatest variability. Each subsequent factor accounts for less variability than the previous factor. By critically evaluating the factor loadings, an estimate of the chemical source responsible for each factor can be made. The concentrations were Kaiser normalized and Varimax rotation was used as the preferred transformation. Multiple linear regression was than performed on the significant factors to determine the mass apportionment of each source to total concentrations.
Stepwise modeling was used to allow each independent factor to enter into the regression equation if it could significantly increase the correlation, and a default significant level of 0.05 was used here. After normalization, the MLR equation can be expressed as Eq 1.
WhereẐ sum is the standard normalized deviate of the sum of the chemical concentrations, B k represents the regression coefficients, and FS k are factor scores calculated by the PCA analysis. The mean percentage contribution can be calculated by B k /∑ B k , and the contribution of each source k was estimated as Eq 2.
More information of PCA-MLR in environmental studies can be found in the literatures [30,31].

Occurrence of ECs
The results can be illustrated better by dividing the 28 ECs into 6 groups based on their general uses and/or origins: non-steroidal anti-inflammatory drugs (NSAIDs), illicit drugs, personal care products, antibiotics, caffeine, and other pharmaceuticals (clofibric acid, gemfibrozil, and carbamazepine). The high overall frequency of detection for ECs is likely influenced by the study design, which places a focus on sampling sites generally considered susceptible to contamination (i.e., downstream of intense population, levels of urbanization, and livestock production). A large proportion of the ECs (22 out of 28) are detected at least once (Fig 2). Among the 22 detected ECs, ibuprofen and pseudoephedrine were detected in 100% of samples (S2 Table). Measured concentrations are generally low (median detectable concentrations generally < 1000 ng/L); the exception is caffeine (2792 ng/L), with a maximum concentration of 41,200 ng/L. Caffeine shows the highest concentration, with a high frequency of detection, which is not surprising, given its prevalence in beverages, foods, and pharmaceuticals [32]. Ibuprofen is detected in all surface water samples at concentrations ranging from 1.9 to 4000 ng/L. This observation is similar to findings reported in previous research [33,34] and might be explained by the fact that ibuprofen is a commonly used antiphlogistic drug, with widespread use in the treatment of symptoms of colds, aches, and pains, and for treatment of arthritic conditions [25].

Patterns and signatures
Gaoping River is a characteristic mountain river, with a slender and sharp upstream basin. Most inhabitants (97.4%) are located in downstream areas [55]. Therefore, only scarce EC concentrations could be found at the stations G1-G4, reflecting background levels in the rural area (Fig 3). Ampicillin shows the highest concentrations of antibiotics (1920 ng/L) in Gaoping River. Animal husbandry, such as pig farming, and inappropriate disposal of manure into watercourses might explain these high antibiotic concentrations. It is estimated that there are approximately 1.9 million pigs in the drainage area of Gaoping River, approximately 30% of the entire pig production of Taiwan [56]. Thus, it is expected that there is a pronounced signal from animal husbandry. On the other hand, Ning et al. [57] find that livestock such as pig farming can be a potential threat for the water resources due to inappropriate disposal of manure into watercourses in the catchments of Gaoping River. This may represent a critical issue, as downstream waters are an important drinking water source for Kaohsiung city.
Relatively high EC concentrations were observed in upstream area of Love River. This may be so because two of the largest hospitals in Kaohsiung are located along Love River (Fig 1). Caffeine, NSAIDs, and illicit drugs have relatively high concentrations and frequencies of detection in Love River. It may reflect that cumulative contributions from domestic impact. As part of water quality management of Love River, two river interception stations were installed to collect and redirect river water for ocean outfall disposal. Hence, the downstream river waters are mainly composed of rainwater and tidal water from estuarine regions, where EC concentrations are relatively low. Higher concentrations found in Houjin River than in Dianbao River may be explained by the fact that Houjin River serves 4 times greater population in its catchment area than Dianbao River [55]. In addition, to a certain extent, Dianbao River demonstrates a similar compositional pattern with Gaoping River. The elevated concentration of antibiotics in Dianbao River may also be attributed to antibiotics use in the nearby animal husbandry area.
The signatures among various rivers could be demonstrated in the plot of EC concentrations for Human-ECs (human-use drugs, including NSAIDs, clofibric acid, carbamazepine, gemfibrozil, personal care products, and illicit drugs) and antibiotic concentrations (Fig 4). A distinct skewness between human-ECs and antibiotics is found in Gaoping River and Love River. Stations in Love River and Houjin River both contained much higher concentrations of Human-ECs than antibiotics, suggesting the dominant domestic impact. On the contrary, stations in Gaoping River only have elevated levels of antibiotics, indicating an observable impact from antibiotics application on animal husbandry. In addition, a much lower concentration is observed at stations in the upstream Gaoping River (G1-G4), reflecting a signature of rural area. The results are also in agreement with the discussion mentioned above.

Source contribution
To further identify the source contribution based on the profiles of ECs, we performed for all samples principal component analysis followed by multiple linear regression (PCA-MLR) and hierarchical cluster analysis (HCA). Concentrations below the LOQs were recorded as half of the LOQ values in the datasheet. The compounds used for multivariate analysis are shown in Table 4, and chemicals without detection or with low detection frequency were not included. PCA of the data sets in this study evolved three principal components (PCs) with eigenvalue >1. These 3 PCs were identified after varimax rotation, which accounted for 30%, 18%, and 17% of the total variance, respectively. It may be due to the missing values and replaced by half of the LOQ values of EC contaminants giving low variation in the data. Thus, some PCs captured low variance in PCA analysis [58,59]. The first component (PC1) is highly associated with diclofenac, ibuprofen, naproxen, ketoprofen, erythromycin-H 2 O, gemfibrozil, carbamazepine, caffeine, benzophenone-3, benzophenone-4, and pseudoephedrine, which are important chemicals in the human profile. Thus, PC1 could be highly indicative of the source due to domestic sewage discharging into the environment. The second component (PC2) is characterized by high loadings of sulfamethoxazole, ampicillin, tetracycline, and erythromycin-H 2 O. Chang et al. [60] investigated overall antibiotic consumption in both humans and animals in Taiwan. Annual consumption of human-use antibiotics is estimated at 329-378 tons, while 869-1,040 tons is estimated for animal-use antibiotics. This indicates that animal-use antibiotics account for 70%-76% of the total quantity of antibiotics consumed, suggesting that consumption of antibiotics in Taiwan is mainly for animal-use. Based on this profile, antibiotics application in animal husbandry area near those sites was speculated to be the potential source. The third component (PC3) has high loadings of amphetamine, methamphetamine, ketamine, and codeine and moderate loadings of ibuprofen and pseudoephedrine. Origins of these chemicals are mainly from drug abuse although some of them may partially use for medication in hospitals. Therefore, high proportions of these drugs in PC3 could also be further clarified by drug abuse.
Multiple linear regression analysis with the factor score (FS k ) against the standard normalized deviate of the sum concentrations of the 22 chemicals (Ẑ sum ) was performed to determined the mass apportionment of the three components in all samples. The resulting equation was as follows:Ẑ By expandingẐ sum and rearranging terms, the MLR equation becomes: Where σ was 7389 ng/L; and mean[Z sum ] was 5926 ng/L. Thus the mean percentage contribution (B k /∑ B k ) was 58.9% for domestic impact (FS 1 ), 9.7% for antibiotics application (FS 2 ), and 31.4% for drug abuse (FS 3 ). Fig 5 shows the estimated contributions for each source in all samples in two sampling campaigns. The positive contributions explain the variations of the source contributions in all rivers, and the negative contributions indicate the outcome of improper variable scaling inherent in PCA methods as described previously [31]. The PCA-MLR analysis showed that contributions due to antibiotics application (FS 2 ) were relatively low except for samples collected near animal husbandry area (Stations G6, G7, D2, D3, and D4); these data point to antibiotics application on animal husbandry as a significant source of antibiotics contamination. The contribution levels in Love River and Houjin River were high and showed substantial domestic impact (FS 1 ). The source tentatively attributed to drug abuse (FS 3 ) was a contributor to most Love River samples, particularly those sites in the upstream. S1 Fig showed the relative percentage of source contribution at each sampling site. Relatively high percentage of FS 2 was observed in Gaoping River (1.3-65% in dry season and 2.0-94% in wet season) and Dianbao River (40-63% in dry season and 37-74% in wet season), while high percentage of FS 1 and FS 3 were found in Love River (12-94% and 3.4-82% in dry season; 42-78% and 20-55% in wet season). These results may be consistent with land-use structure: Love River and Houjin River mainly flow through the residential areas of Kaohsiung City. Therefore, significant source contributions from domestic impact and drug abuse could be found in both two sampling campaigns for Love River and Houjin River. The dendrogram of sampling points in two sampling campaigns obtained by HCA is shown in Fig 6. Two well-differentiated clusters were observed: (I) a cluster characterized by high Pharmaco-Signature of Contaminants in Multi-Scape Water Systems compositional fractions of caffeine; and (II) a cluster characterized by high compositional fractions of ampicillin. Cluster I is the largest, formed by all stations in Love River and Houjin River and station D1 and D2. These results indicate that the signature in this cluster bears mainly domestic impacts. Cluster II comprises stations in Gaoping River and Dianbao River (G1-G8, D3, and D4). This cluster contained stations (G1-G4) with the lowest concentration of ECs, and stations characterized by high-level antibiotics. These results indicate that the signatures of cluster II were mainly derived from rural and animal husbandry contributions. These findings gave similar results and provided further evidence to source contributions.

Environmental risk characterization
Environmental risks to aquatic organisms are assessed for a worst case scenario in southern Taiwan based on the RQ calculated using maximum MECs and PNECs (Table 5). Overall, ampicillin has the highest RQ, and the values in Gaoping River, Love River, Houjin River, and Dianbao River are 22.45, 5.71, 8.13, and 4.48, respectively. Both RQ values for ampicillin and codeine in the four rivers exceed 1.0, indicating their potential risk to aquatic organisms. Ibuprofen and diclofenac may pose a high risk to aquatic organisms in Love River and Houjin River. Similar results for these ECs with high risk are also found in surface waters worldwide. Hernando et al. [29] predict high risk levels based on the RQ values of ibuprofen, diclofenac, ketoprofen, gemfibrozil, erythromycin-H 2 O, clofibric acid, and carbamazepine in surface water and STP effluent in Europe. RQ values greater than 1.0 have been reported for ibuprofen in the Danish aquatic environment and in Spanish sewage effluent [61,62], as well as for diclofenac in a Norwegian river [63], Australian sewage effluent [64], and in the Pearl River, China [27]. In summary, risk assessment in the present study shows that ibuprofen, diclofenac, and codeine are the three NSAIDs with high ecological risk, whereas ampicillin and erythromycin-H 2 O are the two antibiotics with high ecological risk. Although direct acute ecological effects have not been reported in the aquatic environment, and the PNEC values were not derived for the most sensitive species in this study area, precautionary measures should be taken to reduce risks to aquatic organisms due to potential subtle chronic changes caused by ECs in southern Taiwan.

Limitation, advantage and application
One important limitation of developing this pharmaco-signature is the selection of the most representative and indicative target compounds. For example, several EC compounds have different applications and may be used for both human and veterinary treatment, and therefore, no distinct pattern could be observed. The use of ECs may also vary among countries. Thus, the greater difficulty lies in proper source identification. It is important for researchers that should strive to include key EC source markers that will improve the ability to identify the pharmaco-signature from this concept.
Despite the limitations, this methodology revealed several advantages. In the step-by-step approach, the first step is determining concentration distribution in terms of individual ECs, species groups, and percentages to summed EC concentrations, which can then be used to  Pharmaco-Signature of Contaminants in Multi-Scape Water Systems identify abundant chemicals and to clarify patterns and signatures. The second step is implementing PCA-MLR method to resolve predominant factors and source contributions. The third step is using HCA method to obtain differentiated clusters. PCA-MLR or HCA method alone cannot clearly characterize EC sources. Performing both of these methods enable to confirm and support each other and can clarify the potential source contributions. In this study, PCA-MLR and HCA analysis were used to identify source contribution and to clarify patterns and signatures by comparing two sampling seasons despite those sampling campaigns were 3 years apart. The study showed that both PCA-MLR and HCA analysis gave similar results of pharmaco-signature in those sampling campaigns, indicating the universally coincident land-use in multi-scape water systems. Therefore, these results can strengthen the belief in the validity of these multivariate statistical analysis approaches in our study area in clarifying the potential source contributions.
The results of this concept have much broader implications for discerning source contributions. Where appropriate contaminant data are available, use of the developed methodology, with some additional perspectives geographical/hydrological characteristics of the study area, water quality parameter (e.g. BOD, TOC, E. coli), and chemical markers (e.g. pesticides, VOCs), makes it more applicable for environmental studies to further resolve potential source contributions and identification.  Table. CAS number, formula, molecular weight, logK ow , logK oc , melting point, vapor pressure, and solubility of the selected ECs. (DOCX) S2 Table. The Rank of ECs according to the frequency of detection in the study area.
(DOCX) S1 Text. Materials and Methods. Detailed descriptions of chemicals and standards, LC-MS/ MS analysis, and environmental risk assessment in this study were provided in the S1 Text. (DOCX)