Combining physicochemical properties and microbiome data to evaluate the water quality of South African drinking water production plants

Anthropogenic activities in catchments used for drinking water production largely contaminates source waters, and this may impact the quality of the final drinking water product. These contaminants may also affect taxonomic and functional profiles of the bacterial communities in the drinking water. Here, we report an integrated insight into the microbiome and water quality of four water treatment plants (NWC, NWE, WCA and NWG) that supply portable water to communities in South Africa. A new scoring system based on combined significant changes of physicochemical parameters and microbial abundance from raw to treated water was used to evaluate the effectiveness of the treatment plants at water purification. Physicochemical parameters which include total soluble solids, turbidity, pH, nitrites and phosphorus among others, were measured in source, treated, and distributed water. There were general statistically significant (P ≤ 0.05) differences between raw and treated water, demonstrating the effectiveness of the purification process. Illumina sequencing of the 16S rRNA gene was used for taxonomic profiling of the microbial communities and this data was used to infer functional attributes of the communities. Structure and composition of the bacterial communities differed significantly (P < 0.05) among the treatment plants, only NWE and NWG showed no significant differences (P > 0.05), this correlated with the predicted functional profile of the microbial communities obtained from Phylogenetic Investigation of Communities by Reconstruction of Observed States (PICRUSt), as well as the likely pollutants of source water. Bacteroidetes, Chlorobi and Fibrobacteres significantly differed (P < 0.05) between raw and distributed water. PICRUSt inferred a number of pathways involved in the degradation of xenobiotics such as Dichlorodiphenyltrichloroethane, atrazine and polycyclic aromatic hydrocarbons. More worryingly, was the presence of pathways involved in beta-lactam resistance, potential pathogenic Escherichia coli infection, Vibrio cholerae infection, and Shigellosis. Also present in drinking and treated water were OTUs associated with a number of opportunistic pathogens.


Introduction
Water quality in several parts of South Africa is threatened by urbanization (poorly or untreated sewage and polluted storm water), mines (effluents containing metals and acid), agriculture (return flow that contain excessive amounts of pesticides, herbicides and fertilizers) and various industries [1]. This is particularly the case for the Vaal River system that became the receptacle of pollutants through runoff and infiltration [2]. These anthropogenic activities results in poor quality of most source waters which in turn will require sophisticated systems and additional purification steps for the delivery of good potable drinking water [3]. These sophisticated systems are costly to operate and they are not always available [4]. Poor management and maintenance of existing drinking water infrastructures may also lead to the degradation of drinking water quality even if the raw water is of reasonable quality [5]. To encourage municipalities and water utilities in South Africa to manage and maintain infrastructure, the Drinking Water Quality Framework for South Africa was introduced [6]. This is an incentivebased water regulation and monitoring framework that is defined by the Blue Drop Certification Programme for Drinking Water Quality Management Regulation [1]. The regulation and monitoring of drinking water quality is based on legislated norms and standards such as the South African National Standards [6][7][8].
The physical and chemical properties of water intended for drinking and other domestic purposes must not exceed specified limits [9]. These physical and chemical properties may affect the appearance, colour and odour of the water to levels which are unacceptable to the consumers regardless of not posing any dangers. Consumers have the right to evaluate the quality and acceptability of the water [10]. The potential for drinking water to transport and disseminate microbiological pathogens to consumers, is a reality, for example the 2015 cholera outbreak caused by drinking contaminated water in Kasase, Uganda [11]. For this reason, physical and chemical processes are used to remove these pathogens from drinking water, and to evaluate the efficacy of the processes, surrogate organisms such as the bacterium from faecal origin, Escherichia coli is used [1,10,12]. Water intended for human use and consumption must be free of any faecal or E. coli indicator organisms [13]. Drinking water distribution systems are, however, not isolated, sterile environments and may contain heterotrophic bacteria. These include all bacteria that use organic nutrients for growth and are universally present in all types of water systems. Heterotrophic plate count bacteria, a subset of heterotrophic bacteria are thus found in drinking water and up to 10 3 cfu/ml is allowed in a small number of samples [14][15][16].
Next generation sequencing (NGS) of the 16S rRNA gene in microbial community environmental DNA demonstrated that safe, high-quality drinking water contains a unique biodiversity [17,18]. These communities are impacted by the quality of the source water, purification process, materials used in the distribution system, and physical forces in the system [17,18]. Elevated water temperatures, low residual chlorine and nutrients (carbon, phosphorus, nitrogen, and iron) are important factors for maintaining microbial communities in drinking water distribution systems [17]. The 16S rRNA gene profile data are informative at the population and community level. It can be processed into various ecological diversity indices [17,18]. In addition, phylogenetic datasets generated through NGS of the 16S rRNA can be used for extrapolating metabolic and ecosystem functions [19].
The aim of the present study was thus to provide insights into the microbiome and water quality of selected drinking water production plants in South Africa and to discuss the potential application of such data in interpreting the impact of anthropogenic activities on the process and cost of water purification. At the same time we were evaluating the efficacy of the treatment process used by the drinking water treatment plants. More importantly, we introduce a new method which allows for combined evaluation of physicochemical parameters and microbiome data to evaluate the efficacy of different drinking water treatment plants to remove contaminants. Table 1 shows a summary of information of the various drinking water treatment plants (DWTP) used in this study. Samples were collected in June 2017 from raw, treated and distributed water of each DWTP following the Department of Water Affairs and Forestry (DWAF) sampling guidelines [20]. Sampling was done in triplicates, briefly, the samples were collected in sterile 1 litre Schott bottles, and they were stored and transported on ice. All samples were subjected to laboratory analyses within 8 hrs of sample collection. Table 1 also shows the source of raw water, anthropogenic activities likely to have an influence on the quality of raw/ source water of all the treatment plants. The exact locations and names of the treatment plants were anonymised as the results might influence the consumer's opinions. The four treatment plants were designated as WCA, NWC, NWE and NWG. For sampling WCA, NWC and NWE treatment plants, written permissions were obtained from the local municipalities, and the municipalities serves as both water service providers and water service authority. For sampling NWG treatment plant, written permissions were obtained from both the water service provider (a private company) and the water service authority (local municipality). For distributed samples, non-written permission for sampling was obtained from the household where the samples were collected. In South Africa, the water service authorities are under the jurisdiction of the Department of Water Affairs and Sanitation formerly DWAF.

Analysis of physicochemical parameters
Water quality parameters (pH, temperature and total dissolved solids) were measured in situ using a multi-350 probe analyser (Merck, Germany). Turbidity was measured using a HACH 21000P Turbidity meter (HACH, USA). A HACH DR 2800 spectrophotometer (HACH, USA) was used to measure phosphates, nitrate, nitrite and free chlorine. Microsoft Excel (2016; version 16.0.6868.2067) was used to determine the averages and standard deviations. Correlations were made between the physicochemical parameters of raw, treated and distributed water by Principal Component Analysis (PCA) using Canoco software version 4.5.

Bioinformatics analysis and data visualisation
Overlapping paired-end Illumina fastq files were merged using the PANDAseq assembler [22], and reads were quality checked using FastQC (Babraham Bioinformatics, UK; https://www. bioinformatics.babraham.ac.uk/), where necessary trimming was done using ea-utils. Downstream analysis was done using Quantitative Insights into Microbial Ecology (QIIME 1.91) [23]. Merged quality-filtered reads were clustered into operational taxonomic units (OTUs) at 97% 16S rRNA gene similarity using UCLUST algorithm [24] against the Greengenes database. The version gg_13_5 was used for closed reference OTU picking which were used for analysis with Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUst), while the version gg_13_8 was used for open reference OTU picking. Closed reference OTUs were only used for PICRUst analysis. The taxonomy of each phylotype was classified based on the Greengenes database using the Ribosomal Database Project. For visualisation and statistical analysis, the OTUs were subjected to Microbiome Analyst [25], META-GENassist [26], PICRUSt [27] and Statistical Analysis of Metagenomic Profiles (STAMP) [28].

Comparative statistical analysis of selected physicochemical properties and microbiome data
The reduction of total dissolved solids (TDS), turbidity, phosphates, nitrites, nitrates and OTUs from raw water to final drinking water for each water purification plant was evaluated. This was achieved by using a scoring system where a score of 0 was assigned if no significant reduction or increase of a parameter was observed; a score of -1 was assigned for a significant increase of a parameter from raw to treated water whereas a score of 1 was assigned for a significant decrease of a parameter from raw to treated water. A total score for each purification plant was calculated in which higher scores indicated better overall functionality of the purification plant. The significant changes between the treated and raw water was calculated using the Student's t-test for each parameter, individually for each purification plant. The t-test analysis was conducted using a one-tailed distribution assuming unequal variances, and statistical significance was recognized for P-values < 0.05.

Physicochemical analysis
Turbidity in the raw water (0.59 ± 0.6 NTU) of the NWC treatment plant was significantly lower (P < 0.05) than in the treated water (1.21 ± 0.8 NTU) as well as that of water in the distribution system (1.57 ± 1.06 NTU) ( Table 2). In contrast, turbidity significantly (P � 0.05) decreased from raw to treated water at the NWE and NWG treatment plants. At the WCA treatment plant, there was no significant difference in turbidity between raw and treated water (P > 0.05) however, treatment resulted in a decrease of turbidity ( Table 2). In the distributed water, at NWC and NWG treatment plants turbidity (1.57 ± 1.06 and 1.02 ± 1.91 NTU, respectively) was higher when compared to that in the distributed water at WCA and NWE (0.51 ± 0.08 and 0.46 ± 0.19 NTU), respectively ( Table 2). There were no significant differences in total dissolved solids (TDS) between raw and treated water at all the treatment plants. The WCA treatment plant had a relatively higher TDS (above 600 mg/L) in all the compartments in comparison to the other treatment plants where TDS concentrations were below 600 mg/L ( Table 2). The maximum concentration of TDS recorded at the WCA plant was above 900 mg/L ( Table 2). Phosphate levels between raw and treated water were not significantly different at three of the four treatment plants. The only exception was the NWG treatment plant where the levels of phosphates were significantly lower (P < 0.05) in the treated water compared to the raw water (Tables 3 and 4). Overall, the WCA treatment plant had higher concentration of phosphorus in all compartments (Table 3). The concentration of nitrates in treated water at the WCA and NWC treatment plants did not significantly differ with the concentrations recorded in source water. However, at NWE the treatment of source water resulted in a significant decrease (P < 0.05) of nitrates. In contrast at the NWG treatment plant, the treated water had a significantly (P < 0.05) higher concentration of nitrates than raw water (Tables 3 and 4). There were no significant differences in the concentration of nitrites between raw and treated water at WCA, NWE and NWC treatment plants. On the other hand, at the NWG treatment plant there was a significantly (P < 0.05) higher concentration of nitrites in the treated water compared to the raw water (Table 4). Raw, treated and distributed water was alkaline at all the treatment plants ( Table 2). There were no significant differences among the treatment plants as well as within the different compartments of the treatment plants. Free chlorine concentrations in the distributed water were generally low. However, at NWE and NWG, maximum concentrations exceeded 1 mg/L.   Principle component analysis (PCA) of physicochemical properties showed that, at the WCA treatment plant, turbidity and nitrates strongly correlated with raw water, whereas the drinking water was associated with temperature and pH (Fig 1). At NWC raw water correlated with nitrites and pH. Temperature, turbidity and TDS strongly correlated with drinking water (Fig 1). Raw water at NWE correlated mostly with pH, whereas drinking water mostly correlated with turbidity and temperature. Raw water at NWG did not have a specific positive correlation with any parameter whereas the drinking water correlated with nitrates, nitrites, temperature and TDS (Fig 1).
Treated and distributed water samples were screened for potentially pathogenic bacteria. A number of pathogenic signatures which includes the genera Acinetobacter, Clostridium, Legionella, Pseudomonas and Serratia Tatlockia were identified. The NWC and NWG treated water had all the above mentioned genera. The NWE treated water had at least one OTU belonging to all the genera except for Serratia. OTUs in distributed water of NWE were positive for Acinetobacter and Pseudomonas while distributed water at NWG had Pseudomonas, Tatlockia and surprisingly a higher number of OTUs belonging to Legionella in comparison with treated water. S1 Table shows the distributions of OTUs within various potentially pathogenic bacterial genera. The species from the various genera included: Acinetobacter (A. spp., A. johnsonii and A. rhizosphaerae), Clostridium (Clostridium spp., C. intestinale, C. piliforme and C. bowmanii), Legionella (L. spp. and L. pneumophila), Pseudomonas (P. spp., P. pseudoalcaligenes and P. nitroreducens, P. veronii and P. fragi) and Serratia (S. spp. and S. marcescens).

OTU diversity and similarity analysis
Community OTU comparisons were visualised by PCoA analysis (OTU �97% similarity) using Bray Curtis Index (P < 0.05; PERMANOVA; Fig 4). Bray Curtis index showed distinct clustering based on location rather than treatments. A dendrogram generated using the Bray Curtis index distance measure and the Ward clustering algorithm showed that OTUs clustered together mainly by location, rather than treatments. This is consistent with the PCoA plots (Fig 4). WCA raw and treated water formed their own cluster while NWC raw and treated water formed their own sub-cluster. However, NWG and NWE raw water formed their own sub-cluster, with the NWE treated water being slightly distinct. Distributed and treated water at NWG was closely related to NWE treated water (Fig 4).

Taxonomic-to-phenotype mapping of the OTUs
Data was normalised by log transformation, and METAGENassist was used for taxonomic-tophenotype mapping of the OTUs. Abundance of inferred metabolic pathways are shown in (Fig 5). Predicted metabolic pathways for dehalogenation ( 5).

PICRUSt predicted metabolic functions and capacities of the bacterial communities
PICRUSt prediction of the metabolic functions, was used to have an insight into the role of different microbial communities in source, treated and drinking water from the four different treatment plants. A number of housekeeping pathways which include carbohydrate metabolism, pyruvate metabolism, sulfur metabolism, purine metabolism, lipid metabolism, pyrimidine metabolism, cysteine and methionine metabolism, energy metabolism, arginine and proline metabolism, metabolism of core factors and vitamins, amino acid metabolism, carbon fixation pathways in prokaryotes, glycine, serine and threonine metabolism as well as amino sugar and nucleotide sugar metabolism were predicted and their distribution is shown in (S1 Fig).

Comparative statistical analysis of selected physicochemical properties and microbiome data
Evaluation results of the water treatment effectiveness among the various water purification plants are summarized in Table 4. Results indicated no significant reduction or increase of TDS after treatment among the purification facilities. Significant decrease of turbidity was observed for NWG and NWE treatment plants whereas, a significant increase in turbidity was observed at the NWC treatment plant. There were no significant changes in turbidity between raw and treated water at the WCA treatment plant. No significant changes were observed for phosphate removal apart from the NWG plant at which a significant reduction was observed after treatment. There were no significant changes in nitrate removal at all the treatment plants except for the NWG treatment plant which showed significant increase of nitrites after treatment. WCA and NWC indicated no significant reduction or increase of nitrates however, NWG indicated a significant increase and NWE a significant reduction. There was a significant reduction in the number of OTUs from raw water to treated water at all the treatment plants, except for the NWC at which no significant changes were observed. The total scores indicated which purification facilities were overall more effective at water purification. A higher score indicated that purification was achieved. NWE had a total score of 3, WCA and NWG had a total score of 1. NWC had a total score of -1 showing it was the least effective.

Physicochemical properties of water
In the current study, we combined physicochemical properties with microbiome data to evaluate the water quality of different drinking water production plants. Treatment plants using filtration as part of the treatment process should be able to limit turbidity levels to below 0.5 NTU [16]. Turbidity in water can affect the disinfection with chlorine-based chemicals as microorganisms and pathogens can be shielded from such disinfectants if the turbidity exceeds this limit [9,10]. In the present study, turbidity at NWE and WCA were within the limit of 0.5 NTU. NWC and NWG had turbidity levels that were slightly above 1 NTU however, based on the South African water quality guidelines, water with such levels of turbidity is still safe to drink although there is a moderate chance of adverse aesthetic effects. There is also a moderate chance of infectious disease transmission. The NWC treatment plant showed a significant increase in turbidity from raw to treated water ( Table 2, Fig 1). This phenomenon could probably be ascribed to the treatment process at this plant. According to the manager of the system, the rapid sand filtration system operates in such a way that treated water is collected in a sump where it is not left for long enough so that the suspended particles can settle. The suspended material probably accounts for the significant increase of turbidity from raw to treated water. TDS concentrations below 600 mg/L in drinking water is considered to be good [29]. WCA was the only DWTP that exceeded this recommended concentration. When TDS concentrations exceed 1000 mg/L, water becomes aesthetically compromised [29]. However, in accordance with the South African National Standards [7] TDS concentrations of 1000 mg/L in drinking water has no likely health effects, even taking into account higher water consumption during very warm climatic conditions. Thus, TDS at the WCA (664-833 mg/L) was in line with the South African National Standards and falls into the category of fair TDS concentrations based on WHO guidelines. Nutrients (nitrates and phosphates) were detected in the water after treatment and in the distribution systems. These compounds are associated with microbial growth in treated water as well as in the distribution systems [30], and they may favour biofilm formation [31]. The NWC treatment plant showed a slightly higher phosphorus in treated water compared to raw water. Regardless of the noted increase, drinking water from all the treatment plants did not exceed the WHO recommended maximum level of 5 mg/L. Nitrogen compounds such as nitrates and nitrites are interchangeable components within water environments [32]. Their levels are often associated with anthropogenic activities and their origins could be from agricultural runoff (fertilizers, pesticides) and urbanization (effluents from municipal and industrial wastewaters [33]. The source water for all the plants, was likely to be impacted by such anthropogenic activities (Table 1). Overall, nitrate concentrations were low at all the DWTPs. In a study done by Almdar et al., 2009, nitrate concentrations in the drinking water were low (2.40 mg/L-2.80 mg/L) while the nitrite concentrations were high [32]. Similar results were observed in the current study. Nitrates and nitrites play an essential role in the maintenance and development of microbial communities [34]. pH and temperature are intrinsically linked to the physicochemical and biological reactions in water. A rise in temperature would generally increase the chemical reactions, metabolic-and growth rates of microorganisms which can also increase the turbidity. However, it does not have direct adverse effects on human health [35]. The normal range of pH for surface waters is 6.5-8.5 [36], which is also the Environmental Protection Agency (EPA) recommended pH range that municipality water suppliers must keep. In this study, the treated water was within these guidelines.
Low free chlorine concentrations were observed in the distributed water of all the DWTPs. This was similar to a number of studies which also suggested that low free chlorine concentrations can cause pathogens to survive through the distribution system [37,38]. Thus, it is important to control and monitor free chlorine concentrations regularly within DWTPs. The physicochemical conditions (nutrients, suspended solids, pH, and temperatures) were such that an active microbial population could be sustained. Variations in the levels of these and other parameters associated with anthropogenic activities could impact the community composition of the aquatic systems.

16S rRNA gene profiling
Taxonomic profile analysis indicated that treatment of source water significantly influences the microbial structure of treated water. This was mainly indicated by a great decrease in the number of bacterial phyla in treated water in comparison to raw water (Fig 2). During drinking water treatment, a number of disinfectants such as chlorine, monochloramine and ozone are used to eliminate pathogenic microorganisms. Although these treatments are largely effective, some microbes can survive and proliferate in the drinking water system. A number of studies have indicated the presence of diverse microbes in drinking water distribution systems [39][40][41]. In our study, all treatment plants used chlorination during the disinfection step; in addition, the NWG treatment plant applies both chlorination and ozone (Table 1). In this case the ozonation is part of the treatment options, particularly to oxidize manganese [42] and not as a disinfection step. Fig 2 indicates that some of the microbes survived the treatment process and could be found in the treated water and the distribution system which is consistent with the previous studies [39,41]. In the present study treated water was also dominated by Proteobacteria and Planctomycetes. At the NWG and NWE treatment plants, end user water was also sampled and Proteobacteria and Firmicutes dominated at these plants, respectively.
Drinking water sources play an important role in the overall composition of final drinking water [43]. This was demonstrated for the WCA and NWC treatment plants were both PCoA and dendrogram showed that the microbial community in treated water was more similar to the source water (Fig 4). NWE and NWG raw water clustered together showing similarities in the microbial communities of their source water, which was also supported by no significant differences between the phyla from these treatment plants (Fig 3). Their treated water also clustered together in agreement with the fact that source water shapes the microbial community of the treated water regardless of the treatment process. Variation of bacterial communities in source water had been shown to be a function of land use and water quality [44]. This was true for all the treatment plants, particularly NWE and NWG that clustered together (Fig  3). Similar anthropogenic activities; urbanisation, mining, agriculture and informal sectors ( Table 1) are likely to impact the source waters of the two treatment plants. Moreover, the physicochemical properties of their raw water were not very distinct (Tables 2 and 3). Though WCA and NWC are likely to be impacted by agriculture, the physicochemical parameters of their raw water were significantly different accounting for the variation in the microbial communities (Tables 1-3).
From the OTUs in treated and distributed water, we detected signatures of potentially pathogenic bacteria which included Acinetobacter, Clostridium, Legionella, Serratia, Pseudomonas and Tatlockia. Some of the signatures identified up to species level are shown in S1 Table, L. pneumophila was the only species which is included in the US EPA bacteria of concern in water [45] and is the leading cause of pneumonia worldwide [46]. However, only the NWC treated water had L. pneumophila OTUs (four in total). This value might be very low for causing any illness as risk associated with ingestion of about 6.9x101-3.8x10 2 per single event of 1 litre consumption may lead to 1 in 10,000 risk [47]. The genus Clostridium includes several significant pathogens. In Finland, gastroenteritis outbreak resulting from distributed water contaminated with Clostridium difficile was reported by [48]. Genus Serratia was present at NWC and NWG treated water, each having one OTU identified as S. spp. In addition NWG had ten OTUs classified as S. marcescens which is a well-known opportunistic pathogen. S. marcescens had been associated with urinary tract infections and catheter-associated bacteraemia [49]. Pseudomonas spp. and the other potential pathogens were reported in waterborne outbreaks in the United States between 2007 and 2008 [50]. However our results should be interpreted with caution as pathogens are known to harbour strain specific virulence factors, thus quantification of pathogenic taxa based on the occurrence of a biomarker such as 16 rRNA may not correlate to public health risk [51].
A number of sequences retrieved from the predicted metagenomes were associated with bacterial groups or genes that are of concern when it comes to public health. The NWC treatment plant had significantly (P < 0.05) higher proportion of predicted pathways associated with shigellosis, pathogenic Escherichia coli and Vibrio cholerae infection (S2 Fig). These were also present at all the other treatment plants particularly in raw water. Although treated water had a significantly lower proportion of these functional categories, their presence in source water should serve as a warning of the potential hazards. A study by Probert et al., 2017, have provided evidence of contaminated stream water as a source of Escherichia coli O157 related illness in children [52]. Predicted metagenome analyses also predicted the presence of beta-lactam resistance at all the treatment plants, as well as all water compartments (S2 Fig). The World Health Organization (WHO), has listed antibiotic resistance as a great threat to human health, and recently launched an action plan on Antimicrobial Resistance (AMR). Knowledge of the spread and distribution of AMR through research is one of the main objectives of this action plan. Antibiotic-resistant bacteria makes the treatment of community acquired infections very challenging, and their presence in drinking water is a cause for concern. Studies have also detected the presence of beta-lactam resistance bacteria in drinking water among other forms of resistance [53,54].
Cyanobacterial species produce cyanotoxins which include microcystin, anatoxin, cylindrospermopsin [55] thus their presence especially in drinking water is highly undesirable. Treated water at the NWC, NWE and NWG treatment plants had relative abundances of Cyanobacteria (Cyanobacteria-like sequences) which were 3.83, 9.83 and 28.32%, respectively ( Fig  2). Due to their similarity to chloroplast rRNA gene sequences, it is difficult to correctly classify cyanobacteria using 16S rRNA sequencing [56]. However, [40,43] also detected cyanobacteria in drinking water using 16S rRNA gene clone libraries. Thus, the presence of Cyanobacterialike sequences in drinking water from this study also echoes the presence of cyanobacteria in the drinking water distribution system. Urbanisation, agriculture, mining and other anthropogenic activities have been reported for contaminating source water with a number of xenobiotics [57]. In the current study, one or more of these activities were likely to impact the source water (Table 1). Predicted metagenomes using PICRUst revealed the metabolism of atrazine, DDT, polycyclic aromatic hydrocarbon degradation (S2 Fig). Taxonomic to phenotype mapping of the OTUs using METAGENassist also predicted the presence of atrazine metabolism, degradation of aromatic hydrocarbons, naphthalene degradation, methane oxidation, chlorophenol degradation ( Fig  5). Atrazine has been recently linked to pre-term birth effects [58], endocrine disruption, cancer and reproductive complications [59]. The presence of atrazine metabolism in treated water from this study (Fig 5) should serve as a warning sign to the potential hazards imposed to drinking water by the agricultural activities near source water. Furthermore, functional analysis indicated that basic microbial metabolism did not vary considerably between treated and source water (Fig 6 and S1 Fig). The different treatment plants showed varying trends of the abundance of basic metabolism between treated and raw water. This suggests that treatment process might not greatly affect some of the basic cellular process essential to bacteria, though some stress related genes might be upregulated [53]. Taxonomic to phenotype mapping reveals complex metabolic pathways (Fig 5) which includes carbon fixation, chitin degradation, chlorophenol degrading and atrazine metabolism, amongst others. These pathways indicate the key biogeochemical processes in source and treated water and perhaps could serves as an indicator of in situ biodegradation process potential in source and treated water. The question arises whether the presence of the pathways can be exploited to accelerate pollutant clean-up [60].
A scoring system based on significant changes from raw to treated water was used to establish physicochemical parameters and microbial abundance reduction capabilities of various water purification facilities (Table 4). We propose that this approach could be used in future studies that are investigating the effectiveness of drinking water treatment plants in reducing substance in their raw water. In some previous studies such an approach was lacking. A study by [61] evaluated the removal capabilities of natural organic matter (NOM) from South African water treatment plants. Even though the authors could compare the treatment plants for their ability to remove NOM, no statistical significance was used in the comparisons. A study by [62] also compare reduction of the same substance (NOM) during water purification processes at various plants but a similar lack of statistics was evident. Water purification plants are not always efficient in removal of all dissolved water constituents. The statistical method used in this study makes it possible to establish if reductions or increases were not only significant for a specific DWTP, but can be used to compare various plants. More importantly, the system allows for combined evaluation of physicochemical parameters and microbiome data.
The application of Next Generation Sequencing (NGS) for microbial detection has a number of limitations. NGS-based methods cannot differentiate between viable and dead bacterial cells thus in disinfected water, they may have poor comparability to culture-based methods [63,64]. To overcome such limitations, it is crucial that NGS methods are combined with culture based methods which can provide an extra dimension of the cell viability. Kishor et al., 2019 showed that combining NGS methods and conventional methods is an effective way to evaluate water quality, the different methods will complement for the limitations of the other [65]. The presence of highly conserved 16S rRNA genes in some family and genera could lead to limited taxonomic resolution [66]. To circumvent such limitations, NGS method can be complemented with species-specific methods such as qPCR. Regardless of the limitations of NGS studies, they provide insights into community microbial structure which no other methods can provide. In our current study, removing the OTUs from the scoring system (Table 4) will not change the results of the effectiveness of the water treatment plants in this study. This suggests that even if the NGS data might have false positives, they did not significantly influence results of this study.

Conclusions
This study gives integrated insights into the microbiome and quality of the source water, treated as well as distributed water, allowing observations of microbial-mediated processes. At the same time it evaluates the efficacy of the water treatment process used, and provides warning of the potentially looming hazards. It also adds to the baseline for monitoring perturbations in source and drinking water microbiome, which will be essential for establishing effective water treatment methods in the future. However, it is important to take into consideration the possibility of dead but intact cells as well as free environmental DNA, especially after water treatment to have an impact on the microbiome results. Even so, the data demonstrate that raw water quality is intertwined with the quality of final produced water but further to this, it also impact on the microbiome of the drinking water. We devised a method which combines physicochemical properties and microbiome data to evaluate the efficacy of various water treatment plants. This method could be applied in future studies, and it will be important to also add outgroups such as highly contaminated or pure water, so as to evaluate the methods.
Supporting information S1