Metagenomic Identification of Bacterioplankton Taxa and Pathways Involved in Microcystin Degradation in Lake Erie

Cyanobacterial harmful blooms (CyanoHABs) that produce microcystins are appearing in an increasing number of freshwater ecosystems worldwide, damaging quality of water for use by human and aquatic life. Heterotrophic bacteria assemblages are thought to be important in transforming and detoxifying microcystins in natural environments. However, little is known about their taxonomic composition or pathways involved in the process. To address this knowledge gap, we compared the metagenomes of Lake Erie free-living bacterioplankton assemblages in laboratory microcosms amended with microcystins relative to unamended controls. A diverse array of bacterial phyla were responsive to elevated supply of microcystins, including Acidobacteria, Actinobacteria, Bacteroidetes, Planctomycetes, Proteobacteria of the alpha, beta, gamma, delta and epsilon subdivisions and Verrucomicrobia. At more detailed taxonomic levels, Methylophilales (mainly in genus Methylotenera) and Burkholderiales (mainly in genera Bordetella, Burkholderia, Cupriavidus, Polaromonas, Ralstonia, Polynucleobacter and Variovorax) of Betaproteobacteria were suggested to be more important in microcystin degradation than Sphingomonadales of Alphaproteobacteria. The latter taxa were previously thought to be major microcystin degraders. Homologs to known microcystin-degrading genes (mlr) were not overrepresented in microcystin-amended metagenomes, indicating that Lake Erie bacterioplankton might employ alternative genes and/or pathways in microcystin degradation. Genes for xenobiotic metabolism were overrepresented in microcystin-amended microcosms, suggesting they are important in bacterial degradation of microcystin, a phenomenon that has been identified previously only in eukaryotic systems.


Introduction
Freshwater lakes are ecologically important and a major source of drinking water; thus maintaining and improving water quality in lakes is critical. Cyanobacterial (blue-green algal) harmful blooms (CyanoHABs) threaten water quality and the frequency and extent of these blooms are increasing worldwide [1]. One important harmful effect of CyanoHABs is production of cyanotoxins, such as microcystins, which have strong hepatotoxicity that can severely damage mammalian liver cells. Microcystins (MCs) are produced by several bloom-forming cyanobacteria that are common in freshwater lakes, including Microcystis, Anabaena, Planktonthrix and Nostoc [2].
There are over 80 chemical variants of MCs, which all share a cyclic structure consisting of five constant non-protein amino acids and two variable protein amino acids [3]. Microcystin-LR (MC-LR) is the most abundant and well-studied form of MCs and contains leucine (Leu or L) and arginine (Arg or R) in the two variable positions (Figure 1). Due to its cyclic structure, MC-LR is chemically stable under the environmental range of pH, light radiation and temperature [4]. Heterotrophic bacterial assem-blages are thought as major agents that regulate MC degradation in in lakes [5], [6], estuaries [7] and water treatment units [8].
Most previous studies on MC-degrading bacteria are culturebased and many of them are conducted in artificial environments, such as water treatment units [8], [9]. These studies have suggested that MC-degrading assemblages are mainly consisted of a narrow group of alphaproteobacteria in order Sphingomonadales. However, indirect evidence from studies on bacteria associated with MC-producing CyanoHABs has suggested that much broader bacterial taxa may be involved in MC degradation [10], [11]. Direct studies on in situ taxonomic composition of MCdegrading assemblages are scarce.
To date, a single pathway has been identified in bacterial systems for MC-LR degradation. This cleavage pathway is encoded by a cluster of genes (mlrABCD) and has been identified in all MC-degrading Sphingomonas species and several other strains of Gammaproteobacteria and non-Sphingomonas Alphaproteobacteria [3], [9], [12]. However, the specificity and ubiquity of mlr among environmental MC-degrading bacteria remain unclear [13]. This study aims to identify taxa, genes and pathways involved in microbially mediated MC transformation, using a comparative metagenomic approach on free-living bacterial assemblages from Lake Erie. Our results suggest that diverse taxa of free-living bacterioplankton, especially members of Methylophilales and Burkholderiales, might be important in MC degradation and that they likely employ different pathways from the mlr-based cleavage.

Sample Collection and Nutrient Analysis
Surface water samples were collected in carboys from the Western Basin of Lake Erie (Latitude 41.7423, Longitude 283.4019; Station MB18) on Aug. 27 th , 2010, where CyanoHABs were reported throughout the summer, including at the time of this sampling trip (NOAA, Harmful Algal Bloom Events Response; http://www.glerl.noaa.gov). Before use, the carboys were acid washed in the lab and rinsed with ambient lake water three times. Standard limnological data were collected using a YSI 6600 Water Quality Sonde and included temperature, dissolved oxygen concentration, pH and turbidity (Table S1). The Secchi depth was also measured at the time of sampling. Water samples for nutrient analyses were filtered through 0.2 mm-pore-size membrane filters (Pall Life Sciences, Port Washington, NY) and stored on ice or at 4uC. Concentrations of nutrients, including dissolved organic carbon, total dissolved nitrogen, soluble reactive phosphorus, nitrate/nitrite, ammonium, were measured using standard methods for water quality analyses [14] and reported in Table S1.

Microcosm Setup and Incubation
Lake water was filtered through 1.0 mm-pore-size membrane filters (Pall Life Sciences, Port Washington, NY) immediately after sampling to obtain free-living bacterioplankton proportion, and to exclude bacterivores and other large particles. Filtrate was collected in carboys and amended with a mixture of inorganic nitrogen and phosphorus compounds (5 mM NH 4 Cl, 5 mM NaNO 3 , and 1 mM NaH 2 PO 4 , final concentrations) and incubated in the dark at room temperature (2261Cu) with for 7 days. The water was agitated every 4-12 hours by shaking the carboys by hand. This pre-incubation was done to allow the bacteria to consume labile dissolved organic carbon compounds and to become growth limited by carbon availability.
At the end of the pre-incubation, microcosms were set up in six 20 L carboys. Two microcosms, designated as MC-1 and MC-2, were constructed of pre-incubated lake water and amended with MC-LR (,15 mg L 21 , final concentration, Axxora LLC, Farmingdale, NY). Two microcosms, designated as CT-1 and CT-2, served as controls and were constructed of pre-incubated lake water without further amendments. The remaining two microcosms were designated as FW-MC-1 and FW-MC-2; these received pre-incubated lake water that was in turn filtered by passage through 0.2 mm pore-size membrane filters to remove most of the bacterial cells and then amended with MC-LR (,15 mg L 21 , final concentration). The final volume of each microcosm was 18 L.
Microcosms were incubated in the dark at room temperature for a total of 48 hours and agitated every 4-12 hours by shaking the carboys by hand. Samples (10 ml) were taken in triplicates from each microcosm after 0 hour, 12 hours, 24 hours and 48 hours of incubation for subsequent MC-LR concentration measurement and flow cytometric analysis.
All plasticware was acid washed then rinsed with sample waters for three times before use. All glassware was ashed at 500uC for 5 hours then rinsed with sample waters for three times before use.

Microcystin Concentration Measurement
Samples collected as above were filtered through 0.2-mm-pore size filters. MC-LR concentration in filtrate was measured using the Microcystins/Nodularins (ADDA) ELISA Kits (Abraxis Bio-Science, Warminster, PA) following the manufacturer's instruction. Technical duplicates were measured for each sample.

Flow Cytometric Analysis
Flow-cytometric analysis (FCM) was performed with a FAC-SAria (BD, Franklin Lakes, NJ) to measure the abundance, size and metabolic activity of bacterioplankton in the microcosms. Before FCM analysis, samples were preserved with 1% (final concentration) freshly made paraformaldehyde at room temperature for 2 hours. Preserved cells were stained with Sybr Green II (1:5 000 dilution of the commercial stock; Molecular Probes Inc.) in the dark at room temperature for 20 min and mixed with an internal standard of beads that have a known density (1-mmdiameter Fluoresbrite YG Microspheres; Polysciences, Warrington, PA). FCM data acquisition was triggered by green fluorescence intensity of Sybr Green II staining (GFI). All FCM signals were collected on a logarithmic scale. Bacterial cell numbers were calculated based on ratios between the counts of bacterial cells and the internal bead standard.
FCM populations were defined based on GFI and side scatter (SSC) using a procedure described previously [15]. FCM population notation was based on the value of GFI from Sybr Green II staining, a proxy for intracellular nucleic acid content (largely RNA), which was taken as a surrogate indicator of cell activity [16]. Two FCM populations were gated for each sample, one was designated as ''high intensity cells'' (HI) and the other was designated as ''low intensity cells'' (LI). HI and LI populations thereby were corresponding to cells with higher and lower activity, respectively. Technical duplicates were analyzed for each sample.

Bacterial Growth Rate Estimation
Bacterial growth rate (m) during the incubation experiment was calculated using a linear regression formula: m = (lnN t -lnN 0 )/t, where t is the incubation time at the time of sampling, N 0 and N t are bacterial abundance at initial (0 hour) and at the time of sampling (t).

DNA Extraction
For PCR amplification, bacteria cells in 1 L water samples were collected from each microcosm and filtered onto 47 mm-diameter, 0.2 mm-pore-size membrane filters (Pall Life Sciences, Port Washington, NY). Filters were changed after approximately every 500 ml of water filtered.
For metagenomic analysis, bacteria cells in ,17 L water samples were collected from each microcosm and filtered on to 142 mm-diameter, 0.2 mm-pore-size membrane filters (Pall Life Sciences). Filters were changed after approximately every 9 L of water filtered. DNA was extracted from the filters using a Power-Max Soil DNA Isolation Kit (Mobio Inc, Carlsbad, CA) and served as templates for PCR amplification and metagenomic sequencing.

16S rRNA Gene Amplification and T-RFLP Analysis
16S rRNA gene amplification and terminal restriction fragment polymorphism (T-RFLP) analysis were performed following a protocol described previously with minor modifications [15]. Briefly, PCR was carried out with Illustra PuRe Taq Ready-to-go PCR beads (GE Healthcare, Piscataway, NJ) using 0.4 mM of 6carboxyfluorescein (FAM) labeled 8F (59-FAM-AGAGT TTGAT CCTGG CTCAG-39) and unlabeled 1492R (59-TACGG YTACC TTGTT ACGAC TT-39) primers. A touchdown PCR program was used with the annealing temperature sequentially decreasing from 62uC to 52uC by 1uC per cycle, followed by 15 cycles at 52uC. Each PCR cycle included denaturing (at 95uC for 50s), annealing (at 62 to 52uC for 50s), and extension (at 72uC for 50s) steps. An initial 3-min denaturation and final 7-min extension step were also included. For each sample, triplicate PCR amplifications were performed and resulting amplicons were pooled before being examined on ethidium bromide-stained 1% agarose gels.
FAM-labeled PCR amplicons were purified with the QIAquick gel extraction kits (QIAGEN, Valencia, CA) and then digested with the CfoI restriction enzyme (Roche Applied Science, Indianapolis, IN) at 37uC for 3 hours. Afterwards, the digestion products were purified using ethanol precipitation. The length and relative abundance of each terminal restriction fragment (T-RF) were determined using a 3730 DNA Analyzer (Applied Biosystems) at the Plant-Microbe Genomic Facility, Ohio State University, Columbus, OH.
T-RFLP profiles among bacterial FCM populations were quantitatively compared using a hierarchical cluster analysis using the Primer v5 program (Primer-E Ltd, Plymouth, United Kingdom). The relative peak area of each terminal restriction fragment (T-RF) from the output of T-RFLP data was used as a proxy for the relative abundance of bacterial taxa associated with that T-RF peak. The relative peak areas were square-root transformed before analysis. T-RFs with ,2% relative peak areas were excluded from the analysis.

Metagenomic Sequencing and Sequence Annotation
Genomic DNA of metagenomes collected at the end of microcosm incubations, i.e., 48 hours after the MC-LR was added, was sequenced together by one full plate run of 454 multiplex pyrosequencing with titanium chemistry at the Georgia Genomics Facility, University of Georgia, Athens, GA. The metagenomic sequences were deposited in the CAMERA database under the project CAM_P_0000956.
Low quality reads (,200 bp or Phred quality scores ,20) were removed from the metagenomic library. Identical reads that were generated as artifacts during pyrosequencing [17] were also removed using the CD-HIT-454 identifiers [18]. Remaining sequences were analyzed by BLASTn against the RDPII database to identify putative rRNA gene sequences (cutoff value of E ,10 25 ). The taxonomic affiliations of each putative rRNA gene sequence was assigned based on the best hit of the BLASTn against the Greengenes database [19], using the E value ,10 210 and identity .85%. The taxonomic annotation was further confirmed by consultation with RDP taxonomy classifier (.80% confidence).
Putative protein-coding sequences were identified from non-rRNA sequences by BLASTx against the NCBI RefSeq protein database (E #0.01, identity $40% and overlapping length $65 nt) [20]. The protein-encoding sequences were further categorized into Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways by BLASTx against the NCBI's COG database and the KEGG databases (E #0.1, similarity $40% and overlapping length $65 nt). The taxonomic affiliations were obtained by BLASTx against the NCBI RefSeq database using the MEGAN program [21]. Sequences that did not meet any of the criteria for rRNA or functional genes were excluded from further analysis.

Functional Gene Identification
Putative mlrABCD, a cluster of genes that encode MC degradation in bacterial systems, were identified in the metagenomic libraries using tBLASTx with a bit score cutoff of 50. Putative glutathione S-transferase (GST) genes, key genes in xenobiotic metabolism, were identified using BLASTx with a bit score cutoff of 50. Accession numbers for reference gene sequences are provided in Table S2.

Shannon-Wiener Index
The taxonomic diversity of microbes was estimated at the order level using the formula H' = 2g (P i * ln P i ), where P i is the relative abundance of the sequences belonging to the ith microbial order, and R is the total number of unique orders.

Statistical Analyses
A Student's t test for two samples of unequal variance was performed to compare total bacterial abundance, relative abundance of each FCM population and MC-LR loss between the MC and CT microcosms.
A t test with Bonferroni correction for two samples of unequal variance [22] was used to compare the relative abundance of bacterial taxa at two levels, e.g., between the within-treatment metagenome replicates (MC1 vs. MC2 and CT1 vs. CT2) and between the pooled metagenomes of different treatments (MCs vs. CTs). Significant differences between MC and CT microcosms were reported at P,0.05 with Bonferroni correction. Taxa with significant within-treatment differences were removed from the final list of responsive taxa.
The Xipe-TOTEC program, a statistical method that has been specifically developed to compare metagenomes [23], was used to identify overrepresented COGs and KEGGs in the MC metagenomes. Pair-wise comparisons were performed between within-treatment metagenome replicates and between the pooled metagenomes of different treatments, based on the occurrence of gene categories. In each comparison, a total of 20,000 resamplings were made, with the sample size equal to the average number of sequences in the two metagenome sequence libraries  being compared. Significant differences between the metagenome datasets were reported at the level of P,0.02, after removing those gene categories that had significant within-treatment differences.
The results of Xipe-TOTEC are affected by sample size, i.e., the number of sequences in randomly formed pools [24], and the copy number of target genes [20]. Therefore, changes in relative abundance of gene categories were also assessed by a statistical analysis that is free of these concerns, which is based on calculating the odds ratio (OR) and binominal distribution probabilities [25]. OR was calculated using the equation [(n mc /(N mc -n mc )]/[n ct /(N ctn ct )], where n mc and n ct were the number of targeted gene sequences in the pooled MC and CT metagenomes, respectively; N mc and N ct were the total number of sequences in the pooled MC and CT metagenomes, respectively. Binomial distribution of genes was assumed in each metagenomic sequence library. The binomial distribution probability (P) was calculated within Microsoft Excel, using the [n mc /(N mc -n mc )] as the observed gene sequence frequency and [n ct /(N ct -n ct )] as the expected gene sequence frequency. Genes or gene groups were reported as significantly overrepresented in MC metagenomes when the corresponding OR .1 and P,0.02. Genes or gene groups that had significant within-treatment differences were removed from the final list of overrepresented gene categories.

Response of Bacterial Assemblages to Microcystin
Added MC-LR was consumed rapidly in microcosms with a preestablished carbon-limited condition (Figure 2A). Within 12 hours of incubation, over 75% of MC-LR was lost and MC-LR became nearly undetectable after 24 hours in the MC microcosms. In contrast, MC-LR that was added to microcosms with filtersterilized lake water (FW-MC) remained untransformed throughout the incubation (t test, P,0.05).
Concomitant with MC-LR consumption, number of bacteria in the MC microcosms significantly increased. Bacterioplankton in the MC microcosms nearly doubled within 12 hours following addition of MC-LR (growth rate; m = 1.2 day 21 ) and the cell density reached 2.2610 6 cells ml 21 after 48 hours of incubation (,3.4-fold increase from initial cell density). Meanwhile, cell density in the control microcosms (CTs) was unchanged at 6.1610 5 cells ml 21 (t test, P,0.05; Figure 2B).
MC-LR addition also led to compositional differentiation between the MC and CT metagenomes (Figure 3). In the MC microcosms, the relative abundance of HIs increased from 14.3% of total cells (1.0610 5 cells ml 21 ) at 0 hour to 45.9% (3.1610 5 cells ml 21 ) at 48 hours of incubation. Meanwhile, no significant change was found for the relative abundance of HIs in the CT microcosms (t test, P,0.05; Figures 3B and 3C), indicating the observed increase in HI cells in the MC microcosms was due to growth of bacterioplankton on added MC-LR. At the end of the 48-hour incubation, LIs in the MC microcosms accounted for a smaller percentage (54.1%) than those in the CTs (85.7%), however, it contained nearly 3 times more cells than LIs in the CT microcosms ( Figure 3C). This suggests that a considerable fraction of LI cells in the MC microcosms were also microcystinresponsive. Thus, rather than analyzing only the HI populations, the total bacterial communities (HI plus LI) in the MC and CT microcosms were examined to elucidate the taxa and genes involved in microcystin degradation.
T-RFLP analysis, based on 16S rRNA genes, was performed to examine the potential shifts in total bacterial community structures during inorganic nutrient pre-incubation and MC-LR incubation experiment. Cluster analysis of T-RFLP data closely grouped duplicates for each sample to each other, indicating good withintreatment reproducibility (Figure 4). The original water (Ori) and pre-incubated water samples (Pre-incub) had highly similar T-RFLP data. These two samples were moderately similar with the samples from the CT microcosms at the end of the incubation experiment (CT-48 h), but were distant from samples of the MC microcosms (MC-48 h) (Figure 4). These findings indicated that pre-incubation had less effect on bacterial community structure than MC-LR amendments.

General Structure of Metagenomes
A total of 815,435 metagenomic sequences with average length of 386 bp were recovered, after removing low quality and artificial reads (Table 1). More sequences were recovered for the MC libraries than the CTs, although starting amounts of genomic DNAs were similar (,1 mg). About 0.4% and 0.2% of the sequences of the MC and CT metagenomes, respectively, were affiliated with16S rRNA genes, in accordance with their expected frequency in prokaryotes (Mou et al., 2008). Most of the non-16S rRNA gene sequences in the MC (,70%) and CT (,50%) metagenomes were identified as putative protein-coding sequences.
Out of 498,519 putative protein-coding gene sequences, 90% were further assigned into a total of 3,259 unique COG groups and 182 unique KEGG pathways (Table 1). These were further classified into 23 COG (A-V, Z) and 21 KEGG classes within the networks of metabolism, genetic information processing, environmental information processing and cellular processes. COG and KEGG assignments consistently revealed that functional categories, including cell motility, signal transduction, metabolisms of organic and inorganic molecules were significantly overrepresented in the MC relative to the CT metagenomes ( Figure 5).
Xenobiotic metabolism-related COGs and KEGGs were overrepresented in the MC metagenomes (Table 2). KEGG0980 and COG0625 are associated with cytochrome P450 oxidase and glutathione S-transferase (GST), respectively. These two enzymes have been found to catalyze the synthetic conversion of MC-LR into glutathione (GSH) and cysteine (Cys) conjugates in animal cells [26]. COG0841 and COG1566 are both affiliated with multidrug efflux pumps, which have been found to regulate the excretion of final degradation products of GSH and Cys conjugates from animal cells (Figure 1).
The known pathway of MC-LR cleavage in bacterial systems involves expression of a cluster of genes, e.g., mlrABCD [3], [9]. Putative mlr genes had similar relative abundance in the MC (0.22% of protein-coding sequences) and CT (0.19%) metagenomes (OR .1, P,0.02). On the other hand, putative GST genes, which are involved in MC-LR degradation in animal cells but has yet unreported in bacteria [26], were overrepresented in the MC (0.54% of protein-coding sequences) than in the CT (0.24%) metagenomes (OR .1, P,0.02).

Microcystin Responsive Bacterial Taxa
Most of the putative protein sequences in the MC (90%) and CT (80%) metagenomes received taxonomic assignments at least to the phylum level, and ,64% of these sequences had COG assignment. Patterns of taxonomic affiliation of metagenomic sequences were conserved, regardless of whether all protein-coding sequences or just those subsets assigned to significantly overrepresented COG categories were considered ( Figure S1). In addition, even though 16S rRNA gene sequences were a much smaller fraction of total sequences than the protein-coding gene sequences (Table S4), they revealed a similar bacterial taxonomic structure ( Figure S1 and Table S6).
COG sequences were affiliated with 89 unique bacterial orders, but about 65% of them were from only 22 orders of the phyla of Acidobacteria, Actinobacteria, Bacteroidetes, Planctomycetes, Proteobacteria (in subdivision of alpha, beta, gamma and delta/epsilon) and Verrucomicrobia ( Figure 6A). Archaeal sequences occurred in low abundance (0.08% COGs in the MCs; 0.4% in the CTs) and 95% of them were affiliated with Euryarchaeota. The richness of the MC and CT metagenomes was similar at the order level. However, the COG sequences of the MC metagenomes were taxonomically less diverse (Shannon-Wiener Index, H' = 2.0) than the CT metagenomes (H' = 3.3), because evenness of the MC metagenomes was lower.
Over half (53.2%) of COG sequences in the MC metagenomes were affiliated with Methylophilales (Betaproteobacteria), a taxon that was significantly less abundant in the CT metagenomes (9.6% of COG sequence; t test with Bonferroni correction, P,0.05).  Burkholderiales (Betaproteobacteria; 18.1% of COG sequences) and Xanthomonadales (Gammaproteobacteria; 9.0%) were the second and third most abundant taxa affiliated with COG sequences in the MC metagenomes. Their relative abundances were similar to those in the CT metagenomes (15.9% and 2.2%, respectively) ( Figure 6). Although representing fewer sequences (Table S4), similar distribution patterns of bacterial taxa were observed at the family and genus levels ( Figures 6B and 6C). Like their parent order Methylophilales, the family Methylophilaceae and genus Methylotenera were the most abundant members in the MC metagenomes and were significantly more abundant than those in the CT metagenomes (t test with Bonferroni correction, P,0.05). On the other hand, underrepresentation of Actinobacteria in the MC metagenomes at the order level was not observed at the family or species level ( Figure 6). This may be partly due to the fact that only a limited number of environmental Actinobacteria species have been isolated and sequenced [27].
Putative genes of MC-LR cleavage pathway (mlr) and xenobiotic metabolisms (GST genes) were affiliated with 13 and 16 bacterial orders, respectively (Figure 7). About 80% of the putative mlr sequences were affiliated with only 5 orders, including Burkholderiales (in genera Burkholderia, Cupriavidus, and Variovorax), Caulobacterales (Phenylobacterium), Rhizobiales (Mesorhizobium, Methylobacterium and Rhodopseudomonas), Sphingomonadales (Sphingopyxis), and Xanthomonadales (Stenotrophomonas). These orders and genera also represented major taxa for putative GST gene sequences. Taxonomic affiliations of mlr and GST genes were statistically similar between the MC and CT metagenomes (OR .1, P,0.02). However, a significant difference was found for Methylophilales-affiliated sequences. They accounted for over 13% of putative GST gene sequences but were not identified among the putative mlr sequences (Figure 7).

Discussion
Bacterially mediated microcystin degradation has been studied primarily on bacterial cultures or in artificial environments. Related studies in natural environments have generally assumed that bacteria associated with CyanoHABs are predominant microcystin degraders [25], [26], [27], [28]. Using microcosm incubations, our study provides empirical data to identify bacterial genes and taxa that are involved in microcystin degradation in nature.
Microcosms are widely used in ecological research because they can be readily replicated and examined under controlled laboratory conditions, permitting experimental manipulations as in this study. However, the reliability of conclusions drawn from microcosms can be compromised by artifacts of confinement (''bottle effects''), which are exacerbated as the ratio of bottle surface to microcosm volume increases [29]. For this reason we constructed microcosms as large as possible to be manipulated in the laboratory: 18 L microcosms in 20 L carboys. Because of the uniformly large size of our microcosms we assumed that ''bottle effects'' would be consistent among the treatments and have low impacts on our overall conclusion. Other manipulations, i.e., prefiltration and pre-incubation, were found necessary to establish contrasting results of cell abundance, size and nucleic acid content distributions, and MC-LR degradation activities between the MC and CT microcosms. However, these processing steps also made the experimental systems less in situ-like. Nonetheless, our approach allowed culture-independent identification of MCdegrading bacterial taxa and genes without constrains from prior knowledge.
In this study, free-living bacterioplankton grew substantially at the expense of added MC-LR, indicating they were actively using microcystin as carbon and/or energy sources ( Figure 2). Cell distribution observed from FACS analysis also indicated that MC-LR affected the bacterial taxa differently and stimulated the growth of a subset of bacterial taxa present in the samples (Figure 3). The average growth rate of bacterioplankton in MC microcosms was 0.94 day 21 , which is comparable to rates that have been reported for Lake Erie bacteria [30], [31], [32]. The consumption rate of MC-LR (,15 mg L 21 in 48 hours) in microcosms was similar to previous observations in pure cultures growing on MC-LR at similar concentrations [8], [33]. If MC-LR were supplied at higher concentrations (.25 mg L 21 ) and/or to un-manipulated ambient lake water, which contains diverse labile dissolved organic compounds, MC-LR degradation rate would likely be slower and with a lengthy initial lag phase [8], [33].
Base on the amount of MC-LR addition and the growth of bacterial cells, we estimated that the average carbon content per bacterial cell (C c ) in the MC microcosms was 6 fg/cell. The widely used average C c for bacteria is 20 fg/cell [34], but many studies have proposed lower values (7-13 fg/cell) [35], [36], [37]. Studies have shown that C c values of bacterial cells can range across three orders of magnitude, from1.5 fg/cell to 1.9 pg/cell [36]. Lower C c values are typically associated with cells that are small in size [34], [36] and/or growing under nutrient limited conditions [38], similar as those in the MC microcosms. This calculation also indicated that bacterial growth in the MC microcosms can be largely explained by active incorporation of added MC-LR by bacterial cells.
Over twenty strains of MC-degrading bacterial isolates are currently available and they are affiliated with a narrow group of bacterial orders, including Actinomycetales, Bacillales, Sphingomonadales and Burkholderiales and Methylophilales [13], [39], [40]. Our cultureindependent study suggests a highly heterogeneous composition of MC-responsive bacteria, including members from over 89 orders with in the phyla of Actinobacteria, Bacteroidetes, Firmicutes, Planctomycetes, Proteobacteria of the alpha, beta, gamma and delta/epsilon subdivisions, and Verrucomicrobia. Recent studies on CyanoHABassociated bacteria have similarly indicated a high taxonomic diversity of MC degraders in a number of freshwater lakes [10], [41], [42]. Previous culturing studies and surveys of CyanoHAB-associated bacteria have suggested a dominant role of Sphingomonadales (mainly within the genus Sphingomonas) in MC degradation. However, a recent survey by 16S rRNA gene pyrotag sequencing has indicated a low relative abundance of Sphingomonadales (,1% of total bacterial community) during a CyanoHAB event in Lake Erie [43]. Our metagenomic data also indicate that Sphingomonadales may be less important than Methylophilales and Burkholderiales in bacterioplankton-mediated MC degradation in Lake Erie (Figures 6 and S1; Table S6). The latter two Betaproteobacterial orders are common to freshwater environments [44], and each has cultured MC-degrading representatives [7], [40]. Moreover, although often at low abundance, Methylophilales have been frequently found to be associated with freshwater CyanoHABs [25], [26], [27], [28], [41]. Differences between our study and those of others most likely are due to variation between one site and another in physical, chemical and biotic conditions and the targeted fraction of the bacterioplankton (free-living bacterioplankton vs. total community). Notwithstanding those differences, our findings emphasize that MC-degrading bacteria and pathways likely are broader than earlier studies indicated.
Archaea have been identified as important in situ microbial taxa during a MC-producing CyanoHAB, but their high abundance has declined to undetectable levels after being incubated in MCamended microcosms [10]. The authors have attributed this to high sensitivity of Archaeal cells to ''bottle effects''. In our study, MC and CT metagenomes were subjected to the same container incubation conditions. Archaea-affiliated sequences occurred in low quantities in all microcosms, and their relative abundance was significantly lower in the MC (0.08% of total protein-coding sequences) than in the CT microcosms (0.5%) (t test with Bonferroni correction, P,0.05). This suggests an insignificant role of archaea in microcystin degradation.
Actinobacteria have several MC-degrading species and are common taxa associated with MC-producing CyanoHABs [10], [26]. They have been found more important during CyanoHABs in lakes with water temperature below 20uC than in warmer lakes. In our study, water temperatures at time of sampling and during incubation were at 22uC or above. Significantly fewer Actinobacteria were found in MC metagenomes, indicating that this taxon was insignificant in MC degradation in the samples examined.
The genes (mlrABCD), intermediates and products of an enzymatic pathway for bacterial MC degradation have been identified based on works on Sphingomonadales strains [12], [45]. mlrA genes are considered as the most important within the mlr cluster because they encode the ring-cleavage step that leads to opening of the microcystin ring structure (Figure 1). Probes/ primers of mlrA have been developed and used to study in situ activity of MC-degrading bacteria in various environments [46], [47], [48]. However, PCR amplification of mlrA genes from microcystin-degrading Actinobacteria isolates has failed [13]. Two factors may be contributing to this failure: first, existing primers may be inefficient for broad identification of mlrA in non-Sphingomonas taxa [40] and second, microcystin degradation genes and/or pathways may vary among bacterial taxa. Our results support the latter hypothesis. In this study, putative mlrA genes were identified based on full-length amino acid sequence homology, which should have largely bypassed the bias of primer specificity that is inherent in the PCR method. In accordance, recovered mlrA genes in our metagenomes were broadly affiliated with Proteobacteria (in subdivisions of alpha, beta, gamma and delta) and Bacteroidetes. In addition, none of the mlrA or other mlr sequences was affiliated with Methylophilales, even though Methylophilales represented the most abundant taxon in the MC metagenomes. These suggest that bacteria, especially members of Methylophilales, may employ an alternative microcystin degradation pathway.
Our results suggest that this alternative pathway may involve xenobiotic metabolism (Figure 1). Xenobiotic metabolism-related genes and gene categories, e.g., GST gene and COG0625, COG0841, COG1566 and KEGG0980, were significantly overrepresented in the MC metagenomes. Moreover, Methylophilales were affiliated with a large proportion of xenobiotic metabolism related sequences, but none of the putative mlr sequences. Xenobiotic metabolism is widely distributed among living organisms of all three life domains and refers to intracellular processes that neutralize and eliminate toxic effects of foreign compounds by altering their chemical structures. A long list of substrates has been identified for xenobiotic metabolizing systems in bacteria, including halogenated compounds [49], drugs [50] and numerous environmental pollutants [51]. Our metagenomic study suggests adding MC-LR to this list. Although it is novel for bacterial systems, GST-mediated xenobiotic metabolisms are known for their critical role in MC detoxification by various aquatic eukaryotes, including higher plants, invertebrate and vertebrate animals [26]. The wide distribution of GST genes and related xenobitc metabolism have been largely attributed to horizontal gene transfer, and the parallel and independent evolution of these essential genes among different phylogenic groups [52].
It is noted that our results did not rule out the involvement of mlr gene-based pathway in MC-LR degradation, but suggested that alternative pathway, such as xenobiotic metabolism, may also be important in the process. Further studies, especially those that identify degradation intermediates and measure gene expression, are required to confirm the occurrence of MC degradation by xenobiotic metabolism in bacteria and to examine its importance relative to mlr gene-based cleavage pathway. Figure S1 Relative abundance of major bacterial taxa at order levels in the MC and CT metagenomes. Taxonomic affiliations are based on (A) total protein sequences, (B) total protein sequences with COG group assignment, (C) COG sequences that were overrepresented in the MC metagenomes, relative to the CT metagenomes and (D) putative 16S rRNA genes. (EPS)