Structuring of Bacterioplankton Diversity in a Large Tropical Bay

Structuring of bacterioplanktonic populations and factors that determine the structuring of specific niche partitions have been demonstrated only for a limited number of colder water environments. In order to better understand the physical chemical and biological parameters that may influence bacterioplankton diversity and abundance, we examined their productivity, abundance and diversity in the second largest Brazilian tropical bay (Guanabara Bay, GB), as well as seawater physical chemical and biological parameters of GB. The inner bay location with higher nutrient input favored higher microbial (including vibrio) growth. Metagenomic analysis revealed a predominance of Gammaproteobacteria in this location, while GB locations with lower nutrient concentration favored Alphaproteobacteria and Flavobacteria. According to the subsystems (SEED) functional analysis, GB has a distinctive metabolic signature, comprising a higher number of sequences in the metabolism of phosphorus and aromatic compounds and a lower number of sequences in the photosynthesis subsystem. The apparent phosphorus limitation appears to influence the GB metagenomic signature of the three locations. Phosphorus is also one of the main factors determining changes in the abundance of planktonic vibrios, suggesting that nutrient limitation can be observed at community (metagenomic) and population levels (total prokaryote and vibrio counts).


Introduction
Microbes are recognized as major drivers of nutrient cycling in the Oceans and in coastal waters [1]. They present structured populations both in large and small scale patterns [2,3,4] and may even share some biogeographical and macroecological features of macroorganisms, according to their geographic distribution [5]. Microbial community diversity analysis in the North Pacific Subtropical gyre indicated the contribution of different microbial taxa to nutrient cycling according to the depth [2]. Microbial diversity may also be structured according to latitude [3] and even within the same location [4,6]. A global survey of the bacterioplankton community diversity indicated a latitudinal gradient in species richness, with an increasing number of taxa (species) at lower latitudes, and only 2 cosmopolitan taxa out of 562, suggesting a pattern of geographic-dependent microbial diversity structuring [7]. Coastal bacterioplankton and vibrioplankton communities appear to be finely structured in discrete phylogenetic clusters, revealing the co-occurrence of several hundred closely related microbial populations [6]. Sympatric differentiation may be due to niche partitioning and specialization with the association of different groups of bacterioplanktonic species with different habitats (zooplankton, particles, or water) in the same geographic location [4,6]. Hunt et al. [4] showed that some vibrio species appeared to occur only in association with plankton, whereas other species appear to be exclusively freeliving. A few studies have addressed the main water quality parameters that determine structuring shifts in bacterioplanktonic and vibrioplanktonic diversity and abundance [8,9,10,11,12]. Salinity, phosphorus, and nitrogen concentration appeared to influence the abundance of vibrios in cold water environments [8,9,10,11,12]. Vibrio splendidus and V. anguillarum species groups were among the most abundant taxa in cold water environments [9]. Similar studies have not been carried out yet in tropical environments. Thus it is not known if these previous findings can be generalized for other (tropical) environments.
The environmental parameters and ecological factors involved in habitat and/or niche partitioning may comprise biotic and abiotic factors. Three levels of control over microbial abundance and diversity are recognized [13,14,15]: i. bottom-up control, or nutrient supply/limitation; ii. top-down control via viral lysis and/ or protist grazing, and iii. sideways control, or interactions among co-occurring microbial populations. The interplay in and among these three levels explain community structure and composition, according to niche partitioning theories [1,16]. On the other hand, experimental studies demonstrated recently that microbial diver-sity resulting from niche partitioning leads to more efficient nutrient uptake [15,17], also implying that biodiversity may influence water quality and be advantageous as a buffer against pollution impacts [17]. Since vibrios comprise several human (e.g. V. cholerae, V. vulnificus, and V. parahaemolyticus) and marine animal (e.g. V. harveyi species group) pathogenic species [18,19], it is very important to determine specific environmental triggers for their growth and diversity structuring in tropical environments.
We studied the bacterioplankton abundance and diversity of Guanabara Bay (GB) as a model to better understand the effects of water quality parameters on free-living bacterioplankton, in three representative locations of the GB, according to previous studies [20]. GB is the second largest bay of Brazil and is located near Rio de Janeiro city, one of the largest metropolitan areas of South America, with nearly 9 million people living in close vicinity [21]. Monthly counts of vibrio CFUs and total prokaryotic (bacter-ia+archaea) counts, prokaryotic production measurements, and measurements of water quality parameters (nutrient concentration, chlorophyll a, salinity, temperature, and pH) allowed us to establish correlations between different parameters. The 24-month time series study (between February 2009 and March 2011) allowed us to investigate the role of nutrient input on the abundance and diversity of total bacterioplankton and vibrioplankton, determining which nutrients limit their growth. Vibrio counts and vibrio diversity were used as proxies for the determination of correlations with water quality parameters because these Gammaproteobacteria are known to respond swiftly to nutrient inputs [22]. Additionally, metagenomic analysis of the different sites corroborates some of the inferences from the chemical analysis and resulted in the determination of a metagenomic signature for the GB.

Study area
The study area is located in the Guanabara Bay (centered on lat. 22u509S and long. 43u109W), Rio de Janeiro State, a eutrophic estuarine system near one of the largest metropolitan areas in Brazil. No specific permits were required for the described field studies. Subsurface waters (1-2 m depth) were sampled from three different sites, representing different trophic levels ( Figure 1), according to the literature [20]. Location 1 is at the most external point of the GB (22u559430S; 43u089510W), therefore receiving high influence of the marine environment and lesser influx of terrigenous material. The second sampling site (location 7) is located in an intermediary point in the middle of the GB (22u529120S; 43u099460W), subjected to strong mixing of inner and coastal seawater, and the third sampling location (34)

Physical-chemical and biological analyses
All environmental parameters were analyzed by standard oceanographic methods [23]. At least three replicates were analyzed for each parameter monthly. Temperature and salinity were evaluated by using CTD or salinity meters from YSI. Chlorophyll a analyses were performed after vacuum filtration (,25 cm of Hg) of 2L of water. The filters (Glass fiber Millipore AP15) were extracted overnight in 90% acetone at 4uC, and analyzed by spectrophotometry or fluorimetry. Inorganic nutrients were also analyzed: 1) ammonia by indophenol, 2) nitrite by diazotization, 3) nitrate by reduction in Cd-Cu column followed by diazotation, 4) total nitrogen by digestion with potassium persulfate following nitrate determination, 5) orthophosphate by reaction with ascorbic acid, 6) total phosphorous by acid digestion to phosphate, and 7) silicate by reaction with molibdate. Dissolved (DOC) and particulate (POC) organic carbon were analyzed as described previously [24].
Prokaryotic abundance and productivity. Prokaryotic production and total prokaryotic counts were performed only in the first year of study. Abundance was determined from two replicates of seawater by flow cytometry with Syto-13 (Life Technologies, Carlsbad, CA), with minor modifications [25]. Microbial cells with high (HNA) and low (LNA) nucleic acids content were quantified through measurement of fluorescence intensity by flow cytometry with Syto-13. Prokaryotic production was determined according to Smith and Azam [26], using 3Hleucine incorporation into proteins as a proxy of production. Carbon production was calculated using the conversion factor of 0.86 [27].
Vibrios CFU counts and diversity. Monthly seawater samples were serially diluted and plated in triplicate (100 ml) in Thiosulphate Citrate Bile Salts Sucrose (TCBS) agar. Colony counts were performed 48 h after incubation at 28uC. All vibrio colonies obtained in February 2010 from the three locations were stored in Tryptone Soy Broth with 20% glycerol at 280uC. These colonies were purified and isolates were subjected to DNA extraction following [28] and their uridylate kinase (pyrH) gene sequences were obtained in order to determine vibrio diversity. PCR amplification of the pyrH gene was performed as described previously [29], using primers 80F and 530R. PCR products of the approximate size (approx. 500 bp) were purified through Illustra GFX TM PCR DNA and Gel Band Purification kit, following the manufacturer protocol. Purified PCR products were analyzed in an automatic Applied Biosystems Genetic Analyzer 3500. pyrH sequences were analyzed using the software Kodon package 2.03, and compared to reference sequences (type strains) available from GenBank (NCBI). Phylogenetic tree was constructed based on pyrH gene sequences applying the Neighbor-joining method using the software MEGA [30]. Distance estimation was obtained by the model of Kimura 2-Parameter. Bootstrap analysis was performed after 2,000 replications. Phylogenetic trees were constructed using the software MEGA [30].
Statistical analysis. Simple correlations were initially used as exploratory analysis, through fit of data points in linear models. Total prokaryotic abundance and Vibrio counts were plotted against all other variables and fit to the linear model (r 2 ) were calculated in each case, in Microsoft Excel (MicrosoftH). Variable transformation was used to ensure normality wherever needed (usually counts vs. abiotic variables). Two additional multivariate analyses were performed using the STATISTICA (StatsoftH) software: a) an ordination analysis, the principal component analysis (PCA), performed on a correlation matrix between total prokaryotic abundance, HNA, LNA, Vibrios and abiotic variables, and b) a multiple regression analysis with total prokaryotic abundance, HNA, LNA and vibrio counts as dependent variables and abiotic parameters as independent variables.
For the multiple regression analysis, assumptions were respected through residual analysis. To avoid distortions in the statistical tests, outlier data (.2*sigma) were excluded and independence between explicative variables was assumed for a tolerance .0.01. The tolerance of a variable is defined as 1 minus the squared multiple correlation of this variable with all other independent variables in the regression equation. Therefore, the smaller the tolerance of a variable, the more redundant is its contribution to the regression (i.e., it is redundant with the contribution of other independent variables).
Metagenomic analysis. A deeper understanding of structuring and metabolic potential of the sampled sites involved metagenomic analysis. Two replicate (ca. 2L each) seawater samples collected in February 2010, from each of the three study locations. In total, six samples were pre-filtered in a 20 mm nylon filter, followed by a Sterivex 0.2 mm filter. Microbial cells retained in the Sterivex filter received SET buffer and were stored at 280uC until DNA extraction, which was performed as described previously [31]. Metagenomic DNA samples were sequenced by 454 pyrosequencing [32]. Metagenomic sequences were annotated using MG-RAST 2.0, utilizing the subsystems technology for metabolic analyses and the SEED database for phylogenetic analyses, with an e-value cutoff of 10 25 [33]. The main phylogenetic groups were further extracted after SEED analysis, and resubmitted to evaluate their relative contribution to the functional subsystems. The Statistical analysis of metagenomic profiles (STAMP) bioinformatics software was used for statistical analysis [34]. Statistical significance was calculated using two-sided Fisher's exact test, and the differences between proportions were analyzed using the Newcombe-Wilson method with 99% confidence interval. Data was further subjected to filtering and only data with p-values lower than 0.01 were analyzed. Functional data was additionally compared to a pool of 19 public metagenomes from different bays around the world in order to determine possible unique features of GB in comparison to other bays. The list of these bays is available in the Table S1. The analysis was restricted to the sequences only, given that metadata of these bays was not available with the metagenomic sets.

Physical chemical analysis
Seawater parameters were measured monthly in three locations of the GB ( Table 1). The highest values of inorganic and organic nutrients were observed in location 34, the inner portion of the GB. For instance, total N and total P were 574.366493.15 mM and 19.4365.31 mM in location 34, respectively, with ammonia and orthophosphate as main contributors ( Table 1). The values of the same parameters in location 1 were 67.73634.55 mM and 2.6961.03 mM, respectively. The higher silicate values observed in the location 34 (80.79670.55 mM) in comparison with location 1 (23.63613.94 mM) and location 7 (33.57616.10 mM) suggested a higher contribution of land runoff and/or benthic-pelagic effects. In addition, salinity, pH, seawater transparency, and dissolved oxygen values were lower in the location 34. There were also higher chlorophyll a levels and lower pheophytin (chlorophyll degradation product) levels in this location, indicating active photosynthesis. Based on the physical chemical and biological parameters of the seawater, the three locations can be split in two groups: Group 1, corresponding to locations 1 and 7, represented an intermediate area of the GB with higher influence from the coastal waters; Group 2, corresponding to location 34, a heavily polluted area of the GB, and with considerable input of nutrients.

Total prokaryotic counts and vibrio (CFUs) counts
Total prokaryotic counts obtained by flow cytometry and vibrios CFU counts obtained by plating on TCBS were higher in location 34 than in the other two locations (Table 1). Prokaryotic cell counts varied between 9,66610 5 and 3,68610 7 , with the highest values occurring in April 2009 ( Figure 2). Vibrios CFU counts varied between 30 and 20,866 CFU/ml, with the highest values occurring in January 2011 ( Figure 3). A presumptive seasonal pattern for the microbial abundance was observed, with higher counts in the summer months (Dec-Apr) and lower counts in the winter months (Jul-Sept) (Figure 2 and 3). Prokaryotic cells with high nucleic acid content (HNA) were more abundant in location 34 (representing almost 70% of total counts). On the other hand, the contribution of HNA prokaryotic cells was equal to the contribution of LNA prokaryotic cells in locations 1 and 7, which suggests a lower proportion of dividing cells ( Figure 2 and Table 1) [1]. Our results indicated a strong correlation between total prokaryotic counts/prokaryotic production and vibrios CFU counts. The great majority of the vibrio isolates belonged to three species (Vibrio communis, V. parahaemolyticus and V. alginolyticus) according to pyrH gene sequence analysis ( Figure S1). The pyrH sequence diversity (allelic diversity) was higher in locations 1 and 7 than in location 34 (results not shown).

Relationship between seawater parameters and microbial abundances
The initial exploratory analysis revealed high fits to linear models between total microbial abundance and vibrio counts (r 2 = 0.829), and between both these counts and microbial production (0.70,r 2 ,0.76) (Supplementary Figure S2). Vibrio abundance also fitted the models when plotted against orthophosphate (r 2 = 0.620) and ammonia (r 2 = 0.443) showing direct correlation, as opposed to an inverse correlation when plotted against salinity (r 2 = 0.511) ( Figure S2).
The PCA supported the division between Group 1 (locations 1 and 7) and Group 2 (location 34) (Figure 4). All biotic variables clustered with phosphorus (total and orthophosphate) and ammonia, and opposed to salinity, agreeing with the exploratory correlations even though the latter refer only to Vibrio ( Figure S2).
The more robust multiple regression analysis also highlighted the importance of ammonia and mainly phosphorus to all biotic variables ( Table 2). Total prokaryotic abundance increase was highly dependent (p#0.011) on increases of orthophosphate, ammonia and temperature, as well as dependent on chlorophyll-a decreases (p = 0.043). HNA cells responded to increases in orthophosphate and ammonia (0.036,p,0.038). LNA, on the other hand, was highly statistically dependent only to temperature (p = 0.0006), even though total phosphorus, chlorophyll-a and ammonia were also selected for the model with no statistical support (p.0.05). Vibrio abundance was directly dependent on orthophosphate (p = 0.012) and inversely dependent on salinity (p = 0.032), despite nitrite being also selected for the model, with no statistically significant contribution (p = 0.161).

Metagenomic analysis description
To better analyze the structure of the bacterioplankton and to further assess the metabolic potential of the microbes, a total of six microbial plankton samples were subjected to metagenomic analysis. Location 1 produced 72,907 and 96,780 sequences, from which 85,723 (50.52%) were classified in the MG-RAST subsystems hierarchy and 112,245 (66.15%) were classified in the SEED database (Table S2)

Taxonomic assignment of the GB metagenomes
The relative majority of sequences recovered from GB belonged to the domain Bacteria. Archaeal sequences corresponded to less than 1% of the total, mainly represented by methane metabolizing groups ( Figure S3). Around 1% of the sequences belonged to eukaryotes and viruses ( Figure S4). Phages and large algal viruses (but not prophages) were more abundant in location 7, suggesting higher ongoing bacterial infection of these groups, since most of them should have been filtered out during sampling instead of remaining in the microbial fraction (however see [35]). There was a high relative abundance of Proteobacteria in the three locations, corresponding to around 60-68% of all metagenomic sequences ( Figure 5A). A shift was observed from Alphaproteobacteria and Bacteroidetes/Chlorobi in locations 1 and 7 to Gammaproteobacteria, Betaproteobacteria and Actinobacteria in location 34. Flavobacteria, Rhodobacterales and Pelagibacter ubique were the main contributors to this pattern of relative abundance shift observed in locations 1 and 7 ( Figure S5), while Actinobacteridae, Burkholderiales and several groups within Gammaproteobacteria (Alteromonadales, unclassified Gammaproteobacteria, Pseudomonadales, Chromatiales, Vibrionales and a few others) were responsible for the higher relative abundance in the more polluted location 34 ( Figure 5B).

Predominant metabolic profiles of the GB
Metagenomic sequences of the three locations were classified in 23 informative subsystems ( Figure 6). Locations 1 and 7 had relatively more metagenomic sequences related to carbohydrates and membrane transport subsystems, while location 34 had more metagenomic sequences related to RNA metabolism, virulence, motility and chemotaxis, regulation and cell signaling, cell division  and cell cycle subsystems. When compared with other bay systems around the world, GB appeared to have some singular metabolic features ( Figure 6). GB had proportionally less sequences in the photosynthesis subsystem and more sequences in the phosphorus metabolism and metabolism of aromatic compounds than the mean profile of 19 pooled bay metagenomes (descriptions provided in Table S1). Location 34 represents an outlier, in comparison with the average bay and the other locations, showing a lower proportion of carbohydrates and fatty acids subsystems and higher RNA metabolism, virulence, motility/chemotaxis and

Relationship between metabolic (subsystems) and taxonomic information
Alphaproteobacteria predominated in locations 1 and 7 and were reduced in the polluted location 34. Gammaproteobacteria, on the other hand, predominated in location 34 and were reduced in the locations 1 and 7. These relative shifts were reflected in the contribution of these groups to all subsystems in each location (Figure 7). For instance, Gammaproteobacteria had a higher contribution to several subsystems (such as phosphorous metabolism) in location 34. An increased contribution of Betaproteobacteria sequences to several subsystems was also noted in location 34, while the opposite was observed in Bacteroidetes/Chlorobi group, also reflecting changes in abundance in these groups. Cyanobacteria had a higher proportional contribution to the photosynthesis subsystem in the location 34, despite the similar relative abundances observed. Actinobacteria was also proportionally more abundant in location 34 and contributed predominantly to sulfur metabolism in this site (Figure 7).

Discussion
Physical, chemical and biological parameters of GB seawater reflected two groups of environments (Table 1 and Figure 4). Locations 1 and 7 suffered more influence of oceanic seawater, in contrast with location 34. Even though previous spatial studies suggested differences between locations 1 and 7 [20], our time-series sampling revealed highly similar environments, corroborating the notion that location 7 is prone to water mixing, with greater influence of oceanic water. The concentration of some nutrients in the inner location of the GB was well above other coastal systems. For instance, ammonia and phosphate concentration in location 34 of GB were, respectively, 6.8 and 6 times the highest average value in 26 years of measurement in the Delaware Bay [36]. GB ammonia and phosphate levels were also about 7 and 20 times, respectively, the values obtained in the Yangtze estuary [37]. Total N and total P were, respectively, 9 and 5 times higher in location 34 of the GB than in the riverine station of the Neuse estuary (US) [38]. The relatively low pH, dissolved organic carbon, and dissolved oxygen in location 34 clearly indicated intense consumption of nutrients and high biochemical oxygen demand. The high loads of nutrients in the inner portion of the GB appear to have a drastic influence in microbial metabolism, abundance and diversity.

Bottom-up effects as main regulators of bacterioplankton and vibrioplankton abundance and diversity in the GB
The multiple regression models covered most, but not all, of the variation (0.666,model r 2 ,0.791), suggesting that other control- ling factors, such as predation by protists and viruses, may account for prokaryotic abundance variation in GB, particularly in the vibrio model (r 2 = 0.350). Future studies are needed in order to evaluate their ecological role in (vibrio) abundance and diversity in GB. Nevertheless, linear model fits revealed phosphorus (orthophosphate and total phosphorus) as one of the main limiting factors for vibrio growth in the GB, and the more robust multiple regression models supported this view for total prokaryotic abundance, HNA and vibrios, but not for LNA cells (Table 2). LNA cell abundance increase was highly dependent on temperature increases (p = 0.0006), hinting at the influence of temperature on cell division activation, at least in this tropical bay. Phosphorus appeared to be an important factor in previous studies in the Swedish coastline and the Western English Channel [8,9,10,11]). On the other hand, the effect of phosphorus on the vibrioplankton of the Chesapeake Bay was not evident, since temperature, dissolved-O 2 concentration, and tide height yielded the highest correlations with vibrio abundance [12].
The N:P ratios observed in GB (31:1, 26:1 and 38:1 in locations 1, 7 and 34, respectively) and the higher prevalence of phosphorus metabolism subsystem in metagenomes from all locations supports the idea of phosphorus limitation. Tropical estuarine systems are usually phosphorus limited, even though in general, coastal systems are limited by nitrogen [39]. This occurs because land runoff generally supplies phosphorus to coastal systems [39], unlike open ocean waters [2]. In spite of the high loads of phosphorus that GB receives, it appears that the even higher carbon and nitrogen input from anthropogenic sources (particularly ammonia) supply the demand for carbon and nitrogen, allowing larger population size which is, in turn, limited by phosphorus. In addition, prokaryotes may need to compete for phosphorus with other fractions of the plankton, such as photosynthesizing eukaryotes, which appeared to be abundant near location 34, which is supported by the dependence of total prokaryotic abundance to the decrease in chlorophyll-a observed in the multiple regression model. Salinity also influenced the abundance of vibrios and may account for a fraction of microbial diversity variation as suggested previously [8,9,10,11].

High nutrient loads promote microbial growth and affect microbial composition and diversity in GB
A clear relationship between nutrient concentration and microbial abundance was observed. Total microbial counts were at least one order of magnitude higher in location 34 than in locations 1 and 7. In addition, a higher proportion of actively dividing microbial cells (suggested by higher HNA counts as discussed in Kirchman [1]) was observed in location 34. On the other hand, the excessive input of nutrients may also have reduced the overall diversity of the GB, as observed in other aquatic systems [17]. A reduction in the estimated number of bacterial species was observed in location 34 (407 OTUs) in comparison to locations 1 (455 OTUs) and 7 (488 OTUs). The vibrio pyrH allelic diversity was also lower in location 34 (results not shown). The metagenomic analysis allowed some inferences regarding the structuring of the microbial populations. Gammaproteobacteria, Betaproteobacteria and Actinobacteria were proportionally the most abundant groups in location 34. The contribution of the former two to the metabolism in this location were generally higher in most subsystems, while Actinobacteria seemed to explore a particular niche metabolizing sulfur compounds (Figure 7). This finding needs to be explored further in future studies. In contrast, the more oligotrophic locations 1 and 7 presented lower microbial counts and nutrient concentrations and selected for different bacterioplanktonic groups, particularly Alphaproteobacteria and Flavobacteria. These two groups had a proportionally higher contribution to most metabolic subsystems (Figure 7).
The GB has a distinct metagenomic signature Because most subsystems (e.g. amino acids and derivatives, protein metabolism, cofactors, and phosphorus metabolism) were found to be equally abundant regardless of the sample location (1, 7, or 34) and the taxonomic group, it appears that a selection of microbes adapted to the GB environmental conditions occurred. Overall, GB had a distinct metabolic profile compared to an average bay metagenome (pooled data from 19 public bay metagenomes obtained from MG-RAST). Phosphorus metabolism represented one of the distinctive features in the metagenomes and was more (relative) abundant in GB than in other bays. Indeed, phosphorus seemed to be a limiting factor in GB due to excessive C and N input. Metagenomic sequences belonging to the aromatic compounds subsystems were also proportionally more abundant in the GB than in other bays. Aromatic compounds are persistent pollutants that accumulate in the water because they are hard to digest [40] and may represent a nutrient source, particularly valuable in the more oligotrophic locations 1 and 7. This is compatible with a notion that the GB is chronically polluted with hydrocarbons [41]. The photosynthesis subsystem was poorly represented in the microbial (prokaryotic) metabolic profile of GB. The high concentrations of chlorophyll-a observed in the inner location 34, with proportionally lower pheophytin concentrations indicates that primary producers were abundant in the GB, but such organisms compose larger fractions of the (eukaryotic)plankton, not analyzed in this study. Because the GB had a distinct metagenomic profile, we suggest that metagenomic signatures may reflect environmental effects on microbial populations of GB, supporting the view that metagenomic signatures are useful to describe environmental features [42]. Although a more detailed analysis accounting for environmental factors of each bay would be preferred, the lack of metadata accompanying these public metagenomes prevented such approach. The different coverage should not prevent our tentative conclusions given that the analysis was restricted to the top-level hierarchy of MG-RAST, and that the trends observed were common to all six GB metagenomes (irrespective of their sizes).

Vibrios can be considered indicators of trophic conditions
Vibrios are indigenous marine bacteria, which grow easily on plates and play important ecologic roles in the marine environment [29]. They are also known for their swift responses to nutrient rich environments, with some species having duplication times as low as 10 minutes in suitable conditions [22,43]. Because some vibrio species are human and/or animal pathogens, it may be important to monitor vibrio counts in coastal waters. Promoting the growth of super-heterotrophic potentially pathogenic bacteria may lead to disease in marine life, such as benthic invertebrates and corals [44,45,46]. The abundance of vibrios in GB was one order of magnitude higher in location 34 than locations 1 and 7, similar to total microbial counts. Significant simple correlations were observed between vibrios counts and both total microbial counts and microbial production. As such, vibrios counts seems a good proxy (indicator organism) for water quality, as suggested in Dinsdale et al. [46], and we propose that vibrios counts of .200 CFU/mL might reflect polluted seawaters (inference from Figure 3). A potential health risk is highlighted here, since GB waters are used for recreation (swimming, bathing, fishing), and Vibrio parahaemolyticus was among the most frequently found species in these waters. Additionally, Vibrio plating is a relatively inexpensive technique for monitoring bacterial numbers in the environment, unlike the more expensive and technically demanding flow citometry count. This is the first study on the Guanabara Bay aiming at a time series comprehensive analysis on the planktonic microbial diversity and abundance. Our approach comprised nutrient concentration measurements, microbial counts, and metagenomics, to unravel microbial population composition and metabolic potential. The integration of data allowed us to shed light on the main bottom up factors controlling microbial abundance in GB. Our data also shows that the GB has typical features (e.g. phosphorus metabolism) that differentiate it from other bay metagenomic signatures. These features may have been acquired in the course of the occupation of the GB area, leading to in the observed structuring of the microbial composition. We also show that nutrient (phosphorus) limitation may be inferred from community (metagenome) and population (total prokaryote and vibrio) levels. Figure S1 Phylogenetic tree based on pyrH gene sequences of vibrio isolates from Guanabara Bay using the Neighbor-joining method. Distance estimation was obtained by the model of Kimura 2-Parameter. Bootstrap percentages after 2,000 replications are shown. Scale bar, 1% estimated sequence divergence. Evolutionary analyses were conducted in MEGA5. (TIF) Figure S2 Most meaningful fits to linear models between vibrio/microbial counts and physical chemical parameters. Data points from each location are marked differently according to the legend. An exponential fit (data log transformed) was deemed best in figures B, C, D, E, F, G and H to ensure normality. (TIF) Figure S3 Relative percentage of contribution of archaeal sequences to GB metagenomes, separated by locations. Different letters indicate significant difference (p,0,01) between samples, while repeated letters indicate no statistical difference. In all cases, a.b.c, regarding relative percentage values. (TIF) Figure S4 Relative percentage of contribution of viral sequences to metagenomes, separated by locations. Different letters indicate significant difference (p,0,01) between samples, while repeated letters indicate no statistical difference. In all cases, a.b.c, regarding relative percentage values. (TIF) Figure S5 Relative percentage of contribution of alphaproteobacterial sequences to metagenomes, separated by locations. Different letters indicate significant difference (p,0,01) between samples, while repeated letters indicate no statistical difference. In all cases, a.b.c, regarding relative percentage values. (TIF)