Diversity and distribution of sediment bacteria across an ecological and trophic gradient

The microbial communities of lake sediments have the potential to serve as valuable bioindicators and integrators of watershed land-use and water quality; however, the relative sensitivity of these communities to physio-chemical and geographical parameters must be demonstrated at taxonomic resolutions that are feasible by current sequencing and bioinformatic approaches. The geologically diverse and lake-rich state of Minnesota (USA) is uniquely situated to address this potential because of its variability in ecological region, lake type, and watershed land-use. In this study, we selected twenty lakes with varying physio-chemical properties across four ecological regions of Minnesota. Our objectives were to (i) evaluate the diversity and composition of the bacterial community at the sediment-water interface and (ii) determine how lake location and watershed land-use impact aqueous chemistry and influence bacterial community structure. Our 16S rRNA amplicon data from lake sediment cores, at two depth intervals, data indicate that sediment communities are more likely to cluster by ecological region rather than any individual lake properties (e.g., trophic status, total phosphorous concentration, lake depth). However, composition is tied to a given lake, wherein samples from the same core were more alike than samples collected at similar depths across lakes. Our results illustrate the diversity within lake sediment microbial communities and provide insight into relationships between taxonomy, physicochemical, and geographic properties of north temperate lakes.


Responses to Reviewer #1
Major: 1. The sequencing depth seems to be very shallow with only 3.3 million raw reads for 40 samples, especially when complex communiPes are expected in the sediment samples. Maybe the diversity in some lakes sediments is underesPmated because rare taxa were missed. Please at least provide the read numbers for each lake and rarefacPon curves.
We understand and appreciate this concern. We felt given our goals of examining broad scale changes in higher taxonomic level distribu@ons that we had sufficient read depth and did not chose to resequence the samples. We have included more informa@on on individual sample read depth in a supplemental table (S2 Table) and have added a supplemental figure (S2 Fig.) illustra@ng rarefac@on curves.
2. As explanatory variables mainly the area type and measures for carbon, phosphorous and nitrogen were used. But some simple to determine variables are missing, such as water temperature, pH, conducPvity, (oxygen concentraPon in the sediment), which are known to strongly effect some of the detected bacterial phyla, such as Proteobacteria. If these variables are neglected, the "true" causal effect might not be detectable, e.g. the effect aIributed to the regions could simply be a temperature effect. Is there no addiPonal informaPon on the lakes available that could be included? Such as annual temperature and pH?
We appreciate this review and have since added more water quality parameters to our analysis, including dissolved oxygen, temperature, pH, conduc@vity, and turbidity. However, as men@oned in our conclusions/caveats we do not have any sediment specific parameters. We agree with the reviewer that by neglec@ng some of these variables we may not be able to detect and "true" causal effect. Nevertheless, with the addi@on of the previously men@oned variables to our analysis there were changes in our alpha diversity models as well as the vector fifng of the PCA. Specifically, specific conductance and temperature were the two variables that correlated the strongest with the PCA axis scores and were informa@ve to the alpha diversity models.
Minor: 1. Line 76: There are more studies available that should be cited which study communiPes and distribuPon paIerns related to environmental factors, especially across Europe: e.g. hIps:// sfamjournals.onlinelibrary.wiley.com/doi/full/10.1111/1462-2920.14992 We appreciate the reviewer's sugges@on to add further sources to this statement, and greatly appreciate the inclusion of a specific source. We have added an addi@onal four cita@ons to this sec@on. We recognize that there are many more studies than the nine cited here but feel these best highlight the argument that we were making. Lines 81-82 2. The provided images have a low quality, but that might be due to the incorporaPon into the pdf We apologize for the low quality of the images provided. We did adhere to the journal requirements for expor@ng but have re-exported all figures in the hopes of correc@ng this issue.
3. Line 99: "These regions can be characterized by their underlying geology, soils, vegetaPon, and land use." Could you provide this informaPon.
We appreciate this sugges@on as it provides a greater context to the overall area of the study. We have included a new table for the supplement. Line 107 4. Line 120: How long were the samples stored before processing?
We thank the reviewer for this comment and have since added storage times for both the sediment cores (up to 7 days) and frozen subsamples (up to three months). We have also included this information (the collection and extraction dates) in a supplemental table (S2 Table). Lines 134 & 140-141 5. Nucleic Acid PreparaPon, AmplificaPon, and Sequencing: Was the quality/integrity of the DNA controlled before library prep? Please provide the measures.
We appreciate the reviewer's concerns about DNA quality/integrity and have added a new supplemental table (S2 Table) which includes Qubit fluorimeter readings for all DNA extrac@ons. We've also added reference to the S2 Table in the sec@on regarding DNA isola@on where we previously commented on our final DNA concentra@ons. Lines 144-150 6. Nucleic Acid PreparaPon, AmplificaPon, and Sequencing: Was the PCR performed by the core facility? What kind of PCR protocol was used?
We send genomic DNA to the core facility and they perform all library prep and sequencing. We have added addi@onal reference to the methods used by the University of Minnesota Genomic Center for further clarifica@on. Line 162 7. Line 136: Please provide a reference for the used primers.
This was an oversight on our part, and we have rec@fied it by including references for both primers. Lines 156-158 8. Methods: Please provide versions for all tools, programs are R packages used.
Thanks for catching this! We've updated the text to include all version notes at the first appearance of the tool/program/package in the text. Lines 166, 168, 169, 174-175 9. Methods: Were any of the environmental/physicochemical variables standardized or log transformed for any of the analyses, e.g. in the PERMANOVA? Do any of these factors covary?
We appreciate this ques@on and apologize for not making this clear in the text. There were two circumstances where we were comparing our environmental data, first to the alpha diversity scores in the linear model and second to the axis scores of our PCA. In both circumstances we log transformed our right skewed data (e.g., TP, TN). However, in the case of the PERMANOVA we were looking at the categorical variable of ecological region, so there was no transforma@on necessary. Physio-chemical variables such as nutrient concentra@ons and chlorophyll a that describe enrichment and produc@vity will naturally covary along trophic gradients. This covaria@on is reflected in the direc@onality of the vectors in the PCA. Addi@onally, model selec@on using AIC penalizes for addi@onal complexity and thus selects against explanatory variables whose explained variance is shared among 1 or more variables already included in the model.

Line 155:
There is no Figure S1 in the supplement.
We apologize for this and have made sure to include all supplemental and intext figures on the resubmission.
11. Line 158 onwards. It is not clear from the methods how the samples were grouped for the staPsPcal tests, how many groups there were and which environmental parameters were used.

We apologize for the lack of clarity and have added details on the number of samples per group to the methods sec@on of the text. Lines 187-190
12. Line 163: MulPple (linear?) regression was used for predicPon. Please have a look at hIps:// onlinelibrary.wiley.com/doi/full/10.1111/mec.15872 were it is shown that linear predictors do not perform well on such data. Since I did not find a predicPon in the results, maybe just the fiong of environmental data to the PCA axes is meant here. Please clarify.
First, we'd like to sincerely thank the reviewer for including this source as it was incredibly informa@ve. However, we believe there may be confusion with its rela@onship to this work. We used a mul@ple regression to model Shannon diversity and the Observed number of OTUs using our environmental data. In this case, we s@ll had more samples than regressors and found this approach to fit the data. The results of these models were included with their R2 values in the alpha diversity subsec@on of the results. We did addi@onally fit environmental data to the PCA axes and those results are in the beta diversity subsec@on of the results. We regret for the lack of clarity in this approach and have refined the text to make this clearer. Lines 298-305 13. Line 211: "> 2250 OTUs" is stated as diverse, but there are no references provided that compare it to other studies.
We thank the reviewer for this comment and have since clarified the language with this statement. We were indica@ng that with regard to other loca@ons in the lake microbiome (e.g., the water column) and with regard to other sediment organisms (e.g., diatoms) these levels of diversity are great. However, our levels of diversity are similar to other studies looking at the bacterial microbiome of sediments. Lines 239-242 14. Fig. 2 and Fig. 3 are more or less redundant. Fig. 2 could be put into the Supplement. Further, it would be good to split the boxplots for deep and shallow sediment samples which would beIer show that there is not difference between these. Again, it would be good to know the sequencing depth per sample and saturaPon before drawing conclusion about the richness/evenness. Fig. 2 to the supplemental informa@on. We have leo Fig. 3 in the text as it reflects the samples per group for the sta@s@cal tes@ng present in the results; however, we made an addi@onal figure illustra@ng the separa@on of the metrics by depth (S5 Fig.). We also provide details regarding the sequencing depth per sample as men@oned in a previous comment/response. 15. Line 255: It would be good to have some examples about funcPons from analysed species. Most result/discussion points provided later are on the level of phyla, classed etc. Did you not determine genera or species that perform specific funcPon which could be aIributed to ecological funcPons linked the different trophies?

We have taken the reviewer's advice and moved
We appreciate this comment but hesitate to a]empt to assign func@on from a small por@on of the 16S rRNA gene. We did ini@ally perform a Tax4Fun analysis which was able to assign func@ons for around 1% of the data; however, the top hits were all related to housekeeping func@ons. In addi@on, at increasing taxonomic resolu@on, the confidence in OTU assignment decreased. A number of our OTUs were only classified (at high confidence) to the Order level. Due to the limita@ons of connec@ng func@on to 16S rRNA genes (e.g. h]ps:// microbiomejournal.biomedcentral.com/ar@cles/10.1186/s40168-020-00815-y reports limited success of func@onal assignments to 16S rRNA amplicons outside the human microbiome), the inability of Tax4Fun to assign dis@nct func@on, and the lack of higher resolu@on taxonomy, we cannot determine specific func@ons or genera or species let alone their ecological func@ons across trophic levels.
16. Figure S3 does not indicate significant differences.
We have added significant differences to what was figure S3 and is now figure S6.
17. Line 275: What could be the reason for the more diverse community here? Are different metabolic processes involved? See response to item 15 for context into differing metabolic processes/func@ons with regard to the dataset.
18. Line 279: Rare taxa are menPoned here as important. Were these captured at this sequencing depth?
When we men@on rare taxa, we're using our defini@on of rare in which "... we deemed rare taxa at the phylum level as those not comprising more than 1% rela@ve abundance of the sample". Lines 312-315 While there's always the possibility that greater sequencing depth may have captured more taxa, when we discuss the importance of rare taxa we are doing so with this defini@on in mind.
19. Line 353: You state that nitrogen is a selecPve variable also in your data, did you find more/less taxa for nitrogen cycling in these samples?
We appreciate this ques@on and have elaborated in the text to the best of our ability about nitrogen cycling taxa. We discussed some nitrogen cycling taxa as members of the order MBNT15 and in our supplemental table 3 we do note that taxa in the phylum Nitrospirae sta@s@cally varies across the ecological regions. However, we did not discuss any specific Nitrospirae as there is only three classes, each of which contains only one order, and one family. Below the family level, our taxonomic resolu@on was uncultured, unclassified class members, or Nitrospira. Given the lack of 6 taxonomic resolu@on we felt there was no substan@al discussion to have surrounding the poten@al roles of nitrifying bacteria across these systems.

We have refined this terminology and the sentence now reads: "In this study, we selected twenty lakes with varying physio-chemical proper@es across four ecological regions of Minnesota." Lines 33-34 & 83-84
2. Line 35, add 'bacterial' community structure

We have clarified this and the statement now reads: (ii) determine how lake loca@on and watershed land-use impact aqueous chemistry and influence bacterial community structure. Lines 36-37 & 86-87
3. I suggest to menPon brief methods used in the study to reach the conclusion in the abstract as well.
We appreciate this sugges@on and have added a statement regarding the use of 16S rRNA amplicon data to our abstract. Line 37

Line 37, What is TP?
We apologize for the addi@on of an undefined acronym here. We've corrected this to read total phosphorus instead of TP. Line 40 5. I could not find the knowledge gap or significance of the study in the whole manuscript. I suggest adding a significance statement both in the abstract and the introducPon.

We appreciate this sugges@on and have added a statement of significance to the manuscript in the abstract/introduc@on and we revisit this significance in our conclusion. Lines 27-31 & 477-480
6. Line 61-62, needs restructuring We were unsure of what exactly to restructure with these lines; however, we do acknowledge that the statements were wri]en passively in an otherwise ac@ve narra@ve. We've revised them to keep the same tense throughout the paragraph. Lines 65-67 7. Line 76, add 'microbial/bacterial' community assembly.
We have clarified this sentence ensure the community assembly in ques@on is specific to bacterial community assembly. Line 81 8. Line 81-84, the sentence is too long and have a lot of jargon. please re-structure We thank the reviewer for this sugges@on; however, we feel that we have provided the details necessary in the preceding introduc@on and thus feel this wording accurately and best describes the specific aims of our study. We are also careful throughout to define terms, avoid jargon, and translate our results for both specialists and a broader audience.

Line 103
, what year was the water sample collected.

We apologize for not including this informa@on in our tables. We've since updated supplemental table 2 and have included a statement about dates in the text.
10. How was the water sample collected and stored? Was there any treatment done to the water prior to any tests conducted? were these samples collected every-Pme the sediment samples were collected? why was the water analysis done only once? This is not clear.
We apologize for a lack of clarity in the water sampling methodology. We have since rewri]en this sec@on to incorporate sample storage as well as sample hold @mes/temperatures for all measured parameters. All samples, including sediment core samples, were only taken at a single @mepoint. We have also included sampling dates in our supplemental table which were accidentally leo out in the first version. Lines 110-126 11. How orer were the samples collected. Details of the Pme the sediment samples were collected?
We thank the reviewer for this comment and have since added a supplemental table (S3 Table) with the exact dates of sediment sampling --moving beyond the exis@ng text which only clarifies the range of dates for sampling.
12. Line 126, how many samples were collected in total? You have said there were total 40 samples. How are they distributed?
We apologize for the confusion surrounding the total number of samples and their distribu@on. To clarify we selected twenty lakes, each was cored and subsampled at two depth intervals resul@ng in the forty total samples. Following a previous comment from this reviewer, we have added a statement regarding methodology to the abstract as well as throughout the text . Lines 37-38,  137-140 & 177-178 13. Was there any data collected for sediment samples? e.g, pH, temperature, salinity etc?
We appreciate this review and have acknowledged in our Caveats and Conclusions that our dataset is limited to water quality data only. Nevertheless, we have added water measures of pH, temperature, specific conductance, turbidity, and dissolved oxygen in our analysis. These measures, like our nutrient measures, are from a loca@on of depth minus one meter. When added, to our analysis there were changes in our alpha diversity models as well as the vector fifng of the PCA. Specifically, specific conductance and temperature were the two variables that correlated the strongest with the PCA axis scores and were informa@ve to the alpha diversity models.
We have changed this text to read: "We extracted DNA from 0.25g of wet sediment from each subsample using a PowerSoil DNA Isola@on Kit (Qiagen, Inc.) following the manufacturer's protocols. " Lines 141-143 15. Line 128, How were the extracPon carried out? Were they extracted in duplicates? was a certain number of samples repeated if not done in repeats? What were your controls? Both posiPve and negaPve. How can you determine the efficiency of your extracPon? and how did you control for contaminaPons?
We thank the reviewer for these comments and have addressed our use of nega@ve controls in the text. Addi@onally, we provide reference to methodologies used by the University of Minnesota Genomic Center who performed nega@ve controls for sequencing. Given the complexity of the samples we did not perform any posi@ve control, as no mock community would adequately ensure that we sufficiently extracted the contained organisms of our samples. 18. What were your controls for PCR? Did you sequence your controls as well? Having negaPve controls is extremely important.
We send genomic DNA to the core facility and they perform all library prep and sequencing, including controls. We have added addi@onal reference to the methods used by the University of
We clarify the distribu@on of the 40 samples in a previous comment, and in turn we've added (n=#) for all group sta@s@cal tes@ng to clarify the distribu@on of samples across ecological regions.

Lines 187-190
20. Line 146, what version of SILVA database did you use?
We thank the reviewer for catching this and have since included all version notes for tools, programs, and packages at the first appearance in the text. Lines 166, 168, 169, 174-175 21. How did you deal with controls? Did you remove any contaminant taxa?
We clarified our use of controls in the ini@al comment (#15); however, nega@ve controls were sent for sequencing where they failed quality control by the core facility and were not sequenced. This informa@on has been added to the methods sec@on of the text. Lines 143-150 & 159-162 22. Line 158, what do you mean by all available measures for alpha diversity? please menPon names? Why did you choose to analyse all of them? I suggest you choose only one for each measure.
We thank the reviewer for this ques@on and have included mul@ple measures of alpha diversity in the supplemental informa@on because we are aware that researchers may exhibit a preference for a given metric. We chose to include them because under all examples the pa]erns in diversity are the same. However, in the main text we exclusively discuss the observed number of OTUs as a measure of species richness and Shannon index scores as a measure of species evenness.
23. What metric did you choose to measure sample richness?
As stated in the text Line 186, we use the observed number of OTUs as a measure of sample richness.

We have clarified this language in two loca@ons in the text. Lines 200 & 207
24. Line 211, richness is not equal to diversity. What did you mean here?
We appreciate the ques@on surrounding our language with this statement and have since clarified this comment to reflect a comparison in richness and diversity especially with regards to the water column microbiome and diversity of diatoms in the sediments. Lines 239-242 25. Line 214, add name of the test used to all p values.

We appreciate the reviewer's sugges@on and have added test names to all p-values throughout the text.
26. Line 283. These taxa are extremely tricky to analyse if you do not have controls. I would like you to menPon the contaminants found in the sequences and then analyse this. Otherwise rare taxa data cannot be trusted.
We understand the reviewer's concerns and we feel we have adequately addressed the issue of controls in previous comments (15, 18, & 21) 27. Line 373, add 'relaPve' abundance.
We have made this change.
28. I would suggest to draw conclusions from previous data. Did you get the same findings as from other studies?
We have included several discussions linking the findings in our data to those found in previous data. Lines 306-310 compared trends in alpha diversity, 385-387 contextualized our beta diversity analysis to several other studies, and 408-412, 445-449, and several others discussed specific taxonomic rela@onships with trophic and/or ecological status. These discussions and comparisons to exis@ng studies served as the basis for our conclusions.
29. Please revisit your hypothesis in the conclusion and explain in relevance to your findings.
While we don't explicitly restate our ini@al aims, we do paraphrase the objec@ves of the study "we examined the lake sediment bacterial communi@es of 20 lakes to determine the influence of landuse and large-scale land classifica@ons on community structure and diversity." We then proceed to discuss the pa]erns we found in alpha diversity and with regards to specific taxa. However, we did not include a statement regarding our findings about the drivers of community of composi@on, so we have added that to this sec@on. We have also added addi@onal language about the findings in rela@on to our study's significance to the conclusion sec@on. Lines 473-480 30. Figure 3, add the name of metric to y-axis labels.
We have added the metrics to the y-axis in addi@on to the facets. Please note however, this figure is no longer figure 3 and is now figure 2.