Reader Comments

Post a new comment on this article

Referee comments: Referee 1

Posted by PLOS_ONE_Group on 25 Jan 2008 at 13:24 GMT

Referee 1's review:

In this study the authors filtered water from aquatic communities and characterized the DNA viral fraction collected between 0.1 and 0.8 micron filters. This fraction includes community DNA from various microbes, not solely from viral particles. The experimental work was based on methods this group has developed for their global survey project, and on viral work pioneered by DeLong and by Breitbart/Rohwer. This work also provides a very nice complement to previous published studies from the same group and will be of interest to marine virologists and evolution biologists.
The claims made in the paper appear appropriate from the genomic data and analyses performed.

I do have a few comments to help clarify the text. Although I consider myself well versed in genomics, I did find it at times a bit arduous to follow:

The authors tagged a metagenomic sequence as being of viral origin only if the top homology match was to a viral protein. The authors acknowledge that this approach will most likely underestimate the number of true viral sequences present. They identified in this manner more than 154,662 viral sequences and functionally characterized them through sequence similarity clustering.

1. I assume this number represents the number of viral READS?
2. It is not clear to me, from the text, what percentage of the total sequences (or reads) this number (154,662) represents.
3. The authors indicate in the 'Results' section that 79.3% of these viral sequences "belonged to a sequence assembly scaffold that was taxonomically assigned as viral". Does this mean that >79% of the viral sequences ASSEMBLED into this scaffold? If yes, then there would be a considerable overlap between these sequences? I assume not because in the protein clustering section the authors indicate that these viral sequences then cluster into many different protein groups. In the same paragraph they also refer to 154,662 proteins. I think it would be more appropriate to refer to them as peptides since they are most likely not complete proteins (or are they?). It is just not clear from the text.

The interest of this study is primarily in the diversity of aquatic environments that they have been able to sample (37 in all) over a 10 month period during a 6 leg oceanographic expedition. They added another 4 sites that had been previously analyzed in the course of the Sargasso Sea pilot study. This provides a unique snapshot of the viral ecology of these locales, ranging from Nova Scotia to French Polynesia.

4. Were the sequence data from the 4 Sargasso stations already available or did the authors have to take samples out of freezers and sequence them? If they were available, I assume the same fraction (01-08 micron) had already been analyzed for the Sargasso study but that viral sequences had not been analyzed? Please clarify in the text.
5. What is considered as "the first part of the GOS expedition" (first paragraph, page 5, Results)? Is that the first leg of the expedition or the Sargasso Sea data?

Other edits/comments

6. The term 'host-derived viral sequences' is a bit confusing. There is a nice description and literature review in the introduction on lateral gene transfer between viruses and their hosts; it wouldn't hurt to hint back at that in the results section when discussing host-derived viral sequences for the first time.
7. Figure 2. I do not see any yellow or orange lines, as mentioned in the legend to this figure. Figure 3. I'm not sure I see orange lines in Figure 3. These figures are really not very clear, at least not in my printed version.
8. Figures 4, 5 and S1-S9. Does the colored gradient for the histogram bars mean anything or is that to help in visualizing which bar corresponds to each individual site?
9. Table 1: indicates a value of 4.0x107 copies per L of surface water for gene petE. The text on page 7 says 2.1x107 copies/L.
10. It would facilitate things for the reader if you indicated in Table 1 which fractions correspond to ocean samples and which correspond to freshwater.
11. Table S1: This table is marked "Table 1."

N.B. These are the comments made by the referee when reviewing an earlier version of this paper. Prior to publication the manuscript has been revised in light of these comments and to address other editorial requirements.