Peer Review History

Original SubmissionAugust 18, 2020
Decision Letter - Francisco Rodriguez-Valera, Editor

PONE-D-20-25895

metaVaR: introducing metavariant species models for reference-free metagenomic-based population genomics

PLOS ONE

Dear Dr. Madoui,

First of all apologies for the long delay in assessing your manuscript. It is hard to find reviewers for bioinformatics manuscripts. Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 19 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Francisco Rodriguez-Valera

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary).

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In the article "metaVAR: introducing metavariant species models for reference-free metagenomic-based population genomics", the authors present a method to retrieve polymorphism data from metagenome samples without the need for a reference genome, based on performing variant calling directly on the raw reads, then grouping the different variants into "Metavariant Species" (MVS) based on the variant coverage.

The method is mathematically sound and it certainly has a place in current metagenomic analyses. Although the production of Metagenome Assembled Genomes (MAGs) is widespread on the field, there are still a lot of samples for which the assembly of contigs to build MAGs is still difficult (One such example is any sample where one species of microorganism is specially abundant, such as salterns). However, I see a few issues that need to be addressed before this article can be published.

One of them is the overall readability of the article, specially the introduction. Although there are not any grammatical or spelling errors, there are some phrases that should be rewritten to make it easier to understand. I would bring special attention to lines 11-16, 50-53 and 256-275, but the entire article could do with an additional punctuation pass. The figures could do with a bit of retouching: Figure 1 in particular has too much information, including pictures referring to different sections of the article. Offloading some of the subfigures to Supplementary data would make the figure easier to parse. I am also missing a figure explaining in a graphical manner the concept of variable loci and metavariants: understanding these concepts is vital to understand the method, but the definition in the text is correct but a tad too formal for someone without proper mathematics formation, and a good figure could help biologists understand the core concepts of the article. The Algorithm 1 could also be removed from the article, as it is not required to understand the method and could be moved to Supplementary Data.

My main gripe with the article, however, is the application of this method to real metagenomic analyses. I'm particularly worried about the following issues:

*The scoring of each metavariant cluster depends on the number of samples, but metagenomic studies do not usually include replicates and the amount of samples sequenced is usually very low. How does sample size affect metavariant clustering? How many samples are needed to obtain a good separation of clusters? Does the homogeneity of the samples (How similar are they to one another, in terms of population composition) affect the scoring somehow?

* I am particularly troubled about the number of metavariants the method is able to recover: the simulated metagenomic dataset test only uses 6 genomes, and the real metagenomic test focuses only on a single genome. A real metagenomic sample is going to have a lot more genomes. How many is the tool able to recover? If the tool is only able to recover a limited number of genomes per run, is it possible to direct the tool to recover an specific genome?

* Although I understand the motivation of building MVS, a biologist using this tool still needs a way to connect a MVS to a genome, which the article does not include.

If these issues are adressed, then I have no doubt metaVaR has a bright future as a tool in population genomics.

Reviewer #2: In this manuscript the authors introduce a new method to detecting micro-diversity in metagenomic datasets. They also introduce the concept of metavariants and metavariant species. Their tool, MetaVaR was tested on both real and simulated datasets and presented superior performance. The manuscript is relevant and well written, the experiments are well designed and the results support the conclusions. Nevertheless some minor changes and clarifications are necessary to make this manuscript suitable for publication.

Ln 71: It seems to me that considering only a single metavariant might be a major limitation of the proposed method, albeit necessary to make the computations feasible. This should be mentioned in the discussion.

Ln 73: Would it not be more logical to establish the reference as the variant that has higher coverage?

Ln 113: Which distance metric was used with the DBSCAN algorithm? Did you test different metrics?

Ln 189-191: This setup is an overly simplistic representation of metagenomes. In most real world datasets metagenomes are made up of many more species and often multiple strains of the same species. Using only six species and no multiple strains from the same species is likely to inflate the precision and skew the other evaluation metrics.

Ln 195: Perhaps the term “communities” is more adequate than “populations” considering you are dealing with different species.

Figure 1C: Is there evidence to support that the DBSCAN noise points are all the result of inter-species variation or is this one of your assumptions? In case of the latter, this should be explicitly stated along with the rationale behind it.

Figure 2A: Is it correct to assume that coverage values refer to specific positions within genomes rather than coverage of the whole genome? If so, please clarify

Ln 253: word missing after evolutionary?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Dear editor and reviewers,

I really appreciate the time spent on reviewing our study. We took all your remark in consideration and redacted a complete response point by point to satify your legitimate concerns. We hope the effecort we provided fill fill your expectations. The manuscript contains major changes and new analysis and we hope the readers will find a great interest on our work. A detailed answer is provided in a seperate file.

Regards

Best regards,

The authors

Attachments
Attachment
Submitted filename: Rebuttal_letter.pdf
Decision Letter - Francisco Rodriguez-Valera, Editor

metaVaR: introducing metavariant species models for reference-free metagenomic-based population genomics

PONE-D-20-25895R1

Dear Dr. Madoui,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Francisco Rodriguez-Valera

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Sorry for the delays

Reviewers' comments:

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .