Peer Review History

Original SubmissionOctober 7, 2020
Decision Letter - Jason A. Papin, Editor, Niranjan Nagarajan, Editor

Dear Dr Rogers,

Thank you very much for submitting your manuscript "Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

All three reviewers found the manuscript to be of interest. Reviewers 1 and 2 have several concerns particularly related to validation of the methods described here which need to be addressed in a revised manuscript.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Niranjan Nagarajan

Associate Editor

PLOS Computational Biology

Jason Papin

Editor-in-Chief

PLOS Computational Biology

***********************

All three reviewers found the manuscript to be of interest. Reviewers 1 and 2 have several concerns particularly related to validation of the methods described here which need to be addressed in a revised manuscript.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: see attached review document

Reviewer #2: The manuscript titled “Ranking microbial metabolomic and genomic links in the

NPLinker framework using complementary scoring functions” by Eldjárn et al demonstrates a ranking scheme to score link the genomic and the metabolomic data. Specifically, the authors use the information about strains, biosynthetic gene clusters (BGCs) and the metabolic fingerprints (MFs) obtained from mass spectra. The authors raise concerns over the existing scoring scheme and propose a method to modify this score to circumvent the challenges. Besides, the authors propose the use of an Input-Output Kernel Regression (IOKR) to predict the ranking of BGCs by combining the information from strains and MFs. Further, they integrate the scores obtained from the above two proposed methods and suggest that the combined score is better. Overall, I find the method reasonable; however, the proposed method requires further validations. Currently, the validation relies heavily on the predictions from antiSMASH, although there are other BCGs tools available. Further, at some places in the manuscript, I observe that the scores obtained using IOKR are lower, raising critical questions about the method proposed. I have listed my suggestions below.

Major

1. Lines 158-159 - What do the correlation scores indicate? What is the range of the correlation scores?

2. Lines 176-177 - For compounds having similar structures, how is the mapping done from MS2 spectra to the space of metabolites? In such a case of very similar compounds, how would the scores vary? How do the authors handle the mass spectra of cyclic compounds and stereoisomers?

3. Lines 212-213 – Since the input spectra are filtered to include only the peaks in training data, wouldn’t there be a bias?

4. It is quite unclear from the methods how the authors link the mass spectrum with molecular fingerprints. I suggest that the authors explain this further in the manuscript.

5. It is also not clear how the problems reported in lines 223-235 are alleviated by combining the two techniques since Table 1 does not report that scores obtained from IOKR+standardized correlation.

6. Table 1 – Scores of the links reported by IOKR offers very little improvement over the standardized score – in which case, what is the novelty behind using IOKR? I believe it is important for the authors to discuss this in the main manuscript.

7. Lines 372-376 is quite unclear – What is a correct metabolite? Suggest rephrasing for clarity.

8. Line 374 – How are the rank ties resolved?

9. Figure 7 – has missing axes labels and hence can’t be interpreted.

10. How do the links given by standardized correlation score and IOKR vary? Is there a degree of overlap?

11. Table 4 - Since there are so many links above the validated links, what does it mean? Based on the claims in Results section 1, I would imagine that validated link from the combination of two scores (l_1/2) to have the highest rank (but do not observe it in Table 4). Shouldn’t the validated links have a higher rank compared to the others? This should be discussed.

12. Also, since BGC0000137 has multiple entries, which rank should be prioritised and used for further analyses? This is an important point to be discussed in the manuscript.

13. Since the authors are proposing a new method, I would suggest that the comparison must be done with BGCs from other tools like DeepBGCs as well.

14. The link to the tool NPLinker is not provided in the main manuscript. Hence, this couldn't be used/tested.

Minor

1. At a few places such as Lines 452, the notations are not consistent

2. At several places these terms are used Established links, validated links, verified links and it becomes difficult to comprehend. I suggest authors use consistent terms or explain what each of these terms mean in the supplement.

3. Small typos need to be fixed (e.g. Summary prodcut -> product)

Reviewer #3: Abstract: Single paragraph is recommended.

There are several typos in the manuscript, “prodcut, unknnown, concenrated, compuational, peptidogenomcs, severly, to produces, “ etc.

There are some acronyms such as RiPP, but the open form is not given in the manuscript.

Table 2 is first mentioned in line 357, but without any details on its rows (what is total score etc). This information is provided much later, when Table 2 is mentioned again. Please either remove the first referral or move the related explanations to the place where Table 2 is first mentioned.

Crüsemann figure (Figure 6) is repeated in the supplementary file section 10. In the related supplementary figures for other datasets, please use red, not green, for verified points. Green is difficult to see.

Table 4: What do the bold numbers show? This should be added to table caption.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Tunahan Cakir

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

Attachments
Attachment
Submitted filename: NPLinker-review-2.docx
Revision 1

Attachments
Attachment
Submitted filename: nplinker_reply_to_reviewers_20210305.pdf
Decision Letter - Jason A. Papin, Editor, Niranjan Nagarajan, Editor

Dear Dr Rogers,

We are pleased to inform you that your manuscript 'Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Niranjan Nagarajan

Associate Editor

PLOS Computational Biology

Jason Papin

Editor-in-Chief

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have carefully considered the comments and concerns of this reviewer, and responded in a very scholarly and appropriate fashion. The revised paper is significantly improved and will make a fine contribution to the field.

Previous Comment Number (also used in the Author response):

1. It appears that the authors have solved the issues.

2. The additions by the authors respond to and answer our suggestion that they try matching substructures to fragmentation spectra.

3. The addition of the clarification in section 2.6 is very good!

4. Based on their arguments, this reviewer agrees that the quality of the MAG-BGCs would not add much to the main conclusions of the paper.

5. The authors have provided a very good response to this issue.

6. The explanation provided by the authors appears valid.

7. This reviewer agrees that there are adequate comparisons against the current state-of-the-art, which is the strain correlation score defined by Doroghazi et al

8. The added text helps to explain the accuracy.

Items 9-19 – All of these issues have been properly dealt with by the authors, and good explanations provides.

Reviewer #2: Thank you for carefully considering all the comments and making appropriate modifications. The revised manuscript looks substantially stronger than the earlier submission.

Reviewer #3: My comments were properly addressed by the authors.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Formally Accepted
Acceptance Letter - Jason A. Papin, Editor, Niranjan Nagarajan, Editor

PCOMPBIOL-D-20-01813R1

Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions

Dear Dr Rogers,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Andrea Szabo

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .