Peer Review History
| Original SubmissionSeptember 15, 2022 |
|---|
|
PONE-D-22-25734Different structural variant prediction tools yield considerably different results in Caenorhabditis elegansPLOS ONE Dear Dr. Wasmuth, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Reviewer 1 raised several important concerns. We think they are essential issues that need to be addressed before considering the publication of the manuscript. Please submit your revised manuscript by Nov 20 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Zechen Chong Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: "This work was supported by Results Driven Agricultural Research (RDAR #2016F013R) to JDW, the Natural Sciences and Engineering Research Council of Canada (NSERC) through Discovery Grants (#06239-2015 and 04589-2020) to JDW, and an NSERC Collaborative Research and Training Experience Program (CREATE) program in Host-Parasite Interactions (#413888-2012) to JDW and others. ECA was supported by a National Science Foundation CAREER award." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "DW; 04589-2020; The Natural Sciences and Engineering Research Council of Canada; https://www.nserc-crsng.gc.ca JDW; 06239-2015; The Natural Sciences and Engineering Research Council of Canada; https://www.nserc-crsng.gc.ca JDW; 413888-2012; The Natural Sciences and Engineering Research Council of Canada; https://www.nserc-crsng.gc.ca JDW; 2016F013R; Results Driven Agricultural Research (Alberta); https://rdar.ca/ ECA; no number; National Science Foundation (USA); " ext-link-type="uri" xlink:type="simple">https://www.nsf.gov/" Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. "Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 4. Please clarify the Table 3 "Table 3 – Performance of long-read structural variant callers on simulated data" in page "9" and Table 3 "Table 3 – Predicted deletions, duplications, and inversions in C. elegans" in page "10". 5. Please upload a copy of Supporting Information Figure/Table/etc. Supplemental_Table_S1 to S_13 which you refer to in your text on pages 27 and 28. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A Reviewer #2: Yes Reviewer #3: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In this paper, authors benchmarked several short-read and long-read SV callers in simulated C. elegans datasets and tested these SV callers in several real datasets. In general, I do not think the conclusions of this paper would provide significant guidance for SV caller selection in future research. As no benchmark was available for real data, the authors showed low consistency between SV callers, from which we could not make any useful conclusions about which SV caller works best for C. elegans genome. Benchmarking SV callers only on simulated data is usually not sufficient, as some tools may perform very well in simulations but poorly in real data, depending on how the simulation (mock genome) is generated. Although I acknowledge the authors’ efforts in benchmarking all the SV callers at various sequencing depths, I would suggest rejecting current version of the manuscript and consider for publication if substantially revised. Major concerns: 1. The interpretation of results in real data needs to be substantially revised. As no benchmark is available for real data, we could not use regular recall/precision/F1 score to assess SV discovery accuracy. Some alternative approaches should be used to compare the performance of SV callers, instead of simply showing low consistency between SV callers. There is much more work needed to be done after this point. The significance of this manuscript is largely determined by how SV callers are evaluated in real data without benchmark datasets. What the authors have presented are far from sufficient to answer the questions raised by authors at the beginning. 2. In the abstract, authors stated “multiple short-read and long-read tools were benchmarked using real and simulated data”. I would suggest revising such statements in the manuscript, as it is usually not considered as “benchmarked” in real data if no reference SV callset is available. 3. In the Introduction section, authors claimed that there are few public long-read datasets for non-human organisms. In fact, although benchmark datasets are relatively rare for non-human species, samples sequenced by both short-read and long-read platforms are pretty common in SRA. There are also projects like Vertebrate Genomes Project that really focuses on non-human organisms. I understand that C. elegans is the focus of this paper, but it may limit the audiences to C. elegans researchers, as benchmark results in C. elegans cannot represent other species if authors claim that benchmark in human cannot represent C. elegans. 4. In Table 3, it is quite surprising that long-read SV callers showed such low recall for duplications, except for Sniffles. Majority of the simulated SVs are shorter than 10kbp and should be reported by long reads. Duplications are sometimes considered as insertions by some SV callers. I know Sniffles, pbsv, and SVIM do report insertions in the VCF file. How were the insertion calls treated? If an insertion event is reported at a true duplication site with same SV size, should this duplication still be considered as FN? 5. In the section of ‘prediction of known structural variants’, only three previously validated SVs were used for benchmark short-read SV callers. Could the authors clarify if there are only three SVs in the benchmark dataset, or did they select just one SV from each SV type? Based on these three SVs in one sample, if they are not clinically relevant or extremely hard to identify, the results in this section are not significant enough to demonstrate which SV callers are more accurate, considering the fact that there are hundreds of SVs per C. elegans genome in last table. 6. The use of Jaccard index to represent SV calling accuracy could be a novel approach. According to its definition and its applications in simulation benchmarks, the Jaccard index could be biased to large SVs as they contain more numbers of base pairs. One thing we can try is to set a maximal allowed SV size when calculating Jaccard index. For example, if we set the cutoff at 2kbp, we only count 2000 bp for all SVs longer than the cutoff, so that a single 4Mbp FP will not reduce the Jaccard index a lot. 7. For the simulation benchmark in results section, I would suggest adding some more details about how mock genomes were generated, especially the number of SVs per genome. It would help readers who are not very familiar with C. elegans genome. Minor concerns: 1. Why was Manta not included in the short-read SV caller comparison? It is also a widely used SV caller. In our previous experience of SV discovery in human genomes, Manta often performs better than other SV callers. 2. In the methods section, parentheses should be used in equations for Precision and Recall calculation. Precision = TP / (TP + FP) and Recall = TP / (TP+FN). 3. There are two ‘Table 3’ in the manuscript. Reviewer #2: I appreciate the authors attention to the revisions and think this will be cited as a methodological caution to the field. I am glad to see the paper come out to the field. I have nothing more to add. Reviewer #3: Lesack and colleagues describe an analysis of the ability of different bioinformatic tools to detect structural variation in simulated and real data from the model nematode Caenorhabditis elegans. The rationale is that there are several different tools and approaches to detect structural variants, but that existing benchmarks are based on human data which may not apply to other non-human species and that there is substantial variation in the precision and recall of different tools. The authors make use of C. elegans as it is one species for which there is a large amount of data (including that curated by one of the authors) available and some structural variants have been previously characterised; while this is not necessarily a “truth” dataset, it is perhaps as close as one can get for any non-human species. The manuscript is well written and has been clearly improved and refined after having already undergone a round of peer review. Scientifically, it is my opinion that this study provides an important reflection on a decision-making process that is not well defined, even when a “gold standard” or “truth” dataset is used. The novelty that was perhaps underappreciated by the previous reviewers is that non-human organisms with different genome compositions and biases will behave quite differently from which the tools were designed and tested upon; this study is clear in pointing this out. While the study does not point to a clear winner, which is perfectly unsurprising, it does test several relevant parameters and does emphasise that more than one tool is needed. Perhaps a missing part of the discussion is the fact that some recent approaches for structural variant calling rely on the combination and consensus of multiple tools (eg. Parliment2, Zarate et al 2020 Gigascience, which is briefly mentioned in the introduction, but the overall concept is not revisited). Nonetheless, the point is not to develop yet another tool but to better understand how existing tools behave in different biological and experimental contexts. I think the authors have done a thorough job at this. In my opinion, the authors have also done a good job at addressing the previous reviewer's comments, made a number of sensible changes as well as pushed back on some unreasonable suggestions. I am happy to support the publication of this manuscript as it is in PloS One. Kind regards, Stephen Doyle Wellcome Sanger Institute ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous, but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: Yes: Stephen Doyle ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans PONE-D-22-25734R1 Dear Dr. Wasmuth, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Zechen Chong Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: (No Response) ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: (No Response) Reviewer #2: Yes Reviewer #3: (No Response) ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: (No Response) ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: (No Response) ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: I was happy with the previous revisions and thought the paper could have been accepted in the previous round. I do want to comment on some of the requests from the other reviewer and to validate some of the authors' responses. First, PLoS One is supposed to accept papers that are technically sound, regardless of impact. I support the authors' choice not to expand manuscript with analyses outside the scope of their work. Second, in our own work on SVs we have found discordance in SV callers and have found that those tuned to human variation perform poorly in model organisms. We have less experience in C. elelgans but can state that we find similar results using these same methods in Drosophila also with moderate sized CNVs. pbsv performed especially badly (10% identified) and the documentation was not sufficient for us to discern why. The results here echo our own frustration that greater method development is needed-- and possibly organism specific bioinformatic pipelines. It is nice to see a group run through the pitfalls, and while the manuscript makes modest advances in solving the problem, this paper will offer information that is useful in the field. Reviewer #3: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: Yes: Stephen R Doyle ********** |
| Formally Accepted |
|
PONE-D-22-25734R1 Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans Dear Dr. Wasmuth: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Zechen Chong Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .