Peer Review History
| Original SubmissionNovember 13, 2024 |
|---|
|
PONE-D-24-51517Understanding the epidemiology and pathogenesis of Mycobacterium tuberculosis with non-redundant pangenomePLOS ONE Dear Dr. Zhou, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. As can be seen in the reviews, the reviewers felt that this manuscript is a well-done study overall and contains valuable insights about the genomic structure and evolution of Mtb. The reviewers noted some weakness that need to be addressed, including clarifying some of the terminology and methodology used, reconciling some of the numbers reported, and citing some additional relevant literature. Please submit your revised manuscript by Feb 27 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Thomas R. Ioerger Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for stating the following financial disclosure: “Authors received the funding Ou Xichao & Zhao Yanlin Grant numbers Zhao Yanlin : 2022YRC2305203 Ou Xichao : 2022YRC2305204 Project Title & Grand Numbers: Establishment and application of China's AIDS and tuberculosis pathogen gene database and intelligent precision prevention and control platform (2022YRC2305200) Sub-project Title & Grand Numbers : Creation of a national representative tuberculosis pathogen genetic sequence database and its analysis tools (2022YRC2305203) Sub-project Title & Grand Numbers : Research and application of tuberculosis transmission network and its molecular analysis tools in China (2022YRC2305204) Full name of funder 2022YRC2305203 & 2022YRC2305204 : Ministry of Science and Technology of the People's Republic of China Url of funder Ministry of Science and Technology of the People's Republic of China : https://www.most.gov.cn/index.html.” Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 3. Please note that your Data Availability Statement is currently missing a direct link to access each database]. If your manuscript is accepted for publication, you will be asked to provide these details on a very short timeline. We therefore suggest that you provide this information now, though we will not hold up the peer review process if you are unable. 4. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data. Additional Editor Comments: As can be seen in the reviews, the reviewers felt that this manuscript is a well-done study overall and contains valuable insights about the genomic structure and evolution of Mtb. The reviewers noted some weakness that need to be addressed, including clarifying some of the terminology and methodology used, reconciling some of the numbers reported, and citing some additional relevant literature. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: No Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The study by Zouh et al seeks to get new insight into the epidemiology and pathogenesis of Mtb strains by analyzing the pangenome, i.e. entire gene content of this species, which is usually neglected for this kind of studies (due to the clonal population structure). They address an important point that has been of major interest in the last few years. Overall the authors did a thorough analysis combining various tools to find a good consensus for annotations problems that can emerge in such a workflow. Findings about possibly disrupted ORFs, and hyperconserved genes I find very interesting. However, several parts are currently difficult to follow because the authors throw out various numbers that address different subsets, analysis steps, and the rationale and interpretation of those numbers is poorly presented. Below some comments that should be clarified/improved: Major comments: l.224 The pangenome is based on the population structure of Mtb strains in China. For instance lineage 3 (gene content) is completely missing. This limitation should be addressed in the title and the abstract. l.40 and l 427 The statement that only like 50% of the core genome is intact and translated to functional proteins is difficult to follow from the results section. Usually, one allows some alleles to be missing or failing the thresholds in core genome schemes, so this number above seems just summing up disrupted ORFs in individual samples. If that is really an unusual finding, this needs to be presented better. l. 227 this paragraph is difficult to understand. Where is the number of 1 million predicted CDSs coming from when on average 4 thousand genes are predicted per genome? And what does it mean they were identical to the blast results? The whole selection procedure of genes included for the pangenome may benefit from a flowchart. l.285 I re-read this passage a couple of times, and could not grasp what the authors want to say here. l.433 What is exactly the extend of structural variation as compared to SNPs in Mtb? You highlight RvD2 is that the known RD2 deletion in lineage 2? Any insides/observations on new (maybe phylogenetic) larger deletions? l.457 Also like 60% of your core genome is under (positive?) selective pressure? Mtb is so far thought to be shaped by purifying selection. Minor comments: l.63 the term “ancient” vaccine is maybe a bit degrading l.65 pretomanid might be mentioned as well then with regard to new drugs l.70 introduce MTB abbreviation at first mentioning l.83 Bottai et al https://www.nature.com/articles/s41467-020-14508-5 demonstrated the increased virulence of strains with a deleted TbD1 region l.106 I would not say that nonsense or frame shifts are ignored in presence/absence gene content analysis. Usually that involves a de novo annotation, and current algorithms still can link genes that are truncated or have frameshift mutations with their homologue/functional counterpart. l.123 explain selection criteria, what does it mean based on drug resistance patterns? l.144 explain a bit why CDSs needed to be clustered, and what is meant with overlapping with Rv? That also affects the results part. How do you come to 10 thousand coding gene clusters? l.198 briefly explain the TajimaD and pi measurements, and homoplasy. l.214 what are “different” evolutionary patterns? l.263 what is a valid ORF, and what is not valid? l.266 what exactly do you mean with “how genetic variations affected the presence of absence at different levels in the pangenome”? do you mean by which genomic change certain genes were likely disrupted? l.270 I assume the 76k sequence variants are detected in the whole dataset from a reference mapping approach. That does not include SNPs here right? Because the sentence above you are mentioning SNPs as well? And then you mention again “high-impact” point mutations. I would suggest to be very strict and consistent with naming of mutations and describe from which analysis which numbers come from. L.357 rpsL high nucleotide diversity as compared to what? Fig3 what is allele frequence here? And segregating site count? Allele = gene position? Or allele = gene sequence? Segregating site = SNPs? Fig4 and 5 are not readable in the online version (low resolution) Is Fig4 a phylogeny based on 127 shell genes? Which method? Seems to be a cladogram. In the caption it is stated it is based on 33k SNPs. In that regard one would rather use a ML or at least neighbor joining tree to have the genetic distances represented. Also the term shell gene is used for the first time here. Is that equivalent to the core genome here? Reviewer #2: Comparative genomics is arguably the best tool for querying the Mtb’s evolutionary trajectory and the functional consequences thereof. In this manuscript, the authors presented a comprehensive computational pipeline to infer Mtb’s pan-genome architecture using 420 public short-read sequencing data of Mtb strains isolated through a single national survey conducted in China. The pipeline appears reasonable based on the methods described, though the code used for the analyses was not provided. The analysis revealed a closed pangenome comprising 4,278 genes. The authors reported an interesting finding that among the 4,098 annotated protein-coding genes, only 1,651 encode full-length proteins across all genomes, while the remaining genes were disrupted in at least a subset of strains. Furthermore, the authors reported known and potentially new phylogenetically structured structural variations (SVs) and large structural variations (LSVs) and estimated gene-wise selective pressures using a single metric. Overall, the manuscript is well-structured and could be a valuable asset to the TB research community. I’d be happy to read this manuscript again in its published format if the authors could address the following minor concerns: 1. Which H37Rv genome assembly was used as the reference? Several H37Rv assemblies now exist, including curated assemblies generated using long-read sequencing (e.g., DOI: 10.7554/eLife.97870.1). Since the analyses were based on a single annotated H37Rv genome, the authors should confirm that they used a complete, high-quality genome assembly with no blind spots. They should also provide the specific assembly ID of the H37Rv reference genome used in this study. 2. I commend the authors for the incredibly large amount of work they did to improve the overall power and robustness in calling SVs and other genetic variations by integrating a handful of existing bioinformatic tools. That being said, this is not the first Mtb pan-genome paper and there were several recent studies that used long-read sequencing to generate complete genome assemblies and conduct SV analysis, which were not cited in the present manuscript. Please cite these relevant studies, such as the recent eLife paper by Behruznia et al. (DOI: 10.7554/eLife.97870.1) and the preprint by Marin et al. (DOI: 10.1101/2024.03.21.586149). These studies demonstrated that incomplete contig assemblies from short-read sequencing can lead to biases in accessory gene calling. Comparing findings with these datasets could help avoid spurious variant calls, particularly those discussed in lines 295-297. 3. The number of coding genes with interruptions varies between different parts of the manuscript. For example, line 40 states 2,447 interrupted genes, while line 298 reports 2,194. It could be me who mis-interpreted the words but I do think that these numbers would benefit from explicitly stating the criteria used for counting interrupted genes. 4. Terms such as "unclassified interruptions" (lines 285-295) and "complex" (Table 1) are unclear. Additionally, frameshift mutations are referenced in lines 180-181 but seem omitted from Table 1. Please provide precise definitions for terms like "unclassified interruptions" and "complex" mutations, and, if applicable, expand the legend for Table 1 to explain each mutation category. 5. The rationale for using Tajima’s D to infer selective pressure is unclear, especially when extensive SVs are expected to skew the metric toward extreme values. Please justify why Tajima’s D was chosen over alternative methods that may be more robust to structural variation, such as the McDonald-Kreitman test or dN/dS ratios. 6. Lines 315-316 report 40 genes absent in lineage 2 strains, many of which are phage-related or CRISPR-associated. A recent study (DOI: 10.1128/spectrum.00527-24) identified an association between drug resistance and the absence of CRISPR genes. The authors may consider comparing the results to this published study and testing for convergence in findings. 7. The homoplastic SVs near the plcC locus are intriguing and have also been identified in the Marin et al. preprint. Please cite this relevant preprint and perhaps also discuss potential biological implications of homoplastic SVs in this genomic region. 8. Fig. 5 lacks a legend explaining the color notations, and Figure 4 was placed before Figure 1 in the manuscript. The authors should also consider including a supplementary data sheet corresponding to Figure 4, which would greatly benefit readers who want to perform an in-depth re-analysis of this information-rich dataset. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes: Junhao Zhu ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
|
| Revision 1 |
|
PONE-D-24-51517R1Understanding the epidemiology and pathogenesis of Mycobacterium tuberculosis with non-redundant pangenome of epidemic strains in ChinaPLOS ONE Dear Dr. Zhou, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. You have appropriately addressed all of the reviewers' original concerns.However, there is one additional minor suggestion from one of the reviewers, for which I would like to give you the opportunity to consider making a final revision. Please submit your revised manuscript by May 27 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Thomas R. Ioerger Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Thanks for the clarification and improvements. I found this statement in the abstract still a bit misleading: "However, due to 99,694 interruptions in 2,447 coding genes, only 1,651 may be translated in all samples, which dramatically reduces the number of active core genes." (1) How many of those interruptions actually occur only in single isolates? Your core genome definition of 100% for >400 isolates is quite strict already. If you now count every isolate with a gene interruption, there are probably many singletons and possibly also artefacts included. When you use your softcore genome definition instead (or vice versa genes interrupted in >5% of all isolates), it would be more relevant and may hint to genes or sub-lineages which really make a phenotypic difference. Also we have no indication if those interruptions really abrogate the gene function (for this statement one should maybe consider only large InDels) (2) In that regard we also saw that different tools miss or overinterpret structural variants. Visual inspection of a reference mapping helped to gain confidence for certain applications, and also BWA/GATK pipelines would identify at least small indels and confirm the results from the assembly used here. Reviewer #2: The authors have addressed all of my concerns and kindly clarified the points I had previously misunderstood. I commend them for their excellent work and look forward to reading and sharing the manuscript in its published form. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 2 |
|
Understanding the epidemiology and pathogenesis of Mycobacterium tuberculosis with non-redundant pangenome of epidemic strains in China PONE-D-24-51517R2 Dear Dr. Zhou, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Thomas R. Ioerger Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: |
| Formally Accepted |
|
PONE-D-24-51517R2 PLOS ONE Dear Dr. Zhou, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Thomas R. Ioerger Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .