Peer Review History
| Original SubmissionAugust 17, 2022 |
|---|
|
Dear Dr. Gao, Thank you very much for submitting your manuscript "HiSV: a control-free method for structural variation detection from Hi-C data" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. Especially, some of the reviewers raised serious concerns about this work. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Jie Liu Academic Editor PLOS Computational Biology Jian Ma Section Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors present HiSV (Hi-C for Structural Variation), an algorithm to find structural variants from Hi-C contact maps. The advantages of HiSV include: - it is control-free and works for more tissues or primary samples, - it detects CNV, intra/inter-chrom translocations at the same time with a decision tree. However, the evidence and experiments in this paper are not sufficient to prove HiSV is reliable. My detailed comments are as follows: 1. There is not enough evidence to demonstrate that intra-chromosomal SVs can be separated from normal Hi-C contacts. Fig 1C shows that SV-related interactions are generally stronger than TADs. However, to make sure SVs can be separated from other structures: 1) SVs should be compared with the strongest TADs (Does “top 20 q_values” mean this in the figure caption? The authors should clarify.), 2) contact hot spots are not necessarily TADs, loops/stripes should also be considered. 2. A predefined threshold is used to select SVs from other segments. How is the threshold chosen? Should it be dependent on map resolution, sequencing depth, or assay (Hi-C/capture Hi-C/Micro-C/…)? The authors should provide more information. 3. Another way to evaluate the performance of the “predefined threshold” is to change it and calculate the AUROC. If this is also possible for other baseline methods, AUROC provides a more reliable comparison. 4. In Fig 3 (a and c), the precision and recall are calculated with all SVs. Is it possible to evaluate the performance in different SV types? If HiSV performs well in CNV, it also helps answer my questions in Comment 1. 5. From Fig 3c, I can see many false positive SVs called by HiSV, which might also be related to the limited separation I mentioned in Comment 1. What will we get if applying HiSV to a “normal cell”? 6. The authors should proofread the paper to make sure concepts are well explained in the text and figure captions. For example, 1) the “top 20 q_values” in Comment 1, 2) “DEG” or “DGE”? The full name should be provided when the abbreviation is mentioned for the first time, 3) I think the “dist()” part in formula 3 is wrong. 4) the colors of the two types of translocations in Fig 6b are too similar to distinguish. 7. Is it true that only “above threshold” pixels are chosen? Is it possible that some depleted contacts also correspond to SVs? 8. In Line 88, “the interaction frequencies between two loci decrease logarithmically with genome distance”. I think people usually use “power law” to describe the distance decay of contacts. 9. Can HiSV support other common Hi-C formats like .hic or .mcool? Reviewer #2: In this manuscript, Li et. al. present HiSV, a saliency-based method, which aims to identify structural variants from HiC data. Compared to other SV detection tools based on HiC, HiSV is control-free and achieved better performance on simulated data and cancer cell lines. In addition, HiSV can complement the SV identification from WGS methods. The manuscript is clearly written and well organized. I have four major concerns. (1) Calling the exact starting and ending position of structural variants can be challenging. It seems that HiSV could only identify SV at bin pair level (for example 5kb, 10kb). What if one wants to find the exact location of a ~3kb SV? (2) HiC_breakfinder and EagleC use normal cell lines to construct background models. When normal cell lines are available, can HiSV incorporate this information to provide better identification of SV? (3) The authors exclude the comparison with EagleC since it is a supervised model. However, EagleC can be pre-trained and applied to 91 Hi-C datasets and 25 HiChIP/ChIA-PET datasets from 105 cancer cell lines or primary tumors. It should be a reasonable comparison to see how the pre-trained EagleC performs on the data used in this manuscript. (4) The ground truth is defined as SVs with lengths greater than 1MB. Can HiSV predict short-range (<1MB) SVs? How does it compare with other SV detection tools? I also have one minor comment: (1) Line 316 dist[(i,j),(p,q)] seems to be a typo. I think you are referring to dist(z_i,j - z_p,q). Reviewer #3: Li et al present an interesting computational method named HiSV to identify structure variations (SV) from the Hi-C contact map. HiSV out-perform existing unsupervised methods with the ability to detect both intra-chromosomal and inter-chromosomal SVs. Overall, I found the manuscript is reasonably clear and the use of the saliency map is the main novelty of this study. Despite the merit of this method, I have significant concerns about the robustness of the method towards known experimental biases and the choice of free parameters. My detailed comments are listed below. Major comments: 1. In the introduction section, the authors claim that one advantage of HiSV over HiC_breakfinder and EagleC is that HiSV does not need a background model such that HiSV can be applied to tissue samples. First, this statement is not accurate. EagleC only requires a handful of negative samples from the normal cell lines for training. The authors of EagleC actually apply their method to over 100 Hi-C datasets from either cancer cell lines or primary tumors. HiNT-TL requires a background model, however, a generic background model averaged from multiple cell types also works well. On the other hand, the usage of the background model is helpful to distinguish SVs from 3D genome features such as interactions between A/B compartments, especially SVs within a short distance (e.g., less than 1Mb). I wonder what is the lower bound of the range of SVs that HiSV can identify, because long-range intra-chromosomal SVs are similar to inter-chromosomal SVs while short-range intra-chromosomal SVs are much harder to identify. The authors should provide a comprehensive assessment of the capability of HiSV in detecting various size ranges of intra-chromosomal SVs. For example, the authors should add more groups (e.g., < 1Mb) in Figure 3D and add a category of SVs that were missed by HiSV but appeared in the ground truth. Simulation results would also be quite helpful to demonstrate the capability of HiSV. 2. The authors should provide more explanation on how to choose parameter t. For example, given what kind of sequence depth, should we choose 0.5 rather than 0.6? The authors may consider down-sampling a Hi-C dataset and run HiSV under different values of t and assess what would be the best choice of t given different sequencing depths. In addition, it is unclear if parameter t is sensitive to the choice of bin size. 3. The authors should take Hi-C biases such as GC content, the number of restriction sites, and mappability into account when calculate the average profiles, which is used to annotate SVs into different types. Minor comments: 1. The authors should add a description of the output file in the GitHub repo 2. The authors should explain the definition of q_values in the caption of Figure 1C 3. Line 73, typo "specie samples" ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols |
| Revision 1 |
|
Dear Dr. Gao, We are pleased to inform you that your manuscript 'HiSV: a control-free method for structural variation detection from Hi-C data' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to address Reviewer #3's additional suggestion on the text and also complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Jie Liu Academic Editor PLOS Computational Biology Jian Ma Section Editor PLOS Computational Biology *********************************************************** If possible, please address the point from Reviewer 3. Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: In this revised version, the authors add more experiments and made efforts in addressing my concerns. I think the work is improved and I am satisfied with the current version. Reviewer #2: The authors have addressed all my concerns. Reviewer #3: The authors have addressed most of my concerns in the revised manuscript. There is only one place where I think the revised text is not accurate. In the revised introduction, the authors stated that "These methods also cannot accurately predict the SVs of other species because of the variation in 3D genome organization features between species [14].” This statement is not true. The authors of EagleC tested their method in mouse samples. Additionally, the authors of HiSV did not show any results on applying HiSV in other species. I would suggest authors remove this sentence or revise it. Regarding the comparison with EagleC, as also pointed out by reviewer 1, it is reasonable to assess the performance of EagleC in datasets tested by HiSV. The authors indeed compare these two methods in HCC1954 (the only sample not used as training datasets in EagleC). The results in the response show that EagleC is better than HiSV in detecting intra-chromosomal especially short ones (< 1Mb) while HiSV has better performance than EagleC in detecting inter-chromosomal SVs. I wonder if the trend is consistent when the authors apply HiSV in other samples tested by EagleC (e.g., MCF-7). This may be beyond the scope of this paper but it could significantly strengthen the statement of HiSV in accurately detecting large-scale/inter-chromosomal SVs if similar results can be observed in other cancer cell lines or patient samples. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No |
| Formally Accepted |
|
PCOMPBIOL-D-22-01245R1 HiSV: a control-free method for structural variation detection from Hi-C data Dear Dr Gao, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Bernadett Koltai PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .