Peer Review History

Original SubmissionMay 10, 2024
Decision Letter - Louxin Zhang, Editor

PCSY-D-24-00073

Identifying stable communities in Hi-C data using a multifractal null model

PLOS Complex Systems

Dear Dr. Hedström,

Thank you for submitting your manuscript to PLOS Complex Systems. After careful consideration, we feel that it has merit but does not fully meet PLOS Complex Systems's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days Oct 01 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at complexsystems@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcsy/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Y-h. Taguchi, Dr. Sci.

Academic Editor

PLOS Complex Systems

Journal Requirements:

Additional Editor Comments (if provided):

The problem about stability and robustness pointed by the reviewer 3 must be addressed.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Complex Systems’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

--------------------

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I don't know

Reviewer #3: Yes

--------------------

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

--------------------

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Complex Systems does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

--------------------

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this work, the authors presented a computational pipeline to identify stable 3D communities in Hi-C data. The methodology first bootstrapped an ensemble of Hi-C datasets to simulate experimental noise. Then, they employed the Generalized Louvain method, a community detection algorithm, to identify communities within these datasets. The authors focused on communities that remained consistent across the bootstrapped datasets, indicating their stability under noise. They found that these stable communities tend to have higher internal contact frequencies, are enriched in active chromatin marks, and exhibit more nested cross-scale hierarchies than less stable communities. I have a few questions that would help to clarify the technique and results:

It would be interesting to see the model performed in different genome architectures such as described in “Hoencamp, Claire, et al. 3D Genomics across the Tree of Life Reveals Condensin II as a Determinant of Architecture Type. Science, vol. 372, no. 6545, May 2021, pp. 984–89.”

In addition, it would also be interesting to see how the methodology performs in Hi-C maps obtained in mitotic chromosomes or across the cell cycle, such as presented in “Gibcus, Johan H., et al. A Pathway for Mitotic Chromosome Formation. Science, vol. 359, no. 6376, Feb. 2018”

I also suggest expanding the comparison of the chromatin states of HMM with subcompartments annotation with other cell lines as available at the ENCODE portal (https://www.encodeproject.org/search/?type=Annotation&searchTerm=physical%20modeling) with the methodology described in Dodero-Rojas, Esteban, et al. “PyMEGABASE: Predicting Cell-Type-Specific Structural Annotations of Chromosomes Using the Epigenome.” Journal of Molecular Biology, vol. 435, no. 15, Aug. 2023, p. 168180.

How the community detection would be applied in intermaps? Is this possible to identify NADs and LADs?

What are the computational resource requirements for running the pipeline, and how scalable is it for larger datasets? I would recommend adding a tutorial in GitHub - A Jupyter Notebook / Google Colab or depositing the model would help potential users.

Reviewer #2: This article studies the stability of communities detected by HiC data. The main contributions are a method to estimate the stability of a community and observations made about stability of communities after applying the method to a data set.

The main steps of the suggested method are:

1. Noisy HiC samples are generated based on an observed HiC data set where the noise depends on the variance of number of contacts at a given distance.

2. Communities are detected in the original HiC and noisy HiC data using the Generalised Louvain method with the null model being the hierarchical domain model.

3. The stability of communities in the original HiC data is estimated using the Jaccard index between the original data and generated data. The stability of a community in the original data set is the proportion of noisy HiC samples that have precisely the same community.

The main observations about stable communities are that they have high internal connectedness, they contain a high proportion of active chromatin marks and nodes in stable communities are preserved at different scales.

The two motivations for studying the community stability are that different methods for clustering can give different results and the community detection is influenced by experimental noise which can influence community detection. The focus of this paper is on the noisy case. I find that the topic of the article is well-motivated and important to study since the goal is to find biologically meaningful communities independent of noise. The proposed method looks reasonable to me, although I am not able to judge all the details. Especially the definition of the hierarchical domain model is complex, although the figures illustrating it look convincing.

Major comments/remarks:

1. The cutoff in the stability estimation was chosen to be 1.0 which means that two communities have to be exactly equal. Is the cutoff 1.0 realistic from biological perspective? For example, it was brought out that the overlap between TAD boundaries in replicate experiments was only 62%. How do the results change when one changes the cutoff, for example to 0.95?

2. What are specific recommendations to someone that is looking for communities based on HiC data? Which communities are meaningful and which ones should one discarded? Is using cutoff 1.0 reasonable in this context?

3. It would be interesting to see an analysis of the stability of some communities that have been previously found in the literature.

In summary, I think the paper is interesting and well written. The figures are illustrative and have descriptive captions. The article has some small mistakes and typos that should be corrected in a revision. There was a link to access data, but it did not work for me. I recommend accepting the paper for publication after a revision that addresses the remarks and comments in this report and makes data publicly available.

Typos and minor remarks:

Page 2: It is written that there are 13 different chromatin types. Later in the text and in the appendix, the number is 15.

Page 3, line 3: A period is missing at the end of the sentence.

Page 3, second column: “P^{(0)}_{ij} is the null model” - please specify for what it is the null model (this is some value substituted to the formula)

Page 3, second column: “Within” should start with a small letter.

Page 4: Can you clarify what is meant by TAD boundaries here. Is it the same as TADs and if not, please define boundaries.

Page 4, second column: “the last term … approaches 0” - there is “1-“ missing; what is written now approaches to one

Page 4, second column: I am confused why higher epsilon indicates stronger resilience to noise. Shouldn’t it be true for smaller smaller epsilons?

Page 5, first column: Should P_0 be P_o?

Page 5, second column: It is not clear to me what the last paragraph is referring to (as the subsection ends after that).

Page 6, section D: Effective community size used here is 0.38Mb. Earlier in the paper it is 0.88Mb. Please justify why do you change it here? 0.38MB is also used in the next subsection, but there several ones are used, but nevertheless the question raises why 0.88Mb that was used so far is not used anymore.

Page 7, column 1: “go towards between nodes with similar stabilities of just above or below” - I did not understand this sentence.

Page 7, column 1: “We note that FE levels for active chromatin (three leftmost bars) remain conserved across both scales …” - This is not completely convincing from the figures (the inactive ones look more conserved).

Page 7, column 1: “… but this like stem from …” -> “likely stems”?

Page 7, column 2: “distant-dependent” -> “distance-dependent”?

Table 1 caption: “The value of gamma that reproduce” -> check singular/plural

Table 1: Why are there two rows for TADs?

Page 10, second column, last line of the first paragraph: remove period after the word “Table”.

Reviewer #3: The authors analyze Hi-C data by treating them as DNA-contact networks and extracting 3D communities resilient to experimental noise. They do this by bootstrapping an ensemble of noisy Hi-C datasets and comparing them with unperturbed data. The pipeline reveals that stable communities under noise exhibit higher internal contact frequencies, are enriched in active chromatin marks, and form more nested hierarchies. This proposed pipeline, which utilizes an ensemble of bootstrapped noisy data to detect stable community structures, could attract the attention of general leaders. However, the methodology itself requires further validation.

1. How robust is the method? How robust are the detected stable communities for different metrics other than the Jaccard index? Also, what about different threshold cutoffs?

2. The Generalized Louvain algorithm also generates an ensemble of communities. What distinguishes the stable communities predicted from the ensemble of detected communities using the original data from those detected by comparing the bootstrapped ensemble with the original data?

3. Equations (4)-(6) are introduced, but their meanings are not well explained. Readers would appreciate a clear explanation of these equations. Additionally, the contribution of the current work is unclear. What new elements does this study present compared to [24]?

4. The node stability s_i is not defined in the main text.

5. Does S_C indicate the average community stability?

6. In Appendix A, MD(d) is not explained.

7. In Fig. 8, it would be helpful to show results from the data as well.

9. Minor typos:

- Below Fig. 2, a period is missing after the sentence "We note that the distributions at all these distances are log-normal, albeit with their unique mean and standard deviation."

- Fig. 4 caption: "maximum of around 0.5" should be "0.05."

- "Hi-maps" should be "Hi-C maps."

- Below Eq. (11), "int" has only closing double quotation marks.

--------------------

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Vinicius Contessoto

Reviewer #2: No

Reviewer #3: No

--------------------

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1
Decision Letter - Louxin Zhang, Editor

PCSY-D-24-00073R1

Identifying stable communities in Hi-C data using a multifractal null model

PLOS Complex Systems

Dear Dr. Hedström,

Thank you for submitting your manuscript to PLOS Complex Systems. After careful consideration, we feel that it has merit but does not fully meet PLOS Complex Systems's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days May 02 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at complexsystems@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcsy/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to any formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Y-h. Taguchi, Dr. Sci.

Academic Editor

PLOS Complex Systems

Y-h. Taguchi

Academic Editor

PLOS Complex Systems

Hocine Cherifi

Editor-in-Chief

PLOS Complex Systems

Additional Editor Comments (if provided):

Since the reviewer 3 is very negative to your manuscript, if possible address the reviewer 3's concerns. If it is not possible, at least, please address all the concerns raised by reviewer 2.

[Note: HTML markup is below. Please do not edit.]

Reviewers' Comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Does this manuscript meet PLOS Complex Systems's publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I don't know

Reviewer #2: I don't know

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Complex Systems does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: (No Response)

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors asked: "what is the rationale for selecting this

one among all the others explored in3?"

The approaches mentioned in 3 (ref in the response), the ones performed for Humans (ref 32 and 33) also used HMM.

It would be interesting to benchmark the communities on different strategies. The rationale for selecting the mentioned ML model predictions is that they are publicly available on the ENCODE portal, and that would require less effort from the authors to compare the manuscript results with different approaches besides HMM. The predictions are trained on epigenetic data only based on the Rao 2014 subcompartment annotations (ref 21). Then, the authors defined/used 5 states, which could be compared (not only with the ML model for other cell lines as suggested) but with the subcompartment annotation from Rao 2014 (A1, A2, B1, B2, B3, and B4).

Given the unsatisfaction of the suggestion that does not use HMM and is easily accessible, many others based on Hi-C and the epigenome are worth investigating, such as SNIPER, Calder, TECSAS, SLICE, and more.

In the discussion:

What do the authors mean by "...biologically meaningful 3D structures from Hi-C"? Are they actually talking about XYZ or what means the "3D"?

What do the authors mean by "noise-driving structural conformations"?

Reviewer #2: The authors have addressed all my questions, but I have to admit that I don't really understand all the parts about the answer about the cutoff value and the corresponding section in the appendix, although I have tried to read it several times. I fully agree that it is expected that if the cutoff is too low would not produce interesting results about the stability and the observed relationship between stability in Figure 9. First I would like to emphasise that I didn't suggest to have cutoff at 0.5, which I agree is too low, but something close to 1 but not 1 which I would find more realistic. Therefore the left column on page 12 does not really answer my question. I feel that it explains some things that are obvious, but does not address really my question/confusion. Moreover, if the cutoff value is 1, then how changing one community member changes the Jaccard index does not matter, since no changes are allowed to my understanding. Second, I think I don't understand the connection between the cutoff value for the Jaccard Index and the 62% percentage overlap of TAD boundaries in experiments.

The text that on page 12 is the most confusing to me starts with the sentence "In this case, we look at how borders between the communities overlap ..." and continues until the end of this paragraph and the first sentence of the next paragraph. Additionally, it is not clear what the word "this" refers to twice in this part of the text (but I think that's not the main problem).

Additionally there are some typos:

* p4: expressour

* The sentence after formula (6) starts with a small letter.

* Use of wrong " " in the new text.

* p8: period at the beginning of a line

* p9 first sentence has no verb

* p11: The x-axis show

* p12: more nodes needs

Reviewer #3: All my comments have been addressed, and I am satisfied with the authors' response letter. Therefore, I recommend publishing the manuscript in PLOS Complex Systems.

*note: It seems that the previous version of the manuscript was uploaded. I have reviewed the changes using the diff file. Please verify the manuscript version.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachments
Attachment
Submitted filename: referee-response.pdf
Revision 2
Decision Letter - Louxin Zhang, Editor

Identifying stable communities in Hi-C data using a multifractal null model

PCSY-D-24-00073R2

Dear Dr Hedström,

We are pleased to inform you that your manuscript 'Identifying stable communities in Hi-C data using a multifractal null model' has been provisionally accepted for publication in PLOS Complex Systems.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow-up email from a member of our team. 

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact complexsystems@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Complex Systems.

Best regards,

Y-h. Taguchi, Dr. Sci.

Academic Editor

PLOS Complex Systems

Hocine Cherifi

Editor-in-Chief

PLOS Complex Systems

***********************************************************

Reviewer Comments (if any, and for reference):

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Does this manuscript meet PLOS Complex Systems's publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: No

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Complex Systems does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Attachments
Attachment
Submitted filename: response_rev_2.pdf

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .