Peer Review History

Original SubmissionDecember 29, 2020
Decision Letter - Eduardo Andrés-León, Editor

PONE-D-20-40860

miRGTF-net: integrative miRNA-gene-TF network analysis reveals key drivers of breast cancer recurrence

PLOS ONE

Dear Dr. Nersisyan,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please check and answer the reviewers' questions and suggestions and provide a revised version of the article and a response to each of the reviewers' points.

Please submit your revised manuscript by Apr 02 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Eduardo Andrés-León

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Comments to the author:

The authors report miRGTF-net, a tool for constructing regulatory networks among miRNAs, genes, and TFs. This tool combines a database-level approach as well as expression profiles given from miRNAs and genes to reduce potential false positives for edge construction. They have applied miRGTF-net on the TCGA-BRCA ER-positive dataset and proposed a classification pipeline to predict recurrence from several independent patient cohorts. The presentation is in a light quality level and in the scope of PLOS ONE. The rationale of the work is interesting, although it’s not completely novel. The implementation can be considered of some significance for the research community. Several concerns should be addressed as stated below that I believe would increase the quality and clarity of the manuscript.

Major comments:

1. P2 L28-33: The two major approaches of network construction are described their respective disadvantages but no advantage is mentioned here. I suggest using “limitations” in the sentence rather than advantages and disadvantages.

2. P2 L29: “disadvantagesr” should be “disadvantages”.

3. P2 L29-30: Why does the database-level analysis usually lack tissue specificity? Please describe it in detail.

4. P3 L82-83: “some other node” should be “some other nodes”.

5. P4 L118: How many genes were used as the input for the enrichment analysis? What is the threshold used here to identify the significantly enriched terms? Please add these in the article. Also, check out the number of 218 significantly enriched terms is correct.

6. P5 L156: Add a comma between “potential” and “we”.

7. P8 L276: “intersect” should be “intersects”.

8. P9 L341-342: Are miRNA IDs unified to the latest version of miRBase when constructing the miRGTF-net databases? Among the databases, miRTarBase 7.0 and TransmiR v2.0 utilize miRBase v21 and v22, respectively. According to the description of the miRNA.diff file from miRBase, there are 12 human mature miRNAs have been changed the IDs between the two versions. Although the impact is slight, still might cause the relationship in a network to unmatch. Additionally, the miRGTF-net input data with miRNA IDs can use an annotation belonging to one of the various versions of miRBase. Once these miRNA IDs use older versions, the number of miRNAs that cannot be mapped to the databases correctly will increase. This case is more common in early miRNA-seq data. The authors should address this issue and propose a feasible approach in miRGTF-net.

9. Figs. 1 and 7: Please use standard flowchart symbols instead of colors to represent elements in the workflows.

Comments for the miRGTF-net improvement:

1. Besides the human databases, the authors can consider establishing the databases of other model organisms for miRGTF-net.

2. It would be good to show the real-time progress/status as well as brief summary while running the miRGTF-net script.

Reviewer #2: In this manuscript, Nerissyan et al. introduced a novel method to analyze miRNA-gene-TF network on breast cancer datasets. This method highlights the use of integrative information (microarray data, TCGA database, miRNA information) to build a classifier for five-year breast cancer recurrence prediction. As a result, they have identified a set of gene signature that can be used to predict the outcome of cancer patients. Finally, the authors discussed the mechanistic roles of ESR1 and E2F1 underlying breast cancer recurrence.

However, I think the following points need to be clarified in the manuscript:

1, Since the results are based on the input databases, and the use of databases may bias the output. I would like to see a comparison between using other miRNA prediction tools, such as TargetScan, to perform the analysis and see how robust the results would be.

2, I can see the potential of applying this software to other species. I suggest the author to include a note on the Github pages and show how to build the input data for other model organisms, such as mouse and Drosophila. If the preparation of those data needs pre-processing, they should also include scripts to perform that.

3, I suggest the authors to compare one or two more machine learning methods in the classification step. Although SVM is considered as a good classifier (AUC value), other algorithm may outperform it due to the use of different strategy, such as tree models (decision tree, random forest), and probability-based method, for example, Naive Bayes classifier.

4, I suggest the authors to provide more analysis/visualization on the gene signatures, for example, how stable are the signature across TCGA samples? Do they have individual variations? A Heatmap with clustering analysis would be a useful way to address these questions.

Also, I have tested the scripts on their GitHub webpage and the software is user friendly. I would suggest the authors to change the default output format to pdf, which is easier to open and process than the current graphml format. Overall, this work provides a new angle for integrative TF and miRNA analysis. I suggest a moderate revision of the manuscript before publication.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Yu H. Sun

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Reviewer 1

Point 1. P2 L28-33: The two major approaches of network construction are described their respective disadvantages but no advantage is mentioned here. I suggest using “limitations” in the sentence rather than advantages and disadvantages.

Response 1. We thank the Reviewer for pointing this out. We revised the text accordingly.

Point 2. P2 L29: “disadvantagesr” should be “disadvantages”.

Response 2. This word was removed from the text (see Response 1).

Point 3. P2 L29-30: Why does the database-level analysis usually lack tissue specificity? Please describe it in detail.

Response 3. The detailed explanation was added to the paragraph.

Point 4. P3 L82-83: “some other node” should be “some other nodes”.

Response 4. The typo was corrected.

Point 5. P4 L118: How many genes were used as the input for the enrichment analysis? What is the threshold used here to identify the significantly enriched terms? Please add these in the article. Also, check out the number of 218 significantly enriched terms is correct.

Response 5. The required information was added to the manuscript text (lines 122, 479-480). We also verified that there are 218 significantly enriched terms (the last column of S2 Table with 0.05 threshold was used).

Point 6. P5 L156: Add a comma between “potential” and “we”.

Response 6. Comma was added.

Point 7. P8 L276: “intersect” should be “intersects”.

Response 7. The typo was corrected.

Point 8. P9 L341-342: Are miRNA IDs unified to the latest version of miRBase when constructing the miRGTF-net databases? Among the databases, miRTarBase 7.0 and TransmiR v2.0 utilize miRBase v21 and v22, respectively. According to the description of the miRNA.diff file from miRBase, there are 12 human mature miRNAs have been changed the IDs between the two versions. Although the impact is slight, still might cause the relationship in a network to unmatch. Additionally, the miRGTF-net input data with miRNA IDs can use an annotation belonging to one of the various versions of miRBase. Once these miRNA IDs use older versions, the number of miRNAs that cannot be mapped to the databases correctly will increase. This case is more common in early miRNA-seq data. The authors should address this issue and propose a feasible approach in miRGTF-net.

Response 8. We corrected errors related to miRBase v21 to v22 migration, which resulted in only one new interaction between miRNA and its target gene, and one additional two-node strongly connected component. This interaction, however, had not modified the major weakly connected component and, therefore, constructed classifiers remained unchanged. We updated the corresponding part of the manuscript (lines 104, 132-133). Additionally, we uploaded miRTarBase version 8.0 to the miRGTF-net repository as a default database for miRNA-gene interactions and provided the link for miRBase name conversion tool (miRBaseConverter).

Point 9. Figs. 1 and 7: Please use standard flowchart symbols instead of colors to represent elements in the workflows.

Response 9. Figures 1 and 7 were updated accordingly.

Point 10. Besides the human databases, the authors can consider establishing the databases of other model organisms for miRGTF-net.

Response 10. We added three mouse databases to the miRGTF-net GitHub repository (TRRUST, TransmiR, miRTarBase), so one can run the tool if miRNA/mRNA expression data in a set of samples are available. This information was also added to the manuscript text (lines 97-98).

Point 11. It would be good to show the real-time progress/status as well as brief summary while running the miRGTF-net script.

Response 11. Progress messages and a brief summary were added to the run.py script.

Reviewer 2

Point 1. Since the results are based on the input databases, and the use of databases may bias the output. I would like to see a comparison between using other miRNA prediction tools, such as TargetScan, to perform the analysis and see how robust the results would be.

Response 1. We thank the Reviewer for pointing this out. Since TargetScan is a tool for in silico sequence-based target prediction, it contains about 25 times more interactions compared to miRTarBase subset for interactions with strong experimental support. Thus, a direct comparison of two networks is inapplicable. However, use of TargetScan in miRGTF-net can allow one to formulate hypotheses on new miRNA-mRNA interactions in a considered tissue type. We expanded the Discussion section accordingly (lines 269-279).

Point 2. I can see the potential of applying this software to other species. I suggest the author to include a note on the Github pages and show how to build the input data for other model organisms, such as mouse and Drosophila. If the preparation of those data needs pre-processing, they should also include scripts to perform that.

Response 2. We added three mouse databases to the miRGTF-net GitHub repository (TRRUST, TransmiR, miRTarBase), so one can run the tool if miRNA/mRNA expression data in a set of samples are available. This information was also added to the manuscript text (lines 97-98).

Point 3. I suggest the authors to compare one or two more machine learning methods in the classification step. Although SVM is considered as a good classifier (AUC value), other algorithm may outperform it due to the use of different strategy, such as tree models (decision tree, random forest), and probability-based method, for example, Naive Bayes classifier.

Response 3. Aside from the SVM classifier, we also tried several alternatives such as mentioned random forests, Naive Bayes as well as others (k-nearest neighbors, gradient boosting). Interestingly, all these methods had not allowed us to construct classifiers with reliable quality, so we decided not to include this information into the text, since this could shift accents of the manuscript far from the main line (network construction and analysis).

Point 4. I suggest the authors to provide more analysis/visualization on the gene signatures, for example, how stable are the signature across TCGA samples? Do they have individual variations? A Heatmap with clustering analysis would be a useful way to address these questions.

Response 4. Heatmaps with clustering for each dataset were added (S1 Figure, lines 209-210).

Point 5. I would suggest the authors to change the default output format to pdf, which is easier to open and process than the current graphml format.

Response 5. We thank the reviewer for the suggestion for improving the tool. Unfortunately, we were not able to adequately address the problem of universal programmatic graph visualization yet, but we are planning to work on this feature for the future releases of miRGTF-net. However, we added links to Gephi and yED Graph Editor to the GitHub README file, so one can simply use these tools for fast and beautiful visualization of networks from graphml format.

Decision Letter - Eduardo Andrés-León, Editor

miRGTF-net: integrative miRNA-gene-TF network analysis reveals key drivers of breast cancer recurrence

PONE-D-20-40860R1

Dear Dr. Nersisyan,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Eduardo Andrés-León

Academic Editor

PLOS ONE

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this revised version, the authors exhaustively addressed all the raised concerns. I consider these changes made by the authors to be appropriate. The manuscript is overall more curated, therefore, I suggest the publication of the paper in its revised form.

Reviewer #2: In this revised article, the authors have addressed all the questions I raised during the first round of review. In addition, they added more discussions in the main text (lines 269-279) which expands the application of the new method they developed. The updated github files (such as TRRUST, TransmiR, miRTarBase databases, and the instructions of visualization) will also benefit the users to implement their software in their own research. I hope that the authors keep good maintenance of the github repository and regularly check the reported bugs raised by the users. I suggest to accept this manuscript.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Yu H. Sun

Formally Accepted
Acceptance Letter - Eduardo Andrés-León, Editor

PONE-D-20-40860R1

miRGTF-net: integrative miRNA-gene-TF network analysis reveals key drivers of breast cancer recurrence

Dear Dr. Nersisyan:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Eduardo Andrés-León

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .