Peer Review History

Original SubmissionOctober 18, 2023
Decision Letter - M. Sohel Rahman, Editor

PONE-D-23-34195RPIPLM: prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with Supervised Contrastive LearningPLOS ONE

Dear Dr. Liu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit and may be considered for publications after a thorough revision. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please attend to the comments and questions of the reviewers particularly related to the experimental setup and results. More specifically, I would expect your response and if appropriate revision on Points #4-#7 of Reviewer 2 and the points listed under the "Weak Points/possible Improvements" of Reviewer 1.

Please submit your revised manuscript by Feb 01 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

M. Sohel Rahman, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Please update your submission to use the PLOS LaTeX template. The template and more information on our requirements for LaTeX submissions can be found at http://journals.plos.org/plosone/s/latex.

4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

5. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

"We gratefully acknowledge the financial support received from the following funding sources to conduct this research. The research was partially supported by the Defense Industrial Technology Development Program (Grant

JCKY2021906B002 and Grant JCKY2021602B002)."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

"The authors received no specific funding for this work."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

6. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Summary: The aforementioned paper introduces an innovative method called RPIPLM, designed to predict interactions between non-coding RNA (ncRNA) and proteins.

The method applied a pair of parallel BERT-based models to encode both the ncRNA and protein sequences simultaneously. Then these two encodings were concatenated to form a unified feature representation which is refined by an attention based method. After which a contrastive learning based model predicts whether the ncRNA-protein pair interacts. To train the BERT-based models, unlabelled ncRNA and protein sequences were used (they were pre trained so authors didn't need to train them). Also labelled data was used to train a number of parameters of the pipeline. The main claim of the authors is that using the BERT-models harnesses the power of huge amounts of unlabelled data that improve ncDNA-Protein interaction prediction, which is shown in the results section.

Strong Points:

The world renowned BERT model was used in an intricate way.

Vast quantities of Unlabelled data is available for protein or RNA sequences. The RPIPLM model’s attempt to utilise unlabelled data is praiseworthy.

A case study was conducted to demonstrate the model's performance in real-world scenarios.

For protein encoding, multiple BERT models were tested. And the employment of OntoProtein was very thoughtful.

Instead of using hard coded thresholds, ROC curve was shown for comparison, which is a plus point.

Results generated by using various concatenation methods were shown (Table 3).

Weak Points/possible Improvements:

In Figure 1, the protein sequence only has A C G U. But generally there are 20 different amino acid residues. This may confuse some readers. Also the protein sequence looks identical to RNA sequence. Which may falsely make some readers think sequences can be identical. (Yes, the ‘...’, means anything can be there but many readers may skip this). Also in the FIgure where MLM is shown the sequences do not match the input.

In section 2.2, some concatenation methods are stated and what each concatenation method captures is given, such as , “This method provides a straightforward

approach to concatenate RNA and protein features, but may not capture more complex interactions between RNA and protein.” - what is the basis of the underlined states? Similarly for other concatenation methods such a statement is given. If there were references it would be better.

In section 2.3.2, kernel size of convolutional layers, filter count and depth is given which were fixed during tests. It would be better if multiple tests were done with different values for these hyperparameters.

In section 3, it is assumed that, if ncRNA “Rs and Ri share more than 80% sequence identity” or proteins “Ps and Pi share more than 40%” sequence identity” than if pair Rs and Ps interact then Ri and Pi interact. This way the dataset was filtered. What is the basis of choosing these percentages (80% and 40%).

In section 4.7, we see there were tests with No pre-training and no CL. It is unclear how such this model was trained and tested.

RNABERT used RNA structural information in its pipeline. For protein encoding pipeline addition of such options can be great, as huge protein structural databases are available.

There are great tools such as Hmmer or HH-Suite that can create MSA from or do profiling for protein sequences. What happens when results of these tools are used to again tune the MLM encoded results? Such tasks may improve performance.

I think once the weaknesses/questions are addressed this could be a decent contribution.

Reviewer #2: 1. It would be great if the authors included line numbers.

2. In the second paragraph, some of the examples were repeated in the sentence "These interactions are involved in a range of biological processes, including disease development, cellular signaling, and gene expression regulation, among others.". Either provide new examples or remove these lines.

3. The introduction section is over-sized. Too much focus on previous work. It is better to reduce the number of examples and their contributions. Include one/two from machine learning and two/three from deep learning with the most ground breaking contributions at their time.

4. There are other model for protein feature extraction like ESM and ProteinBERT. Why have those not been tested?

5. To calculate |r-p| and r*p, both r and p need to be of same size. However, the pretrained models don't give output of same size. How did the authors convert them to same size? Dense layer or 1D conv? Need to mention that.

6. The global and local attention methods were not described. From the figure it seems, global attention is scaled dot product attention by Vaswami and local attention is regular dot product attention. Should detail on these and refer to the authors of the original work.

7. Please highlight the best scores in Table 3.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Gourab Saha

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

We express our gratitude to the esteemed reviewer for providing constructive feedback and valuable suggestions regarding the positive evaluation of our work. Furthermore, his/her insightful recommendations have significantly contributed to improving the quality and rigor of our manuscript. Please refer to our submitted PDF document "Response to Reviewers" for specific responses.

Attachments
Attachment
Submitted filename: Response to Reviewers.pdf
Decision Letter - M. Sohel Rahman, Editor

PONE-D-23-34195R1RPIPLM: prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with Supervised Contrastive LearningPLOS ONE

Dear Dr. Liu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================Pleas note that while Reviewer 1 seems to be happy with your revision effort, Reviewer 2 is still unhappy and suggested a Major revision. While, I felt the recommendation of Reviewer 2 is a bit harsh, I do agree with his points, particularly, Points 3 and 6. Therefore, a further revision is in order or you may provide a clear rebuttal if you believe that the comments are unjustified.==============================

Please submit your revised manuscript by Aug 02 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

M. Sohel Rahman, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All the comments have been addressed properly. Relevant references were added and the manuscripts was properly updated. I am happy with the revised manuscript.

Reviewer #2: 1. Table-3 needs to show the best score per column in bold.

2. The authors need to show the applicability of the method on SOTA PLMs like ESM (https://github.com/facebookresearch/esm) and RNA LMs like RiNALMo (https://arxiv.org/pdf/2403.00043) to prove robustness of the idea.

3. One major issue with the method is the application of dot product attention and scaled dot product attention on the concatenated features. Attention is usually applied along the length of a sequence. However, this method applies it along the feature dimension. The article never explained the rationale behind it If it wasn't the case, the description is not well written to reflect that.

4. Table-6 in Ablation studies should indicate the best scores in bold. Additionally, authors might reconsider a different title for the section. Additionally, in the description, instead of rehashing what the table shows, authors may instead focus on the improvement ( in percentage or other measures).

5. Additionally, while the difference in performance may not be big as seen in the ablation studies, it would be interesting to see the effects of the various attention modules on the leaned representations. Authors may perform the TSNE plot experiments to show that. Also, if there are significant differences, an interesting study would be exploring how the attention modules impact the feature space.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Sabab Aosaf

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 2

We express our gratitude to the esteemed reviewer for providing constructive feedback and valuable suggestions regarding the positive evaluation of our work. Furthermore, his/her insightful recommendations have significantly contributed to improving the quality and rigor of our manuscript.

1. Reviewer #1

1.1. Comment 1

RC: All the comments have been addressed properly. Relevant references were added and the manuscripts was properly updated. I am happy with the revised manuscript.

AR: Thank you for your positive feedback and for your time and effort in reviewing our manuscript.

2. Reviewer #2

2.1. Comment 1

RC: Table-3 needs to show the best score per column in bold.

AR: Thank you for your suggestion. We have made the necessary revisions to Table 3 to show the best score per column in bold in the revised manuscript.

2.2. Comment 2

RC: The authors need to show the applicability of the method on SOTA PLMs like ESM(https://github.com/

facebookresearch/esm) and RNA LMs like RiNALMo (https://arxiv.org/pdf/2403.00043) to prove robustness of the idea.

AR: Thank you for your valuable suggestion. We agree that testing on state-of-the-art PLMs such as ESM and RiNALMo would further demonstrate the robustness and generalizability of our method. As these models have different input formats and computational demands, we plan to include them in our future work to extend RPIPLM’s applicability. In the current version, we have focused on integrating and evaluating a range of competitive PLMs (e.g., ProtBert, OntoProtein, RNABERT) to establish the foundational performance of our dual-tower framework.

2.3. Comment 3

RC: One major issue with the method is the application of dot product attention and scaled dot product attention on the concatenated features. Attention is usually applied along the length of a sequence. However, this method applies it along the feature dimension. The article never explained the rationale behind it If it wasn't the case, the description is not well written to reflect that.

AR: We appreciate the reviewer’s insightful comment. However, we believe there may have been a misunderstanding. In our method, the dot-product attention is applied along the sequence (token) dimension, not along the feature (embedding) dimension.

As described in Section Global and Local Attention Modules (page 7), we define our input $X = [x_1, x_2, \dots, x_n]$ as a sequence of embeddings, where $n$ denotes the sequence length (i.e., the number of tokens after concatenation of RNA and protein CLS or pooled embeddings). The scaled dot-product attention then computes attention weights between all pairs of tokens in this sequence. To avoid further confusion, we have revised the manuscript to explicitly state the dimension along which attention is applied, and we have clarified that the attention is not across feature channels.

2.4. Comment 4

RC: Table-6 in Ablation studies should indicate the best scores in bold. Additionally, authors might reconsider a different title for the section. Additionally, in the description, instead of rehashing what the table shows, authors may instead focus on the improvement ( in percentage or other measures).

AR: Thank you for your insightful observation.Thank you for the constructive suggestions regarding the ablation study section. We have updated Table 6 by highlighting the best-performing scores in bold for clearer comparison. Additionally, we have renamed the section to “Ablation Study and Performance Impact” to better reflect its content. In the revised description, we now focus on the relative performance changes, reporting the percentage improvements and degradations caused by the removal of key modules, rather than restating the table values.

2.5. Comment 5

RC: Additionally, while the difference in performance may not be big as seen in the ablation studies, it would be interesting to see the effects of the various attention modules on the leaned representations. Authors may perform the TSNE plot experiments to show that. Also, if there are significant differences, an interesting study would be exploring how the attention modules impact the feature space.

AR: We thank the reviewer for the valuable suggestion to explore the impact of attention modules on the learned representations. While we initially planned to use t-SNE visualizations, we agree that a quantitative summary better complements our existing evaluation style.

To this end, we conducted embedding-level ablation experiments under the same concatenation scheme ($(r, p, |r - p|)$), where we removed either global attention, local attention, or both. The classification results using the derived embeddings are reported in the following table.These results show that removing either attention module consistently degrades performance, with the most notable drop occurring when both are removed. This supports the view that the attention mechanisms help shape a more discriminative and informative feature space, even when the top-line metrics are not drastically different.

Configuration Accuracy (NPInter2.0) MCC (NPInter2.0) Accuracy (RPI2241) MCC (RPI2241)

No Global Attention 0.931 0.875 0.812 0.667

No Local Attention 0.922 0.868 0.801 0.643

No Global & Local Attention 0.903 0.847 0.785 0.620

Attachments
Attachment
Submitted filename: Response to Reviewers.docx
Decision Letter - M. Sohel Rahman, Editor

PONE-D-23-34195R2RPIPLM: prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with Supervised Contrastive LearningPLOS ONE

Dear Dr. Liu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jun 20 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

M. Sohel Rahman, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

I have been in touch with the reviewer who had trouble submitting the report. So, I am communicating his comments for you (the authors) for (minor) revision.

1. Because the model is trained in a fully supervised manner, its final embedding layer is inherently optimized for class separation, so any t-SNE projection will directly mirror its classification accuracy. As a result, the only distinction between the RPIPLM and RPITER t-SNE visualizations arises from their differing accuracy ( and other metrics) scores, and the “new” plots offered are in fact identical to those already shown in Figure 3. Additionally, the 3rd and 4th plots in the tSNE plots are reflective of the final plot in Figure 5.

2. As for what the authors can do to show the interpretability of the model

They can use the attention wits to see which tokens the model is paying most attention to while classifying and if those tokens have patterns that might represent something biologically.

Biologically interpret whether the model’s attention weights are indicative of the binding sites, and if the weights can shed some light on that.

Note that authors are not restricted to these ideas and can explore any other ideas they might find a good fit.

3. All other comments have been addressed

[Note: HTML markup is below. Please do not edit.]

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 3

We express our gratitude to the esteemed reviewer for providing constructive feedback and valuable suggestions regarding the positive evaluation of our work. Furthermore, his/her insightful recommendations have significantly contributed to improving the quality and rigor of our manuscript.

1. Comment 1

RC: Because the model is trained in a fully supervised manner, its final embedding layer is inherently optimized for class separation, so any t-SNE projection will directly mirror its classification accuracy. As a result, the only distinction between the RPIPLM and RPITER t-SNE visualizations arises from their differing accuracy ( and other metrics) scores, and the “new” plots offered are in fact identical to those already shown in Figure 3. Additionally, the 3rd and 4th plots in the tSNE plots are reflective of the final plot in Figure 5.

AR:

We thank the reviewer for the thoughtful comment and agree with the observation that “any t-SNE projection will directly mirror its classification accuracy.” Indeed, as our model is trained in a fully supervised manner, the embedding space is naturally optimized for class separation, and the resulting t-SNE plots primarily reflect this objective.

However, we would like to clarify two points:

1.We are not entirely certain about the reference to “the RPITER t-SNE visualizations”, as RPITER was not included in any t-SNE comparison in our manuscript. Furthermore, we did not introduce any new t-SNE plots in the revised version of the manuscript beyond what was already present in the original submission (Figure 6). If there has been a misunderstanding on our part, we would be grateful for clarification.

2.Regarding the comment that “the 3rd and 4th plots in the t-SNE plots are reflective of the final plot in Figure 5,” we believe this refers to a perceived overlap in visual information. We note that the last plot in Figure 5 shows the performance of RPIPLM on the RPI369 dataset, and the t-SNE example we included in Figure 6 is likewise drawn from RPI369. Therefore, the similarity is expected and intentional: the t-SNE plot serves as a geometric visualization corresponding to the same experimental condition.

2. Comment 2

RC: As for what the authors can do to show the interpretability of the model They can use the attention wits to see which tokens the model is paying most attention to while classifying and if those tokens have patterns that might represent something biologically. Biologically interpret whether the model’s attention weights are indicative of the binding sites, and if the weights can shed some light on that.

Note that authors are not restricted to these ideas and can explore any other ideas they might find a good fit.

AR:

We appreciate the reviewer’s insightful suggestion regarding the use of attention weights for biological interpretability. Indeed, analyzing which tokens receive high attention may help identify sequence regions associated with binding activity. While preliminary inspection of attention maps shows that the model occasionally highlights regions near known binding sites, a more rigorous biological validation would require curated site-level annotations, which we plan to incorporate in follow-up studies.

Attachments
Attachment
Submitted filename: Response_to_Reviewers_auresp_3.docx
Decision Letter - M. Sohel Rahman, Editor

RPIPLM: prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with Supervised Contrastive Learning

PONE-D-23-34195R3

Dear Dr. Liu,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

M. Sohel Rahman, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The reviewer (who had trouble accessing the system and hence the delay), has made the following comments to me:

1. As the authors mentioned that the overlap between the plots in figure-5 and 6 are intentional , it would be best if the tSNE plots were moved from the model interpretability section and moved to the result discussion section. 

2. Rest of the comments have been addressed.

I would request the authors to consider #1 above as a discretionary comment. 

Reviewers' comments:

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .