A general hypergraph learning algorithm for drug multi-task predictions in micro-to-macro biomedical networks

Shuting Jin; Yue Hong; Li Zeng; Yinghui Jiang; Yuan Lin; Leyi Wei; Zhuohang Yu; Xiangxiang Zeng; Xiangrong Liu

doi:10.1371/journal.pcbi.1011597

Peer Review History

Original SubmissionJuly 6, 2023
25 Aug 2023 Decision Letter - Jason A. Papin, Editor, Alexander MacKerell, Editor Dear Miss Jin, Thank you very much for submitting your manuscript "A general hypergraph learning algorithm for drug multi-task predictions in micro-to-macro biomedical networks" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Alexander MacKerell Academic Editor PLOS Computational Biology Jason Papin Editor-in-Chief PLOS Computational Biology ********************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Jin et al. reported in this manuscript a hypergraph-based deep learning framework tailed for drug-related predictions using fragments of drug molecules. In particular, BRICSdecomposition is used to partition each small-molecule drug into functional group fragments at a variety of levels, and the interference is performed based on fragments instead of whole molecules. Most importantly, the deployment of hypergraphs by the authors is suitable and imperative for accommodating the higher-order relationships between fragments/motifs. The authors demonstrated that such an approach can be successfully used to perform interference for drug-drug interactions, drug-target interactions, drug-disease relationships, and drug side-effect interactions. The manuscript is in general well-written, and should be suitable for PLOS Comp. Biol. It can be improved by addressing the following comments. 1) The section discussing implementation specifics should elucidate the number of learnable parameters in the neural network (NN) models. This should be in tandem with the information about training parameters and the utilized platform. 2) To ensure a fair comparison, the inclusion of other evaluation metrics—such as MCC, Precision, Recall, and F1-score—is advised for all methods in comparison across all four drug-related tasks. 3) The manuscript would benefit from a more comprehensive description of the baseline models, especially concerning the three GNN-based models. Given that the efficacy of NNs is closely tied to their configurations, the authors, at the very least, should specify the number of learnable weights in the baseline models in relation to their HGDrug models. 4) I checked the corresponding github repository and found the introduction unclear. It would be beneficial to incorporate a comprehensive README section, detailing download instructions, software requirements, training protocols, and illustrative examples of outputs. 5) Table 1 requires a more detailed explanation and description. Besides, I'd like to draw attention to a potential typo error concerning the eighth affiliation of the corresponding author (X.L.). Please check whether it is "Zhejiang Lab" as opposed to "Zhijiang Lab". Reviewer #2: Summary: The authors present a hypergraph learning algorithm named HGDrug for drug multi-task predictions. The algorithm incorporates drug-substructure relationships into molecular interaction networks to construct a drug-centric heterogeneous network. HGDrug captures high-order drug relations and fetches effective drug features using motif-driven hypergraphs and a self-supervised auxiliary task. This study demonstrates HGDrug achieves state-of-the-art performance and the ability to capture relations between drugs with the same functional groups. The proposed drug-substructure interaction networks can also improve the performance of existing network models for drug-related prediction tasks. The paper is well-written, and the method is clearly described. I only have a few comments and suggestions for the authors. Major: The introduction does provide a clear objective of the study, which is to introduce a new hypergraph learning algorithm for drug multi-task predictions. However, the significance of this objective in the broader context of drug discovery could be elaborated upon more. I suggest they give a deeper explanation about why their objective is challenged or crucial. While the work does touch upon the importance of drug multi-task predictions, a more detailed motivation explaining the challenges in the current landscape would provide a stronger foundation for the study. The authors briefly mention terms like "motif-driven hypergraphs" and "self-supervised auxiliary task". It would be better to give a brief explanation or reference for these terms in the introduction. While the paper claims to introduce a novel approach by incorporating drug substructure information, it would be beneficial to provide a clearer distinction description between the proposed method and existing graph-based methods. For example, give more information to introduce the difference between DSMN and the heterogeneous-network-based methods. I suggest the authors underscore the significance and benefits of their approach compared to traditional methods. Many researchers tend to favor established techniques unless the new method addresses specific challenges that traditional methods cannot overcome. Why is there a pressing need for another drug prediction algorithm? What specific challenges does HGDrug address that others don't? The work could benefit from better structuring, possibly with more defined subheadings, to guide the reader through different sections. I suggest the authors could summarize the section's conclusion as the subheading for readers’ better understanding. It would be beneficial to provide a detailed explanation for the operator “δ\|\|θ\|_2^2” in Formula 8. For discussion It would be better to highlight to emphasize the importance of the potential applications or implications of the HGDrug algorithm in real-world scenarios could be. I suggest the authors have a discussion on the computational complexity of HGDrug, its scalability, and performance concerning dataset size would provide insights into its feasibility for large-scale applications. Minor: Refining and condensing your writing will improve the paper's readability. For example: In the Definition 2 section, "The motif is a small pattern of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks," consider revising to "A motif is a recurring pattern of interconnections in complex networks, occurring more frequently than in randomized networks." For references, ensure all references are up-to-date and relevant. Cross-check that all cited works are appropriately referenced within the text. In this article, there are many abbreviations, it would be better to consider providing a list or table of abbreviations for quick reference. The manuscript delves into technical details regarding hypergraphs, motifs, and heterogeneous networks. It would be helpful to provide more intuitive explanations or examples for readers less familiar with these concepts. Reviewer #3: The manuscript describes a hypergraph learning model, namely HGDrug, which introduces drug substructure (functional group fragments) into biomedical networks for the first time. Specifically, the work constructs a drug-centric micro-to-macro heterogeneous network (termed DSMN) and presents a motif-driven hypergraph learning framework (termed HGDrug) with the self-supervised auxiliary task for drug multi-task interaction predictions. The method is evaluated by 4 benchmark tasks and demonstrates that it achieves highly accurate and robust predictions, outperforming 8 state-of-the-art task-specific models and 6 general-purpose conventional models. A specific case study also shows that HGDrug can learn the substructure information to improve the performance of drug repurposing. However, I have the following concerns. Major comments: 1、 Fragments increase the number of nodes and edges of these networks. For example, for the Drug-drug, Drug-target, Drug-disease, and Drug-side-effect datasets, the number of fragments is 4.42 times the number of nodes. Please discuss the balance between time and performance before and after adding fragments. 2、 In the part of the “Ablation studies”, the author explored the influence of 4 hypergraph-based branches on the prediction performance of the model, but I noticed that each hypergraph branch also contains multiple network motifs to derive hypergraphs. I think the author should study the contribution of the four kinds of motifs (M2, M3, M5, M6) related to fragments to the prediction results, and further verify the importance of constructing hypergraphs based on fragments for drug feature learning. 3、 The work uses BRICS to decompose the SMILES sequences of drugs when acquiring DFIs and FFIs networks. As far as I know, there are many ways for SMILES decomposition, such as rdkit Recap, why did the author choose BRICS? BRICSDecompose should return a non-repeated fragments list after decomposition. How did the author achieve FFI? 4、 At the beginning of the Section “Network visualization of the DDiI predictions”, it is not clear what the authors mean by " We remove the known DDiIs used in the prediction model…". Do they perform again the same pipeline without considering the known drug-disease associations? Do they delete the known associations from the outcome they've already had in Section ‘Performance comparison’? Minor comments: 1、In the results shown in Figure 7, there seems to be no“Disease/Side effect” node type, and the color marked in the figure seems to be the same as the “Target” node. 2、 Some important works have not been reviewed or mentioned, such as doi: 10.1093/bioinformatics/btac579, doi: 10.1093/bioinformatics/btab651, doi: 10.1093/bioinformatics/btz718，doi：10.1371/journal.pcbi.1011382. Sufficient literature is important for potential readers. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes:** Jing Huang Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcbi.1011597.r001
Revision 1
22 Sep 2023 Author Response Attachments Attachment Submitted filename: Response to Reviewers.docx https://doi.org/10.1371/journal.pcbi.1011597.r002
13 Oct 2023 Decision Letter - Jason A. Papin, Editor, Alexander MacKerell, Editor Dear Miss Jin, We are pleased to inform you that your manuscript 'A general hypergraph learning algorithm for drug multi-task predictions in micro-to-macro biomedical networks' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Alexander MacKerell Academic Editor PLOS Computational Biology Jason Papin Editor-in-Chief PLOS Computational Biology ********************************************************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have adequately addressed my comments and the manuscript is now ready for publication. Reviewer #3: The authors have addressed my concerns. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #3: None ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes:** Jing Huang Reviewer #3: No https://doi.org/10.1371/journal.pcbi.1011597.r003
Formally Accepted
2 Nov 2023 Acceptance Letter - Jason A. Papin, Editor, Alexander MacKerell, Editor PCOMPBIOL-D-23-01067R1 A general hypergraph learning algorithm for drug multi-task predictions in micro-to-macro biomedical networks Dear Dr Jin, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1011597.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .