Peer Review History
| Original SubmissionApril 10, 2025 |
|---|
|
PCOMPBIOL-D-25-00691 InteracTor: Feature Engineering and Explainable AI for Profiling Protein Structure-Interaction-Function Relationships PLOS Computational Biology Dear Dr. Dias, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 30 days Sep 12 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. We look forward to receiving your revised manuscript. Kind regards, Fei Guo Academic Editor PLOS Computational Biology Shihua Zhang Section Editor PLOS Computational Biology Additional Editor Comments : Please revise paper based on reviewers' comments. Journal Requirements: 1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full. At this stage, the following Authors/Authors require contributions: Jose Cleydson F. Silva, Layla Schuster, Nick Sexson, Melissa Erdem, Ryan Hulk, Matias Kirst, Marcio F. R. Resende, and Raquel Dias. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form. The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions 2) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: https://journals.plos.org/ploscompbiol/s/figures 3) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form. Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager. Potential Copyright Issues: i) Figures 1A, and 1D. Please confirm whether you drew the images / clip-art within the figure panels by hand. If you did not draw the images, please provide (a) a link to the source of the images or icons and their license / terms of use; or (b) written permission from the copyright holder to publish the images or icons under our CC BY 4.0 license. Alternatively, you may replace the images with open source alternatives. See these open source resources you may use to replace images / clip-art: - https://commons.wikimedia.org 4) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. 1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)." 2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 3) If any authors received a salary from any of your funders, please state which authors and which funders. Reviewers' comments: Reviewer's Responses to Questions Reviewer #1: The work presented here describes a predictive ML method for the structural classification of protein families and their function. Several aspects, namely the use of 3D structures and interpretable models, are two of the strengths of this work. Overall, the manuscript covers all the methodological details of the work with sufficient clarity, and approaches appear to be rigorous and robust. I do have some minor reserves mostly related to the results analysis and the implications of the features deemed responsible of the model performance, and minor wording clarifications throughout the manuscript. In particular, the correlation between features role and protein classes feels rather anecdotal and week. This part could be strengthen very easily by finding more examples and consistent outcomes across protein families (i.e.: all/the majority of transporters show features X and Y ). Below a few more in-detail comments. - vdW vs London forces: interesting take, but how did they calculate them with sufficient accuracy to differentiate them and prevent double-counting? - Out of the features extracted, how the Authors addressed the dominance of features describing the frequency of a tripeptide? While not being at risk of providing unwanted memorization of the dataset, it still represents a dominant set of features that capture the sequence more than the structural context of the amino acids, arguably one of the strengths of the work. In that regard, I assume that in the Mutual Information analysis, the descriptors of TG, VN and GG descriptors refer to the frequency of dipeptides of such amino acids. If that's correct, it's interesting that dipeptide descriptors were found more important than tri-peptides? > Additionally, certain peptide composition patterns, such as those involving amino acids P, CLG, PP, H, and W, were also among the most important features. This is somehow a recurrent pattern in the manuscript, where amino acid one-letter references are mixed with other acronyms in a not so clear manner. Also, according to the nomenclature introduced earlier, those should be referred to as n-peptide descriptors, not "amino acids". > For instance, the importance of London dispersion forces(45) and internal hydrophobicity(46) in Cytochrome P450 classification emphasizes the crucial role these interactions play in maintaining its tertiary structure This statement and its inverse in the following paragraph are a bit controversial. Hydrophobicity is considered among the most important forces driving protein folding, which would be expected to hold true for virtually every protein in the dataset, and not in specific classes. > Bacterial solute-binding protein 2 family displayed a balance of repulsive interactions and internal tension, likely linked to their dynamic conformational states required for substrate transport(54) This is an interesting finding, but unless confirmed across multiple dynamic proteins, it really falls a bit short of supporting the post-hoc analysis discussed. Also, as a suggestion for future work, it would be very interesting to see the performance of a model extended to include secondary structure descriptors. Reviewer #2: This paper by Silva et al explores the classification of protein family and protein function using machine learning methods. The specific contribution of this work is the use of features based on 3D protein structures. These features are based on, for example, patterns of hydrogen bonding, hydrophobic contacts, and van der Waals interactions. Overall, this work is scientifically interesting and rigorously conducted. I have only three comments for the authors' and editors' consideration: First, this work is based on the 20,877 protein structures in the PDB REDO data set. The manuscript provides good justification for using this subset of the entire PDB based on data quality and data leakage. But the thresholds used only leave 8 protein families and 3 GO terms, which limit the generalizability of the conclusions. Given that the GO analysis uses a higher threshold for the sample size (90 vs 30), the authors should consider relaxing that threshold to increase the number of GO terms in their analysis. Second, Figure 5 shows the F1-score on subsets of features as they compare to the full set of features. It appears this evaluation was done based on a 80-20 split, where 80% of the samples are used to train the classifier and 20% are used to calculate the F1-score. I infer that the feature selection was also done on the 80% training set. I believe that the more rigorous approach would be a three-way split of samples, with one set used strictly for the feature selection. This design would better support the generalized assertion that the feature selection method produces comparable classification results. Third, in Figure 6, consider highlighting the 3D interaction features (by bolding or coloring) that are the focus of this study. Especially since the feature lists are heavily dominated by the compositional features, visually highlighting those interaction features will reinforce their importance. To assign a statistical significance, consider also using a Kolmogorov-Smirnoff statistic to show that these interaction features have larger SHAP values (similar to how the GSEA tool performs gene set enrichment). ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility: To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols |
| Revision 1 |
|
Dear Prof. Dias, We are pleased to inform you that your manuscript 'InteracTor: Feature Engineering and Explainable AI for Profiling Protein Structure-Interaction-Function Relationships' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Fei Guo Academic Editor PLOS Computational Biology Shihua Zhang Section Editor PLOS Computational Biology *********************************************************** |
| Formally Accepted |
|
PCOMPBIOL-D-25-00691R1 InteracTor: Feature Engineering and Explainable AI for Profiling Protein Structure-Interaction-Function Relationships Dear Dr Dias, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Judit Kozma PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .