Peer Review History
| Original SubmissionMay 24, 2024 |
|---|
|
PONE-D-24-21031Calculated hydration free energies become less accurate with increases in molecular weightPLOS ONE Dear Dr. Ivanov, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Your manuscript has been reviewed by three independent reviewers and they raised valuable questions which require further clarifications and should be reflected in the revised manuscript. I will be delighted to consider the revised version of the manuscript. Please submit your revised manuscript by Aug 02 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Soumendranath Bhakat Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 4. Thank you for stating the following in the Acknowledgments Section of your manuscript: "Acknowledgments. This study is financed by the European Union-NextGenerationEU through the National Recovery and Resilience Plan of the Republic of Bulgaria, project № BG-RRP-2.004-0004-C01. The in silico calculations were performed at the Centre of Excellence for Informatics and ICT, supported by the Science and Education for Smart Growth Operational Program and co-financed by the European Union through the European Structural and Investment Funds (Grant No. BG05M2OP001-1.001-0003)." Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript. 5. In this instance it seems there may be acceptable restrictions in place that prevent the public sharing of your minimal data. However, in line with our goal of ensuring long-term data availability to all interested researchers, PLOS’ Data Policy states that authors cannot be the sole named individuals responsible for ensuring data access (http://journals.plos.org/plosone/s/data-availability#loc-acceptable-data-sharing-methods). Data requests to a non-author institutional point of contact, such as a data access or ethics committee, helps guarantee long term stability and availability of data. Providing interested researchers with a durable point of contact ensures data will be accessible even if an author changes email addresses, institutions, or becomes unavailable to answer requests. Before we proceed with your manuscript, please also provide non-author contact information (phone/email/hyperlink) for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If no institutional body is available to respond to requests for your minimal data, please consider if there any institutional representatives who did not collaborate in the study, and are not listed as authors on the manuscript, who would be able to hold the data and respond to external requests for data access? If so, please provide their contact information (i.e., email address). Please also provide details on how you will ensure persistent or long-term data storage and availability. 6. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. Additional Editor Comments: Please provide Github link to scripts and data used in this study [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly Reviewer #3: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No Reviewer #3: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No Reviewer #3: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: No Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This manuscript focuses on the impact of molecular properties & choice of water model on hydration free energy – a key component in determination ligand binding free energy. The findings and discussion presented in this study add value to the area of computational drug discovery efforts. I recommend this manuscript for publication after the following comments have been addressed by the author. 1. The author has measured molecular descriptors using RDKit. However, the discussion on the measured descriptors is missing in the discussion. I would recommend adding the list of the descriptors. Did the author find any correlation between one or more of those descriptors and the RMSE in HFE? Did any of those descriptors show correlation with the errors in the sign of the HFE? 2. The results presented in this manuscript show that the 3-point water models outperform the 4-point water model. Could this finding be extended to charged molecules? Should the choice of water model be determined by the physicochemical property of the molecules? A comment on this aspect could be useful if the author has any insights obtained from their analysis. 3. While the author has presented an interesting observation regarding the impact of molecular weight on the accuracy of hydration free energy prediction, it will be useful to add a section on why the errors get larger with increase in molecular weight. Specifically, does it have any correlation with the accessible surface area and total volume of the molecule? Do linear molecules show a different trend than cyclic or branched molecules? Reviewer #2: See Attached Document for formatted review. Plaintext version pasted below: Review for : “Calculated hydration free energies become less accurate with increases in molecular weight” Summary: This article seeks to address and benchmark long-standing problem of how accurately computation is able to predict the hydration free energy (HFE) of drug-like molecules. By using an established dataset of experimentally measured HFE values, the FreeSolv database, the author uses AMBER MD engine to carry out free energy perturbation methods to compute HFE estimations. The author then generates an machine-learned (ML) model trained on this data and assess the performance of said model relative to HFE predictions. The analysis of accuracy focuses on some select molecular properties, such as molecular weight, but does not do a complete benchmarking of other potentially relevant properties. As such, the article could benefit from a greater in-depth discussion on the results, why certain features were chosen, and more discussion on HFE’s relevance to drug discovery. As discussed below, other parts require further contextualization, and the discussion section requires a major rewrite as it is a single large paragraph. Major (required) revisions: 1. The introduction would benefit from a greater direct contextualization of HFE calculations and their relevance in drug discovery. While Free Energy Calculations (FECs) are known to be incredibly relevant to drug discovery and inhibitor improvement, the introduction does not make it clear that HFEs also have their own utility. It would be beneficial if the introduction provided additional citations and additional direct contextualization about where HFEs are valuable in the drug discovery process and what point they’re relevant. Obviously, solubility is an important component of the ligand design process, but there is no discussion about additional methods for measuring solubility like logS or how they are utilized. 2. The discussion requires a major rewrite. It is currently one singular large paragraph that spans multiple pages, making it incredibly difficult to read and parse. It is also not clear how the discussion points are connected to the data or are further speculation without the data. It will be important to separate the discussion to clearly demarcate both of those types of paragraphs. 3. Additional discussion on why comparing error to molecular weight and no other properties would benefit the manuscript. In general, I think it is known that HFEs will scale with respect to molecular weight, and with increased ligand size there will be increased parameters to consider and increased time to convergence with simulation-based methods. Thus, it seems consistent with expectation that HFE variance will increase per ligand which would be remedied by increased replicates. 4. Consistent with the above discussion that convergence would require additional sampling with larger molecular weight due to a variety of chemical factors, it would be useful to compare how different numbers of replicates impact this error while still achieving convergence. It would be incredibly useful to identify if there is a scaling for molecular weight, number of atoms/bonds, and the amount of sampling needed. 5. Lastly, it would be useful to provide additional contextualization, testing, and out-of-data comparisons for the Machine Learning model that was constructed. The model tested using these train-test splits never tested against another orthogonal dataset that might be out of distribution. Testing only on data within the dataset allows the model to learn more similar chemistries between the across the train-test split without any guarantee that the model would indeed be able to extrapolate to new chemical topologies. 6. Additionally, a more thorough of the train-test characterization would be useful. It is currently not clear whether the train-test splits contain similar chemical identities in both the training and test sides of the split, which would make it difficult to test how well the dataset is able to extrapolate to new chemistries. Given the importance of the FreeSolv dataset to the ML model, but its small size, it would be useful to do more thorough chemically driven splitting. 7. Given that this was an effort done using established open-source libraries on openly available databases, it would good for there to be an associated github for sharing the results of this data and their weights. 8. Given the interesting nature of the data and the results, a conclusion section would benefit the manuscript and greatly improve readability. Minor revisions: 1. Citations are needed at the following points: a. Page 2, line 31, “an HIV integrase inhibitor” b. Page 2, line 32-33, “modeling and simulation have been instrumental in bringing aobut new therapeutic agents” c. Page 4, line 86 and 87, “one-step approach” and “two-step approach” d. Page 5, line 97, “despite its many known shortcomings” e. Page 5, line 96, “if not the most widely used” 2. The following phrases are not clear and could benefit from rewording: a. Page 3, line 43-44 3. Start a new paragraph at Page 5 line 108 for clarity. 4. The end of the introduction is riddled with sharp transitions between sentences – some transition words and rephrasing can improve the flow for the reader here. 5. Given the historically outdated nature of the Berendsen Barostat (see papers such as: https://doi.org/10.1016/j.molliq.2022.120116 and https://doi.org/10.1016/j.bbamem.2016.02.004), it would be useful to provide some context for why the Berendsen was used in this simulation over other barostat methods. Alternatively a characterization across different barostats would be useful to see. 6. It would be useful to provide increased contextualization in the text for building the 2D QSAR models 7. Page 8, line 166-167: It would be useful to provide a description what these parameters of zero-variance were that were removed from consideration, and what the other 284 descriptors were in the SI. 8. Page 8, line 169: Please clarify why a 70:30 ratio was chosen for train-test splitting (or provide a citation) 9. Figure would benefit from larger fonts and heading text to improve readability 10. Page 11, line 232: Perhaps a more quantitative description of what it means that the three modes are lying close to each other? 11. Page 14, line 301: Provide a rationale/citation for why 5.25 ns of production dynamics was used, or a more robust sampling was done. 12. Page 15, line 314 appears to have a typo – it should be referring to S5 Fig. if I’m reading correctly? Reviewer #3: This work compares the experimental Hydration Free Energy (HFE) in the Free Solve dataset to a) values calculated using alchemical free energy methods with thermodynamic integration (TI) estimator and b) to an ML model trained on what I assume is the experimental data. The main conclusion is that the error in the prediction from the alchemical estimation increases with increasing molecular weight. Overall There has been considerable effort in producing a useful set of calculations on an important area of computational chemistry, I thank the authors for their efforts. The main points of criticism are that: 1. The implications and reasons for the lack of convergence of the HFE estimates are not adequately explored. This affects the validity of the conclusions drawn. 2. The inclusion of the ML analysis isn’t fully justified. 3. The discussion areas should be a lot more focused on the topic of the paper. 4. There should be more references to relevant work. There is much here that is interesting and a refocusing of the analysis would be welcome to explore the convergence properties of these calculations. Introduction - The introduction is well written and gives an overview of computer aided drug discovery. - I believe it is too long and not focused enough on the specific area covered by the work and fails to make the case for why this work is necessary. - The have been numerous works looking at computational predictions of HFE e.g. (non-exhaustive list), The SAMPL challenges or https://doi.org/10.1021/acs.jcim.0c00600, https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00285 which should be mentioned. - There have also been many papers on ML methods for QSAR, see https://paperswithcode.com/sota/molecular-property-prediction-on-freesolv for a ‘leaderboard’ of methods on the Free Solv database. - Other forcefields e.g., CHARMM small molecule forcefield and the recent Open Force Field, Sage 2.0, and Machine Learned forcefields were also not mentioned. Methods - The methods were mostly clearly explained with the following exceptions: The use of the TI estimator was not fully justified given the success of other estimators and potential drawbacks of TI, e.g., MBAR. See https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00285, https://doi.org/10.1063/1.5041835, and https://doi.org/10.1021/acs.jced.7b00104. - The QSAR variables would be better listed in the SI, rather than given as a reference. While implied, the training data was not explicitly identified as the experimental values. Given there is value and precedence in fitting ML models to ABFE data this should be explicitly mentioned. - The normalization by the molecular weight, while not wrong, was not justified as inclusion of the molecular weight itself should account for the effect of molecular weight on the predictions and regression coefficients. - The fitting procedure was rigorous but the hyperparameter turning curve should be given in the SI. - The metric denoted as ‘relative error’ is not consistent with general use of that term (which would include subtraction by 1) please either subtract 1 from the values or use a different term. Results The results are generally well reported and clear. - The vertical lines in figure 1 where not explained in the caption. - The discussion of the distribution of prediction errors would benefit from being quantitative (using terms like, bias, standard deviation, kurtosis, etc.) rather qualitative (e.g. ‘TIP3P has the tallest and narrowest error distribution’). - In line 236 please convert these values into percentages. - Please clarify what you mean by (line 242) ‘not only do the errors become larger with increasing MW but…’ as the error distribution looks to have an approximate mean of 0 for larger weight. Your note of increasing range and kurtosis looks accurate though. - It is hard to draw meaningful conclusions from figure 2 due to its format (scatter plot with different colours). Plotting an estimate of the mean and range / standard deviation etc. of the errors vs MW would be more informative. One could use a Gaussian process, LOWESS smoother or even categorise the molecular weight into ranges and plot box plots. LOWESS smoothers are available in the Seaborn in the regplot function. - Figure 3 is quite confusing. It would be ideal if you could keep the format of the comparison the same as Figure 2. Discussion - I do not believe that the conclusions you draw here are adequately supported by the data. This is because the convergence of the free energy estimates using the FE method drops significantly with increasing molecular weight. - It’s not clear why MW has been singled out as the factor influencing accuracy given that number of rotatable bonds must also be very influential. - I would like to see an analysis of convergence wrt to MW stratified by number of rotatable bonds. - I would also like some investigation into the reasons for the lack of convergence. It’s not clear what you mean by ‘independent’ replicas in line 324. If they are not different configurations, perhaps perform replicas on some of the least converged molecules with different starting configurations. - Comparisons between forcefields and water models is valid but only with converged estimates. - Line 329: sampling multiple short trajectories are only useful if the starting configurations are drawn from the equilibrium configurational distribution (see comment earlier as well). - In line 362 you say it is not the purpose of the paper to draw definitive conclusions yet the title of the paper is very definitive. - The discussion is very wide ranging and could do with being shortened and restricted to the main points of the paper. - The inclusion of the SVM models in the study was not justified. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
|
| Revision 1 |
|
Calculated hydration free energies become less accurate with increases in molecular weight PONE-D-24-21031R1 Dear Dr. Ivanov, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Soumendranath Bhakat Academic Editor PLOS ONE Additional Editor Comments (optional): Dear Dr. Ivanov, After carefully reviewing the reviewers comments, I am glad to accept your paper PONE-D-24-21031R1 for publication in PLOS ONE. Thanks for your patience during this process. Best regards, Soumendranath Bhakat Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: (No Response) ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The author has satisfactorily addressed all the questions. Therefore, I recommend the manuscript for publication. Reviewer #2: The author has addressed all comments in my previous review and revised the manuscript appropriately. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** |
| Formally Accepted |
|
PONE-D-24-21031R1 PLOS ONE Dear Dr. Ivanov, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Soumendranath Bhakat Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .