Predicting Meridian in Chinese traditional medicine using machine learning approaches

Yinyin Wang; Mohieddin Jafari; Yun Tang; Jing Tang

doi:10.1371/journal.pcbi.1007249

Peer Review History

Original SubmissionJuly 5, 2019
12 Aug 2019 Decision Letter - Weixiong Zhang, Editor, Alexander MacKerell, Editor Dear Dr Tang, Thank you very much for submitting your manuscript 'Predicting Meridian in Chinese Traditional Medicine Using Machine Learning Approaches' for review by PLOS Computational Biology. Your manuscript has been fully evaluated by the PLOS Computational Biology editorial team and in this case also by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. While your manuscript cannot be accepted in its present form, we are willing to consider a revised version in which the issues raised by the reviewers have been adequately addressed. We cannot, of course, promise publication at that time. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Your revisions should address the specific points made by each reviewer. Please return the revised version within the next 60 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at ploscompbiol@plos.org. Revised manuscripts received beyond 60 days may require evaluation and peer review similar to that applied to newly submitted manuscripts. In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. (3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution. Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are: - Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition). - Supporting Information uploaded as separate files, titled Dataset, Figure, Table, Text, Protocol, Audio, or Video. - Funding information in the 'Financial Disclosure' box in the online system. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see here. We are sorry that we cannot be more positive about your manuscript at this stage, but if you have any concerns or questions, please do not hesitate to contact us. Sincerely, Alexander MacKerell Associate Editor PLOS Computational Biology Weixiong Zhang Deputy Editor PLOS Computational Biology A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: This research presents a very interesting idea: using traditional machine learning algorithms and fingerprints of chemical compounds in herbs to predict herbs' Meridians. The manuscript is well organized and clearly written, and a large amount of computation is performed. However, I do see several major statistical issues. First, since an herb could belong to multiple Meridians as clearly stated by authors, this is a typical multiclass classification problem. However, this is totally ignored by the authors. From the method descriptions, they seem to use one-vs-rest approach. The authors need to explain why one-vs-rest is a good approach for this multiclass classification problem. Second, relevant to the first issue, neural network models could be used for this multiclass classification problem without such one-vs-rest approach. Have the authors tried any NN models? Did NN models perform poorly in comparison with these traditional algorithms? Third, apparently, the data is quite unbalanced. Machine learning models are generally very sensitive to "unbalancedness" of the training data. Authors did not discuss this at all. Correspondingly, numbers of positive and negative samples should be added in "Supplementary Table 3.xlsx". Fourth, for unbalanced data, AUC-ROC (the area under the receiver operating characteristic curve) is a widely used metrics; I highly recommend AUC-ROC numbers are calculated. Reviewer #2: Wang and colleagues proposed a comprehensive machine learning-based study to predict Meridian in Chinese Traditional Medicine. They integrated multiple types of molecular fingerprints and ADME properties of active compounds in herbs. They then evaluated four different machine learning algorithms by combing different types fingerprints and ADME properties, which is quite a novel and comprehensive insight. Some machine learning models reveal good accuracy in predicting herb-Meridian associations in cross validation. This is an impressive study which offers powerful machine learning-based approaches for evaluations of Meridian by Chinese Traditional Medicine. The main findings are well presented and the manuscript is well written. Several specific comments may help improve the manuscript further. 1. The reviewers appreciated that the authors collected large-scale herbs with specific ingredients from database. However, each active ingredient has different concentration across herbs. The authors use equal weight (concentration) for each ingredient for calculation of molecular fingerprints and ADME properties to build machine learning models. This limitation has to be well explained or discussed in the revised manuscript. 2. The authors only evaluated accuracy only for machine learning models. Several comprehensive indexes, such as AUC (area under ROC), and precision-recall curves should be added. 3. The authors integrated both molecular fingerprints and ADME properties for building models. However, the reviewer cannot find how they integrated ADME properties in Figure 1. 4. The authors systematically evaluated four different machine learning algorithms in this study. More details of parameters of machine learning models are suggested to provided. For example, which k used for kNN, which function (kernel or linear) used for SVM, how many trees used for Random forest, etc. The authors may get better performance if they optimize tree number in random forest models. 5. The authors calculated ADME properties using public tools. One popular tool, admetSAR should be discussed. 6. It is impressive that the authors found that RF model shows the best performance for large intestine as several key ADME properties are highly correlated with large intestine. Could the authors evaluate the performance of RF models on large intestine using ADME properties only. 7. Several key refs related to polypharmacy (10.1038/s41467-019-09186-x) and polypharmacology of natural products (doi: 10.1093/bib/bbx045) should be discussed. Overall, this is an interesting study, which offer powerful computational tools and models for systematic evaluation of Meridian-herb associations, an important, complex biomedical research question in traditional Medicine. ******** Have all data underlying the figures and results presented in the manuscript been provided?** Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: yuhong wang Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1007249.r001
Revision 1
26 Sep 2019 Author Response Attachments Attachment Submitted filename: PLoS_R1_response_letter_final_v2.docx https://doi.org/10.1371/journal.pcbi.1007249.r002
20 Oct 2019 Decision Letter - Weixiong Zhang, Editor, Alexander MacKerell, Editor Dear Dr Tang, We are pleased to inform you that your manuscript 'Predicting Meridian in Chinese Traditional Medicine Using Machine Learning Approaches' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pcompbiol/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. One of the goals of PLOS is to make science accessible to educators and the public. PLOS staff issue occasional press releases and make early versions of PLOS Computational Biology articles available to science writers and journalists. PLOS staff also collaborate with Communication and Public Information Offices and would be happy to work with the relevant people at your institution or funding agency. If your institution or funding agency is interested in promoting your findings, please ask them to coordinate their releases with PLOS (contact ploscompbiol@plos.org). Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Computational Biology. Sincerely, Alexander MacKerell Associate Editor PLOS Computational Biology Weixiong Zhang Deputy Editor PLOS Computational Biology Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The revision addressed my concerns and suggestions. The accuracy and ROC numbers remain low using the common machine learning standards. However, the relationship between the meridians and compounds is expected to be very complex, and these numbers may still be meaningful. Similar problems will likely occur more frequently as people try machine learning models to more complex biological phenomena. The thinkings behind traditional Chinese medicine are distinct from those in Western medicine, but in my opinion they complement each other well. Machine learning, in particular deep learning which could deal with more complex relationship, could be a powerful method studying complex biological systems such as herbal formulas. I have two suggestions which may be helpful for authors' future researches. First, to further test the significance of the model, you could use bootstrap. Basically, randomly assign meridians for the used samples, perform the same procedure, calculate the same accuracy numbers, and then compute confidence intervals. Such confidence numbers could be more convincing supports for these models. Second, as I said above, the relationship between meridians and compound structures is obviously very complex. From my experiences, deep learning models, if well constructed, could help even for the sample size of this study. Congratulations for this interesting work. Reviewer #2: The authors has addressed my concerns. ******** Have all data underlying the figures and results presented in the manuscript been provided?** Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1007249.r003
Formally Accepted
6 Nov 2019 Acceptance Letter - Weixiong Zhang, Editor, Alexander MacKerell, Editor PCOMPBIOL-D-19-01126R1 Predicting Meridian in Chinese Traditional Medicine Using Machine Learning Approaches Dear Dr Tang, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Matt Lyles PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1007249.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .