Peer Review History
| Original SubmissionNovember 9, 2021 |
|---|
|
PONE-D-21-35704Multi-label classification of symptom terms from free-text bilingual drug allergy records using natural language processingPLOS ONE Dear Dr. Chaichulee, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Mar 06 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Junaid Rashid, Ph.D Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. 3. Please ensure that you refer to Figure 1 in your text as, if accepted, production will need this reference to link the reader to the figure. 4. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 3 in your text; if accepted, production will need this reference to link the reader to the Table. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In this paper, authors evaluated and compared three types of NLP methods and their variations, Naive Bayes - Support Vector Machine (NB-SVM), Universal Language Model Fine-tuning (ULMFiT), and Bidirectional Encoder Representations from Transformers (BERT), for classification of symptom terms in English and Thai. The paper is well written, the proposed methods are briefly reviewed, and the results are clearly presented. As expected, the comparison results showed that the BERT-based methods perform better in general and the ensembled method performs the best among all different methods for different evaluation criteria. The main concern is lacking of novelty: • Need more justifications for methodological novelty if any. In particular, the two different languages (Thai and English) need to be dealt with. What is new to use the proposed NLP methods to deal with two languages simultaneously? • Need to discuss more details on the novel clinical findings if any. Reviewer #2: Thank you so much for sharing your manuscript. The authors of "Multi-label classification of symptom terms from free-text bilingual drug allergy records using natural language processing" evaluate the ability of several natural language models to predict symptoms from unstructured texts from electronic health records. I commend the authors on their creation of an allergy-specific BERT-based model and focus on an important clinical construct. However, there are key portions of the manuscript that are not clear, which make it difficult to understand why certain decisions were made. More details are included below, along with a few additional questions. I hope that these comments can help to strengthen the manuscript. General Comments: - I appreciate the inclusion of Figures 1 and 3 to try and explain the experimental pipeline. However, while I can generally understand what was done for the “Data Preparation” and “Algorithm Development,” the “Evaluation” process is not clear. For example, the data is split into three parts for training, validation, and testing, but I am unclear on what kind of model was used for “training” or what hyperparameters were “validated.” Furthermore, you discuss precision, recall, and accuracy, but I am unclear on how gold-standard labels were obtained. Please clarify this information within the text. - Right now, you compare the natural language models to each other to identify the one with the best performance. While the metrics obtained appear impressive, it would be helpful if you compare your results to a simple text-matching model. If the simple method performs poorly compared to the models presented already, it supports the need for more complex models. - You note that the data is imbalanced, yet no correction is implemented. As a result, I am skeptical of the high performance reported in Table 3 and concerned that not correcting for this imbalance may limit the generalizability of the results. Please consider running a sensitivity analysis where you over- or under-sample the training data and report those results alongside the non-adjusted data. In addition, please consider including a confusion matrix so that readers can see which symptoms are typically are mistaken for each other (this may also help with some interpretation of what your model is “learning”). Specific Comments: Abstract - The abstract first mentions three different “NLP techniques” (i.e., NB-SVM, ULMFiT, BERT) but then names different models in the following sentence (XLM-RoBERTa, AllergyRoBERTa). While this makes sense after reading the article, it is not does not make sense without that additional information. Please clarify in the abstract that you tested different BERT-like models that include XLM-RoBERTa and AllergyRoBERTa. Introduction - Lines 10 – 11: There are several phrases within the Introduction (e.g., “cutaneous manifestations”) that could be considered clinical jargon. Since the audience of this journal is broad, please consider rephrasing some of this language or providing a description that is understandable to non-health audiences. Methods - “Data Collection”, Lines 186 – 187: You mention that you obtain data for 18 years, yet only 79,912 records are included. This value feels very small relative to the time window of data collection. It may be helpful to provide additional context to assure the reader that this value is “okay.” Were these records a sample of a broader pool of labels, is the hospital small (and thus does not have that many patients per year), are drug allergy records uncommon, or is there another reason for this? - “Data Preparation” section: You mention that the free text contains both English and Thai terms, but I am still unclear on how this was accounted for in the model. Were there separate models for each language, or were they combined? Were they evaluated together or separately? Please elaborate. - “Model development”: While I appreciate the detail in this section, I think that there is actually too much information in the subsections for each model. In an already-complex paper, adding the extreme specifics of how each model works may be lost on readers from general audiences. Please consider moving portions of this section to a Methods supplement so that readers who are interested can read more, but make sure to keep enough in the main paper that it can be understood. - “Evaluation – Performance metrics”: Consider moving the mathematical descriptions of each of the different metrics to a Methods supplement, as a general audience likely will not understand this and get lost in the details. Results - Lines 430 – 432: This information could go in the “Evaluation” section of the methods instead of the Results. - Lines 441 – 443: I am confused on why Krippendorff’s alpha was used in this study if you are already assuming that the labels provided by the pharmacists are the gold standard. Discussion - “Bilingual representation”: Because there were no results presented specifically for a bilingual model (or, if they were, they were done in a way that was not clear), this section seems more speculative than substantive. Please clarify in the methods how the bilingual component of the models were assessed, then describe this outcome in the Results. That way, this portion of the discussion will make more sense. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
PONE-D-21-35704R1Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processingPLOS ONE Dear Dr. Chaichulee, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by May 19 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Junaid Rashid, Ph.D Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notic [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed Reviewer #3: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes Reviewer #3: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: No Reviewer #3: No ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: (No Response) Reviewer #3: In this work authors evaluated different NLP algorithms that can encode unstructured ADRs stored in EHRs into institutional symptom terms. Authors made efforts to revise the first draft of the manuscripts by addressing previous reviewers' comments. However, I have a few concerns on this paper and in my opinion, the manuscript is not in a state to be published. See my comments below 1) The approach leverages existing well known techniques together to solve an existing problem. It is not clear the key technical contribution of the proposed study. No novelty 2) There are many recent studies already published which are using the same idea even with more sophisticated ways of learning text representation. All recent studies are missing in the literature review. 3) Authors should refer to the state-of-the-art methods in Biomedical NLP (bioNLP) (e.g BioBERT and current SOTA BioALBERT). There are many studies which shows that using Biomedical (domain-specific) language models works better than language models trained on general corpus (such as Wikipedia etc). Authors should compare their results and discuss that domain-specific methods BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT biobert: a pre-trained biomedical language representation model for biomedical text mining 4) The case for the paper is weak. The authors do provide a review of the relevant works however the relevant works are flatly discussed without properly highlighting their weaknesses and establishing the research gaps 5) Some experiment methods need more explanation 6) Finally, various typographical and grammatical errors must be rectified. I would recommend that the authors look through more recent publications on this problem. Establishing novelty of approach over other published work would benefit their work, as well as the manuscript ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 2 |
|
Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing PONE-D-21-35704R2 Dear Dr. Chaichulee, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Junaid Rashid, Ph.D Academic Editor PLOS ONE Additional Editor Comments : I believe the authors have addressed all reviewer comments, and the manuscript can be accepted. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #3: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #3: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #3: (No Response) ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #3: (No Response) ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #3: Authors have addressed most of my comments. I have no more suggestions there I recommend to accept this paper for publication. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #3: No ********** |
| Formally Accepted |
|
PONE-D-21-35704R2 Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing Dear Dr. Chaichulee: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Junaid Rashid Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .