Peer Review History
| Original SubmissionAugust 3, 2024 |
|---|
|
PONE-D-24-32612Larger models mean yield results? Streamlined Severity Classification of ADHD-Related Concerns Using BERT-Based Knowledge DistillationPLOS ONE Dear Dr. Karim, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Nov 23 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Weiqiang (Albert) Jin, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. Please update your submission to use the PLOS LaTeX template. The template and more information on our requirements for LaTeX submissions can be found at http://journals.plos.org/plosone/s/latex. 4. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process. 5. Please ensure that you refer to Figures 1 and 2 in your text as, if accepted, production will need this reference to link the reader to the figure. 6. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Tables 1 and 7 in your text; if accepted, production will need this reference to link the reader to the Table. 7. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. 8. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. Additional Editor Comments: minor revision Based on the reviewers' feedback, we recommend you need minor revisions. The authors should focus on the following key areas: clearly justify the choice of the BERT-based model over others like GPT or Llama, and provide a comparative analysis of fine-tuning versus knowledge distillation. Address the need for recent references, improve formula numbering, clarify LastBERT's innovation, and enhance data interpretability in tables. Additionally, simplify the introduction and conclusion, improve image clarity, and consider a detailed discussion on limitations. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This study investigates the creation of a lightweight yet powerful BERT-based model through knowledge distillation techniques for NLP tasks, specifically classifying the severity of ADHD (Attention Deficit Hyperactivity Disorder) related problems from social media text data. This research work has certain value and significance, and there are related questions in this article as follows: 1. There are no recent studies and comparative models from 2022 to 2024 in the references. Why? 2. The formulas in the text are not numbered, such as (1), (2), etc. 3. There are two periods at the end of the “Introduction” section. 4. Are there any relevant references on the hyperparameter settings of LastBERT? 5. In Table 7, the number of parameters of LastBERT is larger than that of MobileBERT. Why is the Matthews correlation coefficient on the CoLA dataset and the Spearman coefficient on the STS-B dataset so different from those of MobileBERT? In addition, the comparison model in Table 7 lacks some data support, and the comparison model needs to be expanded to increase the data interpretability of LastBERT. 6. In Table 6, Study 1-6 lacks annotations and the method names should be indicated. In addition, there are inconsistencies in the Dataset part of Table 6. Is the Accuracy indicator still meaningful for reference? 7. Can you clarify the innovation of the model? It is not easy to understand from the text and model framework. There are some related suggestions: 1. It is recommended to check the clarity of all images in the text, especially the small images such as Figures 7, 8, and 9. It is also recommended to use a magnifying glass frame to display the important areas of some images, such as Figure 4. 2. The conclusion section is redundant. It is recommended to simplify the content and add a discussion section to fully analyze the shortcomings of the model and areas for improvement. 3. It is recommended that the URLs appearing in the references be placed in the footnotes of the corresponding pages of the text. 4. The introduction of the dataset in Section 3 is too much, so it is recommended to simplify it. 5. For the contribution part in the “Introduction” section, it is recommended not to emphasize the use of free computing resources, because many related researchers also complete their experiments and research work based on the free computing resource platform. It is recommended to mention it in the experimental environment setup part in Section 3. 6. The last part of the Introduction section should summarize the main contents of the remaining sections. 7. Are the data in Table 5 redundant? The values of the macro average and weighted average are the same, so it is recommended to keep only one of them. Reviewer #2: This paper makes several contributions to the NLP and mental health diagnostics field. First, it demonstrates the effectiveness of knowledge distillation in creating a significantly smaller BERT-based model, LastBERT, which reduces parameters without compromising performance. Second, the model shows strong generalization, achieving high performance on the GLUE benchmark across various tasks. Third, it offers practical utility by applying the model to ADHD-related social media data, where it gained a commendable 85% across multiple evaluation metrics. This paper provides a valuable tool for mental health professionals, highlighting its potential in resource-constrained environments. The following are two suggestions to improve the robustness of this paper. First, this paper should provide a more precise justification for choosing the BERT-based model, particularly concerning the study's specific objectives. It is essential to articulate the strengths of BERT compared to other large language models, such as GPT and Llama. This comparison could enhance the reader’s understanding of why BERT is more suitable for the tasks addressed in the research. Including empirical evidence or relevant literature highlighting these advantages would strengthen the argument. Overall, a more detailed discussion of these aspects is essential for a comprehensive evaluation of the model selection in this paper. Second, this paper should discuss the merits and limitations of employing a student model for knowledge distillation, particularly in light of existing research demonstrating promising performance from fine-tuning current BERT-based models with small datasets in various NLP tasks, as evidenced by the articles listed below. - Lin, J., Nogueira, R., & Yates, A. (2020). Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv preprint arXiv:2010.06467. - Kim, J., Kim, J., Lee, A., & Kim, J. (2023). Bat4RCT: A suite of benchmark data and baseline methods for text classification of randomized controlled trials. Plos one, 18(3), e0283342. - Kim, J., Kim, J., Lee, A., Kim, J., & Diesner, J. (2024). LERCause: Deep learning approaches for causal sentence identification from nuclear safety reports. Plos one, 19(8), e0308155. This context is crucial, as it highlights that substantial and efficient modeling can be achieved by fine-tuning the small dataset without the complexity of creating a student model. A comparative analysis between the efficiency of fine-tuning existing models with small datasets and the potential benefits of using a distilled model would provide valuable insights. Additionally, the discussion should address scenarios where the student model might offer advantages and any trade-offs involved in this approach. In conclusion, more thoroughly exploring these aspects will enhance the paper's contribution to the field. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation PONE-D-24-32612R1 Dear Dr. Ahmed Akib and Jawad Karim, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Weiqiang (Albert) Jin, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Based on the reviewers' comments, I am pleased that this article version can be accepted as it is now. Because the professional reviewers say that you answered all my questions in detail and thoroughly, and they have no further questions. So, Congratulations! Reviewers' comments: Reviewer #1: The author answered all my questions in detail and thoroughly, and I have no further questions. This article can be considered for acceptance. Reviewer #2: The authors have satisfactorily addressed the comments I raised during the previous round of review. In response to my comment, the authors have added a new subsection titled “Rationale for Model Selection” within the Methodology section and the Related Works section. This addition includes a comparison with other LLMs, such as GPT and LLaMA, supported by relevant literature. Additionally, the authors have incorporated a comparative analysis in the Discussion section, which highlights the resource trade-offs and specific use-case scenarios where fine-tuning and distillation-based approaches excel. This addition effectively clarifies the merits and limitations of both methods, enhancing the overall clarity and depth of the study. |
| Formally Accepted |
|
PONE-D-24-32612R1 PLOS ONE Dear Dr. Karim, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Weiqiang (Albert) Jin Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .