Peer Review History
| Original SubmissionJanuary 5, 2025 |
|---|
|
Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Apr 12 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Lucija Gosak Academic Editor PLOS ONE Journal Requirements: 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for uploading your study's underlying data set. Unfortunately, the repository you have noted in your Data Availability statement does not qualify as an acceptable data repository according to PLOS's standards. At this time, please upload the minimal data set necessary to replicate your study's findings to a stable, public repository (such as figshare or Dryad) and provide us with the relevant URLs, DOIs, or accession numbers that may be used to access these data. For a list of recommended repositories and additional information on PLOS standards for data deposition, please see https://journals.plos.org/plosone/s/recommended-repositories . 3. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information . [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? Reviewer #1: No Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? -->?> Reviewer #1: No Reviewer #2: I Don't Know ********** 3. Have the authors made all data underlying the findings in their manuscript fully available??> The PLOS Data policy Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #1: No Reviewer #2: Yes ********** Reviewer #1: The study provides an assessment of the utility of chatbots in psychiatric clinical practice. The research evaluates 27 chatbots using 160 multiple-choice questions derived from the Taiwan Psychiatry Licensing Examination. Employing the Rasch model for statistical analysis, the study offers quantitative insights into the ability of chatbots to handle psychiatric clinical knowledge and reasoning. The standout model, ChatGPT-o1-preview, displayed exceptional capabilities in diagnostic and treatment reasoning and psychopharmacology, with a Joint Maximum Likelihood Estimation (JMLE) ability score of 2.23. Overall, the paper needs substantial improvements and changes, as explained below 1. The study's findings are narrowly applicable, focusing solely on chatbots tested in traditional Mandarin. This language limitation reduces the relevance and applicability of the results for global audiences, particularly in non-Mandarin-speaking regions. 2. The study evaluates chatbot performance using standardized multiple-choice questions (MCQs), which do not adequately capture the complexities and nuances of real-world psychiatric clinical practice. There is no evidence of chatbot evaluation in live clinical settings or through practical scenarios. 3. While the study evaluates 27 chatbots, this is a relatively small sample size considering the rapidly evolving landscape of AI-driven language models. The lack of diversity among evaluated chatbots raises questions about the robustness and generalizability of the findings. 4. The study highlights significant limitations in factual recall and reasoning biases among chatbots but does not propose robust methods for mitigating these issues. This undermines the reliability of the conclusions and fails to address critical concerns about safety in clinical settings. 5. While the Rasch model is a well-established psychometric tool, its application here simplifies the complexity of clinical knowledge and reasoning. It may not be the most suitable approach to fully assess chatbots' ability to handle nuanced psychiatric scenarios. 6. The exclusion of questions related to Taiwan-specific laws and policies limits the comprehensive evaluation of the chatbots' knowledge. This omission raises concerns about the completeness and depth of the study. 7. The study primarily reaffirms well-known strengths and limitations of chatbots (e.g., strong reasoning, poor factual recall, and biases). It does not provide any groundbreaking insights or propose innovative solutions to address the identified challenges. 8. The study lacks a detailed discussion of ethical implications and safety concerns regarding the deployment of chatbots in psychiatric settings. Furthermore, the manuscript does not explore the impact of potential errors on patient care or how clinicians can effectively mitigate these risks. 9. While the study identifies the potential utility of chatbots in psychiatry, it does not validate these findings through experimental or observational studies in clinical environments. This limits the practical significance of the conclusions. 10. The article's findings largely replicate conclusions from prior studies on chatbot performance in medical fields, offering little novelty or value to the existing body of knowledge. Reviewer #2: The article is well written. How was the level of difficulty of the chosen questions from the Taiwan Psychiatry Licensing Examinations assessed to test the range analytic ability of the Chatbots? What are the limitations of this study? ********** what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
Dear Dr. Liu, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The manuscript has been evaluated by two reviewers, and their comment is available below. Reviewer 3 has requested minor modifications in various sections of the manuscript. Could you please carefully revise the manuscript to address the comment raised? Please submit your revised manuscript by Sep 07 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Zahra Al-Khateeb, Ph.D Staff Editor PLOS ONE Journal Requirements: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author Reviewer #3: All comments have been addressed Reviewer #4: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions??> Reviewer #3: Yes Reviewer #4: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? -->?> Reviewer #3: Yes Reviewer #4: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available??> The PLOS Data policy Reviewer #3: Yes Reviewer #4: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #3: Yes Reviewer #4: Yes ********** Reviewer #3: This is a timely and methodologically solid study that evaluates the performance of 27 state-of-the-art chatbots using a large, standardized set of psychiatry board exam questions. The use of Rasch analysis to assess chatbot performance represents a methodological advancement over prior surface-level comparisons, and the dual quantitative-qualitative approach offers valuable insights into chatbot reasoning. The manuscript is clearly written, well-organized, and contributes meaningfully to the literature on AI in clinical education and decision support. Minor Comments: - Zero-shot prompting is mentioned, but the paper would benefit from a more detailed discussion of prompt design and its implications (e.g., potential effects of different phrasings or batching). - The qualitative evaluation (Tables 4 and 5) is excellent but could benefit from a more structured rubric or scoring system to increase reproducibility of the “reasoning strength” claims. - While the authors briefly mention psychiatrist performance in prior studies, adding a comparator (even if drawn from past literature) would help contextualize chatbot scores. - The term "hallucination" vs. "confabulation" is correctly problematized, but the terminology could be clarified further for a general audience unfamiliar with LLM behavior nomenclature. - The manuscript acknowledges that no human data were used. However, a deeper reflection on the clinical risks of AI hallucinations in psychiatry—especially in vulnerable populations—would enhance the impact. - Consider tightening the abstract to make the results and implications more immediately clear. - Ensure consistent formatting of acronyms (e.g., DSM-5-TR is occasionally formatted inconsistently). Reviewer #4: Well-written article and discussed the main strengths and limitations of current chatbox use in psychiatry. The future directions are well identified and articulated. Thank you for outlining these thoughtful mitigation strategies. I particularly agree with the importance of prompt engineering—designing structured, context-rich queries significantly improves chatbot performance. This approach helps guide the model’s reasoning process, minimizes ambiguity, and leads to more clinically relevant responses. It's a practical and effective way to align chatbot outputs with expert expectations, especially in sensitive domains like healthcare. Thank you for this important work. ********** what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy Reviewer #3: Yes: Luca Cima Reviewer #4: Yes: Bernice G. Gulek, PhD, ACNP ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org |
| Revision 2 |
|
<p>Evaluating Chatbots in Psychiatry: Rasch-Based Insights into Clinical Knowledge and Reasoning PONE-D-25-00276R2 Dear Dr. Liu, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support . If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, George Vousden Staff Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: |
| Formally Accepted |
|
PONE-D-25-00276R2 PLOS ONE Dear Dr. Liu, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. George Vousden Staff Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .