Peer Review History
| Original SubmissionJune 5, 2023 |
|---|
|
Transfer Alert
This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.
PONE-D-23-14620ChatGPT versus human in generating medical graduate exam questions – An international prospective studyPLOS ONE Dear Dr. CO, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Aug 31 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Jie Wang, Ph.D. Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for including your ethics statement: "N/A". a. For studies reporting research involving human participants, PLOS ONE requires authors to confirm that this specific study was reviewed and approved by an institutional review board (ethics committee) before the study began. Please provide the specific name of the ethics committee/IRB that approved your study, or explain why you did not seek approval in this case. Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”). For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research. b. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”). For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research. 3. Thank you for stating the following financial disclosure: “No” At this time, please address the following queries: a) Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution. b) State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.” c) If any authors received a salary from any of your funders, please state which authors and which funders. d) If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.” Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 4. Thank you for stating the following in your Competing Interests section: “No” Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now This information should be included in your cover letter; we will change the online submission form on your behalf. 5. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide. 6. We note that Figure 1 in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission: a. You may seek permission from the original copyright holder of Figure 1 to publish the content specifically under the CC BY 4.0 license. We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text: “I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].” b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only. 7. Please include your tables as part of your main manuscript and remove the individual files. Please note that supplementary tables (should remain/ be uploaded) as separate "supporting information" files Additional Editor Comments: The authors should pay careful attention to each of the comments below and address the issues raised by the reviewers. Since the scale of the current study is not sufficient to support a generic conclusion, the authors need to be more cautious when interpretting the results. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes Reviewer #4: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: I Don't Know Reviewer #4: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No Reviewer #3: Yes Reviewer #4: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes Reviewer #4: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This is an interesting study. The actual quality of MCQ can be assessed by item analysis after testing it on students. The authors used assessment by experts. The study is interesting. needs some changes. In title specify exam of which country? Different countries has different level of difficulty. "An international prospective study" is not needed. Background in abstract may have more detail about why this study was conducted. Avoid old references. Figure 3 is hazy. Reviewer #2: My main criticism concerns the method used to assess the quality of the questions. For me the only scientifically valid way to evaluate the quality of the questions is to use them in real life, in official exams for example. Only the statistical analysis of students' answers makes it possible to evaluate the quality of a question. This is especially true for distractors. Making this criticism in the discussion is necessary in my opinion. Reviewer #3: Many thanks for the authors to tackle this hot topic in medical education. I enjoyed reading your manuscript however, the following points should be addressed to improve it 1. The title “ChatGPT versus human in generating medical graduate exam questions” indicate more general term while the study mentioned only MCQ type of questions so either you need to include other types of questions or change the title to indicate the MCQ. 2. Method, how you make sure that the two faculty staff who generated the 50 questions were not using AI to generate questions? Clarify this in method section. 3. Some results mentioned in the discussion section “When the questions generated by ChatGPT were reviewed, we could also observe that they were compatible with the guidance from the Division of Education, American College ofSurgeons, with minimal negative features, including a minimal use of negative stem (only14% (7/50), compared to 12% (6/50) by human examiners) with a lack of “except”,“All/none of the above” This should be mentioned in result section and discussed in the discussion section. 4. Regarding assessor evaluation of quality of questions it would be better to assess the Bloom levels of questions and add it as another parameter to ensure that the questions not only assessing the lower levels of Bloom (remembering and understanding) specially for questions generated by AI. 5. Table 3 need more clarification (how you calculate the percentage in each column) 6. Table 4 the calculation of % should be the number of truly guessed question written by AI to the 50 questions written by AI, the same for human correct guess(e.g for assessor B it should be 14/50 not 14/23 7. Attach a strobe checklist for the article and address the missed part in the manuscript. 8. Look for instruction of authors for both incitation and reference section. Reviewer #4: The authors made a comparative study between ChatGPT and Human Experts in generating medical domain MCQ questions from textbooks. The study offers some insight on the performance of chatGPT in generating medical MCQs. However, I don't find any technical contribution in this article. What is the contribution of the authors? They only made some comparison between the MCQs generated by ChapGPT and human experts using certain evaluation criteria. Additionally, the article needs some improvement for common readers: There is no explanation on how the questions are generated by ChatGPT. I feel a large number of readers do not have specific idea on how to generate MCQs from a textbook (as input) using ChatGPT. So, there should be some discussion on that. What are the list of human input (like, specific information, selection of specific portion, any tuning, parameter setting) or amount of experience required to generate meaningful questions using ChatGPT? For instance, the authors mentioned, the length of the text was limited by ChatGPT at 510 tokens. What are the other such inputs? The authors claimed that "There was no significant difference in question quality between questions drafted by AI versus human". This is actually the key highlight of the paper. However, this is not completely supported by evidence. When I study the values in Table 3, I find that the human experts are much better than ChatGPT. For instance, appropriateness of the question: 18 (AI) vs 27 (human); Clarity and specificity 18 (AI) vs 26 (human); Distractor quality 21 (AI) vs 26 (human). The gap is significant. Then how the claim is justified? Also, the number of questions is too less to make such a generic comment on ChatGPT vs Human. The study should consider much larger number of questions, more subjects as input, more number of human experts. More number of MCQ evaluation metrics are required to be considered for complete evaluation of the MCQs from all aspects. The evaluation of quality of MCQs is a tricky task, and various metrics have been used in the literature in the past two decades. For example, well-formed, sentence length, sentence simplicity, difficulty level, answerable, sufficient context, relevant to the domain, over-informative, under-informative, grammaticality, item analysis etc. are some metrics I find in the literature. Finally, the human evaluators give a score against each evaluation metric in a scale of 1-10. What was the specific guidelines given to the evaluators for the scoring? Only the name of a evaluation metric might have certain ambiguity and might carry different meaning to different experts. So, there should be proper guidelines for the evaluators. Please mention those. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes: Pr. Olivier Palombi, Grenoble Université Alpes Reviewer #3: Yes: Nazdar Ezzaddin Alkhateeb Reviewer #4: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
ChatGPT versus human in generating medical graduate exam multiple choice questions – A multinational prospective study (Hong Kong SAR, Singapore, Ireland, and the United Kingdom) PONE-D-23-14620R1 Dear Dr. CO, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Jie Wang, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed Reviewer #4: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #4: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #4: No ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No Reviewer #4: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #4: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The study addresses an important and relevant topic concerning the potential use of AI language models for generating MCQs in medical education. The research purpose is well-defined, and the study's objectives are clear. The study adopts a prospective design and employs a systematic approach by comparing ChatGPT-generated MCQs with those developed by human examiners. The use of standardized assessment scores and independent international assessors enhances the reliability and validity of the findings. The article highlights the significant time advantage of ChatGPT in generating MCQs, with a total time of 20 minutes and 25 seconds to create 50 questions, compared to 211 minutes and 33 seconds taken by human examiners for the same task. This efficiency demonstrates the potential practicality of using AI in question generation. The study reveals that ChatGPT-generated MCQs show comparable quality to those created by human examiners across most domains, including appropriateness, clarity, discriminative power, and suitability for medical graduate exams. This suggests that ChatGPT is capable of producing high-quality MCQs in these areas Reviewer #2: (No Response) Reviewer #4: Actually, I feel some more experiments are required to validate the claim. But it seems the authors feel their work is sufficient. Anyway ... ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dr. Himel Mondal Reviewer #2: Yes: Full Professor Olivier Palombi, Université Grenoble Alpes Reviewer #4: No ********** |
| Formally Accepted |
|
PONE-D-23-14620R1 ChatGPT versus human in generating medical graduate exam multiple choice questions – A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom) Dear Dr. CO: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Jie Wang Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .