Multimodal instruction with AI-generated images for noun retention: Exploring semantic scene and materiality effects

Gaojie Ye; Shibo Yan

doi:10.1371/journal.pone.0334778

Peer Review History

Original SubmissionSeptember 28, 2025
30 Oct 2025 Decision Letter - Ramandeep Kaur, Editor Dear Dr. Yan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== The manuscript presents an innovative attempt to integrate AI-generated content (AIGC) into multimodal instructional design, focusing on noun retention. The topic aligns with PLOS ONE’s criteria for methodological soundness and interdisciplinary relevance. However, the current version requires substantial revision before it can be considered for acceptance. The required changes include major clarification of the study design, particularly in distinguishing between AIGC-generated visual material and general multimodal learning effects. As currently structured, the experiment tests the difference between unimodal and multimodal conditions rather than isolating the contribution of AI-generated imagery. The conclusions must therefore be rewritten to accurately represent the study’s scope and findings. Additionally, participant demographic details, adherence to the journal’s data availability policy, and correction of minor typographical and referencing issues are mandatory for acceptance.include major clarification of the study design, particularly in distinguishing between AIGC-generated visual material and general multimodal learning effects. As currently structured, the experiment tests the difference between unimodal and multimodal conditions rather than isolating the contribution of AI-generated imagery. The conclusions must therefore be rewritten to accurately represent the study’s scope and findings. Additionally, participant demographic details, adherence to the journal’s data availability policy, and correction of minor typographical and referencing issues are mandatory for acceptance.include major clarification of the study design, particularly in distinguishing between AIGC-generated visual material and general multimodal learning effects. As currently structured, the experiment tests the difference between unimodal and multimodal conditions rather than isolating the contribution of AI-generated imagery. The conclusions must therefore be rewritten to accurately represent the study’s scope and findings. Additionally, participant demographic details, adherence to the journal’s data availability policy, and correction of minor typographical and referencing issues are mandatory for acceptance.include major clarification of the study design, particularly in distinguishing between AIGC-generated visual material and general multimodal learning effects. As currently structured, the experiment tests the difference between unimodal and multimodal conditions rather than isolating the contribution of AI-generated imagery. The conclusions must therefore be rewritten to accurately represent the study’s scope and findings. Additionally, participant demographic details, adherence to the journal’s data availability policy, and correction of minor typographical and referencing issues are mandatory for acceptance. Recommended changes include enhancing the theoretical framework with a balanced discussion of both the potential and limitations of AIGC, refining the introduction to better justify the research rationale, and improving readability through concise, focused language.include enhancing the theoretical framework with a balanced discussion of both the potential and limitations of AIGC, refining the introduction to better justify the research rationale, and improving readability through concise, focused language.include enhancing the theoretical framework with a balanced discussion of both the potential and limitations of AIGC, refining the introduction to better justify the research rationale, and improving readability through concise, focused language.include enhancing the theoretical framework with a balanced discussion of both the potential and limitations of AIGC, refining the introduction to better justify the research rationale, and improving readability through concise, focused language. The reviewer’s observations are consistent and valid, with no major conflicts in interpretation. Addressing these core methodological and reporting concerns will ensure the paper meets PLOS ONE’s standards for technical rigor, transparency, and interpretive accuracy. Therefore, the decision at this stage is Major Revision.... ============================== Please submit your revised manuscript by Dec 14 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.... We look forward to receiving your revised manuscript. Kind regards, Ramandeep Kaur Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information. 3. Please note that PLOS One has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 4. We note that your Data Availability Statement is currently as follows: All relevant data are within the manuscript and its Supporting Information files. Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study. Authors must share the “minimal data set” for their submission. PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition). For example, authors should submit the following data: - The values behind the means, standard deviations and other measures reported; - The values used to build graphs; - The points extracted from images for analysis. Authors do not need to submit their entire data set if only a portion of the data was used in the reported study. If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories. If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If data are owned by a third party, please indicate how others may request data access. 5. We note that Figure 1 includes an image of a participant in the study. As per the PLOS ONE policy (http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research) on papers that include identifying, or potentially identifying, information, the individual(s) or parent(s)/guardian(s) must be informed of the terms of the PLOS open-access (CC-BY) license and provide specific permission for publication of these details under the terms of this license. Please download the Consent Form for Publication in a PLOS Journal (http://journals.plos.org/plosone/s/file?id=8ce6/plos-consent-form-english.pdf). The signed consent form should not be submitted with the manuscript, but should be securely filed in the individual's case notes. Please amend the methods section and ethics statement of the manuscript to explicitly state that the patient/participant has provided consent for publication: “The individual in this manuscript has given written informed consent (as outlined in PLOS consent form) to publish these case details”. If you are unable to obtain consent from the subject of the photograph, you will need to remove the figure and any other textual identifying information or case descriptions for this individual. 6. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. Additional Editor Comments: The study offers an engaging exploration of how AI-generated visual materials may influence noun retention in multimodal learning contexts. The topic is timely and relevant, especially with the increasing integration of artificial intelligence in education. However, several substantive and structural concerns need to be addressed before the paper can progress further. Conceptual Clarity and Design – The comparison between text-only and multimodal (text + image) conditions does not specifically evaluate the impact of AIGC. To support claims about AIGC-enhanced learning, one of the conditions should employ traditional (non-AI) imagery. Otherwise, the conclusions should be reframed to emphasize multimodal rather than AIGC-based effects. Introduction and Theoretical Framing – Expand the introduction to define AIGC at first mention and include both supportive and critical perspectives. Discuss known limitations of AI-generated images, such as possible inaccuracy or bias, to present a balanced view. Methodology Details – Provide clear participant demographics (age, gender, background, proficiency level). Justify the chosen tasks and materials, particularly the “image-to-word matching” task, which may not be suitable for a text-only control group. Ensure all experimental comparisons are methodologically equivalent. Data Availability – Ensure full compliance with PLOS ONE’s data policy by sharing anonymized datasets or including a valid reason for restrictions. Writing and Structure – Revise the abstract to expand “AIGC” and highlight the study’s key findings succinctly. Correct minor typographical errors and reference inconsistencies (e.g., “approximately” on p.2; missing reference on p.5). The final paragraph of the introduction should emphasize study rationale and objectives, not manuscript structure. Conclusion Revision – Rephrase the conclusion to accurately reflect that multimodal learning was superior to unimodal learning in this study, rather than attributing the improvement solely to AIGC. With these major revisions, the manuscript could make a meaningful contribution to the field of AI-supported education and multimodal learning research. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? Reviewer #1: Partly ******** 2. Has the statistical analysis been performed appropriately and rigorously? -->?> Reviewer #1: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available??> The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: No ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #1: Yes ****** Reviewer #1: Reviewer Comments 1. The abstract introduces “AIGC” but does not expand the abbreviation at its first mention. 2. There is a typographical error for “approximately” on page 2. 3. The last paragraph on page 4 should be revised: Instead of outlining the structure of the paper, focus on providing engaging background, rationale, and objectives. A section-by-section preview is less appropriate for the journal's format. 4. On page 5, the statement regarding “traditional multimodal materials” requires a proper reference. 5. The introduction lacks a non-biased review. It should address concerns that AIGC can produce incorrect or distorted images, not just cite studies that support AIGC. A balanced review including potential limitations is needed. 6. There are methodological flaws: The comparison is between unimodal (text-only) and multimodal (text + picture) learning rather than AIGC-based versus traditional learning. Traditional learning typically uses both text and images. For a valid comparison, one group should learn with traditional images and text, and another with AI-generated images and text—then compare outcomes. 7. Please provide demographic details of all study participants. 8. The conclusion that “AIGC-enhanced instruction led to more effective noun learning across multiple aspects” is misleading. The study compares unimodal and multimodal learning, not specifically the impact of AI-generated images. The conclusion should reflect that multimodal learning outperformed text-based learning, regardless of image source. 9. The “image-to-word matching” task (gap of 0.85 vs. 0.45) appears inappropriate because Group 1 only received text (without images), thus undermining the comparison. ****** what does this mean?). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our For information about this choice, including consent withdrawal, please see our For information about this choice, including consent withdrawal, please see our For information about this choice, including consent withdrawal, please see our Privacy Policy..--> Reviewer #1: Yes: Jithin BalanJithin BalanJithin BalanJithin Balan ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] To ensure your figures meet our technical requirements, please review our figure guidelines: https://journals.plos.org/plosone/s/figures You may also use PLOS’s free figure tool, NAAS, to help you prepare publication quality figures: https://journals.plos.org/plosone/s/figures#loc-tools-for-figure-preparation. NAAS will assess whether your figures meet our technical requirements by comparing each figure against our figure specifications. https://doi.org/10.1371/journal.pone.0334778.r001
Revision 1
28 Jan 2026 Author Response PONE-D-25-52191 Response to Reviewers Multimodal Instruction with AI-Generated Images for Noun Retention: Exploring Semantic Scene and Materiality Effects We thank the Academic Editor Dr. Ramandeep Kaur and Reviewer Mr. Jithin Balan for your constructive and detailed comments. We have addressed every point below; all changes are highlighted in the revised manuscript with Track Changes. 1. Abstract: “AIGC” not expanded at first mention RESPONSE: Expanded at first occurrence as “artificial intelligence (AI)-generated visual content” (lines 15). 2. Typographical error “approximately” on p. 2 RESPONSE: Corrected to “approximately” (line 40). 3. Last paragraph on p. 4 (old ms) outlines paper structure RESPONSE: Rewritten to provide rationale and objectives only; section preview removed (lines 106–113). 4. Missing reference for “traditional multimodal materials” (p. 5) RESPONSE: Added Mayer, R. E. (2009) as foundational citations (lines 133,518-519). 5. Introduction lacks balanced review of AI-image limitations RESPONSE: Added new subsection “Potential Limitations of AI-Generated Visuals” (lines 75–92) citing: “Indeed, the educational promise of these multimodal tools is accompanied by well-documented technical limitations. Perceptual inconsistency: AI models frequently produce distorted objects or illogical spatial relations that may increase extraneous cognitive load [11]. Socio-cultural bias: generated images tend to over-represent specific cultural archetypes, reinforcing stereotypes rather than fostering cross-cultural understanding [12]. Output variability: fluctuations in colour fidelity and detail introduce unintended perceptual noise, while the detectability of synthetic images changes over their online lifespan, complicating responsible use [13]. All references are already present in the provided reference list. 6. Methodological flaw: only unimodal vs multimodal RESPONSE: - Design clarified: two-group (text-only vs. text + AI-images) (lines 184–185). Conclusions rewritten throughout to attribute effect to “multimodal presentation” rather than “AI-generated nature” (see Abstract, Discussion, Conclusion). 7. Participant demographics missing RESPONSE: Participants subsection (lines 166–179) report: N = 40 university students; native Chinese speakers; ≥ 6 years formal English. All participants majored in either big data & accounting (n = 20) or engineering cost management (n = 20). No colour-vision or visual impairments Ethics approval No. CJ-202412003 from the College of Urban Construction,Anhui Vocational College of Defense Technology ; written informed consent obtained. 8. Misleading conclusion about “AIGC-enhanced” superiority RESPONSE:We fully accept the reviewer's critique. The original manuscript did indeed obscure the independent variable under investigation. Our experimental design was specifically a comparison between two modal conditions: "text-only" and "text-plus-image," not a comparison of image sources. The major revisions we have implemented include: Conceptual Clarification: Throughout the manuscript, particularly in the Introduction and Discussion, we have clarified that this study aims to investigate the effectiveness of "a multimodal pedagogical approach implemented using AIGC tools," rather than testing AIGC-generated imagery in isolation. Terminology Correction: We have systematically revised terms such as "AIGC-enhanced instruction/approach/group" to more accurate descriptors like "the multimodal condition/instruction/group (employing AI-generated images)" or "the text-with-image condition." Conclusion Rewriting: We have thoroughly rewritten the Conclusion. The key finding is now stated as: "Compared with traditional text-based instruction, the multimodal presentation led to better recall..." This explicitly attributes the advantage to the multimodal presentation itself, not to AIGC per se. (Please see the revised conclusion section.) Title Adjustment: To maximize precision, we propose a slight adjustment to the main title: "Multimodal Instruction with AI-Generated Images for Noun Retention: Exploring Semantic Scene and Materiality Effects.". 9. Image-to-word matching task inappropriate for text-only group RESPONSE: We agree with the reviewer's concern and have revised the manuscript to address it. The “image-to-word matching” task is now explicitly defined as a cross-modal transfer task designed to assess the quality of learners' mental representations, rather than a direct test of trained skills. This clarification appears in the Methods (lines 262-268). Accordingly, the large performance difference (0.85 vs. 0.45) is interpreted not as a simple advantage, but as evidence that the multimodal condition supported the formation of more robust and flexibly accessible semantic representations (see Results, lines 370-378). -------------------------------------------------------------------- ADDITIONAL EDITOR / JOURNAL REQUIREMENTS � File naming & style – checked with PLOS template; all filenames now conform. � Participant consent – added explicit statement: “Written informed consent was obtained; consent form filed in participant folder” (lines 174–176). � Code availability – Analysis scripts (R) uploaded to OSF DOI:10.17504/protocols.io.j8nlk1bx6g5r/v1 and cited in Data Availability Statement. � Data availability – This study shares research materials to the extent permitted: Full Methodology: The complete, step-by-step experimental protocol is available on Protocols.io (DOI: dx.doi.org/10.17504/protocols.io.j8nlk1bx6g5r/v2). Analysis Code: All R scripts used for data processing and statistical analysis are available on OSF (DOI: dx.doi.org/10.17504/protocols.io.j8nlk1bx6g5r/v2). Research Data: The original, de-identified trial-level data are stored on a secure institutional platform at [the College of Urban Construction，Anhui Vocational College of Defense Technology]. Due to current platform policies and data governance agreements, these raw data cannot be exported or deposited in a public repository. As a complete alternative, all data necessary to replicate the reported statistical findings—including the values behind all means, standard deviations, and test statistics—are provided in the supporting information files (wh2wcvue7.xlsx). For further inquiries, qualified researchers may contact the corresponding author or the institutional ethics committee at [35311296@acdt.edu.cn] to discuss possible access under a formal data agreement. � Figure 1 participant image – In full compliance with PLOS policies on publication consent, the original Figure 1 has been replaced with a different image, as obtaining informed consent for the publication of an AI-generated likeness was not feasible. � Typographical & reference errors – Full spell-check performed; “approximately” and missing ref. fixed (see #2, #4 above). -------------------------------------------------------------------- We appreciate the opportunity to improve our manuscript and believe the revised version fully addresses the methodological, reporting, and ethical concerns raised. Sincerely, Corresponding Author: Yan Shibo On behalf of all co-authors Attachments Attachment Submitted filename: Response to Reviewers.docx https://doi.org/10.1371/journal.pone.0334778.r002
10 Feb 2026 Decision Letter - Ramandeep Kaur, Editor Multimodal Instruction with AI-Generated Images for Noun Retention: Exploring Semantic Scene and Materiality Effects PONE-D-25-52191R1 Dear Dr. Yan, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support.... If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Ramandeep Kaur Academic Editor PLOS One Additional Editor Comments (optional): Reviewers' comments: https://doi.org/10.1371/journal.pone.0334778.r003
Formally Accepted
Acceptance Letter - Ramandeep Kaur, Editor PONE-D-25-52191R1 PLOS One Dear Dr. Yan, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Ramandeep Kaur Academic Editor PLOS One https://doi.org/10.1371/journal.pone.0334778.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .