Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Scale for students’ attitude towards AIGC feedback in english pronunciation learning: Development, validation and application

  • Weihe Zhong,

    Roles Conceptualization, Writing – review & editing

    Affiliation Rector’s Office, Macau Millennium College, Macao, China

  • Yanchao Yang ,

    Roles Conceptualization, Software, Validation, Writing – original draft, Writing – review & editing

    yangyanchao@mmc.edu.mo

    Affiliations Institute of International Language Services Studies, Macau Millennium College, Macao, China, Qinggong College, North China University of Science and Technology, Tangshan, Hebei Province, China

  • Bosheng Jing,

    Roles Methodology, Validation

    Affiliation School of Humanities and Languages, The University of New South Wales, Sydney, Australia

  • Xinxin Yang,

    Roles Investigation, Supervision

    Affiliation Faculty of Humanities and Social Sciences, Macau Millennium College, Macao, China

  • Zehan Tan,

    Roles Methodology, Writing – review & editing

    Affiliation Faculty of Digital Science and Technology, Macau Millennium College, Macao, China

  • Qiu Wei

    Roles Writing – review & editing

    Affiliations Rector’s Office, Macau Millennium College, Macao, China, Faculty of Health and Wellness, City University of Macau, Macao, China

Abstract

This study develops and validates the Scale of Students’ Perception of AIGC Feedback for English Pronunciation Learning. The research was conducted at a university in northern China using a convenience sampling method. The exploratory factor analysis (EFA) involved 207 participants, while the confirmatory factor analysis (CFA) included 229 participants. Based on interviews with 10 students who had used AIGC tools for English pronunciation learning, 16 representative items were identified. Expert validation was performed through interviews with 8 experts—four English pronunciation teachers with extensive experience using AIGC in teaching, and four AIGC specialists. Content validity was confirmed, and all items were retained. The EFA results revealed four dimensions: Accuracy, Strictness, Clarity, and Personalisation. The CFA results demonstrated good structural and convergent validity. However, the discriminant validity was slightly problematic. Concurrent validity was confirmed by the high correlation between the scale and perceived English Pronunciation Self-efficacy. The study has several limitations, including its cross-sectional design, limited sample diversity, and reliance on traditional validation methods (EFA and CFA), suggesting the need for test-retest reliability, a more diverse sample, and alternative methods like Item Response Theory (IRT) or Network Analysis in future research. The validated scale offers valuable insights into how students perceive and interact with generative AI tools, and it can serve as a useful instrument for educators and researchers interested in exploring the impact of AI feedback systems on language learning.

Introduction

The emergence of generative artificial intelligence (AIGC) has reshaped the landscape of technological advancements, bringing about a significant transformation across various industries, such as in healthcare [1,2], hospitality and tourism industry [3,4], commerce [5,6], and more notably, education [79]. In the context of education, AIGC presents an opportunity to reimagine traditional learning processes. Its application goes beyond automating tasks to providing personalized and adaptive learning experiences that respond to individual student needs. As such, AIGC is poised to become a foundational tool for enhancing teaching effectiveness [10,11], improving student engagement [12,13], and fostering a more dynamic and efficient educational environment.

Among the various domains in which AIGC is being deployed, the field of education is one of the most promising, particularly in the context of language learning (e.g., [1416]). The integration of AIGC technologies in language education offers a variety of specific functions that address the challenges faced by language learners. These technologies provide personalized learning paths, adaptive content creation, and real-time feedback. For instance, intelligent tutoring systems can analyze learners’ performance and tailor lessons to their proficiency level, ensuring targeted skill development [17,18]. Additionally, AIGC-powered platforms can generate customized exercises and quizzes, enabling learners to practice specific language areas where they need improvement, such as vocabulary building or grammar usage [19]. Furthermore, AIGC tools can offer instant translations, helping learners bridge language gaps [20], while automated writing assistants can help refine written expression through grammar correction and style suggestions [21,22]. These capabilities have significantly transformed the traditional language learning model by offering more interactive, individualized, and efficient learning experiences.

Specifically, pronunciation stands as a fundamental component that significantly influences students’ communicative competence [2325]. Effective pronunciation not only enhances the clarity of speech but also contributes to the overall success of language learners in real-life interactions. Despite its importance, pronunciation instruction in traditional classroom settings often faces challenges, such as inadequate exposure to the language as it is spoken in the real world [26], limited resources [27], influence of mother tongue [28]. This is where AIGC tools can offer substantial value. By leveraging speech recognition [29,30], speech synthesis [31], Natural Language Processing [32] and machine learning algorithms [33], AIGC technologies are capable of providing precise and immediate feedback on pronunciation, offering students the opportunity to refine their speech patterns autonomously. Such tools allow for a more interactive and efficient learning process, empowering students to engage with the material more effectively and enabling educators to focus on other aspects of language instruction [3437].

The effectiveness of AIGC tools in English pronunciation learning is intrinsically linked to students’ attitudes toward the feedback they receive. Students’ attitudes toward the accuracy, clarity, and relevance of the feedback provided by AIGC tools play a critical role in determining how they engage with these technologies and how effectively they benefit from them. If students view the feedback as accurate, personalized, and conducive to their learning, they are more likely to integrate these tools into their study routines, leading to improved pronunciation skills. On the other hand, if the feedback is perceived as unclear, overly strict, or not personalized to individual needs, students may disengage from the learning process, diminishing the potential benefits of AIGC tools. However, despite the significant amount of research on attitudes toward generative artificial intelligence technology and products [3844], there is limited research on attitudes toward generative artificial intelligence’s feedback on English pronunciation. Therefore, the primary objective of this study is to investigate university students’ attitudes toward AIGC feedback in the context of English pronunciation learning by developing and validating a scale for measuring these attitudes, with the aim of facilitating more effective feedback mechanisms.

Literature review

Feedback

Feedback plays a vital role in the teaching-learning process, acting as a bridge between students’ current understanding and their desired learning outcomes. It not only helps students identify gaps in their knowledge but also provides them with clear guidance on how to improve [45]. Through feedback, students are given the chance to address and improve upon areas where their knowledge or skills may be lacking. This allows them to focus on specific weaknesses, whether they involve gaps in understanding, incorrect concepts, or underdeveloped abilities [46].

Feedback can come from both human and technological sources, each offering unique advantages. Human sources, such as teachers [4750] and peers [5153], provide personalized, context-specific feedback that can address individual needs, clarify misunderstandings, and offer emotional support. Teachers, in particular, offer expert insights that guide students through the learning process. Peer feedback, on the other hand, encourages collaborative learning, promotes critical thinking, and provides students with different perspectives on their work. In contrast, technological tools bring a new dimension to feedback [54,55]. These systems can provide immediate, data-driven feedback, offering insights on language use, content accuracy, and even structure. With advancements in technology, generative artificial intelligence (AI) feedback has now emerged as a powerful tool in the learning process. Unlike traditional systems, generative AI can not only analyze and assess content but also generate customized feedback that adapts to individual students’ needs [5658]. This feedback is often immediate and highly personalized, allowing students to receive tailored suggestions and corrections in real time, promoting more efficient and effective learning. As AI continues to evolve, its ability to provide nuanced, context-specific feedback is expected to further enhance the learning experience across various disciplines.

Studies have consistently shown that feedback has a beneficial effect across a wide range of disciplines. For instance, in language learning context [5962], feedback helps improve vocabulary acquisition [6365] and speaking proficiency as well as skills [6668].

Attitude

The attitude construct has been recognized as one of the most indispensable concepts in psychology. Over time, it has been defined in various ways, reflecting its complexity and relevance across different contexts. For instance, it was viewed as a psychological, evaluative response toward a specific person, place, thing, event, or other object, characterized by positive and/or negative feelings, and shaped by affective, behavioral, and cognitive information [6971].Additionally, it was also characterized as a stable and broad assessment of an object, person, group, issue, or concept, evaluated on a spectrum from negative to positive, offers overall assessments of target objects and are frequently thought to stem from particular beliefs, emotions, and previous behaviors linked to those objects. [72]. Furthermore. it can also be described as an individual’s perspective and evaluation of something or someone, reflecting a tendency or inclination to respond either positively or negatively to a particular idea, object, person, or situation [73]. Additionally, an attitude is seen as a stable and lasting evaluation or emotional response to a stimulus, object, or situation, which can be either positive or negative. This evaluation plays a key role in shaping the behaviors directed toward the attitude object [74]. In conclusion, the central idea shared across the various definitions of attitude is that it involves an individual’s evaluation or assessment of something, whether it be a person, object, event, or situation.

Traditionally, attitude is organized into three dimensions: Cognitive (which involves perceptions and beliefs), Affective (which includes likes, dislikes, feelings, or emotions), and Behavioral (which refers to actions or intentions toward the object, influenced by the cognitive and affective responses) [75]. Empirical research, however, does not provide clear evidence distinguishing the thoughts, emotions, and behavioral intentions linked to a specific attitude [69,76]. The model does not fully explain the interplay between the affective, cognitive, and behavioral components, which often intertwine in real-world situations. In practice, these dimensions are not isolated from one another; instead, they continuously influence and shape each other in dynamic ways. The linear and distinct framework of the ABC model, therefore, may not fully account for the complexities of how attitudes manifest in behavior, particularly when emotional and cognitive responses are deeply interlinked and context-dependent. Consequently, a more nuanced approach may be needed to capture the fluid nature of these dimensions and their influence on each other.

Methods

Ethical considerations

The study was approved by the Ethics Committee (Approval No: MMCIRB-2024–002), ensuring compliance with ethical standards. Data collected was anonymized, with no identifiable information retained, and strict measures were implemented to protect participants’ privacy. A convenience sampling method was used to collect data at an independent college in northern China, where students from various provinces in China, representing diverse cultural, educational, and socio-economic backgrounds, participated, making the sample reasonably representative. The survey link created by Wenjuanxing was distributed via WeChat. The consent was obtained in electronic format: participants were required to read the informed consent form embedded in the online questionnaire, and by clicking the “Agree to participate” option, they indicated their consent to participate. Only after providing consent in this way could they proceed to complete the survey. Furthermore, all data was securely stored and used solely for this research, with strict confidentiality measures in place to safeguard participants’ privacy.

Participants

A convenience sampling method was employed to collect data from students at an independent college in northern China. Participants completed an online questionnaire via Wenjuanxing on 20th December 2024, with an average completion time of around 3 minutes. The effective sample size was 436, based on a criterion of 2 seconds per item for valid responses [77,78]. The participants had a mean age of 18.88 years. The sample was then randomly divided into two groups for further analysis: 207 participants were assigned to the exploratory factor analysis (EFA) group (see S1 Table in Supporting Information for reference), and 229 participants were allocated to the confirmatory factor analysis (CFA) group (see S2 Table in Supporting Information for reference).

The demographic characteristics of the sample in both the EFA and CFA groups are displayed in Table 1: In terms of sex, the EFA group consisted of 36.7% male participants (n = 76) and 63.3% female participants (n = 131). In the CFA group, 41.5% of participants were male (n = 95), and 58.5% were female (n = 134). Regarding birthplace, 23.2% of participants in the EFA group were from urban areas (n = 48), while 76.8% were from rural areas (n = 159). In the CFA group, 19.7% of participants were from urban areas (n = 45), and 80.3% were from rural areas (n = 184). These demographic distributions provide a broad representation across both sex and geographical background in the sample.

Scale for students’ attitude towards AIGC feedback in english pronunciation learning

The Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning was developed through a rigorous and systematic process to ensure the scale’s accuracy, relevance, and clarity.

The items for the scale were generated through focus group interview with 10 students who had experience using generative artificial intelligence to assist in their English pronunciation learning. Ten students, all recommended by their English teachers, participated in the interviews. These students were recommended based on their prior experience with using generative artificial intelligence tools to assist in their English pronunciation learning during their regular study sessions.

The interviews explored students’ experiences and perceptions regarding the feedback they received from AIGC systems. Sample questions included: Do you think the generative AI software can accurately identify your pronunciation errors? If you repeatedly make the same pronunciation mistakes, how does the generative AI software handle these errors? Is the feedback provided by the generative AI software easy to understand? Does the generative AI software remember your previous pronunciation issues and provide more targeted suggestions in subsequent feedback? The focus group interview lasted approximately 40 minutes, which allows for rich, interactive discussions, where participants can build upon each other’s responses, creating a dynamic exchange of ideas and help the researchers identify key themes and areas of interest in the feedback process.

The interview data were analyzed using a Grounded Theory methodology, which allowed for a detailed and systematic exploration of the students’ responses. In the first phase of the analysis, open coding was conducted, where the researchers carefully examined each interview transcript line by line, identifying discrete themes, phrases, and concepts that emerged from the data. These initial codes were assigned to specific segments of the data to capture the essence of the responses.

For instance, some responses were coded with labels such as “phoneme errors,” “vowel mispronunciation,” and “stress errors” Additionally, some responses included terms like “repeated corrections,” “zero tolerance for errors,” and “attention to minor mistakes.” Furthermore, some responses were coded with labels such as “clear instructions,” “simple language,” and “avoid technical jargon”. Moreover, some responses were coded with labels such as “customized exercises,” “tracking progress,” and “targeted feedback”. These labels helped capture the specific aspects of pronunciation that the AI was addressing.

During the axial coding phase, the researchers reviewed the open codes and began to group related themes and concepts into broader categories, focusing on the relationships between them. This step helped to refine the themes and identify patterns, such as how feedback on pronunciation accuracy, clarity, and personalization were perceived by students.

The final categories were structured into coherent groupings that captured the key aspects of students’ experiences and perceptions, providing a foundation for the theory-building process. This coding process led to the identification of four primary dimensions that reflected the students’ attitudes towards the AIGC feedback: Accuracy, Strictness, Clarity, and Personalization.

Once the items were developed, they were reviewed by 8 English language teachers who were experienced in AI-assisted language learning. These experts were asked to evaluate the items and proposed categories based on their clarity, representativeness, and relevance. To assess the content validity of the scale, the experts rated each item on a 4-point scale, allowing for the calculation of the Item-Content Validity Index (I-CVI) and Scale-Content Validity Index (S-CVI). Both indices were calculated to determine the extent to which the items represented the core concepts of the scale. The results from the expert review indicated that the I-CVI and S-CVI scores met the acceptable threshold, confirming the content validity of the scale. No items required revision based on the feedback from the experts. This step ensured that the scale accurately reflected the key dimensions of students’ attitudes towards AI-generated feedback. To measure students’ attitudes, a 5-point Likert scale was used, with higher scores indicating stronger agreement with the statements provided.

English pronunciation self-efficacy

To validate the concurrent validity of the Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning, the English Pronunciation Self-Efficacy questionnaire was also collected. This additional data helps assess the relationship between students’ attitudes towards AIGC feedback and their self-perceived efficacy in English pronunciation, ensuring that the scale aligns with relevant established measures in the context of English pronunciation learning. The English Pronunciation Self-Efficacy scale [79] is designed to measure students’ self-efficacy in English pronunciation, consisting of two dimensions: Segmental Features (4 items) and Suprasegmental Features (5 items). The scale has been validated using both Rasch and Classical Test Theory (CTT) methods, with excellent reliability and validity results. Furthermore, the scale demonstrates strong generalizability across variables such as gender, major, student domicile, and time. The scale uses a five-point Likert scale, with higher scores indicating greater self-efficacy in English pronunciation.

Analytical procedure

The scale development process involved several steps to ensure its reliability and validity. First, an item analysis was conducted, including independent samples t-tests, item-total correlation analysis, and Cronbach’s α if item deleted. Next, factor analysis (EFA and CFA) was performed to identify and confirm the scale’s underlying factor structure. During Confirmatory Factor Analysis (CFA), model fit was assessed using fit indices such as RMSEA, SRMR, CFI, TLI, and chi-square statistics, with values of RMSEA and SRMR below 0.08, and CFI and TLI above 0.90 considered indicative of a well-fitting model [80,81]. Convergent validity was assessed using factor loadings, composite reliability, and average variance extracted (AVE), while discriminant validity was evaluated using the Fornell-Larcker criterion. Finally, the concurrent validity was conducted by calculating the Pearson correlation with English Pronunciation Self-Efficacy scale. This rigorous approach ensured that the scale accurately measured the intended construct and demonstrated both reliability and validity.

Results

Item analysis

The item discrimination analysis, based on a t-test between the top 27% (total ≥ 58) and bottom 27% (total ≥ 48) groups, revealed significant differences for all items (p = 0.01), confirming their ability to differentiate effectively [82]. Item-total correlations ranged from 0.811 to 0.867 (p = 0.01), demonstrating strong relationships with the overall construct and supporting internal consistency. The overall Cronbach’s α was 0.972, indicating excellent reliability. Furthermore, analysis showed that removing any item did not improve Cronbach’s α, affirming that all items meaningfully contribute to the scale’s consistency.

Exploratory factor analysis

The results of the exploratory factor analysis (EFA) conducted using the first dataset demonstrated strong evidence for the appropriateness of the data for factor extraction. Specifically, the Kaiser-Meyer-Olkin (KMO) measure was 0.954, indicating excellent sampling adequacy, while Bartlett’s Test of Sphericity yielded a significant chi-square value (χ² = 3518.736, df = 120, p < 0.001), confirming the suitability of the data for factor analysis. Based on the hypothesized factor structure, with the number of factors fixed at four, promax oblique rotation was employed, given the potential correlations among the factors. Items were retained based on the following criteria: (a) factor loadings below 0.40, (b) communalities below 0.30, (c) cross-loadings of items (loadings above 0.30 on two or more factors), and (d) factors with fewer than three items. No items were removed, and the final factor structure was in alignment with the qualitative analysis results as shown in Table 2.

The four factors identified in the scale were named as Personalisation, Accuracy, Strictness, and Clarity, each representing a key dimension for assessing students’ attitudes towards the feedback provided by generative AI tools in English pronunciation learning. Personalisation reflects how well the GenAI tailors feedback based on the learner’s individual history and pronunciation issues. Accuracy represents the GenAI tool’s precision in identifying and correcting specific pronunciation errors, such as phonemes, stress, and intonation. Strictness captures the GenAI’s approach in addressing all errors, ensuring they are corrected until fully resolved. Lastly, Clarity pertains to how easily students can comprehend and act upon the feedback given. As shown in Table 2 and 3, Personalisation (Cronbach’s α = 0.927) explained 21.4% of the variance; Accuracy (Cronbach’s α = 0.940) explained 20.4% of the variance; Strictness (Cronbach’s α = 0.926) explained 18.4% of the variance; and Clarity (Cronbach’s α = 0.925) explained 17.6% of the variance. These factors demonstrated strong internal consistency and accounted for a substantial portion of the total variance.

Confirmatory factor analysis

CFA was subsequently performed by using the CB-SEM module in SmartPLS to validate the four-dimensional factorial structure of the scale, as illustrated in Fig 1. The model fit was evaluated based on several commonly used fit indices, including Root Mean Square Error of Approximation (RMSEA), Standardized Root Mean Square Residual (SRMR), Comparative Fit Index (CFI), and Tucker-Lewis Index (TLI). According to established criteria, RMSEA values below 0.08, SRMR values below 0.08, and CFI and TLI values above 0.90 indicate an adequate model fit (Hu & Bentler, 1999; McDonald & Ho, 2002). The CFA results presented in Table 4 demonstrated that all fit indices met these thresholds, confirming that the proposed model exhibited an acceptable fit with the data, thus affirming the validity of the underlying factor structure.

thumbnail
Fig 1. Four-factor model of scale for students’ attitude towards AIGC feedback in English pronunciation learning.

https://doi.org/10.1371/journal.pone.0335210.g001

Convergent validity

For convergent validity, both Composite Reliability (CR) and Average Variance Extracted (AVE) were evaluated. As shown in Table 5, the CR values ranged from 0.897 to 0.944, exceeding the recommended threshold of 0.7 [83], indicating that the constructs demonstrate good internal consistency. Similarly, the AVE values ranged from 0.686 to 0.807, surpassing the threshold of 0.5 [83], which confirms that a sufficient proportion of the variance in the indicators is captured by their respective latent variables. These results collectively support the convergent validity of the measurement model.

Discriminant validity

For discriminant validity, the Heterotrait-Monotrait ratio (HTMT) was used to assess the distinctiveness of the constructs. According to the stricter standard, HTMT values should be below 0.85, while a more lenient threshold allows values up to 0.90 [84]. As displayed in Table 6, the HTMT value for the Personalisation construct was 0.943, which exceeds the 0.90 threshold. However, the HTMT values for all other construct pairs remained within the acceptable range, confirming that the constructs are sufficiently distinct for the majority of the model. Thus, while the value for Personalisation is slightly above the threshold, the results partially support the discriminant validity of the measurement model.

Concurrent validity

For concurrent validity, the English Pronunciation Self-Efficacy scale [79] was used as the benchmark instrument by calculating the Pearson correlation coefficients between English Pronunciation Self-Efficacy and the dimensions as well as the total score of Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning. The correlations between the model’s dimensions and the English Pronunciation Self-Efficacy were as follows: 0.495 (Accuracy), 0.526 (Strictness), 0.565 (Clarity), and 0.546 (Personalisation) with p < 0.001. The correlation between Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning and the English Pronunciation Self-Efficacy scale was 0.581 (p < 0.001). These significant positive correlations suggest that the proposed model is meaningfully related to the English Pronunciation Self-Efficacy scale, supporting its concurrent validity.

Discussion

The primary aim of this study was to develop and validate the Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning. The scale was rigorously developed through a systematic process, including item generation and expert validity checks, to ensure its content relevance and clarity. The validation process involved several robust methods, including item analysis, exploratory factor analysis (EFA), and confirmatory factor analysis (CFA), which together provided a comprehensive assessment of the scale’s structure and reliability. Additionally, concurrent validity was assessed through correlation with the English Pronunciation Self-Efficacy scale, further strengthening the scale’s validity.

The results of the EFA and CFA both supported a four-factor structure, with the identified dimensions being Accuracy, Clarity, Personalisation, and Strictness. These factors reflect distinct aspects of the students’ attitudes toward AIGC feedback in English pronunciation learning. Specifically, the Accuracy dimension evaluates the ability of generative AI tools to precisely identify and correct specific pronunciation errors. It is supported by speech deep learning classification algorithms, which enable the AI to process and recognize phonetic variations, identify incorrect pronunciations, and provide accurate corrections.

The Strictness dimension assesses how rigorously these tools handle errors, ensuring consistent and uncompromising correction until the desired pronunciation standard is achieved. This is achieved through reinforcement learning, where the GenAI system is trained to reinforce correct pronunciation behaviors and penalize errors, adapting its feedback progressively to maintain a high standard of pronunciation correction.

The Clarity dimension addresses the students’ ability to understand and act upon the feedback, highlighting the ease with which students can follow the AI’s suggestions. It relies on large language models, which are designed to generate clear, comprehensible, and contextually relevant feedback. These models enhance communication by tailoring responses to the user’s level of understanding and language proficiency.

Finally, the Personalisation dimension evaluates the AI’s capacity to utilize learning records for providing personalized and continuous improvement feedback tailored to the individual student. This is supported by database technology, which store and analyze students’ learning histories and performance data, allowing the GenAI to adjust feedback to meet each student’s unique needs and track their progress over time.

The scale demonstrated high reliability across its dimensions, further confirming its suitability for assessing students’ attitudes toward GenAI-generated feedback in pronunciation learning.

The scale demonstrates strong convergent validity and concurrent validity, which support its overall reliability and relevance in measuring students’ perceptions of AIGC feedback for English pronunciation learning. However, the discriminant validity presents a slight concern. The HTMT results suggest that the Personalisation construct may not be sufficiently distinct from other constructs in the model, particularly Clarity or other dimensions related to the feedback mechanism. The higher HTMT value for Personalisation (0.943) exceeds the more lenient threshold of 0.90, indicating that it shares significant overlap with other constructs, which may compromise its distinctiveness. The higher HTMT value for Personalisation may stem from its inherent overlap with other constructs in the model, such as Clarity, which could also involve aspects of individualized feedback or customization. Given that Personalisation refers to the degree to which feedback is tailored to the individual learner’s needs, it is likely that students’ perceptions of feedback clarity also influence their perceptions of its personal relevance. This overlap may blur the distinction between the two constructs, leading to the higher HTMT value for Personalisation. Therefore, the slight overlap between Personalisation and other constructs warrants further investigation. Moreover, since all participants were drawn from the same school, similarities in prior instruction, learning environment, and experience may further increase the perceived overlap between these dimensions. As a result, the HTMT value for Personalization is slightly elevated, indicating minor discriminant validity concerns. We caution that this overlap likely reflects the homogeneity of the sample, rather than a flaw in the scale design, and future studies with more diverse samples are recommended to further validate the distinctiveness of these constructs.

Implications

By developing a scale to measure students’ attitudes toward feedback from generative artificial intelligence (AI), the current study contributes to optimizing English pronunciation learning in several ways. First, generative AI can provide personalized, real-time feedback, and students’ attitudes and receptiveness to this feedback directly impact their learning outcomes. Therefore, the research offers insights into improving the design of AI tools to better meet students’ needs and enhance learning effectiveness. Moreover, this study helps increase the acceptance of AI in language learning by providing valuable feedback on how students perceive generative AI feedback. Understanding their expectations and attitudes allows generative AI developers to refine their tools’ design and functionality to better align with students’ preferences and habits.

Limitations and suggestions

The study does have several limitations that should be addressed in future research. First, the current study employs a cross-sectional design, which captures data at a single point in time. This approach limits the ability to assess the stability or consistency of the scale across multiple time points. While the scale demonstrates good internal consistency and construct validity within the current dataset, future research could benefit from incorporating a test-retest reliability measure. By collecting data at multiple intervals, researchers can assess the scale’s stability and ensure that the attitudes captured are consistent over time, thus enhancing the scale’s reliability.

Second, while the sample size in this study meets the minimum requirements for conducting factor analyses, there is room for improvement. A larger and more diverse sample would increase the robustness of the results and improve the generalizability of the findings. Moreover, the sample was drawn from a single institution, which limits the ability to extrapolate the results to a broader population. Although various demographic variables were considered, future research should include students from multiple institutions and geographical locations to ensure that the findings are more representative and applicable across different cultural contexts. This will help to strengthen its generalizability of the scale and its applicability to a wider range of learners.

Third, the validation methods employed in this study, namely Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA), are traditional and well-established approaches for scale validation. However, these methods have certain limitations, such as their reliance on linearity and fixed factor structures. Future studies could explore alternative, more flexible approaches to validation, such as Item Response Theory (IRT) and Network Analysis. IRT can offer more nuanced insights into how individual items function across different levels of the latent trait (e.g., students’ attitudes towards AIGC feedback), while Network Analysis can provide a deeper understanding of the interrelationships between the different dimensions of the scale, highlighting potential dependencies or complexities not captured by traditional factor models.

By addressing these limitations and incorporating these suggestions, future research can further refine the Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning and increase its applicability and precision across a broader spectrum of learners and contexts.

Conclusion

The validated scale offers valuable insights into how students perceive and interact with generative AI tools, and it can serve as a useful instrument for educators and researchers interested in exploring the impact of AI feedback systems on language learning.

Supporting information

S1 Fig. Four-factor model of scale for students’ attitude towards AIGC feedback in English pronunciation learning.

https://doi.org/10.1371/journal.pone.0335210.s003

(PBG)

S2 File. Scale for Students’ Attitude towards AIGC Feedback in English Pronunciation Learning.

https://doi.org/10.1371/journal.pone.0335210.s004

(DOCX)

References

  1. 1. Moulaei K, Yadegari A, Baharestani M, Farzanbakhsh S, Sabet B, Reza Afrash M. Generative artificial intelligence in healthcare: A scoping review on benefits, challenges and applications. Int J Med Inform. 2024;188:105474. pmid:38733640
  2. 2. Nova K. Generative AI in healthcare: advancements in electronic health records, facilitating medical languages, and personalized patient care. J Adv Anal Healthc Manag. 2023;7(1):115–31.
  3. 3. Dwivedi YK, Pandey N, Currie W, Micu A. Leveraging ChatGPT and other generative artificial intelligence (AI)-based applications in the hospitality and tourism industry: practices, challenges and research agenda. Int J Contemp Hosp Manag. 2024;36(1):1–12.
  4. 4. Kim JH, Kim J, Park J, Kim C, Jhang J, King B. When ChatGPT Gives Incorrect Answers: The Impact of Inaccurate Information by Generative AI on Tourism Decision-Making. Journal of Travel Research. 2023;64(1):51–73.
  5. 5. Ghaffari S, Yousefimehr B, Ghatee M. Generative-AI in E-Commerce: Use-Cases and Implementations. In: 2024 20th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), 2024:1–5.
  6. 6. Veluru CS. Investigating the impact of artificial intelligence and generative AI in e-commerce and supply chain: A comprehensive. Eur J Adv Eng Technol. 2024;11(4):131–43.
  7. 7. Mao J, Chen B, Liu JC. Generative Artificial Intelligence in Education and Its Implications for Assessment. TechTrends. 2023;68(1):58–66.
  8. 8. Su J, Yan W. Unlocking the Power of ChatGPT: A Framework for Applying Generative AI in Education. ECNU Review of Education. 2023;6(3):355–66.
  9. 9. Qadir J. Engineering Education in the Era of ChatGPT: Promise and Pitfalls of Generative AI for Education. In: 2023 IEEE Global Engineering Education Conference (EDUCON), 2023:1–9. doi: https://doi.org/10.1109/educon54358.2023.10125121
  10. 10. Lin H. Influences of artificial intelligence in education on teaching effectiveness: The mediating effect of teachers’ perceptions of educational technology. Int J Emerg Technol Learn. 2022;17(24):144.
  11. 11. Nasir M, Hasan M, Adlim A, Syukri M. Utilizing Artificial Intelligence In Education To Enhance Teaching Effectiveness. Proceedings of ICE. 2024;2(1):280–5.
  12. 12. Chen X, Xie H, Qin SJ, Wang FL, Hou Y. Artificial intelligence‐supported student engagement research: text mining and systematic analysis. Eur J Educ. 2025;60(1):e70008.
  13. 13. Nguyen A, Kremantzis M, Essien A, Petrounias I, Hosseini S. Enhancing student engagement through artificial intelligence (AI): Understanding the basics, opportunities, and challenges. J Univ Teach Learn Pract. 2024;21(06).
  14. 14. Vo A, Nguyen H. Generative artificial intelligence and ChatGPT in language learning: EFL students’ perceptions of technology acceptance. J Univ Teach Learn Pract. 2024;21(6):199–218.
  15. 15. Law L. Application of generative artificial intelligence (GenAI) in language teaching and learning: A scoping literature review. Computers and Education Open. 2024;6:100174.
  16. 16. Pack A, Maloney J. Using Generative Artificial Intelligence for Language Education Research: Insights from Using OpenAI’s ChatGPT. TESOL Quarterly. 2023;57(4):1571–82.
  17. 17. Tabrizi HH, Jafarie M. Intelligent (language) tutoring systems: A second-order meta-analytic review. J Appl Linguist Stud. 2023;2(1):30–42.
  18. 18. Wang Q. AI-driven autonomous interactive English learning language tutoring system. J Comput Methods Sci Eng. 2024.
  19. 19. Peachey N. ChatGPT in the language classroom. PeacheyPublications Ltd. 2023.
  20. 20. Muñoz-Basols J, Neville C, Lafford BA, Godev C. Potentialities of Applied Translation for Language Learning in the Era of Artificial Intelligence. hpn. 2023;106(2):171–94.
  21. 21. Shi H, Chai CS, Zhou S, Aubrey S. Comparing the effects of ChatGPT and automated writing evaluation on students’ writing and ideal L2 writing self. Comput Assist Lang Learn. 2025;:1–28.
  22. 22. Song C, Song Y. Enhancing academic writing skills and motivation: assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Front Psychol. 2023;14:1260843. pmid:38162975
  23. 23. Po’latova HA, Yokutkhon R. The importance of pronunciation in second language acquisition. J New Century Innov. 2024;48(1).
  24. 24. Yolchiyeva M. The importance of pronunciation in learning foreign language and prospects of improving pronunciation competence. Mod Sci Res. 2024;3(5):343–7.
  25. 25. Rashidova G, Komilova R. The Importance of Pronunciation in Second Language Learning. Mod Sci Res Int Sci J. 2025;3(1):53–8.
  26. 26. Albiladi WS. Teaching English pronunciation revisited: the challenges of teaching EFL in non-English-speaking countries. Eur J Foreign Lang Teach. 2019;4(2):41–50.
  27. 27. Qader MdA, Ashraf TA, Monira S, Islam MA, Mustafizur Rahman MR. Challenges of Pronunciation Practices in the ESL Curriculum within the CLT Framework in Bangladesh: A Systematic Review. WJE. 2024;14(3):137.
  28. 28. Azane A. Identifying and overcoming challenges in teaching and learning pronunciation in an EFL class: An experience from Cameroon. Int J Linguist. 2021;2(1):36–48.
  29. 29. Gaikwad SK, Gawali BW, Yannawar P. A review on speech recognition technique. Int J Comput Appl. 2010;10(3):16–24.
  30. 30. Yu D, Deng L. Automatic speech recognition. Springer. 2016.
  31. 31. Dutoit T. An introduction to text-to-speech synthesis. Springer Science & Business Media. 1997.
  32. 32. Chowdhary K, Chowdhary KR. Natural language processing. Fundam Artif Intell. 2020;:603–49.
  33. 33. Mahesh B. Machine Learning Algorithms - A Review. IJSR. 2020;9(1):381–6.
  34. 34. Mompean JA. ChatGPT for L2 pronunciation teaching and learning. ELT Journal. 2024;78(4):423–34.
  35. 35. Dja’far VH, Hamidah FN. Improving English pronunciation skills through AI-based speech recognition technology. Ethical Ling J Lang Teach Lit. 2024;11(2).
  36. 36. Senowarsito S, Ardini SN. The Use of Artificial Intelligence to Promote Autonomous Pronunciation Learning: Segmental and Suprasegmental Features Perspective. Indones J English Lang Teach Appl Linguist. 2023;8(2):133.
  37. 37. Shafiee Rad H, Roohani A. Fostering L2 learners’ pronunciation and motivation via affordances of artificial intelligence. Comput Sch. 2024;1–22.
  38. 38. Sun L, Zhou L. Generative artificial intelligence attitude analysis of undergraduate students and their precise improvement strategies: A differential analysis of multifactorial influences. Educ Inf Technol. 2024;1–36.
  39. 39. Nikolic S, Wentworth I, Sheridan L, Moss S, Duursma E, Jones RA. A systematic literature review of attitudes, intentions and behaviours of teaching academics pertaining to AI and generative AI (GenAI) in higher education: An analysis of GenAI adoption using the UTAUT framework. Australas J Educ Technol. 2024.
  40. 40. Daher W, Hussein A. Higher education students’ perceptions of genai tools for learning. Inf. 2024;15(7).
  41. 41. Toncelli R, Kostka I. A love-hate relationship: Exploring faculty attitudes towards GenAI and its integration into teaching. Int J TESOL Stud. 2024;6(3):77–94.
  42. 42. Shen X, Mo X, Xia T. Exploring the attitude and use of GenAI-image among art and design college students based on TAM and SDT. Interact Learn Environ. 2024;:1–18.
  43. 43. Rasmussen D, Karlsen T. Adopt or Abort? Mapping students’ and professors’ attitudes towards the use of generative AI in higher education. 2023.
  44. 44. Wen Y, Zhao X, Li X, Zang Y. Attitude Mining Toward Generative Artificial Intelligence in Education: The Challenges and Responses for Sustainable Development in Education. Sustainability. 2025;17(3):1127.
  45. 45. Butler DL, Winne PH. Feedback and self-regulated learning: A theoretical synthesis. Rev Educ Res. 1995;65(3):245–81.
  46. 46. Cavalcanti AP, Barbosa A, Carvalho R, Freitas F, Tsai Y-S, Gašević D. Automatic feedback in online learning environments: A systematic literature review. Comput Educ Artif Intell. 2021;2:100027.
  47. 47. Stovner RB, Klette K. Teacher feedback on procedural skills, conceptual understanding, and mathematical practices: A video study in lower secondary mathematics classrooms. Teach Teach Educ. 2022;110:103593.
  48. 48. Carless D, Winstone N. Teacher feedback literacy and its interplay with student feedback literacy. Teach High Educ. 2023;28(1):150–63.
  49. 49. Alharbi MA. Exploring the impact of teacher feedback modes and features on students’ text revisions in writing. Assess Writ. 2022;52:100610.
  50. 50. Chi S, Wang Z, Liu X. Moderating effects of teacher feedback on the associations among inquiry-based science practices and students’ science-related attitudes and beliefs. Int J Sci Educ. 2021;43(14):2426–56.
  51. 51. Kerman NT, Noroozi O, Banihashem SK, Karami M, Biemans HJA. Online peer feedback patterns of success and failure in argumentative essay writing. Interact Learn Environ. 2024;32(2):614–26.
  52. 52. Misiejuk K, Wasson B, Egelandsdal K. Using learning analytics to understand student perceptions of peer feedback. Computers in Human Behavior. 2021;117:106658.
  53. 53. Zhang ZV, Hyland K. Student engagement with peer feedback in L2 writing: Insights from reflective journaling and revising practices. Assess Writ. 2023;58:100784.
  54. 54. Demszky D, Liu J, Hill HC, Jurafsky D, Piech C. Can Automated Feedback Improve Teachers’ Uptake of Student Ideas? Evidence From a Randomized Controlled Trial in a Large-Scale Online Course. Educational Evaluation and Policy Analysis. 2023;46(3):483–505.
  55. 55. Fahmi MA, Cahyono BY. EFL students’ perception on the use of Grammarly and teacher feedback. JEES. 2021;6(1):18–25.
  56. 56. Lee SS, Moore RL. Harnessing Generative AI (GenAI) for Automated Feedback in Higher Education: A Systematic Review. OLJ. 2024;28(3).
  57. 57. Becerra Á, Mohseni Z, Sanz J, Cobos R. A Generative AI-Based Personalized Guidance Tool for Enhancing the Feedback to MOOC Learners. In: 2024 IEEE Global Engineering Education Conference (EDUCON), 2024. 1–8. doi: https://doi.org/10.1109/educon60312.2024.10578809
  58. 58. Escalante J, Pack A, Barrett A. AI-generated feedback on writing: insights into efficacy and ENL student preference. Int J Educ Technol High Educ. 2023;20(1):57.
  59. 59. Mayo M del PG, Labandibar UL. The use of models as written corrective feedback in English as a foreign language (EFL) writing. Annu Rev Appl Linguist. 2017;37:110–27.
  60. 60. Ha XV, Nguyen LT, Hung BP. Oral corrective feedback in English as a foreign language classrooms: A teaching and learning perspective. Heliyon. 2021;7(7):e07550. pmid:34337178
  61. 61. Chu R. Effects of Teacher’s Corrective Feedback on Accuracy in the Oral English of English-Majors College Students. TPLS. 2011;1(5).
  62. 62. Vattoy KD, Smith K. Students’ perceptions of teachers’ feedback practice in teaching English as a foreign language. Teach Teach Educ. 2019;85:260–8.
  63. 63. Calvo-Ferrer JR. Effectiveness of Type of Feedback and Frequency on Digital Game-Based L2 Vocabulary Acquisition. International Journal of Game-Based Learning. 2021;11(3):38–55.
  64. 64. Irham Mifta. The Influence of Feedback in Different Way of Learning and the Relationship with Students’ Vocabulary Acquisition: Issues in Assessment. JournEEL. 2021;3(2):71–9.
  65. 65. Henderson C. The effect of feedback timing on L2 Spanish vocabulary acquisition in synchronous computer-mediated communication. Lang Teach Res. 2021;25(2):185–208.
  66. 66. Zhang W. The Efficacy of Computer-Mediated Feedback in Improving L2 Speaking: A Systematic Review. tpls. 2021;11(12):1590–601.
  67. 67. Quyen VT, Ha ND. Impacts of feedback posted on Google Classroom on students’ speaking skill. TNU J Sci Technol. 2021;226(03):58–63.
  68. 68. Shadiev R, Feng Y, Zhussupova R, Altinay F. Effects of speech-enabled corrective feedback technology on EFL speaking skills, anxiety and confidence. Comput Assist Lang Learn. 2024;1–37.
  69. 69. Eagly AH, Chaiken S. The psychology of attitudes. Harcourt brace Jovanovich college publishers; 1993.
  70. 70. Minami M. Role of attitude in multicultural counselling competency. World Cult Psychiatry Res Rev. 2009;4(1):39–46.
  71. 71. Schwartz SH. Universals in the content and structure of values: theoretical advances and empirical tests in 20 countries. Adv Exp Soc Psychol. 1992.
  72. 72. American Psychological Association. Attitude. APA Dictionary of Psychology. https://dictionary.apa.org/attitude?amp=1. 2018.
  73. 73. Vargas-Sánchez A, Plaza-Mejía MÁ, Porras-Bueno N. Attitude. Encyclopedia of Tourism. Springer International Publishing. 2016: 58–62. doi: https://doi.org/10.1007/978-3-319-01384-8_11
  74. 74. Kersh J. Attitudes about people with intellectual disabilities: Current status and new directions. International review of research in developmental disabilities. Elsevier. 2011:199–231.
  75. 75. Rosenberg MJ, Hovland CI, McGuire WJ, Abelson RP, Brehm JW. Attitude organization and change: An analysis of consistency among attitude components. 1960.
  76. 76. Pratkanis AR, Breckler SJ, Greenwald AG. Attitude structure and function. Psychology Press. 2014.
  77. 77. Curran PG. Methods for the detection of carelessly invalid responses in survey data. J Exp Soc Psychol. 2016;66:4–19.
  78. 78. Soland J, Wise SL, Gao L. Identifying disengaged survey responses: New evidence using response time metadata. Appl Meas Educ. 2019;32(2):151–65.
  79. 79. Yang Y. Measurement and exploration of English pronunciation self-efficacy in m-learning context. University of Macau. 2023.
  80. 80. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model A Multidiscip J. 1999;6(1):1–55.
  81. 81. McDonald RP, Ho M-HR. Principles and practice in reporting structural equation analyses. Psychol Methods. 2002;7(1):64–82. pmid:11928891
  82. 82. Qin X. Questionnaire survey method in foreign language teaching. Foreign Language Teaching and Research Publishing Co., Ltd. 2009.
  83. 83. Fornell C, Larcker DF. Evaluating structural equation models with unobservable variables and measurement error. J Mark Res. 1981;18(1):39–50.
  84. 84. Henseler J, Ringle CM, Sarstedt M. A new criterion for assessing discriminant validity in variance-based structural equation modeling. J Acad Mark Sci. 2015;43.