Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Development and validation of a deep learning–based assessment tool for teacher leadership: A case study from Xinjiang, China

  • Jianwei Dong,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Resources, Writing – original draft, Writing – review & editing

    Affiliations College of Education Science, Xinjiang Normal University, Urumqi, China, College of Education Science, Xinjiang Teacher’s College, Urumqi, China

  • Xinya Chen,

    Roles Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation College of Information Science and Engineering, Xinjiang University, Urumqi, China

  • Chen Chen,

    Roles Data curation, Formal analysis, Funding acquisition, Validation

    Affiliation College of Software, Xinjiang University, Urumqi, China

  • Cheng Chen

    Roles Conceptualization, Methodology, Project administration, Supervision, Visualization

    chenchengoptics@gmail.com

    Affiliations College of Software, Xinjiang University, Urumqi, China, Department of Cardiology, People’s Hospital of Xinjiang Uyghur Autonomous Region, Urumqi, China, Xinjiang Key Laboratory of Cardiovascular Homeostasis and Regeneration Research, Urumqi, China

Abstract

Teacher leadership is widely regarded as a critical driver of school reform and educational quality improvement. Although the field has been extensively studied, empirical research remains limited in Xinjiang, China—a region characterized by its multiethnic and multilingual context. To address this gap, the present study developed and validated a culturally sensitive assessment tool based on a sample of 371 primary and secondary school teachers from Xinjiang. A structured questionnaire was designed encompassing four dimensions: professional guidance, educational collaboration, cross-cultural ICT-based teaching competence, and leadership cognition. In addition, we introduced an interpretable deep learning model—ITL-LSTM (Interpretable Teacher Leadership LSTM)—which employs a Diagonal BiLSTM structure for dynamic classification of teacher leadership profiles, achieving a prediction accuracy of 90.10%. The findings indicate that the proposed tool demonstrates strong applicability and scalability within the Xinjiang context, providing effective support for dynamic evaluation, personalized development, and evidence-based decision-making in multicultural educational settings.

1. Introduction

Among the numerous key elements in the field of education, teacher leadership has a significant and non – negligible impact on the development process of schools and the growth trajectory of students [14]. In recent years, with the continuous deepening of educational research, teacher leadership has become a key issue of common concern in both academic and educational practice fields [5,6]. A large number of studies have been carried out around its promoting effects on teaching quality improvement, teacher professional growth, and students’ academic achievements, providing rich theoretical and practical experience for educational development [79].

However, compared with the developed eastern regions, research on teacher leadership in border areas, especially in Xinjiang, is still weak [10]. In Xinjiang, China, due to its unique social background of multi – ethnic settlement and multi – cultural integration, research on teacher leadership is relatively scarce. The educational environment in Xinjiang has distinct particularities. The cultures of multiple ethnic groups collide and integrate on campus, which places higher demands on teachers’ leadership abilities [1113]. But existing studies have failed to comprehensively and deeply reveal the actual situation of teacher leadership in primary and secondary schools in Xinjiang. There is a lack of systematic analysis of various factors that affect the exertion of teacher leadership, such as ethnic and cultural differences, regional education, and unbalanced educational resources. Moreover, the development path of teacher leadership that conforms to local characteristics has not been explored yet. There is a lack of in – depth discussion on how to further enhance teacher leadership by combining the rich cultural characteristics of Xinjiang to improve the quality of education and teaching, which has a significant gap with the urgent need for high – quality education in Xinjiang.

At the same time, with the development of artificial intelligence technology, the application of deep learning in educational assessment and decision – making support is becoming more and more extensive. Introducing deep – learning models into teacher leadership research not only helps to improve the accuracy and scientific nature of analysis but also provides a new path for the intelligent assessment of teacher capabilities [1416].

Although various influential definitions, models, and assessment frameworks for teacher leadership have been proposed, most existing studies are concentrated in economically developed regions and lack empirical applicability to areas with distinct social structures and educational ecologies such as Xinjiang. This study aims to address the contextual complexity and cultural diversity of the education system in Xinjiang, China, by developing an adaptive and interpretable assessment tool for teacher leadership. By constructing a four-dimensional scale—comprising professional guidance, educational collaboration, cross-cultural ICT-based teaching competence, and leadership cognition—and integrating it with a Diagonal BiLSTM deep learning model, this study enables dynamic identification and predictive analysis of teacher leadership. The goal is to establish a scalable assessment framework applicable to multiethnic, resource-constrained educational settings, thereby supporting teacher development and informing education policy and practice.

2. Review

The theory of teacher leadership originated from the education reform in the United States in the 1980s, emphasizing the crucial role of teachers in teaching improvement and school development. With the in – depth study, this theory has gradually shifted from the understanding of “position power” to that of “professional influence” and “collective participation” [3,17]. Currently, teacher leadership is widely defined as the comprehensive ability of teachers to have a positive impact on the educational environment through knowledge, skills, emotions, and collaborative behaviors [18,19]. It emphasizes the leadership role of teachers in promoting teaching quality, driving school change, and constructing a professional culture, especially in cross – hierarchical interactions in formal and informal fields [20].

In recent years, international research on teacher leadership models has become more systematic. The “Teacher Leader Model Standards” and the “Teacher Leadership Framework” released by the Teacher Leadership Exploratory Consortium in the United States have further clarified its core dimensions, including self – development, innovation leadership, coaching ability, and teamwork [2]. These studies have promoted the transformation of teacher leadership from theory to practice and provided a clear path for education reform.

In China, especially in multi – ethnic regions such as Xinjiang, research on teacher leadership has gradually attracted attention. Existing studies have shown that teacher leadership styles directly affect students’ academic achievements and learning environments, and play an important role in constructing educational equity, cultural integration, and teacher professional growth [21,22]. For example, research on Shanghai’s PISA scores has emphasized the role of teacher leadership in curriculum implementation and teacher team construction, and empirical analysis in Beijing also supports its positive effects on academic performance and campus culture construction.

Overall, the core attributes of teacher leadership include ethicality, professionalism, and collectivism [23]. Its essence is an internally – driven force for educational change. Especially in regions with diverse social cultures and complex educational challenges, it has greater research and practical value. Based on this, this article aims to explore the performance characteristics and development mechanisms of teacher leadership in Xinjiang, providing theoretical support and practical paths for the high – quality development of education in border areas.

2.1. Existing teacher leadership assessment tools and their limitations

A comparative analysis of the five aforementioned studies reveals several common limitations among existing teacher leadership assessment tools:

Static measurement models lacking dynamic behavior capture: Instruments such as Chen’s (2022) Teacher Leadership Inventory (TLI) and Eckert’s (2022) Collective Leadership Tool primarily rely on structural equation modeling and multi-group confirmatory factor analysis (MGCFA). These approaches are inherently static and fail to reflect the evolving nature of teacher leadership in real-world educational settings.

Limited applicability across regions and populations: Pazur’s (2022) Democratic Leadership Scale is confined to the Croatian context, while Darwis et al.’s (2025) tool for Indonesian principals is similarly restricted to a single national setting. Both fail to account for the dynamic adaptability required in multicultural and diverse educational environments. Although Zhang et al. (2025) employed a large university sample in China, their focus on higher education excludes applicability to K–12 settings and cross-cultural factors.

Lack of integration with advanced technologies: Most existing tools remain grounded in traditional quantitative analysis, with minimal incorporation of artificial intelligence or deep learning methods. As such, they fall short in accurately predicting or explaining the dynamic trajectories of teacher leadership development.

3 Materials and methods

3.1 Research method

Therefore, this study aims to develop and validate a teacher leadership assessment tool tailored to the educational complexities of China’s Xinjiang region, characterized by its multi-ethnic and multilingual context. Based on construct validity principles, we designed the “Teacher Leadership Survey for Primary and Secondary Schools,” incorporating four key dimensions: Professional Guidance, Educational Collaboration, Cross-cultural Digital Instructional Competence, and Leadership Cognition. A reliability and validity analysis was conducted on a sample of 371 responses to ensure the scientific soundness and contextual applicability of the scale structure. Building upon this, we further introduced an interpretable deep learning model—ITL-LSTM—based on a Diagonal BiLSTM architecture, enabling dynamic classification and prediction of teacher leadership levels. The model provides an intelligent, interpretable, and generalizable evaluation tool for educational systems operating under complex cultural conditions [29].

3.2 Data collection

The research tool used in this study is the “Questionnaire on the Current Situation of Leadership among Primary and Secondary School Teachers in Xinjiang Uygur Autonomous Region”, which was compiled by experts in the field of education and has high content validity. This questionnaire consists of two parts. The first part is a survey of basic information about teachers, including variables such as gender, age, ethnicity, years of teaching experience, subjects taught, educational levels, professional titles, and school regions.

The second part is the Scale of Leadership for Primary and Secondary School Teachers, which comprises four dimensions: professional guidance, educational collaboration, leadership cognition, and cross-cultural information technology teaching ability. To avoid potential expectancy biases of the respondents regarding the four dimensions, the dimension divisions were not explicitly stated during the formal survey process. After the questionnaires were collected, the scores of the four dimensions were calculated by categorizing the items [30]. The scale uses the Likert five-point scale method for measurement, with 5 representing complete compliance, 4 representing relatively compliance, 3 representing basic compliance, 2 representing basic non-compliance, and 1 representing complete non-compliance.

3.3 Implementation of the survey

This study distributed the “Survey Questionnaire on the Current Situation of Leadership among Primary and Secondary School Teachers in Xinjiang Uygur Autonomous Region” to primary and secondary schools in Xinjiang. A total of 400 questionnaires were collected(The dataset is shown in S1 Data). The survey data were cleaned and analyzed using SPSS 26.0 statistical analysis software. Among them, there were 29 cases where respondents consistently chose the same answer or continuously provided agreement responses. These cases were cleaned, resulting in a final sample of 371 valid questionnaires. The specific statistical details are presented in Table 1.

From the perspective of questionnaire completion, our survey covers a wide range of population and regional distribution, exhibiting certain characteristics specific to Xinjiang. The Han ethnic group accounts for 74.9% of the total population, while Uyghurs account for 11.4%, Kazakhs and other ethnic minorities account for 7.8% and 6.2%, respectively, in line with the distribution proportion of ethnic minorities in Xinjiang. Northern Xinjiang accounts for 47.6% of the total population, while Urumqi and Southern Xinjiang account for 31.1% and 19.2%, respectively, occupying a relatively large proportion in the overall distribution. Eastern Xinjiang and the Xinjiang Production and Construction Corps have relatively fewer numbers, accounting for 1.4% and 1.1%, respectively. In terms of educational stages, our research mainly focuses on primary and secondary schools, with junior high schools accounting for the largest proportion at 55.8%, and primary schools accounting for 39.3%.

3.4. Project analysis

The item analysis method is widely used in assessing the discriminative power of scale items. In this study, using SPSS 26.0 software, we calculated the total scores of the respondents on the teacher leadership scale to measure their leadership capabilities. Based on this analysis, we categorized the participants into two groups: a high-scoring group and a low-scoring group. Specifically, participants scoring in the top 27% were classified as the high-scoring group, while those scoring in the bottom 27% were classified as the low-scoring group.

To further explore the discriminative power of each item between the two groups, independent samples t-tests were conducted. Table 2 showed that all sixteen items exhibited significant discriminative power between the high-scoring and low-scoring groups (P < 0.05). Specifically, the scores of the high-scoring group were significantly higher than those of the low-scoring group, indicating that these items possess good discriminative power in assessing the leadership level of the surveyed teachers and successfully passed the item analysis test.

3.5. Reliability analysis

Reliability refers to whether the questionnaires in the research sample are truly reliable and whether the statistical data align with the actual situations in natural contexts. In this study, reliability was examined by calculating the values of Cronbach’s alpha coefficient and split-half reliability coefficient to systematically assess the internal consistency of the Teacher Leadership Scale and verify whether the questionnaire has good reliability.

3.5.1. Cronbach’s alpha coefficient analysis.

Cronbach’s alpha coefficient is the most widely used tool for evaluating reliability in current research According to the views of the American scholar Kline, when the alpha coefficient is below 0.5, the reliability analysis is unacceptable, while when it is above 0.5, the reliability analysis is acceptable. Additionally, alpha coefficients greater than 0.5 are further categorized into several degrees: moderate reliability when α > 0.7, good reliability when α > 0.8, and optimal reliability when α > 0.9. In other words, the higher the alpha coefficient, the better the reliability, indicating a more stable internal structure of the scale.

In this study, Cronbach’s alpha coefficient was calculated for the Teacher Leadership Scale developed for primary and secondary school teachers. As shown in Table 3, the alpha coefficient for the overall Teacher Leadership Scale is 0.899, indicating good reliability of the overall structure of the scale and the teacher leadership structure having good internal consistency.

3.5.2. Split-half reliability coefficient analysis.

The split-half reliability coefficient, also known as the Spearman-Brown coefficient, is another important tool for assessing reliability. In this study, the Spearman-Brown split-half coefficient for the Teacher Leadership Scale was calculated. The specific results are shown in Table 4, indicating a split-half coefficient of 0.834, which is greater than 0.8. This suggests that the structure of the scale has good reliability.

thumbnail
Table 4. Spearman-Brown Split-Half Reliability Coefficient Values.

https://doi.org/10.1371/journal.pone.0331560.t004

In summary, through the dual examination of Cronbach’s alpha coefficient and split-half reliability coefficient values, it was found that the Teacher Leadership Scale for primary and secondary school teachers has good reliability. This indicates that the research data obtained are highly credible, and the structure exhibits internal consistency.

3.6. Validity analysis

In the academic research process, to verify the effectiveness of measurement dimensions, two statistical tools are commonly relied upon: the Kaiser-Meyer-Olkin (KMO) measure and Bartlett’s test of sphericity. The KMO measure is specifically used to assess the correlations between variables. A value closer to 1 indicates stronger correlations between variables, facilitating subsequent factor analysis. Bartlett’s test of sphericity, on the other hand, is used to determine whether the observed data’s correlation matrix is an identity matrix, thereby assessing whether the data are suitable for factor analysis.

According to commonly accepted research standards domestically and internationally, the KMO measure should exceed 0.7, and Bartlett’s test of sphericity significance coefficient should be less than 0.05 to meet statistical requirements, ensuring that the dimensions being tested have good validity. Only when these two conditions are simultaneously met can we conclude that the effectiveness of the measurement tool has been effectively verified, providing solid support for subsequent research.

By examining the data in Table 5, we can observe that the KMO statistic is 0.914, significantly higher than the threshold of 0.7, indicating strong correlations between variables. Additionally, Bartlett’s test of sphericity yields a chi-square value of 2892.572, with a significance P-value of 0.000, significantly less than the 0.05 significance level. This implies that the correlation matrix of the data is not an identity matrix and is highly suitable for factor analysis. Therefore, we can conclude that the item validity of the scale used in this study is excellent, and the data quality fully meets the requirements for factor analysis.

Given that the Teacher Leadership Scale used in this study was developed jointly by an expert team, exploratory factor analysis was conducted using SPSS software to further ensure the structural validity of the scale. Through principal component analysis, four common factors were successfully extracted, as indicated in Table 6 below. These four common factors contributed to the explanation of variances at rates of 43.313%, 8.520%, 7.938%, and 6.262%, respectively, collectively accounting for 66.033% of the total variance explained, meeting the requirement in academic research for the total variance explained to exceed 60%.

After factor rotation using the maximum variance method, the variance explained ratios of the four common factors changed to 28.280%, 13.279%, 12.549%, and 11.925%, respectively. Although the specific contribution rates of each common factor were adjusted, the total variance explained remained stable at 66.033%. This result further confirms the stability and rationality of the scale structure.

The first factor extracted using the maximum variance method includes the following items: “I foster a culture of trust, inclusiveness, and reflection, encouraging the exchange of different viewpoints,” “I have clear goals and strategies for cooperation between home and school, facilitating effective interaction,” “I understand students from diverse backgrounds and strive to ensure their equitable development,” “I frequently discuss various educational issues with colleagues and communicate with parents,” “I make efforts to garner more assistance and resources from the parent community to achieve school development goals,” and “My words and actions can influence changes in parents.” The factor loadings for each item exceed 0.5. Therefore, we label this factor as “Educational Synergy.” Educational Synergy mainly reflects teachers’ leadership in communication between home and school and social influence. It encompasses various efforts such as fostering a culture of trust, setting clear cooperation goals, addressing individual student differences, effective communication with colleagues and parents, and leveraging personal influence to contribute to students’ comprehensive development and the sustainable development of schools.

The second factor extracted after rotation using the maximum variance method comprises three items: “I emphasize leading students in information utilization, encouraging appropriate online resource utilization,” “I attempt to guide students from different backgrounds and cultures to interact effectively,” and “I attach importance to cultivating students’ understanding of Chinese traditional culture and actively foster their multicultural capabilities.”

The factor loadings for these three items all exceed 0.5, indicating their significant contribution to the second factor. Given the characteristics of these items, we name this factor “Cross-cultural Information Teaching Capability.” This factor primarily reflects teachers’ leadership in using informational tools for cross-cultural teaching. It demonstrates that teachers, in the era of information technology, need not only traditional teaching skills but also leadership in fostering students’ cross-cultural awareness and adaptability to address the challenges of globalization in education.

Through factor rotation using the maximum variance method, we successfully extracted a third crucial factor. This factor encompasses three specific items: “I have a very good understanding of teacher leadership,” “I possess teacher leadership,” and “Teacher leadership behaviors are present in my school..” The factor loadings for all three items exceed 0.5. Based on the core attributes of these items, we label this factor as “Leadership Cognition.” This factor not only reveals teachers’ self-perception of their leadership abilities but also reflects their general awareness of teacher leadership behavior within the school environment.

Finally, after factor rotation using the maximum variance method, we identified a fourth significant factor. This factor consists of three specific items: “I have a clear plan for my future professional development,” “I need to enhance myself through training and peer exchange in my professional development,” and “I believe that research projects are essential and can change educational teaching methods through research.” The factor loadings for these three items all exceed 0.5, indicating their substantial contribution to this factor. Given the shared characteristics of these items, we label this factor as “Teacher Guiding Capacity.” The rotated component matrix is presented in Table 7.

3.7. Analysis of scores on the five-dimensional scale

As indicated in Table 8, the overall average score for leadership among primary and secondary school teachers in Xinjiang is 4.099 points. Among the five dimensions, the highest average score is for Cross-cultural Information Teaching Capability, at 4.238 points, while the average score for Leadership Cognition is the lowest, at 3.830 points. Overall, both the total leadership score and the average scores for the four dimensions are above 4 points, indicating a favorable overall situation of leadership among primary and secondary school teachers in Xinjiang.

thumbnail
Table 8. Detailed Scores of Leadership Levels among Primary and Secondary School Teachers.

https://doi.org/10.1371/journal.pone.0331560.t008

Table 9 provides a detailed display of the scores for each item in the four-dimensional scale, revealing the cognition and emphasis of primary and secondary school teachers on different aspects. Among them, the item “In my own professional development, I need to improve myself through training and peer exchange” in the dimension of “Professional Guidance” scored the highest at 4.23 points. This result indicates that primary and secondary school teachers generally place great importance on opportunities for their professional development and are eager to enhance their professional competence through training and exchange.

thumbnail
Table 9. Detailed Scores of Leadership Levels among Primary and Secondary School Teachers.

https://doi.org/10.1371/journal.pone.0331560.t009

At the same time, we also noticed that in the dimension of “Leadership Cognition,” the scores for each item are not high, with average scores of only 3.84 points for “Understanding Teacher Leadership,” 3.82 points for “Having Leadership,” and 3.83 points for “Teacher Leadership Behavior Existence” items. Further analysis suggests that this might be due to the lack of a deep understanding of leadership concepts and practices among most teachers. Often, leadership is equated with managerial positions or administrative roles, without realizing that every teacher actually possesses potential leadership that can play a significant role in teaching, professional development, and school improvement.

In addition, influenced by traditional educational beliefs, some teachers may tend to position themselves as executors rather than decision-makers or leaders. They may be accustomed to receiving guidance from management rather than actively participating in school management and decision-making, thus limiting their own leadership development.

Regarding cross-cultural information technology teaching ability, the item “Valuing the cultivation of traditional Chinese culture and focusing on developing students’ multicultural abilities” scored 4.36, which not only demonstrates the firm belief and action of primary and secondary school teachers in inheriting and promoting traditional culture but also reflects their profound understanding that, in the context of global informationization, it is necessary not only to continue the inheritance of local culture but also to cultivate students’ global perspective and cross-cultural communication skills. Cultivating such abilities helps students better adapt to the challenges and opportunities brought about by globalization, and enhances international understanding and cooperation.

3.8. Deep learning

We conducted various data analyses, including reliability analysis and descriptive statistical analysis, on the collected data of leadership among primary and secondary school teachers in Xinjiang. These analyses provided insights into the current status and influencing factors of teacher leadership in the region from multiple perspectives. However, traditional data analysis methods only offer surface-level, linear relationship descriptions and cannot uncover deeper features and patterns behind leadership.

To delve into the characteristics of teacher leadership in Xinjiang, considering leadership as a complex, multi-level, and multi-dimensional concept, we decided to employ deep learning for further research.

Deep learning possesses powerful representation learning capabilities, automatically extracting useful features from raw data and revealing the complex structures behind the data through layered learning. Through the application of deep learning, we can explore and understand teacher leadership in primary and secondary schools in Xinjiang more thoroughly, providing more scientific and precise guidance for teacher leadership development. Deep learning can also assist in predicting teachers’ leadership levels. By training and optimizing deep learning models for teacher leadership, we may predict teachers’ leadership performance in different contexts, offering personalized training and development recommendations.

3.8.1. Data source.

We processed and analyzed the questionnaire data using SPSS 26.0 software. To ensure the objectivity and accuracy of the data, we rigorously excluded questionnaires with continuous identical answers or consistently selecting the same response during the data cleaning process. This process aimed to eliminate potential risks and ensure the authenticity and reliability of the dataset, resulting in 371 valid questionnaires.

For an in-depth study of teacher leadership in primary and secondary schools, we collaborated with education experts with over 30 years of experience and holding senior professional titles. They participated in labeling and objectively analyzing the questionnaire data based on teachers’ performances in the five dimensions of professional guidance, home-school cooperation, leadership cognitive ability, social appeal, and cross-cultural informationized teaching ability. The leadership levels were categorized as “strong,” “average,” and “weak.”

After expert annotation, there were 82 teachers (22.1%) classified as “strong leadership,” 98 teachers (26.4%) as “average leadership,” and 192 teachers (51.7%) as “weak leadership”(The dataset is shown in S2 Data). The labeled data were then divided into training and testing sets in a 7:3 ratio, and all models used ten-fold cross-validation to measure the classification accuracy of different network models.

3.8.2. Interpretable teacher leadership LSTM.

In this context, we propose the ITL-LSTM (Interpretable Teacher Leadership LSTM) model, specifically designed for parsing and classifying the degree of teacher leadership. The model utilizes multiple Diagonal BiLSTM units, a structure capable of capturing long-term dependencies in sequential data, making it suitable for addressing continuous issues in questionnaires. With the deep bidirectional LSTM structure, the model can learn from both positive and negative directions, comprehensively understanding various dimensions of leadership. After dimensionality reduction using global average pooling layers, the model employs fully connected layers for classification output. The network structure is depicted in Fig 1.

Diagonal BiLSTM is a custom LSTM unit using a diagonal weight matrix structure, connecting each time step’s input and hidden state directly. In the leadership questionnaire data, specific questions have a significant impact on leadership levels, and the flexibility of the diagonal weight matrix better captures these features while reducing parameters. By stacking multiple Diagonal BiLSTM units, ITL-LSTM can learn more complex features and abstract representations, adapting to the multi-layer relationships and influencing factors that may exist in the data.

ITL-LSTM demonstrates excellent performance in handling leadership data. Its Diagonal BiLSTM structure is suitable for sequential data, effectively capturing the dynamic changes in leadership dimensions. The use of diagonal weight matrices reduces the number of parameters, enhancing generalization performance. The deep bidirectional LSTM structure accommodates the multi-level characteristics of the data. The application of global average pooling layers strengthens interpretability for leadership features. Overall, ITL-LSTM, with its robust temporal processing, interpretability, and efficiency, proves to have stronger classification capabilities for leadership level data.

3.8.3. Model metrics.

In this study, the performance of the classification model is evaluated using metrics such as accuracy, sensitivity, specificity, and precision. The specific calculation formulas are as follows (1–4) and Table 10:

(1)(2)(3)(4)

3.8.4. Deep learning result.

To enhance the robustness of the model and determine the optimal classifier, we adopted ten-fold cross-validation, using the mean evaluation of ITL-LSTM model performance as the final criterion. The results, as shown in Fig 2 and Table 11, reveal excellent performance by ITL-LSTM in the three-class task of leadership level, with an accuracy of 90.10%, precision of 92.85%, sensitivity of 88.11%, and specificity reaching 92.90%. Compared to traditional network models (MLP, ANN, CNN, AlexNet, VGG, LSTM), ITL-LSTM achieved higher accuracy by 16.23%, 9.01%, 2.71%, 19.74%, 2.71%, and 4.6%, respectively.

4. Discussion

This study focuses on the assessment and analysis of teacher leadership among primary and secondary school teachers in Xinjiang, aiming to address the gap in international research concerning leadership practices in multicultural and multi-ethnic educational contexts. While numerous studies have emphasized the significance of teacher leadership in advancing school development and improving student academic outcomes [25], most have been conducted in culturally homogeneous and resource-balanced settings, overlooking the unique dynamics of teacher leadership in pluralistic environments. This research attempts to bridge both theoretical and methodological gaps by integrating empirical investigation with a deep learning–based analytical model.

Regarding the influence of teacher leadership on student development, our findings align with those of Wang et al., affirming that strong teacher leadership contributes positively to students’ academic performance. In the context of Xinjiang, teachers exhibit notable strengths in guidance, organization, and cross-cultural communication, which not only enhance classroom efficiency but also expand students’ global perspectives and cultural tolerance through various intercultural activities. Compared with Chen’s (2022) findings from central China, our study reveals higher scores in the dimensions of “Cross-cultural Instructional Competence” and “Educational Collaboration,” suggesting greater adaptability among Xinjiang teachers to multilingual and multicultural environments.

From a methodological perspective, this study introduces the ITL-LSTM model, designed for dynamic modeling and prediction of teacher leadership. By incorporating a Diagonal BiLSTM structure, diagonal weight matrices, and global average pooling layers, the model effectively captures the temporal features and latent inter-dimensional relationships in leadership data. Compared with traditional static scale-based methods, the ITL-LSTM model not only achieves a classification accuracy of 90.10% but also offers superior interpretability and generalization, providing a more intelligent and precise tool for educational assessment.

A comparative review of existing teacher leadership assessment tools reveals three main limitations. First, current tools primarily rely on static measurement, lacking the capacity to capture behavioral dynamics—as seen in TLI scale [25] and Collective Leadership tool [26]. Second, most instruments are designed for specific regions or populations and thus lack adaptability to multi-ethnic and multilingual contexts [24,27], which are region-specific, or which targets only higher education [28], neglecting foundational education. Third, most tools are not integrated with advanced technologies such as artificial intelligence or deep learning, limiting their ability to predict and interpret leadership traits dynamically. The ITL-LSTM–based leadership assessment framework developed in this study systematically addresses these gaps by introducing a four-dimensional model encompassing Professional Guidance, Educational Collaboration, Cross-cultural Digital Instructional Competence, and Leadership Cognition. By combining this with a deep learning approach, the tool demonstrates strong scientific validity, contextual relevance, and predictive capacity.

To further enhance the contextual depth of our study, we supplemented our quantitative analysis with qualitative data from teacher interviews in Xinjiang. For instance, an English teacher from Changji noted: “Our students come from diverse backgrounds, and some struggle with Mandarin. I prepare bilingual vocabulary cards in advance and often ask bilingual classmates to help translate, so no one falls behind.” A rural teacher in Kashgar stated: “When resources are tight, whoever can lead will step up, even without an official title. We organize lessons, mentor newcomers—whatever is needed.” These frontline experiences illustrate how teachers interpret the dimensions of “Cross-cultural Instructional Competence” and “Educational Collaboration” in real practice, reinforcing the scale’s structural validity and the model’s predictive reliability. Additionally, a math department head from a key school in Ürümqi remarked: “Many young teachers are capable, but they don’t recognize themselves as leaders. I often remind them in meetings: if you’re guiding a lesson study group, you’re already leading.” This insight highlights the practical relevance of the “Leadership Cognition” dimension, reflecting how teachers gradually construct leadership awareness through practice.

5. Conclusions

This study developed and preliminarily validated a teacher leadership assessment tool tailored for multi-ethnic and multilingual educational environments. Based on survey data from 371 primary and secondary school teachers in Xinjiang, we proposed a four-dimensional framework encompassing Professional Guidance, Educational Collaboration, Cross-cultural Digital Instructional Competence, and Leadership Cognition. We further employed a Diagonal BiLSTM–based deep learning model—ITL-LSTM—to dynamically predict teacher leadership levels, achieving a classification accuracy of 90.10%.

The findings demonstrate that teacher leadership in multicultural educational contexts positively contributes to student motivation, cultural understanding, and classroom engagement. Teachers in Xinjiang, in particular, exhibited strong performance in cross-cultural pedagogy and collaborative practices, highlighting their adaptive capacity and leadership potential in complex educational environments.

From a practical perspective, this study provides education administrators and policymakers with an operational tool for identifying leadership development pathways and optimizing professional growth strategies. At the international level, the proposed assessment framework and modeling approach offer transferable value for teacher leadership research in other multicultural and resource-diverse settings.

Nevertheless, several limitations should be acknowledged. Although Xinjiang serves as a highly representative case, the generalizability of the findings to the national or global level requires further validation. While the study emphasizes the practical dimensions of teacher leadership, data collection relied primarily on quantitative measures, and qualitative insights remain relatively limited. Additionally, the model development focused on tool construction, without fully exploring the dynamic interplay between leadership growth trajectories and institutional contexts. Future research may extend the sample scope to include broader geographic and cultural backgrounds, refine model parameters, and explore its integration into dynamic feedback and intervention systems—thereby supporting equitable education and sustained professional development in a broader context.

Acknowledgments

The authors acknowledge all those who have contributed to this work.

References

  1. 1. Bagwell JL. Exploring the leadership practices of elementary school principals through a distributed leadership framework: A case study. Educational Leadership and Administration: Teaching and Program Development. 2019;30:83–103.
  2. 2. Wenner JA, Campbell T. The theoretical and empirical basis of teacher leadership: A review of the literature. Review of Educational Research. 2017;87(1):134–71.
  3. 3. Gumus S, Bellibas MS, Esen M. A systematic review of studies on leadership models in educational research from 1980 to 2014. Educational Management Administration & Leadership. 2018;46(1):25–48.
  4. 4. Nguyen D, Harris A, Ng D. A review of the empirical research on teacher leadership (2003–2017) Evidence, patterns and implications. Journal of Educational Administration. 2020;58(1):60–80.
  5. 5. Wang M, Ho D. A quest for teacher leadership in the twenty-first century–emerging themes for future research. International Journal of Educational Management. 2020;34(2):354–72.
  6. 6. Printy S, Liu Y. Distributed leadership globally: The interactive nature of principal and teacher leadership in 32 countries. Educational administration quarterly. 2021;57(2):290–325.
  7. 7. Liu Y, Bellibaş MŞ, Gümüş S. The effect of instructional leadership and distributed leadership on teacher self-efficacy and job satisfaction: Mediating roles of supportive school culture and teacher collaboration. Educational Management Administration & Leadership. 2021;49(3):430–53.
  8. 8. Collie RJ. COVID-19 and Teachers’ Somatic Burden, Stress, and Emotional Exhaustion: Examining the Role of Principal Leadership and Workplace Buoyancy. AERA Open. 2021;7.
  9. 9. Qian H, Walker A, Bryant DA. Global trends and issues in the development of educational leaders. Handbook of research on the education of school leaders. Routledge; 2016. p. 67–88.
  10. 10. Wang X, Chen J, Yue W, Zhang Y, Xu F. Curriculum Leadership of Rural Teachers: Status Quo, Influencing Factors and Improvement Mechanism-Based on a Large-Scale Survey of Rural Teachers in China. Front Psychol. 2022;13:813782. pmid:35360591
  11. 11. Bray M, Adamson B, Mason M. Comparative education research: Approaches and methods. Springer; 2014.
  12. 12. Miled N. Educational leaders’ perceptions of multicultural education in teachers’ professional development: A case study from a Canadian school district. Multicultural education review. 2019;11(2):79–95.
  13. 13. Gümüş S, Beycioglu K. The intersection of social justice and leadership in education: what matters in multicultural contexts? Taylor & Francis; 2020. p. 233–4.
  14. 14. Gao Y. Deep learning-based strategies for evaluating and enhancing university teaching quality. Computers and Education: Artificial Intelligence. 2025;8:100362.
  15. 15. Liu Y, Wang F. Educational quality evaluation model based on deep learning theory. International Journal of High Speed Electronics and Systems. 2025;34(01):2540165.
  16. 16. Shitaya AM, Wahed MES, Ismail A. Predicting student behavior using a neutrosophic deep learning model. Neutrosophic Sets and Systems. 2025;76:288–310.
  17. 17. York-Barr J, Duke K. What do we know about teacher leadership? Findings from two decades of scholarship. Review of Educational Research. 2004;74(3):255–316.
  18. 18. Yuet FKC, Yusof H, Mohamad SIS. Development and validation of the teacher leadership competency scale. In: 2016.
  19. 19. Campbell C, Lieberman A, Yashkina A. Teachers leading educational improvements: Developing teachers’ leadership, improving practices, and collaborating to share knowledge. Leading and managing. 2015;21(2):90–105.
  20. 20. Heikka J, Halttunen L, Waniganayake M. Perceptions of early childhood education professionals on teacher leadership in Finland. Early Child Development and Care. 2016;188(2):143–56.
  21. 21. Xianguo W, Amirrudin S. Teacher Leadership Research in China (2020–2024): Trends, Gaps, and Future Directions in Educational Development. International Journal of Education and Humanities. 2025;5(2):375–86.
  22. 22. Liu P, Xiu Q, Tang L. Understanding teacher leadership identity: The perspectives of Chinese high school teachers. International Journal of Leadership in Education. 2025;28(1):149–66.
  23. 23. Wang Z, Han L, Zhang L. Exploring educational leadership and teacher ethics: who can lead and what are the key qualities? Ethics & Behavior. 2025;:1–15.
  24. 24. Pazur M. Development and validation of a research instrument for measuring the presence of democratic school leadership characteristics. Educational Management Administration & Leadership. 2022;50(4):613–29.
  25. 25. Chen JJ. Understanding teacher leaders’ behaviours: Development and validation of the Teacher Leadership Inventory. Educational Management Administration & Leadership. 2022;50(4):630–48.
  26. 26. Eckert J, Morgan GB, Padgett RN. Collective Leadership: Developing a Tool to Assess Educator Readiness and Efficacy. Journal of Psychoeducational Assessment. 2022;40(4):533–48.
  27. 27. Darwis A, Bafadal I, Wiyono BB, Sultoni , Malik AR. Designing an assessment tool for teacher leadership competencies in aspiring elementary school principals in Indonesia. Sustainable Futures. 2025;9:100803.
  28. 28. Zhang G, Chen P, Xu S. Developing and validating a scale for measuring sustainable leadership development among teachers in Chinese higher education institutions. Journal of Cleaner Production. 2025;486:144403.
  29. 29. Paramole OC. The impact of artificial intelligence on educational leadership: theoretical frameworks for measurement and evaluation. Jurnal Saintifik (Multi Science Journal). 2025;23(1):47–72.
  30. 30. Clark LA, Watson D. Constructing validity: Basic issues in objective scale development. Washington, DC, US: American Psychological Association; 2016.