Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A knowledge tracing approach with dual graph convolutional networks and positive/negative feature enhancement network

  • Jianjun Wang ,

    Contributed equally to this work with: Jianjun Wang, Zongliang Zheng

    Roles Formal analysis, Writing – original draft, Writing – review & editing

    Affiliation School of Fine Arts and Design, Leshan Normal University, Leshan, Sichuan, China

  • Qianjun Tang ,

    Roles Project administration, Writing – original draft, Writing – review & editing

    tangqianjun@lsnu.edu.cn

    Affiliation School of Education Science, Leshan Normal University, Leshan, Sichuan, China

  • Zongliang Zheng

    Contributed equally to this work with: Jianjun Wang, Zongliang Zheng

    Roles Data curation, Software, Writing – review & editing

    Affiliation School of Computer Science and Engineering, Sichuan University of Science & Engineering, Zigong, Sichuan, China

Abstract

Knowledge tracing models predict students’ mastery of specific knowledge points by analyzing their historical learning performance. However, existing methods struggle with handling a large number of skills, data sparsity, learning differences, and complex skill correlations. To address these issues, we propose a knowledge tracing method based on dual graph convolutional networks and positive/negative feature enhancement. We construct dual graph structures with students and skills as nodes, respectively. The dual graph convolutional networks independently process the student and skill graphs, effectively resolving data sparsity and skill correlation challenges. By integrating positive/negative feature enhancement and spectral embedding clustering optimization modules, the model efficiently combines student and skill features, overcoming variations in learning performance. Experimental results on public datasets demonstrate that our proposed method outperforms existing approaches, showcasing significant advantages in handling complex learning data. This method provides new directions for educational data mining and personalized learning through innovative graph learning models and feature enhancement techniques.

Introduction

Knowledge Tracing (KT) [14] is a key research area in educational data mining and learning analytics. The development of KT techniques has gone through several stages [57], with significant research advancements from the early Bayesian Knowledge Tracing (BKT) to the more recent Deep Knowledge Tracing (DKT) [8]. However, BKT has some limitations, such as assuming the independence of knowledge points and insufficiently considering student-specific characteristics. To address these limitations, researchers have recently proposed DKT. DKT can capture more complex learning patterns and dynamic changes in knowledge states. Although DKT performs well in many respects, it also faces challenges, such as sensitivity to sequence length and limited model interpretability. However, existing methods still face challenges in handling large-scale skill sets, data sparsity, heterogeneity in learning performance, and complex inter-skill associations.

Knowledge Tracing (KT) faces several critical challenges in modern educational settings. One primary challenge is data sparsity, where students typically attempt only a small fraction of available exercises (often below 1% interaction density), making it difficult to accurately model knowledge states. Additionally, knowledge concepts are inherently interconnected through complex prerequisite and hierarchical relationships, yet traditional KT models often fail to capture these dependencies effectively. The heterogeneous nature of student learning, where individuals demonstrate varying patterns and rates of knowledge acquisition, further complicates the modeling process. Moreover, modern educational platforms must handle thousands of skills and millions of student interactions, necessitating efficient large-scale data processing while maintaining prediction accuracy. These challenges significantly impact the effectiveness of personalized learning systems and necessitate more sophisticated approaches to knowledge tracing.

This study aims to address these challenges by proposing a novel knowledge tracing prediction method based on dual graph convolutional networks and a positive/negative feature enhancement network. In recent years, Graph Neural Networks (GNNs) have excelled in processing graph-structured data, especially Graph Convolutional Networks (GCNs) [9,10]. GCNs not only excel in performance, but their intrinsic interpretability also allows researchers and engineers to better understand and optimize models. The application of GCN to KT [11,12] not only captures students’ knowledge mastery status more accurately but also flexibly handles complex learning relationships and multi-source data, significantly improving the performance and applicability of KT models. The rise of GCNs has provided a new approach, enabling us to further enhance the effectiveness of KT by modeling the relationships among students’ knowledge points. Therefore, the research motivation of this paper is to explore the application of GCNs to KT, with the expectation of improving the prediction of students’ knowledge status by capturing the complex relationships between knowledge points.

This study aims to address the challenges faced by existing KT models in dealing with large-scale skill sets, data sparsity, heterogeneity in learning performance, and complex inter-skill associations. To this end, we propose a knowledge tracing prediction method based on dual graph convolutional networks and positive /negative feature enhancement network. To begin with, to address the diversity of student behaviors and the complexity of skill relationships, we design a novel dual graph convolutional network (dual-GCN) structure that simultaneously constructs graph networks from both student and skill dimensions. Additionally, to more accurately portray the complex inter-skill correlations and enhance the predictive ability of the model, we innovatively utilize both the correct and incorrect response data of students to achieve the fusion enhancement of positive and negative features. This approach effectively improves the model’s comprehensive understanding of the learning process.

The contributions of this paper include the following:

  1. We propose a knowledge tracing model based on the dual graph convolutional network and positive/negative feature enhancement network for predicting students’ knowledge mastery. Its superior performance on multiple educational datasets is experimentally verified, demonstrating improved prediction accuracy and robustness to support personalized teaching and resource optimization.
  2. We propose a dual graph convolutional network that simultaneously constructs graph networks from both student and skill dimensions, enhancing the correlation between similar nodes through graph convolution operations. This structure effectively alleviates the data sparsity problem, improves the processing capability for large-scale skill sets, and better accommodates differences in student learning.
  3. We propose a positive/negative feature enhancement network to bootstrap data from both positive and negative response perspectives. This network innovatively combines students’ correct and incorrect response data to comprehensively capture their knowledge status.

Related work

Knowledge tracing based on traditional learning

Traditional knowledge tracing methods rely on statistical and probabilistic models, typically including Bayesian Knowledge Tracing (BKT) [1,13], mixed effects models, Learning Factor Analysis (LFA) [14], and Dynamic Bayesian Networks (DBN) [15]. Bayesian models utilize Bayes’ theorem combined with prior knowledge and observational data to make inferences and predictions. The traditional BKT model assumes that all students have the same parameters in the learning of knowledge points, ignoring individual differences among students. In fact, the learning ability and learning rate of different students may vary greatly. Compared with BKT, PFA [16] proposed a new KT method that takes into account the effect of topic difficulty and the correlation between knowledge points, and has good interpretability and extensibility. Another example, Pardos et al. [17] paper improves the KT model by introducing Bayesian networks and personalization parameters, enhancing its application in education. However, these approaches have limitations in dealing with complex associations among skills and dynamic learning environments.

Knowledge tracing based on deep learning

In order to overcome the shortcomings of traditional KT methods, more and more researchers have explored the development of KT models based on deep learning from different perspectives in recent years. KT methods based on deep learning [18,19] take more factors into account, such as the relationship between knowledge points and the consistency of the learning process, by introducing techniques such as the attention mechanism [2022] and graph neural networks, so as to obtain more accurate student modeling and performance prediction. They embody some of the recent advances and ideas in knowledge tracing research.

The most typical deep knowledge tracing model applies a recurrent neural network (RNN) to capture the temporal dynamics of a series of interactions between students’ questions and answers. For example, DKT-DSC [23] demonstrated how deep learning techniques and dynamic classification methods can be used to improve knowledge tracing models to more accurately predict student performance and provide strong support for personalized education. DKVMN [24] proposed a knowledge tracing model based on a memory-enhanced neural network to store and retrieve knowledge states through a key-value memory mechanism. SAKT [20] applied the self-attention mechanism to KT, capturing key information in students’ answer history through attention weights. These deep models have superior performance but neglect explicit modeling of learning curve theory. Similarly, some researchers have explored feature engineering approaches. Xu et al. [25] introduced feature crosses to capture interactions between educational elements (like student-problem pairs), providing more comprehensive modeling of student learning patterns. CAKT [26] combines learning curve theory with deep learning to enhance the performance of KT models and proposes a new approach with practical applications.

Graph-based knowledge tracing

Inspired by the representational capabilities of graph learning techniques, such as Graph Neural Networks (GNNs) [2729], KT models based on deep learning have begun to leverage graph learning techniques to fully exploit the rich structural information in graphs and flexibly model the relationships between problems and skills [30,31]. GNNs represent questions or concepts as nodes and their relationships as edges, demonstrating advantages in modeling student learning paths and knowledge mastery [32].

Early graph-based models, such as GIKT [33] and JKT [34], demonstrated how GNNs could effectively utilize knowledge structure information [35] and model relationships between questions and knowledge points. The CMKT [36] model advanced this approach by using educational concept graphs, enhancing learner modeling and addressing data sparsity issues.

Recent innovations have focused on several key directions: (1) Interaction patterns and contrastive learning, exemplified by Bi-CLKT [37], which uses dual graph structures to simultaneously represent knowledge concepts and student learning patterns. (2) Attention mechanisms introduced through knowledge structure-aware graph attention networks [38], better capturing hierarchical relationships between concepts. (3) Heterogeneous graph networks, such as Sun et al. [39]’s tri-view contrastive learning approach based on weighted heterogeneous graphs for knowledge tracing in personalized e-learning systems, which improved tracking accuracy and robustness by integrating multi-view information of students and knowledge points through contrastive learning strategies.

The field has also made significant progress in incorporating educational theories and prior knowledge into graph-based models. Recent research has focused on combining exercise and prior knowledge differences, while dynamic cognitive diagnosis methods enhance deep knowledge tracing through educational priors. These developments demonstrate the field’s evolution toward more sophisticated and practical models that strike a balance between data-driven approaches and established educational theories.

The proposed method

It is a challenging task to extract effective information from the data related to KT. In order to cope with the many difficulties in KT, as shown in Fig 1, based on combining the skill information and student response information, this study proposes a knowledge tracing method based on a dual graph convolutional neural network. The proposed model is divided into three parts, Dual-GCN module, Clustering-optimized module, and P/N-FEN module. A and B represent the adjacency matrices constructed based on the dual-GCN for the input data S, respectively. Skill-GCN input features are obtained by transposing the features of student-GCN input. The transposed features generated by skill-GCN are weighted and fused with features generated by student-GCN to obtain embedded features. After that, the P/N-FEN combined with spectral embedding clustering optimization is further applied to guide the model to learn features more efficiently.

thumbnail
Fig 1. The proposed model, which is divided into three parts.

The dual Graph convolution network (Dual-GCN) module, Clustering-Optimized module, and Positive/Negative Features Enhancement Network (P/N-FEN) module. A and B represent adjacency matrices, respectively.

https://doi.org/10.1371/journal.pone.0317992.g001

The method takes advantage of the graph structure to fully explore the correlation between students and skills, while combining students’ correct response and incorrect response data to improve the predictive ability and interpretability of the model, and thus provide personalized learning support.

Preprocessing

Feature processing.

Based on the Assistments dataset, to process student skills and response data, we designed and generated three matrices (Skill Matrix: S, Response Matrix: R, and Error Response Matrix: E) to provide input and labels for subsequent knowledge tracing models. The specific steps are as follows:

  1. Skill Matrix S: , where represents a student sample, % . Here, n is the number of student samples, and s is the number of skills. For each sample , traverse its skill sequence. Construct a skill matrix S , where each row represents a sample, and each column represents a specific skill . If student i has used skill j p times, then = p. The skill matrix S can accurately reflect each student’s mastery and application of various skills.
  2. Extracting Response Matrix R: Construct a response matrix R with the same structure as the skill matrix S. For each sample , traverse its response sequence. If the corresponding response value is correct (1), set the value in the corresponding position of matrix R to 1.
  3. Extracting Error Response Matrix E: Construct an error response matrix E with the same structure as the skill matrix S. For each sample , traverse its response sequence. If the corresponding response value is incorrect (0), set the value in the corresponding position of matrix E to 1.

Graph structure processing.

Adjacency matrices are used to represent the structure of graphs. In this study, we will construct two adjacency matrices: one is an adjacency matrix A built based on the relationships between skills, and the other is an adjacency matrix B based on student feature similarity, constructed using the K-Nearest Neighbors (KNN) method on the skill matrix S. Through these two adjacency matrices, we can more comprehensively demonstrate the interrelationships between different skills in the learning process and the similarities between students. The specific steps are as follows:

1. Adjacency Matrix Based on Skill Similarity

First, create an adjacency matrix A of size (s, s), with all initial values set to zero. If two skills appear simultaneously in the same student’s skill sequence, set the corresponding positions and in the adjacency matrix A to 1, indicating that there is a connection between these two skills. To further enhance the expressiveness of the adjacency matrix, we count the co-occurrence frequency of skills, calculate weights based on the co-occurrence frequency, and record them in the adjacency matrix A.

(1)

Here, represents the strength of the association between skill i and skill j.

2. Adjacency Matrix Based on Student Similarity

First, create a matrix B of size (n, n), with all initial values set to zero. Use the row vectors of the skill matrix S to represent each student’s skill usage, treating each row as a student’s feature attributes. Calculate the similarity between students to find k nearest neighbors for each student. Based on the calculated similarities, identify the k most similar students for each student. In the adjacency matrix B, set the corresponding positions and to 1, indicating that there is a connection between student i and student j.

(2)

Here, represents the strength of the association between student i and student j, and denotes the KNN method used to construct the topology graph.

Proposed model

Graph Convolutional Networks (GCNs) serve as a powerful tool with significant advantages in processing graph-structured data. Particularly in the field of knowledge tracing, GCNs can effectively capture and utilize potential similarities and commonalities among students when answering questions or using skills.

As shown in Fig 1, we propose an innovative dual graph convolutional network module to capture the relationships between students and skills more comprehensively. Our dual-GCN approach simultaneously models both student-centered and skill-centered graph structures, revealing complex interactions that single-graph methods may overlook. Specifically, we construct two graph convolutional networks: skill-GCN and student-GCN.

The dual-GCN can be intuitively understood as two complementary perspectives of the learning process. In skill-GCN, skills serve as nodes with students as node features, where frequently co-mastered skills are positioned closer together in the space. When a student masters a skill, the model can utilize this spatial relationship to better predict their performance on related skills. In student-GCN, students serve as nodes with skills as node features, where students who master similar skills are placed closer together. When one student improves in certain skills, this improvement can influence the predictions for other students with similar skill sets. Through GCN encoding of these two types of nodes, we can simultaneously capture the associations between skills and the similarities between students, thereby improving overall prediction accuracy.

To reduce the impact of different node degrees on feature aggregation, we apply symmetric normalization to the adjacency matrices A and B of skill-GCN and student-GCN during the training process.

(3)(4)

Where , , S and represent the features generated by the dual-GCN from A and B , respectively. represents the adjacency matrix with self-loops, and denotes the degree matrix of . The same applies to B. is the transpose of S . W , , b , and represent weights and bias terms, respectively. % denotes the activation function.

To obtain enriched node feature representations, we use a feature-weighted fusion method to process the features from the two GCNs, enhancing the interpretability of the model. The main reason for using weighted summation is that it balances the contributions of the two feature matrices in the final representation. By adjusting , we control the relative importance of the output features from the skill-GCN and student-GCN in the fused features.

(5)

where , , H is the output of the skill-GCN, and is the output of the student-GCN.

In the clustering optimization network, we use spectral embedding techniques to generate low-dimensional node embeddings from the eigenvectors of the graph’s Laplacian matrix, preserving the structural information between nodes.

Spectral embedding clustering can be understood as a process of dimensionality reduction that preserves the essential relationships between nodes. Like arranging students in a classroom based on their learning patterns, this technique organizes nodes in a lower-dimensional space where similar nodes are grouped together. This grouping helps identify natural learning patterns and skill relationships that might not be immediately apparent in the original data.

This method helps us identify natural groups among students and skills, gaining deeper insights into the differences in skill mastery and learning behavior among different student groups. Spectral embedding optimization generates compact and highly distinguishable feature vectors, effectively clustering similar nodes together and improving the model’s accuracy in handling diverse student data.

(6)

where is the matrix composed of the embedding vectors of all nodes, and Tr() denotes the trace of the matrix.

To avoid the rotational symmetry problem of the soluton and ensure orthogonality among the vectors, we impose an orthogonal constraint on the embedding nodes:

(7)

where for a graph G , its Laplacian matrix L is defined as L = D − W , with D being the degree matrix and W the adjacency matrix constructed based on the similarity of the embedded nodes Z.

To further encourage the model to fully explore potential association patterns between the two nodes, we designed a positive and negative feedback module to provide supervisory signals. Unlike traditional methods that focus solely on correct answers, our P/N-FEN innovatively leverages both correct and incorrect responses. This approach creates a more nuanced representation of student knowledge states, capturing subtle aspects of the learning process that are often overlooked.

Specifically, for the feature embeddings extracted earlier, we designed a positive feedback classifier for the response matrix R and a negative feedback classifier for the incorrect response matrix E. Through the joint application of positive and negative feedback classifiers, the model is effectively encouraged to explore more potential associations. The positive feedback classifier uses the response matrix R as positive labels, while the negative feedback classifier uses the incorrect response matrix E as negative labels.

(8)(9)

where s is the number of skills. In the positive feedback classifier, is the label for the i − th sample in the cth skill category, and is the probability predicted by the model that the ith sample belongs to the cth skill label. In the negative feedback classifier, is the label for the ith sample in the cth skill category, and is the probability predicted by the model that the ith sample belongs to the cth skill label.

Model training

The proposed model framework mainly consists of three modules: the dual-GCN (including skill-GCN and student-GCN), the cluster-optimized module, and the P/N-FEN module. Therefore, the total loss function L of the model is:

(10)

where λ ,ε and ρ are hyperparameters. is the clustering optimization loss function, is the positive feedback classifier loss function, and is the negative feedback classifier loss function.

Experiments

Dataset

To validate the proposed model, we conducted experiments on three public datasets [4042] (Assistments 2009, Assistments 2012, Assistments 2017).

We processed the raw data by first reading the CSV files and removing samples containing NaN values. Then, we reordered the student IDs based on the ‘studentID’ column and reordered the skills or mapped the skill names to new skill IDs based on the ‘skill’ column. Finally, we grouped the data by student ID. As shown in Table 1, the number of student samples and the number of skill types are displayed.

Experimental setup

In this experiment, we used three different educational datasets: Assistments 2009, Assistments 2012, and Assistments 2017. These datasets represent student learning behaviors and knowledge mastery in different years, providing high representativeness and diversity. To scientifically evaluate the model’s performance, we split each dataset into 80% training and 20% testing data to ensure consistency in the distribution of the training and testing data.

To ensure the reliability and reproducibility of the experimental results, we ran each experiment five times independently and reported the average performance metrics.

Performance comparison with different methods

To verify the effectiveness of the proposed model, we compared it with BKT [1], IRT [43], DKT [8], DKT-DSC [23], SKVMN [44], DTransformer [45], and KT-Bi-GRU [46].

As shown in Table 2, our model demonstrated excellent performance across all three datasets. Particularly on the largest dataset, Assistments 2012 (28,834 students, 198 skills), the model achieved significant advantages in both AUC (0.733) and RMSE (0.354), indicating its effective utilization of large-scale data for learning. On the smaller Assistments 2017 dataset (1,709 students, 102 skills), the model achieved the highest AUC (0.790) but showed relatively higher RMSE (0.450), suggesting strong predictive ranking ability but room for improvement in absolute value prediction on smaller datasets.

thumbnail
Table 2. Performance comparison with proposed methods on dierent datasets.

https://doi.org/10.1371/journal.pone.0313772.t002

Analyzing the model structure, our proposed dual graph convolutional model captures skill relationships through skill-GCN and models student similarities through student-GCN, with this multidimensional feature extraction method showing good adaptability across datasets of different scales. Compared to traditional methods (such as BKT and IRT), our model can handle more complex nonlinear relationships; compared to other deep learning methods (such as DKT and SKVMN), our approach is more comprehensive in modeling student and skill relationships. The model also incorporates spectral embedding techniques for clustering optimization and employs positive and negative feedback classifiers, with these innovative designs collectively enhancing prediction accuracy.

Compared to recent advanced methods (such as DTransformer and KT-Bi-GRU), our model based on dual-GCN and P/N-FEN has demonstrated strong competitiveness. The model performs optimally with sufficient data (as in Assistments 2012) while maintaining stable performance with less data (as in Assistments 2017). Particularly in terms of AUC metrics, the model maintains high performance levels (0.732–0.790) across datasets of different scales, fully reflecting its adaptability and stability in complex learning environments. Although other models may perform better in specific cases (such as RMSE on Assistments 2017), our model maintains overall competitiveness and shows clear advantages on large-scale datasets.

Ablation study of loss function

To verify the effectiveness of the extraction modules and loss functions in our proposed model, we conducted ablation experiments on three Assistments datasets of different scales. Table 3 shows the impact of different loss function combinations on the model’s performance.

The experimental results indicate that using the positive feedback classifier loss alone yields the worst performance, suggesting that considering only the students’ correct responses is insufficient. The inclusion of the clustering loss significantly improves the model’s performance, particularly in the AUC metric. This demonstrates the importance of spectral embedding techniques in capturing the structural relationships between students and skills. The combination of the positive feedback and negative feedback classifier losses further enhances the model’s performance, especially in the RMSE metric. This indicates that considering both the students’ correct and incorrect responses provides a more comprehensive learning signal, helping the model to predict students’ knowledge states more accurately. Finally, the complete model achieves the best or nearly best performance across all datasets. This confirms the effectiveness of our proposed multi-objective learning framework, which effectively leverages clustering information, positive feedback, and negative feedback signals, thus performing well on datasets of varying scales and characteristics.

These results not only validate the necessity of each component in our proposed model but also showcase the synergistic effects between them. By combining GCN, spectral embedding clustering, and dual feedback classifiers, our model can more comprehensively capture student-skill relationships, thereby achieving superior performance in KT tasks.

Parameter analysis

Sensitivity analysis of learning rate.

To comprehensively evaluate the performance characteristics and stability of our proposed model based on dual-GCN and P/N-FEN, we conducted a detailed analysis of key hyperparameters. This section focuses on two core hyperparameters: the learning rate and the weighting of the P/N-FEN’s loss function.

The learning rate is one of the key hyperparameters that affect the model’s convergence speed and final performance. We systematically experimented with values in the range [0.001, 0.005, 0.01, 0.05, 0.1], as shown in Fig 2. We observed that there is a balance point between the learning rate and model performance: a lower learning rate may lead to a slower convergence rate, while a higher learning rate may cause instability in the training process and even degrade model performance.

thumbnail
Fig 2. The influence of the value lr on AUC on three datasets.

https://doi.org/10.1371/journal.pone.0317992.g002

Sensitivity analysis of the weights in the P/N enhancement loss function.

In the enhanced positive-negative loss function, weight parameters ε and ρ are crucial for balancing the model’s attention between correct and incorrect responses. To thoroughly investigate the impact of these two hyperparameters on model performance, we fixed the weight of a = 0 and searched for the most suitable parameter combination based on . We systematically conducted experiments within the range of , with the experimental setup and results shown in Fig 3.

thumbnail
Fig 3. Impact of hyperparameter on model performance.

(a) shows the impact of on model’s AUC performance, (b) shows the impact of on model’s RMSE performance.

https://doi.org/10.1371/journal.pone.0317992.g003

In this experiment, hyperparameters ε and control the weights of positive feedback loss function and negative feedback loss function in the total loss respectively, aiming to balance the model’s attention to correct and incorrect responses. Results show that different ratios significantly affect AUC and RMSE across datasets. For Assistments 2009 and 2012 datasets, when , AUC reached 0.732 and 0.733 respectively, with lowest RMSE at 0.415 and 0.354, showing the best performance. For the Assistments 2017 dataset, AUC peaked at 0.790 with RMSE of 0.445 when , indicating that higher weight on negative feedback is more suitable for this dataset. Most datasets achieved optimal performance with moderate ratios (between 0.5 and 1.5), while extreme values of generally led to degraded model performance. This suggests the importance of maintaining a balanced attention between correct and incorrect responses. Therefore, properly adjusting the ratio is crucial for achieving optimal model performance across different datasets.

Robustness test.

To validate the robustness of the model under different data distributions and noise conditions, we introduced varying levels of label noise (i.e., random perturbations of correct and incorrect responses) across three datasets: Assistments 2009, Assistments 2012, and Assistments 2017. Specifically, we randomly altered 5%, 10%, and 15% of student response records in the test set. The results are shown in Fig 4.

As shown in Fig 4, the AUC values for all datasets exhibit a declining trend as the proportion of label noise increases. However, even with 15% noise, the model’s AUC remains above 0.690 across all three datasets. This demonstrates the strong robustness of our model, which can effectively resist noise interference in the data and maintain high predictive performance.

Conclusion

With the popularization of online learning, research on knowledge tracing has gained increasing attention. Existing methods have limitations in addressing issues such as limited skill quantity, insufficient information, diversity in learning performance, and complex associations between skills. To address these problems, we propose a knowledge tracing prediction method based on dual graph convolutional networks and positive-negative feature enhancement network. We construct two graph structures with students and skills as nodes respectively. By building relationship graphs between students and skills, the model can more accurately capture and predict students’ learning behaviors and knowledge mastery levels. Meanwhile, we utilize positive and negative labels to train the model. Experimental results show that compared to existing methods, our approach demonstrates superiority and potential in handling complex learning data. However, the methods presented in this paper still have certain limitations. For example, the model primarily relies on students’ skill usage frequency and response data, and has not fully considered students’ personalized learning paths and background information. Future research can explore the introduction of more diverse features and the optimization of the graph convolution network structure to further enhance the model’s performance and adaptability.

References

  1. 1. Corbett AT, Anderson JR. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Model User Adapt Interact. 1994;4(4):253–78.
  2. 2. Song J, Wang Y, Zhang C, Xie K. Self-attention and forgetting fusion knowledge tracking algorithm. Inf Sci. 2024;680:121149.
  3. 3. Liu H, Zhang T, Li F, Yu M, Yu G. A probabilistic generative model for tracking multi-knowledge concept mastery probability. Front Comput Sci. 2024:18(3);183602.
  4. 4. Aamodt A, Nygård M. Different roles and mutual dependencies of data, information, and knowledge — An AI perspective on their integration. Data Knowl Eng 1995;16(3):191–222.
  5. 5. Song X, Li J, Cai T, Yang S, Yang T, Liu C. A survey on deep learning based knowledge tracing. Knowl Based Syst. 2022;258:110036.
  6. 6. Tsutsumi E, Guo Y, Kinoshita R, Ueno M. Deep knowledge tracing incorporating a hypernetwork with independent student and item networks. IEEE Trans Learn Technol. 2023.
  7. 7. Lin C, Lai Y, Hailemariam Z. A distinguished-bit tracking knowledge-based query tree for RFID tag identification. Comput Commun. 2024;218:166–75.
  8. 8. Piech C, Bassen J, Huang J, Ganguli S, Sahami M, Guibas LJ, et al. Deep knowledge tracing. Adv Neural Inf Process Syst. 2015;28.
  9. 9. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv Preprint. 2016.
  10. 10. Li R, Qin Y, Wang J, Wang H. AMGB: Trajectory prediction using attention-based mechanism GCN-BiLSTM in IOV. Pattern Recognit Lett. 2023;169:17–27.
  11. 11. Gan W, Sun Y, Sun Y. Knowledge structure enhanced graph representation learning model for attentive knowledge tracing. Int J Intell Syst. 2022;37(3):2012–45.
  12. 12. Wu Z, Huang L, Huang Q, Huang C, Tang Y. SGKT: Session graph-based knowledge tracing for student performance prediction. Expert Syst Appl. 2022;206:117681.
  13. 13. Baker Rs, Corbett A, Aleven V. More accurate student modeling through contextual estimation of slip and guess probabilities in bayesian knowledge tracing. In: Intelligent tutoring systems: 9th international conference, ITS 2008, Montreal, Canada, June 23–27, 2008 Proceedings. 2008. p. 406–415.
  14. 14. Cen H, Koedinger K, Junker B. Learning factors analysis–a general method for cognitive model evaluation and improvement. In: International conference on intelligent tutoring systems. 2006. p. 164–175.
  15. 15. Käser T, Klingler S, Schwing AG, Gross M. Dynamic Bayesian networks for student modeling. IEEE Trans Learn Technol. 2017;10(4):450–62.
  16. 16. Pavlik P, Cen H, Koedinger K. Performance factors analysis–a new alternative to knowledge tracing. Artificial intelligence in education. 2009. p. 531–538.
  17. 17. Pardos ZA, Heffernan NT. Modeling individualization in a bayesian networks implementation of knowledge tracing. In: User modeling, adaptation, and personalization: 18th international conference, UMAP 2010. Proceedings 18. 2010. p. 255–266.
  18. 18. Liang Z, Wu R, Liang Z, Yang J, Wang L, Su J. GELT: A graph embeddings based lite-transformer for knowledge tracing. PLoS One 2024;19(5):e0301714. pmid:38713679
  19. 19. Xu F, Chen K, Zhong M, Liu L, Liu H, Luo X, et al. DKVMN&MRI: A new deep knowledge tracing model based on DKVMN incorporating multi-relational information. PLoS One 2024;19(10):e0312022. pmid:39475856
  20. 20. Pandey S, Karypis G. A self-attentive model for knowledge tracing. arXiv Preprint. 2019.
  21. 21. Pandey S, Srivastava J. RKT: relation-aware self-attention for knowledge tracing. Proceedings of the 29th ACM international conference on information and knowledge management. 2020. p. 1205–1214.
  22. 22. Yu D, Wang L, Chen X, Chen J. Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt. Front Comput Sci. 2021:15(4);154208.
  23. 23. Minn S, Yu Y, Desmarais MC, Zhu F, Vie JJ. Deep knowledge tracing and dynamic student classification for knowledge tracing. In: 2018 IEEE international conference on data mining (ICDM). 2018. p. 1182–1187.
  24. 24. Zhang J, Shi X, King I, Yeung D. Dynamic key-value memory networks for knowledge tracing. In: Proceedings of the 26th international conference on World Wide Web. 2017. p. 765–774.
  25. 25. Xu L, Wang Z, Zhang S, Yuan X, Wang M, Chen E. Modeling student performance using feature crosses information for knowledge tracing. IEEE Trans Learn Technol. 2024.
  26. 26. Yang S, Liu X, Su H, Zhu M, Lu X. Deep knowledge tracing with learning curves. In: 2022 IEEE international conference on data mining workshops (ICDMW). 2022. p. 282–291.
  27. 27. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 2021;32(1):4–24. pmid:32217482
  28. 28. Xia Z, Dong N, Wu J, Ma C. Multi-variate knowledge tracking based on graph neural network in ASSISTments. IEEE Trans Learn Technol. 2023.
  29. 29. Chen Y, Xie Z. Multi-channel fusion graph neural network for multivariate time series forecasting. J Comput Sci. 2022;64:101862.
  30. 30. Zhu X, Ao X, Qin Z, Chang Y, Liu Y, He Q. Intelligent financial fraud detection practices in post-pandemic era. The Innovation. 2021;2(4).
  31. 31. Zhu L, Duan X, Bai L. SSQTKG: A subgraph-based semantic query approach for temporal knowledge graph. Data Knowl Eng. 2024:102372.
  32. 32. Long T, Liu Y, Zhang W, Xia W, He Z, Tang R. Automatical graph-based knowledge tracing. In: Proceedings of the international conference on educational data mining (EDM). 2022.
  33. 33. Yang Y, Shen J, Qu Y, Liu Y, Wang K, Zhu Y. GIKT: a graph-based interaction model for knowledge tracing. Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, part I. 2021. p. 299–315.
  34. 34. Song X, Li J, Tang Y, Zhao T, Chen Y, Guan Z. Jkt: A joint graph convolutional network based deep knowledge tracing. Inf Sci. 2021;580:510–23.
  35. 35. Nakagawa H, Iwasawa Y, Matsuo Y. Graph-based knowledge tracing: modeling student proficiency using graph neural network. In: Proceedings of the IEEE/WIC/ACM international conference on web intelligence. 2019. p. 156–163.
  36. 36. Lu Y, Chen P, Pian Y, Zheng V. CMKT: Concept map driven knowledge tracing. IEEE Trans Learn Technol. 2022;15(4):467–80.
  37. 37. Song X, Li J, Lei Q, Zhao W, Chen Y, Mian A. Bi-CLKT: Bi-graph contrastive learning based knowledge tracing. Knowl Based Syst. 2022;241:108274.
  38. 38. Mao S, Zhan J, Li J, Jiang Y. Knowledge structure-aware graph-attention networks for knowledge tracing. In: Proceedings of the international conference on knowledge science, engineering and management. 2022. p. 309–321.
  39. 39. Sun J, Du S, Liu Z, Yu F, Liu S, Shen X. Weighted heterogeneous graph-based three-view contrastive learning for knowledge tracing in personalized e-learning systems. IEEE Trans Consum Electron. 2023;70(1):2838–47.
  40. 40. Feng M, Heffernan N, Koedinger K. Addressing the assessment challenge with an online system that tutors as it assesses. User Model User Adapt Interact. 2009;19:243–66.
  41. 41. Pardos ZA, Heffernan NT. KT-IDEM: Introducing item difficulty to the knowledge tracing model. In: User modeling, adaptation and personalization: 19th international conference, UMAP 2011. Proceedings 19. 2011. p. 243–254.
  42. 42. Heffernan N, Heffernan C. The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Int J Artif Intell Educ. 2014;24(4):470–97.
  43. 43. Wilson K, Karklin Y, Han B, Ekanadham C. Back to the basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation. arXiv Preprint. 2016.
  44. 44. Abdelrahman G, Wang Q. Knowledge tracing with sequential key-value memory networks. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 2019. p. 175–184.
  45. 45. Yin Y, Dai L, Huang Z, Shen S, Wang F, Liu Q. Tracing knowledge instead of patterns: Stable knowledge tracing with diagnostic transformer. In: Proceedings of the ACM web conference 2023. 2023. p. 855–864.
  46. 46. Delianidi M, Diamantaras K. KT-Bi-GRU: Student performance prediction with a bi-directional recurrent knowledge tracing neural network. J Educ Data Min. 2023;15(2):1–21.