Fig 1.
The workflow of the proposed approach.
Fig 2.
The overall framework of the proposed approach.
Embedding layers: Initially, each word in the text is converted into a numerical vector using a pre-trained word embedding model, such as GloVe [45]. Input to LSTM: The numerical vectors from the word embeddings serve as the input to the LSTM model. Each vector represents a single time step in the sequence, with the sequence length being equal to the number of words in the input text.
Fig 3.
The overall structure of the proposed Informer model.
Fig 4.
The suggested pipeline utilizes a probability attention method.
The variable L represents the length of the Conv1d procedure. k is the quantity of feature maps produced in every attention module.
Table 1.
The details of the publicly available dataset.
Table 2.
The hyper-parameter settings used in this study.
Fig 5.
The accuracy and loss curves of the proposed approach in the training process.
Table 3.
The comparison between the state-of-the-arts and the proposed models on the datasets.
Fig 6.
The ROC curves of the proposed approach on the Assistments2009, Assistments2017, EdNete datasets.
Fig 7.
The superiority of the proposed approach compared to the current state-of-the-art algorithms.
The proposed strategy has demonstrated higher performance compared to other methods in terms of accuracy and AUC on three distinct datasets.
Table 4.
The comparison between the state-of-the-arts and the proposed models on the datasets.
Fig 8.
The suggested approach includes two alternate architectures for the Bi-LSTM models: the Bi-LSTM model (Top) and the Bi-LSTM with a single informer network (Bottom).
Fig 9.
The comparison results between the LSTKT models with two different architectures.
Fig 10.
The relationship between different early stop settings and the corresponding accuracy of the LSTKT model on the EdNet dataset.
Table 5.
Dropout rate ablation study results on the EdNet dataset.