Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Optimizing multi label student performance prediction with GNN-TINet: A contextual multidimensional deep learning framework

  • Xiaoyi Zhang,

    Roles Conceptualization, Data curation, Methodology, Validation, Visualization, Writing – original draft

    Affiliation College of Liberal Arts and Science, University of Illinois Urbana-Champaign, Urbana, IL, United States of America

  • Yakang Zhang,

    Roles Conceptualization, Funding acquisition, Investigation, Resources, Writing – original draft

    Affiliation Industrial Engineering and Operations Research Department, Columbia University, New York, NY, United States of America

  • Angelina Lilac Chen,

    Roles Data curation, Formal analysis, Funding acquisition, Project administration, Resources, Writing – review & editing

    Affiliation Le Regent School, Crans-Montana, Switzerland

  • Manning Yu,

    Roles Conceptualization, Methodology, Project administration, Resources, Validation, Visualization, Writing – original draft

    Affiliation Department of Statistics, Columbia University, New York, NY, United States of America

  • Lihao Zhang

    Roles Conceptualization, Data curation, Methodology, Software, Writing – original draft

    lhzhangcuhk@ieee.org

    Affiliation Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, China

Abstract

As education increasingly relies on data-driven methodologies, accurately predicting student performance is essential for implementing timely and effective interventions. The California Student Performance Dataset offers a distinctive basis for analyzing complex elements that affect educational results, such as student demographics, academic behaviours, and emotional health. This study presents the GNN-Transformer-InceptionNet (GNN-TINet) model to overcome the constraints of prior models that fail to effectively capture intricate interactions in multi-label contexts, where students may display numerous performance categories concurrently. The GNN-TINet utilizes InceptionNet, transformer architectures, and graph neural networks (GNN) to improve precision in multi-label student performance forecasting. Advanced preprocessing approaches, such as Contextual Frequency Encoding (CFI) and Contextual Adaptive Imputation (CAI), were used on a dataset of 97,000 occurrences. The model achieved exceptional outcomes, exceeding current standards with a Predictive Consistency Score (PCS) of 0.92 and an accuracy of 98.5%. Exploratory data analysis revealed significant relationships between GPA, homework completion, and parental involvement, emphasizing the complex nature of academic achievement. The results illustrate the GNN-TINet’s potential to identify at-risk pupils, providing a robust resource for educators and policymakers to improve learning outcomes. This study enhances educational data mining by enabling focused interventions that promote educational equality, tackling significant challenges in the domain.

Introduction

Online courses are becoming increasingly common due to the proliferation of digital platforms. This gives students more freedom and flexibility in their study. Measuring students’ progress and comprehension is challenging when teachers and students do not have regular one-on-one interactions. Lower test scores and reduced student involvement may be caused by a lack of touch, which can impair academic progress [1]. Therefore, gathering information from various digital sources and evaluating it is essential for resolving these issues. Using performance data, teachers may identify students at risk of falling behind or being left out, enabling them to intervene quickly and boost their performance [2].

Educational Data Mining (EDM) is crucial for identifying patterns in large educational datasets, analyzing student success factors, and boosting learning outcomes [3]. Assessing student attendance and engagement on MOOC platforms using data mining technologies shows a positive correlation between academic success and active participation [4]. Learning analytics is a specialized field that uses data mining to predict educational achievement by analyzing student behaviour [5]. A rising corpus of research emphasizes clickstream data, which tracks student activities in online learning settings, as a measure of academic achievement. Research suggests that neuro-fuzzy algorithms may accurately predict academic success using behavioural data [6, 7]. Most techniques concentrate on linear correlations, omitting contextual and behavioural data nuances. Nuanced student engagement and performance data are essential for complete academic assessments, requiring advanced modelling approaches [8, 9].

Current prediction methods cannot identify at-risk pupils early, limiting proactive measures. Many traditional methodologies only predict results after a course and restrict the use of interventions that might enhance student learning [10]. Previous research has used a binary framework to predict outcomes (e.g., pass or fail), which might ignore students’ different performance levels. Advanced learning possibilities may help excelling students, but binary systems generally overlook them beyond simple pass/fail criteria [11, 12]. This constraint highlights the need for models that account for the multi-label nature of student performance, which acknowledges that one kid may have various academic, behavioural, and social performance markers.

Several machine learning and deep learning algorithms, including decision trees, random forests, and logistic regression, are used in EDM to predict student demographics, academic scores, and engagement metrics [13]. Traditional models struggle with multidimensional, contextual, and sequential educational data, which are necessary for a comprehensive perspective of student performance. Deep learning models like Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks have improved temporal patterns in educational data. Still, it faces issues representing complicated data type relationships, particularly in multi-label contexts. CNNs and LSTMs may struggle to manage the complex interplay between demographic and behavioural data [14].

This study presents GNN-Transformer-InceptionNet (GNN-TINet) to predict multi-label student performance. GNN-TINet uses GNN, Transformer topologies, and InceptionNet to find complex contextual links and interdependencies in multivariate, time-series student data. This method improves forecast accuracy and supports multi-label student outcomes, revealing each kid’s needs. The GNN-TINet system handles educational datasets using contextual adaptive imputation and frequency encoding to preserve data quality and relevance. Models may adjust to missing values and categorical variables by reacting to contextual correlations in the data during preprocessing, boosting prediction robustness and generalizability. GNN-TINet may tailor educational interventions for at-risk students and strong achievers who may benefit from advanced challenges. This research’s multi-label prediction system advances educational data mining (EDM) by integrating cutting-edge deep learning architectures with innovative preprocessing. This technique improves EDM by offering educators accurate information to boost student progress via tailored learning and educational equity.

  1. Developed the GNN-Transformer-InceptionNet (GNN-TINet) model to increase student performance prediction accuracy and multi-label classification by combining the GNN, Transformer, and InceptionNet architectures.
  2. Pioneered new feature selection and data balance methods, such as CBCE for synthetic data production and HCFS for prioritized feature ranking.
  3. To evaluate the dependability of the model and its effect on student outcomes, two new performance measures are created: the Learning Impact Factor (LIF) and the Predictive Consistency Score (PCS).
  4. Contributed to the EDM field by providing a comprehensive, high-performing pipeline for early student performance prediction, supporting targeted interventions for improved educational results.

The related Work section includes essential research on student performance from other researchers in the field. The Conceptual Framework section describes the mechanics of our approach for student performance, which uses a deep ensemble. The Experimental Data and Results section thoroughly discusses the experimental data obtained. The Conclusions and Future Directions section ends with a review of the results and suggestions for further study.

Related work

Student performance prediction research is prevalent in EDM. Improved prediction accuracy has been achieved using classic machine learning and sophisticated deep learning methods. Decision trees (DT) and random forests (RF) have been used in various studies to predict student outcomes based on demographic and academic factors. These approaches work in many situations with multi-label predictions and contextual multidimensional data [13, 14].

A researcher in [15] used a decision tree technique to examine how demographic factors, notably age, affect academic achievement. This research does not include behavioural data, essential for understanding performance dynamics. A study [16] predicted student performance using demographic data but not time series data. Another survey by [17] found that extracurricular activities improve academic performance. Although accurate, their random forest model did not fully use multidimensional data sources. Many logistic regression studies examined family factors, including wealth and size [18]. Demographic factors affected performance, but behavioural data complicated things.

Deep learning has expanded EDM research. Using a CNN network, the author in [19] extracted temporal characteristics from clickstream and evaluation score data. This methodology enhanced prediction accuracy but concentrated on sequential data without addressing performance prediction’s multi-label nature. Using time-behavioral data, [20] created a hybrid deep-learning model, GritNet, to detect high-risk kids. Their forecasts were encouraging, but the model did not account for contextual interactions between characteristics, which may significantly alter predictions. One study [21] used an LSTM neural network and attention mechanism to predict student performance. This approach increased accuracy by concentrating on vital information, but multi-class categorization, essential for thorough performance evaluations, was complex.

Another study [22] used a transformer to convert learning behaviour data into sequential feature vectors for student performance prediction. This novel strategy increased prediction granularity, but it was necessary to investigate how demographic data may improve model accuracy. Researchers [23] suggested a multi-head attention model and SVM to pick relevant behavioural variables, which improved temporal accuracy but not multidimensional data integration. The author in [24] advanced using multi-topological graph neural networks (MTGNN) to represent student interactions. This technique increased relational dynamics knowledge but lacked time-series characteristics and contextual information, limiting its predictive power. Researchers in [25] used a time-series neural network to capture distinctive learning patterns in clickstream and evaluation data. However, binary categorization limited student performance analysis.

Existing models frequently work to include multidimensional and contextual data. The author in [26] used a K-NN regression model with decision trees but restricted feature sets, limiting the possibility of adding more dynamic data sources. The study [27] employed standard approaches to predict early dropout, which frequently did not use all the data. The author in [5] predicted student performance using machine learning techniques, including DenseNet. Their method enhanced accuracy. However, the research did not examine merging DenseNet with other models for a more robust prediction framework.

The author [28] used a ResNet to predict students’ degree completion with high recall and accuracy. However, single-dimensional data hampered the model’s capture of complicated feature relationships. In contrast, the author [29] suggested a bidirectional LSTM-attention mechanism hybrid deep neural network. This model excelled in contextual feature extraction but struggled with multi-label classification. According to the existing literature, advanced models that combine contextual, multidimensional data for student performance prediction are needed. Many studies have concentrated on individual performance variables but neglected demographic, behavioural, and academic interdependencies. The GNN-TINet system synthesizes these varied data sources to fill these gaps and maximize multi-label student performance prediction, improving educational interventions and results. The literature summary is described in Table 1.

thumbnail
Table 1. Literature review on student performance prediction.

https://doi.org/10.1371/journal.pone.0314823.t001

Proposed methodology

The proposed method utilizes the GNN-Transformer-InceptionNet (GNN-TINet) model to improve multi-label student performance prediction by integrating sophisticated machine learning methodologies. The California Student Performance Dataset, a publicly accessible resource with many academic, behavioural, and emotional characteristics, is first preprocessed using advanced techniques like Contextual Adaptive Imputation and Contextual Frequency Encoding. The data is then organized into a graph representation, allowing the Graph Neural Network (GNN) to identify related patterns among students and their performance indicators. The GNN processes input via a message-passing approach to model global dependencies and then applies a Transformer network that uses self-attention methods. InceptionNet’s multi-scale architecture produces a complete data representation, which extracts many properties. Accuracy, predictive consistency score, learning impact factor, and many other metrics show that the model effectively identifies at-risk students and enables tailored educational interventions. Fig 1 shows the visual view of the proposed framework. The modules of the proposed framework are described in detail in subsequent sections.

Dataset description

The dataset utilized in this study is the Student Performance Dataset (CSPD), which contains 97,000 data points from several California schools. It predicts student performance well using 36 characteristics and a broad range of student data. The publicly available data contains quantitative and qualitative data on students’ academic, social, emotional, and behavioural traits, which are included in this collection. All names, personal information, and sensitive data were anonymized or removed for privacy. [30] provides public access to the dataset. This website helps educators, data scientists, and policymakers improve educational outcomes by providing insights into numerous facets of student performance. CSPD was selected because it integrates technical and educational environment data, emotional and mental well-being indicators, social and behavioural data, real-time learning information, academic performance assessments, and student demographic and background facts. To examine student achievement comprehensively, the CSPD uses various data sources:

  • Student Demographics and Background: Age, gender, socioeconomic status, parental education, and learning disabilities.
  • Academic Data: Details on homework completion, project performance, GPA, and examination scores.
  • Real-time learning data indicates class attendance, daily quiz outcomes, and study durations.
  • Social and behavioural data include teacher feedback, motivation levels, and peer interactions.
  • Mental health conditions, emotional awareness via wearables, and stress levels—data on emotional and psychological well-being.
  • Technology and Learning Environment: Internet accessibility, use of instructional materials, and assessments of the educational environment.

This dataset was selected as it combines academic accomplishment, emotional well-being, social behaviour, and learning environments, which are crucial to understanding student outcomes. Academic, technological, and well-being data offer more exact and tailored student performance projections for intervention programs and instructional upgrades, enabling their implementation. Table 2 shows the dataset description and characteristics.

Data preprocessing

Adequate dataset preparation is essential for early student performance prediction. This step handles missing values, encodes categorical variables, and normalizes numerical characteristics to prepare data for modelling [31]. A new preprocessing method emphasizes methodical and context-aware techniques. The new Contextual Adaptive Imputation approach addresses missing values. This approach improves feature relationship-based imputation by combining statistical methods with contextual data knowledge. We first check for disappeared entries, designated as Dmiss, which is the subset of D with missing values.

Start the CAI approach by finding features having a substantial link to the missing value feature. For missing values in feature F, we calculate a correlation score C(F, G) with all other features G: (1)

Cov signifies covariance, and σ represents feature standard deviation. For imputation, we keep characteristics with a correlation value over 0.3.

Weighted Mean Imputation is used for imputation. This technique weights features by association with the missing feature. For missing entries in feature F, the imputed value is determined as: (2)

w(G) is the correlation score-derived weight, and FG represents the associated feature values for imputation. Each missing item is credited using the most appropriate attributes based on the data context.

The novel Contextual Frequency Encoding approach converts category information into numerical representations. We compute category frequency within relevant characteristics using this technique. For the categorical characteristic C: (3)

The suitable feature is Fj and Ci represents a specific category. The model calculates category relevance by comparing category frequency to other characteristics.

For numerical normalization, Dynamic Range Scaling (DRS) is proposed. DRS scales are based on data distribution rather than min-max. DRS formula can be described as: (4)

The local mean is denoted by μ, and the local standard deviation is denoted by σ; they are calculated using a sliding window of values around each occurrence. Feature robustness against outliers is enhanced by local context normalization. Afterwards, a novel method for detecting outliers is proposed: relative performance thresholding. This method determines d by using the average and standard deviation of the features. Extremely low or high numbers are considered outliers: (5)

By controlling or eliminating values according to feature distribution, k allows for personalized outlier detection when set as a user-defined sensitivity number, like two.

This preprocessing module uses novel techniques, including Relative Performance Thresholding, Dynamic Range Scaling, Frequency Encoding, and Contextual Adaptive Imputation. The dataset and early student, performance prediction model, are enhanced by accounting for missing values, normalizing numerical features, coding categorical variables, and overcoming outliers.

Hierarchical contextual feature scoring (HCFS based feature selection)

HCFS selects features in a novel way that considers context, predictive power, and eliminating duplicate features. It selects data accordingly based on the independent forecasting of attributes and their relationships with other significant traits. Ratings might be improved by including hierarchical or grouped links of attributes. Determining the Mutual Information (MI) for each feature is the first step in the HCFS method. Feature data on Y is evaluated using MI. Here is the equation: (6)

Although marginal probability distributions are p(xj), their combined distribution of probabilities for the feature xj is p(xj). Higher MI-scored qualities have more associations, improving target relevance. HCFS considers contextual importance in the second review, unlike MI-based feature selection.

p(xj, y) represents the combined probability distribution of the feature xj and the target y, whereas p(xj) and p(y) represent their marginal probability distributions. Higher MI score elements are preferred since they align more with the goal. Unlike previous MI-based feature selection methods, HCFS assesses contextual influence again. (7)

In the context Cj, the number of features is represented by—|C_j|—, and the mutual information between features xj and xk is quantified by MI(xj, xk). Because of this, HCFS may discover strong features and traits that benefit their group.

HCFS includes the Redundancy Penalisation (RP) phase to avoid selecting repetitive traits. Features that provide identical results are penalized to ensure that the final features contain diverse and non-redundant information. The redundancy penalty for feature xj is calculated as follows: (8) S is the previously selected collection of features, whereas MI(xj, xk) measures the mutual information between xj and each xk in the set. This redundancy reduction phase enhances the model’s performance by preventing overfitting caused by an excessive number of similar features.

The final stage of HCFS is integrating these calculations into an HCFS for each feature. Contextual interaction, redundancy penalty, and feature mutual information comprise the score: (9)

The relative importance of the contextual engagement score and the redundant penalty is decided by the weighting factors λ1 and λ2. The model employs the characteristics that rank highest according to their HCFS scores. HCFS emphasizes contextual interactions and hierarchical feature relationships, unlike conventional feature selection methods. It better captures the intricacy of multidimensional data since it groups attributes and analyzes their interconnections. Every step of the redundancy penalty process keeps the model efficient and prevents it from overfitting. Because it considers students’ academic, behavioural, emotional, and contextual aspects, educational data mining successfully employs HCFS in complicated datasets. HCFS enables one to put facts about ideas, emotions, and behaviours in perspective. It modulates for overlapping characteristics and ranks the top predictors to reduce repetition. The provided relevant and diverse feature sets improve model performance.

Feature engineering

Feature engineering seeks to enhance a system’s predictive capabilities by combining or transforming current features and adding newly created ones. Three more features are added to the current ones to improve the ability of the Hierarchical Contextual Feature Scoring (HCFS) method to capture relationships between data points.

Academic Consistency Score (ACS).

ACS is derived from Grade Point Average (GPA), Internal Exam Scores, and Final Exam Scores. This function evaluates the reliability of a student’s results when examined via several techniques. The equation for it is: (10) The ACS score provides academic stability By averaging student’s performance using important academic metrics,

Study Efficiency Ratio (SER).

The SER is determined by dividing the student’s homework completion rate by their study time. This tool compares the student’s study time to their assignment completion time to evaluate the effectiveness of their study habits. (11) A lower SER indicates that students complete homework with less studying time. On the other hand, an increased SER can suggest that the research methodologies used were inefficient.

Peer Influence Index (PII).

The PII consists of two scores—one for peer interaction and one for peer influence. This trait quantifies how much a student’s classmates influence their conduct and academic achievement. (12)

This derived characteristic lets one measure how much peer networks influence student achievement. Greater PII values point to more intense peer impact.

Data balancing using Cluster-Based Class Expansion (CBCE)

The CBCE method generates synthetic instances in underrepresented classes obtained from tiny data point clusters, reducing class imbalance. This method guarantees that the newly generated synthetic data adheres to each class’s inherent feature distribution and variation while mitigating duplication in overrepresented classes by eliminating similar cases. The objective is to provide a balanced and diversified dataset to enhance model generalization for predicting student performance.

Step 1: Clustering of data points within each class. The first phase of CBCE entails grouping instances within each class according to their feature similarity. Let denote the dataset, wherein the target variable comprises m classes: . For each underrepresented class , we implement a clustering technique (e.g., K-Means or DBSCAN) [32] to partition the instances into ki clusters, referred to as . Each cluster consists of feature vectors . (13)

This clustering guarantees that data points are clustered according to natural patterns in the feature space, allowing for meaningful synthetic instance production.

Step 2: Cluster boundary expansion for synthetic instance generation. Generation of synthetic instances is the subsequent stage after the clusters within the minority class are identified. This is accomplished by expanding the boundaries of the clusters. The convex hull, which denotes the minimal convex boundary encompassing all cluster elements, is computed for each cluster . The set of coordinates that define the cluster’s boundary is referred to as the convex hull: (14)

To generate additional synthetic instances, we marginally extend the limits of the convex hull. Let denote a border point, and let represent a randomly selected point from inside the cluster. The synthetic instance znew is produced using linear interpolation between the boundary point y and the interior point : (15)

The interpolation factor β determines the extent of new instances beyond the current cluster bounds. We choose β depending on cluster variance to provide synthetic examples that match the natural distribution of data inside the class. Lower-variance clusters create examples closer to the centre, whereas higher-variance clusters yield more scattered synthetic points.

Step 3: Conditional suppression of over-represented classes. In over-represented classes, basic downsampling may result in significant data loss. CBCE employs a conditional suppression method to prevent this. For each instance zj in the over-represented class , we calculate its variance contribution v(zj), which quantifies the deviation of the instance from the mean of its cluster: (16)

In this instance, represents the cluster mean, and v(zj) represents the contribution of each instance to the cluster’s spread. Remove instances with low variance contributions, which are closer to the cluster mean and redundant. Set θ as a suppression threshold to achieve desired diversity in the over-represented class: (17)

Only instances with S(zj) = 1 are maintained, ensuring that the retained data points preserve the class’s variety while avoiding excessive repetition.

Step 4: Iterative balance adjustment. Class distributions are re-evaluated after each cycle of synthetic instance production and conditional suppression. Until the class ratios fall within a suitable range, clusters repeatedly grow and shrink. The chance of happening determines the proportion of class after each iteration; is the total of all occurrences in the dataset. (18)

This cycle continues until all classes are represented, at which point it ends. The synthetic instances that comprise the final dataset that CBCE creates preserve various occurrences within the majority classes and successfully augment the data for the minority classes. For early student performance forecasting models to provide accurate and generalizable predictions, the dataset must be balanced while retaining each class’s inherent diversity and structure.

Classification with GNN-Transformer-InceptionNet Network (GNN-TINet)

Graph Neural Networks (GNNs) [33, 34], Transformer Networks [35], and InceptionNet [36] are combined in the GNN-TINet network to categorize complex information, including student performance prediction efficiently. Transformer uses attention techniques to capture global relationships, InceptionNet enhances multi-scale feature extraction to diversify learned features, and GNN models the relational and graph-based data structure. The dataset is first shown in GNN-TINet as a graph , with V standing for nodes (such as students or devices) and E for edges (relationships between nodes). Each node has a feature vector fj. A message-passing mechanism updates each node’s representation in the GNN. The message-passing update rule for node vj at layer l + 1 is: (19)

represents the feature vector of node vj at layer l, represents its neighbors, and W(l) and are learnable weight matrices for nodes. This approach lets the GNN collect local and neighbourhood data and encode relational patterns.

After identifying minority class clusters, input GNN output into a Transformer network. Self-attention techniques help the Transformer represent global dependencies and interactions, capturing complicated dataset linkages. The self-attention mechanism is: (20)

The key, query, and value matrices from the GNN output are denoted by W, K, B in this equation, whereas the essential vector dimensionality is represented by fj. This approach improves classification accuracy by guiding the model to focus on crucial interactions between nodes or between temporal sequences.

Inception processes Transformer output to boost the model’s capacity to learn varied characteristics. Each Inception block uses convolutional filters with varying kernel sizes (1x1, 3x3, 5x5) to capture characteristics at different scales. Calculating Inception module output: (21)

f1×1, f3×3, f5×5 represent the outputs from convolutional layers with varying kernel dimensions, whereas fpool is derived from a max-pooling layer. This multi-scale methodology allows the model to acquire features attuned to intricate and overarching patterns within the data.

The characteristics derived from the Inception module are flattened and then processed via fully linked layers for classification. The anticipated output is expressed as: (22)

Wf and bf represent the weights and bias of the fully connected layer, assuring a probability distribution across classes. The GNN-TINet is trained to reduce cross-entropy loss: (23)

Where ti represents the actual label for instance i and represents the expected probability. This complete design lets the GNN-TINet classify data using relational, temporal, and multi-scale aspects.

GNN-TINet is ideal for student performance prediction, where connections between students, instructors, and peers are critical, and network traffic analysis, where multi-scale feature extraction and relationship modelling are needed for anomaly identification. Integrating GNN, Transformer, and InceptionNet creates a strong and versatile architecture that can handle numerous categorization tasks.

Performance evaluation metrics

The developed model for predicting early student performance is assessed using classic and innovative indicators customized to the educational setting [3739]. Accuracy, precision, recall, and F1-score reveal the model’s student performance category classification performance. Accuracy is a simple measure of correctness, although class imbalance might underrepresent some performance areas. Precision and recall improve assessment by emphasizing expected categorization relevance. Precision measures how many optimistic predictions are correct, whereas recall measures the model’s ability to find all relevant occurrences. These measures are crucial in education because incorrectly labelling a difficult student might have far-reaching effects. The Learning Impact Factor (LIF) and the Predictive Consistency Score (PCS) are two new performance measures that complement existing ones. The PCS highlights the importance of consistent performance across evaluations by measuring the consistency of model predictions across time. Organized as follows: (24)

The equation includes N students, Ti predictions, anticipated performance (), actual performance (yi,t), and the indicator function (). If the indication is correct, it returns one; otherwise, it returns zero. Effective intervention techniques rely on the model’s ability to predict across assessments reliably; a better PCS demonstrates this. To determine how well the model foretells students’ long-term performance, especially in response to interventions or changes to their learning environment, the LIF uses a variety of metrics, as shown in the equation below. (25)

While the change in actual performance for student i is recorded by Δyi = yi,Tyi,0, the expected performance change is shown by . One avoids zero division using the tiny constant ϵ. The approach is valuable in classrooms if the LIF is sufficient, as it can identify significant variations in student performance.

In the performance assessment framework, the combination of PCC and LIF offers a comprehensive understanding of the model’s ability to predict student outcomes accurately. Integrating these supplementary criteria with established evaluation methodologies renders the model’s long-term projections more reliable and valid.

Summary of method contributions

To solve the challenges associated with forecasting students’ performance, the GNN-Transformer-InceptionNet (GNN-TINet) model introduces the following additional features:

  1. First, a unified architecture that captures multi-scale features, temporal patterns, and other sophisticated data interactions using GNN, Transformer, and InceptionNet layers.
  2. Improves data quality in advanced preprocessing using contextual frequency encoding and adaptive imputation.
  3. To address the class imbalance and choose relevant characteristics, it employs Hierarchical Contextual Feature Scoring (HCFS) and Cluster-Based Class Expansion (CBCE).
  4. Novel Evaluation Criteria: Proposes PCS and LIF to Assess Models’ Long-Term Effectiveness.

With all these factors included, the model can better forecast how well children would do in different types of classrooms.

Simulation results and discussion

A machine with 32 GB of RAM and an Intel Core i7 9th Generation quad-core CPU was used to conduct extensive simulations. This system can handle deep learning techniques and big datasets with enough processing power. The proposed method employed a learning rate of 0.001, 64 batches, and 100 training epochs to get better results. The training was stopped after five epochs if the validation loss did not improve, and the dropout rate was set at 0.5 to prevent overfitting. While maintaining computational efficiency, these adjustments improved the model’s ability to predict student success.

The data undergoes basic preprocessing to address missing values, convert categorical variables, normalize numerical characteristics, and detect outliers. Following this preparation, dataset analysis informs the subsequent operations. After the dataset analysis, exploratory data analysis is conducted.

Fig 2 illustrates the dataset’s distribution of performance categories across students. The x-axis represents the performance categories, while the y-axis denotes the frequency of pupils inside each category. This illustration discerns the number of students categorized inside each performance level, including Very High, High, Medium, and Low. By analyzing bar heights, readers may evaluate the equilibrium of student performance and pinpoint areas for pedagogical improvement. The accurate labelling of the axes and general structure enhances the chart’s readability and elucidates the dataset’s performance landscape.

thumbnail
Fig 2. Distribution of performance categories among students.

https://doi.org/10.1371/journal.pone.0314823.g002

Fig 3 demonstrates how student performance differs by performance category and GPA distribution. The box plot shows the median GPA, IQR, and outliers for each performance category, which may include “Very High,” “High,” “Medium,” and “Low.” The median line in the box shows the central tendency of GPA within each category. In contrast, the box shows the distribution of GPA values, showing the middle 50% of scores. Outliers—students with substantially different GPAs from the trend in their categories—may be found outside the whiskers. This chart shows students’ academic achievement by overall performance level and how GPA may relate to different categories. This data may help identify student strengths and weaknesses and provide educational initiatives and assistance.

thumbnail
Fig 3. Distribution of student performance categories based on GPA.

https://doi.org/10.1371/journal.pone.0314823.g003

Parental engagement distribution of student achievement categories is shown in Fig 4. The counterplot shows the frequency of pupils in each performance category, from “Very High” to “Low,” with distinct colours denoting parental participation (“High,” “Medium,” and “Low”). The figure shows how parental participation affects student achievement. If the plot shows a greater frequency of “Very High” performers with “High” parental participation, it demonstrates that parental engagement may improve academic achievement. If “Low” performers have “Low” parental participation, it may imply that a lack of parental support may lead to low performance.

thumbnail
Fig 4. Distribution of student performance categories based on parental involvement levels.

https://doi.org/10.1371/journal.pone.0314823.g004

Fig 5 shows students’ average GPA by mental health condition. The bar plot shows how mental health conditions like “Good,” “Average,” and “Poor” affect students’ average GPA. Each bar shows the mean GPA for students in each mental health condition group, comparing performance. This statistic suggests that students with “Good” mental health have higher average GPAs than those with “Average” or “Poor” mental health. This link may reveal the relevance of mental well-being in education and its impact on student performance.

Fig 6 shows student GPA vs homework completion rate. The boxplot shows how homework completion affects GPA. The line within each box shows the median GPA for a specific range of homework completion rates. The whiskers reflect the remaining data range, whereas points outside them may be outliers. The statistics show that students who regularly do their assignments have higher GPAs. Despite high or low assignment completion rates, outliers may have GPAs that are considerably different from their contemporaries. This chart shows how homework completion affects student performance, helping educators and policymakers improve academic success.

thumbnail
Fig 6. Boxplot of GPA by homework completion rate, showing that higher completion rates correlate with higher GPAs.

Outliers indicate variability in performance.

https://doi.org/10.1371/journal.pone.0314823.g006

Fig 7 shows the correlation matrix for early student performance prediction characteristics. Each heatmap column indicates the correlation coefficient between two attributes, ranging from -1 to 1.1 represents a complete positive correlation, suggesting that when one characteristic rises, the other does, too; -1 indicates a perfect negative correlation, where one trait increases and the other decreases. This matrix shows positive connections between GPA, Final Exam Scores, and Internal Exam Scores, implying that students who do well on examinations have better GPAs. Time spent studying also correlates with assignment completion rate, showing that students who study are more likely to finish their homework regularly. However, parental education and motivation levels have modest correlations with academic achievement measures, indicating that they may affect student outcomes but are less directly linked than more linked factors. This correlation matrix helps analyze and choose features for predictive modelling by showing how different variables affect student performance.

Fig 8 shows how Hierarchical Contextual Feature Scoring (HCFS) measures feature significance in predicting early student success. The bar plot shows feature significance ratings in increasing order. The figure shows that “Peer Influence Score,” “Learning Disabilities,” and “Parental Education Level” have lower significance ratings, indicating they may predict student performance less. In contrast, “GPA,” “Average Subject Score,” and “Time Spent Studying” have greater significance values, reflecting their relevance in student outcomes. This graphic shows which prediction model characteristics are more important and the underlying variables that affect student success. Understanding the proportional significance of these qualities helps educators and policymakers choose interventions and support systems to improve student outcomes.

Table 3 compares the proposed GNN-TINet approach against GritNet, SVM, DenseNet, ResNet, and CNN. The table shows each model’s early student performance prediction accuracy, precision, recall, F1 score, log loss, area under the curve (AUC), Matthews correlation coefficient (MCC), specificity, balanced accuracy, predictive consistency score, learning impact factor, and Hamming loss. From the table, GNN-TINet’s 98.5% accuracy is more significant than all other approaches, demonstrating its better predictive power. GNN-TINet’s accuracy and recall scores indicate that it generates accurate predictions and detects relevant occurrences, decreasing false negatives. Log loss and AUC measures show the model’s dependability, with GNN-TINet having lower log loss and greater AUC than other approaches.

thumbnail
Table 3. Performance evaluation of proposed method GNN-TINet and existing methods.

https://doi.org/10.1371/journal.pone.0314823.t003

The GNN-TINet model’s 30-epoch training and testing accuracy and loss are displayed in Fig 9. The left subplot represents training/testing model accuracy. The training accuracy slowly improves to 98.8%, indicating model learning from training data. The model generalizes well to additional data as testing accuracy rises to 98.8% in the final epoch. The right subplot indicates training/testing loss. The model becomes more accurate when the training loss decreases from 0.72 to 0.10. The testing loss drops to 0.09, confirming that the model matches the training data and performs well on the test set. The picture shows that the GNN-TINet model predicts early student performance with high accuracy and low loss values for training and testing. Real-world scenarios suit the model.

thumbnail
Fig 9. Training and testing accuracy and loss of the GNN-TINet model over 30 epochs.

https://doi.org/10.1371/journal.pone.0314823.g009

Table 4 displays the GNN-TINet model’s parameter sensitivity analysis, which shows how hyperparameters affect performance measurements. Learning rate, batch size, dropout rate, and training epochs are all examined with different values. The data indicates that the GNN-TINet model’s accuracy reaches 98.5% with specified parameters, highlighting the need for hyperparameter adjustment for best performance. A learning rate of 0.001 and a batch size of 64 results in the highest accuracy and F1 score, making them suitable for model training. The picture also depicts how dropout rate and epochs influence model performance, emphasizing that tuning may significantly impact log loss and area under the curve. It concludes that carefully evaluating these hyperparameters enhances the GNN-TINet model’s anticipated early student performance, giving valuable insights for practitioners looking to replicate or improve these results.

thumbnail
Table 4. Parameter sensitivity analysis for GNN-TINet model.

https://doi.org/10.1371/journal.pone.0314823.t004

Fig 10 shows the complexity analysis of several machine learning approaches with increasing data quantities. The graph compares GNN-TINet, GritNet, SVM, DenseNet, ResNet, and CNN training periods (in seconds) from 20,000 to 150,000 cases. According to the chart, GNN-TINet has the lowest training times across all data sizes. This implies that it can manage additional datasets with greater efficacy and scalability. Conversely, the training time for SVM and DenseNet increases significantly as the size of the dataset increases, particularly for larger datasets. This visualization emphasizes the most efficient method for large-scale data processing: GNN-TINet. Constraints revealed by trends can expose other methods’ computational difficulty and practical consequences. This number emphasizes the need to choose effective methods to project large amounts of data.

thumbnail
Fig 10. Complexity analysis of the proposed and existing method.

https://doi.org/10.1371/journal.pone.0314823.g010

Conclusion and future work

This study improves educational data mining by developing an effective strategy for forecasting students’ early performance. The GNN-Transformer-InceptionNet (GNN-TINet) model utilizes GNNs, transformers, and the Inception architecture to interpret intricate data on student performance. Thanks to these intricate models, the GNN-TINet model can process-relational and contextual data, which improves its ability to generalize to different learning contexts and make accurate predictions. Following a battery of simulations, the GNN-TINet model outperformed conventional machine learning models in terms of accuracy (98.5%), recall (99%) and F1 score (98). Combining contextual awareness and multi-scale feature engineering enables the model to comprehend intricate patterns and relationships within educational data. By analyzing demographics, behavioural traits, academic performance, and other multi-dimensional attributes, the GNN-TINet model aids educators in foreseeing and enhancing student results. Beyond the accuracy of forecasts, the method gives teachers and legislators helpful information about students’ development. The GNN-TINet model accurately predicts performance patterns, allowing personalized interventions that provide responsive and helpful learning environments. These treatments promote a holistic view of student development and achievement by addressing emotional, cognitive, and social barriers to achievement.

There are downsides to this study despite the positive results. Statistics derived from public sources may not fairly portray the unique features of different classroom settings. The results obtained using the model in contexts different from the one used for this study might differ. Improving models for behavioural analysis and dropout prediction, expanding data sources, or real-time learning analytics could result from removing these limitations.

References

  1. 1. Hidayat N, Afdholy N, Arifani Y. The effectiveness and challenges of online teaching of EFL teachers in the COVID-19 crisis. Int J Humanit Educ. 2024;22(1).
  2. 2. Liu Y, Cao S, Chen G. Research on the long-term mechanism of using public service platforms in national smart education—based on the double reduction policy. Sage Open. 2024.
  3. 3. Shu X, Ye Y. Knowledge discovery: Methods from data mining and machine learning. Soc Sci Res. 2023;110:102817. pmid:36796993
  4. 4. Xu T, Gao Q, Ge X, Lu J. The relationship between social media and professional learning from the perspective of pre-service teachers: A survey. Educ Inf Technol. 2024;29(2):2067–2092.
  5. 5. Alhazmi E, Sheneamer A. Early predicting of students performance in higher education. IEEE Access. 2023;11:27579–27589.
  6. 6. Chen G, Jin Y, Chen P. Development of a platform for state online education services: Design concept based on meta-universe. Educ Inf Technol. 2024;1–25.
  7. 7. Abou Naaj M, Mehdi R, Mohamed EA, Nachouki M. Analysis of the factors affecting student performance using a neuro-fuzzy approach. Educ Sci. 2023;13(3):313.
  8. 8. Liu Z, Tang Q, Ouyang F, Long T, Liu S. Profiling students’ learning engagement in MOOC discussions to identify learning achievement: An automated configurational approach. Comput Educ. 2024;219:105109.
  9. 9. Wibisono DL, Abidin Z. Prediction of student graduation using hybrid 2D convolutional neural network and synthetic minority over-sampling technique. Recursive J Informatics. 2023;1(1):27–34.
  10. 10. Kustitskaya TA, Esin RV, Vainshtein YV, Noskov MV. Hybrid approach to predicting learning success based on digital educational history for timely identification of at-risk students. Educ Sci. 2024;14(6):657.
  11. 11. Clarke T, McLellan R. Associations between children’s school wellbeing, mindset and academic attainment in standardised tests of achievement. School Psychol Int. 2024;45(4):409–446.
  12. 12. Boztaş GD, Berigel M, Altinay Z, Altinay F, Shadiev R, Dagli G. Readiness for inclusion: Analysis of information society indicator with educational attainment of people with disabilities in European Union countries. J Chin Hum Resour Manag. 2023;14(3):47–58.
  13. 13. Batool S, Rashid J, Nisar MW, Kim J, Kwon HY, Hussain A. Educational data mining to predict students’ academic performance: A survey study. Educ Inform Technol. 2023;28(1):905–971.
  14. 14. Kancan OE, Altinay F, Altinay Z, Dagli G, Bastas M. The role of supervisor to develop strategic planning for the future of education. J Chin Hum Resour Manag. 2023;14(3):70–83.
  15. 15. Sarker S, Paul MK, Thasin TH, Hasan MAM. Analyzing students’ academic performance using educational data mining. Comput Educ Artif Intell. 2024;7:100263.
  16. 16. Al-Azazi FA, Ghurab M. ANN-LSTM: A deep learning model for early student performance prediction in MOOC. Heliyon. 2024;9(4).
  17. 17. Nachouki M, Mohamed EA, Mehdi R, Abou Naaj M. Student course grade prediction using the random forest algorithm: Analysis of predictors’ importance. Trends Neurosci Educ. 2023;100214. pmid:38049293
  18. 18. Munir J, Faiza M, Jamal B, Daud S, Iqbal K. The impact of socio-economic status on academic achievement. J Soc Sci Rev. 2023;3(2):695–705.
  19. 19. Chhabra GS, William P, Lanke GR, Jain K, Lakshmi TV, Varshney N. Comparative analysis of data mining based performance evaluation using hybrid deep learning approach. In: International Conference on Mobile Radio Communications & 5G Networks; 2023; Singapore: Springer Nature Singapore. p. 607-621.
  20. 20. Abid K, Aslam N, Fuzail M, Maqbool MS, Sajid K. An efficient deep learning approach for prediction of student performance using neural network. VFAST Trans Softw Eng. 2023;11(4):67–79.
  21. 21. Chen Y, Wei G, Liu J, Chen Y, Zheng Q, Tian F, et al. A prediction model of student performance based on self-attention mechanism. Knowl Inf Syst. 2023;65(2):733–758.
  22. 22. Kusumawardani SS, Alfarozi SAI. Transformer encoder model for sequential prediction of student performance based on their log activities. IEEE Access. 2023;11:18960–18971.
  23. 23. Sridharan TB, Akilashri PSS. Hybrid attention network-based students behavior data analytics framework with enhanced capuchin search algorithm using multimodal data. Soc Netw Anal Min. 2023;13(1):145.
  24. 24. Huang Q, Chen J. Enhancing academic performance prediction with temporal graph networks for massive open online courses. J Big Data. 2024;11(1):52.
  25. 25. Ben Said A, Abdel-Salam ASG, Hazaa KA. Performance prediction in online academic course: A deep learning approach with time series imaging. Multimedia Tools Appl. 2024;83(18):55427–55445.
  26. 26. Tariq R, Mohammed A, Alshibani A, Ramírez-Montoya MS. Complex artificial intelligence models for energy sustainability in educational buildings. Sci Rep. 2024;14(1):15020. pmid:38951562
  27. 27. Quispe JOQ, Toledo OC, Toledo MC, Llatasi EEC, Saira EMR. Early prediction of university student dropout using machine learning models. Nanotechnology Perceptions. 2024;659–669.
  28. 28. Jiang L, Lv M, Cheng M, Chen X, Peng C. Factors affecting deep learning of EFL students in higher vocational colleges under small private online courses‐based settings: A grounded theory approach. J Comput Assist Learn.
  29. 29. Zyout I, Zyout MA. Sentiment analysis of student feedback using attention-based RNN and transformer embedding. Int J Artif Intell. 2024;13(2):2173–2184.
  30. 30. Laoshi. SPD24—Student performance data revised features [Dataset]. Kaggle. Available from: https://doi.org/10.34740/KAGGLE/DSV/9083250, 2024.
  31. 31. Palanivinayagam A, Damaševičius R. Effective handling of missing values in datasets for classification using machine learning methods. Inform. 2023;14(2):92.
  32. 32. Gholizadeh N, Saadatfar H, Hanafi N. K-DBSCAN: An improved DBSCAN algorithm for big data. J Supercomput. 2021;77(6):6214–6235.
  33. 33. Veličković P. Everything is connected: Graph neural networks. Curr Opin Struct Biol. 2023;79:102538. pmid:36764042
  34. 34. Qiao M, Xu M, Jiang L, Lei P, Wen S, Chen Y, et al. HyperSOR: Context-aware graph hypernetwork for salient object ranking. IEEE Trans Pattern Anal Mach Intell. 2024. pmid:38381637
  35. 35. Neimark D, Bar O, Zohar M, Asselmann D. Video transformer network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. p. 3163-3172.
  36. 36. Aboussaleh I, Riffi J, Mahraz AM, Tairi H. Inception-UDet: an improved U-Net architecture for brain tumor segmentation. Ann Data Sci. 2024;11(3):831–853.
  37. 37. Gao C, Zheng Y, Li N, Li Y, Qin Y, Piao J, et al. A survey of graph neural networks for recommender systems: Challenges, methods, and directions. ACM Trans Recommender Syst. 2023;1(1):1–51.
  38. 38. Xiong Y, Xinya XG, Xu J. CNN-Transformer: A deep learning method for automatically identifying learning engagement. Educ Inform Technol. 2024;29(8):9989–10008.
  39. 39. Shiri FM, Ahmadi E, Rezaee M, Perumal T. Detection of student engagement in e-learning environments using EfficientNetV2-L together with RNN-based models. J Artif Intell. 2024;6:2579–0021.
  40. 40. Ahmed A, Nipa FA, Bhuyian WU, Mushfique KM, Shahin KI, Nguyen H-H, et al. Students’ performance prediction employing Decision Tree. CTU J Innov Sustain Dev. 2024;16(Special issue: ISDS):42–51.
  41. 41. Alshamaila Y, Alsawalqah H, Aljarah I, Habib M, Faris H, Alshraideh M, et al. An automatic prediction of students’ performance to support the university education system: a deep learning approach. Multimedia Tools Appl. 2024;83(15):46369–96.
  42. 42. Silva MP, Rupasingha RAHM, Kumara BTGS. Identifying complex causal patterns in students’ performance using machine learning. Technol Pedagog Educ. 2024;33(1):103–19.
  43. 43. Cheng B, Liu Y, Jia Y. Evaluation of students’ performance during the academic period using the XG-Boost Classifier-Enhanced AEO hybrid model. Expert Syst Appl. 2024;238:122136.