Enhancing reliability in electrical grids: A hybrid machine learning approach for electrical faults classification

Momotaz Begum; Ariful Islam Shiplu; Mehedi Hasan Shuvo; Fahmid Al Farid; Sumaiya Ismat Jerin; Jia Uddin; Hezerul bin Abdul Karim

doi:10.1371/journal.pone.0341238

Abstract

Transmission lines are vital components of electrical grids, ensuring the efficient transfer of electricity from power plants to consumers over extensive geographical areas. These lines are constructed with careful consideration of factors such as conductor materials, insulation levels, current ratings, and voltage ratings to maintain reliable and safe electricity delivery. However, various types of faults can occur in transmission lines, posing significant challenges, often leading to outages, equipment damage, and reduced system reliability. Accurate and fast fault classification is therefore a pressing requirement in modern smart grids, where proactive maintenance and resilience are critical. This research addresses the critical need for an efficient electric fault classification model. A comprehensive investigation is conducted, employing a variety of machine learning (ML) algorithms, including Decision Tree (DT), Random Forests (RF), Naive Bayes (NB), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and, AdaBoost, for fault classification. Additionally, fundamental ensemble techniques such as Hard-Voting, Soft-Voting, Stacking, and Blending are incorporated with five hybrid ML models (each constructed by combining various ML algorithms) to enhance fault classification performance and the reliability of transmission lines. Also, this research proposes a hybrid ML model, specifically (RF + DT + Stacking), to classify transmission line data. The main contribution of this work is an application-oriented evaluation of classical and ensemble machine learning models for electrical fault classification, with an emphasis on benchmarking performance, model interpretability, and computational efficiency. This study demonstrates that a carefully configured hybrid ensemble (RF + DT + Stacking) can provide a practical and lightweight alternative to deep learning-based methods in grid fault monitoring scenarios. The dataset used encompasses various attributes affecting line performance, making accurate classification critical for proactive issue detection, optimized maintenance scheduling, and uninterrupted energy supply. Our hybrid model achieves high-performance metrics, including an accuracy of 93.64%, precision of 93.65%, recall of 93.64%, and F1 score of 93.64%, underscoring its effectiveness in enhancing decision-making processes and operational efficiency within electrical transmission networks.

Citation: Begum M, Shiplu AI, Shuvo MH, Al Farid F, Jerin SI, Uddin J, et al. (2026) Enhancing reliability in electrical grids: A hybrid machine learning approach for electrical faults classification. PLoS One 21(2): e0341238. https://doi.org/10.1371/journal.pone.0341238

Editor: Haris Calgan, Balikesir Universitesi, TÜRKIYE

Received: June 11, 2025; Accepted: January 5, 2026; Published: February 13, 2026

Copyright: © 2026 Begum et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data that supported the findings of this study are available at the following link: https://www.kaggle.com/datasets/esathyaprakash/electrical-fault-detection-and-classification.

Funding: This research was funded by Multimedia University, Cyberjaya, Selangor, Malaysia (Grant Number: PostDoc (MMUI/240029)). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: No authors have competing interests.

1 Introduction

Transmission lines are high-voltage power lines designed to carry electricity from power plants to substations over long distances with minimal losses. They facilitate the delivery of power generated at remote plants to end-users, ensuring reliable electricity supply in various sectors of society. The design and operation of transmission lines encompass several key considerations, including conductor material, insulation levels, and voltage ratings, which optimize energy transmission while prioritizing reliability and safety. Their versatility allows them to carry both alternating current (AC) and direct current (DC), further enhancing their importance in maintaining a stable power supply [1].

Key factors in the design of transmission lines include length, voltage, and current capacity. A careful balance among these parameters is essential to meet demand while minimizing energy losses [2]. Additionally, the condition of transmission lines, particularly insulation levels, is crucial for preventing outages and ensuring public safety [3]. Various types of transmission lines, such as overhead lines, underground lines, and submarine cables, each possess unique advantages and disadvantages suited to specific applications, highlighting the necessity of tailoring solutions to operational environments [4].

Transmission lines operate under diverse environmental and physical conditions that can significantly impact their performance and lifespan. Factors such as temperature fluctuations, weather conditions, and physical wear necessitate proper maintenance and regular inspections to identify and mitigate potential issues that could lead to failures [5]. Advanced monitoring systems and diagnostic tools are increasingly employed to assess the condition of transmission lines, facilitating the early detection of anomalies that may indicate underlying problems [6].

The advent of smart grid technologies offers promising opportunities for enhancing the management and operation of transmission lines. Smart grids leverage advanced sensors, communication networks, and data analytics to enable real-time monitoring and control of electrical infrastructure, allowing for improved load balancing, fault detection, and automatic grid reconfiguration in response to changing conditions [7]. Integrating smart grid technologies with transmission line infrastructure can significantly enhance reliability, resilience, and efficiency in electricity delivery [8].

ML has emerged as a transformative technology across various fields, enabling the development of models that can learn from data and make predictions without explicit programming. In the realm of electrical engineering, machine learning is increasingly utilized to analyze and predict the performance and condition of transmission lines, thus improving their reliability and efficiency [9]. These algorithms can process vast amounts of data to identify patterns, detect anomalies, and make accurate predictions, enabling proactive issue resolution and optimized maintenance schedules [10]. Ensemble methods were later explored to improve robustness and class-wise reliability without incurring the heavy training cost of deep neural network [11,12]. In parallel, the power-systems community investigated feature engineering from phase voltages/currents, sequence components, symmetrical components, and time–frequency transforms to increase separability across single-line-to-ground, line-to-line, and three-phase faults [13]. Robust control and estimation techniques have been widely applied in motor drives and power systems, such as H controllers with MRAS-based estimators [14,15] and adaptive model predictive control (AMPC) with online parameter estimation [16], to enhance dynamic performance and reliability. In the energy sector, machine learning and deep learning models are increasingly explored for tasks like supercapacitor performance prediction [17] and solar power forecasting [18]. These studies show the potential of data-driven methods in renewable energy applications. Recent studies emphasize MPPT algorithms for PV systems [19], power quality in wind energy with DFIGs [20], optimization of PV and DSTATCOM placement [21], fault detection in PV modules via imaging [22], and hybrid power systems with demand response strategies [23], showcasing the breadth of advanced control and optimization in renewable energy. In addition, research has highlighted the role of ML/DL techniques in wind power prediction [24], integration challenges of renewable energy sources in modern grids [25], LVRT enhancement for wind farms using DFIG protection schemes [26], hybrid energy system optimization with novel metaheuristics [27], and fault detection in solar panels using ML classifiers [28], further reinforcing the transformative impact of intelligent methods in renewable energy systems.

Unlike general ML classification problems, electrical fault classification involves high-frequency transient behavior in voltages and currents. Fault events cause abrupt changes in current magnitudes (Ia, Ib, Ic) and subtle variations in voltage signals. These patterns are nonlinear and system-dependent, governed by power-system physics, line impedance, fault resistance, and sequence component behavior. The selected ML models are therefore applied to system-generated signals representing real electrical phenomena rather than generic tabular data.

Scope: This research focuses on developing a hybrid ML approach to improve the reliability and performance of electrical transmission lines by accurately classifying electrical faults. Utilizing a comprehensive dataset with attributes related to line performance, the study explores and evaluates several ML algorithms, including DT, RF, NB, KNN, SVM, and AdaBoost. It further integrates ensemble techniques like Hard-Voting, Soft-Voting, Stacking, and Blending, emphasizing the hybrid model (RF + DT + Stacking). The scope includes enhancing fault classification accuracy, optimizing maintenance processes, and minimizing energy supply disruptions, ultimately contributing to more reliable and efficient grid operations.

The following are some of our study’s contributions:

Demonstrated that a hybrid ensemble configuration (RF + DT + Stacking) provides a competitive balance between classification performance, interpretability, and computational efficiency for transmission line fault classification. Using RF and DT as base learners within a stacking framework, the model achieves strong performance, with an accuracy of 93.64%, precision of 93.65%, recall of 93.64%, and an F1 score of 93.64%.
Demonstrated the effectiveness of the hybrid model in accurately predicting the performance and condition of transmission lines, providing a robust and reliable framework for classification that can aid in early issue detection and optimized maintenance scheduling.
Enhanced decision-making processes in electrical energy transmission by offering a data-driven approach that ensures reliability and safety, contributing to the field of electrical engineering by integrating multiple machine learning techniques to improve classification accuracy and robustness. Showcased the practical applications and benefits of ML in optimizing power distribution systems, thereby enhancing their efficiency and reliability.

This paper investigates the following key research questions:

RQ1: What machine learning algorithms have been utilized for electrical fault classification, and what advantages do they offer in terms of accuracy, efficiency, and real-time fault classification?
RQ2: How does an application-oriented hybrid ensemble configuration improve the balance between accuracy, interpretability, and computational efficiency in electrical fault classification?
RQ3: Which hybrid machine learning model can outperform previous research and improve the accuracy of electrical fault classification?

The organization of this paper is as follows: Sect 2 presents the Related Work, discussing previous research and methodologies relevant to electrical fault classification and machine learning. Sect 3 covers the Background Theory, providing an overview of this study’s key concepts, algorithms, and techniques. Sect 4 describes the Proposed Approach, detailing the hybrid machine learning model developed for fault classification. Experiments and Results are presented in Sect 5, where we analyze the performance metrics of our model. Sect 6 engages in a thorough discussion of the results. Sect 7 concludes the paper with a summary of key findings in the Conclusion, while Sect 8 outlines potential Future Work, highlighting areas for further research and improvement.

2 Related work

ML techniques have gained significant attention in electrical engineering, particularly for the classification, and monitoring of transmission lines. Traditional methods, such as supervised learning, have been effective in various applications [29–31]; however, they often face challenges when handling large-scale datasets and achieving high classification accuracy. Ensemble learning techniques have emerged as a promising solution to these challenges by combining multiple ML models to enhance predictive performance [32]. Among these, the Stacking Ensemble method has shown superior accuracy and robustness in power system fault diagnosis [11,33]. For instance, researchers have applied Stacking approaches to fault detection in transmission lines, demonstrating improved results compared to single-model methods.

Recent advancements in transmission line fault diagnosis have highlighted the integration of ML with real-time data analytics. Sun et al. [34] introduced an improved multiple SVM model optimized by a genetic algorithm, achieving an accuracy improvement of up to 11%. Their method was validated on an IEEE-30 node test system and real-world data, effectively addressing issues related to small sample sizes and generalization accuracy.

Yin et al. [35] developed a predictive decision support system using data mining techniques, correlating multi-source dynamic datasets with meteorological data to model transmission line disasters. This approach provided high-accuracy early warnings for potential failures, underscoring its relevance in regions like Bangladesh, where extreme weather poses significant risks to the power grid. Additionally, Tong et al. [36] proposed a novel transient fault detection and classification approach utilizing graph convolutional neural networks (GCN). By incorporating spatial information from sampling sequences and topology data, their method has shown exceptional performance in real-time fault detection, offering advantages in online transmission line protection.

Furthermore, Yu et al. [37] developed a diagnostic approach utilizing the Elgamal encryption algorithm, which not only improved data security but also achieved diagnostic accuracy exceeding 90%. Ma et al. [38] established a simulation model to identify both lightning and non-lightning strike faults, proposing criteria based on transient traveling wave current characteristics for intelligent fault diagnosis. Lahiri et al. [39] introduced a fault diagnosis method employing Decision Tree and Random Forest techniques, achieving 95% to 100% accuracy in identifying the location, type, and faulty phase of transmission lines across various scenarios.

Agarwal et al. [40] proposed a method for rapid fault identification in line commutated converter-based high voltage DC transmission lines, utilizing discrete Fourier transform analysis of DC current to enhance accuracy and reliability for timely trip commands to DC breakers. Additionally, the work presented by Hao et al. [41] introduced a faulted phase selection scheme that utilizes Multiscale Principal Entropy (MPE) values from fault transient voltage signals, combined with CS-SVM for high-accuracy fault detection, resilient to variations in fault location and other parameters. Lastly, the 10kV railway power transmission line simulation model developed by Yu et al. [42] employed a BP neural network to classify faults based on phase current differences, accurately determining fault types and locations.

Despite these developments, there remains a need for more comprehensive models that integrate various ML techniques to maximize their strengths and mitigate weaknesses. Our proposed hybrid model, which combines (RF + DT + Stacking) Ensemble techniques, seeks to address these gaps by providing a more robust and accurate classification framework for transmission line data. In Bangladesh, where the electrical grid faces challenges from extreme weather conditions, hybrid models can enhance transmission line monitoring significantly [43]. By improving the reliability and efficiency of these systems, our research aims to contribute to stable power system operations, reducing economic losses, and ensuring consistent electricity supply.

This work builds upon existing research while introducing a hybrid approach that integrates multiple ML techniques, thereby contributing to advancements in electrical grid management and enhancing the reliability and efficiency of transmission line monitoring. Table 1 represents the contributions and limitations of literature review papers.

Download:

Table 1. Contributions and limitations of different studies in the literature.

https://doi.org/10.1371/journal.pone.0341238.t001

3 Background theory

In this section, we outline the foundational concepts and theoretical underpinnings relevant to this research. We will focus on the mathematical formulations of the supervised ML algorithms and ensemble methods utilized in this study.

3.1 Decision Tree (DT)

Decision Tree (DT) algorithm is renowned for its ease of use and clarity, which makes it a popular choice for data analysis, mainly when dealing with large datasets [47]. Its ability to perform automatic feature selection is a crucial advantage, as it highlights the most influential features for accurate predictions. The DT represents data through a tree-like model, making decisions hierarchically that reveal valuable patterns and decision processes. Understanding the DT algorithm requires familiarity with concepts like Entropy (E), Gini Index (GI), and Information Gain (IG). These concepts are defined as follows: Certainly! Let’s update the equations with different variable names for clarity:

(1)

(2)

(3)

In these equations, H(D) represents the entropy of the dataset D, G(D) denotes the Gini impurity at a node, and IG(D, A) indicates the information gain from an attribute A with respect to dataset D.

3.2 Random Forest (RF)

The Random Forest (RF) classifier is an ensemble technique that aggregates multiple decision trees, each trained on different subsets of the dataset. By averaging the predictions of these trees, it enhances overall predictive accuracy. Renowned for its resilience, the RF classifier effectively manages diverse data challenges, including categorical variables, imbalanced datasets, and missing values. This approach leverages the combined strength of various decision trees, making it a robust and reliable method for delivering accurate predictions, even in complex data situations.

3.3 K-Nearest Neighbors (KNN)

The K-nearest neighbor (KNN) algorithm is well-regarded for its simplicity, ease of interpretation, and effectiveness in classification tasks. It operates by storing all available instances and classifying new instances based on their similarity to these stored cases, typically using distance metrics. By examining the k closest neighbors, it assigns a class label based on the most common label among these nearby points. This straightforward method makes it easy to understand and applies effectively across various domains, including pattern recognition, recommendation systems, and data mining. The similarity between instances can be measured using the following distance metrics.

The general distance metric is given by:

(4)

where a and b are two vectors. The Manhattan distance (also known as Taxicab distance) is expressed as:

(5)

where and are two points in a plane. The Euclidean distance is given by:

(6)

where and are points in a two-dimensional space. The formula for making predictions with the KNN algorithm is:

(7)

In this equation, represents the predicted class for a new input x, denotes the set of K nearest neighbors to x, and y_j is the class label associated with the j-th neighbor in the set. The function selects the most frequently occurring class label among the neighbors.

3.4 Naive Bayes (NB)

Bayes’ theorem provides a straightforward way to quantify the probability of a true hypothesis based on specific evidence. It is a core concept in probability theory that facilitates the calculation of conditional probabilities. NB algorithm, a widely used method in ML, is built upon Bayes’ theorem. Below are the reformulated equations for Bayes’ theorem:

The product rule can be expressed as:

(8)

Since:

(9)

It follows that:

(10)

Bayes’ theorem itself is given by:

(11)

In this context, A represents the hypothesis and B denotes the evidence. denotes the posterior probability of the hypothesis A given the evidence B. is the probability of observing the evidence B given that the hypothesis A is true. P(A) and P(B) are the prior probabilities of the hypothesis and evidence, respectively.

3.5 Support Vector Machine (SVM)

Support Vector Machine (SVM) is known for its simplicity and effectiveness as an ML algorithm, offering precise results with relatively low computational demands. It excels at linear and nonlinear classification tasks by maximizing the margin between data points and a decision boundary, typically represented as a hyperplane or line. SVM is also effective at preventing overfitting. The kernel trick also allows SVM to handle more complex, non-linear problems.

3.6 AdaBoost and gradient boosting

AdaBoost is an effective ML algorithm that combines multiple weak learners to improve overall prediction performance. AdaBoost enhances accuracy by integrating these relatively simple models, resulting in a more robust and reliable predictive system. In contrast, GB is another ML technique used for regression and classification tasks. When applied to regression trees, GB produces strong, resilient models capable of handling noisy or incomplete data. The final prediction in GB is determined using the following formula:

(12)

The loss function can be minimized with respect to h using:

(13)

In these equations, t_i represents the true value, h(z_i) is the predicted value, and denotes the loss function.

3.7 Hard-voting or majority-voting

Hard-voting ensemble methods combine predictions from multiple ML models to produce a final prediction [48]. This process can be expressed using the following equation:

(14)

Here, represents the classes, and indicates the different ML models. The term D_j,k refers to the decision output from the j-th model for class k. The expression identifies the class k that receives the highest total number of votes from all models.

3.8 Soft-voting or weighted average

Soft-voting is a method where the predicted probabilities of each class from individual ML models are averaged to decide the final class label [49]. Mathematically, this can be described by the following equation:

(15)

This equation, denotes the predicted output or label. N represents the total number of independent ML models or base learners in the ensemble. The variable indicates the weight assigned to the prediction of the j^th model. In this framework, the prediction from each base learner is scaled by its respective weight. The term aggregates the weighted predictions from each model g_j(z) for a given input z. The denominator, , represents the total sum of the weights, which is used to normalize the weighted sum of predictions.

3.9 Stacking

Stacking, also known as stacked generalization, is an ensemble technique that combines multiple ML models, or base learners, to improve predictive accuracy [50]. This method involves training a meta-learner or a higher-level model, which learns the best way to aggregate the predictions from different base models. Meta-learning is a specialized form of learning where algorithms are trained using the outputs of other ML algorithms to produce more accurate predictions.

3.10 Blending

Blending is an ensemble learning approach where a separate machine learning model is trained to optimally combine the predictions of multiple base models [51]. In ML, blending is a meta-ensemble method that combines predictions from multiple base models using a secondary model to produce the final predictions.

4 Proposed approach

This paper proposes a hybrid ML model designed for efficient electrical fault analysis and classification. The framework of the proposed hybrid model is illustrated in Fig 1. It encompasses several key phases: data collection, data preprocessing, data splitting, hybrid model integration, and classification. The process begins with collecting and preprocessing data, which includes label encoding and standard scaling. The dataset is then split into 85% for training and 15% for testing. The hybrid model-1 (RF + DT + Stacking) with GB as the meta-learner to enhance predictive accuracy. This model leverages the strengths of RF and DT, combined through GB, to refine predictions and improve performance. The final classification phase assesses the model’s effectiveness using evaluation metrics such as accuracy, precision, recall, and F1 score. Algorithm 1 outlines the steps of the proposed hybrid approach. The model combines Random Forest (RF) and Decision Tree (DT) as base learners, whose predictions are then stacked and fed into a Gradient Boosting (GB) meta-learner to produce the final classification. This ensemble strategy leverages the complementary strengths of RF and DT while using GB to refine and optimize the final prediction.

Download:

Fig 1. Proposed methodology.

https://doi.org/10.1371/journal.pone.0341238.g001

Algorithm 1. Hybrid Stacking Ensemble Model for Electrical Fault Detection (Base Learners: Random Forest & Decision Tree; Meta-Learner: Gradient Boosting).

4.1 Dataset

The dataset used for this research is the Electrical Faults Analysis & Classification dataset, which comprises a total of 7861 samples with ten distinct features [52]. This dataset was generated through simulation experiments in MATLAB/Simulink, modeling electrical transmission lines under diverse operating and fault conditions. The data was recorded at a sampling rate of 10 kHz, ensuring accurate capture of transient fault events. Fig 2 illustrates the relative importance of each feature used in the classification task. The ordinate axis (“Importance”) represents the normalized feature importance scores, computed using the Gini impurity criterion from the Random Forest (RF) algorithm. These scores quantify the contribution of each feature to reducing classification error across all decision trees in the ensemble. The results indicate that current-based features (Ia, Ib, Ic) exhibit higher importance compared to voltage-based features (Va, Vb, Vc). This is expected because fault events typically cause sharper deviations in current signals, making them more discriminative for classification, whereas voltage signals tend to vary more gradually and thus provide relatively lower predictive power. Fig 3 displays the proportion of samples corresponding to each fault category within the dataset. Before model training, data preprocessing was performed to ensure consistency and reliability. This included consolidating the output labels into a single column, removing missing or inconsistent values, and normalizing all features to a uniform scale. Since the dataset already contained diagnostic attributes directly related to fault classification, no additional handcrafted feature engineering was required. For clarity, we combined the output features into a single column, representing the fault type. Table 2 summarizes the fault types and their corresponding output representations. It should be noted that the “No-Fault” regime is included as a reference category to enable the model to distinguish between healthy and faulty system states. The dataset covers six classes (No-Fault, LG, LL, LLG, LLL, LLLG), representing the most frequent and severe transmission line faults. However, other fault scenarios such as Phase B–Ground, Phase C–Ground, or inter-phase faults (A–C, B–C) are not explicitly included in this dataset. This limitation arises from the original dataset design and is acknowledged as a potential extension for future work to enhance the generalizability of the proposed model.

Download:

Fig 2. Importance of features for classifying the fault.

https://doi.org/10.1371/journal.pone.0341238.g002

Download:

Fig 3. Percentage distribution of fault types in the dataset.

https://doi.org/10.1371/journal.pone.0341238.g003

Download:

Table 2. The experimental results of the Hybrid models (Base ML Models + Ensemble Techniques) for fault classification.

https://doi.org/10.1371/journal.pone.0341238.t002

This classification framework allows for accurate and efficient identification of six distinct types of faults, essential for effective fault detection and maintenance planning in electrical power systems. The simulated nature and diversity of the dataset provide a comprehensive basis for training and evaluating the proposed hybrid model.

4.2 Data preprocessing

Data preprocessing is a fundamental step in preparing the dataset for ML models. In our dataset, some features have categorical data, and some feature numeric data and overfitting problems. So, in this research, we have used Label Encoding and Standard Scaler to suit the data [53], ultimately enhancing the performance and reliability of the model. Label encoding is a technique for converting categorical data into numerical format. It assigns a unique integer to each category, allowing ML algorithms to process the data. On the other hand, Standard Scaler is also a data preprocessing technique that transforms numerical features into a mean of 0 and a standard deviation of 1. This standardization process is crucial for many ML algorithms, ensuring that features with different scales contribute equally to the model’s learning process.

4.3 Hybrid model

This study developed hybrid models by combining various supervised ML algorithms to leverage their strengths and improve overall predictive performance. Each hybrid model integrates two different algorithms, providing a diverse approach to classification. Five unique hybrid models were created by pairing various ML algorithms to identify the most effective combinations. These combinations are detailed in Table 3.

Download:

Table 3. List of Hybrid models using various combinations of ML algorithms.

https://doi.org/10.1371/journal.pone.0341238.t003

Subsequently, these hybrid models were integrated with four ensemble techniques: Soft-Voting, Hard-Voting, Stacking, and Blending. Using these combinations, we constructed various ensemble ML models. Comprehensive experiments were conducted on these ensemble models (hybrid models + ensemble techniques) to identify the optimal ensemble model for fault classification tasks. Based on the experimental results, we identified the most influential ensemble model for electrical fault analysis as the combination of RF, DT, and Stacking, denoted as RF+DT+Stacking and achieved an Accuracy of 93.64%.

In Model-1, the RF and DT algorithms were combined to balance complexity and interpretability. RF, an ensemble method, excels at capturing intricate patterns and reducing overfitting by aggregating multiple decision trees, while DT offers transparency with its hierarchical decision-making. Similarly, the other models also integrate complementary algorithms in order to enhance classification performance. Model-2 (RF + KNN) combines RF’s robustness with KNN’s instance-based learning. RF constructs multiple DT to extract deep feature representations, while KNN classifies instances based on their similarity to neighboring samples. This combination enhances adaptability in complex classification tasks by leveraging RF’s ability to handle high-dimensional data and KNN’s effectiveness in local pattern recognition. whereas Model-3 (NB + DT) leverages NB’s probabilistic approach with DT’s interpretability. NB efficiently handles categorical data and assumes feature independence to make it computationally efficient, while DT enables structured decision-making. The fusion of these classifiers enhances predictive accuracy by incorporating both probabilistic reasoning and rule-based learning. In contrast, Model-4 (KNN + NB) merges KNN’s local generalization with NB’s simplicity. KNN determines class membership by evaluating the proximity of neighboring samples, while NB estimates class probabilities based on feature distributions. The combination enhances classification robustness, particularly in scenarios with varying data distributions and noise. Finally, Model-5 (RF + NB) fuses RF’s ensemble strength with NB’s probabilistic nature. RF enhances feature extraction through its diverse DT’s, while NB provides a lightweight and efficient classification mechanism. This integration aims to improve generalization by combining RF’s feature importance ranking with NB’s fast probabilistic inference. As a result, these combinations ensure diverse and improved fault classification outcomes.

4.4 Meta learner

The meta-learner serves as an integrative layer, trained on the predictions of base learners rather than directly on the original data. Acting as a higher-level model, the meta-learner learns the relationships between base model predictions and the actual target values, refining the final predictions by effectively combining base learners’ outputs. Common choices for meta-learners include linear regression (LR), logistic regression, and gradient-boosting (GB). By leveraging the strengths of diverse base models and mitigating their weaknesses, the meta-learner aims to improve the final predictive model’s accuracy, robustness, and generalization. This study used LR and GB as meta-learners to aggregate predictions from various base models, and we selected GB as the preferred meta-learner due to its superior performance in Model-1, where it achieved the highest accuracy. By leveraging its iterative approach to refine predictions and capture complex patterns, GB enhances the stacking ensemble’s accuracy and robustness, significantly improving linear regression.

4.5 Evaluation metrics

To evaluate the proposed hybrid ML model’s performance, critical metrics such as True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) are derived from the confusion matrix. These metrics are then utilized to compute Precision, Recall, F1-score, and Accuracy. This comprehensive assessment framework allows for a thorough evaluation of the model’s overall effectiveness in classification tasks [54,55].

Let M be the number of classes and q_i be the support (number of true samples) for class i. Define true positives TP_i, false positives FP_i, and false negatives FN_i for class i. Total samples: .

Accuracy

(16)

Per-class Precision, Recall, F1

(17)

Weighted averages

(18)

5 Experiment results

In this section, we present the results of our experiments designed to evaluate the proposed hybrid model’s performance. We assess the model’s effectiveness in classifying transmission line faults using various performance metrics, including accuracy, precision, recall, and F1-score.

5.1 Result

In this research, we conducted a comprehensive experiment. Initially, we assess the performance of various ML algorithms for Electrical fault classification tasks using the Electrical Faults Analysis & Classification dataset. Subsequently, we conduct experiments involving five hybrid ML models in Table 3 and five ensemble techniques on the datasets to determine the optimal ensemble model for electric fault classification. Table 4 presents the performance metrics of various ML algorithms evaluated in our study. The table includes A, P, R, and F1-Score for each algorithm, comprehensively comparing their effectiveness in classifying electrical faults. The DT and RF algorithms both achieved the highest A of 88.73%, demonstrating their robustness in fault classification tasks. We note that the accuracies of the DT and RF algorithms are equal. Therefore, we consider memory usage and computational complexity as additional factors for evaluation. The execution time and memory of DT is less than RF. However, the DT slightly outperformed the RF in terms of time and memory. The KNN algorithm achieved an A of 78.31%, with P and F1–Score slightly lower, reflecting its moderate performance in this context. The SVM algorithm had the lowest A of 74.32% and F1–Score of 72.25%, indicating limited effectiveness for this specific dataset. NB achieved a better performance than KNN and SVM with an A of 79.75%, and a balanced F1–Score of 76.66%. AdaBoost performed slightly better than NB with an A of 78.81% and a P of 77.29%. These results highlight that tree-based algorithms (DT, RF) are inherently more effective for this type of fault data because they capture non-linear relationships and handle categorical features efficiently, whereas distance-based (KNN) and margin-based (SVM) classifiers struggle due to overlapping fault patterns in the dataset. From a real-time applicability perspective, NB and DT exhibit the lowest execution times (62 ms and 205 ms, respectively) and minimal memory requirements, making them highly suitable for online grid fault detection systems. RF, despite its competitive accuracy, requires significantly higher execution time (2524 ms) and memory (13.67 MB), which may limit its deployment in latency-sensitive environments. Similarly, algorithms like SVM and AdaBoost demonstrate higher computational overheads, raising concerns for real-time grid monitoring. Therefore, while accuracy remains an important criterion, execution time and memory efficiency suggest that DT and NB can serve as practical candidates for real-time deployment, whereas RF and SVM may be more appropriate in offline or batch-processing scenarios.

Download:

Table 4. Performance metrics of various ML algorithms.

https://doi.org/10.1371/journal.pone.0341238.t004

Table 5 presents the classification results of the hybrid models, which combine base ML models with Hard Voting, Soft Voting, Stacking, and Blending techniques for fault classification tasks. Model-1 exhibited the best overall performance, achieving the highest A, P, R, and F1–Score values of 93.64%, 93.65%, 93.64%, and 93.64%, respectively, with the Stacking (Gradient Boosting) technique. This result highlights its robustness in fault classification tasks. Compared with the single models, the hybrid models consistently improved accuracy by 4–6%, which confirms that ensemble learning reduces the variance and bias of individual classifiers. Among them, Stacking with Gradient Boosting provided the best balance between predictive accuracy and stability. Similarly, Model-1 outperformed other models with the Hard Voting technique, obtaining A, P, R, and F1–Score values of 88.56%, 88.58%, 88.56%, and 88.55%. In the Soft Voting technique, Model-1 also achieved the best results, with A, P, R, and F1–Score values of 89.49%, 89.50%, 89.49%, and 89.48%, respectively. Moreover, Model-1 outperformed other models with the Stacking (Linear Regression) technique, achieving A, P, R, and F1–Score values of 91.44%, 91.46%, 91.44%, and 91.43%, respectively. Similarly, Model-5 excelled in the Blending technique, achieving A, P, R, and F1–Score values of 87.54%, 87.39%, 87.54%, and 87.46%, respectively.

Download:

Table 5. The experimental results of the Hybrid models (Base ML Models + Ensemble Techniques) for fault classification.

https://doi.org/10.1371/journal.pone.0341238.t005

Other hybrid models also demonstrated notable results, with Model-2 and Model-3 showing competitive outcomes under the Soft Voting technique, while Model-4 and Model-5 achieved moderate performances across both techniques.

Fig 4 illustrates the accuracy of the hybrid ML models. Model-1 achieved the highest accuracy, reaching 93.64%, particularly with the Stacking, reinforcing the earlier observation. This consistent outperformance across ensemble methods indicates that carefully combining decision-tree-based learners can yield more reliable classification than relying on individual algorithms.

Download:

Fig 4. Electrical fault classification accuracy of the Hybrid ML models.

https://doi.org/10.1371/journal.pone.0341238.g004

6 Discussion

In the field of electrical fault classification, researchers have proposed numerous ML and hybrid models to enhance classification accuracy, highlighting ample room for improvement in model performance. We address the research problem " Which hybrid machine learning model can outperform previous research and improve the accuracy of electrical fault classification?" by developing various hybrid ML models and conducting extensive experiments. We utilized six ML algorithms—namely, KNN, DT, RF, AdaBoost, NB, and SVM, both independently and in various combinations to construct hybrid ML models. Additionally, we employed four ensemble techniques, including stacking, blending, hard-voting, and soft-voting. Prior to experimentation with individual ML algorithms, we executed data preprocessing tasks and observed a substantial impact on classification performance. Table 4 displays the classification performance of the ML algorithms, with DT and RF achieving the highest A of 89% and 89%, respectively. Table 5 showcase the performance of ensemble models. The ensemble ML model (Model-1 + Stacking) achieved the highest A, P and F1–Score of 94%, and Model-1 exhibited commendable performance with other ensemble techniques. Although execution time and memory consumption were analyzed to assess computational efficiency, the proposed framework has not been validated on IEEE benchmark systems or real-world operational data. Therefore, claims related to real-time deployment should be interpreted as prospective rather than experimentally demonstrated. Nevertheless, the lightweight nature of the proposed ensemble suggests its suitability for future real-time implementation, subject to further validation on standardized test systems and field data.

Table 6 illustrates the performance comparison between previous research and our proposed hybrid model for the electrical fault classification task. Our proposed work achieves approximately 94% accuracy, which is slightly lower than the other methods presented in the literature. The work [56] achieves 99% accuracy with a Kalman filter and a second-order low-pass filter, specifically designed for Active Distribution Networks (ADNs) with bidirectional power flows. The study [57], with 90% accuracy, focuses on fault isolation in smart grids using complex current and voltage criteria, tested on the IEEE 13-node test feeder. In [58] achieves an impressive 99.93% accuracy using a hybrid deep learning model that combines wavelet packet transform and LSTM, specifically designed for photovoltaic systems. The paper [59], with 98.85% accuracy, utilizes a CNN optimized by Gorilla Troops Optimization (GTO) for microgrids, effectively detecting, classifying, and locating faults. While deep learning methods outperform our model in raw accuracy, they typically require significantly larger datasets, high-end computational resources, and domain-specific feature engineering. In contrast, our proposed hybrid model achieves competitive performance on a modest dataset while remaining lightweight, interpretable, and computationally efficient. Although ML techniques can theoretically be applied to any tabular dataset, electrical fault classification is unique because the features arise directly from power-system transient dynamics. The model must learn system-specific nonlinearities resulting from impedance, line length, and fault resistance, which are not present in generic classification tasks. This makes the proposed approach particularly suitable for resource-constrained environments and rapid deployment in practical power system monitoring. When we used the binary class dataset, our model achieved an accuracy of 99.58% [52]. The superior results suggest that the proposed hybrid model is effective for electrical fault classification tasks.

Download:

Table 6. Performance metrics of various ML algorithms.

https://doi.org/10.1371/journal.pone.0341238.t006

7 Conclusion

Fault classification in electrical systems has become a crucial area of research, particularly with the growing complexity of modern power grids and the need for reliable system protection. In this study, we developed five hybrid models using advanced ensemble techniques such as Hard-Voting, Soft-Voting, Stacking, and Blending to determine the most effective approach for fault classification. Extensive experiments were conducted using electrical fault datasets to evaluate the performance of these models. Based on the experimental findings, the hybrid model denoted as (RF+DT+Stacking) achieved the highest classification accuracy of 93.64%, outperforming other ensemble models. Compared with prior works, our model offers competitive accuracy while maintaining computational efficiency. For example, deep learning-based methods in the literature achieve very high accuracy (98–99.9%), but they typically require extensive datasets and higher computational resources. By contrast, our model demonstrates a more practical trade-off between performance and efficiency, making it more feasible for real-time grid monitoring. However, this study also has several limitations. The dataset used is relatively small and primarily simulated, which may not fully capture the diversity of real-world grid conditions. Additionally, the evaluation does not include large-scale IEEE benchmark systems or extensive statistical robustness tests (e.g., cross-validation). For future work, we plan to: (i) validate the model on larger and more diverse real-world datasets; (ii) apply the method to IEEE benchmark test systems; (iii) incorporate explainable AI methods to improve interpretability for operators; and (iv) explore additional hybrid and deep learning architectures to further enhance classification accuracy and robustness.

8 Strengths, limitations, and future perspectives

This research offers a robust hybrid ML model combining RF, DT, and Stacking techniques, significantly improving fault classification accuracy. The inclusion of ensemble techniques enhances decision-making, contributing to more reliable and proactive grid maintenance. The model’s high-performance metrics demonstrate its effectiveness in handling complex transmission line faults.

This study evaluates model performance using a single train–test split (85%–15%) due to computational and time constraints. Consequently, cross-validation (e.g., k-fold validation), robustness analysis under varying noise levels, and statistical significance testing were not performed. While the reported results provide useful comparative insights, they should not be interpreted as definitive evidence of strong generalization across unseen operating conditions. Future work will incorporate cross-validation, sensitivity analysis, and statistical tests to strengthen the reliability of the conclusions.

The model’s performance heavily depends on the dataset’s quality and diversity. Additionally, the complexity of the hybrid approach increases computational requirements, potentially limiting real-time applications in resource-constrained environments.

Future research could focus on improving computational efficiency and exploring deep learning methods for more nuanced fault classification. Expanding the dataset to cover various environmental and operational conditions will enhance the model’s adaptability and robustness in diverse grid scenarios.

References

1. Bhandari V, Konidena R. Modern Electricity Systems.
2. Paul CR. Analysis of multiconductor transmission lines. John Wiley & Sons; 2007.
3. Kalaga S, Yenumula P. Design of electrical transmission lines: structures and foundations. CRC Press; 2016.
4. Kiessling F, Nefzger P, Nolasco JF, Kaintzyk U. Overhead power lines: planning, design, construction. Springer; 2014.
5. Dehghanian P, Aslan S, Dehghanian P. Maintaining electric system safety through an enhanced network resilience. IEEE Trans on Ind Applicat. 2018;54(5):4927–37.
- View Article
- Google Scholar
6. Tsimberg Y, Lotho K, Dimnik C, Wrathall N, Mogilevsky A. Determining transmission line conductor condition and remaining life. In: 2014 IEEE PES T&D Conference and Exposition. 2014. p. 1–5.
7. Rahman H, Khan BH. Power upgrading of transmission line by combining AC–DC transmission. IEEE Trans Power Syst. 2007;22(1):459–66.
- View Article
- Google Scholar
8. Bayliss C, Hardy B. Transmission and distribution electrical engineering. Elsevier; 2011.
9. Nayyar A, Gadhavi L, Zaman N. Machine learning in healthcare: review, opportunities and challenges. Machine learning and the internet of medical things in healthcare. Elsevier; 2021. p. 23–45. https://doi.org/10.1016/b978-0-12-821229-5.00011-2
10. Hasan MR. Revitalizing the electric grid: a machine learning paradigm for ensuring stability in the U.S.A. JCSTS. 2024;6(1):141–54.
- View Article
- Google Scholar
11. Shiplu AI, Rahman MM, Watanobe Y. A robust ensemble machine learning model with advanced voting techniques for comment classification. In: International Conference on Big Data Analytics. 2023. p. 141–59.
12. Nasution M, Munthe IR, Nasution FA, Defit S. Optimizing text classification using techniques adaboost ensemble with decision tree algorithm. CogITo Smart Journal. 2025;11(1):39–51.
- View Article
- Google Scholar
13. Bahgat BH, Elhay EA, Elkholy MM. Advanced fault detection technique of three phase induction motor: comprehensive review. Discov Electron. 2024;1(1):9.
- View Article
- Google Scholar
14. Diab AAZ, El-Sayed A-HM, Abbas HH, Sattar MAE. Robust speed controller design using H_infinity theory for high-performance sensorless induction motor drives. Energies. 2019;12(5):961.
- View Article
- Google Scholar
15. Aziz AGMA, Diab AAZ, Sattar MAE. Speed sensorless vector controlled induction motor drive based stator and rotor resistances estimation taking core losses into account. In: 2017 Nineteenth International Middle East Power Systems Conference (MEPCON). 2017. p. 1059–68. https://doi.org/10.1109/mepcon.2017.8301313
16. Diab AAZ, El-Sattar MA. Adaptive model predictive based load frequency control in an interconnected power system. In: 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). 2018. p. 604–10. https://doi.org/10.1109/eiconrus.2018.8317170
17. Emad-Eldeen A, Azim MA, Abdelsattar M, AbdelMoety A. Utilizing machine learning and deep learning for enhanced supercapacitor performance prediction. Journal of Energy Storage. 2024;100:113556.
- View Article
- Google Scholar
18. Abdelsattar M, Ismeil MA, Zayed MMAA, Abdelmoety A, Emad-Eldeen A. Assessing machine learning approaches for photovoltaic energy prediction in sustainable energy systems. IEEE Access. 2024;12:107599–615.
- View Article
- Google Scholar
19. Fawzy IY, Mohamad YS, Shehata EG, Abd El Sattar M. A modified perturb and observe technique for MPPT of intgrated PV system using DC-DC boost converter. Journal of Advanced Engineering Trends. 2021;40(1):63–77.
- View Article
- Google Scholar
20. Hafez WA, Sattar MAE, Alaboudy AHK, Elbaset AA. Power quality issues of grid connected wind energy system focus on DFIG and various control techniques of active harmonic filter: a review. In: 2019 21st International Middle East Power Systems Conference (MEPCON). 2019. p. 1006–14. https://doi.org/10.1109/mepcon47431.2019.9008171
21. Abd El Hamed A, Ebeed M, Refai A, Abd El Sattar M, A. Elbaset A, Ahmed T. Application of slime mould algorithm for optimal allocation of datacom and PV system in real egyptian radial network. Sohag Engineering Journal. 2021;1(1):16–24.
- View Article
- Google Scholar
22. Abdelsattar M, AbdelMoety A, Emad-Eldeen A. A review on detection of solar pv panels failures using image processing techniques. In: 2023 24th International Middle East Power System Conference (MEPCON); 2023. p. 1–6.
23. Abdelsattar M, Mesalam A, Fawzi A, Hamdan I. Mountain gazelle optimizer for standalone hybrid power system design incorporating a type of incentive-based strategies. Neural Comput & Applic. 2024;36(12):6839–53.
- View Article
- Google Scholar
24. Abdelsattar M, A Ismeil M, Menoufi K, AbdelMoety A, Emad-Eldeen A. Evaluating machine learning and deep learning models for predicting wind turbine power output from environmental factors. PLoS One. 2025;20(1):e0317619. pmid:39847588
- View Article
- PubMed/NCBI
- Google Scholar
25. Abulkhair AF, Abdelsattar M, Mohamed HA. Negative effects and processing methods review of renewable energy sources on modern power system: a review. International Journal of Renewable Energy Research (IJRER). 2024;14(2):385–94.
- View Article
- Google Scholar
26. Alaboudy AHK, Elbaset AA, Abdelsattar M. A case study on the LVRT capability of an Egyptian electrical grid linked to the Al-Zafarana wind park using series resistor. International Journal of Renewable Energy Research. 2023;13(1):36–48.
- View Article
- Google Scholar
27. Abdelsattar M, Mesalam A, Diab AAZ, Fawzi A, Hamdan I. Optimal sizing of a proposed stand-alone hybrid energy system in a remote region of southwest Egypt applying different meta-heuristic algorithms. Neural Comput & Applic. 2024;36(26):16251–69.
- View Article
- Google Scholar
28. Abdelsattar M, AbdelMoety A, Emad-Eldeen A. Comparative analysis of machine learning techniques for fault detection in solar panel systems. SVU-International Journal of Engineering Sciences and Applications. 2024;5(2):140–52.
- View Article
- Google Scholar
29. Awasthi S, Singh G, Ahamad N. Classifying electrical faults in a distribution system using K-Nearest Neighbor (KNN) model in presence of multiple distributed generators. J Inst Eng India Ser B. 2024;105(3):621–34.
- View Article
- Google Scholar
30. Ding S, Hao M, Cui Z, Wang Y, Hang J, Li X. Application of multi-SVM classifier and hybrid GSAPSO algorithm for fault diagnosis of electrical machine drive system. ISA Trans. 2023;133:529–38. pmid:35868910
- View Article
- PubMed/NCBI
- Google Scholar
31. R. A, Nair DS, T. R, V. V. A novel SVM based adaptive scheme for accurate fault identification in microgrid. Electric Power Systems Research. 2023;221:109439.
- View Article
- Google Scholar
32. Rahman MdM, Shiplu AI, Watanobe Y. CommentClass: a robust ensemble machine learning model for comment classification. Int J Comput Intell Syst. 2024;17(1):184.
- View Article
- Google Scholar
33. Wang X, Han T. Transformer fault diagnosis based on stacking ensemble learning. IEEJ Transactions Elec Engng. 2020;15(12):1734–9.
- View Article
- Google Scholar
34. Sun P, Liu X, Lin M, Wang J, Jiang T, Wang Y. Transmission line fault diagnosis method based on improved multiple SVM model. IEEE Access. 2023;11:133825–34.
- View Article
- Google Scholar
35. Yin W, Gumabay MVN, Lin H, Tu C, Ao C. Overhead transmission lines early warning and decision support system with predictive analytics. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE). 2022. 310–4. https://doi.org/10.1109/aemcse55572.2022.00070
36. Tong H, Qiu RC, Zhang D, Yang H, Ding Q, Shi X. Detection and classification of transmission line transient faults based on graph convolutional neural network. CSEE Journal of Power and Energy Systems. 2021;7(3):456–71.
- View Article
- Google Scholar
37. Yu H, Zhang L, Zhao P, Liu Z, Yang Z, Jin M, et al. Fault diagnosis of power transmission line based on elgamal encryption algorithm. In: 2022 IEEE 2nd International Conference on Electronic Technology, Communication and Information (ICETCI). 2022. p. 953–7. https://doi.org/10.1109/icetci55101.2022.9832408
38. Ma J, Zhang X, Peng X. Simulation evaluation of lightning and non lightning fault identification of transmission line. In: 2022 5th International Conference on Energy, Electrical and Power Engineering (CEEPE). 2022. p. 452–8. https://doi.org/10.1109/ceepe55110.2022.9783404
39. Lahiri S, Abhijnan C, De MA. Fault diagnosis in power transmission line using decision tree and random forest classifier. In: 2022 IEEE 6th International Conference on Condition Assessment Techniques in Electrical Systems (CATCON). 2022. p. 57–61. https://doi.org/10.1109/catcon56237.2022.10077633
40. Agarwal S, Swetapadma A, Panigrahi C, Dasgupta A. Fault detection in direct current transmission lines using discrete fourier transform from single terminal current signals. In: 2017 1st International Conference on Electronics, Materials Engineering and Nano-Technology (IEMENTech). 2017. p. 1–5. https://doi.org/10.1109/iementech.2017.8076975
41. Hao F, Yang X, Wang G, Feng Y. Transmission line fault diagnosis based on machine learning. In: 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE); 2023. p. 847–50.
42. Yu H, Liu M, Wang S. Research on fault diagnosis in the railway power transmission line based on the modern mathematical methods. In: 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2). 2018. p. 1–5. https://doi.org/10.1109/ei2.2018.8582049
43. Siddique AH, Tasnim S, Shahriyar F, Hasan M, Rashid K. Renewable energy sector in Bangladesh: the current scenario, challenges and the role of iot in building a smart distribution grid. Energies. 2021;14(16):5083.
- View Article
- Google Scholar
44. Liu Z, Han Z, Zhang Y, Zhang Q. Multiwavelet packet entropy and its application in transmission line fault recognition and classification. IEEE Trans Neural Netw Learn Syst. 2014;25(11):2043–52. pmid:25330427
- View Article
- PubMed/NCBI
- Google Scholar
45. Jamehbozorg A, Shahrtash SM. A decision-tree-based method for fault classification in single-circuit transmission lines. IEEE Trans Power Delivery. 2010;25(4):2190–6.
- View Article
- Google Scholar
46. Recioui A, Benseghier B, Khalfallah H. Power system fault detection, classification and location using the K-Nearest Neighbors. In: 2015 4th International Conference on Electrical Engineering (ICEE). 2015. p. 1–6. https://doi.org/10.1109/intee.2015.7416832
47. Begum M, Shuvo MH, Ashraf I, Mamun AA, Uddin J, Samad MA. Software defects identification: results using machine learning and explainable artificial intelligence techniques. IEEE Access. 2023;11:132750–65.
- View Article
- Google Scholar
48. Gandhi I, Pandey M. Hybrid ensemble of classifiers using voting. In: 2015 International Conference on Green Computing and Internet of Things (ICGCIoT). 2015. p. 399–404. https://doi.org/10.1109/icgciot.2015.7380496
49. Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering. 2021;2:40–6.
- View Article
- Google Scholar
50. Liu N, Gao H, Zhao Z, Hu Y, Duan L. A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang. J Petrol Explor Prod Technol. 2021;12(6):1595–608.
- View Article
- Google Scholar
51. Coulson S, Oakley T. Blending basics. 2001. https://doi.org/10.1515/cogl.2001.014
52. Prakash ES. Electrical fault detection and classification. 2021. https://www.kaggle.com/datasets/esathyaprakash/electrical-fault-detection-and-classification
53. Momotaz B, Dohi T. Prediction interval of cumulative number of software faults using multilayer perceptron. In: Lee R, editor. Applied Computing & Information Technology. Cham: Springer International Publishing; 2016. p. 43–58.
54. Rahman MM, Shiplu AI, Watanobe Y, Alam MA. RoBERTa-BiLSTM: a context-aware hybrid model for sentiment analysis. arXiv preprint 2024.
- View Article
- Google Scholar
55. Begum M, Hasan Shuvo M, Kamal Nasir M, Hossain A, Jakir Hossain M, Ashraf I, et al. LCNN: lightweight CNN architecture for software defect feature identification using explainable AI. IEEE Access. 2024;12:55744–56.
- View Article
- Google Scholar
56. Stefanidou-Voziki P, Corchero C, Dominguez-Garcia JL. A practical algorithm for fault classification in distribution grids. In: 2020 IEEE International Conference on Environment and Electrical Engineering and 2020 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe). 2020. p. 1–6.
57. Mumtaz F, Khan HH, Haider MU, Younas MB, Mohsin MY, Zeeshan M. Two-stage hybrid-filtering based fault detection & classification method for active distribution networks. In: 2022 International Conference on Emerging Trends in Electrical, Control, and Telecommunication Engineering (ETECTE); 2022. p. 1–6.
58. Alrifaey M, Lim WH, Ang CK, Natarajan E, Solihin MI, Juhari MRM, et al. Hybrid deep learning model for fault detection and classification of grid-connected photovoltaic system. IEEE Access. 2022;10:13852–69.
- View Article
- Google Scholar
59. Hatata AY, Essa MA, Sedhom BE. Adaptive protection scheme for FREEDM microgrid based on convolutional neural network and gorilla troops optimization technique. IEEE Access. 2022;10:55583–601.
- View Article
- Google Scholar

[ref1] 1. Bhandari V, Konidena R. Modern Electricity Systems.

[ref2] 2. Paul CR. Analysis of multiconductor transmission lines. John Wiley & Sons; 2007.

[ref3] 3. Kalaga S, Yenumula P. Design of electrical transmission lines: structures and foundations. CRC Press; 2016.

[ref4] 4. Kiessling F, Nefzger P, Nolasco JF, Kaintzyk U. Overhead power lines: planning, design, construction. Springer; 2014.

[ref5] 5. Dehghanian P, Aslan S, Dehghanian P. Maintaining electric system safety through an enhanced network resilience. IEEE Trans on Ind Applicat. 2018;54(5):4927–37.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref6] 6. Tsimberg Y, Lotho K, Dimnik C, Wrathall N, Mogilevsky A. Determining transmission line conductor condition and remaining life. In: 2014 IEEE PES T&D Conference and Exposition. 2014. p. 1–5.

[ref7] 7. Rahman H, Khan BH. Power upgrading of transmission line by combining AC–DC transmission. IEEE Trans Power Syst. 2007;22(1):459–66.
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref8] 8. Bayliss C, Hardy B. Transmission and distribution electrical engineering. Elsevier; 2011.

[ref9] 9. Nayyar A, Gadhavi L, Zaman N. Machine learning in healthcare: review, opportunities and challenges. Machine learning and the internet of medical things in healthcare. Elsevier; 2021. p. 23–45. https://doi.org/10.1016/b978-0-12-821229-5.00011-2

[ref10] 10. Hasan MR. Revitalizing the electric grid: a machine learning paradigm for ensuring stability in the U.S.A. JCSTS. 2024;6(1):141–54.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref11] 11. Shiplu AI, Rahman MM, Watanobe Y. A robust ensemble machine learning model with advanced voting techniques for comment classification. In: International Conference on Big Data Analytics. 2023. p. 141–59.

[ref12] 12. Nasution M, Munthe IR, Nasution FA, Defit S. Optimizing text classification using techniques adaboost ensemble with decision tree algorithm. CogITo Smart Journal. 2025;11(1):39–51.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref13] 13. Bahgat BH, Elhay EA, Elkholy MM. Advanced fault detection technique of three phase induction motor: comprehensive review. Discov Electron. 2024;1(1):9.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref14] 14. Diab AAZ, El-Sayed A-HM, Abbas HH, Sattar MAE. Robust speed controller design using H_infinity theory for high-performance sensorless induction motor drives. Energies. 2019;12(5):961.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref15] 15. Aziz AGMA, Diab AAZ, Sattar MAE. Speed sensorless vector controlled induction motor drive based stator and rotor resistances estimation taking core losses into account. In: 2017 Nineteenth International Middle East Power Systems Conference (MEPCON). 2017. p. 1059–68. https://doi.org/10.1109/mepcon.2017.8301313

[ref16] 16. Diab AAZ, El-Sattar MA. Adaptive model predictive based load frequency control in an interconnected power system. In: 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). 2018. p. 604–10. https://doi.org/10.1109/eiconrus.2018.8317170

[ref17] 17. Emad-Eldeen A, Azim MA, Abdelsattar M, AbdelMoety A. Utilizing machine learning and deep learning for enhanced supercapacitor performance prediction. Journal of Energy Storage. 2024;100:113556.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref18] 18. Abdelsattar M, Ismeil MA, Zayed MMAA, Abdelmoety A, Emad-Eldeen A. Assessing machine learning approaches for photovoltaic energy prediction in sustainable energy systems. IEEE Access. 2024;12:107599–615.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref19] 19. Fawzy IY, Mohamad YS, Shehata EG, Abd El Sattar M. A modified perturb and observe technique for MPPT of intgrated PV system using DC-DC boost converter. Journal of Advanced Engineering Trends. 2021;40(1):63–77.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref20] 20. Hafez WA, Sattar MAE, Alaboudy AHK, Elbaset AA. Power quality issues of grid connected wind energy system focus on DFIG and various control techniques of active harmonic filter: a review. In: 2019 21st International Middle East Power Systems Conference (MEPCON). 2019. p. 1006–14. https://doi.org/10.1109/mepcon47431.2019.9008171

[ref21] 21. Abd El Hamed A, Ebeed M, Refai A, Abd El Sattar M, A. Elbaset A, Ahmed T. Application of slime mould algorithm for optimal allocation of datacom and PV system in real egyptian radial network. Sohag Engineering Journal. 2021;1(1):16–24.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref22] 22. Abdelsattar M, AbdelMoety A, Emad-Eldeen A. A review on detection of solar pv panels failures using image processing techniques. In: 2023 24th International Middle East Power System Conference (MEPCON); 2023. p. 1–6.

[ref23] 23. Abdelsattar M, Mesalam A, Fawzi A, Hamdan I. Mountain gazelle optimizer for standalone hybrid power system design incorporating a type of incentive-based strategies. Neural Comput & Applic. 2024;36(12):6839–53.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref24] 24. Abdelsattar M, A Ismeil M, Menoufi K, AbdelMoety A, Emad-Eldeen A. Evaluating machine learning and deep learning models for predicting wind turbine power output from environmental factors. PLoS One. 2025;20(1):e0317619. pmid:39847588
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref25] 25. Abulkhair AF, Abdelsattar M, Mohamed HA. Negative effects and processing methods review of renewable energy sources on modern power system: a review. International Journal of Renewable Energy Research (IJRER). 2024;14(2):385–94.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref26] 26. Alaboudy AHK, Elbaset AA, Abdelsattar M. A case study on the LVRT capability of an Egyptian electrical grid linked to the Al-Zafarana wind park using series resistor. International Journal of Renewable Energy Research. 2023;13(1):36–48.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref27] 27. Abdelsattar M, Mesalam A, Diab AAZ, Fawzi A, Hamdan I. Optimal sizing of a proposed stand-alone hybrid energy system in a remote region of southwest Egypt applying different meta-heuristic algorithms. Neural Comput & Applic. 2024;36(26):16251–69.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref28] 28. Abdelsattar M, AbdelMoety A, Emad-Eldeen A. Comparative analysis of machine learning techniques for fault detection in solar panel systems. SVU-International Journal of Engineering Sciences and Applications. 2024;5(2):140–52.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref29] 29. Awasthi S, Singh G, Ahamad N. Classifying electrical faults in a distribution system using K-Nearest Neighbor (KNN) model in presence of multiple distributed generators. J Inst Eng India Ser B. 2024;105(3):621–34.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref30] 30. Ding S, Hao M, Cui Z, Wang Y, Hang J, Li X. Application of multi-SVM classifier and hybrid GSAPSO algorithm for fault diagnosis of electrical machine drive system. ISA Trans. 2023;133:529–38. pmid:35868910
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref31] 31. R. A, Nair DS, T. R, V. V. A novel SVM based adaptive scheme for accurate fault identification in microgrid. Electric Power Systems Research. 2023;221:109439.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref32] 32. Rahman MdM, Shiplu AI, Watanobe Y. CommentClass: a robust ensemble machine learning model for comment classification. Int J Comput Intell Syst. 2024;17(1):184.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref33] 33. Wang X, Han T. Transformer fault diagnosis based on stacking ensemble learning. IEEJ Transactions Elec Engng. 2020;15(12):1734–9.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref34] 34. Sun P, Liu X, Lin M, Wang J, Jiang T, Wang Y. Transmission line fault diagnosis method based on improved multiple SVM model. IEEE Access. 2023;11:133825–34.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref35] 35. Yin W, Gumabay MVN, Lin H, Tu C, Ao C. Overhead transmission lines early warning and decision support system with predictive analytics. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE). 2022. 310–4. https://doi.org/10.1109/aemcse55572.2022.00070

[ref36] 36. Tong H, Qiu RC, Zhang D, Yang H, Ding Q, Shi X. Detection and classification of transmission line transient faults based on graph convolutional neural network. CSEE Journal of Power and Energy Systems. 2021;7(3):456–71.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref37] 37. Yu H, Zhang L, Zhao P, Liu Z, Yang Z, Jin M, et al. Fault diagnosis of power transmission line based on elgamal encryption algorithm. In: 2022 IEEE 2nd International Conference on Electronic Technology, Communication and Information (ICETCI). 2022. p. 953–7. https://doi.org/10.1109/icetci55101.2022.9832408

[ref38] 38. Ma J, Zhang X, Peng X. Simulation evaluation of lightning and non lightning fault identification of transmission line. In: 2022 5th International Conference on Energy, Electrical and Power Engineering (CEEPE). 2022. p. 452–8. https://doi.org/10.1109/ceepe55110.2022.9783404

[ref39] 39. Lahiri S, Abhijnan C, De MA. Fault diagnosis in power transmission line using decision tree and random forest classifier. In: 2022 IEEE 6th International Conference on Condition Assessment Techniques in Electrical Systems (CATCON). 2022. p. 57–61. https://doi.org/10.1109/catcon56237.2022.10077633

[ref40] 40. Agarwal S, Swetapadma A, Panigrahi C, Dasgupta A. Fault detection in direct current transmission lines using discrete fourier transform from single terminal current signals. In: 2017 1st International Conference on Electronics, Materials Engineering and Nano-Technology (IEMENTech). 2017. p. 1–5. https://doi.org/10.1109/iementech.2017.8076975

[ref41] 41. Hao F, Yang X, Wang G, Feng Y. Transmission line fault diagnosis based on machine learning. In: 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE); 2023. p. 847–50.

[ref42] 42. Yu H, Liu M, Wang S. Research on fault diagnosis in the railway power transmission line based on the modern mathematical methods. In: 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2). 2018. p. 1–5. https://doi.org/10.1109/ei2.2018.8582049

[ref43] 43. Siddique AH, Tasnim S, Shahriyar F, Hasan M, Rashid K. Renewable energy sector in Bangladesh: the current scenario, challenges and the role of iot in building a smart distribution grid. Energies. 2021;14(16):5083.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref44] 44. Liu Z, Han Z, Zhang Y, Zhang Q. Multiwavelet packet entropy and its application in transmission line fault recognition and classification. IEEE Trans Neural Netw Learn Syst. 2014;25(11):2043–52. pmid:25330427
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref45] 45. Jamehbozorg A, Shahrtash SM. A decision-tree-based method for fault classification in single-circuit transmission lines. IEEE Trans Power Delivery. 2010;25(4):2190–6.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref46] 46. Recioui A, Benseghier B, Khalfallah H. Power system fault detection, classification and location using the K-Nearest Neighbors. In: 2015 4th International Conference on Electrical Engineering (ICEE). 2015. p. 1–6. https://doi.org/10.1109/intee.2015.7416832

[ref47] 47. Begum M, Shuvo MH, Ashraf I, Mamun AA, Uddin J, Samad MA. Software defects identification: results using machine learning and explainable artificial intelligence techniques. IEEE Access. 2023;11:132750–65.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref48] 48. Gandhi I, Pandey M. Hybrid ensemble of classifiers using voting. In: 2015 International Conference on Green Computing and Internet of Things (ICGCIoT). 2015. p. 399–404. https://doi.org/10.1109/icgciot.2015.7380496

[ref49] 49. Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering. 2021;2:40–6.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref50] 50. Liu N, Gao H, Zhao Z, Hu Y, Duan L. A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang. J Petrol Explor Prod Technol. 2021;12(6):1595–608.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref51] 51. Coulson S, Oakley T. Blending basics. 2001. https://doi.org/10.1515/cogl.2001.014

[ref52] 52. Prakash ES. Electrical fault detection and classification. 2021. https://www.kaggle.com/datasets/esathyaprakash/electrical-fault-detection-and-classification

[ref53] 53. Momotaz B, Dohi T. Prediction interval of cumulative number of software faults using multilayer perceptron. In: Lee R, editor. Applied Computing & Information Technology. Cham: Springer International Publishing; 2016. p. 43–58.

[ref54] 54. Rahman MM, Shiplu AI, Watanobe Y, Alam MA. RoBERTa-BiLSTM: a context-aware hybrid model for sentiment analysis. arXiv preprint 2024.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref55] 55. Begum M, Hasan Shuvo M, Kamal Nasir M, Hossain A, Jakir Hossain M, Ashraf I, et al. LCNN: lightweight CNN architecture for software defect feature identification using explainable AI. IEEE Access. 2024;12:55744–56.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref56] 56. Stefanidou-Voziki P, Corchero C, Dominguez-Garcia JL. A practical algorithm for fault classification in distribution grids. In: 2020 IEEE International Conference on Environment and Electrical Engineering and 2020 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe). 2020. p. 1–6.

[ref57] 57. Mumtaz F, Khan HH, Haider MU, Younas MB, Mohsin MY, Zeeshan M. Two-stage hybrid-filtering based fault detection & classification method for active distribution networks. In: 2022 International Conference on Emerging Trends in Electrical, Control, and Telecommunication Engineering (ETECTE); 2022. p. 1–6.

[ref58] 58. Alrifaey M, Lim WH, Ang CK, Natarajan E, Solihin MI, Juhari MRM, et al. Hybrid deep learning model for fault detection and classification of grid-connected photovoltaic system. IEEE Access. 2022;10:13852–69.
View Article
Google Scholar

[124] View Article

[125] Google Scholar

[ref59] 59. Hatata AY, Essa MA, Sedhom BE. Adaptive protection scheme for FREEDM microgrid based on convolutional neural network and gorilla troops optimization technique. IEEE Access. 2022;10:55583–601.
View Article
Google Scholar

[127] View Article

[128] Google Scholar

Figures

Abstract

1 Introduction

2 Related work

3 Background theory

3.1 Decision Tree (DT)

3.2 Random Forest (RF)

3.3 K-Nearest Neighbors (KNN)

3.4 Naive Bayes (NB)

3.5 Support Vector Machine (SVM)

3.6 AdaBoost and gradient boosting

3.7 Hard-voting or majority-voting

3.8 Soft-voting or weighted average

3.9 Stacking

3.10 Blending

4 Proposed approach

4.1 Dataset

4.2 Data preprocessing

4.3 Hybrid model

4.4 Meta learner

4.5 Evaluation metrics

5 Experiment results

5.1 Result

6 Discussion

7 Conclusion

8 Strengths, limitations, and future perspectives

References