Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Transformer fault identification based on GWO-optimized Dual-channel M-A method

  • Ning Ji,

    Roles Conceptualization

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Xi Chen,

    Roles Conceptualization

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Xue Qin,

    Roles Methodology

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Wei Wei,

    Roles Investigation

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Chenlu Jiang,

    Roles Methodology

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Yifan Bo,

    Roles Validation

    Affiliation Skill Training Center of State Grid Jiangsu Electric Power Co., Ltd., Suzhou, China

  • Kai Tao

    Roles Supervision

    kai.tao@njupt.edu.cn

    Affiliation College of Automation & College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing, China

Abstract

In order to improve the accuracy of the transformer fault identification using nature-inspired algorithms, an identification method based on the GWO (Grey Wolf Optimizer)-optimized Dual-channel MLP (Multilayer Perceptron)-Attention is proposed. First, a Dual-channel model is constructed by combining the AM (Attention Mechanism) and MLP. Subsequently, the GWO algorithm is used to optimize the number and the nodes of the hidden layer in the Dual-channel MLP-Attention model. Typical transformer faults are simulated using DDRTS (Digital Dynamic Real-Time Simulator) system. Experiments showed that the GWO- optimized method has an accuracy rate of 95.3%-96.7% in identifying the transformer faults. Compared with BP, SVM, MLP, and single-channel M-A models, the proposed method improved the accuracy by14.1%, 9.6%, 9.3%, and 3.3% respectively. This result indicates the rationality and effectiveness of the proposed method in transformer fault identification.

1 Introduction

Transformer is a vital component in power systems [1]. The working environment of transformer is remote and harsh, which is prone to faults, such as insulation aging, short circuit, etc. [24] Transformer faults not only affect the operation of power system, but also could lead to significant accidents [59]. Therefore, it is of great significant to identify the transformer faults.

Nature-inspired algorithms and artificial intelligence technology have been widely used in the fault identification [1014], for example, Support Vector Machine [15], Random Forest [16], Multilayer Perceptron [17] as well as Bayesian method [18], etc. Paul et al. [19] researched a gradient boosting (GB) model to optimize the Bayesian parameters. Liao et al. [20] proposed a transformer fault diagnosis model which integrates high accuracy and interpretability. Wang et al. [21] presented a TPE-XGBoost model with a identification accuracy of 89.5% in the condition of 20% missing data.

Nature-inspired algorithms are applicable in the field of power systems. However, there are various types of transformer faults, such as abnormal temperature, partial discharge, etc. Faults would be coupled. When a local fault occurs, it may cause the fluctuations in other parts, leading to the expansion of the accident. This characteristic makes traditional identification models insufficient in capturing the random features and potential fault modes.

The substation recording signal contains key fault information, which could be used for fault identification. A novel transformer fault identification method based on GWO (Grey Wolf Optimizer) and Dual-channel MLP (Multi Layer Perceptron)-Attention was proposed in this paper. Traditional identification methods have poor diagnostic performance for the complex faults. The number of hidden layers and nodes in the MLP-Attention model could be optimized using the GWO algorithm. In this way, the transformer fault could be quickly identified, so that the equipment damage accidents could be prevented. Moreover, this method could assist in analyzing the cause of fault, which is helpful for the stable operation of the system.

2 Methodology

2.1 GWO

GWO is a nature-inspired optimization algorithm that simulates the hunting behavior of grey wolves [22]. In a gray wolf pack, there is a leading gray wolf (α), several secondary leading gray wolves (β). The rest are ordinary wolves (δ) and the bottom wolves (ω). The alpha wolf represents the current best solution [23]. The process of searching for prey could be described as follow (1) where D is the distance between the individual and the prey, A is the convergence factor, C is the oscillation factor. t is the iteration counts. z and zp are the positions of the grey wolf and the prey respectively. a linearly decreases from 2 to 0. r1, r2 ∈ (0,1). During the search and capture of prey, instructions are given by the alpha, beta, as well as the delta wolves. The positions of the top 3 wolves in terms of fitness are preserved during the iterations. The position information of the other wolves is updated, (2) (3) (4) where Dα, Dβ, Dδ represent the distances between the wolves α, β, δ and prey. A1, A2, A3 are coefficient vectors of wolves α, β, δ. z is the position of wolf individual. Initialize the population size and the positions. Let tmax be the maximum number of iterations. Compute the fitness based on the initialized positions, so that the positions of wolves α, β, δ, and ω could be calculated. Evaluate whether the maximum number of iterations has been reached, so that the optimal number of hidden layers and hidden layer nodes could be obtained.

2.2 MLP

MLP is a widely used neural network model. Each MLP has 3 layers, ① the input layer, ② the hidden layer, and ③ the output layer [24]. The connections between layers are fully connected, with no connections between different layers [25]. The structure was shown in Fig 1.

The data vectors were input into the input layer and passed to the first hidden layer. The output of the hidden unit j for the first hidden layer is (5) where aj is the output of the first hidden unit. Wij is the weight from the input layer to the hidden layer. xi is the input of the input layer. bj is the bias of the hidden layer. For the L-th hidden layer, the output of the hidden unit j is (6) where ap is the output of the L-th hidden unit. p is the number of neurons in the (L-1)-th hidden layer. Wjp is the weight from the (L-1)-th to the L-th hidden layer. aj is the input of the L-th hidden layer. bp is the bias of the L-th hidden layer. The output of the last hidden layer is then passed to the output layer. The input and output of the output unit k for the output layer is (7) where yk is the output of the output unit. Wpk is the weight from the hidden layer to the output layer. ap is the output of the hidden layer. bk is the bias of the output layer.

The cross-entropy loss function could be used in MLP model to measure the difference between the output and the labels. The weights W and biases b are updated using the gradient descent algorithm. The cross-entropy loss function could be defined (8) where y is the true probability distribution. yi,c is the indicator variable (0 or 1), which indicates that the i-th sample belongs to c-th category. is the output probability distribution of the model. is the probability that the i-th sample belongs to c-th category predicted by the model. N is the number of samples. M is the number of categories.

2.3 Attention mechanism

AM focus on the significant components of the input data by assigning weights, so that the robustness and accuracy could be improved [26,27]. The diagram of AM was shown in Fig 2.

The definition of weight coefficients in AM is (9) where u and w are the different weights, b is the bias vector, X is the input of the AM, αn is the different weight coefficients.

The M-A model is a neural network model that combines the multi-layer perceptron (MLP) and attention mechanism. In a typical MLP model, the input layer and the hidden layer are connected. The model has output after the process of multiple hidden layers. In the M-A model, the input is processed by an attention mechanism module. The output of the attention mechanism is connected to the hidden layer. In this way, dominated features could be enhanced, and the non- dominated features would be weakened.

2.4 Dual-channel MLP attention model

The structure of the dual-channel MLP-Attention model proposed in this paper is shown in Fig 3. There are two channels. One is a combination of MLP and AM. The input layer of the MLP was optimized by AM. The other channel is MLP. The final output is the weighted result of the two channels. The output of the single-channel M-A model is (10)

(11)

The output of the single-channel MLP model is (12)

The weighted output Y could be calculated as (13) where wh1 and wh2 are weight matrices. bh1 and bh2 are bias vectors. β1 and β2 are weight coefficients.

2.5 GWO optimization

The optimization process of the number and nodes of hidden layers of the dual-channel M-A model using GWO was shown in Fig 4.

  1. Construct the dual-channel M-A model. The channel 1 is MLP, channel 2 is MLP-Attention.
  2. Update the parameters of dual channel M-A model using GWO. Construct a new dual-channel M-A model and train the network.
  3. Determine whether the iteration stop condition is satisfied. If satisfied, the optimal parameters would be output.

2.6 Faults identification

The diagram of the fault identification using GWO-optimized Dual-channel M-A model was shown in Fig 5. First, the multi- features were extracted. The three-phase A, B, and C voltage signals were transformed using Fourier method, and the DC components were used as feature 1–3. Then, the energy of the three-phase A, B, and C voltage signals were taken as feature 4–6. All data were randomly divided into training set and test set in a ratio of 8:2. Further, the parameters of the Dual-channel M-A model was optimized by the GWO algorithm to identify the faults.

3 Experiment

3.1 Simulated system

The data used in the experiment is obtained from the simulation of Digital Dynamic Real-Time Simulator (DDRTS). This system could simulate the operation of substations, including bus faults, line faults, transformer faults, etc.

The virtual secondary system runs on a graphical simulation platform. The calculated data is obtained from the DDRTS interface. The response and actions of protective devices in actual substation operations can be simulated. The simulation results of the virtual protection devices could be displayed through a visualization interface. The flow of virtual digital protection simulation data was shown in Fig 6.

thumbnail
Fig 6. Data flow of virtual digital protection simulation.

https://doi.org/10.1371/journal.pone.0312474.g006

3.2 Data set

There are a total of 1500 transformer fault data. The samples are divided into 10 classes, phase A ground fault, phase B ground fault, phase C ground fault, AB phase-to-phase fault, BC phase-to-phase fault, CA phase-to-phase fault, AB ground fault, BC ground fault, CA ground fault, and ABC ground fault. 1200 samples were used for training, and the remaining 300 samples are used for testing. The sample group was shown in Table 1.

4 Results

4.1 Fault coordinate recording

Take the phase B ground fault as an example, the coordinate records (Phase A, B, C protection voltages, protection zero-sequence voltage, and phase A, B, C protection currents) were taken from fault waveform signal, as shown in Fig 7.

4.2 Feature extraction

For the A, B, and C phase voltage fault, the coordinates of 100 sampling points were extracted. There are six features in total. Take the absolute value of the three-phase voltage signal and perform Discrete Fourier Transform(DFT) processing. The DC component are taken as features 1–3. The energy of the three-phase voltage signal was features 4–6. The fault types and partial feature data are shown in Table 2.

4.3 Optimization results

The GWO-based optimization algorithm has good convergence performance, so that the great solution could be calculated in a short time. In addition, the GWO algorithm requires fewer parameters to be adjusted, so it is suitable for the fault signal processing.

Due to population diversity issues, the GWO method has local optima risks. The WOA (Whale Optimization Algorithm) and PSO (Particle Swarm Optimization) method were used to compared with the proposed GWO-based method. The result was shown in Fig 8.

Fig 8 shows that the convergence performance of GWO-based method is significantly better than the other two algorithms. This result shows the advantage of the GWO-based method. The fitness curve was shown in Fig 9. After the 12-th iteration, the fitness value reaches the minimum. The dual-channel M-A model was designed based on the optimization of the number and nodes of hidden layers.

4.4 Identification performance

After optimization and training, the accuracy was shown in Fig 10. 30 test experiments were conducted, the accuracy was 95.3% - 96.7%. The experiment shows that the proposed model has a good identification performance for transformer faults.

4.5 Ablation study

To validate the performance of the proposed method, an ablation study was conducted. The two channel attention mechanisms were removed in turn, and the identification performance was shown in Table 3. Accuracy rate, Precision (P), Recall (R), and FMeasure were used as the metrics to assess the performance of the algorithms. Accuracy rate is the ratio of the number of correctly predicted samples to the total number of samples. This indicator emphasizes the proportion of successful predictions made by the model and can reflect the performance. FMeasure can reflect the shortcomings of Precision and Recall indicator. Thus, the performance of the model on imbalanced datasets can be evaluated. The definitions of Ar(Accuracy rate), P, R, as well as FMeasure are (14) (15) (16) (17) where TP is the number of samples that are actually positive and identified as positive. FN is the number of samples that are actually positive but identified as negative. FP is the number of samples that are actually negative but identified as positive. TN is the number of samples that are actually negative and predicted as negative.

Table 3 shows that after removing one channel and the attention mechanism, the performance deteriorates significantly. This result proves the superiority of the proposed method in the fault diagnosis of transformer.

4.6 Algorithm comparison

To validate the superiority, the BP (Backpropagation) and SVM (Support Vector Machine) algorithms were used to compared with the proposed method. The confusion matrix of Dual-channel M-A were shown in Fig 11. The comparison results were shown in Table 4.

Compared with the BP and SVM algorithms, the Dual-channel M-A model has great performance in terms of Accuracy rate, Precision (P), Recall (R), and FMeasure. This result shows that the proposed method has superior performance in the field of transformer faults identification.

Jin et al. proposed a BP-based transformer fault detection method with the accuracy of 92% [28]. Shan et al. presented an SSA-AdaBoost-SVM method for the fault detection of transformer. The identification accuracy is 91.58% [29]. Andrade Lopes et al. researched an artificial neural network-based transformer fault classification with an accuracy of 85% [30]. Compared with other literature, the proposed method has great performance in the identification accuracy.

5 Discussion

The proposed method optimizes the number of hidden layers and hidden nodes by GWO algorithm, so that the generalization ability of the dual channel M-A model could be enhanced. The model can adaptively adjust parameters according to the training scenario. In addition, this research discusses the possibility of optimizing the parameters of identification model using optimization algorithm.

Compared with traditional algorithms, the dual channel M-A model improves the identification accuracy through two channels. The network structure leads to the high computational complexity and long running time. In the future, effective model structures and training algorithms would be explored to reduce the number of parameters and runtime. At the same time, advanced feature fusion strategies would be researched to improve the generalization ability and robustness. The convolutional neural networks (CNNs) in the compressed sensing would be used to reduce the computational complexity. Furthermore, techniques such as Attention Feature Fusion (AFF) can be used to improve the performance of feature fusion. Through the above plan, the performance of transformer fault identification can be further improved.

6 Conclusion

Transformer fault may affect the stability of the substation and lead to safety accidents. In this research, a transformer fault identification method based on Dual-channel MLP and attention mechanism was proposed. The main conclusions are

  1. By dual channels, the proposed method could learn different features from the dataset simultaneously, which reduces the risk of overfitting. If the performance of one channel degrades, the other channel could provide effective information. Therefore, the model has good robustness.
  2. The proposed method could automatically focus on key features by the attention mechanism, thereby improving the accuracy. The accuracy of the proposed method is higher than the traditional MLP method. Thus, it is suitable for the real-time monitoring and fault diagnosis. The experiment shows that the proposed method has good performance in identifying the transformer faults.

Supporting information

References

  1. 1. Ali M N, Amer M, Elsisi M. Reliable IoT Paradigm With Ensemble Machine Learning for Faults Diagnosis of Power Transformers Considering Adversarial Attacks. IEEE Trans Instrum Meas. 2023, 72.
  2. 2. Rashad B A-E, IBRAHIM D K, GILANY M I, Abdelhamid AS, Abdelfattah W. Identification of broken conductor faults in interconnected transmission systems based on discrete wavelet transform, Plos One, 2024, 19(1). pmid:38215163
  3. 3. Bjelic M, Brkovic B, Zarkovic M, Miljkovic T. Machine learning for power transformer SFRA based fault detection. Int. J. Electr. Power Energy Syst. 2024, 156.
  4. 4. Camarena-Martinez D, Huerta-Rosales J R, Amezquita-Sanchez J P, Granados-Lieberman D, Olivares-Galvan JC, Valtierra-Rodriguez M. Variational Mode Decomposition-Based Processing for Detection of Short-Circuited Turns in Transformers Using Vibration Signals and Machine Learning. Electronics 2024, 13, 1215.
  5. 5. Ding C, Ding Q, Wang Z, Zhou Y. Fault diagnosis of oil-immersed transformers based on the improved sparrow search algorithm optimised support vector machine. IET Electr. Power Appl. 2022, 16, 985–95.
  6. 6. Doolgindachbaporn A, Callender G, Lewin P, Simonson E, Wilson G. Data Driven Transformer Thermal Model for Condition Monitoring. IEEE Trans. Power Delivery 2022, 37, 3133–41.
  7. 7. Kumar A, Bhalja B R, Kumbhar GB. Approach for Identification of Inter-Turn Fault Location in Transformer Windings Using Sweep Frequency Response Analysis [J]. IEEE Trans. Power Delivery 2022, 37, 1539–48.
  8. 8. Tao K, Wang Q, Wang H, Liu T, Yue D, Wang L. An automatic fuzzy monitoring method of underground rock moisture permeation damage based on MAE fractal. Measurement 2022, 205, 112181.
  9. 9. Ren G, Zha X., Jiang B, Hu X, Xu J, Tao K. Location of Multiple Types of Faults in Active Distribution Networks Con-sidering Synchronization of Power Supply Area Data. Appl. Sci.-Basel 2022, 12, 10024.
  10. 10. Li C, Chen J, Yang C, Yang J, Liu Z, Davari, P. Convolutional Neural Network-Based Transformer Fault Diagnosis Using Vibration Signals. Sensors 2023, 23.
  11. 11. Rajesh K N V P S, Rao U M, Fofana I, Rozga P, Paramane A. Influence of Data Balancing on Transformer DGA Fault Classification With Machine Learning Algorithms. IEEE Trans. Dielectr. Electr. Insul. 2023, 30, 385–92.
  12. 12. Tao K, Wang Q. Yao Z, Jiang B, Yue D, Underground Sedimentary Rock Moisture Permeation Damage Assessment Based on AE Mutual Information. IEEE Trans. Instru. Meas. 2022, 72, 1–11.
  13. 13. Xing Z, He Y. Multi-modal information analysis for fault diagnosis with time-series data from power transformer. Int. J. Electr. Power Energy Syst. 2023, 144.
  14. 14. Chen H, Han C, Zhang Y, Ma Z, Zhang H, Yuan Z. Investigation on the fault monitoring of high-voltage circuit breaker using improved deep learning, Plos One, 2023, 18(12). pmid:38039313
  15. 15. Wang J, Zhao Z, Zhu J, Li X, Dong F, Wan S. Improved Support Vector Machine for Voiceprint Diagnosis of Typical Faults in Power Transformers. Machines 2023, 11.
  16. 16. Prasojo R A, Putra M A A, Ekojono , Apriyani M E, Rahmanto A N, Ghoneim S S M, et al. Precise transformer fault diagnosis via random forest model enhanced by synthetic minority over-sampling technique. Electr. Power Syst. Res. 2023, 220.
  17. 17. Souahlia S, Bacha K, Chaari A. MLP neural network-based decision for power transformers fault diagnosis using an im-proved combination of Rogers and Doernenburg ratios DGA. Int. Elec. Power 2012, 43, 1346–53.
  18. 18. Tao K, Chen G, Wang Q, Yue D. Ultrasonic Curved Coordinate Transform-RAPID With Bayesian Method for the Damage Localization of Pipeline. IEEE Trans. Ind. Electron. 2024.
  19. 19. Paul D, Goswarmi A K, Chetri R L, Roy R, Sen P. Bayesian Optimization-Based Gradient Boosting Method of Fault De-tection in Oil-Immersed Transformer and Reactors. IEEE Trans. Ind. Appl. 2022, 58, 1910–9.
  20. 20. Liao W, Zhang Y, Cao D, Ishizaki T, Yang Z, Yang D. Explainable Fault Diagnosis of Oil-Immersed Transformers: A Glass-Box Model. IEEE Trans. Instrum. Meas. 2024, 73.
  21. 21. Wang T, Li Q, Yang J, Xie T, Wu P, Liang J. Transformer Fault Diagnosis Method Based on Incomplete Data and TPE-XGBoost. Appl. Sci.-Basel 2023, 13, 7539.
  22. 22. Mirjalilis Mirjalilis, Lewis A.Grey Wolf Optimizer. Adv. Eng. Software 2014, 69, 46–61.
  23. 23. Zheng W, Jiang J, Tao K. A method based on musical-staff-inspired signal processing model for measuring rock moisture content. Measurement 2018, 125, 577–85.
  24. 24. Liu J, Yao C, Yu L, Dong S, Liu Y. Using MLP to locate transformer winding fault based on digital twin. Front. Energy Res. 2023, 11.
  25. 25. Tao K, Zheng W. Real-time damage assessment of hydrous sandstone based on synergism of AE-CT techniques. Eng. Fail. Anal. 2018, 91, 465–80.
  26. 26. Brauwers G, Frasincar F. A General Survey on Attention Mechanisms in Deep Learning. IEEE Trans. Knowl. Data Eng. 2023, 35, 3279–98.
  27. 27. Tao K, Wang Q, Yue D. Data compression and damage evaluation of underground pipeline with musicalized sonar GMM. IEEE Trans. Ind. Electron. 2023, 71, 3093–102.
  28. 28. Jin Y, Wu H, Zheng J, Zhang J, Liu Z. Power Transformer Fault Diagnosis Based on Improved BP Neural Network. Electronics. 2023;12(16): 3526.
  29. 29. Shan Y, Duan J, Fu H, Zhao J. Transformer Fault Diagnosis Based on SSA-AdaBoost-SVM. Control Engineering of China. 2022;29(2):280–6.
  30. 30. Andrade Lopes SM, Flauzino RA. A Novel Approach for Incipient Fault Diagnosis in Power Transformers by Artificial Neural Networks. 11th IEEE-PES Innovative Smart Grid Technologies Europe. 2021: 1–5.