A process-guided uncertainty-aware deep learning framework for reliable and interpretable industrial fault diagnosis

Babar Hayat; Shabeer Ahmad; Muhammad Asfandyar Shahid; Adil Khan; Md. Rajibul Islam; Md Shohel Sayeed; Yasir Ullah

doi:10.1371/journal.pone.0349385

Abstract

Timely fault detection is essential for safety, product quality, and energy efficiency in advanced industrial processes. However, many existing fault diagnosis methods insufficiently exploit process structure and sensor reliability, which limits their robustness and practical usefulness for process engineers. This study presents an improved framework SAU-PGA-CNN-BiLSTM that first couples Convolutional Neural Networks and Bidirectional Long Short-Term Memory layers to extract multivariate temporal dynamics and spatial correlations of the process data, secondly a process guided and sensor-aware attention mechanism is introduced which embeds process centrality, sequence level sensor reliability and uncertainty to the attention learning, to suppress unreliable channels and bias towards informative and stable sensors. In addition, Monte Carlo dropout with sensor prior-conditioning is used to provide calibrated confidence estimates that reflect both predictive uncertainty and sensor reliability. Finally, two lightweight sigmoid output heads perform fault detection and diagnosis combinedly, promoting mutual reinforcement between the tasks. Validated on the Tennessee Eastman Process benchmark, the proposed framework outperforms baselines model and achieves 93.6% multiclass diagnosis accuracy with 94.0% F1 score. After temperature scaling, the proposed model also demonstrates improved calibration compared with an otherwise identical model without sensor awareness, reducing negative log-likelihood from 0.197 to 0.182, Brier score from 0.101 to 0.095, and expected calibration error from 0.040 to 0.037. Attention visualizations further show that the model focuses on process-relevant and reliable sensors, supporting reliable industrial fault diagnosis.

Citation: Hayat B, Ahmad S, Shahid MA, Khan A, Islam MR, Sayeed MS, et al. (2026) A process-guided uncertainty-aware deep learning framework for reliable and interpretable industrial fault diagnosis. PLoS One 21(6): e0349385. https://doi.org/10.1371/journal.pone.0349385

Editor: Muhammad Shahid Anwar, Gachon University, KOREA, REPUBLIC OF

Received: December 22, 2025; Accepted: April 29, 2026; Published: June 2, 2026

Copyright: © 2026 Hayat et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Detecting and diagnosing abnormal events quickly in large industrial processes is crucial for safety, maintaining product quality, and improving energy efficiency [1]. As modern industrial processes use more and more dense sensor networks, they generate and store huge amounts of process data every day, creating a unique chance for intelligent monitoring [2]. However, the complex, nonlinear, and constantly changing nature of industrial systems makes fault detection and diagnosis (FDD) quite challenging [3]. Faults that are missed or diagnosed may propagate at a very fast rate, resulting in damaged equipment, environmental risks, and significant financial loss [4]. The common process monitoring methods which have been used extensively in fault detection and diagnosis (FDD) of industrial systems include Principal Component Analysis (PCA) Partial least squares (PLS) and Multivariate Statistical Process Control (MSPC) [5–7]. Data-driven techniques, including multivariate statistical process monitoring and machine learning methods, have therefore gained substantial attention due to their flexibility and reduced dependence on explicit process models. However, classical methods such as PCA, PLS, and shallow classifiers typically assume linear relationships and lack the capacity to capture complex temporal dependencies and fault propagation patterns [8,9]. Moreover, such procedures do not provide a practical understanding of the underlying causes of failures, lowering the usefulness of the operator intervention and troubleshooting [3]. The recent surging popularity of deep learning technologies has spawned advanced data-intensive models of monitoring and diagnosis of industrial processes. CNNs are also skilled at describing local spatial correlations of sensors, whereas Recurrent Neural Networks (RNNs), especially LSTM units are skilled at describing temporal dependencies [10,11]. Hybrid models that combine CNNs with LSTMs or BiLSTMs have shown notable improvements in accuracy on benchmark datasets like the Tennessee Eastman Process (TEP). Nevertheless, more powerful deep learning models are also costly, in the cost of training and running them, which makes them impractical to apply in industrial contexts with highly constrained time, latency, and hardware requirements [12,13].

More importantly, the vast majority of deep neural networks remain black boxes, which do not provide much interpretability to process engineers [14]. This transparency deficiency makes automated monitoring system implementation challenging as operators require beyond alerts, they want to know what variables and time periods are behind the observed anomalies [15]. Attention mechanism has been proposed as a promising technique that can provide interpretability, i.e., make the internal workings of neural networks transparent by giving priority ratings to input elements and time steps [16]. However, process monitoring models that use attention are often based on multi-head attention or transformer designs, which add complexity and computational cost to the models [17]. Beyond predictive accuracy and interpretability, practical industrial fault detection and diagnosis (FDD) systems must also be able to quantify and communicate prediction confidence [18]. In dynamic and noise-dominated industrial environments, knowing whether a model is uncertain about an alarm is critical, as this directly influences risk-aware operational decisions and operator intervention strategies [19]. Although uncertainty quantification techniques such as Monte Carlo dropout enable deep learning models to provide calibrated confidence estimates alongside their predictions [20,21], such capabilities remain largely underexplored in current data-driven process monitoring and diagnosis literature. Consequently, there is a growing need for FDD frameworks that jointly deliver accurate fault predictions, reliable uncertainty-aware outputs, and interpretable diagnostic explanations, while remaining computationally efficient for real-time industrial deployment.

To overcome these issues, we propose sensor-aware uncertainty and process-guided attention mechanism CNN-BiLSTM (SAU-PGA-CNN-BiLSTM) framework is specifically intended for accurate and interpretable fault detection and diagnosis in complicated industrial situations. A process-guided attention mechanism dynamically highlights fault-relevant sensor–time patterns, producing interpretable attention heatmaps that are consistent with known process behavior and facilitate faster root-cause analysis. Furthermore, the combination of MC dropout–based uncertainty estimation and post-hoc temperature scaling yields well-calibrated confidence scores, enabling confidence-aware alarm handling and selective decision deferral in safety-critical settings.

2. Related work

Traditional methods for process monitoring and fault detection are based on statistical process control and latent variable models. Principal Component Analysis (PCA) and its dynamic variants [22] have been widely used to extract complex sensor data, making it easier to spot when a process is behaving abnormally. The Squared Prediction Error (SPE) and Hoteling’s T² are commonly used for process monitoring [23], but since they rely on linear assumptions, they struggle with processes that behave in nonlinear ways. The Kernel PCA [24] and the Independent Component Analysis (ICA) [25] has some advantages, but in many cases, they require more computing power and more careful parameter adjustment. Multivariate statistical process control (MSPC) methods like the Partial Least Squares (PLS) and Canonical Correlation Analysis (CCA) provides more flexibility as they are more flexible in modelling the relationships between inputs and outputs [7,26]. However, they continue to have difficulties in offering time-based details or in detecting causes of roots especially in complicated or non-linear systems. Machine learning approaches based on data have been investigated to support FDD [27,28], including support vector machines (SVMs) random forests and gradient boosting, yet such approaches are demanding of manually designed features, and they do not account for the important temporal relationships that are necessary to analyze fault progression.

Deep learning has greatly improved fault detection and diagnosis (FDD) in recent years. Convolutional Neural Networks (CNNs) have shown great ability to capture spatial relationships across sensor networks [29]. By applying convolutional filters to input data, CNNs can learn local fault patterns that global linear method often misses. However, CNNs by themselves struggle to handle time-based dependencies. Recurrent Neural Networks (RNNs), especially LSTM networks, are often used to model temporal dynamics in process data [30]. LSTMs help solve the vanishing gradient problem found in basic RNNs and work well for tracking trends and detecting anomalies over long sequences [31]. Combining CNNs and LSTMs into hybrid models takes advantage of both types, capturing spatial and time-related patterns at the same time [32]. Furthermore, the study found that a hybrid model improved fault detection and diagnosis on the TEP dataset compared to using either model alone [33]. Bidirectional LSTMs (BiLSTM) improve temporal modeling by processing input sequences both forward and backward, which helps in detecting subtle or long-range fault patterns [34]. Despite these improvements, most existing CNN–RNN-based methods remain purely data-driven, offering limited interpretability and providing only point predictions without explicit measures of predictive confidence.

Attention mechanisms have become popular for making deep neural networks easier to understand in areas like natural language processing and time-series analysis [35]. In industrial fault detection and diagnosis (FDD), methods like additive attention and scaled-dot product attention are used to identify the most important variables and time points [36]. By giving dynamic weights to sensor and time pairs, attention layers help reveal the cause-and-effect relationships behind process faults. Recent studies have combined attention mechanisms with CNN–LSTM architectures to boost both accuracy and interpretability [37]. However, many of these models rely on multi-head or transformer-based attention [38], which greatly increases complexity and resource use. Moreover, existing attention-based FDD methods generally learn attention weights solely from data, without incorporating process knowledge or sensor reliability information. As a result, attention maps may overemphasize noisy or unstable sensors, reducing robustness and potentially misleading operators during fault analysis. Beyond accuracy and interpretability, reliable industrial deployment requires models to express prediction uncertainty, especially in safety-critical environments. Uncertainty-aware learning has been studied in broader machine learning contexts using Bayesian neural networks, ensemble methods, and Monte Carlo (MC) dropout [39], In industrial FDD, uncertainty estimates can support risk-aware alarm handling, selective prediction, and operator decision-making. Most deep learning models for fault detection and diagnosis (FDD) only give point estimates and don’t show how confident or reliable their predictions are. In high-risk industrial environments [40], it is important to tell the difference between alarms that are highly confident and those that are less certain, to manage risks better and use resources more efficiently. Monte Carlo dropout [41] has become a practical way to approximate Bayesian inference in neural networks. By keeping dropout active during inference and averaging several random forward passes, the model can estimate both the average prediction and its uncertainty. Furthermore, post-hoc calibration techniques such as temperature scaling, which are effective in improving probabilistic reliability [42], are rarely integrated into industrial FDD pipelines. Overall prior research has achieved notable progress in fault detection and diagnosis using deep learning and attention mechanisms. However, there are still critical gaps limited integration of process knowledge into attention mechanisms, lack of sensor-aware uncertainty handling, and insufficient focus on probabilistic calibration and risk-aware decision support. To bridge these gaps, this paper proposes a Sensor-Aware Uncertainty and Process-Guided Attention CNN–BiLSTM (SAU-PGA-CNN-BiLSTM) framework that jointly addresses accuracy, interpretability, and reliability. By embedding process topology and sensor reliability into the attention mechanism and explicitly modeling predictive uncertainty through MC dropout and temperature scaling, the proposed approach advances the state of the art toward deployable, trustworthy industrial FDD systems. Our contributions are summarized as follows:

A unified SAU-PGA-CNN-BiLSTM framework is proposed that jointly performs multiclass FDD while explicitly quantifying predictive uncertainty through MC dropout. Unlike conventional pipelines that treat diagnosis and uncertainty estimation as separate stages, the proposed approach delivers fault predictions and calibrated confidence estimates within a single end-to-end inference process.
A novel attention design is introduced that integrates sensor-aware uncertainty weighting with process-guided priors, allowing the model to dynamically emphasize reliable and process-critical sensors while down-weighting noisy or weakly informative channels. This mechanism embeds domain knowledge directly into the learning process, yielding attention patterns that are physically meaningful and consistent with known fault propagation behavior.
The proposed framework combines convolutional layers to capture inter-sensor correlations and local temporal patterns with stacked BiLSTM layers to model long-range fault evolution. This architectural synergy enables accurate diagnosis of complex, distributed faults while maintaining moderate computational complexity suitable for online industrial monitoring.
Extensive experiments on the Tennessee Eastman Process dataset demonstrate that the proposed framework achieves high fault detection and diagnosis accuracy, improved calibration and uncertainty behavior, and clear, interpretable attention visualizations. The model operates faster than typical industrial sampling intervals, supporting its practical applicability in real-time, safety-critical industrial environments.

3. Proposed network design

The proposed SAU-PGA-CNN-BiLSTM network integrates four well-studied deep-learning blocks, each tailored to a specific function in real-time industrial fault analysis.

3.1. Two-dimensional convolutional neural network

One-dimensional CNNs use convolutional filters to slide across sequential data, capturing local feature interactions between sensor readings [43]. In this study, CNN layers function as spatial feature extractors, automatically detecting spatial correlated sensor behaviors that indicate possible fault conditions. Given an input vector of sensor measurements at time step t can written as:

(1)

Each convolutional kernel calculates a feature map as follows:

(2)

Where denotes the ReLU activation function, and two convolution layers are stacked to produce the spatially encoded features. The tensor can be written as:

(3)

3.2. Bidirectional long-short term memory networks

Bidirectional LSTM (BiLSTM) networks handle sequences in both forward and reverse temporal directions, incorporating rich context into feature representations [44]. The BiLSTM layers serve as a temporal encoder, successfully capturing complicated temporal dependencies and discriminating between transient fluctuations and permanent fault deviations. For each direction (), the LSTM hidden state and cell state at time t are updated as follows:

(4)

(5)

(6)

(7)

(8)

(9)

where denotes element-wise multiplication. The forward and backward hidden states are then combined as:

(10)

3.3. Attention mechanism

The additive attention method adds dynamic relevance weights to different time steps, allowing the model to select the most relevant chunks of the input sequence [45]. Attention serves as the interpretability module, giving transparent representations of the decision-making process using attention-weighted saliency maps to aid operator comprehension. Attention alignment scores and weights are calculated as:

(11)

The aggregated context vector is obtained by:

(12)

3.4. Monte Carlo dropout for uncertainty estimation

Monte Carlo dropout is a useful technique for assessing predictive uncertainty. It involves conducting numerous stochastic forward passes with active dropout layers during inference [46]. It provides the model with a confidence estimation mechanism, distinguishing between high-confidence predictions suitable for automated action and low-confidence predictions. With dropout rate p_drop, predictions from N_MC stochastic forward passes yield a set of probabilities . The predictive mean and variance are:

(13)

A high predictive variance indicates low confidence, enabling a risk-aware decision-making strategy. This combination results in a robust, interpretable, and uncertainty-aware system specifically intended for effective real-time fault detection and diagnosis in a complex industrial environment.

4. Methodology

4.1. Proposed framework overview

This study proposed a hybrid deep learning framework called SAU-PGA-CNN-BiLSTM aimed for robust fault detection and diagnosis in complex industrial processes. The framework is designed to jointly address key challenges in industrial processes modeling of complex multivariant temporal dynamics and spatial correlations of the process data, accounting for heterogenous sensor reliability, and providing calibrated confidence estimates alongside diagnostic decisions. The model integrates convolutional layers and bidirectional LSTM layers to extract spatial correlations and understand time-related patterns of the data, and a process guided sensor-aware attention mechanism within a unified framework. Using time-series industrial sensors data, the framework performs fault detection and multiclass diagnosis, while simultaneously estimating predictive uncertainty through Monte Carlo dropout and post-hoc calibration. An overview of the proposed methodology is shown Fig 1.

Download:

Fig 1. Proposed SAU-PGA-CNN-BiLSTM framework for industrial fault detection and diagnosis.

https://doi.org/10.1371/journal.pone.0349385.g001

4.2. Framework architecture

The proposed SAU-PGA-CNN-BiLSTM framework processes multivariate sensor time-series data , where T is the sequence length and F is the features at each time step. It uses a hierarchical approach that combinedly captures both spatial and temporal relationships, while providing interpretable and uncertainty-aware predictions. First, one-dimensional convolutional layers extract spatial feature maps which summarize local interactions among variables. These features are then reshaped to to serve as input to the subsequent temporal modeling stage. To capture long-range temporal dependencies and fault propagation behavior, the reshaped features are passed into a bidirectional long short-term memory (BiLSTM) network. The BiLSTM processes the sequence in both forward and backward directions, enabling the model to learn temporal dependencies related to fault onset and evolution. For each time step t, the hidden state representation is given by:

(14)

Where , shows the forward and backward hidden states respectively. An additive attention mechanism is then employed to identify the most informative time steps and sensor-derived features for fault diagnosis. Unlike conventional attention mechanisms, the attention logits in this framework are modulated by process-guided and sensor-aware priors, including process centrality, sensor reliability, and sensor uncertainty. These priors bias the attention weights toward reliable and process-consistent measurements while suppressing noisy or sensors data. The normalized attention weights are computed and used to aggregate the BiLSTM outputs into a context vector, that highlight the temporal importance of each h_t.

(15)

The resulting context vector c captures the most relevant spatiotemporal features for fault detection and diagnosis in a physically interpretable manner. The context vector is shared by two output heads to enable joint learning of fault detection and fault diagnosis. A sigmoid-activated output layer performs binary fault detection can be written as:

(16)

And SoftMax layer for multi-class fault diagnosis

(17)

This shared-representation strategy improves learning efficiency and ensures consistency between detection and diagnosis tasks. To enhance reliability in safety-critical industrial settings, MC dropout is employed during inference to approximate Bayesian uncertainty. Multiple stochastic forward passes generate predictive distributions rather than point estimates, allowing computation of predictive means and variances for both detection and diagnosis outputs. This uncertainty information provides calibrated confidence measures that support risk-aware decision-making and operator intervention. Intuitively, sensor-aware uncertainty reflects both the model predictive confidence and the reliability of the underlying sensor signals. Predictions supported by stable and consistent sensors result in lower uncertainty, whereas noisy or unreliable sensor inputs lead to higher uncertainty. This allows operators to distinguish between confident and potentially unreliable decisions, thereby supporting safer and more informed process monitoring. To further improve probability calibration, post-hoc temperature scaling is applied following standard practice. The temperature parameter is fitted exclusively on the validation split, which is strictly separated from both training and test data to avoid information leakage. A single global temperature is learned for the multiclass fault diagnosis output and shared across all fault classes. At test time, predictive uncertainty is first estimated by averaging logits across MC-dropout forward passes, after which temperature scaling is applied only to the aggregated logits before computing calibrated SoftMax probabilities. In summary, the proposed SAU-PGA-CNN-BiLSTM framework integrates spatiotemporal feature extraction, process-guided sensor-aware attention, and uncertainty-aware calibration within a unified architecture. The CNN and BiLSTM layers capture spatial correlations and temporal dynamics, while the process-guided attention mechanism incorporates process centrality, sensor reliability, and uncertainty priors to emphasize informative and stable sensor signals. The model jointly performs fault detection and diagnosis using shared representations, ensuring consistency and efficiency. Furthermore, Monte Carlo dropout combined with temperature scaling provides calibrated confidence estimates for risk-aware decision-making. This integrated design enables accurate and reliable fault diagnosis suitable for real-time industrial applications.

5. Description of TEP data

Vogel originally provided the Tennessee Eastman process (TEP) dataset, and Down is the chemical process simulation model [47], which is commonly used as a benchmark in the field of process control to compare various problem detection and diagnosis methodologies [33]. TEP is made up of five functioning units: a reactor, condenser, compressor, separator, and stripper, as shown in Fig 2. The process model represents indiscriminate relationships in operational units, and generates two sets of training and test data, comprising normal and faulty data are generated from the TEP process.

Download:

Fig 2. Schematic diagram of Tennessee Eastman Process (TEP).

https://doi.org/10.1371/journal.pone.0349385.g002

Let x(t) denote the process data, with 52 features for each time series sample containing 41 measured and 11 modified variables. These 52 variables were collected at 3-minute intervals during the running process. There are 25 hours of training data and 48 hours of testing data, with each fault appearing after one hour of training and eight hours of testing. There is one class for fault-free conditions and twenty classes for faulty states as shown in Table 1.

Download:

Table 1. Description of fault types in TEP.

https://doi.org/10.1371/journal.pone.0349385.t001

5.1. Data collection and preprocessing

The data used in this study is a TEP dataset generated by a sophisticated process simulation model. Tennessee Eastman Process (TEP) data is a simulated real-life industrial data set that captures the plant activity in both normal and faulty conditions. The simulations represent different operating scenario and faults are caused randomly in the simulation. The dataset comprises of 25,101 samples and 20 failure circumstances. To have an effective fault detection and diagnosis, TEP dataset preprocessing involves a number of key steps to be followed in order to have clean and structured data to be analyzed. First, the binary labels are obtained to fault detection by encoding the fault number by a binary indicator, where 0 indicates normal conditions and 1 indicates faulty conditions. Conversely, there are multi-class labels that are developed across classifications of faults. Advanced feature engineering approaches are used, including computing time characteristics such as rolling average, standard deviation, skewness, and kurtosis to capture time-dependent patterns. To achieve uniformity in feature scaling, the data was standardized through Standard Scalar. Furthermore, the sliding window technique generates a 100-step sequence, allowing the model to learn from time dependencies. Finally, stratified sampling was used to divide the data into training, validation, and test sets, to ensure the balance representation of normal and faulty data across all sets.

5.2. Training phase of the proposed framework

During training, raw multivariate time-series data from the Tennessee Eastman Process (TEP) benchmark is first preprocessed. All sensor variables are normalized using z-score normalization to ensure numerical stability and consistent scaling. The dataset contains 52 process variables and several fault types. Each sample is labeled by its fault number, and for binary fault detection, samples with a fault number greater than 0 are labeled as 1 (faulty), and others as 0 (normal). To capture the sequential nature of the industrial process data, fixed-length overlapping windows of size T = 50 are created using a sliding window method. Each sequence , where F = 52 is the number of features, is paired with its corresponding fault label for classification. The sequences are randomly shuffled and split into 80% training/validation and 20% test data, with the training portion further divided into 90% training and 10% validation using a fixed random seed for reproducibility; all reported results are obtained on the held-out test set, and normalization statistics are estimated from the training data only to prevent information leakage. Model parameters are optimized using AdamW (learning rate 0.001, weight decay 10⁻⁴ a batch size of 64 with a warmup–cosine learning rate schedule (6 warmup epochs), trained for up to 48 epochs with early stopping (patience = 12) based on validation loss. Class imbalance is handled through class-weighted cross-entropy for diagnosis (inverse square-root frequency weighting) and a weighted random sampler that oversamples rare fault classes. To improve robustness, Gaussian input noise is added during early epochs, uncertainty-aware sensor dropout is applied with dropout probabilities proportional to sensor uncertainty priors, gradient clipping is used, and mixed-precision training is enabled. The framework is trained in a multi-task manner, jointly performing fault diagnosis and fault detection with a composite loss that combines class-weighted cross-entropy, binary detection loss, and regularization terms that penalize excessive predictive variance and discourage high-confidence predictions from relying on unreliable sensors. The model is trained using combined loss function

(18)

(19)

where BCE is the binary cross-entropy loss for fault detection, and WCE is the weighted cross-entropy loss for diagnosis, with class weights set inversely proportional to class frequency. Training runs for 40 epochs using the Adam optimizer with a learning rate of 0.001, a batch size of 64, and early stopping based on validation loss.

5.3. Evaluation phase of the proposed framework

During evaluation, two inference variants are evaluated using the same trained backbone to ensure a controlled comparison. The proposed SA setting activates the full process-guided, sensor-aware attention by incorporating the process centrality sensor reliability R_rel and uncertainty priors into the attention logits. As a baseline, a No-SA variant disables the sensor-aware terms at inference while keeping the architecture and learned parameters unchanged, thereby isolating the effect of sensor awareness on calibration and uncertainty behavior. Predictive uncertainty is estimated using Monte Carlo dropout with stochastic forward passes per test sample, and the mean softmax probability is used to compute predictive entropy and related uncertainty measures. Probability calibration is further improved via post-hoc temperature scaling, with the temperature fitted on the validation set only and applied consistently to both SA and No-SA outputs at test time, for clarity, a single global temperature parameter is used across all experiments. The temperature is applied uniformly to all classes and samples during inference, ensuring consistent and stable probability calibration, with N = 20 samples, the predicted fault probability and diagnosis results are averaged:

(20)

(21)

The predictive variance is also calculated as:

(22)

This variance captures the proposed model uncertainty, which will help in identifying abnormal and border samples. Evaluation includes both classification and probability-quality metrics, test accuracy, per-class precision, recall, and F1-score, along with calibration measures including Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Negative Log-Likelihood (NLL), and the Brier score. A hard subset analysis is conducted on samples where the No-SA baseline is incorrect or assigns low confidence (maximum probability < 0.80) to assess robustness under challenging conditions. Moreover the proposed model demonstrates robust generalization, efficient inference, and interpretable decision-making through the attention weights and variance estimates. Compared to baseline models, which include BiLSTM, CNN-LSTM, and Conv1D classifiers without attention or uncertainty estimation, the SAU-PGA-CNN-BiLSTM model consistently achieves superior accuracy and reliability under various conditions. The proposed flowchart is shown in Fig 3

Download:

Fig 3. Proposed flowchart of online and offline monitoring of SAU-PGA-CNN-BiLSTM.

https://doi.org/10.1371/journal.pone.0349385.g003

6. Experimental results and discussions

6.1. Training dynamics and convergence

Fig 4 the training and validation loss and accuracy of the proposed SAU-PGA-CNN-BiLSTM model with process-guided, sensor-aware attention converged quickly and steadily. The validation loss kept decreasing consistently, and the gap between training and validation results got smaller throughout the training (see learning curves). On the test set, the model reached 93.55% accuracy, with a macro-F1 score of 0.942 and a weighted-F1 score of 0.945, showing strong performance across all 21 operating modes.

Download:

Fig 4. Training vs validation (loss and accuracy).

https://doi.org/10.1371/journal.pone.0349385.g004

6.2. Classification performance

Fig 5 shows the confusion matrix for binary fault detection. The model performs very well, accurately identifying most of the normal samples (3968 out of 4000) and faulty samples (14620 out of 14629). There are very few are misclassified, which shows the model capacity to distinguish between normal and faulty samples. Fig 5 shows the confusion matrix for multiclass fault diagnosis, highlighting that the proposed model accurately identifies a wide range of fault types. The clear diagonal line means most samples are correctly matched to their true fault categories. Fault classes like 9, 10, 13, 15, 17, 19, and 20 are also mostly classified correctly, showing the model’s strong ability to tell apart many different faults. Overall, the results demonstrate the model’s high precision and recall, making it well-suited for real-world fault diagnosis in industrial settings.

Download:

Fig 5. Fault detection and diagnosis heatmaps.

https://doi.org/10.1371/journal.pone.0349385.g005

6.3. False alarm rate and detection delay performance

Table 2 summarizes the false alarm rate (FAR) and detection delay of the proposed SAU-PGA-CNN-BiLSTM framework on representative Tennessee Eastman Process faults. The model achieves a low average FAR of 0.97%, indicating strong robustness against noise-induced misdetections during normal operation. This behavior is mainly attributed to the sensor-aware uncertainty mechanism, which suppresses unreliable or highly volatile sensor contributions in the decision process. The framework also demonstrates rapid fault detection, with an average detection delay of 5.6 samples after fault onset. Step-type and correlated process faults (1–3, 10, 11) are detected particularly quickly, benefiting from the process-guided attention that emphasizes central and strongly coupled variables. Slightly longer delays are observed for actuator-related or localized faults (15), which exhibit weaker global signatures. Importantly, even for unknown disturbances (16 and 20), the proposed method maintains both low FAR and short detection delay, highlighting its robustness and generalization capability. Overall, the results confirm that SAU-PGA-CNN-BiLSTM provides an effective trade-off between early fault detection and false-alarm suppression, which is essential for reliable real-time industrial process monitoring.

Download:

Table 2. False alarm rate and detection delay for representative faults.

https://doi.org/10.1371/journal.pone.0349385.t002

6.4. Uncertainty quantification

Fig 6 shows uncertainty scatter plots for fault detection on the left and diagnosis on the right. In detection, uncertainty (measured by the standard deviation across MC dropout samples) goes down as the predicted probability gets closer to 1, meaning the model is more confident with most samples. For diagnosis, uncertainty is highest around the middle range of predicted probabilities, which matches cases that are unclear or harder to classify. Also, uncertainty changes depending on the true fault type, which is shown by different colors.

Download:

Fig 6. Fault detection and diagnosis uncertainty heatmaps.

https://doi.org/10.1371/journal.pone.0349385.g006

6.5. Probability calibration on full test data

Temperature scaling was learned using the validation split and then applied uniformly to both models during test-time evaluation. For clarity, the same single global temperature parameter was used across all experiments, so any observed calibration differences arise from the attention design itself rather than from additional post-hoc tuning. Under this controlled setting, the proposed process-guided, sensor-aware framework produces more reliable probability estimates than the corresponding model without sensor awareness. On the full test set, the NLL drops from 0.1970 to 0.1815 (a 7.9% improvement), and the Brier score goes from 0.1013 to 0.0947 (a 6.5% improvement), showing that the probabilities are both sharper and better matched to actual outcomes. The ECE also decreases from 0.0401 to 0.0374 (a 6.7% improvement), although this reduction is numerically modest, it is consistent across the full test set and is visually supported by the reliability diagrams (Figs 7 and 8). In these diagrams, the SA model stays closer to the diagonal, especially in the mid-confidence range (0.4–0.8), where the baseline tends to be slightly under-confident. while the curves remain almost the same in the highest-confidence area. This means SA improves calibration without just making the model more confident. Overall, these results Table 3 show that, after applying the same temperature scaling procedure, the proposed sensor-aware model achieves lower probabilistic loss, lower prediction error, and slightly improved calibration compared with the non-sensor-aware baseline.

Download:

Table 3. Evaluation metrics of proposed model with and without sensor aware.

https://doi.org/10.1371/journal.pone.0349385.t003

Download:

Fig 7. Reliability diagram of the proposed model with and without sensor-aware conditioning.

https://doi.org/10.1371/journal.pone.0349385.g007

Download:

Fig 8. Reliability comparison on the full test dataset with and without sensor-aware conditioning.

https://doi.org/10.1371/journal.pone.0349385.g008

6.6. Calibration on difficult samples

We focus on a challenging part of the test data, including all cases where the baseline model (No-SA) either gets misclassified or gives a confidence score below 0.80, about 11% of the test set (2,342 out of 21,269). This subset highlights the tricky situations where having well-calibrated probabilities is most important. In this subset, adding sensor-aware attention (SA) to the process-guided model backbone leads to a significant improvement in calibration (see Table 4) hard-subset panel. The expected calibration error (ECE) drops from 0.3270 to 0.2920, an absolute decrease of 0.035, with a 95% bootstrap confidence interval between 0.0257 and 0.0449, showing consistent improvement across resamples. The maximum calibration error (MCE) more than halves, going from 0.9356 to 0.4543, meaning SA greatly reduces the worst calibration mistakes that can cause risk in real-world use. In short, SA helps the model be honestly uncertain when it should be, sharply cutting down on overconfident error, exactly what is needed for risk aware decision-making in industrial fault detection and diagnosis.

Download:

Table 4. Evaluation metrics of proposed model for hard subset.

https://doi.org/10.1371/journal.pone.0349385.t004

6.7. Class-wise uncertainty and calibration

Per-class calibration (Table 5; Fig 9 and 10) further demonstrate the effectiveness of the proposed sensor-aware framework. The macro ECE decreases from 0.0445 to 0.0393, indicating improved alignment between predicted confidence and true accuracy across fault classes. Significant improvements are observed for Classes 3, 10, 11, 16, and 20, where ECE is consistently reduced. These gains are mainly attributed to the combined effect of process-guided attention, which emphasizes structurally important variables, and sensor-aware conditioning, which suppresses unreliable signals. This leads to more stable and better-calibrated predictions, particularly for faults involving multiple correlated sensors. As shown in Fig 9, the sensor-aware model generally exhibits lower predictive entropy, indicating more confident predictions. Fig 10 further shows slightly higher and more consistent class-wise confidence values, supporting improved reliability. Minor increases in ECE are observed for Classes 15 and 19, which may result from limited samples, overlapping fault characteristics, or already high classification accuracy. However, these changes are small and do not affect the overall trend. Overall, the results indicate that incorporating process structure and sensor reliability improves class-wise calibration, particularly for complex fault conditions, while maintaining strong classification performance.

Download:

Table 5. Class wise evaluation metrics of the proposed model.

https://doi.org/10.1371/journal.pone.0349385.t005

Download:

Fig 9. Per-class predictive entropy for models with and without sensor-aware conditioning.

https://doi.org/10.1371/journal.pone.0349385.g009

Download:

Fig 10. Class-wise confidence comparison between models with and without sensor-aware conditioning.

https://doi.org/10.1371/journal.pone.0349385.g010

6.8. Attention behaviours and alignment with priors

Figs 11 and 12 clearly show how the proposed model attention works. The class-wise average heatmaps reveal that attention is focused mainly on a small group of sensors, while many other channels stay near a low baseline. This focus isn’t random it reflects the priors built into the attention logits:

Download:

Fig 11. Class wise attention heatmap of the proposed model.

https://doi.org/10.1371/journal.pone.0349385.g011

Download:

Fig 12. Top sensors plot with respect to mean attention weight.

https://doi.org/10.1371/journal.pone.0349385.g012

Here, sensors that are central to the process get higher weights, while unstable or weakly connected sensors are consistently given lower weights. This indicates that the model emphasizes process-relevant and reliable variables instead of distributing attention uniformly across all sensors. The Top-k plot highlights the key sensors that the model focuses on repeatedly, and the error bars (confidence intervals) show that these rankings are consistent across test examples. This stability suggests that the learned attention is not driven by incidental fluctuations, but instead captures persistent and diagnostically meaningful process variables. The per-sample heatmap shows clear, class-dependent patterns and faults spreading through related subsystems cause similar attention patterns across samples, giving process-level insights rather than random, one-off spikes.

Together, these observations show that proposed model does not just focus on whatever correlates with the label but it reflects the domain structure (process centrality and reliability) in its attention scores. As a result, the attention mechanism improves not only interpretability but also the reliability of probabilistic predictions, particularly in difficult cases where robust sensor selection is essential. This helps the model calibrate probabilities better on tough cases, relying on sensors that are both informative and stable, and avoiding over-confidence when the data mostly comes from noisy sensors.

6.9. Baselines comparison with proposed model

Table 6 and Fig 13 show a comparison of the proposed SAU-PGA-CNN-BiLSTM model with classical machine learning methods (Random Forest, XGBoost, MLP) and deep learning methods (1D-CNN, BiLSTM, Transformer) for fault diagnosis on the TEP dataset. The SAU-PGA-CNN-BiLSTM model achieves the highest test accuracy and F1-score, both at 0.943, outperforming all other models. Among the machine learning methods, XGBoost performs best, while Transformer is the top among deep learning models. Interestingly, both our hybrid model and the Transformer do much better than traditional methods, highlighting how important sequential and attention-based techniques are for capturing complex time-related and contextual patterns.

Download:

Table 6. Baseline machine learning and deep learning performance comparison.

https://doi.org/10.1371/journal.pone.0349385.t006

Download:

Fig 13. Comparison of the proposed model with the baseline models.

https://doi.org/10.1371/journal.pone.0349385.g013

The results show that combining convolutional, sequential, and attention mechanisms in the SAU-PGA-CNN-BiLSTM model helps it effectively capture spatiotemporal features for strong fault diagnosis, outperforming both traditional and deep learning methods. By integrated attention reveal that the model focuses on important time steps, making its decisions easier to understand. Although the accuracy and F1 score are high, there is a calibration gap, meaning the predicted confidence should be used carefully in probabilistic or decision-making tasks. Using post-hoc calibration methods like temperature scaling could help fix this. Additionally, the uncertainty analysis shows the model is reliable in most cases but sometimes produces high uncertainty, suggesting the need for human review or backup systems. Overall, these findings confirm that the proposed method works well for real-world industrial fault detection and diagnosis, while also providing useful uncertainty information to support decision-making.

7. Discussions

Our study focuses on fault detection and diagnosis (FDD) in the Tennessee Eastman–multivariate process, using a CNN–BiLSTM model enhanced with process-guided, sensor-aware attention. We incorporate domain knowledge and sensor quality through three data-driven priors calculated from the training data: a process bias based on correlation centrality to highlight key process channels; a reliability prior that increases the weight of sensors with high mutual information and stable variance; and an uncertainty prior that penalizes unstable or weakly connected sensors. These priors are added to the attention logits, along with a regularizer that prevents over-reliance on unreliable sensors when the model is confident.

On the test set, after applying temperature scaling fitted on validation data, proposed model achieves 93.6% accuracy and produces better-calibrated probabilities than the same network without sensor awareness: NLL 0.181 vs. 0.197, Brier score 0.095 vs. 0.101, and ECE 0.037 vs. 0.040. On a difficult subset of about 11% of the test data, defined by baseline errors or low confidence, SAU-PGA-CNN-BiLSTM significantly improves probability quality—ECE drops by 0.035 (95% CI [0.0257, 0.0449]) and MCE halves from 0.936 to 0.454, with no significant change in per-bin accuracy. This indicates that the improvements come from better confidence calibration rather than raw accuracy.

Per-class ECE improvements are largest for classes 3, 10, 11, 16, and 20, which involve distributed, correlated disturbances where process structure is important. Small declines occur for classes 15 and 19. Analysis of attention weights supports this: a small group of sensors consistently receive high attention, while volatile sensors are down-weighted. The rank correlation between learned attention and the uncertainty prior showing the model learns to trust reliable sensors.

Overall, these results support our SAU-PGA-CNN-BiLSTM approach for industrial FDD. It integrates process knowledge and sensor quality into the model, leading to more reliable probability estimates, better calibration where it matters most, and useful risk-aware behavior for human-in-the-loop systems. Although the proposed framework is validated on the Tennessee Eastman Process, its design is general and can be extended to other industrial systems with multivariate time-series data. The process-guided and sensor-aware components can be adapted using process-specific structural information, such as sensor relationships or process topology, enabling effective application across different industrial domains.

8. Ablation study of the proposed model

A thorough ablation study was conducted to examine the distinct contributions of each element in the proposed SAU-PGA-CNN-BiLSTM model, where all the results are summarized in Table 7, Accuracy and F1-score significantly decreased when the attention mechanism was removed, underscoring the importance of attention in concentrating on the most instructive temporal characteristics. Performance was similarly reduced by removing the convolutional layers, highlighting the significance of local spatial feature extraction. The importance of representing bidirectional temporal dependencies in process data was further highlighted by the decreased accuracy that was obtained when the BiLSTM was substituted with a straightforward feedforward layer or when a unidirectional LSTM was used in place of a bidirectional one. When compared to the complete model, standalone CNN or BiLSTM architectures that were not integrated with other elements consistently unperformed relative to the full model. Overall, the ablation results confirm that the integration of convolutional, sequential, and attention mechanisms is crucial for achieving effective and robust l fault detection and diagnosis performance in complex industrial environments.

Download:

Table 7. Ablation performance of the proposed model.

https://doi.org/10.1371/journal.pone.0349385.t007

9. Limitations and future work

Our priors are estimated based on data-driven correlations and sequence statistics. However, as operating conditions change, controllers are adjusted, or sensors age, these correlation-based centralities can become unreliable, causing attention to misalign with the true process structure, a common challenge in process fault detection and diagnosis [48]. We use global temperature scaling for calibration, which is simple but cannot correct miscalibration specific to certain classes or regions [49]. Predictive uncertainty is estimated through MC dropout, which is computationally efficient but an approximation that depends on where it is applied and the number of samples taken. Attention weights provide useful insights into the process but do not prove causal feature importance and may sometimes overestimate or underestimate a features contribution [50]. We will replace correlation-only priors with models that understand the systems structure, like GATs built from P&IDs, and add physics and constraint information directly into the learning process to keep performance stable across different situations [51]. For uncertainty, we plan to compare MC dropout with deep ensembles and similar last-layer Gaussian methods, and test calibration under controlled distribution changes using standard protocols [52]. When it comes to calibration, we will go beyond simple scalar temperature adjustments and use class wise or Dirichlet calibration to fix systematic biases for each class while maintaining consistent probabilities. These improvements, along with online updating of priors and calibrators, aim to keep the interpretability benefits of process-guided, sensor-aware attention while making the system more robust in varied operating conditions.

10. Conclusion

This paper proposed a framework SAU-PGA-CNN-BiLSTM for reliable and interpretable industrial fault detection and diagnosis. The model integrates convolutional feature extraction, bidirectional temporal modeling, and a process-guided, sensor-aware attention mechanism that incorporates process centrality, sequence-level reliability, and sensor uncertainty priors directly into the attention computation. In addition to joint fault detection and multiclass diagnosis, Monte Carlo dropout and post-hoc temperature scaling is employed to provide calibrated probability estimates suitable for risk-aware industrial deployment. The proposed model achieves 93.6% test accuracy and, after validation-fit temperature scaling, produces better-calibrated probabilities than an otherwise identical model without sensor awareness (NLL , Brier , and ECE ), reflecting a shift towards more confident and accurate predictions. On a challenging slice of the data (∼11% of the test set where No-SA is either low-confidence or erroneous), the sensor-aware variant significantly improves calibration: ECE drops by 0.035 and MCE is halved (), showing safer probability estimations even with minor accuracy improvements. Per-class analysis demonstrates the greatest improvements for failure modes that spread across correlated subsystems, which is consistent with the design decision to prioritize central, trustworthy sensors. Selective-risk curves show that operators can defer the noisiest 10–20% of cases while maintaining good accuracy on the auto-accepted set. Moreover, the model maintains low false alarm rates, short detection delays, and improved behavior on challenging samples. Overall, the proposed SAU-PGA-CNN-BiLSTM framework offers a scalable, uncertainty-aware, and operationally reliable solution for industrial fault detection and diagnosis, bridging the gap between high predictive accuracy and practical deployment requirements in safety-critical process environments.

Acknowledgments

The authors express sincere gratitude to Xi’an Eurasia University for providing support and a conducive environment for this research

References

1. Xu Y, Sun Y, Wan J, Liu X, Song Z. Industrial big data for fault diagnosis: taxonomy, review, and applications. IEEE Access. 2017;5:17368–80.
- View Article
- Google Scholar
2. Reis MS, Gins G. Industrial process monitoring in the big data/industry 4.0 era: from detection, to diagnosis, to prognosis. Processes. 2017;5(3):35.
- View Article
- Google Scholar
3. Leite D, Andrade E, Rativa D, Maciel AMA. Fault detection and diagnosis in industry 4.0: a review on challenges and opportunities. Sensors (Basel). 2024;25(1):60. pmid:39796851
- View Article
- PubMed/NCBI
- Google Scholar
4. Peng H, Zhang H, Fan Y, Shangguan L, Yang Y. A review of research on wind turbine bearings’ failure analysis and fault diagnosis. Lubricants. 2022;11(1):14.
- View Article
- Google Scholar
5. Zhao H, Zheng J, Xu J, Deng W. Fault diagnosis method based on principal component analysis and broad learning system. IEEE Access. 2019;7:99263–72.
- View Article
- Google Scholar
6. Luo J, Kong X, Hu C, Li H. Key-performance-indicators-related fault subspace extraction for the reconstruction-based fault diagnosis. Measurement. 2021;186:110119.
- View Article
- Google Scholar
7. Kong X, Luo J, Feng X. An Overview of Conventional MSPC Methods. Process Monitoring and Fault Diagnosis Based on Multivariable Statistical Analysis. 2024. pp. 9–25. https://doi.org/10.1007/978-981-99-8775-7_2
8. Yu W, Zhao C, Huang B. MoniNet with concurrent analytics of temporal and spatial information for fault detection in industrial processes. IEEE Transac Cybern. 2021;52(8):8340–51.
- View Article
- Google Scholar
9. Ji C, Sun W. A review on data-driven process monitoring methods: characterization and mining of industrial data. Processes. 2022;10(2):335.
- View Article
- Google Scholar
10. Huang T, Zhang Q, Tang X, Zhao S, Lu X. A novel fault diagnosis method based on CNN and LSTM and its application in fault diagnosis for complex systems. Artif Intell Rev. 2021;55(2):1289–315.
- View Article
- Google Scholar
11. Van Gompel J, Spina D, Develder C. Satellite based fault diagnosis of photovoltaic systems using recurrent neural networks. Appl Energy. 2022;305:117874.
- View Article
- Google Scholar
12. Yadong H, Zhe Y, Dong W, Chengdong G, Chuankun L, Yian G. A fault diagnosis method for complex chemical process based on multi-model fusion. Chem Eng Res Design. 2022;184:662–77.
- View Article
- Google Scholar
13. Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, et al. AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artif Intell Rev. 2023;56(6):4929–5021. pmid:36268476
- View Article
- PubMed/NCBI
- Google Scholar
14. Hassija V, Chamola V, Mahapatra A, Singal A, Goel D, Huang K, et al. Interpreting black-box models: a review on explainable artificial intelligence. Cogn Comput. 2023;16(1):45–74.
- View Article
- Google Scholar
15. Esna-Ashari M. Beyond the black box: A review of quantitative metrics for neural network interpretability and their practical implications. Int J Sustain Appl Sci Eng. 2025;2(1):1–24.
- View Article
- Google Scholar
16. Kotipalli B. The role of attention mechanisms in enhancing transparency and interpretability of neural network models in explainable AI. 2024.
17. Chen J, Duan N, Zhou X, Wang Z. Diagnostic model for transformer core loosening faults based on the gram angle field and multi-head attention mechanism. Appl Sci. 2024;14(23):10906.
- View Article
- Google Scholar
18. Han T, Li Y-F. Out-of-distribution detection-assisted trustworthy machinery fault diagnosis approach with uncertainty-aware deep ensembles. Reliab Eng Syst Saf. 2022;226:108648.
- View Article
- Google Scholar
19. Guillory D, Shankar V, Ebrahimi S, Darrell T, Schmidt L. Predicting with confidence on unseen distributions. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021. pp. 1134–44.
20. Lin Y-H, Li G-H. Uncertainty-aware fault diagnosis under calibration. IEEE Trans Syst Man Cybern, Syst. 2024;54(10):6469–81.
- View Article
- Google Scholar
21. Asfandyar Shahid M, Zhang X, Qin X, Peng K. A novel quality-related generative data-driven fault diagnosis method for complex industrial processes with incomplete data. Meas Sci Technol. 2025;36(10):106218.
- View Article
- Google Scholar
22. Pule M, Matsebe O, Samikannu R. Application of PCA and SVM in fault detection and diagnosis of bearings with varying speed. Math Probl Eng. 2022;2022(1):5266054.
- View Article
- Google Scholar
23. Deng R, Fan Y, Fang Z, Wang Z. Statistical process monitoring based on collaboration preserving embedding. IEEE Trans Instrum Meas. 2022;71:1–9.
- View Article
- Google Scholar
24. Khan MM, Islam I, Rashid AB. Fault diagnosis of an industrial chemical process using machine learning algorithms: principal component analysis (PCA) and Kernel Principal Component Analysis (KPCA). IOP Conf Ser: Mater Sci Eng. 2024;1305(1):012037.
- View Article
- Google Scholar
25. Uddin Z, Qamar A, Alam F. ICA based sensors fault diagnosis: an audio separation application. Wireless Pers Commun. 2021;118(4):3369–84.
- View Article
- Google Scholar
26. Gao L, Li D, Yao L, Gao Y. Sensor drift fault diagnosis for chiller system using deep recurrent canonical correlation analysis and k-nearest neighbor classifier. ISA Trans. 2022;122:232–46. pmid:33985786
- View Article
- PubMed/NCBI
- Google Scholar
27. Wang J, Gao D, Zhu S, Wang S, Liu H. Fault diagnosis method of photovoltaic array based on support vector machine. Energy Sourc Part A: Recov Utiliz Environ Effect. 2023;45(2):5380–95.
- View Article
- Google Scholar
28. Cen J, Yang Z, Liu X, Xiong J, Chen H. A review of data-driven machinery fault diagnosis using machine learning algorithms. J Vib Eng Technol. 2022;10(7):2481–507.
- View Article
- Google Scholar
29. Ruan D, Wang J, Yan J, Gühmann C. CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Adv Eng Inform. 2023;55:101877.
- View Article
- Google Scholar
30. Veerasamy V, Wahab NIA, Othman ML, Padmanaban S, Sekar K, Ramachandran R, et al. LSTM recurrent neural network classifier for high impedance fault detection in solar PV integrated power system. IEEE Access. 2021;9:32672–87.
- View Article
- Google Scholar
31. Fan C, Xiahou K, Wang L, Wu Q. Data-driven fault detection of multiple open-circuit faults for mmc systems based on long short-term memory networks. CSEE J Power Energy Syst. 2024;10(4):1563–74.
- View Article
- Google Scholar
32. Alhanaf AS, Farsadi M, Balik HH. Fault detection and classification in ring power system with DG penetration using hybrid CNN-LSTM. IEEE Access. 2024;12:59953–75.
- View Article
- Google Scholar
33. Shahid MA, Zhang X, Qin X, Peng K. A novel deep multi‐task learning model for spatial–temporal fault detection and diagnosis in industrial systems. Can J Chem Eng. 2025;103(12):5910–34.
- View Article
- Google Scholar
34. Li J, Chen J, Li Z. BILSTM based on quality driven attention for product defect prediction. In: 2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC). IEEE; 2024. pp. 2233–7.
35. Ruan T, Zhang S. Towards understanding how attention mechanism works in deep learning. arXiv preprint arXiv:241218288. 2024.
36. Rao S, Wang J. A comprehensive fault detection and diagnosis method for chemical processes. Chem Eng Sci. 2024;300:120565.
- View Article
- Google Scholar
37. Chen Y, Zhang R, Gao F. Fault diagnosis of industrial process using attention mechanism with 3DCNN-LSTM. Chem Eng Sci. 2024;293:120059.
- View Article
- Google Scholar
38. Kim C, Cho K, Joe I. Artificial intelligence-based fault diagnosis for steam traps using statistical time series features and a transformer encoder-decoder model. Electronics. 2025;14(5):1010.
- View Article
- Google Scholar
39. Huang L, Ruan S, Xing Y, Feng M. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Med Image Anal. 2024;97:103223. pmid:38861770
- View Article
- PubMed/NCBI
- Google Scholar
40. Ren J, Wen J, Zhao Z, Yan R, Chen X, Nandi AK. Uncertainty-aware deep learning: a promising tool for trustworthy fault diagnosis. IEEE/CAA J Autom Sinica. 2024;11(6):1317–30.
- View Article
- Google Scholar
41. Zeevi T, Venkataraman R, Staib LH, Onofrey JA. Monte-carlo frequency dropout for predictive uncertainty estimation in deep learning. In: 2024 IEEE International Symposium on Biomedical Imaging (ISBI). IEEE; 2024. pp. 1–5.
42. Shao H, Xiao Y, Leng J, Zhao X, Liu B. Collaborative human-computer fault diagnosis via calibrated confidence estimation. Adv Eng Inf. 2025;65:103349.
- View Article
- Google Scholar
43. Ige AO, Sibiya M. State-of-the-art in 1D convolutional neural networks: a survey. IEEE Access. 2024;12:144082–105.
- View Article
- Google Scholar
44. Hameed Z, Garcia-Zapirain B. Sentiment classification using a single-layered BiLSTM model. IEEE Access. 2020;8:73992–4001.
- View Article
- Google Scholar
45. Wu C, Wu F, Qi T, Huang Y, Xie X. Fastformer: Additive attention can be all you need. arXiv preprint arXiv:210809084. 2021.
46. Milanés-Hermosilla D, Trujillo Codorniú R, López-Baracaldo R, Sagaró-Zamora R, Delisle-Rodriguez D, Villarejo-Mayor JJ, et al. Monte Carlo dropout for uncertainty estimation and motor imagery classification. Sensors (Basel). 2021;21(21):7241. pmid:34770553
- View Article
- PubMed/NCBI
- Google Scholar
47. Downs JJ, Vogel EF. A plant-wide industrial process control problem. Comput Chem Eng. 1993;17(3):245–55.
- View Article
- Google Scholar
48. Loeys S, Boute RN, Antonio K. The use of IoT sensor data to dynamically assess maintenance risk in service contracts. Eur J Operat Res. 2025;324(2):454–65.
- View Article
- Google Scholar
49. Dussert G, Chamaillé‐Jammes S, Dray S, Miele V. Being confident in confidence scores: calibration in deep learning models for camera trap image sequences. Remote Sens Ecol Conserv. 2024;11(1):88–99.
- View Article
- Google Scholar
50. Wang J, Shao H, He J, Liu L, Ma J, Liu B. A novel interpretable fault diagnosis method using multi-image feature extraction and attention fusion. Pattern Recogn Lett. 2025;189:38–47.
- View Article
- Google Scholar
51. Zhou W, Ma G, Zhou N, Liang X, Gao C, Deng W, et al. Fgat: Industrial Protocol States Labeling with Fuzzy Graph Attention Network.
52. Mendonça F, Shanawaz Mostafa S, Morgado-Dias F, Ravelo-García AG, Figueiredo MAT. ProBoost: reducing uncertainty using a boosting method for probabilistic models. IEEE Access. 2025;13:132006–21.
- View Article
- Google Scholar

[ref1] 1. Xu Y, Sun Y, Wan J, Liu X, Song Z. Industrial big data for fault diagnosis: taxonomy, review, and applications. IEEE Access. 2017;5:17368–80.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Reis MS, Gins G. Industrial process monitoring in the big data/industry 4.0 era: from detection, to diagnosis, to prognosis. Processes. 2017;5(3):35.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Leite D, Andrade E, Rativa D, Maciel AMA. Fault detection and diagnosis in industry 4.0: a review on challenges and opportunities. Sensors (Basel). 2024;25(1):60. pmid:39796851
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Peng H, Zhang H, Fan Y, Shangguan L, Yang Y. A review of research on wind turbine bearings’ failure analysis and fault diagnosis. Lubricants. 2022;11(1):14.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Zhao H, Zheng J, Xu J, Deng W. Fault diagnosis method based on principal component analysis and broad learning system. IEEE Access. 2019;7:99263–72.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Luo J, Kong X, Hu C, Li H. Key-performance-indicators-related fault subspace extraction for the reconstruction-based fault diagnosis. Measurement. 2021;186:110119.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Kong X, Luo J, Feng X. An Overview of Conventional MSPC Methods. Process Monitoring and Fault Diagnosis Based on Multivariable Statistical Analysis. 2024. pp. 9–25. https://doi.org/10.1007/978-981-99-8775-7_2

[ref8] 8. Yu W, Zhao C, Huang B. MoniNet with concurrent analytics of temporal and spatial information for fault detection in industrial processes. IEEE Transac Cybern. 2021;52(8):8340–51.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref9] 9. Ji C, Sun W. A review on data-driven process monitoring methods: characterization and mining of industrial data. Processes. 2022;10(2):335.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref10] 10. Huang T, Zhang Q, Tang X, Zhao S, Lu X. A novel fault diagnosis method based on CNN and LSTM and its application in fault diagnosis for complex systems. Artif Intell Rev. 2021;55(2):1289–315.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref11] 11. Van Gompel J, Spina D, Develder C. Satellite based fault diagnosis of photovoltaic systems using recurrent neural networks. Appl Energy. 2022;305:117874.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref12] 12. Yadong H, Zhe Y, Dong W, Chengdong G, Chuankun L, Yian G. A fault diagnosis method for complex chemical process based on multi-model fusion. Chem Eng Res Design. 2022;184:662–77.
View Article
Google Scholar

[34] View Article

[35] Google Scholar

[ref13] 13. Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, et al. AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artif Intell Rev. 2023;56(6):4929–5021. pmid:36268476
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref14] 14. Hassija V, Chamola V, Mahapatra A, Singal A, Goel D, Huang K, et al. Interpreting black-box models: a review on explainable artificial intelligence. Cogn Comput. 2023;16(1):45–74.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref15] 15. Esna-Ashari M. Beyond the black box: A review of quantitative metrics for neural network interpretability and their practical implications. Int J Sustain Appl Sci Eng. 2025;2(1):1–24.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref16] 16. Kotipalli B. The role of attention mechanisms in enhancing transparency and interpretability of neural network models in explainable AI. 2024.

[ref17] 17. Chen J, Duan N, Zhou X, Wang Z. Diagnostic model for transformer core loosening faults based on the gram angle field and multi-head attention mechanism. Appl Sci. 2024;14(23):10906.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref18] 18. Han T, Li Y-F. Out-of-distribution detection-assisted trustworthy machinery fault diagnosis approach with uncertainty-aware deep ensembles. Reliab Eng Syst Saf. 2022;226:108648.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref19] 19. Guillory D, Shankar V, Ebrahimi S, Darrell T, Schmidt L. Predicting with confidence on unseen distributions. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021. pp. 1134–44.

[ref20] 20. Lin Y-H, Li G-H. Uncertainty-aware fault diagnosis under calibration. IEEE Trans Syst Man Cybern, Syst. 2024;54(10):6469–81.
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref21] 21. Asfandyar Shahid M, Zhang X, Qin X, Peng K. A novel quality-related generative data-driven fault diagnosis method for complex industrial processes with incomplete data. Meas Sci Technol. 2025;36(10):106218.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref22] 22. Pule M, Matsebe O, Samikannu R. Application of PCA and SVM in fault detection and diagnosis of bearings with varying speed. Math Probl Eng. 2022;2022(1):5266054.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref23] 23. Deng R, Fan Y, Fang Z, Wang Z. Statistical process monitoring based on collaboration preserving embedding. IEEE Trans Instrum Meas. 2022;71:1–9.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref24] 24. Khan MM, Islam I, Rashid AB. Fault diagnosis of an industrial chemical process using machine learning algorithms: principal component analysis (PCA) and Kernel Principal Component Analysis (KPCA). IOP Conf Ser: Mater Sci Eng. 2024;1305(1):012037.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref25] 25. Uddin Z, Qamar A, Alam F. ICA based sensors fault diagnosis: an audio separation application. Wireless Pers Commun. 2021;118(4):3369–84.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref26] 26. Gao L, Li D, Yao L, Gao Y. Sensor drift fault diagnosis for chiller system using deep recurrent canonical correlation analysis and k-nearest neighbor classifier. ISA Trans. 2022;122:232–46. pmid:33985786
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref27] 27. Wang J, Gao D, Zhu S, Wang S, Liu H. Fault diagnosis method of photovoltaic array based on support vector machine. Energy Sourc Part A: Recov Utiliz Environ Effect. 2023;45(2):5380–95.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref28] 28. Cen J, Yang Z, Liu X, Xiong J, Chen H. A review of data-driven machinery fault diagnosis using machine learning algorithms. J Vib Eng Technol. 2022;10(7):2481–507.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref29] 29. Ruan D, Wang J, Yan J, Gühmann C. CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Adv Eng Inform. 2023;55:101877.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref30] 30. Veerasamy V, Wahab NIA, Othman ML, Padmanaban S, Sekar K, Ramachandran R, et al. LSTM recurrent neural network classifier for high impedance fault detection in solar PV integrated power system. IEEE Access. 2021;9:32672–87.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref31] 31. Fan C, Xiahou K, Wang L, Wu Q. Data-driven fault detection of multiple open-circuit faults for mmc systems based on long short-term memory networks. CSEE J Power Energy Syst. 2024;10(4):1563–74.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref32] 32. Alhanaf AS, Farsadi M, Balik HH. Fault detection and classification in ring power system with DG penetration using hybrid CNN-LSTM. IEEE Access. 2024;12:59953–75.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref33] 33. Shahid MA, Zhang X, Qin X, Peng K. A novel deep multi‐task learning model for spatial–temporal fault detection and diagnosis in industrial systems. Can J Chem Eng. 2025;103(12):5910–34.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref34] 34. Li J, Chen J, Li Z. BILSTM based on quality driven attention for product defect prediction. In: 2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC). IEEE; 2024. pp. 2233–7.

[ref35] 35. Ruan T, Zhang S. Towards understanding how attention mechanism works in deep learning. arXiv preprint arXiv:241218288. 2024.

[ref36] 36. Rao S, Wang J. A comprehensive fault detection and diagnosis method for chemical processes. Chem Eng Sci. 2024;300:120565.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref37] 37. Chen Y, Zhang R, Gao F. Fault diagnosis of industrial process using attention mechanism with 3DCNN-LSTM. Chem Eng Sci. 2024;293:120059.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref38] 38. Kim C, Cho K, Joe I. Artificial intelligence-based fault diagnosis for steam traps using statistical time series features and a transformer encoder-decoder model. Electronics. 2025;14(5):1010.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref39] 39. Huang L, Ruan S, Xing Y, Feng M. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Med Image Anal. 2024;97:103223. pmid:38861770
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref40] 40. Ren J, Wen J, Zhao Z, Yan R, Chen X, Nandi AK. Uncertainty-aware deep learning: a promising tool for trustworthy fault diagnosis. IEEE/CAA J Autom Sinica. 2024;11(6):1317–30.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref41] 41. Zeevi T, Venkataraman R, Staib LH, Onofrey JA. Monte-carlo frequency dropout for predictive uncertainty estimation in deep learning. In: 2024 IEEE International Symposium on Biomedical Imaging (ISBI). IEEE; 2024. pp. 1–5.

[ref42] 42. Shao H, Xiao Y, Leng J, Zhao X, Liu B. Collaborative human-computer fault diagnosis via calibrated confidence estimation. Adv Eng Inf. 2025;65:103349.
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref43] 43. Ige AO, Sibiya M. State-of-the-art in 1D convolutional neural networks: a survey. IEEE Access. 2024;12:144082–105.
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref44] 44. Hameed Z, Garcia-Zapirain B. Sentiment classification using a single-layered BiLSTM model. IEEE Access. 2020;8:73992–4001.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref45] 45. Wu C, Wu F, Qi T, Huang Y, Xie X. Fastformer: Additive attention can be all you need. arXiv preprint arXiv:210809084. 2021.

[ref46] 46. Milanés-Hermosilla D, Trujillo Codorniú R, López-Baracaldo R, Sagaró-Zamora R, Delisle-Rodriguez D, Villarejo-Mayor JJ, et al. Monte Carlo dropout for uncertainty estimation and motor imagery classification. Sensors (Basel). 2021;21(21):7241. pmid:34770553
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref47] 47. Downs JJ, Vogel EF. A plant-wide industrial process control problem. Comput Chem Eng. 1993;17(3):245–55.
View Article
Google Scholar

[131] View Article

[132] Google Scholar

[ref48] 48. Loeys S, Boute RN, Antonio K. The use of IoT sensor data to dynamically assess maintenance risk in service contracts. Eur J Operat Res. 2025;324(2):454–65.
View Article
Google Scholar

[134] View Article

[135] Google Scholar

[ref49] 49. Dussert G, Chamaillé‐Jammes S, Dray S, Miele V. Being confident in confidence scores: calibration in deep learning models for camera trap image sequences. Remote Sens Ecol Conserv. 2024;11(1):88–99.
View Article
Google Scholar

[137] View Article

[138] Google Scholar

[ref50] 50. Wang J, Shao H, He J, Liu L, Ma J, Liu B. A novel interpretable fault diagnosis method using multi-image feature extraction and attention fusion. Pattern Recogn Lett. 2025;189:38–47.
View Article
Google Scholar

[140] View Article

[141] Google Scholar

[ref51] 51. Zhou W, Ma G, Zhou N, Liang X, Gao C, Deng W, et al. Fgat: Industrial Protocol States Labeling with Fuzzy Graph Attention Network.

[ref52] 52. Mendonça F, Shanawaz Mostafa S, Morgado-Dias F, Ravelo-García AG, Figueiredo MAT. ProBoost: reducing uncertainty using a boosting method for probabilistic models. IEEE Access. 2025;13:132006–21.
View Article
Google Scholar

[144] View Article

[145] Google Scholar

Figures

Abstract

1. Introduction

2. Related work

3. Proposed network design

3.1. Two-dimensional convolutional neural network

3.2. Bidirectional long-short term memory networks

3.3. Attention mechanism

3.4. Monte Carlo dropout for uncertainty estimation

4. Methodology

4.1. Proposed framework overview

4.2. Framework architecture

5. Description of TEP data

5.1. Data collection and preprocessing

5.2. Training phase of the proposed framework

5.3. Evaluation phase of the proposed framework

6. Experimental results and discussions

6.1. Training dynamics and convergence

6.2. Classification performance

6.3. False alarm rate and detection delay performance

6.4. Uncertainty quantification

6.5. Probability calibration on full test data

6.6. Calibration on difficult samples

6.7. Class-wise uncertainty and calibration

6.8. Attention behaviours and alignment with priors

6.9. Baselines comparison with proposed model

7. Discussions

8. Ablation study of the proposed model

9. Limitations and future work

10. Conclusion

Acknowledgments

References