Figures
Abstract
Industrial sensor networks exhibit heterogeneous, federated, large-scale, and intelligent characteristics due to the increasing number of Internet of Things (IoT) devices and different types of sensors. Efficient and accurate anomaly detection of sensor data is essential for guaranteeing the system’s operational reliability and security. However, existing research on sensor data anomaly detection for industrial sensor networks still has several inherent limitations. First, most detection models usually consider centralized detection. Thus, all sensor data have to be uploaded to the control center for analysis, leading to a heavy traffic load. However, industrial sensor networks have high requirements for reliable and real-time communication. The heavy traffic load may cause communication delays or packets lost by corruption. Second, there are complex spatial and temporal features in industrial sensor data. The full extraction of such features plays a key role in improving detection performance. Nevertheless, the majority of existing methodologies face challenges in simultaneously and comprehensively analyzing both features. To solve the limitations above, this paper develops a cloud-edge collaborative data anomaly detection approach for industrial sensor networks that mainly consists of a sensor data detection model deployed at individual edges and a sensor data analysis model deployed in the cloud. The former is implemented using Gaussian and Bayesian algorithms, which effectively filter the substantial volume of sensor data generated during the normal operation of the industrial sensor network, thereby reducing traffic load. It only uploads all the sensor data to the sensor data analysis model for further analysis when the network is in an anomalous state. The latter based on GCRL is developed by inserting Long Short-Term Memory network (LSTM) into Graph Convolutional Network (GCN), which can effectively extract the spatial and temporal features of the sensor data for anomaly detection. The proposed approach is extensively assessed through experiments using two public industrial sensor network datasets compared with the baseline anomaly detection models. The numerical results demonstrate that the proposed approach outperforms the existing state-of-the-art models.
Citation: Yang T, Jiang X, Li W, Liu P, Wang J, Hao W, et al. (2025) Cloud-edge collaborative data anomaly detection in industrial sensor networks. PLoS One 20(6): e0324543. https://doi.org/10.1371/journal.pone.0324543
Editor: Haleem Farman, Prince Sultan University, SAUDI ARABIA
Received: October 29, 2024; Accepted: April 26, 2025; Published: June 11, 2025
Copyright: © 2025 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data for this study are publicly available from the GitHub repository (https://github.com/yechenn/sensor-data-anomaly-detection).
Funding: The National Key Research and Development Program of China (2023YFB3107603) the “Pioneer” and “Leading Goose” R&D Program of Zhejiang (2022C01239).
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Industrial sensor networks have been developed quickly in the paradigm of the IoT and adopted in many industrial application domains ([1–3]) In industrial sensor networks, a large amount of sensor data, i.e., the measured value of sensors, is generally collected by different types of distributed industrial sensors and made available to a remote control center for centralized processing and analysis. The sensor data is valuable for condition monitoring and management functionalities in industrial sensor networks. However, anomalous sensor data cannot be completely avoided in practice due to many factors, e.g., unreliable sensing devices [4], computation, physical attacks, and transmission errors [5], imposing obvious risks to industrial sensor network applications. The failure to detect the anomalous speed of centrifuges led to the collapse of the nuclear plant, as documented in reference [6]. In the Venezuelan power grid event of 2019, anomalous voltage measurements were not accurately and timely detected, which resulted in a nationwide blackout, as described in reference [7]. Therefore, accurate anomalous sensor data detection is necessary to ensure the system’s operational reliability ([8–15]).
At present, there are many industrial sensor data anomaly detection approaches, e.g., statistical algorithms and machine learning algorithms. However, most of them still exhibit two limitations, the first being issues with the detection architecture, and the second being challenges related to the extraction of features from industrial sensor data. Specifically, these limitations are as follows:
(1) Traditional sensor data anomaly detection models are based on statistical algorithms (e.g., [16–18]), which cannot provide high detection accuracy. Thus, machine learning algorithms are popular in sensor data anomaly detection due to their better detection performance (e.g., [19–22]). However, it should be noted that the machine learning-based model can be computationally complex for field sensors with limited computational capability. Therefore, many machine learning-based detection models are supposed to be deployed in the control center in practice. In Fig 1, the arrows indicate the transmission of sensor data. As Fig 1a shows, the anomaly detection model using machine learning algorithms is deployed in the control center, so all sensor data have to be uploaded to the control center for further analysis. With the advancement of informatization and intelligence in Industrial Control Systems (ICS), the number of industrial sensors has drastically increased, and their distribution has become more widespread. Therefore, the centralized detection mechanism imposes a significant communication burden, leading to a surge in traffic load. However, reliable and real-time communication is essential to the stable operation of the industrial sensor network [23]. The increase in traffic load may disrupt normal communication behaviors ( [24,25]), leading to serious problems, e.g., communication delays and packets lost by corruption. Consequently, the traditional centralized detection architecture struggles to adapt to the evolving landscape of industrial control systems. Therefore, it is imperative to propose a new detection architecture to address this challenge.
(2) Different industrial processes are generally coupled. Hence, the sensor data exhibit spatial and temporal features [26,27]. In industrial sensor networks, the measurements of sensors at different positions influence each other. For example, a rise in water level may cause a drop in PH in a water treatment system. This is called a spatial feature. In addition, sensor data is time sequence data. Its value at time T+1 is affected by the value at T. This is called a temporal feature. It is crucial to consider spatial and temporal features simultaneously for sensor data characterization and anomaly detection [28]. In our investigation of existing literature on anomaly detection in sensor data within industrial sensor networks, we observed that the majority of studies focus on handling either temporal [29] or spatial [30] features. While there are a few articles capable of extracting both types of features simultaneously, their feature extraction capabilities remain insufficiently comprehensive.
To address limitation (1), we draw inspiration from the federated learning architecture ( [31–33]), as well as our previous work on cloud–edge collaborative detection for network traffic anomaly detection [34], and further propose a cloud–edge collaborative detection framework tailored for sensor data anomaly detection. This approach takes into consideration that industrial sensor networks spend significantly more time in normal operation than in an anomalous state in real-world scenarios. This approach entails deploying the sensor data detection model at the edge and the sensor data analysis model in the cloud, as illustrated in Fig 1b. The sensor data detection model deployed at the edge efficiently filters the substantial volume of sensor data generated during normal operation of the industrial sensor network. It only uploads all the sensor data when the network is in an anomalous state to the cloud-based sensor data analysis model for further analysis. Therefore, this approach can lead to a significant reduction in traffic load. In this paper, we assume computational complexity as either high or low. High complexity is attributed to machine learning algorithms that require GPU involvement for computation, whereas low complexity is associated with statistical algorithms that do not require GPU support. Based on prior knowledge of ICS, it is understood that most edge areas in industrial sensor networks are low-computing environments, lacking GPU support, and therefore unable to perform deep learning operations. Therefore, the sensor data detection model is expected to be of low complexity. The Gaussian and Bayesian algorithms [35] based on statistical methods have low complexity and are widely used in anomaly detection applications. Thus, we select Gaussian and Bayesian algorithms as the sensor data anomaly detection model.
To address limitation (2), we introduce a sensor data analysis model called GCRL. This model combines the functionalities of GCN, Residual networks, and LSTM to analyze spatial and temporal features simultaneously. GCN is popularized and widely used for extracting spatial features [36]. LSTM, with its well-established track record of effectively capturing temporal dependencies and patterns in sequential sensor data, has been widely adopted and extensively studied in the field of anomaly detection [37]. Therefore, we consider combining the two algorithms to mine temporal and spatial features of sensor data. The input of GCRL consists of a sensor data matrix and an adjacency matrix indicating the correlations between any two sensors. To construct the adjacency matrix, we use Spearman correlation analysis [38] to calculate the degrees of correlation between sensors. Moreover, to avoid the gradient disappearance or explosion during the training process, a Residual algorithm [39] is adopted in GCRL. The Residual algorithm is a technique aimed at mitigating the issue of gradient vanishing or steep gradient descent during the training of deep learning models. In essence, it involves conceptualizing a set of consecutive neural network layers as a module, where the input to the next module is equal to the sum of the current module’s output and its original input. Finally, we can use the sensor data analysis model to further analyze the sensor data uploaded by the detection model at the edge and then accurately identify anomalies.
To evaluate the proposed detection approach, two public industrial sensor network datasets are used for evaluation. One is Water Distribution (WADI) dataset [19] and another is the Condition Monitoring of Hydraulic Systems (CMHS) dataset [40]. The two datasets are popular and widely used in industrial sensor data anomaly detection evaluation. The numerical results show that our approach performs better than other state-of-the-art anomaly detection approaches in detection performance. Compared with other anomaly detection approaches, our approach achieves an overall improvement of 4.35% for the F1-score. The main technical contributions made in this work are summarized as follows:
- (1) A cloud-edge collaborative anomaly detection approach for industrial sensor networks is proposed, filtering the substantial volume of sensor data generated during normal operation of the industrial sensor network.
- (2) A novel machine learning classification model GCRL is proposed to effectively mine the spatial and temporal features of industrial sensor data, which can improve detection accuracy.
- (3) The proposed approach is extensively assessed through experiments based on two benchmark industrial sensor datasets. The results demonstrate that the proposed approach can achieve an impressive 4.35% improvement in F1-score, compared with the existing models.
The rest of the paper is organized as follows: Section 2 discusses the existing anomaly detection solutions in industrial sensor networks. The proposed cloud-edge collaborative anomaly detection approach for industrial sensor networks is presented in Section 3. The experimental results of the proposed approach are compared with existing baseline approaches in Section 4. The conclusions and future research directions are given in Section 5.
2. Related work
In the domain of statistical-based industrial sensor network anomaly detection, extensive research has been undertaken. The author in [17] proposed a mixture of probabilistic PCA models for fault detection, which can separate the input space into some local regions and deploy the linear sensor anomaly diagnosis model in each region. A novel sparse PCA model can complete the task of localizing anomalies by analyzing a sparse low-dimensional space of anomalous data [41]. The author in [42] proposed a distributed anomaly detection technique based on the Seasonal Autoregressive Integrated Moving Average (SARIMA), considering the limited computing resources. However, it excels primarily in analyzing data exhibiting pronounced seasonality and periodicity. In [18], a data-driven detection approach was developed based on hidden Markov models for industrial sensor networks. Industrial sensor data contains complex features and correlations. However, statistical algorithms cannot effectively analyze these features and correlations due to their simple computing structure, leading to poor detection performance.
To improve detection accuracy, anomaly detection models based on machine learning algorithms are popular. In [43], the author introduced a framework for anomaly identification comprising a Convolutional Neural Network (CNN) and a two-stage LSTM-based Autoencoder. In [44], the author proposed a machine-learning approach to model normal sensor data to detect anomalous behaviors. In [45], the author proposed a sensor data anomaly classification algorithm leveraging Convolutional Neural Networks (CNN) with a Partially Observable Markov Decision Process (POMDP) model. The work in [29] proposed an LSTM-Gauss-NBayes approach that combined the LSTM with the Gaussian Bayes model for anomaly detection in the sensor network. However, this approach has a high false-positive rate in the presence of non-Gaussian distribution data. The author proposed a low-complexity model to detect anomalous sensor data in an industrial sensor network [46]. In [47], the author proposed an Extreme Learning Machine and Mutual Information (ELM-MI) based anomaly detection model. This model utilizes sensor data to compute temporal correlations by using the AutoCorrelation Function (ACF). Additionally, the Genetic Algorithm (GA) is used to solve multi-objective optimization problems. In [48], the authors developed a deep autoencoding Gaussian model to identify anomalous data. This model consists of a Gaussian mixture model and deep autoencoders. It should be noted that these anomaly detection models using machine learning algorithms can be computationally complex for the field sensors with limited computational capability and hence can be hardly deployed in the field sensors in practice. Therefore, most of the existing anomaly detection models using machine learning algorithms adopt centralized detection deployed in the control center, leading to a heavy traffic load in industrial sensor networks.
The industrial sensor data contains complex temporal and spatial features. Therefore, efficiently extracting these features is key to improving the accuracy and efficiency of anomaly detection. In the realm of detecting temporal features, many scholars have conducted extensive research. Many researchers have proposed methods to analyze the features. In [19], the normal sensor data was adopted to train a Generative Adversarial Network (GAN), and the discriminator based on the LSTM-Recurrent Neural Network (RNN) was used to extract temporal features and compute anomaly scores. However, the detection accuracy of this model is low in a benchmark dataset. The author used a GAN based on RNN to identify anomalous data [49]. The author [50] developed an anomaly detection framework called Multi-Time Scale Deep Convolutional Generative Adversarial Network (MTS-DCGAN). The goal of this framework is to efficiently extract temporal features of time-series data in Industrial Control Systems (ICS). In [51], the author developed an anomaly detection framework based on federated learning, combining CNN with LSTM. The model in [20] based on multi-head CNN-LSTM was used for detecting anomalies in sensor data. The anomaly detection algorithm in [52] also consists of a multi-head mechanism, CNN, and LSTM. In [53], the authors proposed a simple yet efficient method, i.e., Multiresolution Self-supervised Discriminative Network (MS2D-Net), for anomalous sensor data detection. Many scholars have proposed cloud-edge-based anomaly detection methods [54]. The model [55] employs a cloud-edge computing architecture and uses a lightweight LSTM to extract features from sensor data for anomaly detection tasks. However, the LSTM does not adequately capture the spatial features of sensor data, particularly neglecting spatial characteristics. The authors [56] reduced the dimensionality of the sensor data and identified the anomalous data based on autoencoders. However, these models only focus on the temporal features without considering the spatial features of industrial sensor data.
In the extraction of spatial features in the industrial sensor network space, some scholars have conducted relevant research based on Graph Neural Network (GNN). The Graph Deviation Network (GDN) [30] was used to predict future data by graph attention-based forecasting and the absolute error was computed for evaluating the graph deviation score. In [57], a framework for sensor data anomaly detection was proposed, consisting of automatically learning a graph structure, graph convolutional, and transformer. However, these two models cannot efficiently extract temporal features as their calculations are conducted in the spatial domain, resulting in poor detection performance.
Some scholars have conducted relevant research on how to simultaneously extract spatial and temporal features. In [58], this paper presented a methodology for anomaly detection and classification of spatially correlated sensor data. The proposed approach integrated three techniques, namely Detrended Cross-Correlation Analysis (DCCA), Autoregressive Integrated Moving Average (ARIMA), and Forgetting Factor Iterative Data Capture Anomaly Detection (FFIDCAD). However, the utilized DCCA only considers the pairwise correlations between sensors and fails to fully capture the spatial features of the entire sensor network. Additionally, the implementation of ARIMA, being a statistical algorithm, may not effectively extract temporal features from long-term sequences. In [59], the author combined 1D-CNN and GCN to analyze the temporal and spatial features. The 1D-CNN utilizes convolutional operations primarily for extracting local features, where each convolutional kernel is crafted to focus on a specific subset of the data. However, this attribute might hinder its effectiveness in capturing long-range dependencies and global structural nuances. Therefore, existing studies have not fully succeeded in simultaneously extracting the spatial and temporal features of industrial sensor data.
3. Framework of proposed cloud-edge collaborative data anomaly detection approach
This paper proposes a cloud-edge collaborative data anomaly detection approach for industrial sensor networks. For this cloud-edge collaborative detection framework, we initially addressed three issues:
- i) How does the edge determine the abnormal state of the industrial sensor network?
Considering the limited computational resources at the edges, the sensor data detection model is expected to be of low complexity. The Gaussian and Bayesian algorithms [35] based on statistical methods have low complexity and are widely used in anomaly detection applications. Thus, we select Gaussian and Bayesian algorithms as the sensor data anomaly detection model. In general, the distribution of anomalous sensor data differs from that of normal sensor data. Therefore, one approach is to separately compute the distributions of normal sensor data and anomalous sensor data. Then, by comparing how closely the sensor data to be detected aligns with these two distributions, we can determine whether the current sensor data is normal or anomalous. However, the sensor data detection model deployed at the edge can only identify if a particular sensor’s value is abnormal. Our objective is to detect whether the entire industrial sensor network is in an abnormal state. It is unreasonable to identify the abnormal state of industrial sensor networks only by detecting the value of one sensor, which will lead to a high False Positive Rate (FPR). In practice, most sensors will be anomalous when the industrial sensor network is in an abnormal state. Therefore, we consider that the industrial sensor network is anomalous when the number of abnormal sensors exceeds the pre-defined sensitivity coefficient e, i.e., a threshold for the tolerable maximum number of anomalous sensors.
- ii) If the edge can identify abnormal data, why is it necessary to upload the sensor data to the cloud for further analysis when the industrial sensor network is in an abnormal state?
Industrial sensor networks typically have the following data structure: with a sampling period of a, at each sampling time point, values from each sensor can be collected at the current moment. It’s worth noting that currently available industrial sensor network datasets usually only label the operational state of the network as normal or anomalous. For example, if a physical attack occurs in the industrial sensor network from 9:10–9:20, then every sampling time point during these 10 minutes is marked as an anomalous state. The Gaussian and Bayesian algorithms require the prior calculation of probability density functions for normal and anomalous data for each sensor to perform the classification task. To compute the probability density function for anomalous data of a particular sensor, we need to utilize that sensor’s anomalous data. However, in these publicly available industrial sensor network datasets, we only know whether the industrial sensor network is in an anomalous state at a given time but do not have information about specific sensors that are anomalous. Therefore, to calculate the probability density functions for anomalous data of each sensor, we can only consider every sensor’s data in the training set as anomalous when the industrial sensor network is in an anomalous state. In practice, when an industrial sensor network experiences an anomaly, the majority of sensor data will exhibit anomalies, while a minority of sensor data remains unaffected and remains normal. Therefore, if we consider every sensor’s data in the training set as anomalous when the industrial sensor network is in an anomalous state, some of these presumed anomalous sensor data may actually be normal. Consequently, the probability density functions calculated for the sensor’s anomalous data may not be as accurate, potentially causing the sensor data detection model to misclassify some normal sensor data as anomalous. This, in turn, results in misclassifying an industrial sensor network in a normal state as anomalous, leading to a high FPR. Additionally, due to the low computational complexity of the sensor data detection model, the sensor data detection model can only perform binary classification and cannot identify the specific type of anomaly. Therefore, when the industrial sensor network is marked as anomalous, it is necessary to upload all sensor data from the current time to the cloud-based sensor data analysis model for further analysis. This process helps improve detection accuracy, and identify the specific type of anomaly.
- iii) Why is it essential to upload all sensor data from the industrial sensor network during an abnormal state rather than just the data from the sensors flagged as abnormal?
In general, when an industrial sensor network is in an anomalous state, the majority of sensors within the network tend to be in an anomalous state, while a minority of sensors remain in a normal state. If we upload only the data from the sensors marked as anomalous by the sensor data detection model when the industrial sensor network is in an anomalous state, the structure of the sensor data uploaded to the cloud would not be consistent. For instance, there may be 60 anomalous sensors at one moment and 80 at another. Therefore, the sensor data received by the cloud may consist of 60 data points at one time and 80 data points at another time, resulting in inconsistent data structures. However, a non-uniform data structure hinders our ability to extract complex features, posing significant challenges for subsequent analysis. To address this issue, we have to upload the values of all sensors at the current moment. This ensures a consistent data structure at the cloud at all times, enabling us to comprehensively extract features for further analysis.
The proposed cloud-edge collaborative data anomaly detection approach consists of three models: a data preprocess model, a sensor data detection model, and a sensor data analysis model in the cloud. These models are illustrated in Fig 2, where the blue arrows represent the transmission of sensor data between different physical devices and various detection models; the red arrows denote trigger signals; the purple arrows represent the transmission of the number of anomalous sensors from the edge areas to the cloud; and the green arrows indicate the flow of sensor data processing within the detection models. The data preprocess model and sensor data detection model are deployed at individual edges, covering a collection of sensors. The data preprocess model can aggregate sensor data and transform non-Gaussian distributed data into Gaussian distributed data. The sensor data detection model can identify the sensors with anomalous data. The count of anomalous sensors in each edge is subsequently transmitted to the cloud, where the total number of anomalous sensors can be calculated by aggregating the information received from all edges. If the total number of anomalous sensors exceeds the predefined sensitivity coefficient e (i.e., a threshold for the tolerable maximum number of anomalous sensors), the industrial sensor network is considered to be in an anomalous operational state. At this point, the data uploading mechanism in each edge will be triggered, prompting the transmission of sensor data from all edges to the cloud for further analysis. Thus, massive sensor data during normal operation of the industrial sensor network can be filtered, which makes the traffic load smaller. In the cloud, the sensor data analysis model based on GCRL comprises Spearman correlation analysis, GCN, LSTM, and a Residual algorithm. The core goal of GCRL is to efficiently extract and analyze the spatial and temporal features of industrial sensor data, ensuring accurate anomaly detection.
Without the data preprocessing model, the sensor data detection model would not function effectively. Sole reliance on the sensor data detection model for anomaly detection tasks could lead to low detection accuracy and challenges in identifying anomaly types. Furthermore, if the anomaly detection tasks are solely conducted by the sensor data analysis model deployed in the cloud, it would require continuous uploading of all sensor data generated by the industrial sensor network. Although this approach effectively extracts temporal-spatial features of the sensor data and yields higher detection accuracy, it significantly increases the traffic load, disrupts the normal communication behaviors of the ICS, and impedes their stable operation. Consequently, based on this analysis, the approach proposed in this paper is an integrated approach composed of three interdependent models. These models cannot function independently for anomaly detection; their roles are complementary and essential to one another.
In summary, the cloud-edge collaborative data anomaly detection approach we propose addresses two challenges present in current industrial sensor network anomaly detection. The first challenge is that existing detection frameworks primarily employ centralized detection, which can lead to increased traffic load. The second challenge is that current methods struggle to simultaneously extract the temporal and spatial features of industrial sensor networks comprehensively. The details of our proposed approach are presented in the following sections.
3.1. Data preprocess model
The data preprocess model is designed to aggregate sensor data and convert non-Gaussian distributed data into Gaussian distributed data. In practice, the data sampling rate is generally high, e.g., in seconds or milliseconds. Thus, the data aggregation algorithm needs to be adopted for data analysis in multiple time scales. Moreover, The aggregated sensor data may not follow the Gaussian distribution, so the Box-Cox transformation [48] is used to transform the non-Gaussian distribution data into a Gaussian distribution. The details of the data preprocess model are given in equations (1)-(2).
where represents the collected original data of sensor i in the r-th edge at time t; a is the sampling period and B is the scale of aggregation;
is the aggregated data of sensor i in the r-th edge at time t.
is used to maximize the log-likelihood function of sensor i in the r-th edge.
is the preprocessed data of sensor i in the r-th edge at time t.
3.2. Sensor data detection model at the edge
To mitigate traffic load, we have embraced a cloud-edge collaborative detection architecture that employs a sensor data detection model deployed at the edge to pre-filter substantial amounts of sensor data during the regular operation of the industrial sensor network. However, given the limited computing resources at the edge, the computational structure of the sensor data detection model cannot be overly complex. Consequently, we opted for a straightforward statistical algorithm for this model, namely the Gaussian and Bayesian algorithms. The detailed computational process of the sensor data detection model is outlined as follows:
First, we need to calculate the Probability Density Function (PDF) of normal and abnormal data. The dataset is split into two sets: a training set and a testing set, in which the label value 0 represents abnormal and 1 represents normal. We use the training set to calculate the PDF of normal and abnormal data. The details are shown in equations (3)-(4).
where and
represents the normal and abnormal PDF of sensor i in the r-th edge, respectively;
and
are the means and variance of the i-th sensor in the r-th edge when its label is 1. Otherwise, it is
and
, respectively.
The parameters of normal and abnormal PDFs can be obtained through the maximum likelihood estimate. The parameters of PDF are shown in equation (4):
where and
represents the means in normal and abnormal training sets of i-th sensor data in the r-th edge, respectively.
and
represents the preprocessed j-th abnormal and normal data of sensor i in the r-th edge, respectively.
and
are the number of preprocessed abnormal and normal training data of sensor i in the r-th edge, respectively.
Then, the normal and abnormal probability of the sensor data can be calculated based on equation (5).
As , i.e., the denominator is the same when
and
are compared. Thus, equation (5) can be further simplified as equation (6).
where and
are constants, which represent the proportion of normal and abnormal samples in the training set, respectively.
The algorithm of the sensor data detection model is given in Algorithm 1.
Algorithm 1: Sensor data detection model at the edge
Input: the training set, real-time original sensor data ,
,
Output: all sensor data at the current moment
1: Aggregate the original sensor data using the training set
2: Transform the non-Gaussian distribution into Gaussian distribution using the equation (2)
3: Obtain the normal and abnormal PDFs for each sensor using the equation (4)
4: Collect sensor data
5: s = 0
6:
7: Transform the data into Gaussian distribution
8: Compute the and
9: Compare the and
10: If
11: | s = s + 1
12: If s > e
13: | upload all the sensor data at the current moment to the cloud
14: Jump step 4
where k is the number of edge areas in the industrial sensor network; mr is the number of sensors in the r-th edge; s is the number of anomalous sensors in the whole sensor network.
Not all sensor data can be transformed into Gaussian distribution using the Box-Cox algorithm. Therefore, the and
may be equal when the transformed data do not follow Gaussian distribution. The sensor data can’t be identified as normal or abnormal in this case, so we consider the sensor data as potentially abnormal and in need of further analysis. Therefore, we set this condition
. It is unreasonable to identify the abnormal state of industrial sensor networks only by detecting the value of one sensor, which will lead to a high FPR. In practice, most sensors will be anomalous when the industrial sensor network is in an abnormal state. Therefore, we consider that the industrial sensor network is anomalous when the number of abnormal sensors exceeds the pre-defined sensitivity coefficient e.
3.3. GCRL-based sensor data analysis model in the cloud
Given the inherent correlation among data from various sensors in coupled industrial processes, the sensor data inherently manifests both spatial and temporal features. To comprehensively extract these dual features, we introduce GCRL, a framework comprising GCN, LSTM, and a Residual algorithm. GCN is used to extract the spatial features of sensor data. The input of GCN consists of a sensor data matrix and an adjacency matrix indicating the correlations between any two sensors. In fact, any two sensors have different degrees of correlation in the industrial sensor network. To construct the adjacency matrix, we use the Spearman correlation analysis algorithm to calculate different degrees of correlation between any two sensors. Moreover, to extract temporal features, we insert LSTM into GCN to process sensor data. In addition, a Residual algorithm is also adopted in the proposed GCRL to avoid gradient disappearance or explosion. The implementation of the proposed GCRL in the cloud is described as follows:
Firstly, the adjacency matrix, representing the correlations between any two sensors in the sensor network is built using the Spearman correlation analysis algorithm. The details of the calculation are shown in equations (7) and (8).
where is the correlation coefficient between sensor z and sensor c.
is the difference between the u-th pair values in the dataset that represents the sorted data of the value of sensor z and sensor c.
is the number of the value of sensor c or sensor z.
is the P-value [60] for the
, which is used to determine whether there exists a correlation between the value of sensor z and sensor c.
The adjacency matrix can be calculated using equation (9).
where z or c represents the sensor number. p is the threshold of the correlation coefficient and is the significance level. A is the adjacency matrix of the industrial sensor network.
Each layer in the GCRL model consists of a graph convolutional part, LSTM, and residual part.
where is the activation function;
represents the adjacency matrix with self-loops;
is the corresponding degree matrix.
is the weight matrix of the graph convolutional part in l + 1-th layer. yG,l+1 is the output of the graph convolutional part in l + 1-th layer. LSTMl+1() represents the LSTM function in the l + 1-th layer. yL,l+1 is the result of the LSTM part in the l + 1-th layer. yL+G,l+1 is the final output of l + 1-th layer. Moreover,
and
is the input of the GCRL that is provided by the sensor data detection model at the edges.
The structure of the proposed sensor data analysis model is shown in Fig 3, where the blue arrows represent the transmission of sensor data between different physical devices and various detection models; and the green arrows indicate the flow of sensor data processing within the detection model.
3.4. The cloud-edge collaborative data anomaly detection approach
The proposed cloud-edge collaborative data anomaly detection approach can identify the abnormal state of an industrial sensor network and upload the values of all sensors at the current moment. Consequently, this approach helps reduce the transmission of massive sensor data during the normal operation of the industrial sensor network to the cloud, thereby reducing the traffic load. The sensor data analysis model based on GCRL deployed in the cloud can sufficiently extract the spatial and temporal features, improving detection performance. The algorithm of the proposed approach is given in Algorithm 2.
Algorithm 2: Cloud-edge collaborative data anomaly detection approach
In: the training set, real-time original sensor data ,
,
Out: classification result
1: Obtain the normal and abnormal PDFs for each sensor using the equations (1)-(4)
2: Calculate the adjacency matrix A using the training set. using equation (7)-(9)
3: Train the GCRL model using the training set.
4: Obtain original sensor data at the edges
5: If the state of the sensor network is identified as normal using the sensor detection model at the edges
6: | Jump Step 4
7: Else
8: | Upload all sensor data at the current moment to the cloud
9: | A and are the input of the GCRL model
10: | for l ← 1 to g do
11: | | Extract the spatial features using equation (10)
12: | | Process the temporal features using equation (11)
13: | | Obtain the output of layer l using the equation (12)
14: | Obtain the classification result
15: Jump Step 4
where g is the number of layers of the sensor data analysis model based on GCRL.
4. Testbed experiment and numerical results
The proposed cloud-edge collaborative data anomaly detection approach is assessed using the Water Distribution (WADI) dataset [19] and Condition Monitoring of Hydraulic Systems (CMHS) dataset [40], compared with a range of existing detection models. At each edge area, a host machine equipped with an Intel(R) Core(TM) i5-9400 CPU and 4GB of memory is deployed for computation, without a GPU graphics card. The CPU operates on Ubuntu 20.04.6 LTS. In the cloud, a deep learning server is configured, with details of its hardware environment provided in Table 1. The sensor data analysis model is trained using a significance level of and ReLu is adopted as the activation function. The specific structure and parameters of the deep learning algorithm can be explored in our publicly available source code.
4.1. Dataset
4.1.1. WADI dataset.
The WADI testbed [19] is a water distribution system consisting of many pipelines and 127 sensors spanning across a large area, forming a storage and distribution network, and realistic and complete water treatment. The WADI dataset consists of 14 types of attack data, e.g., turn on or turn off maliciously some actuators, stop chemical dosing to the raw water, and 14-day continuous data in normal operational conditions. During the normal and attack data collection, all sensor data were collected. The sampling period is 1000ms, representing 127 sensor data that can be collected with the sampling rate of 1000ms. In this work, we define the values collected from all sensors per second as one sample. If a sample is labeled as anomalous, it indicates that at the corresponding time, the industrial sensor network is in an abnormal operational state. The details of 15-classification WADI dataset is shown in Table 2. To evaluate the binary classification performance of the proposed approach, we consider the 14 types of attack data as abnormal data and then obtain a 2-classification WADI dataset, as shown in Table 3. The values in these two tables represent the number of samples collected for each item. For example, 1188 in Table 2 indicates that the number of samples of attack #2 is 1188 for training.
4.1.2. CMHS dataset.
The CMHS dataset [40] was obtained from an industrial hydraulic test rig that is composed of a primary working, a secondary cooling-filtration circuit, and an industrial sensor network. There are 17 sensors in the industrial sensor network (e.g., temperature, pressure, motor power), and each sensor obtains a measurement per second. The 17 sensor data collected per second is defined as one sample. The dataset has two labels: cooler condition and valve condition. The cooler condition has three states: 3 is “close to total failure”, 20 is “reduced efficiency”, and 100 is “full efficiency”. The valve condition has four states: 73 is “close to total failure”, 80 is “severe lag”, 90 is “small lag”, and 100 is “optimal switching behavior”. The details of 6-classfication CMHS dataset is shown in Table 4. To evaluate the binary classification performance of the proposed approach, we consider the condition where a cooler condition value of 3 is treated as abnormal, and a value of 100 is treated as normal. Based on this criterion, we construct a binary-class WADI dataset, as summarized in Table 5. The values in the two tables represent the number of samples.
4.2. Detection performance at the edge
To evaluate the detection performance of the proposed approach at the edge, we use the 2-classification WADI dataset and 2-classification CMHS dataset to conduct this experiment. B is selected as 10, representing that the sensor data is detected every 10 seconds. The evaluation metrics are shown as follows:
where True Negative (TN) represents the number of abnormal samples classified as abnormal correctly. False Positive (FP) is the number of abnormal samples classified as normal incorrectly. False Positive (FN) represents the number of normal samples classified as abnormal incorrectly. FR represents the proportion of abnormal samples that are classified as normal samples. is the times of uploading the anomalous samples to the cloud. RTL represents the number of filtered sensor data by the sensor data detection model, which is corresponding to the traffic load.
is the number of samples in the testing set.
is the number of sensors in the industrial sensor network.
This work also considers the influence of sensitivity coefficient e on the detection results. The numerical results are presented in Tables 6 and 7.
Table 6 shows that as e increases, TN and FN, as well as k_times, will decrease, while FR and RTL will increase. On the one hand, we want to reduce the traffic load and minimize k_times while maximizing RTL. On the other hand, the sensor data detection model cannot identify the specific type of anomaly. Thus, all anomalous samples must be uploaded to the cloud for further analysis, i.e., FR must be zero. Therefore, it seems that e = 27 is the best choice. However, the proposed approach consists of the sensor data detection model and the sensor data analysis model. Different values of e can affect the classification results of the sensor data analysis model in the cloud. Thus, the detection performance of the sensor data analysis model may not be the best when e = 27. We need to consider e with different values for further analysis. It can be observed that RTL when e = 22 is about twice as much as RTL when e = 21, so we consider e = 22 as a turning point. Moreover, e = 27 is the performance turning point as FR is not zero when e is more than 27. Therefore, we conduct the following experiments using e with the different values of 22, 23, 24, 25, 26, and 27.
In Table 7, we can find that all normal samples will be uploaded to the cloud for further analysis when e = 2. When e = 3, most of the normal samples will be filtered and the sensor data detection model can reduce 120020 sensor data delivered to the cloud. Thus, we consider e = 2 is a turning point. In addition, We observed a sharp increase in FR when transitioning from e = 7 to e = 8. Thus, we consider e = 7 is a turning point. In the following experiment, we consider e with different values of 2, 3, 4, 5, 6, and 7 for further analysis.
The results in Tables 6 and 7 demonstrate that if we select an appropriate value for e, the sensor data detection model can accurately identify all anomalous states of the industrial sensor network and upload the data from all sensors at the current anomalous time. Additionally, it can filter out a significant amount of sensor data during the normal operation of the industrial sensor network, reducing the traffic load.
4.3. Detection performance in the cloud
In this section, we conducted four experiments to evaluate the proposed anomaly detection approach in the following four aspects.
Experiment A: What is the influence of different parameter settings on detection results? How to select the parameters with the best detection performance?
Experiment B: Whether the proposed approach can learn spatial and temporal features of industrial sensor data, by inserting LSTM into GCN?
Experiment C: How is the performance of our proposed detection approach compared with other sensor data anomaly detection approaches for binary classification?
Experiment D: How is the performance of our proposed detection approach compared with other detection approaches for the classification of multiple anomalies?
4.3.1. Experiment A: The detection results of different parameter settings.
Experiment A primarily addresses the following questions: What is the influence of different parameter settings on the detection results? How can the parameters that yield the best detection performance be selected?
In Experiment A, the datasets in Tables 3 and 5 are used for evaluation. First, we construct the adjacency matrix A using the training set and the Spearman correlation analysis algorithm. The threshold of correlation coefficient p is 0.45, 0.55 or 0.65. The corresponding sensor data correlation graph is used to visualize adjacency matrix A, as shown in Figs 4 and 5. Each point represents a sensor, and the coordinates are calculated using the T-distributed Stochastic Neighbor Embedding (TSNE) [61]. If one sensor is associated with other sensors, the sensor is a red point, while the blue points are the opposite. The edge represents that there exists a correlation between the two sensors. The edges in Fig 4a are denser than those in Fig 4b and 4c, indicating that the sensor data of WADI is sensitive to the value of p. In Fig 5, we find that the structures of the three correlation graphs are different but the density is similar, demonstrating that the sensor data of CMHS is not sensitive to the value of p. The adjacency matrix A is the input of the proposed approach, and the threshold of correlation coefficient p has an influence on adjacency matrix A. Therefore, the parameter p will affect the final classification results.
Then, the proposed approach is evaluated with different parameter settings. Here, the parameter e is 22, 23, 24, 25, 26, 27 for the WADI dataset, and e is 2, 3, 4, 5, 6, 7 for the CMHS dataset. The parameter p is 0.45, 0.55, or 0.65.
Three commonly used classification metrics, namely Precision, Recall, and F1-score, are employed for performance assessment.
The comparison results using the 2-classification WADI dataset are shown in Fig 6a–6f. F1-score is the parameter that can balance Precision and Recall. Thus, we consider selecting the parameters that maximize the F1-score in this paper. The Precision is always higher than 98% with different parameter settings. However, the Recall fluctuates from 71.24% to 93%. F1-score of 95.96% is the highest when p = 0.65 if e = 26 or e = 27. But the traffic load is smaller when e = 27, compared with e = 26. Therefore, we select p = 0.65 and e = 27 as the final parameters of our proposed detection approach. In the following content, we refer to our proposed cloud-edge collaborative data anomaly detection approach as G-GCRL.
The comparison results using the 2-classification CMHS dataset are shown in Fig 7a–7f. When e = 6 or e = 7, Recall is 100% but F1-score is not the highest. The reason of Recall is 100% is that all normal samples are identified as normal by the sensor data detection model in the edge. However, many abnormal samples are incorrectly classified as normal by this model at the same time, which decreases the F1-score. In each subfigure, we find that F1-score increases with the increase of p, representing that a larger p can make it easier for the extraction of correlations between sensors in CMHS. It is observed that F1-score of 97.17% is the highest when p = 0.65 and e = 5. Thus, p = 0.65 and e = 5 are selected as the parameters of G-GCRL.
4.3.2. Experiment B: Validation of complex feature learning.
Experiment B primarily addresses the following questions: Whether the proposed approach can learn spatial and temporal features of industrial sensor data, by inserting LSTM into GCN?
In Experiment B, the datasets in Tables 3 and 5 are used for evaluation. The proposed approach is compared with GCN and LSTM to validate the complex feature learning. In Fig 8a, the Recall and F1-score of LSTM are 45.32% and 47.81%, respectively. The Recall and F1-score of GCN are 75.73% and 85.62%, respectively. However, the Recall and F1-score of G-GCRL are 92.97% and 95.96% respectively, which are much higher than those of GCN or LSTM. In addition, the Recall and F1-score of GCN and LSTM are much lower than those of GCRL in Fig 8b. This is because LSTM can only extract temporal features and GCN can only extract spatial features. GCRL that is composed of GCN and LSTM can effectively extract the spatial and temporal features of industrial sensor data.
4.3.3. Experiment C: Compared with other sensor data anomaly detection models.
Experiment C primarily addresses the following questions: How is the performance of our proposed detection approach compared with other sensor data anomaly detection approaches for binary classification?
In Experiment C, the datasets in Tables 3 and 5 are used for evaluation. In addition, the performance of the proposed G-GCRL is evaluated against 11 different existing models, including AE [29], MAD-GAN [19], LSTM-VAE [62], DAGMM [48], ECAD [55], AQADF [63], GDN [30], GTA [57], B-GAN [64], TCN-GCN [59] and MDAM [65]. These are commonly used baseline models in the field of anomaly detection for industrial sensor data.
AQADF [63]: The Automatic Quasi-periodic time series Anomaly Detection Framework (AQADF) consists of a clustering algorithm and a hybrid LSTM-CNN model with an attention mechanism, which can identify anomalous data.
ECAD [55]: This article proposed an anomaly detection method for industrial sensor data based on cloud-edge collaboration, utilizing neural networks to extract data features for the identification of anomalies.
GTA [57]: Graph learning with Transformer for Anomaly detection (GTA) involves automatically learning a graph structure and graph convolution, and extracts temporal features using a transformer.
B-GAN [64]: Utilizing a GAN structure, the authors leveraged LSTM to augment the number of abnormal samples, effectively resolving the problem of imbalanced distribution between normal and abnormal samples.
TCN-GCN [59]: The author combined 1D-CNN and GCN to analyze the temporal and spatial features. While originally utilized for prediction in the original paper, we can adapt the model to accomplish classification tasks through adjustments.
MDAM [65]: The authors successfully utilize a Multidimensional Deconvolutional network and Attention Mechanism (MDAM) to effectively identify anomalies.
In summary, the models including AE [19,29], LSTM-VAE [62], DAGMM [48], AQADF [63], B-GAN [64], ECAD [55] and MDAM [65] all employ deep learning algorithms with a focus on extracting temporal features from industrial sensor data. GDN [30] utilizes graph neural networks to focus on extracting spatial features of industrial sensor data. GTA [57] and TCN-GCN [59] are specifically designed to analyze and process both temporal and spatial features of industrial sensor data simultaneously.
The comparison results of different models are shown in Table 8. The best performance is highlighted in bold and the second-best with underlines. Our proposed data anomaly detection approach based on G-GCRL significantly outperforms all other models for the 2-classification WADI dataset with the performance of Precision, F1-score up to 99.16%, and 0.96, respectively. In addition, compared to the second-best model, G-GCRL can achieve an overall 4.35% improvement for F1-score. For the 2-classification CMHS dataset, our proposed approach can achieve an overall 4.3% improvement for F1-score.
The GDN [30] model solely considers spatial features, leading to a relatively low F1-score. GTA [57] incorporates GCN to extract spatial features and utilizes transformers for capturing temporal characteristics. However, due to the relatively small scale of the dataset we employed and the complexity of the GTA model structure, overfitting may occur as a consequence of the model’s complexity and insufficient training samples, resulting in suboptimal evaluation outcomes. B-GAN [64] addresses the imbalance between positive and negative samples without considering the extraction of temporal-spatial features. TCN-GCN [59] employs 1D-CNN for extracting temporal features and GCN for spatial feature extraction. Nevertheless, the use of convolutional operations in 1D-CNN primarily focuses on capturing local features, where each convolutional kernel is designed to emphasize a specific subset of the data. This characteristic might impede its effectiveness in capturing long-range dependencies, global structural nuances, and temporal relationships intrinsic to sequences. The model [55] employs a cloud-edge computing architecture and uses a lightweight LSTM to extract features from sensor data for anomaly detection tasks. However, the LSTM does not adequately capture the temporal-spatial features of sensor data, particularly neglecting spatial characteristics, resulting in a detection performance that is inferior to the method we propose. The author in [65] employs a MDAM to effectively identify anomalies. However, its computational structure faces challenges in fully extracting spatial features from sensors.
For evaluating the computational complexity and detection latency of the models, this paper adopts two key metrics: FLoating-point OPerations (FLOPs) and the Maximum detection time per sample (Mtime). FLOPs measure the theoretical computational cost of the model during the inference phase and reflect the complexity of the model’s computational structure. Mtime represents the maximum time required to process a single sample in practical deployment, encompassing data preprocessing, edge-side model detection, data transmission to the cloud, and final analysis by the cloud-side model. As shown in Table 8, AE [29] exhibits the lowest FLOPs and the shortest Mtime on both the WADI and CMHS datasets, followed by DAGMM [48]. These differences primarily stem from variations in model architecture: AE [29] employs a relatively simple autoencoder for anomaly detection, resulting in lower computational cost and shorter inference time. DAGMM [48] incorporates a Gaussian Mixture Model at the output of the autoencoder. Although this increases the architectural complexity slightly, the overall computational burden remains relatively low. However, despite their advantages in computational cost and inference speed, AE [29] and DAGMM [48] underperform in terms of detection performance, as evidenced by their significantly lower F1-scores. This indicates that low computational complexity alone is insufficient to guarantee the practical utility of anomaly detection models in real-world applications. In practice, while FLOPs serve as a useful reference for evaluating model efficiency, greater emphasis is placed on the model’s responsiveness in actual operating environments—specifically, whether it can meet real-time detection requirements. Generally, if the Mtime is shorter than the data sampling interval, the model is considered capable of real-time detection [66]. According to the experimental results, although the proposed G-GCRL incurs relatively higher FLOPs and slightly greater computational complexity, its Mtime consistently remains below 200 ms, which is well within the 1000 ms sampling interval. Therefore, G-GCRL fully satisfies the real-time detection requirements for practical deployment and demonstrates strong performance in both detection accuracy and system responsiveness.
4.3.4. Experiment D: Validation of multi-classification ability.
Experiment D primarily addresses the following questions: How is the performance of our proposed detection approach compared with other detection approaches for the classification of multiple anomalies?
In this experiment, the proposed G-GCRL for the classification of multiple anomalies is assessed against AQADF and GCN based on the dataset in Tables 3 and 5, respectively. The performance metric of detection accuracy for different types of attacks is defined in (16).
where is the number of correctly classified data in attack j dataset.
is the number of data in attack j dataset. The comparison results are shown in Fig 9.
In Fig 9a, #0 represents normal and #1 to #14 represent attack #1 to attack #14, respectively. In Fig 9b, @5 represents normal and @0 to @4 represents attack @0 to attack@4, respectively. The detection accuracy of each attack of the G-GCRL model is always higher than that of the other models in Fig 9a and 9b. This is because the AQADF and GCN model cannot sufficiently extract the spatial and temporal features of industrial sensor data. If the classifier is the G-GCRL model, the accuracy of attack #4, #9, #10, #11, #13, and #14 is almost 100%, respectively. However, the accuracy of attack #1, #2 and #6 are lower than 0.6 in these three models. After our analysis, we find that these three types of attacks are easily classified as normal. This suggests that the impact of these three attacks on industrial sensor data is relatively small, making them difficult to distinguish. Experiment D validates that our proposed approach outperforms other detection models in multi-class detection performance.
5. Conclusions and future work
This paper proposes a cloud-edge collaborative data anomaly detection approach for industrial sensor networks. The approach consists of a data preprocess model, a sensor data detection model deployed at individual edges, and a sensor data analysis model deployed in the cloud. The data preprocess model can aggregate collected sensor data and transform it into a Gaussian distribution. The sensor data detection model efficiently filters the substantial volume of sensor data generated during normal operation of the industrial sensor network, and only uploads all the sensor data when the network is in an anomalous state to the cloud. Therefore, this approach can lead to a significant reduction in traffic load. The sensor data analysis model can efficiently extract temporal and spatial features and identify anomalies precisely. Two benchmark datasets WADI and CMHS are used for evaluation compared with other baseline models. The numerical results show that our proposed approach can achieve an impressive 4.35% improvement in F1-score, compared with the existing baseline models.
In future work, we will explore adaptive and automated parameter update methods to enhance the model’s adaptability to dynamic environments. Furthermore, we will investigate sensor data detection models with higher detection accuracy that can be deployed at the edge to filter out a large volume of normal samples. Moreover, we will explore methodologies to quantify both energy consumption and storage complexity of the algorithms, enabling a more comprehensive evaluation of the performance of our detection models.
References
- 1. Rodriguez PM, Lizeaga A, Mendicute M, Val I. Spectrum handoff strategy for cognitive radio-based MAC for real-time industrial wireless sensor and actuator networks. Computer Networks. 2019;152:186–98.
- 2. Zhang X, Ming X. Implementation path and reference framework for Industrial Internet Platform (IIP) in product service system using industrial practice investigation method. Advanced Engineering Informatics. 2022;51:101481.
- 3. Chen Z, Di W, Chen R, Deng T, Wang Y, You H, et al. Modeling and experimental investigation of magnetic anomaly detection using advanced triaxial magnetoelectric sensors. Sensors and Actuators A: Physical. 2022;346:113806.
- 4. Jeong S, Ferguson M, Hou R, Lynch JP, Sohn H, Law KH. Sensor data reconstruction using bidirectional recurrent neural network with application to bridge monitoring. Advanced Engineering Informatics. 2019;42:100991.
- 5. Alwan AA, Brimicombe AJ, Ciupala MA, Ghorashi SA, Baravalle A, Falcarin P. Time-series clustering for sensor fault detection in large-scale Cyber–Physical Systems. Computer Networks. 2022;218:109384.
- 6. Edwards C, Press I. An analysis of a cyberattack on a nuclear plant: The stuxnet worm. Crit Infrastruct Protect. 2014;116:59.
- 7.
Vaz R. Venezuela’s power grid disabled by cyber attack. Green Left Weekly; 2019. 15.
- 8. Titouna C, Naït-Abdesselam F, Khokhar A. DODS: A Distributed Outlier Detection Scheme for Wireless Sensor Networks. Computer Networks. 2019;161:93–101.
- 9. Truong HT, Ta BP, Le QA, Nguyen DM, Le CT, Nguyen HX, et al. Light-weight federated learning-based anomaly detection for time-series data in industrial control systems. Computers in Industry. 2022;140:103692.
- 10. Catalano C, Paiano L, Calabrese F, Cataldo M, Mancarella L, Tommasi F. Anomaly detection in smart agriculture systems. Computers in Industry. 2022;143:103750.
- 11. Fang W, Shao Y, Love PED, Hartmann T, Liu W. Detecting anomalies and de-noising monitoring data from sensors: A smart data approach. Advanced Engineering Informatics. 2023;55:101870.
- 12. Kea K, Han Y, Kim T-K. Enhancing anomaly detection in distributed power systems using autoencoder-based federated learning. PLoS One. 2023;18(8):e0290337. pmid:37594957
- 13. Zhang Z, Chen Y, Wang H, Fu Q, Chen J, Lu Y. Anomaly detection method for building energy consumption in multivariate time series based on graph attention mechanism. PLoS One. 2023;18(6):e0286770. pmid:37289704
- 14. Sun X, Yan B, Zhang X, Rong C. An Integrated Intrusion Detection Model of Cluster-Based Wireless Sensor Network. PLoS One. 2015;10(10):e0139513. pmid:26447696
- 15. Iqbal A, Amin R, Alsubaei FS, Alzahrani A. Anomaly detection in multivariate time series data using deep ensemble models. PLoS One. 2024;19(6):e0303890. pmid:38843255
- 16. Ghalem SK, Kechar B, Bounceur A, Euler R. A probabilistic multivariate copula-based technique for faulty node diagnosis in wireless sensor networks. Journal of Network and Computer Applications. 2019;127:9–25.
- 17. Sharifi R, Langari R. Nonlinear sensor fault diagnosis using mixture of probabilistic PCA models. Mech Syst Signal Process. 2017;85:638–50.
- 18. Huang D, Shi X, Zhang W-A. False Data Injection Attack Detection for Industrial Control Systems Based on Both Time- and Frequency-Domain Analysis of Sensor Data. IEEE Internet Things J. 2020;8(1):585–95.
- 19.
Li D, Chen D, Jin B, Shi L, Goh J, Ng S-K. MAD-GAN: multivariate anomaly detection for time series data with generative adversarial networks. In: International conference on artificial neural networks. Springer; 2019.
- 20. Canizo M, Triguero I, Conde A, Onieva E. Multi-head CNN–RNN for multi-time series anomaly detection: An industrial case study. Neurocomputing. 2019;363:246–60.
- 21. Ullah S, Pirahandeh M, Kim D-H. Self-attention deep ConvLSTM with sparse-learned channel dependencies for wearable sensor-based human activity recognition. Neurocomputing. 2024;571:127157.
- 22. Ullah S, Kim D. Lightweight driver behavior identification model with sparse learning on in-vehicle CAN-BUS sensor data. Sensors. 2020;20(18).
- 23. Flammini A, Ferrari P, Marioli D, Sisinni E, Taroni A. Wired and wireless sensor networks for industrial applications. Microelectronics Journal. 2009;40(9):1322–36.
- 24. Kumar S AA, Ovsthus K, Kristensen. LM. An Industrial Perspective on Wireless Sensor Networks — A Survey of Requirements, Protocols, and Challenges. IEEE Commun Surv Tutorials. 2014;16(3):1391–412.
- 25. Han G, Liu L, Jiang J, Shu L, Hancke G. Analysis of Energy-Efficient Connected Target Coverage Algorithms for Industrial Wireless Sensor Networks. IEEE Trans Ind Inf. 2017;13(1):135–43.
- 26. Chen S, Zhang S, Zheng X, Ruan X. Layered adaptive compression design for efficient data collection in industrial wireless sensor networks. Journal of Network and Computer Applications. 2019;129:37–45.
- 27. Liu Y, Yang D, Wang Y, Liu J, Liu J, Boukerche A, et al. Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models. ACM Comput Surv. 2024;56(7):1–38.
- 28. Marjani M, Nasaruddin F, Gani A, Karim A, Hashem IAT, Siddiqa A, et al. Big IoT Data Analytics: Architecture, Opportunities, and Open Research Challenges. IEEE Access. 2017;5:5247–61.
- 29. Wu D, Jiang Z, Xie X, Wei X, Yu W, Li R. LSTM Learning With Bayesian and Gaussian Processing for Anomaly Detection in Industrial IoT. IEEE Trans Ind Inf. 2019;16(8):5244–53.
- 30.
Deng A, Hooi B. Graph neural network-based anomaly detection in multivariate time series. In: Proc AAAI Conf Artif Intell. 2021.
- 31. Ullah S, Kim D-H. Federated Learning Using Sparse-Adaptive Model Selection for Embedded Edge Computing. IEEE Access. 2021;9:167868–79.
- 32.
Ullah S, Kim D. Federated learning convergence on IID features via optimized local model parameters. In: 2022 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE; 2022.
- 33. Kim J, Ullah S, Kim D-H. GPU-based embedded edge server configuration and offloading for a neural network service. J Supercomput. 2021;77(8):8593–621.
- 34. Yang T, Hao W, Yang Q, Wang W. Cloud-edge coordinated traffic anomaly detection for industrial cyber-physical systems. Expert Systems with Applications. 2023;230:120668.
- 35.
Jensen F. An introduction to Bayesian networks. UCL Press: London; 1996.
- 36.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. 2016.
- 37. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. pmid:9377276
- 38.
Corder GW, Foreman DI. Nonparametric statistics: a step-by-step approach. John Wiley & Sons; 2014.
- 39.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conf Comput Vis Pattern Recognit. 2016.
- 40.
Helwig N, Pignanelli E, Schütze A. Condition monitoring of a complex hydraulic system using multivariate statistics. In: 2015 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) Proceedings. 2015.
- 41. Jiang R, Fei H, Huan J. A Family of Joint Sparse PCA Algorithms for Anomaly Localization in Network Data Streams. IEEE Trans Knowl Data Eng. 2012;25(11):2421–33.
- 42. Hao W, Yao P, Yang T, Yang Q. Industrial cyber-physical system defense resource allocation using distributed anomaly detection. IEEE Internet Things J. 2021.
- 43. Nizam H, Zafar S, Lv Z, Wang F, Hu X. Real-Time Deep Anomaly Detection Framework for Multivariate Time-Series Data in Industrial IoT. IEEE Sensors J. 2022;22(23):22836–49.
- 44. Givnan S, Chalmers C, Fergus P, Ortega-Martorell S, Whalley T. Anomaly Detection Using Autoencoder Reconstruction upon Industrial Motors. Sensors (Basel). 2022;22(9):3166. pmid:35590855
- 45. Watts J, Van Wyk F, Rezaei S, Wang Y, Masoud N, Khojandi A. A Dynamic Deep Reinforcement Learning-Bayesian Framework for Anomaly Detection. IEEE Trans Intell Transport Syst. 2022;23(12):22884–94.
- 46. Saci A, Al-Dweik A, Shami A. Autocorrelation Integrated Gaussian Based Anomaly Detection using Sensory Data in Industrial Manufacturing. IEEE Sensors J. 2021;21(7):9231–41.
- 47. Peng X, Li H, Yuan F, Razul SG, Chen Z, Lin Z. An extreme learning machine for unsupervised online anomaly detection in multivariate time series. Neurocomputing. 2022;501:596–608.
- 48.
Zong B, Song Q, Min MR, Cheng W, Lumezanu C, Cho D, et al., editors. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. International conference on learning representations; 2018.
- 49. Xia X, Pan X, Li N, He X, Ma L, Zhang X, et al. GAN-based anomaly detection: A review. Neurocomputing. 2022;493:497–535.
- 50. Liang H, Song L, Wang J, Guo L, Li X, Liang J. Robust unsupervised anomaly detection via multi-time scale DCGANs with forgetting mechanism for industrial multivariate time series. Neurocomputing. 2021;423:444–62.
- 51. Liu Y, Garg S, Nie J, Zhang Y, Xiong Z, Kang J, et al. Deep Anomaly Detection for Time-Series Data in Industrial IoT: A Communication-Efficient On-Device Federated Learning Approach. IEEE Internet Things J. 2020;8(8):6348–58.
- 52. Liu J, Song X, Zhou Y, Peng X, Zhang Y, Liu P, et al. Deep anomaly detection in packet payload. Neurocomputing. 2022;485:205–18.
- 53. Huang D, Shen L, Yu Z, Zheng Z, Huang M, Ma Q. Efficient time series anomaly detection by multiresolution self-supervised discriminative network. Neurocomputing. 2022;491:261–72.
- 54.
Liu J, Liu Y, Lin J, Li J, Sun P, Hu B. Networking systems for video anomaly detection: A tutorial and survey. 2024.
- 55. Li J, Deng Y, Sun W, Li W, Li R, Li Q, et al. Resource Orchestration of Cloud-Edge–based Smart Grid Fault Detection. ACM Trans Sen Netw. 2022;18(3):1–26.
- 56.
Sakurada M, Yairi T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd workshop on machine learning for sensory data analysis. 2014.
- 57. Chen Z, Chen D, Zhang X, Yuan Z, Cheng X. Learning graph structures with transformer for multivariate time series anomaly detection in iot. IEEE Internet Things J. 2021.
- 58.
Rollo F, Bachechi C, Po L. Semi real-time data cleaning of spatially correlated data in traffic sensor networks. In: Proceedings of the 18th International Conference on Web Information Systems and Technologies. 2022.
- 59. Zhang Y, An H, Xing Y, Liu Y, Zhang T. Learning Temporal and Spatial Features Jointly: A Unified Framework for Space-Time Data Prediction in Industrial IoT Networks. IEEE Sensors J. 2023;23(16):18752–64.
- 60.
Fisher RA. Statistical methods for research workers. Springer; 1992. 66–70.
- 61. Vandermaaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11).
- 62. Park D, Hoshi Y, Kemp CC. A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-Based Variational Autoencoder. IEEE Robot Autom Lett. 2018;3(3):1544–51.
- 63. Liu F, Zhou X, Cao J, Wang Z, Wang T, Wang H. Anomaly detection in quasi-periodic time series based on automatic data segmentation and attentional LSTM-CNN. IEEE Trans Knowl Data Eng. 2020.
- 64. Yuan L, Yu S, Yang Z, Duan M, Li K. A data balancing approach based on generative adversarial network. Future Generation Computer Systems. 2023;141:768–76.
- 65. Li Z, Duan M, Xiao B, Yang S. A Novel Anomaly Detection Method for Digital Twin Data Using Deconvolution Operation With Attention Mechanism. IEEE Trans Ind Inf. 2023;19(5):7278–86.
- 66. Hao W, Yang T, Yang Q. Hybrid Statistical-Machine Learning for Real-Time Anomaly Detection in Industrial Cyber–Physical Systems. IEEE Trans Automat Sci Eng. 2023;20(1):32–46.