A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Different anomaly detection modes depending on the availability of labels in the dataset.
(a) Supervised anomaly detection uses a fully labeled dataset for training. (b) Semi-supervised anomaly detection uses an anomaly-free training dataset. Afterwards, deviations in the test data from that normal model are used to detect anomalies. (c) Unsupervised anomaly detection algorithms use only intrinsic information of the data in order to detect instances deviating from the majority of the data.

More »

Fig 2 — Fig 2.

A simple two-dimensional example.
It illustrates global anomalies (x₁, x₂), a local anomaly x₃ and a micro-cluster c₃.

More »

Fig 3 — Fig 3.

A taxonomy of unsupervised anomaly detection algorithms comprising of four main groups.
Note that CMGOS can be categorized in two groups: It is a clustering-based algorithm as well as estimating a subspace of each cluster.

More »

Fig 4 — Fig 4.

A visualization of the results of the k-NN global anomaly detection algorithm.
The anomaly score is represented by the bubble size whereas the color shows the labels of the artificially generated dataset.

More »

Fig 5 — Fig 5.

Comparing COF (top) with LOF (bottom) using a simple dataset with a linear correlation of two attributes.
It can be seen that the spherical density estimation of LOF fails to recognize the anomaly, whereas COF detects the non-linear anomaly (k = 4).

More »

Fig 6 — Fig 6.

Comparing INFLO with LOF shows the usefulness of the reverse neighborhood set.
For the red instance, LOF takes only the neighbors in the gray area into account resulting in a high anomaly score. INFLO additionally takes the blue instances into account (reverse neighbors) and thus scores the red instance more normal.

More »

Fig 7 — Fig 7.

A visualization of the results for the uCBLOF algorithm.
The anomaly score is represented by the bubble size, whereas the color corresponds to the clustering result of the preceded k-means clustering algorithm. Local anomalies are obviously not detected using uCBLOF.

More »

Table 1 — Table 1.

The 10 datasets used for comparative evaluation of the unsupervised anomaly detection algorithms from different application domains.
A broad spectrum of size, dimensionality and anomaly percentage is covered. They also differ in difficulty and cover local and global anomaly detection tasks.

More »

Fig 8 — Fig 8.

The AUC values for the nearest-neighbor based algorithms on the breast-cancer dataset.
It can be seen that k values smaller than 10 tend to result in poor estimates, especially when considering local anomaly detection algorithms. Please note that the AUC axis is cut off at 0.5.

More »

Table 2 — Table 2.

The results of the nearest-neighbor based algorithms showing the AUC and the standard deviation for 10 ≤ k ≤ 50 for all of the 10 datasets.
Due to the computational complexity, LOCI could not be computed for larger datasets.

More »

Fig 9 — Fig 9.

The AUC values for the large kdd99 dataset for 0 < k < 100.
It can be easily seen that the performance of local anomaly detection algorithms is poor for this global anomaly detection challenge.

More »

Table 3 — Table 3.

The results of the clustering-based algorithms showing the AUC and the standard deviation for different initial k (10 ≤ k ≤ 50).
The last row shows a comparison with the best nearest-neighbor method for the dataset.

More »

Table 4 — Table 4.

The AUC results of the remaining unsupervised anomaly detection algorithms.
Four different strategies for keeping the components have been used for rPCA, while for HBOS the number of different bins was altered.

More »

Table 5 — Table 5.

Comparing the computation time of the different algorithm show huge differences, especially for the larger datasets.
The unit of the table is seconds for the first nine columns and minutes for the last dataset (kdd99).

More »

Table 6 — Table 6.

Recommendations for algorithm selection.
Qualitatively judgments are given from very bad (− −) over average (o) to very good (++).

More »