Anomaly detection in virtual machine logs against irrelevant attribute interference

doi:10.1371/journal.pone.0315897

Table 1.

Comparison of methods for log anomaly detection.

More »

Expand

Fig 1.

LSTM auto encoder algorithm illustration.

More »

Expand

Fig 2.

A concrete example showing a few log lines from a VMWare log file.

More »

Expand

Fig 3.

A schematic diagram of a single log file.

The diagram illustrates the generation of logs. Each virtual machine generates log files in chronological order over time. The intervals between log file generations are often inconsistent, resulting in some virtual machines generating a large number of log files within a given time frame ‘K’, while others generate fewer log files. The number of log lines in each log file also tends to vary.

More »

Expand

Fig 4.

One case of log file anomaly detection is shown.

More »

Expand

Fig 5.

Another case of log file anomaly detection is shown.

The latest log file on each virtual machine at time T is the object to be detected Discriminator is a detection system. Normal and Anormal represent the two categories into which the log files are divided. In one case (Fig 4), T₃ is a noisy normal log file alerted as an anomaly. In another case (Fig 5), T₃ is a noisy normal log file considered as normal.

More »

Expand

Fig 6.

A brief overview of virtual machine log anomaly detection.

In the training phase, the training log set undergoes log parsing to obtain log templates. The log templates are then sorted based on their length to create a mapping dictionary between the log templates and numerical values. This dictionary converts the log data into numerical data. The feature vector data, obtained through feature extraction, serves as input for training the SVM discriminator. In the testing phase, the log set is mapped into numerical data using the dictionary obtained during the training phase. The feature vector data, obtained through feature extraction, is then used as input for the SVM discriminator to detect anomalies.

More »