Attentive transformer deep learning algorithm for intrusion detection on IoT systems using automatic Xplainable feature selection

doi:10.1371/journal.pone.0286652

Table 1.

Summary of related literature and their features.

More »

Expand

Fig 1.

Flowchart for the design of the proposed TabNet-IDS model in this research.

More »

Expand

Table 2.

Distribution of stream records in CICIDS2017 dataset.

More »

Expand

Table 3.

Distribution of stream records in CSE-CIC-IDS2018 dataset.

More »

Expand

Fig 2.

Distribution of the 10% of the CIC-DDoS2019 use for the training and testing of the model.

More »

Expand

Fig 3.

The TabNet encoder is made up of three parts: A feature transformer, an attentive transformer, and feature masking.

The information to be used by the attentive transformer and the next feature transformer is separated with a split block. For each phase, the feature selection mask gives interpretable information regarding the model’s functionality, and the masks can be combined to get global feature significance attributes.

More »

Expand

Fig 4.

Model training, optimization, and evaluation.

More »

Expand

Fig 5.

Optuna’s general system design.

In the search study, each worker is responsible for executing one instance of the objective function. The objective function uses Optuna APIs to run its trial. The objective function accesses the shared storage to obtain information about past studies when necessary. Each worker operates independently, running the objective function and sharing the progress of the current study through the shared storage.

More »

Expand

Table 4.

Description of TabNet hyper-parameters and optimized space.

More »

Expand

Fig 6.

Hyperparameter importance of the model parameters obtained during the optimization process using optuna.

n_da is used to represent the values for n_d and n_a since n_d = n_a is ideal for better performance.

More »

Expand

Table 5.

Performance of the TabNet model on the selected datasets according to the number of folds.

More »

Expand

Fig 7.

TabNet-IDS performance on the three datasets.

More »

Expand

Fig 8.

Feature masking output for the three decision steps of the TabNet model on the CIC-IDS2017 dataset.

More »

Expand

Fig 9.

Global explanation for the selected features used in the CIC-IDS2018 dataset.

More »

Expand

Fig 10.

“Global explanation for the selected features used in the CIC-IDS2019 dataset”.

Masks for each step show that the Fwd Packet Length Min, Fwd Packet Length Mean, Total Fwd Packets, Min Packets Length, Bwd Packet Length Mean, Fwd Packet Length Max, and Fwd Packet Length Std are mostly used for the decision in the Masks.

More »

Expand

Fig 11.

Confusion matrix of the model performance on the CIC-IDS2017 dataset.

The encoded label is 0: Benign, 1: Botnet, 2: Brute force, 3: DDoS, 4: DoS, 5: Heartbleed, 6: Infiltration, 7: PortScan, 8: Web attacks.

More »

Expand

Fig 12.

Confusion matrix of the model performance on the CSE-CICIDS2018 dataset with each label encoded as 0: Benign, 1: Bot, 2: DoS, 3: DDoS, 4: Brute force, 5: Infiltration, 6: Web Attacks.

More »

Expand

Fig 13.

Confusion matrix of the model performance on the CIC-DDoS2019 dataset where each of the labels is encoded as 0: BENIGN, 1: DNS, 2: LDAP, 3: MSSQL, 4: NTP, 5: NetBIOS, 6: Portmap, 7: SNMP, 8: SSDP, 9: Syn, 10: TFTP, 11: UDP, 12: UDPLag, 13: WebDDoS.

More »

Expand

Table 6.

Comparison of the Performance of our proposed model and other models that implemented similar algorithms.

More »

Expand