Table 1.
Summary of related literature and their features.
Fig 1.
Flowchart for the design of the proposed TabNet-IDS model in this research.
Table 2.
Distribution of stream records in CICIDS2017 dataset.
Table 3.
Distribution of stream records in CSE-CIC-IDS2018 dataset.
Fig 2.
Distribution of the 10% of the CIC-DDoS2019 use for the training and testing of the model.
Fig 3.
The TabNet encoder is made up of three parts: A feature transformer, an attentive transformer, and feature masking.
The information to be used by the attentive transformer and the next feature transformer is separated with a split block. For each phase, the feature selection mask gives interpretable information regarding the model’s functionality, and the masks can be combined to get global feature significance attributes.
Fig 4.
Model training, optimization, and evaluation.
Fig 5.
Optuna’s general system design.
In the search study, each worker is responsible for executing one instance of the objective function. The objective function uses Optuna APIs to run its trial. The objective function accesses the shared storage to obtain information about past studies when necessary. Each worker operates independently, running the objective function and sharing the progress of the current study through the shared storage.
Table 4.
Description of TabNet hyper-parameters and optimized space.
Fig 6.
Hyperparameter importance of the model parameters obtained during the optimization process using optuna.
nda is used to represent the values for nd and na since nd = na is ideal for better performance.
Table 5.
Performance of the TabNet model on the selected datasets according to the number of folds.
Fig 7.
TabNet-IDS performance on the three datasets.
Fig 8.
Feature masking output for the three decision steps of the TabNet model on the CIC-IDS2017 dataset.
Fig 9.
Global explanation for the selected features used in the CIC-IDS2018 dataset.
Fig 10.
“Global explanation for the selected features used in the CIC-IDS2019 dataset”.
Masks for each step show that the Fwd Packet Length Min, Fwd Packet Length Mean, Total Fwd Packets, Min Packets Length, Bwd Packet Length Mean, Fwd Packet Length Max, and Fwd Packet Length Std are mostly used for the decision in the Masks.
Fig 11.
Confusion matrix of the model performance on the CIC-IDS2017 dataset.
The encoded label is 0: Benign, 1: Botnet, 2: Brute force, 3: DDoS, 4: DoS, 5: Heartbleed, 6: Infiltration, 7: PortScan, 8: Web attacks.
Fig 12.
Confusion matrix of the model performance on the CSE-CICIDS2018 dataset with each label encoded as 0: Benign, 1: Bot, 2: DoS, 3: DDoS, 4: Brute force, 5: Infiltration, 6: Web Attacks.
Fig 13.
Confusion matrix of the model performance on the CIC-DDoS2019 dataset where each of the labels is encoded as 0: BENIGN, 1: DNS, 2: LDAP, 3: MSSQL, 4: NTP, 5: NetBIOS, 6: Portmap, 7: SNMP, 8: SSDP, 9: Syn, 10: TFTP, 11: UDP, 12: UDPLag, 13: WebDDoS.
Table 6.
Comparison of the Performance of our proposed model and other models that implemented similar algorithms.