Figures
Abstract
The Internet has continued to provision its infrastructure as a platform for competitive marketing, enhanced productivity, and monetization efficacy. However, it has become a means for adversaries to exploit unsuspecting users and, in turn, compromise network resources. The utilization of filters, gateways, firewalls, and intrusion detection systems has only minimized the effects of adversaries. Thus, with the constant evolution of exploitation and penetrative techniques in network security, security experts are required to also evolve their mitigation and defensive measures by using advanced tools such as machine learning approach(es) poised to help detect and stop as close to its source, any attack or threat. This will help to quickly identify malicious packets and prevent resource exploits and service disruption. To curb these, studies have sought to minimize the effects of these attacks via advanced machine learning (ML) inspired tools. Traditional ML performance is often degraded due to: (a) its simplistic design that is unsuitable to handle categorical datasets effectively, and (b) its adoption of hill-climbing mode that yields solution(s) that are stuck at local maxima. To avoid such pitfalls, we use deep learning (DL) schemes based on recurrent networks. They present the demerits of the vanishing gradient problem and require longer training time. To curb the challenges of ML and DL, we propose a transfer learning scheme with 3-base (BiGRU, BiLSTM, and Random Forest) classifiers and XGBoost meta-learner to aid effective identification of DDoS. The ensemble yields Accuracy and F1 of 1.000 to effectively classify 314,102-DDoS-cases during its evaluation. The proposed ensemble demonstrates that it can efficiently identify malicious packets for DDoS attacks in network transactions.
Citation: Yoro RE, Okpor MD, Akazue MI, Okpako EA, Eboka AO, Ejeh PO, et al. (2025) Adaptive DDoS detection mode in software-defined SIP-VoIP using transfer learning with boosted meta-learner. PLoS One 20(6): e0326571. https://doi.org/10.1371/journal.pone.0326571
Editor: Taimur Bakhshi, OU: The Open University, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Received: January 23, 2025; Accepted: June 2, 2025; Published: June 26, 2025
Copyright: © 2025 Yoro et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Dataset was retrieved from https://data.mendeley.com/datasets/b7vw628825/1 – array of DDoS traffic attack in a software-defined network.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Businesses and corporate organizations have become more vigilant in their commitment to mitigating threats associated with cybersecurity as global losses run into Billions of Dollars [1]. Despite these, attackers have continued to evolve techniques poised at circumventing secure protocols [2,3]. With the efforts of using secure gateways and firewalls to enhance user trust and access to network resources yielding less than expected results [4], stakeholders need to reposition using approaches that will accurately identify adversarial exploits in this multi-billion dollar global battle [5]. All the same, as businesses can initiate-establish-terminate an online call via Session Initiation Protocol (SIP) on a Voice Over Internet Protocol (VoIP) telephony [6] to facilitate efficient Peer-to-Peer (P2P) communication, IP spoofing has become commonplace. IP spoofing has become a common security flaw in SIP-based VoIP networks [7], as an adversary can assume the identity of a legitimate user to create access points targeted at exploiting unsuspecting users. For instance, using a man-in-the-middle attack, false messages can be triggered, which impacts the integrity of a SIP/VoIP call. By also listening to a legitimate interaction, an adversary can pose as a legitimate user and attack the interaction via a subterfuge attack, either via flooding or probing [8,9].
The proliferation of social media platforms has spurred new frontiers for attacks [10], with users’ suspicion levels now elevated with more online presence. Sentiments have proven to be an integral facet of a user’s personality traits that drives desires [11]. Attack designs can utilize: (a) believability feat that increases possibility of a device to accept malicious data as originating from a genuine user [12], and (b) insidiousness measures the rate at which a malicious content remains potent and undetectable to the user device [13]. Threats by design are poised to weaken networks by obscuring data privacy and evading security by presenting themselves as legitimate users [14]. This is achieved via intrusive acts, service outage, and denial to a user [15]. The exponential rise of these attacks has been attributed to the broad range of availability in the constructive technology itself and calls for effort by security experts and stakeholders in mitigating these threats by exploring various protocols [16,17]. Socially engineered threats against network resources continue to witness a high success rate as users are repeatedly compromised due to personality traits [18,19]. The ease with which these attacks are propagated, such as Distributed Denial of Service (DDoS), has become of great concern to businesses [20], even with the rise in the number of tools and techniques to mitigate such attacks [21,22].
1.1 DDoS attacks on SIP VoIP-based infrastructure
These type of attacks target user devices to derail networked resources [23] from their original purposes and can sometimes utilize social-engineered tools in harvesting users’ credentials. These credentials are then used to compromise a network infrastructure [24] and resources such as memory, Central Processsing Unit (CPU) time, and network bandwidth [25,26]. In many cases, an adversary can achieve this feat via carefully coordinated and crafted exploits that insert obfuscated (malware) as requests that are poised to overwhelm a network. The magnitude of the exploits depends on the size of the explored botnet, which unveils the severity of the threat [27]. DDoS are carefully crafted threats that flood a network server with user requests and exploit the targeted network resources by denying legitimate users access to services [28,29]. As a first aid measure, manually disconnecting a (detected) compromised device is a common approach to fix the challenge; and once compromised, the device becomes an adversary’s entry point to proceed on other targets within the network infrastructure for further exploits [30]. Detection schemes for DDoS attacks can be grouped into: (a) victim-end detection [31,32] (b) core-end detection [33–35], and (c) source-end detection [36–38].
DDoS attacks are grouped into: (1) spoofing: adversary sends large volume of malicious packets to a server by spoofing/masking its source-IP address – making it tedious to differentiate between genuine and malicious packets [39], and (2) flooding: adversary floods a network with user requests that exhausts network resources – making it difficult for legitimate users to access the network resources [40]. Flood-based attacks flood available (network) resources with massive amounts of user requests that eventually create a spoofed packet traffic [41]. This inadvertently blocks legitimate user devices from being serviced with the available infrastructure resources. An exploitative DoS variant is the distributed DoS that originates from multiple sources, making the flood-based DoS a menace to VoIP infrastructure with loss of monetization [42]. The last network architecture layer aids data transfer with programs (i.e., browsers, email, etc) that provide services such as FTP, IMAP, Telnet, SMPT [43,44], and so on. Susceptible targets of DDoS attacks, especially on SIP-based networks, aim to disrupt SIP proxy services or their network users [45,46].
1.2. Learning schemes for DDoS detection
Identification tasks are grouped into: machine learning (ML), deep learning (DL), and ensemble learning (EL) [47]. MLs as used in high-dimension tasks are trained to identify hidden relations of interest in (un)structured datasets to support decisions in the quest for truth [48]. Their robustness, reusability, and flexibility help them learn such relations quickly as changes occur via feature engineering to ease outlier identification in the functioning of a system [49]. Thus, it determines crucial predictors selected for model construction as input and, in turn, recognizes those to aggregate as output. For classification cum regression tasks, the use of ML scheme are poised to help identify cum recognize hidden relations between the underlying predictor features of interest – as the model/heuristic seeks to learn changes within the dataset as means to support decisions in its quests for ground-truth [50]. For these, researchers often explore common traditional MLs such as Random Forest [51], SVM [52], Naïve Bayes [53,54], etc. However, most traditional MLs: (a) explore/utilize hill-climbing techniques that often traps such solutions at local minima – making them not optimally fit, (b) traditional ML simplistic design and nature often yields degraded performance as they may not be able to effectively handle categorical dataset, and (c) they are not robust enough to handle large dataset [55]. For these and other reasons, researchers then explore deep learning (DL) schemes. DL are networks tailored to capture underlying relations of interest in a dataset [56]. Its vanishing gradient challenge impedes performance and hinders the widespread use of Recurrent Neural Network (RNN) [57]). Its variant, the Long-Short-Term Memory (LSTM), resolves this challenge by exploring input-gates that effectively manage how quickly and easily the model adapts to changes observed in the dataset [58,59]. A major degrade to LSTM is its need for longer training time and inability to handle large datasets [60,61].
To combat the challenges in both ML and DL, EL fuses both ML and DL using ML to overcome the issues in DL and vice versa. Thus, the EL yields a single and stronger optimal fit classifier. This feat is achieved via: (a) vote, (b) bagging, (c) boost, and (d) stacked schemes. In vote mode, classifier(s) are independently aggregated to yield a final output with enhanced performance. While it does rely on their fused predictive relations. This unexplored fusion degrades performance if more diversity and outliers exist in the dataset [62]. Bagging trains similar decision trees with equal vote weights (s). It minimizes the variance and bias in a dataset by randomly training its tree with k-fold train-data so that the model aggregates all tree predictions to yield greater accuracy with reduced errors [63,64]. With boost, it sequentially trains independent decision trees so that each iteration yields a classifier that corrects the mistakes of its base (previous) learners in the output [65]. Thus, with each iteration, the ensemble learns and amends its predecessors’ errors to yield enhanced performance with ADAboost as an example. Lastly, the stacked mode explores transfer-learning mode, which trains its (meta) learner to efficiently fuse the predictive outcome of its base-classifiers to improve the generalization performance of its (meta)classifier. This flexibility yields enhanced outcomes with less convergence time and fewer iterations.
This fusion will benefit us as thus: (a) the stacked learning will exploit the benefits of the various base schemes explored to devoid the model of overfit while resolving gaps in the categorical/complex dataset [66], and (b) the XGB-regressor will leverage and boost the predictive capabilities inherent the stacked learning to enhance itself, more profitably. Boosting effectively improves the performance of its learners by improving the difficulty witnessed with its previous iterations, which in turn, improves its outcome as well as reduces both variance and bias in the dataset. Thus, the ensemble will lean on the comprehensive knowledge of the various approaches to exploit the learning depths of its base models with a boosted (meta-learner) mode that yields improved error reduction. But, performance is often degraded due to the imbalanced nature of the dataset [67]. To yield a balanced distribution, studies have explored oversampling schemes (as opposed to undersampling that makes increasingly meaningless, the chosen dataset) [68]. A common oversample mode used is SMOTE (synthetic minority oversample technique) [69], and its variant SMOTE-Tomek [70] fuses a SMOTE (oversampler) with Tomeks (undersampler) [71] as a means to reduce its data distribution class-overlap, improve the dataset’s quality, and yield fastened learning.
1.3. Study motivation
The study is motivated as thus [72–74]: (a) limited availability/access to right-quality datasets to aid model construction, training and evaluation [75], (b) imbalanced nature of datasets where attacks (minor-class) transactions trails behind genuine (major-class) transactions [76], (c) rise in multi-cross-channel transactions, newer schemes must account for it as data acquisition mode to enhance model’s performance and keep up with emergent tactics [77,78], and (d) quest for adaptive detection scheme against VoIP-based DDoS flood-attacks in software-defined network must be poised to identify low data traffic rate(s) [79–81], stealthy traffics, abrupt, high surges and flash-crowd flood-attacks [82]. To minimize this, a model is embedded with targeted delivery on a gateway to ensure the network application layer is protected from all incoming and outgoing packets [83].
Thus, we capture dynamic predictor feats via the trial-and-error mode in pursuance of a model that yields an optimal solution and satisfies the target class with improved generalization and devoid of model overfit. This study proposes to fuse ML and DL with SMOTE-Tomek sampling via transfer (boosted) learning ensemble for robust DDoS attack identification and classification. The study contributes as thus: (a) utilizes SMOTE-Tomek to improve data quality and distribution, (b) develops three (i.e., BiLSTM, BiGRU, and Random Forest) base learners, (c) fuses the base-learners via transfer learning with XGBoost as a meta-learner, and (d) evaluates the proposed method with popular SDN-DDoS test datasets to prove the method’s robustness.
2. Materials and methods
Our proposed method as shown in Fig 1 adopts the stacked learning mode with the following steps:
- Step 1 – Data Collection: Dataset was retrieved from https://data.mendeley.com/datasets/b7vw628825/1 – an array of DDoS traffic attacks in a software-defined network. With 26 features and 1,048,575 records as in Table 1, it yields minor-class 328,765 (attacks) and major-class 719,810 (genuine cases). Fig 2 is a plot of the original data distribution. The input data are read and stored in the data frame.
- Step 2 – Preprocessing: Here, we perform the actions of cleaning as follows: (a) remove duplicate records to ensure the dataset is devoid of record redundancies, (b) remove missing values to ensure data quality, and (c) to yield an optimized, restructured dataset [84] distributed into a variety of labeled-classes. We then encode the records using the one-hot encoding technique, which helps the heuristic to transform all categorical data into their binary equivalent for effective utilization by the ML heuristic. A detailed code listing is as in algorithm listing 2.1.
Algorithm Listing 2.1
Import pandas as pd; import matplotlib,pyplot as plt; import seaborn as sns #import Python libraries
1. df = pd.read_csv(‘files/data/sdn_csv’) #loads the software-defined network dataset
2. df.head(10) #shows the maximum of 10 samples from the chosen dataset
#data cleaning
3. plt.figure(figsize = (10,6)) sns.heatmap(df.isna().transpose(), cmap=’YlGnBu”, cbar_kws={‘label’: ‘Missing Data’})
#handling the missing values
4. def data_imputation(data, column_group, column_selected):
5. group = data[column_group].unique()
6. for value in group:
7. median = data.loc[(data[column_group]==value) & ~(data[column_select].isna()), column_select].median() #get median value
8. median=data.loc[(data[column_group]==value) & ~(data[column_select].isna()), column_select].median #change miss value
9. return data #endfor and return the dataframe
10. sns.boxplot(df[‘source_IP’])
#change the missing values
11. condition_de = (df[‘source_ip’].notnull())
12. df[‘source_IP’] = df[‘source_IP’].mask(condition.de, df[‘source_IP’]: sns.boxplot[df[‘source_IP’]
#remove duplicate data values
13. df = df.drop_duplicates().reset_index(drop=True)
This yields an optimized dataset and allows the proposed system to initiate the feature selection process.
- Step 3 – Feature Selection extracts data label(s) as predictor variables, assessing which labels are pinned down as input (x) vis-à-vis choosing label that the proposed model will forecast as its output (y). Feature selection evaluates and assesses which predictors yield important relations in the quest for ground-truth (target class). It discards predictor(s) that yield no significance (irrelevant or docile) in the quest. Thus, it yields reduced dimensionality in the chosen predictors, both quickens the construction of the heuristics as well, and improves training to yield enhanced generalization. Studies have continued to posit that this is especially useful for model construction, where cost is a crucial characteristic in the quest for our target class [85,86]. Lastly, the model is evaluated on how fit and/or close to optimal the selected features correlate to a target class. Thus, we utilize the relief ranking function (for the feature selection approach) as in Equation 1 and detailed in Algorithm Listing 2.2. This was used to compute the resulting threshold alongside the feature ranking values (by importance) for each predictor about the target class, as seen in Table 1. With the threshold set at 8.321, a total of 7 predictors (i.e., features) were selected with requisite in the quest for anomaly detection (i.e., target-class 1).
Algorithm listing 2.2 is a step-by-step, relief ranking feature selection mode [87].
Algorithm Listing 2.2
1. With dataset: n ← number of train samples), a ← number of features), m ← random train samples used to update W
2. initialize all feature weights W[A]=0.0
3. for i = 1 to m do:
4. randomly select a target instance R
5. find nearest hit ‘H’ and nearest miss ‘M’ (instances)
6. for A = 1 to m do:
7. W[A] = W[A] – diff(A,R,H)/m + diff(A,R,M)/m
8. end for: end for
9. return vector W of feature scores that estimate the quality of features
The 4th and 5th columns of Table 1 yield the ranking of all data labels (and attributes) as contained therein the dataset, which were scored using the Chi-Square feature ranking for the various predictor features.
- Step 3 – Dataset Split and Balancing: First, the dataset is split into train (70%) and test (30%) subsets, to allow for balancing to be actioned only on the train dataset. With the dataset chosen, data points are grouped into minor and major class distributions. Thus, balancing causes a resampling of these data labels via the actions of nearest-neighbour interpolation to either remove some labels or create synthetic instances that repopulate the pool to yield a redistributed, and more balanced distribution within the major-and-minor classes. Here, we adapt SMOTE-Tomek links mode via SMOTE (oversample) and Tomek-links (under-sampler) scheme achieved as thus: (a) it samples the original dataset pool, identifying both the major-and-minor classes, (b) it then creates synthetic labels for the minor-class, while removing (under-samples) labels from majority-class closest to the minority-class [70], and (c) these newly created synthetic labels are added to the original pool to yield a more balanced class(es) distribution as in Figs 2 and 3 respectively. While algorithm 2.3 details the step-by-step approach, the SMOTE-Tomek links approach to data balancing for the task.
Algorithm Listing 2.3
Input: M(minor_class_sample); N(synthetic_sample); number_k_nearest_neighbor for i in range(N);
1. from minor_class, choose random data-point//start SMOTE_mode
2. compute: relative_distance from randomly_selected_data and k_nearest_neighbor
3. choose rnd_val = random_value(0,1): rnd_val * relative_distance;
4. if simulated_samples = obtained then minor_class_new = minor_class + simulated_samples
5. repeat steps 2-to-4 until threshold_minor_class_new = reached;
6. select rnd_minor_class(data)//start Tomek_Links (under-sampler) approach
7. find k_nearest_neighbor(randomized_data)
8. if k_nearest_neighbor.selected = minor_class_new then TomekLink created
9. stop TomekLink procedure: end
- Step-5 – Normalization via feature transformation, reengineers the imbalanced dataset to resample the class-distribution(s). We use the standard-normalizer function as in Equation 2, which selects data-labels from the resampled dataset to yield a distribution mean of 0, and deviation of 1, with
as mean, z as our normalizer, x as the original data, and
as standard deviation. Fig 4 shows the normalized plot. Afterwards, for the study, the dataset is split into a 70% training dataset and a 30% test dataset.
- Step-6 – Transfer (Stacked) Learning Ensemble Design – leans on 3-base models with XGBoost meta-classifier as in the proposed methodology, as thus:
- The Bidirectional Long Short-Term Memory (BiLSTM) is based on the RNN model, useful for handling time-series datasets. The RNN yields a gradient vanishing problem, such that its gradient for the learning process becomes quite small. This slows down or eventually stops all learning within the model. LSTM overcomes this challenge via utilization of gates (i.e., input, forget, and output), which effectively allows the network to learn when to ‘recall’ and when to ‘forget’ irrelevant knowledge. In addition, its cell state update function (Ct) maintains all important knowledge over the period and is not impaired or degraded by the vanishing gradient problem. The gates are constructed using the Equation (3)-(5), respectively:
With as activation of the input gate,
is an activation of output gate,
is sigmoid function,
is the weight of the forget gate,
is the hidden state from previous timestamp, xt is input at time t, bf is the bias for forget gate,
is the candidate value for the memory cell, and
is the hidden state at time t. Additionally, BiLSTM, as a variant of the LSTM, can process data using forward or backward formations. Its first layer allows the flow of data in one direction (i.e., source to destination), while the second layer reverses the flow of data (from destination to source) so that the network possesses past and future context of data-series [70]. This paradigm is useful and expressive in natural language processing. The BiLSTM proffers greater flexibility via fusion of knowledge from both directions (s). It carefully utilizes hyperparameters that tune the model to avoid slow convergence, model overfit, memory efficiency, and task distribution. This is seen in Table 2.
- The Bidirectional Gated Recurrent Units (BiGRU) – The LSTM without proper setting is caught up with the challenge of the vanishing problem. BiGRU yields a simpler structure (as a variant RNN) [88] and also overcomes the vanishing gradient problem in LSTM as it fuses both the input and forget gates into a single update gate; And in turn, reduces the number of predictors to be trained. This speeds up the construction of the model and its training without trading off much of its memory capability [89]. Similar to BiLSTM, the BiGRU yields a 2-way data processing capability to capture the before/after context in each data sequence. It achieves this via the Update and Reset gates as in Equations (6)-(7), respectively.
Where as update gate,
is sigmoid function, W is weight matrix,
is weight of update gate,
as hidden state in previous time, xt is input at time t, rt is reset gate,
is new hidden state candidate value for the memory cell, and
is the updated hidden state at time t. Thus, the model captures bidirectional data context to yield an improved understanding of all intricate data dependencies with a carefully tuned hyper-predictor to help achieve greater balance for train speed, result convergence, memory requirements, enhanced accuracy, and task distribution. Model design and configuration are seen in Table 3.
- The Random Forest (RFE) as a supervised learning, tree-based model, utilizes the bagging approach to grow its decision trees independently. Each tree is constructed using the bootstrap aggregation, which explores the majority vote that samples the training data during prediction [90]. In addition, it provides an extra layer that extends/changes how the decision trees are constructed as a means to reintroduce randomness with a binary-tree split on each node. Thus, its best predictor nodes are randomly selected via its recursive structure to capture intrinsic interactions between parameters (data-labels). Its major demerit, however, is its flexibility and robustness with complexity and mutation as contained therein the dataset [91,92], causing degraded performance [74,93]. To curb this, we tune the RF hyperparameters to help address dataset imbalance and diversity, reduce overfitting of the ensemble, and also yield improved performance accuracy. Table 4 shows the RF ensemble design and configuration.
- The XGBoost meta-regressor, like the Random Forest, is a tree-based ensemble that exploits the gradient boost approach to identify labels in a dataset. It aggregates the sum of its weaker, base learners via a series of iterations to yield a fit solution [94]. Thus, for each new iteration and corresponding outcome, the XGBoost corrects the weaknesses of its base learners to yield an improved ensemble. This is achieved via its goal function, which minimizes its loss function as in [95,96]. Ensemble can also be tuned to address dataset imbalance and diversity, reduce overfitting of the ensemble, and also yield improved performance accuracy [97,98]. Table 5 shows the XGBoost design and configuration.
- (Re)Training aims to fit the proposed ensemble with a SMOTE-Tomek, normalized training dataset with a 10% training dataset applied for retraining (or cross-validation). With the stratified k-fold-dataset poised to yield (with each fold) a good representation of the dataset, the stacked learning ensemble ensures the heuristic proffers enhanced generalization that is devoid of overfit. During (re)training – the sample rule(s) as generated are explained as thus, with Table 6 showing the top-18 rules:
if (duration=“-1:0:23”, protocol=“telnet” and src-port= -1, dest-port=23, srcIP=“192.168.1.30”, dest-IP =“192.168.0.20) then {log network connection as an Intrusion}.
3. Results and analysis
3.1. Results and findings
Table 7 shows performance evaluation metrics for all base-learners (Random Forest, BiGRU, and BiLSTM) with the XGB-regressor (meta-learner). The XGB meta-learner is used to readily resolve the conflict generated by the diversity of the fused heuristics and inherent encoding complexities in the dataset. Thus, the model is devoid of overfit as the transfer learning approach combines the predictive capability of all 3-base classifiers.
Both BiLSTM and BiGRU outperformed the Random Forest with accuracy of (RF, BiLSTM, and BiGRU) as 0.9815, 0.9968, 0.9981; Recall of 0.9745, 0.9848, and 0.9881; Precision of 0.9805, 0.9318, and 0.9541, and F1 of 0.9805, 0.9881, and 0.9925, respectively. Our stacked meta-learner yielded 1.000 for Accuracy, Recall, F1, and Precision, respectively. Our simplistic stacked (transfer) learning design yields reduced computational complexities, reduced overhead, and enhanced performance. Our near-perfect F1 and Accuracy allow for more integration from its base classifiers, as XGB’s regularization term efficiently moderates ensemble overfit for more accurate, generalizable predictions in practical implementation. Proposed ensemble efficiently reduced skewed variance and bias inherent in the dataset, to yield a more robust and stable ensemble for new data and hidden variables inherent in the training dataset.
Fig 5 yields the training loss for the proposed scheme. It yields a consistent, significant decrease from 0.69 in the first epoch to 0.31 by the third epoch. This trend signifies that the proposed scheme successfully minimizes errors on the training dataset. A smooth, monotonic decrease in loss without sudden fluctuations or plateaus indicates that the model’s learning rate is well-tuned and that it is not encountering instability or vanishing gradients. Conversely, Fig 6 shows validation accuracy per epoch with a continuous rise from 0.71 to over 0.90 across the three epochs. It implies how well the scheme performs with the unseen data not used during its training. This improvement suggests that the model is not merely memorizing the training data (overfit); Rather, it learns the intricate features as captured in the test dataset that generalize effectively to new SIP traffic samples. The upward trajectory implies a healthy learning process and demonstrates that the model can reliably detect SIP-VoIP DDoS flood-based attacks with increasing accuracy, harmonic mean, precision, and recall [99].
Our proposed ensemble effectively identifies DDoS attack data accurately and has proven to efficiently reduce bias and variance indicative of the confusion matrix as in Fig 7, yielding a more stable, robust model for new data and/or hidden underlying parameters of interest within a domain’s training dataset being considered. Our study supports that SMOTE-Tomek proffered greater influence in the quest for ground-truth and impacted the overall performance by identifying features of importance that influence prediction. Model effectively classified genuine from malicious packets with perfect accuracy and with perfect F1-score to accurately classify all 314,573 instances, which agrees with [100,101].
3.2. Comparative analysis
As we explored the high performance of our proposed ensemble across the dataset to demonstrate its flexibility, adaptability, robustness, and prediction ability, we also benchmarked it against previous methods that have utilized the same dataset as seen in Table 8 [102,103].
Whilst some task datasets have proven much easier to recognize/classified, Others have also conversely proven to be more painstaking. Some domain task(s), such as medical and image records, require that their chosen ensemble design metric is strongly impacted by the consequence of diagnostic errors within the captured dataset. Thus, the measure of both specificity and sensitivity becomes 2 critical feats to be evaluated since they are directly related to their inherent outcomes [94].
3.3. Practical Implementation and Implications
For our target system delivery, we tested the SMOTE-Tomek fused stacked-based learning ensemble as an embedded application program interface on a web standalone program via Flask. The Flask is a lightweight Python framework that easily integrates as an embedded app, as incorporated with Streamlit to provide the requisite platform to transform this spam detection ensemble into an accessible API. This, in turn – yielded a Fast-API deployed with 3-components: (a) initialize function specifies opened communication ports, (b) integrate function connects the API framework to the server system – and thus, allow the processing as filter of all incoming packets, and (c) interoperability function processes all messaging data from/to all the interconnected devices.
4. Conclusion
Our proposed model yields a total of 60 rules, with the top 18 rules found to have a classification accuracy range [0.89, 0.98]. This stresses and advances the evidence that over 89% of the generated rules can adequately identify cum classify the DDoS dataset. This ideology and paradigm of many good-fit rules is far-reaching and better than achieving a single elitist rule. This, in turn, increases the chances of recognizing malicious packets. Also, the nature of attacks in network transactions requires a constant concerted effort of all to detect intrusion. ML-based schemes simply sniff across network requests, analyze their traffic pattern (anomaly detection), and decide which transactions are compromised.
References
- 1. Akazue MI, Okofu SN, Ojugo AA, Ejeh PO, Odiakaose CC, Emordi FU, et al. Handling Transactional Data Features via Associative Rule Mining for Mobile Online Shopping Platforms. IJACSA. 2024;15(3).
- 2. Zuama LR, Setiadi DRIM, Susanto A, Santosa S, Gan H-S, Ojugo AA. High-Performance Face Spoofing Detection using Feature Fusion of FaceNet and Tuned DenseNet201. J Fut Artif Intell Tech. 2025;1(4):385–400.
- 3. Okofu SN, Anazia KE, Akazue MI, Okpor MD, Oweimieto AE, Asuai CE, et al. Pilot Study on Consumer Preference, Intentions and Trust on Purchasing-Pattern for Online Virtual Shops. ijacsa. 2024;15(7).
- 4. Eboka AO, Aghware FO, Okpor MD, Odiakaose CC, Okpako EA, Ojugo AA, et al. Pilot study on deploying a wireless sensor-based virtual-key access and lock system for home and industrial frontiers. IJ-ICT. 2025;14(1):287.
- 5. Aktayeva A, Makatov Y, Tulegenovna AK, Dautov A, Niyazova R, Zhamankarin M, et al. Cybersecurity Risk Assessments within Critical Infrastructure Social Networks. Data. 2023;8(10):156.
- 6. Ojugo AA, Eboka AO. Mitigating Technical Challenges via Redesigning Campus Network for Greater Efficiency, Scalability and Robustness: A Logical View. IJMECS. 2020;12(6):29–45.
- 7. Ang L-M, Seng KP, Ijemaru GK, Zungeru AM. Deployment of IoV for Smart Cities: Applications, Architecture, and Challenges. IEEE Access. 2019;7:6473–92.
- 8. Ojugo AA, Akazue MI, Ejeh PO, Ashioba NC, Odiakaose CC, Ako RE, et al. Forging a User-Trust Memetic Modular Neural Network Card Fraud Detection Ensemble: A Pilot Study. J Comput Theor Appl. 2023;1(2):50–60.
- 9. Otorokpo EA, Okpor MD, Yoro RE, Brizimor SE, Ifioko AM, Obasuyi DA, et al. DaBO-BoostE: Enhanced Data Balancing via Oversampling Technique for a Boosting Ensemble in Card-Fraud Detection. AIMS Research Journal. 2024;12:45–66.
- 10. Aghware FO, Yoro RE, Ejeh PO, Odiakaose CC, Emordi FU, Ojugo AA. DeLClustE: Protecting Users from Credit-Card Fraud Transaction via the Deep-Learning Cluster Ensemble. IJACSA. 2023;14(6).
- 11. Cooper PW. Managing Insider Threat. Manag Insid Threat. vol. Special Is, pp. 1–14; 2015.
- 12. Ojugo AA, Oyemade DA. Boyer Moore string-match framework for a hybrid short message service spam filtering technique. IJ-AI. 2021;10(3):519.
- 13. Benchaji I, Douzi S, El Ouahidi B, Jaafari J. Enhanced credit card fraud detection based on attention mechanism and LSTM deep model. J Big Data. 2021;8(1).
- 14. Dawodu SO, Omotosho A, Odunayo JA, Abimbola OA, Ewuga SK. Cybersecurity risk assessment in banking: methodologies and best practices. Comput Sci It Res J. 2023;4(3):220–43.
- 15. Setiadi DRIM, Widiono S, Safriandono AN, Budi S. Phishing Website Detection Using Bidirectional Gated Recurrent Unit Model and Feature Selection. J Fut Artif Intell Tech. 2024;1(2):75–83.
- 16. Okpor MD, Aghware FO, Akazue MI, Eboka AO, Ako RE, Ojugo AA, et al. Pilot Study on Enhanced Detection of Cues over Malicious Sites Using Data Balancing on the Random Forest Ensemble. J Fut Artif Intell Tech. 2024;1(2):109–23.
- 17. Malasowe BO, Akazue MI, Okpako EA, Aghware FO, Ojie DV, Ojugo AA. Adaptive Learner-CBT with Secured Fault-Tolerant and Resumption Capability for Nigerian Universities. IJACSA. 2023;14(8).
- 18. Haque MA, Ahmad S, John A, Mishra K, Mishra BK, Kumar K, et al. Cybersecurity in Universities: An Evaluation Model. SN Comput Sci. 2023;4(5).
- 19. Yoro RE, ObukohwoAghware F, Akazue MI, Ibor AE, Ojugo AA. Evidence of personality traits on phishing attack menace among selected university undergraduates in Nigerian. IJECE. 2023;13(2):1943.
- 20. Ojugo AA, Otakore DO. Intelligent cluster connectionist recommender system using implicit graph friendship algorithm for social networks. IJ-AI. 2020;9(3):497.
- 21. Setiadi DRIM, Muslikh AR, Iriananda SW, Warto W, Gondohanindijo J, Ojugo AA. Outlier Detection Using Gaussian Mixture Model Clustering to Optimize XGBoost for Credit Approval Prediction. J Comput Theor Appl. 2024;2(2):244–55.
- 22. Dumebi Okpor M, Eluemnor Anazia K, Adigwe W, Abugor Okpako E, Moses Setiadi DRI, Adimabua Ojugo A, et al. Unmasking effects of feature selection and SMOTE-Tomek in tree-based random forest for scorch occurrence detection. Bulletin EEI. 2025;14(3):2393–403.
- 23. Jáñez-Martino F, Alaiz-Rodríguez R, González-Castro V, Fidalgo E, Alegre E. A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev. 2022;56(2):1145–73.
- 24. Sheng WJ, Kasmin IF, Amin S, Zainal NK. Addressing user perception and implementing Hedera Hashgraph and voice recognition into Multi-Factor Authentication (MFA) system. ijdsaa. 2023;4:194–201.
- 25. Hambali MA, Agwu PA. Adversarial Convolutional Neural Network for Predicting Blood Clot Ischemic Stroke. J Comput Theor Appl. 2024;2(1):51–64.
- 26. Ifeanyi Akazue M, Adimabua Ojugo A, Elizabeth Yoro R, Ogheneovo Malasowe B, Nwankwo O. Empirical evidence of phishing menace among undergraduate smartphone users in selected universities in Nigeria. IJEECS. 2022;28(3):1756.
- 27. Arachchige KG, Branch P, But J. An Analysis of Blockchain-Based IoT Sensor Network Distributed Denial of Service Attacks. Sensors (Basel). 2024;24(10):3083. pmid:38793937
- 28. Adimabua Ojugo A, Ogholuwarami Ejeh P, Chukwufunaya Christopher O, Okonji Eboka A, Uchechukwu Emordi F. Improved distribution and food safety for beef processing and management using a blockchain-tracer support framework. IJ-ICT. 2023;12(3):205.
- 29. Alvares C, Dinesh D, Alvi S, Gautam T, Hasib M, Raza A. Dataset of attacks on a live enterprise VoIP network for machine learning based intrusion detection and prevention systems. Computer Networks. 2021;197:108283.
- 30. Adimabua Ojugo A, Elizabeth Yoro R. Extending the three-tier constructivist learning model for alternative delivery: ahead the COVID-19 pandemic in Nigeria. IJEECS. 2021;21(3):1673.
- 31. Albladi SM, Weir GRS. User characteristics that influence judgment of social engineering attacks in social networks. Hum Cent Comput Inf Sci. 2018;8(1).
- 32. Zhang G, Fischer-Hübner S, Ehlert S. Blocking attacks on SIP VoIP proxies caused by external processing. Telecommun Syst. 2009;45(1):61–76.
- 33. Sidley A. Insider Threat Best Practices Guide. SIFMA, no. February, p. 31, 2018, Online. Available from: https://www.sifma.org/wp-content/uploads/2018/02/insider-threat-best-practices-guide.pdf
- 34.
Andel L, Kuthan J, Sisalem D. Distributed media server architecture for SIP using IP anycast. IPTComm 2009 Serv. Secur. Next Gener. Networks - Proc. 3rd Int. Conf. Princ. Syst. Appl. IP Telecommun. no. July 2009, 2009. Available from: https://doi.org/10.1145/1595637.1595644
- 35.
Amalou W, Mehdi M. An Approach to Mitigate DDoS Attacks on SIP Based VoIP. In: The 1st International Conference on Computational Engineering and Intelligent Systems, 2022. 6. https://doi.org/10.3390/engproc2022014006
- 36. Callen M, Gibson CC, Jung DF, Long JD. Improving Electoral Integrity with Information and Communications Technology. J Exp Polit Sci. 2015;3(1):4–17.
- 37. Yoro RE, Aghware FO, Malasowe BO, Nwankwo O, Ojugo AA. Assessing contributor features to phishing susceptibility amongst students of petroleum resources varsity in Nigeria. IJECE. 2023;13(2):1922.
- 38. Sisalem D, Kuthan J, Ehlert S. Denial of service attacks targeting a SIP VoIP infrastructure: attack scenarios and prevention mechanisms. IEEE Network. 2006;20(5):26–31.
- 39. Ojugo AA, Yoro RE. Forging a deep learning neural network intrusion detection framework to curb the distributed denial of service attack. IJECE. 2021;11(2):1498.
- 40. Yeboah-Boateng EO, Amanor PM. Phishing, SMiShing & Vishing: An Assessment of Threats against Mobile Devices. J Emerg Trends Comput Inf Sci. 2014;5(4):297–307.
- 41. Odiakaose CC, Aghware FO, Okpor MD, Eboka AO, Binitie AP, Ojugo AA, et al. Hypertension Detection via Tree-Based Stack Ensemble with SMOTE-Tomek Data Balance and XGBoost Meta-Learner. J Fut Artif Intell Tech. 2024;1(3):269–83.
- 42. Thorat O, Parekh N, Mangrulkar R. TaxoDaCML: Taxonomy based Divide and Conquer using machine learning approach for DDoS attack classification. International Journal of Information Management Data Insights. 2021;1(2):100048.
- 43. Chuadhry MA, Bhatti MG, Shah RA. Impact of Blockchain Technology in Modern Banking Sector to Exterminate the Financial Scams. SJCMS. 2023;6(2):27–38.
- 44.
Fiedler J, Kupka T, Ehlert S, Magedanz T, Sisalem D. VoIP defender. In: Proceedings of the 1st international conference on Principles, systems and applications of IP telecommunications, 2007. 11–7. https://doi.org/10.1145/1326304.1326307
- 45. Nazih W, Hifny Y, S. Elkilani W, Mostafa T. Fast Detection of Distributed Denial of Service Attacks in VoIP Networks Using Convolutional Neural Networks. IJICIS. 2020;20(2):125–38.
- 46. Ehlert S, Zhang G, Geneiatakis D, Kambourakis G, Dagiuklas T, Markl J, et al. Two layer Denial of Service prevention on SIP VoIP infrastructures. Computer Communications. 2008;31(10):2443–56.
- 47. Binitie AP, Odiakaose CC, Okpor MD, Ejeh PO, Eboka AO, Ojugo AA, et al. Stacked Learning Anomaly Detection Scheme with Data Augmentation for Spatiotemporal Traffic Flow. J Fuzzy Syst Control. 2024;2(3):203–14.
- 48.
Xuan S, Liu G, Li Z, Zheng L, Wang S, Jiang C. Random forest for credit card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), 2018. 1–6. https://doi.org/10.1109/icnsc.2018.8361343
- 49.
Barlaud M, Chambolle A, Caillau JB. Robust supervised classification and feature selection using a primal-dual method. 2019.
- 50. Aghware FO, Akazue MI, Okpor MD, Malasowe BO, Aghaunor TC, Ugbotu EV, et al. Effects of Data Balancing in Diabetes Mellitus Detection: A Comparative XGBoost and Random Forest Learning Approach. NIPES. 2025;7(1):1–11.
- 51. Ako RE, Aghware FO, Okpor MD, Akazue MI, Yoro RE, Ojugo AA, et al. Effects of Data Resampling on Predicting Customer Churn via a Comparative Tree-based Random Forest and XGBoost. J Comput Theor Appl. 2024;2(1):86–101.
- 52. Li C, Ding N, Dong H, Zhai Y. Application of Credit Card Fraud Detection Based on CS-SVM. IJMLC. 2021;11(1):34–9.
- 53. Angdresey A, Sitanayah L, Tangka ILH. Sentiment Analysis for Political Debates on YouTube Comments using BERT Labeling, Random Oversampling, and Multinomial Naïve Bayes. J Comput Theor Appl. 2025;2(3):342–54.
- 54. Kolawole AO, Irhebhude ME, Odion PO. Human Action Recognition in Military Obstacle Crossing Using HOG and Region-Based Descriptors. J Comput Theor Appl. 2025;2(3):410–26.
- 55. Ako RE, Okpor MD, Aghware FO, Malasowe BO, Nwozor BU, Ojugo AA, et al. Pilot Study on Fibromyalgia Disorder Detection via XGBoosted Stacked-Learning with SMOTE-Tomek Data Balancing Approach. NIPES. 2025;7(1).
- 56. Setiadi DRIM, Susanto A, Nugroho K, Muslikh AR, Ojugo AA, Gan H-S. Rice Yield Forecasting Using Hybrid Quantum Deep Learning Model. Computers. 2024;13(8):191.
- 57.
Abakarim Y, Lahby M, Attioui A. An Efficient Real Time Model For Credit Card Fraud Detection Based On Deep Learning. In: Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications, 2018. 1–7. https://doi.org/10.1145/3289402.3289530
- 58. Alqatawna A, Abu-Salih B, Obeid N, Almiani M. Incorporating Time-Series Forecasting Techniques to Predict Logistics Companies’ Staffing Needs and Order Volume. Computation. 2023;11(7):141.
- 59. Ojugo AA, Ejeh PO, Odiakaose CC, Eboka AO, Emordi FU. Predicting rainfall runoff in Southern Nigeria using a fused hybrid deep learning ensemble. IJ-ICT. 2024;13(1):108.
- 60. Ren C, Chai C, Yin C, Ji H, Cheng X, Gao G, et al. Short-Term Traffic Flow Prediction: A Method of Combined Deep Learnings. Journal of Advanced Transportation. 2021;2021:1–15.
- 61. Muhamada K, Setiadi DRIM, Sudibyo U, Widjajanto B, Ojugo AA. Exploring Machine Learning and Deep Learning Techniques for Occluded Face Recognition: A Comprehensive Survey and Comparative Analysis. J Fut Artif Intell Tech. 2024;1(2):160–73.
- 62. Deepika K, Nagenddra MPS, Ganesh MV, Naresh N. Implementation of Credit Card Fraud Detection Using Random Forest Algorithm. IJRASET. 2022;10(3):797–804.
- 63. Onoma PA, Ugbotu EV, Aghaunor TC, Agboi J, Ojugo AA, Odiakaose CC, et al. Voice-based Dynamic Time Warping Recognition Scheme for Enhanced Database Access Security. J Fuzzy Syst Control. 2025;3(1):81–9.
- 64. Onoma PA, Agboi J, Geteloma VO, Max-Egba AT, Eboka AO, Ojugo AA, et al. Investigating an Anomaly-based Intrusion Detection via Tree-based Adaptive Boosting Ensemble. J Fuzzy Syst Control. 2025;3(1):90–7.
- 65. Omoruwou F, Ojugo AA, Ilodigwe SE. Strategic Feature Selection for Enhanced Scorch Prediction in Flexible Polyurethane Form Manufacturing. J Comput Theor Appl. 2024;1(3):346–57.
- 66. Ben Yahia N, Dhiaeddine Kandara M, Bellamine BenSaoud N. Integrating Models and Fusing Data in a Deep Ensemble Learning Method for Predicting Epidemic Diseases Outbreak. Big Data Research. 2022;27:100286.
- 67. Islam N, Farhin F, Sultana I, Shamim Kaiser M, Sazzadur Rahman Md, Mahmud M, et al. Towards Machine Learning Based Intrusion Detection in IoT Networks. Computers, Materials & Continua. 2021;69(2):1801–21.
- 68.
Sinayobye O, Musabe R, Uwitonze A, Ngenzi A. A Credit Card Fraud Detection Model Using Machine Learning Methods with a Hybrid of Undersampling and Oversampling for Handling Imbalanced Datasets for High Scores. Communications in Computer and Information Science. Springer Nature Switzerland. 2023. p. 142–55. https://doi.org/10.1007/978-3-031-34222-6_12
- 69. Aghware FO, Ojugo AA, Adigwe W, Odiakaose CC, Ojei EO, Ashioba NC, et al. Enhancing the Random Forest Model via Synthetic Minority Oversampling Technique for Credit-Card Fraud Detection. J Comput Theor Appl. 2024;1(4):407–20.
- 70. Setiadi DRIM, Nugroho K, Muslikh AR, Iriananda SW, Ojugo AA. Integrating SMOTE-Tomek and Fusion Learning with XGBoost Meta-Learner for Robust Diabetes Recognition. J Fut Artif Intell Tech. 2024;1(1):23–38.
- 71. Pratama NR, Setiadi DRIM, Harkespan I, Ojugo AA. Feature Fusion with Albumentation for Enhancing Monkeypox Detection Using Deep Learning Models. J Comput Theor Appl. 2025;2(3):427–40.
- 72. N SB, Akki CB. Sentiment Prediction using Enhanced XGBoost and Tailored Random Forest. IJCDS. 2021;10(1):191–9.
- 73. Ojugo AA, Eboka AO. Empirical Bayesian network to improve service delivery and performance dependability on a campus network. IJ-AI. 2021;10(3):623.
- 74. Adityawan HT, Farroq O, Santosa S, Islam HMM, Sarker MK, Setiadi DRIM. Butterflies Recognition using Enhanced Transfer Learning and Data Augmentation. J Comput Theor Appl. 2023;1(2):115–28.
- 75. Ifeanyi Akazue M, Elizabeth Yoro R, Ogheneovo Malasowe B, Nwankwo O, Arnold Ojugo A. Improved services traceability and management of a food value chain using block-chain network: a case of Nigeria. IJEECS. 2023;29(3):1623.
- 76. Ifeanyi Akazue M, Efetobore Edje A, Okpor MD, Adigwe W, Ejeh PO, Odiakaose CC, et al. FiMoDeAL: pilot study on shortest path heuristics in wireless sensor network for fire detection and alert ensemble. Bulletin EEI. 2024;13(5):3534–43.
- 77. De Kimpe L, Walrave M, Hardyns W, Pauwels L, Ponnet K. You’ve got mail! Explaining individual differences in becoming a phishing target. Telematics and Informatics. 2018;35(5):1277–87.
- 78. Amalraj JR, Lourdusamy R. A Novel Distributed Token-Based Access Control Algorithm Using A Secret Sharing Scheme for Secure Data Access Control. IJCNA. 2022;9(4):374.
- 79. Setiadi DRIM, Ojugo AA, Pribadi O, Kartikadarma E, Setyoko BH, Widiono S, et al. Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection. J Comput Theor Appl. 2025;2(4):536–52.
- 80. Eboka AO, Odiakaose CC, Agboi J, Okpor MD, Onoma PA, Aghaunor TC, et al. Resolving Data Imbalance Using a Bi-Directional Long-Short Term Memory for Enhanced Diabetes Mellitus Detection. J Fut Artif Intell Tech. 2025;2(1):95–109.
- 81. Al-Qudah DA, Al-Zoubi AM, Castillo-Valdivieso PA, Faris H. Sentiment Analysis for e-Payment Service Providers Using Evolutionary eXtreme Gradient Boosting. IEEE Access. 2020;8:189930–44.
- 82. Roshan MKG. Multiclass Medical X-ray Image Classification using Deep Learning with Explainable AI. IJRASET. 2022;10(6):4518–26.
- 83. Ibor A, Edim E, Ojugo A. Secure Health Information System with Blockchain Technology. J Nig Soc Phys Sci. 2023;992.
- 84. G. Bhati R. A Survey on Sentiment Analysis Algorithms and Datasets. Review of Computer Engineering Research. 2019;6(2):84–91.
- 85. Lee OV, Heryanto A, Ab Razak MF, Raffei AFM, Eh Phon DN, Kasim S, et al. A malicious URLs detection system using optimization and machine learning classifiers. Indones J Electr Eng Comput Sci. 2020;17(3):1210.
- 86. Malasowe BO, Aghware FO, Okpor MD, Edim BE, Ako RE, Ojugo AA. Techniques and Best Practices for Handling Cybersecurity Risks in Educational Technology Environment (EdTech). J Sci Technol Res. 2024;6(2):293–311.
- 87. Urbanowicz RJ, Meeker M, La Cava W, Olson RS, Moore JH. Relief-based feature selection: Introduction and review. J Biomed Inform. 2018;85:189–203. pmid:30031057
- 88. Yao J, Wang C, Hu C, Huang X. Chinese Spam Detection Using a Hybrid BiGRU-CNN Network with Joint Textual and Phonetic Embedding. Electronics. 2022;11(15):2418.
- 89. Safriandono AN, Setiadi DRIM, Dahlan A, Rahmanti FZ, Wibisono IS, Ojugo AA. Analyzing Quantum Feature Engineering and Balancing Strategies Effect on Liver Disease Classification. J Fut Artif Intell Tech. 2024;1(1):51–63.
- 90. Aghware FO, Adigwe W, Okpor MD, Odiakaose CC, Ojugo AA, Eboka AO, et al. BloFoPASS: A blockchain food palliatives tracer support system for resolving welfare distribution crisis in Nigeria. IJ-ICT. 2024;13(2):178.
- 91. Moses Setiadi DRI, Sutojo T, Rustad S, Akrom M, Ghosal SK, Nguyen MT, et al. Single Qubit Quantum Logistic-Sine XYZ-Rotation Maps: An Ultra-Wide Range Dynamics for Image Encryption. CMC. 2025;83(2):2161–88.
- 92. Okolo O, Baha BY, Philemon MD. Using Causal Graph Model variable selection for BERT models Prediction of Patient Survival in a Clinical Text Discharge Dataset. J Fut Artif Intell Tech. 2025;1(4):455–73.
- 93. Adebayo PO, Basaky F, Osaghae E. Leveraging Variational Quantum-Classical Algorithms for Enhanced Lung Cancer Prediction. J Comput Theor Appl. 2024;2(3):307–23.
- 94. Odiakaose CC, Anazia KE, Okpor MD, Ako RE, Aghaunor TC, Ugbotu EV, et al. Enhanced behavioural risk detection in cervical cancer using bi-directional gated recurrent unit: A pilot study. NIPES - J Sci Technol Res. 2025;7(1):192–203.
- 95. Çetin A, Öztürk S. Comprehensive Exploration of Ensemble Machine Learning Techniques for IoT Cybersecurity Across Multi-Class and Binary Classification Tasks. J Fut Artif Intell Tech. 2025;1(4):371–84.
- 96. San KK, Win HH, Chaw KEE. Enhancing Hybrid Course Recommendation with Weighted Voting Ensemble Learning. J Fut Artif Intell Tech. 2025;1(4):337–47.
- 97. Jiang H, Liu A, Ying Z. Identification of texture MRI brain abnormalities on Fibromyalgia syndrome using interpretable machine learning models. Sci Rep. 2024;14(1):23525. pmid:39384824
- 98. Okpor MD, Aghware FO, Akazue MI, Ojugo AA, Emordi FU, Odiakaose CC, et al. Comparative Data Resample to Predict Subscription Services Attrition Using Tree-based Ensembles. J Fuzzy Syst Control. 2024;2(2):117–28.
- 99.
Ojugo AA, Eboka AO, Yoro RE, Yerokun MO, Efozia FN. Hybrid Model for Early Diabetes Diagnosis. In: 2015 Second International Conference on Mathematics and Computers in Sciences and in Industry (MCSI), 2015. 55–65. https://doi.org/10.1109/mcsi.2015.35
- 100. Ma T, Wang F, Cheng J, Yu Y, Chen X. A Hybrid Spectral Clustering and Deep Neural Network Ensemble Algorithm for Intrusion Detection in Sensor Networks. Sensors (Basel). 2016;16(10):1701. pmid:27754380
- 101. Ma C, Wang H, Hoi SCH. Multi-label thoracic disease image classification with cross-attention networks. Singaporean J Radiol. 2020;21:1–9.
- 102.
Jovic A, Brkic K, Bogunovic N. A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015. 1200–5. https://doi.org/10.1109/mipro.2015.7160458
- 103. Aghaunor TC, Omede EU, Ugbotu EV, Agboi J, Onochie CC, Max-Egba AT, et al. Enhanced scorch occurrence prediction in foam production via a fusion SMOTE-Tomek balanced deep learning scheme. NIPES - J Sci Technol Res. 2025;7(2):330–9.
- 104. Salau AO, Beyene MM. Software defined networking based network traffic classification using machine learning techniques. Sci Rep. 2024;14(1):20060. pmid:39209938
- 105. Wang K, Fu Y, Duan X, Liu T. Detection and mitigation of DDoS attacks based on multi-dimensional characteristics in SDN. Sci Rep. 2024;14(1):16421. pmid:39014041
- 106. Ataa MS, Sanad EE, El-Khoribi RA. Intrusion detection in software defined network using deep learning approaches. Sci Rep. 2024;14(1):29159. pmid:39587182
- 107. Kumar CL, Betam S, Pustokhin D, Laxmi Lydia E, Bala K, Aluvalu R, et al. Metaparameter optimized hybrid deep learning model for next generation cybersecurity in software defined networking environment. Scientific Reports. 2025;15(1).