Figures
Abstract
The rapid growth of digital payments exacerbates the challenges in Financial Transaction Fraud Detection (FTFD). These challenges stem primarily from an extreme class imbalance, where legitimate transactions greatly outnumber fraudulent ones. This imbalance significantly hampers the ability of FTFD models to accurately learn fraud patterns. Although existing data augmentation techniques have shown effectiveness in alleviating this problem, they are often negatively influenced by anomalous samples that diverge from the true fraud distribution due to fraudsters’ concealment strategies and the inherent complexity of fraudulent patterns. This divergence makes it challenging to accurately model the distribution of fraudulent activities. In this work, we propose a Boundary-Aware Dual-discriminator Generative Adversarial Network (BADGAN) to address the class imbalance issue in FTFD. BADGAN integrates a boundary sample classifier with a dual-constraint mechanism based on distance adversarial learning, allowing the generator to produce synthetic samples that both adhere to the distribution of real fraud data and maintain a distance from the decision boundary. This boundary-aware design emphasizes the optimization of sample quality near classification boundaries, thereby improving the downstream classifier’s ability to distinguish fraudulent behavior. Extensive experiments on both real-world and public datasets demonstrate that BADGAN outperforms its competitive peers in addressing the class imbalance issue, thereby enhancing the detection performance of FTFD models.
Citation: Zhu H, Wang Z, Xie Y, Yao J (2026) Boundary-aware dual-discriminator generative adversarial network for data augmentation in financial transaction fraud detection. PLoS One 21(2): e0342095. https://doi.org/10.1371/journal.pone.0342095
Editor: Sameena Naaz, University of Roehampton - Digby Stuart College, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Received: September 28, 2025; Accepted: January 16, 2026; Published: February 20, 2026
Copyright: © 2026 Zhu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript.
Funding: This work is supported in part by the Major Project of Natural Sciences of the Scientific Research Program of Anhui Provincial Department of Education under Grant 2025AHGXZK20090; in part by the Peak Disciplines at Bengbu University under Grant 2025GFXK01; in part by the Bengbu University AI Technology Application Center under Grant 2025BBXYkypt01 (received by Honghao Zhu); in part by the National Natural Science Foundation of China under Grant 62502299; and in part by the Natural Science Foundation of Shanghai under Grant 24ZR1427500 (received by Yu Xie).
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
The rapid proliferation of e-commerce and digital payment systems significantly transforms the landscape of global financial transactions. This transformation is further accelerated by the COVID-19 pandemic, solidifying online shopping as the dominant retail model. While these advancements substantially improve economic accessibility and consumer convenience, they also expand the attack surface, thereby enabling increasingly sophisticated financial fraud schemes [1,2]. The evolution of fraud tactics, combined with the increasing volume of digital financial activities, heightens the demand for robust Financial Transaction Fraud Detection (FTFD) systems. These systems must accurately identify fraudulent transactions while ensuring that legitimate financial flows remain uninterrupted and the user experience is not compromised. This presents a fundamental challenge, as traditional fraud detection methods, which are often reactive, struggle to keep pace with the evolving landscape of fraud.
Recent advancements in FTFD employ machine learning-based classifiers to differentiate between legitimate and fraudulent transactions [3]. While they demonstrate notable performance in controlled environments, their real-world deployment is limited by the inherent class imbalance in financial transaction datasets. Specifically, legitimate transactions overwhelmingly dominate the data, often accounting for more than 95% of all records, while fraudulent transactions represent only a small fraction [4]. This extreme class imbalance creates a significant bias in model training, causing classifiers to predominantly learn features associated with the majority class (legitimate transactions). As a result, the ability to detect the minority class (fraudulent transactions) is severely impaired, leading to poor generalization and an increased rate of false negatives [5]. Addressing this class imbalance constitutes a central focus of research in FTFD system development, with important implications for enhancing fraud detection accuracy and reliability in practical settings.
To mitigate the detrimental effects of class imbalance, various data augmentation techniques are proposed, including oversampling, undersampling, and synthetic data generation approaches [6,7]. Among these, Generative Adversarial Networks (GANs) [8] attract significant attention for their capacity to generate synthetic samples that replicate the distribution of fraudulent transactions and preserve data authenticity. GANs demonstrate substantial potential in capturing complex patterns within imbalanced datasets by modeling the underlying distribution of fraudulent activities. However, as fraud detection systems advance, fraudsters continuously adapt and refine their tactics. Modern fraudsters frequently exploit synthetic identities and adversarial machine learning techniques to obscure the decision boundaries between legitimate and fraudulent transactions [9,10]. This ongoing technological arms race exposes a fundamental vulnerability in conventional GAN-based augmentation methods. The same adversarial mechanisms that enable GANs to generate realistic samples also predispose them to incorporate deceptive patterns designed to mislead classifiers [11]. Consequently, GAN-based data augmentation techniques may fail to capture emerging fraud patterns or, more critically, inadvertently generate adversarial noise that compromises the performance of fraud detection models. The core challenge lies in GANs’ inability to reliably distinguish between authentic fraud patterns and adversarial perturbations, leading to suboptimal performance in detecting novel and sophisticated fraudulent activities.
To overcome these limitations and advance the state of the art in FTFD, we propose a novel Boundary-Aware Dual-discriminator Generative Adversarial Network (BADGAN) to address the class imbalance issue. Unlike traditional GANs, which may occasionally generate samples near decision boundaries, BADGAN alleviates this issue by incorporating boundary sample discriminators and boundary-distance constraints. These mechanisms ensure that the synthetic fraud samples not only replicate the distribution of authentically fraud patterns but also remain sufficiently distant from the decision boundaries, thereby reducing the risk of misclassification. By embedding these boundary-aware mechanisms, BADGAN mitigates the feature overlap commonly observed in GAN-based augmentation, enabling the generation of more representative synthetic data. This, in turn, enhances the ability of FTFD classifiers to accurately distinguish fraudulent transactions from legitimate ones, thereby improving detection performance in real-world scenarios.
The main contributions of this work to FTFD are summarized as follows:
- It proposes a novel boundary-aware dual-discriminator generative adversarial network that generates synthetic fraud samples by authentically replicating fraud patterns while explicitly avoiding decision boundary regions. This approach addresses class imbalance and enhances the quality of training data for fraud detection classifiers.
- It introduces a two-stage optimization strategy, where the real sample discriminator ensures conformity to fraud patterns, and the boundary sample discriminator enforces spatial separation from decision boundaries. This allows the generator to simultaneously learn fraud features and boundary integrity, producing synthetic samples that are both representative and discriminative.
- It demonstrates the effectiveness of BADGAN through extensive experiments on real-world and public datasets. The results show that BADGAN enhances fraud detection performance by generating high-quality synthetic samples that effectively address class imbalance, thereby improving both detection accuracy and reliability in FTFD.
The remainder of this paper is structured as follows. Sect 2 reviews the related work. Sect 3 details the proposed methods. Sects 4 and 5 present the experimental setup and results, respectively. Sect 6 concludes this paper.
2 Related work
Class imbalance represents a pervasive challenge in machine learning, particularly in FTFD, where fraudulent transactions constitute only a small fraction of the overall dataset. This pronounced class imbalance induces systematic bias in learning algorithms, favoring the majority class (legitimate transactions) and impairing the model’s capacity to effectively detect fraudulent activities [12]. To mitigate this issue, existing methods are generally classified into two main strategies: data-level and model-level approaches [13].
2.1 Data-level methods
Data-level methods aim to address class imbalance by modifying the training dataset to improve its representation of the minority class [14]. These techniques generally include oversampling, undersampling, hybrid sampling, and generative modeling approaches [15].
Oversampling methods focus on increasing the representation of the minority class by generating synthetic samples. A widely adopted approach is the Synthetic Minority Oversampling Technique (SMOTE) [16], which generates synthetic samples by interpolating between existing minority instances and their nearest neighbors. Extensions of SMOTE, such as Borderline-SMOTE [17] and ADASYN [18], enhance sample diversity by concentrating on the ambiguous boundary regions or sparsely populated areas of the minority class. Despite their widespread adoption, these interpolation-based methods face notable limitations in the context of FTFD. Due to their reliance on interpolation, they often fail to capture the intricate and sometimes unique patterns inherent in fraudulent transactions. Additionally, these methods may generate synthetic samples in regions where fraudsters intentionally obfuscate or manipulate transaction features, thereby failing to reflect the true complexity of fraudulent behaviors [19]. Furthermore, these techniques are vulnerable to adversarial patterns crafted to exploit their interpolation logic, resulting in synthetic samples that fail to faithfully represent real-world fraud and consequently degrade the detection performance of FTFD systems [20].
Undersampling methods are designed to address class imbalance by reducing the number of majority class samples. Techniques such as Tomek Links (TL) [21] and Cluster Centroids (CC) [22] are commonly employed. TL removes overlapping instances near the decision boundary, while CC selects representative majority class samples based on clustering. While undersampling can effectively alleviate class imbalance, it risks discarding informative majority class instances that may contain critical patterns relevant to modeling fraudulent behaviors. Such a reduction in majority class samples may impair the model’s generalization ability and ultimately degrade fraud detection performance in real-world financial transaction settings [23].
Hybrid sampling methods integrate oversampling and undersampling techniques to construct more balanced and representative datasets. Representative approaches include Synthetic Minority Transfer Learning (SMTL) [24] and Synthetic Minority Enhanced Sampling (SMEN) [25], which enhance the representation of the minority class while preserving the overall data distribution. However, these approaches may fail to capture the complex, domain-specific characteristics of fraudulent transactions. Moreover, the excessive generation of synthetic samples can introduce noise into the dataset, further diminishing model generalizability and compromising the performance of fraud detection in practical FTFD applications [26].
2.2 Model-level methods
Model-level methods focus on enhancing a classifier’s ability to learn from imbalanced datasets without altering the underlying data distribution [27,28].
Cost-sensitive learning methods aim to address class imbalance by assigning higher penalties to the misclassification of fraudulent transactions, thus encouraging models to prioritize the detection of minority-class instances [29]. However, due to the dynamic nature of fraud, where fraudsters often mimic normal transaction patterns or adapt their strategies in response to evolving detection mechanisms [30], the static penalty structures inherent in cost-sensitive approaches are frequently inadequate. This limitation may lead to performance degradation over time, as these models struggle to accommodate the continuously evolving tactics employed by fraudsters [31].
Ensemble learning methods, such as Boosting (e.g., AdaBoost [32]) and Bagging (e.g., Random Forests [33]), improve model robustness by combining multiple classifiers. The former enhances sensitivity to rare fraud patterns by iteratively reweighting misclassified instances, while the latter reduces variance through the aggregation of predictions. Despite these advantages, ensemble methods remain vulnerable to manipulation. Fraudsters may exploit subtle behavioral variations or introduce noise to deceive base classifiers, thereby undermining the ensemble’s overall effectiveness [34,35]. Additionally, the substantial computational cost of ensemble methods limits their real-time applicability, a crucial consideration for practical fraud detection systems [36].
GANs gain increasing attention for augmenting minority class instances in fraud detection tasks. However, they often struggle with boundary discrimination [37]. The adversarial training process can lead to the generation of synthetic samples that cluster near decision boundaries or, more problematically, within dense regions of legitimate transactions. This limitation arises because the generator’s optimization objective emphasizes sample realism, often at the expense of preserving subtle yet critical discriminative features that differentiate fraudulent transactions from legitimate ones [38]. Enhanced architectures, such as Wasserstein GAN (WGAN) [39] and Roulette-wheel selection GAN (RGAN) [40], are proposed to address training instability. However, these advancements do not fundamentally resolve the issue of boundary confusion. Consequently, the synthetic samples generated by these models often exhibit feature distributions that overlap significantly with legitimate transactions, thereby misleading classifiers rather than improving their discriminative ability. This challenge persists because conventional GAN frameworks lack explicit mechanisms to enforce separation between class manifolds in the generated feature space.
Two categories of adversarial attacks are particularly prominent in recent AI security research: evasion attacks and model-poisoning attacks. Evasion attacks involve manipulating inputs during the inference stage to mislead the model’s predictions, whereas model-poisoning attacks intervene during training by injecting adversarial samples that alter the learned model parameters. Both attack types pose serious risks to the reliability and security of machine learning models, especially in sensitive domains such as fraud detection.
Therefore, the current data augmentation and model-level methods in FTFD exhibit inherent limitations in addressing the non-stationary nature of fraud distributions and the deliberate obfuscation of fraudulent patterns. A significant weakness lies in the inability of these methods to distinguish between genuine boundary ambiguity and fraudsters’ intentional attempts to obscure their activities, resulting in synthetic samples that oversimplify or distort the true characteristics of fraudulent transactions. Most traditional techniques assume a static distribution of fraud, overlooking the adaptive strategies employed by fraudsters to blur class boundaries and mimic legitimate behavior [41,42]. As a result, models trained on such synthetic data may perform well on historical fraud patterns but struggle to detect more sophisticated fraud schemes that exploit these weaknesses. To overcome these challenges, future approaches must not only preserve the underlying structure of fraudulent transactions but also explicitly counteract adversarial obfuscation during sample generation.
3 Proposed method
The increasing sophistication of fraudulent tactics highlights fundamental limitations in existing GAN-based augmentation methods. Their unimodal discrimination is insufficient to address the dual challenges of FTFD: preserving fidelity to authentic fraud distributions while effectively navigating regions of boundary ambiguity. To overcome these challenges, we propose BADGAN, an enhanced GAN framework that integrates a boundary sample discriminator and distance-adversarial learning, enabling the generator to produce high-quality samples that not only align with genuine fraud distributions but also maintain a significant distance from decision boundaries.
To enhance boundary discrimination, we incorporate Borderline-SMOTE [17] to identify and process boundary samples from the minority class, as illustrated in Fig 1. The choice of Borderline-SMOTE is motivated by its ability to focus specifically on instances that lie near the decision boundary, where fraudulent patterns are most prone to ambiguity. This targeted focus on boundary samples allows Borderline-SMOTE to generate more discriminative synthetic data, which is essential for capturing the subtle nuances of fraud. The minority class samples are categorized into three types:
- A (“safe”): Instances for which more than half of the nearest neighbors belong to the minority class, typically corresponding to genuine fraudulent transaction patterns.
- B (“dangerous”): Instances that are predominantly surrounded by majority-class neighbors, often reflecting fraudsters’ attempts to obscure fraudulent behavior and thereby increasing the likelihood of classifier confusion.
- C (“noise”): Instances that are entirely surrounded by majority-class samples, generally indicating anomalies arising from data collection or labeling processes.
In our preprocessing pipeline, Borderline-SMOTE is employed to strategically generate synthetic boundary samples by interpolating around high-risk instances. It focuses on the critical boundary region, where fraudulent patterns are most susceptible to ambiguity. By generating synthetic samples in these regions, Borderline-SMOTE effectively expands the coverage of boundary characteristics while preserving their essential features. These synthesized samples are subsequently used as specialized training data for the boundary sample classifier and are labeled as boundary fraudulent samples, thereby explicitly reflecting their role in enhancing boundary discrimination.
As shown in Fig 2, BADGAN employs a dual-discriminator architecture, with each discriminator performing a distinct yet complementary function. The real sample discriminator Dr is trained on both genuine fraudulent samples and generator outputs, guiding the generator to produce synthetic samples that not only resemble authentic fraudulent behavior but also capture the intricate patterns characteristic of fraud. This ensures that the generated samples align closely with the true nature of fraudulent transactions. Simultaneously, the boundary sample discriminator Db operates on boundary fraudulent samples—instances that are particularly prone to misclassification due to their proximity to the decision boundary—along with generator outputs. Db plays a critical role in penalizing the generator when synthetic samples overly align with boundary features, thereby preventing the generation of ambiguous samples. Instead, it encourages the production of samples with clear and identifiable fraudulent traits, which improves both the authenticity of the synthetic data and the clarity of classification boundaries.
This dual-discriminator framework ensures that BADGAN generates synthetic samples that both improve the overall quality of fraud detection and reduce the impact of boundary ambiguity. By focusing on boundary regions, we mitigate the risk of generating misleading samples that could hinder classifier performance. Furthermore, this approach enhances the robustness of the generated fraud patterns, making them more suitable for downstream fraud detection tasks. In doing so, BADGAN effectively balances fidelity to real-world fraud distributions with the avoidance of regions where fraud patterns overlap with legitimate data, thus optimizing the model’s overall FTFD performance.
The adversarial optimization objectives for Dr and Db are defined as follows:
where xf, xb, and z denote real fraudulent samples, boundary fraudulent samples, and random noise, respectively. and
respectively denote the probabilities assigned to synthetic and real fraudulent samples, ranging from 0 to 1. Db similarly aims to distinguish synthetic samples G(z) from boundary samples xb. During training, both discriminators strive to maximize their respective objectives by increasing values
and
while reducing
and
.
Inspired by the adversarial training paradigm of GANs, the generator’s optimization objective is designed to synthesize samples that successfully deceive the fraud discriminator Dr while preserving the critical fraudulent characteristics. Additionally, the generator is explicitly guided to avoid producing samples that resemble boundary patterns, further ensuring that synthetic transactions are not misclassified as legitimate. This comprehensive adversarial optimization objective ensures that the generator produces high-quality, realistic fraudulent data that is useful for training robust fraud detection models.
To further enhance the discriminative capacity of the dual-discriminator framework, we introduce a distance-adversarial learning mechanism that explicitly improves the separation between fraudulent transactions and boundary fraudulent samples in the feature space. This is achieved by designing specific loss functions that guide the discriminators in distinguishing between clearly identifiable fraudulent transactions and those at risk of misclassification owing to their proximity to the decision boundary. In addition to outputting adversarial probabilities and
, the discriminators Dr and Db also provide boundary-proximity scores, denoted as
. A value closer to 1 indicates that the input sample is considered farther from the decision boundary, while a value approaching 0 suggests that the sample is near the boundary. The distance optimization objectives for Dr and Db are formulated as follows:
For boundary samples, the objective is for . To achieve this, we employ the loss function
, which drives the distance score of boundary samples toward zero, thereby encouraging them to remain close to the decision boundary. Conversely, for non-boundary samples, we aim to enforce
and
, using the loss function
to encourage these samples to be positioned farther from the decision boundary. This design encourages the generator to synthesize samples with well-separated class characteristics, thereby minimizing the influence of ambiguous boundary samples on the training process.
By incorporating the distance loss, the generated samples are explicitly encouraged to move away from the decision boundary, reducing the likelihood of misclassification in ambiguous regions. The overall optimization objectives for Dr, Db, and G are defined as follows:
As shown in Fig 2, both Dr and Db output probabilities in the range of 0 to 1 and are jointly optimized together with the generator. BADGAN improves the quality of generated samples and their compatibility with downstream classifiers by introducing a boundary-aware mechanism. Specifically, Db models the decision boundary region explicitly, compelling the generator to produce fraudulent samples that avoid ambiguous boundary regions. The distance loss further ensures that the generated samples align with the underlying distribution of real fraudulent transactions in the feature space.
This dual optimization framework ensures that the generated samples preserve the dynamic characteristics of genuine fraud—such as sudden high-frequency transactions and abnormal fluctuations in transaction amounts—while simultaneously enabling higher classification confidence. As a result, the proposed augmentation method effectively mitigates class imbalance in FTFD, enabling downstream models to achieve superior detection performance even in low-supervision settings. The training process for BADGAN is outlined in Algorithm 1. The time complexity of the algorithm is , where ne, k, nb, and d denote the number of training epochs, iterations per epoch, batch size, and features per sample, respectively.
Algorithm 1 Training algorithm of BADGAN.
Input: Fraudulent samples distribution , Boundary samples distribution
, Noise distribution
, Generator G with parameters
, Discriminators Dr, Db with parameters
,
, Training epochs ne, batch size nb
Output: Trained BADGAN model
1: Initialize G, Dr, Db
2: for epoch = 1 to ne do
3: for k steps do
4: Sample batch {Fraudulent samples}
5: Sample batch {Boundary samples}
6: Sample batch {Noise vectors}
7: Update Dr by ascending its stochastic gradient:
8:
9:
10: Update Db by ascending its stochastic gradient:
11:
12:
13: end for
14: Sample batch
15: Update G by ascending its stochastic gradient:
16:
17:
18: end for
19: Return
4 Experimental setup
4.1 Datasets
We obtain a comprehensive transaction dataset from a leading Chinese financial institution, comprising 5.12 million records involving 107,192 distinct clients. The dataset exhibits a significant class imbalance, with fraudulent transactions (labeled as “1”) representing only a small fraction relative to legitimate transactions (labeled as “0”). To ensure a robust evaluation of model performance, we employ a time-based partitioning strategy that preserves chronological order. Specifically, transactions from January are used for training, while transactions from February serve as the first test set. This partitioning scheme is extended to create six consecutive monthly evaluation phases, concluding on June. As summarized in Table 1, this temporal partitioning preserves the evolution of transaction patterns and prevents data leakage between training and testing phases, thereby ensuring the integrity of the evaluation process.
4.2 Benchmarks
To comprehensively evaluate the performance of BADGAN, we benchmark it against ten competitive peers for addressing class imbalance. This allows us to rigorously assess the effectiveness, strengths, and limitations of BADGAN across various datasets and evaluation metrics. Our comparison not only validates the model’s capability in handling class imbalance but also highlights its relative advantages over existing solutions in fraud detection.
- SMOTE [16] synthesizes new minority class samples by interpolating between existing instances and their nearest neighbors. This enhances sample diversity and supports more robust classifier decision boundaries.
- SMOTE with Edited Nearest Neighbors (SMEN) [25] combines SMOTE’s oversampling technique with Edited Nearest Neighbors (ENN) data refinement. This approach generates synthetic samples while removing mislabeled or ambiguous majority instances to improve data quality.
- SMOTE with Tomek Links (SMTL) [24] first applies SMOTE to enrich the minority class representation, followed by Tomek Links to eliminate closely spaced majority class instances. This combination sharpens class separation and enhances dataset structure.
- ADASYN [18] generates synthetic samples adaptively, focusing on harder-to-learn minority instances. By leveraging local data density, ADASYN tailors sample creation to areas where classifiers need the most support.
- Cluster Centroids (CC) [22] clusters majority class samples and retains the centroids of these clusters as representative samples. This reduces the number of majority class samples while preserving the overall data distribution and minimizing loss of minority information.
- Tomek Links (TL) [21] identifies and removes pairs of samples that are close neighbors but belong to different classes. This process reduces class overlap and enhances boundary clarity, aiding in the construction of more discriminative models.
- VGAN [8] introduces a generator-discriminator framework for data synthesis. By iteratively refining the generator’s output through adversarial feedback, the model generates realistic synthetic samples that reflect the true data distribution.
- Semi-supervised GAN (SGAN) [43] enhances fraud detection by combining semi-supervised adversarial training with an anomaly density-guided selection process. SGAN targets high-risk fraudulent patterns and incorporates a behavioral deviation penalty to minimize overlap with legitimate transactions, generating more discriminative synthetic fraud samples for model training.
- Roulette-Wheel-Selection-based GAN (RGAN) [40] published in 2023 introduces a targeted sampling strategy to emphasize regions of inter-class overlap, guiding the generator to capture finer fraud characteristics and improve the quality of synthetic minority data near decision boundaries.
- Density-based Wasserstein Generative Adversarial Network (DWGAN) [44] published in 2025 enhances fraud sample generation by combining Wasserstein adversarial training with density-guided sample selection. This method emphasizes representative fraudulent regions and incorporates penalties to reduce overlap with legitimate behaviors, producing more distinctive and high-quality synthetic samples.
To streamline subsequent comparisons, we adopt the following notations: SMEN, SMTL, ADASYN, SMOTE, TL, CC, SGAN, RGAN, DWGAN, VGAN, and BADGAN are denoted as M1, M2, M3, M4, M5, M6, M7, M8, M9, M10, and M11, respectively. These abbreviations are applied consistently across all tables, figures, and discussions to enhance readability while ensuring methodological clarity.
4.3 Evaluation criteria
To evaluate the performance of BADGAN in FTFD, we utilize two widely adopted metrics: F1-Score (F1) and Geometric Mean (Gm). These metrics are particularly suitable for binary classification problems, such as fraud detection, where the minority class—fraudulent transactions—requires special attention due to its significant underrepresentation [28,45]. Both metrics are derived from the confusion matrix, as shown in Table 2, which quantifies a classifier’s predictions against the ground truth. In FTFD, the confusion matrix consists of four components:
- True Positive (TP): The number of fraudulent transactions correctly identified as fraud.
- False Positive (FP): The number of legitimate transactions incorrectly classified as fraud.
- False Negative (FN): The number of fraudulent transactions incorrectly classified as legitimate.
- True Negative (TN): The number of legitimate transactions correctly identified as non-fraudulent.
F1 balances precision and recall, serving as a critical metric in FTFD. It ensures that the model effectively identifies fraudulent transactions (high recall) while maintaining a low rate of false positives (high precision). This balance is crucial for reliable fraud detection systems in financial contexts [46].
Gm evaluates classification balance by combining the true positive rate (sensitivity) and the true negative rate (specificity). This metric is particularly valuable in the context of imbalanced datasets, as it ensures robust performance on both the minority (fraudulent) and majority (legitimate) classes, thereby mitigating bias towards either class [47].
By jointly considering F1 and Gm, we obtain a more comprehensive assessment of the model’s discriminative capability. This joint evaluation emphasizes both sensitivity to fraudulent cases and robustness in identifying legitimate transactions. Together, these metrics provide a robust and informative evaluation of synthetic data quality and its impact on downstream FTFD models.
4.4 Parameter settings
For all experiments, the nearest-neighbor parameter in Borderline-SMOTE is set to k = 5, with its effectiveness verified through sensitivity analysis in Sect 5.3. The BADGAN model is trained using a batch size of 64 for 1000 epochs, striking a balance between capturing underlying patterns and mitigating the risk of overfitting. To maintain equilibrium between the generator and the two discriminators, each round of discriminator training is followed by an additional round of generator training. This approach prevents the generator from being overly suppressed by dominant discriminators. Both the generator and discriminators are optimized using the Adam optimizer [48] with a learning rate of 0.0001, which enhances training stability and reduces the risk of mode collapse [49].
For simplicity, and to minimize the influence of network complexity on performance evaluation, the generator is implemented as a two-layer fully connected neural network. Adversarial training employs Binary Cross-Entropy (BCE) loss, which provides well-defined optimization objectives, stabilizes gradients, and maintains stable adversarial dynamics between the generator and the discriminators, thereby improving both training efficiency and overall model performance [50,51].
For the remaining sampling methods, key hyperparameters (e.g., neighborhood size in SMOTE-based approaches) are configured according to their original implementations or standard practices in imbalanced learning. GAN-based models adopt network architectures and optimization settings consistent with those of BADGAN where applicable, to ensure a fair comparison while preserving their respective architectural design principles and maintaining optimal performance [52]. All methods are fine-tuned following established benchmarks or default recommendations in the class imbalance literature.
5 Experimental results and analysis
5.1 Experiments on FTFD
To assess BADGAN’s ability to address class imbalance in FTFD, we conduct extensive testing on real-world financial transaction datasets spanning multiple time periods. The model’s performance is benchmarked against ten state-of-the-art approaches, using three standard classifiers: Support Vector Classifier (SVC), Logistic Regression (LR), and Multilayer Perceptron (MLP). For a fair comparison, all GAN-based competitors utilize consistent generator and discriminator architectures. Experimental results are based on the mean values of ten repeated trials, ensuring statistical reliability. For brevity, the average value and ranking are denoted by and Ar, respectively, in the subsequent results.
Training efficiency and inference speed (the average time of training the model and detecting transactions) are shown at the bottom of Tables 3, 4, and 5. However, despite the increase in model complexity due to the adoption of the dual-classifier structure, the time for both training and inference remains at an acceptable level, ensuring that the model’s high performance does not come at the cost of reduced efficiency.
When using SVC as the classifier, BADGAN demonstrates superior performance across multiple evaluation periods. As shown in Table 3 and Fig 3, it achieves the highest F1 in four out of five test sets and leads in Gm in three of the test periods. Overall, BADGAN achieves the highest average F1 and Gm, as well as the best Ar across all competitors. It shows an average improvement of 6.76% in F1 and 3.23% in Gm over the strongest competitors, confirming its ability to maintain a balanced trade-off between sensitivity and specificity in imbalanced FTFD.
With LR as the classifier, BADGAN continues to exhibit strong performance. As shown in Table 4 and Fig 4, it achieves the highest F1 and Gm in three out of five evaluation periods. It also records the best average F1 and Gm, securing the top overall ranking among all methods. It outperforms the second-best method by 3.90% in F1 and 0.30% in Gm, demonstrating its robust generalization ability and effectiveness in maintaining a balance between precision and recall across different classification models.
When using MLP as the classifier, BADGAN demonstrates strong performance in most evaluation periods. As shown in Table 5 and Fig 5, over the five-month evaluation period, it consistently achieves the highest F1, outperforming the second-best method by an average of 4.46%. In terms of Gm, BADGAN leads in three months and remains competitive in the remaining two months, with a performance gap of less than 0.35%. Notably, BADGAN ranks first in both the average F1 and Gm, as well as in Ar, confirming its superior capability in capturing both minority and majority class patterns.
In summary, BADGAN consistently outperforms all competing methods across multiple classifiers and evaluation periods. Its superior performance in both F1 and Gm demonstrates its effectiveness in capturing evolving temporal fraud patterns while maintaining an optimal balance between detecting minority-class fraud and preserving majority-class accuracy. By attaining the highest average performance and top rankings across all evaluated classifiers, BADGAN demonstrates strong generalizability, establishing itself as a reliable and scalable solution for FTFD.
5.2 Visualization of varied representations for FTFD
To evaluate the fidelity of synthetic samples and enhance interpretability, we employ t-distributed Stochastic Neighbor Embedding (t-SNE) [53] to project high-dimensional transaction data into a two-dimensional space. Fig 6 illustrates the distribution of legitimate, fraudulent, and synthetic samples for all methods, where green, purple, and orange dots represent legitimate, fraudulent, and synthetic samples, respectively.
(A) SMEN (B) SMTL (C) ADASYN (D) SMOTE (E) TL (F) CC (G) SGAN (H) RGAN (I) DWGAN (J) VGAN (K) BADGAN.
Hybrid resampling approaches, such as M1 and M2 (Fig 6(A)–6(B)), use intelligent synthetic generation techniques to construct locally dense data distributions that preserve the core characteristics of original fraud clusters. By capturing the intrinsic cluster patterns of the minority class, these methods generate synthetic samples that form cohesive structures. However, M1’s dynamic noise-filtering mechanism and M2’s adaptive pruning strategy may push certain synthetic instances toward decision-boundary regions. This dispersion effect introduces partial overlap between synthetic and legitimate samples, increasing classifier ambiguity and ultimately degrading precision performance.
Synthetic oversampling techniques, such as M3 and M4 (Fig 6(C)–6(D)), generate interpolated minority-class instances to enhance data diversity and distributional continuity. Unlike naive replication methods, these techniques create new samples along feature-space trajectories between existing minority points, thereby improving local density representation. However, their reliance on linear interpolation confines synthetic instances to convex regions within observed minority clusters. This limited extrapolation capability fails to capture the full spectrum of fraudulent patterns, particularly in sparse or non-convex regions, ultimately constraining classifier generalization.
Undersampling methods, such as M5 and M6 (Fig 6(E)–6(F)), rebalance class distributions by strategically reducing majority-class samples. M5 selectively removes borderline instances to sharpen decision boundaries, while M6 condenses majority clusters by retaining only their centroids. Although both methods effectively mitigate class imbalance, they inherently discard informative majority-class samples, resulting in sparser representations of legitimate transactions. M5’s boundary-focused pruning may eliminate ambiguous yet potentially valuable instances, while M6’s centroid-based compression risks losing intra-cluster diversity. Consequently, these methods can weaken the model’s ability to learn robust majority-class patterns, impairing generalization performance.
Generative models, such as M7–M10 (Fig 6(G)–6(J)), exhibit distinct patterns in synthetic sample generation relative to traditional resampling methods. M8 and M10 generate samples that cluster closely around genuine fraud instances but frequently intrude into legitimate regions, exposing their vulnerability to adversarial imitation and complicating discriminator training. In contrast, M7 avoids overlap with legitimate areas but suffers from excessive dispersion, failing to align with authentic fraud patterns and inadequately capturing complex fraudulent behaviors. M9 mitigates these limitations by concentrating synthetic data within compact, well-defined regions, enhancing class separability and reducing boundary intrusion. However, its restrictive generation may lead to high-density clusters that represent only a subset of fraud patterns, increasing the risk of overfitting.
In contrast, M11 (Fig 6(K)) synthesizes minority-class samples that faithfully adhere to authentic fraud distributions while maintaining robust separation from legitimate instances near decision boundaries. Unlike methods that merely cluster around fraud points, M11 explicitly optimizes the boundary region by generating high-impact samples in areas where the risk of misclassification is greatest. The resulting synthetic data preserves critical fraud characteristics and reinforces the integrity of the decision boundary, directly enhancing the discriminative model’s training. Compared to conventional approaches, M11 delivers superior precision in boundary sample generation, which is particularly crucial in severely imbalanced FTFD scenarios.
5.3 Parameter sensitivity
To assess the robustness of our approach, we systematically investigate the effect of the nearest-neighbor parameter k in Borderline-SMOTE. We test values and summarize the results in Table 6. These results demonstrate that, as long as k remains within an appropriate range, its variation exerts minimal influence on experimental results.
Based on these observations, k = 5 is selected as the optimal parameter for this study, as it achieves optimal performance in terms of F1 and Gm, while ensuring computational efficiency. This choice optimally balances model accuracy with practical deployment considerations, avoiding noise amplification at lower k values and excessive smoothing effects at higher k values.
5.4 Ablation study
To assess the individual contributions of each component within BADGAN, we conduct an ablation study using a real-world financial transaction dataset. The comparative results are summarized in Table 7. We evaluate three model variants: A0 represents the baseline GAN, A1 represents the addition of only the boundary sample discriminator on top of A0, and A2 corresponds to the full implementation of our proposed model, BADGAN, which incorporates distance-adversarial learning. SVC is employed as the downstream classifier for all evaluations to ensure consistency.
As shown in Table 7, both A1 and A2 substantially outperform A0 in terms of F1 and Gm. This indicates that enhancing the discriminative power of boundary samples is a key factor in improving the quality of the generated data. The improved performance of A2 over A1 further confirms the effectiveness of distance-adversarial learning. BADGAN enhances the discriminative power of generated samples through the boundary sample discriminator and constrains their distribution authenticity through distance-adversarial learning, significantly improving the recall and accuracy of downstream fraud detection.
These findings underscore the pivotal role of boundary-aware mechanisms in FTFD generation modeling. The proposed architectural innovation not only significantly enhances the model’s capacity to represent complex fraud patterns but also improves its adaptability and generalization in adversarial environments, where fraudsters deliberately disguise their behavior to evade detection. This advancement offers a novel technological pathway for applying generative adversarial networks in financial risk control, demonstrating particularly strong effectiveness in addressing highly imbalanced fraud detection tasks.
5.5 Experiments on imbalanced classification datasets
To further evaluate the generalizability and effectiveness of BADGAN beyond the financial domain, we conduct a comprehensive benchmark study on three widely used imbalanced classification datasets from the UCI (https://archive.ics.uci.edu/ml/index.php) (Dataset D1: Default of credit card clients), Synthetic Financial Datasets For Fraud Detection (https://www.kaggle.com/datasets/ealaxi/paysim1) (Dataset D2: Paysim), and Credit Card Fraud Detection (https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud) (Dataset D3: Creditcard) repository [54]. For brevity, we refer to these datasets as D1, D2, and D3, respectively. Table 8 provides descriptions of the different datasets, where #S and #A represent the number of samples and attributes, respectively. #IR indicates the imbalance ratio. This evaluation aims to assess the model’s adaptability across diverse data distributions. To ensure fair and consistent comparisons across all methods, we use SVC as the base classifier.
As shown in Table 9, BADGAN consistently outperforms all ten compared baseline methods. It consistently achieves the highest F1 across all datasets and delivers the best overall performance in terms of Gm, with only a minor exception of scoring 0.71% lower than ADASYN on one dataset. Importantly, BADGAN ranks first in both the average metric values and average rankings for F1 and Gm. These results highlight the model’s effectiveness in addressing class imbalance while maintaining strong generalization across diverse scenarios. Furthermore, its robustness against adversarial evasion demonstrates its suitability for real-world applications with dynamic data distributions.
5.6 Discussion
The proposed BADGAN introduces a novel approach to financial fraud detection by integrating a dual-discriminator architecture with Borderline-SMOTE to enhance boundary-sample learning. While effective in generating high-quality synthetic samples and addressing class imbalance, several limitations require consideration. The computational overhead associated with the dual-discriminator design may impede real-time deployment, highlighting the need for more efficient architectures, such as parameter-sharing discriminators or lightweight temporal encoders. Additionally, the current model fails to explicitly capture the temporal dependencies in transaction sequences, which limits its capability to model evolving fraud patterns. To address this issue, future research directions will include the integration of Recurrent Neural Networks (RNNs) [55] and attention mechanism-based modules for the explicit capture of temporal dependencies in transaction sequences. These methods can better track the evolution of fraud patterns, thereby improving the model’s adaptability and accuracy in dynamic financial environments.
A key limitation of the current framework is its inability to capture the relational and spatial dimensions of fraud, such as coordinated attacks across multiple accounts or geographically clustered activities. Future work could investigate graph-based approaches to model inter-account relationships and spatial attention mechanisms for location-aware detection. Furthermore, the framework’s performance may deteriorate under extreme class imbalance, where the scarcity of genuine fraud samples limits the quality of generated synthetic samples. Incorporating semi-supervised learning techniques or adaptive resampling strategies offers promising directions to address this challenge.
Future research will focus on improving the interpretability of BADGAN and enhancing its transparency in practical applications. We plan to introduce attention-based visualization techniques to help analysts intuitively understand the key factors in the model’s decision-making process. Meanwhile, counterfactual explanation methods will be applied to reveal the minimal input changes that affect the model outputs, providing actionable insights for fraud analysts. In addition, we will evaluate the model’s performance in a streaming data environment to further verify its robustness in addressing evolving patterns and methods of financial fraud.
Recent studies show that adversarial attacks pose significant risks to machine learning models, particularly in sensitive domains such as fraud detection [56]. Although BADGAN demonstrates strong performance across varying fraud patterns, the present work does not examine its behavior under adversarial manipulation. We recognize this as an important direction for future research and consider adversarial robustness evaluation, such as evasion and data-poisoning scenarios, a valuable extension of our framework.
Overall, BADGAN shows strong potential in tackling critical challenges in fraud detection, particularly in handling class imbalance and generating high-quality synthetic samples. However, to enhance its real-world applicability, future research should focus on improving its computational efficiency, capturing temporal dependencies in transaction sequences, and incorporating relational awareness for detecting coordinated fraud activities. Additionally, while the model performs well under typical conditions, its robustness to adversarial attacks has not been explicitly tested. Addressing this gap will be crucial, as adversarial threats pose a growing concern in machine learning, particularly in sensitive applications like fraud detection. Future work should explore strategies to enhance the model’s resilience against evasion and poisoning attacks. Finally, integrating BADGAN’s boundary-sensitive design with temporal modeling techniques could offer more adaptive and comprehensive fraud detection solutions in dynamic financial environments.
6 Conclusion
This study presents BADGAN, a novel boundary-aware dual-discriminator GAN framework designed to address class imbalance in financial fraud detection. Unlike conventional oversampling methods, BADGAN employs a boundary-aware generation strategy to synthesize high-quality minority-class samples that reinforce decision boundaries while preserving distributional authenticity. By integrating adversarial distance constraints with a dual-discriminator architecture, the model achieves a balance between sample diversity and discriminative strength, enabling fraud detectors to capture subtle anomalous patterns more effectively. Extensive evaluations on real-world transaction data and public benchmarks demonstrate BADGAN’s consistent superiority over state-of-the-art methods, particularly under highly imbalanced and evolving fraud scenarios. The boundary-aware design not only improves classifier robustness but also mitigates synthetic sample overfitting—a common drawback of GAN-based approaches. Future work will focus on dynamic boundary adaptation to accommodate non-stationary fraud behaviors and extensions to cross-modal financial data, such as graph-structured transactions. Additionally, incorporating self-supervised pre-training may further reduce reliance on labeled data. Beyond fraud detection, BADGAN’s adaptability highlights its potential in broader domains, such as cybersecurity and medical anomaly detection, enabling more resilient AI-driven risk management.
Acknowledgments
This paper used ChatGPT-4 to assist in polishing the language expression, including grammatical correction, logical coherence adjustment and standardized expression of professional terms. The AI tool was only used for text refinement and did not participate in any creation of research content, experimental design, data analysis or conclusion derivation of this paper. All academic content of the paper is independently completed by the authors, who bear full academic responsibility for the authenticity and originality of the paper.
References
- 1. Jin C, Zhou J, Xie C, Yu S, Xuan Q, Yang X. Enhancing ethereum fraud detection via generative and contrastive self-supervision. IEEE TransInformForensic Secur. 2025;20:839–53.
- 2. Qiao S, Huang M, Li H, Wang L, Yin W, Sun Y, et al. FedSSH: a consumer-oriented federated semi-supervised heterogeneous IoMT framework. IEEE Trans Consumer Electron. 2025;71(3):8465–76.
- 3. Qiao J, Lin Y, Bi J, Yuan H, Wang G, Zhou M. Attention-based spatiotemporal graph fusion convolution networks for water quality prediction. IEEE Trans Automat Sci Eng. 2025;22:1–10.
- 4. Xie Y, Liu G, Yan C, Jiang C, Zhou M, Li M. Learning transactional behavioral representations for credit card fraud detection. IEEE Trans Neural Netw Learn Syst. 2024;35(4):5735–48. pmid:36197863
- 5. Wang K, An J, Zhou M, Shi Z, Shi X, Kang Q. Minority-weighted graph neural network for imbalanced node classification in social networks of internet of people. IEEE Internet Things J. 2023;10(1):330–40.
- 6. Ni L, Li J, Xu H, Wang X, Zhang J. Fraud feature boosting mechanism and spiral oversampling balancing technique for credit card fraud detection. IEEE Trans Comput Soc Syst. 2024;11(2):1615–30.
- 7. Qiu J, Chen B, Song D, Wang W. Semisupervised specific emitter identification based on contrastive learning and data augmentation. IEEE Trans Aerosp Electron Syst. 2025;61(4):8449–66.
- 8. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S. Generative adversarial nets. Advances in Neural Information Processing Systems. 2014;27.
- 9. Jiang N, Gu W, Li L, Zhou F, Qiu S, Zhou T, et al. TFD: trust-based fraud detection in SIoT with graph convolutional networks. IEEE Trans Consumer Electron. 2025;71(1):1897–908.
- 10. Qiao S, Guo Q, Wang M, Zhu H, Rodrigues JJPC, Lyu Z. FRW-TRACE: forensic-ready watermarking framework for tamper-resistant biometric data and attack traceability in consumer electronics. IEEE Trans Consumer Electron. 2025;71(3):8234–45.
- 11. Shi X, Kang Q, Zhou M, Bao H, An J, Abusorrah A, et al. Dual attention-aided cooperative deep-spatiotemporal-feature-extraction network for semi-supervised soft sensing. IEEE Robot Autom Lett. 2025;10(3):2184–90.
- 12. Siam AM, Bhowmik P, Uddin MP. Hybrid feature selection framework for enhanced credit card fraud detection using machine learning models. PLoS One. 2025;20(7):e0326975. pmid:40668849
- 13. Yu J, Wang H, Wang X, Li Z, Qin L, Zhang W, et al. Temporal insights for group-based fraud detection on e-commerce platforms. IEEE Trans Knowl Data Eng. 2025;37(2):951–65.
- 14. Wang X, Yu H, Guo J, Li P, Luo X. Towards fraud detection via fine-grained classification of user behavior. IEEE Trans Big Data. 2025;11(4):1994–2007.
- 15. Tian Y, Liu G, Wang J, Zhou M. ASA-GNN: adaptive sampling and aggregation-based graph neural network for transaction fraud detection. IEEE Trans Comput Soc Syst. 2024;11(3):3536–49.
- 16. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling Technique. JAIR. 2002;16:321–57.
- 17.
Han H, Wang WY, Mao BH. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International conference on intelligent computing. Springer; 2005. p. 878–87.
- 18.
Haibo He, Yang Bai, Garcia EA, Shutao Li. ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence). 2008. p. 1322–8. https://doi.org/10.1109/ijcnn.2008.4633969
- 19. Ortigoso-Narro J, Diaz-de-Maria F, Dehshibi MM, Tajadura-Jiménez A. L-SFAN: lightweight spatially focused attention network for pain behavior detection. IEEE Sensors J. 2025;25(10):18409–18.
- 20. Ni M, Sun Z, Liu W. Fraud’s Bargain attack: generating adversarial text samples via word manipulation process. IEEE Trans Knowl Data Eng. 2024;36(7):3062–75.
- 21. Devi D, Biswas S kr., Purkayastha B. Redundancy-driven modified Tomek-link based undersampling: a solution to class imbalance. Pattern Recognition Letters. 2017;93:3–12.
- 22. Lematre G, Nogueira F, Aridas CK. Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of machine learning research. 2017;18(17):1–5.
- 23. Wang C, Tang H, Zhu H, Jiang C. Collaborative prediction in anti-fraud system over multiple credit loan platforms. IEEE Trans Dependable and Secure Comput. 2024;21(4):3580–96.
- 24. Ning Q, Zhao X, Ma Z. A novel method for identification of glutarylation sites combining Borderline-SMOTE with Tomek links technique in imbalanced data. IEEE/ACM Trans Comput Biol Bioinform. 2022;19(5):2632–41. pmid:34236968
- 25. Zhu Y, Jia C, Li F, Song J. Inspector: a lysine succinylation predictor based on edited nearest-neighbor undersampling and adaptive synthetic oversampling. Anal Biochem. 2020;593:113592. pmid:31968210
- 26. Li Y, Yang X, Gao Q, Wang H, Zhang J, Li T. Cross-regional fraud detection via continual learning with knowledge transfer. IEEE Trans Knowl Data Eng. 2024;36(12):7865–77.
- 27. Xu Y, Yu Z, Chen CLP. Improved contraction-expansion subspace ensemble for high-dimensional imbalanced data classification. IEEE Trans Knowl Data Eng. 2024;36(10):5194–205.
- 28. Haibo He, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009;21(9):1263–84.
- 29. Hong B, Lu P, Chen R, Lin K, Yang F. Health insurance fraud detection via multiview heterogeneous information networks with augmented graph structure learning. IEEE Trans Comput Soc Syst. 2025;12(5):2297–317.
- 30. Kang Q, Chen X, Li S, Zhou M. A noise-filtered under-sampling scheme for imbalanced classification. IEEE Trans Cybern. 2017;47(12):4263–74. pmid:28113413
- 31. Kang Q, Shi L, Zhou M, Wang X, Wu Q, Wei Z. A distance-based weighted undersampling scheme for support vector machines and its application to imbalanced classification. IEEE Trans Neural Netw Learn Syst. 2018;29(9):4152–65. pmid:29990027
- 32. Yun F, Yu Z, Yang K, Chen CLP. AdaBoost-stacking based on incremental broad learning system. IEEE Trans Knowl Data Eng. 2024;36(12):7585–99.
- 33. Shen C, Pei Z, Chen W, Wang J, Wu X, Chen J. Lower limb activity recognition based on sEMG using stacked weighted random forest. IEEE Trans Neural Syst Rehabil Eng. 2024;32:166–77. pmid:38145527
- 34. Zhu H, Zhou M, Liu G, Xie Y, Liu S, Guo C. NUS: noisy-sample-removed undersampling scheme for imbalanced classification and application to credit card fraud detection. IEEE Trans Comput Soc Syst. 2024;11(2):1793–804.
- 35. Qiao S, Zhu H, Sha L, Wang M, Guo Q. DynMark: a dynamic packet counting watermarking scheme for robust traffic tracing in network flows. Computers & Security. 2025;157:104571.
- 36. Xie Y, Shan J, Wei L, Yao J, Zhou M. GAN-based hybrid sampling method for transaction fraud detection. IEEE Trans Knowl Data Eng. 2025;37(10):5905–18.
- 37. Jia W, Lu M, Shen Q, Tian C, Zheng X. Dual generative adversarial networks based on regression and neighbor characteristics. PLoS One. 2024;19(1):e0291656. pmid:38236899
- 38. Xiang S, Zhang G, Cheng D, Zhang Y. Enhancing attribute-driven fraud detection with risk-aware graph representation. IEEE Trans Knowl Data Eng. 2025;37(5):2501–12.
- 39.
Adler J, Lunz S. Banach Wasserstein GAN. In: Advances in Neural Information Processing Systems. 2018.
- 40. Ding H, Sun Y, Wang Z, Huang N, Shen Z, Cui X. RGAN-EL: a GAN and ensemble learning-based hybrid approach for imbalanced data classification. Information Processing & Management. 2023;60(2):103235.
- 41. Shi X, Wang X, Zhang Y, Zhang X, Yu M, Zhang L. Innovative novel regularized memory graph attention capsule network for financial fraud detection. PLoS One. 2025;20(5):e0317893. pmid:40435125
- 42. Qiao S, Guo Q, Shi F, Wang M, Zhu H, Khan F, et al. SIBW: a swarm intelligence-based network flow watermarking approach for privacy leakage detection in digital healthcare systems. IEEE J Biomed Health Inform. 2025;PP:10.1109/JBHI.2025.3542561. pmid:40036416
- 43. Li D, Chen T, Liu C, Liao L, Chen S, Cui Y, et al. Semi-supervised GAN for enhancing electrocardiogram time series diagnostics. Biomedical Signal Processing and Control. 2025;110:108058.
- 44. Xie Y, Hong Y, Qiao S, Yao J, Liu G, Pang S. A time-aware generative network for enhancing transaction security in consumer electronics. IEEE Trans Consumer Electron. 2025;71(2):6818–28.
- 45. Qu Z, Xi Z, Lu W, Luo X, Wang Q, Li B. DF-RAP: a robust adversarial perturbation for defending against deepfakes in real-world social network scenarios. IEEE TransInformForensic Secur. 2024;19:3943–57.
- 46. Teng H, Wang C, Yang Q, Chen X, Li R. Leveraging adversarial augmentation on imbalance data for online trading fraud detection. IEEE Trans Comput Soc Syst. 2024;11(2):1602–14.
- 47. Bai S, Zheng L, Bai J, Ma X. DLS-HCAN: duplex label smoothing based hierarchical context-aware network for fine-grained 3D shape classification. IEEE Trans Multimedia. 2025;27:5815–30.
- 48. Luo B, Wu H, Wang M, Wang F, Bai L, Jiang C, et al. Front-end parameter identification method based on Adam-W optimization algorithm for underwater wireless power transfer system. IEEE Trans Power Electron. 2025;40(4):6307–18.
- 49. Liu Z, Gao J, Yu H, Luo X. A robust graph fraud detection model based on adversarial reweighting. IEEE Trans Comput Soc Syst. 2025;12(6):5213–24.
- 50. Hu X, Chen H, Chen H, Liu S, Li X, Zhang S, et al. Cost-sensitive GNN-based imbalanced learning for mobile social network fraud detection. IEEE Trans Comput Soc Syst. 2024;11(2):2675–90.
- 51. Qiao S, Guo Q, Wang M, Zhu H, Rodrigues JJPC, Lyu Z. Advances in network flow watermarking: a survey. Computers & Security. 2025;159:104653.
- 52. Zhang Y, Chakrabarty S, Liu R, Pugliese A, Subrahmanian VS. SockDef: a dynamically adaptive defense to a novel attack on review fraud detection engines. IEEE Trans Comput Soc Syst. 2024;11(4):5253–65.
- 53. Van der Maaten L, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research. 2008;9(11).
- 54. Zhu H, Zhou M, Xie Y, Albeshri A. A self-adapting and efficient dandelion algorithm and its application to feature selection for credit card fraud detection. IEEE/CAA J Autom Sinica. 2024;11(2):377–90.
- 55.
Gouhara K, Watanabe T, Uchikawa Y. Learning process of recurrent neural networks. In: [Proceedings] 1991 IEEE International Joint Conference on Neural Networks. 1991. p. 746–51 vol. 1. https://doi.org/10.1109/ijcnn.1991.170489
- 56. Bello OA, Ogundipe A, Mohammed D, Adebola F, Alonge OA. AI-driven approaches for real-time fraud detection in US financial transactions: challenges and opportunities. European Journal of Computer Science and Information Technology. 2023;11(6):84–102.