Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

GDFGAT: Graph attention network based on feature difference weight assignment for telecom fraud detection

Abstract

In recent years, the number of telecom frauds has increased significantly, causing substantial losses to people’s daily lives. With technological advancements, telecom fraud methods have also become more sophisticated, making fraudsters harder to detect as they often imitate normal users and exhibit highly similar features. Traditional graph neural network (GNN) methods aggregate the features of neighboring nodes, which makes it difficult to distinguish between fraudsters and normal users when their features are highly similar. To address this issue, we proposed a spatio-temporal graph attention network (GDFGAT) with feature difference-based weight updates. We conducted comprehensive experiments on our method on a real telecom fraud dataset. Our method obtained an accuracy of 93.28%, f1 score of 92.08%, precision rate of 93.51%, recall rate of 90.97%, and AUC value of 94.53%. The results showed that our method (GDFGAT) is better than the classical method, the latest methods and the baseline model in many metrics; each metric improved by nearly 2%. In addition, we also conducted experiments on the imbalanced datasets: Amazon and YelpChi. The results showed that our model GDFGAT performed better than the baseline model in some metrics.

Introduction

With the rapid development of the telecommunication industry, telecommunication fraud has become more and more severe in recent years, and more and more users have suffered property losses. Telecom fraud refers to the behavior of fraudsters who defraud others of large amounts of property through phone calls, text messages, etc [1]. In telecom network fraud, fraudsters will contact many users in order to obtain the personal privacy information of potential victims for further fraudulent behavior. They will step by step, induce the victim to provide personal information and induce the victim to transfer money to a designated bank account [2]. In other cases, the fraudsters lure the victims to open some links to leak their personal information, then the fraudsters get the privacy and steal the money.

In 2017, a survey reported by telecom service providers revealed that losses from telecom fraud amounted to $29.2 billion, accounting for 1.69% of estimated global revenue [3]. In 2023, telecom fraud caused 328.8 billion in losses in China. Telecom fraud damages people’s lives both financially and mentally, even leading to death. Therefore, the fight against telecom fraud has become an urgent global issue [4]. Although governments have taken a number of measures to combat telecom fraud, such as publishing the latest typical fraud cases on social media and using predictive interception mechanisms and systems like China’s National Anti-Fraud Application, cases still occur. Fraudsters continue to improve their strategies, such as disguising themselves as normal users to commit fraud. This makes it very difficult to detect telecom fraud, especially when they disguise themselves as ordinary users in a large number of call records [4].

In the early research on telecom fraud detection, researchers only regarded it as an abnormal sequence detection task, such as using long short-term memory network (LSTM) [5, 6]. In other studies, zhen [4] used the convolutional neural network (CNN) method to convert telecommunication call record (CDR) data into a matrix for research. The methods proposed in these studies have achieved good results, but fraudsters are becoming increasingly intelligent, and their methods are becoming more and more cunning. They will pretend to be normal users to commit fraud and have highly similar characteristics to normal users. Therefore, more than previous methods are needed to deal with today’s fraudsters and explore the complex relationships between them.

Current researches propose graph neural network (GNN) methods to address the above challenges. Research based on graph neural networks (GNN) has made significant progress in the field of fraud detection, such as credit card fraud detection [79], e-commerce fraud detection [10, 11] and other fields. These successful studies provide new possibilities for telecom fraud. One point that cannot be ignored is that telecom fraud users and normal users are highly interactive and can easily form a graph network structure, as shown in Fig 1. The graph neural network method can aggregate the features of neighbor nodes well and discover potential relationships between nodes. Now graph neural networks have been applied to telecom fraud detection [3, 1215] etc. The methods proposed in these studies have achieved good results. However, based on our research on these methods and analysis of existing telecom fraud datasets, the application of graph neural networks in the field of telecom fraud detection still has the following difficulties: (1) Telecom fraud users have obvious collaborative relationships, and their characteristics are also irregular. Traditional methods cannot effectively analyze the characteristics. The collaborative relationship between telecom fraudsters usually involves multiple fraudsters working together to defraud a user [12]. In addition, the Sichuan telecom fraud dataset we used has been proven to have collaborative relationships in research such as [12, 16]. (2) In existing research, methods based on graph neural networks update features by aggregating neighbor node features. This makes it impossible to distinguish fraudsters who are highly similar to normal users, further leading to the failure of fraudsters to be discovered.

thumbnail
Fig 1. This is an example of a telecom graph network structure.

Black users are fraud users, white users are normal users, and the line represents the calling relationship between users.

https://doi.org/10.1371/journal.pone.0322004.g001

To solve the above problems, we proposed a novel telecom fraud detection model (GDFGAT) in this paper. For the call detail record (CDR) dataset, we introduce the gated recurrent unit model (GRU) for feature processing, dividing the CDR data into different time windows according to multiple time frequencies. Using GRU for these different time windows can better extract the time series features in each time period and capture the trend or periodic features that may change over time. In addition, we proposed a GAT network that updates based on feature difference weights by calculating the difference weights of the central node and neighbor features. In updating features through GAT message passing, we assign features based on the difference weights, enabling more effective detection of fraudsters and normal users. Considering the node imbalance of fraud detection, we also designed a node degree frequency controller to calculate the frequency weight value of each node through the node degree and use the weight value to alleviate the node imbalance during message passing.

Our contributions are summarized below:

  • We used the GRU model to process CDR data from different time dimensions. This model can better extract temporal features within each time period and capture trends or periodic features that may change over time.
  • We proposed a new model (GDFGAT) for telecom fraud detection. Based on the GAT model, we innovatively proposed a graph attention network based on feature difference weight update, which can distinguish nodes with highly similar features to better distinguish between fraudsters and normal users. In addition, we also designed a frequency controller based on node degree to alleviate the imbalance of nodes by calculating the frequency weight of each node.
  • We conduct extensive experiments on real-world telecom fraud datasets and other public fraud data sets, and the results show that our method outperforms other methods in many indicators.

Related work

The research of this paper is based on graph neural network telecom fraud detection. In this section, we will introduce the work related to this research.

Graph neural network

In recent years, GNN has been widely used in many fields, such as natural language processing [17, 18], chemical biology [1921], recommendation systems [22, 23], fraud detection, causal inference [24, 25] etc. GNN can process graph-structured data and data with complex relationships. It can effectively mine features in the data, thereby better capturing the complex relationships in the graph structure. GNN updates features through the message passing process, where each node receives information from its neighbor nodes and updates its features through an aggregation function. Different graph neural networks also employ different aggregation methods [26].

Telecom fraud detection

Most of the early work on telecommunications fraud detection was based on traditional machine learning methods. For example, Dominik Olszewski [27] designed a threshold type classification algorithm based on analyzing user features to detect fraudulent users. Amuji [28] proposed a method to group features (number of calls and call duration) to calculate probabilities, then used classifiers and probability models to detect fraud users. Lin [29] proposed a new model (COSIN), a Markov model combined with probability distribution to detect fraud users. Subudhi [30] improved the mean clustering algorithm for telecom fraud detection. Wang [31] extracted call detail record (CDR) data and then used the support vector machine (SVM) algorithm to predict fraud users. Ji [32] used support vector machines with linear, polynomial, and radial basis function kernels to predict fraudulent calls and identify telecom fraud users after modeling the time dimension. Li [33] introduced a state machine into the support vector machine to classify fraudulent users and normal users. However, as the data size grows, the performance of SVM will decline. Other machine learning methods such as decision tree [34], naive Bayes model [35] and ensemble learning [36] have also been applied in the field of telecommunications fraud detection and have achieved good results. However, as fraudsters’ methods become increasingly subtle, traditional machine learning methods cannot effectively detect fraudsters.

Graph-based telecom fraud detection

In recent years, graph neural networks have been widely used in the field of telecom fraud detection. GNN can learn data with complex relationships and fully mine the relationships between the data. Ji [37] proposed a multi-range gated graph neural network (MRG-GNN) model, which converted some social relationships of users into latent features, then used graph neural networks to learn the latent features of users to detect telecommunications fraudsters and finally achieved good results. Chu [38] used a GraphSage model combined with attention mechanisms to detect fraudsters. Wu [12] proposed a telecommunications fraud detector based on latent collaborative graph learning, which used long short-term memory (LSTM) to encode the original features of user nodes in sequential call behavior learning, constructed a latent collaborative graph by recreating the connections between nodes that share the same call recipient.

Problem definition

Definition 1 [Call Detail Record]. A Call Detail Record (CDR) is a log file generated by a telecommunications company that records information about each call behavior or session. CDR data typically contains information such as call type, call duration, calling and called phone numbers, etc. Each CDR record can be considered a user’s call record sequence. We define the call record sequence of user u as .

Definition 2 [Graph]. We usually define a simple graph using . Where represents the nodes in the graph, can be used to represent all the nodes in the graph. E represents the edges in the graph, represents an edge from node . This paper we use to define a graph structure for telecommunications fraud detection. Where represents each node, corresponding to each user, specifically represents that there is a call relationship between user i and user j. X represents the feature vector set of the node, defined as , represents the feature vector of node . Y represents the set of node labels, , where yi represents the label of node . In this paper, the value of yi is defined as , where yi = 1 means the node is a fraudulent node, and yi = 0 means the node is a non-fraudulent node.

Definition 3 [Telecom fraud detection based on graph neural network]. The fraud detection problem can be viewed as a classification problem. Using graph methods for fraud detection tasks can convert this task into a graph node classification problem to identify fraud and non-fraud nodes. Graph nodes represent entities that need to be classified. Each node contains corresponding features, and the edges connecting nodes represent the relationship between nodes. Telecom fraud detection can also be viewed as a classification problem to identify whether a node is a fraud node or a non-fraud node. GNN is a neural network that learns graph-structured data and has expressive solid power. GNN aggregates the feature information of neighboring nodes during training and can better capture feature information. Telecom fraud detection based on GNN uses the GNN model to train labeled nodes so that the model can classify unlabeled nodes.

The proposed method

In this section, we will introduce our model and some key designs.

Overview

Our model GDFGAT consists of four different modules, the structure is shown in Fig 2. The first module is the input metadata information part. The dataset we use comes from the Sichuan Telecom Fraud Competition dataset, which consists of four parts. The Sichuan dataset will be introduced in detail in the next section. The second module is the feature extraction module, which mainly processes the data, extracts features, and constructs graph structure relationships. The third module is the core part of our model, a GRU model and a graph neural network based on feature difference weights. The last module is a classification module that classifies the nodes.

thumbnail
Fig 2. An illustration is provided here to show the details of the proposed GDFGAT model. The model is divided into four modules.

(1) Metadata module. (2) Feature extraction module, which extracts features from two data set dimensions. (3) The main model module consists of the GRU model and the GAT model. (4) Classification module, which classifies each node.

https://doi.org/10.1371/journal.pone.0322004.g002

GRU-based call detail record feature processing

We used the GRU model to process the call detail records (CDR). It is common to use recurrent neural networks (RNN) for time series data, such as call detail records. Considering the memory requirements of recurrent neural networks, we use the GRU model with fewer calculation parameters to improve our performance. We process the CDR data through four different time windows and then input them into the GRU model. This aims to extract as many features of fraudsters as possible in the call time dimension. The calculations for the entire process are summarized as follows:

(1)

Where are the call time series features of each user extracted from the call detail records through feature engineering. represent four different time windows. We use the features corresponding to each time window as the input of the GRU model and finally obtain the corresponding time series feature information ts.

Feature fusion

In this section, we will briefly introduce the feature fusion method. After obtaining the time series feature ts, we concatenate the time series feature and the user behavior feature information c, and finally pass an MLP to obtain the final feature h. We can express it as follows:

(2)

Where h is the feature after the fusion. tsi (i=0,1,2,3) is the information of four-time series features, c is the information of user behavior features, we concatenate the features of these two dimensions, and finally obtain the total features of each node after fusion through MLP.

Feature difference weights for graph attention module

In the traditional graph attention network, the feature of the central node is updated by aggregating the features of neighboring nodes. We express the original GAT model feature update as follows:

(3)

Where aij is the attention coefficient between node i and node j, is the feature of node j, f represents the node feature dimension, and represents a learnable weight matrix. mathcalNi is a collection of neighbors of node i, and is a nonlinear activation function. The GAT model can finally obtain the updated feature through this calculation formula.

Our Approach. Existing GNN models are not explicitly designed for fraud detection, and they cannot effectively identify abnormal states of nodes. Today’s telecom fraudsters are constantly improving their methods and are very similar to normal users in many aspects. Suppose the traditional GNN method is used to aggregate the features between them. In that case, the features of fraud users and the features of normal users will be mixed, and the fraud nodes and normal nodes cannot be effectively identified, resulting in detection failure [26]. Even if the characteristics of fraud users and normal users are highly similar, they cannot be exactly the same, and there must be differences between them. We can compare it to two twins. Even though they look very similar and have similar body shapes, characteristics, and behaviors, there must be differences in specific characteristics. For example, they have different fingerprints. We can identify them by examining the differences between them. Based on this idea, we can calculate the difference weight value of the node and assign features to each node according to the feature difference weight value coefficient in the process of GAT updating features so that the model can more effectively identify fraudulent nodes. The whole calculation process of our method is as follows:

(4)(5)

Where represents the feature difference between node i and node j at layer lth, and represents the feature difference after linear transformation. represents the original attention scores of nodes i and j in layer lth, represents a learnable weight vector. First, we concatenate the feature of node i and the feature difference and then transform the dimension by embedding a learnable weight matrix. Finally, use the LeakyRelu activation function to get the original attention scores of node i and node j. represents the final difference attention weight of node i and node j in layer lth, and represents the updated feature of node i.

Determination process of feature difference weight. The features h of the nodes of each graph network are obtained through the previous process. Refer to Sect 4 Model Structure. Our process of calculating feature difference weights is shown in Eq 4. In the first step, we calculate the feature difference of each node. In Eq 4, the feature difference of each node can be obtained through the first calculation formula. In the second step, a linear transformation is performed. W is a trainable matrix, corresponding to the second calculation formula in Eq 4. In the third step, we use the LeakyReLU activation function to get the original attention scores of node i and node j, corresponding to the third formula in Eq 4. Finally, the softmax activation function is used for mapping to get the difference weight between nodes, corresponding to the last calculation formula in Eq 4. The process is shown in the Fig 3 below.

thumbnail
Fig 3. Determination process of feature difference weight.

https://doi.org/10.1371/journal.pone.0322004.g003

Frequency controller based on node degree

Fraud detection tasks often face data imbalance. Data imbalance frequently significantly affects the model’s performance, causing minority classes (fraud samples) to be ignored by the model, resulting in detection failure. Graph neural networks can capture complex relationships between nodes. Still, when faced with imbalanced data, the model tends to learn majority class features first, causing it to ignore minority class samples, thus affecting overall performance. The GCN [39] model allocates the feature weights of neighbor nodes through the node degree matrix. Based on this idea, we proposed a node frequency controller method to alleviate the data imbalance. Briefly summarized as follows:

(6)(7)(8)

Where Eq 6 is to scale the degree of node v. Here, we refer to the scaling method of the degree matrix in GCN. In Eq 7, W is a matrix for linear changes, aij is the difference in attention weight between node i and node j. We adjust the dimension of the attention difference weight and then use the activation function for mapping. Eq 8 sums and normalizes the degrees of node i and node j, then scales again based on the difference in attention weight. The final K value is our balance control coefficient. The final formula of the node update feature is as follows:

(9)

What is the purpose and principle of what we do? It can alleviate the problem of data imbalance, thereby reducing the risk of features being biased towards the majority class. First of all, we inverted the degree of each node so that the node with a lot of degrees assigned to each edge of the degree weight is small, and finally, this node can pass very few features of other nodes. Assuming that the degree of the central node is large and most neighboring nodes are normal. This approach can make the weight of each edge very small so that when aggregating features, only a small number of features need to be aggregated, avoiding aggregating all features of normal nodes and finally causing the features to favor most normal nodes. Secondly, we normalize the degrees of the central and neighboring nodes and add them together. This approach balances the influence of the nodes at both ends of the edge to a certain extent. In the message-passing process of the graph neural network, this method can better capture the relationship between the two nodes connected by the edge. When considering the information propagation weight between nodes, this method not only focuses on the degree of a single node but adjusts the propagation weight based on the degree of the two nodes as a whole. This can avoid the situation where only the starting or ending node is biased, making information propagation more reasonable. Finally, the final balance control factor is obtained by allocating according to the attention difference weight coefficient. In the process of feature update, allocating features according to the balance control factor can not only alleviate the impact of imbalance of the data sets to a certain extent, but also ensure that the features are the most effective features when they are transmitted.

Algorithm 1. Training algorithm.

Experiments

Dataset

In this paper, we use a real-world telecom fraud dataset provided by the Sichuan Digital Innovation Competition 2020, organized by the Sichuan Big Data Center of China. Due to the large amount of personal privacy involved in telecom fraud data sets, there are few open telecom fraud data sets. Therefore, this section will give a detailed introduction to our dataset. This dataset surveys 6106 users in 23 cities in Sichuan Province, covering a period of 8 months from August 2019 to March 2020. Our analysis of this dataset is summarized in Table 1, which contains 1962 fraudulent users and a total of 5015430 call records. The entire dataset is divided into four parts: user basic information table USER, call behavior table VOC, text message table SMS and APP table. The User table describes basic information about each user, such as phone number, phone bill record, number of phone cards, user label (whether it is a fraudulent user), etc. The VOC table records user call details, including calling number, called number, call type, call date, and call duration features. The SMS table contains user SMS information, SMS type, and the SMS’s date. The APP table records the name of the application and the amount of flow the user uses. The CDR dataset’s feature information is shown in detail in Table 2.

CDR dataset features. The CDR dataset features used in the project are obtained by feature extraction from the VOC table, as shown in Table 2. The feature dimension size is 32. It should be noted that the feature of the number of calls per hour has a dimension size of 24, and we count it according to the time unit of each hour. Therefore, in our model, the feature size of the GRU module input is also 32, which is consistent with the CDR dataset features.

Dataset preprocessing. We perform a unified preprocessing process on the dataset in the project. First, filter and clean the original data set. In this process, we will directly filter out the data with missing features to ensure that the data set is reliable and will not affect the experimental results. The second step is feature extraction and calculation. This process extracts the original features and further calculates other features through the original features to expand the feature dimension size. It will be described in detail in the feature extraction below. The last step is to normalize the features.

Feature Extraction. We perform feature extraction on the above datasets and extract features from two dimensions. Extract user call features from the VOC table of user call detail records and extract user behavior features from the remaining three tables. In the VOC table, we extract features such as average call duration, variance, number of calls per hour, user out-degree (total number of calls made), in-degree (total number of calls received), callback rate, and repetition rate. In the text message table, we analyze features such as the frequency and types of text messages sent. In the USER table, we extract features such as the number of user phone cards, average consumption, and variance of consumption. In the APP table, we calculated the total number of apps users use, the total flow, and the average flow features. The feature extraction for each table is summarized in Table 3.

thumbnail
Table 3. Features of the Sichuan dataset after extraction.

https://doi.org/10.1371/journal.pone.0322004.t003

Dataset feature analysis. The most noticeable feature of this telecom fraud dataset is that the fraudsters have a collaborative relationship. In telecom fraud activities, we define a collaborative relationship as two or more fraudsters working together to commit fraud. In other words, if two or more fraudsters have called the same defrauded person, they may have a synergistic relationship. We sample and analyze the fraudsters in the data. As the proportion of fraudsters continued to increase, the proportion of them calling the same normal user also continued to increase. When the sampling ratio reached 90%, about 30% of the fraudsters called the same user, as shown in Fig 4 left. In addition, we analyze the second-order neighbors of normal users and fraud users, respectively. The results show that 58.3% of the second-order neighbors of fraud users are frauds. However, among the second-order neighbors of normal users, only 8% of users are fraudsters, as shown in Fig 4 right. We can assume that two fraudsters jointly defrauded one user(second-order neighbor), so the collaborative relationship between the fraudsters is very obvious. Secondly, we analyze some features of fraudsters, as shown in Fig 5, and we find that many of the features of fraudsters are collaborative. The features of fraudsters have the characteristic of indirect concentration, which causes their features to show an irregular trend, the trend of feature concentration at a certain time. For example, fraudsters have the highest call charges in December but the lowest in August. Fraudsters are cooperative in some features, leading to the concentration of fraudsters’ features. That is to say, the similarity of fraudsters’ features will increase significantly at a specific point in time. If there is no behaviour synergy, fraudsters’ characteristics should be more random and scattered, with features not concentrated at a particular point in time. This concentration phenomenon indicates that many fraudsters may be engaged in some joint fraudulent operation at these particular points in time, resulting in a concentrated representation of the features.

thumbnail
Fig 4. The x-axis on the left side of

Fig 4 is the proportion of sample sampling, and the y-axis is the proportion of joint calls made by fraudulent users and non-fraudulent users. The x-axis on the right side of Fig 4 is the proportion of sample sampling, and the y-axis is the proportion of fraudulent users and non-fraudulent users using second-order neighbors or fraudulent users.

https://doi.org/10.1371/journal.pone.0322004.g004

thumbnail
Fig 5. The x-axis represents time, from August 2019 to March 2020, and the y-axis represents the total size of feature statistics of fraudsters or non-fraudsters each month.

https://doi.org/10.1371/journal.pone.0322004.g005

Experimental setup

In the experimental section, we conduct extensive experiments on a real-world telecommunications fraud dataset.

Baseline Methods. To verify the effectiveness of our proposed method, we compared the GDFGAT model with three types of model methods. The first category is traditional machine learning algorithms. We selected three classic machine learning algorithms: Support Vector Machine(SVM), Logistic Regression(LR) and Random Forest(RF). For the second category, we selected three basic models of graph neural networks, namely the GCN model, the GAT model, and the GraphSAGE model. For the third category, we selected the graph neural network model that has been used in the field of telecommunications fraud detection and fraud detection in recent years.

  • SVM: Support Vector Machine algorithm is a machine learning algorithm commonly used for classification tasks. It classifies samples by finding a hyperplane.
  • LR: Logistic Regression algorithm is a machine learning algorithm used for classification tasks. It predicts the probability of an event by establishing a logical function and determines the classification result based on the probability value.
  • RF: Random Forest algorithm is a machine learning algorithm that can be used for classification tasks, classifying samples through decision trees.
  • GCN [39]: The graph convolutional network in GNN updates the features of a node by aggregating the feature information of its neighbors.
  • GAT [40]: Graph Attention Network, which calculates the attention coefficient and performs weighted summation on the features of neighboring nodes to obtain new features of the node.
  • GraphSAGE [41]: Updates the feature representation of the target node by sampling the neighbors of the node and aggregating the feature information of the neighboring nodes.
  • CARE-GNN [42]: A graph neural network model for finding informative neighboring node aggregation features based on label-aware similarity metrics.
  • GEM [43]: A GNN model that uses the attention mechanism to learn the importance of different types of nodes and uses the summation and aggregation features for each node.
  • FRAUDER [44]: A GNN model with fraud-aware graph convolutional modules to learn discriminative embeddings of normal users and fraudsters.
  • RCGN [45]: A telecommunication fraud detection method that uses adaptive cost-sensitive learning (AdaCost) to prioritize fraudulent nodes and then uses a deep deterministic policy gradient algorithm to dynamically optimize the weight coefficients.
  • BWGNN [46]: Proposed a GNN fraud detection model with spectrally and spatially localized bandpass filters to address imbalanced dataset problem.
  • F2GNN [47]: Proposed a GNN model with adaptive filter with feature segmentation to solve the problem of data imbalance in the field of fraud detection.
  • CDR2IMG [4]: A computer vision method is proposed to detect telecommunication fraudsters. Convert the CDR data into an image feature matrix, and use the convolutional neural network model to extract the feature matrix.
  • FDAGNN [26]: A graph neural network method for telecommunication fraud detection with aggregated feature differences.
  • LSG-FD [12]: A graph neural network telecommunication fraud detector based on latent collaborative graph (LSG) learning.

Experimental Metric. We can usually view fraud detection tasks as classification tasks, where we classify fraudsters and non-fraudsters. Therefore, we selected five Accuracy, Precision, Recall, F1 score and AUC metrics to measure our results. Accuracy measures the proportion of correctly predicted samples to the total number of samples. In classification tasks, accuracy is a significant evaluation indicator. Precision represents the fraction of predicted fraudsters that are ground truth. The recall rate represents the proportion of real fraudsters detected. F1 is an indicator that comprehensively considers precision and recall, indicating the trade-off between precision and recall. AUC is used to evaluate the performance and reliability of the classification model. Therefore, this paper mainly evaluates our model from these five metrics.

Implementation. During the training process, we selected the Adam optimizer, set the feature embedding dimension size to 32, and set the learning rate to 0.0005. We use T4 GPU on Google Colab to train the Sichuan data set. All experiments use torch-geometric 2.5.3, Pytorch 2.3.0, and Python 3.9. All baselines were implemented using open source code, and our model code can be found on GitHub.

Performance comparison

As shown in Table 4, our model outperforms other baseline models in all indicators. Among them, the LSG-FD model is the latest and best model for telecommunications fraud detection, and our model also outperforms it in all metrics. We further analyze the experimental results: Firstly, the results obtained by the GNN method are generally better than the traditional machine learning classification algorithms, which also shows that GNN is very suitable for processing data sets with complex relationships. Secondly, our model Accuracy can reach 93.28%, which is about 2% higher than LSG-FD. Compared with this model, our model fully considers the feature change trend of the fraudster and adopts the GRU model to extract features from different time dimensions. This implicitly shows that the proposed method effectively handles the feature concentration caused by the synergistic relationship between fraudsters. FDAGNN is a GNN model that aggregates feature differences. Compared with it, our method is also better than this method in key metrics. Our proposed way of allocating features according to feature difference weights is feasible because GNN has a high sensitivity to scaling factors during the feature update process. There are 1962 fraudulent users in the Sichuan telecommunications fraud dataset, the imbalance rate is 32.13%. The dataset is relatively balanced. To verify the performance of our model on imbalanced datasets, we conducted separate experiments on the Yelp and Amazon datasets.

thumbnail
Table 4. Performance comparison of GDFGAT and baseline on SiChuan dataset.

https://doi.org/10.1371/journal.pone.0322004.t004

Model comparison on imbalanced dataset

The Sichuan dataset mentioned above shows that fraud and non-fraud users are relatively balanced. So, in this section, we perform an imbalanced test on the model to verify that our model also performs well on imbalanced data sets. We use data sets from Amazon and Yelp, two publicly available data sets in the field of fraud detection. The Amazon dataset collects user reviews under the category of musical instruments. There are three connections between users, products, reviews, and time. It connects users who have reviewed at least one of the same products (U-P-U) and have at least one review with the same star rating within a week. User (U-S-U) means the top 5% users among all users with similar text comments to each other (U-V-U). The Yelp dataset comes from spam or deceptive reviews of restaurants or hotels. There are also three connections, linking reviews posted by the same user (R-U-R) with the same star rating (R-S-R) and the same product in the same month (R-T-R). The analysis of Amazon and Yelp data sets is shown in Table 5.

Baseline Methods. To validate our proposed method, we conducted extensive experiments on two imbalanced data sets, Amazon and Yelp. We selected some recently proposed models in fraud detection (including some models from the previous experiment) and models used to solve the problem of imbalanced dataset fraud detection as our baseline.

  • GHRN [48]: A GNN fraud detection model based on spectral analysis.
  • PC-GNN [49]: A GNN model based on resampling method to solve the problem of data imbalance.
  • GDN [50]: A graph decomposition network that infers and updates the distribution of abnormal features through prototype vectors for the field of graph anomaly detection.

Experimental Metric. For two imbalanced data sets, Amazon and Yelp, we used F1, AUC, Recall, and G-Mean to measure the performance of our model on imbalanced data sets. G-Mean metric means the geometric mean. When evaluating imbalanced data sets, we usually measure the size of the G-Mean value. G-mean can better compare the performance of different models on imbalanced data sets.

Performance comparison on imbalanced dataset

Table 6 shows the results of our experiments on imbalanced data sets. Our model also performs well on imbalanced data sets. Compared with the results of the Sichuan dataset, we can find that the models generally perform worse. For example, the AUC and F1 of the CARE-GNN model are much lower, and the impact of data imbalance on the model is evident. Compared with the LSG-FD model on the Amazon dataset, our model has a lower AUC but is better than that model in other indicators. On the Yelp dataset, the G-mean metric is lower than the GDN model, but other metrics are better than other models. Our model also performs well on imbalanced data sets in other metrics compared to other models.

thumbnail
Table 6. Performance comparison of GDFGAT and baseline on Amazon and Yelp dataset.

https://doi.org/10.1371/journal.pone.0322004.t006

Quantitative description of experimental results

In the above experimental part, we conducted experiments on the model on relatively balanced data sets and unbalanced data sets. On the Sichuan dataset, the accuracy of our model GDFGAT reaches 93.28%, while the SVM, LR, and RF in traditional machine learning algorithms are 86.27%, 88.49%, and 88.31%. Among the classic GNN models, GCN is 89.47%, GAT is 89.82%, and GraphSage is 84.65%. Among models in the same field, the latest LSG-FD model is 91.89%. Our model is 5% higher than the traditional machine learning algorithm, about 4% higher than the traditional GNN method, and about 2% higher than the latest method LSF-FD. In terms of F1 score, GDFGAT is 92.08%, SVM is 77.64%, LR is 80.08%, and RF is 77.87%. Our method improves 12% over the machine learning method. Among traditional GNN methods, the best performing method is GAT with 89.55%, and our method has an improvement of about 3%. Among related research methods, the best performing method is LSG-FD with 90.43%, which is about 1.65% higher than LSG-FD. In terms of Precision, GDFGAT is 93.51%, RF is 89.97%, and LSG-FD is 92.06%. GDFGAT is about 4% higher than RF and about 1.45% higher than LSG-FD. In terms of Recall, GDFGAT is 90.97%, SVM is 77.10%, and LSG-FD is 89.45%. GDFGAT is about 13.87% higher than SVM and about 1.52% higher than LSG-FD. Similarly, in terms of AUC, GDFGAT is 94.53%, RF is 86.66%, and LSG-FD is 93.18%. GDFGAT is about 8% higher than RF and about 1.35% higher than LSG-FD. The above comparison indicators clearly demonstrate the advantages of the GDFGAT model in various metrics. In the imbalance experiment, we used Amazon and Yelp datasets. Considering that imbalance has a great impact on accuracy, we used F1 score, AUC, Recall and G-mean as indicators for testing. For the Amazon dataset, the imbalance rate is 14.5%. Compared with the baseline model, GDFGAT is 92.30% in F1 score, while the best performing method among other methods is LSG-FD, which is 92.16%. In comparison, our method has improved by about 0.2%. In Recall, it is 1% higher than LSG-FD.

For the Yelp dataset, the imbalance rate is 9.5%. In F1 score, GDFGAT is 79.96%, which is about 2% higher than the best LSG-FD among other methods. In AUC indicator, GDFGAT is 93.14%, which is about 1% higher than the best other methods. In Recall indicator, GDFGAT is 85.71%, which is about 2% higher than other methods.

The results show that on a relatively balanced dataset, our model GDFGAT performs best on all metrics. Considering the impact of imbalance problems on the model, even so our model is better than other methods in some metrics, and is also very close to existing advanced methods in some indicators, such as the G-mean metric. The imbalance test also shows that our model GDFGAT has a good performance on imbalanced datasets.

Ablation experiments

To illustrate the effectiveness of each module of our GDFGAT model, we conducted multiple sets of ablation experiments. We designed the following variant models for experiments: (1). The original GAT model. (2). Introduce the GRU module to verify its validity (GRU-GAT). (3). Add the feature difference weight algorithm to verify the effectiveness of the algorithm (GRU-GATDF). (4). Introduce the frequency controller based on the node degree module and verify the effectiveness of the module (GRU-GATcontrol). We also conducted ablation experiments on the feature difference weight algorithm and the degree-based frequency controller module on the Amazon and Yelp datasets. The variant models are as follows: (1). The variant model GRU-GATcontrol that ignores the degree-based frequency controller module. (2). The variant model GRU-GATDF that ignores the feature difference weight algorithm. We conducted ablation experiments on the Sichuan, Amazon, and Yelp data sets, and the results are shown in Fig 6.

thumbnail
Fig 6. Ablation experiment diagram on SiChuan, Amazon and Yelp data sets.

The x-axis represents each metric and the y-axis represents the corresponding value. The SiChuan dataset has five metrics, and the Yelp and Amazon data sets have four metrics.

https://doi.org/10.1371/journal.pone.0322004.g006

As seen from the results in Fig 6, our model performs best on Sichuan, Amazon, and Yelp data sets. From this dimension, our model is more effective than other models. On the Sichuan dataset, the results of the variant model GRU-GATDF have good performance and are also very close to our final model. The reason for this phenomenon is that the updated feature algorithm based on the weight assignment of the feature difference we propose has a good effect. The performance of variant GRU-GATcontrol is close to that of variant GRU-GAT, probably because our proposed frequency controller based on node degree does not work well. The reason may be that the Sichuan dataset is relatively balanced. Our assumption is verified on Amazon and Yelp data sets. The results of the variant model GRU-GAT are better than the original GAT model, so the GRU module we introduce effectively processes the call feature behaviors of different windows. The performance of the original GAT model on the Sichuan dataset is lower than that of other variant models. On the Yelp dataset, after deleting the method of frequency controller based on node degree, GRU-GATcontrol performs poor than other variant models. Considering the imbalance of the Yelp dataset, our proposed method to alleviate data imbalance is effective on imbalanced data sets. Similarly, on the Amazon dataset with a lower imbalance rate, the variant model GRU-GATcontrol still performs poorly, and its various indicators are lower than other variant models.

Parameter sensitivity

In this section, we conduct experiments on the GDFGAT model’s key parameters to explore the model’s sensitivity to the parameters. We study the sensitivity of embedding size, the number of layers in the graph attention network, and the number of heads in the graph attention network, the result is shown in Fig 7. Among them, the embedding size ranges from 16 to 128, as shown in Fig 7(a). When the range increases from 32 to 64, the indicators are significantly improved. However, when the embedding size exceeds 64, the performance begins to decline. We need to pay extra attention to the fact that the larger the embedding size, the higher the computing requirements, and the operating efficiency will also slow down as the embedding size increases. Considering the efficiency, we set the embedding size to 64. As shown in Fig 7(b), the number of layers of the graph attention network ranges from 1 to 5. When the number of layers of the graph attention network increases from 3 to 4, the performance improves significantly, and when 4 increases to 5, the performance reaches a steady state. We also need to pay extra attention to the fact that the number of layers also affects the efficiency of the operation, so we finally set the number of layers to 4. Fig 7(c) shows the multi-head size of the graph attention network, ranging from 1 to 6. When the multi-head size is 4, the model performs best. When the size exceeds 4, the performance begins to decline, so we set the multi-head size to 4.

thumbnail
Fig 7. Performance of GDFGAT with different embedding sizes, layers, and multi-head on the Sichuan dataset.

In Fig(a), the x-axis represents the embedding size, and the y-axis represents the corresponding value. In Fig(b), the x-axis represents the number of graph neural layers, ranging from 1 to 5, and the y-axis represents the corresponding value. In Fig(c), the x-axis represents the number of heads of the graph neural network, ranging from 1 to 6, and the y-axis represents the corresponding value.

https://doi.org/10.1371/journal.pone.0322004.g007

Conclusion and future work

In this paper, we proposed a novel model called GDFGAT for telecom fraud detection. In the GDFGAT model, we introduce the GRU module to process call characteristics in different time windows. In the graph neural network module, we proposed a new graph attention network model DIFFGAT based on the node feature difference weights based on the graph attention network model and update the features according to the feature difference weights of the central node and the neighboring nodes. Finally, we design a degree frequency controller according to the node’s degree information to alleviate the problem of data imbalance. We conducted a large number of experiments on real telecommunications fraud datasets and two public fraud datasets, and the results show that our model GDFGAT has a good performance. GDFGAT has made progress in the direction of telecom fraud detection, but limited by the dataset, our model can only be validated on the Sichuan dataset at present. Therefore, in future work, we will seek more complex telecom fraud data sets for experiments to improve our model. In addition, we also plan to conduct further research on the problem of data imbalance and propose more effective solutions. Finally, we will study applying the model to other fraud detection directions.

References

  1. 1. Lu G, Duan C, Zhou G, Ding X, Liu Y. Privacy-preserving outlier detection with high efficiency over distributed datasets. IEEE INFOCOM 2021 – IEEE conference on computer communications; 2021. p. 1–10. https://doi.org/10.1109/infocom42981.2021.9488710
  2. 2. Du S, Zhao M, Hua J, Zhang H, Chen X, Qian Z, et al. Who moves my app promotion investment? A systematic study about app distribution fraud. IEEE Trans Depend Secure Comput. 2021;19(4):2648–64.
  3. 3. Liu M, Liao J, Wang J, Qi Q. AGRM: Attention-based graph representation model for telecom fraud detection. ICC 2019–2019 IEEE international conference on communications (ICC). IEEE; 2019. p. 1–6.
  4. 4. Zhen Z, Gao J. CDR2IMG: A bridge from text to image in telecommunication fraud detection. Comput Syst Sci Eng. 2023; 47(1):955–73.
  5. 5. Guo J, Liu G, Zuo Y, Wu J. Learning sequential behavior representations for fraud detection. 2018 IEEE international conference on data mining (ICDM). IEEE; 2018. p. 127–36.
  6. 6. Jiang Y, Liu G, Wu J, Lin H. Telecom fraud detection via hawkes-enhanced sequence model. IEEE Transactions on Knowledge and Data Engineering. 2022;35(5):5311–24.
  7. 7. Xiang S, Zhu M, Cheng D, Li E, Zhao R, Ouyang Y, et al. Semi-supervised credit card fraud detection via attribute-driven graph representation. Proceedings of the AAAI conference on artificial intelligence. 2023;37:14557–65.
  8. 8. Duan Y, Zhang G, Wang S, Peng X, Ziqi W, Mao J, et al. CaT-GNN: Enhancing Credit Card Fraud Detection via Causal Temporal Graph Neural Networks. arXiv preprint arXiv:2402.14708. 2024.
  9. 9. Tang Y, Liang Y. Credit card fraud detection based on federated graph learning. Expert Syst Appl. 2024;256:124979.
  10. 10. Zhang G, Li Z, Huang J, Wu J, Zhou C, Yang J, et al. eFraudCom: An e-commerce fraud detection system via competitive graph neural networks. ACM Trans Inform Syst (TOIS). 2022;40(3):1–29.
  11. 11. Zhao W, Liu X. Detection of E-commerce fraud review via self-paced graph contrast learning. Comput J. 2024;67(6):2054–2065.
  12. 12. Wu J, Hu R, Li D, Ren L, Huang Z, Zang Y. Beyond the individual: An improved telecom fraud detection approach based on latent synergy graph learning. Neural Networks. 2024;169:20–31. pmid:37857170
  13. 13. Yan Q, Sun Y, Cao Y, Yang J, Zhang A, Ju J, et al. An adaptive graph neural networks based on cost-sensitive learning for fraud detection. 2024 7th international symposium on autonomous systems (ISAS); 2024. p. 1–6. https://doi.org/10.1109/isas61044.2024.10552392
  14. 14. Gao P, Li Z, Zhou D, Zhang L. Reinforced cost-sensitive graph network for detecting fraud leaders in telecom fraud. IEEE Access. 2024.
  15. 15. Cao J, Cui X, Zheng C. TFD-GCL: Telecommunications fraud detection based on graph contrastive learning with adaptive augmentation. 2024 international joint conference on neural networks (IJCNN). IEEE; 2024. p. 1–7.
  16. 16. Hu X, Chen H, Liu S, Jiang H, Chu G, Li R. BTG: A bridge to graph machine learning in telecommunications fraud detection. Future Gener Comput Syst. 2022;137:274–87.
  17. 17. Wu L, Chen Y, Ji H, Liu B. Deep learning on graphs for natural language processing. Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval; 2021. p. 2651–3. https://doi.org/10.1145/3404835.3462809
  18. 18. Liu B, Wu L. Graph neural networks in natural language processing. Graph neural networks: Foundations, frontiers, and applications; 2022. p. 463–481.
  19. 19. Reiser P, Neubert M, Eberhard A, Torresi L, Zhou C, Shao C, et al. Graph neural networks for materials science and chemistry. Commun Mater. 2022;3(1):93. pmid:36468086
  20. 20. Fung V, Zhang J, Juarez E, Sumpter BG. Benchmarking graph neural networks for materials chemistry. npj Comput Mater. 2021;7(1):84.
  21. 21. Zhang S, Jin Y, Liu T, Wang Q, Zhang Z, Zhao S, et al. SS-GNN: A Simple-Structured Graph Neural Network for Affinity Prediction. ACS Omega. 2023;8(25):22496–507. pmid:37396234
  22. 22. Qian T, Liang Y, Li Q. Solving cold start problem in recommendation with attribute graph neural networks. arXiv preprint 2019. arXiv:191212398. 2019.
  23. 23. Wu Q, Zhang H, Gao X, He P, Weng P, Gao H, et al. Dual graph attention networks for deep latent representation of multifaceted social effects in recommender systems. The world wide web conference; 2019. p. 2091–102.
  24. 24. Wein S, Malloni WM, Tomé AM, Frank SM, Henze G-I, Wüst S, et al. A graph neural network framework for causal inference in brain networks. Scientific Reports. 2021;11(1):8061. pmid:33850173
  25. 25. Xu H, Huang Y, Duan Z, Feng J, Song P. Multivariate time series forecasting based on causal inference with transfer entropy and graph neural network. arXiv preprint. 2020;1–9. arXiv:200501185.
  26. 26. Wang Y, Chen H, Liu S, Li X, Hu Y. Feature difference-aware graph neural network for telecommunication fraud detection. J Intell Fuzzy Syst. 2023;45(5):8973–88.
  27. 27. Olszewski D. A probabilistic approach to fraud detection in telecommunications. Knowl-Based Syst. 2012;26:246–58.
  28. 28. Amuji HO, Chukwuemeka E, Ogbuagu EM, et al. Optimal classifier for fraud detection in telecommunication industry. Open J Optim. 2019;8(01):15.
  29. 29. Lin H, Liu G, Wu J, Zuo Y, Wan X, Li H. Fraud detection in dynamic interaction network. IEEE Trans Knowl Data Eng. 2019;32(10):1936–50.
  30. 30. Subudhi S, Panigrahi S. Use of Possibilistic fuzzy C-means clustering for telecom fraud detection. Computational intelligence in data mining: Proceedings of the international conference on CIDM, 10–11 December 2016. Springer; 2017. p. 633–41.
  31. 31. Dong W, Quan-yu W, Shou-yi Z, Feng-xia L, Da-zhen W. A feature extraction method for fraud detection in mobile communication networks. Fifth world congress on intelligent control and automation (IEEE Cat. No. 04EX788). vol. 2. IEEE; 2004. p. 1853–1856. https://doi.org/10.1109/wcica.2004.1340996
  32. 32. Ji Z, Ma Yc, Li S, LI Jl. SVM based telecom fraud behavior identification method. Comput Eng Softw. 2017;38(12):104–9.
  33. 33. Li R, Zhang Y, Tuo Y, Chang P. A novel method for detecting telecom fraud user. 2018 3rd International conference on information systems engineering (ICISE); 2018. p. 46–50. https://doi.org/10.1109/icise.2018.00016
  34. 34. Xu T. The design and implementation of visualization character relationship analysis system based on mining of call records. Harbin Institute of Technology; 2014.
  35. 35. Kabari LG, Nanwin DN, Nquoh EU. Telecommunications subscription fraud detection using Naı̈ve Bayesian network. Int J Comput Sci Math Theory. 2016;2(2).
  36. 36. Arafat M, Qusef A, Sammour G. Detection of Wangiri Telecommunication Fraud Using Ensemble Learning. 2019 IEEE Jordan international joint conference on electrical engineering and information technology (JEEIT). IEEE; 2019. p. 330–5. https://doi.org/10.1109/jeeit.2019.8717528
  37. 37. Ji S, Li J, Yuan Q, Lu J. Multi-range gated graph neural network for telecommunication fraud detection. 2020 International joint conference on neural networks (IJCNN). IEEE; 2020. p. 1–6.
  38. 38. Chu G, Wang J, Qi Q, Sun H, Tao S, Yang H, et al. Exploiting spatial-temporal behavior patterns for fraud detection in telecom networks. IEEE Trans Depend Secure Comput. 2022;20(6):4564–77.
  39. 39. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. 2016.
  40. 40. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. arXiv preprint arXiv:1710.10903. 2017.
  41. 41. Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs. Adv Neural Inform Process Syst. 2017;30.
  42. 42. Dou Y, Liu Z, Sun L, Deng Y, Peng H, Yu PS. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. Proceedings of the 29th ACM international conference on information & knowledge management; 2020. p. 315–24. https://doi.org/10.1145/3340531.3411903
  43. 43. Liu Z, Chen C, Yang X, Zhou J, Li X, Song L. Heterogeneous graph neural networks for malicious account detection. Proceedings of the 27th ACM international conference on information and knowledge management; 2018. p. 2077–85.
  44. 44. Zhang G, Wu J, Yang J, Beheshti A, Xue S, Zhou C, et al. Fraudre: Fraud detection dual-resistant to graph inconsistency and imbalance. 2021 IEEE international conference on data mining (ICDM). IEEE; 2021. p. 867–76.
  45. 45. Gao P, Li Z, Zhou D, Zhang L. Reinforced cost-sensitive graph network for detecting fraud leaders in telecom fraud. IEEE Access. 2024.
  46. 46. Tang J, Li J, Gao Z, Li J. Rethinking graph neural networks for anomaly detection. International conference on machine learning. PMLR; 2022. p. 21076–89.
  47. 47. Hu G, Liu Y, He Q, Ao X. F2GNN: An adaptive filter with feature segmentation for graph-based fraud detection. ICASSP 2024 – 2024 IEEE international conference on acoustics, speech and signal processing (ICASSP); 2024. p. 6335–9. https://doi.org/10.1109/icassp48485.2024.10446523
  48. 48. Gao Y, Wang X, He X, Liu Z, Feng H, Zhang Y. Addressing heterophily in graph anomaly detection: A perspective of graph spectrum. Proceedings of the ACM Web conference 2023; 2023. p. 1528–38.
  49. 49. Liu Y, Ao X, Qin Z, Chi J, Feng J, Yang H, et al. Pick and choose: A GNN-based imbalanced learning approach for fraud detection. Proceedings of the web conference 2021; 2021. p. 3168–77.
  50. 50. Gao Y, Wang X, He X, Liu Z, Feng H, Zhang Y. Alleviating structural distribution shift in graph anomaly detection. Proceedings of the sixteenth ACM international conference on web search and data mining; 2023. p. 357–65. https://doi.org/10.1145/3539597.3570377