Diffusion characteristics classification framework for identification of diffusion source in complex networks

Fan Yang; Jingxian Liu; Ruisheng Zhang; Yabing Yao

doi:10.1371/journal.pone.0285563

Abstract

The diffusion phenomena taking place in complex networks are usually modelled as diffusion process, such as the diffusion of diseases, rumors and viruses. Identification of diffusion source is crucial for developing strategies to control these harmful diffusion processes. At present, accurately identifying the diffusion source is still an opening challenge. In this paper, we define a kind of diffusion characteristics that is composed of the diffusion direction and time information of observers, and propose a neural networks based diffusion characteristics classification framework (NN-DCCF) to identify the source. The NN-DCCF contains three stages. First, the diffusion characteristics are utilized to construct network snapshot feature. Then, a graph LSTM auto-encoder is proposed to convert the network snapshot feature into low-dimension representation vectors. Further, a source classification neural network is proposed to identify the diffusion source by classifying the representation vectors. With NN-DCCF, the identification of diffusion source is converted into a classification problem. Experiments are performed on a series of synthetic and real networks. The results show that the NN-DCCF is feasible and effective in accurately identifying the diffusion source.

Citation: Yang F, Liu J, Zhang R, Yao Y (2023) Diffusion characteristics classification framework for identification of diffusion source in complex networks. PLoS ONE 18(5): e0285563. https://doi.org/10.1371/journal.pone.0285563

Editor: Vincenzo Bonnici, University of Parma, ITALY

Received: November 20, 2022; Accepted: April 26, 2023; Published: May 15, 2023

Copyright: © 2023 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All real dataset in the experiments are available from the networkrepository database. 1. https://networkrepository.com/ca-netscience.php 2. https://networkrepository.com/subelj-euroroad.php 3. https://networkrepository.com/email-univ.php 4. https://doi.org/10.1007/978-3-642-01206-8_5 Source: 1. Rossi RA, Ahmed NK. The Network Data Repository with Interactive Graph Analytics and Visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Austin, Texas; 2015. p. 4292|-4293. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/9277. 2. Gregory S. Finding overlapping communities using disjoint community detection algorithms. In: Complex networks; 2009. p. 47-61. The synthetic dataset in the experimants can be generated by igraph: https://igraph.org/ The related theory can be found in: 1. Barabasi AL, Albert R. Emergence of Scaling in Random Networks. Science.1999;286(5439):509-512. 2. Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature.1998;393(6684):440. doi:https://doi.org/10.1038/30918.

Funding: This work was supported by: 1. National Natural Science Foundation of China (Grant No. 62062010) (Fan Yang). https://www.nsfc.gov.cn/ 2. Science and Technology Planning Project of Guangxi (Grant No. AD19245101) (Fan Yang). http://kjt.gxzf.gov.cn/ 3. Science and Technology Planning Project of Liuzhou City (Grant No. 2020PAAA0606) (Fan Yang). http://kjj.liuzhou.gov.cn/ 4. Higher Education Innovation Fund project of Gansu (No. 2022A-022) (Yabing Yao). http://jyt.gansu.gov.cn/ 5. National Natural Science Foundation of China (Grant No. 62061003) (Jingxian Liu). https://www.nsfc.gov.cn/ 6. Doctoral Foundation of Guangxi University of Science and Technology (Grant No. 19Z06) (Fan Yang). https://www.gxust.edu.cn/ 7. Longyuan Youth Innovation and Entrepreneurship Talents Team Project of Gansu (No. 2021LQTD24) (Yabing Yao). http://www.gszg.gov.cn/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Most complex systems in the real world take the form of networks [1] in which the nodes and edges denote the units and the interactions between units, respectively. Various diffusion phenomena taking place in networks are usually modelled as diffusion process [1], such as disease spreading [2], rumor diffusion [3] and computer virus propagation [4]. The ubiquity of these harmful diffusion processes has incurred huge losses to human society. Therefore, it is of great theoretical and practical significance to develop effective strategies to control the harmful diffusion process. One of the important measures is identifying the diffusion source that initiates the diffusion process on networks, which has attracted widespread attentions in recent years [5]. Many existing source identification methods provided effective solutions for some important issues in reality, such as identifying the source of SARS [6], COVID-19 [7], Cholera [8], finding the source of foodborne disease [9], etc. However, accurately identifying the diffusion source is still an opening challenge.

The success of artificial neural networks has boosted research on many scientific fields [10–12]. Especially, the emergence of graph neural networks [13, 14] (GNNs) and network embedding [15, 16] facilitate the applications of artificial neural networks on irregular structures of networks. GNNs are the neural network models to address different graph tasks in an end-to-end way [13]. The most common GNNs include recurrent graphs neural networks [17], convolutional graph neural networks [18], graph autoencoders [13], etc. Network embedding is composed of various kinds of methods designed for a same task, i.e., network representation learning [13]. Recently, GNNs and network embedding have been successfully introduced into some important issues of complex networks [13, 16], such as link prediction and node classification. However, only a few artificial neural networks based methods focused on the diffusion source identification problem [19, 20]. Li et al. [19] proposed a label propagation framework to locate the diffusion source. Due to the common characteristics between label propagation framework and graph convolutional networks (GCNs), the source identification is converted into a multi-classification problem. Dong et al. [20] detected multiple sources by utilizing the wavefront information. Since existing GNNs is not a suitable solution for the wavefront based method, they developed a novel multi-task learning model based on encoder-decoder structure. Different from the two methods in [19] and [20], this paper utilizes the diffusion time and direction information recorded in limited observers to identify the diffusion source. The two types of information have been proved to be helpful in accurately identifying the source [21–26]. We define the two types of information as diffusion characteristics, and identify the diffusion source by classifying the diffusion characteristics. Although existing GNNs and network embedding are powerful models to process graph data, both of them are not suitable to be used to process the diffusion characteristics which is dynamically generated in a diffusion process. Therefore, we develop a novel neural networks based diffusion characteristics classification framework, which contains the following three stages, (i) the diffusion characteristics are utilized to construct network snapshot feature, (ii) a graph LSTM auto-encoder is proposed, by which the network snapshot feature is represented as low-dimension vectors, (iii) a source classification neural network is proposed to identify the diffusion source by classifying the representation vectors of network snapshot feature. With the proposed framework, the identification of diffusion source is converted into a classification problem. Further, the feasibility and effectiveness of this framework is validated by the experimental results.

The rest of this paper is organized as follows. Existing related works are briefly reviewed in Section Related work. The neural networks based diffusion characteristics classification framework is proposed in Section Materials and methods. The experimental results are discussed in Section Results. We conclude this work in Section Conclusion.

Related work

The early diffusion source identification methods were developed for unweighted networks. A systematic method was pioneered by Shah et al. [27], they constructed a source estimator based on a topological quantity termed as Rumor Centrality (RC). The RC has been extended to identify the diffusion source in more complex environments [28–30]. Zhu et al. [31] proposed a sample path based method termed as Jordan Center (JC), which has been improved to identify the diffusion source with limited observations [32–34]. Meanwhile, many methods based on various ideas were proposed for unweighted networks, including the Dynamic Message Passing based method [35], the Belief Propagation based method [36], the Monte-Carlo method based method [37], the Rationality Observation based method [38], the Label Ranking framework based method [39], the Time Aggregated Graph based method [40], etc. The above methods are effective in unweighted networks. However, in reality, we have to consider various significant weights associated with the edges in networks, such as the traffic, the time delay and so on.

For weighted networks, Brockmann et al. [6] modeled the Global Mobility Network as a weighted graph, and identified the epidemic source based on a novel effective distance. This method has been extended to identify multi-source by Jiang et al. [41]. Meanwhile, several methods based on various ideas were proposed to identify the diffusion source in weighted networks [42–44]. However, these methods require the knowledge of all nodes state. In reality, it is often the case that only limited nodes state can be observed [45]. For this problem, many methods were proposed by utilizing limited observers, including the Time-Reversal Backward Spreading algorithm [24], the Backward Diffusion-based method [46], the improved Gaussian estimator [47], the Gromov matrix based method [25], the Greedy Optimization based algorithm [26], the Sequential Neighbour Filtering algorithm [48], the Estimated Propagation Delay based algorithms [49], etc. These methods [24–26, 46–49] mainly utilized the diffusion time information of observers to identify the source. Pinto et al. [50] proposed a Gaussian estimator, which is the first method to identify the source by utilizing the diffusion direction information of observers. However, the diffusion direction information is only used in the tree graphs. Yang et al. [21] improved the accuracy of Gaussian estimator on general graphs by utilizing the diffusion direction information of observers. Zhu et al. [22, 23] also proposed a path-based source identification method by utilizing the diffusion direction information of observers. Obviously, the diffusion time and direction information of observers play important roles in accurately identifying the diffusion source.

Different from all the traditional source identification methods mentioned above, in recent years, a few artificial neural networks based methods are developed to identify the source. Li et al. [19] proposed a Source Identification Graph Convolutional Network (SIGN) framework, this method requires the knowledge of complete observation. Dong et al. [20] proposed a graph constraint based sequential source identification model. To obtain the wavefront information, this method [20] also requires the knowledge of complete observation. However, in reality, it is often the case that only limited nodes state can be observed [45]. In this paper, we identify the diffusion source by utilizing limited observers. We define the diffusion time and direction information of observers as diffusion characteristics, and propose an artificial neural networks based framework to identify the source by classifying the diffusion characteristics. The feasibility and effectiveness of the proposed framework are validated on a series of synthetic and real networks.

Materials and methods

Problem description and overview

A network the diffusion process taking place in is modelled as a finite and undirected graph , where V and E represent the nodes set and edges set, respectively. θ = {θ_vu}, θ_vu is the random propagation delay associated with an edge vu, vu ∈ E. Generally, is assumed to be known. We consider that the {θ_vu} associated with E are independent and identically distribution (I.I.D) random variables.

Diffusion model.

Assuming that the diffusion process taking place in follows a simple diffusion model that is similar to reference [50]. At time t, each node v ∈ V is only in one of the two states: (i) informed, if it has received the information from any one neighbour, or (ii) ignorant, if it has not been informed so far. Any node v is equally likely to be the source. The diffusion process is initiated by a single source s* at unknown start time, all nodes are ignorant except for s* is informed. Let denote the neighbour(s) of v. Suppose v is in the ignorant state, and receives the information for the first time from one informed neighbour w, thus becoming informed at time t_v. Then, v will attempt to retransmit the information to all its other neighbours along the edges, so that each neighbour receives the information with success probability β at time t_v + θ_vu. If there are two or more informed neighbours having a same propagation delay to u, u can be informed by only one neighbour. Once the diffusion process is terminated, a network snapshot, denoted by , will be generated.

For an arbitrary , with the diffusion model introduced above, a network snapshot is generated. Generally, only a part of nodes state in can be observed, we call these nodes observers, denoted by . The observations made by provide two types of information [21, 50]: (i) the direction in which information arrives to observers and, (ii) the timing at which the information arrives to observers. Obviously, the two types of information recorded in show the true details of the diffusion process, which have been proved to be helpful in accurately identifying the diffusion source [21–26]. In this paper, the two types of information are defined as diffusion characteristics. The purpose is to find the diffusion source s* from by utilizing the diffusion characteristics recorded in . We propose a neural networks based diffusion characteristics classification framework (NN-DCCF) to identify the diffusion source, by which the identification of source is converted into a classification problem. NN-DCCF is composed of the following three stages.

By selecting vital nodes and extending their neighbours in a given , we build . Then, for a , by utilizing the diffusion characteristics recorded in , we construct network snapshot feature, denoted by .
We propose a graph LSTM auto-encoder (GLSTM-AE). By using GLSTM-AE, is represented as low-dimension vectors, denoted by .
We propose a source classification neural network (SCNN) to estimate the diffusion source by classifying .

The overview of NN-DCCF is shown in Fig 1. Frequently used notations are summarized in Table 1.

Download:

Fig 1. Overview of NN-DCCF.

(a) The vital nodes selected by degree centrality [51] include node 8 and node 44. Observation areas set . . consists of node 8 and its neighbours within 1 hop distance. . consists of node 44 and its neighbours within 1 hop distance. . (b) . is composed of the sequence features constructed with the diffusion characteristics recorded in the observers of . is composed of the sequence features constructed with the diffusion characteristics recorded in the observers of . (c) By using GLSTM-AE, is converted into low-dimension representation vectors, i.e. . (d) With as the input of SCNN, we can estimate the diffusion source.

https://doi.org/10.1371/journal.pone.0285563.g001

Download:

Table 1. Notation summarization.

https://doi.org/10.1371/journal.pone.0285563.t001

Stage 1: Constructing network snapshot feature

To utilize the diffusion characteristics of observers to construct network snapshot feature, the observers set is built with the following strategy. Given a , we first rank the importance of nodes by a vital nodes identification methods [51]. Next, with the ranking results, we select the most important K nodes as vital nodes. Then, for each vital node, we extend its neighbours within h hops distance. Further, each vital node and its extended neighbours are combined to form an observation area. contains K observation areas , , . o denotes an unique observer in . When the diffusion process occurs on and generates , by utilizing the diffusion characteristics, i.e., the diffusion direction and time information, recorded in each , we construct the network snapshot feature . Here, we set . The procedure for constructing is summarised in Algorithm 1.

Algorithm 1 Network snapshot feature constructing algorithm

Input: , and

Output:

1: initialize an empty

2: sort all in according to the average informed time of

3: for each in do

4: for each do

5: initialize an empty seq to record a single sequence feature

6: set current node c = o

7: while do

8: if c is in the informed state then

9: add c into seq

10: get next node n according to the diffusion direction information recorded in c and set c = n

11: end if

12: end while

13: reverse the nodes order in seq

14: if 1 < |seq| ≤ l_max then

15: add seq into

16: else if |seq| > l_max then

17: remove the last |seq| − l_max nodes from seq

18: add seq into

19: end if

20: end for

21: end for

22: for each in do

23: remove duplicated sequence features from

24: sort all sequence features in according to their length

25: if then

26: remove the last sequence features from

27: end if

28: end for

In Algorithm 1, the inputs are the topology of , observation areas set and network snapshot . The average informed time of in step 2 is the average of the diffusion time information recorded in . Steps 4–20 are used for constructing the sequence features in by traversing each . Here, steps 7–12 are used to generate a single sequence feature, denoted by seq. A single seq is a basic unit for constructing . Obviously, generating a single seq depends on the diffusion direction information of observers. Step 13 is used to reverse the order of current seq. Steps 14–19 are used to add the seq into , where, 2 ≤ |seq| ≤ l_max. Further, from step 3 to step 21, the sequence features in each are constructed, then, we get . Steps 22–28 are used to remove the redundant sequence features and limit the size of . A schematic to obtain by using Algorithm 1 is shown in Fig 1(a) and 1(b).

Stage 2: GLSTM-AE based network snapshot feature representation

From Algorithm 1, we know that each sequence feature, termed as seq, in consists of several ordered informed nodes. Therefore, the seq is a type of sequential data. Inspired by the idea that the long short-term memory (S1 File) is a powerful tools for modelling sequential data [52–54], we use the LSTM networks to learn the representation of seq. However, the seq is different from traditional sequential data since it is composed of ordered informed nodes. Further, we propose a graph LSTM auto-encoder (GLSTM-AE) to learn the low-dimension representation of seq. A GLSTM-AE consists of two LSTMs, the encoder LSTM and the decoder LSTM, as shown in Fig 2. GLSTM-AE works as follows. For an arbitrary seq, each node in seq is represented as an one-hot vector with dimension |V|. The input to GLSTM-AE is the one-hot representation of seq. The output of the encoder LSTM after the last input has been read is low-dimension representations of the one-hot vectors of seq, denoted by r, , where, d_r denotes the representation dimension. r is the representation result we obtained from the GLSTM-AE. The decoder LSTM reconstruct back the input from r. The target of GLSTM-AE is same as the input.

Download:

Fig 2. The structure of GLSTM-AE, where, d denotes the dimension of a vector.

https://doi.org/10.1371/journal.pone.0285563.g002

Obviously, it is necessary to train GLSTM-AE before it is applied to learn the representations of the sequence features in . A simple way to obtain the training data of GLSTM-AE is generating sequence features with fixed length from .

Because the mean squared error (MSE) loss is commonly used for the regression task, it is suitable for the task of GLSTM-AE. Therefore, we adopt the MSE loss as the loss function of GLSTM-AE, which is described as follows. (1) where, Y denotes the output of the decoder LSTM in GLSTM-AE, Y* denotes the one-hot representation of seq.

Then, with the trained GLSTM-AE, we get the low-dimension representation of , denoted by . This process is summarised in Algorithm 2.

Algorithm 2 Network snapshot feature representation algorithm

Input:

Output:

1: initialize an empty ,

2: set

3: set

4: for each in do

5: for each seq in do

6: input=one-hot(seq), input ∈ R^|seq|×|V|

7: r = GLSTM-AE (input),

8: if |seq| < l_max then

9: k = l_max − |seq|

10: pad r with p_l for k times

11: end if

12: add r into

13: end for

14: if then

15:

16: pad with p_η for k times

17: end if

18: end for

In Algorithm 2, the input is the network snapshot feature . The one-hot(⋅) function in step 6 is to get the one-hot representation of current seq. In step 7, the representation result r of seq is obtained by using the trained GLSTM-AE. Further, we set by steps 8–11, and set by steps 14–17.

Stage 3: Identify the diffusion source with SCNN

With Algorithm 2, we get the representation of , i.e. . In this section, with as input, we propose a source classification neural network (SCNN) to identify the diffusion source by classifying . SCNN is mainly composed of two fully connected layers. To get convergence faster, we add a normalization layer. The structure of SCNN is shown in Fig 3, where, the LogSoftmax is used for multi-class classification.

Download:

Fig 3. The structure of SCNN.

https://doi.org/10.1371/journal.pone.0285563.g003

SCNN also requires to be trained before it is applied to identify the diffusion source. The training data of SCNN can be generated by Algorithm 3.

Algorithm 3 SCNN training data generating algorithm

Input: and

Output: training data collector C

1: specify the number of loops N

2: initialize an empty training data collector C

3: set , β_i ∈ (0, 1), ∀i, j ∈ [1, M], β_i ≠ β_j

4: while N > 0 do

5: for β_i ∈ β do

6: for each node v ∈ V do

7: generate by running the diffusion model (see Diffusion model) on with v as diffusion source and β_i as propagation rate

8: generate by Algorithm 1

9: construct corresponding to by Algorithm 2

10: add a training data (, one-hot(v)) into C

11: end for

12: end for

13: N = N − 1

14: end while

In Algorithm 3, the inputs are the topology of and observation areas set . From step 7 to step 10, given a node v and a propagation rate β, a single training data can be generated, which is composed of the and the one-hot representation of v. Obviously, the SCNN training dataset size is N ⋅ |β| ⋅ |V|.

Because the cross entropy loss is mainly used for classification, we adopt the cross entropy loss as the loss function of SCNN, which is described as follows. (2) where, Z denotes the estimated diffusion source obtained by SCNN, Z* denotes the one-hot representation of true diffusion source.

Finally, by combining with the trained SCNN, the algorithm corresponding to NN-DCCF is summarised as Algorithm 4.

Algorithm 4 Diffusion source identification algorithm

Input: , and

Output:

1: generate according Algorithm 1

2: construct corresponding to according to Algorithm 2

3: output = SCNN , output ∈ R^|V|

4:

Results

Main experimental environment

Hardware: Dell R740 with 2 Intel(R) Xeon(R) gold 6254 CPU, 1 TB RAM, 1 NVIDIA Tesla V100S GPU with 32 GB GPU memory. Software: Python 3.8.10 + PyTorch 1.10.2 + CUDA 10.2.

Methods for comparison

Essentially, the proposed NN-DCCF is an observers based method. To validate its feasibility and effectiveness, three existing state-of-the-art observers based methods are selected for comparison, including time-reversal backward spreading (TRBS) algorithm [24], sequential neighbour filtering (SNF) algorithm [48] and estimated propagation delay (EPD) algorithm [49].

Datasets

We compare the four diffusion source identification methods on a series of synthetic and real networks. The synthetic networks include scale-free (BA) [55] model and small-world (WS) [56] model. The parameters for generating synthetic networks are summarised in Tables 2 and 3. The real networks are of different types, including NetworkScience (https://networkrepository.com/ca-netscience.php) [57], Euroroads (https://networkrepository.com/subelj-euroroad.php) [57], Email (https://networkrepository.com/email-univ.php) [57] and Blogs (https://doi.org/10.1007/978–3-642-01206-8_5) [58]. The topological properties of all networks are summarised in Table 4.

Download:

Table 2. The parameters for generating BA models.

https://doi.org/10.1371/journal.pone.0285563.t002

Download:

Table 3. The parameters for generating WS models.

https://doi.org/10.1371/journal.pone.0285563.t003

Download:

Table 4. The topological properties of networks.

https://doi.org/10.1371/journal.pone.0285563.t004

Evaluation metrics

The performance of diffusion source identification methods are commonly evaluated with two metrics [5, 21, 25, 27, 34], including the precision and average error distance. The precision focuses on evaluating the capability of a method in precise identification (i.e. the proportion of 0 error hop). For each network, we randomly select 100 nodes as test seeds. For the precision, the higher the value is, the better the algorithm is. For the average error hop, the smaller the value is, the better the algorithm is.

Parameters setting

For an arbitrary , we assume that θ are Gaussian distribution [24, 49], μ and σ² are known [50], here, we set μ/σ = 4 [21, 50]. We assume that the diffusion process on networks follows the diffusion model introduced in Diffusion model. To investigate the performance of NN-DCCF under different propagation rates, we set relatively larger range for β, β ∈ [0.1, 0.9]. The diffusion process is terminated when there is no ignorant node.

How to select a suitable observers placement strategy may depends on the topology of a network [61]. Although lots of methods [51] can be used to select , sometimes there maybe no significant difference between the placement strategies for the performance of source identification [62]. In this paper, is selected by the strategy introduced in Section Stage 1. Here, the K vital nodes in are selected by the degree centrality (DC) [51] (due to the simplicity and efficiency of DC). For each network, we select 1% nodes as vital nodes. Then, by extending the neighbours within 1 hop distance of the vital nodes, we get and in , the details are shown in Table 5. Other general parameters are also summarised in Table 5. All the four compared methods adopt the same to identify the diffusion source.

Download:

Table 5. General parameters setting.

https://doi.org/10.1371/journal.pone.0285563.t005

The parameters set of GLSTM-AE are summarised in Table 6. Meanwhile, we generate the training dataset of GLSTM-AE for each network with the simple method introduced in Section Stage 2. To emphasize the local structure, we set l ∈ [2, 4]. The training dataset size of GLSTM-AE on different networks are shown in Table 7. The training parameters set for GLSTM-AE on different networks are summarised in Table 8. Because the purpose is to identify the diffusion source, we show the accuracy of GLSTM-AE by the results of source identification, which can be found in Figs 4–11 and Table 10.

Download:

Fig 4. The error distance of TRBS, SNF, EPD and NN-DCCF methods on BA model (1).

https://doi.org/10.1371/journal.pone.0285563.g004

Download:

Fig 5. The error distance of TRBS, SNF, EPD and NN-DCCF methods on BA model (2).

https://doi.org/10.1371/journal.pone.0285563.g005

Download:

Fig 6. The error distance of TRBS, SNF, EPD and NN-DCCF methods on WS model (1).

https://doi.org/10.1371/journal.pone.0285563.g006

Download:

Fig 7. The error distance of TRBS, SNF, EPD and NN-DCCF methods on WS model (2).

https://doi.org/10.1371/journal.pone.0285563.g007

Download:

Fig 8. The error distance of TRBS, SNF, EPD and NN-DCCF methods on NetworkScience network.

https://doi.org/10.1371/journal.pone.0285563.g008

Download:

Fig 9. The error distance of TRBS, SNF, EPD and NN-DCCF methods on Euroroads network.

https://doi.org/10.1371/journal.pone.0285563.g009

Download:

Fig 10. The error distance of TRBS, SNF, EPD and NN-DCCF methods on Email network.

https://doi.org/10.1371/journal.pone.0285563.g010

Download:

Fig 11. The error distance of TRBS, SNF, EPD and NN-DCCF methods on Blogs network.

https://doi.org/10.1371/journal.pone.0285563.g011

Download:

Table 6. The parameters set of GLSTM-AE.

https://doi.org/10.1371/journal.pone.0285563.t006

Download:

Table 7. The training dataset size of GLSTM-AE and SCNN on different networks.

https://doi.org/10.1371/journal.pone.0285563.t007

Download:

Table 8. The training parameters set of GLSTM-AE and SCNN on different networks.

https://doi.org/10.1371/journal.pone.0285563.t008

The parameters set of SCNN are summarised in Table 9. Further, for each network, we generate the training dataset of SCNN by Algorithm 3. The training dataset size of SCNN on different networks are shown in Table 7. The training parameters set for SCNN are summarised in Table 8. In SCNN, we adopt the batch normalization as the normalization layer [19, 20]. Since the purpose is to identify the diffusion source, we validate the performance of SCNN by the results of source identification, which can be found in Figs 4–11 and Table 10.

Download:

Table 9. The parameters set of SCNN.

https://doi.org/10.1371/journal.pone.0285563.t009

Download:

Table 10. The average error distance of TRBS, SNF, EPD and NN-DCCF on different networks.

https://doi.org/10.1371/journal.pone.0285563.t010

Experimental results and discussion.

Figs 4–11 show the error distance of the four methods on different networks. Table 10 shows the average error distance of the four methods. From Figs 4 to 11, we can see that the precisions (i.e. the proportion of 0 error hop) exposed by NN-DCCF on the eight networks are 74%, 83%, 58%, 66%, 22%, 19%, 83% and 49%, respectively. Obviously, except for WS model (2), NN-DCCF exposes the best performance in precision. On WS model (2), the precision of NN-DCCF is only inferior to the TRBS, and superior to other two methods. From Table 10, we know that the NN-DCCF is superior to other three methods in the average error distance on all networks. Therefore, the NN-DCCF is a feasible and effective method in accurately identifying the diffusion source. Additionally, from Table 4, we know that the eight networks are different in their topological properties, which indicates that NN-DCCF could effectively identify the source on different types of networks by simply modifying the training parameters. Therefore, the NN-DCCF is a general source identification framework.

Conclusion

This paper defines the diffusion direction and time information of observers as diffusion characteristics, and develops a NN-DCCF to identify the diffusion source by classifying the diffusion characteristics. Firs, we utilize the diffusion characteristics to construct network snapshot feature. Then, we propose a GLSTM-AE by which the network snapshot feature is represented as low-dimension vectors. Further, we propose a SCNN to identify the diffusion source. By using NN-DCCF, the identification of diffusion source is converted into a classification problem. The feasibility and effectiveness of NN-DCCF are validated by the experimental results on a series of synthetic and real networks. In the future work, we will generalize the NN-DCCF to the case of multi-source.

Supporting information

S1 File. Long short-term memory (LSTM).

https://doi.org/10.1371/journal.pone.0285563.s001

(PDF)

References

1. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: Structure and dynamics. Physics Reports. 2006; 424(4): 175–308.
- View Article
- Google Scholar
2. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021; 589(7840): 82–87. pmid:33171481
- View Article
- PubMed/NCBI
- Google Scholar
3. Zhu L, Yang F, Guan G, Zhang Z. Modeling the dynamics of rumor diffusion over complex networks. Information Sciences. 2021; 562: 240–258.
- View Article
- Google Scholar
4. Wang Y, Wen S, Xiang Y, Zhou W. Modeling the Propagation of Worms in Networks: A Survey. IEEE Communications Surveys & Tutorials. 2014; 16(2): 942–960.
- View Article
- Google Scholar
5. Jiang J, Sheng W, Shui Y, Yang X, Zhou W. Identifying Propagation Sources in Networks: State-of-the-Art and Comparative Studies. IEEE Communications Surveys & Tutorials. 2017; 19(1): 465–481.
- View Article
- Google Scholar
6. Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. Science. 2013; 342(6164): 1337–1342. pmid:24337289
- View Article
- PubMed/NCBI
- Google Scholar
7. Wang Y, Zhong L, Du J, Gao J, Wang Q. Identifying the shifting sources to predict the dynamics of COVID-19 in the US. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2022; 32(3): 033104.
- View Article
- Google Scholar
8. Li J, Manitz J, Bertuzzo E, Kolaczyk ED. Sensor-based localization of epidemic sources on human mobility networks. PLoS Computational Biology. 2021; 17(1): e1008545. pmid:33503024
- View Article
- PubMed/NCBI
- Google Scholar
9. Horn AL, Friedrich H. Locating the source of large-scale outbreaks of foodborne disease. Journal of the Royal Society Interface. 2019; 16(151): 20180624. pmid:30958197
- View Article
- PubMed/NCBI
- Google Scholar
10. Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural network architectures and their applications. Neurocomputing. 2017; 234: 11–26.
- View Article
- Google Scholar
11. Chamberlain B, Rowbottom J, Gorinova MI, Bronstein M, Webb S, Rossi E. GRAND: Graph Neural Diffusion. Proceedings of the 38th International Conference on Machine Learning. 2021; 139: 1407–1418. Available: http://proceedings.mlr.press/v139/chamberlain21a/chamberlain21a.pdf
12. Zhang C, Zhao S, Yang Z, Chen Y. A reliable data-driven state-of-health estimation model for lithium-ion batteries in electric vehicles. Frontiers in Energy Research. 2022; 10.
- View Article
- Google Scholar
13. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems. 2021; 32(1): 4–24. pmid:32217482
- View Article
- PubMed/NCBI
- Google Scholar
14. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. Computational Capabilities of Graph Neural Networks. IEEE Transactions on Neural Networks. 2009; 20(1): 81–102. pmid:19129034
- View Article
- PubMed/NCBI
- Google Scholar
15. Cui P, Wang X, Pei J, Zhu W. A Survey on Network Embedding. IEEE Transactions on Knowledge and Data Engineering. 2019. 31(5): 833–852.
- View Article
- Google Scholar
16. Zhang D, Yin J, Zhu X, Zhang C. Network Representation Learning: A Survey. IEEE Transactions on Big Data. 2020; 6: 3–28.
- View Article
- Google Scholar
17. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The Graph Neural Network Model. IEEE Transactions on Neural Networks. 2009; 20(1): 61–80. pmid:19068426
- View Article
- PubMed/NCBI
- Google Scholar
18. Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional Networks. International Conference on Learning Representations. 2017.
19. Li L, Zhou J, Jiang Y, Huang B. Propagation source identification of infectious diseases with graph convolutional networks. Journal of biomedical informatics. 2021; 116: 103720. pmid:33640536
- View Article
- PubMed/NCBI
- Google Scholar
20. Dong M, Zheng B, Li G, Li C, Zheng K, Zhou X. Wavefront-Based Multiple Rumor Sources Identification by Multi-Task Learning. IEEE Transactions on Emerging Topics in Computational Intelligence. 2022; 6(5): 1068–1078.
- View Article
- Google Scholar
21. Yang F, Yang S, Peng Y, Yao Y, Wang Z, Li H, et al. Locating the propagation source in complex networks with a direction-induced search based Gaussian estimator. Knowledge-Based Systems. 2020; 195: 105674.
- View Article
- Google Scholar
22. Zhu P, Cheng L, Gao C, Wang Z, Li X. Locating Multi-Sources in Social Networks With a Low Infection Rate. IEEE Transactions on Network Science and Engineering. 2022; 9(3): 1853–1865.
- View Article
- Google Scholar
23. Cheng L, Li X, Han Z, Luo T, Ma L, Zhu P. Path-based multi-sources localization in multiplex networks. Chaos, Solitons & Fractals. 2022; 159: 112139.
- View Article
- Google Scholar
24. Shen Z, Cao S, Wang W, Di Z, Stanley HE. Locating the source of diffusion in complex networks by time-reversal backward spreading. Physical Review. E. 2016; 93(3): 032301. pmid:27078360
- View Article
- PubMed/NCBI
- Google Scholar
25. Tang W, Ji F, Tay WP. Estimating Infection Sources in Networks Using Partial Timestamps. IEEE Transactions on Information Forensics and Security. 2018; 13(12): 3035–3049.
- View Article
- Google Scholar
26. Hu Z. Wang L, Tang C. Locating the source node of diffusion process in cyber-physical networks via minimum observers. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2019; 29(6): 063117. pmid:31266325
- View Article
- PubMed/NCBI
- Google Scholar
27. Shah D, Zaman T. Rumors in a Network: Who’s the Culprit? IEEE Transactions on Information Theory. 2011; 57(8): 5163–5181.
- View Article
- Google Scholar
28. Luo W, Tay WP, Leng M. Identifying Infection Sources and Regions in Large Networks. IEEE Transactions on Signal Processing. 2013; 61(11): 2850–2865.
- View Article
- Google Scholar
29. Wang Z, Dong W, Zhang W, Tan CW. Rumor Source Detection with Multiple Observations: Fundamental Limits and Algorithms. SIGMETRICS Perform. Eval. Rev. 2014; 42(1): 1–13.
- View Article
- Google Scholar
30. Wang Z, Dong W, Zhang W, Tan CW. Rooting our Rumor Sources in Online Social Networks: The Value of Diversity From Multiple Observations. IEEE Journal of Selected Topics in Signal Processing. 2015; 9(4): 663–677.
- View Article
- Google Scholar
31. Zhu K, Ying L. Information Source Detection in the SIR Model: A Sample-Path-Based Approach. IEEE/ACM Transactions on Networking. 2016; 24(1): 408–421.
- View Article
- Google Scholar
32. Zhu K, Ying L. A Robust Information Source Estimator with Sparse Observations. Computational Social Networks. 2014; 1(1): 1–21.
- View Article
- Google Scholar
33. Luo W, Tay WP, Leng M. How to Identify an Infection Source With Limited Observations. IEEE Journal of Selected Topics in Signal Processing. 2014; 8(4): 586–597.
- View Article
- Google Scholar
34. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. Rumor Source Identification in Social Networks with Time-Varying Topology. IEEE Transactions on Dependable and Secure Computing. 2018; 15(1): 166–179.
- View Article
- Google Scholar
35. Lokhov AY, Mézard M, Ohta H, Zdeborová L. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Physical Review E. 2014; 90(1): 012801. pmid:25122336
- View Article
- PubMed/NCBI
- Google Scholar
36. Altarelli F, Braunstein A, Dall’Asta L, Lage-Castellanos A, Zecchina R. Bayesian inference of epidemics on networks via belief propagation. Physical Review Letters. 2014; 112(11): 118701. pmid:24702425
- View Article
- PubMed/NCBI
- Google Scholar
37. Antulov-Fantulin N, Lančić A, Šmuc T, Štefančić H, Šikić M. Identification of Patient Zero in Static and Temporal Networks: Robustness and Limitations. Physical Review Letters. 2015; 114(24): 248701. pmid:26197016
- View Article
- PubMed/NCBI
- Google Scholar
38. Yang F, Zhang R, Yao Y, Yuan Y. Locating the propagation source on complex networks with Propagation Centrality algorithm. Knowledge-Based Systems. 2016; 100: 112–123.
- View Article
- Google Scholar
39. Zhou J, Jiang Y, Huang B. Source identification of infectious diseases in networks via label ranking. PLoS ONE. 2021; 16(1): e0245344. pmid:33444390
- View Article
- PubMed/NCBI
- Google Scholar
40. Chai Y, Wang Y, Zhu L. Information Sources Estimation in Time-Varying Networks. IEEE Transactions on Information Forensics and Security. 2021; PP(99): 2621–2636.
- View Article
- Google Scholar
41. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. K-Center: An Approach on the Multi-Source Identification of Information Diffusion. IEEE Transactions on Information Forensics and Security. 2015; 10(12): 2616–2626.
- View Article
- Google Scholar
42. Cai K, Hong X, Lui JCS. Information Spreading Forensics via Sequential Dependent Snapshots. IEEE/ACM Transactions on Networking. 2018; 26(1): 478–491.
- View Article
- Google Scholar
43. Feizi S, Médard M, Quon G, Kellis M, Duffy K. Network Infusion to Infer Information Sources in Networks. IEEE Transactions on Network Science and Engineering. 2019; 6(3): 402–417.
- View Article
- Google Scholar
44. Chang B, Chen E, Zhu F, Liu Q, Xu T, Wang Z. Maximum a Posteriori Estimation for Information Source Detection. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2020; 50(6): 2242–2256.
- View Article
- Google Scholar
45. Caputo JG, Hamdi A, Knippel A. Inverse source problem in a forced network. Inverse Problems. 2019; 35(5): 055006.
- View Article
- Google Scholar
46. Fu L, Shen Z, Wang W, Fan Y, Di Z. Multi-source localization on complex networks with limited observers. EPL. 2016; 113(1): 18006.
- View Article
- Google Scholar
47. Paluch R, Lu X, Suchecki K, Szymański BK, Holyst JA. Fast and accurate detection of spread source in large complex networks, Scientific Reports. 2018; 8(1): 2508. pmid:29410504
- View Article
- PubMed/NCBI
- Google Scholar
48. Wang H, Sun K. Locating source of heterogeneous propagation model by universal algorithm. Europhysics Letters. 2020; 131(4): 48001. https://dx.doi.org/10.1209/0295-5075/131/48001
- View Article
- Google Scholar
49. Wang H, Zhang F, Sun K. An algorithm for locating propagation source in complex networks. Physics Letters A. 2021; 393: 127184.
- View Article
- Google Scholar
50. Pinto PC, T Patrick, V Martin. Locating the source of diffusion in large-scale networks. Physical Review Letters. 2012; 109(6): 068702. pmid:23006310
- View Article
- PubMed/NCBI
- Google Scholar
51. Lü L, Chen D, Ren X, Zhang Q, Zhang Y, Zhou T. Vital nodes identification in complex networks. Physics Reports. 2016; 650: 1–63.
- View Article
- Google Scholar
52. Sutskever I, Vinyals O, Le QV. Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems. 2014; 27. Available: https://proceedings.neurips.cc/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
- View Article
- Google Scholar
53. Srivastava N, Mansimov E, Salakhudinov R. Unsupervised learning of video representations using lstms. International conference on machine learning. 2015. pp. 843–852. Available: http://proceedings.mlr.press/v37/srivastava15.pdf
54. Dai AM, Le QV. Semi-supervised Sequence Learning. Advances in Neural Information Processing Systems. 2015; 28. Available: https://proceedings.neurips.cc/paper/2015/file/7137debd45ae4d0ab9aa953017286b20-Paper.pdf
- View Article
- Google Scholar
55. Barabasi AL, Albert R. Emergence of Scaling in Random Networks. Science. 1999; 286(5439): 509–512. pmid:10521342
- View Article
- PubMed/NCBI
- Google Scholar
56. Watts DJ, Strogatz SH. Collective dynamics of’small-world’ networks. Nature. 1998; 393(6684): 440–442. pmid:9623998
- View Article
- PubMed/NCBI
- Google Scholar
57. Rossi RA, Ahmed NK. The Network Data Repository with Interactive Graph Analytics and Visualization. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015; 29(1): 4292–4293. Available: https://ojs.aaai.org/index.php/AAAI/article/view/9277
58. Gregory S. Finding overlapping communities using disjoint community detection algorithms. Complex networks. 2009; 207: 47–61.
- View Article
- Google Scholar
59. Newman MEJ. Assortative Mixing in Networks. Physical Review Letters. 2002; 89(20): 208701. pmid:12443515
- View Article
- PubMed/NCBI
- Google Scholar
60. Yang F, Li X, Xu Y, Liu X, Wang J, Zhang Y, et al. Ranking the spreading influence of nodes in complex networks: An extended weighted degree centrality based on a remaining minimum degree decomposition. Physics Letters A. 2018; 382(34): 2361–2371.
- View Article
- Google Scholar
61. Gajewski Ł, Paluch R, Suchecki K, Sulik A, Szymanski B, Hołyst J. Comparison of observer based methods for source localisation in complex networks. Scientific Reports. 2022; 12: 5079. pmid:35332184
- View Article
- PubMed/NCBI
- Google Scholar
62. Zhang X, Zhang Y, Lv T, Yin Y. Identification of efficient observers for locating spreading source in complex networks. Physica, A. Statistical mechanics and its applications. 2016; 442: 100–109.
- View Article
- Google Scholar

[ref1] 1. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: Structure and dynamics. Physics Reports. 2006; 424(4): 175–308.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021; 589(7840): 82–87. pmid:33171481
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Zhu L, Yang F, Guan G, Zhang Z. Modeling the dynamics of rumor diffusion over complex networks. Information Sciences. 2021; 562: 240–258.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref4] 4. Wang Y, Wen S, Xiang Y, Zhou W. Modeling the Propagation of Worms in Networks: A Survey. IEEE Communications Surveys & Tutorials. 2014; 16(2): 942–960.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Jiang J, Sheng W, Shui Y, Yang X, Zhou W. Identifying Propagation Sources in Networks: State-of-the-Art and Comparative Studies. IEEE Communications Surveys & Tutorials. 2017; 19(1): 465–481.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. Science. 2013; 342(6164): 1337–1342. pmid:24337289
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref7] 7. Wang Y, Zhong L, Du J, Gao J, Wang Q. Identifying the shifting sources to predict the dynamics of COVID-19 in the US. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2022; 32(3): 033104.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref8] 8. Li J, Manitz J, Bertuzzo E, Kolaczyk ED. Sensor-based localization of epidemic sources on human mobility networks. PLoS Computational Biology. 2021; 17(1): e1008545. pmid:33503024
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref9] 9. Horn AL, Friedrich H. Locating the source of large-scale outbreaks of foodborne disease. Journal of the Royal Society Interface. 2019; 16(151): 20180624. pmid:30958197
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref10] 10. Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural network architectures and their applications. Neurocomputing. 2017; 234: 11–26.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref11] 11. Chamberlain B, Rowbottom J, Gorinova MI, Bronstein M, Webb S, Rossi E. GRAND: Graph Neural Diffusion. Proceedings of the 38th International Conference on Machine Learning. 2021; 139: 1407–1418. Available: http://proceedings.mlr.press/v139/chamberlain21a/chamberlain21a.pdf

[ref12] 12. Zhang C, Zhao S, Yang Z, Chen Y. A reliable data-driven state-of-health estimation model for lithium-ion batteries in electric vehicles. Frontiers in Energy Research. 2022; 10.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref13] 13. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems. 2021; 32(1): 4–24. pmid:32217482
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref14] 14. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. Computational Capabilities of Graph Neural Networks. IEEE Transactions on Neural Networks. 2009; 20(1): 81–102. pmid:19129034
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref15] 15. Cui P, Wang X, Pei J, Zhu W. A Survey on Network Embedding. IEEE Transactions on Knowledge and Data Engineering. 2019. 31(5): 833–852.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref16] 16. Zhang D, Yin J, Zhu X, Zhang C. Network Representation Learning: A Survey. IEEE Transactions on Big Data. 2020; 6: 3–28.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref17] 17. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The Graph Neural Network Model. IEEE Transactions on Neural Networks. 2009; 20(1): 61–80. pmid:19068426
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref18] 18. Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional Networks. International Conference on Learning Representations. 2017.

[ref19] 19. Li L, Zhou J, Jiang Y, Huang B. Propagation source identification of infectious diseases with graph convolutional networks. Journal of biomedical informatics. 2021; 116: 103720. pmid:33640536
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref20] 20. Dong M, Zheng B, Li G, Li C, Zheng K, Zhou X. Wavefront-Based Multiple Rumor Sources Identification by Multi-Task Learning. IEEE Transactions on Emerging Topics in Computational Intelligence. 2022; 6(5): 1068–1078.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref21] 21. Yang F, Yang S, Peng Y, Yao Y, Wang Z, Li H, et al. Locating the propagation source in complex networks with a direction-induced search based Gaussian estimator. Knowledge-Based Systems. 2020; 195: 105674.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref22] 22. Zhu P, Cheng L, Gao C, Wang Z, Li X. Locating Multi-Sources in Social Networks With a Low Infection Rate. IEEE Transactions on Network Science and Engineering. 2022; 9(3): 1853–1865.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref23] 23. Cheng L, Li X, Han Z, Luo T, Ma L, Zhu P. Path-based multi-sources localization in multiplex networks. Chaos, Solitons & Fractals. 2022; 159: 112139.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref24] 24. Shen Z, Cao S, Wang W, Di Z, Stanley HE. Locating the source of diffusion in complex networks by time-reversal backward spreading. Physical Review. E. 2016; 93(3): 032301. pmid:27078360
View Article
PubMed/NCBI
Google Scholar

[75] View Article

[76] PubMed/NCBI

[77] Google Scholar

[ref25] 25. Tang W, Ji F, Tay WP. Estimating Infection Sources in Networks Using Partial Timestamps. IEEE Transactions on Information Forensics and Security. 2018; 13(12): 3035–3049.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref26] 26. Hu Z. Wang L, Tang C. Locating the source node of diffusion process in cyber-physical networks via minimum observers. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2019; 29(6): 063117. pmid:31266325
View Article
PubMed/NCBI
Google Scholar

[82] View Article

[83] PubMed/NCBI

[84] Google Scholar

[ref27] 27. Shah D, Zaman T. Rumors in a Network: Who’s the Culprit? IEEE Transactions on Information Theory. 2011; 57(8): 5163–5181.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref28] 28. Luo W, Tay WP, Leng M. Identifying Infection Sources and Regions in Large Networks. IEEE Transactions on Signal Processing. 2013; 61(11): 2850–2865.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref29] 29. Wang Z, Dong W, Zhang W, Tan CW. Rumor Source Detection with Multiple Observations: Fundamental Limits and Algorithms. SIGMETRICS Perform. Eval. Rev. 2014; 42(1): 1–13.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref30] 30. Wang Z, Dong W, Zhang W, Tan CW. Rooting our Rumor Sources in Online Social Networks: The Value of Diversity From Multiple Observations. IEEE Journal of Selected Topics in Signal Processing. 2015; 9(4): 663–677.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref31] 31. Zhu K, Ying L. Information Source Detection in the SIR Model: A Sample-Path-Based Approach. IEEE/ACM Transactions on Networking. 2016; 24(1): 408–421.
View Article
Google Scholar

[98] View Article

[99] Google Scholar

[ref32] 32. Zhu K, Ying L. A Robust Information Source Estimator with Sparse Observations. Computational Social Networks. 2014; 1(1): 1–21.
View Article
Google Scholar

[101] View Article

[102] Google Scholar

[ref33] 33. Luo W, Tay WP, Leng M. How to Identify an Infection Source With Limited Observations. IEEE Journal of Selected Topics in Signal Processing. 2014; 8(4): 586–597.
View Article
Google Scholar

[104] View Article

[105] Google Scholar

[ref34] 34. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. Rumor Source Identification in Social Networks with Time-Varying Topology. IEEE Transactions on Dependable and Secure Computing. 2018; 15(1): 166–179.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref35] 35. Lokhov AY, Mézard M, Ohta H, Zdeborová L. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Physical Review E. 2014; 90(1): 012801. pmid:25122336
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref36] 36. Altarelli F, Braunstein A, Dall’Asta L, Lage-Castellanos A, Zecchina R. Bayesian inference of epidemics on networks via belief propagation. Physical Review Letters. 2014; 112(11): 118701. pmid:24702425
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref37] 37. Antulov-Fantulin N, Lančić A, Šmuc T, Štefančić H, Šikić M. Identification of Patient Zero in Static and Temporal Networks: Robustness and Limitations. Physical Review Letters. 2015; 114(24): 248701. pmid:26197016
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref38] 38. Yang F, Zhang R, Yao Y, Yuan Y. Locating the propagation source on complex networks with Propagation Centrality algorithm. Knowledge-Based Systems. 2016; 100: 112–123.
View Article
Google Scholar

[122] View Article

[123] Google Scholar

[ref39] 39. Zhou J, Jiang Y, Huang B. Source identification of infectious diseases in networks via label ranking. PLoS ONE. 2021; 16(1): e0245344. pmid:33444390
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref40] 40. Chai Y, Wang Y, Zhu L. Information Sources Estimation in Time-Varying Networks. IEEE Transactions on Information Forensics and Security. 2021; PP(99): 2621–2636.
View Article
Google Scholar

[129] View Article

[130] Google Scholar

[ref41] 41. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. K-Center: An Approach on the Multi-Source Identification of Information Diffusion. IEEE Transactions on Information Forensics and Security. 2015; 10(12): 2616–2626.
View Article
Google Scholar

[132] View Article

[133] Google Scholar

[ref42] 42. Cai K, Hong X, Lui JCS. Information Spreading Forensics via Sequential Dependent Snapshots. IEEE/ACM Transactions on Networking. 2018; 26(1): 478–491.
View Article
Google Scholar

[135] View Article

[136] Google Scholar

[ref43] 43. Feizi S, Médard M, Quon G, Kellis M, Duffy K. Network Infusion to Infer Information Sources in Networks. IEEE Transactions on Network Science and Engineering. 2019; 6(3): 402–417.
View Article
Google Scholar

[138] View Article

[139] Google Scholar

[ref44] 44. Chang B, Chen E, Zhu F, Liu Q, Xu T, Wang Z. Maximum a Posteriori Estimation for Information Source Detection. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2020; 50(6): 2242–2256.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref45] 45. Caputo JG, Hamdi A, Knippel A. Inverse source problem in a forced network. Inverse Problems. 2019; 35(5): 055006.
View Article
Google Scholar

[144] View Article

[145] Google Scholar

[ref46] 46. Fu L, Shen Z, Wang W, Fan Y, Di Z. Multi-source localization on complex networks with limited observers. EPL. 2016; 113(1): 18006.
View Article
Google Scholar

[147] View Article

[148] Google Scholar

[ref47] 47. Paluch R, Lu X, Suchecki K, Szymański BK, Holyst JA. Fast and accurate detection of spread source in large complex networks, Scientific Reports. 2018; 8(1): 2508. pmid:29410504
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref48] 48. Wang H, Sun K. Locating source of heterogeneous propagation model by universal algorithm. Europhysics Letters. 2020; 131(4): 48001. https://dx.doi.org/10.1209/0295-5075/131/48001
View Article
Google Scholar

[154] View Article

[155] Google Scholar

[ref49] 49. Wang H, Zhang F, Sun K. An algorithm for locating propagation source in complex networks. Physics Letters A. 2021; 393: 127184.
View Article
Google Scholar

[157] View Article

[158] Google Scholar

[ref50] 50. Pinto PC, T Patrick, V Martin. Locating the source of diffusion in large-scale networks. Physical Review Letters. 2012; 109(6): 068702. pmid:23006310
View Article
PubMed/NCBI
Google Scholar

[160] View Article

[161] PubMed/NCBI

[162] Google Scholar

[ref51] 51. Lü L, Chen D, Ren X, Zhang Q, Zhang Y, Zhou T. Vital nodes identification in complex networks. Physics Reports. 2016; 650: 1–63.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

[ref52] 52. Sutskever I, Vinyals O, Le QV. Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems. 2014; 27. Available: https://proceedings.neurips.cc/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
View Article
Google Scholar

[167] View Article

[168] Google Scholar

[ref53] 53. Srivastava N, Mansimov E, Salakhudinov R. Unsupervised learning of video representations using lstms. International conference on machine learning. 2015. pp. 843–852. Available: http://proceedings.mlr.press/v37/srivastava15.pdf

[ref54] 54. Dai AM, Le QV. Semi-supervised Sequence Learning. Advances in Neural Information Processing Systems. 2015; 28. Available: https://proceedings.neurips.cc/paper/2015/file/7137debd45ae4d0ab9aa953017286b20-Paper.pdf
View Article
Google Scholar

[171] View Article

[172] Google Scholar

[ref55] 55. Barabasi AL, Albert R. Emergence of Scaling in Random Networks. Science. 1999; 286(5439): 509–512. pmid:10521342
View Article
PubMed/NCBI
Google Scholar

[174] View Article

[175] PubMed/NCBI

[176] Google Scholar

[ref56] 56. Watts DJ, Strogatz SH. Collective dynamics of’small-world’ networks. Nature. 1998; 393(6684): 440–442. pmid:9623998
View Article
PubMed/NCBI
Google Scholar

[178] View Article

[179] PubMed/NCBI

[180] Google Scholar

[ref57] 57. Rossi RA, Ahmed NK. The Network Data Repository with Interactive Graph Analytics and Visualization. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015; 29(1): 4292–4293. Available: https://ojs.aaai.org/index.php/AAAI/article/view/9277

[ref58] 58. Gregory S. Finding overlapping communities using disjoint community detection algorithms. Complex networks. 2009; 207: 47–61.
View Article
Google Scholar

[183] View Article

[184] Google Scholar

[ref59] 59. Newman MEJ. Assortative Mixing in Networks. Physical Review Letters. 2002; 89(20): 208701. pmid:12443515
View Article
PubMed/NCBI
Google Scholar

[186] View Article

[187] PubMed/NCBI

[188] Google Scholar

[ref60] 60. Yang F, Li X, Xu Y, Liu X, Wang J, Zhang Y, et al. Ranking the spreading influence of nodes in complex networks: An extended weighted degree centrality based on a remaining minimum degree decomposition. Physics Letters A. 2018; 382(34): 2361–2371.
View Article
Google Scholar

[190] View Article

[191] Google Scholar

[ref61] 61. Gajewski Ł, Paluch R, Suchecki K, Sulik A, Szymanski B, Hołyst J. Comparison of observer based methods for source localisation in complex networks. Scientific Reports. 2022; 12: 5079. pmid:35332184
View Article
PubMed/NCBI
Google Scholar

[193] View Article

[194] PubMed/NCBI

[195] Google Scholar

[ref62] 62. Zhang X, Zhang Y, Lv T, Yin Y. Identification of efficient observers for locating spreading source in complex networks. Physica, A. Statistical mechanics and its applications. 2016; 442: 100–109.
View Article
Google Scholar

[197] View Article

[198] Google Scholar

Figures

Abstract

Introduction

Related work

Materials and methods

Problem description and overview

Diffusion model.

Stage 1: Constructing network snapshot feature

Stage 2: GLSTM-AE based network snapshot feature representation

Stage 3: Identify the diffusion source with SCNN

Results

Main experimental environment

Methods for comparison

Datasets

Evaluation metrics

Parameters setting

Experimental results and discussion.

Conclusion

Supporting information

S1 File. Long short-term memory (LSTM).

References