Figures
Abstract
The diffusion phenomena taking place in complex networks are usually modelled as diffusion process, such as the diffusion of diseases, rumors and viruses. Identification of diffusion source is crucial for developing strategies to control these harmful diffusion processes. At present, accurately identifying the diffusion source is still an opening challenge. In this paper, we define a kind of diffusion characteristics that is composed of the diffusion direction and time information of observers, and propose a neural networks based diffusion characteristics classification framework (NN-DCCF) to identify the source. The NN-DCCF contains three stages. First, the diffusion characteristics are utilized to construct network snapshot feature. Then, a graph LSTM auto-encoder is proposed to convert the network snapshot feature into low-dimension representation vectors. Further, a source classification neural network is proposed to identify the diffusion source by classifying the representation vectors. With NN-DCCF, the identification of diffusion source is converted into a classification problem. Experiments are performed on a series of synthetic and real networks. The results show that the NN-DCCF is feasible and effective in accurately identifying the diffusion source.
Citation: Yang F, Liu J, Zhang R, Yao Y (2023) Diffusion characteristics classification framework for identification of diffusion source in complex networks. PLoS ONE 18(5): e0285563. https://doi.org/10.1371/journal.pone.0285563
Editor: Vincenzo Bonnici, University of Parma, ITALY
Received: November 20, 2022; Accepted: April 26, 2023; Published: May 15, 2023
Copyright: © 2023 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All real dataset in the experiments are available from the networkrepository database. 1. https://networkrepository.com/ca-netscience.php 2. https://networkrepository.com/subelj-euroroad.php 3. https://networkrepository.com/email-univ.php 4. https://doi.org/10.1007/978-3-642-01206-8_5 Source: 1. Rossi RA, Ahmed NK. The Network Data Repository with Interactive Graph Analytics and Visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Austin, Texas; 2015. p. 4292|-4293. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/9277. 2. Gregory S. Finding overlapping communities using disjoint community detection algorithms. In: Complex networks; 2009. p. 47-61. The synthetic dataset in the experimants can be generated by igraph: https://igraph.org/ The related theory can be found in: 1. Barabasi AL, Albert R. Emergence of Scaling in Random Networks. Science.1999;286(5439):509-512. 2. Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature.1998;393(6684):440. doi:https://doi.org/10.1038/30918.
Funding: This work was supported by: 1. National Natural Science Foundation of China (Grant No. 62062010) (Fan Yang). https://www.nsfc.gov.cn/ 2. Science and Technology Planning Project of Guangxi (Grant No. AD19245101) (Fan Yang). http://kjt.gxzf.gov.cn/ 3. Science and Technology Planning Project of Liuzhou City (Grant No. 2020PAAA0606) (Fan Yang). http://kjj.liuzhou.gov.cn/ 4. Higher Education Innovation Fund project of Gansu (No. 2022A-022) (Yabing Yao). http://jyt.gansu.gov.cn/ 5. National Natural Science Foundation of China (Grant No. 62061003) (Jingxian Liu). https://www.nsfc.gov.cn/ 6. Doctoral Foundation of Guangxi University of Science and Technology (Grant No. 19Z06) (Fan Yang). https://www.gxust.edu.cn/ 7. Longyuan Youth Innovation and Entrepreneurship Talents Team Project of Gansu (No. 2021LQTD24) (Yabing Yao). http://www.gszg.gov.cn/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Most complex systems in the real world take the form of networks [1] in which the nodes and edges denote the units and the interactions between units, respectively. Various diffusion phenomena taking place in networks are usually modelled as diffusion process [1], such as disease spreading [2], rumor diffusion [3] and computer virus propagation [4]. The ubiquity of these harmful diffusion processes has incurred huge losses to human society. Therefore, it is of great theoretical and practical significance to develop effective strategies to control the harmful diffusion process. One of the important measures is identifying the diffusion source that initiates the diffusion process on networks, which has attracted widespread attentions in recent years [5]. Many existing source identification methods provided effective solutions for some important issues in reality, such as identifying the source of SARS [6], COVID-19 [7], Cholera [8], finding the source of foodborne disease [9], etc. However, accurately identifying the diffusion source is still an opening challenge.
The success of artificial neural networks has boosted research on many scientific fields [10–12]. Especially, the emergence of graph neural networks [13, 14] (GNNs) and network embedding [15, 16] facilitate the applications of artificial neural networks on irregular structures of networks. GNNs are the neural network models to address different graph tasks in an end-to-end way [13]. The most common GNNs include recurrent graphs neural networks [17], convolutional graph neural networks [18], graph autoencoders [13], etc. Network embedding is composed of various kinds of methods designed for a same task, i.e., network representation learning [13]. Recently, GNNs and network embedding have been successfully introduced into some important issues of complex networks [13, 16], such as link prediction and node classification. However, only a few artificial neural networks based methods focused on the diffusion source identification problem [19, 20]. Li et al. [19] proposed a label propagation framework to locate the diffusion source. Due to the common characteristics between label propagation framework and graph convolutional networks (GCNs), the source identification is converted into a multi-classification problem. Dong et al. [20] detected multiple sources by utilizing the wavefront information. Since existing GNNs is not a suitable solution for the wavefront based method, they developed a novel multi-task learning model based on encoder-decoder structure. Different from the two methods in [19] and [20], this paper utilizes the diffusion time and direction information recorded in limited observers to identify the diffusion source. The two types of information have been proved to be helpful in accurately identifying the source [21–26]. We define the two types of information as diffusion characteristics, and identify the diffusion source by classifying the diffusion characteristics. Although existing GNNs and network embedding are powerful models to process graph data, both of them are not suitable to be used to process the diffusion characteristics which is dynamically generated in a diffusion process. Therefore, we develop a novel neural networks based diffusion characteristics classification framework, which contains the following three stages, (i) the diffusion characteristics are utilized to construct network snapshot feature, (ii) a graph LSTM auto-encoder is proposed, by which the network snapshot feature is represented as low-dimension vectors, (iii) a source classification neural network is proposed to identify the diffusion source by classifying the representation vectors of network snapshot feature. With the proposed framework, the identification of diffusion source is converted into a classification problem. Further, the feasibility and effectiveness of this framework is validated by the experimental results.
The rest of this paper is organized as follows. Existing related works are briefly reviewed in Section Related work. The neural networks based diffusion characteristics classification framework is proposed in Section Materials and methods. The experimental results are discussed in Section Results. We conclude this work in Section Conclusion.
Related work
The early diffusion source identification methods were developed for unweighted networks. A systematic method was pioneered by Shah et al. [27], they constructed a source estimator based on a topological quantity termed as Rumor Centrality (RC). The RC has been extended to identify the diffusion source in more complex environments [28–30]. Zhu et al. [31] proposed a sample path based method termed as Jordan Center (JC), which has been improved to identify the diffusion source with limited observations [32–34]. Meanwhile, many methods based on various ideas were proposed for unweighted networks, including the Dynamic Message Passing based method [35], the Belief Propagation based method [36], the Monte-Carlo method based method [37], the Rationality Observation based method [38], the Label Ranking framework based method [39], the Time Aggregated Graph based method [40], etc. The above methods are effective in unweighted networks. However, in reality, we have to consider various significant weights associated with the edges in networks, such as the traffic, the time delay and so on.
For weighted networks, Brockmann et al. [6] modeled the Global Mobility Network as a weighted graph, and identified the epidemic source based on a novel effective distance. This method has been extended to identify multi-source by Jiang et al. [41]. Meanwhile, several methods based on various ideas were proposed to identify the diffusion source in weighted networks [42–44]. However, these methods require the knowledge of all nodes state. In reality, it is often the case that only limited nodes state can be observed [45]. For this problem, many methods were proposed by utilizing limited observers, including the Time-Reversal Backward Spreading algorithm [24], the Backward Diffusion-based method [46], the improved Gaussian estimator [47], the Gromov matrix based method [25], the Greedy Optimization based algorithm [26], the Sequential Neighbour Filtering algorithm [48], the Estimated Propagation Delay based algorithms [49], etc. These methods [24–26, 46–49] mainly utilized the diffusion time information of observers to identify the source. Pinto et al. [50] proposed a Gaussian estimator, which is the first method to identify the source by utilizing the diffusion direction information of observers. However, the diffusion direction information is only used in the tree graphs. Yang et al. [21] improved the accuracy of Gaussian estimator on general graphs by utilizing the diffusion direction information of observers. Zhu et al. [22, 23] also proposed a path-based source identification method by utilizing the diffusion direction information of observers. Obviously, the diffusion time and direction information of observers play important roles in accurately identifying the diffusion source.
Different from all the traditional source identification methods mentioned above, in recent years, a few artificial neural networks based methods are developed to identify the source. Li et al. [19] proposed a Source Identification Graph Convolutional Network (SIGN) framework, this method requires the knowledge of complete observation. Dong et al. [20] proposed a graph constraint based sequential source identification model. To obtain the wavefront information, this method [20] also requires the knowledge of complete observation. However, in reality, it is often the case that only limited nodes state can be observed [45]. In this paper, we identify the diffusion source by utilizing limited observers. We define the diffusion time and direction information of observers as diffusion characteristics, and propose an artificial neural networks based framework to identify the source by classifying the diffusion characteristics. The feasibility and effectiveness of the proposed framework are validated on a series of synthetic and real networks.
Materials and methods
Problem description and overview
A network the diffusion process taking place in is modelled as a finite and undirected graph , where V and E represent the nodes set and edges set, respectively. θ = {θvu}, θvu is the random propagation delay associated with an edge vu, vu ∈ E. Generally,
is assumed to be known. We consider that the {θvu} associated with E are independent and identically distribution (I.I.D) random variables.
Diffusion model.
Assuming that the diffusion process taking place in follows a simple diffusion model that is similar to reference [50]. At time t, each node v ∈ V is only in one of the two states: (i) informed, if it has received the information from any one neighbour, or (ii) ignorant, if it has not been informed so far. Any node v is equally likely to be the source. The diffusion process is initiated by a single source s* at unknown start time, all nodes are ignorant except for s* is informed. Let
denote the neighbour(s) of v. Suppose v is in the ignorant state, and receives the information for the first time from one informed neighbour w, thus becoming informed at time tv. Then, v will attempt to retransmit the information to all its other neighbours along the edges, so that each neighbour
receives the information with success probability β at time tv + θvu. If there are two or more informed neighbours having a same propagation delay to u, u can be informed by only one neighbour. Once the diffusion process is terminated, a network snapshot, denoted by
, will be generated.
For an arbitrary , with the diffusion model introduced above, a network snapshot
is generated. Generally, only a part of nodes state in
can be observed, we call these nodes observers, denoted by
. The observations made by
provide two types of information [21, 50]: (i) the direction in which information arrives to observers and, (ii) the timing at which the information arrives to observers. Obviously, the two types of information recorded in
show the true details of the diffusion process, which have been proved to be helpful in accurately identifying the diffusion source [21–26]. In this paper, the two types of information are defined as diffusion characteristics. The purpose is to find the diffusion source s* from
by utilizing the diffusion characteristics recorded in
. We propose a neural networks based diffusion characteristics classification framework (NN-DCCF) to identify the diffusion source, by which the identification of source is converted into a classification problem. NN-DCCF is composed of the following three stages.
- By selecting vital nodes and extending their neighbours in a given
, we build
. Then, for a
, by utilizing the diffusion characteristics recorded in
, we construct network snapshot feature, denoted by
.
- We propose a graph LSTM auto-encoder (GLSTM-AE). By using GLSTM-AE,
is represented as low-dimension vectors, denoted by
.
- We propose a source classification neural network (SCNN) to estimate the diffusion source by classifying
.
The overview of NN-DCCF is shown in Fig 1. Frequently used notations are summarized in Table 1.
(a) The vital nodes selected by degree centrality [51] include node 8 and node 44. Observation areas set .
.
consists of node 8 and its neighbours within 1 hop distance.
.
consists of node 44 and its neighbours within 1 hop distance.
. (b)
.
is composed of the sequence features constructed with the diffusion characteristics recorded in the observers of
.
is composed of the sequence features constructed with the diffusion characteristics recorded in the observers of
. (c) By using GLSTM-AE,
is converted into low-dimension representation vectors, i.e.
. (d) With
as the input of SCNN, we can estimate the diffusion source.
Stage 1: Constructing network snapshot feature
To utilize the diffusion characteristics of observers to construct network snapshot feature, the observers set is built with the following strategy. Given a
, we first rank the importance of nodes by a vital nodes identification methods [51]. Next, with the ranking results, we select the most important K nodes as vital nodes. Then, for each vital node, we extend its neighbours within h hops distance. Further, each vital node and its extended neighbours are combined to form an observation area.
contains K observation areas
,
,
. o denotes an unique observer in
. When the diffusion process occurs on
and generates
, by utilizing the diffusion characteristics, i.e., the diffusion direction and time information, recorded in each
, we construct the network snapshot feature
. Here, we set
. The procedure for constructing
is summarised in Algorithm 1.
Algorithm 1 Network snapshot feature constructing algorithm
Input:
,
and
Output:
1: initialize an empty
2: sort all in
according to the average informed time of
3: for each in
do
4: for each do
5: initialize an empty seq to record a single sequence feature
6: set current node c = o
7: while do
8: if c is in the informed state then
9: add c into seq
10: get next node n according to the diffusion direction information recorded in c and set c = n
11: end if
12: end while
13: reverse the nodes order in seq
14: if 1 < |seq| ≤ lmax then
15: add seq into
16: else if |seq| > lmax then
17: remove the last |seq| − lmax nodes from seq
18: add seq into
19: end if
20: end for
21: end for
22: for each in
do
23: remove duplicated sequence features from
24: sort all sequence features in according to their length
25: if then
26: remove the last sequence features from
27: end if
28: end for
In Algorithm 1, the inputs are the topology of , observation areas set
and network snapshot
. The average informed time of
in step 2 is the average of the diffusion time information recorded in
. Steps 4–20 are used for constructing the sequence features in
by traversing each
. Here, steps 7–12 are used to generate a single sequence feature, denoted by seq. A single seq is a basic unit for constructing
. Obviously, generating a single seq depends on the diffusion direction information of observers. Step 13 is used to reverse the order of current seq. Steps 14–19 are used to add the seq into
, where, 2 ≤ |seq| ≤ lmax. Further, from step 3 to step 21, the sequence features in each
are constructed, then, we get
. Steps 22–28 are used to remove the redundant sequence features and limit the size of
. A schematic to obtain
by using Algorithm 1 is shown in Fig 1(a) and 1(b).
Stage 2: GLSTM-AE based network snapshot feature representation
From Algorithm 1, we know that each sequence feature, termed as seq, in consists of several ordered informed nodes. Therefore, the seq is a type of sequential data. Inspired by the idea that the long short-term memory (S1 File) is a powerful tools for modelling sequential data [52–54], we use the LSTM networks to learn the representation of seq. However, the seq is different from traditional sequential data since it is composed of ordered informed nodes. Further, we propose a graph LSTM auto-encoder (GLSTM-AE) to learn the low-dimension representation of seq. A GLSTM-AE consists of two LSTMs, the encoder LSTM and the decoder LSTM, as shown in Fig 2. GLSTM-AE works as follows. For an arbitrary seq, each node in seq is represented as an one-hot vector with dimension |V|. The input to GLSTM-AE is the one-hot representation of seq. The output of the encoder LSTM after the last input has been read is low-dimension representations of the one-hot vectors of seq, denoted by r,
, where, dr denotes the representation dimension. r is the representation result we obtained from the GLSTM-AE. The decoder LSTM reconstruct back the input from r. The target of GLSTM-AE is same as the input.
Obviously, it is necessary to train GLSTM-AE before it is applied to learn the representations of the sequence features in . A simple way to obtain the training data of GLSTM-AE is generating sequence features with fixed length from
.
Because the mean squared error (MSE) loss is commonly used for the regression task, it is suitable for the task of GLSTM-AE. Therefore, we adopt the MSE loss as the loss function of GLSTM-AE, which is described as follows.
(1)
where, Y denotes the output of the decoder LSTM in GLSTM-AE, Y* denotes the one-hot representation of seq.
Then, with the trained GLSTM-AE, we get the low-dimension representation of , denoted by
. This process is summarised in Algorithm 2.
Algorithm 2 Network snapshot feature representation algorithm
Input:
Output:
1: initialize an empty ,
2: set
3: set
4: for each in
do
5: for each seq in do
6: input=one-hot(seq), input ∈ R|seq|×|V|
7: r = GLSTM-AE (input),
8: if |seq| < lmax then
9: k = lmax − |seq|
10: pad r with pl for k times
11: end if
12: add r into
13: end for
14: if then
15:
16: pad with pη for k times
17: end if
18: end for
In Algorithm 2, the input is the network snapshot feature . The one-hot(⋅) function in step 6 is to get the one-hot representation of current seq. In step 7, the representation result r of seq is obtained by using the trained GLSTM-AE. Further, we set
by steps 8–11, and set
by steps 14–17.
Stage 3: Identify the diffusion source with SCNN
With Algorithm 2, we get the representation of , i.e.
. In this section, with
as input, we propose a source classification neural network (SCNN) to identify the diffusion source by classifying
. SCNN is mainly composed of two fully connected layers. To get convergence faster, we add a normalization layer. The structure of SCNN is shown in Fig 3, where, the LogSoftmax is used for multi-class classification.
SCNN also requires to be trained before it is applied to identify the diffusion source. The training data of SCNN can be generated by Algorithm 3.
Algorithm 3 SCNN training data generating algorithm
Input:
and
Output: training data collector C
1: specify the number of loops N
2: initialize an empty training data collector C
3: set , βi ∈ (0, 1), ∀i, j ∈ [1, M], βi ≠ βj
4: while N > 0 do
5: for βi ∈ β do
6: for each node v ∈ V do
7: generate by running the diffusion model (see Diffusion model) on
with v as diffusion source and βi as propagation rate
8: generate by Algorithm 1
9: construct corresponding to
by Algorithm 2
10: add a training data (, one-hot(v)) into C
11: end for
12: end for
13: N = N − 1
14: end while
In Algorithm 3, the inputs are the topology of and observation areas set
. From step 7 to step 10, given a node v and a propagation rate β, a single training data can be generated, which is composed of the
and the one-hot representation of v. Obviously, the SCNN training dataset size is N ⋅ |β| ⋅ |V|.
Because the cross entropy loss is mainly used for classification, we adopt the cross entropy loss as the loss function of SCNN, which is described as follows.
(2)
where, Z denotes the estimated diffusion source obtained by SCNN, Z* denotes the one-hot representation of true diffusion source.
Finally, by combining with the trained SCNN, the algorithm corresponding to NN-DCCF is summarised as Algorithm 4.
Algorithm 4 Diffusion source identification algorithm
Input:
,
and
Output:
1: generate according Algorithm 1
2: construct corresponding to
according to Algorithm 2
3: output = SCNN , output ∈ R|V|
4:
Results
Main experimental environment
Hardware: Dell R740 with 2 Intel(R) Xeon(R) gold 6254 CPU, 1 TB RAM, 1 NVIDIA Tesla V100S GPU with 32 GB GPU memory. Software: Python 3.8.10 + PyTorch 1.10.2 + CUDA 10.2.
Methods for comparison
Essentially, the proposed NN-DCCF is an observers based method. To validate its feasibility and effectiveness, three existing state-of-the-art observers based methods are selected for comparison, including time-reversal backward spreading (TRBS) algorithm [24], sequential neighbour filtering (SNF) algorithm [48] and estimated propagation delay (EPD) algorithm [49].
Datasets
We compare the four diffusion source identification methods on a series of synthetic and real networks. The synthetic networks include scale-free (BA) [55] model and small-world (WS) [56] model. The parameters for generating synthetic networks are summarised in Tables 2 and 3. The real networks are of different types, including NetworkScience (https://networkrepository.com/ca-netscience.php) [57], Euroroads (https://networkrepository.com/subelj-euroroad.php) [57], Email (https://networkrepository.com/email-univ.php) [57] and Blogs (https://doi.org/10.1007/978–3-642-01206-8_5) [58]. The topological properties of all networks are summarised in Table 4.
Evaluation metrics
The performance of diffusion source identification methods are commonly evaluated with two metrics [5, 21, 25, 27, 34], including the precision and average error distance. The precision focuses on evaluating the capability of a method in precise identification (i.e. the proportion of 0 error hop). For each network, we randomly select 100 nodes as test seeds. For the precision, the higher the value is, the better the algorithm is. For the average error hop, the smaller the value is, the better the algorithm is.
Parameters setting
For an arbitrary , we assume that θ are Gaussian distribution
[24, 49], μ and σ2 are known [50], here, we set μ/σ = 4 [21, 50]. We assume that the diffusion process on networks follows the diffusion model introduced in Diffusion model. To investigate the performance of NN-DCCF under different propagation rates, we set relatively larger range for β, β ∈ [0.1, 0.9]. The diffusion process is terminated when there is no ignorant node.
How to select a suitable observers placement strategy may depends on the topology of a network [61]. Although lots of methods [51] can be used to select , sometimes there maybe no significant difference between the placement strategies for the performance of source identification [62]. In this paper,
is selected by the strategy introduced in Section Stage 1. Here, the K vital nodes in
are selected by the degree centrality (DC) [51] (due to the simplicity and efficiency of DC). For each network, we select 1% nodes as vital nodes. Then, by extending the neighbours within 1 hop distance of the vital nodes, we get
and
in
, the details are shown in Table 5. Other general parameters are also summarised in Table 5. All the four compared methods adopt the same
to identify the diffusion source.
The parameters set of GLSTM-AE are summarised in Table 6. Meanwhile, we generate the training dataset of GLSTM-AE for each network with the simple method introduced in Section Stage 2. To emphasize the local structure, we set l ∈ [2, 4]. The training dataset size of GLSTM-AE on different networks are shown in Table 7. The training parameters set for GLSTM-AE on different networks are summarised in Table 8. Because the purpose is to identify the diffusion source, we show the accuracy of GLSTM-AE by the results of source identification, which can be found in Figs 4–11 and Table 10.
The parameters set of SCNN are summarised in Table 9. Further, for each network, we generate the training dataset of SCNN by Algorithm 3. The training dataset size of SCNN on different networks are shown in Table 7. The training parameters set for SCNN are summarised in Table 8. In SCNN, we adopt the batch normalization as the normalization layer [19, 20]. Since the purpose is to identify the diffusion source, we validate the performance of SCNN by the results of source identification, which can be found in Figs 4–11 and Table 10.
Experimental results and discussion.
Figs 4–11 show the error distance of the four methods on different networks. Table 10 shows the average error distance of the four methods. From Figs 4 to 11, we can see that the precisions (i.e. the proportion of 0 error hop) exposed by NN-DCCF on the eight networks are 74%, 83%, 58%, 66%, 22%, 19%, 83% and 49%, respectively. Obviously, except for WS model (2), NN-DCCF exposes the best performance in precision. On WS model (2), the precision of NN-DCCF is only inferior to the TRBS, and superior to other two methods. From Table 10, we know that the NN-DCCF is superior to other three methods in the average error distance on all networks. Therefore, the NN-DCCF is a feasible and effective method in accurately identifying the diffusion source. Additionally, from Table 4, we know that the eight networks are different in their topological properties, which indicates that NN-DCCF could effectively identify the source on different types of networks by simply modifying the training parameters. Therefore, the NN-DCCF is a general source identification framework.
Conclusion
This paper defines the diffusion direction and time information of observers as diffusion characteristics, and develops a NN-DCCF to identify the diffusion source by classifying the diffusion characteristics. Firs, we utilize the diffusion characteristics to construct network snapshot feature. Then, we propose a GLSTM-AE by which the network snapshot feature is represented as low-dimension vectors. Further, we propose a SCNN to identify the diffusion source. By using NN-DCCF, the identification of diffusion source is converted into a classification problem. The feasibility and effectiveness of NN-DCCF are validated by the experimental results on a series of synthetic and real networks. In the future work, we will generalize the NN-DCCF to the case of multi-source.
References
- 1. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: Structure and dynamics. Physics Reports. 2006; 424(4): 175–308.
- 2. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021; 589(7840): 82–87. pmid:33171481
- 3. Zhu L, Yang F, Guan G, Zhang Z. Modeling the dynamics of rumor diffusion over complex networks. Information Sciences. 2021; 562: 240–258.
- 4. Wang Y, Wen S, Xiang Y, Zhou W. Modeling the Propagation of Worms in Networks: A Survey. IEEE Communications Surveys & Tutorials. 2014; 16(2): 942–960.
- 5. Jiang J, Sheng W, Shui Y, Yang X, Zhou W. Identifying Propagation Sources in Networks: State-of-the-Art and Comparative Studies. IEEE Communications Surveys & Tutorials. 2017; 19(1): 465–481.
- 6. Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. Science. 2013; 342(6164): 1337–1342. pmid:24337289
- 7. Wang Y, Zhong L, Du J, Gao J, Wang Q. Identifying the shifting sources to predict the dynamics of COVID-19 in the US. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2022; 32(3): 033104.
- 8. Li J, Manitz J, Bertuzzo E, Kolaczyk ED. Sensor-based localization of epidemic sources on human mobility networks. PLoS Computational Biology. 2021; 17(1): e1008545. pmid:33503024
- 9. Horn AL, Friedrich H. Locating the source of large-scale outbreaks of foodborne disease. Journal of the Royal Society Interface. 2019; 16(151): 20180624. pmid:30958197
- 10. Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural network architectures and their applications. Neurocomputing. 2017; 234: 11–26.
- 11.
Chamberlain B, Rowbottom J, Gorinova MI, Bronstein M, Webb S, Rossi E. GRAND: Graph Neural Diffusion. Proceedings of the 38th International Conference on Machine Learning. 2021; 139: 1407–1418. Available: http://proceedings.mlr.press/v139/chamberlain21a/chamberlain21a.pdf
- 12. Zhang C, Zhao S, Yang Z, Chen Y. A reliable data-driven state-of-health estimation model for lithium-ion batteries in electric vehicles. Frontiers in Energy Research. 2022; 10.
- 13. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems. 2021; 32(1): 4–24. pmid:32217482
- 14. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. Computational Capabilities of Graph Neural Networks. IEEE Transactions on Neural Networks. 2009; 20(1): 81–102. pmid:19129034
- 15. Cui P, Wang X, Pei J, Zhu W. A Survey on Network Embedding. IEEE Transactions on Knowledge and Data Engineering. 2019. 31(5): 833–852.
- 16. Zhang D, Yin J, Zhu X, Zhang C. Network Representation Learning: A Survey. IEEE Transactions on Big Data. 2020; 6: 3–28.
- 17. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The Graph Neural Network Model. IEEE Transactions on Neural Networks. 2009; 20(1): 61–80. pmid:19068426
- 18.
Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional Networks. International Conference on Learning Representations. 2017.
- 19. Li L, Zhou J, Jiang Y, Huang B. Propagation source identification of infectious diseases with graph convolutional networks. Journal of biomedical informatics. 2021; 116: 103720. pmid:33640536
- 20. Dong M, Zheng B, Li G, Li C, Zheng K, Zhou X. Wavefront-Based Multiple Rumor Sources Identification by Multi-Task Learning. IEEE Transactions on Emerging Topics in Computational Intelligence. 2022; 6(5): 1068–1078.
- 21. Yang F, Yang S, Peng Y, Yao Y, Wang Z, Li H, et al. Locating the propagation source in complex networks with a direction-induced search based Gaussian estimator. Knowledge-Based Systems. 2020; 195: 105674.
- 22. Zhu P, Cheng L, Gao C, Wang Z, Li X. Locating Multi-Sources in Social Networks With a Low Infection Rate. IEEE Transactions on Network Science and Engineering. 2022; 9(3): 1853–1865.
- 23. Cheng L, Li X, Han Z, Luo T, Ma L, Zhu P. Path-based multi-sources localization in multiplex networks. Chaos, Solitons & Fractals. 2022; 159: 112139.
- 24. Shen Z, Cao S, Wang W, Di Z, Stanley HE. Locating the source of diffusion in complex networks by time-reversal backward spreading. Physical Review. E. 2016; 93(3): 032301. pmid:27078360
- 25. Tang W, Ji F, Tay WP. Estimating Infection Sources in Networks Using Partial Timestamps. IEEE Transactions on Information Forensics and Security. 2018; 13(12): 3035–3049.
- 26. Hu Z. Wang L, Tang C. Locating the source node of diffusion process in cyber-physical networks via minimum observers. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2019; 29(6): 063117. pmid:31266325
- 27. Shah D, Zaman T. Rumors in a Network: Who’s the Culprit? IEEE Transactions on Information Theory. 2011; 57(8): 5163–5181.
- 28. Luo W, Tay WP, Leng M. Identifying Infection Sources and Regions in Large Networks. IEEE Transactions on Signal Processing. 2013; 61(11): 2850–2865.
- 29. Wang Z, Dong W, Zhang W, Tan CW. Rumor Source Detection with Multiple Observations: Fundamental Limits and Algorithms. SIGMETRICS Perform. Eval. Rev. 2014; 42(1): 1–13.
- 30. Wang Z, Dong W, Zhang W, Tan CW. Rooting our Rumor Sources in Online Social Networks: The Value of Diversity From Multiple Observations. IEEE Journal of Selected Topics in Signal Processing. 2015; 9(4): 663–677.
- 31. Zhu K, Ying L. Information Source Detection in the SIR Model: A Sample-Path-Based Approach. IEEE/ACM Transactions on Networking. 2016; 24(1): 408–421.
- 32. Zhu K, Ying L. A Robust Information Source Estimator with Sparse Observations. Computational Social Networks. 2014; 1(1): 1–21.
- 33. Luo W, Tay WP, Leng M. How to Identify an Infection Source With Limited Observations. IEEE Journal of Selected Topics in Signal Processing. 2014; 8(4): 586–597.
- 34. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. Rumor Source Identification in Social Networks with Time-Varying Topology. IEEE Transactions on Dependable and Secure Computing. 2018; 15(1): 166–179.
- 35. Lokhov AY, Mézard M, Ohta H, Zdeborová L. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Physical Review E. 2014; 90(1): 012801. pmid:25122336
- 36. Altarelli F, Braunstein A, Dall’Asta L, Lage-Castellanos A, Zecchina R. Bayesian inference of epidemics on networks via belief propagation. Physical Review Letters. 2014; 112(11): 118701. pmid:24702425
- 37. Antulov-Fantulin N, Lančić A, Šmuc T, Štefančić H, Šikić M. Identification of Patient Zero in Static and Temporal Networks: Robustness and Limitations. Physical Review Letters. 2015; 114(24): 248701. pmid:26197016
- 38. Yang F, Zhang R, Yao Y, Yuan Y. Locating the propagation source on complex networks with Propagation Centrality algorithm. Knowledge-Based Systems. 2016; 100: 112–123.
- 39. Zhou J, Jiang Y, Huang B. Source identification of infectious diseases in networks via label ranking. PLoS ONE. 2021; 16(1): e0245344. pmid:33444390
- 40. Chai Y, Wang Y, Zhu L. Information Sources Estimation in Time-Varying Networks. IEEE Transactions on Information Forensics and Security. 2021; PP(99): 2621–2636.
- 41. Jiang J, Wen S, Yu S, Xiang Y, Zhou W. K-Center: An Approach on the Multi-Source Identification of Information Diffusion. IEEE Transactions on Information Forensics and Security. 2015; 10(12): 2616–2626.
- 42. Cai K, Hong X, Lui JCS. Information Spreading Forensics via Sequential Dependent Snapshots. IEEE/ACM Transactions on Networking. 2018; 26(1): 478–491.
- 43. Feizi S, Médard M, Quon G, Kellis M, Duffy K. Network Infusion to Infer Information Sources in Networks. IEEE Transactions on Network Science and Engineering. 2019; 6(3): 402–417.
- 44. Chang B, Chen E, Zhu F, Liu Q, Xu T, Wang Z. Maximum a Posteriori Estimation for Information Source Detection. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2020; 50(6): 2242–2256.
- 45. Caputo JG, Hamdi A, Knippel A. Inverse source problem in a forced network. Inverse Problems. 2019; 35(5): 055006.
- 46. Fu L, Shen Z, Wang W, Fan Y, Di Z. Multi-source localization on complex networks with limited observers. EPL. 2016; 113(1): 18006.
- 47. Paluch R, Lu X, Suchecki K, Szymański BK, Holyst JA. Fast and accurate detection of spread source in large complex networks, Scientific Reports. 2018; 8(1): 2508. pmid:29410504
- 48. Wang H, Sun K. Locating source of heterogeneous propagation model by universal algorithm. Europhysics Letters. 2020; 131(4): 48001. https://dx.doi.org/10.1209/0295-5075/131/48001
- 49. Wang H, Zhang F, Sun K. An algorithm for locating propagation source in complex networks. Physics Letters A. 2021; 393: 127184.
- 50. Pinto PC, T Patrick, V Martin. Locating the source of diffusion in large-scale networks. Physical Review Letters. 2012; 109(6): 068702. pmid:23006310
- 51. Lü L, Chen D, Ren X, Zhang Q, Zhang Y, Zhou T. Vital nodes identification in complex networks. Physics Reports. 2016; 650: 1–63.
- 52. Sutskever I, Vinyals O, Le QV. Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems. 2014; 27. Available: https://proceedings.neurips.cc/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
- 53.
Srivastava N, Mansimov E, Salakhudinov R. Unsupervised learning of video representations using lstms. International conference on machine learning. 2015. pp. 843–852. Available: http://proceedings.mlr.press/v37/srivastava15.pdf
- 54. Dai AM, Le QV. Semi-supervised Sequence Learning. Advances in Neural Information Processing Systems. 2015; 28. Available: https://proceedings.neurips.cc/paper/2015/file/7137debd45ae4d0ab9aa953017286b20-Paper.pdf
- 55. Barabasi AL, Albert R. Emergence of Scaling in Random Networks. Science. 1999; 286(5439): 509–512. pmid:10521342
- 56. Watts DJ, Strogatz SH. Collective dynamics of’small-world’ networks. Nature. 1998; 393(6684): 440–442. pmid:9623998
- 57.
Rossi RA, Ahmed NK. The Network Data Repository with Interactive Graph Analytics and Visualization. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015; 29(1): 4292–4293. Available: https://ojs.aaai.org/index.php/AAAI/article/view/9277
- 58. Gregory S. Finding overlapping communities using disjoint community detection algorithms. Complex networks. 2009; 207: 47–61.
- 59. Newman MEJ. Assortative Mixing in Networks. Physical Review Letters. 2002; 89(20): 208701. pmid:12443515
- 60. Yang F, Li X, Xu Y, Liu X, Wang J, Zhang Y, et al. Ranking the spreading influence of nodes in complex networks: An extended weighted degree centrality based on a remaining minimum degree decomposition. Physics Letters A. 2018; 382(34): 2361–2371.
- 61. Gajewski Ł, Paluch R, Suchecki K, Sulik A, Szymanski B, Hołyst J. Comparison of observer based methods for source localisation in complex networks. Scientific Reports. 2022; 12: 5079. pmid:35332184
- 62. Zhang X, Zhang Y, Lv T, Yin Y. Identification of efficient observers for locating spreading source in complex networks. Physica, A. Statistical mechanics and its applications. 2016; 442: 100–109.