Evolving graph attention networks for dynamic link prediction

Yucai Jiang; Rongying Shan; Gang Fu; Zhuolin Li; Jingjing Sun; Zhongyun Bao; Feifei Wei

doi:10.1371/journal.pone.0349685

Abstract

Graph neural networks (GNNs), which learn node representations via aggregating their neighbors, have shown superior performance and become the de facto efficient toolkit for analyzing and learning from data with structured properties. However, most existing GNNs are designed for static graphs and assume fixed graph structures and node sets. In many real-world applications, graphs evolve continuously over time—with nodes and edges appearing or disappearing—rendering static models insufficient for capturing these temporal dynamics. In this paper, we propose Evolving Graph Attention Networks (EGAT), a novel framework for dynamic graph representation learning. Specifically, EGAT leverages the anisotropic attention mechanism of Graph Attention Networks (GATs) to capture complex inter-node relationships. Crucially, the multi-head attention weights of the GAT are evolved over time via a recurrent neural network (RNN), enabling the model to adaptively adjust the importance of different neighbors as the graph topology and relational dynamics change. This weight-evolving paradigm couples the anisotropic attention mechanism of GATs with a recurrent subnetwork, enabling the joint modeling of topological evolution and temporal relational dynamics. Extensive experiments on benchmark datasets demonstrate that the proposed model consistently outperforms state-of-the-art baselines.

Citation: Jiang Y, Shan R, Fu G, Li Z, Sun J, Bao Z, et al. (2026) Evolving graph attention networks for dynamic link prediction. PLoS One 21(6): e0349685. https://doi.org/10.1371/journal.pone.0349685

Editor: Guangyin Jin, National University of Defense Technology, CHINA

Received: January 31, 2026; Accepted: May 4, 2026; Published: June 1, 2026

Copyright: © 2026 Jiang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data involved in this study are all publicly available and accessible. The BC-Alpha Dataset can be obtained at http://snap.stanford.edu/data/soc-sign-bitcoin-alpha.html. The UCI Dataset can be obtained at http://Konect.cc/Networks/opsahl-ucsocial/ The AS Dataset can be obtained at http://snap.stanford.edu/data/as-733.html.

Funding: This research was supported by the Outstanding Talents Training Program of Anhui Higher Education Institutions in 2021 (Grant No. gxbjZD2021112), funded by the Department of Education of Anhui Province, China.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Graph Neural Networks (GNNs) have attracted great attention for its excellent ability to model the ubiquitous structured data in the real-world, such as natural language processing [1,2], computer vision [3,4], chemistry [5] and point cloud [6,7]. A graph is comprised of edges and nodes, in which the nodes are used to represent the entities and the edges represent relationships between the entities. For most of the GNNs, they resort to the special message passing mechanism [5], in which the message is flowing along the edge between the nodes for the update of each node in the next layers. To obtain the long-range node dependencies, multiple layers can be stacked to propagate information from multiple-hops [8]. Afterwards, at the final layer, the nodes and edges can be formulated to build the downstream task, for instance node classification and link prediction.

These GNNs are usually built on the static scenarios, where graph structure and the total amount of the nodes are always not changed. However, in the real-world, the scene is evolved with the around environments, which poses the challenges to develop the dynamic model that adapts to the real-life scenarios. For instance, users in the social network make a friend or lose a friend over time, then the node representation should be subsequently updated accordingly. Additionally, in financial domain, transactions involve the time stamps which characterize the nature of the user account, such as money laundering and fraud. From the perspective of these realistic scenarios, it is crucial to construct the dynamically evolving model to adapt to the sophisticated applications.

Recently, although the traditional graph convolutional networks (GCNs) [9] has achieved tremendous success with its simplicity and effectiveness and the graph attention networks (GATs) [10] is equipped with building the multi-relation between the nodes, they are developed for static scenarios, thus being insufficient to model the constantly changing world. In this work, based on the GATs, we build a model that dynamically evolves with temporal dimension by a RNNs to update the parameters for capturing the dynamism of the nodes and edges, which is consistent with the evolving sequence of real-world. As a result, our attention mechanism with evolving graph has the significant advancement compared to the existing GATs when modelling the dynamical world.

Similar works also proposed the method based on the GNNs and RNNs [11–13], where the GNNs is leveraged to extract node feature and the RNNs is utilized for temporal learning combined with the learned node feature. However, at the temporal axis, they only utilize one GNNs to model all the graphs, which necessitates acquire the information of the nodes along the time span, thus hardly being promising the performance when the graph is changed. The subsequent work [14] attempts to incorporate the temporal dimension to the GNNs for modelling the dynamical graph structure. However, owing to its simple vanilla GNNs framework, it can not fully investigate the sophisticated scenarios in the real-world.

To solve the problem of constantly changing graphs that may add new node or construct new connection between nodes, in this paper, we propose an evolving graph attention networks (EGAT) that captures the dynamism of the graph along the time span. Specifically, we evolve the multi-head weights of the GATs along the temporal span by using the RNNs at every time step, which can efficiently adapt the model to the real scenarios. Besides, our model can well tackle the new node that has no historical information.

The contributions of the paper can be summarized as follows:

We introduce a novel weight-evolving paradigm coupling a recurrent subnetwork with the anisotropic attention mechanism of Graph Attention Networks. This design enables the model to jointly capture topological evolution and the fine-grained temporal dynamics of relational strengths between nodes.
We propose a novel model EGAT that dynamically adapts to the constantly changing real-world for capturing the evolving graph structure and interplay between nodes, in which the nodes and edges are in the status of changing.
The extensive experiments have been conducted on a variety of real-life benchmark datasets. And the results showcase that our performances outperform the state-of-the-art work strikingly in almost every aspects.

Related work

In this section, the graph neural networks and the closed dynamic graphs are introduced.

Graph Neural Networks. The most famous work is the GCNs model [9] that linearly approximates the localized spectral convolution and performs the iteratively update of the node embedding through the isotropic averaging over the neighbour nodes embedding. Then, a lot of follow-up works [10,15–18] are inspired in the spatial domain with improvements at different aspects. Especially, the GATs [10] propose the anisotropy scheme when aggregating the neighbour nodes, which learns the importance over the neighbour nodes feature and thus significantly improves the learning capacity by its multi-head architecture.

Dynamic Graphs. Indeed, dynamic graphs are developed to tackle the constantly changing scenarios and often derived from the static graphs, which specifically focus on the temporal dimension and corresponding update methods. Recently, dynamic graphs are becoming emerging and have been broadly investigated in academia and industry [19–21]. For instance, some models leverage regularization over the static graph embedding [22,23]. The work [22] gradually refreshes the embedding through utilizing the incremental Singular Value Decomposition (SVD) and then the SVD is obtained when the error is out of the threshold. Another line of work [24,25] is based on random walk, which obtains the transition probabilities by computing the normalized inner products on the node embedding from the past, thus maximizing the probabilities of the sampled random walks.

With the deep learning surging and achieving remarkable success on a wide range of applications, such as in physics [26], in knowledge graphs [27], there have been work DynGEM [28] that attempts to leverage the auto-encoding framework for minimizing the reconstruction loss and distance between similar nodes, thus resulting in clustering of nodes in the embedding space. The salients of DynGEM are that it can be adapted to the constantly changing size of the graph and the history information can also be learned for further initialization for the next training step.

It is worth to mention that another investigation direction of dynamic graphs is the point processes [29–31]. Specifically, Know-Evolve [29] and DyRep [30] leverage the point process to model the edge and take into the input consideration for parameterizing the intensity function through a neural network. For more sophisticated scenarios, such as triadic closure in which triadic is comprised of three nodes, the work [31] also utilizes the point process to investigate thoroughly from the open form (only two pairs connected) to the closed form (three pairs connected each other).

A recent research direction explicitly models the temporal evolution of GNN parameters. A representative work, EvolveGCN [14], employs an RNN to update the weight matrices of a GCN at each time step, decoupling model capacity from changing graph size. However, EvolveGCN [14] inherits the isotropic nature of GCNs—neighbor information is aggregated uniformly after linear transformation—limiting its ability to capture the temporally-varying importance of different edges. In contrast, our EGAT evolves the anisotropic multi-head attention weights of GATs via an RNN, enabling the model to adaptively re-weight neighbor contributions as the graph evolves. This captures richer dynamic patterns in both structural topology and inter-node relational dynamics.

Methods

In this section, we make a brief introduction to the Graph Attention Networks (GATs) and weight evolution. With these preliminary knowledge, our model EGAT is demonstrated in the following section.

Graph attention networks

For GATs, let G = (V, E) be a graph, where and E denote a set of nodes and edges, respectively. The node embedding are denoted as . Let at time t. Here, N is denoted as total number of nodes of a graph, which is constantly changing when evolving along the temporal dimension. F represents the feature dimension of a node. Let be adjacent matrix of the graph at time t, where A _i,j = 1 if , otherwise . Similarly, with the scenarios of real-world changing, A_t is also evolved when the node is disappeared or added. The denotes the neighbor nodes. For the sake of capturing the powerful learning ability of model, GATs leverages the shared linear transformation for every node, which is followed by a self-attention on the nodes . Thus, the attention coefficients can be obtained:

(1)

which shows the importance of node j feature representation to node i. Then, the softmax function is applied to normalize the coefficients for being comparable across different nodes:

(2)

The aggregated features from the neighbours can be formulated as:

(3)

For multi-head attention, the Eq (3) is independently conducted by K times followed by the concatenating of their features:

(4)

where denotes the concatenation, represents the normalized attention coefficients obtained from the k-th attention mechanism and w^k is the k-th linear transformation weight matrix corresponding to the k-th head.

From the above, it can be seen that the GATs [10] consists of multi-head attention, which is leveraged to model the sophisticated relationship between the nodes.

Weight evolution

For the GATs with k heads, we exploit the k RNNs: long short-term memory (LSTM [32]) to evolve to weights w^k. Note that the LSTM is not the only choice, and other RNNs, such as gated recurrent unit (GRU), is alternative. In this setting, the input and output of the LSTM are the w^k of different GATs without the node embedding involving, thus leading to the system information contained by the LSTM cell. The update process can be derived as:

(5)

where the is denoted as k-th head weight at time t. The detailed process is demonstrated in the later subsection.

Evolving graph attention networks

Based on the weight evolution scheme from Eq 5, the evolving graph attention networks (EGAT) can be derived by evolving the weight of each head. In Fig 1(b), it can be seen that the graph structure evolves from time 0 to time 1 via dynamical multi-relations between neighboring nodes and the central node, driven by the temporal dynamism of the head weights as illustrated in Fig 2. In contrast, the limited expressive power of vanilla GCN, as shown in Fig 1(a), can hardly capture such dynamism.

Download:

Fig 1. (a) The vanilla GCNs based dynamic graph that has limited expressiveness power for sophisticated structure.

(b) The overall structure of our proposed method EGAT. Through the RNN bus, the k-th RNN are leveraged to evolve the k-th head weight of the GATs at time 0 and 1, in which the multi-relation, namely , is evolved between the neighbour nodes to the central node (detailed process can refer to Tables 1 and 2) and then it can expend to more temporal dimensions for adapting to the more sophisticated scenarios.

https://doi.org/10.1371/journal.pone.0349685.g001

Download:

Fig 2. The illustration of EGAT.

The v₀ is denoted as the central nodes, while v₁, v₂, and v₃ are neighbour nodes to v₀. The weights W of the k heads are evolved from time to t, leading to the attention scores accordingly, and then these weights are shared with the neighbour nodes to capture the dynamical node embedding and deeply reveal the interplay between the central and neighbour node.

https://doi.org/10.1371/journal.pone.0349685.g002

This distinction stems from a fundamental difference in aggregation paradigms: prior dynamic models such as EvolveGCN [14] rely on isotropic aggregation, where the temporal evolution remains confined to a global feature extractor and is thus agnostic to fine-grained relational dynamics between individual node pairs. In EGAT, however, the evolved weights are coupled with the spatial domain through the attention mechanism—even when the local topology and node features remain unchanged, the drift in weights induces shifts in pairwise attention scores, enabling the model to capture relation decay or interest shift purely through evolving aggregation logic. Obviously, our method can be extended to any temporal span for adapting to arbitrary dynamic scenarios.

Table 2 details the weight evolution process for each attention head, where the standard LSTM formulation is extended from vector-valued to matrix-valued states, allowing the recurrent unit to directly update the GAT weight matrices over time.

Download:

Table 1.

EGAT: Evolve the weight of the k heads at each time step.

https://doi.org/10.1371/journal.pone.0349685.t001

Download:

Table 2. Evolving the weights of each head at all time steps.

https://doi.org/10.1371/journal.pone.0349685.t002

Experiments

In this section, we provide a detailed experiments to demonstrate the effectiveness and efficiency of our model EGAT. Specifically, we conduct the experiments on multiple data sets, tasks, and compared methods. Moreover, to show the full efficiency, different evaluation metrics are compared. Finally, the best validation epoch is reported in each experiment.

Datasets description

We conduct the experiments on the most widely publicly used benchmark datasets.

SBM Dataset. It is a widely used random graph model that is generated by SBM model. With the in-block and cross-block probability set to be 0.2 and 0.1 respectively, we generate the initial snapshot of the dynamic graph. Next, it randomly generated 10–20 nodes at each time step, which are added to another community. The final synthetic SBM incorporates 1000 nodes, 4 870 863 edges, and 50 timestamps.

BC-Alpha Dataset. It is a message communication network in the form of who-trusts-whom, in which the users trade by Bitcoin. The BC-Alpha dataset consists of 3,777 nodes and 24,173 edges across 136 timestamps.

Autonomous System (AS) Dataset. The AS is the second real-world dataset which is leveraged for communication network by exchanging traffic signals with peers. The dataset consists of 6,474 nodes ranging from November 8, 1997, to January 2, 2000.

UCI Dataset. UCI consists of an online community from the University of California, Irvine, in which the user send messages to each other that indicate the edge between the users. It is comprised of 1,899 nodes and 59,835 edges spanning 88 timestamps (directed graph), which shows a high dynamism in terms of transition state.

The statistics of the datasets and their split scheme of train, validation and test can be seen on Table 3.

Download:

Table 3. Statistics of the datasets.

https://doi.org/10.1371/journal.pone.0349685.t003

Compared methods

We make a thorough comparison with these baselines. GCN [9]. This is a static model that is built without temporal information with single GCN applied to all the time steps.

GCN-GRU [33] and GCN-LSTM [11]. These two models are deep learning frameworks that integrate Graph Convolutional Networks (GCNs) with recurrent units (GRU/LSTM) to model spatiotemporal dependencies in dynamic graph data, highly effective for tasks requiring the prediction of dynamic node-level opinions or states in large-scale, time-varying networks.

DynGEM. [28] It is a dynamic graph embedding model that uses deep autoencoders to generate stable, low-dimensional node representations for evolving graphs. Its key strengths include temporal stability, scalability for growing graphs, and computational efficiency compared to static embedding methods applied per snapshot.

dyngraph2vec and dyngraph2vecAERNN. [34] This method has high similarity to DynGEM while additionally contain the former node information. It has several variants of dyngraph2vecAE, dyngraph2vecRNN, and dyngraph2vecAERNN.

EvolveGCN. [14] It is a dynamic graph neural network model that captures temporal evolution in graph-structured data, which adapts Graph Convolutional Network (GCN) parameters directly over time using a recurrent mechanism, enabling it to handle dynamic graphs with changing node sets and maintains strong performance in tasks such as link prediction, edge classification, and node classification. COMP-GCN. [35] It proposes the graph Fourier transformation in simplex space, based on which a compositional graph convolutional network layer is introduced.

EGAI. [36] Its core idea is to dynamically filter out harmful or redundant information from neighbor nodes during training by removing specific edges, thereby improving model performance and mitigating over-smoothing.

ADMP-GNN. [37] This framework addresses the limitation of GNNs using a fixed number of message-passing layers for all nodes, allowing each node to dynamically determine its optimal number of propagation steps based on its local characteristics, leading to improved performance on node classification tasks.

-GNN. [38] This model employs a learned, dynamic weighting mechanism, which forms a weighted ensemble between any base GNN and an MLP. The weight adaptively modulates the GNN’s influence, helping to preserve performance on clean data while improving resilience to attacks.

HGCN. [39] It proposes a self-tuning toolkit using GCN models, which integrates multiple graph representations of event sequences with different choices of node- and graph-level attributes and in temporal dependencies via edge weights.

0.1 Results for link prediction and discussion

As shown in Tables 4 and 5, our proposed method demonstrates comprehensive performance advantages in the dynamic graph link prediction task. Across four datasets SBM, BC-Alpha, UCI, and AS, our method achieves the best performance on the core metric MAP. In terms of MRR, our method also outperforms others on the SBM and UCI datasets. All experiments were repeated ten times with different random seeds, reporting the mean and standard deviation of MAP and MRR. Paired t-tests conducted between EGAT and the best-performing baseline on each dataset confirm the improvements achieved by EGAT are statistically significant (p < 0.05) for the MAP metric on all four datasets, and for the MRR metric on the SBM and UCI datasets. These results collectively demonstrate both the effectiveness and the stability of our proposed method.

Download:

Table 4. Results for link prediction (MAP). Each column represents one dataset. Results are reported as mean ± standard deviation over five independent runs. The best mean results are shown in bold. ^* indicates statistically significant improvement over the best baseline (paired t-test, p < 0.05).

https://doi.org/10.1371/journal.pone.0349685.t004

Download:

Table 5. Results for link prediction (MRR). Each column represents one dataset. Results are reported as mean ± standard deviation over five independent runs. The best mean results are shown in bold. ^* indicates statistically significant improvement over the best baseline (paired t-test, p < 0.05).

https://doi.org/10.1371/journal.pone.0349685.t005

While existing methods focus on different aspects of dynamic modeling, they exhibit notable limitations. For instance, EvolveGCN [14] updates GCN parameters solely through recurrent units, which struggles to capture the complex, global dynamic interactions within the evolving graph structure. ADMP-GNN [37] though implements node-level adaptability in the depth of message passing, remains constrained by using fixed, non-evolving transformation weights during each aggregation step. This limits its ability to adapt the nature of information fusion across time and topology. Methods like EGAI [36] aim to enhance GNNs by dynamically filtering out harmful neighbor information during training, which primarily addresses static graph quality issues rather than temporal evolution. Its edge-dropping strategy does not inherently model the temporal dynamics or the evolution of transformation functions crucial for dynamic link prediction.

In contrast, our method fundamentally advances dynamic graph modeling by evolving the anisotropic attention weights of Graph Attention Networks via a recurrent subnetwork. This design enables two key advantages: first, by evolving multi-head attention mechanisms rather than homogeneous convolution filters, our model captures the temporal dynamics of relational strengths between central nodes and their neighbors—an aspect that the isotropic weight evolution of EvolveGCN [14] cannot express. Second, by jointly evolving the transformation weights with the message-passing topology, our approach not only considers where information flows (as in ADMP-GNN [37]) but also adaptively modulates how information is transformed and integrated at each step, offering a more nuanced mechanism for capturing dynamic dependencies than the static or filtering-based approaches of models like EGAI [36].

0.2 Computational efficiency analysis

We further analyze the computational efficiency of EGAT compared to representative dynamic graph models. Table 3 reports the average training time per epoch and parameter counts on the SBM dataset.

Theoretical Complexity. For a graph with |V| nodes and |E| edges, let F and denote input and output feature dimensions, and K the number of attention heads. Following GAT [10], a single head requires operations. EGAT extends this with LSTM-based weight evolution, which incurs per layer for evolving K independent transformation matrices. The overall per-time-step complexity is:

(6)

For comparison, we examine three representative baselines with distinct architectural choices:

Static GAT [10]: As established in the original work, the per-time-step complexity for K attention heads is , since the individual heads’ computations are fully independent and can be parallelized. No weight evolution overhead is incurred, but this parameter sharing fundamentally limits the model’s capacity to adapt to temporal distribution shifts.

EvolveGCN [14]: This model evolves the weight matrix of a standard GCN layer through an LSTM. The per-time-step cost consists of the GCN convolution operation and the LSTM-based weight update. The GCN convolution exhibits complexity similar to a single attention head without the K factor (i.e., isotropic aggregation). The LSTM update cost scales with the square of the weight matrix size, as the recurrent cell operates on the flattened parameter vector. Critically, EvolveGCN does not employ multi-head attention, resulting in a constant-factor reduction compared to EGAT.

EGAT (Ours): Our method integrates multi-head attention with temporally evolving weights. Relative to EvolveGCN, EGAT introduces an additional factor of K in both the graph convolution and weight evolution terms. Relative to static GAT, EGAT adds the weight evolution overhead. Importantly, this additional term is independent of the graph scale (|V| and |E|) and thus constitutes a negligible fraction of total runtime on large graphs. Furthermore, as noted in [10], the K attention heads are independent and can be computed in parallel.

Efficiency Summary. As reported in Table 6, EGAT requires 1.25× the training time of EvolveGCN and 1.11× that of static GAT, while using only 48,156 parameters—a modest 6% increase over EvolveGCN (45,320) and 7% over static GAT (44,892). The observed runtime ratio is substantially below the theoretical K = 8 factor because the weight evolution component accounts for <5% of total computation and multi-head operations are efficiently GPU-parallelized [10]. Parameter efficiency stems from the K heads sharing the same LSTM evolution mechanism, with additional parameters limited to per-head attention coefficients that scale as rather than . Since dynamic graph models are typically trained offline, this modest overhead is well within acceptable limits for practical deployment.

Download:

Table 6. Efficiency comparison on SBM.

https://doi.org/10.1371/journal.pone.0349685.t006

0.3 Interpretability analysis of evolving transformation weights

As shown in Fig 3, each curve depicts the Frobenius norm of the linear transformation matrix W_k for the k-th attention head in EGAT. The distinct temporal patterns across heads demonstrate that our model dynamically updates transformation weights. Head 1 shows a strong rising trend, capturing persistent structural evolution, while Heads 5 and 8 exhibit steady declines. The diverse evolving patterns of weights validate that EGAT adaptively modulates feature transformations to match varying topological dynamics, significantly improving the interpretability of dynamic graph representation learning.

Download:

Fig 3. Evolution of multi-head transformation weights over time.

https://doi.org/10.1371/journal.pone.0349685.g003

Conclusions and future works

In this paper, we proposed EGAT, a dynamic graph neural network that evolves the multi-head attention weights of Graph Attention Networks via a recurrent neural network. By coupling RNN-based weight evolution with anisotropic neighborhood aggregation, EGAT effectively captures both the structural evolution of the graph and the temporal variation in relational strengths between nodes. Extensive experiments on four benchmark datasets demonstrate that EGAT consistently outperforms state-of-the-art dynamic graph models on link prediction tasks, establishing a strong new baseline for dynamic graph representation learning.

Future Works. The EGAT framework opens several promising avenues for further investigation. First, the weight evolution paradigm presented in this work is naturally extensible to more advanced sequence modeling backbones. In particular, replacing the recurrent unit with a Transformer-based temporal encoder [40] could allow the model to explicitly attend to long-range historical weight configurations via self-attention, offering an alternative mechanism for capturing cross-time dependencies. Preliminary exploration of this Transformer-enhanced variant constitutes an interesting direction for future research. Second, investigating the spectral properties of evolving attention weights—specifically how the graph Fourier basis transforms [41] over time —may yield deeper theoretical insights into the behavior of dynamic attention mechanisms. Finally, extending EGAT to continuous-time dynamic graphs and evaluating its scalability on industrial-scale datasets represent important practical next steps.

References

1. Yang T, Hu L, Shi C, Ji H, Li X, Nie L. HGAT: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. 2021;39(3):1–29.
- View Article
- Google Scholar
2. Zhang M, Qian T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In: Webber B, Cohn T, He Y, Liu Y, editors. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics; 2020. p. 3540–9. Available from: https://aclanthology.org/2020.emnlp-main.286/ https://doi.org/10.18653/v1/2020.emnlp-main.286
3. Zhu Y, Xu X, Shen F, Ji Y, Gao L, Shen HT. PoseGTAC: graph transformer encoder-decoder with atrous convolution for 3D human pose estimation. International Joint Conference on Artificial Intelligence; 2021. Available from: https://api.semanticscholar.org/CorpusID:237100475
4. Liu S, Lv P, Zhang Y, Fu J, Cheng J, Li W, et al. Semi-dynamic hypergraph neural network for 3D pose estimation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. IJCAI’20; 2021.
5. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. International Conference on Machine Learning; 2017. Available from: https://api.semanticscholar.org/CorpusID:9665943
6. Liu Y, Fan B, Xiang S, Pan C. Relation-shape convolutional neural network for point cloud analysis. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019. p. 8887–96. https://doi.org/10.1109/CVPR.2019.00910
7. Wei X, Yu R, Sun J. View-GCN: view-based graph convolutional network for 3D shape analysis. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020. p. 1847–56. https://doi.org/10.1109/CVPR42600.2020.00192
8. Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K, Jegelka S. Representation learning on graphs with jumping knowledge networks. CoRR. 2018:abs/1806.03536. Available from: http://arxiv.org/abs/1806.03536 arXiv:1806.03536.
9. Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. ArXiv. 2016:abs/1609.02907. Available from: https://api.semanticscholar.org/CorpusID:3144218
10. Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. arXiv e-prints. 2017:arXiv:1710.10903. Available from: http://arxiv.org/abs/1710.10903
11. Seo Y, Defferrard M, Vandergheynst P, Bresson X. Structured sequence modeling with graph convolutional recurrent networks. International Conference on Neural Information Processing; 2016. Available from: https://api.semanticscholar.org/CorpusID:2687749
12. Trivedi R, Farajtabar M, Biswal P, Zha H. DyRep: learning representations over dynamic graphs. 7th International Conference on Learning Representations, ICLR 2019; 2019 May 6–9; New Orleans, LA, USA. OpenReview.net; 2019. Available from: https://openreview.net/forum?id=HyePrhR5KX
13. Narayan A, O’N Roe PH. Learning graph dynamics using deep neural networks. IFAC-PapersOnLine. 2018;51(2):433–8. 9th Vienna International Conference on Mathematical Modelling. Available from: https://www.sciencedirect.com/science/article/pii/S2405896318300788 https://doi.org/10.1016/j.ifacol.2018.03.074
- View Article
- Google Scholar
14. Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, et al. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. CoRR. 2019;abs/1902.10191. Available from: http://arxiv.org/abs/1902.10191. arXiv:1902.10191.
15. Dong Y, Liu N, Jalaian B, Li J. EDITS: modeling and mitigating data bias for graph neural networks. CoRR. 2021:abs/2108.05233. Available from: https://arxiv.org/abs/2108.05233 arXiv:2108.05233.
16. Zhuang C, Ma Q. Dual graph convolutional networks for graph-based semi-supervised classification. Proceedings of the 2018 World Wide Web Conference. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee; 2018. p. 499–508. Available from: https://doi.org/10.1145/3178876.3186116
17. Aldhahri EA, Almazroi AA, Alkinani MH, Alqarni M, Alghamdi EA, Ayub N. GNN-RMNet: Leveraging graph neural networks and GPS analytics for driver behavior and route optimization in logistics. PLoS One. 2025;20(8):e0328899. pmid:40773479
- View Article
- PubMed/NCBI
- Google Scholar
18. Li J, Yang B, Liu J, Wang X, Wu Z, Huang Q, et al. GNN-FTuckER: a novel link prediction model for identifying suitable populations for tea varieties. PLoS One. 2025;20(5):e0323315. pmid:40424391
- View Article
- PubMed/NCBI
- Google Scholar
19. Lee M, Woo J. TempODEGraphNet: predicting user churn using dynamic social graphs and neural ODEs. PLoS One. 2025;20(6):e0321560. pmid:40489450
- View Article
- PubMed/NCBI
- Google Scholar
20. Parmelee C, Moore S, Morrison K, Curto C. Core motifs predict dynamic attractors in combinatorial threshold-linear networks. PLoS One. 2022;17(3):e0264456. pmid:35245322
- View Article
- PubMed/NCBI
- Google Scholar
21. Li G, Jung JJ. Dynamic graph embedding for outlier detection on multiple meteorological time series. PLoS One. 2021;16(2):e0247119. pmid:33600442
- View Article
- PubMed/NCBI
- Google Scholar
22. Zhang Z, Cui P, Pei J, Wang X, Zhu W. TIMERS: error-bounded SVD restart on dynamic networks. CoRR. 2017:abs/1711.09541. Available from: http://arxiv.org/abs/1711.09541 arXiv:1711.09541.
23. Zhu L, Steeg GV, Galstyan A. Scalable link prediction in dynamic networks via non-negative matrix factorization. CoRR. 2014:abs/1411.3675. Available from: http://arxiv.org/abs/1411.3675 arXiv:1411.3675.
24. Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. CoRR. 2014:abs/1403.6652. Available from: http://arxiv.org/abs/1403.6652 arXiv:1403.6652.
25. Grover A, Leskovec J. node2vec: Scalable feature learning for networks. CoRR. 2016:abs/1607.00653. Available from: http://arxiv.org/abs/1607.00653 arXiv:1607.00653.
26. Cranmer MD, Xu R, Battaglia PW, Ho S. Learning symbolic physics with graph networks. CoRR. 2019:abs/1909.05862. Available from: http://arxiv.org/abs/1909.05862 arXiv:1909.05862.
27. Schlichtkrull M, Kipf TN, Bloem P, van den Berg R, Titov I, Welling M. Modeling relational data with graph convolutional networks. arXiv:1703.06103; 2017. Available from: https://arxiv.org/abs/1703.06103
28. Goyal P, Kamra N, He X, Liu Y. DynGEM: deep embedding method for dynamic graphs; arXiv:1805.11273; 2018. Available from: https://arxiv.org/abs/1805.11273
29. Trivedi R, Dai H, Wang Y, Song L. Know-evolve: deep temporal reasoning for dynamic knowledge graphs. arXiv:1705.05742; 2017. Available from: https://arxiv.org/abs/1705.05742
30. Trivedi R, Farajtabar M, Biswal P, Zha H. Representation learning over dynamic graphs. arXiv:1803.04051; 2018. Available from: https://arxiv.org/abs/1803.04051
31. Zhou L, Yang Y, Ren X, Wu F, Zhuang Y. Dynamic network embedding by modeling triadic closure process. Proc AAAI Conf Artif Intell. 2018;32.
- View Article
- Google Scholar
32. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
- View Article
- Google Scholar
33. Zhao X, Chen F, Cho JH. Deep learning for predicting dynamic uncertain opinions in network data. arXiv:1910.05640; 2019. Available from: https://arxiv.org/abs/1910.05640
34. Goyal P, Chhetri SR, Canedo A. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowl Based Syst. 2020;187:104816. Available from: https://www.sciencedirect.com/science/article/pii/S0950705119302916 https://doi.org/10.1016/j.knosys.2019.06.024
- View Article
- Google Scholar
35. Lu S, Wang H, Zhao J. Graph convolutional network for compositional data. Inf Fusion. 2025;117:102798.
- View Article
- Google Scholar
36. Liu C, Wu J, Liu W, Hu W. Enhancing graph neural networks by a high-quality aggregation of beneficial information. Neural Netw. 2021;142:20–33. Available from: https://www.sciencedirect.com/science/article/pii/S0893608021001623 https://doi.org/10.1016/j.neunet.2021.04.025
- View Article
- Google Scholar
37. Abbahaddou Y, Malliaros FD, Lutzeyer JF, Vazirgiannis M. ADMP-GNN: adaptive depth message passing GNN. arXiv:2509.01170; 2025. Available from: https://arxiv.org/abs/2509.01170
38. Aslan HI, Wiesner P, Xiong P, Kao O. β-GNN: a robust ensemble approach against graph structure perturbation. Proceedings of the 5th Workshop on Machine Learning and Systems. EuroMLSys ’25. New York, NY, USA: Association for Computing Machinery; 2025. p. 168–75. Available from: https://doi.org/10.1145/3721146.3721949
39. Wang F, Ceravolo P, Damiani E. HGCN(O): a self-tuning GCN hypermodel toolkit for outcome prediction in event-sequence data. arXiv:2507.22524; 2025. Available from: https://arxiv.org/abs/2507.22524
40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. arXiv:1706.03762; 2023. Available from: https://arxiv.org/abs/1706.03762
41. Wei F, Mei K. Frequency inception based graph neural network for relation prediction in knowledge graphs. Knowl Based Syst. 2023;278:110908. Available from: https://www.sciencedirect.com/science/article/pii/S0950705123006585 https://doi.org/10.1016/j.knosys.2023.110908
- View Article
- Google Scholar

[ref1] 1. Yang T, Hu L, Shi C, Ji H, Li X, Nie L. HGAT: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. 2021;39(3):1–29.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Zhang M, Qian T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In: Webber B, Cohn T, He Y, Liu Y, editors. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics; 2020. p. 3540–9. Available from: https://aclanthology.org/2020.emnlp-main.286/ https://doi.org/10.18653/v1/2020.emnlp-main.286

[ref3] 3. Zhu Y, Xu X, Shen F, Ji Y, Gao L, Shen HT. PoseGTAC: graph transformer encoder-decoder with atrous convolution for 3D human pose estimation. International Joint Conference on Artificial Intelligence; 2021. Available from: https://api.semanticscholar.org/CorpusID:237100475

[ref4] 4. Liu S, Lv P, Zhang Y, Fu J, Cheng J, Li W, et al. Semi-dynamic hypergraph neural network for 3D pose estimation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. IJCAI’20; 2021.

[ref5] 5. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. International Conference on Machine Learning; 2017. Available from: https://api.semanticscholar.org/CorpusID:9665943

[ref6] 6. Liu Y, Fan B, Xiang S, Pan C. Relation-shape convolutional neural network for point cloud analysis. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019. p. 8887–96. https://doi.org/10.1109/CVPR.2019.00910

[ref7] 7. Wei X, Yu R, Sun J. View-GCN: view-based graph convolutional network for 3D shape analysis. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020. p. 1847–56. https://doi.org/10.1109/CVPR42600.2020.00192

[ref8] 8. Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K, Jegelka S. Representation learning on graphs with jumping knowledge networks. CoRR. 2018:abs/1806.03536. Available from: http://arxiv.org/abs/1806.03536 arXiv:1806.03536.

[ref9] 9. Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. ArXiv. 2016:abs/1609.02907. Available from: https://api.semanticscholar.org/CorpusID:3144218

[ref10] 10. Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. arXiv e-prints. 2017:arXiv:1710.10903. Available from: http://arxiv.org/abs/1710.10903

[ref11] 11. Seo Y, Defferrard M, Vandergheynst P, Bresson X. Structured sequence modeling with graph convolutional recurrent networks. International Conference on Neural Information Processing; 2016. Available from: https://api.semanticscholar.org/CorpusID:2687749

[ref12] 12. Trivedi R, Farajtabar M, Biswal P, Zha H. DyRep: learning representations over dynamic graphs. 7th International Conference on Learning Representations, ICLR 2019; 2019 May 6–9; New Orleans, LA, USA. OpenReview.net; 2019. Available from: https://openreview.net/forum?id=HyePrhR5KX

[ref13] 13. Narayan A, O’N Roe PH. Learning graph dynamics using deep neural networks. IFAC-PapersOnLine. 2018;51(2):433–8. 9th Vienna International Conference on Mathematical Modelling. Available from: https://www.sciencedirect.com/science/article/pii/S2405896318300788 https://doi.org/10.1016/j.ifacol.2018.03.074
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref14] 14. Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, et al. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. CoRR. 2019;abs/1902.10191. Available from: http://arxiv.org/abs/1902.10191. arXiv:1902.10191.

[ref15] 15. Dong Y, Liu N, Jalaian B, Li J. EDITS: modeling and mitigating data bias for graph neural networks. CoRR. 2021:abs/2108.05233. Available from: https://arxiv.org/abs/2108.05233 arXiv:2108.05233.

[ref16] 16. Zhuang C, Ma Q. Dual graph convolutional networks for graph-based semi-supervised classification. Proceedings of the 2018 World Wide Web Conference. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee; 2018. p. 499–508. Available from: https://doi.org/10.1145/3178876.3186116

[ref17] 17. Aldhahri EA, Almazroi AA, Alkinani MH, Alqarni M, Alghamdi EA, Ayub N. GNN-RMNet: Leveraging graph neural networks and GPS analytics for driver behavior and route optimization in logistics. PLoS One. 2025;20(8):e0328899. pmid:40773479
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref18] 18. Li J, Yang B, Liu J, Wang X, Wu Z, Huang Q, et al. GNN-FTuckER: a novel link prediction model for identifying suitable populations for tea varieties. PLoS One. 2025;20(5):e0323315. pmid:40424391
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref19] 19. Lee M, Woo J. TempODEGraphNet: predicting user churn using dynamic social graphs and neural ODEs. PLoS One. 2025;20(6):e0321560. pmid:40489450
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref20] 20. Parmelee C, Moore S, Morrison K, Curto C. Core motifs predict dynamic attractors in combinatorial threshold-linear networks. PLoS One. 2022;17(3):e0264456. pmid:35245322
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref21] 21. Li G, Jung JJ. Dynamic graph embedding for outlier detection on multiple meteorological time series. PLoS One. 2021;16(2):e0247119. pmid:33600442
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref22] 22. Zhang Z, Cui P, Pei J, Wang X, Zhu W. TIMERS: error-bounded SVD restart on dynamic networks. CoRR. 2017:abs/1711.09541. Available from: http://arxiv.org/abs/1711.09541 arXiv:1711.09541.

[ref23] 23. Zhu L, Steeg GV, Galstyan A. Scalable link prediction in dynamic networks via non-negative matrix factorization. CoRR. 2014:abs/1411.3675. Available from: http://arxiv.org/abs/1411.3675 arXiv:1411.3675.

[ref24] 24. Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. CoRR. 2014:abs/1403.6652. Available from: http://arxiv.org/abs/1403.6652 arXiv:1403.6652.

[ref25] 25. Grover A, Leskovec J. node2vec: Scalable feature learning for networks. CoRR. 2016:abs/1607.00653. Available from: http://arxiv.org/abs/1607.00653 arXiv:1607.00653.

[ref26] 26. Cranmer MD, Xu R, Battaglia PW, Ho S. Learning symbolic physics with graph networks. CoRR. 2019:abs/1909.05862. Available from: http://arxiv.org/abs/1909.05862 arXiv:1909.05862.

[ref27] 27. Schlichtkrull M, Kipf TN, Bloem P, van den Berg R, Titov I, Welling M. Modeling relational data with graph convolutional networks. arXiv:1703.06103; 2017. Available from: https://arxiv.org/abs/1703.06103

[ref28] 28. Goyal P, Kamra N, He X, Liu Y. DynGEM: deep embedding method for dynamic graphs; arXiv:1805.11273; 2018. Available from: https://arxiv.org/abs/1805.11273

[ref29] 29. Trivedi R, Dai H, Wang Y, Song L. Know-evolve: deep temporal reasoning for dynamic knowledge graphs. arXiv:1705.05742; 2017. Available from: https://arxiv.org/abs/1705.05742

[ref30] 30. Trivedi R, Farajtabar M, Biswal P, Zha H. Representation learning over dynamic graphs. arXiv:1803.04051; 2018. Available from: https://arxiv.org/abs/1803.04051

[ref31] 31. Zhou L, Yang Y, Ren X, Wu F, Zhuang Y. Dynamic network embedding by modeling triadic closure process. Proc AAAI Conf Artif Intell. 2018;32.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref32] 32. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref33] 33. Zhao X, Chen F, Cho JH. Deep learning for predicting dynamic uncertain opinions in network data. arXiv:1910.05640; 2019. Available from: https://arxiv.org/abs/1910.05640

[ref34] 34. Goyal P, Chhetri SR, Canedo A. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowl Based Syst. 2020;187:104816. Available from: https://www.sciencedirect.com/science/article/pii/S0950705119302916 https://doi.org/10.1016/j.knosys.2019.06.024
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref35] 35. Lu S, Wang H, Zhao J. Graph convolutional network for compositional data. Inf Fusion. 2025;117:102798.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref36] 36. Liu C, Wu J, Liu W, Hu W. Enhancing graph neural networks by a high-quality aggregation of beneficial information. Neural Netw. 2021;142:20–33. Available from: https://www.sciencedirect.com/science/article/pii/S0893608021001623 https://doi.org/10.1016/j.neunet.2021.04.025
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref37] 37. Abbahaddou Y, Malliaros FD, Lutzeyer JF, Vazirgiannis M. ADMP-GNN: adaptive depth message passing GNN. arXiv:2509.01170; 2025. Available from: https://arxiv.org/abs/2509.01170

[ref38] 38. Aslan HI, Wiesner P, Xiong P, Kao O. β-GNN: a robust ensemble approach against graph structure perturbation. Proceedings of the 5th Workshop on Machine Learning and Systems. EuroMLSys ’25. New York, NY, USA: Association for Computing Machinery; 2025. p. 168–75. Available from: https://doi.org/10.1145/3721146.3721949

[ref39] 39. Wang F, Ceravolo P, Damiani E. HGCN(O): a self-tuning GCN hypermodel toolkit for outcome prediction in event-sequence data. arXiv:2507.22524; 2025. Available from: https://arxiv.org/abs/2507.22524

[ref40] 40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. arXiv:1706.03762; 2023. Available from: https://arxiv.org/abs/1706.03762

[ref41] 41. Wei F, Mei K. Frequency inception based graph neural network for relation prediction in knowledge graphs. Knowl Based Syst. 2023;278:110908. Available from: https://www.sciencedirect.com/science/article/pii/S0950705123006585 https://doi.org/10.1016/j.knosys.2023.110908
View Article
Google Scholar

[71] View Article

[72] Google Scholar

Figures

Abstract

Introduction

Related work

Methods

Graph attention networks

Weight evolution

Evolving graph attention networks

Experiments

Datasets description

Compared methods

0.1 Results for link prediction and discussion

0.2 Computational efficiency analysis

0.3 Interpretability analysis of evolving transformation weights

Conclusions and future works

References