Figures
Abstract
Nodes that play strategic roles in networks are called critical or influential nodes. For example, in an epidemic, we can control the infection spread by isolating critical nodes; in marketing, we can use certain nodes as the initial spreaders aiming to reach the largest part of the network, or they can be selected for removal in targeted attacks to maximise the fragmentation of the network. In this study, we focus on critical node detection in temporal networks. We propose three new measures to identify the critical nodes in temporal networks: the temporal supracycle ratio, temporal semi-local integration, and temporal semi-local centrality. We analyse the performance of these measures based on their effect on the SIR epidemic model in three scenarios: isolating the influential nodes when an epidemic happens, using the influential nodes as seeds of the epidemic, or removing them to analyse the robustness of the network. We compare the results with existing centrality measures, particularly temporal betweenness, temporal centrality, and temporal degree deviation. The results show that the introduced measures help identify influential nodes more accurately. The proposed methods can be used to detect nodes that need to be isolated to reduce the spread of an epidemic or as initial nodes to speedup dissemination of information.
Citation: Farahi Z, Abedian R, Rocha LEC, Kamandi A (2025) Critical node detection in temporal social networks, based on global and semi-local centrality measures. PLoS One 20(8): e0327699. https://doi.org/10.1371/journal.pone.0327699
Editor: Giridhar Maji, Asansol Polytechnic, INDIA
Received: September 26, 2024; Accepted: June 19, 2025; Published: August 26, 2025
Copyright: © 2025 Farahi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used in this study were obtained from the Societal Patterns website (http://www.sociopatterns.org). The data are publicly available and can be accessed freely at the provided URL.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Network science is a useful and appropriate mathematical framework to represent social contacts and relations between objects in the real world. Network science is used in various disciplines, from biological to social sciences. Network models have been applied to study protein-protein interaction networks [1,2], brain networks [3] and neurosciences [4], social networks [5–8], transport network and routing problems [9], the dynamics of epidemic [10–13], non-fungible tokens (NFTs) [14], the spread of violence within the social network of youngsters [15] and, in combination with maximum entropy network models, misinformation campaigns on Twitter [16]. The goal is to unveil how specific patterns of connections regulate or influence spreading process.
The complexity of real-world networks has prompted scientists to study different features of networks to unveil what topology they have and what behaviours they may show. Therefore, they proposed different measures to help analyse complex networks, such as the shortest path [17], betweennness [18], node or edge centrality [19], cycle and circuits, and robustness. Novel metrics based on existing fundamental metrics have also been introduced. For example, the Locality-based Structure System (LSS) is based on three different parameters: degree, nodes’ k-shell, and the number of triangles a node is involved [20]. The cluster coefficient ranking measure (ECRM) is based on the common hierarchy of a node and its neighbours [21]. Each node is labelled based on k-shell algorithm, and the method uses the common labels in the hierarchy of a node and its neighbours. Another study proposed a generalised degree decomposition (GDD) algorithm to improve the drawbacks of the k-shell algorithm for critical node detection [22].
Observations of real-world networks indicate that connections are not static, they change over time. In temporal networks, the nodes’ connections may change at each time step. This temporal feature affects network behavior in response to events occurring within it.
Detecting critical nodes helps in analyzing the network, understand their impact on the flow of information, epidemics, or to ensure rge network resilience. For example, when critical nodes act as seed nodes, they can accelerate the spread of an epidemic. Therefore, detecting and isolating these nodes becomes crucial for public health strategies. Additionally, removing these critical nodes can lead to greater disconnection within the network, which is important for assessing network robustness. These nodes facilitate the flow of information. Several methods have been introducedto detect those critical nodes. Most research focused on static networks, overlooking the dynamic and evolving nature of interactions. In this paper, we introduce methods to detect critical nodes in temporal networks. Since analyzing a node’s importance depends on the local and global position of the node in the network, we incorporate both aspects into our methods.
This study focuses on local, semi-local and global node centrality measures and introduce 3 novel metrics to measure the importance of nodes in temporal networks. We compare the results with popular centrality measures such as betweenness centrality, degree deviation, and closeness. In Sect 1 we review the previous works relevant to our research, then in Sect 2, we define temporal network and centrality measures, describe the data sets to be used, and the epidemic model used for performance testing. In Sect 3, we introduce the proposed new metrics. In Sect 4, we analyse and compare the proposed and existing metrics. Finally, Sect 5 briefly discusses the proposed metrics and their accuracy.
1 Related literature
Research has been done to develop methods for detecting critical or influential nodes, as they play a significant role in shaping the behavior of the entire network. Different algorithms have been proposed to detect influential nodes: statistical-based, neural network-based, and diffusion-based, among others [23]. Depending on the network, the nodes represent different objects like humans, for example, in ref. [24], the author analyses the influence of nodes, representing researchers, using various metrics and proposes a comprehensive study of metrics to help researchers in different fields. Laplacian Distance is another method for analysing the node importance in complex networks, and the distance Laplacian centrality (DLC) can be used for critical node detection [25]. This centrality focuses on the node’s role on a global scale using the graph energy. TempoRank, based on a random walk, is introduced for detecting the critical nodes in temporal networks [26].
Nosirov et al. [27] compiled diverse algorithms for determining the shortest path in networks and proposed a comprehensive classification for them. The significance of this measure becomes evident when researchers across different disciplines utilise it; for example, a model is proposed for message passing in neural networks, wherein each node propagates information to all its neighbours via the shortest path [28]. Node degree, inverse local clustering coefficient (ILCC), and neighbours’ degree have been used to identify the most influential nodes [29]. A new degree centrality measure based on the spanning tree, called Multi-Spanning Tree-based Degree Centrality (MSTDC), was introduced for detecting the most influential nodes [30].
Influential nodes affect the spread of rumours, information, and epidemics. Researchers develop methods to detect the most influential or critical nodes, such as information diffusion in complex networks based on the SIS (Susceptible-Infected-Susceptible) epidemic model and information competition and cooperation [10]. The robustness of the network is also an important object of study, especially under network attacks or failures. Robustness represents network strength against loss of nodes. Different measures for analysing robustness have been proposed, including the size of the largest connected component, entropy, strength, and skewness [31].
The Game of Thieves (GoT) is a novel approach which models network centrality using a decentralized process where wandering agents collect virtual resources. Unlike traditional methods, GoT estimates centrality in polylogarithmic time [32].
VoteRank-based methods have also been used to detect influential nodes [33]. In this method, in each turn, all nodes vote for their neighbours, and at the end of the turn, the node with the highest score is selected as one of the most influential nodes [34]. Since the voting ability of nodes may be different and based on the coreness of the neighbours, an alternative is to use a coreness-based VoteRank called NCVoteRank [35]. The Recent and Weight strategies have also been proposed to identify critical nodes in temporal networks for effective epidemic control by leveraging information about past temporal contact patterns [36]. The IM-ELPR algorithm for critical node detection uses the h-index to find the seed nodes [37]. After detecting the network communities, it consolidates the small communities to achieve the larger ones and finds the k most influential nodes.
Several real-world networks such as the brain functional connectivity [38], fraud detection in banking [39], epidemics like Covid-19 [40,41] or sexual infections [42], and the behaviour of mobile phone users [43] can be described by temporal networks. The ubiquity of temporal networks in representing the real world motivates researchers to focus on analysing their features and the impact of changing structures on dynamic processes, such as epidemic, information, and flow.
Researchers have analysed the role of important nodes in information flow using centrality metrics such as betweenness and closeness [44,45]. A temporal walk centrality was proposed to analyse information flow [46]. This algorithm is based on a temporal random walk to capture that diffusion spreads not only through the shortest path but can also be distributed to adjacency paths.
Sampling-based algorithms for temporal betweenness [47] and a method to find temporal paths in temporal networks considering waiting time [48] have been proposed. The fact that people may forget about news and not continue to propagate information to others (memory or expiration time) has been used in an algorithm for the temporal reachable set [49]. Apart from node centrality, the researchers also proposed models to measure edge centrality to identify the most important connections in the network [49]. Other researchers have expanded the temporal network to a spatial-temporal network, a layered network made of networks, for example, ESTNet, for analysing and controlling traffic [50].
Table 1 shows a summary of studies on centrality measures and the detection of critical nodes. Some studies have explored critical nodes in temporal networks, but most considered only the local or global positions of nodes within the network.
2 Materials and methods
2.1 Temporal networks
A static network is defined as G(N, E), where N is a set of nodes and E is a set of edges. In temporal networks, edges may be active and inactive from time to time, unlike a static network in which they are always active. For the temporal networks in the time interval , we have
[51,52], where:
is the set of nodes,
is a set of edges active at time t and,
comprises a collection of snapshots of a graph, one per time step. The edges of the temporal network evolve, and one snapshot of the network can differ from the other at different times. The identical static network is the union of all
snapshots.
In the time interval , the temporal degree of a node
is the number of nodes
are connected to i in the time interval
[53]:
The connection between i and j may disconnect several times in the interval , but if in one of the snapshots, i and j are connected, then we consider j as a neighbour of i.
Node j is the neighbour of node i in if and only if
. Therefore, the set of all neighbors of i in
is:
A path in a static network is a sequence of edges connecting two nodes. The distance between two nodes is the number of edges in the path from the source node to the destination node. In temporal networks, a temporal path is a sequence of edges in appearing in a sequence of time snapshots where every two consecutive edges have a common node. A sequence of
represents the path between nodes i and j. In other words, a temporal path is a sequence of edges
which
are in contact in
and edge en + 1 appears in the path after edge en if and only if
or
.
The distance in the temporal network is the total number of time steps a node needs to reach the destination node from the source node. Consider a path that starts from node j at time step t, after visiting a sequence of edges, reaches node i at time step [54,55], then:
A cycle is a path where the first and last nodes are the same. A temporal cycle is a temporal path in which the first and last nodes are the same. Base cycles are the set of smallest cycles that make up the network. Therefore, they do not have any other node-induced sub-graph that makes a cycle. If is the temporal spanning tree of
and
does not exist in
then
makes a base cycle of
. Therefore, if
then:
2.2 Centrality measures
There are three types of centrality measures: one considers the centrality of nodes locally in their neighbourhood, the second considers the node’s importance on a global scale in the whole network, and the third is between the local and global indexes (hereafter called semi-local).
Local temporal degree centrality: It only considers a node and its direct neighbourhood. It shows the number of nodes a node i can affect directly [56]. In temporal networks, the degree centrality of a node can be different at each time. The temporal degree centrality of node j at time t is the number of nodes connected to j at time t [53].
Local temporal degree deviation (TDD): It quantifies the difference between the temporal and static degrees. A higher value means that the contacts of a node are not always active because if TDD is small, the degree of the node is more or less constant over time. A higher temporal degree means that the node has an active connection in more time steps; therefore, it is more important in transferring a flow [57].
Semi-local centrality: It considers not only the direct neighbours of nodes but also the neighbours of neighbours. For node i, and , which is the set of node i’s neighbors, we have [56]:
Semi-local integration centrality: It considers more features related to a node, including features of its sub-network and the weight of the degree. For each node, i, an edge e(i, j) connects node i to node j. Then, for each edge e, we count all the base cycles in which edge e is involved. This measure considers the weight of edges and the degree of nodes [49].
(Global) Temporal Betweenness (TB): It is a representative measure of node importance that considers the number of times a node is located on the shortest path between two nodes. A high betweenness for a node means that more information passes through this node to reach other nodes; an attack on this node can disrupt information diffusion. The temporal betweenness finds nodes that are in the temporal shortest path between two nodes. Different algorithms are proposed for betweenness, for example, a polynomial time algorithm [58] and an algorithm for link streams [59].
(Global) Temporal Closeness (TC): It is the sum of the inverse of the shortest path from i to all the other nodes [60,61]. The Harmonic closeness algorithm is another algorithm proposed for computing the top-k temporal closeness [62]. Crescenzi et al. proposed an approximation for temporal closeness based on sampling and backward BFS [63].
2.3 Empirical networks
We applied our methods on face-to-face interaction networks using four distinct datasets from the SocioPatterns website (http://sociopatterns.org). These social interactions were captured using wearable RFID sensors such that if two people face each other, an interaction event is recorded. Interaction events are recorded every 20 seconds. The first network was collected during a scientific conference in 2009 [7]. The second network corresponds to interaction between high school students [8]. The third network corresponds to workplace [64]. The fourth network corresponds to interaction of health-care workers within a hospital ward [65] (Table 2).
2.4 Epidemic models
Epidemic models aim to reproduce an epidemic dynamic process. In the fundamental SIR epidemic network model, nodes can be in different states: for Susceptible,
for Infected, and
for Recovered. When an epidemic starts, all nodes are in the
state except one that is infected
; this is the seed of the epidemic. When the susceptible nodes contact the infected nodes, their state changes to
with probability β (here,
), infected nodes recover with probability γ (here,
); once they recover, they cannot get infected again [66]. Mathematically, being in the recovery state is thus equivalent to an effective vaccination since a node cannot get infected any longer. To estimate the diffusion speed of an epidemic outbreak, we report di/dt, which represents the number of newly infected nodes in each time step. We can then study the evolution of the epidemics in the network, such as the peak time and when the epidemic vanishes.
3 Temporal centrality measures
We introduce three novel temporal centrality measures for temporal networks. Each measure will capture different temporal-structural properties of the nodes.
Fig 1 shows a temporal network with nodes indexed from . The edge labels on the network represent the time step in which the edges are connected; thus, the edges are inactive for the rest of the time. Each node in this network has a unique feature. Node V6 has the highest degree, node V7 has edges with the longest active time and more second-order neighbours (neighbours of neighbours) and nodes V2 and V6 are involved in more base cycles.
3.1 Temporal supra cycle ratio
The temporal cycle ratio (TSCR) is a measure for detecting the most important and influential nodes based on the number of cycles in which they are involved. This measure is proposed based on the static cycle ratio [56]. The basis of TSCR is the number of circles in which a node and its neighbours are involved.
For a node i:
where is the number of temporal cycles in which two nodes i and j are involved, and
is the total number of temporal cycles of node j. These two parameters are the elements of the temporal cycle matrix(TCM) related to network
.
In Fig 1, if the flow starts from the node V0 in the network, then at the end, the set of cycles is:
These cycles are completed over time and are based on the sequence of nodes activated in a sequence of time steps. However, in the temporal version, the cycle set is different for each time step. For example, for time step 5, the set of cycles is:
The following matrix shows the number of cycles that every two nodes involved (), and the main diagonal of the matrix is the total number of cycles that a node is involved in (
).
Based on Eq 8, we get , and the final result is obtained by summing over time. Table 3 shows the values of TSCR for the sample network nodes. In TSCR, node V6 is the most important node because it is involved in more cycles, followed by node V2, and the rest of the nodes.
Fig 2 shows the TSCR for the simple network. The size and colour of the nodes indicate the node importance based on the TSCR index. By tracing the network cycles, we identify that nodes V6 and V2 are involved in more cycles; thus, nodes V6 and V2 are the most important nodes in the network.
The most important nodes have lighter colour.
3.2 Temporal semi-local integration
Not only is the global feature of nodes important, but the local features of nodes are also important. In addition, the SLI states that a node connected to an important edge is important, and the weight of an edge shows the importance. In temporal semi-local integration (TSLI), we define the local and global features used in SLI. Therefore, we need three base differences to SLI:
- All nodes have the same weight.
- The edge degree is defined based on the total time that the connection is active.
- The cycle is defined based on the active connections in each time step.
The edge cycle factor is defined as follows:
where P(e) is the number of base cycles an edge is included. For each node i, TSLI is as follows:
is a set of i’s neighbors and
is the temporal degree of node i:
where w(i, j) is the weight of edge (i, j), which is equal to the total active time of that edge. This is the best definition for edge weight because a more active edge indicates higher importance; thus, it also makes end nodes important. The following equation shows the weight of the edge (i, j):
where T is the total time window, we trace the network behaviour.
This measure represents the integrity of each node in the neighbourhood. As long as a node is involved in more cycles, it has a denser neighbourhood.
Since we consider the weight of all nodes equal to one, it is possible that +
−
in Eq 12 gives a negative result, and as much it is involved in different cycles, it becomes more negative and less important. Therefore, we use the absolute value of
in Eq 12.
Fig 3 shows the TSLI value for the sample network. Here, node V7 is the most important, followed by node V6. Node V7 is connected to two edges with a high active time; therefore, based on the idea that the node connected to the important edge is also important, the TSLI value of V7 is higher than that of the other nodes. Node V6 is also important because it is involved in more cycles.
The most important nodes have lighter colour.
Table 3 shows the TSLI (Eq 11) for all nodes. Compared with Fig 3, the nodes connected to the more important nodes also have higher TSLI values. A high TSLI indicates that a node is critical in the network.
3.3 Temporal semi-local centrality
Semi-local centrality is based on the neighbours and neighbours of neighbours for node j. Therefore, in static networks, the semi-local centrality counts the second-order neighbours of a node. The TSLC is the SLC measure in the temporal network. Because the connections in the temporal networks are temporal, the TSLC for node j is all the second-order neighbours that are reachable according to consecutive time steps starting at time t. For node j, we have:
The total TSLC(j) for all snapshots is:
We call this measure the semi-local centrality measure because it considers the importance of a node in a wider area than local. The degree centrality index is strictly based on neighbours, but the semi-local centrality extends to neighbours of the neighbours.
Table 3 shows the TSLC for the nodes in the sample network. This Fig 4 shows nodes V7 and V6 as the most important. Because this measure is based on the second-order neighbours of the nodes, we expect that these nodes will have more second-order neighbours.
The most important nodes have lighter colour.
Fig 4 represents the node importance based on TSLC. Similar to the TSLI, nodes V7 and V6 are the most important nodes since they have more second-order neighbours, node V7 with nine second-order neighbours including {V0,V1,V2,V5,V6,V8} and node V6 with six second-order neighbours {V0,V2,V4,V7,V8,V9}. Both nodes have higher degrees, and their connections are more active than others, which are also in contact with nodes with high degrees.
The centralities TSLC, TSLI, and TSCR take a distinct approach to evaluating a node’s influence. TSLC considers a node’s direct neighbors and the neighbors of those neighbors. By expanding the analysis to a second layer of connectivity, TSLC provides a broader perspective on how embedded a node is within its local structure. TSLI measures how often a node’s direct neighbors participate in different temporal cycles within the network. Additionally, it adjusts this measure by multiplying it with the ratio of active time steps to the node’s degree. “Active time” represents the weight of the edge between a node and its neighbor, incorporating temporal dynamics into the measure. TSCR focuses on cycles by calculating the proportion of cycles that each neighbor is involved in relation to the total number of cycles of the node.
4 Results
We evaluated the performance of the proposed measures by assessing the impact of node removal on epidemic spread and on network robustness. We will perform sensitivity analysis and compare their performance to the most important centrality measures, i.e. betweenness, closeness, and degree.
We use four empirical networks representing real-world social interactions to study epidemic spread and the role of the centrality measures. The conference dataset uses RFID devices to track proximity among individuals [7]. The highschool dataset combines wearable sensors, contact diaries, and online links to map interactions in a French high school [8]. The workplace dataset records interactions in a French office in 2015 [64]. Lastly, the hospital dataset logs over 14,000 contacts in four days between healthcare workers and elderly patients [65].
Each time step in these datasets corresponds to 20 seconds. For computational convenience, a time step of 20 seconds has been considered for each interval. Then, based on the number of nodes and the edges between them, a network was created using the “networkx” library in Python, and active times were assigned to each edge.
We first check and compare the detected nodes by all six measures. The comparison includes three cases: the SIR propagation speed when the critical nodes are spreaders, the SIR propagation speed when the critical nodes are removed, and the largest connected component.
4.1 Epidemic spread
There are two important tasks when analysing epidemic spread, predicting and controlling the epidemic outbreak. When an epidemic outbreak occurs, we must isolate critical nodes, i.e. the nodes that regulate the epidemic spread, to prevent the epidemics. Both these approaches prompted us to analyse the effect of critical nodes on epidemic spread speed.
To analyse the effect of the critical nodes in the epidemic’s spreading, we can isolate them or consider them the seed nodes. We use both approaches to evaluate the proposed measures’ performance and compare them with known measures. At first, we remove critical nodes detected by each measure, then run the SIR epidemic model in the network. For this simulation, the initially infected nodes are randomly chosen, and to mitigate the seed’s choice effect, we repeated the simulation 100 times and reported the where M is the number of runs,
represents the rate of change of the number of infected individuals (i) over time (t) in the r–th run. Fig 5(a)–5(d) shows the epidemic spread for the four datasets.
(e–h) Epidemic spread in the network when the top 1% nodes are selected as initial spreaders or seeds in the different networks. The horizontal axis represents time, and the vertical axis indicates the extent of disease spread.
The first peak is critical to help control the epidemic spread. Fig 5(a)–5(d) shows that the cycle ratio has the best performance since it shows the lowest value for Ω. Regarding the hospital dataset, the first peak occurs at the beginning, and all the measures behave similarly, but the semi-local centrality decreases faster than other measures. Based on the semi-local centrality and cycle ratio, the epidemic ends sooner. In the conference dataset, the three proposed measures have the best functionality for the first peak. For the highest peak, the value of Ω for the cycle ratio is the lowest, indicating that it has the best recognition for the critical nodes. For the high school dataset, the cycle ratio and degree deviation have the lowest value for the first peak. The local integration has the lowest value in all phases (first peak, highest peak, and ending of the epidemic). Finally, in the workplace dataset, similar to the hospital dataset, all the measures for the first and highest peak behave similarly. Still, the cycle ratio decreases faster and ends the epidemic sooner than the other measures.
The other viewpoint is using critical nodes to increase the epidemic speed. In this case, we chose the most critical nodes as the initial spreaders instead of choosing randomly and then ran the SIR epidemic model. In Fig 5(e)–5(h), the initial spreaders are set to be the top 1% of critical nodes detected by each of the six measures. At each time step, the epidemic spreads from infected nodes to healthy nodes that are in contact with them during that time, with a probability of 1. The epidemic continues until all individuals in the community are infected and no new nodes are left to infect, at which point Ω becomes zero.
The role of initial spreaders is seen in the timing of the epidemic peak and when the epidemic becomes widespread. The sooner and higher the peak occurs, the more critical the initial spreaders were in propagating the epidemic, leading to the population getting infected sooner. In the high school dataset, closeness shows the lowest value for Ω. In the conference dataset, while betweenness has a low value at the beginning, it has a high value at the highest peak, indicating that the nodes infected in the second stage are more critical. In data sets related to the hospital and workplace, the epidemic had the highest value initially, but with TSCR and TSLI, the epidemic reached the whole network sooner. In the workplace data set, TDD and TSCR have the best performance, reaching the entire network. In all cases, TSCR, TSLI, and TSLC achieved the best performance in reaching the peak and infecting the whole network.
In the next experiment, different fractions of critical nodes, ranging from 0.1 to 0.9, are removed. In this simulation, the seed nodes are chosen randomly, and the reported value is the average of M = 100 runs. Table 4 reports the cumulative peak value of Ω and 95% confidence interval (95% C.I.) over all time steps where the total corresponds to E in Table 2.
Removing critical nodes decreases the epidemic speed since they are essential in regulating the epidemic. A lower Ω indicates that the removed nodes were more critical. In most cases, the minimum values are observed for TSCR, TSLC, and TSLI, while the maximum values are mostly for TB and TDD. The reported peak values for different measures are close to the workplace dataset. Nonetheless, TSLI performs better, as it has the minimum value in five cases.
Some of the measures have more accurate detection depending on the network. However, well-known measures like betweenness, closeness, and degree deviation focus on only one of the node’s features. In contrast, TSLC, TSLI, and TSCR consider a combination of node features. In the workplace and high school networks, TSLC and TSCR detect the most influential nodes because, in these two networks, the nodes have the most influence on their semi-local neighbours within their communities. In these networks, it is rare for a node to have a global effect; usually, it affects a group of friends. Therefore, we expect the detected nodes to have less influence on betweenness and closeness, which consider the global features of nodes. On the other hand, at conferences where people try to connect with others and form new relationships, measures like betweenness and closeness, which consider the global features of nodes, have the best functionality. In contrast, TSCR, which focuses on semi-local features, is less valuable. Finally, in the hospital dataset, where connections are more uniform, all the measures exhibit similar performance.
4.2 Network robustness
We study the impact of node removal on network fragmentation and evaluate network robustness through percolation theory. As the network becomes denser, the removal of nodes tends to have less effect on the size of the largest connected component(Smax), indicating higher network robustness [56]. Thus, assessing the critical node detection through percolation theory provides valuable insights into network resilience [56,67,68].
To compare the accuracy of critical node detection, we order nodes based on six measures of importance. Subsequently, we iteratively remove nodes according to their importance, reporting the largest connected component size for all four networks (Fig 6). Across all datasets, temporal closeness centrality and temporal supra-cycle ratio exhibit similar behavior, showing superior performance in the high-school and conference datasets by inducing more disconnections, leading to smaller sizes of the largest connected component. In the workplace and hospital datasets, degree deviation demonstrates the best performance, while it performs moderately in the other two datasets. Temporal supra-cycle ratio and temporal semi-local centrality exhibit the best performance overall, while temporal local integration centrality and temporal betweenness centrality show similar behaviour. The impact of removing critical nodes differs significantly from that of marginal nodes. Removing several nodes with less importance yields a different effect than removing a critical node; even removing a small fraction of essential nodes can result in network disconnection. Therefore, the effectiveness of a measure lies in its ability to identify critical nodes accurately. In our experiment, the supra-cycle ratio demonstrates the most reliable performance across all four networks, while other measures show promising results in different networks.
The horizontal axis represents the fraction of removed nodes, ranging from 0.1 to 0.9, and the vertical axis indicates the size of the largest connected component after node removal.
4.3 Correlation analysis
We analyze the similarity between all measures discussed in this study via a Pearson correlation analysis to check if they are capturing different information for the same nodes. Highly correlated measures indicate that they are strongly similar, suggesting that one can serve as a good proxy for the other. In critical situations such as disaster handling, using one of these correlated measures ensures we do not lose much accuracy. Conversely, when analyzing a network, we can select measures with low correlation since they represent different features of the network or provide different analytical perspectives. Fig 7 shows that for all studied networks, betweenness, closeness, and semi-local centrality have the highest correlations, implying that there is no gain in using them together since they rank the nodes similarly. On the other hand, degree deviation shows an almost negative correlation with all other measures. Additionally, the cycle ratio has a low or no correlation with other measures, as seen in the hospital dataset. Therefore, in any case, the cycle ratio can be one of the selected measures.
‘U’ indicates undefined correlation.
Since there is no temporal cycle in the dataset related to the hospital, the cycle ratio for the hospital network is a constant value. As Pearson correlation relies on the variability of data points, the correlation for the cycle ratio is undefined (NaN) in the hospital dataset. This is represented by the white color in Fig 7c.
Table 5 shows that TB and TC have a strong correlation across all networks (p–value<.001), especially in the high-school network. Similarly, TC and TDD show significant correlations in the workplace and hospital networks, and TSLC and TDD also have a strong relationship. On the other hand, there is no strong correlation between TSCR and TB, TSLC, TC, or TDD (p–value>.05). The same applies to TSLI, which does not show a significant relationship with these measures.
5 Conclusions
Since social interactions change over time, studying and designing algorithms to characterise temporal networks is helpful. Detecting these networks’ critical or more influential nodes is essential because they may be used to control epidemic outbreaks, opinions, and marketing campaigns. We introduced three novel measures for identifying the most critical nodes in temporal networks, considering both the local and global features of the nodes.
We applied these measures to four real-world contact networks in different contexts: a conference, school, workplace and hospital. The first measure, temporal supracycle ratio (TSCR), is based on the total number of cycles in which a node is involved; a node involved in more cycles is deemed more important. The second measure, temporal semi-local integration (TSLI), indicates that nodes connected to important edges, defined as edges with more active time, are also important. The third measure, temporal semi-local centrality (TSLC), is based on the second-order neighbours of a node, reflecting its semi-local centrality.
First, we ranked the nodes using these measures and compared the results with known measures, including temporal betweenness, temporal closeness, and temporal degree deviations. We analysed the accuracy of these measures by examining their effect on controlling an epidemic by removing the most critical nodes. The proposed measures demonstrated superior performance in terms of epidemic spread. By removing the critical nodes identified by these measures, the measure performs better as much as the peak value of Ω is lower. In the high school network, TSLC showed the best performance; in the conference dataset, TSCR was most effective; in the hospital dataset, both TSCR and TSLI were optimal; and in the workplace dataset, TSCR was the best for controlling the epidemic through node removal. We also removed different fractions of critical nodes to analyse their role in epidemic spread, and the proposed measures consistently performed well compared to other known measures.
In this paper, we used a variety of empirical networks to examine the impact of removing critical nodes on epidemic spread, and network greatest connect coponent by identifying those critical nodes using new centrality measures. These metrics are based on neighbors of neighbors, the number of cycles, and the duration of active connections between nodes. In the hospital network, there are no cycles, so the TSCR metric, which is based on cycles, has lower accuracy. However, the TSLI metric, which considers not only cycles but also node neighbors and the active time of edges between them, provides higher accuracy. In denser networks, the TSCR accuracy is higher because it considers repeated cycles and does not treat them as important nodes. The other two metrics do not have this feature. On the other hand, between two dense etworks, such as high school and conference networks, if the ratio of active times to total time is high, TSLI is more useful compared to TSLC, as it treats active edge times as weights, unlike TSLC, which only considers neighbors.
In temporal networks, centrality measures are influenced not only by network properties such as degree and density but also by the active time of links, which is just as important as other parameters. When a network has a high temporal degree, TSCR achieves the highest accuracy. However, if the overall degree is high but the temporal degree is not, TSLCI, which weights links based on active time, becomes more relevant. In sparse networks where link active times are similar, TSLC is more applicable. Due to the low degree, the cycle ratio is negligible, and since active times are close, TSLI is not suitable.
Selecting the most influential nodes as the initial spreaders is crucial for epidemic spread, as they can accelerate the spread and affect more individuals. In our experiments, the proposed measures exhibited the highest peak values of Ω and reached (where the entire population is infected) sooner than others across all four networks. The robustness of networks is also significantly affected by the removal of critical nodes. The more critical the nodes, the more their removal leads to network disconnection, resulting in a smaller size of the largest connected component. The proposed measures, particularly TSCR and TSLC, generally performed better than known measures for different networks. TSLI performs similarly to betweenness but is less effective.
Different measures can be helpful depending on network features, such as density and degree deviation, as each focuses on specific features. Therefore, in situations where accuracy is crucial, it is advisable to use multiple metrics to ensure the best selection. Measures like betweenness, degree deviation, and closeness consider nodes from only one aspect, focusing on global or local features. These measures do not account for the semi-local features of nodes, such as groups of friends or coworkers. Therefore, semi-local measures show better performance in networks where people are usually network community members.
TSLC and TC were the most effective measures critical nodes, especially in high-school and conference. TSLI and TDD are slightly less effective. In sparce networks like hospital and workplace datasets, all centrality measures were efficient to detect critical nodes, with a small advantage for TB. TSCR, tends to have the lowest SE values and does not significantly impact network stability, suggesting it is less effective in selecting critical nodes.
Studying the features of temporal networks aids policymakers in controlling epidemics, hindering or accelerating the spread of information, such as news or ads. The robustness of networks is necessary because removing critical nodes leads to network fragmentation, creating smaller connected components. This disconnection hinders information propagation, demonstrating the importance of maintaining network integrity. In future work, we can study networks from the perspective of network communities and compare the results of community detection algorithms with the proposed measures. We can analyse the relationship between populations of different community sizes and the proposed measures.
Code availability statement
The code supporting the findings of this study is openly available at https://github.com/zhrfarahi/Temporal_Influential_Nodes.git. The repository contains the full Python code used in this project.
References
- 1. Burke DF, Bryant P, Barrio-Hernandez I, Memon D, Pozzati G, Shenoy A, et al. Towards a structurally resolved human protein interaction network. Nat Struct Mol Biol. 2023;30(2):216–25. pmid:36690744
- 2. Göös H, Kinnunen M, Salokas K, Tan Z, Liu X, Yadav L, et al. Human transcription factor protein interaction networks. Nat Commun. 2022;13(1):766. pmid:35140242
- 3. Luo C, Li F, Li P, Yi C, Li C, Tao Q, et al. A survey of brain network analysis by electroencephalographic signals. Cognit Neurodyn. 2022:1–25.
- 4. Váša F, Mišić B. Null models in network neuroscience. Nat Rev Neurosci. 2022;23(8):493–504. pmid:35641793
- 5. Wang Y, Qing F, Wang L. Rumor dynamic model considering intentional spreaders in social network. Discrete Dyn Nat Soc. 2022;2022:1–10.
- 6. Zarei F, Gandica Y, Rocha LEC. Bursts of communication increase opinion diversity in the temporal Deffuant model. Sci Rep. 2024;14(1):2222. pmid:38278824
- 7. Cattuto C, Van den Broeck W, Barrat A, Colizza V, Pinton J-F, Vespignani A. Dynamics of person-to-person interactions from distributed RFID sensor networks. PLoS One. 2010;5(7):e11596. pmid:20657651
- 8. Mastrandrea R, Fournet J, Barrat A. Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PLoS One. 2015;10(9):e0136497. pmid:26325289
- 9. Ma J, Wei J, Ma J, Lu Z. An improved local efficient routing strategy on scale-free networks. Int. J. Mod. Phys. C. 2023:2350123.
- 10. Zhang H, Chen X, Peng Y, Kou G, Wang R. The interaction of multiple information on multiplex social networks. Inf Sci. 2022;605:366–80.
- 11. Demongeot J, Griette Q, Magal P. SI epidemic model applied to COVID-19 data in mainland China. R Soc Open Sci. 2020;7(12):201878. pmid:33489297
- 12. Han D, Wei J, Xu H, Li D. Dynamical analysis of the SIS epidemic model in cluster events. Appl Math Model. 2021;99:147–54.
- 13. Rocha LEC, Singh V, Esch M, Lenaerts T, Liljeros F, Thorson A. Dynamic contact networks of patients and MRSA spread in hospitals. Sci Rep. 2020;10(1):9336. pmid:32518310
- 14. Vasan K, Janosov M, Barabási A-L. Quantifying NFT-driven networks in crypto art. Sci Rep. 2022;12(1):2769. pmid:35177628
- 15. Geeraert J, Rocha LEC, Vandeviver C. The impact of violent behavior on co-offender selection: evidence of behavioral homophily. J Crim Justice. 2024;94:102259.
- 16. De Clerck B, Rocha LEC, Van Utterbeeck F. Maximum entropy networks for large scale social network node analysis. Appl Netw Sci. 2022;7(1):68. pmid:36193095
- 17. Cherkassky BV, Goldberg AV, Radzik T. Shortest paths algorithms: theory and experimental evaluation. Math Program. 1996;73(2):129–74.
- 18. Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977:35–41.
- 19. Song JH. Important edge identification in complex networks based on local and global features. Chin. Phys. B. 2022.
- 20. Ullah A, Shao J, Yang Q, Khan N, Bernard CM, Kumar R. LSS: a locality-based structure system to evaluate the spreader’s importance in social complex networks. Exp. Syst. Appl. 2023;228:120326.
- 21. Zareie A, Sheikhahmadi A, Jalili M, Fasaei MSK. Finding influential nodes in social networks based on neighborhood correlation coefficient. Knowl-Based Syst. 2020;194:105580.
- 22. Zheng J, Liu J. A new scheme for identifying important nodes in complex networks based on generalized degree. J Comput Sci. 2023;67:101964.
- 23. Ou Y, Guo Q, Liu J. Identifying spreading influence nodes for social networks. Front Eng Manag. 2022;9(4):520–49.
- 24. Ahmed B, Li W, Mustafa G, Afzal MT, Alharthi SZ, Akhunzada A. Evaluating the effectiveness of author-count based metrics in measuring scientific contributions. IEEE Access. 2023.
- 25. Yin R, Li L, Wang Y, Lang C, Hao Z, Zhang L. Identifying critical nodes in complex networks based on distance Laplacian energy. Chaos Solitons Fractals. 2024;180:114487.
- 26. Rocha LEC, Masuda N. Random walk centrality for temporal networks. New J Phys. 2014;16(6):063023.
- 27. Nosirov K, Norov E, Tashmetov S. A review of shortest path problem in graph theory. Eur J Eng Technol. 2022;13:1–11.
- 28.
Abboud R, Dimitrov R, Ceylan II. Shortest path networks for graph property prediction. In: Learning on Graphs Conference. PMLR; 2022. p. 5.
- 29. Berahmand K, Bouyer A, Samadi N. A new local and multidimensional ranking measure to detect spreaders in social networks. Computing. 2018;101(11):1711–33.
- 30. Dai B, Qin S, Tan S, Liu C, Mou J, Deng H. Identifying influential nodes by leveraging redundant ties. J Comput Sci. 2023;69:102030.
- 31. Oehlers M, Fabian B. Graph metrics for network robustness—a survey. Mathematics. 2021;9(8):895.
- 32.
Ficara A, Fiumara G, De Meo P, Liotta A. Correlations among Game of Thieves and other centrality measures in complex networks. Data science and Internet of Things: research and applications at the intersection of DS and IoT. 2021. p. 43–62.
- 33. Liu P, Li L, Fang S, Yao Y. Identifying influential nodes in social networks: a voting approach. Chaos Solitons Fract. 2021;152:111309.
- 34. Zhang J-X, Chen D-B, Dong Q, Zhao Z-D. Identifying a set of influential spreaders in complex networks. Sci Rep. 2016;6:27823. pmid:27296252
- 35. Kumar S, Panda B. Identifying influential nodes in social networks: neighborhood coreness based voting approach. Phys A: Statist Mech Appl. 2020;553:124215.
- 36. Lee S, Rocha LEC, Liljeros F, Holme P. Exploiting temporal network structures of human interaction to effectively immunize populations. PLoS One. 2012;7(5):e36439. pmid:22586472
- 37. Kumar S, Singhla L, Jindal K, Grover K, Panda B. IM-ELPR: Influence maximization in social networks using label propagation based community structure. Appl Intell. 2021:1–19.
- 38. Ding K, Dragomir A, Bose R, Osborn LE, Seet MS, Bezerianos A, et al. Towards machine to brain interfaces: sensory stimulation enhances sensorimotor dynamic functional connectivity in upper limb amputees. J Neural Eng. 2020;17(3):035002. pmid:32272463
- 39.
Hajdu L, Krész M. Temporal network analytics for fraud detection in the banking sector. In: ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium: International Workshops: DOING, MADEISD, SKG, BBIGAP, SIMPDA, AIMinScience 2020 and Doctoral Consortium, Lyon, France, August 25–27, 2020, Proceedings. 2020. p. 145–57.
- 40. Yang Z, Zhang J, Gao S, Wang H. Complex contact network of patients at the beginning of an epidemic outbreak: an analysis based on 1218 COVID-19 cases in China. Int J Environ Res Public Health. 2022;19(2):689. pmid:35055511
- 41. Myall A, Price JR, Peach RL, Abbas M, Mookerjee S, Zhu N, et al. Prediction of hospital-onset COVID-19 infections using dynamic networks of patient contact: an international retrospective cohort study. Lancet Digit Health. 2022;4(8):e573–83. pmid:35868812
- 42. Frieswijk K, Zino L, Cao M. A time-varying network model for sexually transmitted infections accounting for behavior and control actions. Int J Robust Nonl Control. 2023;33(9):4784–807.
- 43. Ventura PC, Aleta A, Rodrigues FA, Moreno Y. Epidemic spreading in populations of mobile agents with adaptive behavioral response. Chaos Solitons Fract. 2022;156:111849.
- 44.
Tang J, Musolesi M, Mascolo C, Latora V, Nicosia V. Analysing information flows and key mediators through temporal centrality metrics. In: Proceedings of the 3rd Workshop on Social Network Systems. 2010. p. 1–6. https://doi.org/10.1145/1852658.1852661
- 45. Salama M, Ezzeldin M, El-Dakhakhni W, Tait M. Temporal networks: a review and opportunities for infrastructure simulation. Sustain. Resilient Infrastruct. 2022;7(1):40–55.
- 46.
Oettershagen L, Mutzel P, Kriege NM. Temporal walk centrality: ranking nodes in evolving networks. In: Proceedings of the ACM Web Conference 2022 . 2022. p. 1640–50.
- 47. Cruciani A. On approximating the temporal betweenness centrality through sampling. arXiv preprint 2023. https://arxiv.org/abs/2304.08356
- 48. Casteigts A, Himmel AS, Molter H, Zschoche P. Finding temporal paths under waiting time constraints. Algorithmica. 2021;83(9):2754–802.
- 49. Kirigin T, Bujačić Babić B, Perak B. Semi-local integration measure of node importance. Mathematics. 2022;10(3):405.
- 50. Luo G, Zhang H, Yuan Q, Li J, Wang FY. ESTNet: embedded spatial-temporal network for modeling traffic flow dynamics. IEEE Trans Intell Transp Syst. 2022;23(10):19201–12.
- 51. Zhan , Li Z, Masuda N, Holme P, Wang H. Susceptible-infected-spreading-based network embedding in static and temporal networks. EPJ Data Sci. 2020;9(1):30.
- 52. Yang J, Zhong M, Zhu Y, Qian T, Liu M, Yu JX. Scalable time-range k-core query on temporal graphs. arXiv preprint 2023. https://arxiv.org/abs/2301.03770
- 53. Zhang Y, Lin L, Yuan P, Jin H. Significant engagement community search on temporal networks: concepts and algorithms. arXiv preprint 2022. https://arxiv.org/abs/2206.06350
- 54. Pan RK, Saramäki J. Path lengths, correlations, and centrality in temporal networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2011;84(1 Pt 2):016105. pmid:21867255
- 55. Holme P, Saramäki J. Temporal networks. Phys Rep. 2012;519(3):97–125.
- 56. Ye Q, Yan G, Chang W, Luo H. Vital node identification based on cycle structure in a multiplex network. Eur Phys J B. 2023;96(2):15. pmid:36776156
- 57. Yu E-Y, Fu Y, Chen X, Xie M, Chen D-B. Identifying critical nodes in temporal networks by network embedding. Sci Rep. 2020;10(1):12494. pmid:32719327
- 58.
Buß S, Molter H, Niedermeier R, Rymar M. Algorithmic aspects of temporal betweenness. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2020. p. 2084–92.
- 59. Simard F, Magnien C, Latapy M. Computing betweenness centrality in link streams. arXiv preprint 2021. https://arxiv.org/abs/2102.06543
- 60.
Zhang J, Luo Y. Degree centrality, betweenness centrality, and closeness centrality in social network. In: 2017 2nd international conference on modelling, simulation and applied mathematics (MSAM2017). 2017. p. 300–3.
- 61. Eballe R, Cabahug I. Closeness centrality of some graph families. Int J Contemp Math Sci. 2021;16(4):127–34.
- 62. Oettershagen L, Mutzel P. Computing top-k temporal closeness in temporal networks. Knowl Inf Syst. 2022:1–29.
- 63. Crescenzi P, Magnien C, Marino A. Finding top-k nodes for temporal closeness in large temporal graphs. Algorithms. 2020;13(9):211.
- 64. Génois M, Barrat A. Can co-location be used as a proxy for face-to-face contacts?. EPJ Data Sci. 2018;7(1):1–18.
- 65. Vanhems P, Barrat A, Cattuto C, Pinton J-F, Khanafer N, Régis C, et al. Estimating potential infection transmission routes in hospital wards using wearable proximity sensors. PLoS One. 2013;8(9):e73970. pmid:24040129
- 66. Garibaldi P, Moen ER, Pissarides CA. Modelling contacts and transitions in the SIR epidemic model. Covid Econ. 2020;5:1–21.
- 67. Li M, Liu R-R, Lü L, Hu M-B, Xu S, Zhang Y-C. Percolation on complex networks: theory and application. Phys Rep. 2021;907:1–68.
- 68. Badie-Modiri A, Rizi AK, Karsai M, Kivelä M. Directed percolation in temporal networks. Phys Rev Research. 2022;4(2).