Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Dynamics of temporal influence in polarised networks

  • Caroline B. Pena ,

    Roles Conceptualization, Data curation, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

    caroline.pena@ul.ie

    Affiliation Mathematics Applications Consortium for Science and Industry (MACSI), Department of Mathematics and Statistics, University of Limerick, Limerick, Ireland

  • David J. P. O’Sullivan,

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Mathematics Applications Consortium for Science and Industry (MACSI), Department of Mathematics and Statistics, University of Limerick, Limerick, Ireland

  • Pádraig MacCarron,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Mathematics Applications Consortium for Science and Industry (MACSI), Department of Mathematics and Statistics, University of Limerick, Limerick, Ireland

  • Akrati Saxena

    Roles Conceptualization, Formal analysis, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, The Netherlands

Abstract

In social networks, it is often of interest to identify the most influential users who can successfully spread information to others. This is particularly important for marketing (e.g., targeting influencers for a marketing campaign) and to understand the dynamics of information diffusion (e.g., who is the most central user in the spreading of a certain type of information). However, different opinions often split the audience and make the network polarised, with fragmented structure. In polarised networks, information becomes siloed within communities in the network, and the most influential user within a network might not be the most influential across all communities. Additionally, influential users and their influence may change over time as users may change their opinion or choose to decrease or halt their engagement on the subject. In this work, we aim to study the temporal dynamics of users’ influence in fragmented social networks. We compare the stability of influence ranking using temporal centrality measures, while extending them to account for community structure across a number of network evolution behaviours. We show that we can successfully aggregate nodes into influence bands, and how to aggregate centrality scores to analyse the influence of communities over time. A modified version of the temporal independent cascade model and the temporal degree centrality perform the best in this setting, as they are able to reliably isolate nodes into their bands.

Introduction

Information spread plays an important role in shaping people’s opinions and behaviour in social networks [1,2]. Currently, information spread is faster and easier than in the past, with the use of online social media where sharing information with connections is just one click away. Online social platforms, such as Facebook, Instagram, TikTok, and Twitter (currently known as X), serve as a venue for the information spread among their users, where users both create and share content with each other [3,4]. Understanding how information spreads on social networks is of paramount importance for society [3,5], having applications in public health [68], politics [3,9], and business [10]. Information spread in social networks is commonly influenced by homophily, i.e., people’s tendency to associate preferentially with other people who are similar to themselves in some way [11,12]. Users of social media platforms have a tendency to group with others that share similar opinions and interests, and tend to share information from those who are similar [1316]. The division of society into groups that believe in different, often opposing, ideas is commonly referred to as polarisation [17], which is also observed in social media discussions, especially where the topic is controversial.

Online social platforms, such as Twitter, Facebook, and others, further amplify polarisation among users by using a self-reinforcing system where users are more likely to see the posts from others they share opinions with [1619]. O’Sullivan et al. [14] and Pena et al. [13] explored the polarisation structure on conversation networks on Twitter about two recent referendums in Ireland: (i) the same-sex marriage referendum of 2015, and (ii) the abortion referendum of 2018, and showed that users involved in the online conversation around these referendums presented a strong homophilic behaviour, leading to the observed polarisation. Kearney [15] studied network polarisation on Twitter during the 2016 general election in the USA, and also observed that partisan users form highly polarised networks, while moderates and less engaged users largely avoid political discussions. Researchers have also studied the evolution of polarisation and its impact on opinion formation. De Arruda et al. [20] modelled opinion dynamics in online social networks, showing that friendship rewiring and network algorithms influence polarisation and echo chamber formation, and the temporal dynamics can lead to scenarios ranging from consensus to extreme polarisation. Soares et al. [21] analysed influencers’ roles in political conversations on Twitter during the impeachment process of the ex-president of Brazil, Dilma Rousseff. The authors observed that the network is highly modularised and contains three types of influencers shaping influence and polarisation — opinion leaders, informational influencers, and activists. Loy et al. [22] proposed a Boltzmann-type kinetic model for opinion formation in social networks, considering connectivity-based opinion influence. There are many other works [2325], which have studied the evolution of echo chambers in polarised networks and observed that information tends to flow within its own group.

In such highly opinionated environments, community structure serves as good indicators of polarised groups [13,14], where there exist only a few inter-community links. Consequently, when analysing such complex and fragmented social networks, it is of interest to identify the most influential nodes within each community over time. These central “players" drive information spread by convincing others to share their content or news within their connections. We can measure the influential power of a user using different centrality measures [26]. In the literature, several centrality measures have been defined, which are used extensively to identify influential nodes who maximise the influence spread, i.e., if they start sharing the content on the network they would be expected to have a larger outreach than other nodes [21,22,27]. Centrality measures have several other applications, including finding the source of rumours [2830], identifying weak points in the network (where if nodes are removed, the structural properties of the network would deteriorate), or which nodes, if added, would improve infrastructure [3135], and for organizational design [36,37]. In this paper, we use centrality measures to identify the most influential users in fragmented temporal social networks [27,3841]. Some of the well-known centrality measures in social network analysis are [41]: degree centrality, closeness centrality [42,43], betweenness centrality [44], eigenvector centrality [45,46], Katz centrality [47], and PageRank centrality [48]. There have been proposed methods to update these centrality measures in networks with communities [49,50] as well as extend them for temporal networks [5156]. However, to the best knowledge of the authors, the literature is scarce when it comes to the study of centrality on temporal networks with fragmented community structure.

Ghalmane et al. [49] and Rajeh et al. [50] conducted extensive analyses into how centrality can be calculated on modular networks. However, they have focused on static networks with no temporal component. In real world, diffusion mechanisms commonly unfold in a given time frame, where information takes time to spread in the network. For example, Holme [57] investigated disease spreading over time on empirical datasets of human contacts; Goel et al. [58] analysed the virality of information in social media through a mechanistic model that infers the paths of diffusion by bringing time information into play; and Kim and Anderson [51] analysed the temporal dynamics of contact traces of mobile devices owned by students and staff in two universities. Therefore, it is essential to investigate how information spreads over time and identify the most influential nodes at each defined time slice (e.g., hours, days, or intervals between significant events). Additionally, understanding the impact of community structures on information diffusion within temporal networks remains a key area of interest.

In addition to the natural temporal aspect of social networks, Soares et al. [21] showed that users tend to cluster based on their level of influence within the network. Similarly, O’Brien et al. [59] ranked players of the online fantasy football game Fantasy Premier League by their fantasy team performance, and analysed the evolution of their rank and team selections across multiple time points throughout a season. Following this approach, we categorize nodes in our networks according to their level of influence, which we refer to as influence bands, and analyse the flow of users between these bands over time to gain insights into the temporal evolution of influence. This method is particularly useful because (1) a user’s influence naturally fluctuates over time, and while minor position shifts may be unimportant, substantial changes — such as moving between influence bands — can be more meaningful, and (2) simultaneously analysing both temporal and community-based influence can be complex, whereas grouping users into influence bands provides a more structured and interpretable framework for analysis.

In this paper we aim to investigate the temporal dynamics of nodes’ influence in fragmented social networks by addressing the following key questions:

  1. Can influential nodes be effectively grouped into influence bands?
  2. Does the overall influence of a specific community within a fragmented network change over time?
  3. Can we determine which fragmented community the most influential nodes belong to, and how do influential nodes differ across communities?

In the following section, we explain the methods used to compute temporal centrality measures, as well as the generative models to build synthetic networks for our analysis.

Methodology

In this section, we explain three different methodologies to compute temporal centralities, and our method to generate synthetic temporal networks with bands and communities. Building synthetic networks is crucial to understanding how centrality methods perform in simple and controlled temporal fragmented networks. In this section we also summarise the networks studied and explain our method to aggregate nodes into influence bands.

Temporal centrality methods

In order to calculate temporal centrality scores, we use different techniques to represent temporal networks, which makes it more convenient to compute different centralities. Please note that these techniques are applied to the same set of networks, but are stored in different forms for faster computation of centralities during the analysis.

First, we use a method proposed by Kim and Anderson [51], which allows the calculation of temporal degree, closeness and betweenness centrality scores. This method involves creating a layer for each time slice, starting at t = 0, and every link is drawn between time slices; refer to Fig 1. For eigenvector-based centralities, such as eigenvector centrality and PageRank, we use a second technique proposed by Taylor et al. [52], which creates a multilayer network where each layer contains a time slice of the temporal network, and each node is connected to itself in the subsequent and the preceding time slices (Fig 2(a)). We build what the authors refer to as a supra-centrality matrix, which contains the centrality values for each time slice in block-matrix form (Eq 2). Finally, for the temporal Katz centrality, we use the method proposed by Grindrod and Parsons [53].

thumbnail
Fig 1. Schematic of the network built for temporal degree and closeness centralities.

Example of a simple network analysed over three time slices. On the left, a representation of the time slices, and on the right, the representation of the temporal network according to the method described in Ref [51].

https://doi.org/10.1371/journal.pone.0337753.g001

thumbnail
Fig 2. Schematic of the multilayer network built for temporal eigenvector-based centralities for networks with communities.

Example of a network with two communities analysed over three time slices. Each node is linked to itself on preceding and subsequent time slices, where the weight of each inter-layer link is .

https://doi.org/10.1371/journal.pone.0337753.g002

Temporal degree and closeness centrality.

Kim and Anderson [51] developed a method to calculate temporal degree, temporal betweenness and temporal closeness centrality scores by using a common temporal network representation (Fig 1). The method involves creating a layer for each time slice that contains a set of dummy nodes, starting at t = 0. Each dummy node is then connected to itself in the subsequent time slice (i.e., the dummy node a0 is connected to the dummy node a1, and so on), as well as to the dummy nodes they have an original connection with (e.g., if a link exists in time slice 1, the dummy connection will be written as ). The temporal centrality matrix looks the following:

(1)

where is the adjacency matrix for time slice t and I is the identity matrix which adds self-links between time slices. The matrix is of dimension , where N is the number of nodes in the network and t is the number of time slices.

For this method, a node v’s temporal degree is the normalised total number of inbound edges to and outbound edges from v on the time interval [i,j], disregarding the self-edges from to for all .

Temporal closeness requires a more complex setup. The authors define temporal closeness by considering m time intervals where m = ji by varying the starting time t of each time interval from i to j–1. The temporal shortest paths from node u to node v are then calculated. These are the paths from node ui to node , which is the first node encountered along a path from ui to a node in . However, the temporal shortest paths from u to v will change as time increases. Therefore, in addition to the case with the starting time i, we also need to consider the temporal shortest paths from node u to node v on the m–1 time intervals by varying t from i + 1 to j–1. A node in the time interval [i,j] has temporal closeness centrality calculated by

where is the temporal shortest path distance from v to u on a time interval [t,j].

This way we are able to calculate temporal degree and temporal closeness for each dummy node in each time slice. Unlike temporal degree and closeness, temporal betweenness as defined by the authors [51] will not be considered here as it does not allow the computation of a score for each temporal dummy node due to its calculation process. Next we explore a temporal method developed to compute eigenvector-based centrality scores.

Temporal eigenvector-based centrality.

Taylor et al. [52] proposed the eigenvector-based centrality for temporal networks, which extends the static eigenvector centrality by creating a multilayer network where each layer contains a time slice of the temporal network, and each node is connected to itself in the subsequent and the preceding time slices. Fig 2(a) illustrates the multilayer network generated. The eigenvectors are calculated from the supra-centrality matrix, which is defined as:

(2)

where is the centrality matrix for each time slice t (for the temporal eigenvector centrality, is the adjacency matrix for time slice t including all N nodes in the static network), I is the identity matrix of dimension , and dictates how each time slice is connected to its subsequent one according to the correlation between time slices. The parameter leads to strongly interconnected time slices, while leads to independent time slices.

Since choosing the best value of ε is non-trivial, and is case-dependent, in our work we set , which essentially sets the weight of self-links between time slices to 1. Note that a node in time slice t can either propagate the information forward to t  +  1 ( I in the superdiagonal of the matrix ) or borrow information from itself in the previous time slice t–1 ( I in the subdiagonal of the matrix ). This is important for the correct functioning of eigenvector-based centrality algorithms, as causal coupling (allowing nodes to only connect with themselves either forward or backward in time) can yield non-irreducible supra-centrality matrices, which are problematic for the calculus of eigenvectors [52].

The leading eigenvector of the supra-centrality matrix gives the joint centrality score of each node in each time slice. This allows us to compute from this matrix, two different types of centrality measures: (1) the marginal node centrality (MNC) — the summary of nodes’ scores across all time slices; and (2) the marginal layer centrality (MLC) — the summary of centrality scores for a time slice across all nodes.

As we mentioned earlier, this supra-centrality matrix is applicable to any eigenvector-based centrality; therefore, we also calculate the temporal PageRank by using the same method. A temporal adaptation of PageRank is computed by the same eigenvector-based centrality method by setting

where is the out-degree of node v, the quantity is the damping coefficient, 1 is a vector of ones, and v is the personalized PageRank vector (which is set to be ). The parameter p is set to 0.85 as in the original PageRank paper [48]. Nodes with out-degree 0 are handled by adding a single self-link for each of these nodes. Another well-known influence measure is the Katz centrality [47], which we will discuss next.

Temporal Katz centrality.

The original paper on the Katz centrality [47] calculate people’s influence by taking into account not only the number of direct links to each individual but, also, the influence of each individual’s neighbours. The method consists of considering all paths of two steps, three steps, and so on, and weighing them to allow for the lower effectiveness of longer chains. Therefore, the impact of a k-step chain is computed by weighing it with . In this sense, a k-step chain has probability of being effective, where corresponds to complete attenuation while corresponds to absence of any attenuation. The influence of nodes in a k-chain network is therefore given by

which, in the limit , converges to the resolvent matrix when [53]. Here denotes the largest eigenvalue in modulus of the adjacency matrix A, and represents the limiting α value for which Katz centrality is reduced to the eigenvector centrality [60].

Therefore, for simplicity, when calculating Katz centrality we set

Grindrod and Parsons [53] extend the Katz centrality to temporal networks with t time slices, which is defined as:

(3)

where is the inverse of the matrix .

This method deals with large, sparse networks, and allows a message to “wait” at a node until a suitable connection appears at a later time [53].

The centrality measure that quantifies how effectively a temporal node n can spread information is given by row sums of the matrix . Thus, the following is the temporal Katz centrality of the temporal node n:

(4)

Marginal community centrality.

The above discussed centrality measures deal with temporal data, and therefore, to compare influence in fragmented communities, we extend the idea of marginal node centrality (MNC) to calculate the marginal community centrality (MCC), i.e., for a community level centrality. The marginal centrality for community C1 is computed by aggregating the MNC for each community, as follows:

(5)

Fig 2(b) shows the structure of the table that contains the joint community-time centrality, MLC and MCC measures. As a benchmark to the centrality methods studied, we use the temporal independent cascade model, as described next.

Temporal independent cascade model

The independent cascade model (ICM) [61] is commonly used as a benchmark to assess the accuracy of centrality methods [6264]. In a single Monte Carlo simulation of the ICM, nodes can exist in three states, Susceptible-Infected-Removed. Every infected node in a discrete time-step has one chance to infect its susceptible network neighbours, with an independent probability , before being removed to the recovered state, i.e., a node is in the infected state for only one discrete time-step. If a susceptible node has multiple network neighbours trying to activate it, these attempts occur in a random order. The process terminates once there are no more infected nodes active in a time-step to further propagate the influence. Each node is initially in the susceptible/inactive state. To initialise the process the seed-node state is changed to infected/active. Each node is selected as the seed for a large number of Monte Carlo simulations, where the average cascade size is calculated for that seed node. The average cascade size calculated across all nodes is used as a benchmark for the centrality scores, where nodes producing larger cascades, on average, are assumed to be more influential, and as such should have higher centrality scores [6567]. As a benchmark for temporal centrality measures, we use the Temporal Independent Cascade Model (T-ICM) developed by Haldar et al. [68]. This is particularly useful as a benchmark for our empirical network, as it allows us to easily calculate benchmark for centralities, where we have a temporally evolving network structure.

The T-ICM introduced by Haldar et al. is a straightforward temporal extension of the classic ICM. To model the temporal dynamics, the authors run the ICM on each temporal network, , for one discrete time slice. Any newly infected nodes become the seed infections on the next temporal network, , and the process continues until there is no more infected nodes, or the maximum number of time slices have been reached (i.e., one time slice for each temporal adjacency matrix). This is analogous to the matrix defined in Eq 1. Specifically, the T-ICM can be constructed by creating a weighted matrix , where each edge represents the probability of a currently infected node infecting its neighbour. The matrix is a multilayer network, where we have created a separate layer for each time slice. In this construction, each node at time t is linked to its corresponding node at time t + 1 (e.g., node in layer t connects to node in layer t + 1 via an inter-layer link). Additionally, if a node v has an edge to node u in the original graph at time t, this is translated as an edge from node v in time slice t to node u in time slice t + 1, resulting in an off-diagonal block structure in the temporal adjacency matrix. Thus, the temporal matrix of infection probabilities effectively encodes the ICM dynamics across time slices as:

(6)

where is the adjacency matrix of each time slice t, I is the identity matrix of dimension , and is the probability of an infected node passing the information forward to a neighbour.

Applying the ICM on this multilayer network with weighted edges (weights representing infection probabilities) is equivalent to running the ICM separately on each time slice for one iteration, and using the new infected/activated nodes as the seed nodes in the next time slice. It should be noted that it is possible to use different values of ρ for different time slices, as well as different values of ρ for each node in the network by modifying the constant ρ to vectors of ρ values. In this paper, for simplicity, we will set ρ to be constant for all nodes across all time slices.

It is important to note here that for the purpose of analysing online social networks, we create a slightly modified version of the method developed in Ref [68]. In the original method, there are no self-edges of a node between time slices, i.e., an infected node in time slice t returns to the susceptible state in t  +  1, and can be reinfected. Matrix ensures that a node infected in time slice t will still be infected and will attempt to pass the information forward in the subsequent time slices. Hence, the identity matrix I added to each weighted block matrix in . This is aligned with the online information spread process, where a content is still available to be seen and spread forward in the future, and cannot spread back to a node that previously shared the content. Infected nodes attempt to infect each neighbour in their own time slice with probability ρ. Note that this modification implies that the infection process does not allow recovery, and is therefore comparable to the Susceptible-Infected (SI) model. In the limit all the nodes should be infected. However, as we are only considering a small number of time slices, infection may not reach every node in the network given that ρ is set to be small.

Simulation and empirical analysis of fragmented temporal networks

Our goal in this paper is to study the dynamics of communities influence in temporal fragmented networks by applying a range of centrality measures and using the centrality scores (and the average cascade size via ICM) to aggregate users into bands of influence via clustering techniques. We start our analysis by applying our methods to synthetic networks where we know the true community and influence band structure. This way, we can assess our methods’ performance in a controlled setting before applying our techniques to analyse real-world Twitter/X networks, originally studied in Ref [13].

BandNet: Synthetic fragmented network with bands of influence.

To apply a T-ICM and temporal centrality measures to temporal fragmented networks with bands of influence, we first create a synthetic network that: (1) Has two communities, where the network contains a small number of cross-community links compared to the number of in-community links, creating a fragmented (or modular) environment, as seen in our previous work [13]; (2) has nodes that can be clearly classified into bands of influence, allowing for the comparison of results obtained using various centrality measures in a controlled and simulated setting. Note that other temporal network generation processes like the dynamic stochastic block model (DSBM) [69], could have been used, but we wish to have control over the dynamic changes in the network and require knowledge of the ground-truth labels for the nodes, which our method gives. This approach enables us to gauge the general behaviour expected when applying different centrality measures to real-world networks.

Additionally, in real-world networks, a node influence changes over time as edges are created, deleted, or reallocated in time. To simulate this temporal evolution in a network’s structure, we follow two steps. In the first step, node influence is changed by swapping nodes between influence bands, effectively swapping the number of connections a node has. In the second step, to capture the creation or deletion of edges, a fraction of the intra-community edges are selected and rewired, and similarly the same fraction of the inter-community edges are rewired. This new configuration of the network represents a new time slice. The detailed generation process is explained below:

  1. Create the network with communities and bands
    1. Create two networks that have a small number of nodes with high degree (band 1), a moderate amount of nodes with moderate degree (band 2), and a large amount of nodes with low degree (band 3). Note that here we use degree classes as the true bands, however other measures of interest may be used to define the true bands, such as the length of the shortest path the node is part of. Each one of these networks will be a community in the synthetic network to be studied.
    2. Connect these two networks (communities) together by adding a small number of edges between randomly selected nodes in different communities. The number of edges between communities is required to be much smaller than the number of edges inside each community, as we are building fragmented networks.
  2. Create the temporal evolution
    1. To create the temporal evolution of the network, select x% of nodes uniformly at random from each band and swap their original bands by, in practice, changing nodes’ labels, i.e., if node a swaps position with node v, in practice, all that changes is their labelling. To better model the behaviour of nodes in a time-evolving network, nodes can only change from one band to its neighbouring band(s), that is, a node originally in band 1 can only change to band 2, a node in band 2 can either change to band 1 or band 3, and a node in band 3 can only change to band 2. In this step, in order to change node from band 1 to band 2, for example, we require a node originally in band 2 to swap places with , and become a band 1 node. It is important to note that the number of nodes in each band is maintained.
    2. To make the temporal evolution of the network closer to reality, rewire a percentage of the intra-community edges in the new network time slice. Do the same for a percentage of the inter-community edges. It is important to note that if this percentage is set to 0%, the network structure from the previous time slice will be maintained (only nodes’ labels will be swapped), whereas if this percentage is set to 100%, the new time slice will be a complete randomisation of the previous time slice. Furthermore, in the limit of an infinite number of time slices, the last time slices will be a complete randomisation of the first time slice. If by deleting an edge a node becomes disconnected, it is then reconnected to a randomly selected node of its own community.

Fig 3 shows an example figure of this process. In this figure, nodes of darker colour and larger size are nodes of greater degree, inter-community edges are blue-coloured, and red edges represent the changes in the network structure.

thumbnail
Fig 3. Schematic of the process for building a temporal BandNet.

Here, nodes of darker colour and larger size are nodes of greater degree, inter-community edges are blue-coloured, and red edges represent the changes in the network structure.

https://doi.org/10.1371/journal.pone.0337753.g003

After studying examples of the synthetic BandNet networks using the temporal centralities previously discussed, we will analyse a conversation Twitter network, and a randomised version of it, created as explained in S1 Appendix. Random-graph models constructed from real networks perform well in estimating quantities investigated, and in some cases give results of high accuracy [70]. We will therefore check how centrality measures in a randomised network behave compared to their performance in the original network.

Classification of nodes into bands of influence

After applying the centrality measures on previously described networks, we need to classify the nodes into bands of influence according to their centrality score, for each centrality measure, in order to compare them, with T-ICM being used as the truth in empirical networks. We use a clustering technique to identify groups of nodes that are closely related according to their centrality scores. We apply hierarchical clustering with complete linkage as we are seeking maximal intercluster dissimilarity [71], i.e., groups that are as further apart from each other as possible to avoid overlaps. Here clusters correspond to the bands of influence. We then check the optimum number of bands by using the elbow method. As we divide the nodes into three bands throughout our analysis (band 1 consisting of high influential nodes, band 2 consisting of mid-influential nodes, and band 3 consisting of low influential nodes), if the optimum number of bands found through hierarchical clustering is greater than 3, we merge bands together according to their average centrality score until we get 3 bands. In the rare case where the optimum number of clusters is less than 3, we select the cut point equal to 3.

With nodes clustered into bands, we assess the performance of each influence measure according to 1) the true bands for synthetic networks, or 2) the bands classification according to every other influence method for the RT8 network (note that when we lack ground truth we rely on the T-ICM as the benchmark for the other methods). To do so, we use balanced accuracy (BA), a metric used to evaluate the performance of a classification model. It is calculated as the average of correct classifications throughout all classes (or bands, in our study), i.e.,

where bn is the number of correctly classified nodes into band n.

In the next section, we outline the properties of the networks that will be analysed. Following this, we show how our proposed method works for the synthetic networks and examine the Twitter conversation network RT8 to identify possible bands of influence.

Properties of the networks analysed

We start our analysis with the previously outlined synthetic BandNet network, which contains two communities and well-defined bands of influence for nodes. We will use different degree structures with Bandnet, which will progressively become more complex to make the simulated network more realistic allowing us to see how the centrality methods perform in increasingly complex settings. We will start with a fixed set of possible degrees that each node can take: low degree (for most nodes), moderate degree (for some), and high degree (for a few). After this we will move to a less homogenous network structure, where each node degree will be sampled from a Poisson distribution, in which we will explore the effect of relative community size on the centrality measures. After which we will also study the RT8 network originally studied in Ref [13] and a random version of this network created by using the configuration model, as described in S1 Appendix. All the data used in this paper is publicly available on [72]. Table 1 shows properties of the studied networks.

We start with the simplest network, which has nodes with fixed-degree values spread into two communities of the same size and evolves over four time slices (BandNet1). Band 1 contains 10 nodes (5 from each community) of degree 30, band 2 contains 100 nodes (50 from each community) of degree 10, and band 3 contains 1 000 nodes (500 from each community) of degree 2. We link the two communities by drawing edges between 100 random nodes in community 1 and 100 random nodes in community 2, sampled without replacement. The subsequent time slices are created by applying step 2(a) of the network creation process, where 10% of the nodes in each band may change to its neighbour band — if the node is originally in band 3, it may change to band 2 given band 2 can still take swaps — and 10% of inside-community edges and 10% of in-between communities edges are rewired following step 2(b). Again, this process ensures that the number of nodes in each band does not change over time.

To increase the complexity of our synthetic network, while still keeping it reasonably simple, we create BandNet2, which is a network with two communities of the same size, where nodes follow one of three possible average degree distributions: band 1 contains 5 nodes in each community whose degree is sampled from a Poisson distribution with mean 40, band 2 contains 10 nodes in each community whose degree is sampled from a Poisson distribution with mean 20, and band 3 contains 500 nodes in each community whose node degree is sampled from a Poisson distribution with mean 5. There are 100 inter-community edges. The time evolution is created the same way as before.

The BandNet3 network structure aims to understand how influence measures behave when communities are of different sizes. It contains 555 nodes in community 1 and 1 110 nodes in community 2, that is, community 2 is twice the size of community 1. Its structure is as follows: Band 1 contains 5 nodes of community 1 and 10 nodes of community 2 with degree sampled from a Poisson distribution with average 40, band 2 contains 50 nodes of community 1 and 100 nodes of community 2 with degree sampled from a Poisson distribution with average 20, and band 3 contains 500 nodes of community 1 and 1 000 nodes of community 2 with degree sampled from a Poisson distribution with average 5. There are 100 inter-community edges. The time evolution is created the same way as before.

The real-world RT8 network was constructed from Twitter/X mentions around the Irish Abortion Referendum of 2018, using mentions among the most active users that tweeted using at least one of the tracked hashtags #repealthe8th, #savethe8th, #loveboth, #together4yes, and #retainthe8th from the of May to the of May 2018 (two days after the referendum). In previous analysis [13] polarised communities that represent the Yes- and No-vote supporters were found. The network contains nodes in community 1 and N2 = 463 nodes in community 2. There are 7 353 inter-community edges, against 127 242 edges in-community 1 and 21 197 edges in-community 2. The same applies to the configuration model network with the same degree distribution as the RT8 network. We will now present and discuss the results for the three synthetic networks before discussing the results of the RT8 network.

Results and discussion

We now present and discuss the results for three synthetic networks generated by using the method previously explained, and for the RT8 Twitter network as previously summarised.

BandNet: Synthetic networks

Our synthetic networks allow us to compare different centrality methods results in a setting where we know the bands and communities structure (our ground truth). Thanks to the synthetic networks construction, we are able to assess the methods’ accuracy against true bands, that is, how effective each centrality method, together with the clustering technique, is in capturing which nodes fall into each band over time.

BandNet1 with communities of the same size and fixed degree values.

As mentioned above, we start with BandNet1, a simple example network with two communities of the same size (), that evolves over four time slices. Fig 4 shows the evolution of the network structure.

thumbnail
Fig 4. Time slices of BandNet1.

Nodes are coloured according to the community they belong to, and the size of the node reflects its degree. Layout produced using the Force Atlas algorithm.

https://doi.org/10.1371/journal.pone.0337753.g004

We assess how each centrality measure captures the temporal dynamics of the network by looking at the band flow dynamics over time (Fig 5(a)–5(f)), the joint community-time centrality (Fig 5(g)–5(l)), the nodes in band 1 over time (Fig 5(m)–5(r)), and the summary table containing the joint community-time centrality scores, the MLC and the MCC for each community (Fig 5(s)–5(x)).

thumbnail
Fig 5. Results for BandNet1.

(a)–(f) show how many nodes are in each band in each time slice, and how nodes move between bands in subsequent time slices; (g)–(l) show the normalized influence score for each community over time; (m)–(r) show how many nodes of each community are classified in band 1 over time; (s)–(x) show the summary tables containing the joint community-time scores over time, the marginal layer centrality (MLC) over time and the marginal community centrality (MCC) for each community. Here, the infection probability for T-ICM is set to .

https://doi.org/10.1371/journal.pone.0337753.g005

As in BandNet1 nodes can only assume one of three possible degree values, we expect that the centrality methods combined with our band clustering method should be able to capture true bands since nodes swap places with one another, without changing the network structure. The only change in the network structure comes from the rewiring of 10% of intra-community edges and 10% of inter-community edges between time slices. Comparing results for the band flow over time (Fig 5(a)–5(f)), we see that T-ICM, degree centrality and PageRank capture this behaviour well when keeping bands of similar sizes over time. Closeness, eigenvector and Katz centralities also perform well in capturing this temporal dynamics, but with less accuracy. We therefore show that we can successfully aggregate nodes into bands of influence (research question 1) in this simple setting.

To answer research question 2 (“Does the overall influence of a specific community within a fragmented network change over time?”), from the joint community-time scores (Fig 5(g)–5(l)) and tables in the fourth column of Fig 5, we conclude that (1) T-ICM, degree, closeness and Katz present similar behaviour, with scores decaying over time. This is partially explained by the fact that in these methods a piece of information starting in time slice 1 has the chance to spread until time slice 4, whereas a piece of information that starts in time slice 4 can only spread through its own time slice, as it is the last time slice. The eigenvector-based centralities (Eigenvector and PageRank), on the other hand, consider not only the subsequent time slices but also the previous ones and tend to assign higher scores to the central time slices [52]; (2) Eigenvector centrality consistently gives significantly higher scores to nodes in community 1, which is an indication that it does not perform well in fragmented networks. This is supported by the fact that eigenvector centrality can be used as community detection in networks with high enough modularity [73,74].

The third column of Fig 5 helps us answer research question 3 (“How do influential nodes differ across communities?”). T-ICM, degree centrality and PageRank show the same number of nodes (5 nodes) from each community in band 1, which remains the same over time. This is the expected result as communities are of same size. The summary tables in the fourth column of Fig 5 show that the marginal community centrality (MCC) is similar (close to 50%) for both communities in all centrality methods except eigenvector. This is expected as communities are of the same size and bands should remain the same size throughout the temporal dynamics. Eigenvector centrality returns different MCC values for each community as it is not the most appropriate method for fragmented networks, as previously pointed out.

Table 2 shows the balanced accuracy for the investigated methods against the true bands in BandNet1. Here the true bands are tracked over time from the initial setup in time slice t1, i.e., nodes that swap bands are tracked over time. T-ICM, degree and PageRank are the methods which score the highest against true bands with an overall balanced accuracy of 0.9. Closeness and Katz follow closely, and eigenvector centrality scores much lower (overall 0.67). Time slice t1 has the highest balanced accuracy for every method, except closeness. This is an expected behaviour as rewiring hasn’t occurred at the initial setup of t1, and bands are more clearly laid-out.

BandNet2 with communities of the same size and Poisson degree distribution

As a natural and simple extension to the synthetic network we have generated, BandNet2 has two communities of the same size (), where the nodes degrees in each band are drawn from a Poisson distribution, as explained previously. Fig 6 illustrates the network time slices.

thumbnail
Fig 6. Time slices of BandNet2.

Nodes are coloured according to the community they belong to, and the size of the node reflects its degree. Layout produced using the Force Atlas algorithm.

https://doi.org/10.1371/journal.pone.0337753.g006

As the bands degree distributions overlap each other (Fig 7), we expect more variability in the results compared to the BandNet1 results. In fact, although T-ICM and degree centrality (Fig 8(a) and 8(b)) still show bands of consistent sizes over time, the initial network configuration of 10 nodes in band 1, 100 nodes in band 2 and 1 000 nodes in band 3 is not perfectly captured. PageRank, which was very successful in capturing the initial setup and keep bands of the same size over time in BandNet1, now does not capture well the initial network configuration and shows bands that fluctuate more in size over time (Fig 8(e)). Katz centrality (Fig 8(f)) is successful in maintaining bands of similar sizes throughout the temporal network, however it does not capture the initial setup, and band 1 consists of only one user. Closeness centrality, on the other hand, shows bands that vary greatly in size over time (Fig 8(c)). This only reminds us that closeness centrality, based on the concept of shortest paths, is not directly related to the other centrality measures here studied, which are related to the degree of a node.

thumbnail
Fig 7. Degree distribution of bands in the initial setup of BandNet2.

Here, band 1 has average degree of , band 2 has average degree of , and band 3 has average degree of .

https://doi.org/10.1371/journal.pone.0337753.g007

thumbnail
Fig 8. Results for BandNet2.

(a)–(f) show how many nodes are in each band in each time slice, and how nodes move between bands in subsequent time slices; (g)–(l) show the normalized influence score for each community over time; (m)–(r) show how many nodes of each community are classified in band 1 over time; (s)–(x) show the summary tables containing the joint community-time scores over time, the marginal layer centrality (MLC) over time and the marginal community centrality (MCC) for each community. Here, the infection probability for T-ICM is set to .

https://doi.org/10.1371/journal.pone.0337753.g008

The behaviour of the joint community-time centrality scores (Fig 8(g)–8(l) and 8(s)–8(x)) is similar to the one observed in BandNet1, with T-ICM, degree, closeness and Katz centralities showing a decreasing behaviour over time, PageRank giving higher scores to the mid-time slices, and eigenvector centrality consistently attributing higher scores to nodes in community 1. As per the nodes in band 1 (Fig 8(m)–8(r)), T-ICM is successful in capturing a consistent amount of nodes in each community over time. Degree centrality and PageRank also capture this dynamic well, with small deviations in t1 and t4. Closeness centrality, however, shows a downward trend on the number of nodes in band 1 overall, in both communities. Eigenvector centrality, similarly to what was observed in BandNet1, attributes the highest scores to nodes in community 1, therefore only nodes in C1 are present in band 1. Katz centrality also shows only community 1 in band 1, however this is due to its band 1 having classified one node only. MCC scores (Fig 8(s)–8(x)) tell us that communities have the exact same average influence over time according to PageRank, and very similar influence according to T-ICM, degree and closeness centralities. Eigenvector and Katz, on the other hand, attribute higher influence to community 1, i.e., MCC is higher for C1 when compared to C2, which shows these methods give preference to one of the communities in detriment to the other. Again, this behaviour is expected for the eigenvector centrality given the high modularity of the network, but further investigation is needed to understand why this is the case for Katz. For example, does α play a role?

According to Table 3, PageRank, T-ICM and degree centrality score the highest overall balanced accuracy (), when compared to the tracked true bands. Eigenvector, Katz and closeness centralities score lower, in this order. Eigenvector centrality scores higher in this network when compared to BandNet1, while all other methods score slightly lower, which is due to the overlapping of degree distributions (Fig 7), resulting in the higher variability of the degree structure for this network, as previously pointed out.

We will now analyse the results for a network with Poisson distributions and communities of different sizes to understand the impact of community size on the influence of nodes in the network as a whole.

BandNet3 with communities of different sizes and Poisson degree distribution

BandNet3 consists of a network with two communities, where C1 is half the size of C2, i.e., N1 = 555 nodes and nodes. The degree of the nodes is Poisson distributed as described in Table 1. In the initial configuration t1, band 1 has 5 nodes in C1 and 10 nodes in C2, band 2 has 50 nodes in C1 and 100 nodes in C2, and band 3 has 500 nodes in C1 and 1 000 nodes in C2. Fig 9 illustrates the network over 4 time slices.

thumbnail
Fig 9. Time slices of BandNet3.

Nodes are coloured according to the community they belong to, and the size of the node reflects its degree. Layout produced using the Force Atlas algorithm.

https://doi.org/10.1371/journal.pone.0337753.g009

Fig 10 shows the results of the T-ICM and temporal centrality methods on BandNet3. BandNet3 has similar behaviour to BandNet2, that is, T-ICM and degree centrality show bands of reasonably consistent sizes over time (Fig 10(a) and 10(b)), however they present higher deviations to the true bands when compared to BandNet2. The initial network configuration at t1 is not well captured by any of the methods, with T-ICM being the closest method to do so, which is expected as this is the benchmark method. PageRank on BandNet3 shows bands that fluctuate more in size over time than for the previous examples of BandNet (Fig 10(e)). Katz centrality on BandNet3 (Fig 10(f)) captures better the initial 15-150-1 500 band sizes setup when compared to its performance in BandNet2; however, band 2 in t3 is considerably smaller than in the other time slices. Closeness and eigenvector centralities show bands that vary greatly in size over time and are not very successful in capturing the t1 configuration (Fig 10(c) and 10(d)), which is again likely due to the low capacity of eigenvector in measuring centrality on networks with high modularity and the non-straightforward relationship between closeness and degree distribution.

thumbnail
Fig 10. Results for BandNet3.

(a)–(f) show how many nodes are in each band in each time slice, and how nodes move between bands in subsequent time slices; (g)–(l) show the normalized influence score for each community over time; (m)–(r) show how many nodes of each community are classified in band 1 over time; (s)–(x) show the summary tables containing the joint community-time scores over time, the marginal layer centrality (MLC) over time and the marginal community centrality (MCC) for each community. Here, the infection probability for T-ICM is set to .

https://doi.org/10.1371/journal.pone.0337753.g010

The behaviour of the joint community-time centrality scores (Fig 10(g)–10(l) and 10(s)–10(x)) is similar to the ones observed in BandNet1 and BandNet2, with T-ICM, degree, closeness and Katz centralities showing a decreasing behaviour over time, PageRank giving higher scores to the central time slices, and eigenvector centrality consistently attributing higher scores to nodes in one of the communities (again C1).

The results are quantitatively different for the nodes in band 1 (Fig 10(m)–10(r)), where every method place a larger number of nodes from the largest community C2 in band 1, except eigenvector centrality. Under eigenvector centrality, band 1 is entirely composed by nodes in the smallest community C1. This is due to the information getting confined through random walks in C1, given the networks high modularity [73]. Closeness centrality, however, attributes only nodes in C2 to band 1 in the first time slices t1 and t2. MCC scores (Fig 10(s)–10(x)) show that communities have the exact same average influence over time according to T-ICM or very similar influence according to degree and PageRank centralities. This may be explained by the fact that bands are of proportionate sizes in both communities, i.e., each band in C1 is initially set up to be half the size of the bands in C2, the same proportion of the number of nodes between communities. Closeness and Katz centralities attribute slightly higher scores to the largest community C2, while eigenvector attributes a much higher influence score to C2, despite band 1 being composed by nodes in C1 only, i.e., the largest number of nodes in C2 biases the scores overall.

When compared to the true bands, Table 4 shows that PageRank scores the highest balanced accuracy (0.88) among the methods, followed by T-ICM and Katz centrality, which score 0.81, degree centrality with 0.78, eigenvector centrality with 0.77 and lastly closeness centrality with 0.73.

From the analysis of our synthetic networks we conclude that the eigenvector centrality is not appropriate to compute the influence of nodes in a fragmented temporal network, and PageRank consistently performs well in this type of network. Furthermore, the results obtained with closeness centrality cannot be straightforwardly compared to the results obtained by the degree-based centrality methods, as it is based on shortest paths. Next we will analyse the results of T-ICM and the centrality methods here studied in the real RT8 network composed of Twitter mentions on the Irish Abortion Referendum of 2018.

Empirical network analysis: The Irish abortion referendum Twitter network

The Twitter mentions network on the Irish Abortion Referendum of 2018 was studied in Ref [13], where the authors showed a clearly polarised (and, therefore, fragmented) environment. In the context of a referendum, there is a clear community of users that supports the Yes vote, and another clear community that supports the No vote. The network has nodes— in the Yes community and 463 in the No community, connected by edges. Holistically, we can consider four time-slices according to important events that affect the network (Fig 11). There were three televised debates, two of them occurred on the same day, therefore the time steps are t1) before debates, t2) after debate 1 and before debates 2 and 3, t3) after debates 2 and 3 and before the referendum day, and t4) on the referendum day.

thumbnail
Fig 11. Time slices on the Irish RT8 network.

Time slices of the network showing the number of users, the number of new users coming into the conversation, and the number of links among users in each time slice.

https://doi.org/10.1371/journal.pone.0337753.g011

The original Irish abortion referendum network

Real-world networks often display complex structures. Online social networks, in particular, often present heavy-tailed degree distributions [14,58,75], where there are many nodes with only a few edges and a few nodes (hubs) with a large number of edges [76]. In this setting, since only a few nodes have much higher degree than the vast majority of the nodes in the network, we expect the highest bands, band 1 and 2, to be significantly narrower than band 3, which should encompass the vast majority of nodes. Analysing Fig 12(a)–12(f), we see that every method is able to capture this behaviour, except closeness centrality.

thumbnail
Fig 12. Results for the original RT8 network.

(a)–(f) show how many nodes are in each band in each time slice, and how nodes move between bands in subsequent time slices; (g)–(l) show the normalized influence score for each community over time; (m)–(r) show how many nodes of each community are classified in band 1 over time; (s)–(x) show the summary tables containing the joint community-time scores over time, the marginal layer centrality (MLC) over time and the marginal community centrality (MCC) for each community. Here, the infection probability for T-ICM is set to .

https://doi.org/10.1371/journal.pone.0337753.g012

The joint community-time scores shown in Fig 12(g)–12(l) and 12(s)–12(x) follow similar behaviours as for the synthetic networks previously studied. Degree centrality attributes slightly higher scores to nodes in C2, which is the opposite behaviour shown by closeness, eigenvector, PageRank and Katz. This is due to C2 presenting tighter connected nodes than C1; therefore, the average degree by community gives higher scores to C2. Closeness, eigenvector, PageRank and Katz centralities attribution of higher scores to nodes in C1 suggest that these methods are more sensitive to the size of communities and tend to give higher scores to the largest community. This translates into the MCC scores, where these methods attribute higher scores to C1 — the highest difference of scores being in the eigenvector centrality — while degree attributes higher MCC score to C2. T-ICM, however, attributes similar influence over time to both communities, i.e., the communities MCC scores are similar to each other and close to 0.5.

Every method captured only one or two nodes in band 1 in each time slice, and these nodes are from C1, i.e., the Yes community in the polarised network. Eigenvector captures the same node in band 1 throughout the time slices, as well as PageRank. However, the nodes captured by each method differ. The node captured by eigenvector is also captured by T-ICM, degree, closeness and Katz up to time slice t3, and this node presents high out-degree but low in-degree, that is, they mention many users in the network but are rarely mentioned by other users. The node captured by PageRank is a highly active user in canvassing for the Yes vote. They mention and are mentioned by many users in the network and effectively act as a hub of information. This node is not captured in band 1 by any other method apart from PageRank. Other users captured in band 1 by the range of methods are (1) an influential Irish novelist, (2) an active user canvassing for the Yes vote, and (3) a user that is now suspended on X and we have no information about.

Fig 13 shows the balanced accuracy between pairs of influence methods. PageRank diverges greatly from other methods, while Katz, closeness and degree show good agreement with the benchmark T-ICM.

thumbnail
Fig 13. Balanced accuracy between pairs of influence methods in the original RT8 network.

Darker colours represent higher balanced accuracy.

https://doi.org/10.1371/journal.pone.0337753.g013

Configuration model on the Irish abortion referendum network

The results of the analysis on the original real-world network RT8 raises the question: to what extent the degree distributions inside and in-between communities are important to measure nodes’ influence? The configuration model of a network is a way to simplify its structure while maintaining the nodes’ degrees, as previously outlined (see S1 Appendix for an explanation on how the configuration model can be used on networks with communities). Therefore, if we get the same results as for the original network, we know that, for the purposes of the analysis, the degree distributions are the most important factors for centrality scores. Fig 14 shows the results for T-ICM and centrality measures on the configuration model of the RT8 network, which was built through the use of the configuration model.

thumbnail
Fig 14. Results for the configuration model of the RT8 network.

(a)-(f) show how many nodes are in each band in each time slice, and how nodes move between bands in subsequent time slices; (g)-(l) show the normalized influence score for each community over time; (m)-(r) show how many nodes of each community are classified in band 1 over time; (s)-(x) show the summary tables containing the joint community-time scores over time, the marginal layer centrality (MLC) over time and the marginal community centrality (MCC) for each community. Here, the infection probability for T-ICM is set to .

https://doi.org/10.1371/journal.pone.0337753.g014

As for the original network, bands 1 and 2 are very narrow when compared to band 3, as the overall degree of the network is maintained and is still heavy-tailed. Closeness centrality gives a narrower band 2 in time slice t1 when compared to the original network, which is due to the rewiring process, as this centrality method is based on paths. PageRank, which attributed higher joint community-time scores to nodes in C1 in the original network, now attributes higher scores to nodes in C2 (Fig 14(k)), and the methods Katz, eigenvector and closeness, which presented slightly higher scores to C1 in the original network, now attribute slightly higher scores to C2 (Fig 14(i), 14(j) and 14(l)). All methods now capture only one node in band 1 (Fig 14(m)–14(r)), which is the same over time. This node is the same for T-ICM, degree, eigenvector, closeness and Katz centralities, and is one of the first ranked nodes in the original network. PageRank, however, attributes the highest score to a different node, which is the same node it attributed the highest score in the original network.

MCC scores are now slightly higher and close to 0.5 for C2 according to every method, as opposed to slightly higher for C1 as before (except degree centrality, which was higher for C2 in the original network). Eigenvector centrality now also attributes similar MCC scores to both communities, which may be due to the randomisation process decreasing the modularity of the network (modularity is now against 0.22 in the original network).

For the same reason, the balanced accuracy (Fig 15) of eigenvector centrality increased compared to the values for the original network. The balanced accuracy for Katz, closeness and degree centralities against T-ICM decreased slightly when compared to the values for the original network, however.

thumbnail
Fig 15. Balanced accuracy between pairs of influence methods in the configuration model of the RT8 network.

Darker colours represent higher balanced accuracy.

https://doi.org/10.1371/journal.pone.0337753.g015

With this analysis, we conclude that by randomising the network structure using the configuration model, the methods are still able to capture the same highest ranked nodes as for the original network, and eigenvector centrality can now be applied as modularity decreases with the rewiring process.

Conclusions and limitations

In this paper, we discussed in detail centrality measures in the context of fragmented networks that evolve over time. We started by building synthetic networks with community structure and bands that allowed us to evaluate the performance of centrality methods in a controlled environment where the influence band of each node in each time slice was known. Note that we used degree classes to define bands, however other measures of interest may be used, such as the length of the shortest path the node is part of, for example. We showed that we can successfully aggregate nodes into influence bands (a low-score, a mid-score and a high-score bands), and showed how to aggregate centrality scores to analyse the influence of communities over time. Additionally, we derived matrices of temporal spread of information that are potentially useful in more theoretical frameworks to compute influence spread in complex networks.

We then studied the influence of communities over time in fragmented temporal networks, according to different methods of centrality and influence diffusion. We showed that our modified version of the T-ICM is a good benchmark for centrality methods in this type of network, especially for online social networks, in which information is available for other users to see and potentially spread further for a long time. Using our version of T-ICM we assessed the performance of the centrality methods in a real-world polarised (and fragmented) network.

From our analysis, T-ICM and degree centrality perform the best in this setting, as they are able to reliably isolate nodes into their bands. The eigenvector centrality, however, does not perform well in fragmented networks due to their high modularity, but it does generate the expected results for a randomised version of the fragmented network. Nonetheless, the rank of nodes computed for the randomised network, where we have isolated certain network properties, is not the same rank for the original network. Closeness centrality, due to its dependence on paths, performs poorly on the synthetic networks as the influence bands were set according to degree distributions. However, it performed well on the empirical network, where the mechanisms that lead to nodes’ influence are hard to disentangle, but are most likely a combination of degree and paths. PageRank performs well in the controlled synthetic networks, although it does not match the behaviour of our T-ICM benchmark in the more complex setting of our real network. Katz centrality seems to perform better in networks with a more complex degree distribution and communities of different sizes (i.e., BandNet3 and RT8 original) than in simpler networks. Therefore, the best centrality method depends on the structure of the network being analysed and the mechanism of influence of interest (if more degree-based or more path-based).

Furthermore, in the networks we have studied, we observe that the size of the community does not necessarily dictate how influential this community is in the whole network. This requires further investigation. In addition to this, another limitation of our work is that the classification of nodes into influence bands is dictated by the clustering method chosen. Hierarchical clustering was chosen over k-means, as k-means tends to select clusters of similar sizes and therefore performs poorly for our purposes. Alternatively, other clustering methods such as model-based clustering [77] and other classes of hierarchical clustering may be tested, and a further investigation on whether the clustering technique chosen affects the results would be an important study. Moreover, the adoption of a fixed number of influence bands (set to three in our work) may look simplistic, however it is essential for the purpose of comparison over time and between centrality methods. In the future, these assumptions could be removed via model-based clustering, and a statistical method for determining the optimal number of clusters could be adopted.

It is also important to note that the temporal centrality methods and the modified T-ICM described here are computationally intensive, as the supra-matrices increase rapidly in size depending on the network size (number of nodes) and the number of time slices. To minimise this issue, we suggest the use of sparse matrices to represent the supra-matrices, one can take advantage of algorithms and software that optimise the calculation of eigenvectors of large sparse matrices — i.e., ARPACK and C++-based code. Additionally, we use the independent cascade model, which is a simple contagion model, as the ground truth for information spread in real networks. An extension of our work to include complex-contagion models would be highly beneficial and worthwhile. In addition, the optimisation of the parameters α for Katz centrality, ε for the temporal eigenvector-based centrality, and ρ for the modified T-ICM method remains an open research opportunity. This would be easy to explore as we have already created the simulation scheme required to do so. Finally, it should be noted that the rewiring process used to model temporal evolution in synthetic networks may oversimplify real-world behaviour shifts. In future work one could extend this analysis to include other rewiring processes for temporal networks with communities, such as the dynamic stochastic block models (DSBMs). It is important to highlight, however, that this simplistic rewiring process used in this work was chosen over more complex processes as we required a well-controlled process with clear ground-truth and tractable labels for influence bands.

Supporting information

S1 Appendix. Randomisation of networks with communities.

https://doi.org/10.1371/journal.pone.0337753.s001

(PDF)

References

  1. 1. Ausat AMA. The role of social media in shaping public opinion and its influence on economic decisions. JTechnology;JSociety Perspective. 2023;1(1):35–44.
  2. 2. Swastiningsih S, Aziz A, Dharta Y. The role of social media in shaping public opinion: a comparative analysis of traditional vs. digital media platforms. The Journal of Academic Science. 2024;1(6):620–6.
  3. 3. Amedie J.. The impact of social media on society. Pop Culture Intersections. 2015.
  4. 4. Allcott H, Gentzkow M. Social Media and Fake News in the 2016 Election. Journal of Economic Perspectives. 2017;31(2):211–36.
  5. 5. Siddiqui S, Singh T. Social media its impact with positive and negative aspects. IJCATR. 2016;5(2):71–5.
  6. 6. Clark SE, Bledsoe MC, Harrison CJ. The role of social media in promoting vaccine hesitancy. Curr Opin Pediatr. 2022;34(2):156–62. pmid:35232950
  7. 7. Johnson NF, Velásquez N, Restrepo NJ, Leahy R, Gabriel N, El Oud S, et al. The online competition between pro- and anti-vaccination views. Nature. 2020;582(7811):230–3. pmid:32499650
  8. 8. Beguerisse-Díaz M, McLennan AK, Garduño-Hernández G, Barahona M, Ulijaszek SJ. The “who” and “what” of #diabetes on Twitter. Digit Health. 2017;3:2055207616688841. pmid:29942579
  9. 9. Persily N, Tucker JA. Social Media and Democracy: The State of the Field, Prospects for Reform. Cambridge University Press; 2020.
  10. 10. Aral S, Muchnik L, Sundararajan A. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proc Natl Acad Sci U S A. 2009;106(51):21544–9. pmid:20007780
  11. 11. Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003;45(2):167–256.
  12. 12. Rizi AK, Michielan R, Stegehuis C, Kivela M. Homophily Within and Across Groups. In: 2024. https://doi.org/arXiv:241207901
  13. 13. Pena CB, MacCarron P, O’Sullivan DJP. Finding polarized communities and tracking information diffusion on Twitter: a network approach on the Irish Abortion Referendum. R Soc Open Sci. 2025;12(1):240454. pmid:39816737
  14. 14. O’Sullivan DJP, Garduño-Hernández G, Gleeson JP, Beguerisse-Díaz M. Integrating sentiment and social structure to determine preference alignments: the Irish Marriage Referendum. R Soc Open Sci. 2017;4(7):170154. pmid:28791141
  15. 15. Kearney MW. Analyzing change in network polarization. New Media & Society. 2019;21(6):1380–402.
  16. 16. Chen THY, Salloum A, Gronow A, Ylä-Anttila T, Kivelä M. Polarization of climate politics results from partisan sorting: evidence from Finnish Twittersphere. Global Environmental Change. 2021;71:102348.
  17. 17. Smith LGE, Thomas EF, Bliuc A-M, McGarty C. Polarization is the psychological foundation of collective engagement. Commun Psychol. 2024;2(1):41. pmid:39242857
  18. 18. Flamino J, Galeazzi A, Feldman S, Macy MW, Cross B, Zhou Z, et al. Political polarization of news media and influencers on Twitter in the 2016 and 2020 US presidential elections. Nat Hum Behav. 2023;7(6):904–16. pmid:36914806
  19. 19. Darwish K. Quantifying polarization on twitter: the kavanaugh nomination. In: Social Informatics: 11th International Conference, SocInfo 2019, Doha, Qatar, November 18–21, 2019, Proceedings 11. 2019. p. 188–201.
  20. 20. Ferraz de Arruda H, Maciel Cardoso F, Ferraz de Arruda G, R. Hernández A, da Fontoura Costa L, Moreno Y. Modelling how social network algorithms can influence opinion polarization. Information Sciences. 2022;588:265–78.
  21. 21. Soares FB, Recuero R, Zago G. Influencers in polarized political networks on Twitter. In: Proceedings of the 9th International Conference on Social Media and Society. 2018. p. 168–77. https://doi.org/10.1145/3217804.3217909
  22. 22. Loy N, Raviola M, Tosin A. Opinion polarization in social networks. Philos Trans A Math Phys Eng Sci. 2022;380(2224):20210158. pmid:35400191
  23. 23. Baumann F, Lorenz-Spreen P, Sokolov IM, Starnini M. Modeling echo chambers and polarization dynamics in social networks. Phys Rev Lett. 2020;124(4):048301. pmid:32058741
  24. 24. Cinelli M, De Francisci Morales G, Galeazzi A, Quattrociocchi W, Starnini M. The echo chamber effect on social media. Proc Natl Acad Sci U S A. 2021;118(9):e2023301118. pmid:33622786
  25. 25. Jiang J, Ren X, Ferrara E. Social media polarization and echo chambers in the context of COVID-19: case study. JMIRx Med. 2021;2(3):e29570. pmid:34459833
  26. 26. Newman M. Networks. Oxford University Press; 2018.
  27. 27. Saxena A, Iyengar S. Centrality measures in complex networks: a survey. arXiv preprint 2020. https://arxiv.org/abs/2011.07190
  28. 28. Shelke S, Attar V. Source detection of rumor in social network – a review. Online Social Networks and Media. 2019;9:30–42.
  29. 29. Dang A, Smit M, Moh’d A, Minghim R, Milios E. Toward understanding how users respond to rumours in social media. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2016. p. 777–84. https://doi.org/10.1109/asonam.2016.7752326
  30. 30. Azzimonti M, Fernandes M. Social media networks, fake news, and polarization. European Journal of Political Economy. 2023;76:102256.
  31. 31. Pitts FR. A graph theoretic approach to historical geography. The Professional Geographer. 1965;17(5):15–20.
  32. 32. Cadini F, Zio E, Petrescu CA. Using centrality measures to rank the importance of the components of a complex network infrastructure. In: International Workshop on Critical Information Infrastructures Security. 2008. p. 155–67.
  33. 33. Oliva G, Esposito Amideo A, Starita S, Setola R, Scaparra MP. Aggregating centrality rankings: a novel approach to detect critical infrastructure vulnerabilities. In: International Conference on Critical Information Infrastructures Security. 2019. p. 57–68.
  34. 34. Kaur M, Singh S. Analyzing negative ties in social networks: a survey. Egyptian Informatics Journal. 2016;17(1):21–43.
  35. 35. Latora V, Marchiori M. Vulnerability and protection of infrastructure networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2005;71(1 Pt 2):015103. pmid:15697641
  36. 36. Bizzi L. At the origin of network centrality: how to design jobs to make employees central. Group & Organization Management. 2022;49(1):40–73.
  37. 37. Beauchamp MA. An improved index of centrality. Behav Sci. 1965;10:161–3. pmid:14284290
  38. 38. Peng S, Zhou Y, Cao L, Yu S, Niu J, Jia W. Influence analysis in social networks: a survey. Journal of Network and Computer Applications. 2018;106:17–32.
  39. 39. Zhang J, Luo Y. Degree centrality, betweenness centrality and closeness centrality in social network. In: Proceedings of the 2017 2nd International Conference on Modelling, Simulation and Applied Mathematics (MSAM2017). 2017. https://doi.org/10.2991/msam-17.2017.68
  40. 40. Sun PG, Miao Q, Staab S. Community-based k-shell decomposition for identifying influential spreaders. Pattern Recognition. 2021;120:108130.
  41. 41. Landherr A, Friedl B, Heidemann J. A critical review of centrality measures in social networks. Wirtschaftsinformatik. 2010;52:367–82.
  42. 42. Bavelas A. Communication patterns in task-oriented groups. The Journal of the Acoustical Society of America. 1950;22(6):725–30.
  43. 43. Koschutzki D, Lehmann KA, Peeters L, Richter S, Tenfelde-Podehl D, Zlotowski O. Centrality indices. Network analysis: methodological foundations. 2005. p. 16–61.
  44. 44. Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977;40(1):35.
  45. 45. Seeley JR. The net of reciprocal influence; a problem in treating sociometric data. Canadian Journal of Psychology/Revue canadienne de psychologie. 1949;3(4):234–40.
  46. 46. Bonacich P. Factoring and weighting approaches to status scores and clique identification. The Journal of Mathematical Sociology. 1972;2(1):113–20.
  47. 47. Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18(1):39–43.
  48. 48. Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems. 1998;30(1–7):107–17.
  49. 49. Ghalmane Z, Cherifi C, Cherifi H, Hassouni ME. Centrality in complex networks with overlapping community structure. Sci Rep. 2019;9(1):10133. pmid:31300702
  50. 50. Rajeh S, Savonnet M, Leclercq E, Cherifi H. Characterizing the interactions between classical and community-aware centrality measures in complex networks. Sci Rep. 2021;11(1):10088. pmid:33980922
  51. 51. Kim H, Anderson R. Temporal node centrality in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2012;85(2 Pt 2):026107. pmid:22463279
  52. 52. Taylor D, Myers SA, Clauset A, Porter MA, Mucha PJ. Eigenvector-based centrality measures for temporal networks. Multiscale Model Simul. 2017;15(1):537–74. pmid:29046619
  53. 53. Grindrod P, Parsons MC, Higham DJ, Estrada E. Communicability across evolving networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2011;83(4 Pt 2):046120. pmid:21599253
  54. 54. Tsalouchidou I, Baeza-Yates R, Bonchi F, Liao K, Sellis T. Temporal betweenness centrality in dynamic graphs. Int J Data Sci Anal. 2019;9(3):257–72.
  55. 55. Tang J, Musolesi M, Mascolo C, Latora V, Nicosia V. Analysing information flows and key mediators through temporal centrality metrics. In: Proceedings of the 3rd Workshop on Social Network Systems. 2010. p. 1–6. https://doi.org/10.1145/1852658.1852661
  56. 56. Alsayed A, Higham DJ. Betweenness in time dependent networks. Chaos, Solitons & Fractals. 2015;72:35–48.
  57. 57. Holme P. Temporal network structures controlling disease spreading. Phys Rev E. 2016;94(2–1):022305. pmid:27627315
  58. 58. Goel S, Anderson A, Hofman J, Watts DJ. The structural virality of online diffusion. Management Science. 2016;62(1):180–96.
  59. 59. O’Brien JD, Gleeson JP, O’Sullivan DJP. Identification of skill in an online game: the case of fantasy premier league. PLoS One. 2021;16(3):e0246698. pmid:33657110
  60. 60. Zhan J, Gurung S, Parsa SPK. Identification of top-K nodes in large networks using Katz centrality. J Big Data. 2017;4(1):1–19.
  61. 61. Kempe D, Kleinberg J, Tardos É. Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. 2003. p. 137–46. https://doi.org/10.1145/956750.956769
  62. 62. Chen D, Lü L, Shang M-S, Zhang Y-C, Zhou T. Identifying influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications. 2012;391(4):1777–87.
  63. 63. Zhao X, Liu F, Xing S, Wang Q. Identifying influential spreaders in social networks via normalized local structure attributes. IEEE Access. 2018;6:66095–104.
  64. 64. Hajarathaiah K, Enduri MK, Anamalamudi S, Sangi AR. Algorithms for finding influential people with mixed centrality in social networks. Arab J Sci Eng. 2023;48(8):10417–28.
  65. 65. Fink CG, Fullin K, Gutierrez G, Omodt N, Zinnecker S, Sprint G, et al. A centrality measure for quantifying spread on weighted, directed networks. Physica A: Statistical Mechanics and its Applications. 2023;626:129083.
  66. 66. Nandi S, Malta MC, Maji G, Dutta A. IS-PEW: identifying influential spreaders using potential edge weight in complex networks. In: International Conference on Complex Networks and Their Applications. 2023.
  67. 67. Lü L, Chen D, Ren X-L, Zhang Q-M, Zhang Y-C, Zhou T. Vital nodes identification in complex networks. Physics Reports. 2016;650:1–63.
  68. 68. Haldar A, Wang S, Demirci GV, Oakley J, Ferhatosmanoglu H. Temporal cascade model for analyzing spread in evolving networks. ACM Trans Spatial Algorithms Syst. 2023;9(2):1–30.
  69. 69. Xu KS, Hero III AO. Dynamic stochastic blockmodels: Statistical models for time-evolving networks. In: International conference on social computing, behavioral-cultural modeling, and prediction. 2013. p. 201–10.
  70. 70. Newman ME, Strogatz SH, Watts DJ. Random graphs with arbitrary degree distributions and their applications. Phys Rev E Stat Nonlin Soft Matter Phys. 2001;64(2 Pt 2):026118. pmid:11497662
  71. 71. James G, Witten D, Hastie T, Tibshirani R, Taylor J. An introduction to statistical learning: with applications in python. Springer Nature. 2023.
  72. 72. Pena BC, O’Sullivan DJB, MacCarron P, Saxena A. BandNet and RT8 data; 2025. https://github.com/caroline-pena/BandNet_data
  73. 73. Sharkey KJ. Localization of eigenvector centrality in networks with a cut vertex. Phys Rev E. 2019;99(1–1):012315. pmid:30780242
  74. 74. Ditsworth M, Ruths J. Community detection via Katz and eigenvector centrality. arXiv preprint 2019.
  75. 75. Wu J. Power laws in social networks. Social Network Computing. Springer; 2024. p. 257–84.
  76. 76. Barabási AL, Pósfai M. Network science. Cambridge: Cambridge University Press; 2016.
  77. 77. Gormley IC, Murphy TB, Raftery AE. Model-based clustering. Annu Rev Stat Appl. 2023;10(1):573–95.