Dynamics and resiliency of networks with concurrent cascading failure and self-healing

Waseem Al-Aqqad; Hassan S. Hayajneh; Xuewei Zhang

doi:10.1371/journal.pone.0277490

Abstract

Local attacks in networked systems can often propagate and trigger cascading failures. Designing effective healing mechanisms to counter cascading failures is critical to enhance system resiliency. This work proposes a self-healing algorithm for networks undergoing load-based cascading failure. To advance understanding of the dynamics of networks with concurrent cascading failure and self-healing, a general discrete-time simulation framework is developed, and the resiliency is evaluated using two metrics, i.e., the system impact and the recovery time. This work further explores the effects of the multiple model parameters on the resiliency metrics. It is found that two parameters (reactivated node load parameter and node healing certainty level) span a phase plane for network dynamics where three regimes exist. To ensure full network recovery, the two parameters need to be moderate. This work lays the foundation for subsequent studies on optimization of model parameters to maximize resiliency, which will have implications to many real-world scenarios.

Citation: Al-Aqqad W, Hayajneh HS, Zhang X (2022) Dynamics and resiliency of networks with concurrent cascading failure and self-healing. PLoS ONE 17(11): e0277490. https://doi.org/10.1371/journal.pone.0277490

Editor: Sathishkumar V E, Hanyang University, KOREA, REPUBLIC OF

Received: July 15, 2022; Accepted: October 27, 2022; Published: November 15, 2022

Copyright: © 2022 Al-Aqqad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available from: https://figshare.com/articles/software/Final_m/21453858.

Funding: This work received support from the U.S. Department of Commerce, Economic Development Administration (https://eda.gov/) under Award #08-69-05349 of which X.Z. is the principal investigator. There was no additional funding received for this study.

Competing interests: The authors have declared that no competing interests exist.

Introduction

A wide range of real-world systems such as power grids [1], financial transaction networks [2], communication networks (e.g., the Internet) [3], and command and control systems [4] have been modeled as complex networks. Among other characteristics, the resiliency of networked systems has received growing research attention from diverse application areas including economic systems [5], organizational management [6], and multiple engineering systems [7, 8]. Generally speaking, resiliency can be viewed as the ability of a system to bounce back from high-impact disruptions to achieve partial or full recovery [9]. To improve the resiliency of a system, it is necessary to advance the knowledge of the effects of system properties, external disruptions, and recovery mechanisms on resiliency, which calls for extensive modeling and simulation studies and would be of fundamental interests to the planning, design, operation, and control of systems from critical infrastructure and supply chains to disaster recovery and humanitarian aids [10].

Review of related works

To characterize system resiliency, the modeling and simulation efforts need to consider two aspects: (1) following an initial (usually local) attack, the failure of nodes or links propagates over the network, which is called cascading failure; (2) at certain point the system’s self-healing mechanism is activated, which would counter the impacts of cascading failure. In addition, it is also necessary to develop consistent, quantitative resiliency metrics to facilitate the comparison of system performances.

Previous research on cascading failure in networks (here we focus on single-layer networks) fall into two main categories: connectivity-based [11, 12] and load-based [13, 14]. The first cascading failure model was developed in [11] to describe the propagation of binary decisions in a population of interacting decision-makers (nodes). Each node observes the states (0 or 1) of the nodes connected to it (neighbors). Its state to be in state 1 (active) or state 0 (inactive) is determined from whether the fraction of its neighbors being in state 1 is higher or lower than a pre-specified threshold. It was found that large cascades can be triggered due to the inactivation of highly connected nodes. In [12], it was shown that community structures are crucial in connectivity-based cascading failures such as information diffusion and virus spreading. In [13], the nodes in a network were associated with a physical quantity called “load” that can be transferred between neighbors, and the effects of inactivating some nodes with their loads transferred to the neighboring nodes were investigated. It was assumed that the initial load of a node is the total number of shortest paths passing through the node. If the load of a node exceeds its capacity (which is, by definition, proportional to its initial load), the node will become inactive, and its load will be transferred to its neighbors. It was shown in [14] that the networks with more heterogeneous distribution of loads are likely to be more vulnerable to cascades of overload failures.

These two types of cascading failure (connectivity-based and load-based) can be used to model many systems under disruptions. However, the majority of prior studies have focused on network robustness (i.e., how to mitigate the risk of global failure), without considering active defense or self-healing (recovery) processes that could be initiated after some damages have been made to original system. Self-healing in complex networks has raised substantial research interests in the past decade. Representative studies on network self-healing can be found in [15–18]. For instance, a defense strategy against cascading failure due to overload was proposed in [15], which was based on selective removal strategy of nodes/links immediately after the initial attack. It was shown that the removal of nodes with low loads can result in reduced size of cascades. Two self-healing models were introduced in [16, 17], where the former decides for each node, after damage, whether to create a new link depending on the fraction of neighbors it has lost, while the implementation of the latter relies on the presence of dormant backup links that can be switched back on. However, these studies developed solutions to repair or restore system “instantaneously” and did not treat network recovery as a dynamic process [18]. As such, self-healing in load-based failure scenarios has not been modeled [19]. Further, there have been few studies on a networked system with concurrent cascading failure and healing [20].

There have been some studies evaluating the resiliency of complex networks. In [7], some metrics were designed to quantify the resiliency of networked infrastructure systems during earthquakes and hurricanes. An agent-based modeling approach was demonstrated in [9] to assess the performance of a complex system after disruptions using metrics such as systemic impact and the time to reach a full restoration. A method called resilience triangle was introduced in [21] to quantitatively assess three aspects of supply chain network resiliency: complexity, density, and node-criticality. Another concept called expected disruption cost was proposed in [22] to quantify resiliency and enable its inclusion in optimization models.

Objective and overview of this work

This work aims to consider networked systems with concurrent load-based cascading failure and self-healing and investigate the dependence of resiliency on various system parameters. The dynamic healing model for overload failures is newly developed. The effects of some important healing parameters such as triggering level and budget parameter were explored. Two metrics, i.e., 95% recovery time and T-20 active node number, are used to measure the resiliency. The networks under study in this paper include an Erdös-Rényi (ER) random network [23] and a scale-free (SF) network [24].

The major contribution of this work is to develop a dynamic modeling and simulation framework to quantitatively assess the resiliency of networked systems. To the best of our knowledge, this is the first time when cascading failure, self-healing, and resiliency are considered together as integral parts of a dynamic networked system. This dynamic system modeling framework also enables the examination of the effects of the model parameters on resiliency. This work lays the foundation for subsequent studies on more complex mechanisms and processes on the networks, optimization of resiliency, as well as applications to more real-world scenarios.

The organization of the remainder of this paper is as follows. Sec. 2 describes the load-based cascading failure and self-healing models implemented in this work, as well as the resiliency metrics. Sec. 3 presents the ER and SF networks under study, the system dynamics (i.e., recovery trajectories) under various combinations of parameters, and the corresponding resiliency metrics. The conclusions of the paper are drawn in Sec. 4.

Model and methods

Dynamic processes on networks

In this work, the systems under study are networks (or graphs). The initial network is denoted as G, while the network at each time step following the initial attack is denoted as G_dmg. Table 1 lists the most important notations used in the system model. The system model consists of four main modules, each of which is described below.

Download:

Table 1. List of notations.

https://doi.org/10.1371/journal.pone.0277490.t001

(i) Initialization (t = 0). This module has two parts. Firstly, since the processes on the network are load based, it is needed to specify the initial load L_i,0 and capacity (maximum load) C_i of each node i = 1, 2, …, N. In this work, the L_i,0’s are sampled from uniform distribution over [L_min, L_min]. The capacity of each node is assumed to be proportional to its initial load, i.e., C_i = (1 + a)L_i,0 where a is called tolerance factor (fixed at 0.1 throughout this work). The C_i’s, once set, will remain constant in the simulation. Secondly, in the original network G, a set (denoted R) of nodes are selected as the targets of initial attack (each with an additional load shock D). In this work, the nodes in R are randomly selected from all nodes in G.

(ii) Cascading (t = 1, 2, …, t_max). At each time step t, consider the network G_dmg from the previous time step t − 1. For i = 1, …, N, if L_i,t−1 > C_i, then node i fails and becomes inactive (in our work a load of negative infinity is assigned to inactive nodes), and the load L_i,t−1 is transferred uniformly to the active neighbors of node i. The transfer process starts with the identification of the set of neighboring nodes of node i in G, denoted by b_i, and then filter out the elements in b_i with a negative infinity load to obtain the set of active neighboring nodes of node i in G_dmg, denoted as Active_b_i. The load of each node in Active_b_i will be increased by L_i,t−1/length(Active_b_i). The resulting network is returned as the updated G_dmg.

(iii) Healing (t = t_trig, …, t_max). If the number of inactive (failed) nodes has not reached a pre-specified threshold T, the self-healing will not be initiated. Once the number of inactive nodes exceeds T at time step t_trig, the self-healing module will be running. In the model, it is possible to implement x repetitions of module (ii) and y repetitions of module (iii) at each time step to model the different “speeds” of the two processes; however, in this work, we set both x and y to be 1. The healing process has two steps as follows.

Step 1—Decision. By selecting some inactive nodes to recover (reactivate) at each time step, we aim to maximally mitigate the cascading failure and restore the original network as quickly as possible. The highest number of nodes that can be recovered at each time step is called the budget parameter of healing, B. Our model always attempts to recover the highest possible number of inactive nodes; when the number of inactive nodes is greater than B, we need to rank these inactive nodes by their “importance”. In this work, the importance is approximately evaluated as the average capacity usage of the inactive node’s active neighbors. For each inactive node j in G_dmg, one can identify Active_b_j and calculate the mean of the ratios of current load and capacity of all nodes in Active_b_j. The result is denoted by LC_Ratio_j; the inactive nodes with higher LC_ratio will be of higher priorities to be chosen. It makes sense to immediately restore the inactive nodes whose active neighbors have the highest average capacity and are the most vulnerable to failures in the next time step. These highest impact inactive nodes (limited by the set size and B) will be the input of Step 2 below.

Step 2—Implementation. This step is the reactivation of the inactive nodes identified above. For each of such inactive node j, we will transfer some of the load of each of its active neighbors to itself (for the relief of the active neighbors). In our model, each neighbor transfers a portion P of the mean of the loads of these active neighbors. The load of node j is updated from negative infinity to the transferred amount. The load of each of those neighbors will be reduced by P/length(Active_b_j) of its old value. Furthermore, another parameter is introduced, namely, certainty level of node healing α ∈ [0, 1]. This parameter captures the success probability of the implementation of healing of any nodes selected in Step 1. The output of Step 2 will be the updated G_dmg.

(iv) Resiliency evaluation. The algorithm outlined above can generate system trajectories (number of inactive nodes vs time), based on which one can evaluate resiliency. In this work, we examine two metrics: A_t, the number of active nodes (normalized by N) at time step t, and T_β, time steps needed to reach β% recovery (i.e., the network reaches a steady state with less than (100 − β)% inactive nodes). Higher values of A_t (less severe disruption) and lower values T_β (faster recovery) correspond to more resilient systems.

Research design

In this work, we consider two different setups of the initial network G: (1) a computer generated ER network and (2) a computer generated SF network. Both networks have 5000 nodes (N = 5000) connected by 10000 links. For the ER network, we randomly generate the adjacency matrix and pick one with all nodes connected. For the SF network, we start from 5 interconnected nodes (seed) and add new nodes. The number of links a new node can make to the existing nodes is 2. This repeats until the total node number reaches N.

The four model parameters to be investigated are: T, B, P, and α, all between 0 and 1. For various combinations of these parameters, we will compare the resulting system trajectories and resiliency metrics to explore the effects of each parameter.

Results and discussion

System trajectories

Fig 1 presents the system trajectories for two different network topologies (ER and SF). The initial attacks are targeted at 8 nodes that are randomly chosen in the simulation. As expected, the self-healing mechanism works, since the number of inactive nodes in all cases eventually falls to zero (or very close to zero). In both networks, under the current parameter settings, a higher certainty level α leads to faster recovery and smaller damage, which agrees with the intuition that higher certainty of node healing means higher recovery efficiency. Additionally, the SF network appears to be more resistant to random attacks, as the peak number of inactivated nodes are lower than that of the ER network. However, the recovery in the SF network takes longer than that in the ER network. This is a consequence of our healing mechanism. When the damage is more severe at the time of healing initiation (and the budget parameter is high enough), at subsequent time steps there will be more inactive nodes of high importance to be selected for reactivation, which in general results in a faster system recovery.

Download:

Fig 1. The number of inactive nodes as a function of time steps in (a) a random network and (b) a scale-free network.

Both networks have 5000 nodes and 10000 links. T = B = P = 0.2, and α takes three values: 0.2 (black curve), 0.5 (red curve), and 0.8 (green curve). Each data point is the average of results of 20 iterations of simulation (each with the same initial attack targets). The error bars indicate standard errors.

https://doi.org/10.1371/journal.pone.0277490.g001

Higher certainty levels α could produce counter-intuitive outcomes, especially as P is increased. Fig 2 shows trajectories of the two networks in Fig 1 with α = 0.8, P = 0.5, B = 0.2, and T = 0.8, as well as snapshots of the damaged networks at different time steps. In both cases, the system cannot reach a full recovery. The self-healing process works well initially, bringing the number of inactive nodes down to >90% recovery. After that, the cascading failure regains momentum. And the number of inactive nodes grows more rapidly in the ER network than in the SF network, which is consistent with the observed trend of cascading failure in Fig 1 (note that it is out of the scope of this work to examine whether this holds for all possible pairs of ER and SF networks). The “re-ignition” of cascading failure is seeded by the reactivated nodes with transferred loads that are higher than their capacities (defined at t = 0 as (1 + a) times initial loads and kept constant) when P is high. If the capacity is updated as (1 + a) times the new load, the phenomenon seen in Fig 2 would disappear. We stick with the original model in this work, in view of that in many real-world cases, when restoring system components, previous specifications are often followed.

Download:

Fig 2. The number of inactive nodes as a function of time steps at higher values of P and α in (a) a random network and (b) a scale-free network.

Both networks have 5000 nodes and 10000 edges. The snapshots indicate the distributions of the inactive nodes (red) and the active nodes (blue) at respective time steps.

https://doi.org/10.1371/journal.pone.0277490.g002

Combining the results in Figs 1 and 2, it is implied that lowering the certainty level α could offset some negative impacts of increased P, by reducing the occurrences of the reactivated, overloaded nodes. This can be seen more clearly in Fig 3, where different combinations of P and α correspond to different behaviors of the ER network. In S1, both P and α are too small to generate healing strong enough to counter the cascading failure. S2 is a region with either or both of P and α being high, where the long-term behavior of the system is either partially recovered at a stable level or oscillating around a certain level. The zigzagged boundary between S2 and the fully recovered regime does exhibit a trend of decreasing α with increasing P.

Download:

Fig 3. Phase diagram of the ER network behavior under various combinations of P and α (both from 0 to 1 at 0.05 step size) with T = 0.2 and B = 0.8.

There are three regimes: S1 (red)—healing unable to stop cascading failure; Fully Recovered (green); and S2 (brown)—healing unable to achieve 100% (or very close to 100%) recovery.

https://doi.org/10.1371/journal.pone.0277490.g003

Resilience metrics.

We now move on to measure the resiliency of the networks and investigate the effects of the four model parameters. Here we consider two metrics: (a) A₂₀, the number of active nodes (normalized by N) at time step t = 20; and (b) T₉₅, the time needed to reach 95% recovery (i.e., <5% inactive nodes). In Figs 4 and 5, we show the results of A₂₀ for the ER and SF networks with different B, T, P, and α values. As shown in Fig 4, when P = 0.2, for both networks, a higher T (the same B) always corresponds to a lower A₂₀, which means that the later the self-healing mechanism kicks in, the more severe the damage observed at t = 20, and the less resilient the system. The difference between the two networks is the effect of α. In the ER network, when α is greater than 0.6, the A₂₀ values of all cases approach 1, while in the SF network, the convergence is not as obvious. Comparing the results with different B values under the same T, one can see that as B increases beyond certain level, no significant increase in A₂₀ will be made. For instance, in Fig 4(a), the results with B = 0.5 and B = 0.8 (the same T) are very close to each other. The apparently larger deviations seen in Fig 4(b) are most likely a result of the randomness in the simulations.

Download:

Fig 4. The portion of active nodes at time step 20 when P = 0.2 in (a) the ER network and (b) the SF network.

The possible values of B and T are 0.2, 0.5, and 0.8. The values of α are from 0 to 0.9 at step size 0.1.

https://doi.org/10.1371/journal.pone.0277490.g004

In Fig 5(a), when P = 0.8 and α < 0.4, the dependence of A₂₀ of the ER network on T and B are the same as in Fig 4(a). Under higher α values, because of the re-ignition of cascading failure shown in Fig 2(a), the expected effect of T on A₂₀ (the same B) is no longer seen. However, a common feature of all curves is that A₂₀ reaches maximum at α 0.4. The results in Fig 5(b) for the SF network are more chaotic. The dependence of A₂₀ on B (the same T) remain the same as in Fig 4(a). The budget “saturation” effect comes from the healing algorithm and is universal in all simulation cases.

Download:

Fig 5. The portion of active nodes at time step 20 when P = 0.8 in (a) the ER network and (b) the SF network.

The possible values of B and T are 0.2, 0.5, and 0.8. The values of α are from 0 to 0.9 at step size 0.1.

https://doi.org/10.1371/journal.pone.0277490.g005

Fig 6 demonstrates the effects of P and α on the resiliency metric A₂₀. In both networks, with P increasing from 0.2 to 0.8, at all α values, A₂₀ will decrease, which means that the systems become less resilient. For smaller P (e.g., 0.2), A₂₀ tends to increase with increasing α, while for larger P, A₂₀ is more likely to peak at an intermediate α.

Download:

Fig 6. The portion of active nodes at time step 20 when B = T = 0.2 in (a) the ER network and (b) the SF network.

The possible values of P are 0.2, 0.5, and 0.8. The values of α are from 0 to 0.9 at step size 0.1.

https://doi.org/10.1371/journal.pone.0277490.g006

The use of A_t as resiliency metric has its limitation because it contains only the information of a snapshot of system dynamics. Moreover, the assessment of system resiliency based on A_t might not yield consistent conclusions simply because of the selection of different observation times. In view of this, we measure the resiliency using another metric called recovery time T_β (β = 95 in this work). This metric, compared to the prior one, can fully capture the overall system dynamics (it is also worth noting that while T_β is a more comprehensive and consistent metric for resiliency planning, A_t is mostly used for real-time decision-making).

In Fig 7, T₉₅ results of the two networks under various combinations of parameters are presented. The intuition is that, with the increase of node healing certainty level α, T₉₅ decreases (i.e., faster recovery and better resiliency). However, this is only true when the system is in the Fully Recovery regime (Fig 3), i.e., neither P nor α can be very high, which is supported by the results in Fig 7. One can observe a converging trend in T₉₅ values with increasing budget parameter B and other parameters the same, which is the same as the aforementioned budget saturation effect. The effects of triggering level T appear to be entangled with other parameters and cannot be easily separated.

Download:

Fig 7. The time needed for 95% recovery in (a) the ER network with P = 0.2, (b) the SF network with P = 0.2, (c) the ER network with P = 0.8, and (d) the SF network with P = 0.8.

The possible values of B and T are 0.2, 0.5, and 0.8. The values of α are from 0.2 to 0.9 at step size 0.1. In some cases with higher α, the system can never reach 95% recovery and therefore no data points are shown in the figures.

https://doi.org/10.1371/journal.pone.0277490.g007

Fig 8 shows the dependence of T₉₅ on P and α in the two networks with B = T = 0.2. In both systems, the common counterintuitive result is that the increase of P from 0.2 to 0.5 and then to 0.8 cannot reduce the recovery time; actually it does the opposite. This trend of system resiliency in Fig 8 is consistent with that in Fig 6, showing the potential of compatibility of the two metrics.

Download:

Fig 8. The time needed for 95% recovery in (a) the ER network and (b) the SF network with B = T = 0.2.

https://doi.org/10.1371/journal.pone.0277490.g008

Conclusions

In this work, we propose a self-healing mechanism for networks undergoing load-based cascading failure. We develop a simulation framework to study the resiliency of networked systems with concurrent cascading failure and self-healing. The two resiliency metrics used are the time-t active node portion (A_t) and the time for β% recovery (T_β). The network resiliency has the following dependencies on the model parameters.

Budget parameter B. If it is too small, the healing process is able to counter cascading failure. When B is high enough, further increasing it cannot bring additional improvements in resiliency (budget saturation effect).
Reactivated node load parameter P and node healing certainty level α. As illustrated in Fig 3, if either is very small, the network cannot have effective healing; when either is high, the system might enter a regime where no full recovery can be made. To make sure that full recovery occurs, our model requires the specification of moderate P and α. And it is possible to find the combinations of P and α that maximize the resiliency.
Triggering level T. In cases with moderate P, α, and sufficiently high B, lowering T (sooner healing kick-in) could increase A₂₀ (but not necessarily reduce T₉₅). In general, the effect of this parameter is highly entangled with other parameters.

In addition, this work provides preliminary results showing the difference between the ER and SF networks in terms of system trajectories and resiliency metrics. We also see the promise of making the two resiliency metrics consistent by specifying appropriate t and β.

This work lays the foundation for subsequent studies on more complex mechanisms and processes on the networks, optimization of parameters to maximize resiliency, and applications to more real-world scenarios.

References

1. Schäfer B, Witthaut D, Timme M, Latora V. Dynamically Induced Cascading Failures in Power Grids. Nat Commun. 2018 May;9:1975. pmid:29773793
- View Article
- PubMed/NCBI
- Google Scholar
2. Li Y, Duan D, Hu G, Lu Z. Discovering Hidden Group in Financial Transaction Network Using Hidden Markov Model and Genetic Algorithm. Proc. 6th Int. Conf. FSKD. 2009 Aug;.
3. Sergiou C, Lestas M, Antoniou P, Liaskos C, Pitsillides A. Complex Systems: A Communication Networks Perspective Towards 6G. IEEE Access. 2020 May;8:89007-89030.
4. Wang Y, Chen B, Chen X, Gao X. Cascading Failure Model for Command and Control Networks Based on an m-Order Adjacency Matrix. Mob Inf Syst. 2018 Dec;e6404136.
- View Article
- Google Scholar
5. Hynes W, Trump BD, Kirman A, Haldane A, Linkov I. Systemic Resilience in Economics. Nat Phys. 2022 Apr;18(4):Art. no. 4.
- View Article
- Google Scholar
6. Yuan R, Luo J, Liu MJ, Yu J. Understanding Organizational Resilience in a Platform-Based Sharing Business: The Role of Absorptive Capacity. J Bus Res. 2022 Mar;141:85–99.
- View Article
- Google Scholar
7. Reed DA, Kapur KC, Christie RD. Methodology for Assessing the Resilience of Networked Infrastructure. IEEE Syst J. 2009 Jun;3(2):174–180.
- View Article
- Google Scholar
8. Das L, Munikoti S, Natarajan B, Srinivasan B. Measuring Smart Grid Resilience: Methods, Challenges and Opportunities. Renew Sust Energ Rev. 2020 Sep;130:109918.
- View Article
- Google Scholar
9. Pumpuni-Lenss G, Blackburn T, Garstenauer A. Resilience in Complex Systems: An Agent-Based Approach. Syst Eng. 2017 Mar;20(2):158–172.
- View Article
- Google Scholar
10. Omer M. The Resilience of Networked Infrastructure Systems: Analysis and Measurement. Singapore: World Scientific; 2013.
11. Watts DJ. A Simple Model of Global Cascades on Random Networks. PNAS. 2002 Apr;99(9):5766–5771. pmid:16578874
- View Article
- PubMed/NCBI
- Google Scholar
12. Stegehuis C, van der Hofstad R, van Leeuwaarden JSH. Epidemic Spreading on Complex Networks with Community Structures. Sci Rep. 2016 Jul;6(1):29748. pmid:27440176
- View Article
- PubMed/NCBI
- Google Scholar
13. Motter AE, Lai YC. Cascade-Based Attacks on Complex Networks. Phys Rev E. 2002 Dec;66:065102. pmid:12513335
- View Article
- PubMed/NCBI
- Google Scholar
14. Lv D, Eslami A, Cui S. Load-Dependent Cascading Failures in Finite-Size Erdös-Rényi Random Networks. IEEE Trans Netw Sci Eng. 2017 Apr;4(2):129–139.
- View Article
- Google Scholar
15. Motter AE. Cascade Control and Defense in Complex Networks. Phys Rev Lett. 2004 Aug;93:098701. pmid:15447153
- View Article
- PubMed/NCBI
- Google Scholar
16. Gallos LK, Fefferman NH. Simple and Efficient Self-Healing Strategy for Damaged Complex Networks. Phys Rev E. 2015 Nov;92:052806. pmid:26651743
- View Article
- PubMed/NCBI
- Google Scholar
17. Quattrociocchi W, Caldarelli G, Scala A. Self-Healing Networks: Redundancy and Structure. PLoS ONE. 2014 Feb;9(2):e87986. pmid:24533065
- View Article
- PubMed/NCBI
- Google Scholar
18. Wang T, Zhang J, Sun X, Wandelt S. Network Repair Based on Community Structure. EPL. 2017 Jun;118(6):68005.
- View Article
- Google Scholar
19. Al-Aqqad W, Zhang X. Modeling Command and Control Systems in Wildfire Management: Characterization of and Design for Resiliency. Proc 2021 IEEE Int Conf HST. 2021 Nov;.
20. Liu C, Li D, Fu B, Yang S, Wang Y, Lu G. Modeling of Self-Healing against Cascading Overload Failures in Complex Networks. EPL. 2014 Sep;107(6):68003.
- View Article
- Google Scholar
21. Mari SI, Lee YH, Memon MS. Sustainable and Resilient Supply Chain Network Design under Disruption Risks. Sustainability. 2014 Oct;6(10):Art. no. 10.
- View Article
- Google Scholar
22. Tierney K, Bruneau M. Conceptualizing and Measuring resilience: A Key to Disaster Loss Reduction. TR News. 2007 May;250:14–17.
- View Article
- Google Scholar
23. Erdös P, Rényi A. On Random Graphs I. Math. Debrecen. 1959;6:290–297.
- View Article
- Google Scholar
24. Barabási AL, Albert R. Emergence of Scaling in Random Networks. Science. 1999 Oct;286(5439):509–512. pmid:10521342
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Schäfer B, Witthaut D, Timme M, Latora V. Dynamically Induced Cascading Failures in Power Grids. Nat Commun. 2018 May;9:1975. pmid:29773793
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Li Y, Duan D, Hu G, Lu Z. Discovering Hidden Group in Financial Transaction Network Using Hidden Markov Model and Genetic Algorithm. Proc. 6th Int. Conf. FSKD. 2009 Aug;.

[ref3] 3. Sergiou C, Lestas M, Antoniou P, Liaskos C, Pitsillides A. Complex Systems: A Communication Networks Perspective Towards 6G. IEEE Access. 2020 May;8:89007-89030.

[ref4] 4. Wang Y, Chen B, Chen X, Gao X. Cascading Failure Model for Command and Control Networks Based on an m-Order Adjacency Matrix. Mob Inf Syst. 2018 Dec;e6404136.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref5] 5. Hynes W, Trump BD, Kirman A, Haldane A, Linkov I. Systemic Resilience in Economics. Nat Phys. 2022 Apr;18(4):Art. no. 4.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref6] 6. Yuan R, Luo J, Liu MJ, Yu J. Understanding Organizational Resilience in a Platform-Based Sharing Business: The Role of Absorptive Capacity. J Bus Res. 2022 Mar;141:85–99.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref7] 7. Reed DA, Kapur KC, Christie RD. Methodology for Assessing the Resilience of Networked Infrastructure. IEEE Syst J. 2009 Jun;3(2):174–180.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref8] 8. Das L, Munikoti S, Natarajan B, Srinivasan B. Measuring Smart Grid Resilience: Methods, Challenges and Opportunities. Renew Sust Energ Rev. 2020 Sep;130:109918.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref9] 9. Pumpuni-Lenss G, Blackburn T, Garstenauer A. Resilience in Complex Systems: An Agent-Based Approach. Syst Eng. 2017 Mar;20(2):158–172.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref10] 10. Omer M. The Resilience of Networked Infrastructure Systems: Analysis and Measurement. Singapore: World Scientific; 2013.

[ref11] 11. Watts DJ. A Simple Model of Global Cascades on Random Networks. PNAS. 2002 Apr;99(9):5766–5771. pmid:16578874
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref12] 12. Stegehuis C, van der Hofstad R, van Leeuwaarden JSH. Epidemic Spreading on Complex Networks with Community Structures. Sci Rep. 2016 Jul;6(1):29748. pmid:27440176
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref13] 13. Motter AE, Lai YC. Cascade-Based Attacks on Complex Networks. Phys Rev E. 2002 Dec;66:065102. pmid:12513335
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref14] 14. Lv D, Eslami A, Cui S. Load-Dependent Cascading Failures in Finite-Size Erdös-Rényi Random Networks. IEEE Trans Netw Sci Eng. 2017 Apr;4(2):129–139.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref15] 15. Motter AE. Cascade Control and Defense in Complex Networks. Phys Rev Lett. 2004 Aug;93:098701. pmid:15447153
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref16] 16. Gallos LK, Fefferman NH. Simple and Efficient Self-Healing Strategy for Damaged Complex Networks. Phys Rev E. 2015 Nov;92:052806. pmid:26651743
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref17] 17. Quattrociocchi W, Caldarelli G, Scala A. Self-Healing Networks: Redundancy and Structure. PLoS ONE. 2014 Feb;9(2):e87986. pmid:24533065
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref18] 18. Wang T, Zhang J, Sun X, Wandelt S. Network Repair Based on Community Structure. EPL. 2017 Jun;118(6):68005.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref19] 19. Al-Aqqad W, Zhang X. Modeling Command and Control Systems in Wildfire Management: Characterization of and Design for Resiliency. Proc 2021 IEEE Int Conf HST. 2021 Nov;.

[ref20] 20. Liu C, Li D, Fu B, Yang S, Wang Y, Lu G. Modeling of Self-Healing against Cascading Overload Failures in Complex Networks. EPL. 2014 Sep;107(6):68003.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref21] 21. Mari SI, Lee YH, Memon MS. Sustainable and Resilient Supply Chain Network Design under Disruption Risks. Sustainability. 2014 Oct;6(10):Art. no. 10.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref22] 22. Tierney K, Bruneau M. Conceptualizing and Measuring resilience: A Key to Disaster Loss Reduction. TR News. 2007 May;250:14–17.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref23] 23. Erdös P, Rényi A. On Random Graphs I. Math. Debrecen. 1959;6:290–297.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref24] 24. Barabási AL, Albert R. Emergence of Scaling in Random Networks. Science. 1999 Oct;286(5439):509–512. pmid:10521342
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

Figures

Abstract

Introduction

Review of related works

Objective and overview of this work

Model and methods

Dynamic processes on networks

Research design

Results and discussion

System trajectories

Resilience metrics.

Conclusions

References