Connecting mass-action models and network models for infectious diseases

Thien-Minh Le; Jukka-Pekka Onnela

doi:10.1371/journal.pcbi.1013373

Abstract

Infectious disease modeling is used to forecast epidemics and assess the effectiveness of intervention strategies. Although the core assumption of mass-action models of homogeneously mixed population is often implausible, they are nevertheless routinely used in studying epidemics and provide useful insights. Network models can account for the heterogeneous mixing of populations, which is especially important for studying sexually transmitted diseases. Despite the abundance of research on mass-action and network models, the relationship between them is not well understood. Here, we attempt to bridge the gap by first identifying a spreading rule that results in an exact match between disease spreading on a fully connected network and the classic mass-action models. We then propose a method for mapping epidemic spread on arbitrary networks to a form similar to that of mass-action models. We also provide a theoretical justification for the procedure. Finally, we demonstrate the application of the proposed method in the theoretical analysis of reproduction numbers and the estimation of model parameters using synthetic data based on an empirical network. The method proves advantageous in explicitly handling both finite and infinite networks, significantly reducing the computation time required to estimate model parameters for spreading processes on networks. These findings help us understand when mass-action models and network models are expected to provide similar results and identify reasons when they do not.

Author summary

The study of infectious diseases employs two primary modeling approaches: mass-action models and network models. Mass-action models assume a fully connected graph for interactions, while network models typically feature more complex, less connected graphs. The relationship between these two modeling approaches is unclear. To address this knowledge gap, we propose a method that allows us to match the mass-action and network models when the graph is fully connected. We also introduce a mapping, which makes use of the structure of the network, to transform epidemic spread on arbitrary networks into models that are analogous to mass-action models. Simulation results indicate that using the proposed mapping method reduces computational time by a factor of approximately 30 with little loss in accuracy. The proposed approach has been validated theoretically and through simulations, showing its potential to enhance approximate Bayesian parameter estimation. Overall, these findings offer valuable insights into disease dynamics and suggest new ways to integrate traditional mass-action results into network analyses.

Citation: Le T-M, Onnela J-P (2025) Connecting mass-action models and network models for infectious diseases. PLoS Comput Biol 21(8): e1013373. https://doi.org/10.1371/journal.pcbi.1013373

Editor: Claudio José Struchiner, Fundação Getúlio Vargas: Fundacao Getulio Vargas, BRAZIL

Received: October 26, 2024; Accepted: July 27, 2025; Published: August 18, 2025

Copyright: © 2025 Le, Onnela. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The empirical network data is publicly available:https://figshare.com/articles/dataset/The_Copenhagen_Networks_Study_interaction_data/7267433/1 Python code used in this study is publicly available at https://github.com/onnela-lab/connecting-ma-network. All other relevant data are in the manuscript and its supporting information files.

Funding: NIH award AI138901 provided salary support for both authors (TML and JPO). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Understanding disease spread is crucial for providing accurate predictions of disease outbreaks and for gaining greater insight into prevention strategies. Infectious disease modeling has served as a potent tool for this endeavor for centuries, with the earliest work dating back to the work of Bernoulli in 1760 [1]. Compartmental mass-action models are the most common type of modeling approach, and they are frequently used to study different influenza strains. This modeling assumes that all individuals are well-mixed [2]. Its advantage is that it is simple to use and has a well-established theoretical foundation for different disease properties, while still producing accurate predictions for multiple types of diseases, particularly influenza [3,4]. Sexually transmitted diseases such as monkeypox, human immunodeficiency virus (HIV), and human papillomavirus (HPV) are challenging for mass-action models as infected individuals transmit diseases to their neighbors only through a sexual network. Since population structure is naturally represented as a network, there have been numerous investigations into the spread of disease on networks in the last two decades [5–10].

Despite the fact that network epidemiology has received a lot of attention from the research community, most of the work has focused on deriving solutions for spreading processes on static or dynamic networks or understanding the effects of network topology characteristics on spreading process outcomes [5–7,11]. Surprisingly, there are few studies on the connection of network models and mass-action models. Understanding the connection between the two model families is critical because it will allow researchers to see the effect of network topology on the spreading process and could open up new avenues for making use of well-established results from classic models. Levin and Durrett (1996) highlighted the similarity between the SIR network-based model and the SIR mass-action model, with the primary distinction lying in the interaction between the susceptible and infectious compartments [12]. Other researchers extended the network-based model by developing approaches to study spreading processes on networks within a more generalized framework, including directed, semi-directed, and message-passing dynamics. These studies also confirmed that the results of the classic mass-action model emerge as a special case of the network-based model in the limit of large networks [13–17]. The first attempt to make use property of a network-based model to fit in a classic model was the work of Keeling [18]. Keeling (2005) proposed a modified mass-action model to fit the network model’s predictions. The transmission rate of the modified mass-action model was defined to be a function of certain network characteristics (the average degree and the ratio of triangles to triples). Kenah (2010) introduced a contact interval approach and used simulations to illustrate the limitations of naively applying the mass-action model when estimating the reproduction number generated from a network-based model [19]. Malloy et al. (2021) used different simulation settings to investigate the influence of the mass-action model and the network model on the effectiveness of prevention strategies [20]. Recently, Rempala demonstrated that the Poisson SIR network model approximates the classic SIR model as the mean degree of the network increases [21]. Despite these efforts, the existing studies mostly focus on the large sample behavior of networks, and the explicit relation between the two models was not spelled out clearly.

The purpose of this work is to bridge the gap in the literature by relating the two models. The primary distinction between mass-action models and network-based models lies in their graph structures, as the implicit contact graphs of classic mass-action models are always fully connected, whereas the graphs of network-based models are usually not. In order to connect these two models, we will first study the behavior of epidemic spread on fully connected graphs. There are several ways to define the spreading process on networks, including the Gillespie, degree infectivity, and unit infectivity methods [7,22,23]. The classical Gillespie method is a continuous-time, event-driven stochastic simulation in which infections occur at a time. The infected node is chosen from susceptible nodes with infected neighbors using weighted sampling, and nodes with more infected neighbors have higher infection rates [7]. A discrete-time version can be adapted from this formulation by applying the same rate-based infection logic (see Algorithm 1 in the S1 File). In the unit infectivity method, at each time step, each infected node randomly chooses a neighbor, and then transmits the disease to the chosen neighbor with a fixed probability [23]. Unlike the unit infectivity method, in the degree infectivity method, at each time step each infected node transmits the disease to each of its neighbors with a fixed probability [22]. Degree infectivity is the most commonly used spreading method in networks and is similar to bond percolation on networks. More precisely, when infectiousness does not vary and the infectious period is fixed, the degree infectivity process reduces exactly to classical bond percolation on the network [24]. Under these conditions, a mass‑action model on a fully mixed (Erdős–Rényi) network can also be formulated as bond percolation. The pseudo codes in Algorithm 1,2,3 in S1 File describe how these spreading methods work. In this work, we adopt a stochastic discrete-time modeling framework, in which infection counts remain integer-valued. While classical mass-action models are often formulated in continuous time and can produce non-integer expected infection counts, the discrete-time formulation serves as a natural foundation for bridging mass-action and network-based models. This choice is motivated by both practical and theoretical considerations: when the time step is sufficiently small, the discrete-time model closely approximates continuous-time dynamics, and most empirical epidemic data—such as case counts—are recorded at regular intervals (e.g., daily or weekly). These properties make the discrete-time framework particularly convenient for inference and model comparison. To connect the two models, the spreading process on fully connected graphs should yield identical results, regardless of how the spreading rule is defined. In the discrete-time framework, the unit infectivity method, which restricts each infected node to a single attempted transmission event per time step, and the discrete-time adaptation of the Gillespie method, which allows only one infection event per time step, will not yield the same result as the mass-action model on fully connected graphs. Therefore, we will only consider degree infectivity and refer to it as the conventional network spreading method. Under the conventional network spreading method, however, the number of infections on fully connected graphs is always less than the number of infections under the mass-action model [7]. We first propose a rule for network propagation that eliminates this discrepancy. Then, based on the proposed spreading rule, we present approaches to employ network topology to adapt the mass-action model to capture the spread on networks. We also provide theoretical justifications to support our method. Finally, using simulation and synthetic data, we show the merits of the proposed method in studying epidemics on networks.

The structure of the paper is as follows. In Section 2.1, we discuss the classic mass-action models and our proposed spreading process on networks. Sections 2.2 and 2.3 provide the approximation procedure for the proposed spreading process and offer theoretical justifications for it. Section 3 provides results demonstrating the merits of the proposed method in studying epidemics on networks. In particular, Section 3.1 demonstrates that the proposed method enables the calculation of reproduction numbers on networks in a manner similar to the mass-action model. It also provides theoretical results comparing the early dynamics of the epidemic between the network and mass-action models. Section 3.2 shows the merits of employing the proposed methods for analyzing epidemics on networks. We highlight the advantages of the proposed approach to significantly reduce computational time with minimal loss of accuracy when studying epidemics on networks. Additionally, Section 3.3 provides a quantitative answer to a fundamental question regarding the necessity of network-based models in epidemic studies: If network information were readily available, how valuable would it be compared to simply using the mass-action model? Finally, Section 4 discusses our contribution and possible directions for future research.

2. Materials and methods

2.1 Mass-action models and network models

2.1.1 Mass-action models.

Mass action models are the most common model types used in infectious disease epidemiology due to their simplicity. The model makes the fundamental assumption of homogeneous mixing of all individuals, implying that the contact network is fully connected: any infectious individual can potentially transmit infection to any susceptible individual. This section examines the Susceptible-Infected (SI), Susceptible-Infected-Recovered (SIR), and Susceptible-Infected-Treated-AIDS-Death (SITAD) processes. The first two processes, SI and SIR, are frequently used for influenza. The SITAD model here is a simplified version of the model used in Hove-Musekwa et. al. [25] to study Human immunodeficiency virus infection and acquired immune deficiency syndrome (HIV/AIDS).

SI process: In the SI process, at a given time, the population is divided into two mutually exclusive compartments: susceptible and infected. Suppose is the size of the population. Let and denote the number of susceptible and infected individuals, respectively (). Suppose that the model parameter is , where is the transmission rate. Its dynamic states evolve as in Fig 1a.

Download:

Fig 1. Three different spreading processes: (a) the SI spreading process, (b) the SIR spreading process, and (c) the SITAD spreading process.

https://doi.org/10.1371/journal.pcbi.1013373.g001

Let us denote the status of its population at time is . Using the tau leaping method by Gillespie [26], the status of its population at time evolves as . Here, is the transition vector, is the population level hazard, and is Poisson distributed with mean . By choosing , which represents the change in population status after each time unit, the dynamic epidemic in the population evolves from to by the transformation . In particular, the evolving process can be described in the system of equations (1) below:

(1)

where is Poisson distributed with mean .

SIR process: The population for the SIR process at a given time is divided into three mutually exclusive compartments: susceptible, infected, and recovered. Suppose is the size of the population. Let denote the number of susceptible, infected, and recovered individuals, respectively (). Suppose that the model parameter is , where is the transmission rate and is the recovery rate. Its dynamic states evolve as in Fig 1b.

Denote the status of its population at time is . The tau leaping method tells us the status of its population at time evolved as , where and are the transition vectors, and , for , are random variables. Here Poisson distributed with means , where and are population level hazards. Let , which represents the change in population status after each time unit, the dynamic epidemic in the population evolves from to by the transformation . More specifically, the shift from time to time of the SIR process can be described as the system of equations (2) below.

(2)

where , are Poisson distributed with means , respectively.

SITAD process: We consider a simplified version of the HIV/AIDS model of Hove-Musekwa et. al. [25]. In this model, at a given time, the population’s state is divided into five mutually exclusive compartments: susceptible (), HIV positive (), AIDS (), treated (), and deceased () (). Its dynamic state evolves as in Fig 1c, where the model parameter , is the transmission rate of HIV, is the transmission rate of AIDS, is the treatment rate of HIV, is the AIDS progression rate of HIV, is the treatment rate of AIDS, and is the death rate of AIDS.

Let the status of its population at time be . Using the tau leaping method, the dynamic epidemic of the population at time step evolves as , where , , , , and are the transition vectors. And are Poisson distributed with means , for . Here, , , , and are population level hazards. Let , which represents the change in population status after each time unit, the dynamic epidemic in the population evolves from to by the transformation . In particular, the shifted from time to of the SITAD process is described as in the system of equations (3) as below.

(3)

where , for , are Poisson distributed with means .

2.1.2 Network models and the proposed spreading process.

The graphs of the network-based model are usually not fully connected, so the spreading process depends on network topology. For simplicity, we consider a given fixed network with a known initial single infected node.

The spreading process on networks occurs between infected and susceptible individuals, where at each time step each infected node has a fixed probability of transmitting the disease to each of its susceptible neighbors. Note that the meaning of the transmission parameter of the spreading process on networks differs from that of the transmission parameter in the mass-action model [22].

In a discrete-time network spreading process, represents the per-time-step probability that an infected node transmits infection to one of its susceptible neighbors (In continuous-time formulations, is typically treated as a transmission rate, with the per-time-unit transmission probability approximated by when is small). In mass-action models, the transmission parameter represents the average number of effective contacts per unit time between infectious and susceptible individuals that result in new infections. Since there are susceptible individuals out of a total of individuals in the population, the probability of randomly selecting a susceptible individual from the entire population is . Therefore, the total number of new infections caused by each infected individual is . Therefore, on a graph with fully connected nodes, the transmission parameter of the mass-action model corresponds to the transmission rate in the network model. The same distinction was also pointed out between the contact interval distribution and scaled contact interval distribution as in [19].

To evaluate the connection between the network model and the traditional mass-action model, we first examine if there is any discrepancy in the number of infectious each model generates over time in a fully connected graph. Because the topologies of the populations under the two models are identical, we expect that the number of infections generated from each model will be the same. However, the spreading process on a fully connected network results in fewer infections than the mass-action model for finite networks, with the gap between the two models closing only as the network size [7]. The gap is intuitive as the nature of new infections formed for each model is different. Under the mass-action model, at each time step, the model generates a total of new infections. Due to the homogeneous mixing assumption, any susceptible individual can be infected, regardless of their location. In mass-action models, each time step can be understood as a two-stage process: first, the total number of new infections is determined at the global (network) level, and then these infections are randomly assigned to susceptible nodes at the local level.

On the other hand, under the conventional spreading rule of the network-based model, at each time step, each infected node will spread the disease to its susceptible neighbors with a given probability. Since multiple infected neighbors may attempt to infect the same susceptible node during each time step, the total number of new infections at each time step can only be determined after the entire spreading process from all infected nodes is completed.

So, under the network-based model, at each time step, the transmission flow goes from the generation of infections from each infected individual (“local level”) to the total new infections (“global level”). Therefore, the network model may produce a lower number of infections when, at the local level, one susceptible node may be infected by two or more of its infected neighbors.

This discrepancy is more problematic when the probability that at least one susceptible node will be infected by more than one infectious neighbor at time increases, causing infected nodes in a sense to “compete” for susceptible nodes. This phenomenon was also discussed in the work of Kenah and his co-authors [27]. Therefore, as long as this discrepancy persists, the gap between the two models persists. Thus, to bridge the mass-action models and network models, we need to propose a new spreading rule that can close this gap when the graph is fully connected.

The proposed spreading rule adopts the transmission parameter as defined in the mass-action model. Specifically, for each infected node, the transmission rate represents the average number of contacts with randomly selected susceptible individuals that result in new infections per unit time. For an infected node , its local “bubble” consists of individuals, including node itself and its neighbors. The probability that a randomly selected individual from this bubble is susceptible is , where denotes the number of susceptible individuals in the bubble. The use of in the denominator ensures a natural connection between mass-action and network-based models, particularly in finite-size networks. To illustrate, one may imagine that each infected node produces a fixed number of pathogens per unit time, which is then shared uniformly among the members of its local bubble, including the node itself. Consequently, the expected number of new transmissions caused by an infected node is given by .

Under the proposed spreading rule, at each time step, the total transmission rate is computed first, and the number of new infections is then generated based on the total transmission rate. New infections are assigned to at-risk nodes (susceptible neighbors of infected nodes) using a weighted random sample, where the weight of each at-risk node is proportional to the number of infected neighbors. More specifically, let be the set of infected nodes at time , be the number of susceptible neighbors of node at time , and be the number of at-risk nodes at time . The total transmission rate at time is calculated as . The number of new infections is generated from , where is the number of at-risk nodes at time . This Binomial distribution approximates a Poisson distribution with mean when is large. The new infections are then randomly allocated among the at-risk nodes based on their weights (see Algorithm 4 in S1 File for more details on the proposed SI spreading rule).

Under the proposed spreading process, when the network is fully connected, susceptible nodes and at-risk nodes are the same, i.e., . Therefore, at time , the total transmission rate is , and new infections are generated from Binomial given that . Our proposed procedure allows for a good match with the mass-action model as long as Binomial is a good estimate for Poisson. It should be noted that the proposed SI spreading rule will yield an exact match with the SI mass-action model in (1) for any network size if one chooses to model using a Binomial distribution instead of the Poisson distribution.

Fig 2 displays the average proportion of infections over time for an SI process of the proposed spreading rule, the conventional network spreading rule, and the mass-action model on fully connected graphs. Here we consider four cases: graphs of 100 nodes and 1000 nodes with the transmission parameter (top left and top right), and graphs of 100 nodes and 1000 nodes with the transmission parameter (bottom left and bottom right). The average is taken over 200 stochastic realizations. As expected, as the transmission parameter is small, the different spreading rules are hard to distinguish. But when the transmission parameter is large, the proposed spreading rule still precisely matches the mass-action model, but the conventional spreading rule on the network underestimates the number of infections relative to the mass-action model. Although this discrepancy of all spreading rules will be removed for large networks as , the proposed spreading rule helps to alleviate the discrepancy for all cases, especially for finite networks. Therefore, the proposed spreading rule is an excellent choice for bridging the mass-action models and network-based models.

Download:

Fig 2. Comparison of the SI process of the proposed network spreading rule (Proposed) on fully connected graphs, the conventional network spreading rule (Conventional), and the mass-action model (Mass-action) with the transmission parameter

= 0.12 with 100 nodes (top left) and 1000 nodes (top right), and

= 0.7 on fully connected graphs with 100 nodes (bottom left) and 1000 nodes (bottom right).

https://doi.org/10.1371/journal.pcbi.1013373.g002

2.2 Approximations to the proposed spreading process

The most straightforward strategy for studying epidemic spread on network is to use simulation. Despite the fact that this method provides us with many insights into various disease-spreading processes in a variety of settings, it does not give us a thorough theoretical understanding of the spreading process. Much effort has been dedicated to investigating the solution of the spreading process using the conventional network spreading rule. However, most approaches are for large populations, which provide approximate solutions for large networks [15,28–31]. Recently, efforts applying dynamical survival analysis studying network epidemics helped to encompass both mass-action models and network-based models for the configuration model [32–35]. Although these approaches provide great insights into the spreading process on networks, they still provide limited insights into how network models and mass-action models are related, especially for finite-size networks.

In the following, we present another approach for a better understanding of the relationship between network models and mass-action models. The main idea is to set up a system of equations that are analogous to those of the mass-action model while also taking into account the topology of the network. For example, for the SI process, we aim to replace with , where is a function that contains information about network topology.

2.2.1 The modified SI process.

Consider an SI process with transmission rate , starting with one infected node and spreading the disease to all over the network. If the order of infections is known, we introduce a transmission matrix based on network topology. Element of the transmission matrix , , represents the transmission rate caused by network topology of node when the network has infected nodes. Here , where is the number of susceptible neighbors of node when there are infected nodes in the network, and is the degree of node . The summation of row in the transmission matrix, , gives the overall spreading rate caused by network topology when there are infected nodes in the network. Thus, the overall transmission rate when the network has infected nodes is . Since reordering the column of the transmission matrix will not change the overall transmission rate (the row sum), for simplicity, we rearrange the columns of the transmission matrix in the order of infections and use this rearranged matrix as our transmission matrix . Fig 3a demonstrates the transmission matrices corresponding to the infection order (1,3,2,4) for the incomplete network with 4 nodes. Here the transmission matrix results from reordering the columns of the original transmission matrices from (1,2,3,4) to (1,3,2,4) as

Download:

Fig 3. The connection in transmission rate between network model and mass-action model.

(a) Transmission order on a network of four nodes and its corresponding transmission matrix. (b) Transmission order on a complete network of four nodes and its corresponding transmission matrix. (c) Connection of the network transmission matrix and the transmission matrix corresponding to the mass-action model.

https://doi.org/10.1371/journal.pcbi.1013373.g003

Similarly, Fig 3b represents the transmission matrices corresponding to the infection order (1,3,2,4) for the complete network with 4 nodes. Here the transmission matrix results from reordering the columns of the original transmission matrices from (1,2,3,4) to (1,3,2,4) as

Let us denote . The behavior of the spreading process on the network using the transmission matrix after each time unit then can be described by

(4)

where , and is Poisson distributed with mean = .

Compared to the SI mass-action model, it is apparent that the SI process on networks is controlled by network topology through the transmission matrix , where the mass-action transmission rate is replaced by . Therefore, the transmission matrix encompasses all the network information related to the spreading process. Instead of directly simulating the spread of the disease on the network, using the transmission matrix allows us to study the spreading process more quantitatively. The relationship between a network model and the mass-action model is depicted in Fig 3c. In contrast to the transmission matrix of the conventional mass-action model, which is always , the transmission matrix of the network model will vary according to the network topology and takes the form . Under our procedure, the actual transmission matrix is approximated by , where non-zero elements in each row represent the average infection rate at that time.

Let denote the sequence of the number of infected nodes based on the transmission matrix at time . Denote , . Therefore, the average number of infected nodes from time 1 to in the network is . Similarly, the average number of susceptible nodes from time 1 to in the network is .

Lemma 2.1 below tells us that the proposed network spreading process and the modified network spreading process have the same average realization.

Lemma 2.1 When the order of infections is known, the spreading process based on equation (4) has the same average realization as the proposed SI process on the network.

If the order of infections is unknown, we can obtain the infection order by a random sample. In this approach, each newly infected node is incrementally updated by sampling at-risk nodes based on their risk weights (see Algorithm 7 in S1 File for details). Then, using the same procedure as before, we can construct the corresponding transmission matrix for each sample and calculate the number of infections. Finally, the average number of infections is obtained by averaging the number of infections corresponding to the sampled infection order sequences. This sampling scheme has the following rationale. Consider the network with nodes and a known first infected node; there are at most possible infection order sequences. Assuming that we obtained infection order sequences using the sampling approach, let denote the transmission matrix corresponding to the infection order sequence . The average realization of the number of infections can then be approximated by the average realization of the number of infections from transmission matrices . Therefore, as the sample size grows, the average number of infections based on transmission matrices will converge to the average realization of the spreading process on the network.

Remark 1: Although the sampling approach used to obtain the m-infection order sequence is similar to directly simulating epidemics on a network, the transmission matrix derived from this sampling procedure provides deeper insights into how the network topology influences the transmission rate. For certain networks, such as complete networks, k-star networks, and cycle networks, the transmission matrix remains unchanged. This allows us to examine the impact of network topology on the spreading process through the transmission matrix, an insight that would be difficult to uncover through direct simulation alone. This represents a key advantage of the proposed method.

Remark 2: There is a close similarity between the method of obtaining the infection order sequence in the proposed approach (Algorithm 7 in S1 File) and the use of epidemic percolation networks (EPNs) to simulate the spreading process in networks, as described in Kenah and Miller (2011) [36]. EPNs provide a flexible and general framework for modeling stochastic epidemic processes, including settings with heterogeneous infectiousness and susceptibility, as well as hybrid models that combine network-based and mass-action transmission mechanisms. While the transmission sequence in EPNs is typically generated through a simple random procedure, the proposed approach uses a weighted sampling method where each at-risk node’s probability of infection is proportional to its number of infectious neighbors. Although the two approaches may be equivalent under certain assumptions, they differ in formulation and intended use: EPNs are often employed to study final epidemic size, whereas the proposed method is intended to establish connections between network-based and mass-action models. Exploring formal connections between these frameworks is a promising direction for future work.

2.2.2 The modified SIR process.

Similarly to the SI process, we first consider the infection order of all nodes when it is known, and when it is unknown, we use the same sampling procedure as above. For the SIR process, due to the effect of network topology, the recovery of an infected node in one location may result in a different number of at-risk nodes (susceptible neighbors of infected nodes) compared to when an infected node recovers in a different location. Therefore, in the SIR process, the same number of recovered nodes may result in a different number of at-risk nodes. If recovery occurs at random, the Binomial approximation cannot be used. Therefore, the exact match in terms of the number of infections will only be feasible if the recovery order is determined by the length of time a node was infected. As random recovery is a common assumption, we will focus on this case. As the number of at-risk nodes is unattainable, we can generate the number of newly infected nodes at each time step using the Poisson distribution. For a given infection order sequence, we can extract the transmission matrix corresponding to the case in which there is no recovery during transmission. Denote . The modified SIR spreading process based on the transmission matrix can now be updated as

(5)

where ; and are Poisson distributed with means and , respectively. Note that the Poisson approximation is accurate if the network is dense or if the transmission rate is small compared to the network density.

Let and denote the sequence of the number of infected nodes and recovered nodes based on the transmission matrix at time , respectively. Denote , , , . Then the average realization of the number of nodes in each state from time 1 to on the network is where is one of the states .

The following lemma shows that the two spreading processes have the same average realization.

Lemma 2.2 If the infection order sequence is known, the spreading process based on equation (5) has the same average realization as the SIR process on the network.

2.2.3 The modified SITAD process.

The SITAD process on networks starts from an initially infected node and then spreads to cause new HIV infections with rate . Among those infected with HIV, some progress to AIDS and some get treated. Individuals with AIDS spread the disease and cause new HIV infections with rate . Among individuals with AIDS, some will get treated and some will die. The risk weight of each at-risk node in the SITAD spreading process is determined by . Similarly to the SIR process, we consider the case where AIDS progression, treatment, and death happen at random, and the infection order is known. Let be the transmission matrix corresponding to the infection order sequence. The modified SITAD spreading process based on the transmission matrix can now be updated as follows:

Denote . Here is the total number of people with HIV. The modified SITAD spreading process based on the transmission matrix can now be updated as

(6)

where , , . , for , are Poisson distributed with means . Here, and .

Similarly as for the SIR process, we define the average realization for the SITAD process based the transmission matrix at time as , , , . The average realization of the number of nodes in each state from time 1 to K in the network are where is one of the states .

The following lemma tells us that the average realizations based on the two approaches are the same.

Lemma 2.3 If the order of infections is known, the spreading process based on equation (6) has the same average realization as the SITAD process on the network.

2.3 Approximations of the spreading processes using the average transmission matrix

As shown in the preceding sections, when the infection sequence is known, the modified SI, SIR, and SITAD processes generate the same average number of infections on networks. Since the order of infections is often unknown, we can determine the order of infections using random sampling. We proved that when the infection order is known, the average realization of the number of infections based on the corresponding transmission matrix equals the average realization based on the network. As a result, the average number of infections based on transmission matrices for will converge to the average realization of the network spreading process as the sample size grows.

For a given infection sequence with the transmission matrix , if the transmission rate is , represents the matrix of average spreading rates corresponding to the infection sequence. Therefore, the average spreading rate corresponding to different realizations for can be approximated by , where . In other words, the average number of infections based on the modified process utilizing the average transmission matrix can be used to approximate the average number of infections generated by the network spreading process. We refer to this approximation approach as the average transmission matrix model, or ATMM.

3. Results

In this Section, we provide results related to theoretical, simulation, and a quantitative answer for the usefulness of network information to study epidemics if such information is accessible. Section 3.1 provides some theoretical results of using the proposed method to study the reproduction number of the spreading process on networks. Section 3.2.1 shows that the modified spreading process agrees with the proposed network spreading process in terms of the average number of infections over time. In Section 3.2.2, we use synthetic network data to show how the modified spreading process outperforms the proposed spreading process in terms of computation. Finally, Section 3.3 demonstrates that using the network model to analyze network epidemic data, once network information is accessible, surpasses the mass-action model both in terms of computational efficiency and goodness of fit.

The synthetic data used in Sections 3.2.2 and 3.3 below are generated using the proposed discrete-time SIR process in Section 2.2.2 on an empirical network dataset. The empirical network data is an aggregate of network data obtained from the Copenhagen Network Study (CNS), which was made publicly accessible in 2019 [37]. The network data comprises the connectivity patterns of 706 students at the Technical University of Denmark during a 28-day period in February 2014. The connectivity patterns are identified through the use of Bluetooth as participants consented to use loaner phones provided by the study as their main phone throughout the study. The received signal strength indicator (RSSI), which can serve as an approximation of physical distance, was collected every five minutes. Following Hambridge et. al. [38], we assigned a connection between two persons if there was at least one RSSI signal large enough during the period, i.e., RSSI . For analysis purposes, we simply kept the largest component, which contained 673 nodes and 57,712 edges, as a fixed network. Based on the fixed empirical network, we synthesized epidemic data. For generating network epidemic data, we used the discrete time SIR proposed spreading process as described in Section 2.2.2. In particular, we first generated model parameters from prior distributions and then simulated network epidemic data using the generated parameters. If the synthesized data realization was good enough, meaning there was enough data to estimate model parameters, we kept it and retained the model parameters. Since there were only 673 nodes in the observed network, we specified that a good realization had to have cumulatively at least 50% of nodes infected and 10% of nodes recovered. Once these constraints were met, the synthesized data was treated as observed network epidemic data, with the corresponding parameters serving as the underlying truth to evaluate the accuracy of estimation.

3.1 Theoretical results on the early behavior of the proposed SIR spreading process on networks

In this section, we present an advantage of using the proposed method for studying epidemics on networks related to reproduction numbers. For simplicity, we consider the proposed SIR process on networks. Under this process, we will show that the proposed method has the merit of providing a straightforward derivation of the basic reproduction number and the reproduction number at the early stage . Finally, we will provide a result on the epidemic behavior during the early stage for any network size.

From Lemma 2.2, we know that once the order of infections is known, the proposed SIR spreading process on a network has the same average realization as the modified spreading process based on the transmission matrix corresponding to the infection order as described in the system of equations (5). Therefore, we can study the basic reproduction number of the SIR process on a network by using the analog system of equations (5) in a similar manner as in the mass-action model.

We have . So the basic reproduction number of the spreading process on the network is . Note that, when , large outbreaks that are driven to extinction by depletion of susceptibles occur with positive probability. Let us denote the sequence of distinct node degrees of the given network as ; the probability a given node has degree is , where and . If the initial infected node is node , then the basic reproduction number is . If the initially infected node is unknown, the basic reproduction number now follows a distribution induced by the node degree distribution where with probability , where is again the probability a node has degree . The average basic reproduction number is .

Denote . We have the following bounds: for all . Since the network structure underlying the mass action model is a fully connected network, its basic reproduction number is . We see that the quantity is an upper bound on the basic reproduction number of a spreading process on a network. This tells us that if the process starts with one initially infected node, the spreading process on a network will less likely lead to an epidemic compared to the mass-action model. The lower bound of the above inequality tells us that the epidemic will least likely occur if the initially infected node has the smallest number of neighbors (smallest degree).

Next, we consider the early stage behavior of the effective reproduction number , for small. For simplicity, suppose that at time , the network has infected nodes and no recovered nodes. Let denote the set of infected nodes at time , we have . First, we look at the upper bound and lower bound of the quantity at each infected node . Since in a network with infected nodes, node is one of them, and there are other infected nodes. The number of susceptible neighbors of node , , depends on the positions of the other infected nodes. In one extreme case, if only one of node ’s neighbors is infected and the remaining infected nodes are not neighbors of node , then reaches its maximum value, which is . On the other hand, if all infected nodes are neighbors of node , then the number of susceptible neighbors of node is reduced to . Therefore, we can establish the following inequality: Dividing through by , we obtain Therefore, the lower bound and upper bound of are given by

. We observe that the effective reproduction number of the spreading process on networks attains its upper bound if the spreading path of infected nodes form a line, and it attains its lower bound if the spreading path of infected nodes forms a complete graph of nodes.

Finally, we consider the important question of whether there are any scenarios where the spreading process on a network is more aggressive than the mass-action model (where its corresponding network structure is fully connected). The following Proposition gives us the answer to this important question.

Proposition 3.1 Consider the SIR proposed spreading process on networks at the early stage with infected nodes.

a. For large networks as size , the asymptotic behavior of the effective reproduction number at the early stage of the epidemic on networks is always asymptotically bounded above by the effective reproduction number of the mass-action model.
b. For finite-size networks, the effective reproduction number at the early stage of the epidemic on networks is greater for the mass-action model if all infected nodes form a chain graph and each infected node has more than susceptible neighbors.

Proposition 3.1 shows that for a large network, its effective reproduction number at the early stage ( small) is always asymptotically bounded from above by its counterpart mass-action model. However, given a finite-size network, the network spreading process can be more aggressive than the mass-action model depending on the spreading pattern and network topology. This highlights the importance of network topology in understanding disease dynamics.

Fig 4 demonstrates the case where the spreading process on a partially connected network is more aggressive than the spreading process on a fully connected network. In this case, the infected nodes on the partially connected network as in Fig 4a form a chain graph. The total transmission rate on the partially connected network is 2.1, while the total transmission rate on the fully connected network is 1.5.

Download:

Fig 4. Comparing transmission rates of two networks with three infected nodes (1,2,3): (a) partially connected network with the total transmission rate of 2.1, and (b) fully connected network with the total transmission rate of 1.5.

https://doi.org/10.1371/journal.pcbi.1013373.g004

3.2 Simulation results on the proposed spreading process and its modifications

3.2.1 Modified spreading processes.

We conducted simulation studies for the three processes discussed in the paper: SI, SIR, and SITAD. For each process, the fixed network structure was generated from a network model. We considered networks of nodes generated by the Erdős–Rényi (ER) model with the probability parameter , and the Barabási–Albert (BA) model with the parameter . Note that the parameters and in the ER and BA models influence the network’s density. The larger of these values, the network is denser. While is the probability that two random nodes among nodes are connected, refers to the number of edges a new coming node will create to connect with existing nodes in the network. Since the ER network can have multiple components, we forced it to one component by adding the set of edges . Without loss of generality, we assumed that node 1 is the initial infected node. Based on the given network, the initially infected node, and the model parameter , the average number of infections is determined by averaging the number of infections resulting from 1000 iterations of disease transmission using the proposed spreading rule on the network. On the other hand, using the sampling procedure, we generated 30 infection order sequences and their corresponding transmission matrices , for . Then, we used the modified process that utilized random transmission matrices to produce 1000 realizations of the number of infections. We obtained the average realization of the number of infections of the modified process by taking the average of these infection sequences. We also considered the ATMM by applying the modified process to the average transmission matrix . In particular, we simulated the modified process utilizing the average transmission matrix 1000 times. We then calculated the average number of infections by averaging those 1000 simulated realizations. Fig 5 shows a good agreement across the different approaches.

Download:

Fig 5. Approximation of different approaches for the SI, SIR, and SITAD spreading processes on the BA network (left) and modified ER network with a single connected component (right).

The first row corresponds to the SI process with the model parameter , the second row to the SIR process with , and the last row to the SITAD process with .

https://doi.org/10.1371/journal.pcbi.1013373.g005

3.2.2 Proposed SIR model and the SIR ATMM.

In this Section, we use synthetic network data to illustrate the benefits of the modified spreading process in estimating model parameters. In particular, we present a comparison of the performance of the proposed SIR process and the SIR ATMM in estimating model parameters based on observed network data. To estimate the model parameters, we employ approximate Bayesian computation (ABC), a method that bypasses the need for a direct likelihood calculation. There are many variants of ABC, but they are all based on a comparison of observed and simulated data. The key idea of ABC is to start by sampling parameter values from prior distributions, then use the model and these parameter values to generate a data realization. The sampled parameter value is retained if its distance to the observed data is close enough. The collection of accepted parameter values constitutes a sample from an approximation of the posterior distribution and is used to estimate the model parameters. The variant ABC method used in this paper is replenishment ABC (RABC) [39]. Criteria for comparison include computational time, confidence interval coverage probability from posteriors, and interquartile range. To obtain these metrics, we setup the code as follows.

Step 1. Generating data and parameters: For , we generate the parameter from uniform priors and . Based on the parameters, empirical network, and the proposed SIR model, we generate a data set corresponding to . If the generated data set constitutes a good realization as defined above, we keep as a true parameter value to be estimated and treat the generated data as observed data. We repeat the process until we obtain 100 underlying true parameter values and the corresponding 100 datasets . For simplicity, we fix the initial infected node at node 1 and set the simulation time period for all .

Step 2. Estimating parameters: For each iteration , , based on the sequence of , we use RABC to estimate the underlying true parameter value . In this estimation step, we chose priors for as , as , the final threshold as 40, and sampled 100 particles to form the posterior. We also used the simple Euclidean distance, , where are the days during the study period, is the number of infected nodes and is the number of recovered nodes at time ; and are the corresponding numbers from simulated data.

Step 3. Evaluating parameter estimates: For each synthesized data set , , we evaluated the accuracy of our parameter estimates for each method based on coverage probability of the interquartile (IQ Cover), coverage probability of the 95 percentile interval ( Cover), and the average interquartile range (IQR = ), for each parameter . We then calculate the average of these 100 IQ Cover, Cover, and IQR. We also compared the average time requirements to obtain the estimators corresponding to each realization using each method. The computation time is based on the results after submitting the parallel Python code to the University of Tennessee of Chattanooga Cluster with 3GB of memory and one CPU per task.

Table 1 presents the averages of running time, IQ Cover, Cover, and IQR when using ABC to estimate model parameters with the Proposed SIR and SIR ATMM. The reported credible intervals show slight overcoverage, which is a known and generally acceptable feature of ABC methods, especially under low tolerance thresholds, and reflects the method’s conservative approach to uncertainty [40]. Beyond this, the table shows that the average transmission matrix model estimated parameters with comparable accuracy to the proposed model, while significantly reducing computation time (from 13.7 hours to 0.4 hours on average).

Download:

Table 1. Comparison of parameter estimation between the proposed SIR model and the SIR ATMM using RABC.

https://doi.org/10.1371/journal.pcbi.1013373.t001

This improvement was to be expected as the average transmission matrix model requires network information in order to obtain the transmission matrix; once the transmission matrix is available, the spreading process can proceed at the same rate as the mass-action model. This important aspect addresses a significant challenge associated with the utilization of ABC in network infectious disease epidemiology research: the lengthy computational time required to directly simulate epidemic data on network for calibration purposes. The average transmission matrix approach is thus an excellent candidate for implementing ABC in network epidemiology.

3.3 Real data analysis on the necessity of network information in studying epidemics

Here, we address a fundamental question: If network information were readily accessible, how useful would it be compared with merely using the mass-action model? We provide a quantitative answer to this question by comparing the fit of the average transmission matrix model and the mass-action model to observed data, as well as the computation time required for each method. Specifically, we initially utilized ABC to estimate the model parameters for each model based on the observed data. We then used the ABC posteriors of model parameters to find the average realization and the 95% confidence band for the number of current and cumulative infected cases (infected and recovered) for every approach.

We calculated the 95% confidence interval for each approach as follows. We simulated three distinct SIR data sets using each model parameter sample of the ABC posteriors, and we retained only the best 30 simulated data that were closest to the observed data. The point of simulating three data sets for each model parameter is to avoid losing the particle (posterior sample) by chance, as the spreading process on a network might cause the realization to stop abruptly if the recovered nodes are in bottleneck positions at the early stages of the spreading process. From the best 30 realizations, we constructed a 95% confidence interval for each method. In addition, for each of the 30 realizations, we calculated the Euclidean distance as defined in Section 3.2.2. Based on these distances, we calculated the mean distance and its standard deviation.

Fig 6 demonstrates that network information provides a far better fit to the observed data. The 95% confidence band derived from the average transmission matrix method effectively captures the observed data. However, when naively applying the mass-action model to fit the spreading process on the network, the results deviate significantly from the observed data. Table 2 provides more information on the distance and time for each model. Here, we take the final threshold as 40 for the ABC procedure. The table shows that adopting the mass-action model to match the epidemic data naively not only results in a worse fit than the average transmission matrix model but also requires a threefold increase in processing time. This interesting phenomenon arises because the computer is having difficulty finding a suitable fit between data generated by the mass-action model and the observed network data. If the ABC acceptance threshold is lower than 40, the mass-action model will eventually fail to converge because we are using a wrong model to fit the network epidemic. Therefore, network knowledge is extremely valuable and can provide insights into the nature of epidemics.

Download:

Table 2. Comparison of the mass-action and the naive method.

https://doi.org/10.1371/journal.pcbi.1013373.t002

Download:

Fig 6. Comparison of the 95% confidence band for the mass-action model and the ATMM.

https://doi.org/10.1371/journal.pcbi.1013373.g006

4. Discussion

In this study, we examined the connection between network models and mass-action models. We proposed a spreading rule on networks that allows for an exact match between the epidemic spread on the network and the classic mass-action models when the graph is fully connected. We then developed modified spreading processes on networks that are similar to the classic mass-action models. We also proved that the modified processes and the proposed spreading rule on networks have the same average number of infections. Our results reveal some of the differences between the two models as well as how the network model differs from the traditional mass-action model. More specifically, we noted that the variety of spreading rules in the network-based model cause it to differ from its mass-action counterpart. The proposed spreading rule allows us to bridge the two models as it shares a similar underlying spreading mechanism with the mass-action model. When the network is fully connected, the proposed rule and the mass-action model align regardless of population size. However, when the network is partially connected, for a given infection order, the spreading process on network and mass-action models diverges due to the different transmission matrices driving each model.

Besides considering the popular SI and SIR spreading processes on networks, we extended the SITAD model to networks. We also analyzed and compared outbreaks during the early stage of the SIR spreading process for network and mass-action models. By utilizing synthesized data from an empirical network, we highlighted the benefits of network information in studying epidemics and the advantages of the proposed method. In particular, we demonstrate that the network structure is crucial for improving the fit. Additionally, we show that the modified version of our approach, ATMM, is computationally efficient with little loss of accuracy.

Furthermore, our approach allows us to transform epidemic spread on arbitrary networks into equivalent mass-action models. When employing simulation-based inference, which typically requires extensive calibration datasets, this transformation enables us to estimate model parameters for network epidemics more efficiently (especially for large networks): instead of repeatedly simulating the spreading process on the network, we can generate calibration data faster from the corresponding mass-action model.

The main limitation of the modified process is that the network is fixed. In practice, the network may evolve over time. There are also many extensions that can be conducted using the proposed spreading process, such as investigating different epidemiological quantities such as the basic reproduction number , the exact/approximate solution of the spreading process, and prevention strategies on the network.

Finally, over the past couple of decades, research in network science has provided valuable insights into epidemics. Centrality measures and clustering have proven highly effective in identifying high-risk groups and optimizing intervention strategies [9]. As global connectivity and emerging diseases continue to challenge public health systems, the ability to model and analyze disease transmission through network-based approaches will be increasingly crucial in addressing evolving infectious disease threats.

Supporting information

S1 File. Proofs for Lemmas 2.1, 2.2, and 2.3; Proposition 3.1; Pseudocode for commonly used network spreading algorithms: SI Gillespie, SI unit infectivity, SI degree infectivity, the proposed SI, SIR, SITAD spreading rules, and the sampling process for determining infection order sequence.

Algorithm 1: The Gillespie SI spreading rule. Algorithm 2: The unit infectivity SI spreading rule. Algorithm 3: The degree infectivity SI spreading rule. Algorithm 4: The proposed SI spreading rule on networks. Algorithm 5: The proposed SIR spreading rule on networks. Algorithm 6: The proposed SITAD spreading rule on networks. Algorithm 7: The sampling mechanism to obtain the infection order sequence.

https://doi.org/10.1371/journal.pcbi.1013373.s001

(PDF)

Acknowledgments

The authors would like to thank Dr. Louis Raynal for his valuable insights and for sharing the RABC Python code.

References

1. Bernoulli D. Essai d’une nouvelle analyse de la mortalité causée par la petite vérole et des avantages de l’inoculation pour la prévenir. Histoire de l’Acad Roy Sci avec Mém des Math et Phys. 1760.
- View Article
- Google Scholar
2. Kermack WO, McKendrick AG. A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London, Series A. 1927.
- View Article
- Google Scholar
3. Nsoesie EO, Brownstein JS, Ramakrishnan N, Marathe MV. A systematic review of studies on forecasting the dynamics of influenza outbreaks. Influenza Other Respir Viruses. 2014;8(3):309–16. pmid:24373466
- View Article
- PubMed/NCBI
- Google Scholar
4. Afzal A, et al. Merits and limitations of mathematical modeling and computational simulations in mitigation of COVID-19 pandemic: a comprehensive review. Archives of Computational Methods in Engineering. 2022.
- View Article
- Google Scholar
5. Keeling MJ, Rand DA, Morris AJ. Correlation models for childhood epidemics. The Royal Society Interface. 1997.
- View Article
- Google Scholar
6. Newman M. Exact solutions of epidemic models on networks. arXiv. 2002.
- View Article
- Google Scholar
7. Kiss IZ, Miller JC, Simon PL. Mathematics of Epidemics on Networks: From Exact to Approximate Models. Springer, 2017.
8. Craig B, Phelan T, Siedlarek JP, Steinberg J. Improving epidemic modeling with networks. Economic Commentary. 2020.
- View Article
- Google Scholar
9. Wang X, An Q, He Z, Fang W. A literature review of social network analysis in epidemic prevention and control. Complexity. 2021.
- View Article
- Google Scholar
10. Kuga K, Tanimoto J. Effects of void nodes on epidemic spreads in networks. Sci Rep. 2022;12(1):3957. pmid:35273312
- View Article
- PubMed/NCBI
- Google Scholar
11. Holme P. Fast and principled simulations of the SIR model on temporal networks. PLoS One. 2021;16(2):e0246961. pmid:33577564
- View Article
- PubMed/NCBI
- Google Scholar
12. Levin SA, Durrett R. From Individuals to Epidemics. Royal Society, 1996.
13. Miller JC. Epidemic Size and Probability in Populations with Heterogeneous Infectivity and Susceptibility. Physical Review. 2007.
- View Article
- Google Scholar
14. Kenah E, Robins JM. Second look at the spread of epidemics on networks. Physical Review. 2007.
- View Article
- Google Scholar
15. Kenah E, Robins JM. Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing. J Theor Biol. 2007;249(4):706–22. pmid:17950362
- View Article
- PubMed/NCBI
- Google Scholar
16. Wilkinson RR, Ball FG, Sharkey KJ. The relationships between message passing, pairwise, Kermack-McKendrick and stochastic SIR epidemic models. J Math Biol. 2017;75(6–7):1563–90. pmid:28409223
- View Article
- PubMed/NCBI
- Google Scholar
17. Allard A, Moore C, Scarpino SV, Althouse BM, Hébert-Dufresne L. The role of directionality, heterogeneity, and correlations in epidemic risk and spread. SIAM Rev. 2023;65(2):471–92.
- View Article
- Google Scholar
18. Keeling M. The implications of network structure for epidemic dynamics. Theor Popul Biol. 2005;67(1):1–8. pmid:15649519
- View Article
- PubMed/NCBI
- Google Scholar
19. Kenah E. Contact intervals, survival analysis of epidemic data, and estimation of R0. Biostatistics. 2011.
- View Article
- Google Scholar
20. Malloy GSP, Fiebert JDG, Enns EA, Brandeau ML. Predicting the effectiveness of endemic infectious disease control interventions: the impact of mass action versus network model structure. Medical Decision Making. 2021.
- View Article
- Google Scholar
21. Rempala GA. Equivalence of mass action and Poisson network SIR epidemic models. Biomath. 2023.
- View Article
- Google Scholar
22. Newman M. Networks: An Introduction. Oxford University Press; 2010.
23. Dutta R, Mira A, Onnela JP. Bayesian inference of spreading processes on networks. Proceedings of the Royal Society of London, Series A. 2018.
- View Article
- Google Scholar
24. Kuulasmaa K. The spatial general epidemic and locally dependent random graphs. Journal of Applied Probability. 1982;19(4):745–58.
- View Article
- Google Scholar
25. Hove-Musekwa SD, Runyowa V, Mukandavire Z. Modelling the epidemiological and economic impact of HIV/AIDS with particular reference to Zimbabwe. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 2010.
26. Gillespie DT. Approximate accelerated stochastic simulation of chemically reacting systems. The Journal of Chemical Physics. 2001;115(4):1716–33.
- View Article
- Google Scholar
27. Kenah E, Lipsitch M, Robins JM. Generation interval contraction and epidemic data analysis. Math Biosci. 2008;213(1):71–9. pmid:18394654
- View Article
- PubMed/NCBI
- Google Scholar
28. Keeling MJ. The effects of local spatial structure on epidemiological invasions. Proceedings of the Royal Society of London. 1999.
- View Article
- Google Scholar
29. Volz E. SIR dynamics in random networks with heterogeneous connectivity. J Math Biol. 2008;56(3):293–310. pmid:17668212
- View Article
- PubMed/NCBI
- Google Scholar
30. Miller JC. A note on a paper by Erik Volz: SIR dynamics in random networks. J Math Biol. 2011;62(3):349–58. pmid:20309549
- View Article
- PubMed/NCBI
- Google Scholar
31. Miller JC, Slim AC, Volz EM. Edge-based compartmental modelling for infectious disease spread. J R Soc Interface. 2012;9(70):890–906. pmid:21976638
- View Article
- PubMed/NCBI
- Google Scholar
32. Jacobsen KA, Burch MG, Tien JH, Rempala GA. The large graph limit of a stochastic epidemic model on a dynamic multilayer network. Journal of Biological Dynamics. 2018.
- View Article
- Google Scholar
33. KhudaBukhsh WR, Choi B, Kenah E, Rempala GA. Survival dynamical systems: individual-level survival analysis from population-level epidemic models. Interface Focus. 2020.
- View Article
- Google Scholar
34. KhudaBukhsh WR, Bastian CD, Wascher M, Klaus C, Sahai SY, Weir MH, et al. Projecting COVID-19 cases and hospital burden in Ohio. J Theor Biol. 2023;561:111404. pmid:36627078
- View Article
- PubMed/NCBI
- Google Scholar
35. Kiss IZ, Kenah E, Rempała GA. Necessary and sufficient conditions for exact closures of epidemic equations on configuration model networks. J Math Biol. 2023;87(2):36. pmid:37532967
- View Article
- PubMed/NCBI
- Google Scholar
36. Kenah E, Miller JC. Epidemic percolation networks, epidemic outcomes, and interventions. Interdisciplinary Perspectives on Infectious Diseases. 2011.
- View Article
- Google Scholar
37. Sapiezynski P, Stopczynski A, Lassen DD, Lehmann S. Interaction data from the Copenhagen Networks Study. Sci Data. 2019;6(1):315. pmid:31827097
- View Article
- PubMed/NCBI
- Google Scholar
38. Hambridge H, Kahn R, Onnela JP. Examining SARS-CoV-2 interventions in residential colleges using an empirical network. International Journal of Infectious Diseases. 2021.
- View Article
- Google Scholar
39. Drovandi CC, Pettitt AN. Estimation of parameters for macroparasite population evolution using approximate bayesian computation. Biometrics. 2011;67(1):225–33. pmid:20345496
- View Article
- PubMed/NCBI
- Google Scholar
40. Prangle D, Blum MGB, Popovic G, Sisson SA. Diagnostic tools for approximate Bayesian computation using the coverage property. Aust N Z J Stat. 2014;56(4):309–29.
- View Article
- Google Scholar

[ref1] 1. Bernoulli D. Essai d’une nouvelle analyse de la mortalité causée par la petite vérole et des avantages de l’inoculation pour la prévenir. Histoire de l’Acad Roy Sci avec Mém des Math et Phys. 1760.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Kermack WO, McKendrick AG. A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London, Series A. 1927.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Nsoesie EO, Brownstein JS, Ramakrishnan N, Marathe MV. A systematic review of studies on forecasting the dynamics of influenza outbreaks. Influenza Other Respir Viruses. 2014;8(3):309–16. pmid:24373466
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Afzal A, et al. Merits and limitations of mathematical modeling and computational simulations in mitigation of COVID-19 pandemic: a comprehensive review. Archives of Computational Methods in Engineering. 2022.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Keeling MJ, Rand DA, Morris AJ. Correlation models for childhood epidemics. The Royal Society Interface. 1997.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Newman M. Exact solutions of epidemic models on networks. arXiv. 2002.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Kiss IZ, Miller JC, Simon PL. Mathematics of Epidemics on Networks: From Exact to Approximate Models. Springer, 2017.

[ref8] 8. Craig B, Phelan T, Siedlarek JP, Steinberg J. Improving epidemic modeling with networks. Economic Commentary. 2020.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref9] 9. Wang X, An Q, He Z, Fang W. A literature review of social network analysis in epidemic prevention and control. Complexity. 2021.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref10] 10. Kuga K, Tanimoto J. Effects of void nodes on epidemic spreads in networks. Sci Rep. 2022;12(1):3957. pmid:35273312
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref11] 11. Holme P. Fast and principled simulations of the SIR model on temporal networks. PLoS One. 2021;16(2):e0246961. pmid:33577564
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref12] 12. Levin SA, Durrett R. From Individuals to Epidemics. Royal Society, 1996.

[ref13] 13. Miller JC. Epidemic Size and Probability in Populations with Heterogeneous Infectivity and Susceptibility. Physical Review. 2007.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref14] 14. Kenah E, Robins JM. Second look at the spread of epidemics on networks. Physical Review. 2007.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref15] 15. Kenah E, Robins JM. Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing. J Theor Biol. 2007;249(4):706–22. pmid:17950362
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref16] 16. Wilkinson RR, Ball FG, Sharkey KJ. The relationships between message passing, pairwise, Kermack-McKendrick and stochastic SIR epidemic models. J Math Biol. 2017;75(6–7):1563–90. pmid:28409223
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref17] 17. Allard A, Moore C, Scarpino SV, Althouse BM, Hébert-Dufresne L. The role of directionality, heterogeneity, and correlations in epidemic risk and spread. SIAM Rev. 2023;65(2):471–92.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. Keeling M. The implications of network structure for epidemic dynamics. Theor Popul Biol. 2005;67(1):1–8. pmid:15649519
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref19] 19. Kenah E. Contact intervals, survival analysis of epidemic data, and estimation of R0. Biostatistics. 2011.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref20] 20. Malloy GSP, Fiebert JDG, Enns EA, Brandeau ML. Predicting the effectiveness of endemic infectious disease control interventions: the impact of mass action versus network model structure. Medical Decision Making. 2021.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref21] 21. Rempala GA. Equivalence of mass action and Poisson network SIR epidemic models. Biomath. 2023.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref22] 22. Newman M. Networks: An Introduction. Oxford University Press; 2010.

[ref23] 23. Dutta R, Mira A, Onnela JP. Bayesian inference of spreading processes on networks. Proceedings of the Royal Society of London, Series A. 2018.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref24] 24. Kuulasmaa K. The spatial general epidemic and locally dependent random graphs. Journal of Applied Probability. 1982;19(4):745–58.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref25] 25. Hove-Musekwa SD, Runyowa V, Mukandavire Z. Modelling the epidemiological and economic impact of HIV/AIDS with particular reference to Zimbabwe. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 2010.

[ref26] 26. Gillespie DT. Approximate accelerated stochastic simulation of chemically reacting systems. The Journal of Chemical Physics. 2001;115(4):1716–33.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Kenah E, Lipsitch M, Robins JM. Generation interval contraction and epidemic data analysis. Math Biosci. 2008;213(1):71–9. pmid:18394654
View Article
PubMed/NCBI
Google Scholar

[78] View Article

[79] PubMed/NCBI

[80] Google Scholar

[ref28] 28. Keeling MJ. The effects of local spatial structure on epidemiological invasions. Proceedings of the Royal Society of London. 1999.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref29] 29. Volz E. SIR dynamics in random networks with heterogeneous connectivity. J Math Biol. 2008;56(3):293–310. pmid:17668212
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref30] 30. Miller JC. A note on a paper by Erik Volz: SIR dynamics in random networks. J Math Biol. 2011;62(3):349–58. pmid:20309549
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref31] 31. Miller JC, Slim AC, Volz EM. Edge-based compartmental modelling for infectious disease spread. J R Soc Interface. 2012;9(70):890–906. pmid:21976638
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref32] 32. Jacobsen KA, Burch MG, Tien JH, Rempala GA. The large graph limit of a stochastic epidemic model on a dynamic multilayer network. Journal of Biological Dynamics. 2018.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref33] 33. KhudaBukhsh WR, Choi B, Kenah E, Rempala GA. Survival dynamical systems: individual-level survival analysis from population-level epidemic models. Interface Focus. 2020.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref34] 34. KhudaBukhsh WR, Bastian CD, Wascher M, Klaus C, Sahai SY, Weir MH, et al. Projecting COVID-19 cases and hospital burden in Ohio. J Theor Biol. 2023;561:111404. pmid:36627078
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref35] 35. Kiss IZ, Kenah E, Rempała GA. Necessary and sufficient conditions for exact closures of epidemic equations on configuration model networks. J Math Biol. 2023;87(2):36. pmid:37532967
View Article
PubMed/NCBI
Google Scholar

[107] View Article

[108] PubMed/NCBI

[109] Google Scholar

[ref36] 36. Kenah E, Miller JC. Epidemic percolation networks, epidemic outcomes, and interventions. Interdisciplinary Perspectives on Infectious Diseases. 2011.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref37] 37. Sapiezynski P, Stopczynski A, Lassen DD, Lehmann S. Interaction data from the Copenhagen Networks Study. Sci Data. 2019;6(1):315. pmid:31827097
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref38] 38. Hambridge H, Kahn R, Onnela JP. Examining SARS-CoV-2 interventions in residential colleges using an empirical network. International Journal of Infectious Diseases. 2021.
View Article
Google Scholar

[118] View Article

[119] Google Scholar

[ref39] 39. Drovandi CC, Pettitt AN. Estimation of parameters for macroparasite population evolution using approximate bayesian computation. Biometrics. 2011;67(1):225–33. pmid:20345496
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref40] 40. Prangle D, Blum MGB, Popovic G, Sisson SA. Diagnostic tools for approximate Bayesian computation using the coverage property. Aust N Z J Stat. 2014;56(4):309–29.
View Article
Google Scholar

[125] View Article

[126] Google Scholar

Figures

Abstract

Author summary

1. Introduction

2. Materials and methods

2.1 Mass-action models and network models

2.1.1 Mass-action models.

2.1.2 Network models and the proposed spreading process.

2.2 Approximations to the proposed spreading process

2.2.1 The modified SI process.

2.2.2 The modified SIR process.

2.2.3 The modified SITAD process.

2.3 Approximations of the spreading processes using the average transmission matrix

3. Results

3.1 Theoretical results on the early behavior of the proposed SIR spreading process on networks

3.2 Simulation results on the proposed spreading process and its modifications

3.2.1 Modified spreading processes.

3.2.2 Proposed SIR model and the SIR ATMM.

3.3 Real data analysis on the necessity of network information in studying epidemics

4. Discussion

Supporting information

S1 File. Proofs for Lemmas 2.1, 2.2, and 2.3; Proposition 3.1; Pseudocode for commonly used network spreading algorithms: SI Gillespie, SI unit infectivity, SI degree infectivity, the proposed SI, SIR, SITAD spreading rules, and the sampling process for determining infection order sequence.

Acknowledgments

References