Pyfectious: An individual-level simulator to discover optimal containment policies for epidemic diseases

doi:10.1371/journal.pcbi.1010799

Table 1.

Comparing the population model in various epidemic simulation softwares.

More »

Expand

Table 2.

Comparing the employed methods to simulate the dynamics of epidemic disease in various epidemic simulation softwares.

More »

Expand

Table 3.

Comparing simulation softwares in terms of the policy enforcement methods. Entries indicated by “-” for EpiModel [9] show that polices are not implemented yet.

More »

Expand

Table 4.

A comparison of the simulators considering different technical aspects.

More »

Expand

Fig 1.

The pipeline of Pyfectious.

The population generator creates the individuals and assigns them their roles to establish the connectivity graph. The connectivity graph, the disease properties, the clock, and the simulation settings are then fed to the simulator to create the time evolution of the disease. Furthermore, the observers and the policies are provided to the simulator in order to log data and make specific alternations to the simulation to emulate real-world epidemic control measures.

More »

Expand

Fig 2.

a) Two people talking to each other (bidirectional connectivity). Each person can infect the other one b) A person touching an item that was touched before by another person is modeled as single-directional connectivity. Only the first person can infect the second person.

More »

Expand

Fig 3.

(a) In the first phase, all individuals of the society are created based on the population size and the set of provided family patterns, as explained in Section 1.2.1. (b) The abstract hierarchical structure of the communities and subcommunities of the society is created. Observe that communities may be intersecting as an individual may be a member of various communities such as family, school, restaurant, etc. Within a community, there are two types of interactions shown as red arrows. One type of interaction is inter-subcommunity, such as the interactions among teachers and students in a school. Note that, inter-subcommunity edges are indeed directional and here for ease of depiction are shown in this way. (c) Another type of interaction, shown with red arrows, is within a subcommunity. For example, the way students interact with each other in a school.

More »

Expand

Fig 4.

a) A node is added to the graph for every individual to generate the population. b) Due to the tight interactions within a family, a family’s subgraph is a complete directed graph (red edges). c) The interactions across communities and subcommunities are represented by the blue edges, which are created according to Eq (2). Blue edges are directional.

More »

Expand

Fig 5.

Yellow circles on the timeline axis indicate events.

The crossed circles represent the events that are already executed. The filled triangle represents the current simulation time. After a task is executed, the simulation time jumps to the next event in the queue. An exemplary transition is shown by moving from (a) to (b).

More »

Expand

Fig 6.

In this figure, a toy example of the execution process for two days is depicted.

Events are indicated by yellow circles on the horizontal axis that represents the timeline. The current time of the simulation is shown by the filled triangle. The crossed circles represent those events that are already executed. After the execution of each event, the simulator time jumps forward to the nearest event in the queue. Each row of this figure shows one step of the simulation time. a) The simulator is initialized. The clock period is set to 5 hours based on which the virus spread events are placed. For better illustration, in this plot the virus spread events are placed exactly at the ends of the intervals; but in simulations, they are uniformly distributed within each interval. b) The planned day events are placed at the beginning of each day. These events are supposed to schedule the individuals’ daily lives based on their roles in society. c) The timer is set at the start of the simulation time. d) During the execution of the planned day events, transition events are added to the event queue. These events will change the location of the individuals. e) The timer takes a step forward. f, g) The transition events are executed that change the locations of the individuals. h) The infection is being spread by interactions among individuals. New infections create new incubation events, which are added to the queue. i) The simulation moves on with the same rules.

More »

Expand

Fig 7.

Simulator’s clock is translated to Virus Spread Events indicated by yellow circles on the timeline axis.

In (a) and (b) two characteristically similar simulators’ timelines are displayed that only differ in their clock periods (T₁ for (a) and T₂ for (b)). If the probability of transmission in a single clock trigger is p, the probability of disease transmission for the whole time is a geometric distribution with parameters equal to (p, ⌊t/T⌋), where T is the clock period. The probability of disease not being transmitted is shown above the axes for a simple interaction between two individuals. In order to have consistent results for both similar simulators with different resolutions, the mentioned probabilities should be equal to each other for the simulators. In other words, (1 − p₁)^⌊t/T₁⌋ = (1−p₂)^⌊t/T₂⌋. For better illustration, in this plot the virus spread events are placed exactly at the ends of the intervals; but in simulations, they are uniformly distributed within each interval.

More »

Expand

Table 5.

Set of family patterns and the probability of their occurrence (M: male, F: female, {}: A family gender pattern).

Any other arbitrary family pattern can be easily defined in Pyfectious. For brevity, the age and the health condition of the family members are excluded from this table. They are sampled from a truncated normal distribution for every family member.

More »

Expand

Table 6.

The design of the communities and subcommunities for an exemplary city (society).

More »

Expand

Fig 8.

a) The time it takes to generate the population and to simulate the propagation of the disease for 48 hours is plotted for six cities with population sizes from 6k to 30k. The clock (virus spread period) is 60 minutes in this experiment. The horizontal axis represents the population size, and the vertical axis represents the total process time. The experiment for every size of the population is repeated multiple times (each vertically aligned dot corresponds to an experiment) to achieve confidence, and the straight line indicates the trend. b) The number of active cases versus time is shown for a sample city with a population size of 20k. To emphasize the probabilistic nature of Pyfectious, the same experiment is repeated multiple times, and the effect of this randomness is seen by observing slightly different trajectories. The blue curve is the moving average (window size of 2) of all executions to show the trend.

More »

Expand

Fig 9.

Experiments demonstrated here are focused on the almost invariant behavior of the simulator within a reasonable chance of the temporal resolution either when applying various control policies or by changing the spread period. The population structure and other details related to these experiments are the same as Section 2.1, with a population size of 20k people. (a) Shows this consistency for clock periods in the interval [80, 320], whereas (b) suggests that the reducibility of the system is a barrier that breaks this invariance feature in case of a radical reduction of temporal resolution.

More »

Expand

Table 7.

Performance measure of commands.

Runtime for commands is averaged over 100 command executions, showing the insignificance of commands runtime compared to other components of the simulator.

More »

Expand

Fig 10.

a) The simulation is executed without any control measure, and the number of infected individuals is plotted versus time for the period of 10 months. The halo around the solid curve is the confidence interval obtained by multiple runs. At each round, the parameters of the population and the disease are re-sampled from the specified distributions. b) The curves of the number of active cases versus time are plotted for different immunity rates. The immunity rate is sampled from three uniform distributions with different mean values. As can be seen, more significant immunity rates give rise to flatter curves. Note that AI initials in the legend stand for Average Immunity, which is the mean of the uniform distribution from which each curve’s immunity rate is sampled.

More »

Expand

Fig 11.

a) The number of currently infected individuals is plotted versus time for different values of the infection rates. The infection rates are sampled from uniform distributions with different mean values. It is observed that a larger infection rate increases the slope of the curve which means a faster spread of the disease early after the advent of the outbreak. As a result, it takes less time for the number of active cases to reach its peak. Notice that AIR stands for the Average Infection Rate, which is the mean of the distribution from which the infection rate is sampled. b) The spread of the infection is shown versus time for different numbers of initial spreaders where the smaller set is chosen from communities such as large workspaces and schools that are suitable places for infecting many people. The confidence intervals are expectedly wider for a smaller initially infected set because it results in some communities without an initial spreader and consequently a less homogeneous spread of the disease. The observation that the peak of the graph with a smaller initial set is higher than the one with a larger initial set emphasizes the hypothesis that some roles and places need special treatment early in an epidemic even though only a few of their individuals can be initially infected.

More »

Expand

Fig 12.

a) The effect of the length of the incubation period (the period in which the infection is not detectable) and the disease period (the period in which the individual is infectious) is shown by changing these parameters of the disease. The curves correspond to a normal incubation and disease period, an increased incubation period by 3.5 days, a decreased incubation period by 3.5 days, and a decreased disease period by seven days. b) The outcome of two quarantine strategies. Strategy A: Enforce a quarantine 40 days after the outbreak, lift it after 20 days and enforce it again after 40 days. Strategy B: Enforce a quarantine 20 days after the outbreak and lift it 20 days later. The oscillatory curve is expected as the remaining active cases after the initial quarantine will be the initial spreaders for the next wave of the epidemic.

More »

Expand

Fig 13.

a) This experiment focuses on enforcing universal quarantines (isolating every infected individual after detection) on a specific day after the outbreak. Here, the quarantines are applied both before and after the day when the curve of the active cases reaches its peak. Strategies A, B, C, and D enforce a quarantine at 200, 100, 40, and 60 days after the outbreak, respectively. b) This experiment studies the effect of partial quarantine where a specified ratio of currently infected individuals are isolated at a specified date (i.e., the control measure is triggered by a time point condition). The partial quarantine represents a real-world scenario where there is uncertainty in detecting the infected individuals, which can be caused by numerous reasons such as inaccurate test kits or individuals with mild symptoms that do not visit hospitals or test facilities.

More »

Expand

Fig 14.

a) This graph shows the effect of control policies that target specific sectors of society. Each curve corresponds to shutting down a different place. Group A includes all workplaces of any size. Group B consists of gyms, restaurants, and cinemas. Group C includes more public places such as malls and public transportation. b) This graph shows the effectiveness of quarantining specific roles (subcommunities) in society. The curves show the spread of the infection when different ratios of workers of any kind are quarantined. The effect is expectedly significant because the workers spend so much time in their workplaces every day, and many individuals often visit a workplace during working hours, which makes it a suitable place for spreading the infection.

More »

Expand

Fig 15.

a) In this experiment, a command is set to quarantine the infected individuals in the population when 10% of the whole population is infected (a ratio condition triggers the control command). b) In this experiment, infected individuals are quarantined when more than 15% of the population are infected, and the quarantine is lifted when the ratio of infected individuals drops below 10%. In the terminology of control theory, this strategy is known as a bang-bang controller.

More »

Expand

Table 8.

A summary of the discovered control strategies during the rounds of the optimization process.

The first column is the index of the discovered quarantine strategy. Each strategy consists of three ratios that show the portion of each group of {students, workers, customers} to put in quarantine. The second column indicates the iteration at which the associated strategy is found, and the rightmost column shows the value of the cost function for that strategy. This table only incorporates the iterations at which the agent improves the strategy. It can be seen that, at earlier rounds, the agent picks a large ratio of students, and only in later rounds, it realizes the critical effect of quarantining workers. Each iteration equals a single run with a population size of 2k, and takes (13.8 ± 5.7) minutes with average computational power.

More »

Expand

Fig 16.

a) The agent aims to minimize the loss function defined as the peak of the active cases. The optimization variables are the ratio of three roles that must be quarantined, and the ratios are constrained to be bounded from above and sum up to a constant value. The upper bound constraints are placed to take into account the cost of shutting down the economy and the trivial solution that is quarantining all individuals. The graph shows the result for 200 trials. The blue dashed line is the lower envelope of the cost produced by the discovered solution at every trial. Each point from A to H corresponds to the minimum cost up to that trial. The discovered policy associated with each of these points can be seen in Table 8. b) The curves that show the number of active cases versus time for each round of the optimization is plotted in this figure. These are actually the curves we need to flatten to protect the healthcare system against overloading. It can be seen that the discovered strategy with the least cost corresponds to the flattest curve. (The population size is reduced by a scale of 10 to boost the computation time.)

More »

Expand