Revealing mechanisms of infectious disease spread through empirical contact networks

doi:10.1371/journal.pcbi.1009604

Fig 1.

A schematic of our algorithm.

Observed data(left panel): INoDS utilizes an observed infection time-series data to estimate statistical evidence towards a static or dynamic contact network hypothesis (or hypotheses) using a three-step procedure. Shown here is an example of two competing network hypotheses based on behaviors A and B that potentially cause infection transfer. Inferential steps (right panel): In the first step, the tool estimates per-contact transmission rate parameter β, and background transmission rate parameter ϵ which captures the components of infection propagation unexplained by the edge connections of the network hypothesis. Here, the total infected connections of the focal node i (k_i) is 2. Second, to estimate the epidemiological relevance of the network hypothesis, Bayesian hypothesis testing is performed. The prior distribution shows that the null hypothesis (M = 1) assumes a uniform distribution over randomized networks generated by permuting 10%—100% of edge connections in the contact network (H_A), whereas the alternate hypothesis (M = 2) is a spike-shaped distribution such that only the contact network (H_A, 0% permutation) has non-zero probability. The distribution on model index shifts to M = 2 if the alternate hypothesis has higher posterior probability than the null. Third, model selection of competing network hypotheses is performed using Bayes Factor (BF). A Bayes factor above 2.44 is considered to be decisive support for one hypothesis over the other.

More »

Expand

Fig 2.

Validation of the three steps of INoDS.

(a) Step 1: Absolute error in estimates of per-contact transmission rate parameter β (orange circles) and background transmission rate ϵ (purple circles) for the simulated dataset with disease transmission rate (β*) ranging from 0.01 to 0.1. The true value of background transmission rate (ϵ*) is zero. The filled black circle indicates the average absolute error and the error bars indicate standard deviation around the mean value. (b) Step 2: establishing epidemiological relevance of the observed contact network. Each box summarizes log Bayes factor of observed network compared to null hypothesis (viz a prior of networks with 10% to 100% permuted edges). (c) Step 3: model selection between the observed contact networks (0% randomization level) and networks with increasing edge randomization (25%, 50%, 75% and 100%). Log Bayes factor was calculated by substracting the log marginal evidence of randomized networks from log marginal evidence of the true (0% randomized) synthetic network. Log Bayes factor of more than 2.44 (dashed line) is considered to be a decisive evidence in favor of the observed contact network. The middle black line in each box plot is the median, the boxed area extends from the 25th to 75th quartile, and whiskers extended from the hinge to the largest/smallest value no further than 1.5 times the inter-quartile range.

More »

Expand

Fig 3.

Robustness of INoDS to missing network data.

Robustness of INoDS to missing nodes and missing edges in network hypothesis. Networks with missing nodes/edges were created by randomly removing 25–75% of nodes/edges not involved in infection spread path at each time-step from the dynamic synthetic network. (a) Step 1: Δβ is the relative deviation of estimated transmission parameter β from the true transmission rate β*. (b) Step 2: Epidemiological relevance of observed network with missing data. Each box summarizes log Bayes factor of observed network with missing data compared to null hypothesis (viz a prior of networks with 10% to 100% permuted edges). (c) Evidence for the true synthetic network over datasets with missing data. Log Bayes factor of more than 2.44 (dashed line) is considered to be a strong support in favor of the observed contact network. The middle black line in each box plot is the median, the boxed area extends from the 25th to 75th quartile, and whiskers extended from the hinge to the largest/smallest value no further than 1.5 times the inter-quartile range.

More »

Expand

Fig 4.

Comparison of INoDS performance with previous approaches.

Statistical power of INoDS, k-test and network position test in establishing epidemiological relevance of the “true” contact network against three common forms of missing data—missing nodes, missing edges and missing infected cases. Statistical power of INoDS, k-test and network position test was calculated as the proportion of disease simulations where the observed contact network was detected as epidemiologically relevant (INoDS: log(B₁₀) > 2.44; k-test and network position test: p < 0.05).

More »

Expand

Fig 5.

Identifying the contact network model of Crithidia spread in two bumble bee colonies (QC6 and UN1) described in [36].

Edges in the contact network models represent physical interaction between the bees. Since the networks were fully connected, a series of filtered contact networks were constructed by removing weak weighted edges in the network. The x-axis represents the edge weight threshold used to remove weak edges in the network. Two types of edge weights were tested—duration and frequency of contacts. In addition, both types of weighted edges were converted to binary to create binary networks. The results shown are estimated values of the per contact rate transmission rate, β, for the two colonies. Asterisks above bars indicate that the networks were epidemiologically relevant in explaining the spread of Crithidia (single asterisk: Log(B₁₀) = 0.5–1, substantial evidence; double asterisks: Log(B₁₀) = 1–2, strong evidence). We note that model convergence was not achieved for several network hypotheses and were removed in our final analysis.

More »

Expand

Fig 6.

Identifying transmission mechanisms of Salmonella spread in Australian sleepy lizards.

Dynamic network of proximity interactions for a total duration of 70 days between (A) 43 lizards at site 1, and (B) 44 lizards at site 2. Each temporal slice summarizes interactions within a day (24 hours). Edges indicate that the pair of individuals were within 14m distance of each other, and the edge weights are proportional to the frequency of physical interactions between the node pair. For ease in visualization, four networks summarizing interactions at day 15, 30, 57 and 70 are shown out of a total of 70 static network snapshots. Green nodes are the animals that were diagnosed to be uninfected at that time-point, red are the animals that were diagnosis to be infected and grey nodes are the individuals with unknown infection status at the time-point. We hypothesized that the spatial proximity networks could explain the observed spread of Salmonella in the population. The results are summarized as a table. Bold numbers indicate that the network hypothesis was found to be epidemiologically relevant compared to an ensemble of randomized networks. The network hypothesis with the highest log Bayesian (marginal) evidence at each site is marked with an asterisk (*).

More »

Expand