Fig 1.
Distributions of simulated transmission divergence values for different pathogens using the outbreaker and phybreak models.
A) Transmission divergence is defined as the number of mutations separating pathogen WGS sampled from transmission pairs. Horizontal bars indicate the proportion of transmission pairs separated by that number of mutations, across 100 outbreak simulations per pathogen. Outbreaks were simulated using both the outbreaker and phybreak models. B) For each simulated outbreak, we calculated the proportion of sequences that were unique. Black circles represent empirical observations of the proportion of unique sequences for a given outbreak (S6 Table), scaled by the size of the outbreak. The grey circle in the EBOV column represents the weighted mean across the four outbreaks. The violin plots with the dotted outlines in the K. pneumoniae column were generated using the empirical serial interval of 25.8 days observed over the course of the outbreak described by Snitkin et al. [106], which differs significantly from the value of 62.7 days in our literature review.
Table 1.
Epidemiological and genomic parameters for ten major outbreak causing pathogens.
Fig 2.
Impact of transmission divergence on outbreak reconstruction.
Transmission divergence is defined as the number of mutations separating pathogen WGS sampled from transmission pairs. A) Change in accuracy of outbreak reconstruction. Accuracy of outbreak reconstruction is defined as the proportion of correctly assigned ancestries in the consensus transmission tree, itself defined as the tree with the most frequent posterior infector for each infectee. Coloured points represent individual simulated outbreaks. The solid black line represents the fitted relationship of the form i—i*exp(-a*K), where K is the transmission divergence and a and i the fitting variables. Dotted black lines represent the corresponding 95% prediction interval. B) Change in posterior entropy. Posterior entropy is related to the number of plausible posterior infectors for a given case. Lower average entropy indicates greater statistical confidence in the proposed transmission tree. The solid black line represents the fitted relationship of the form i*exp(-a*K)—i, where K is the transmission divergence and a and i the fitting variables.
Fig 3.
Impact of the proportion of unique sequences on outbreak reconstruction.
A) Change in accuracy of outbreak reconstruction. Accuracy of outbreak reconstruction is defined as the proportion of correctly assigned ancestries in the consensus transmission tree, itself defined as the tree with the most frequent posterior infector for each infectee. Coloured points represent individual simulated outbreaks. The solid black line represents the fitted linear model, the dotted black lines the 95% prediction interval. B) Change in posterior entropy. Posterior entropy is related to the number of plausible posterior infectors for a given case. Lower average entropy indicates greater statistical confidence in the proposed transmission tree. The solid black line represents the fitted linear model, the dotted black lines the 95% prediction interval.