Fig 1.
Schematic representation of the study method.
For each empirical network we fit a number of network models. For each network model we simulate 100 networks and then simulate 1000 Susceptible-Infected-Recovered (SIR) epidemics per simulated network. Simulated networks are compared with the empirical network in terms of network features and simulated epidemic features.
Table 1.
Summary of High School Contact Networks.
Fig 2.
High school relationships network based on the relationships network of Bearman et al. [30].
A network tie indicates a romantic or non-romantic sexual relationship was reported by one of the incident nodes. Gender is denoted by node colour (blue-male, pink-female). This version of the network was manually re-coded starting from Fig 2 of Bearman et al. [30].
Fig 3.
Snowball sample of PWID contact network used for model estimation.
Waves are indicated by shade, from wave 0 (black) to wave 2 (light gray).
Fig 4.
Subgraphs for edge-triangle models.
For each subgraph, node roles are distinguished by node colour. The number of node roles for each subgraph is shown in column 3. Inclusion of a subgraph is shown (X) for each of the model variations: standard configuration model (CM), edge-triangle model (ET), variation one with subgraphs of four nodes (+sub4), variation two with 3-triangles and 4-triangles (+34tri), variation three with a “truss”, and variation four with maximal cliques of five or more nodes (+clqs5+).
Table 2.
Parameters for SIR Simulations.
Table 3.
HS75 Network ERGM Specification.
Table 4.
HS75 Network ERGM Goodness-of-fit.
Fig 5.
Graph statistics for HS75 network models.
Various network statistics shown as boxplots from 100 simulated networks. AKS is alternating k-star, GCC is global clustering coefficient, AKT is alternating k-triangle, and “node max. geo. dist.” is the mean over all nodes in the largest component of the nodewise maximum geodesic distance. Results are reported for an Erdős-Rényi model (“ER”), the configuration model (“CM”), the edge-triangle model (“ET”) and the four variations which include size four subgraphs (“+sub4”), 3- and 4-triangles (“+34 tri”), trusses (“+truss”), and cliques of size 5 and above (“+clqs5+”), respectively, and an ERGM (“ERGM”). Values from the observed network shown by horizontal dotted lines.
Fig 6.
Final size of SIR epidemics using HS75 network models.
For each network model, shown are the mean (with 95% confidence intervals) over 100 simulated networks of the mean over all outbreaks from 1000 SIR simulations. For the observed network, shown are the mean (with 95% confidence interval) over all outbreaks from 1000 SIR simulations.
Fig 7.
Epidemic duration of SIR epidemics using HS75 network models.
For each network model, shown are the mean (with 95% confidence intervals) over 100 simulated networks of the mean over all outbreaks from 1000 SIR simulations. For the observed network, shown are the mean (with 95% confidence interval) over all outbreaks from 1000 SIR simulations. Large (or missing) confidence intervals for small probabilities of transmission arise from a small number of observed outbreaks (typically less than five). The large difference between observed network and the network models for probabilities of transmission larger than 0.7 reflects differences in the large component diameters.
Table 5.
HS60 Network ERGM Specification.
Fig 8.
Graph statistics for HS60 network models.
Various network statistics shown as boxplots from 100 simulated networks. AKS is alternating k-star, GCC is global clustering coefficient, AKT is alternating k-triangle, and “node max. geo. dist.” is the mean over all nodes in the largest component of the nodewise maximum geodesic distance. Results are reported for an Erdős-Rényi model (“ER”), the configuration model (“CM”), the edge-triangle model (“ET”) and the four variations which include size four subgraphs (“+sub4”), 3- and 4-triangles (“+34 tri”), trusses (“+truss”), and cliques of size 5 and above (“+clqs5+”), respectively, and an ERGM (“ERGM”). Values from the observed network shown by horizontal dotted lines.
Fig 9.
Final size of SIR epidemics using HS60 network models.
For each network model, shown are the mean (with 95% confidence intervals) over 100 simulated networks of the mean over all outbreaks from 1000 SIR simulations. For the observed network, shown are the mean (with 95% confidence interval) over all outbreaks from 1000 SIR simulations.
Fig 10.
Graph statistics for HS6 network models.
Various network statistics shown as boxplots from 100 simulated networks. GCC is global clustering coefficient, and “node max. geo. dist.” is the mean over all nodes in the largest component of the nodewise maximum geodesic distance. Results are reported for an Erdős-Rényi model (“ER”), the configuration model (“CM”), the edge-triangle model (“ET”) and the four variations which include size four subgraphs (“+sub4”), 3- and 4-triangles (“+34 tri”), trusses (“+truss”), and cliques of size 5 and above (“+clqs5+”), respectively. Values from the observed network shown by horizontal dotted lines.
Table 6.
Relationships Network ERGM Specification.
Table 7.
Relationships Network Multilevel ERGM Specification.
Fig 11.
Graph statistics for the high school relationships network models.
Various network statistics. AKS is alternating k-star and “node max. geo. dist.” is the mean over all nodes in the largest component of the nodewise maximum geodesic distance. Results are reported for a random network model with fixed number of edges (“ER fix”), the edge-triangle model (“ET”) without respecting gender, a bipartite version of the configuration model (“bip CM”), an ERGM (“ERGM”) and a multilevel ERGM (“mERGM”). The last three models include the gender of both nodes of each edge, so they have only two same-sex edges. For the ERGM, the 9-star and the same-sex network tie structure are fixed, exogenous effects. For the multilevel ERGM, the 9-star, the sole triangle, the three overlapping 4-cycles and the same-sex network structure are all fixed. All models use 573 nodes.
Fig 12.
Final size and duration of SIR epidemics using the high school relationships network models.
For each network model, shown are the mean (with 95% confidence intervals) over 100 simulated networks of the mean over all outbreaks from 1000 SIR simulations. For the observed network, shown are the mean (with 95% confidence interval) over all outbreaks from 1000 SIR simulations. The dotted horizontal line in the left panel shows the fraction of nodes in the large component of the empirical network. The large number of smaller components effectively limits the final size of an SIR epidemic starting from a limited number of seed nodes.
Fig 13.
Graph statistics for the PWID contact network.
Various network statistics. AKS is alternating k-star, AKT is alternating k-triangle, GCC is global clustering coefficient, and “node max. geo. dist.” is the mean over all nodes in the largest component of the nodewise maximum geodesic distance. Results are reported for an Erdős-Rényi model (“ER”), the edge-triangle model (“ET”) and an ERGM (“ERGM”). All models use 524 nodes. For snowball sampled network data there is no complete network with which to compare.
Fig 14.
Final size and epidemic duration of SIR epidemics using PWID network models.
For each network model, shown are the mean (with 95% confidence intervals) over 100 simulated networks of the mean over all outbreaks from 1000 SIR simulations. For the observed network, shown are the mean (with 95% confidence interval) over all outbreaks from 1000 SIR simulations. For snowball sampled network data there is no complete network with which to compare.