Table 1.
Initial set of covariates for logistic regression.
Table 2.
Logistic regression models for HRV acquisition and loss, with covariates defined in Table 1.
Fig 1.
Univariate effects of covariates in the final generalized linear models.
Probability of acquisition (a-e) or loss (f) of HRV as a function of a) week, b) age, c) age of youngest household member, d) concurrent positive test for coronavirus, e) fraction of other household members testing positive (pooled into six categories to smooth results), and f) number of other household members testing positive. Red lines showed a smoothed fit (supsmu function in R).
Fig 2.
Univariate effects of significant specific infection classes.
Average probability of infection as a function of specific infection classes. Number of data points indicated by numbers in a-c and by dot size in (d), ranging from the smallest value of 23 with 13 individuals infected to a largest value of 662 with 3 individuals infected. Error bars are one standard error.
Table 3.
Logistic regression models for HRV acquisition with specific infection classes.
Within the household, all frequencies are the number of other individuals testing positive in the given class divided by household size minus 1. The population number is the number testing positive in the given class outside the household.
Table 4.
Significant effects on acquisition and loss of other infections pooled into categories defined in the introduction.
Results are from logistic regression (glm with the binomial family in R) after removing covariates that were not significant with a generalized linear mixed model using household number as a random effect. In each case, Frequency refers to the frequency of positive tests of the focal virus within the household. Models include quadratic terms to capture the non-linear effect of week, particularly for viruses with a stronger winter peak. The covariates chosen after model selection are the same when analyzed with generalized additive models (gam in R). No covariates significantly predict loss of paramyxovirus or influenza.
Fig 3.
We compare data (in black) with simulation (in red) for a) the trajectory for full year of data and first five years of the simulation, b) mean prevalence as a function of household size, c) mean prevalence as a function of age group, d) the index of dispersion θ for each household, with smaller values indicating a greater deviation from the binomial distribution (arrows connecting data to simulation added for clarity). The model includes age-dependent susceptibility, specific infection classes for age both within and between households and intrahousehold reinfection. Households match those at the beginning of the BIG-LoVE study. Coefficients of the logistic regression model are Intercept = -0.632, Week = -0.00943, Age group = -0.798, Household HRV frequency in age group 1 = 2.482, Household HRV frequency in age group 2 = 1.573, Population HRV number in age group 2 = 0.0495.
Table 5.
Simulated effects of model components on overall mean prevalence and on the slope of prevalence within households as a function of household size.
The shorthand “intra” and “inter” refer to intrahousehold transmission and interhousehold transmission respectively.
Fig 4.
Effects of population mean household size and model structure on a) mean HRV prevalence in the entire population and b) the slope of HRV prevalence against household size within a population.
The population mean household sizes follow the Poisson distribution with the given mean.