Fine-scale family structure shapes influenza transmission risk in households: insights from a study of primary school students in Matsumoto city, 2014/15

Background Households are important settings for the transmission of seasonal influenza. Previous studies found that the per-person risk of within-household transmission decreases with household size. However, more detailed heterogeneities driven by household composition and contact patterns have not been studied. Methods We employed a mathematical model which accounts for infections both from outside and within the household. The model was applied to citywide primary school surveillance data of seasonal influenza in 2014/15 season in Matsumoto city, Japan. We compared a range of models to estimate the structure of household transmission. Results Familial relationship and household composition strongly influenced the transmission patterns of seasonal influenza in households. Children had substantially high risk of infection from outside the household (up to 20%) compared with adults (1-3%). Intense transmission was observed within-generation (between children/parents/grandparents) and also between mother and child, with transmission risks typically ranging around 5-20% depending on the pair and household composition. Conclusions We characterised heterogeneity in household transmission patterns of influenza. Children were identified as the largest source of secondary transmission, with family structure influencing infection risk. This suggests that vaccinating children would have stronger secondary effects on transmission than would be assumed without taking into account transmission patterns within the household.


3%). Intense transmission was observed within-generation (between 23
children/parents/grandparents) and also between mother and child, with transmission 24 risks typically ranging around 5-20% depending on the pair and household composition. 25

Conclusions: We characterised heterogeneity in household transmission patterns of 26
influenza. Children were identified as the largest source of secondary transmission, with 27 family structure influencing infection risk. This suggests that vaccinating children 28 would have stronger secondary effects on transmission than would be assumed without 29 taking into account transmission patterns within the household. 30 Introduction due to the limited sample size of households in these studies, a rationale on the 66 quantitative effect of household size in transmission has not been established. Familial 67 roles/relationships have been paid far less attention to in household studies; we found 68 only one field study on influenza that included familial roles as a covariate, a 69 descriptive study that did not quantify the risk by familial roles (22).
The probability that a certain combination of individuals (represented by a vector n) in 125 the household are infected by the end of the season is given by the following recursive (1) where Nk and nk are the k-th component of N and n, respectively (1 ≤ ≤ ). The sum 128 ∑ < is taken for all vector ν satisfying 0 ≤ ≤ (∀ ) and ≠ . ε is the risk of 129 external infection for each type of individual (a heterogeneous version of CPI; we avoid 130 the term CPI as our model assumes household members experiences infection from 131 different sources outside the household and not from a single "general community"). 132 The susceptible-infectious transmission probability (SITP) ρkl is the probability of 133 within-household transmission for a specific infectious-susceptible pair (17) and has 134 been used to quantify within-household transmission. However, it is more convenient to 135 use the effective household contact matrix = ( ) in the model; is defined to 136 satisfy = 1 − exp(− ), and is interpreted as the amount of contact that leads to 137 within-household transmission (effective contact) from type l to k. That is, ηkl denotes 138 the amount of exposure that an individual k experiences when another individual of type escapes infection from both outside and inside the household, is given as 141 ( , , ) = (1 − ) exp (− ∑ ). (2) (1 − ) is the probability that the individual is not infected outside the household, and ( The likelihood ( ; , , ) is computed by recursively applying Equation (1) starting 146 with ( ; , , ) = 1. 147 In the present study, we classified each individual in households as one the 148 following type: "father", "mother", "student", "sibling", or "other". "Students" are 149 participants of the survey (i.e., students of primary schools in Matsumoto city), and 150 "siblings" are their elder/younger siblings, who may have also been recruited in the 151 survey if they are primary school students (however, they are not linked in the data and 152 thus unidentifiable as participants). The parameters for "students" and "siblings" were 153 differentiated because "siblings" are not necessarily primary school students, therefore 154 their characteristics may be different from "student". "Father" and "mother" were considered in model selection where their parameter values were differentiated from 157 cohabiting parents (details described in "model selection"). Most individuals classified 158 as "other" were grandparents (90.1%). Uncles/aunts accounted for 6.7%, and the 159 remaining 3.2% was "none of the above categories". 160 161

Transmission risk in households 162
We modelled the possible heterogeneity in household transmission by 163 parameterising the effective household contact matrix = ( ). Our basic 164 assumptions are: (i) each pairs of individuals have a specific "intensity of contact"; (ii) 165 the relative importance of each household contact may be reduced if an individual 166 experiences a large amount of household contacts in total; (iii) the contact intensity 167 adjusted by the total amount of contact is proportional to the force of infection. That is, 168 we modelled ηkl as 169 = . (4) Ck represents the total number of household contacts experienced by an individual of 170 type k, which we introduced to investigate how ηkl differs in households of different sizes and compositions. Noting that the number of individuals in contact is where δkl is the Kronecker delta. The value of the exponent parameter γ determines how 174 strongly is scaled by Ck, which associates our model with density-dependent vs. 175 frequency-dependent mixing assumptions (34). γ=0 corresponds to the density 176 dependent mixing assumption, where the force of infection is proportional to the total 177 number of contacts (weighted by intensity) with infectives, whereas γ=1 corresponds to 178 the frequency dependent mixing assumption, where it is the proportion of infectious 179 contacts among total contacts that matters. In addition to γ=0 and γ=1, γ was also 180 allowed to be estimated as a free parameter in the model selection, representing a 181 mixture of density-dependent and frequency-dependent mixing. 182 The contact intensity matrix (ckl) is interpreted as the per-individual version of the 183 contact matrix ( = / where is the contact matrix). ckl is generally a K×K 184 matrix and contains too many parameters to estimate. We therefore reduced the number 185 of parameters by categorising contacts into the following 5 pairs first:  We then used the models selected by WBIC to estimate the parameters. As final 213 samples, 10,000 thinned samples were recorded from 40,000 pre-thinned MCMC 214 samples. It was ensured that the effective sample size (ESS) was at least 500 for each 215

parameter. 216
Using the estimated parameters, we computed the source-stratified risk of infection and 217 the risk attributable to the introduction into the household (see the supplementary 218 materials Section 2 for further details). 219 estimates for FC and OC were very similar, which suggested that we might be able to 223 equate these two parameters and further stratify the contacts between adults ( AA ) with 224 the degree of freedom earned. We tested some other contact intensity matrices, 225 which gave the best performance in the end. Explored candidate models and selection 227 results are detailed in the supplementary materials Section 2. 228 229

Sensitivity analysis 230
We performed sensitivity analysis to address potential biases in our dataset. We 231 considered in our sensitivity analysis (i) ascertainment bias, (ii) different susceptibility 232 in children, (iii) multiple counting of households and (iv) censoring of sibling cases. 233 The first two points are related to the assumptions in our models. Influenza can 234 have a low reporting rate due to mild clinical presentation (including asymptomatic 235 infections), and therefore some infectious individuals may not have been included in our 236 dataset. The reporting rate of influenza is considered to be very high in primary school 237 other hand, the reporting rate of adults can be lower, as they may be less likely to seek 239 medical treatment than children. A serosurvey conducted in Japan after the 2009/10 240 H1N1 influenza pandemic suggested that while influenza in children were almost fully 241 reported, the reporting rate of adults were relatively low (30-50%) (36). 242 Another possible difference between adults and children is susceptibility: 243 adults may be less likely to be infected by the same amount of exposure due to the 244 previous history of infections or stronger immune systems than children. Conversely, 245 children may exhibit lower susceptibility if the vaccine uptake for them is higher than 246 adults. The majority of household transmission studies from a systematic review (8)  247 reported significant association between susceptibility and age (although this becomes 248 the minority when limited to the studies with PCR-confirmed cases). Our baseline 249 model assumes that transmissibility β is identical between individuals, but in reality 250 transmissibility might depend on the age of the susceptibles. 251 The remaining points explored in sensitivity analysis are inherent limitations in 252 our dataset. One of the limitations is that, because students in the same household 253 responded to the questionnaire separately, households with multiple siblings may have 254 been counted more than once. As this was an anonymous questionnaire, data obtained household. If there was more than one child in a household who was eligible for the 257 study, the same household transmissions can appear multiple times in the dataset, which 258 could modify the results. Lastly, because of the design of the questionnaire, the number 259 of influenza cases in siblings may have been underreported. The questionnaire asked 260 whether each type of individual in the same household had influenza during the season, 261 and the respondents ticked if at least one individual of that type was infected since it 262 was a yes-no question. Therefore, even if there was more than one case in the same type 263 of individuals, the number was not reported and treated as a single case; that is, if a 264 respondent has two older brothers, he/she only reports that "older brother had 265 influenza", and there was no distinction on the dataset whether it was only one or both 266 of them. 267 Each potential source of bias was addressed by incorporating the data-generating 268 process causing the bias into the model. Technical details of the sensitivity analysis can 269 be found in the supplementary materials Section 3. 270 infection and the risk of within-household transmission (Table 3 and Figure 1). The best 274 performing mathematical model suggested that children had a comparatively high risk 275 of infection outside the household: 20% in the primary school students and 16% in their 276 siblings, compared to only 1-3% in adults. Within-household contact patterns showed 277 strong generational clustering. High contact intensities were observed within the same 278 generation (between siblings, parents and grandparents), and the intensity of cross-279 generational contacts was less than half the intensity within the same generation. 280 Contact between mothers and children was an exception to this, showing a higher 281 intensity than between parents. The estimated contact intensity relative to that between 282 parents (father-mother) was highest between other-other (1.97; CrI: 1.10-3.24), most of 283 whom were grandparents in our data, followed by mother-child (1.16; CrI: 1.00-1.32) 284 and child-child (1.04; 0.88-1.23). The model did not support a significant difference 285 between parameter estimates for single and cohabiting parents. 286 The inferred networks of household transmission suggest that various contact 287 patterns between household members exist in different household compositions. The 288 contact intensity between individuals are shown in network graphs ( Figures 3A-3C) for 289 three selected characteristic household composition models, "nuclear family": FM-2 290 (see Table 2 for the notation), (b) "many-siblings family": FM-4, and (c) "three-291 generation family": FM-2-2. Mothers served to bridge between the generations of 292 children and parents; clusters of grandparents were relatively independent of other 293 household members. 294

Overall risk of infection and the breakdown of infection source presented in 295
Figures 3D-3F suggests that risk of infection in children was mostly from outside the 296 household, whereas larger proportion of risk in adults was attributed to within-297 household transmission. Risk of within-household infection increased when more 298 children were in the household ( Figure 3E); however, the influence of additional 299 members categorised as "others" (grandparents in most cases) was minimal, probably 300 due to their low risk of external infection and contact intensity ( Figure 3F). On the other 301 hand, for grandparents in a typical three-generation household, the risk of infection from 302 inside the household was twice the risk from outside. 303 Once influenza was brought into a household by a student, the conditional risk 304 of infection in other members of the household became substantially higher; the 305 implication of disease introduction into households can be seen in the simulated risk of 306 infection after introduction (Figures 3G-3I). In "nuclear family" and "three-generation student in the family was infected. 309 The effective household contacts that each type of individual experiences are 310 displayed in Figure 4, indicating the substantial variation in household contact patterns 311 between individuals and between households. SITP typically ranged around 5-20%, 312 depending on the contact pair and household composition. Reflecting the estimated 313 value of γ=0.5 (CrI: 0.3-0.7), the total amount of effective household contacts was 314 greater in larger households, but the weight of each single contact (the effective contact 315 corresponding to a contact with one individual in the household) decreased with 316 household size. This is because the effective household contact ηkl that one experiences 317 followed an "inverse square root law", i.e., ηkl is inversely proportional to the square 318 root of the total amount of contact Ck ( The sensitivity analysis suggested that the effective household contacts 329 between children may have been lower than the baseline estimates under some 330 assumptions ( Figure S1). However, the overall trend did not change substantially. The 331 importance of children introducing influenza into household remained unchanged 332 throughout the sensitivity analysis. 333 The predicted and observed frequency of data compared in Figure S2 illustrate We applied a household-based mathematical model to a large-scale influenza 341 survey data including 10,000 primary school students and their families in Matsumoto 342 city, Japan, 2014-15. With the dataset of an extensive sample size on morbidity and 343 influenza outside the household, they were responsible for most secondary 348 transmissions within households. Once they brought virus from outside the household, 349 their mother and other siblings were exposed to a higher risk of within-household 350 secondary transmission. The estimated breakdown of infection source showed that 351 within-household transmission accounted for a large proportion of the overall risk in 352 adults. The relative importance of within-household transmission was especially 353 highlighted in grandparents in "three-generation" households. In a typical three-354 generation family composed of two children, two parents and two grandparents, the risk 355 of infection in grandparents was tripled by within-household transmission. Besides, it 356 must be noted that an infection of a grandparent is likely to be followed by that of 357 another due to a high transmission risk between grandparents. These emphasise the 358 importance of controlling school epidemic and household contagion, as the symptoms 359 of influenza tends to be more severe in the elderly (37-39). suggest that vaccinating children is an effective strategy not only because their risk of 371 infection is high but also because they are responsible for a substantial fraction of 372 within-household secondary infections. Especially for adults living with many children, 373 protecting children from infection is as important as (or even more important in some 374 cases) protecting themselves. If one of the household members contracts influenza 375 despite the pre-introduction control effort, the primary target shifts to preventing further 376 transmissions within the household. Household members are now exposed to an 377 infectious person within the same household, which substantially elevates their risk. At 378 every additional infection further increases the exposure. Our findings about household 380 transmission patterns can be used to identify key individuals in the household network. 381 For example, if the primary case is a child, the most probable secondary case is either 382 the mother or another sibling. If the mother gets infected, that may be followed by a 383 transmission to either the father or another child. Direct transmissions between children 384 and father/grandparent may be relatively rare. Grandparents are suggested to be at 385 comparatively low risk from other household members. However, their contacts with 386 each other are closer than any other pair of household members, which warrants 387 attention provided the high disease burden of influenza in the elderly. 388 To our best knowledge, the present study first reported a parametric 389 relationship between within-household influenza transmission and household 390 composition with high precision. With a detailed dataset consisting of up to 10,000 391 households, the present study was able to employ a highly flexible modelling 392 framework to explore previously used modelling assumptions in great detail. A decrease 393 of the per-person risk of within-household infections with household size has been 394 observed in previous studies (8) increased transmission rate in larger household (SITP proportional to N 0.7 ; although not 404 conclusive due to the limited sample size). One of the strengths of our results is that not 405 only did we propose a better alternative measure to scale SITP than household size, we 406 also differentiated the model from both density-and frequency dependent models with a 407 sufficient support. The best model suggested that within-household transmission 408 patterns lies half-way between the two extremes of density-and frequency-dependent 409 models (we call this the semi-density-dependent model as the total effective contact 410 experienced by an individual is proportional to the total contact intensity to the power of 411 0.5). Although a similar approach (without incorporating heterogeneous contact 412 patterns) was employed in (18), where the authors estimated the STIP proportional to 413 N 1.2 , their CrI was too wide (0.13-2.3) to be conclusive. The large-scale dataset enabled from the density-and frequency-dependent models. In the semi-density-dependent 416 model, the total amount of effective contact increases in larger household despite the 417 reduced importance of each contact (Figure 4). Therefore, if the risk of external 418 infection is similar between household members, having many household members is a 419 risk factor (which is not usually the case in the frequency-dependent model) because the 420 effect of reduced SITP is outweighed by the increased number of household members 421 who potentially bring infection into the household. Although such effect was not clearly 422 visible in the present study due to the almost exclusive primary infections in children 423 ( Figure 5), more distinct characteristics may be seen in other epidemic settings with the 424 semi-density dependent model. 425 Multiple limitations in the present study must be acknowledged. Firstly, the 426 case definition in the dataset was not very strict. The data was collected by self-written 427 questionnaires and it was impossible to validate their response. In the dataset, all student 428 cases were reported to be with a clinical diagnosis, and more than 95% of diagnoses 429 were based on RDKs (42). Considering that primary school students in Japan are highly 430 motivated to visit medical institutions to obtain a leave of absence from school, we 431 believe that our data was able to capture influenza incidence in primary schools at high diagnosis was not explicitly required for household members on the question sheet, 434 although the term "influenza" rather than "influenza-like illness" was used. Moreover, 435 subclinical infections may have been present both in children and adults. Because of 436 this, we considered underreporting in the sensitivity analysis, leaving the main 437 conclusions unaltered. Secondly, our model formulation is only one possible candidate 438 for parameterising within-household transmission patterns. "Contact" in our model was 439 merely a hypothetical quantity and may not be directly related to actual physical or 440 social contacts. We also had to use a relatively simple contact pattern matrix for 441 successful parameter estimation. Although our model successfully explained the current 442 data incorporating in an interpretable manner, further development may be sought in the 443 future, including empirical characterisations of household contact patterns which is 444 currently lacking. A recent study have suggested the possible age-dependency in the 445 contact frequency between siblings (6), but the age of household members were not 446 available in the current dataset. More informative dataset and understanding of age-447 dependent household contact patterns will yield further clarification on this point. 448 Furthermore, one must be aware that our analysis based on a unique study population, 449 i.e., households with at least one primary school student in Matsumoto city, may not be 450 considered. The risk of external infection in children was estimated as a single value, 456 which may potentially vary between classes, grades and schools. Overdispersion in 457 infectiousness as addressed in (13,43,44) was also assumed to be negligible. 458 Nonetheless, it is of note that the model had a fairly good performance despite 459 considerable simplification. 460 Although more follow-up studies that supplement our findings are to be 461 awaited, we believe that the present study has presented useful insights on the 462 household-level dynamics of influenza. Understanding of the household-specific contact 463  another. PTR from type l to k is given as ρkl, which refers to the risk of transmission 500 given that the individual of type l is infectious. Households have different compositions 501 and ρkl may also vary according to the composition. On the other hand, ε is the risk from 502 outside the household and thus assumed to be identical across households. 503