Determinants of Influenza Transmission in South East Asia: Insights from a Household Cohort Study in Vietnam

To guide control policies, it is important that the determinants of influenza transmission are fully characterized. Such assessment is complex because the risk of influenza infection is multifaceted and depends both on immunity acquired naturally or via vaccination and on the individual level of exposure to influenza in the community or in the household. Here, we analyse a large household cohort study conducted in 2007–2010 in Vietnam using innovative statistical methods to ascertain in an integrative framework the relative contribution of variables that influence the transmission of seasonal (H1N1, H3N2, B) and pandemic H1N1pdm09 influenza. Influenza infection was diagnosed by haemagglutination-inhibition (HI) antibody assay of paired serum samples. We used a Bayesian data augmentation Markov chain Monte Carlo strategy based on digraphs to reconstruct unobserved chains of transmission in households and estimate transmission parameters. The probability of transmission from an infected individual to another household member was 8% (95% CI, 6%, 10%) on average, and varied with pre-season titers, age and household size. Within households of size 3, the probability of transmission from an infected member to a child with low pre-season HI antibody titers was 27% (95% CI 21%–35%). High pre-season HI titers were protective against infection, with a reduction in the hazard of infection of 59% (95% CI, 44%–71%) and 87% (95% CI, 70%–96%) for intermediate (1∶20–1∶40) and high (≥1∶80) HI titers, respectively. Even after correcting for pre-season HI titers, adults had half the infection risk of children. Twenty six percent (95% CI: 21%, 30%) of infections may be attributed to household transmission. Our results highlight the importance of integrated analysis by influenza sub-type, age and pre-season HI titers in order to infer influenza transmission risks in and outside of the household.


Random digraphs
To model transmission in a household of size N, we consider a random directed graph (digraph) on N vertices labelled 1…N (for each individual of the household), plus an extra vertice labeled C that represents the community. The probability to add an edge from household member j to household member i is given in equation (2) in the main manuscript. The probability to add an edge from the community C to household member j is given in equation (1) in the main manuscript. The probability to add an edge from any household member to the community is null.
Presence of an edge from the community C to subject i means that subject i is infected. Presence of an edge from subject j to subject i means that subject i is infected if subject j is infected.

Final outcome data and augmented digraph
Final outcome data for the household consists of a vector {y 1 ,…y N } where y i =1 if subject i was infected with influenza during the season; y i =0 if he/she wasn't; and y i =NA if infection status is unknown.
We are going to augment the data with a random digraph that is consistent with the final outcome data. This augmented digraph is represented with a matrix G made of N rows and N+1 columns. This matrix is made of 0s and 1s. An example of such a matrix for a household of size 3 is as follows: The matrix G is interpreted as follows. - With these rules, it is straightforward to derive, for a given matrix G, the associated vector of final outcomes for household members x(G). For example, for the digraph presented in the example above, all household members were infected.

Hierarchical structure of the model
Denote θ the parameters of the model. The joint distribution of parameters and the augmented digraph is as follows: where the first, second and third terms correspond to the observation model, the transmission model and the prior model respectively.
The observation model ensures that the augmented digraph G is consistent with the data y: The transmission and the prior models are described in the methods section of the main text.

MCMC algorithm
We developed an MCMC algorithm to explore the joint posterior distribution of parameters and the augmented digraph.
Parameters were updated independently on the log-scale with a standard Metropolis Hastings algorithm. The variance of the proposal was tuned so that the acceptance rate was around 20%.
In practice, as explained in Demiris and O'Neill, exploration of the augmented digraph G can be restricted to the subset of potential cases, i.e. individuals who are or who might be cases (i.e. y i =1 or y i =NA). This is because: (i) if there is an edge from a case to a non-case, the augmented digraph will be inconsistent with the data (and therefore be rejected); (ii) modelling edges from non-cases does not provide any information. This substantially reduces the dimension of augmented digraphs that need to be explored.
Assume that in the household, there are n potential cases, made of n 1 cases (i.e. y i =1) and n NA individuals without diagnoses (i.e. y i =NA). We use the following independence sampler to update the digraph: -For an individual i who was diagnosed as a case (y i =1): o Draw the number x of edges leading to subject i uniformly in 1, …, n. Note that there are n edges leading to subject i _ n-1 from other potential cases and 1 from the community. o Uniformly draw the x edges among the n possible edges.
-For an individual i who did not have a diagnosis (y i =NA): o Same as for those with a positive diagnosis except that this time, the number of edges is uniformly drawn in 0,…, n.
The acceptance rate for this step is 26%.