Neutrality, Cross-Immunity and Subtype Dominance in Avian Influenza Viruses

Avian influenza viruses (AIVs) are considered a threat for their potential to seed human influenza pandemics. Despite their acknowledged importance, there are significant unknowns regarding AIV transmission dynamics in their natural hosts, wild birds. Of particular interest is the difference in subtype dynamics between human and bird populations–in human populations, typically only two or three subtypes cocirculate, while avian populations are capable of simultaneously hosting a multitude of subtypes. One species in particular–ruddy turnstones (Arenaria interpres)–has been found to harbour a very wide range of AIV subtypes, which could make them a key player in the spread of new subtypes in wild bird populations. Very little is known about the mechanisms that drive subtype dynamics in this species, and here we address this gap in our knowledge. Taking advantage of two independent sources of data collected from ruddy turnstones in Delaware Bay, USA, we examine patterns of subtype diversity and dominance at this site. We compare these patterns to those produced by a stochastic, multi-strain transmission model to investigate possible mechanisms that are parsimonious with the observed subtype dynamics. We find, in agreement with earlier experimental work, that subtype differences are unnecessary to replicate the observed dynamics, and that neutrality alone is sufficient. We also evaluate the role of subtype cross-immunity and find that it is not necessary to generate patterns consistent with observations. This work offers new insights into the mechanisms behind subtype diversity and dominance in a species that has the potential to be a key player in AIV dynamics in wild bird populations.


S.1 Data Analysis
To determine the randomness of the datasets in relation to various metrics, we constructed a sample of 1000 random time series for each dataset, with the random time series the same length as the dataset in question. Values for the random time series were drawn from a uniform distribution and could take values on the interval [0, 0.25]. The upper limit was chosen to maintain similarity in scale between the observed prevalence levels in Datasets 1 and 2.

S.2 Model Details
As described in the main text, this model is a multi-site, multi-species, multi-subtype model that details the interactions of three species that use Delaware Bay for some or all of the year. The basic structure for a single subtype version of this model is given in [1]; here we extend that to four subtypes. The assumptions surrounding the model are presented in the main text. S-1

S.2.1 Model Equations
dIR dt = γ 2 II + (1 − ψ 1 )(β 1 I 1 + δρ 1 Here I 1 = IS + II + IR, I 2 = SI + II + RI and δ i = The parameters are described in the following sections. The class names follow the standard SIR format, with S representing susceptibles, I representing infecteds and R representing immune classes. The first letter of each variable represents the infection status for subtype 1, and the second letter represents the infection status for subtype 2 (so, for example, class SI tells us that individuals in this class are susceptible to subtype 1 and infected with subtype 2). Classes V i , i = 1, 2 represent the environmental reservoir for each subtype. For a more thorough description of the single subtype system, see [1].
The parameters described in §S.2.2- §S.2.5 apply to all subtypes, as we assume all subtype parameters are identical in this study. These sections, and table S-1, have previously been published in [1]; the table appeared in the main text and the sections appeared in the Electronic Supporting Information.

S.2.2 Transmission parameters
Transmission within species is denoted by β ii , i = u, r, m and between species as β ij for i = j, i, j = u, r, m. We assume that the transmission between species (β mr , β rm , β ur and β ru ) is the mean of the average direct transmission rates (e.g. β mr = β rm = 1 2 (β m 0 + β r 0 )) within both species. Between-species transmission rates are non-zero only when the migrating species are present in Delaware Bay and interacting with the resident ducks. These periods of time are given by the migration parameters in Section S.2.3. The resulting transmission matrix is We assume perfect mixing between resident ducks and either of the species they interact with; for migratory ducks this is clear from the similarity of their habitats and behaviours. Between resident ducks and ruddy turnstones, we base this on research that has shown a variety of habitats are important to shorebird species [2], offering them the opportunity to interact with other species.
The transmission parameters for within-species transmission are given by with D + denoting the number of days of high transmission and D − denoting the number of days of low transmission, i = u, r, m. The denominator of (S3) ensures that the average of β ii ,β ii , is equal to β i 0 . The 'Breeding season' parameters for the migrating and resident ducks are presented below.
We use the 'M ixing u ' function (given in Section S.2.3) in place of a 'Breeding u ' function in the ruddy turnstone's transmission term -this is the period of time ruddy turnstones are present in Delaware Bay and is given in (S7).

S.2.3 Migration parameters
The parameters determining the location of each species at each point in time (i.e. relating to migration patterns) are given below; the 'M ixing' and 'M ixing u ' parameters give the times the migrating ducks and ruddy turnstones respectively are in Delaware Bay. The 'M ixing u ' parameter is also used to calculate the within-species direct transmission rate for ruddy turnstones (as described in Section S.2.2). The 'Summer' and 'W intering' parameters give the periods when the ruddy turnstones are on their summer and winter grounds.
The shedding and consumption rates for ruddy turnstones and migrating ducks for each location they visit are also defined using the above parameters (as the above parameters determine which location each species in for every point in time), so for example shedding by the migrating ducks in Delaware Bay, ω mdb , uses the M ixing(t) parameter -ω mdb = ω m M ixing(t) (see Table 1 for ω m ; other shedding and consumption rate parameters are constructed similarly).

S.2.4 Hatching and mortality parameters
The hatching parameter for each species is given below. The superscripts u, r, m denote ruddy turnstones, resident ducks and migrating ducks respectively. These parameters (in the case of hatching) define when hatching occurs in the model -i.e. the periods of time when it is non-zero.
In the case of the mortality parameters, the 'Death i (t)' (i = m, r) give the step functions used for the mortality rates in both duck species (so µ m = Death m (t); µ r = Death r (t)).
The seasonal mortality rates for the duck species (as a result of hunting [3,4,5]) are given below. Subscripts r, m denote resident and migratory ducks respectively. (S14)

S.2.5 Virus durability parameters
The persistence terms for virus in the environment (these are defined as η k , k = b, db, a) are time-dependent (as they rely on temperature [6]), taking a 'winter' value and a 'summer' value in Delaware Bay and the breeding grounds for both migrating ducks and ruddy turnstones. They are calculated from weather data in the different locations [7,8,9] and using [6] to convert this to virus persistence in the environment. The virus persistence at the ruddy turnstones' wintering ground (assuming to be coastal Brazil [10]) is taken to constant (η w = 167.9, superscript w represents wintering ground) throughout the year as the region has very little variation in temperature [11]. The functions for persistence rate of virus at the duck breeding grounds, Delaware Bay and the ruddy turnstone breeding grounds (denoted by superscripts b, db and t respectively) are and η t (t) = 10 if 0.33 < t < 0.83, 1.6 otherwise (S17) respectively.

S.3 Barycentric coordinate system
The Barycentric coordinate system was invented by Möbius in 1827 [20]. It provides an alternative coordinate system, based on a simplex (of any dimension), that allows us to consider data in a different way. In this coordinate system each vertex represents a particular quantity. Residing at any of the vertices shows absolute domination by that quantity, with the other quantities equal to zero. Through normalisation, each vertex has coordinate value one for that quantity and zero for the other quantities represented. The Barycentre is the centre of the simplex and represents equality between all quantities, thereby having coordinate value of 1/3 for each quantity. The easiest simplex to consider (for explanation purposes) is the 2-D simplex, or triangle, which is a representation of three quantities. For quantities q 1 , q 2 , q 3 , the Barycentric coordinates are thus ( q 1 q 1 +q 2 +q 3 , q 2 q 1 +q 2 +q 3 , q 3 q 1 +q 2 +q 3 ). Figure S1 shows the idea of Barycentric coordinates, in 2-D, with the Barycentre marked.

S.4 Further explanatory figures
We provide a number of explanatory figures to expand on the model results presented in the main text. Initially we consider the average number of strains present (over 10 simulations) for each parameter set tested (c.f. figure 4(a) and (b)). This results offers greater insight into why not all parameter sets provide estimates in figure 4 -it is that the number of strains present (on average) is simply too small to allow the metric in question to be computed.
While we present the results of the best fit parameter set for the metrics change in rank and rank abundance in the main text (figure 4(c) and (d)), in figure S5 we show some sample prevalence curves, taken from one simulation using the best fit set of parameters. In particular, the bottom panel shows the prevalence curves for ruddy turnstones, and illustrates the different trajectories of each subtype modelled. Figure 4 in the main text shows the combined results for calculating the sum of squared errors for both change in rank and rank abundance. Figures S6 and S7 show these results separately, with S6 showing the results from the change in rank metric, and S7 showing the results from the rank abundance metric. We find that the best fit parameter set for change in rank when varying the ruddy turnstone direct transmission rate and cross-immunity gives a very dissimilar result to that in the data, but that the results are much better when using the best fit parameter set from varying consumption rate and cross-immunity. Where the rank-abundance curves are concerned, both scenarios are capable of creating a curve similar to that seen in the data. When considering these individual fits, we find that in three cases, some cross-immunity is indicated (in two of the three cases, this is for a value of ψ = 0.2). However, when fitting both metrics together (as we do in the main text), we find that the set of parameters that give the best fit to both metrics simultaneously include no cross-immunity. As an extension to the autocorrelation plot (figure 5), in figure S8 we present the lagged scatterplots for both datasets. We calculate the correlation coefficient for each scatterplot and find that none are significant (thereby supporting the result in figure 5, where none of the lags cross the 95% CI).

S.5 Sensitivity analysis
To conduct a sensitivity analysis on the model, we created two scenarios -one of either 'high transmission potential' or 'low transmission potential', using the best fit parameter set predicted in the main text. In the high transmission potential scenario we increased direct transmission and consumption rates (in all species) and viral persistence in the environment by 10% and decreased the duration of immunity and recovery rates (in all species) by 10%. In the low transmission potential scenario, the increased rates were instead decreased by 10%, and the decreased rates were instead increased. The results are shown in figure S9, and we find that, in both transmission potentials, the change in rank curves follow the same basic shape. In the case of low transmission potential, the change in rank takes a somewhat higher value for ranks 1 and 4 than observed in the best fit model results, albeit close to the value given by dataset 2 for rank 1. The mean rank abundance curves are both shallower than that from the best fit model results, although (in the case of low transmission potential) there is a lot of variation between individual curves.
Together, this sensitivity analysis shows that the model has some sensitivity to the parameter set used, but is still able to recreate the change in rank patterns shown in the main text. The rank abundance curve is more variable, but (in the case of low transmission potential) maintains  Figure S3. Sample prevalence curves from the model for the best fit parameter set. The panels (from top to bottom) show prevalence curves for all four subtypes in migrating ducks, resident ducks and ruddy turnstones (RUTUs).  Figure S4. Panels (a) and (b) show the normalised SSEs for varying cross-immunity and either the ruddy turnstone transmission rate (a) or the consumption rate (b) against the metric absolute change in rank vs. rank. Panels (c) and (d) give the absolute change in rank vs rank curves for the best fit SSE parameters for either (c) ruddy turnstone direct transmission rate β u or (d) consumption rate ρ. These occur at either a medium level of cross-immunity (ψ = 0.5) and consumption rate (ρ = 13804 * 10 −8 ), or with low values for cross-immunity (ψ = 0) and direct transmission rate (β u = 0.005).   Figure S7. Change in rank ((a) and (b)) and rank abundance ((c) and (d)) plots for both high transmission potential ((a) and (c)) and low transmission potential ((b) and (d)).
a shape very similar to that seen in figure 4. Furthermore, the parameter sets modelled here represent a real challenge to the model -we varied multiple parameters at once, in all subtypes, that overall create a very different situation to that in the main text. That the model is capable of providing broadly similar results under these conditions supports its use in this work.