Stochastic colonization of hosts with a finite lifespan can drive individual host microbes out of equilibrium

Macroorganisms are inhabited by microbial communities that often change through the lifespan of an individual. One of the factors contributing to this change is colonization from the environment. The colonization of initially microbe-free hosts is particularly interesting, as their microbiome depends entirely on microbes of external origin. We present a mathematical model of this process with a particular emphasis on the effect of ecological drift and a finite host lifespan. Our results indicate the host lifespan becomes especially relevant for short-living organisms (e.g. Caenorhabditis elegans, Drosophila melanogaster, and Danio rerio). In this case, alternative microbiome states (often called enterotypes), the coexistence of microbe-free and colonized hosts, and a reduced probability of colonization can be observed in our model. These results unify multiple reported observations around colonization and suggest that no selective or deterministic drivers are necessary to explain them.


Introduction
Microbial communities inhabit every available habitat on this planet, including the tissues of macroorganisms. For such host-associated communities every host animal constitutes a We consider a population of hosts that is sufficiently large to draw statistical conclusions. The microbial community in each host grows dynamically, but with a fixed maximum capacity N. To make this more precise, let n i be the number of individuals of the i-th microbial taxon within a host (i � 1) and M be the number of taxa. At any time we have P M i¼1 n i � N. We reserve the index i = 0 for the unoccupied space, namely n 0 ¼ N À P M i¼1 n i . We define x i = n i / N as the frequency of the i-th taxon within a host and assume N � 1, such that x i is continuous and N − 1 � N. Note, that x 0 then denotes the fraction of available space within a host. We assume the death of hosts can be approximated as an event occurring each time step with probability τ, given by the probability of host death-birth events per microbial death-birth event. The limiting case τ = 0 corresponds to infinitely living hosts (as in [3,7]), while τ = 1 corresponds to hosts whose lifespan is as short as the average lifespan of a microbe, leading to almost entirely empty hosts.
Let us focus on the events within a single host. In each time step, a randomly selected site is changed. This site is either unoccupied space or a microbe. Death is followed by replacement via immigration or birth of a new type. With probability m, its content is replaced by a random microbe from the environment, selected proportionally to its frequency in the pool of colonizers, p i (note that p 0 = 0). With probability 1 − m, it is replaced by a microbe from the same host, selected proportionally to the fitness (1 + α i )x i of the reproducing microbe-or it is replaced by unoccupied space with probability proportional to (1 + α 0 )x 0 . The fitness parameter α i describes deviations from strict neutrality, where proliferation of microbe i is promoted (α i > 0) or impeded (α i < 0). The parameter α 0 controls how rapidly unoccupied space within a host is filled with microbes. This determines the level of resistance a host poses to be occupied by microbes, or in other words, how favourable the host environment is for microbial reproduction and persistence. For α 0 > 0, hosts pose an increased resistance to the internal microbes, while α 0 < 0 decreases such resistance. The acceptable range of α i and α 0 ranges

PLOS COMPUTATIONAL BIOLOGY
from −1 to infinity. The resulting stochastic process for a given host can be described by the probabilities of events after one time step, where Eq (1a) describes the probability of a host death event: All microbial frequencies are set to zero, i.e. x i ! 0 for i � 1. At the same time, a new empty host arises, corresponding to x 0 ! 1. This is captured by δ i,0 , the Kronecker delta (1 for i = 0 and 0 otherwise). The three other probabilities require that the host survives, which occurs with probability 1−τ. For a microbial taxon i, Eq (1b) describes the probability of increase by immigration or reproduction within the host, and Eq (1c) describes the probability of decrease derived from other taxa immigration, reproduction within the host, or their inability to reproduce. For i = 0, Eqs (1b) and (1c) describe the probability of increasing and decreasing the unoccupied space, respectively. Finally, Eq (1d) indicates the probability of no change. Focusing on the effect of ecological drift we fix the microbial fitness α i = 0 (for i � 1) for the remainder of the manuscript. Probabilities in Eq (1) change considerably through time. For example, because hosts are largely empty at birth, unoccupied space decreases rapidly as For τ = 0 the probabilities are as in Sloan et al.'s [3], which becomes a good approximation when the time scale of reproduction on the microbial level is much faster than the time scale of reproduction on the host level. We focus on the dynamics of the probability density of can be different for all microbial taxa i. This can be approximated in the large N limit by a Fokker-Planck equation (see S1 Appendix), with t being measured in the number of microbial death-birth events. Writing down the equations for unoccupied space x 0 and microbes separately we have where a i [x i ] describes the deterministic part of the change and b 2 i ½x i � describes changes due to randomness [16]. The term a i [x i ] is calculated as the first moment of Δx i , the expectation hΔx i i, The term b 2 i ½x i � is calculated as the second moment of Δx i , the expectation h(Δx i ) 2 i, For τ ! 0, the last terms in Eq (2) vanish, recovering the usual Fokker-Planck equation of the neutral model without host death [3], while for τ > 0 these additional terms describe the change due to host death, where a new, microbe-free host appears.
Although individual hosts constantly change their microbiome through the process of microbial death birth-immigration and host death, the collection of transient host states becomes stationary at the population level. This stationary distribution is found setting the time derivative of Eq (2) equal to zero, The Fokker-Planck approximation has several benefits: It provides an intuition of the stochastic process at the population level and the effect of host death (τ), a direct connection to models not considering finite host lifespans [3], and the possibility to frame the process in the broader stochastic processes literature [16].
An alternative interpretation of the stochastic process is provided by [17] where F i [x i ] results from considering all the possible distributions of the time-dependent death-birth process of microbes without host dynamics, F i [x i , t r ]| τ = 0 , influenced by the distribution of death-birth time of hosts, C[t r ]. The distribution of these resetting events is given by This equation will help us to compare our model and individual-based simulations. Now we aim to solve Eq (5), where a major challenge arises from the additional terms capturing the host death-birth events, which correspond to a resetting of the local microbial community. Such resetting events are often referred to as "catastrophes" in the Mathematics literature and research has focused on finding closed form solutions of the corresponding discrete problem derived from the master equation using first order transition probabilities [18][19][20]. In physics, this is called diffusion-drift with resetting and its Fokker-Planck approximation and zero order transition probabilities have been used to find closed form solutions and compute quantities of interest [17,21]. Our model considers density-dependent transition probabilities, i.e. second order effects. Although these provide a well defined system at the boundaries x i = {0, 1}, they complicate finding a closed form solution of F i [x i ] tremendously. Approximating the solutions numerically using the finite differences and finite element methods [22] is possible.
We solved this equation numerically to query the parameter space [22]. However, we found our implementation could lead to numerical errors that were large and inconsistent in some cases, especially as τ ! 0. As it proved more robust numerically (S1, S2 and S3 Figs), we used the master equation (see S1 Appendix) to produce our figures instead, where DF i is the change of the distribution during one time step. In this case the distribution at a given time is represented by the vectorF i ½x i ; t�, whose entries correspond to the probability densities of x i 2 {0, 1/N, 2/N, . . ., 1}. Upon multiplying by the matrix of transition probabilities, T i , the time change of the distribution is obtained. Because only transitions are considered, the main diagonal of T i equals zero, while the upper and lower diagonals equal Eqs (1b) and (1c), respectively. Host death is reflected in additional non-zero probabilities, τ, at the first column for microbial taxa (i � 1) or last column for unoccupied space (i = 0). The nontrivial stationary distributionF i ½x i � occurs for DF i ½x i ; t þ 1� ¼0, corresponding to the eigenvector of T i with eigenvalue zero. We used this method to compute the stationary distribution in Python 3.6.
If numerical problems emerged solving Eq (7), we focused on Here R i , the probability matrix, is identical to T i , except at the main diagonal where it equals Eq (1d). The stationary distribution corresponds to the eigenvector of R i with eigenvalue one.

Stochastic simulations
To study the transient dynamics of colonization and test our stationary estimation, we performed individual-based simulations. These were performed for 500 hosts, N = 10 4 , two equally abundant microbial taxa in the pool of colonizers, p 1 = p 2 = 0.5, and initially sterile hosts (x 0 = 1 and x 1 = x 2 = 0 as initial condition). We varied the values of migration (m) and rate of occupation of empty space (α 0 ).

Difference between models
To compare models considering finite (τ > 0) and infinite host lifespan (τ = 0), we calculated the total difference between their stationary distributions, This difference, ranging from 0 to 1, will equal zero only if for all x i , the two distributions are identical,

Probability of microbe-free, colonized and fully-colonized hosts
To analyse when a particular microbial taxon will not be observed in a host, i.e. its probability of non-colonization, we calculated where 1/N is the minimum observation limit and P[ On the other hand, to analyse when a particular microbial taxon will fully occupy a host, we calculated where NÀ 1 N is the maximum observation limit, and P[ is the combined probability of partial and non-colonization.
Finally, the quantities P[x 0 < 1/N] and P[x 0 > (N − 1)/N], indicate the probability of hosts full of microbes and the probability of hosts free of microbes, respectively.

Alternative microbiome states
To assess the modality of the distribution F i [x i ], i.e. alternative microbiome states, we identified the maxima of the distribution of its numerical solution for varying parameters. The distribution can be unimodal, with the maximum located at one of the boundaries or between them, x i ¼ f0; x � i ; 1g, or bimodal, by a combination of the former. We classified these states and calculated the magnitude of their maxima.

Comparison between the model and simulated data
In order to evaluate our model, we compared it to stochastic simulations (S1, S2 and S3 Figs). As mentioned above, we simulated hosts individually. However, our model provides a population description for overlapping generations. Therefore, we sampled single time steps of the colonization trajectories according to Eq (6), which indicates the probability of a host deathbirth event through time. The distribution of the simulated sampled set was then compared to our theoretical model predictions.

Code availibility
The Python code for simulations, numerical solution of the model and figures is available at https://github.com/romanzapien/microbiome-hostspan.

The dynamics of colonization affects the microbiome of finite-living hosts, but not of infinite-living habitats
The formation of a microbiome goes through several stages. Analytically, much of the focus has been on its long-term equilibrium, assuming hosts with infinite lifespan. Much less is known about the transient stage. Fig 2 shows two illustrative individual-based simulations, where hosts are colonized by two neutral microbial taxa, going from a microbe-free to a microbe-occupied state. The dynamics is qualitatively different depending on α 0 : For α 0 = 0, the host is colonized by the two microbes at the same time, leading to a unimodal distribution that is similar to the long term equilibrium even during the transient. For α 0 < 0 empty space is occupied more rapidly compared to the dynamics between microbes. This leads to a situation where one microbial strain dominates the host until the host is fully colonized, leading to a bimodal distribution in the colonization of hosts. Only on a much longer timescale, this distribution is replaced by the unimodal distribution characteristic for the long term equilibrium.
Given a low rate of external colonization (m ! 0), the time required for full colonization will be shorter than that to reach the long-term equilibrium. Such difference will increase even further for rapid colonization, α 0 < 0. When considering a finite host lifespan (τ > 0), this difference in time-scales will influence the expected microbiome composition. Interestingly, for shorter lifespans, the host population might be multimodal and only partially colonized ( Fig 2B). Moreover, for sufficiently small external colonization and short host lifespan, coexistence of colonized and microbe-free individuals is expected (S4 Fig).
From a microbial point of view, the results shown here occur in a completely neutral context. They can also be generalized to cases with many microbial taxa. A non-neutral dynamics of the microbes (α i 6 ¼ 0) will modify the stationary distribution, i.e. they will not only depend on the frequency in the pool of colonizers (p i ) and host lifespan (via τ). Instead, asymmetries of the multimodality and differential colonization are expected once α i 6 ¼ 0 is assumed.

A short host lifespan influences the microbiome
We quantified the change of the stationary distribution caused by a finite host lifespan systematically. Using the stationary distribution of the frequency, F i [x i ], we compared the predictions assuming hosts with infinite lifespan (τ = 0) against those with hosts with finite lifespan (τ > 0). Such comparisons were done for multiple migration probabilities (m), frequencies in the pool of colonizers (p i ), and rates of empty space occupation (α 0 ). As explained in the Methods, Eq (8), we express the results as the difference between the stationary distributions.
Figs 3 and 4 show the results of the microbial load (total microbial frequency) and frequency of a particular microbe, respectively. Within the range of m and τ analysed, the difference is always greater than zero, indicating the importance of τ in our model and the predictions arising from it. Only for τ ! 0, full agreement is expected.
Regarding the microbial load, infinitely living hosts (τ = 0) provide enough time for them to be fully colonized and for the distribution of microbes to reach an equilibrium. In contrast, a finite lifespan (τ > 0) might not allow full colonization before host death. For a slow occupation of empty space (α 0 = 0) the difference increases with shorter lifespan (large τ) and reduced migration (small m), Fig 3A. In this case, the model with τ = 0 predicts a distribution centered at frequency 1 decaying towards 0, while the model with τ > 0 predicts a sharp maximum centered at frequency 0 decaying towards 1. In contrast, rapid occupation of empty space (α 0 < 0) causes the difference to decrease and to become increasingly independent of m ( Fig 3B). This occurs because the time for colonization, i.e. host lifespan, becomes more relevant than migration, as successful migrants are increasingly likely to proliferate within hosts.
For a specific microbial taxon, infinitely living hosts (τ = 0) allow the frequency in the hosts to reach that in the pool of colonizing microbes (p i ). However, a restricted, finite lifespan (τ > 0) might not allow to reach this value. In our model, the relevance of τ increases with its magnitude, but not independently of m. The maximum difference between the two distributions occurs for short lifespan (large τ) and large migration (larger m) as p i ! 0 (Fig 4B and 4C). In this region, the model with τ = 0 predicts a distribution centered at x i � p i , while the model with τ > 0 predicts a distribution centered at x i = 0 decaying towards 1. Finally, for a single colonizing taxon (p i = 1, Fig 4A) the difference increases analogously to Fig 3A, i.e. the difference increases for smaller migration and shorter lifespan.

Microbe-free, colonized hosts, and their coexistence are expected
A major consequence of a host finite lifespan is the coexistence of hosts with various degrees of colonization, including microbe-free hosts. We calculated the probability of full colonization in the stationary distribution, i.e. P[x 0 < 1/N] (Eq (9)), for different parameters given a certain capacity for microbes (N). Fig 5 shows the effect of m, τ, and α 0 on the probability of full colonization. Different parameter combinations can result in the same probability of full colonization. Partial colonization is the most likely state for short host lifespans (large τ). Only for long living hosts (small τ), both death probability τ and migration m are important, with m having a larger impact on the distribution when it is larger (Fig 5A). Finally, a faster occupation of empty space (α 0 < 0) makes the probability of full colonization less dependent on m and increases it for shorter living hosts (larger τ), i.e. the coexistence with partially colonized hosts becomes less likely ( Fig  5B).  A-B) The difference between models with finite (τ > 0) and infinite (τ = 0) host lifespan is shown, Eq (8). (A) For a slow occupation of empty space, the difference is maximal for small migration (m) and large τ as the model with τ = 0 predicts a distribution centred at frequency 1 decaying towards 0, whereas the model with τ > 0 predicts a distribution centred at frequency 0 decaying towards 1. For a fixed τ the difference is always greater for smaller m. Only for t≳10 À 4 the difference is maximal and independent of m. Finally, a smaller τ always approximates the models; nonetheless within the range analysed the difference is always greater than zero. (B) A faster occupation of empty space decreases the difference and makes it increasingly independent of m, as τ dominates the predictions of the model. (C-D) The distributions are classified according to their number of maxima (unimodal or bimodal) and location (0 and 1). (C) A slow occupation of empty space results in microbe-free hosts being the maximum for short host lifespans (large τ), fully colonized hosts for large migration (m) and small τ, or microbe-free and microbe-occupied hosts simultaneously for small m and τ. The bimodality results from a limited migration preventing all the hosts from being colonized but over a host lifespan sufficient for successful colonizers to occupy host fully. (D) A faster occupation of empty space increases the bimodality region at the expense of the unimodal cases. In this case, α 0 ! −1 favours the microbe-occupied maximum. When classifying the distributions, any probability smaller than 10 −9 was considered as zero. Other parameters: N = 10 4 . We use Eq (5a) where no definition of p i and α i is required.
https://doi.org/10.1371/journal.pcbi.1008392.g003 As shown by our calculations, S6 Fig, we argue that even microbe-free hosts might not be an experimental artefact, but an inherent outcome of the host colonization process in some host-microbiome systems [23,24], even under neutral (i.e. non-selective) conditions [8]. This might be evident for short living hosts, but less so for longer lifespans. In such case, its experimental observation might be possible only for large samples of hosts.

Rapid proliferation of the first colonizer can result in alternative microbiome states
We have noted previously the existence of multimodal distributions in the transient colonization, and how these prevail in the stationary distribution due to the finite lifespan of hosts (Fig  2). A particular microbial taxon might either succeed or fail to colonize a host, leading to the coexistence of hosts with alternative microbiome states. Moreover, in specific cases all possible microbes could succeed or fail to colonize a host, allowing the coexistence of microbe-free and microbe-occupied hosts. These extremes can have similar or different magnitudes, as shown in Fig 5 and S6 Fig.  Fig 3C and 3D shows the stationary distribution of microbial load for different rates of empty space occupation, α 0 . Firstly, a large host death-birth probability (τ) causes hosts to be rarely colonized; hence most remain microbe-free, so x 0 = 1 is the only maximum. Secondly, a large migration (m) and small τ provides enough time for hosts to be fully colonized, so x 0 = 0 is the only maximum. Finally, the processes of limited migration and long host lifespan combine to define a region where bimodality is expected (Fig 3C). The magnitude of the maxima and region of bimodality are influenced by α 0 (Fig 3D), with α 0 ! −1 favouring the microbeoccupied over the microbe-free state (Fig 5 and S6 Fig).
Similarly, Fig 4D-4F shows the stationary distribution for various frequencies of a microbial taxon in the pool of colonizers (p i ) and α 0 = 0. A qualitative description of the complete distributions (see S7 Fig) is shown. Again, bimodality is expected for small m and large τ. Many microbes do not colonize, but successful colonizers proliferate to occupy hosts entirely. The bimodality region is shaped by p i . A single colonizer (p i = 1, Fig 4D) mirrors Fig 3C. In contrast, p i < 1 has the effect of vanishing the bimodality if m or τ are larger (Fig 4E and 4F). Outside this region, a large τ causes most hosts to be microbe-free, so x i = 0 is the only maximum. However, a larger m and smaller τ make x i = 1 the single maximum if p i = 1, or an internal maximum if p i < 1. Finally, the split into alternative states might be reinforced if empty space is occupied more rapidly, α 0 < 0 (Fig 2 and S4 Fig). This results from a limited migration and rapid proliferation of the first colonizer. Although the alternative states could be transient for long-living hosts, they might persist for short-living ones.

By reducing the colonization probability, the finite host lifespan makes the core microbiome context-dependent
Previous research has focused on defining the set of microbial taxa consistently observed in a given host species. This is often called the core microbiome. In our model, stochastic colonization reduces the probability of observing a taxon in all hosts (Fig 6). Importantly, this is not caused by any kind of selection or competition, but by migration (m), time for colonization (via τ), capacity for microbes (N), and the frequency of a colonizing taxon (p i ) alone. Fig 6  shows the probability of observing a microbial taxon within a host, P[x i � 1/N], for different values of m, τ, and a fixed N. For the values of p i shown, the contour lines increasingly depend on τ for larger τ. Successful colonization is more prevalent whenever m is larger and τ smaller, for microbes down to a frequency of p i = 0.1. Nonetheless, even a single colonizing taxon could not consistently be observed for some m and τ (Fig 6A and S8 Fig). Finally, a smaller microbial frequency in the pool of colonizers (p i ) reduces the overall colonization probability (Fig 6B and 6C, smaller values are shown in S8 Fig).
These results suggest that under neutral dynamics, the observed frequency of microbes within hosts, i.e. the colonization probability, cannot be universally used to define a core microbiome, as the frequency of readily colonizing taxa depends on host and microbial features.

Discussion
Although microbes are ubiquitous in nature [25], including the human body [26], it remains to be answered which microbes not only transit from the environment to the hosts but also persist in or on them. Our understanding of these processes relies on identifying the factors underlying host colonization.
We have introduced a stochastic model along the lines suggested by [27], where migration and death-birth processes of microbes within hosts with finite lifespans can produce a range of colonization dynamics and distinctly different microbiomes-even when there is no selection at all (Fig 1). A key assumption in our model is the absence of inheritance of microbes [28], as hosts are colonized after birth from the environment only. In this context, the microbiome is driven by the frequency in the pool of colonizers. This frequency (which is constant in our model) does not need to be the frequency of an environmental microbe, but can more generally be a function of it. Several organisms including D. rerio [29], C. elegans [8], and D. melanogaster [24] might be colonized from the environment only. Others have weak inheritance [30], or might be microbe-free prior to birth, like humans [15]. Many host species will also inherit their microbes from their parents.
Critical to colonization in our model is the magnitude of the microbial migration from the environment to the hosts (m) [27]. As observed in the gut of D. rerio [31], microbial migration could overwhelm other host selective and non-selective processes. In addition, we have combined the host lifespan with a constant microbial cell doubling time [32] to define τ as the parameter of timescale separation between hosts and microbes. This serves as an indicator of the relevance of a host population dynamics for the microbiome dynamics. In agreement with [29], we observe that a limited migration imposes a bottleneck on the colonizers, which combined with a finite host lifespan might produce complicated colonization patterns (Fig 2 and  S4 Fig). The parameters m and τ have allowed us not only to classify the stationary colonization distributions (Figs 3 and 4), but also to quantify the relevance of the finite host lifespan in our model (Figs 3 and 4).
The parameters m and τ can be inferred from data. Alternatively, prior knowledge of the host lifestyle can give us intuition. For example, given the short lifespan of C. elegans a large τ is expected; while its feeding mechanism might pose a bottleneck, suggesting a small m. In principle, m can range from 0 (no environmental microbes going in) to 1 (only external migration and no internal reproduction). This range is spanned by previous studies that estimated this parameter for multiple species [5][6][7].
Sloan et al. [3] developed a neutral model to estimate the equilibrium distribution of a microbiome in an infinite-living habitat. Several studies have fit this model to data of different host species [5][6][7]. However, based on our results for hosts with varying lifespans, we predict that Sloan et al.'s model will perform poorly for hosts with short lifespans, e.g. D. rerio, D. melanogaster, and C. elegans, impeding comparisons of neutrality between host species (Figs 3  and 4). On top of that, the average microbiome of all sampled hosts might be a transient state, not the long-term equilibrium that is assumed when fitting the model. These problems are expected to be even more pronounced for low frequency microbial taxa (Fig 6 and S8 Fig), and small host populations samples.
As going from a microbe-free to a colonized state might affect the expected stationary distribution in hosts with finite lifespans, we included the occupation of empty space by microbes in our model explicitly. Then, we computed the probabilities of observing microbe-free (S6 Fig), fully colonized hosts (Fig 5), and their coexistence (Fig 3). Interestingly, there is building evidence of individuals with microbe-free guts coexisting in D. melanogaster [24], C. elegans [8], and caterpillars where a microbe-free state might be prevalent [23]-supporting our results. We argue that in such host species, both a low microbial migration and short host lifespan might be causative [33].
We have also observed alternative microbiome states. In other words, subsets of hosts whose microbiome is dominated by different microbial taxa (Fig 2). Our results suggest this might occur for low microbial migration and short host lifespan (Fig 4). Recently, [8] have observed alternative microbiome states occurring in C. elegans when this is colonized by two neutral Escherichia coli strains. The implications of our results go beyond colonization, as they predict priority effects [34], life history [35], and timing to be important conditions for any host control mechanism. Furthermore, we provide a generative process for the emergence of different microbiome states in the gut [36], that does not rely on selection, interaction networks or environmental change [37,38]. Our results support the current view that the enterotypes often discussed are indeed states contained in a continuum of colonization [39].
Finally, we have addressed the issue of identifying a core microbiome. In contrast to the present interest on identifying this subset of microbes [40], we argue that intrinsic features of the colonization process might impede finding a consistent subset. Specifically if the observed frequency within hosts is the criterion (Fig 6). More informative, however, would be distinguishing potential from factual colonizers, with members of the latter depending on the context where the colonization happens. We stress the relevance of regarding the colonization and coexistence ahead of the coevolution of hosts and microbes. Let alone, their organismic nature and implications [41,42].
As a consequence of the neutral assumption (fitness α i = 0 in Eq (1) for i � 1), our results extend to microbiomes with an arbitrary number of taxa. Although we first illustrate the process with two of them (Fig 2), analogously to [8], we move on to focus on the perspective of a single taxon (x i in Eqs (3b), (4b) and (5b)). In this view, the collection of other taxa can be arbitrarily complicated. This is particularly important in conditions leading to alternative microbiome states, where the frequency in the pool of colonizers, p i , becomes extremely relevant. While symmetric p i across taxa will result in as many alternative states as taxa, asymmetries will make those with larger p i appear more prominent, giving the impression of a reduced number of alternative states [39].
Future empirical work could focus on characterizing the prevalence of effects associated with the short lifespan-slow immigration regime (Fig 2). Although this depends on the timescale of the microbial dynamics also (resulting from the quality of the host as a habitat), host life-history might provide direction (Fig 1B). For example, a short lifespan together with a reduced amount of microbes reaching the gut, indicate the potential of observing such regime in nematodes [8] and some insects [23,24,33]. Moreover, different tissues within a host might provide different conditions. Other hosts might be subtler. As our model indicates, different life-histories might lead to similar results (contours in . We have presented a minimal neutral model. More complex processes could build upon it. Among others, the influence of the prenatal microbiome on the dynamics and stationary distribution in a neutral context is largely unknown [9][10][11]. Additionally, after an initial stochastic assembly, hosts might actively influence their microbiome via immunity and development [14]. This might have general or taxa specific effects. Particularly relevant as well, is the role that first colonizers (Fig 2) might have in modifying the internal host, influencing the arrival of upcoming microbes [43]. This could reinforce the difference between alternative microbiome states, at taxonomic and functional levels. Finally, as reported in some hosts [44], non-smooth changes of the microbiome could occur. These changes, of intrinsic (e.g. microbial succession [43], host and metabolic rhythms [45]) or extrinsic (e.g. diet change [43], disease, and antibiotics [46]) origin might be more akin to a Lévy walk [16].
Although previous models have studied signatures of ecological neutrality and selection in microbiome data [47,48], as well as its evolution [9,49], they have not described the ecological effects that we have described here. We share Roughgarden et al.'s [50] view that an eco-evolutionary approach is needed, but our results emphasize that colonization in a neutral context might already be sufficient to unify important and disconnected experimental observations, often implictly attributed to selection. Non-neutral processes might then build on top of such patterns.

Conclusion
We have introduced a stochastic model of the colonization of microbe-free hosts. After considering the environmental colonization and finite lifespan of hosts, our model recapitulates patterns reported experimentally. Namely, the coexistence of microbe-free and partially colonized hosts, as well as alternative microbiome states; both depending especially on the host lifespan. Crucially, our observations occur under non-selective conditions at the level of microbes or hosts. The model and results presented here aim to provide a null model for studying the hostmicrobiome formation by assuming the neutrality of microbial taxa-without ruling out that also selection will be important for these processes in nature. But even in the absence of any selective differences, our model explains a wide range of recent observations in microbiomes, from the observation of non-colonized hosts to alternative microbiome states.  (10). Lines show the model prediction, while triangles show the average over the steady state of 500 host samples according to Eq (6). The match spans several magnitude orders of migration (m) and probability of host death-birth events (τ). The probability increases for shorter host lifespans (larger τ) and less migration to the hosts (smaller m). The rate of occupation of empty space (α 0 ) has a larger effect on cases where migration is limited and the host lifespan is long (small τ). Simulations were computed as explained in the Methods. Other parameters: N = 10 4 . We use Eq (5a) where no definition of p i and α i is required.  (9). Lines show the model prediction, while triangles show the average over the steady state of 500 host samples according to Eq (6). The match spans several magnitude orders of migration (m) and probability of host deathbirth events (τ). The probability increases for longer host lifespans (smaller τ) and larger migration to the hosts (larger m). The rate of occupation of empty space (α 0 ) has a larger effect on cases where migration is large and the host lifespan is long (small τ). Simulations were computed as explained in the Methods. Other parameters: N = 10 4 . We use Eq (5a) where no definition of p i and α i is required. If there are only microbes of type 1 in the pool of colonizers (p 1 = 1), small τ implies that there is a single maximum at x 1 = 1-the hosts tend to be fully occupied. Bimodality is observed for 7 � 10 À 8 ≲ t ≲ 10 À 6 -some hosts are occupied, but some remain empty. For large τ, hosts tend to remain empty and the distribution has a single maximum at x 1 = 0. Black lines indicate the boundaries separating them (see Fig 4). (B) If the microbe is present in the pool of colonizers at p 1 = 0.5, no bimodality is observed. For small τ the frequencies are representative of the pool of colonizers and for large τ most hosts do not contain microbe 1. (C) If the microbe is rare in the pool of colonizers, p 1 = 0.1, the distribution has a single peak at x 1 = 0. This occurs for all values of τ shown here, because there is not enough time in the host to reflect the small number of microbe 1 individuals in the pool of colonizers (any probability smaller than 10 −9 was considered as zero, N = 10 4 and α 0 = α 1 = 0). (TIF) S8 Fig. Probability of colonization of microbial taxon 1 as a function of its frequency in the pool of colonizers. The results of multiple probabilities of host death-birth events (τ) are shown. Overall, the probability of colonization increases with the frequency in the pool of colonizers (p 1 ), but decreases as the host lifespan shortens (larger τ). A smaller migration (m) decreases the probability. Other parameters: N = 10 4 and α 0 = α 1 = 0. (TIF)