The Neutral Metaorganism

aMax Planck Institute for Evolutionary Biology, Plön, Germany bGEOMAR Helmholtz Centre for Ocean Research Kiel, Kiel, Germany cInstitute for General Microbiology, Christian-Albrechts-University Kiel, Kiel, Germany dZoological Institute, Christian-Albrechts-University Kiel, Kiel, Germany eInstitute of Microbiology, Chinese Academy of Science, Beijing, China fInstitute for Clinical Molecular Biology, Christian-Albrechts-University Kiel, Kiel, Germany gInstitute for Experimental Medicine, Christian-Albrechts-University Kiel, Kiel, Germany

The microbial communities living in and on animals can affect many important host functions, including metabolism [1,2,3], the immune system [4,5,6], and even behaviour [7,8]. The extent and direction of this microbe-mediated influence is often linked to the presence or absence of species and their relative abundances. It is thus paramount to understand how host-associated microbial communities are assembled.
A classical explanation for the emergence of a particular ecological community structure posits that every species is defined by distinct traits and occupies a specific ecological niche. An implicit hypothesis underlying much of microbiome research is that hosts have the potential to actively shape their associated microbial communities by providing niches for useful microbes [9]. This implies that individual hosts could select for a potentially very specific community structure.
While the assumption that the metaorganism -the host together with its associated microbes [10,11] -is an actively shaped symbiotic unit is appealing, it may bias interpretation of microbiota-host analyses. An example is the widely reported connection between the structure of the human gut microbiota and obesity, which turns out to be very hard to distinguish from random noise [12]. More recently it has also been demonstrated that the genetic background of the host does not significantly shape human microbiome composition [13]. Indeed, it has been argued that selective processes should not form the null hypothesis for explaining the allegedly cooperative host-microbe symbiosis [14].
Instead, neutral models have been proposed as a valuable tool for finding patterns in the tremendous complexity of ecological communities that may have a deeper mechanistic cause. Neutral models assume ecological equivalence between species. Thus, the community structure within a single host is the outcome of purely stochastic population dynamics, immigration and local extinctions [15]. The Unified Neutral Theory of Biodiversity [16,17] extends the neutral framework from a single site to multiple sites, each harbouring its own local community (Fig. 1). Diversity within the local communities is maintained by immigration from a common source community, so that the whole setup resembles a mainland-island structure [18]. Neutral theory has been applied to numerous ecological systems and sparked a lot of controversy along the way [19]. Despite these controversies and with the growing availability of microbial community data, the neutral theory has been adapted and applied to the microbial world [20,21,22,23,24]. In particular, the local community vs. metacommunity structure of neutral theory naturally extends to hostassociated microbial communities. Here, the hosts are viewed as ecosystems and their microbiota are treated as local communities [25,11]. This has led to several recent studies addressing the question whether the microbiota conform to the patterns predicted by the neutral theory. The rank-abundance patterns of the microbiota from three domesticated vertebrates for example were found to be largely consistent with the neutral expectation [26,27], despite a significant non-neutral signal in the genomic data [26]. A good fit of the neutral model was also found for the microbiota of young individuals of the zebrafish Danio rerio [28]. Interestingly, in this study neutrality decreased as hosts aged, indicating an increasing influence of selective processes over developmental time, potentially linked to the activation of a fully functioning adaptive immune response. The human microbiota were found to be predominantly non-neutral across various body sites [29], and in this study the few neutral communities were mostly associated with the skin and the urological tract. Another study, on the other hand, found the composition of the skin microbiota of healthy human subjects in large Chinese cities to be better explained by non-neutral processes [30]. Contrasting results can also be obtained depending on the state of the host, with the healthy human lung microbiota being largely consistent with a neutral model, while microbes recovered from diseased lungs diverged from neutrality [31]. Neutral assembly processes in invertebrate hosts are understudied, but a recent study shows the microbial communities associated with the fruit fly Drosophila melanogaster to be consistent with the predictions of a neutral model [32].
While these studies have drawn awareness to the potentially important contribution of neutral processes in shaping the microbiota, the use of several different neutral models and focus on one or are a few, mostly vertebrate host organisms, makes it hard to draw more general conclusions. Here, we aim to overcome these limitations by consistently applying a neutral model to a variety of distinct host systems, including, but not limited to, a wide range of invertebrate hosts. We included host species from a range of eukaryotic multicellular organisms with different life styles, ranging from early branching groups such as sponges, to the house mouse with its fully developed adaptive immune system. The use of a consistent modelling framework and the diversity of hosts in our study allows us to identify general characteristics of hosts that are more likely to look neutral. In particular, we hypothesize that for short-lived host species we may underestimate the level of neutrality, since microbiota analyses commonly take snapshots of community structure early in the neutralization process. Based on the neutral null hypothesis, we further identified members of the microbiota which consistently deviate from the neutral expectation, possibly indicating the action of selection on these particular microbes.

Results
We employ the neutral model presented by Sloan et al. [20], which is particularly suited for large-sized microbial populations. It describes the stochastic population dynamics within a local community, corresponding to the microbiota of a specific host. To maintain local diversity, which would otherwise reduce to a single species through ecological drift, immigration from a fixed source community occurs with rate m. This source community is not equivalent to the environmental pool of microorganisms, but rather it can be interpreted as the collection of all microbial species that can pass through the environmental filter of the host (Fig. 1). Thus, the neutral model does not make any assumptions about the selective processes that occur before and during colonization of the host and in particular, it is not concerned with any host traits that may restrict the range of colonizing microbes [33].
This model allows to derive a relatively simple expression for the expected long-term stationary community composition. Specifically, it predicts the relationship between the mean relative abundance of a taxon across all local communities, i.e. the metacommunity, and the probability of actually observing this taxon in any single community. Using well-established methods from population genetics, it has been shown that this relationship is determined by a beta distribution [20]. The only free parameter of this model is the immigration rate m, which can be calibrated by a nonlinear fit. The goodness of the fit then indicates how well the prediction of the neutral model compares to the empirical data. Concretely, we use the coefficient of determination R 2 as a quantitative measure of how consistent the data is with the neutral model. See the Material and Methods for details of the neutral model and the fitting procedure.

Neutrality of the microbiota
We fitted the theoretical neutral expectation to the published microbiota composition of eight different host species across four phyla and, for comparison, three environmental microbial communities (see Table S1 for an overview and references).
With the species Sarcotragus fasciculatus, Ircinia oros and Carteriospongia foliascens, our study includes examples from the oldest extant sister group to all other animals, the sponges (Porifera) [34]. Despite their filter-feeding life-style, sponges harbor a very diverse and highly specific microbiota [35], which mediates the functional role of the sponges in the ecosystem [36]. This is complemented by the jellyfish Aurelia aurita, the starlet sea anemone Nematostella vectensis, and the fresh-water polyp Hydra vulgaris. These hosts are examples from another early branching phylum, the Cnidaria, which comprise a basic innate immune system thought to play an important role in controlling one of the oldest known symbiotic host-microbe relationships [37]. We also included the nematode Caenorhabditis elegans as one of the best-studied multicellular organisms, whose microbiota has only more recently come into the focus [38,39,40,41]. Finally, the house mouse Mus musculus with its potent adaptive immune system [42] offers a well-studied intestinal microbiota [43,44].
In all cases, community composition was determined by the relative abundances of operational taxonomic units (OTUs), obtained by standard high-throughput 16S rRNA sequencing techniques [45]. To ensure equal sample sizes, all OTU abundance tables were rarified to the same read depth (1000 reads per sample). We analyzed the effect of the rarefaction on the neutral fit for datasets where more reads were available and found it to be of little relevance for most communities (Fig. S5). The estimated dispersal  Figure 2 The relationship between mean relative abundance of taxa across all samples and the frequency with which they are detected in individual samples for three representative datasets. Each dot represents a taxon and the solid line is the best-fitting neutral community expectation. The dashed lines and grey area depict the 95% confidence bands. The left panel shows a microbial community sampled from a compost as an example of an abiotic environment. The centre panel shows data for the natural microbiota of C. elegans nematode populations sampled from the same compost. The right panel shows the microbiota of a wild M. musculus mouse population. Here, the red dots indicate members of the microbial family Ruminococcaceae, which are predominantly overrepresented compared to the neutral expectation. A genus that is found in much fewer mice than expected is Shigella, being present in less than a quarter of the hosts despite a neutral expectation of nearly 100% prevalence.
parameter m from the neutral best-fits for all host populations showed considerable variation, but there is no correlation between m and the level of neutrality ( Fig. S4A and Tab. S1). Additionally, we found no correlation between sample size and neutrality, or between microbiota diversity and neutrality (Fig. S4). A comparison of the theoretical and observed relationship between mean relative abundances and occurrence frequencies of OTUs for three illustrative examples is shown in Fig. 2 (see Fig. S1 for an overview of all datasets). Taxa that lie above the neutral expectation are found more often than would be expected from their relative abundance in the microbiota metacommunity, implying a possible role of positive selection. In contrast, taxa below the neutral prediction are found in fewer hosts than expected by their relative abundance across all hosts, which may be due to negative host-mediated selection. The composition of microbial communities derived from compost and seawater are very consistent with the neutral expectation ( Fig. 2A and S1), which aligns with previous studies showing that random population dynamics and immigration play a major role in shaping microbial communities in abiotic environments [21,23,24]. Interestingly, the sediment community showed a relatively low consistency with the neutral model. This dataset is also the only one that showed a substantial effect of rarefaction, with neutrality increasing to the same level as the seawater community, when more reads are included (Fig. S5). This may indicate a greater influence of rare taxa, potentially leading us to underestimate the level of neutrality in the rarefied sediment dataset.
Biological hosts with their potential for active selection of specific taxa, could on the other hand be expected to have their resident microbial communities under tighter control, reducing the relative importance of stochastic processes after colonization, which would be reflected in a worse fit of the neutral expectation. The nematode C. elegans for example naturally consumes bacteria as its food source and possesses an innate immune system, potentially to defend against the threat of ingested pathogenic microbes [46]. This suggests that these worms have some control over their microbiota and indeed, for worms that were sampled at the same locations where the compost samples were taken, the best fit of the neutral expectation is considerably worse (Fig.  2B). We found a similar level of neutrality for the microbiota of laboratory worm populations grown exclusively on E. coli (Fig. 3), indicating a similar influence of neutral processes under the more controlled laboratory environments.
Vertebrate hosts with their more sophisticated and complex immune system are expected to diverge even further from the neutral expectation. However, intriguingly, the neutral model showed a very good fit to the microbiota of a wild M. musculus mouse population (Fig. 2C). In fact, the best fit was just as good as for the abiotic compost environment. A notable non-neutral microbe is Shigella, which is found in significantly fewer mice than expected from the neutral model (Fig. 2C). Invasive Shigella causes intestinal inflammations in humans, but mice mount an effective defense against it, thereby preventing acute infections [47]. This illustrates how potentially pathogenic microbes, which are actively selected against by the host, are typically found below the neutral expectation. The high level of neutrality in the natural mouse population is corroborated by a laboratory population of M. musculus (Fig. 3).
We found similar high levels of neutrality for sponges which as aquatic filter feeders are exposed to the full range of marine microbes. Like nematodes, they comprise a basic immune system [48], and especially for S. fasciculatus and some populations of C. foliascens, the data is in very good agreement with the neutral expectation (Fig. S1). In contrast, the results for the three aquatic polyps are less consistent. The two laboratory populations of A. aurita were in very good agreement with the neutral prediction, while the neutrality of samples taken at different developmental stages of H. vulgaris and N. vectensis varied from poor to good (Fig. 3).
Comparing the neutral expectation with data from a wide range of host species showed that a good fit of the neutral model is in fact the norm rather than the exception (Fig. 3). In particular, there was no substantial difference in neutrality between microbial communities living in abiotic environments, such as compost and seawater, and the microbiota from a diverse array of host species. We thus conclude that neutral processes may play a substantial role in the assembly of microbial communities associated with hosts as different as mice and marine sponges.
Why then, do in particular the microbiota of C. elegans diverge from the neutral expectation, suggesting a greater influence of selective processes? The most commonly assumed cause of such deviations is selection by the host, for example through its immune system. However, it seems intriguing that nematodes with a comparatively simple immune system would be much better at selecting microbes than mice with their sophisticated immune systems. Therefore, it is important to assess other possible violations of the underlying model assumptions.

Change of neutrality over time
A crucial aspect of the neutral prediction is that it corresponds to the expected long-term stationary distribution of the microbial community. If the community is in a transient state, however, it can appear non-neutral even if it is the product of a purely neutral process. This effect is illustrated in Figure 4, showing a simulation of the population dynamics of 50 neutral communities (Material and Methods). For every community, only a few random colonisers were picked from the source community, which yields random, but nonneutral, initial communities. We then compared the neutral expectation to the in-silico communities many times during the simulation, showing that they become "neutralized" through random population dynamics and immigration until the community composition reaches its long-term equilibrium, or the host dies. Goodness of fit of the neutral long-term expectation to neutrally simulated communities over time, where each dot represents the best fit at that timepoint. The two insets show the best fits of the neutral model to snapshots of the communities at an early timepoint (indicated by the red dot) and a late timepoint (indicated by the blue dot). Scale and meaning of the axes for the insets are the same as in Fig. 2. Each community was initialised with a few random colonizers, which gives a non-neutral initial community (R 2 ≈ 0.4).
Thus, in principle, a bad fit of the neutral expectation may be due to a transient nonequilibrium state of the community, rather than actual non-neutral processes. This effect would be especially pronounced for short-lived host species such as C. elegans, where initial colonization can lead to communities that appear non-neutral even for ecologically equivalent (i.e. neutral) colonizers [49]. The microbiota of such hosts may never effectively converge to the neutral long-term expectation, even if they are dominated by stochastic dynamics and immigration. For longer-lived species with a substantial amount of vertical transmission of the microbiota, such as sponges and mice, we expect these transient effects to play a smaller role.
If community assembly is governed by purely neutral processes, we would expect the fit of the neutral model to become better over time. Decreasing neutrality, which has been reported for the zebrafish Danio rerio [28], would on the other hand be a good indicator of selective processes. A similar increase in the relative importance of selective processes over stochastic effects has been described for the ecological succession of microbial communities in salt marshes [50]. However, while the microbiota from the Hydra and Nematostella populations show clear developmental patterns [51], we found consistency with the neutral model to remain relatively constant over time (Fig. S3).

Identifying non-neutral taxa
To identify those members from the bulk of the microbiota that diverge from the neutral expectation is of great practical importance, as this may reveal potential selective processes. In the following, we classified all taxa outside of the 95% confidence bands around the neutral prediction as not consistent with the neutral expectation.
We found that the distribution of non-neutral taxa around the neutral prediction is roughly symmetrical in all cases, and no microbial order deviates systematically above or below the neutral expectation (Fig. S2). In particular, within microbial orders that contain a substantial amount of non-neutral taxa, such as the Clostridiales and Bacteroidales in the mouse microbiota, genera were roughly evenly over-and underrepresented compared to the neutral null hypothesis. This is consistent with the hypothesis that the observed deviations from neutrality are at least in part due to transient neutral processes rather than selection.
To further distinguish random deviations from neutrality and actual non-neutral candidates, we also applied a more conservative definition of non-neutrality. For this, a particular taxon was classified as nonneutral only when it consistently diverged from the neutral expectation in the same direction in independent host populations. Applying this definition to the natural and laboratory populations of C. elegans and M. musculus yields a reduced subset of microbial taxa, which lie above or below the neutral prediction in both populations.
This reveals that of the C. elegans microbiota, only the genus Ochrobactrum is consistently underrepresented (Tab. S2). It is found in only ca. 40% of the natural isolates and 70% of the laboratory worms, despite its relatively high mean abundance across all worms and a neutral expectation of 100% prevalence. This underrepresentation of Ochrobactrum is intriguing, as it had previously been identified as enriched in worms and able to persist in the nematode's gut even under starvation conditions [39]. However, this observation is consistent with a transient community state, where Ochrobactrum has not yet colonized all worms, despite being very successful once it has entered the worm.
In the M. musculus microbiota of natural and laboratory populations, over a third of the over-represented taxa were from the bacterial family Ruminococcaceae. Members of this family were never consistently found below the neutral expectation ( Fig. 2C and Tab. S3). This family includes several physiologically relevant genera involved in the utilization of plant polysaccharides such as starch and cellulose [52], and high-fat diets low in plant-derived materials have been found to decrease the proportion of the Ruminococcaceae in mouse guts [53]. That we were able to consistently identify a bacterial family that had previously been linked to important metabolic functions as over-represented in the overall neutral mice microbiota, shows the utility of the neutral model as a null hypothesis and a tool of finding meaningful patterns in the vast complexity of the microbiota. An objective definition of a core microbiome is not straightforward. Taxa such as the Ruminococcaceae that are present significantly more often than predicted by chance are good candidates fulfilling clear objective criteria to be defined as a core taxon.

Comparison with a random community assembly model
The observed high levels of neutrality do not necessarily imply that the microbiota are simply a random collection of microbes. A good fit of the expected distribution indicates that the simple neutral model is sufficient to describe the relationship between abundance of a species and its occurrence frequency. But this can not rule out other processes, niche-related or stochastic, that lead to the same community structure [54].
A different model is obtained by assembling a local community by simply drawing randomly from the fixed source community. In this case, the probability of observing a specific species in a local community is determined by a binomial distribution. Comparing the best fits of the neutral expectation and of a binomial distribution using the Aikake information criterion (AIC) indicates that the neutral model fits the data better in all cases (Tab S1). This indicates that the microbiota are not merely random samples of the microbes present in the metacommunity, and that stochastic population dynamics and dispersal play important roles in shaping communities.

Discussion
We set out to quantify how consistent the microbiota of a range of different host organisms are with the expectation from a neutral model. We found that the structures of the host-associated microbial communities are often in surprisingly good agreement with the neutral expectation. This leads us to conclude that upon colonization, neutral processes could play a significant role in the assembly of the microbiota. A comparison with environmental microbial communities suggests that biological hosts are not better at structuring the microbial community than abiotic environments. Taking into account a more complete simulation analysis of the neutral model further indicates that the divergence from the neutral expectation observed for some host species may ultimately be due to transient stages of the communities, rather than being the result of actual selective processes.
This finding has major implications for practical problems, such as our ability to manipulate the microbiota to restore or improve a community structure that is associated with a healthy host. A dominance of deterministic, niche-related factors would suggest that undesirable community structures may be amenable to intervention by manipulating certain host traits or the community itself. But if microbiota dynamics are mostly neutral, intervention strategies to modify microbiota structure might be in vain as there are no distinguishable states of the microbiota.
However, even a high consistency with the neutral expectation does not rule out the presence of selective abiotic or biotic processes, as non-neutral models have been shown to yield predictions consistent with neutral theory under certain conditions [54]. In particular, biological hosts, like any other abiotic environment, will select for a subset of the environmentally "available" microbes, and yet the resulting microbiota may still look mostly neutral. In fact, comparing the microbes present in seawater to the species present in individual sponges reveals that there is almost no overlap (Fig. S6A). This indicates that sponges are a highly selective environment, yet their microbiota are very consistent with the neutral expectation. A similar selective filter is found for C. elegans, where only a fraction of the microbes present in the environment successfully colonize the worms (Fig. S6B). This suggests that there are host traits that act as environmental filters [33] and they often will do so in a predictable, i.e. deterministic way. In this sense the microbiological tenet that "everything is everywhere, but the environment selects" also applies when the environment happens to be another biological organism. However, our results highlight that the processes that are at play after the environmental filter has been passed can still be highly consistent with the neutral model. It has also been suggested that this interplay between selective environments and neutral processes acting on the allowed species may have contributed to some of the contrasting results found in previous studies [55].
Another factor to consider is that the data analyzed in this study is based on 16S rRNA gene based taxonomic profiling. Given the known examples of functional redundancy present e.g. in the human gut microbiota [56], viewing communities from the perspective of functional (e.g. metabolic) categories may provide an alternative picture. This ties in with recent studies demonstrating that taxonomic and functional microbial community composition are shaped by largely independent processes [57,58]. Thus, the dominance of neutral factors which we observe on the taxonomic level may in principle be complemented by non-neutral selection of functional groups mediated by the host.

Conclusion
The neutral assumption that species do not interact is an anathema to many ecologists, and for the microbiota in particular it is assumed that the functioning of the community is enhanced by cooperative interactions. Recent results however suggest that interactions within the mouse gut microbiota are indeed predominantly competitive and very weak [59]. Together with our finding of the microbiota being highly consistent with neutral expectations for several different host species, this lends support to taking the neutral null hypothesis as a key component of our understanding of host-microbe interactions. Awareness of the potentially significant role of neutrality in shaping the microbiota is crucial to avoid being lead astray by randomness in view of the incredible complexity of host-microbiota symbioses and to be able to uncover general principles of microbiome assembly.

Methods The Neutral Model
Here, we summarize the model presented in [20], considering only the purely neutral special case with equivalent growth rates for all species. A local community contains a fixed number of N individuals and changes to the relative abundance of a particular species requires individuals to die. The death rate δ is the same for all species and a dead individual is replaced by reproduction of a local individual (with probability 1 − m), or by immigration from a fixed source community (with probability m). The relative abundance of a species in this source community is denoted by p i . With this the probabilities that the number of individuals of species i increases by one, decreases by one or stays the same are given by (1) These transition probabilities correspond to Hubbell's original neutral model, but the following continuous approximation derived by Sloan et al. [20] can be efficiently applied to very large population sizes and allows for a relatively simple analytical solution. In particular, this allows for a calibration with the highthroughput 16S rRNA sequencing data obtained from microbial communities. If N is large enough, such as in microbial communities, we can assume the relative abundance x i = N i /N of species i to be continuous. Applying methods from population genetics [60] this leads to a Fokker-Planck equation for the probability density function where M δ x and V δ x are the expected rates of change in frequency and variability, respectively. Assuming the time intervals between individual death-birth events are short, they can be approximated by and Equation (2) together with (3) and (4) describes the neutral population dynamics of a local microbial community. This is not in general amenable to analytical treatment, but one can approximate the long-term equilibrium solution of this equation. Namely, the potential solution to with the solution approximately given by the beta distribution This can be connected to empirical observations by realizing that, for a given detection threshold d, the probability of actually observing a species in a local community is given by the truncated cumulative probability density function Here, N is given by the number of reads per sample and the relative abundance p i of the focal species in the source community can be approximated by the mean relative abundance of the species across all samples. The probability of immigration m is thus the only free parameter and used to fit the predicted long-term distribution to the observed occurence frequencies of the microbiota.

Fitting procedure
The fitting process applied to all datasets works as follows. First, the OTU abundance table was rarefied to the same read depth (1000 reads per sample), which was determined by the lowest available read depth from all datasets. Rarefaction was performed using the PySurvey package (pypi.org/project/PySurvey). The expected observation probability (6) and a binomial model were then fitted to the observed mean relative OTU abundances p i and occurence frequencies f i obtained from this rarefied table using non-linear least squares minimization with the lmfit package (pypi.org/project/lmfit) [61]. As a measure of the goodness of fit we then calculated the standard coefficient of determination using the ratio of the sum of squared residuals and the total sum of squares: where Φ i is the expected ocurrence frequency obtained from the best-fit neutral prediction. Rarefaction and fitting was repeated 1000 times and the R 2 reported throughout is the mean across individual best-fits for each dataset. Additionally, for comparison of the neutral and binomial model, the mean Aikake information criterion (AIC) was calculated from all rarefactions.

Neutral model simulations
We simulated the neutral process for 50 local communities, each containing N = 10000 individuals, with a source community containing 200 species and an immigration probability of m = 0.05.

Code availibility
The Python code for fitting and simulating the neutral model will be made available on github and in the meantime is available from the authors on request.

Data availibility
All data has been published previously, see Table S1 for the corresponding references.   Figure S1 The relationship between mean relative abundance of taxa across all samples and the frequency with which they are detected in individual samples. Each dot represents a taxon and the solid line is the best-fitting neutral community expectation. The dashed lines and grey area depict the 95% confidence bands.  Figure S1 he relationship between mean relative abundance of taxa across all samples and the frequency with which they are detected in individual samples. Each dot represents a taxon and the solid line is the best-fitting neutral community expectation. The dashed lines and grey area depict the 95% confidence bands.  Figure S2 Bars show the mean relative abundance of the ten most abundant microbial orders in a specific host population. The colored sections indicate the fractions of OTUs within that order which were found above the neutral expectation (red), within the neutral expectation (grey) and below the neutral expectiation (blue).  Figure S2 Bars show the mean relative abundance of the ten most abundant microbial orders in a specific host population. The colored sections indicate the fractions of OTUs within that order which were found above the neutral expectation (red), within the neutral expectation (grey) and below the neutral expectiation (blue).  Figure S2 Bars show the mean relative abundance of the ten most abundant microbial orders in a specific host population. The colored sections indicate the fractions of OTUs within that order which were found above the neutral expectation (red), within the neutral expectation (grey) and below the neutral expectiation (blue).

Figure S5
Effect of rarefaction on neutrality. Generally, consistency with the neutral model initially decreased with increasing read depth, until it leveled off as read depth increased further. For communities that showed a high consistency with the neutral model, varying read depths did not affect the results much, ranging from almost no effect at all for A. aurita to a slight drop in neutrality for the seawater samples from R 2 = 0.7 at 1000 reads/sample to R 2 ≈ 0.6 at 50000 reads/sample. Only for the communities associated with the sponge I. oros and two of the C. foliascens populations did read depth show a more pronounced effect, where in both cases neutrality dropped from R 2 ≈ 0.6 at 1000 reads/sample to R 2 ≈ 0.45 when read depth was exceeding 10000 reads/sample. Interestingly, an opposite trend was observed for the sediment samples, where neutrality increased from R 2 ≈ 0.5 at 1000 reads/sample to R 2 ≈ 0.7 at 10000 reads/sample. The minimal to moderate changes in neutrality with increasing read depth for some datasets potentially reflects the influence of rare, non-neutral taxa, which are only detected with higher read depths.

Figure S6
Overlap of taxa found in hosts and the environment. Left: For two sponge species, there is only a very small overlap between the sponge microbiota and the taxa found in seawater. Right: For C. elegans, only a subset of the environmentally available microbes is found in the worms.