Food Web Assembly Rules for Generalized Lotka-Volterra Equations

In food webs, many interacting species coexist despite the restrictions imposed by the competitive exclusion principle and apparent competition. For the generalized Lotka-Volterra equations, sustainable coexistence necessitates nonzero determinant of the interaction matrix. Here we show that this requirement is equivalent to demanding that each species be part of a non-overlapping pairing, which substantially constrains the food web structure. We demonstrate that a stable food web can always be obtained if a non-overlapping pairing exists. If it does not, the matrix rank can be used to quantify the lack of niches, corresponding to unpaired species. For the species richness at each trophic level, we derive the food web assembly rules, which specify sustainable combinations. In neighboring levels, these rules allow the higher level to avert competitive exclusion at the lower, thereby incorporating apparent competition. In agreement with data, the assembly rules predict high species numbers at intermediate levels and thinning at the top and bottom. Using comprehensive food web data, we demonstrate how omnivores or parasites with hosts at multiple trophic levels can loosen the constraints and help obtain coexistence in food webs. Hence, omnivory may be the glue that keeps communities intact even under extinction or ecological release of species.


S1
Food web ecology 4

S1 Food web ecology
To model food web ecology, we employ the widely-used, generalized Lotka-Volterra equations (MacArthur, 1967) which assume a well-mixed system, i.e. a system where spatial fluctuations are negligible. The links in a food web are descriptions of predominantly predator-prey relationships between species on neighboring trophic levels (Montoya et al., 2006) but could also constitute mutualist interactions (Bascompte et al., 2005).
The lowest trophic level of the food chain, primary producers such as plants or microbes, draw energy from the sun and elementary chemical compounds. We define a distinct species as an organism of distinguishable phenotype. We first separate species with regard to their trophic levels and direct interactions between species occur only between those on neighboring trophic levels. To model the food chain, we require two different types of equations. For n 1 primary producers with (population or mass) densities S i (1) , i = 1, . . . , n 1 , residing at the first trophic level, the general form of the time-evolution equations is (MacArthur, 1967;Gross et al., 2009) In Eq. S1, competition between the primary producers for basic resources is described by a normalized overall carrying capacity in the first term on the RHS (Smith and Slatkin, 1973). k producers compete for resources when p ji > 0 for i = j, but the equation also allows for the description of the case of direct competition, cooperation, as well as growth on non-overlapping resources (p ji = δ ji ).
For species S (l) k at any trophic level l > 1 we write where β l,l−1 km are the efficiencies of reproduction of species S (l) k when consuming species S (l−1) m and the other coefficients are defined analogously to those in Eq. S1. For top predators, residing on the L'th trophic level, there are no enemies, hence the coefficients η L+1,L pk will vanish. Recycling of the top predators' biomass will then only occur through their natural death rates α (L) k . For a species' phenotype to be distinguishable, it must be characterized by a unique combination of the parameters. In the above equations, all parameters and observables have been made dimensionless by normalizing population densities to the primary producer carrying capacity, growth and decay rates to the maximal primary producer growth rate, and interaction rates to the ratio of maximal growth rate and carrying capacity.
Note that Eqs S1 and S2 differ by the respective consumption of resources: Primary producers consume supplies of physical or chemical energy which are replenished at a constant flux. Species at any higher trophic level depend on the availability of prey at the neighboring trophic level below.

S2 Steady state and its stability
The ecological dynamics are taken to occur without rapid evolution, i.e. we are considering systems where species richness is not a consequence of transient rise and decay of species. In the following, we describe how to obtain the interaction matrix R in the steady-state equations (Sec. S2.1). We then discuss stable networks for any combination of species richnesses that is in accordance with Eq. 6, main text (Sec. S2.2).

S2.1 Steady state
To describe sustainable food webs, we demand the time-derivatives on the LHS of Eqs S1 and S2 to vanish and arrive at a set of equations that are linear in the densities of any of the species: Collecting all constant coefficient (LHS of Eqs S3 and S4) in the vector k and all interaction coefficients on the RHS in the interaction matrix R, we have the linear matrix equation R · S = k, where S is the vector of all species densities. For completely shared nutrients, the competition factors p ji = 1. In the opposite extreme, where each basal species draws energy from a distinct nutrient source, p ji = δ ji with δ ji the Kronecker delta function. When re-organizing the terms in the sub-matrix of interactions between species on the basal trophic level, the square matrix A 1 is obtained by subtracting rows 2 to n 1 from row 1. After this linear transformation we yield the modified matrix R , shown in Fig. 1a (main text).
Apart from the first row, R has a block structure with nonzero entries only for interactions between neighboring trophic levels ( Fig. 1a, main text). Steady-state solutions are then given by S * ≡ R −1 k, thus a unique solution S * requires that R be invertible, hence det(R) = 0 (Otto and Day, 2007). In the following we first consider the case of shared nutrients (p ji = 1) but in Sec. S7 also separate resources are discussed.

S2.2 Stability
To demonstrate conditions for stability, in the following we show how a Lyapunov function can be constructed, in analogy to Goh (1977) and Hofbauer and Sigmund (1988). Let us consider the generalized Lotka-Volterra equation (Eqs S1 and S2)Ṡ for primary producers and for higher levels l > 1. For the top level L, the last term in the right hand side will be zero. Let us now assume that a feasible solution exists. For clarity and ease of notation, in the following derivation we use the symbol f given feasible solution f Now, let us define a function with constants c (l) i , which we will determine later. We havė Using the expressions for w If we can choose the values of the c's such that the multiplication rule is fulfilled, the system is globally stable. Noticeably, this multiplication rule does not involve any of the coefficients η (l,l ) ki , the link strengths can hence still be modified, or links removed, by adjustments to these coefficients.
If the multiplication rule (Eq. S18) is satisfied, the expression Eq. S17 becomes extremely simple. In particular, satisfying Eq. S18 is possible at least in the following cases.
1. When the network structure is a tree there is only one path between any pair of nodes. In this case, there is a unique link from a node at the level l to a node at the level l − 1. Therefore it is possible to find the values c 2. If the network structure is not a tree, this can be satisfied only if β l,l−1 ki satisfies β l,l−1 This is possible for any network structure. c (l) c (l-2) c (l-1) β (l,l-2) =c (l) /c (l-2) β (l,l-1) =c (l) /c (l-1) β (l-1,l-2) =c (l-1) /c (l-2) c (l) c (l-2) c (l-1) β (l,l-1) =c (l) /c (l-1) β (l-1,l-2) =c (l-1) /c (l-2) c (l-1) β (l,l-1) =c (l) /c (l-1) β (l-1,l-2) =c (l-1) /c (l-2) a b~~β (l,l-1) β (l-1,l-2) =β (l,l-2) β (l,l-1) β (l-1,l-2) =β (l,l-1) β (l-1,l-2)T rophic example Omnivore example For the case 2, we comment on the existence of loops, i.e. structures where two different paths connect two species (Fig. A). The rule (Eq. S18) can be fulfilled if and only if the products of β's along any different paths between the two given species are equal (Fig. A). Notably, this encompasses the loops involved in food webs with strict trophic levels ( Fig. Aa) and those involving omnivores (Fig. Ab). Although our formalism above only considered interactions between trophic levels l and l − 1, there is nothing that prevents us from extending this formalism to interactions with levels further away, thereby including omnivory. Hence, given that a steady state exists, stability is possible by proper parameter choice for systems involving loops.
In either of these cases 1 and 2, the value of c (1) i is not determined. We then havė (S19) We can define a real symmetric matrix If the n 1 × n 1 real symmetric matrix Q is semipostitive definite 1 , we have a Lyapunov function V (S) that satisfiesV (S) ≤ 0.
1 A real symmetric matrix is semipositive definite if and only if all its eigenvalues are positive or zero. In this case, there is a real matrix P that makes P QP −1 to be diagonal matrix with Q's eigenvalues and P −1 = P t . Since

S2.2.1 When is Q semipositive definite?
The following two cases are easy to consider.
When p ij = δ ij . In this case, Q is semipositive definite as long as all c i . In this case Q ij = 1 and the eigenvalues of Q are n 1 and zero. Eigenvector for the eigenvalue n 1 is the vector filled with ones. In this case, Eq. S21 will becomeV It is straightforward to extend this to the case where p ij is block diagonal with blocks with ones and blocks with delta-functions.

S3 Resource versus consumer limitation
Trophic cascades are changes in the population densities of consumers and resources in food chains when a top predator is added or removed, but can also be triggered by changes in nutrient availability (Heath et al., 2014). Exemplified by the growth of a simple food chain -starting with a single species -we discuss the alternating concentration of biomass on odd, respectively even, trophic levels.
A freely growing basal species, S 1 according to Eq. S3 will reach the steady state population density and can reach the system carrying capacity (here set to unity) by increasing its growth rate k (1) 1 relative to its decay rate α 1 . Consider now a simple ecosystem consisting of only two trophic levels with a single species on each level, i.e. a basal and a consumer species (Fig. B). Eqs S3 and S4 then become Hence, the basal species population density is entirely determined by the trophic interaction and the decay rate of its consumer and S (1) 1 will not primarily be limited by the carrying capacity of the system. In its simplest form this exemplifies what is meant by the transition from the resource to the consumer-limited state.
(1,1) (2,0) (3,1) (4,0) Continuing further, we consider a single species S (3) 1 on the third trophic level and obtain the solution Hence, S is now entirely specified by the interaction with S (3) 1 and the decay rate α 1 . In turn, the basal species density again can approach the carrying capacity by increasing its growth rate relative the the decay rate. Note that Eq. S29 describes a reduction of the producer species both due to its own death rate and the death of its indirect support on level 3.
A similar analysis of the 4-level system reveals that in that case the population of the producer increases from its previous level (Eq. S25) to a larger level because its predator (the species at level 2) has to "carry" the death of its support (level 4), i.e.

S
(1) It is easy to continue the sequence by adding trophic levels with single species and the alternation between resource and consumer limit will be obeyed. More generally, with species diversities {n 1 , . . . , n L } on the L trophic levels, in the consumer-limited case, the decay rates for consumers and their respective interactions with producer species will set the population densities of the N o = N e resource species. In the resource limited state, one degree of freedom for basal species will be preserved. In principle, sustainable networks could then be formed where one basal species again approaches the system carrying capacity by optimizing its own growth and decay rates, i. e. this basal species could be freed from exploitation by consumers. Conversely, in a sustainable consumer-limited state, all species belonging to the resource support must be exploited by consumers. None of the resource species can -even in principle -be freed.
This can be made more explicit: Consider a system of only two trophic levels with diversities n 1 and n 2 . If n 2 = n 1 , the n 2 equations (S4) define n 2 weighted sums of the n 1 = n 2 species at trophic level 1. It is then always possible to yield an expression for the sum S (1) ≡ n1 m=1 S (1) m of total population density on the basal trophic level which depends only on the decay rates α (l) k of and interaction coefficients with species on the second trophic level. Hence, the population density S (1) will not depend on the growth or decay rates of species on the basal level. In other words, species fitness does not improve by increasing its growth rate, but remains solely governed by the species' predators (Thingstad, 2000).
Conversely, if n 2 = n 1 − 1, only the sum of n 1 − 1 species on level 1 can be expressed in terms of the respective decay and interaction coefficients. The remaining degree of freedom will then be fixed by Eq. S3 where the carrying capacity will be felt by the basal species. Generalizing to L trophic levels, n 2 − n 1 is the number of degrees of freedom on level 1 controllable by level 2. However, as n 3 constraints can be imposed by level 3 onto level 2, the net limitation is then n 2 − n 1 − n 3 . This sequence will continue analogously until n L is reached and we are left with the familiar conditions N o − N e ∈ {0, 1}, a difference that defines the number of degrees of freedom remaining for the basal level.

S4 Species definition and example of two shared niches
A trophic species is defined as all species sharing a set of predators and prey. An example is shown in Basal species share nutrients and consumer species and can be combined to a single trophic species. Consumer species share resources and can also be combined to a single trophic species. b, Food web with species aggregated into trophic species. c, As in (a) but showing differential predation rates.
In our approach, we also distinguish species by the interaction strength with other species. For example, when two species feed on the same two prey species, but they do so at differential rates, we will still consider them distinguishable species. The reasoning behind this is twofold: First, for the competitive exclusion principle to be relevant, we need to be able to discuss "perfect" competitors, i.e. those that share a resource. Aggregating the two consumers into a single trophic species would immediately remove this situation, the discussion of competitive exclusion would hence be meaningless from the outset. Second, to discuss how the competitive exclusion principle can be generalized, we do need to discuss situations where species, despite sharing prey and predator species, may still coexist. This is at the core of our discussion.
In the specific case of Fig. C.a, by allowing complementary interaction rates, the two species occupy distinct niches, even though they principally share both prey species. Coexistence between two competing species can be rationalized as a form of trade-off (Tilman, 1990), where one species may be better at preying on one resource than the other but thereby inevitably sacrificing efficiency in preying on the second resource species. To make this more explicit, we now discuss in detail the example shown in (Fig. C.c). In terms of its steady state condition: To simplify the discussion, we now set all β ≡ η || . Note that we have also dropped the superscripts in these coefficients. Using these simplifications, we arrive at solutions where the first equation should be compared to Eq. S25. From these solutions we see that the basal species are now completely dependent on the interaction rates with the consumer species. S (1) 1 and S (1) 2 become inversely proportional to the efficiency of predation, β, and inversely proportional to the sum of contact rates with the predatory species. From the densities of the consumer species, S (2) 1 and S (2) 2 we see that they become singular as the difference between (squared) link strengths vanishes, i.e. η || − η × → 0. This corresponds to the case of the competitive exclusion principle, when generalizing to a niche consisting of two basal species. Equal link strength is equivalent to perfect competition between the two consumer species. The singularity could only be lifted by choosing k 1 = k 2 , a combined situation where we would consider the basal species to be identical.
Conversely, solutions do exist when link strengths are not equal, e.g. η || η×. In that case, feasible solutions imply inequalities between the parameters, e.g.
This is to say that the "free survival time", α −1 must be sufficiently long to outweigh the production time k −1 1/2 and effective "processing time" (η || β/2) −1 . Solutions with a component resulting from η × would yield contributions to the consumers densities from both basal species (compare Eq. S36).
The above example, with each consumer species developing a weak and a strong link to a resource species, could in practice be generated by functional relations between interaction parameters, known as trade-offs Tilman (1990 and S (1) 2 . While mathematically possible, complete equivalence of parameters is biologically exceedingly unlikely. In natural systems, even a small imbalance in the magnitude of parameters would allow one of the species to become dominant. Further, it has previously been shown that the case of very similar parameter values leads to very small probabilities of coexistence (Haerter and Sneppen, 2012), simply because the population redistribution between the two species would perform a random walk, until one is exposed to the absorbing state of zero population.

S5 Assembly rules for a food web with strict trophic levels
The assembly rules state the minimal conditions for a sustainable steady state of a food web, which in turn requires that the determinant of its interaction matrix R is nonzero. Computationally, this can be examined by assigning random values for all nonzero entries, and subsequently calculating the rank of R. If the rank r is equal to the total number of species S, then the known interactions are sufficient to explain the sustainability of the food web. If the rank is less than S, the rank deficiency d ≡ S − r measures a minimal bound for the number of additional links needed to sustain the food web.
The determinant-equal-zero rule can be reformulated in terms of the network structure. We here consider the case of fully bi-directional networks without species with varying food chain lengths, such as omnivores (i.e. all species must have strict trophic levels). The assembly rules can then be summarized as: 1) Food webs without omnivores require perfect matching to be sustainable. Perfect matching is the ability to cover a non-directed network with non-overlapping pairings (Lovász and Plummer, 1986 2) The assembly rules imply general constraints for sustainability of purely trophic food webs: for the case of one common resource. Here N o and N e are again defined as the total species richness at all odd, respectively even, trophic levels. Eq. S38 does justice to the constraints imposed by the species richness at distinct trophic levels, as discussed in the main text. Violation of these constraints can e.g.
by quantified by the difference N e − N o , which expresses to which extent the food web is in the consumerdominated state. N e − N o can be used to estimate the minimal rank deficiency of a given set of data for a food web with sharp trophic levels and an excess number of consumers. 2

S6 Omnivory and parasitism
Omnivores (Fagan, 1997;Thompson et al., 2007) or some parasites (McCann, 2000;Bascompte et al., 2005;Lafferty et al., 2008;Dunne et al., 2013) or pathogens (Mordecai, 2011) that prey on multiple trophic levels, have been suggested to play a moderating role in food webs. When disregarding concomitant links (Secs S9.3.3, S10.4), parasites, despite their complex life-cycles (Huxham et al., 1996), can be seen to formally act as omnivores in our steady-state equations, when considering a given parasite species with its population as a whole. In our analysis, even the sign of interaction would not matter, i.e. also mutualistic interactions with species at distinct trophic levels, or a combination of different interaction types (e.g. antagonistic towards one species at one trophic level and mutualistic to another species at a distinct level) would be possible.
In terms of the assembly rules, omnivores whose prey includes species on a given trophic level l can be interpreted as trophic species on the level l + 1. Specifically, consider an omnivore preying on all species at the L g trophic levels {i 1 , i 2 , . . . , i Lg }. This omnivore can then be interpreted as a trophic consumer residing on any of the trophic levels {i 1 + 1, i 2 + 1, . . . , i Lg + 1}. Consider now the introduction of this omnivore into an existing L-level food web containing only nearest-level trophic interaction, i.e.
omnivores are absent prior to the addition. The L diversities are then given by {n 1 , . . . , n L }. The result of introducing the omnivore is, that the assembly rules now allow the redefinition of any of the existing diversities at the corresponding trophic levels, i.e. n i1+1 can now be redefined to becomẽ where i * ∈ {i 1 , i 2 , . . . , i Lg }. Note that for any single omnivore this replacement can only be made for a single selection of the L g trophic diversities {i 1 , i 2 , . . . , i Lg }. This means, once the omnivore has been associated with a given trophic level, it can no longer be re-interpreted as adding to species richness at another. That said, the previous assembly rules now still hold when the diversities n i are replaced by Consider the introduction of an omnivore into an existing food web, which previously obeyed the assembly rules.
Using the guideline defined by Eq. S40, the omnivore may or may not lead to violation of the resulting set of assembly rules given by theñ i . In the resource-limited case, the requirement then becomes that interpretation of the omnivore as a consumer is possible, i.e. it must prey on one of the resource species levels (odd levels). Further, any of the inequalities in Eq. S38 must still hold when replacing withñ i .
This would e.g. be impossible if the added species richness were to occur at a level already at the limit of the allowed number, e.g. when adding another top predator to a system where the number of top predators equals that of the bordering trophic level, n L = n L−1 .

S6.1 Example of omnivory
Generally, multiple omnivores may be present in a given food web. We exemplify the effect of four omnivores (n g = 4) on a possible configuration of diversities in a food web consisting of seven trophic levels ( Fig. D). For simplicity, in the figure we consider all omnivores to prey on the same three trophic levels, 1, 4, and 6. Note also, that under absence of the omnivores, the food web presented in the figure violates the assembly rules, since n 2 ≥ n 1 − 1 is not obeyed. Interaction matrix for seven trophic levels with respective species diversities n 1 , . . . n 7 . N g specifies the number of omnivore species preying on more than one trophic level, here n 6 , n 4 , and n 1 . The schematic shows that n 2 and n 5 do not fulfill the standard assembly rules but that the species richness at these levels can be supplemented by the omnivore count, i.e.ñ 2 ≡ n 2 + 1, and n 5 ≡ n 5 + 1, yieldingÑ g ≡ N g − 2. The row and column of addition of these two omnivores is indicated by arrows. The two remaining omnivores can further be accommodated by interpreting one as further contributing toñ 2 , the other as contributing toñ 5 .
By interpreting one of the omnivores as a consumer of species at trophic level 1, the replacement n 2 ≡ n 2 + 1 can be made. This modification allows the condition to be met, and we yieldñ 2 = n 1 − 1.
The subsequent condition 4 = n 3 ≥ñ 2 − n 1 + 1 = 0 is trivially fulfilled and 7 = n 4 ≥ n 1 − 1 −ñ 2 + n 3 = 4 is also fulfilled in this example. However, at the next level, 2 = n 5 ≥ n 4 − n 3 + n 2 − n 1 + 1 = 3 is violated. Again, an omnivore can be interpreted to feed on a convenient trophic level, namely n 4 , and thus allows the interpretation of this omnivore as a consumer at the level 5. This leads to the redefinitioñ n 5 ≡ n 5 + 1 and the assembly rule is now obeyed. The final two assembly rules involving n 6 and n 7 prove to be fulfilled.
At this stage of the analysis, two of the omnivores have been re-interpreted as trophic species allowing all assembly rules to be obeyed. However, the two remaining omnivores must also be accommodated without violation of the assembly rules. In the given example, this is possible by interpreting again one to reside at level 2, the other at level 5, all inequalities are still obeyed. Note that the possibility of accommodating multiple omnivores into an existing food web that is compatible with the assembly rules, is only possible when they allow interpretation as species on both even and odd trophic levels. This is the case for the example in Fig. D, where the omnivores prey on levels one and four.

S6.2 Generalist omnivory
The previous discussion shows that the presence of an omnivore allows more flexibility in interpreting the diversities n i in the assembly rules (Eq. S38). The presence of an omnivore can hence allow for fulfillment of the assembly rules when a trophic species is added or removed, which would otherwise lead to a violation of the rules. The situation is most transparent for a generalist omnivore, which we define as a species that preys on all existing trophic levels. This type of omnivore allows greatest flexibility: (i) Incorporation into an existing food web is always compatible with the assembly rules as the generalist omnivore can either be interpreted as adding to N o or N e (which would then lead to an increase of N o or N e by up to the number of omnivores, yielding a modifiedÑ o , respectivelyÑ e ). (ii) Upon addition of a new species into a food web encompassing the generalist omnivore, a possible violation of the assembly rules by the new species can often be circumvented by re-interpreting the generalist omnivore as a species residing on a convenient trophic level. Specifically, the assembly rules under presence of N G generalist omnivores (subscript capital "G") in addition to the food web species richness described in the main text, are: This means that either of the N G ≥ Ω generalist omnivores can -but need not -be interpreted as an additional trophic species residing on any of the existing trophic levels.

S6.3 Pairing requirements for food-webs with omnivores
In terms of the pairing presented in Sec. S5, omnivores can allow additional pairing options as they prey on multiple trophic levels, making it easier to fulfill our basic assembly rule, det(R) = 0 (compare Each pairing of species that are connected, may influence the population sizes of each other (so far marked by a single solid arrow, Figs 1, 3 and 4, main text). To more explicitly highlight that the interaction influences both species involved in the pairing, we now use a more specific notation, where the single arrow is replaced by two dashed arrows (shown in Fig. E.a). Thereby, the interaction that allows species i to influence species j is marked with a directed dashed link from i to j. This may for example be a predator i that reduces the population of prey j, corresponding to a nonzero matrix element R ji . If the predator also gains in population by the interaction, there is a link from j to i, resulting in a nonzero matrix element R ij . A directed pairing is now defined as an interaction involving two species with only one direct link between them, i.e. either, but not both, of the two matrix elements R ji and R ij . Pairings hence always involve two directed pairings in opposite directions (both matrix elements R ji and R ij ). Complete directed pairing: Consider a sequence of nonzero matrix elements R ij , R jk , R kl , . . . , R mi .
We call this type of sequence a closed loop of directed pairings, i.e. a chain of directed pairings where the direction is maintained and the last element connects to the first. When the entire food web is covered by non-overlapping closed loops of directed pairings, we refer to this as complete directed pairing.
Each nutrient source can, but need not, participate as a species (and in this case each nutrient source participates in a bi-directional directed pairing with each basal species). det(R) can only be nonzero if the requirement of complete directed pairing is fulfilled. Note that a food web that can be perfectly matched will automatically allow a complete directed pairing.
Fig. E.a shows a food web that is not completely paired. The orange species needs to be paired with the resource, making it impossible to pair the remaining three species. Fig. E

S7 Basal species with separate nutrients
So far we have restricted the discussion to the case of a single fundamental energy source that was shared by all primary producers. In practice, there may be several fundamental energy sources, such as different chemical compounds, in addition to sun light. Further, nutrient partitioning can also result from e.g.
As a consequence, a partitioning of the nutrient sources between the primary producers will occur. For simplicity, we take the n 1 primary producers to each depend on one of n S ≤ n 1 nutrients. In Eq. S1 this corresponds to setting only those p ij equal to unity where i and j share a nutrient. In that case, A 0 breaks down into a block structure with n S blocks. Each of these blocks is a square matrix of unit entries and dimension equal to the number of primary producers sharing the corresponding nutrient. Each of these blocks can then be manipulated to yield a single row of ones and otherwise zeros. This means that n S rows of unit entries persist. In the assembly rules the availability of the n S nutrients again allows for more flexibility: This set of conditions is analogous to the one for a single nutrient (Eq. S38) except that now, ∆ can take any value between zero and the number of available nutrient types, a choice that then carries over to additional sets of inequalities.

S8 Species link requirement to include predator saturation
In consumer-resource relationships, type-II functional responses are characterized by a less-than linear increase of consumption as resource concentration grows. This distinguishes type-II from type-I responses where consumption is linear in resource concentration, i.e. c(R) ∼ R, where c is the consumption rate per consumer, R is the concentration of the resource species and K is a constant. In the following we set the non-linear type-II response to and discuss the resulting modifications of the food web assembly rules.

S8.1 Model equation with type-II response
Predator saturation, i.e. the situation where the predator growth rate saturates at a high prey density, can be taken into account in the assembly rules by replacing the type-I by the type-II functional response.
For simplicity, let us again first consider the case of a completely shared nutrients for basal species and absence of omnivores. The population dynamics for the basal, respectively higher trophic levels, now obeys the following modified equations (compare Eqs 3 and 4, main text): Here, the termη denotes the predation rate of the species k at the trophic level l + 1 with respect to the species i at the trophic level l, with a half saturation density K , the function will be reduced to the type-I response function as , the function will be reduced to the term which implies that the species at level l is not fully exposed to the species population of S (l+1) k .

S8.2 Constraints from steady state condition and extended species pairing
The non-overlapping species pairing of nodes required for coexistence (described in the main text) can be extended to the type-II functional response case. In order to derive it, we first consider top predators, middle species, and basal species separately, and then summarize the extended rules.

S8.2.1 Top predators
Consider the top predators at the level L. Their population dynamics is given by Eq. S49 amounts to one constraint among the populations' of the all of its prey species S For n L top predator species in steady state, it must be ensured that Eq. S49 does not over-constrain the densities of prey populations. There should hence be n L prey species whose populations are determined by these equations. In other words, it should be possible to pair each top predator with one of its direct prey species in a non-overlapping way. Hence, for top predators and their prey, the pairing requirement remains unchanged, irrespective of whether the type-I or type-II response is of interest.

S8.2.2 Intermediate species
Now consider a species i at a trophic level l, that is neither a top predator nor a basal species.

Type-I response of all predators with respect to S
for all of its predator k; for convenience, we term such species "type-I response species". The dynamical equation for the population density of S (l) In this case, with assuming S  Fig. 3a, main text, where an additional species was included at trophic level 2, i.e. now n 2 = 4. This species is taken to be in the saturation phase w.r.t. the response to predation by a predator on trophic level 3 (the type-II response is indicated by a dotted arrow). Under these circumstances, the species can be considered to pair with itself in the species pairing, indicated by a shaded circle. When viewing the adjacency matrix as a table of relations between a consumer and a resource species, the self-pairing could be interpreted as diagonal matrix element (shown as a pink square), i.e. a dependency of the change in species density on itself.

Type II response of some predators with respect to S
i is comparable or larger than at least one of the constants K (l+1,l) ki ; we term such interactions "type-II response". The equation will then take the form In this case, given S (l) i = 0, the steady state of Eq. S50 yields one constraint among its own population , and its prey species' populations S (l−1) j . For this condition not to over-constrain the populations, it should now be possible to link each of the type-II response species to either itself, one of its predators, or one of its prey species in a non-overlapping way.
An example of a food web involving a type-II response species is shown in Fig. F, demonstrating the possible self-pairing for the added species given that the saturation regime is reached. Note that the ability of a species' self-pairing comes with the condition that the saturation level K in Eq. S43 is reached by the species population density. In the example given in Fig. F (Mougi and Kondoh, 2012;Brose et al., 2006).

S8.2.3 Basal species
Type-I response of all predators with respect to S for all its predator k, a basal species obeys the equation The steady state condition with S (1) i = 0 gives a constraint among the sum of basal species j S (1) j and its predator population S (2) k 's. Hence, for this condition not to over-constrain the populations, it should be possible to pair each of the type-I response basal species to one of its predators in a non-overlapping way, allowing additionally one of the basal species to pair with the nutrient source.

Type-II response of some predators with respect to
The steady state condition with S (1) i = 0 gives a constraint among its own population S i , the sum of basal species j S (1) j and its predators' populations S k . In order for this condition not to over-constrain the populations, it should be possible to pair each of the type-II response basal species to either itself or one of its predators in a non-overlapping way, allowing again one of the basal species to pair with the nutrient source. The generalization to multiple nutrient sources is analogous to the discussion in Sec. S7, i.e. each nutrient source may, but need not, be used for pairing a a basal species.

S8.2.4 Summary of the extended species pairing requirement
Summarizing the conditions above, the species pairing requirements with type-II response read as follows: • It must be possible to produce a pairing of each species and one of its "partners" in a non-overlapping way.
• This partner can be any of its predator or prey species if the species' response to all its predators is in the type-I regime (sufficiently low species density).
• If a species' response to one or more of its predators are not in the regime of type-I response (sufficiently high species density), it can be paired to itself. This self-pairing corresponds to a self-limit on growth which prevents it from fully controlling its prey.

S9 Data Analysis
This section describes details of the data analysis performed. In particular, we describe the characteristics of the food web data considered (Sec. S9.1), an analysis of the sub webs formed by free-living species only (Sec. S9.2) as well as the analysis of the sub webs including parasite species (Sec. S9.3). The latter also involves the discussion of concomitant predation (Sec. S9.3.3). Several of these sections involve specific comments on the individual food webs, we have noted this by "details", to facilitate reading. We have also moved more detailed figures (included for completeness) to the end of the text. We further examine the possible existence of secondary extinctions by removal of species in the different empirical food webs (Sec. S9.4).

S9.1.1 Food web data
We use high-resolution data on seven food webs including free-living and parasite species: The North American Pacific Coast webs Carpinteria Salt Marsh, Estero de Punta Banda, Bahia Falsa (Hechinger et al., 2011;Lafferty et al., 2006); the coastal webs Flensburg Fjord (Zander et al., 2011), Sylt Tidal Basin , and Otago Harbor, New Zealand , as well as the Ythan Estuary, Scotland (Huxham et al., 1996). These food webs describe consumer-resource interactions between basal, predatory and parasite species. A compilation of all seven food webs has recently been provided (Dunne et al., 2013). Specifically, the data distinguish three types of links: (i) links between free-living species only ("Free"), (ii) additional links between parasites and other species ("Par") and (iii) links from free-living consumers to the parasites of their resources ("ParCon"), i.e. so-called concomitant links.

S9.1.2 Technical considerations on parasites
Similar to predators, parasites feed on their hosts. Parasite-host interaction can be distinguished from predator-prey interactions in that parasites undergo complex life cycles (Huxham et al., 1996), where a single parasite at one stage may benefit from one host but a different host at another stage. In addition, a parasite often lives in or on its hosts, and concomitant links from the hosts' predators' to the parasite may result (Sec. S9.3.3 and Sec. S10.4). On the population level, the total population of a given parasite species can however be regarded as feeding on a number of different hosts simultaneously. We adopt this view in the following as far as "Par" is concerned, as the details do not impact on our mathematical description in terms of food web matrices. However, these simplifications should be kept in mind when considering, in particular, removal of species and the question of potential sustainability for the resulting webs (Sec. S9.4). When considering concomitant links, we make additional distinctions (Sec. S9.3.3). For example, one aspect, which we further consider, is that of the host potentially also being impacted upon by the parasite, e.g. by reduction or modification of activity, thereby affecting metabolism or reproductive success (Lafferty and Morris, 1996;Johnson et al., 2010).

S9.1.3 Matrix rank, basal species and nutrient sources
For our general coexistence condition of det(R) = 0 we do not require details of the numerical values of matrix coefficients. Instead, we first symmetrize the interaction matrix to mimic the bi-directional effect in predator-prey, respectively parasite-host pairs. Further, we allow basal species to feed on individual (p ij = δ ij ). We then assign random numbers to all nonzero matrix elements to numerically compute the determinant. To allow a more refined view on food webs exhibiting det(R) = 0 we have computed the rank r of R (Otto and Day, 2007). r describes the number of linearly independent rows (or columns) of the matrix. Ecologically, the rank amounts to the number of uniquely occupied niches in the ecology. We also repeatedly use the rank deficiency (i.e. nullity) d ≡ S − r, with S the total number of species. We distinguish d f , d f +p , d con,asym , and d con,sym , the deficiencies of the sub webs of free-living, free-living and parasite species, as well as additional asymmetric and symmetric concomitant links (Sec. S9.3.3).

S9.1.4 Treatment of detritus and particulate organic matter (POM) in the data
Several of the empirical food web data sets contain detritus or POM as basal species (Wilson and Wolkovich, 2011;Gaedke et al., 1996;Pauly, 1996). The equation describing the concentration of such non-alive "basal species" D i should have a source term that is independent of D i , namely The first term on the RHS is not linear in D i , therefore the equation for the steady state is not expressed by the matrix form. However, as summarized in section S8, the species pairing can be extended to the nonlinear equation as a condition not to over-constrain the variables. In this regard, the steady state of Eq. S54 constrains its own "population" D i and its consumers' populations S k . Thus, the D i can be paired with either themselves or one of their consumers, as in the case of a basal species with its own nutrient. We therefore treat such resources as basal species with their own nutrient in the data analysis.

S9.1.5 Approximate trophic levels
For all empirical food webs, we first organize species according to their approximate trophic level. In accordance with the literature (Williams and Martinez, 2004;Thompson et al., 2007), we here aim to define the trophic level of a given species as its average food chain length. Basal species, which do not consume other species, are thereby assigned trophic level 1. Within the prey-averaged definition (Williams and Martinez, 2004), the trophic level of a given species is then the average trophic level of all its resource species plus one, e.g. a species that only consumes basal species has trophic level 2. In practice, we have determined chain lengths by following paths from a given species to one of the basal species, choosing a random resource at equal probability at each intersection. As choices are random, we have repeated the procedure to ensure that fluctuations are small and the general conclusions do not depend on the sampling. For two species with very similar approximate trophic levels the order could occasionally switch, an effect that we were able to reduce by using larger sample size. We compared results using between 2 × 10 3 and 2 × 10 4 food chain samples per species and detected very little variation. Some food webs contain loops, where the same species recurs within a single food chain. These loops were dropped from the analysis on trophic levels. In the following, we use the computed average and standard deviation of a species' food chain lengths as a proxy its trophic position and degree of omnivory, respectively. A similar consideration has been made in the literature (Lafferty and Morris, 1996). We will, for simplified syntax, in the following refer repeatedly to "trophic level" when discussing species, by which we mean the approximate average food chain length of the given species.

S9.2.1 Trophic levels and omnivory
For all food webs, we compute the frequency distributions of average food chain lengths. We also aggregate the data for all food webs and produce the corresponding frequency distribution (Fig. G). Notably, the distribution of aggregated empirical data ( Fig. G.a) has pronounced, sharp peaks at chain length 1 and 2, corresponding to basal species and their consumers. Approximately 25 percent of species are sharply located at trophic level 2 (dashed red line in the plot). The absolute highest average chain length lies near 4.5 (compare also to the literature, e.g. (Thompson et al., 2007)). Comparing with the individual food webs (Fig. V), this overall pattern is generally preserved. In both figures (Figs G and V), we compare to data produced using the niche and cascade models, with total species number S and total number of links L for each food web as an input. These models generally produce fewer species at the level 2 and overall considerably broader distribution functions ranging up to chain lengths of 5.5.
To quantify the extent to which species can be associated with sharp trophic levels, we compute also the standard deviation of food chain length for each species, and show the aggregated distribution functions for all resulting values (Fig. N.a-c). We hereby define a species to have a "sharp trophic level", if the species shows no variation, i.e. zero standard deviation, in food chain length. In the plots (Fig. N.a-c), a value of zero correspondingly means that all chain lengths associated with a given species are identical.
The figure demonstrates that in the empirical data ( Fig. N.a) a species that consumes at two neighboring trophic levels equally. In comparison, the niche and cascade models ( Fig. N.b,c) predict substantially fewer sharp trophic levels. Further, they produce a continuum of values ranging up to 2, which could e.g. be obtained by a species consuming equally on trophic levels 1 and 5. These aggregated results should also be compared to the distributions for the individuals webs ( Fig. W), where the general pattern is preserved, albeit with some variation (which we will return to in Sec. S10).

S9.2.2 Trophic structure of food web matrices
We now use the obtained mean chain length (CL) of each species to assign an order to species from shortest to longest CL. Using the Bahia Falsa food web as an example, we describe the resulting depiction of food webs (Fig. H, caption). We will subsequently employ this presentation style repeatedly in the remaining text. We also use integer-rounded CL to broadly categorize species (the CL is thereby simply rounded to the nearest integer value). To simplify the discussion, in the following, we refer to interactions between species with integer-rounded chain length l 1 and l 2 as l 1 −l 2 interactions. Consider now the corresponding depiction of all empirical food web matrices of free-living species (Fig. T).

Additional detail.
We here focus first on the sub web of free-living species only. All food webs show a clear structure of blocks of high link density and blocks where links are generally absent. We have in Sec. S9.2.1 noted that species of integer-rounded CL 2 have little variation in chain length. This is also evident from the figure, as the positions of the white blocks in the matrices indicate that species of CL 2 predominantly consume basal species and are in turn mainly consumed by species of CL 3. We further find that the structure of the matrices becomes less defined for species of CL 3 and above. There, different food webs show different behavior. Generically, more links can be observed between species of similar chain lengths, e.g. in the block with chain lengths ∼ 3 considerable interaction is present, while  (red=short, blue=long). This color coding is used in (b) to indicate the new positions of free-living species once parasites are introduced (there is little re-ordering for the lowest 60 species, but more reordering above). Gray and black "measuring bars" at the lower and right edge of the food web matrices indicates the number of integer-rounded approximate trophic levels and corresponding diversities n i . These measuring bars can be used to compare the structure of the matrices in (a) and (b). Basal species have again been assigned individual nutrient sources, yielding finite matrix elements on the diagonal in the lower right block matrix. Figure is a reproduction of main text Fig. 5a,d. this is e.g. absent for species with CL 2 (white spaces). These matrices show additional detail on the trophic organization of species: In addition, a clear hierarchical structure is evident, illustrated by blue and red colors (corresponding to the direction of the consumer-resource links). Species located at higher trophic levels tend to act as consumers of species below them.

S9.2.3 Discussion of individual matrices (details)
Using Fig. 1, main text, we attempt to construct a path through the matrices. For the different matrices, we make the following observations: • Bahia Falsa: Constructing a path is not possible as a large number of well-defined chain length 2 species has formed a white block that is impossible to circumvent using the available 2-3 interactions (interactions between species of integer-rounded CL 2 and 3). This could be remedied by the presence of 4-2 interactions, but also these are mainly absent. In this particular food web, the interactions 4-4 are also generally missing (there is only a thin band of omnivores feeding on levels 3 and 4). Hence, the Bahia Falsa food web constitutes a relatively well-structured food web and the high degree of organization, together with a large number of CL 2 species leads to failure to comply with the assembly rules (the rank difference -i.e. nullity -is d f = 16, compare the table in Fig. Md).
• There are two other food webs with clear structure: Flensburg Fjord and Otago Harbor. Both have nearly complete lack of 2-2 interaction and little 2-4 interaction. Both have large 2-2 blocks. Also in these food webs, the assembly rules can not be fulfilled, the respective rank differences are 15 and 19.
• A further food web with large rank difference is Ythan Estuary (d f = 17). Here, however, the overall link density is very low, compared to the other food webs, leading to an additional difficulty in reaching agreement with the assembly rules.
• The remaining food webs Carpinteria, Punta Banda, and the Sylt Tidal Basin are less clearly structured and the 2-2 block is somewhat smaller. Correspondingly, a path through the matrix can be constructed for Carpinteria and Punta Banda (d f = 0). Only a small rank difference results for Sylt Tidal Basin (d f = 6).
Again, we compare with the simulations obtained from the niche and cascade models.
• Both models describe the hierarchical organization of species (clear organization of blue and red matrix elements), where those with larger CL prey on those with shorter CL. This is clear by the definition of the two models, where an ordering of species is the starting point.
• Both models deviate from the empirical data in that they do not produce any obvious signature of trophic levels, i.e. no block structure is apparent in either of the two models.
• Correspondingly, for all instances of the models considered (with the empirical numbers of species and links), full rank was achieved. Indeed, for random matrices containing exclusively values −1, 0, and 1, with relatively mild conditions, one is very likely to achieve full rank (Tao and Vu, 2006).
• We have further reduced the number of links considerably for the two models: Demanding only that each species be connected to at least one other species, essentially all (> 99 percent) of samples achieved full rank, even when using only 2/3 of the links. In the remaining few cases, a rank difference of 1 resulted.

S9.3.1 Trophic positions with parasites
We now consider the empirical data also for food webs including parasites (Fig. U). We aim to characterize the apparent re-organization that takes place once parasites are included in the food web. We again make use of approximate trophic positions, but it should be kept in mind that for parasites trophic positions are a matter of debate, as they undergo complex life-stages and localizing them at any specific trophic level (or assigning a specific food chain length) is questionable. Here, we use the concept of trophic levels, respectively food chain length, solely as a means to organize parasite species relative to free-living species and discuss the respective host range. To this end, we use a color-coding (displayed at the edge of the food web matrices in the figure and explained in Fig. H). For the free-living food webs, the color coding simply indicates the sequential organization in terms of the chain length computed for the freeliving species alone. When parasites are included, the sequence of the color coding becomes disrupted.
However, the disruption is again far from random: Nearly all species with CL 1 and 2 have unchanged positions in the extended food webs. For species with CL 3, some disruption takes place, i.e. some parasites now have chain lengths ∼ 3. However, most of the parasites enter near the top end of the existing chain lengths and top-predators of the free-living web often no longer have largest chain length in the combined system (Lafferty et al., 2006). This feature of disruption is observed generally for all food webs studied. We also show distribution functions of the respective chain lengths in Fig. AA.
We note more specific features of the matrix structure: Parasites have re-organized parts of the structure at higher trophic positions, leading to substantial randomization there. However, much of the structure in the lower part of the matrix is left intact, in particular, the white block formed by 2-2 interactions is nearly unchanged in most of the food webs. Another noteworthy observation is that the hierarchical structure, highlighted earlier for the free-living webs, is strongly disrupted for the higher trophic levels, when parasites are considered (blue and red matrix elements are now more mixed).

S9.3.2 Discussion of individual food webs (details)
Here it is important to consider the different food webs separately (Fig. U): Carpinteria and Punta Banda show a strong mix-up of hierarchy for the higher levels, while this effect is less clear for some of the other food webs. In particular, Ythan Estuary shows nearly no reduction of the hierarchical structure.
Mixing of hierarchical structure is important as it can lead to an increase in chain length (e.g. free-living species of formerly short chain length can sometimes feed on parasites that then feed on others, thereby increasing chain length). This particular feature is most evident when again considering the distributions of chain length standard deviation (Fig. Z) or the coefficient of variation (Fig. S)

. Carpinteria and Punta
Banda have now acquired much stronger variation of food chain length for some species, values that now even exceed those produced by the corresponding Niche and Cascade model simulations. In the other extreme, the Ythan Estuary food web, the unbroken hierarchical organization leads to nearly unchanged chain length variation distribution. For completeness, we present also the aggregated data of chain length standard deviation for the case of free-living and parasite species (Fig. Y).
The above findings on chain length standard deviation and mixing of hierarchical structure are also consistent with the changes in chain lengths (Fig. X, compare Lafferty et al. (2006)). Here, it is found that a continuum of chain lengths beyond 3 ranging up to 5 or 6, in some cases, is now present. However, the pronounced concentration of level 2 species is conserved (most evident in the aggregated data, Returning to Fig. U, we can discuss the relation to the assembly rules. For the Bahia Falsa web, some parasites prey on level 2, granting better chances to find a path through the food web matrix as demanded by the assembly rules (Fig. 1, main text). Indeed, in the Bahia Falsa food web, presence of parasites reduced the nullity from d f = 16 to d f +p = 3. Similar behavior is observed for Otago Harbor, where substantially more species now prey on level 2 species. On the other hand, in the case of the Ythan Estuary, not much structure is added and overall low link density makes finding a path less likely. This may be one reason why the nullity in fact increases when including parasites (others are discussed in Sec. S10).

S9.3.3 Concomitant predation
We further consider concomitant links: Concomitant links are links from a free-living consumer to the parasites of its resources (Lafferty et al., 2008(Lafferty et al., , 2006, e.g. by consuming the resource, also the parasite will be consumed, causing an additional death rate to the parasite. Such links have e.g. been suggested to contribute considerably to parasite mortality (Johnson et al., 2010). Specifically, in the following we distinguish asymmetric and symmetric concomitant links.
Asymmetric concomitant effects arise, when a predator causes death to its prey's parasite(s), but no effect of the consumption of the parasite is felt by the predator. Generally, this is a type of 3-species interaction, which can be a function of all three species involved (which is beyond the simple products of the generalized Lotka-Volterra equations). In principle, one could e.g. include terms of the type: where η * is a modified interaction probability. We here make the point that such interactions can lead to additional niches, as they impose new relations between parasites and the hosts' predators. To consider the presence of such interaction, it is here therefore sufficient to consider only the case where the coupling between the parasite and the host's predator does not vary with host density: implying a linear truncation. (Non-linear interactions could also be considered in analogy to Sec. S8.) In terms of the interaction matrix, this is a directed link where the parasite feels the negative effect of the host's predator.
In such cases, the interaction matrix has a nonzero entry at some position (i, j) (Fig. Ec). The entry at the symmetric position (j, i) might be zero, if the predator's gain resulting from consumption of its prey does not change when the prey is infected by the parasite. For food webs with some uni-directional links, the basic condition det(R) = 0, i.e. "finding a path through the matrix", amounts to the requirement of possible selection of an out-link and an in-link for each species in the interaction matrix. Free-living Free-living + parasites  Numbers of loops of length three (triangles) of each species for all food webs (as labeled), for the sub-webs of free-living, free-living and parasite links, as well as additional symmetric concomitant links. Loops were computed by first removing diagonal elements from R and collecting the diagonal elements of the cube of the resulting matrix. Note that we count each existing closed loop only once, irrespective of whether the loop may be directed or bi-directional. Panels mark the rank deficiencies for all cases, red labels highlight where changes occur for asymmetric concomitant links. Plot also shows the frequency distribution for numbers of free-living species with parasites, distinguishing additionally the set of high level (i.e. at least CL 3) free-living species. The table shows the numbers of free-living species that interact with parasites (i.e. either act as parasite hosts or feed on parasites, "para"), and do not interact ("no para"). Also shown: numbers of free-living and parasite species, as well as the numbers for interaction with free-living species at odd and even levels, respectively.

Symmetric concomitant effects.
The presence of a parasite may in some cases have effective impact on the consumption rate (Johnson et al., 2010). For example, by altering host behavior, the parasite may facilitate consumption of its host by the consumer, as was shown for killifish that become more susceptible to their natural enemies (birds) when they were infected by trematodes (Lafferty and Morris, 1996).
In such cases, the fraction of hosts infected by a parasite would have a larger probability to be consumed by the predator. This again constitutes a non-linear 3-species interaction as in Eq. S55. Again making a linear approximation, where the predator gain now linearly increases with parasite density, we then consider symmetric concomitant links. These links involve a bi-directional interaction between the host's parasite and predator (see Fig. K for a simple example where coexistence can be obtained).
However, in all empirical cases, the full food webs with parasites and bi-directional concomitant links have lower nullity than the food webs without parasites. We have explored in the empirical data, whether only a small subset of bi-directional concomitant links is sufficient, finding that usually 20 percent yield already substantial improvement of the rank (compare Fig. 5f, main text). Four out of seven full food webs fulfill our assembly rule. Concomitant links are further explored in the simulations in Sec. S10.

S9.4 Removal of species from food web
The lack of niches d can be used to address the potential effect of removal of one species from a food web.
If removal increases the lack of niches further, it can potentially trigger secondary extinctions, especially if d = 0 before the removal of the species. If the removal of a species causes d to decrease (which is possible only if d > 0 before the removal), then the removal of the species could potentially make the food web more compatible with coexistence. It should however be noted that -especially in food webs containing parasites -removal of species, even when decreasing the nullity, can lead to strong disruptions. E.g., when parasites use different hosts at distinct life stages (Lafferty et al., 2008), removal of only one of the hosts could be sufficient to cause extinction of the parasite.
We tested how removal of each species affects d in the seven food web data sets. We again distinguish food webs with free-living species only, with parasites, and with parasites and concomitant links. For simplicity, symmetric entry to the matrix is assumed for concomitant links. In Fig For the webs with d = 0 before removal (Estero de Punta Banda, Carpinteria Salt Marsh), only an increase of d can occur, and removal of such species will cause secondary extinction. The Estero de Punta Banda free-living web is quite sensitive to the removal of one species, while CA is not.
When considering food webs including parasites, it must again be noted that substantial disruption (and extinction) could be caused, when only one of the parasites' hosts' is removed. Given this limitation, we state for Bahia Falsa, Flensburg Fjord, and Sylt Tidal Basin, that adding parasites could in principle reduce the sensitivity of the food web to the removal, as seen by reduced values of the number of species m that increase d by removal. Carpinteria Salt Marsh and Punta Banda appear to be less stable when only adding parasites since d = 0 was already realized in the free-living species web, but concomitant links might have the potential to stabilize these webs. Ythan Estuary also shows such a tendency, though it starts with a rather high value of d in the free-living species web.
Otago Harbor remains rather sensitive to the removal of species even with addition of parasites. This may be due to the rather large excess number of species in the trophic level 2 (cf. Figure M). The addition of parasites decreases d but not enough to completely compensate the excess species, which make it sensitive to removal of species in other levels and also removal of species from level 2 helping to reduce d for all the three cases.

S10 Modeling
This section provides modeling of an idealized food web structure. Sec. S10.1 states the basic assumptions made to obtain an idealized, food web with sharp trophic levels (no omnivores). Originating from this "free-living" food web, we discuss several perturbations: Modification of the free-living food web to include omnivores (Sec. S10.2), addition of different types of parasites to the free-living food web (Sec. S10.3), as well as a discussion of asymmetric and symmetric concomitant links (Sec. S10.4).

S10.1 Idealized food web structure
To obtain additional insight into the generic features of the data, we now produce an idealized free-living food web. To this end, we consider all seven empirical food webs and determine the species' approximate trophic levels ( Fig. M.a and b). For simplicity, for any species we assign the closest integer trophic level as its default trophic position, e.g. a species with mean chain length 3.2 would be assigned a value of 3.
Proceeding in this way for all food webs, indeed some degree of generality is found (compare (compare Sec. S5). In terms of resource vs. consumer limitation (Sec. S3) the system would be in a state where far too many consumers were added to be consistent with the assembly rules (we call this consumer-dominated in the following). We note that a compensation by nutrient sources is thereby not possible, since additional nutrient sources could only be used to compensate for an "overweight" in terms of unpaired basal species. In the current situation, however, compensation would be required for species at levels 2 or 4.
By inspecting the table in Fig. Mc, we further find that in each of the empirical food webs, the difference N o − N e is negative, indicating that all webs are in a consumer-dominated state, an inherent nullity, or rank deficiency, hence results for all food webs, ranging from −9 for the Ythan Estuary web to −37 for Otago Harbor. This means, if the food webs were organized strictly in terms of these integer trophic positions (no variation of food chain length for a given species), the assembly rules could not be fulfilled for any of them. One can speculate as to the origin of the imbalance in N o − N e . One explanation could be that basal species are poorly resolved. Another is that parasites can be seen as adding a counterweight at higher trophic levels and cause additional mixing.
Using the idealized structure as a starting point, we initially assume strict trophic positions of all species and assign random interactions within the allowed sub-block of the matrix until the empirical    Table of food web data for all empirical and the idealized food web. n i are the numbers of species at trophic levels i obtained by the rounded mean food chain lengths for all species in the respective webs. N o − N e is the difference of the numbers of species at odd and even levels, respectively. S f and S p are the numbers of free-living and parasite species, respectively. d, Table containing information on links in the different food webs: L f are the numbers of links in the free-living web, L p are the numbers involving at least one parasite. L p−p are the numbers of links between parasites. d f and d f +p denote the nullities in the webs of free-living, respectively free-living and parasite species webs. d c,asym and d c,sym denote the nullities when including asymmetric and symmetric concomitant links, respectively. The table also indicates where values vary in the simulations. e, An example of a sample of the idealized free-living food web matrix R sim f ree , used as a starting point for the simulations: All species are placed at integer-valued trophic positions without food chain variation. {n 1 ,n 2 ,n 3 ,n 4 }={9, 42, 35, 24}. Basal species are all assigned individual nutrient sources (yielding a sequence of diagonal elements in the lower right block matrix). Panels (a), (b) and (e) are a reproduction of main text Fig. 6a,b. average of links is reached (compare table Fig. Md). The strict trophic positions require that the distribution of chain length variation would be a spike at zero. In the following, we refer to this type of matrix as R sim f ree , a sample is shown in Fig. Me. In sections S10.2 and S10.3 we use R sim f ree as a starting point for the discussion of two specific types of modifications: additions of omnivores to R sim f ree by re-assignment of links (no link addition); addition of parasites to R sim f ree (increasing total number of species to S f + S p , see Fig. Mc).

S10.2 Simulations of omnivore addition within the free-living web
We apply two types of perturbations to R sim f ree without changing the number of species or links: • Some of the links described by R sim f ree are reassigned to become omnivorous links (Fig. N.d). Even for a small fraction of omnivorous links, the variation of chain lengths widens and at 65 links already becomes similar to that of the aggregated empirical data (compare Fig. N.a). Addition of omnivorous links is further associated with a rapid reduction of the nullity (Fig. N.f, dashed line).
• We consider specific omnivory, namely only that of level 3 species (which initially feed only on level 2) being able to feed on other level 3 species (3-3 interaction in addition to the existing 3-2 interaction). This type of interaction appears to be quite abundant in the empirical data (Fig. U).
In this case, the distribution of chain length variation changes more slowly, even at 125 omnivorous links the empirical distribution has not yet been attained. Considering again the change of nullity as function of added links, there is no change. This is clear as the addition of 3-3 interactions does not help in relaxing the consumer-dominated state of the systems towards a more balanced configuration. In terms of the species pairing, 3-3 interaction would allow for pairing of level 3 species with level 3 species. However, all existing level 3 species were previously already paired with either level 2 or 4. Conversely, more 4-2 or 2-2 interactions would be beneficial, as in that case e.g. the abundant level 2 species could be paired with other species at that level or with some of those at level 4.
Real food webs do have high density of 3-3 interactions but some also show substantial 4-2 interaction.
Actual food webs therefore likely lie in between the two extremes (solid and dashed lines in Fig. Nf).
However, nearly no 2-2 interactions are present in empirical webs, which makes pairing between such species rare.

S10.3 Simulations of parasite addition
We now perform a similar analysis for webs including parasites (Fig. O). Starting again from the idealized free-living food web R sim f ree (Fig. O.a), we perform four distinct simulations: Normalized histograms of the standard deviation of food chain length for free living species food webs. a, Aggregated empirical data for all seven empirical food web data sets. b, Niche model results for corresponding values of S and L for the seven food webs. c, Similar to (b) but for cascade model results. d, Simulation data for the idealized free-living food web containing the average numbers of species at approximate trophic levels 1, 2, 3, and 4. The corresponding node richness equals {9, 42, 35, 24}, respectively, leading to a theoretical lack of niches of 22. For L omni = 0 each species in the constructed food web only consumes species belonging to the trophic level below, i.e. all species have zero food chain length variation. Panels show results of simulation using varying numbers of omnivorous links (L omni , as labeled) to perturb a system of species with well-defined trophic levels. Omnivorous links produce consumer-resource pairs where the difference in trophic levels between the respective partners is larger than 1. e, Similar to (d) but for omnivorous links only allowed between species formerly located at trophic level three and constraining the omnivorous links to include interaction with another species at level three. f, Dependence of rank difference ("lack of niches") on L omni for the two cases shown in d and e as dashed and solid curves, respectively. Percentages on top horizontal axis specify the fraction L omni /L, i.e. the fraction of omnivorous links. 1. Addition of the empirical number of parasite species that interact with any existing levels at random, to ensure that each parasite will be connected. Initially, each parasite is granted one link to a random existing species. All subsequent links are completely random ( Fig. O.b).
2. Addition of parasites that are specialized at consuming species of a given trophic level. All parasites are thereby first assigned a random host, this defines the specific trophic level of hosts for the given parasite. Subsequent links are then added by ensuring that the given parasite will only consume hosts with identical chain lengths to the first. Link additions are otherwise random (Fig. O.c).
3. Initially, each parasite is given one link to an existing free-living species at level 3 or 4. Subsequent links are assigned randomly between any parasite and any level 3 or 4 species (Fig. O.d).
4. Similar to (d) but parasites consume hosts at levels 3 and 4 and also previously added parasites at random (Fig. O.e). Approximately a 5 percent fraction of parasite links are parasite-parasite links.
Simulations show qualitatively different results for the four cases: (1) Random addition and consumption behavior of parasites leads to rapid degradation of the nullity, addition of less than 5 links per parasite generally suffices to remove the nullity, i.e. allow full rank.
(2) Completely level-specific addition of parasites leads to a slight initial reduction of rank. This is due to a compensating effect of adding parasites to random hosts: Dominant trophic levels will acquire more parasites. Those parasites are then counted towards the node richness at the neighboring (higher) trophic level. Apart from this species. This case in fact yields again a completely structured food web, where simply a level 5 with n 5 = 47 species is attached to the existing free-living web. The assembly rules then predict that only n 4 = 24 parasites can be paired, leaving n 5 − n 4 = 23 parasite species unpaired. This sets the nullity of the resulting matrix ( Fig. O.a,(i)). This also implies that adding parasites can in fact increase the nullity, if hosts are dominantly located at a specific, single trophic level. (4) When parasites can also consume other parasites (in the empirical data typically more than five percent of parasite links involve parasite-parasite interaction) the nullity can again be removed and approximately 10 links per parasite are sufficient for the moderate concentration of parasite-parasite links used here.
Within the limitations of the idealization made, reality is best described by the curve corresponding to (Fig. O.e), when also the order of interaction is random, i.e. in some cases parasites consume freeliving species, but in others the order is reversed. This best mimics the overall features seen in the data ( Fig. U), also regarding the shuffling of average chain lengths (indicated at the matrix boundaries in color shades). Note also, that in real food webs there are also interactions between parasites and level-1 and level-2 species (Figs U and J), which could also be included in the model.

Specific food webs (details).
We again compare with several empirical cases: • The Bahia Falsa web ( Fig. U.a) might be reasonably well described by Fig. O.e with substantial interaction between parasites and level-3 and level-4 free-living species but also some interaction within the parasite-parasite sub web. This may explain the substantial reduction of nullity for that web (d f = 16 → d f +p = 3, compare Fig. M.c,d).
• The Ythan Estuary food web ( Fig. U.g) does not include any parasite-parasite interaction and shows relatively strict hierarchical organization, even for interactions involving parasites. Hence, that particular web might be best described by Fig. O.d, leading to a sizable nullity, even with parasites (d f = 17 → d f +p = 22). We have tested, whether adding several random parasite-parasite interaction to that food web could change the nullity. We find that when adding on average one parasite-parasite link to each existing parasite (the approximate value of the other food webs), the nullity decreased to values as low as 7 or 8.
• Ythan Estuary further has comparably low numbers of links per parasite (L p /S p = 4.2 compared to an overall average of more than 15), which further makes it difficult to achieve full rank.
• Another food web, Flensburg Fjord, maintains relatively high nullity (d f +p = 10). There, adding further random parasite-parasite interactions can even reduce the nullity to zero.

S10.4 Simulations of the effect of concomitant predation
We also perform simulations to describe the effect of concomitant predation. There are two qualitatively different effects present in the empirical data ( Fig. Nd): 1. Symmetric concomitant links lead to an overall improvement of rank, i. e. a reduction in rank deficiency.
2. Asymmetric concomitant links (only affecting the parasite) lead to an improvement in rank in only few cases, most food webs do not show any change in rank.
To start the discussion, it is important to note several empirical observations: Parasites predominantly interact with species at more than one trophic level, e.g. as implied by the variation in their chain lengths ( Fig. AA). A substantial fraction of free-living species do not have parasite interactions (table in Fig. J), e.g. approximately 50 percent for the Ythan Estuary web. In many webs, parasites also interact with other parasites (Fig. Mc,d), but there are exceptions (Ythan Estuary).

Parasite-free-living links only.
To allow for a simple model, we again start from the idealized food web structure ( Fig. M.b) and the matrix R sim f ree . We then include a typical number of parasites (set to 47), and allow these to prey on a fraction (we choose 50 percent for simplicity) of species at levels 3 and 4 with otherwise random links ( Fig. P.a). In practice, parasites do prey on all trophic levels (albeit with some bias towards the higher levels). Our theory can easily be extended to more levels. previously lacking connections to parasites, participate in such loops of length three (this requires sufficient numbers of links, but as we will see in the simulations, this is usually possible). The loops engage n 4 /2 triples in directed pairings. The remaining n 4 /2 free-living species are open to bi-directional interactions with parasites and can hence also be paired. We arrive at the situation shown in Fig. Pc, where the types of pairings are indicated. Notably, the nullity has now been reduced to d con,asym = 33.

Symmetric concomitant links.
Consider now symmetric concomitant links (Fig. Pd). In this case, level-4 free-living species are likely to acquire bi-directional links to parasites and can all be paired.
Level-3 species however do not acquire additional links to parasites, only half of these could be paired. In the example at hand, there is already an overweight of even level species, level-3 species can therefore be equally well paired with level-2 species. In any case, the nullity can now be decreased to n con,sym = 21.
We have simulated all the above cases using the idealized matrix R f ree sim (Fig.R.a-c). In particular, we have also checked if indeed more loops of length three are produced when adding concomitant links. concomitant links, as well as the reasonable quantitative agreement. The plots differ in that the actual data does have some triangles for low-level species, which is due to the fact that parasites do interact with some species at those levels as well, and omnivory does exist for free-living species.
Additional parasite-parasite links. When a small fraction of links between parasites are included, the system will acquire the ability to pair many parasites with other parasites, which can sometimes allow all parasites to be paired with others. However, a subset of parasites might be needed to pair free-living species, in the example, a number of level-4 species should be paired to reduce the overweight of the even levels. As only n 4 /2 of these have parasite links, only these can be paired. The nullity can then be reduced to d f +p = 10. Adding additional asymmetric concomitant links will not help, since those always involve both a level-3 and a level-4 species. In this case, there will be no benefit in such 3-species pairs, since they will not reduce the overweight of the even levels. The situation is different for symmetric concomitant links, where additional pairings between level-4 species and parasites become possible, thereby reducing the overweight of the even levels. Simulating the case where approximately 15 percent of parasite links were parasite-parasite links, overall reduction of the nullity is achieved (Fig.Rd), even without inclusion of concomitant links. However, considering now the inclusion of asymmetric concomitant links (Fig.Re), the nullity is not decreased further (d f +p = f con,asym = 10). When including symmetric concomitant links, full rank is achieved. This type of simulation (where some parasite-parasite links are present) might be a reasonable abstraction of what is seen in the webs in Fig. U.a,b,c,d,f, where the nullity changes with symmetric, but not with asymmetric, concomitant links.