Conceived and designed the experiments: EBB APD TB SA MP. Performed the experiments: EBB. Analyzed the data: EBB APD TMA. Contributed reagents/materials/analysis tools: EBB APD TB SA. Wrote the paper: EBB APD TB SA MP. Developed the software used in analysis: EBB. Compiled foodweb data from published literature: APD. Contributed plant distribution data: TMA. Developed an interactive network visualization tool: TB.
The authors have declared that no competing interests exist.
Food webs, networks of feeding relationships in an ecosystem, provide fundamental insights into mechanisms that determine ecosystem stability and persistence. A standard approach in foodweb analysis, and network analysis in general, has been to identify compartments, or modules, defined by many links within compartments and few links between them. This approach can identify large habitat boundaries in the network but may fail to identify other important structures. Empirical analyses of food webs have been further limited by lowresolution data for primary producers. In this paper, we present a Bayesian computational method for identifying group structure using a flexible definition that can describe both functional trophic roles and standard compartments. We apply this method to a newly compiled plantmammal food web from the Serengeti ecosystem that includes high taxonomic resolution at the plant level, allowing a simultaneous examination of the signature of both habitat and trophic roles in network structure. We find that groups at the plant level reflect habitat structure, coupled at higher trophic levels by groups of herbivores, which are in turn coupled by carnivore groups. Thus the group structure of the Serengeti web represents a mixture of trophic guild structure and spatial pattern, in contrast to the standard compartments typically identified. The network topology supports recent ideas on spatial coupling and energy channels in ecosystems that have been proposed as important for persistence. Furthermore, our Bayesian approach provides a powerful, flexible framework for the study of network structure, and we believe it will prove instrumental in a variety of biological contexts.
The relationships among organisms in an ecosystem can be described by a food web, a network representing who eats whom. Food web organization has important consequences for how populations change over time, how one species extinction can cause others, and how robustly ecosystems respond to disturbances. We present a computational method to analyze how species are organized into groups based on their interactions. We apply this method to the plant and mammal food web from the Serengeti savanna ecosystem in Tanzania, a pristine ecosystem increasingly threatened by human impacts. This web is unusually detailed, with plants identified down to individual species and corresponding habitats. Our analysis, which differs from the compartmental studies typically done in food webs, reveals that functionally distinct groups of carnivores, herbivores, and plants make up the Serengeti web, and that plant groups reflect distinct habitat types. Furthermore, since herbivore groups feed across multiple plant groups, and carnivore groups feed across multiple herbivore groups, energy represents a wider range of habitats as it flows up the web. This pattern may partly explain how the ecosystem remains in balance. Additionally, our method can be easily applied to other kinds of networks and modified to find other patterns.
Food webs, networks of feeding relationships in ecosystems, connect the biotic interactions among organisms with energy flows, thus linking together population dynamics, ecosystem function, and network topology. Ecologists have been using this powerful conceptual tool for more than a century
Although compartmental structure may be significant at one scale of analysis, compartments alone do not account for much of the topological structure in food webs. Recent work with a probabilistic model considers a more flexible notion of groups, allowing link density to be high or low within any group or between any pair of groups
Two major challenges limit the application of this model in resolving the group structure of food webs and interpreting its biological basis. First, most food webs have poor resolution of primary producers; plants in terrestrial systems and phytoplankton in aquatic ones are typically represented by a few nodes that are highly aggregated taxonomically. Some are aggregated at multiple trophic levels, e.g., the Coachella Valley web
Second, some technical problems have hindered the use of probabilistic models in analyzing group structure. Early food web models served as null models for food web structure and were tested by generating model webs and comparing summary statistics against data from real webs
The Bayesian approach is gaining popularity in ecological modeling due to the philosophical and conceptual appeal of explicitly considering uncertainty in parameter estimation as well as its methodological flexibility
In this paper, we address the group structure of a newly assembled food web for the large mammals and plants of the Serengeti grassland ecosystem of Tanzania by applying a computational approach to the identification of groups based on Bayesian inference. We specifically ask whether the structure that emerges reflects the underlying spatial dimension, as delineated by the different plant communities that characterize different subhabitats within the ecosystem, or whether it is determined by trophic dimensions in the form of species guilds that share functional roles.
The Serengeti has been studied as an integrated ecosystem for almost five decades
We compiled the Serengeti food web from published accounts of feeding links in the literature
The compiled food web (
We compared marginal likelihood estimates of different model variants to determine which one best describes the Serengeti food web (see
Group model  Partition prior  Link prior  Log MLE  Bayes factor  
One group  —  Uniform  −2828.60  −1556.82 

161 groups  —  Beta  −2828.60  −1556.82 

161 groups  —  Uniform  −17967.07  −16695.28 

Compartmental groups  Dirichlet process  Beta  −1978.76  −706.97 

Flexible groups  Uniform  Uniform  −1710.83  −439.04 

Flexible groups  Uniform  Beta  −1404.32  −132.53 

Flexible groups  Dirichlet process  Uniform  −1455.32  −183.54 

Flexible groups  Dirichlet process  Beta  −1271.78  0  1 
Additionally, the use of flexible priors vastly improves the fit of the basic model, for both link probability parameters and network partitions. The model variant with beta prior for link probabilities and Dirichlet process prior for partitions performed best. Next, in order, were (1) the model with beta link probability prior and uniform partition prior, (2) the model with uniform link probability prior and Dirichlet process partition prior, and (3) the model with both uniform priors. The strongest variant surpassed its closest competitior by 133 units of (natural) loglikelihood, corresponding to a posterior odds ratio of
We used samples from the posterior distribution to summarize model hyperparameters controlling link probabilities and partitions. The posterior mean number of groups
Mean values for beta distribution parameters are
The posterior output includes 30,000 partitions of the Serengeti food web into groups, nearly all distinct from each other. One partition appears 6 times; two partitions appear 3 times; 14 partitions appear 2 times, and the rest appear only once. For the sake of interpretation, we formed a consensus partition (
Species are identically ordered top to bottom and left to right according to the consensus partition as listed in
The groups identified in the Serengeti food web in the consensus partition contain trophically similar species, with all groups restricted to a single trophic level (plants, herbivores, or carnivores). The consensus partition, with 14 groups, is shown in
The network is shown organized and colored by group according to the consensus partition listed in
Nodes in the network are aggregated and colored by group according to the consensus partition listed in
Group 1  Acinonyx jubatus, Crocuta crocuta, Lycaon pictus, Panthera leo, Panthera pardus 
Group 2  Canis aureus, Canis mesomelas, Caracal caracal, Leptailurus serval 
Group 3  Damaliscus korrigum, Hippopotamus amphibius, Kobus ellipsiprymnus, Ourebia ourebi, Pedetes capensis, Phacochoerus africanus, Redunca redunca, Rhabdomys pumilio, Taurotragus oryx, Tragelaphus scriptus 
Group 4  Aepyceros melampus, Alcelaphus buselaphus, Connochaetes taurinus, Equus quagga, Nanger granti, Eudorcas thomsonii 
Group 5  Heterohyrax brucei, Procavia capensis 
Group 6  Giraffa camelopardalis, Loxodonta africana, Madoqua kirkii, Papio anubis, Syncerus caffer 
Group 7  Digitaria scalarum, Dinebra retroflexa, Hyparrhenia rufa 
Group 8  Chloris gayana, Combretum molle, Digitaria diagonalis, Duosperma kilimandscharica, Eragrostis cilianensis, Microchloa kunthii, Sporobolus festivus, Sporobolus fimbriatus, Sporobolus spicatus 
Group 9  Acacia tortilis, Andropogon greenwayi, Aristida spp., Balanites aegyptiaca, Bothriochloa insculpta, Brachiaria semiundulata, Croton macrostachyus, Cynodon dactylon, Digitaria macroblephara, Eragrostis tenuifolia, Eustachys paspaloides, Grewia bicolor, Harpachne schimperi, Heteropogon contortus, Hibiscus spp., Hyparrhenia filipendula, Indigofera hochstetteri, Panicum coloratum, Panicum maximum, Pennisetum mezianum, Sida spp., Solanum incanum, Sporobolus ioclados, Sporobolus pyramidalis, Themeda triandra 
Group 10  Pennisetum stramineum 
Group 11  Acacia seyal, Acacia xanthophloea, Andropogon schirensis, Chloris pycnothrix, Chloris roxburghiana, Crotalaria spinosa, Cymbopogon excavatus, Digitaria milanjiana, Digitaria ternata, Echinochloa haploclada, Eragrostis exasperata, Euphorbia candelabrum, Hyperthelia dissoluta, Kigelia africana, Lonchocarpus eriocalyx, Olea spp., Panicum deustum, Panicum repens, Phragmites mauritianus, Psilolemma jaegeri, Sarga versicolor, Setaria pallidefusca, Setaria sphacelata, Typha capensis 
Group 12  Acacia senegal 
Group 13  Abutilon spp., Acalypha fruticosa, Acacia robusta, Achyranthes aspera, Albizia harveyi, Albuca spp., Allophylus rubifolius, Aloe macrosiphon, Aloe secundiflora, Blepharis acanthodioides, Capparis tomentosa, Pennisetum ciliare, Cissus quadrangularis, Cissus rotundifolia, Commelina africana, Commiphora merkeri, Commiphora schimperi, Cordia ovalis, Croton dichogamus, Cyperus spp., Cyphostemma spp., Digitaria velutina, Diheteropogon amplectens, Emilia coccinea, Eragrostis aspera, Eriochloa nubica, Ficus glumosa, Ficus ingens, Ficus thonningii, Grewia fallax, Grewia trichocarpa, Heliotropium steudneri, Hibiscus lunariifolius, Hoslundia opposita, Hypoestes forskaolii, Iboza spp., Indigofera basiflora, Ipomoea obscura, Jasminum spp., Kalanchoe spp., Kedrostis foetidissima, Kyllinga nervosa, Lippia ukambensis, Maerua cafra, Ocimum spp., Pappea capensis, Pavetta assimilis, Pavonia patens, Pellaea calomelanos, Phyllanthus sepialis, Pupalia lappacea, Rhoicissus revoilii, Sclerocarya birrea, Senna didymobotrya, Sansevieria ehrenbergii, Sansevieria suffruticosa, Solanum dennekense, Solanum nigrum, Sporobolus pellucidus, Sporobolus stapfianus, Tricholaena teneriffae, Turraea fischeri, Ximenia caffra, Ziziphus spp. 
Group 14  Boscia augustifolia, Commiphora trothae 
Overall, plants of the same habitat type are significantly more clustered in groups than random according to weighted Shannon entropy. (Lower values of weighted entropy indicate higher levels of clustering; see
Furthermore, the four largest plant groups reflect significant overrepresentation of four different habitat types, and either significant underrepresentation or no significant signal for other habitat types. In group 13, kopje plants are significantly overrepresented, comprising 36.7% of the group, compared to a random expectation of 18.1% (
Plant groups are coupled by groups of herbivores, which are in turn coupled by groups of carnivores. Large migratory grazers (group 4, wildebeest, zebra, and gazelles) feed plant groups that include the dominant grass species in the ecosystem (group 9), predominantly riparian species (group 11), and a mixture of woodland species (
In order to analyze the group structure of the Serengeti food web, we used a flexible Bayesian model of network structure that includes no biological information aside from a set of nodes representing species and links representing their interactions. The groups that emerge from an otherwise blind classification of species make remarkable biological sense, and moreover reveal detailed patterns between habitat structure and network topology that expert intuition alone cannot. Species are divided into trophic guilds that reveal a strong relationship between the habitat structure of plant, herbivore, and carnivore groups and the structure of the network. At the coarsest scale, the groups in the Serengeti food web correspond to carnivores, herbivores, and plants. The further subdivisions that emerge within the trophic levels reveal connections between habitat types and feeding structure. This deeper analysis is made possible by high resolution at the plant level along with information about the habitat occupancy of different plants. Since different habitat types occupy distinct spatial locations in the Serengeti, the group structure thus reflects in part the flow of energy up the food web from different spatial locations, with herbivores integrating spatially separated groups of plants, and carnivores integrating spatially widespread herbivores. A priori, it was not clear precisely what kind of group structure would emerge in the Serengeti web from the use of the group model. In general, the more complex the web, the more useful these methods will be in helping to disentangle the complexity.
The food web presented here included only plants and mammals, but we hypothesize that the general conclusions will be largely robust to the addition of more species. Although the addition of birds, reptiles, invertebrates, and pathogens will likely add a significant number of new groups, they should not significantly modify the derived structure for mammals, since the insectbird links reflect an almost parallel food web. To the extent that insect herbivores further differentiate plants, plant groupings may be affected, but we expect that the larger tendency for groups to reflect habitat structure will remain.
Recently, interesting theoretical and empirical work has highlighted the relationship between observed patterns of foodweb structure and energy flow that seemingly relates to the trophic guild structure in the Serengeti. Rooney and colleagues
These patterns emerge directly from the topology of the food web without being explicitly labeled as different habitats upfront as was done in previous empirical work
In this paper, we used a probabilistic model to analyze the structure of a single food web, an approach we have seen in only one other study based on a probabilistic version of the niche model
In fact, the Bayesian approach described here provides a powerful general framework for encoding hypotheses about the structure of food webs and comparing models against each other, and we see it as a natural next step in the current trend of representing foodweb models in a common way. Simple abstract models such as the niche model and the group model used here act as proxies for the highdimensional trait space that determines feeding relationships in an ecosystem. The identification of actual traits that correspond to groups (or niche dimensions) is another valuable direction, so far followed primarily by finding correlations between compartments/groups
The use of flexible, hierarchical priors for model parameters is another useful innovation of the Bayesian framework. The number of groups identified by the model increases dramatically with the use of a flexible beta prior distribution for link probability parameters. In that model variant, we effectively introduce two degrees of freedom to the model (the beta distribution parameters) but dramatically reduce the effective degrees of freedom of the link probability parameters. Note that we penalize parameters by using the marginal likelihood for model selection, so that the model selection represents a balance between goodness of fit and model complexity. Moreover, this structure makes intuitive sense: since most link probability parameters are simply zero, they should not be penalized. An alternate approach is to remove and add parameters to the model, but this hierarchical technique is much easier to implement in practice.
Advanced Markovchain Monte Carlo methods make it possible to accurately estimate marginal likelihoods for probabilistic network models. Unlike information criteria such as AIC or BIC, an accurate estimate of the marginal likelihood provides a direct measurement of goodness of fit that takes into account the degrees of freedom in a model without making any asymptotic assumptions about parameter distributions
Additionally, the Bayesian approach also serves as a means to avoid fundamental issues inherent in network models with a large parameter space. One recent study has shown that, even in relatively small networks, a large number of good solutions exist for the standard modularity criterion
The group model, based on the simple notion that groups of species may have similar feeding relationships to other groups, reveals that trophic guilds are the topologically dominant type of group in the Serengeti food web. The model also reveals an interesting relationship between habitat structure and network structure that corroborates recent ideas on spatial coupling in food webs. A theoretical study with a dynamical model suggests that this type of structure may contribute to ‘stability’ in the sense of the persistence of species
In this work, we use Bayesian probabilistic models to analyze food webs; for a general introduction to the Bayesian modeling approach and details on the specific models used here, please see supporting
The group model (supporting
In general, priors may incorporate informed knowledge about the system, but in this case we simply use them to encode different variants of the same basic model. We use two distributions for partitions and two distributions for link probabilities, which are combined to form four different model variants. We also consider several null models for comparison.
For partitions, we employ two prior distributions: (1) a uniform distribution and (2) a distribution generated by the Dirichlet process, sometimes referred to as the “Chinese restaurant process”
The two alternative prior distributions used for link probabilities
We also consider two simple models without groups as null comparisons: (1) a directed random graph model (i.e., one group) with a uniform prior on a single link probability parameter
Finally, in order to explicitly restrict the model to detecting compartmental structure, we also consider a modification that requires all betweengroup link probabilities
We sample from the posterior distribution of model parameters using a Markovchain Monte Carlo technique known as Metropoliscoupled MCMC, or
In order to select a good model variant, we employ the marginal likelihood, the probability of data given a model integrated over all model parameters (partitions and link probability parameters). This approach extends the use of Bayes' rule to model selection as well as inference of parameter values. The ratio of the marginal likelihoods for two models is often called the Bayes factor
The output of an MCMC simulation includes a large number of network partitions representing draws from the posterior distribution. As these partitions are potentially all distinct from each other, but represent similar tendencies of species to be grouped together, it is useful to try to summarize the information contained in all the samples in a more compact form. To do this, we construct an affinity matrix with entries equal to the posterior probability that two species are grouped together. We use the affinity matrix to then form a consensus partition, using an averagelinkage clustering algorithm (see supporting
In order to test the overall presence of habitat signature in plant groups, we assigned plants to habitat types via one of three methods based on data availability. For plants present in 133 plots sampled from around the Serengeti
We used a randomization test to measure the overall clustering of habitat in groups across sampled partitions. The habitat signature of an individual group
To test clustering significance of a specific habitat type in a specific grouping of species, we calculated the pvalue as the probability that a randomized group of the same size would have as many or more species assigned to the chosen habitat type.
(TIFF)
(TIFF)
(TIFF)
(CSV)
(CSV)
(CSV)
(CSV)
(PDF)
A.D. acknowledges the Frankfurt Zoological Society for logistical support in the Serengeti for his work on food webs, and the members of Serengeti Biocomplexity Project for many interesting discussions about the Serengeti food 419 web. M.P. is an investigator of the Howard Hughes Medical Institute.