Generalism drives abundance: A computational causal discovery approach

doi:10.1371/journal.pcbi.1010302

Fig 1.

A chicken-and-egg dilemma of generalism and abundance.

Empirical evidence shows that abundant species are also generalists. However, the causal direction is debated. If the community is mainly structured by selection processes, then species are more abundant because generalists have a competitive advantage. In contrast, if the community is mainly structured by drift processes, then species are more generalized because abundant populations have a higher chance of encountering more partners. The clip-arts of flowers and hummingbirds are made with DALL·E.

More »

Expand

Fig 2.

Illustration of the causal discovery methods.

Panel (A) shows a hypothetical dataset where X is the cause and Y is the effect. We generate this dataset as an ensemble y ∼ x^d with white noise, where d ranges from 0.5 to 1. The color of the points represents the density of data: the darker it is, less points it represents. Panel (B)-(D) illustrate three causal discovery methods we adopted. Panel (B) illustrates the method based on formal logic. The criteria of X being the cause is that the proportion of points with small x and large y values is greater than that of points with large x and small y values. The marginal distributions are both normalized to keep roughly equal proportions of small and large samples. Normalization can help avoid the bias in the method regarding skewed marginal distributions. Panel (C) illustrates the method based on nonlinear additive noise model. This method assumes that the effect (i.e., Y) is some potentially nonlinear transformation of the cause (i.e., f(X)) plus some noise that is independent of the cause (i.e., ϵ(Y)). The criteria of X being the cause is that the residuals from the regressions should be independent of X but not of Y. The form of the nonlinear transformation f(X) and consequently the residuals can be fitted through nonparametric regressions. Panel (D) illustrates the method based on information theory. The criteria of X being the cause is that X has a higher entropy than Y. The marginal distributions are both scaled via a linear transformation between 0 and 1 for a fair comparison of entropy.

More »

Expand

Fig 3.

Generalism drives abundance.

We apply the causal discovery method based on formal logic to detect the causal direction in an empirical dataset of 25 hummingbird-plant communities. In each panel, the x axis shows the two categories (rare and generalized species versus abundant and species species), and the y axis shows the mean proportion of species in that category. Each point denotes a different empirical community. Each line connects two points in the same empirical community. If the line is going up (meaning there are more species being abundant and specialized), it indicates that generalism drives abundance, and vice versa. The original method (panels C and D) suggests that generalism drives abundance in most communities (this is expected because the marginal distribution of abundance is more skewed than that of generalism). In contrast, our refined method has removed the bias regarding skewed marginal distributions, and it (panels A and B) suggests that generalism drives abundance in most communities. The qualitative patterns are similar among hummingbirds (panel A) and plants (panel B).

More »

Expand

Fig 4.

Strength of selection process versus drift process under structural and environmental context.

Panels (A) and (C) show the effects of the structural context. The x axis shows the combined nestedness, which is an unbiased metric to compare the level of nestedness across networks [47, 50]. The y axis shows the differences between the proportion of abundant and specialist species and that of rare and generalist species. A positive difference indicates that generalism is the cause of abundance (i.e., stronger selection and weaker drift), while a negative difference indicates that abundance is the cause of generalism (i.e., weaker selection and stronger drift). Each point denotes a different empirical plant-humming bird community (i.e., dataset). The orange points represent the hummingbird data while the blue points represent the plant data. The strength of the selection processes increases as the communities have more nested structures for both plants and hummingbirds. Plants exhibit stronger positive patterns than hummingbirds under the structural context. Panels (B) and (D) show the effects of the environmental context. The x axis shows the local annual temperature mean where the community was sampled. The error bar represents 1 standard deviation with temperature mean from 1958 to 2020 [53]. The strength of the selection processes increases as the communities experience higher mean temperature for both plants and hummingbirds. Hummingbirds exhibit stronger positive patterns than plants under the environmental context. The Pearson correlations and their p values are shown in the figure.

More »

Expand