Generalism drives abundance: A computational causal discovery approach
Fig 2
Illustration of the causal discovery methods.
Panel (A) shows a hypothetical dataset where X is the cause and Y is the effect. We generate this dataset as an ensemble y ∼ xd with white noise, where d ranges from 0.5 to 1. The color of the points represents the density of data: the darker it is, less points it represents. Panel (B)-(D) illustrate three causal discovery methods we adopted. Panel (B) illustrates the method based on formal logic. The criteria of X being the cause is that the proportion of points with small x and large y values is greater than that of points with large x and small y values. The marginal distributions are both normalized to keep roughly equal proportions of small and large samples. Normalization can help avoid the bias in the method regarding skewed marginal distributions. Panel (C) illustrates the method based on nonlinear additive noise model. This method assumes that the effect (i.e., Y) is some potentially nonlinear transformation of the cause (i.e., f(X)) plus some noise that is independent of the cause (i.e., ϵ(Y)). The criteria of X being the cause is that the residuals from the regressions should be independent of X but not of Y. The form of the nonlinear transformation f(X) and consequently the residuals can be fitted through nonparametric regressions. Panel (D) illustrates the method based on information theory. The criteria of X being the cause is that X has a higher entropy than Y. The marginal distributions are both scaled via a linear transformation between 0 and 1 for a fair comparison of entropy.