Pairwise Measures of Causal Direction in the Epidemiology of Sleep Problems and Depression

Depressive mood is often preceded by sleep problems, suggesting that they increase the risk of depression. Sleep problems can also reflect prodromal symptom of depression, thus temporal precedence alone is insufficient to confirm causality. The authors applied recently introduced statistical causal-discovery algorithms that can estimate causality from cross-sectional samples in order to infer the direction of causality between the two sets of symptoms from a novel perspective. Two common-population samples were used; one from the Young Finns study (690 men and 997 women, average age 37.7 years, range 30–45), and another from the Wisconsin Longitudinal study (3101 men and 3539 women, average age 53.1 years, range 52–55). These included three depression questionnaires (two in Young Finns data) and two sleep problem questionnaires. Three different causality estimates were constructed for each data set, tested in a benchmark data with a (practically) known causality, and tested for assumption violations using simulated data. Causality algorithms performed well in the benchmark data and simulations, and a prediction was drawn for future empirical studies to confirm: for minor depression/dysphoria, sleep problems cause significantly more dysphoria than dysphoria causes sleep problems. The situation may change as depression becomes more severe, or more severe levels of symptoms are evaluated; also, artefacts due to severe depression being less well presented in the population data than minor depression may intervene the estimation for depression scales that emphasize severe symptoms. The findings are consistent with other emerging epidemiological and biological evidence.


T(x d ,x s ) = M(x s ,x d ) -M(x d ,x s )
as a causality statistic whose positive values indicate that x d causes x s , whereas the negative values indicate the opposite causality. Since we use the exact same kernelbased pairwise quantity M(·,·) that the DirectLiNGAM-algorithm uses when deriving causal ordering of variables [2], we call this statistic T as the DirectLiNGAM-statistic; it aims to use general dependency information in variables. More restricted deviations from Gaussianity can also be used for the causality estimation.
The other statistics that function like T with respect to positive and negative values are the skewness-and kurtosis-based statistics. These use only some deviations from Gaussian distribution; namely, skewness and kurtosis. Let variables x d and x s be standardized (mean zero, variance one) and multiplied with the sign of their skewness (resulting in positive skewness), then the desired skewness-based statistics is where E is sample average or expectation, and (·,·) is the correlation of input variables. The below theorem establishes that under LiNGAM assumptions, a positive value of T skew (x d ,x s ) indicates that x d is cause and a negative value indicates the antecedence of the second argument. The kurtosis-, or sparseness-based, statistic can also be derived, but it suffers from a lack of robustness and from sign-indeterminacy [3]. A hyperbolic tangent function (tanh) offers a more useful approximation [3,32].
The explicit rationale is beyond present scope, but the ensuing statistic is where the input variables must be standardized. We call this the Tanh-based causality statistic.
T skew and T tanh apply only to standardized variables, but DiractLiNGAM-based statistic T can be applied to standardized and non-standardized variables; when everything goes according to assumptions, T should be invariant with respect to standardization [2]. Therefore we sometimes also provide results for both standardized and original variables in order to directly evaluate the sensitivity for scaling. For standardized random variables X 1 , X 2 , and X 3 the third cumulant, cum(X 1 , , is multilinear (i.e., linear in each argument). Skewness of a standardized variable X is skew(X) = cum(X, X, X) [3,32]. The following theorem is re-stated from previous work [3], and it proves that T skew has the desired properties; that is, its sign implies the correct causality under the LiNGAM assumptions.

Theorem. Let x and y be two standardized variables with positive skewness. If y = x + e, with independent variables x and e and a constant coefficient , then
And if the causal direction is opposite, x = y + e, then T skew (x,y) = skew(y)( 3 -2 ). (2) Before proving the theorem, notice that variances of one for x and y force | | < 1, and therefore the theorem implies that T skew (x,y) is positive when the first argument is cause and negative when the latter argument is the cause, provided the skewnesses of arguments are positive. If a variable x* has a negative skewness, then the theorem can nonetheless be applied to x = sign(skew(x*))x*, which has a positive skewness. In practice, is the usual correlation coefficient. Notice that skew(x) = 0 when x is a Gaussian variable.
Proof. Given the other assumptions and y = x + e, we have