Cross-Correlation Asymmetries and Causal Relationships between Stock and Market Risk

We study historical correlations and lead-lag relationships between individual stock risk (volatility of daily stock returns) and market risk (volatility of daily returns of a market-representative portfolio) in the US stock market. We consider the cross-correlation functions averaged over all stocks, using 71 stock prices from the Standard & Poor's 500 index for 1994–2013. We focus on the behavior of the cross-correlations at the times of financial crises with significant jumps of market volatility. The observed historical dynamics showed that the dependence between the risks was almost linear during the US stock market downturn of 2002 and after the US housing bubble in 2007, remaining at that level until 2013. Moreover, the averaged cross-correlation function often had an asymmetric shape with respect to zero lag in the periods of high correlation. We develop the analysis by the application of the linear response formalism to study underlying causal relations. The calculated response functions suggest the presence of characteristic regimes near financial crashes, when the volatility of an individual stock follows the market volatility and vice versa.


Introduction
A financial market is a complex system demonstrating diverse phenomena and attracting attention from a whole spectrum of disciplines ranging from social to natural science [1]. Better understanding of the behavior of financial markets has become an integral part of the discussion on further sustainable economic development. In this context, proper assessment of financial risks [2] plays a crucial role: Underestimated risks contribute to financial bubbles with eventual crashes while overestimation of risks might cause inefficiency of financial resource allocations and a slowdown in economic growth, giving rise to periods of stagnation. This multifaceted problem, lying at the core of finance, draws significant interest from the physical and mathematical communities [3,4]. One of the key components of financial risk analysis is a volatility assessment, which quantifies the financial stability of an asset in question. To this end, a number of methods have been proposed for risk modeling [5][6][7][8] and forecasting [9], along with numerous studies of various empirical properties of volatility, including such stylized facts as clustering [10][11][12], leadlag effects [13], asymmetries [14,15] and many others (for a review see Refs. [16,17]). Related phenomena, being a result of collective behavior, also involve such aspects as estimation of correlation [18][19][20] and cross-correlation [21][22][23][24] matrices, study of their dynamics [25,26], asymmetric correlations [27], nonlinear correlations [28][29][30] and detrending [31,32], financial networks and clustering [33][34][35][36][37][38][39][40][41][42], multivariate stochastic models [43,44], critical phenomena [45,46], etc.
In the current paper, we focus on lead-lag effects between individual and collective volatility behavior in the US stock market, which might be further discussed in the context of the systemic regulation problem [47]. Former studies reported an increase of correlations across financial markets in recent times [25] along with overall market disposition to systemic collapses [48]. Our investigation thus has an aim to shed additional light on the dynamics of systemic risk in the last decade. For this purpose, we analyze historical prices of 71 stocks (Table 1) from the Standard & Poor's 500 index [49] (hereafter S&P 500) for 1994-2013. Although we employ one of the simplest volatility estimators-the simple moving average (SMA) standard deviation of daily logarithmic returns-it is conjectured to correctly describe asset risk dynamics on long time scales, on the order of months and years [50]. We harness cross-correlation analysis which is a basic tool in the analysis of multiple time series. By definition, the absolute value of the normalized cross-correlation function lies between 0 and 1, indicating the strength of a linear relationship between time series, given that one is shifted by a particular lag value. It is crucial to note that our approach is based on a study of cross-correlations between derived quantities from the stock returns (standard deviations) rather than the analysis of crosscorrelation matrices of the returns per se, implicitly involving calculation of cross-correlations between correlations: Since portfolio return is the sum of stock returns, its variance is the sum of all elements of the covariance matrix, C, [Eq. (3)], which can be factorized into the product of a correlation matrix, R, and a diagonal matrix of standard deviations, C diag , with elements C diag ii~ffi ffiffiffiffiffi C ii p and C diag i=j~0 : C~C diag RC diag . These more sophisticated quantities will hopefully allow us to capture a more systematic evolution of the market risk as a function of time. Indeed, it was previously shown that market volatility and correlation are tightly related across international financial markets [51]. However, our calculations show that the crosscorrelation function averaged over all stocks (see equations below) not only often has the maximum value close to 1 but also possesses an asymmetric shape with respect to zero lag (Fig. 1). These features suggest the presence of long-term trends, when equilibrium on the market is not reached within one trading day and overall market risk tends to follow individual stock risks [ Fig. 1(a)] or vice versa [ Fig. 1(b)]. Lately, emergence of intraday trends has been reported for stock returns [52] and correlations [53], while our investigation develops similar ideas for stock volatilities. Generally, it is not possible to determine causality from an arbitrary shape of the cross-correlation function. However, if the cross-correlation function is asymmetric with respect to the time reversal operation (change of a sign of the time lag), it might hint at the presence of causal relationships [54]. Although determining true causality is rather a philosophical matter, we use this term in the predictive sense, i.e. if the past values of one time series can be used to predict the present or future values of the other. In this regard, one of the most widely used approaches is the Granger causality test [55]. Following this method, one builds autoregressive models for the time series including and excluding factors in question and checks if the difference between models is statistically significant. However, in the current investigation, we propose to use an alternative approach utilizing a specific class of asymmetric cross-correlation functions studied in linear response theory [56], which provides a framework for describing input-output properties of a physical system. Within this approach, causality implies the absence of any response before an action (as long as there are no long-term memory effects), that results in zero values of the crosscorrelation function for a particular lag direction-positive or negative-depending on the input-output roles of the variables.  The simplest example can be given by a force acting on a mass. The mass cannot move before the interaction and thus the correlation between the force and displacement is zero before the time when the force is applied. Although we do not expect to observe such a trivial behavior in real financial markets, asymmetries in the empirical functions ( Fig. 1) can be interpreted as an approximation to this ideal model, where the mass and force are represented by individual stock and collective market volatility or vice versa, depending on the observed regime. Making use of this approximation, we restrict ourselves to the qualitative analysis with aim to reveal historical patterns only.

Estimating the stock and market risks
Let us first introduce notations used throughout the paper. We consider N discrete time series of daily closing stock prices Within the SMA approach, one can calculate a moving average for a particular discrete time moment t using equally weighted values of T previous days including the current one In this case, a cross-covariance of two time series might be defined as where t is a time lag. Series variance is a self-covariance at t~0, s 2 ½s i (t):s½s i ,s i (t,0), where s denotes the standard deviation or volatility in finance. This quantity can be used as the simplest risk measure: Stocks with higher values of s have less stable returns and, consequently, are less attractive for investment, other things being equal.
A stock market comprises all stocks available for trade. Although in the current investigation we consider a limited subset of stocks, it is chosen to represent the top US companies with the largest market capitalization. For such a portfolio, consisting of equal shares of N stocks, total return, m(t), equals to the sum of the separate stock returns, m(t)~P N i~1 s i (t). Its variance, in addition to Eq. (2), can be also expressed as the sum of all elements of the covariance matrix C(t), an N|N matrix with elements C ij (t)~s½s i ,s j (t,0), The square root of this value, s m :s½m, can be also used as a portfolio risk measure, which characterizes overall market risk in the case of large N (Fig. 2). In the remainder of the paper, we will focus on finding historical dependences and lead-lag relationships between individual stock risks, s i :s½s i , and market risk, s m , using the formalism presented in the following subsection.   Fig. 1. The peaks of the imaginary parts hint at the causal relationships between the individual and collective risks: (a) individual stock risks on average tend to influence overall market risk; (b) market risk tends to influence risks of separate stocks. The susceptibilities are calculated using the discrete Fourier transform for the range of +30 days around zero lag (61 days in total) which is highlighted with a blue background in Fig. 1

Causality analysis
One of the possible ways to estimate dependence between two time series x(t) and y(t) is to calculate the cross-correlation function r½x,y(t,t)~s ½x,y(t,t) s½x(tzt)s½y(t) , ð4Þ which is normalized and ranges from 21 to 1. Its peak value shows the strength of a linear relationship between x and y (with zero value corresponding to its absence) when the first series is shifted by the time lag t. In this section, we assume this peak value to be positive since the opposite case can be easily recovered via multiplication of x or y by 21: Noting that s½{x,y~s½x,{y~{s½x,y and s½{x~s½x, one immediately gets r½{x,y~r½x,{y~{r½x,y. If the dependence between series is nonlinear, more sophisticated statistical concepts should be used instead, for instance, cross-entropy [28], copula [29] or the Spearman's rank correlation [30]. However, we are aimed to employ the linear Pearson's coefficient [Eq. (4)] in the present study. Given two series are correlated, it is not possible to establish causal relationships between the variables by this fact itself. However, the particular shapes of the cross-correlation functions studied within linear response theory can provide an insight into this problem. This theory provides a convenient framework for the study of related dynamical properties of a physical system. Within this approach, the cross-correlation function defines the system's response to an external action, obeying laws of motion. In this context, causality implies the absence of any deterministic response before an action, i.e. the expected value of the cross-correlation function is zero for a particular lag direction (tw0 or tv0) defined by the input-output roles of x and y. For example, the response function of the first-order ordinary differential equation where a and b are some constants and y is the delta function (impulse force), is depicted in Fig. 3(a). Here, y can be uniquely identified as an external action because r½x,y is non-zero only for tw0, the time direction corresponding to the future values of x and the past values of y [see Eq. (2)]. This asymmetry of the response is also graphically reflected in its Fourier transform (we use its discrete analogue with a unitary norm for the analysis) known as susceptibility which is a complex-valued function of angular frequency v. Its real (reactive) part, Rex, being an even function of v, is defined by the response strength. While the imaginary (dissipative) part, Imx, is an odd function of v defined by the asymmetric part of r. It is worth noting that any function r(t) can be written as the sum of an even function r even ({t)~r even (t) and an odd function r odd ({t)~{r odd (t). In this case, Rex is the Fourier transform of r even while Imx is the Fourier transform of r odd . Regarding the action-reaction roles of x and y in Eq. 5, Imx has a negative peak for vw0 [ Fig. 3(a)] and a positive peak if the variables are interchanged [ Fig. 3(b)]. Additionally, Rex and Imx should satisfy the Kramers-Kronig relations, which is a mathematical condition of a complex function to be analytic and hence the underlying physical system to be stable [57]. The empirical cross-correlation functions (Fig. 1), which characteristic shapes are schematically depicted in Figs Despite this fact, the corresponding susceptibilities display the similar features of the real and imaginary parts (Fig. 4). Thus, we consider them as a coarse approximation to the theoretical linear response functions and utilize the peak of Imx(vw0) as an indicator of possible causal dependence. If the cross-correlation function is completely symmetric with respect to the time reversal operation [ Fig. 3(c)], t?{t, no causal relation between x and y can be established within the linear response formalism given the cross-correlation function alone: This fact implies that the interchange of the input-output roles of the underlying variables produces exactly the same observable behavior of the system as a whole. However, when the maximum value of r is slightly shifted [ Fig. 3(d)] or the function decays faster for the one lag direction than for the other [ Fig. 3(e)] one might expect that the change of y tends to cause the reaction of x because of the enhanced response for the future values of x. In doing so, reversal of the observed input-output roles corresponds to the change of the sign of the imaginary part while the real part remains unaffected. Finally, fitting of a particular susceptibility model to the empirical data allows one to determine the differential equation which governs the observed behavior of the system. However, the behavior of a real financial market is usually highly nonlinear, possessing long-term memory effects [58,59] and fractal structure [60,61], that is obviously beyond the scope of the discussed method. One of the possible ways to extend the presented approach might be the application of nonlinear response theory  Filled areas under the r r max panel mark the periods where it is not significantly bigger than 0.5. The distance between two labeled dates is 500 trading days and the highlighted periods correspond to the major financial crises described in Fig. 2. doi:10.1371/journal.pone.0105874.g007 [62] although this case is not considered in our paper. We restrict ourselves to the basic linear qualitative analysis which only hints at the direction of influence between the variables in question.

Results
We are now in position to determine causal relations between the individual stock and total market risk, applying the formalism from the previous section. With this aim, we analyze N~71 historical stock prices [63] of the largest US companies in terms of market capitalization, members of the S&P 500 ( Table 1). The historical period considered is between 1994 and 2013, roughly corresponding to 4600 trading days. Being interested in the average market dynamics, we consider a mean value of r½s i ,s m . However, there is a problem of averaging correlation coefficients since their distribution is highly skewed when the value of r r is close to 1 [top panel in Fig. 5(a)], what makes them nonadditive quantities. In this regard, a number of methods has been proposed to tackle this issue [64,65]. The simplest one is the Fisher transform [66] z r f g~1 2 ln 1zr 1{r which makes the distribution of correlation coefficients approximately normal [bottom panel in Fig. 5(a)]. In this case, the average correlation might be estimated as with a confidence interval (CI) where z table~1 :96 corresponding to the 95% confidence level is further used. When r r is small, the distribution is not skewed and the Fisher transform does not affect it (z r f g&r for small r) [ Fig. 5(b),(c)]. This average function is subsequently Fourier transformed to obtain the average susceptibility, x x, using the discrete analogue of Eq. 6 with a unitary norm for the interval t[½{t max ,t max . It is also worth noting that the use of an SMA for the calculation of volatilities (s i and s m ) imposes smoothing on the corresponding time series. Thus, a bigger window of size MwT for the calculation of r in Eq. (4) should be used to avoid spurious correlations [ Fig. 6(c),(f)]. Additionally, Fig. 5(c) suggests that the averaging over a big number of stocks effectively reduces related undesirable effects.
The task at hand requires the series in question to be correlated. For this purpose, we calculate the maximum value of the correlation between the market risk and individual stock risk, r r max , within the considered range of lag +t max . The historical dynamics of this maximum value (second panel in Fig. 7) suggests that it becomes significantly bigger than 0.5 near major financial crashes, while in other times the series seem to be weakly correlated. In this respect, one can highlight the US market downturn of 2002 and approximately the 5-year period from the US housing bubble in 2007 until 2013, when almost the linear relationship was observed. For such highly correlated risks, it is feasible to perform causal analysis within the linear response approximation.
As was mentioned before, typical shapes of r r and x x are depicted in Fig. 1 and Fig. 4 respectively. For instance, causality analysis of these two dates near European sovereign debt crisis reveals that on Jun 15, 2011 [ Fig. 1(a)] the maximum value of the crosscorrelation function is shifted left with respect to zero lag, which is reflected as a negative peak of the imaginary part of the susceptibility for positive frequencies [ Fig. 4(a)]. Following the discussion from the previous section, this feature corresponds to the leading influence of individual stock risks on the total market risk. While the opposite situation is observed on Sep 9, 2011 [ Fig. 1(b) and Fig. 4(a)]. The historical analysis of the average susceptibility dynamics (two bottom panels in Fig. 7) for the periods with high value of r r max reveals two peculiarities. The first one is related to the fact that individual stock risks follow market risk after big crashes. This feature can be viewed as a consequence of herding behavior, when stock risks are trying to reach new equilibrium with overall market risk as a benchmark. This fact is also in agreement with the studies on asymmetric phenomena [14,15,27], which have shown an increase of volatility and correlations in a bear market. The second peculiarity can be observed, for example, before the Lehman Brothers collapse in 2008 and the European sovereign debt crisis in 2012, when individual stock risks on average start to influence market risk shortly before a crash, while at the crash the direction of influence is reversed. Finally, Fig. 8 shows that this behavior is observed for different window sizes T and M, however, use of bigger values of M smooths described effects.

Discussion
We have studied average lead-lag relationships between individual stock and collective market risk in the US stock market using cross-correlation analysis. Our calculations have shown that stock and market volatility are tightly correlated during the periods of financial instability. Furthermore, the correlation functions often possess asymmetries with respect to zero lag, which is a potential sign of a causal dependence between the risks within the linear response approximation. Having analyzed historical data for 1994-2013, we have found similar patterns near the last major crashes. Firstly, after a financial crash individual stock risks tend to follow collective market behavior. Secondly, the opposite influence, when stock risks on average start to influence market risk, is observed before particular crashes, for instance, the Lehman Brothers collapse in 2008 or the European sovereign debt crisis in 2012. Eventual market adjustment after the crash leads to the restoration of a symmetric shape of the average cross-correlation function and decrease of its maximum value. This is also reflected in the Fourier transform of the cross-correlation known as susceptibility. For this complex function, reversal of the causal dependence corresponds to the change of the sign of its imaginary part, while the real part remains unaffected, and the absence of the dependence results in zero value of the imaginary part. We suggest that the observed patterns might be interpreted as a manifestation of herding behavior, when economic performance of separate companies systematically does not meet expectations of investors, creating the panic across the market. Wherein after the crash, financial risks of separate companies adapt to a new reality with overall market performance as a psychological benchmark.