Intraday seasonalities and nonstationarity of trading volume in financial markets: Collective features

Employing Random Matrix Theory and Principal Component Analysis techniques, we enlarge our work on the individual and cross-sectional intraday statistical properties of trading volume in financial markets to the study of collective intraday features of that financial observable. Our data consist of the trading volume of the Dow Jones Industrial Average Index components spanning the years between 2003 and 2014. Computing the intraday time dependent correlation matrices and their spectrum of eigenvalues, we show there is a mode ruling the collective behaviour of the trading volume of these stocks whereas the remaining eigenvalues are within the bounds established by random matrix theory, except the second largest eigenvalue which is robustly above the upper bound limit at the opening and slightly above it during the morning-afternoon transition. Taking into account that for price fluctuations it was reported the existence of at least seven significant eigenvalues—and that its autocorrelation function is close to white noise for highly liquid stocks whereas for the trading volume it lasts significantly for more than 2 hours —, our finding goes against any expectation based on those features, even when we take into account the Epps effect. In addition, the weight of the trading volume collective mode is intraday dependent; its value increases as the trading session advances with its eigenversor approaching the uniform vector as well, which corresponds to a soar in the behavioural homogeneity. With respect to the nonstationarity of the collective features of the trading volume we observe that after the financial crisis of 2008 the coherence function shows the emergence of an upset profile with large fluctuations from that year on, a property that concurs with the modification of the average trading volume profile we noted in our previous individual analysis.


Introduction
The relevance of the trading volume in stock trading is more than well-established: the decision of buying and selling is mainly prompted by factors that bidders and askers believe to affect the price S and its reckoning that the stock is underpriced(overpriced). In other words, setting particular episodes apart-e.g., dividend or interest payments, (reverse) stock splits -, PLOS  the price changes when a buyer and a seller agree on making a transaction at some new price S. In this context, the transfer of equity between agents-the trading volume-can act as a proxy for the flow of information among them. Large trading volumes are immediately associated with a lot of 'buzz' which tend to reflect on the price [1].
In the scenario where the trading volume is a proxy for market information, we can study the relations between trading volume, price fluctuations and information. The first approach blending these three quantities is the Mixture of Distributions Hypothesis (MDH) introduced by Clark [2] which conjectures that, since the dynamics of the volatility and the trading volume are dependent on latent events (disclosure of information), there is a joint distribution for these two quantities with both quantities marginally follow a log-Normal distribution. Additionally, Clark defined a difference between physical (clock) and proper (event) time at which information is input, a difference of concepts that also turns up being fundamental in the description of complex systems and critical phenomena [3]. As examples of a quantitative approaches assuming the MDH, we point to studies on the stochastic nature of the volatility [4][5][6][7] as well as ARCH-like heteroskedastic models [8], which consider the price fluctuations and trading volume share the same underlying [9].
Contrasting with the MDH synchronous nature of the trading volume and volatility dynamics, Copeland introduced the Sequential Arrival of Information Hypothesis (SAIH) [10] which asserts information reaches the trading agents at different times so that the final state of the market is attained by a sequence of local stationary states. In that case, there is a delay in the maximal correlation between volatility and the trading volume.
In a previous paper of ours, hereinafter referred to as Paper I [11], we introduced a broad study over the individual and cross-sectional intraday statistical properties of the trading volume of blue chip equities that composed the Dow Jones Industrial Average with the aiming of going beyond the well-known intraday [-shape of the average trading volume, a profile that is also shared by different definitions of the volatility, including the absolute value of the price fluctuations [1,[12][13][14][15][16]. Paper I showed important features of the trading volume such as the fact the morning (am) and the afternoon (pm) parts of the business day clearly have different dynamical mechanisms of trading. Therein, we related that property to an important change in the way the market participants transfer information between them during the two parts of the session; furthermore, we concluded that such difference is now significantly larger than before the 2008 crisis. That evolution has been accompanied by an overall reduction of the concavity of the [-shape. In addition, we computed a ∽-shape for the kurtosis in both the individual and the cross-sectional analyses, with the former being always greater than 3 (leptokurtic) whereas the latter approaches 3 (Gaussianity) by the middle of the morning and by the end of the session. The message we extracted from combining all those statistical properties is that the higher level of trading in the morning is related to equities whose overnight news must be transferred to their prices whereas the increase in the trading activity that is observed in the last part of the session corresponds to an overall augment in the level of activity, which is largely fuelled by the need for closing positions. These results add to other established properties of the trading volume [16] such as the asymptotic power-law distribution [17][18][19][20], longlasting correlations [20,21] and (multi-)scaling [22,23]. Besides the statistical description of the trading volume, related quantities such as the trading value [24][25][26][27][28]-the product of the price by the trading volume-and the trading volume fluctuations [29] has been studied and provided important insights into financial dynamics.
Although the cross-sectional analysis already furnished some information on the mutual behaviour of the trading volume, we can obtain a further and more detailed insight into its collective features utilising Principal Component Analysis (PCA) and Random Matrix Theory (RMT). The latter was first applied by E.P. Wigner to explain the energy levels of atomic nuclei [30][31][32]. The idea that the complex nature of their constituents and the interactions between them would be best described by stochastic elements has found resemblance in a wide range of physical [30,[33][34][35][36] and non-physical phenomena [37][38][39][40], including Quantitative Finance [41][42][43][44][45][46][47][48][49][50], namely in the collective analysis of price fluctuations for the purposes of portfolio management and risk assessment.
With this paper, we seek to understand how the trading volume collectively behaves across the trading session from a random matrix perspective. To that, we follow along the lines presented in the section 'Materials and Methods'; in the section 'Results' we first analyse the general collective properties of the trading volume in a financial market and in a second stage and study how these properties have evolved in recent years. Finally, in 'Conclusion' we address the impact of our results on trading modelling and set forth prospective issues worth studying.

Materials and methods Materials
Our results are obtained from the same data of Paper I: 1-minute trading volume spanning the period between the 4th January 2004 and the 30th December 2013 of the 30 companies composing the Dow Jones Industrial Average. These data have been provided by Olsen Financial Data and the Chair of Quantitative Finance of the École CentraleSupélec. (The former has furnished the data for the second semester of 2004 which we used for a first analysis and the former has supplied data corresponding to the entire set from which the full results are obtained.) The companies and their corresponding tickers are listed in Table 1, where the suffix '.N' signals the equity is traded at NYSE and '.OQ' at NASDAQ, respectively. Both markets-regardless of (pre)post-market periods-open at 9:30 and close at 16:00. We employ the notation v i (t, d; s) to represent the trading volume of company i at time t and day d. The former is shown in the form of an integer number so that 9:30 ! t = 1, . . ., 16:00 ! t = 391. On top of that, we also take into account the semester, s, day d belongs to. We do so because we divide our data into semesters in order to assess the nonstationarity of the intraday properties. Partitioning our data that way yields a good balance between quasi-stationarity and the number of days within each interval so that a significant analysis can be carried out. Such segmenting is performed twofold, overlapping the semesters and nonoverlapping them. In the overlapping semester scheme the 6-month period comprising the months from January to June of 2004 corresponds to s = 1, from February to July of 2004 is s = 2, from March to August of 2004 is denoted by s = 3 and so forth whereas in the nonoverlapping contiguous approach we make use of the notation xSyy where x = {1, 2} represents the first(second)-i.e., from January(July) to June(December)-of the year 20yy.
Methods F-test on the equality of the variances. Sets of random variables can be different in a multitude of ways, eg, in respect of the average, the variance or generically the distribution itself. When we want to compare the variances of (sub)sets of random variables the F-test on the equality of the variances is assumed as the standard test. In general, and for two populations, that test corresponds to the computation of the ratio, where corresponding to the variances of two independent and identically distributed sets of random variables {x} and {y} with n x and n y elements following distributions whose averages are equal to m x ¼ 1 n x P n x l¼1 x l and m y ¼ 1 n y P n y l¼1 y l , respectively. The variable F is associated with an Fdistribution, Testing the equality of the variances the null hypothesis assumes the ratio Eq (1) must be equal . If-at a significance level α-F stat a=2;n x À 1;n y À 1 is smaller than F value, or else F stat 1À a=2;n x À 1;n y À 1 is bigger than the F value, where F stat a;n x À 1;n y À 1 is the critical value of the F-distribution with n x − 1 and n y − 1 degrees of freedom and a significance level of α, then we can reject the null hypothesis.
Fundamentals of random matrix theory in a nutshell. For every company i, we can group the trading volume v i (t, d; s) into a set constrained to the parameter d (and thus to s as well), {v i (t, d; s)| d, s } with N D elements corresponding to the number of days in semester s with active trading at time t. The combination of these N × N D random variables is used to define the N × N correlation matrix, C (t; s) whose entries correspond to the Pearson's correlation coefficient, (−1 C (t; s) ij 1), where the overline represents that we have averaged over the number N D of valid days in that semester. The average, μ i (t; s), and the standard deviation, σ i (t; s), are obtained by carrying out the statistics over N D as well. According to Eq (7), the elements of the diagonal of the correlation matrix are equal to one, and therefore the trace of C (t; s) is always equal to 30. Taking into consideration the RMT, we can identify important properties of the correlation matrices Eq (7); one of them corresponds to the probabilistics of the spectrum of eigenvalues {λ}. In asymptotic the limit N, N D ! 1 with 1 < N D N < 1, that spectrum follows the Marchenko-Pastur distribution: for λ − λ λ + and ρ t;s (λ) = 0, otherwise. The symbols λ ± represent the maximal(minimal) eigenvalue and they read The eigenvalues larger than the maximal eigenvalue λ + establish the meaningful structure of the correlations of the multivariate stochastic variable whereas the remaining eigenvalues lying within the interval [λ − , λ + ]-the Marchenko-Pastur 'sea'-are equivalent to random noise. Accordingly, the correlation matrix corresponds to the superposition of two other correlation matrices, where the random part, C rnd (t; s), is composed of the eigenvalues within the 'sea' whereas as the structural counterpart, C str (t; s) consists of the eigenvalues larger than λ + . If the largest observed eigenvalue λ max is larger than maximum eigenvalue λ + and (much) larger than the others eigenvalues and its corresponding eigenvector,ṽ max , has all of its components greater than zero, we can associated these eigencomponents with a market mode.
Spectral Decomposition of a correlation matrix. In practical terms, the finiteness of the data affects the determination of the correlation matrix entries, Eq (7), which might reflect on unrealistic results such as the computation of negative eigenvalues. In order to filter specious contributions to the correlation matrix, we apply the method known as Spectral Decomposition within the Analysis of Principal Component [50].
If S is the self-system of a real and symmetric matrix U and {λ i } the set of the eigenvalues of U, then the following identity holds. Moreover, we can define the nonzero elements of the diagonal matrix Λ 0 as We can also defined the non-zero elements of the scaling matrix T from the self-system S by: Taking we haveÛ whereÛ is positive semi-definite and have all the elements of the main diagonal equal to 1.
That being said, we can resume the method as follows: 1. compute the eigenvalues λ i of U and their eigenvectors s i ; 2. assume all the negative eigenvalues λ i as null; 3. Multiply each eigenvector s i by its adjusted associated eigenvalue l 0 i to get the columns of B 0 and; 4. Obtain B from the normalization of the vectors lines of B 0 .

Results and discussion
Intraday dynamics Eigenvalue analysis. We start our analysis on the collective behaviour of the trading volume by computing the eigenvalues and the eigenvectors of the correlation matrix C(t; s) whose entries are defined by Eq (7). The typical evolution of that matrix is shown in Fig 1 for some illustrative timestamps and for the entire trading session in the video available in S1 Video for 2S04. The analysis of those plots allows us to grasp that there are specific times, namely the opening of the market (t % 1) and the (effective) transition between the morning and the afternoon (timestamp t % 190) where the correlations are stronger on average. Specifically, we observe that browner(greener) regions emerge at those times.
That perception is confirmed when we define a matrix, C(t; s), obtained by considering the absolute values of the entries of C(t; s) which are then sorted out-eg, in descending orderalong each line, (see Fig 2 and S2 Video for s = 2S04). In that case, the lighter the region in the plot, the stronger the correlations.
From a purely quantitative perspective, the main analysis that can be carried out on a (random) matrix is the determination of its spectrum of eigenvalues, {λ α }, and the set of their associated eigenvectors, fṽ a g. In Fig 3, we present the intraday profile of the three largest eigenvalues for 2S04. For the largest eigenvalue, λ 1 (t; 2S04), it is perceivable a -like shape. Applying the t-Student test (the details of this test can be found, eg, in the section 'Materials and Methods' of Paper I) on the equality of the means, we have t-value equal to 12.09 which is much larger than the critical t-values that equals 1.654. That implies the two parts of the session are robustly different one another with a significance of 95%. For the case of λ 2 (t; s) and λ 3 (t; s), we are unable to identify any relevant difference between the morning and the afternoon parts of the session.
Following the theory of random matrices, we can determine the upper and lower bounds, λ ± , of the Marchenko-Pastur distribution (MPD), Eq (8). For the illustrative semester, 2S04, using Eq (9) we have λ + = 2.2 and λ − = 0.27. Across the trading session, the average value of the largest eigenvalue is equal to 4.2 which is sustainably above λ + . From these figures, we confirm that λ 1 corresponds to a mode of the trading volume, a feature that extends to all the semesters composing our dataset. The identification of that market mode can be further tightened by bringing into effect the rank one perturbation approach onto the correlation matrix; in other words, we assume that C(t; s) equals the superposition of the Identity Matriz, I, and a zero-diagonal matrix whose (non-diagonal) elements C 0 (t; s) have an average value equal to c(t; s). The evolution of c(t; 2S04) is we depicted in Fig 4. Computing Λ = N × c(t; s), we verify that such value is not clearly larger than 1, except at the opening and during the morning-afternoon transition. Because the inequality L > 1 þ ffiffiffiffiffiffiffiffiffi ffi N=T p is verified, we are within the same conditions as described in [43,51] where it is shown that the minimal value for considering an eigenvalue outside the Marchenko-Pastur domain is given by that yields l Ã am ¼ 3:03 and l Ã pm ¼ 4:0, which are smaller than the average values that equal λ 1,am = 3.8 and λ 1,pm = 4.6 obtained directly from the data. These values correspond to 12.6% and 16% of Tr C(t; s) = 30, respectively.
Still, we can check the likelihood of the trading volume market mode by establishing a custom-made statistical test which heeds that the trading volume is non-negative defined. The test runs as follows: owing to the previous results on the local statistics of these data, namely that the local distribution of trading volume is well-described by a Gamma distribution [52], for a intraday time t, we generate N independent Gamma distributed series-with the same number of elements as the number of business days in semester s-whose mean and variance match with the values obtained from the data at t. From these series, we compute the independent correlation matrix, C ind (t; s) whence a spectrum of eigenvalues is obtained. For each time step, the process we have just described is carried out a large number of times. At each iteration, the value of λ 1 is stored up. For a large number of samples bearing a large set of λ 1 values, we sort that set and at the end we pick the last value in the top 5%. Accordingly, that figure establishes the critical value of the statistical significance of obtaining a largest eigenvalue equal to λ 1 (t; s) associated with a set of independently Gamma distributed variables. The results of that process are depicted by the pink line in Fig 3, from which it is visible that the dots representing λ 1 (t; s) are steadily and in large measure above that line. This upholds the existence of a trading volume collective mode. On the other hand, looking at the second and third largest eigenvalues we verify they are definitely within the Marchenko-Pastur range, except for the very first minutes of the trading session-where the values of the second largest eigenvalue are robustly above the critical value-and during the morming-afternoon transition in which λ 2 is a little above it. For the rest of the time, that eigenvalue is approximately 10% below the computed mode limit. We shall now compare our results with those obtained for the price fluctuations [15]. Explicitly, it was found the collective dynamics presents a strong market mode, with λ 1 /N larger than the value we have obtained for the trading volume. The intraday profile of the largest eigenvalue is also marked by a shift upwards in the beginning of the afternoon part of the session; however, in the case of the price fluctuations, the spike in the opening and during the morning-afternoon transition cannot be perceived. Nonetheless, a major difference emerges in respect of the limits of the MPD; we have seen that for the trading volume, the second largest eigenvalue is already close to the upper bound of the MPD distribution for all the semesters we analyse, a feature that is at odds with our expectations. The startlingness over this result stems from the fact that for 5-minute price fluctuations-which are quite close to white-noise for blue chip equities-the authors of [15] have found that the first seven larger eigenvalues surpass λ + . We recall that the autocorrelation of the trading volume lasts much longer than that of the price fluctuations, typically two hours for 1-minute trading volume [53,54]. Besides the fact we are coping with a different financial quantity, we must take into attention that our sample rating is higher than 5-minute frequency used in [15]. In a previous work [55], it was verified that the lag, Δ, assumed in the computation of the price fluctuations, r Δ (t) ln S(t) − ln S(t − Δ), is pivotal in the results of the analysis of the correlation matrix, namely pointing that up to Δ = 5 minutes the correlation structure is in a transient state that arises a manifestation of the Epps effect caused by the asynchronous trading of the companies [56]. In order to clarify whether the collective dynamics of the trading volume of the DJIA companies is affected by asynchrony in trading as well, we analyse the correlation spectrum of the cumulative trading volume  Eigenversor analysis. The increasing of λ 1 (t; s) across the business day indicates the equities tend to get more concerted in the afternoon than in the morning. How does this property influence the possible intraday dynamics of the eigenversors,ṽ a ðtÞ? Focussing on α = 1, the structure of each eigenversor-ie, the magnitude of its components-establishes the weight of each equity in the trading volume mode. Moreover, each company can be understood as describing a directionẑ i in the DJIA N-dimensional trading volume space. That said, the most straightforward way to assess the evolution of the relative weight of each company in the trading volume collective behaviour is to compare the eigenversor related to the largest eigenvalue, with the uniform vector, computing the scalar product, Dðt; sÞ ṽ 1 ðt; sÞ Áũ: ð20Þ We note that, althoughṽ 1 ðt; sÞ is close to uniformity, the scalar product tends to 1 as the trading session elapses (see Fig 6 and the corresponding first largest eigenvalues are presented in Fig 7). Fitting the results of D(t; s) with a straight line we find a slope around 10 −4 , which is significant regarding its error though. In Fig 6, the smaller (than 0.8) values of D(t; 2S04) are computed for the very first minutes of the session as well as for the morning-afternoon transition. It is worth bridging such results-that indicate a preponderance of some companies with respect to the others-with the augment of the cross-sectional kurtosis of the trading volume reported in Paper I for the same intraday times (a detailed analysis on this matter is addressed in the Conclusion).

Constrained correlation analysis.
Up to now, we have provided a general account over the trading volume correlations of DJIA stocks. Similarly to the case of the price fluctuations [57], relevant information can be extracted from the statistics of the large values of the trading volume. Remembering the assertion that this quantity can act as a proxy for the flow of information, we can look at a extreme-value analysis of the trading volume as form of understanding to what extent the emergence of large values of that financial quantity (large information flow) for a given company affects the volume of trading of the remaining equities. In the present case, we define constrained correlation matrices to the percentile p, C p (t; s) where we consider that the contribution to the calculation of the entry C p (t; s) ij does only count when the trading volume of one of the companies-either i or j-is in the top p% of trading volume values for that time t and semester s.
Because we are analysing the data on a 6-month basis-which implies a not so huge number of values for statistical purposes-we define p = 40. This means we effectively carry out a correlation analysis on the values above the median. In S3 Video, we show the evolution Seasonalities and nonstationarity of trading volume in financial markets: Collective features of C 40 (t; 2S04) across the trading session. The visual inspection of S3 Video hints at a rather modest level of similarity between the full and the constrained matrix for most of the time; nevertheless, the periods in which we have bursts in the λ 1 are accompanied by large values of the constrained eigenvalue λ 1,40 . With the goal of quantitatively probing the differences and similarities between both matrices, we centre our attention on the modes of both matrices and compute the scalar product between the eigenversors of the largest eigenvalue of each case as presented in Fig 8. In that figure, the overlap between the two versors has a median across the trading session equal to 0.25. That overlap is quite strong-with values around 0.8-in the first minutes of the session and afterwards in the (effective) transition between the morning and the afternoon parts of the business day. Note that these are the periods where the kurtosis (individual and cross-sectional) reaches its largest values. Nevertheless, in this case, we cannot identify a statistically significant difference between the morning and the afternoon. Combining our observations we confirm that the emergence of large trading volumes lies at the basis of the collective trading dynamical behaviour, particularly in those two key periods of the business day. Nonstationary collective behaviour Eigenversor analysis. As our previous study on the individual statistical properties showed, intraday features of the trading volume evolve in the long-term, namely the silhouette of the famous [-shape, which has been loosing concavity. In the case of the eigenversors, we  can also try to understand how the weight in the eigenversors is reassigned as months go by. In order to do so, we analyse the evolution of the overlap O a ðt; sÞ ṽ a ðt; 1Þ Áṽ a ðt; sÞ ð21Þ for different timestamps, as depicted in Fig 9. From the panels therein, we observe that the first largest eigenversor was pretty coherent through the years 2003-2014, both in respect of the value in itself, with an average value of 0.84 ± 0.09, as well as the (small) fluctuations from one semester to the next. Yet, as Fig 9 indi-cates, for the timestamps at which we have peaks in λ 1 (t; s), there was a decrease in the value of the scalar product during the trading halt of General Motors stocks due to the bailout by the US Government. Interestingly, for a fixed intraday time t, if we compute the ratio between the average of O α (t; s) wherein GM.NY was traded and not traded we obtain a value circa 1.15, which is appreciably larger than 30/29 ' 1.03. The difference between the two ratios suggests General Motors plays a particular role in the collective dynamics of the trading volume of the this group of stocks and ultimately might help understand the decision to bail out the company.
For other eigenorders, eg, α = {2, 3}, the computation of O α (t; s) can be challenged because we have verified they are effectively within the Marchenko-Pastur limits and hence do not correspond to structured market modes, except for λ 2 when the market opens and during the morning-afternoon transition. For that reason, we understand that: i) O α (t; s) fluctuates significantly and within noise level during morning/afternoon periods for which λ 1 is close to its morning/afternoon mean value (within error) and ii), for the timestamps where the burst in the value of the trading volume mode emerge (see Fig 3), we glimpse a slowly decaying trend for O 2 (t; s) whereas for O 3 (t; s) we reach noise level in the scale of a few months. This suggests that at these specific times the second largest eigenvalue can be relevant. Curiously, the last minute of the trading session is characterised by noise for all the eigenorders we have analysed.
Eigenvalue analysis. As pointed in the previous subsection, the intraday analysis of the correlation matrix of the trading volume has allowed us to identify a clear mode in the collective behaviour of the trading volume, which performs a -like profile across the day. With the goal of understanding how the intraday profile of the largest eigenvalue changed, we assume that each minute corresponds to a given dimension and transform the intraday profile into a versor, L a ðsÞ ¼ 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 390 t¼1 l a ðt; sÞ 2 q ½l a ð1; sÞẑ 1 þ l a ð2; sÞẑ 2 þ . . . þ l a ð390; sÞẑ 390 : ð22Þ We then compute the overlap between each L a ðsÞ and L a ð1Þ. The outcome of these calculations, averaged over companies, is presented in Fig 10. The overlap between the second and third largest eigenvalues profile-versors of semester s and semester 1 is more or less constant until semester 30 (2S06) whence it exhibits larger fluctuations. These overlap values are never less than 0.93 for α = 2 and 0.96 for α = 3. For the trading volume mode, we have a clear increase of the fluctuations from semester 58-off onwards.
That is clear when we look into the fluctuations and apply a F-test of equivalence of the variance in two sub-groups. Accordingly, by separating those fluctuations into 'before semester 58' (the climax semester of the subprime crisis) and 'after semester 58' the statistical test allows us to assert that we have different behaviour for both groups with a significance of 95%. In order to appraise the robustness of this finding, we can also compute the fluctuations of the overlap with respect to other semesters-including the last one (shown on the right panel in Fig 10) obtaining equivalent results.

Conclusion
With this work, we have expanded our previous study on individual and cross-sectional intraday features of trading volume. We have done so by analysing quantities derived from timedependent correlation matrices of the trading volume of the DJIA components, namely its eigenvalues and eigenversors. By dividing the data into 6-month (overlapping) spells our study has assessed their nonstationary features in the last 10 years as well.
Employing Random Matrix Theory and Principal Component Analysis, we have computed at each trading minute, t, the spectrum of eigenvalues and verified the largest of them, λ 1 (t; s), is consistently off the upper bound imposed by the Marchenko-Pastur distribution of a random matrix under the same conditions. That proves the collective behaviour of the trading volume of the stocks composing the DJIA is ruled by a mode. Complementary, the remaining eigenvalues are within the limits established by random matrix theory for most of the time and hence we can associate them with noise. The exception is the second largest eigenvalue at the beginning of the trading session-wherein it can be viewed as a market mode for sure as well-and in the transition between the morning and afternoon part of the session for which its values are slightly above the mode limit. This finding establishes a clear and surprising difference between the collective behaviour of the trading volume and that of the price fluctuations: in spite of basically showing a white-noise autocorrelation function, the latter has a larger set of relevant eigenvalues of the correlation matrix [15]. Such properties contrast to those of the trading volume which exhibits a slowly decaying autocorrelation function, but only one statistically significant collective mode is systematically found. Furthermore, although λ 1 /N is larger for price fluctuations, the difference in the number of significant eigenvalues is even more surprising when we understand that, eg, λ 2 /N is equal to 0.07 for the trading volume and 0.02 for price fluctuations. At first, we have assumed that the different sample rating was at the helm of the somewhat conflicting results; however, a subsequent analysis has shown that despite the larger the lag, the larger the weight of the trading volume mode in the eigenspectrum, the other largest eigenvalues remain around or below the limits imposed by random matrix theory (Ignoring the market mode of the second largest value character in the very first minutes of the session.). In other words, when we bring into play the effects of asynchrony in trading to the analysis of the collective behaviour of the trading volume we fundamentally strengthen the collective dynamics in a linear first-order way. The absence of significant changes in second and third largest eigenvalues might be related to the 'blue chip' (high-liquidity) character of DJIA stocks and that second-order effects become more clear as we consider less capitalised (ergo less liquid) companies. We expect to investigate these observations in future work of ours.
Across the session, λ 1 (t; s) roughly defines a step-like shape punctuated with bursts in the very first minutes of the session and in the beginning of the afternoon part of the session, for which is not distinguishable a relaxation process. Applying standard statistical testing, we have confirmed that the trading volume is also collectively defined by two different regimes: before and after lunch. This feature bridges with the claims we asserted for a previous cross-sectional analysis: in the morning, the trading volume of the DJIA stocks are more loose and large trading volumes mainly stem from the impact of news disclosed in the overnight which have those informations passed on to the price as soon as the market opens; in the afternoon, the trading volume collective mode increases in the wake of a more concerted behaviour of the DJIA stocks so that the tails in the trading volume (cross-sectional) distribution are the outcome of large trading volumes among the companies composing the index.
The conclusions we have just conveyed are further supported by the analysis of the evolution of the scalar product between the first eigenversor and the uniform versor. The value of the projection of the former on the latter increases with intraday time t meaning that the companies tend to assume equivalent weights in the collective dynamics as the time elapses; moreover, the peaks in the kurtosis of the trading volume concur with plunges in that scalar product, ie, we depart from quasi-homogeneity with some company(ies) assuming some sort of leading role.
A subsequent analysis of the correlation matrix constrained to the larger values of the trading volume has proved that in the opening and after lunch-which are the periods for which we have the largest eigenvalue clearly beyond the Marchenko-Pastur limits -, the overlap (scalar product) between the first eigenversors of the full correlation matrix and the constrained correlation matrix reaches a value around 0.8, which indicates that the key element in the collective behaviour of the trading volume is the emergence of large values of v. Once again, we have been able to quantify the heuristic 'lunch effect', showing that agents do still take into account the start of the second half of the session in their trading strategies.
In a second stage, we have analysed how the quantities related to the correlation matrix of the trading volume have evolved across the last decade. Assuming the first semester of our data as default, we have tested the stationarity of both the first eigenversor and the intraday profile of the largest eigenvalue as well. Considering the results of the overlap withṽ 1 ðt; 1Þ, we have understood thatṽ 1 ðt; sÞ is quite robust with typical values larger than 0.8. Importantly, we have grasped that for the timestamps corresponding to the peaks in λ 1 there is a visible shifting down (of the overlap) in the semesters in which General Motors was not traded due to the Chapter 11 reorganisation process filled in June 2008. The ratio gap between the typical value of the overlap and that computed during the GM bailout trading halt is larger than 30/ 29, which points out to the relevance of that company in the market dynamics, especially in the spike intraday periods we have mentioned. It is worth recalling that GM was vehicle sales leader and the automotive sector is intimately related to several other economical sectors. For these timestamps, we have also noticed that the scalar product ofṽ 2 ðt; sÞ byṽ 2 ðt; 1Þ suggests that the overlap with the first semester endures for several months before attaining the noise level. On the other hand, with the goal of understanding the degree of robustness of the intraday profile of the eigenvalues, we have defined eigenvalue versors by assuming each minute of the trading session as an orthogonal dimension. The overlap of such versors, namely that composed of λ 1 (t; s), shows there is a definite change-supported by a F-test of statistical significance-in the behaviour of the overlap fluctuations before and after the second semester 2008 (it gets more flustered after that), the climax of the sub-prime crisis. This finding has been verified assuming other semesters as reference as well.
The present results-and those of the preceding Paper I-have shed light on the intraday dynamics of trading volume (of liquid stocks) in financial markets. Sustaining our analysis on the close relation between trading volume and information flow, we have been capable of explaining the reasons for the different statistical behaviour between the morning and the afternoon parts of the trading session. Nonetheless, several questions are still short of a quantitative approach besides liquidity matters; among them we can mention the intraday profile of the volatility and its relation to the trading volume from a correlation matrix point of view. Alternatively, bearing in mind the analysis carried out for the price fluctuations [58], it is also worthwhile to consider the cross behaviour of trading volume fluctuations [29] in order to get a better collective understanding of the changes in market activity-which have helped explain the power-law distribution of trading volume [20,59,60]-and the limits of acceptance of standard theories of diffusion of information in the market, namely the Mixture of Distributions Hypothesis and the alternative Sequential Arrival of Information Hypothesis.