Figures
Abstract
The complex financial networks, with their nonlinear nature, often exhibit considerable noises, inhibiting the analysis of the market dynamics and portfolio optimization. Existing studies mainly focus on the application of the global motion filtering on the linear matrix to reduce the noise interference. To minimize the noise in complex financial networks and enhance timing strategies, we introduce an advanced methodology employing global motion filtering on nonlinear dynamic networks derived from mutual information. Subsequently, we construct investment portfolios, focusing on peripheral stocks in both the Chinese and American markets. We utilize the growth and decline patterns of the eigenvalue associated with the global motion to identify trends in collective market movement, revealing the distinctive portfolio performance during periods of reinforced and weakened collective movements and further enhancing the strategy performance. Notably, this is the first instance of applying global motion filtering to mutual information networks to construct an investment portfolio focused on peripheral stocks. The comparative analysis demonstrates that portfolios comprising peripheral stocks within global-motion-filtered mutual information networks exhibit higher Sharpe and Sortino ratios compared to those derived from global-motion-filtered Pearson correlation networks, as well as from full mutual information and Pearson correlation matrices. Moreover, the performance of our strategies proves robust across bearish markets, bullish markets, and turbulent market conditions. Beyond enhancing the portfolio optimization, our results provide significant potential implications for diverse research fields such as biological, atmospheric, and neural sciences.
Citation: Peng W, Wen M, Jiang X, Li Y, Chen T, Zheng B (2024) Global motion filtered nonlinear mutual information analysis: Enhancing dynamic portfolio strategies. PLoS ONE 19(7): e0303707. https://doi.org/10.1371/journal.pone.0303707
Editor: Xiyu Liu, Shandong Normal University, CHINA
Received: December 14, 2023; Accepted: April 30, 2024; Published: July 11, 2024
Copyright: © 2024 Peng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data are available from the Investing (https://cn.investing.com/). Raw data have also been included within the Supporting information.
Funding: National Natural Science Foundation of China under Grant Nos. 12175193, 11905183 and 11775186 and Key Program in Humanity and Social Sciences of Zhejiang Provincial Universities under Grant No. 2021QN016.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: MI, Multual information; PC, Pearson correlation; PMFG, Planar maximally filtered graph; RMT, Random matrix theory
Introduction
The complex financial networks, integrating finance, economics, network science, and systems theory, are essential for quantifying the interactions and complexities within financial systems [1–5]. Analyzing interactions and correlations between financial entities enhances our understanding of the market dynamics, identifies systemic risks, and provides crucial insights into financial stability [6–10].
Recent studies have shown that topological structures in financial networks evolve over time [11–16]. Dynamic networks, which are more complex and noisy, are of more significant importance compared with static ones [17, 18]. The dynamic networks obtained from the matrix of Pearson correlation (PC) coefficients are pivotal in understanding the complex interactions within brain networks, particularly during cognitive tasks [19–22]. It is observed that the networks obtained from the matrix of nonlinear mutual information exhibit superior robustness compared to networks obtained from the matrix of PC coefficients [23–25]. Literature on dynamic nonlinear networks primarily focuses on the basic properties of nonlinear networks [26, 27], with relatively fewer applications in portfolio optimization compared to static networks, mainly due to their susceptibility to noises.
Portfolio optimization in finance is the optimal allocation of financial assets in different stocks [28, 29], mutual funds, bonds, digital currencies [16, 30, 31], etc. to maximize the returns with risk tolerance [32]. Recently, financial scholars have been delving into the intricate relationships and distinct differences between various financial markets [16, 33, 34]. A sophisticated portfolio optimization strategy includes various elements, but stock selection, asset allocation, and market timing are fundamental components [35, 36]. The fundamental theory of portfolio optimization, originating from the Markowitz framework [37, 38], bases investment allocation on the mean-variance analysis [31]. This theory necessitates selecting a specific set of stocks [39, 40], as it focuses on allocating proportions within a selected stock group. A diverse range of alternative methodologies, such as neural networks [41, 42], genetic algorithms [43], and hierarchical clustering [10, 44, 45], have been introduced to the portfolio optimization. Recent studies indicate that green cryptocurrencies offer diversification benefits [30, 31], with a growing body of research incorporating cryptocurrencies into investment portfolios [46, 47].
Among these methods, using the hierarchical clustering in the network topology has emerged as an efficient approach for selecting stocks to generate optimal portfolios [10, 28, 45, 48]. Researchers utilize diverse linear correlation methodologies across various financial markets to evaluate the investment potential of assets possessing unique topological features [10, 28, 49–52], and implement minimum variance correlation strategies to optimize portfolio weights [31, 46]. Nevertheless, the exploration of portfolio optimization through nonlinear network topologies remains limited.
The nodes of the minimum risk portfolio are always located on the outer leaves of the MST tree generated by using the hierarchical clustering [48, 53]. Traditional hierarchical clustering portfolios predominantly concentrate on stocks at the extremities or at the core of the market spectrum [10, 45, 52]. Our methodology, however, adopts a holistic approach by incorporating an analysis of the full sample of stocks. A common approach for constructing dynamic financial networks involves generating network graphs over moving windows for the portfolio selection in each period [10, 45]. However, if the moving windows duration approximates the number of nodes, statistical uncertainty increases, and the dynamic network becomes predominantly noise-influenced [12, 54, 55].
The global motion, extracted with random matrix theory (RMT), drives price movements in complex financial systems, underscores the inter-connectedness of global financial markets [56–58], and minimizes noises [52]. The RMT highly effective in large dimensional systems like stock markets, is filtering noises from financial time series [59–62]. Notably, several large eigenvalues in both the mutual information (MI) and PC matrices significantly deviate from the upper boundary of the eigenvalue distribution as predicted by the Wishart matrix [63]. The RMT is also effective for nonlinear matrices like MI, characterized by independent random elements drawn from a probability distribution [59, 63]. Previous findings indicate that only the large eigenvalues, which significantly deviate from random ones, contain substantial information about the network structure [64, 65] and contribute to the variability of the dynamic system [66, 67]. Studies pertaining to the nonlinear RMT are notably limited, with its application within financial markets being almost entirely absent. Thus, we can extract the global motion matrix determined by the largest eigenvalue [52]. However, the structure and function of networks generated by the global motion of nonlinear matrices have not been sufficiently investigated.
The MI metric, rooted in the Shannon entropy theory, excels in evaluating nonlinear relationships, outperforming the PC, which is limited to linear associations [23]. The MI-based methodologies have been instrumental in constructing biological networks [68–71] and have recently gained prominence in complex systems and stock network studies [23, 72]. Research focusing on the dynamic structural characteristics of core and periphery nodes in stock networks has shown that peripherality serves as a reliable indicator for identifying optimal assets [45, 52]. It has been observed that peripheral stocks, when selected based on the MI, particularly with high frequency data, significantly outperform those chosen via PC [73]. What is more, the complex networks generated by the MI is still subject to noise interference, which can weaken the performance of investment portfolio constructed based on it.
In this study, our primary focus is on the global motion filtered MI, which could reduce noise and enhance timing strategies in dynamic stock networks. Initially, we construct dynamic stock networks with four matrices: MI, PC, and their corresponding global motion filtered matrices, based on daily returns of CSI 300 and S&P 500 component stocks. Then, we recognize peripheral stocks in these networks and build corresponding investment portfolios. Besides, we utilize eigenvalue growth and decline patterns associated with global motion to identify trends in collective market movement, exposing distinctive portfolio performance during periods of enhanced and weakened collective movements. In addition, we analyze Sharpe ratios across portfolios varying in stock numbers and holding duration, and unveil contrasting investment tendencies in the Chinese and American markets.
Materials and methods
Data description
Our study involves data on the constituent stocks of two major indices: the CSI 300 from the Chinese stock market and the S&P 500 from the American stock market. The dataset, spanning a decade (2009 to 2019), is freely accessible from the financial historical archives on https://cn.investing.com/. The collection and analysis methods adhered to the terms and conditions specified by the data source. In this analysis, the CSI 300 dataset encompass 2431 days, while the S&P 500 cover 2517 days. To ensure data continuity and integrity, stocks suspended for over 120 consecutive days are excluded. Consequently, the final dataset include 188 stocks from the CSI 300 and 420 from the S&P 500. The chosen dataset concludes in 2019, representing the most recent and comprehensive data unaffected by significant trade disruptions, including the COVID-19 pandemic.
Construction of the mutual information and Pearson correlation networks
Initially, we compute correlation for component stocks of CSI 300 and S&P 500 market with moving windows. The closing price of the xth stock on day t is represented by Px(t), with the logarithmic return calculated as Rx(t) = ln(Px(t)) − ln(Px(t − 1)). For each day t, the normalized return is computed within a moving time window set to ΔT = 125 days, (1) In the expression where 〈…〉 denotes the time-averaged value over t′ in the past ΔT days, with t′ ranging from t − ΔT + 1 to t, and δ(t) is the standard deviation of the returns.
In the information theory, Shannon Entropy quantifies the uncertainty or unpredictability of a random variable or vector [73]. The normalized return rx(t) is uniformly divided into Nx sub-intervals, each with a width Δx = (xmax − xmin)/Nx. The probability of stock x falling into the ith sub-interval is estimated by computing its occurrence frequency f(xi) within that interval. The density function is subsequently approximated by: (2)
The entropy of a discrete random variable rx(t) is defined as: (3)
For discrete random variables rx(t) and ry(t), their joint entropy is defined as: (4) where p(xi, yi) represents the joint density function of variables X and Y. Herein, we employ Nx = 10 and 15 for calculating the mutual information, presenting results specifically for Nx = Ny = 10 since the results are highly similar.
The MI, which originates from the entropy information theory, measures a generalized, nonlinear relationship between two variables rx(t) and ry(t). It is defined as: (5) When rx(t) and ry(t) are independent, their joint density function satisfies p(xi, yi) = p(xi) ⋅ p(yi), generating a mutual information I of zero. From Eq 5, we derive that mutual information I(x, y) is expressed as H(x) + H(y) − H(x, y). Mutual information can be normalized to the interval [0, 1] [23], and is defined as: (6)
The PC between the normalized return series of stocks x and y expressed as: (7)
Within a network, the metric chosen defines the distance between nodes. The normalized distance based on MI between two stocks x and y is defined as: (8)
Similarly, based on PC, the distance between stocks x and y is calculated using the following equation: (9)
Extraction of the global motion in the MI and PC networks
Existing studies mainly focus on the application of the global motion filtering on the linear matrix to reduce the noise interference. Based on the above calculations, we get full correlation matrices (networks) named Cij and Nij. Next, through global motion filtering, we apply global motion filtering to Cij and Nij, i.e., and . In the statistical physics, the statistical properties of eigenvalues are derived from the RMT matrix, formed from uncorrelated time series of finite length [57]. Here, the total number of stocks is denoted as N, and the aggregate data duration is represented as T. In the limit N → ∞ and T → ∞, maintaining Q ≡ T/N ≥ 1, the eigenvalue probability distribution Prm(λ) is described by the following expression: (10) The eigenvalues are constrained within the defined upper and lower bounds, given by .
Expanding on the results of previous research [57], the correlation matrix Mxy is decomposed as: (11) The correlation matrix M encompasses both the PC matrix C and the MI matrix N. Here, λα represents the α-th eigenvalue of Mxy, while denotes the x-th component of the α-th eigenvector. Additionally, N signifies the total number of stocks under consideration.
In this manuscript, we focus on the global motion associated with the largest eigenvalue. The definition of the global correlation matrix is as follows: (12) where λm denotes the largest eigenvalue of the matrix M, and is identified as the x-th component of the largest eigenvector. The matrix Mm is the result of noise reduction using the global motion approach, demonstrating, notable stability. Within this theoretical model, Nm corresponds to the global motion matrix constructed based on MI, and Cm represents the global motion matrix derived from PC.
Calculation of the node peripherality in the MI and PC networks
We utilize the PMFG technique to construct sparse networks, employing both correlation networks Nij and Cij, along with their corresponding global motion networks and for each day t. The PMFG approach, based on iterative creation of a constrained, planar graph, retains the most significant correlations among connected nodes, as elaborated in [74]. Following methodologies in [45, 52], we employ a composite peripherality measure to assess the peripherality of individual nodes within these networks. This peripherality Cp in the networks is computed as follows: (13) where, DC, BC, E, C and EC represent degree centrality, betweenness centrality, eccentricity, closeness, and eigenvector centrality, respectively. The superscripts w and u correspond to the weighted and unweighted sparse networks filtered with PMFG. Central nodes in the network typically have lower Cp values, while peripheral nodes tend to have higher Cp values. The peripherality Cp value for each stock is computed. Subsequently, the hierarchical clustering method is applied to divide all the stocks into 10 groups based on their Cp values, where group ‘1’ signifies central stocks and group ‘10’ represents peripheral stocks. In our analysis, the portfolio selection is based on the network structure in the previous time window and subsequently employed as a strategy for the following investment horizon.
Construction of portfolios based on MI and PC networks
The Markowitz portfolio optimization theory is a fundamental concept in the field of modern finance, guiding optimal portfolio construction, weight allocation, and asset diversification [75]. It provides a structured method to determine the optimal asset weights within a portfolio, either by maximizing the portfolio’s return for given risk level or by minimizing the risk for a specific return. The Markowitz approach is formulated as follows: (14) We analyze portfolios comprising the k most peripheral stocks, identified by the highest Cp values, and contrast them with portfolios of central stocks, characterized by the lowest Cp values.
In our model, Ω represents the covariance matrix of the assets within the portfolio. The methodology adopted in this study precludes the practice of short selling, mandating that the weights ω assigned to the assets must be strictly positive. The target return μ is predetermined; however, in order to mitigate the risk of overfitting, the specification of μ is deliberately omitted, with the emphasis being exclusively placed on the minimization of portfolio risk. For a parameter value of λ = 1, the portfolio strategy exhibits a propensity towards diversification, emphasizing the minimization of risk as opposed to the maximization of returns. In this context, the portfolio assessments are conducted utilizing both uniform and Markowitz optimization approaches for asset weighting.
For the each portfolio under consideration on day t, the return accrued over the holding period of τ days is meticulously tracked and defined as follows: (15) where , with Px(t) is the price of the x-th stock among the selected most peripheral (or central) stocks on day t, and Px(t + τ) is the corresponding price on day t + τ. The term τ designates the holding period, which is within the range of τ ∈ [1, 125]. We assign uniform weights as , and Markowitz weights are deduced from Eq (14). The annualized cumulative return, denoted as , is defined by the formula (1 + rx(t, τ))250/τ − 1, where 250 represents the typical number of trading days within a year. Subsequently, we define the annualized return of the portfolio: (16)
The Sharpe ratio is chosen as a metric to evaluate the performance of the portfolio, defined as: (17) where signifies the mean return computed over all instances of time t within the full span of the time series, as well as the standard deviation σp(k, τ). When the return of a portfolio, denoted as r(t, τ), is represented by ra(t, τ), the symbol Sp signifies the annualized Sharpe Ratio.
The Sortino ratio which is a variation of the Sharpe ratio, only takes in to account downside/negative volatility. It is assumed that the upside volatility is a bonus for investment, and should not be considered risky. Therefore, the total standard deviation in the Sharpe ratio is replaced by the downside deviation in the Sortino ratio [32, 76]: (18) where σd(k, τ) is the target downside deviation.
Identification of collective movement trend with the global motion
The global motion is governed by the largest eigenvalue, and drives the collective price movements in the complex financial systems [56, 57]. Using a moving time window, we compute the correlation matrix and its largest eigenvalue, generating a time series of the largest eigenvalues as λm(t).
The Granger causality test is applied to assess the causal relationship between this eigenvalue series and market indices in both Chinese and American markets. While the largest eigenvalue series of Nij do not exhibit causality, the largest eigenvalue series of Cij pass the causality test. Therefore, we utilize the largest eigenvalue series of Cij as an indicator for market timing, defining market conditions accordingly: (19) On the day t, a value of λm(t) exceeding λm(t − 125), is interpreted as an indication of enhanced collective movements, marked by an ‘E’ superscript. In contrast, if λm(t) is less than λm(t − 125), signifying weakening collective movements, is denoted with a ‘W’ superscript. This notation is consistently applied in all figures throughout the article that include the ‘E’ and ‘W’ superscripts.
The mean return during the ‘E’ period is characterized as follows: (20) where 〈…〉 refers to averaging over the number of stocks: k, and time: t, where t should satisfy the enhancement conditions specified in the formula below. Conversely, the average rate of return during the ‘E’ period can be calculated as .
The concept of conditional probability, denoted as , is widely recognized. Here, P(r+, E) signifies the occurrence where the rate of return exceeds zero during the ‘E’ period, and P(E) pertains to the condition during the ‘E’ period. The corresponding expression for the win rate is expressed as: (21) where denotes the number of days with a positive return rate during the ‘E’ period. The Eq 16 is well-established that the return rate r is contingent upon three variables: k, t, and τ. The number of stocks, denoted as k = N/10 is categorized into either peripheral or central tiers.
Summary of methods
Our methodology is structured into four main parts to ensure logical coherence:
- Construction of Stock Correlation Matrix Using Moving Windows: Initially, we compute correlation for component stocks of CSI 300 and S&P 500 market in different time periods (moving windows from t − 125 to t), utilizing either PC or MI correlation matrix. Then, we get full correlate matrices (networks) named Cij(t) and Nij(t). Through global motion filtering, we get the global motion filtered matrices and . Since the global motion is determined by the largest eigenvalue, which diverges the farthest from the random ones and captures a substantial share of the variability of dynamic system [52, 66, 67]. At this stage, the matrices are N × N correlation matrices (networks).
- Sparse Matrix Generation via PMFG: Utilizing the Planar Maximally Filtered Graph (PMFG) technique, we retain significant connections to obtain sparse matrices (networks). This process involves filtering to preserve only the most crucial edges within the networks.
- The node Peripherality: Sparse matrix calculated using Eq 13, the peripherality of each stock (node) within the network are available at each day. There are eight matrices (networks) across two markets: Cij, Nij, , and . The peripheral node exhibits less closely connected to other nodes in the network, which are less exposed to risk [45, 52]. From these matrices, we can derive metrics from various matrices to identify optimal assets.
- Clustering and Hierarchical portfolios: Based on the peripherality values of all nodes on the same day, we categorize all stocks into ten tiers. Subsequently, we calculate the Sharpe and Sortino ratios for different holding periods (τ) and various market states to evaluate the investment portfolio performance of the networks.
Results
Comparison of MI and PC networks
The RMT is employed using Eq 12 to discern the global motion in Nij and Cij matrices for both CSI 300 and S&P 500 markets, resulting in and , respectively. Subsequent references to Nij pertain to the MI, as defined in Eq 6.
Fig 1 presents the probability distributions of matrix elements for four matrices: Nij, , Cij and . It is evident that Nij, similar to Cij, demonstrated normal distribution characteristics, aligning with typical Wishart matrices. The global motion matrices retain their original distributional characteristics. Both Nij and Cij preserve their symmetry axes, which is indicative of their respective market modes. Moreover, the figure highlights that in the CSI 300 market, the average values are around 0.08 for Nij and 0.4 for Cij, whereas in the S&P 500, they are approximately 0.05 for Nij and 0.32 for Cij, suggesting higher mean values in CSI 300 compared to S&P 500.
In our analytical assessment of the CSI 300 (with L = 2430, N = 188), it is observed that the eigenvalues λ+ and λ− are 1.63 and 0.52, respectively. In contrast, for the S&P 500 (with L = 2516, N = 420) markets, these eigenvalues are determined to be λ+ = 1.98 and λ− = 0.35. Several significant eigenvalues in the matrix notably deviate from the theoretical upper limit of the eigenvalue distribution characteristic of the Wishart matrix, typically representing the correlation matrix of uncorrelated time series [59, 63]. Eigenvalues confined within the interval [λ−, λ+] are classified as the random component. We quantify the degree of non-randomness in the matrix by calculating the ratio of eigenvalues exceeding this range to the total eigenvalue count, applied to both Nij and Cij matrices. The ratio is defined as follows: (22) where nλ denotes the count of eigenvalues λ that are either less than λ− or exceed λ+. As presented in Table 1, the ratio for the S&P 500 index is substantially higher compared to that of the CSI 300 index. Moreover, the ratio for Nij is significantly greater than Cij. These observations imply that MI serves as a more robust measure of non-randomness than the PC coefficient when analyzing stock price data, which conclusion is consistent with the findings reported in Ref. [23].
Performance of the MI and PC network-based portfolios
Using the computed network peripherality Cp value for each stock based on Eq 13, we stratify Cp values in descending order and calculate the portfolio returns for each tier.
Fig 2 shows the variation of the Sharpe ratios for and in the S&P 500 market over holding time τ, indicating that the Sharpe ratios for peripheral stocks are significantly higher than those of central stocks. The stratification and marginal Sharpe ratios of are notably higher than those of .
Sub-figure (a) shows the equal-weighted Sharpe ratio calculated through the network, whereas sub-figure (b) is similar to (a) but displays the Sharpe ratio calculated through the network.
In the Table 2, we display the average Sharpe ratio for each tier. the stocks within the first tier are identified as the most central in terms of their influence or connectivity, whereas those categorized in the tenth tier are recognized as the most peripheral, indicating their relatively lower significance or connectivity within the network.
The Sharpe ratios in the S&P 500 market substantially exceed those in the CSI 300 market, likely due to its higher maturity level and a larger number of stocks per tier. In the CSI 300 market, the Sharpe ratios for the tenth tier concerning and do not significantly surpass those of the ninth tier. However, in the S&P 500 market, while the Sharpe ratios for Nij and Cij in the tenth tier are lower than those in the ninth, the global motion matrices show the tenth tier Sharpe ratios notably outperforming the ninth tier.
Additionally, we calculate averages for the top five and bottom five tiers. It is evident that in both the CSI 300 and S&P 500 markets, the Sharpe ratios of peripheral stocks are larger than those of central stocks. Moreover, peripheral stocks in global motion networks exceed their full correlation counterparts, and peripheral stocks within Nij outperform those within Cij. The mean Sharpe ratio of the top half peripheral stocks in the network is 5% and 1% higher than Nijnetwork in CSI 300 and S&P 500 market respectively. Especially, in S&P 500 market, the mean Sharpe ratio of peripheral stocks in the tenth tier in the network is 21% higher than Nij network. The opposite trend is observed for central stocks.
In Fig 3, we apply the Markowitz method to compare the Sharpe ratios of peripheral portfolios within four networks for the CSI 300, across holding periods τ = 1, 25, 50, 75, 100, 125. At τ = 1, differences in the Sharpe ratios of the peripheral portfolios across the four networks are minimal. This trend is more apparent for holding times from 25 to 125 days.
The results indicate that the Sharpe ratio is highest for , followed by , then Nij, and Cij respectively. Notably, the peripheral portfolio in global motion filtered networks outperforms both Nij and Cij, with Nij exhibiting superior performance compared to Cij. The performance in the S&P 500 market mirrors the CSI 300 market findings, so further elaboration is omitted. In the following sections of this analysis, it is presupposed, unless explicitly stated otherwise, that the portfolio weights adhere to a uniform distribution.
Portfolio performance under varying collective movement trends
We employ the growth and decline patterns of eigenvalues corresponding to global motion to characterize trends in the collective movement of financial markets. The performance of portfolios during different market condition is assessed. We then calculate the Sharpe ratios of peripheral versus central nodes in the network during these distinct conditions, as depicted in Fig 4. For both CSI 300 and S&P 500 markets, peripheral nodes consistently exhibit the highest Sharpe ratios during ‘W’ market conditions, while central nodes had the lowest Sharpe ratios during ‘E’ market conditions. Importantly, regardless of whether the collective movements are strengthening or weakening, portfolios comprising peripheral nodes outperform those with central nodes in both and networks.
Sub-figures (a) and (b) respectively represent the global motion networks of the CSI 300 market. Here, ‘C’ denotes the central node portfolio in the network, while ‘P’ represents peripheral nodes portfolio. The superscripts ‘E’ and ‘W’ indicate timing portfolios during collective movement strengthening and weakening, respectively, while those without superscripts refer to the entire time series. Sub-figures (c) and (d) are similar to (a) and (b), but they represent the S&P 500 market.
To understand the variables influencing the Sharpe ratio under different market conditions, we analyze the average return and win rate under conditions of ‘E’ and ‘W’. Fig 5 visualizes these conditions, with the pink area representing the region of ‘E’, and the white area representing ‘W’ period. In the CSI 300 market, ‘E’ comprises 1138 days, while ‘W’ spans 1167 days. For the S&P 500 market, ‘E’ cover 995 days, while ‘W’ extend over 1396 days. The orange line represents the corresponding index price. We modify Pagan and Sossounov’s (2003) method of dividing the market states into bullish, bearish, and range-bound markets [77, 78]. The peak is the highest price within an eight-month window before and after, and the trough is the lowest price in the same timeframe. To verify market trends, ensure there’s a confirmed uptrend or downtrend exceeding 20 percent change in value over a period longer than four months, starting from the identified peak or trough. The green lines on the price chart signify bearish market states, the red lines delineate bullish market periods, and the blue lines indicate the range-bound market states. The blue lines in Sub-figure (a) appear at the beginning and the end of the CSI 300 index price chart. The S&P 500 index does not feature any green lines, indicating the absence of bearish periods.
The largest eigenvalue corresponds to the price series of the index for sub-figure (a) CSI 300 and (b) S&P 500 market. The pink area represents the region of collective movement enhancement, while the remaining white area indicates the region of collective movement weakening. The green lines on the price chart mark the periods of bearish trends, the red lines highlight the bullish phases, blue lines are the periods of turbulence.
Fig 6 reveals that in the CSI 300 market (a and c), the peripheral portfolios have the largest average return and win rate during the ‘W’ period. This finding elucidates why the Sharpe ratio for the PW line in Fig 4 is significantly larger than the others. In the S&P 500 market, although peripheral portfolios exhibit the largest average return during the ‘E’ period, their largest win rate is observed during the ‘W’ period. Consequently, the Sharpe ratio for PW line is the largest in the S&P 500, but the difference between it and other Sharpe ratio curves is less marked compared to the CSI 300 market.
Sub-figures (a) and (b) respectively show the average return for the CSI 300 and S&P 500, while sub-figures (c) and (d) represent the win rates. In these figures, blue pentagrams represent peripheral portfolios during the ‘W’ period, green circles represent central portfolios during the ‘W’ period, purple squares indicate peripheral portfolios during the ‘E’ period, and yellow snowflake shapes represent central portfolios during the ‘E’ period.
To examine the disparity in returns between peripheral portfolios and central portfolios across various time periods, we define Pc as the difference between the ratios (Sharpe ratios and Sortino ratios) of peripheral and central portfolios. (23) We calculate the differential of Sharpe ratios and Sortino ratios between peripheral and central portfolios in Table 3. ‘A’ denotes the entire series data, ‘E’ corresponds to periods of enhanced collective movement, and ‘W’ signifies periods of weakened collective movement. Additionally, ‘U’ and ‘D’ indicate the timing of portfolios during bullish and bearish market states, respectively, while ‘T’ refers to periods of market turbulence. Firstly, by examining the columns, we observe that the values in columns and are greater than the non-global motion columns Nij and Cij for both the CSI 300 and S&P 500 markets. Furthermore, the values in columns consistently exceed those in columns . Secondly, when examining the ‘Sp’ and ‘St’ rows within the CSI 300 market framework, we note significant discrepancies between peripheral and central portfolios during both the ‘W’ and ‘D’ periods. In contrast, the S&P 500 demonstrates a divergent behavior, with smaller differentials in the ‘E’ and ‘T’ periods compared to the more pronounced disparities observed in the ‘W’ and ‘U’ periods.
For the CSI 300, the most notable differential, 0.366, is observed under the column and PcW row. In the S&P 500, the maximum differential of 0.273 appears under the column and PcE row. In the bearish period ‘D’ within the CSI 300 market, the highest value recorded is 0.326. However, for the S&P 500 market during its bullish phase ‘U’, the largest value observed is 0.393. These variations are largely influenced by the trading habits of Chinese and American investors. In China, during periods of strengthening collective movement, the distinction between investing in peripheral or central stocks is minimal due to its strong synchronicity. However, during market differentiation, peripheral stocks in the Chinese market are more advantageous than central stocks. In the American market, during times of enhanced collective movement, peripheral stocks, unrestricted by price limits, tend to yield better returns than central stocks. Yet, when collective movement weakens, the returns on peripheral portfolios are comparable to those of central portfolios. Overall, the contrast between ‘E’ and ‘W’ periods is more pronounced in the Chinese market, while the difference in the American market is relatively subtle.
Sharpe ratios heatmaps with different values of k and τ
The Sharpe ratio, as defined in Eq 18, is influenced by the number of stocks: k, and the holding period: τ. In this study, Sharpe ratios are computed for a heatmap of portfolio strategies, encompassing a range of 1 to 50 stocks and holding periods extending from 1 to 100 days. Each integral point along this continuum is representative of a distinct portfolio strategy. As illustrated in the supporting information, both the Markowitz optimization technique and equal-weighted portfolio strategies are utilized to generate Sharpe ratio heatmaps. Our analysis indicates that, in both the CSI 300 and S&P 500 markets, the Sharpe ratios derived from the equal-weighted and Markowitz methods exhibit minimal differences.
To determine the optimal number of stocks (k) and holding periods (τ) in portfolio strategies, we create annualized Sharpe ratio heatmaps for both the CSI 300 and S&P 500 markets.
Fig 7 highlights the segments with the highest annualized Sharpe ratios. Specifically, sub-figures (a) and (b) delve into the dynamics within the CSI 300 market. In sub-figure (a), regions displaying the highest Sharpe ratios are mainly associated with portfolios comprising 40 to 50 stocks, and the holding period spans from 20 to 40 days. Conversely, sub-figure (b) emphasizes the highest Sharpe ratios in portfolios ranging from 20 to 50 stocks, with holding intervals spanning 10 to 30 days.
Sub-figures (a) and (c) represent , while sub-figures (b) and (d) represent .
In contrast, the S&P 500 market demonstrates significantly higher annualized Sharpe ratios, especially apparent in the top-right quadrant of the figure. Although our computational limitations restrict our exploration into a broader range of stock counts and longer holding periods, the data available indicate that for the American market, the most effective investment strategy involves portfolios comprising more than 40 stocks and a holding period extending beyond 80 days, suggesting a long-term investment outlook. The CSI 300 market, however, seems to favor a shorter holding window of approximately 10 to 30 days, indicating a medium to short-term investment strategy. This finding is consistent with our expectations, as the Chinese stock market is relatively nascent, characterized by a significant proportion of retail investors who tend to pursue trend-following strategies. In contrast, the US stock market, with its more established history and a higher concentration of institutional investors, demonstrates a pronounced preference for risk diversification.
Discussion and conclusion
Our paper introduces the global motion approach to filter nonlinear dynamic networks derived from the mutual information and first applies it to the timing analysis of dynamic investment portfolios. We utilize the daily price returns from constituents of the Chinese and American stock market indices, CSI 300 and S&P 500, to construct dynamic stock networks based on nonlinear matrices: MI and their corresponding global motion filtered matrices. We construct the linear PC matrices for comparison. The investment portfolios are constructed based on these networks. Our findings indicate that applying global motion to both MI and PC matrices effectively reduced noises in dynamic networks, thereby enhancing portfolio performance. Specifically, the portfolios at the periphery of global motion networks outperform both full networks based on MI and PC matrices. Notably, the Sharpe ratio of peripheral portfolio is the highest for global motion filtered MI networks. These findings imply that global motion can effectively filter noise in nonlinear networks, utilizing the network’s topological structure to identify optimal assets.
Numerous studies have constructed various timing indicators by analyzing the price and volatility of indices. However, pioneering the use of the strength of correlations among constituent stocks for timing is a novel approach. We employ the growth and decline patterns of eigenvalues of the global motion to characterize trends in the collective movement of financial markets. Our analysis reveals distinct portfolio performance during periods of enhanced and weakened collective movements. The optimal assets are frequently located at the peripheries of the global-motion-filtered MI network, especially during periods of weakened collective movement. We also divide the price series into three periods: bullish market, bearish market and turbulence conditions. Subsequently, we calculate the Sharpe and Sortino ratios for those respective periods. Our findings indicate that the difference between peripheral and central stocks is most pronounced in the bearish market for the Chinese market. Conversely, in the American market, this distinction is most significant during the bullish market periods. The performance of our strategies is robust across various market conditions. Moreover, the comparative analysis of Sharpe ratios among portfolios with different stock numbers and holding duration suggest a tendency toward short-term investments in the Chinese market and a preference for long-term investments in the American market. These observations underscore the significant advantages of integrating global motion in portfolio optimization strategies.
In conclusion, the Sharpe ratio of peripheral portfolios is the highest for global motion filtered MI networks, and the performance of the strategies is robust across various market conditions, indicating that applying global motion can effectively filter noise in nonlinear networks and enhance portfolio performance. Given the limitations of the linear correlation under extreme fluctuations, the nonlinear correlation demonstrates a broader applicability. The extensive application of nonlinear mutual information networks in various fields, such as biology, atmospheric sciences, and neural networks, the global motion-filtered mutual information likewise exhibits significant potential for application.
Supporting information
S1 File. Participant data.
The data set supporting the findings of this study is available on Baidu Pan: https://pan.baidu.com/s/18yEPjhjvKN3hr9B8o6Ox9w?pwd=e56h (Access code: e56h).
https://doi.org/10.1371/journal.pone.0303707.s001
(PDF)
S2 File. Comparison of Sharpe ratios for different periods and Heatmaps of Sharpe ratios with equal-weighted and Markowitz optimization methods.
The Supporting Information, S2 File, contains further analysis, some extra details about the methodology and some extra results.
https://doi.org/10.1371/journal.pone.0303707.s002
(PDF)
References
- 1.
Allen F, Babus A. Networks in finance. The network challenge: strategy, profit, and risk in an interlinked world. 2009;367.
- 2.
Bougheas S, Kirman A. Complex financial networks and systemic risk: A review. Springer; 2015.
- 3.
Latora V, Nicosia V, Russo G. Complex networks: principles, methods and applications. Cambridge University Press; 2017.
- 4. Granha MF, Vilela AL, Wang C, Nelson KP, Stanley HE. Opinion dynamics in financial markets via random networks. Proceedings of the National Academy of Sciences. 2022;119(49):e2201573119. pmid:36445969
- 5. Bardoscia M, Barucca P, Battiston S, Caccioli F, Cimini G, Garlaschelli D, et al. The physics of financial networks. Nature Reviews Physics. 2021;3(7):490–507.
- 6. Mastromatteo I, Zarinelli E, Marsili M. Reconstruction of financial networks for robust estimation of systemic risk. Journal of Statistical Mechanics: Theory and Experiment. 2012;2012(03):P03011.
- 7. Wang GJ, Xie C, He K, Stanley HE. Extreme risk spillover network: application to financial institutions. Quantitative Finance. 2017;17(9):1417–1433.
- 8. Wang GJ, Jiang ZQ, Lin M, Xie C, Stanley HE. Interconnectedness and systemic risk of China’s financial institutions. Emerging Markets Review. 2018;35:1–18.
- 9. Onnela JP, Chakraborti A, Kaski K, Kertesz J, Kanto A. Dynamics of market correlations: Taxonomy and portfolio analysis. Physical Review E. 2003;68(5):056110. pmid:14682849
- 10. Ren F, Lu YN, Li SP, Jiang XF, Zhong LX, Qiu T. Dynamic portfolio strategy using clustering approach. PloS one. 2017;12(1):e0169299. pmid:28129333
- 11. Aste T, Shaw W, Di Matteo T. Correlation structure and dynamics in volatile markets. New Journal of Physics. 2010;12(8):085009.
- 12. Song DM, Tumminello M, Zhou WX, Mantegna RN. Evolution of worldwide stock markets, correlation structure, and correlation-based graphs. Physical Review E. 2011;84(2):026108. pmid:21929065
- 13. Drożdż S, Grümmer F, Górski A, Ruf F, Speth J. Dynamics of competition between collectivity and noise in the stock market. Physica A: Statistical Mechanics and its Applications. 2000;287(3-4):440–449.
- 14. Podobnik B, Wang D, Horvatic D, Grosse I, Stanley HE. Time-lag cross-correlations in collective phenomena. Europhysics Letters. 2010;90(6):68001.
- 15. Fenn DJ, Porter MA, Williams S, McDonald M, Johnson NF, Jones NS. Temporal evolution of financial-market correlations. Physical review E. 2011;84(2):026109. pmid:21929066
- 16. Ali S, Naveed M, Hanif H, Gubareva M. The resilience of Shariah-compliant investments: Probing the static and dynamic connectedness between gold-backed cryptocurrencies and GCC equity markets. International Review of Financial Analysis. 2024;91:103045.
- 17. Wang GJ, Xie C, Stanley HE. Correlation structure and evolution of world stock markets: Evidence from Pearson and partial correlation-based networks. Computational Economics. 2018;51:607–635.
- 18. Silva TC, de Souza SRS, Tabak BM. Structure and dynamics of the global financial network. Chaos, Solitons & Fractals. 2016;88:218–234.
- 19. Chinichian N, Kruschwitz JD, Reinhardt P, Palm M, Wellan SA, Erk S, et al. A fast and intuitive method for calculating dynamic network reconfiguration and node flexibility. Frontiers in Neuroscience. 2023;17:1025428. pmid:36845440
- 20. Bassett DS, Sporns O. Network neuroscience. Nature neuroscience. 2017;20(3):353–364. pmid:28230844
- 21. Bullmore E, Sporns O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews neuroscience. 2009;10(3):186–198. pmid:19190637
- 22. Calhoun VD, Miller R, Pearlson G, Adalı T. The chronnectome: time-varying connectivity networks as the next frontier in fMRI data discovery. Neuron. 2014;84(2):262–274. pmid:25374354
- 23. Guo X, Zhang H, Tian T. Development of stock correlation networks using mutual information and financial big data. PloS one. 2018;13(4):e0195941. pmid:29668715
- 24. Jiang YH, Long J, Zhao ZB, Li L, Lian ZX, Liang Z, et al. Gene co-expression network based on part mutual information for gene-to-gene relationship and gene-cancer correlation analysis. BMC bioinformatics. 2022;23(1):194. pmid:35610556
- 25. Yan Y, Wu B, Tian T, Zhang H. Development of stock networks using part mutual information and australian stock market data. Entropy. 2020;22(7):773. pmid:33286545
- 26. Chua L, Green D. A qualitative analysis of the behavior of dynamic nonlinear networks: Steady-state solutions of nonautonomous networks. IEEE Transactions on Circuits and Systems. 1976;23(9):530–550.
- 27.
Sahraee-Ardakan M, Fletcher AK. Estimation and learning of dynamic nonlinear networks (DyNNets). In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2017. p. 2856–2860.
- 28. Siudak D. The effect of self-organizing map architecture based on the value migration network centrality measures on stock return. Evidence from the US market. Plos one. 2022;17(11):e0276567. pmid:36318540
- 29. Ali S, Naveed M, Saleem A, Nasir MW. Time-frequency co-movement between COVID-19 and Pakistan Stock Market: Empirical evidence from wavelet coherence analysis. Annals of Financial Economics. 2022;17(04):2250026.
- 30. Ali S, Naveed M, Youssef M, Yousaf I. FinTech-powered integration: Navigating the static and dynamic connectedness between GCC equity markets and renewable energy cryptocurrencies. Resources Policy. 2024;89:104591.
- 31. Ali F, Khurram MU, Sensoy A, Vo XV. Green cryptocurrencies and portfolio diversification in the era of greener paths. Renewable and Sustainable Energy Reviews. 2024;191:114137.
- 32. Mohan V, Singh JG, Ongsakul W. Sortino ratio based portfolio optimization considering EVs and renewable energy in microgrid power market. IEEE Transactions on Sustainable Energy. 2016;8(1):219–229.
- 33. Abbas F, Ali S, Moudud-Ul-Huq S, Naveed M. Nexus between bank capital and risk-taking behaviour: Empirical evidence from US commercial banks. Cogent Business & Management. 2021;8(1):1947557.
- 34. Naveed M, Ali S, Iqbal K, Sohail MK. Role of financial and non-financial information in determining individual investor investment decision: a signaling perspective. South Asian Journal of Business Studies. 2020;9(2):261–278.
- 35. Devi F, Sudirman I. The effect of stock selection ability, market timing ability, fund size and portfolio turnover on equity fund performance in Indonesia. American Journal of Humanities and Social Sciences Research (AJHSSR). 2021;5(3):58–64.
- 36. Xu Q, Li M, Jiang C. Network-augmented time-varying parametric portfolio selection: Evidence from the Chinese stock market. The North American Journal of Economics and Finance. 2021;58:101503.
- 37. Rubinstein M. Markowitz’s “portfolio selection”: A fifty-year retrospective. The Journal of finance. 2002;57(3):1041–1045.
- 38.
Andonov A, Bauer R, Cremers M. Can large pension funds beat the market? Asset allocation, market timing, security selection, and the limits of liquidity. 2012;.
- 39. Black F, Litterman R. Global portfolio optimization. Financial analysts journal. 1992;48(5):28–43.
- 40. Nanda SJ, Panda G. A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm and Evolutionary computation. 2014;16:1–18.
- 41. Fernández A, Gómez S. Portfolio selection using neural networks. Computers & operations research. 2007;34(4):1177–1191.
- 42. Ko PC, Lin PC. Resource allocation neural network in portfolio selection. Expert Systems with Applications. 2008;35(1-2):330–337.
- 43. Chen Y, Hirasawa K. A portfolio selection model using genetic relation algorithm and genetic network programming. IEEJ transactions on electrical and electronic engineering. 2011;6(5):403–413.
- 44. Nanda S, Mahanty B, Tiwari M. Clustering Indian stock market data for portfolio management. Expert Systems with Applications. 2010;37(12):8793–8798.
- 45. Pozzi F, Di Matteo T, Aste T. Spread of risk across financial markets: better to invest in the peripheries. Scientific reports. 2013;3(1):1665. pmid:23588852
- 46. Ali F, Bouri E, Naifar N, Shahzad SJH, AlAhmad M. An examination of whether gold-backed Islamic cryptocurrencies are safe havens for international Islamic equity markets. Research in International Business and Finance. 2022;63:101768.
- 47. Ali S, Naveed M, Yousaf I, Khattak MS. From cryptos to consciousness: Dynamics of return and volatility spillover between green cryptocurrencies and G7 markets. Finance Research Letters. 2024;60:104899.
- 48. Mantegna RN. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems. 1999;11:193–197.
- 49. Freitas WB, Junior JRB. Random walk through a stock network and predictive analysis for portfolio optimization. Expert Systems with Applications. 2023; p. 119597.
- 50. Saha S, Gao J, Gerlach R. A survey of the application of graph-based approaches in stock market analysis and prediction. International Journal of Data Science and Analytics. 2022;14(1):1–15.
- 51. Zhang J, Jin LF, Zheng B, Li Y, Jiang XF. Simplified calculations of time correlation functions in non-stationary complex financial systems. Physica A: Statistical Mechanics and Its Applications. 2022;589:126615.
- 52. Li Y, Jiang XF, Tian Y, Li SP, Zheng B. Portfolio optimization based on network topology. Physica A. 2019;515:671–681.
- 53.
West DB, et al. Introduction to graph theory. vol. 2. Prentice hall Upper Saddle River; 2001.
- 54. Livan G, Inoue Ji, Scalas E. On the non-stationarity of financial time series: impact on optimal portfolio selection. Journal of Statistical Mechanics: Theory and Experiment. 2012;2012(07):P07025.
- 55. Pafka S, Kondor I. Estimated correlation matrices and portfolio optimization. Physica A: Statistical Mechanics and its Applications. 2004;343:623–634.
- 56. Borghesi C, Marsili M, Micciche S. Emergence of time-horizon invariant correlation structure in financial returns by subtraction of the market mode. Physical Review E. 2007;76(2):026104. pmid:17930101
- 57. Jiang X, Chen T, Zheng B. Structure of local interactions in complex financial dynamics. Scientific reports. 2014;4(1):5321. pmid:24936906
- 58. Zhang L, Guo C, Feng M. Effect of local and global information on the dynamical interplay between awareness and epidemic transmission in multiplex networks. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2022;32(8). pmid:36049937
- 59. Plerou V, Gopikrishnan P, Rosenow B, Amaral LAN, Guhr T, Stanley HE. Random matrix approach to cross correlations in financial data. Physical Review E. 2002;65(6):066126. pmid:12188802
- 60. Plerou V, Gopikrishnan P, Rosenow B, Amaral LAN, Stanley HE. Universal and nonuniversal properties of cross correlations in financial time series. Physical review letters. 1999;83(7):1471.
- 61. Plerou V, Gopikrishnan P, Rosenow B, Amaral LA, Stanley HE. Econophysics: financial time series from a statistical physics point of view. Physica A: Statistical Mechanics and its Applications. 2000;279(1-4):443–456.
- 62. Plerou V, Gopikrishnan P, Rosenow B, Amaral LN, Stanley HE. A random matrix theory approach to financial cross-correlations. Physica A: Statistical Mechanics and its Applications. 2000;287(3-4):374–382.
- 63. Zhao X, Shang P, Huang J. Mutual-information matrix analysis for nonlinear interactions of multivariate time series. Nonlinear Dynamics. 2017;88:477–487.
- 64. Marchenko VA, Pastur LA. Distribution of eigenvalues for some sets of random matrices. Matematicheskii Sbornik. 1967;114(4):507–536.
- 65. Han RQ, Xie WJ, Xiong X, Zhang W, Zhou WX. Market correlation structure changes around the great crash: A random matrix theory analysis of the Chinese stock market. Fluctuation and Noise Letters. 2017;16(02):1750018.
- 66. Pan RK, Sinha S. Collective behavior of stock price movements in an emerging market. Physical Review E. 2007;76(4):046116.
- 67. Daly J, Crane M, Ruskin HJ. Random matrix theory filters in portfolio optimisation: A stability and risk assessment. Physica A: Statistical Mechanics and its Applications. 2008;387(16-17):4248–4260.
- 68.
Yanhui W, Chenxin L, Meng D. The Application of Pearson and Mutual Information Correlation Network Structure Analysis in Mining Pathogenic Mechanism. In: 2021 IEEE 9th International Conference on Bioinformatics and Computational Biology (ICBCB). IEEE; 2021. p. 53–57.
- 69. Guan S, Zhao K, Yang S, et al. Motor imagery EEG classification based on decision tree framework and Riemannian geometry. Computational intelligence and neuroscience. 2019;2019. pmid:30804988
- 70. Vakorin VA, Mišić B, Krakovska O, McIntosh AR. Empirical and theoretical aspects of generation and transfer of information in a neuromagnetic source network. Frontiers in systems neuroscience. 2011;5:96. pmid:22131968
- 71. Meier TB, Wildenberg JC, Liu J, Chen J, Calhoun VD, Biswal BB, et al. Parallel ICA identifies sub-components of resting state networks that covary with behavioral indices. Frontiers in human neuroscience. 2012;6:281. pmid:23087635
- 72. Corso G, Ferreira GM, Lewinsohn TM. Mutual information as a general measure of structure in interaction networks. Entropy. 2020;22(5):528. pmid:33286300
- 73. Sharma C, Habib A. Mutual information based stock networks and portfolio selection for intraday traders using high frequency data: An Indian market case study. PloS one. 2019;14(8):e0221910. pmid:31465507
- 74. Pozzi F, Di Matteo T, Aste T. Centrality and peripherality in filtered graphs from dynamical financial correlations. Advances in Complex Systems. 2008;11(06):927–950.
- 75. Markowitz H. PORTFOLIO SELECTION*. The Journal of Finance. 1952;7(1):77–91.
- 76.
Sortino FA. The Sortino Framework for Constructing Portfolios: Focusing on Desired Target ReturnTM to Optimize Upside Potential Relative to Downside Risk. Elsevier; 2009.
- 77. Pagan AR, Sossounov KA. A simple framework for analysing bull and bear markets. Journal of applied econometrics. 2003;18(1):23–46.
- 78. Lee JS, Kuo CT, Yen PH. Market states and initial returns: Evidence from Taiwanese IPOs. Emerging Markets Finance and Trade. 2011;47(2):6–20.