Predictability of machine learning techniques to forecast the trends of market index prices: Hypothesis testing for the Korean stock markets

The prediction of the trends of stocks and index prices is one of the important issues to market participants. Investors have set trading or fiscal strategies based on the trends, and considerable research in various academic fields has been studied to forecast financial markets. This study predicts the trends of the Korea Composite Stock Price Index 200 (KOSPI 200) prices using nonparametric machine learning models: artificial neural network, support vector machines with polynomial and radial basis function kernels. In addition, this study states controversial issues and tests hypotheses about the issues. Accordingly, our results are inconsistent with those of the precedent research, which are generally considered to have high prediction performance. Moreover, Google Trends proved that they are not effective factors in predicting the KOSPI 200 index prices in our frameworks. Furthermore, the ensemble methods did not improve the accuracy of the prediction.


Introduction
Predicting the trends of financial markets is one of the most important tasks for investors. They have tried to predict the trends using various methods and bet in the markets. Technical analysis and fundamental analysis are employed in analyzing the trends of stock prices. Technical analysis is one of the traditional analytical methods that uses historical stock prices and trading volumes to determine the trend of future stock prices. This analysis is based on supply and demand in financial markets and can even be applied to firms with bad financial conditions because this approach only considers historical price data and volumes. Fundamental analysis predicts stock prices by using intrinsic values. The stock values are determined by the financial statements and economic factors of firms. Investors estimate the profits of firms and evaluate whether they are proper. However, this approach cannot reflect other factors that affect stock prices, such as the emotional factor of market participants. Recently, several studies to analyze financial markets with the sentiments of investors, such as blogs, news, and social network services, emerge. Opportunities to improve the prediction accuracy of market index prices come to light. First, although some studies have confirmed that improving predictability can be achieved via search frequencies from Google Trends, only a few studies use Google Trends as an input variable. Google Trends developed by Google show the global interests in the subject matter through the frequency of keywords on the Internet. [12] have affirmed that the "early warning signs" of stock market moves can be detected through Google Trends because search frequencies from this web facility result from the interaction of humans with the internet. [13] have also proposed a new measure of investor attentions using search frequencies from Google Trends. They verified that the increase of search frequencies means substantially high stock prices in the next 2 weeks and reversal within the year. They compared Google Trends with the existing proxies of investor attentions. Second, in the case of emerging financial markets, they may be biased toward several large firms and show the co-movement of individual stocks. [14] have confirmed that the cumulative contribution rate of the first two principal components of the Korea Composite Stock Price Index (KOSPI) was approximately 90%. Therefore, the co-movements of individual stocks in the KOSPI may occur, and the variances can be explained by using only two principal components. Given that the Korean financial markets are also weighted toward several large firms, other relatively small firms can be noises to predict market index prices. Most market indices are weighted by the market capitalization of firms and are affected by large firms inevitably. If these scenarios are considered, they may represent improved predictabilities.
This study employs an artificial neural networks (ANN), support vector machines (SVMs) with polynomial kernels, and radial basis function (RBF) kernels to predict the trend of the Korea Stock Price Index 200 (KOSPI 200) prices. We used same input variables with those of [2]'s empirical studies; however, the framework for prediction is reasonably different from theirs. We introduce a framework that is significantly realistic and practical for investors. Moreover, three controversial hypotheses are presented. The first hypothesis aims to compare the general framework with the method and to highlight the nonrealistic parts in [2]. The second hypothesis aims to figure out the effectiveness of Google Trends in prediction. Finally, the third hypothesis aims to determine whether ensemble approaches improve the performance in predicting index prices. Subsequently, we test the hypotheses in our framework and show the results.

Nonparametric machine learning models
This study employs three nonparametric machine-learning models to analyze the three hypotheses: an ANN, SVMs with polynomial and RBF kernels. These have been widely used in classification problems and have shown good performances.

Aritificial neural networks
Neural networks developed by [15] are nonparametric nonlinear models which overcome the limitations of linearity. In the networks in Fig 1, The general way of calculations is as follows: where x i is input value from node i in the previous layer, y is the output value, f(Á) is an activation function, and w i represents the weight for input x i . Neural networks is a nonlinear regression model because of the activation functions. Neural networks can represent the nonlinear relationship by employing many activation functions such as sigmoid functions, logistic functions, and hyperbolic tangent functions. The main task of neural networks is to find the optimal weights, w i in Eq (1). The most widely used method is back-propagation algorithm [16]. In this algorithm, the weights are fixed to minimize the loss function between predicted values and true values. The gradients are calculated in backward from the output layer to the input layer. The sum of squared error is usually used as a loss function. Generally, many researchers agree that: the higher the number of hidden layers, the better the performance after optimizing the weights. However it can cause the over-fitting problem and the number of layers should be determined properly according to experiments.

Support vector machine
A support vector machine (SVM) proposed in [17] is a supervised learning model for the classification and regression. Given the data which belongs to one of two groups, The SVM constructs a non-stochastic binary linear classification model based on the given data and predicts groups of new given data. The classification model can be expressed as a hyperplane that the data is divided by a clear gap, which is as wide as possible. When new data is mapped into the same space with training data, the model can classify the data based on the hyperplane as  Two hyperplanes can be described by Eq (2).
Because the distance between two hyperplanes is 2/kwk, we have to minimize kwk to maximize the distance. The hyperplane between halfway of two slashed planes in Fig 2 is called maximum margin hyperplane and the points on the two planes are support vectors.
Geometrically, the points should not be in the region between the planes, so we add the constraint as Eq (3): Getting together, the optimization problem is By adding a regularization term, we can mitigate the classifier in a linearly inseparable case. The objective function of the equation is expressed as Eq (4), where the parameter λ controls the trade-off between maximizing margin size and being strict in the separation. SVMs can be applied to nonlinear classification problems by introducing kernel functions, k(x, x 0 ). Polynomial, radial basis function, and hyperbolic tangent kernels are commonly used.

Data description
This research aims to predict market index prices using the three nonparametric machinelearning models. The inputs are based on 10 technical indicators in [2]. Table 1 presents the inputs.
In the present study, data set is the daily closing prices of the KOSPI 200 from 2004 to 2016 freely available from the Korean exchange website: https://global.krx.co.kr/contents/GLB. After normalizing the 10 indicators, we use the indicators as inputs to predict the movement of the index and stock prices. We set the training and prediction data for approximately 6 months and 1 month, respectively. Test data are selected after the training period. We test the machine-learning models using previous data for approximately 6 months and predict the KOSPI200 index for roughly 1 month after the training period. Given that we aim to predict the market index using the technical indicators, we should split the training and test data without overlapping. When analyzing the financial time series data, considering the occurrence time as a variable is important. The data should not be selected randomly. We use this methodology to predict the KOSPI 200 index prices according to several perspectives and compare our results with existing or well-known results by doing hypothesis tests.

Name of indicators Formula
Simple 10-day moving average C t þC tÀ 1 þÁÁÁþC tÀ 10 10 Weighted 10-day moving average ðnÂC t þðnþ1ÞÂC tÀ 1 þÁÁÁþC 10 Þ ðnþðnÀ 1ÞþÁÁÁþ1Þ and H t is the closing, low, and high price at time t respectively, LL t and HH t is the lowest and highest prices in the last t days respectively. DIFF: EMA (12)

General procedures
However, the framework is entirely different from that of [2]. Therefore, we introduce a rollover strategy. This strategy has been broadly used to analyze time series data in the fields of economics and management science. This strategy can also be employed in real markets and is practical for investors. Given that the roll-over strategy is based on the past data, investors can use this strategy to predict the index price based on the past prices. Fig 3 presents the aforementioned strategy.
Three parameter sets in each model are selected as the ones that give the best performances in the entire training data. Using estimated parameters, we predict test data and draw accuracy on them.

Hypothesis tests and results
In this research, three hypotheses are constructed for the analysis of the prediction results of the three machine-learning models. The significant level of the three hypotheses is 0.05, and the data from July 2004 to December 2016 are used as the test data through the roll-over strategy. We assume that the predicted performance of the learned machine is a random variable and each variable follows a normal distribution according to the law of large numbers. Subsequently, we perform a hypothesis test based on the appropriate t-test after performing the equally distributed tests for each variable.

Regarding on prediction accuracy of machine learning methods
The first hypothesis is based on the results of [2]. In [2], prediction performances in ANNs and SVMs with polynomial and RBF kernel are 75.74%, 71.52% and 62.23%, respectively. Such  Table 2. (a, b, c) in ANNs means a parameter set (epochs, momentum constant, number of neurons) and (a, b) in SVMs with polynomial kernel means a parameter set (degree of kernel function, Gamma, regularization parameter). (a, b) in SVMs with RBF means a parameter set (Gamma, regularization parameter).
To check the robustness, we do experiments using other data; 20-day and 30-day moving average as Table 3. we replace simple 10-day moving average (MA) with two moving average data, respectively. They show better performances than 10-day MA and the ANN gives the best predictability in two experiments, which shows the consistent result with 10-day MA.
The procedures in [2] are inappropriate to predict the market index. The data set for parameter estimations consists of 20% of each increasing and decreasing direction equally, and the test data set for the prediction consists of 50% in the same principle. In this procedure, the data set for parameter estimations can also be selected to predict the test data. This method is not realistic in the investors' view because a real practical application cannot be used the data for parameter setting or training to predict the direction of markets investors want to know. Selecting training and prediction data sets in similar numbers of increasing and decreasing directions are not realistic and useless to apply in actual trading strategies. Therefore, we set the training data for approximately six months and then the prediction data for roughly one month according to our general procedure. Based on these two methods, we test a hypothesis: the prediction performances of each method are similar. This hypothesis is intended to determine whether the high accuracy of the machine-learning method previously reported is independent of the procedures that deal with the data.
Prior to the hypothesis testing, the Anderson-Darling test was performed to samples from in [2] frameworks and the two-sample F-test for equal variances were performed. Given that the Anderson-Darling test did not reject the null hypothesis, the assumption that the sample follows a normal distribution can be allowed. Two independent sample t-tests were performed because the equal variance test rejects the null hypothesis. The null hypothesis and the alternative hypothesis are as follows. The mean value of prediction based on [2] and our framework are μ kara and μ nonoverlap , respectively.
Given the constraints of the model framework, we have 13 data sets for [2] framework and 146 data sets for the non-overlapping framework. We generate a power curve to visualize how the sample size affects the test power. Table 4 shows that the p-value for the null hypothesis is significantly low, which means that, even with very few samples, the average values of the two frames are different. In the first hypothesis test, the result in [2] is different from our result with general procedure that is realistic and practical for actual investors. All models reject the null hypothesis that the mean value of prediction based on [2] is equal to the mean value of prediction based on our general framework.
We check robustness about the first hypothesis with 20-day and 30-day MA in Table 5. We replace 10-day MA with 20-day and 30-day MA, respectively. The results also show that all of null hypothesis are rejected, which shows consistency of results. Fig 4 depicts the prediction accuracy of the KOSPI 200 index prices in three machine-learning models. Support vector machines with RBF give the best performance in predicting the KOSPI 200 index prices. However, the accuracy of the three models is almost 50%. Only two directions should be predicted. Machine-learning methods with general procedures do not give good performances in predicting the trends of market index prices. In other words, the models with the 10 indicators do not have high predictabilities based on the past training data.

Regarding on predicting trends of the market indexes with Google Trends
Various analytical methods have been developed using large data that evokes attention recently. Google Trends that shows search frequencies from Google is one of investors' T-test was performed after adopting the equal variance assumption by preferentially performing the equally distributed tests as likely in the hypothesis test 1. In addition, search frequencies from Google Trends are added as an input variable. Fig 5 and Table 6 show the results of prediction and t-test. While the ANN and SVM with polynomial kernels might have better performances with Google Trends compared with without it, the SVM with RBF kernels might give similar performances. However, it is difficult to determine the effectiveness of Google Trends as shown in Fig 5. All the models fail to reject the null hypothesis. Moreover, the   second hypothesis test confirms that Google Trends is an ineffective factor in predicting the KOSPI 200 index prices in the current framework. Unlike previous studies on the impact of Google Trends on investment decisions, this study validates that using the Google Trends information as an input may not always be effective for all machine-learning models. This scenario may be because Google Trends is not suitable for applying to the KOSPI 200 index data or is not appropriate for the ANN or SVMs, thereby suggesting further research in this area. The second hypothesis with 20-day and 30-day MA also shows consistent results in Table 7. All models fail to reject the null hypothesis, which means Google Trends can be an ineffective factor in predicting the index prices.

Regarding on Korean financial market prediction through ensemble techniques
Market index consists of many companies according to its characteristics. The KOSPI 200 index is a market capitalization index that consists of 200 representative companies in South Korea. The Korean market represents a biased trend toward some large companies. Other companies, except for large ones, are relatively less likely to explain the trend of market index prices. Generally, machine-learning methods will probably improve their performances through ensemble techniques. Therefore, we attempt to compare the predictive performance of the whole market index with those of the ensemble approaches with major companies in the index. We select 10 representative companies with large capitalization among the 200 companies in the KOSPI 200 index. The same framework is applied to predict the trends of each company with the 10 technical indicators.
The third hypothesis concentrates on the prediction of market index prices' direction using the ensemble methods with the results of each individual major company in the market index. We select 10 representative companies with market capitalization in the KOSPI 200 index. If ensemble prediction performances with individual companies are better compared with employing the whole index, then the ensemble methods are effective to forecast the trend of market index prices.
In this empirical study, we predict the stock direction of individual companies with 10 indicators for each company. If the weighted average of directions based on market capitalization is larger than 0.5, then the market index is expected to increase. Fig 6 shows the result. While the SVM with RBF kernels provide significant performances when predicting the index using the 10 representative companies, the ANN and SVM with polynomial kernels show similar accuracies between two predictions. Table 8 shows the result of the third hypothesis test. All the models fail to reject the null hypothesis that the prediction that employs the 10 representative companies provides better performances compared with employing only the whole index. Therefore, the ensemble methodology has no visible effect on the directionality of the market index. The hypothesis that investors can predict the Korean financial markets by solely looking at large companies is, hence, inaccurate. Accordingly, the performance of the ensemble techniques depends on the specific machine-learning methodology and the data.
We check robustness in this experiment. In Table 9, the models with 20-day and 30-day MA give consistent results to the experiment with 10-day MA, which satisfies robustness of the models.

Conclusion
Predicting stock prices has been a major issue to investors and researchers. Many methodologies from various academic fields have been introduced for prediction, and some methods have been used in real financial markets for trading strategies. Recently, researchers have analyzed the trends of stock prices using machine-learning models, and they have shown considerable performances by using meaningful input factors.
In this research, we elucidate disassociation points with real trading environments in analyzing the time series data and in leveraging realistic and practical methods for actual investors. To predict the movement of the KOSPI 200 index prices, we use the three nonparametric machine-learning models: ANN, SVMs with polynomial kernels, and RBF kernels. Applying the methods to various circumstances, we deal with controversial issues by testing three hypotheses. In the first hypothesis, we prove that the prediction of the KOSPI 200 using the 10 technical indicators introduced by [2] does not give a good performance in plausible framework for market investments. In addition, the result of [2] is not consistent with the plausible framework result that predicts future data only based on the past training data. In the second hypothesis, we affirm that Google Trends may be an inadequate input factor in predicting the KOSPI 200 index prices. Accordingly, the second hypothesis test suggests that the suitability of Google Trends for the analysis should be evaluated before other uses in applications. In the third hypothesis, we confirm that the ensemble study results with the 10 selected large companies in the KOSPI 200 index are similar to the whole index prediction results unlike the prediction studies based on the machine-learning techniques.
This study confirmed the instability and large variability of machine-learning methods on market forecasts, which is mentioned in [18] and some other researches, through hypothesis testing using the case of the KOSPI 200 index market. It implies possibilities to be extended on some points. Several limitations of machine-learning methods for KOSPI index market are pointed out in this study. Thus, each limitation can be analyzed in detail and can ameliorate the prediction accuracy of KOSPI 200 index markets. [19] have found Google Trends may The following code includes experiments in [2] in Matlab. It includes parameter setting, training and test procedure of ANN and SVM with two kernels and robustness test. (M) S4 File. Matlab code of experiments with Google Trends. The following code includes experiments in hypothesis 1 and 2. The experiment without Google Trends is used in hypothesis 1 as a non-overlap method and hypothesis 2 as a prediction without Google Trends. The experiment with Google Trends is used in hypothesis 2 to be compared to that without Google Trends and robustness test. (M)