Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A comparative study on effect of news sentiment on stock price prediction with deep learning architecture

  • Keshab Raj Dahal,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Statistics, Truman State University, Kirksville, MO, United States of America

  • Nawa Raj Pokhrel,

    Roles Methodology, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Computer Science, Xavier University of Louisiana, New Orleans, LA, United States of America

  • Santosh Gaire ,

    Roles Conceptualization, Formal analysis, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    gaire1s@cmich.edu

    Affiliation Department of Physics, The Catholic University of America, Washington, DC, United States of America

  • Sharad Mahatara,

    Roles Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics, New Mexico State University, Las Cruces, NM, United States of America

  • Rajendra P. Joshi,

    Roles Conceptualization, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation TQuT Inc., Rockford, MI, United States of America

  • Ankrit Gupta,

    Roles Writing – review & editing

    Affiliation Department of Computer Science, Central Michigan University, Mount Pleasant, MI, United States of America

  • Huta R. Banjade,

    Roles Conceptualization, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics, Virginia Commonwealth University, Richmond, VA, United States of America

  • Jeorge Joshi

    Roles Software

    Affiliation Kathmandu Engineering College, Tribhuvan University, Kathmandu, Nepal

Abstract

The accelerated progress in artificial intelligence encourages sophisticated deep learning methods in predicting stock prices. In the meantime, easy accessibility of the stock market in the palm of one’s hand has made its behavior more fuzzy, volatile, and complex than ever. The world is looking at an accurate and reliable model that uses text and numerical data which better represents the market’s highly volatile and non-linear behavior in a broader spectrum. A research gap exists in accurately predicting a target stock’s closing price utilizing the combined numerical and text data. This study uses long short-term memory (LSTM) and gated recurrent unit (GRU) to predict the stock price using stock features alone and incorporating financial news data in conjunction with stock features. The comparative study carried out under identical conditions dispassionately evaluates the importance of incorporating financial news in stock price prediction. Our experiment concludes that incorporating financial news data produces better prediction accuracy than using the stock fundamental features alone. The performances of the model architecture are compared using the standard assessment metrics —Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Correlation Coefficient (R). Furthermore, statistical tests are conducted to further verify the models’ robustness and reliability.

1 Introduction

The stock market allows us to buy and sell units of ownership in a company called stocks, and one can own some of those profits when a company’s profit goes up and vice versa. The evolution of the stock market is intriguing, from handwritten stock trades in coffee shops to today’s digital platform, where one can have access to the entire world’s stock market in the palm of one’s hand. Thus, it boosts economic growth by encouraging competition and innovation in different sectors including business, education, and labor market [1, 2]. However, purchasing the right stock at the right time is one of the most challenging jobs. Several reasons and circumstances behind the scene affect stock price due to its highly nonlinear, volatile, fuzzy, noisy, non-parametric, deterministic chaotic behavior by default [35]. The stock price prediction model helps us minimize the level of uncertainty for better investment and trading decisions if and only if the well-balanced combination of features that affect the stock price are properly utilized.

There have been various schools of thought used in stock price prediction. Traditionally, the first wave believes in the efficient market hypothesis theory [6]. It urges historical stock data has much influence in predicting the future price. Conversely, the concept of random walk [7] believes that a particular stock’s price is already reflected in its current price. Then any change in the stock price would reflect the release of new information or random noise. The second wave focuses on statistical modeling, where the central focus is predicting future prices based on the relationship between past and present data [813]. Almost all of these models associate the linear relationship between the given variables, but stock market data are often nonlinear. The third and most potent wave came into existence due to technological advancement, high computational power, and rule-based artificial intelligence algorithm growth. These models primarily help to capture the nonlinear behavior of the stock market data.

With very few exceptions, most models developed under any wave utilize structured numerical data to predict stock prices. These data alone are not sufficient to examine stock returns, forecasting daily and weekly market patterns [14, 15]. In the recent decade, the stock market movement has been influenced by public or private information shared via different digital media platforms such as Facebook posts, tweets on Twitter, or financial gossip on multiple platforms. All these activities further increase the stock market volatility as the information is more inclined toward the psychological thought processes of human beings. When the discussion revolves around human beings and human sentiment, the situation is intricate and complex. A case in point is on January 5, 2017, President-elect Donald Trump tweeted to impose a hefty tax on Toyota Motor if builds its Corolla cars for the U.S. market at a plant in Mexico. This tweet had a substantial impact on Toyota stock as its price dipped and the volume spiked [16, 17]. Similarly, stocks fell on August 1, 2019, right after president Donald Trump posted a series of tweets about the 10% charge that would be imposed on $300 billion worth of Chinese goods. The Dow Jones Industrial Average closed 98.41 points lower at 26,485.01 after plunging 334.20 points earlier in the day. The S&P 500 lost 0.7% to end the day at 2,932.05. (https://www.cnbc.com). Furthermore, the famous tweet by Elon Musk on accepting Bitcoin as a payment method for Tesla cars. On the same day stock price of Bitcoin increased by 5.2% on 24 march 2021. The stated evidence speaks volumes of information about the importance and influence of market sentiment in stock price prediction.

Many research articles have been published so far using variations of deep learning models, and varying levels of claims have been seen concerning the models’ accuracy and robustness [1823]. The most popular deep learning architecture are Long Short Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolution Neural Network, and their respective hybridization techniques [18, 2433]. Every new publication speaks with pride about the accuracy and model’s robustness. Their implementation strategies, working framework, series of assumptions, number of features, and data sources differ from one to another. Thus, it is impractical to make an unbiased comparison between previously published articles that use the same model to predict the stock price. Furthermore, the literature lacks a comparative analysis of stock price prediction with or without incorporating unstructured data on regular stock market data.

This study compares LSTM and GRU models for stock price prediction under the standard framework using identical conditions. It helps to objectively assess the statistical significance of including or excluding financial news sentiment in stock price prediction. Financial news of the developed countries is captured in multiple media outlets. The same news is published on various platforms with a micro analysis of the subject area from multiple avenues. Due to the heterogeneity and diversity of assessing the news sentiment for stock market data over a certain period, it is rational to select the data from developed countries to support the purpose of the study.

Our complete vision to achieve the stated goal can be conceptualized from the broader spectrum via the schematic diagram in Fig 1. The proposed study uses fundamental and financial news data to build the model. The concatenated data is normalized using the min-max technique. LSTM and GRU models are developed with or without incorporating financial news data. Once the hyperparameters are tuned, the final model predicts the closing price of the stock market index. The final model helps to determine whether or not the financial news influences the stock price prediction. At the end, the quality and robustness of the proposed model are assessed through RMSE, MAPE, and R scores.

thumbnail
Fig 1. Overall schematic diagram of the proposed research framework.

https://doi.org/10.1371/journal.pone.0284695.g001

The main contributions of this study include (a) Answering the central question: under identical conditions, which LSTM or GRU model would be the best choice for stock price prediction? (b) Confirming the statistical causal inference of news sentiment on the stock price prediction. (c) Determining the significance of incorporating the news sentiment in stock price prediction concerning if only fundamental stock attributes are utilized. (d) Conducting a series of statistical hypotheses to validate the experiment’s robustness and reliability.

The rest of the paper is organized as follows. In Section 3, we explain the method of preprocessing the data before using it in the ML models. Section 4 explains the modeling approach. In section 4.3, we discuss the brief model performance metrics. Section 5 explains the experimental settings and results. Finally, section six presents the conclusion, followed by a list of references.

2 Related work

The study conducted by Adebiyi et al. [34] evaluated the performance of an Artificial Neural Network (ANN) against an Autoregressive Integrated Moving Average (ARIMA) model using historical stock data of Dell Inc. on the New York Stock Exchange (NYSE). The research concluded that the ANN model performed slightly better than the ARIMA model and noted that incorporating macroeconomic and technical indicators could further improve the results.

In a 2019 study, Karmiani and colleagues [35] compared the performance of LSTM, Backpropagation, SVM, and Kalman filter for stock price prediction. They used historic data from Yahoo Finance for nine companies (Apple, Acer, Amazon, Google, HP, IBM, Intel, Microsoft, and Sony) and found that LSTM had the highest prediction accuracy and lowest variance among the models tested.

Chen et al. (2015) [36] used LSTM to predict stock prices in the China stock market, using historical data from the Shanghai and Shenzhen stock markets obtained from Yahoo finance as input features. They reported an accuracy of 27.2% and suggested that incorporating other features such as macroeconomic data and technical indicators would improve the model’s performance.

In the study by Roondiwala et al. [37], LSTM was utilized to predict future stock prices of NIFTY50. Historical data, including high, low, open, and close prices, was obtained from the National Stock Exchange and used as input features. The RMSprop optimizer was employed with 500 epochs, resulting in a testing RMSE score of 0.00859. However, it is possible that normalized data were used to calculate the RMSE, rather than actual data. Additionally, the model’s performance could have been improved by incorporating other factors, such as financial sentiments, that have a direct impact on stock prices.

In the study by Yu and Yan [38], data for six stock indices from various market environments were used, including the S&P 500, DJIA, N 225, HSI, CSI 300, and ChiNext index. In the first stage, the authors applied phase-space reconstruction (PSR), de-noising, and normalization to the data to improve the performance of the model. Four standard machine learning algorithms were compared: LSTM, MLP, SVR, and ARIMA. The results showed that the LSTM had the highest prediction accuracy among the algorithms compared.

In the work of Gao et al. the group applied four neural networks named Multilayer Perceptron (MLP), Long Short Term Memory (LSTM), Convolutional Neural Network (CNN) and one attention-based neural network —Uncertainty-aware Attention (UA)—to test the performance on predicting three stock market price: the SP500 index (most developed market), CSI300 index (less developed market) and Nikkei225 index (developing market) [39]. The results show that UA has the best performance among the alternative models. Furthermore, all models have better accuracy in the developed financial market than in developing ones.

In their study, Shahi et al. (2020) [40] investigated if incorporating financial news sentiments could improve the performance of stock price prediction using LSTM and GRU models. They used historical data from the Agricultural Development Bank Limited (ADBL) of Nepal and financial news headlines from ShareSansar Nepal, from 20 March 2011 to 14 November 2019. The results showed that the performance of both LSTM and GRU models was significantly improved by including financial news sentiments as input features.

Kara et al.in 2011 [41] compared two classifiers —Artificial neural networks (ANN) and support vector machines (SVM) —to predict the direction of movement in the daily Istanbul Stock Exchange (ISE) National 100 Index. The data from January 2, 1997 to December 3, 2007 were taken. Then technical indicators: Simple 10-day moving average, Weighted 10-day moving average, Momentum, Stochastic K%, Stochastic D%, Relative Strength Index (RSI), moving average convergence divergence (MACD), Larry William’s R%, Accumulation/Distribution Oscillator, and Commodity Channel Index were selected as input variables. Experimental results showed that the average performance of ANN model (75.74%) was found significantly better than that of SVM model (71.52%).

In the work of Schoneburg, E. [42], the author analyzed the possibility of predicting stock prices on a short-term, day-to-day basis with the help of neural networks —Perceptron, Adaline, Madaline, and Backpropagation —by studying three important German stocks: BASF, COMMERZBANK, MERCEDES. The author achieved an accuracy of up to 90% with a back propagation network. Moreover, he expresses that the selection of more suitable inputs for the network could improve the performance of the model.

K. Kohara et al. [43] used neural networks (NNs) for the prediction of the daily closing price of Tokyo stock price index (TOPIX) whether incorporating Event-knowledge (the daily headlines of the Japanese newspaper) could produce a better performance. The five inputs —Close: closing price of TOPIX, Exchange: the dollar-to-yen exchange rate (yen/dollar), Interest: an interest rate, Oil: the price of crude oil, and NY: New York Dow-Jones average of the closing prices of 30 industrial stocks —with and without Event-knowledge, feed to the NNs. The result shows that the performance of NNs is improved significantly by incorporating Event-knowledge.

Adebiyi A. A. et al. [44] used feed forward multilayer perceptron neural network with backpropagation whether incorporating fundamental analysis variables could produce a better performance than technical analysis variables only. The published stock data obtained from the Internet were used. The empirical results show that the performance of the model improved significantly by incorporating fundamental analysis variables for daily stock price prediction.

Selvin S. et al. [45] used Recurrent Neural Networks (RNN), Long Short Term Memory(LSTM), and Convolutional Neural Network (CNN) for short term stock price prediction using a sliding window approach. The window size was 100 minutes with 90 minute’s information and prediction was made for the rest of the 10 minutes. Two companies from IT sector and one company from the Pharma sector of NIFTY were taken for the study. For their proposed methodology, CNN is identified as the best model.

During the COVID-19 pandemic in 2021, Binrong Wu. et al. [20] utilized the following machine learning models to forecast oil price, oil production, oil consumption, and oil inventory: CNN (Convolutional neural network), BPNN (Backpropagation neural networks), SVM (Support vector machines), LSTM (Long short-term memory), and RNN (Recurrent neural network). Their empirical findings suggest that information gleaned from social media platforms makes a major contribution to the process of forecasting oil prices, production levels, and consumption rates.

3 Data preparation and alignment

Broadly speaking, this study uses two types of datasets —SP&500 stock market index and financial news data. The overall description of the dataset is presented in Table 1. Stock market data consist of stock data for a popular US stock market index; S&P500, between June 9, 2008, to November 5, 2021 accessed from Yahoo Finance [46]. It consists of the open, close, maximum, and minimum stock price as well as the shares traded (volume) on a particular day.

thumbnail
Table 1. Overall description of the datasets.

https://doi.org/10.1371/journal.pone.0284695.t001

Financial news data scraped from Reddit World News Channel [47] and Finviz [48]. Reddit World News Channel data are extracted from June 9, 2008, to July 1, 2017, from Kaggle [49]. Similarly, from January 1, 2020, to November 5, 2021, data are collected from Finviz. The news scraped from Finviz has many missing values from January 1, 2020, to August 2, 2020, so we retained the news from August 3, 2020, to November 5, 2021, and the rest was discarded. Reddit news data, and Finviz news data are converted into a numerical score using the VADER (Valence Aware Dictionary for Sentiment Reasoning) package.

VADER [50] is a pre-trained model that analyzes people’s opinions, sentiments, evaluations, attitudes, and emotions via computational treatment regarding polarity (positive/negative) and intensity (strength) in text. It relies on an English dictionary that maps lexical features to their semantic orientation as positive or negative [51].

Before inputting financial news obtained through web scraping into VEDAR for sentiment analysis, the data is first preprocessed. This involves removing any unnecessary text found in HTML tags and single or multiple blank spaces and escape sequences. However, the exclamation marks or question marks found in the news headline are not removed as they may add intensity and strength to the news. After preprocessing, the data is fed into VEDAR to determine its corresponding sentiment scores.

Both news data contain at least 25 news published in a single day. So, we computed the average news score for each day. Finally, the sentiment scores for the financial news data are aligned with the stock market data, as shown in Fig 2.

thumbnail
Fig 2. Concatenation of fundamental and financial news data.

https://doi.org/10.1371/journal.pone.0284695.g002

Once the data are concatenated into a single frame based on the date column that exist in both datasets. The descriptive statistics of the combined data is illustrated in Table 2 to gain the initial insight of it. As seen in Table 2, some metrics have large magnitude compared to others. To avoid features with large magnitude dominate the feature with a small magnitude, as partially shown in Table 3, min-max scaling is performed for each metric as defined by Eq 1. (1) where z, x are scaled and the original input respectively. Similarly, xmin and xmax are the minimum and the maximum values of the input respectively. Prior to feeding the data into the ML models, the scaled 2 dimensional data were converted into 3 dimensions; a number of samples, time step, and the number of features by incorporating the time step. The partial snapshot of the normalized dataset used in this study is presented in Table 4.

thumbnail
Table 2. The descriptive statistics of the features.

https://doi.org/10.1371/journal.pone.0284695.t002

thumbnail
Table 3. The snapshot of the actual features used in the model.

https://doi.org/10.1371/journal.pone.0284695.t003

thumbnail
Table 4. The partial snapshot of the normalized dataset.

https://doi.org/10.1371/journal.pone.0284695.t004

4 Modeling approach

4.1 Long short term memory (LSTM)

Long short term memory (LSTM) is an advanced form of recurrent neural network (RNN) known for its robust performance in time series data. It overcomes the main drawbacks of RNN in preserving the information for long-term dependencies due to the problem of vanishing and exploding gradient [52]. LSTM uses memory cells to solve this problem and consists input layer, hidden layer, an output layer, and cell state [5355].

Fig 3 illustrates the LSTM architecture. For input xt, the memory cell ct updates the information using three gates: input gate it, change gate , and forget gate ft. The hidden state ht is updated using output gate ot and the memory cell ct. These operations of LSTM are governed by the following functions: where σ and tanh represent the sigmoid and hyperbolic tangent activation functions respectively. The operator ⊗ is the element-wise product, W, Wh are the weight matrices, and b are bias vectors [5659].

thumbnail
Fig 3. Long short-term memory(LSTM) architecture [56].

https://doi.org/10.1371/journal.pone.0284695.g003

4.2 Gated recurrent unit (GRU)

GRU is a simplified version of the LSTM; first developed by Chung et al. in 2014 [60]. Unlike in LSTM, the short-term state (ht) and long-term state (ct) of LSTM are merged into a single vector ht in GRU. As opposed to the 3 gates in LSTM, GRU is equipped only with 2 gates: the update gate and the reset gate. The update gate of GRU is equivalent to the forget gate and input gate of LSTM [61]. This gate is responsible for long-term memory. It helps the model to determine how much of the past information needs to be passed along to the future. The reset gate is responsible for short-term memory. It helps the model to decide how much of the past information to forget. Based on empirical evidence, both models; LSTM and GRU have been proven effective on many machine learning tasks [6266].

Fig 4 illustrates the GRU architecture. For input xt, it takes xt and hidden state ht−1 from the previous time step t − 1. It computes a new hidden state ht and is again passed for the next time step. These operations of GRU are governed by the following functions: where σ and tanh represent the sigmoid and hyperbolic tangent activation functions respectively. The operator ⊗ is the element-wise product, W, Wh are the weight matrices, and b are bias vectors.

thumbnail
Fig 4. Gated Recurrent Unit (GRU) architecture [66].

https://doi.org/10.1371/journal.pone.0284695.g004

4.3 Model assessment matrices

The prediction accuracy and the reliability of the models is assessed using root mean squared error (RMSE), Mean absolute percentage error (MAPE), and the linear correlation coefficient (R). A ML model with the smallest RMSE and MAPE along with the greatest R would be considered as the best model. The analytical structure of these metrics are given below: (2) (3) (4) where,

  1. yi: Original closing prices,
  2. : The mean of the original closing prices,
  3. : Predicted closing prices,
  4. : The mean of the predicted closing prices,
  5. N: Number of observations.

5 Experiment setting and results

5.1 Experimental setup

The primary objective of this research is to see the effect of financial news on stock prediction. To examine the effect of financial sentiments on stock price predictions, we used two datasets—(I) Fundamental data and (II) Combined data. Fundamental data consist of only stock market data for the S&P 500 index, whereas combined data is fundamental data with a corresponding financial news sentiment score. Each of these datasets is divided into three subsets: training, validation, and testing. The overall distribution for each subset’s time range and corresponding samples are listed in Table 5. The training set contains the data from June 9, 2008, to July 1, 2017, while the test set ranges from August 3, 2020, to November 5, 2021. From the training set, 25% of the data is separated for validation.

thumbnail
Table 5. Overall distribution of training, validation, and test data.

https://doi.org/10.1371/journal.pone.0284695.t005

As our goal is to perform the comparative analysis of the outcome of LSTM and GRU model architecture using two different data sets under identical conditions. Thus, we develop the four models —two using fundamental data (LSTM, GRU) and the remaining two using combined data (LSTM-News, GRU-News). The normalized three-dimensional training and validation dataset is fed to LSTM and GRU models. The models are trained in a supervised learning environment with the mean square error as the loss function. During the training process, we fixed the time step to 5. Both models (LSTM and GRU) are initialized with the input layers, followed by two hidden LSTM layers (two hidden GRU layers for GRU), and a dense layer with linear activation function, respectively. The hyperparameters: number of neurons in each layer, batch size, optimizer, number of epochs, and learning rate were optimized. The optimal set of hyperparameters for each model architecture is presented in Table 6.

thumbnail
Table 6. List of best hyperparameters for the models.

https://doi.org/10.1371/journal.pone.0284695.t006

5.2 Experimental results

We ultimately compare the performance of the ML models with news scores (LSTM-News and GRU-News) against those without news scores (LSTM and GRU) models utilizing the test data via the performance matrices presented in Table 7. The correlation coefficient (R) does not vary significantly between the models and shows virtually a perfect correlation. However, based on the lowest RMSE and MAPE, GRU-News performs best among the four models. We can conjecture a ranking of our models as GRU-News, LSTM-News, LSTM, and GRU, respectively with an order of decreasing performance metrics.

thumbnail
Table 7. Model performance metrics of four models obtained using test datasets.

https://doi.org/10.1371/journal.pone.0284695.t007

The scatter plot of the true values versus the predicted values of closing price for test data are plotted in (Fig 5(a)–5(d)). These plots provides useful information to gauge the goodness of fit of the model. If the predicted values are close to the actual values, the plot resembles a straight line at a 45-degree angle with the horizon, resulting in R close to 1.

thumbnail
Fig 5. Scatter plots between the actual closing price and predicted closing price of test data corresponding to the models: (a) LSTM-News, (b) GRU-News, (c) LSTM, and (d) GRU.

The best-fit linear equation (y = x) is represented by the red dotted line.

https://doi.org/10.1371/journal.pone.0284695.g005

The time series plots presented in (Fig 6(a)–6(d)), show the pattern of actual closing price to the predicted closing price of the employed model architecture. The predicted closing price nearly overlaps with the actual closing price in all cases. It further verifies the fact that all four models sufficiently capture the trend of closing price despite of having various irregularities. In a nutshell, Figs 5 and 6 speaks tons of information about the influence of financial news in stock price prediction in terms of model accuracy.

thumbnail
Fig 6. Time series plots between the actual closing price and predicted closing price of test data corresponding to the models: (a) LSTM-News, (b) GRU-News, (c) LSTM, and (d) GRU.

https://doi.org/10.1371/journal.pone.0284695.g006

5.3 Statistical analysis

The model assessment matrices and visualization techniques discussed above indicate the model that incorporates the financial news data better represents the behavior of the stock market. We would further like to validate the fact that the “performance of each model is different or not” using statistical analysis. It can be executed from both parametric and non-parametric tests.

One-way analysis of variance (ANOVA) is considered as the first method under the umbrella of the parametric test. Even though it is easy to implement and interpret, it may not predict the p-value accurately if the data are not normally distributed [67]. Therefore, before implementing the one-way ANOVA, the condition of normality must be satisfied.

Quantile quantile (QQ) plot [68] has been employed to test the normality in the data. The error (true closing price—predicted closing price) of QQ plots of the test data (Fig 7(a)–7(d)) provide initial insights on normality. Technically speaking, the quantile of normal distribution (straight line in Fig 7) and the quantile of errors (blue dots in Fig 7) should overlap if the data is normally distributed. (Fig 7) reveals the fact that errors are not normally distributed. Therefore, the situation demands hypothesis tests to draw the conclusions about non-normality of the errors.

thumbnail
Fig 7. Normal QQ-plots of errors (actual closing price—predicted closing price) of test data corresponding to the models: (a) LSTM-News, (b) GRU-News, (c) LSTM, and (d) GRU.

The quantile of normal distribution and errors are represented by a solid red line and blue dots, respectively.

https://doi.org/10.1371/journal.pone.0284695.g007

The hypothesis test for the normality of each error and corresponding p-values are listed in (Table 8) using the method described by D’Agostino [69]. Since the p-values are very small (Table 8), we reject the null hypothesis (errors are normally distributed). Therefore, there is sufficient evidence to conclude the fact that the error obtained from four models—LSTM, GRU, LSTM-News, and GRU-News —are far away from the normal distribution. Statistically speaking the populations corresponding to these errors are not normally distributed. These results conclude the assumptions for a parametric test are not applicable in the given scenario.

thumbnail
Table 8. Hypothesis together with p-values of the normality test.

https://doi.org/10.1371/journal.pone.0284695.t008

As the non parametric test does not require to satisfy any assumptions, Kruskal Wallis test [70] is implemented first under this category. The Kruskal Wallis test is proposed to answer the following hypothesized statements:

H0: The prediction accuracy of the models are not significantly different.

H1: At least one model has significantly different prediction accuracy than others.

Kruskal Wallis test statistic and the p-value of the test are 123.1135 and 1.6475 × 10−26 respectively. Since the p-value of the test is approximately equivalent to zero, so we reject the null hypothesis. In other words, there is sufficient evidence to conclude that at least one model has significantly different prediction accuracy than the others.

However, the Kruskal Wallis test does not identify the number of models that have different prediction accuracy. To identify the models with different prediction accuracy, we perform a pairwise comparison of the models using the Mann-Whitney test [71]. The p-values of the Mann-Whitney test are presented in Table 9.

thumbnail
Table 9. P-values of the Mann-Whitney test for pairwise comparisons within the models.

https://doi.org/10.1371/journal.pone.0284695.t009

Looking at Table 9, we can draw the final conclusion that the performance of the LSTM-News and LSTM is significantly different. Moreover, LSTM-News has better prediction accuracy than LSTM. Similarly, GRU-News shows better prediction accuracy than GRU. In either of the cases, the p-value presented in Table 9 and performance metrics (Table 7) is considered as primary evidence to end up with the conclusion. Moreover, we can further conclude the following additional hypothesis:

  • The prediction accuracy of LSTM-News is significantly better than the GRU.
  • The prediction accuracy of GRU-News is significantly better than the LSTM.
  • There is not sufficient evidence to conclude that the performance of LSTM and GRU are significantly different, though LSTM has slightly better performance metrics than GRU.
  • There is not sufficient evidence to conclude that the performance of LSTM-News and GRU-News are significantly different, though GRU-News has slightly better performance metrics than LSTM-News.

5.4 Ethics and implications

The study uses publicly available fundamental stock market data and web-scraped financial news data without manipulation. A series of statistical evidence further supports the reported performance of the model to make the result trustable to the model’s reliability and robustness. One can use the results as additional information to boost confidence in stock investment decisions. The investment decision should not rely solely on this research outcome. Investors are suggested to use their experience and risk tolerance behavior and consider other lurking variables based on the existing market situations. Thus, one can benefit if the current market conditions are appropriately analyzed and amalgamated with the model’s outcome. This research shows the promising possibility of using GRU and CNN architecture utilizing financial news data in combination with fundamental stock data to delineate the cone of uncertainty in stock price prediction.

6 Conclusions

Stock price prediction is gaining popularity for all the stakeholders involved directly/indirectly in making a responsible financial decision. Precise and consistent prediction is challenging due to its volatile, nonlinear, noisy, chaotic, and fuzzy behavior. It is a rational idea to identify whether structured numerical data or unstructured text data or a combination of both, influence stock price prediction. The current literature lacks a comparative analysis of stock price prediction with or without incorporating unstructured data that focuses on human sentiment. The primary objective of this article is to conduct a comparative study of LSTM and GRU under identical conditions by utilizing multifaceted information and to identify the influence of text data in stock price prediction. A comprehensive data-driven approach via hyperparameter tuning reveals that including financial data produces better LSTM and GRU architecture performance. Furthermore, stock price prediction can significantly improve when the stock market’s fundamental data is amalgamated with financial news data. Thus, the employed model outcome quantitatively supports the longstanding belief in sentiment and social media influence on the stock price. Our result further endorsed the standard assessment metrics (RMSE, MAPE, and R) and the series of statistical tests to validate its reliability and robustness.

References

  1. 1. Bosworth B, Hymans S, Modigliani F. The stock market and the economy. Brookings Papers on Economic Activity. 1975;1975(2):257–300.
  2. 2. Jones CM. A century of stock market liquidity and trading costs. Available at SSRN 313681. 2002;.
  3. 3. Ahangar RG, Yahyazadehfar M, Pournaghshband H. The comparison of methods artificial neural network with linear regression using specific variables for prediction stock price in Tehran stock exchange. arXiv preprint arXiv:10031457. 2010;.
  4. 4. Dattatray P, Kumar K. Systematic analysis and review of stock market prediction techniques. Computer Science Review. 2019;34.
  5. 5. Weng B, Ahmed MA, Megahed FM. Stock market one-day ahead movement prediction using disparate data sources. Expert Systems with Applications. 2017;79:153–163.
  6. 6. Hansen JV, McDonald JB, Nelson RD. Time series prediction with Genetic-Algorithm designed neural networks: An empirical comparison with modern statistical models. Computational Intelligence. 1999;15(3):171–184.
  7. 7. Fama EF. Random walks in stock market prices. Financial analysts journal. 1995;51(1):75–80.
  8. 8. Ariyo AA, Adewumi AO, Ayo CK. Stock price prediction using the ARIMA model. In: 2014 UKSim-AMSS 16th international conference on modelling and simulation. IEEE; 2014. p. 106–112.
  9. 9. Devi BU, Sundar D, Alli P. An effective time series analysis for stock prediction using ARIMA model for nifty midcap-50. International Journal of Data Mining & Knowledge Management Process. 2013;3(1):65.
  10. 10. Junior PR, Salomon FLR, de Oliveira Pamplona E, et al. ARIMA: An applied time series forecasting model for the Bovespa stock index. Applied Mathematics. 2014;5(21):3383.
  11. 11. Rounaghi MM, Zadeh FN. Investigation of market efficiency and financial stability between S&P 500 and London stock exchange: monthly and yearly forecasting of time series stock returns using ARMA model. Physica A: Statistical Mechanics and its Applications. 2016;456:10–21.
  12. 12. Banerjee D. Forecasting of Indian stock market using time-series ARIMA model. In: 2014 2nd international conference on business and information management (ICBIM). IEEE; 2014. p. 131–135.
  13. 13. Efendi R, Arbaiy N, Deris MM. A new procedure in stock market forecasting based on fuzzy random auto-regression time series model. Information Sciences. 2018;441:113–132.
  14. 14. Ticknor JL. A Bayesian regularized artificial neural network for stock market forecasting. Expert systems with applications. 2013;40(14):5501–5506.
  15. 15. Yeh CY, Huang CW, Lee SJ. A multiple-kernel support vector regression approach for stock market price forecasting. Expert Systems with Applications. 2011;38(3):2177–2186.
  16. 16. Ge Q, Kurov A, Wolfe MH. Stock market reactions to presidential social media usage: Evidence from company-specific tweets. SSRN Electronic Journal. 2017;.
  17. 17. Ge Q, Kurov A, Wolfe MH. Stock market reactions to presidential statements: Evidence from company-specific tweets. 2018;.
  18. 18. Althelaya KA, El-Alfy ESM, Mohammed S. Evaluation of bidirectional LSTM for short-and long-term stock market prediction. In: 2018 9th international conference on information and communication systems (ICICS). IEEE; 2018. p. 151–156.
  19. 19. Rahman MO, Hossain MS, Junaid TS, Forhad MSA, Hossen MK. Predicting prices of stock market using gated recurrent units (GRUs) neural networks. Int J Comput Sci Netw Secur. 2019;19(1):213–222.
  20. 20. Wu B, Wang L, Wang S, Zeng YR. Forecasting the US oil markets based on social media information during the COVID-19 pandemic. Energy. 2021;226:120403. pmid:34629690
  21. 21. Wu B, Wang L, Zeng YR. Interpretable wind speed prediction with multivariate time series and temporal fusion transformers. Energy. 2022;252:123990.
  22. 22. Bhandari HN, Rimal B, Pokhrel NR, Rimal R, Dahal KR. LSTM-SDM: An integrated framework of LSTM implementation for sequential data modeling. Software Impacts. 2022;14:100396.
  23. 23. Dahal K, Dahal J, Banjade H, Gaire S. Prediction of Wine Quality Using Machine Learning Algorithms. Open Journal of Statistics. 2021;11(2):278–289.
  24. 24. Jiawei X, Murata T. Stock market trend prediction with sentiment analysis based on LSTM neural network. In: International multiconference of engineers and computer scientists; 2019. p. 475–9.
  25. 25. Samarawickrama A, Fernando T. A recurrent neural network approach in predicting daily stock prices an application to the Sri Lankan stock market. In: 2017 IEEE International Conference on Industrial and Information Systems (ICIIS). IEEE; 2017. p. 1–6.
  26. 26. Li J, Bu H, Wu J. Sentiment-aware stock market prediction: A deep learning method. In: 2017 international conference on service systems and service management. IEEE; 2017. p. 1–6.
  27. 27. Shen G, Tan Q, Zhang H, Zeng P, Xu J. Deep learning with gated recurrent unit networks for financial sequence predictions. Procedia computer science. 2018;131:895–903.
  28. 28. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:14061078. 2014;.
  29. 29. Ju X, Chen VC, Rosenberger JM, Liu F. Fast knot optimization for multivariate adaptive regression splines using hill climbing methods. Expert Systems with Applications. 2021;171:114565.
  30. 30. Ju X, Rosenberger JM, Chen VC, Liu F. Global optimization on non-convex two-way interaction truncated linear multivariate adaptive regression splines using mixed integer quadratic programming. Information Sciences. 2022;597:38–52.
  31. 31. Lv SX, Wang L. Multivariate wind speed forecasting based on multi-objective feature selection approach and hybrid deep learning model. Energy. 2023;263:126100.
  32. 32. Peng L, Wang L, Xia D, Gao Q. Effective energy consumption forecasting using empirical wavelet transform and long short-term memory. Energy. 2022;238:121756.
  33. 33. Zhu Q, Che J, Li Y, Zuo R. A new prediction NN framework design for individual stock based on the industry environment. Data Science and Management. 2022;5(4):199–211.
  34. 34. Adebiyi AA, Adewumi AO, Ayo CK. Comparison of ARIMA and artificial neural networks models for stock price prediction. Journal of Applied Mathematics. 2014;2014.
  35. 35. Karmiani D, Kazi R, Nambisan A, Shah A, Kamble V. Comparison of predictive algorithms: backpropagation, SVM, LSTM and Kalman Filter for stock market. In: 2019 Amity International Conference on Artificial Intelligence (AICAI). IEEE; 2019. p. 228–234.
  36. 36. Chen K, Zhou Y, Dai F. A LSTM-based method for stock returns prediction: A case study of China stock market. In: 2015 IEEE international conference on big data (big data). IEEE; 2015. p. 2823–2824.
  37. 37. Roondiwala M, Patel H, Varma S. Predicting stock prices using LSTM. International Journal of Science and Research (IJSR). 2017;6(4):1754–1756.
  38. 38. Yu P, Yan X. Stock price prediction based on deep neural networks. Neural Computing and Applications. 2020;32(6):1609–1628.
  39. 39. Gao P, Zhang R, Yang X. The application of stock index price prediction with neural network. Mathematical and Computational Applications. 2020;25(3):53.
  40. 40. Shahi TB, Shrestha A, Neupane A, Guo W. Stock price forecasting with deep learning: A comparative study. Mathematics. 2020;8(9):1441.
  41. 41. Kara Y, Boyacioglu MA, Baykan ÖK. Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert systems with Applications. 2011;38(5):5311–5319.
  42. 42. Schöneburg E. Stock price prediction using neural networks: A project report. Neurocomputing. 1990;2(1):17–27.
  43. 43. Kohara K, Ishikawa T, Fukuhara Y, Nakamura Y. Stock price prediction using prior knowledge and neural networks. Intelligent Systems in Accounting, Finance & Management. 1997;6(1):11–22.
  44. 44. Adebiyi AA, Ayo CK, Adebiyi M, Otokiti SO. Stock price prediction using neural network with hybridized market indicators. Journal of Emerging Trends in Computing and Information Sciences. 2012;3(1).
  45. 45. Selvin S, Vinayakumar R, Gopalakrishnan E, Menon VK, Soman K. Stock price prediction using LSTM, RNN and CNN-sliding window model. In: 2017 international conference on advances in computing, communications and informatics (icacci). IEEE; 2017. p. 1643–1647.
  46. 46. Yahoo! YF. https://financeyahoocom/. 2022;.
  47. 47. News RW. https://www.redditcom/r/worldnews/. 2022;.
  48. 48. Finviz FV. https://finvizcom/newsashx. 2022;.
  49. 49. Kaggle. kagglecom. 2022;.
  50. 50. Hutto C, Gilbert E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media. vol. 8; 2014.
  51. 51. Liu B, et al. Sentiment analysis and subjectivity. Handbook of natural language processing. 2010;2(2010):627–666.
  52. 52. Hochreiter S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 1998;6(02):107–116.
  53. 53. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997;9(8):1735–1780. pmid:9377276
  54. 54. Gers FA, Schmidhuber J, Cummins F. Continual prediction using LSTM with forget gates. In: Neural Nets WIRN Vietri-99. Springer; 1999. p. 133–138.
  55. 55. Gers FA, Schraudolph NN, Schmidhuber J. Learning precise timing with LSTM recurrent networks. Journal of machine learning research. 2002;3(Aug):115–143.
  56. 56. Bhandari HN, Rimal B, Pokhrel NR, Rimal R, Dahal KR, Khatri RK. Predicting stock market index using LSTM. Machine Learning with Applications. 2022; p. 100320.
  57. 57. Qiu J, Wang B, Zhou C. Forecasting stock prices with long-short term memory neural network based on attention mechanism. PloS one. 2020;15(1):e0227222. pmid:31899770
  58. 58. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems. 2016;28(10):2222–2232. pmid:27411231
  59. 59. Lei J, Liu C, Jiang D. Fault diagnosis of wind turbine based on Long Short-term memory networks. Renewable energy. 2019;133:422–432.
  60. 60. Chollet F. Deep learning with Python. Simon and Schuster; 2017.
  61. 61. Géron A. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media; 2019.
  62. 62. Chorowski J, Bahdanau D, Serdyuk D, Cho K, Bengio Y. Attention-based models for speech recognition. arXiv preprint arXiv:150607503. 2015;.
  63. 63. Wen TH, Gasic M, Mrksic N, Su PH, Vandyke D, Young S. Semantically conditioned lstm-based natural language generation for spoken dialogue systems. arXiv preprint arXiv:150801745. 2015;.
  64. 64. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies; 2016. p. 1480–1489.
  65. 65. Agarap AFM. A neural network architecture combining (GRU) and support vector machine (SVM) for intrusion traffic data. In: Proceedings of the 2018 10th international conference on machine learning and computing; 2018. p. 26–30.
  66. 66. Pokhrel NR, Dahal KR, Rimal R, Bhandari HN, Khatri Predicting NEPSE index price using deep learning models. Machine Learning with Applications. 2022; p. 100385.
  67. 67. Hecke TV. Power study of anova versus Kruskal-Wallis test. Journal of Statistics and Management Systems. 2012;15(2-3):241–247.
  68. 68. Marden JI. Positions and QQ plots. Statistical Science. 2004; p. 606–614.
  69. 69. D’Agostino RB. Tests for the normal distribution. In: Goodness-of-fit techniques. Routledge; 2017. p. 367–420.
  70. 70. Corder GW, Foreman DI. Nonparametric statistics: A step-by-step approach. John Wiley & Sons; 2014.
  71. 71. Conover WJ. Practical nonparametric statistics. vol. 350. john wiley & sons; 1999.