A reservoir computing approach for forecasting and regenerating both dynamical and time-delay controlled financial system behavior.

Significant research in reservoir computing over the past two decades has revived interest in recurrent neural networks. Owing to its ingrained capability of performing high-speed and low-cost computations this has become a panacea for multi-variate complex systems having non-linearity within their relationships. Modelling economic and financial trends has always been a challenging task owing to their volatile nature and no linear dependence on associated influencers. Prior studies aimed at effectively forecasting such financial systems, but, always left a visible room for optimization in terms of cost, speed and modelling complexities. Our work employs a reservoir computing approach complying to echo-state network principles, along with varying strengths of time-delayed feedback to model a complex financial system. The derived model is demonstrated to act robustly towards influence of trends and other fluctuating parameters by effectively forecasting long-term system behavior. Moreover, it also re-generates the financial system unknowns with a high degree of accuracy when only limited future data is available, thereby, becoming a reliable feeder for any long-term decision making or policy formulations.


Introduction
Ever since its beginning, stock markets and financial systems have always played a defining role in chartering new growth trajectories and shaping economies worldwide. Being such a critical influencer, gaining deep insights by dissecting through historical data and international market behaviors has long been a topic of interest. Once understood, the next step would always have been to judiciously forecast the underlying economic data features, attributed with highly volatile characteristics. By being able to achieve this gives a head-start and unique competitive edge over other market participants. Moreover, these systems display erratic recently been successfully applied over differing applications in sentiment analysis [18,19], health care [20], adaptive control [21], robotics [22,23], speech recognition [24,25], financial forecasting [26,27], and numerous other time series prediction [28,29] scenarios. Fig 1 shows the basic anatomy of a recurrent neural network. A high-level view showcases RNN to comprise of three distinct layers of connected neurons. The first one serves as a linear input layer for time-dependent multi-dimension inputs denoted by m(t). As with weighted synaptic connection paths, the information is then passed through to second layer, which is a pool, or, reservoir of neurons and are cyclically connected. This hidden reservoir denoted by r (t), is dynamic in nature and is expected to absorb the temporal characteristics from arriving inputs and project those into a high-dimensional reservoir state space which enhances separability. This internal pool is in turn connected to the multi-dimensional output (readout) layer, denoted by q(t), which completes a basic recurrent neural network. Since it is said to follow the echo state property [10,14], it guarantees the fading away of input influence on the internal reservoir states.
Contrary to other neural networks which involve a bi-directional weight and error update for all the involved layers in succession, reservoir computing adopts a very light-weight strategy which standardizes the input and hidden (reservoir) layer weights. This only leaves the necessity to compute and train the readout layer weights, with the reservoir now behaving as a non-autonomous dynamical system. Such reservoir network characterization plays a defining role in the generated model behavior, along with desired optimizations on non-functional aspects like speed and computational. Owing to its simplicity and low computation cost reservoir computing ranks very high in the list of preferred approaches for RNN training.

Financial system
Financial modelling and economic dynamics have gained acceleration over the last decade. To gain complete insights and control over core features of economic data exhibiting unpredictable micro-and macroeconomic fluctuations, structural and statistical changes, erratic growths and coinciding streams of development-is a strenuous ask. Such irregular behavior is often attributed to external agents influencing the system, but still, researchers have invested considerable effort and have come up with stochastic analysis and non-linear characterization for such systems which are very essential from an economic perspective. Following pursuit, we consider an extremely challenging and a non-linear chaotic finance model which involves interest rate, investment demand and price index as the underlying moving parameters. As presented by Ma and Chen in [30,31], such model can be described in terms of a dimensionless system of first order coupled linear differential equation in Eq 1 below.
where, X-interest rate, Y-investment demand, Z-price index, s-saving amount, c-cost per investment, e-elasticity of demand in commercial markets.
In general, such a system displays complex chaotic behavior as demonstrated by Chen [32] and Jain & Das [33]. The system is shown to have 3 equilibrium points EP 1 , EP 2 and EP 3 which are either asymptotically stable, or, saddle depending on initial conditions mentioned in Eq 2. Having dependency on forces external to the system under consideration, it becomes extremely important to have some methodology in-place through which such chaotic dynamics can be controlled. By introducing a time-delayed feedback as showcased by Chen [32] and Pyragas [34], such complex dynamical systems can be successfully transitioned to exhibit regular dynamics, thereby, make them tractable with widespread applicability in policy decision making and economic forecasting. The updated model having time-delay parameters [32][33][34] can be then represented as in Eq 3.
where, f i -feedback strength and τ i -time delay for i = 1, 2, 3. Proper tuning for feedback strength and time-delay parameters, easily ratified using bifurcation diagrams as in [33] is shown to have a stabilizing influence, and, hence induces a regular and highly controlled system behavior. Basis the above, it's evident that selection of an appropriate modelling technique is highly dependent on the data characteristics and patterns originating from the observed system. Theoretical modelling techniques can be easily applied to scenarios displaying regular and periodically aligned data. On the other hand, machine learning techniques gain much better ground and are better suited to handle non-linear, complex data owing to their inherent capability to learn, self-update and re-learn cyclically. The next section illustrates the specific setup, design implementation and procedures for modelling the afore-mentioned financial system.

Data and procedure
In our present work, we consider the financial system as detailed above. Since it exhibits both chaotic and regular behavior characterizations, we thought of using a common technique capable of addressing both the system behavior scenarios simultaneously. From the plethora of available approaches, we have restricted ourselves in using reservoir computing technique, since it is known to perform exceedingly well on non-linear time-series data irrespective of the application domain. Though it appears straightforward and conceptually simple, still it needs to be carefully implemented for achieving good performance figures. Lukoševičius [35] has presented a comprehensive view on efficiently configuring the reservoir. The intent of this work is not to validate the authenticity of the discussed financial model, but instead, to predict its behavior correctly using the reservoir computing approach. This section thoroughly details the internals for configuring the light-weight reservoir computing setup along with its stepwise processing and updates with respect to the complex input data.

Data-sets
As previously discussed, we use a financial system to derive the targeted model. Two input time series are generated corresponding to Eqs 1 and 3 respectively, while using the initial conditions X 0 = 2.0, Y 0 = 3.0 and Z 0 = 2.0; along with control parameters as deduced by [33] and captured in Table 1. As part of current work, only feedback strength f 1 is tuned, while retaining f 2 and f 3 as 0. The first time-series corresponds to complex chaotic data with the system displaying non-linear behavior, while the second one reflects a controlled and regular behavior for the same financial system acting under the influence of a properly configured time-delayed feedback. Henceforth, these two time-series inputs will be referred to as "ChaoticInput1" and "ControlledInput2".

Dimensioning the reservoir
To be truly efficient, the derived model needs to be agnostic to input values and at the same time be able to service multi-dimensional temporal inputs and outputs. So, our ESN-based reservoir computing setup comprises a M-dimensional input, m(t), forming the input layer which is internally fed to a R-dimensional cyclic reservoir network, r(t), (also called the hidden network layer). This hidden layer, in turn, is connected to a Q-dimensional output, q(t), which forms the readout/output layer. As is the case with any neural network, the layer dimensionality has an analogous mapping to number of neurons, implying M neurons in the initial input layer, R neurons forming the reservoir and Q neurons as part of the final readout layer. The efficiency of any neural network driven model is in principle governed by the number density of neurons and synaptic connection weights configured between them. For the above mentioned topology based on networks defined by Jaeger [10] and Lu [16], we have an input connection matrix W input with size R × M, an internal reservoir connection matrix W res with size R × R, and, an output readout connection matrix W readout with size Q × R. Certain scenarios may require a feedback connection matrix W fb with size R × Q, as well, since it helps in providing the (error) differential back from outputs to readout layer while training the reservoir.
The setup presented in this paper was designed and implemented in MATLAB. Our implementation considers a step-wise update s(t) for the reservoir states having size R X Q, where each step maps to the pre-defined time instants for training data-set. Mathematically, this step-wise update for next RNN state is a function of existing state, existing output and available inputs represented by function G in Eq 4.
A more explicit form of this RNN step-update equation derived from [16,36,37] can be written as Eq 5.
where, α is the leaking rate of the reservoir which helps in regulating the speed of reservoir update dynamics. A value of α ! 1 fastens the reservoir evolution, whereas, α ! 0 renders it to update slowly; H is an activation function, preferably a non-linear function since it helps in attaining saturation for large values; ηI is a noise term which helps in immunizing the reservoir against overfitting based on input data; ϕI is a constant bias term with ϕ as the scalar constant and I as unitary matrix. Post the reservoir state update, the temporal output q(t + 1) is evaluated using Eq 6.

Connection weight matrices
As for an ESN, the connection weight matrices W input , W res , W fb are randomly defined and kept fixed since start, based on which the W readout is derived during training process. A similar procedure as referred by Noah [37], with different tunings is used for generating these matrices. Iterative simulations were done during matrix generation resulting in varying RMSE values, and the choice of finalized weight matrix was based on having the most optimal RMSE value. W input and W fb generation involved a similar procedure as detailed below: 1. Initial parameters comprised of density (d input , d fb ) d � (0,1) and radius σ(σ input , σ fb )

Create an all-zero matrix having correct dimensions (R × M) or (R × Q)
3. For each matrix entry (x,y), draw a random number c � (0, 1) a. If c < d then matrix entry (x,y) = random number l � (−σ,σ) b. If c � d then matrix entry (x,y) = 0 A different procedure was used during W res generation which included following steps: 1. Initial parameters comprised of reservoir radius (σ res )

Create an adjacency matrix of correct size (R × R)
3. For each non-zero matrix entry (x,y), draw a random number c � (0, 1) a. If c < 0.5, then matrix entry (x,y) = −σ res b. If c � 0.5, then matrix entry (x,y) = σ res 4. With above steps, we obtain the initial un-scaled matrix W 0 5. Final reservoir matrix W res is obtained by re-scaling each entry as where, ρ is spectral density and λ max is the greatest magnitude eigenvalue of W 0 . MATLAB functions rand() and sprand() have been used for generating random number and adjacency matrix respectively during above procedures.

Teacher training and testing
Like any model building procedure, this involves teacher training the model for τ 1 reservoir update steps in first leg and testing out the trained model during the remaining part (t 2 ). If the model is properly trained, then the predicted values must match with expected outputs. In other words, the inputs hold significance during model training, which gradually pass on their impact control to feedback handling during prediction process. This forms the basis of having a similar mechanism for generating both W input and W fb weight matrices.
During the training procedure, there is a brief period of transients (also referred to as dropouts) beyond which the readouts start displaying the systematic variations corresponding to the training sequence. The present work uses Ridge regression technique described using Eq 7 for evaluating the weights of the readout connection matrix W readout . Here, β is the regularization constant which helps to make the network less susceptible to noise and overfitting [13], and, S, V corresponds to reservoir state update vector and teacher training output vector respectively (S and V exclude the dropout samples).

Experiment and results
Our experiment considers three inputs namely, interest rate (IR), investment demand (ID) and price index (PI) which results in three-dimensional input and output weight matrices. The number of reservoir nodes is one of the most critical hyperparameter, since this number is directly proportional to the overall computation spent on the experiment. There is no fixed value for this, and the optimal size is reached at by running multiple iterations of the experiment while trying out different values. A large reservoir size (more than 1000) is expected to result in better performance, but on the contrary it results in overfitting and requires appropriate regularization. Moreover, such higher values require large computation resources and is generally recommended to make up for scenarios having very limited input training data. Simulation having reservoir size (R) ranging from 50 to 700 nodes, with an incremental increase of 50 nodes at every step was performed. Choice of other parameters like input radius (σ input ), reservoir density (d res ) and reservoir radius (σ res ) having similar values for both experiments is also driven by thorough simulation and analysis based on recommendations [35] for successfully applying ESNs. Finally, based on the entire hyper-parameter set detailed below in Table 2, a suitable value of reservoir size having 400 neurons was determined. Similarly, leaking rate (α) was also required to be optimized corresponding to differing activations functions.
Along with the above defined parameters, the activation function plays a major role during the state vector update stage. Different activation functions like triangler, tanh, mexican hat, sinc have been detailed in [38] for their effects on predictability in echo state networks. It is evident that activation functions with appropriate non-linearity show better performance compared to linear ones and tanh is one of the most extensively used amongst non-linear functions. As in [37], the present work also weighs upon the effect of four different activations functions on the financial system data. The four functions can be broadly categorized as linear and non-linear, with the non-linear category comprising of tanh, quadratic and purely nonlinear functions, apart from the strictly linear function. The linear activation function resulting from Taylor's expansion of tanh(x) at x = 1 is f(x) � 0.76159 + 0.41997x. Similarly, the quadratic and strictly non-linear activation function are represented by f(x) � 0.76159 + 0.41997x − 0.31958x 2 and f(x) � 0.76159 − 0.31958x 2 respectively. Configuration related to the teacher training, test data and step update size is mentioned as part of Table 3.
Utilizing the above hyper-parameter set and dataset distributions, the present study effectively demonstrates and consolidates two different sets of experiments. The first set involves the prediction/forecast of the entire financial time-series based on only feedback-driven reservoir (zero input during test leg), and, the second set involves a partial re-generation of financial time-series based on the generic causal-effect model. These experiments are performed for both the "ChaoticInput1" and "ControlledInput2" datasets.

Experiment set 1
This comprises the most sought-after use case which involves forecasting the irregular capital market, or, financial system behavior. Here in, the entire financial system behavior needs to be predicted based on the trained model and no external input is available post the teacher training period. The model response in forecasting the financial system is completely dependent on how well the reservoir learnt during training and the current output (now being passed as feedback, or, input). Hence, this is referred to as a feedback-only driven system. Figs 2a-2d and 3a-3d showcase predicted system behavior using "ChaoticInput1" and "ControlledIn-put2" datasets respectively. As evident from Fig 2, the reservoir can predict with a higher accuracy when activation functions have a non-linear component, compared to the strictly 'linear' activation function. This is determined based on the trend and overlap between the predicted (red) and actual target (blue) time-series in the figures. The reservoir retains the system behavior for a good enough time beyond which its memory starts to fade away and deviation is observed. In our case, memory fades much faster when using a purely linear component as in Fig 2a. On the other hand, the models using non-linear components during activation continue to exhibit the expected behavior for a much longer duration of time. Though the predicted values are not exact (reflected as non-overlap in figures), but they could mirror the system behavioral pattern for a prolonged duration as witnessed in Fig 2b-2d. The near conformance in output pattern even ranges for the entire test cycle having 2500 steps for X and Z components as shown in Fig 2b-2d. As observed from experimental results, amongst the non-linear activation functions tanh is seen to perform marginally better than the quadratic and strictly non-linear components. This behavior can be further ratified from Table 4 which contains the mean square error (MSE) values evaluated using Eq 8, where, Eðq; � qÞ refers to error (E) between actual ð� qÞ and inferred (q) readout values, and T is the total steps for which error is evaluated. 'tanh' reports the lowest and 'linear' the highest MSE values.
A time-delayed financial system is seen to provide better results both in terms of prediction accuracy and reservoir memory as depicted in Fig 3a-3d. There exists only a minute difference in overlap between predicted and actual outputs, and, additionally this system is seen to predict precisely for a much longer term when compared to system devoid of time-delayed feedback. Apart from the behavioral pattern, the predicted values are a near perfect match to the actual targets while using non-linear components during activation. This is further corroborated by the very low MSE values in Table 4, where, 'tanh' is again seen to perform the best amongst all activations, thereby making it the preferred option of choice while designing and driving such cyclic reservoirs. The above clearly show the importance of non-linearity while modelling network reservoirs in predicting complex chaotic system behavior.

Experiment set 2
In addition to direct system forecasting, there may be practical situations when it is not required to fully predict such non-linear systems i.e. limited future data for some components,  or, system variables in already available. This necessitates the need to effectively re-generate only the required information based on available data and aligns well with the generic causal effect model, wherein one dependent quantity (say, d 1 ) can be derived based on the available temporal values of another (say, d 2 ) when, d 1 and d 2 have an inter-dependency relationship. Mapping this onto our financial model, reservoir computing approach provides a huge valueadd by being able to behave as an input-driven reservoir for the three components IR (or, X), ID (or, Y) and PI (or, Z) one-by-one and the other two components being simultaneously read-out from the system. Even a high-level analysis of Figs 4-7 is enough to conclude that activation functions having a non-linear component clearly out-perform the strictly linear activation function. The linear activation function is found to behave exceedingly well while correlating X and Z components as shown in Fig 4b and 4e. This is deduced owing to a very marginal difference between the actual (blue line) and inferred (red line) outputs i.e. Z component is precisely predicted from X-driven reservoir, and, X component is precisely predicted from Z-driven component respectively. This strong correlation can be attributed to their one-to-one dependence as evident from Eq 1. The introduction of Y component either for driving the reservoir, or, during inference doesn't generate expected results there by indicating the requirement for a non-linear component which to an extent helps in driving the system behavior towards saturation.
From Figs 5-7 it is observed that the X-driven and Z-driven reservoir system could predict the remaining components to a high degree of accuracy. This is verified by the overlap happening between the actual and inferred components. On the other hand, it is evident that Y-driven reservoirs fail to precisely predict the other components and hence observability doesn't apply here. Not applicable here, but in certain systems the ambiguity in states may result from the change in signs of X and Z components, as shown by Lu [16] while applying observability over chaotic dynamical systems.
Another set of inference results, depicted in Figs 8-11 are generated using the same control parameters defined in Tables 1-3 and "ControlledInput2" data which corresponds to the controlled, regularized financial system owing to the calculated induction of time-delayed feedback.

PLOS ONE
Reservoir computing and chaotic financial system behavior  three set of plots, with each set corresponding to one component driving the reservoir and remaining two components being inferred from the model. As opposed to the original chaotically behaved financial system, the strictly linear activation function is found to perform much better now, but still it is unable to match up with those having a non-linear component. As before, the linear activation function is found to behave exceedingly well while correlating X and Z components as shown in Fig 8b and 8e. Owing to the new regularized nature of the system, the introduction of Y component for driving the reservoir, or, during inference is also able to predict much better, thereby maintaining a high degree of congruency with the actual data. It is evident from Figs 9-11 that the X-driven, Ydriven or Z-driven reservoir systems could predict the remaining components with a high degree of accuracy. It is easily discernible that the overlap between actual and inferred components in far better in , when compared to Figs 4-7. This confirms the theoretical expectation that prediction ratio must be far higher for systems displaying a regular and controlled behavior when compared to chaotic dynamical systems. This is further substantiated by the relatively low mean-square error (MSE) values mentioned in Tables 4 and 5.
Comparing Tables 5 and 6, it is observed that the MSE values are relatively lower for a controlled system, implicitly referring to better approximation than a chaotic or dynamical system. Basis the above demonstrated results and resulting interpretations, it will be justified to mention reservoir computing as a highly versatile and low-cost scalable machine-learning technique which can be trained to infer the control states in systems requiring complex temporal processing capabilities. This should facilitate in helping it gain universal acceptance for efficiently reconstructing many rapidly evolving and complex system behaviors.

Conclusions
In the present work, we provide insights on the effectiveness of applying reservoir computing (RC) to complex and dynamical financial systems. Figs 2 and 3 generated using Experiment set 1 easily demonstrate the RC capability to forecast entire system behavior by executing in a feedback-only mode. The predicted values in Fig 2a-2d are well-aligned to the chaotic financial system behavioral patterns, but it becomes a near replica in Fig 3a-3d when the system is under the influence of a suitable time-delayed feedback. More importantly, the controlled system demonstrates a very high degree of memory retention by mirroring the target outputs for the entire test cycle having 2500 steps, beyond which deviation may be observed. Mean square error (MSE) values reported in Table 4 further conclude that a chaotic system under the influence of suitable time-delayed feedback is seen to predict much better and for a longer period when compared to same chaotic system devoid of time-delayed feedback. Additionally, RC proves to be a robust and self-sufficient technique which can effectively re-generate parts of such non-linear systems when only limited future data is available for some system components. This is evident from Figs 4-11 generated as part of Experiment set 2 where in the economic indices are inferred from a limited set of continuously available system information. The efficiency of this approach is further corroborated by the presence of low MSE values as depicted in Tables 5 and 6. Moreover, the results conclude that activation functions having a non-linear component have a better inference capability compared to strictly non-linear functions for such dynamical systems. In our case, tanh is seen to outperform purely non-linear, quadratic and purely linear activation functions. Attaining such a controlled accuracy enables the government and other policy making bodies to adopt measures even when the scope of interference is limited, in addition to providing the capability of taking timely, well-informed and globally beneficial decisions.
As a next step, it will be beneficial to observe the same financial system behavior by tuning all the feedback strength coefficients (f 1 , f 2 , f 3 ), along with the application of multi-layered, or, deep reservoir networks. This should help envisage the complexity and computation trade-off existing between single and multi-layered reservoirs for such non-linear and complex dynamical systems.