Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Compound distributions for financial returns

  • Emmanuel Afuecheta,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources

    Affiliation Department of Mathematics and Statistics, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia

  • Artur Semeyutin,

    Roles Data curation, Formal analysis, Methodology, Project administration

    Affiliation School of Economics, Finance and Accounting, Coventry University, Coventry, United Kingdom

  • Stephen Chan ,

    Roles Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Visualization, Writing – original draft, Writing – review & editing

    schan@aus.edu

    Affiliation Department of Mathematics and Statistics, American University of Sharjah, Sharjah, UAE

  • Saralees Nadarajah,

    Roles Validation

    Affiliation Department of Mathematics, University of Manchester, Manchester, United Kingdom

  • Diego Andrés Pérez Ruiz

    Roles Conceptualization, Data curation, Formal analysis

    Affiliation Department of Mathematics, University of Manchester, Manchester, United Kingdom

Abstract

In this paper, we propose six Student’s t based compound distributions where the scale parameter is randomized using functional forms of the half normal, Fréchet, Lomax, Burr III, inverse gamma and generalized gamma distributions. For each of the proposed distribution, we give expressions for the probability density function, cumulative distribution function, moments and characteristic function. GARCH models with innovations taken to follow the compound distributions are fitted to the data using the method of maximum likelihood. For the sample data considered, we see that all but two of the proposed distributions perform better than two popular distributions. Finally, we perform a simulation study to examine the accuracy of the best performing model.

1 Introduction

The Student’s t distribution due to Gosset [1] is the most common and parsimonious model for economic and financial data [2, 3]. It not only offers the potential to fit the leptokurtic properties of financial data but also, can serve as a foundation for building complex statistical models that can describe more subtle features of financial data such as volatility clustering. In recent times, many notable modifications to its functional form have been proposed, for example, see Hansen [4], Fernández and Steel [5], Theodossiou [6], Jones and Faddy [7], Sahu et al. [8], Bauwens and Laurent [9], Aas and Haff [10], Zhu and Galbraith [11, 12] and Papastathopoulos and Tawn [13]. They have been applied beyond Bayesian finite and infinite variance models [14], Markov regime switching models [15] as well as multivariate stochastic volatility models [16]. A detailed review of various modifications of the Student’s t distribution is provided by Li and Nadarajah [17] but the list is still by no means complete.

One of the Student’s t popular generalizations, often recommended for risk quantification in finance as noted by McNeil et al. [18] is the generalized hyperbolic distribution (GHYP) due to Barndorff-Nielsen [19]. The GHYP distribution offers a flexible functional form and possesses a number of attractive properties. For instance, the GHYP distribution can be both symmetric and skewed and is classified as a normal mean-variance mixture distribution and has the Student’s t as one of its special cases. Normal mean-variance distributions are also widespread and not uncommon. For example, mixing of this type can be traced back to Press [20] and Praetz [21], followed by Andrews and Mallows [22], Barndorff-Nielsen [19], Barndorff-Nielsen et al. [23], Kon [24], West [25], Madan and Seneta [26], Madan et al. [27], Tjetjep and Seneta [28], Luciano and Semeraro [29], Geweke and Amisano [30], Nadarajah [31], among others. Typically, these class of models (compound distributions) capture heterogeneous characteristics of financial data by randomizing one of the parameters (often the scale parameter) of the parent distribution with appropriate mixing distributions, for example, see McDonald and Butler [32], Hoogerheide et al. [33] and Ardia et al. [34, 35].

Recently, Afuecheta et al. [36], unlike the previous compositions which are based on the normal distribution, introduced mixture models based on scale mixing of the Student’s t distribution by specifically focusing on the leptokurtic properties of financial data. In particular, they provided flexible compositions of the Student’s t with three mixing distributions: exponential, Weibull and gamma. Their models were shown to provide better fits than some of the popular and more complicated generalizations of the Student’s t distribution, including the GHYP distribution. Hence, given good empirical performance of these models and because of the increasing interest in terms of methodology and applications, we extend this work by considering six mixing distributions. We proceed with the assumption that the conditional distribution for financial returns follows the Student’s t distribution. The variance (volatility) of returns is assumed to follow any of the six mixing distributions: one parameter half normal, two parameter Fréchet, two parameter Lomax, two parameter Burr III, two parameter inverse gamma and three parameter generalized gamma distributions. With these distributions, our research offers six new compound distributions.

The primary objectives of this paper are: (i) to propose six new compound distributions based on the Student’s t distribution; (ii) to illustrate applications of these distributions using real financial data sets; (iii) to compare the proposed distributions with two of the most popular parametric distributions used in finance–the GHYP distribution and asymmetric Student’s t (AST) distribution due to Zhu and Galbraith [11, 12]. For each of the proposed compound distribution, we provide its probability density function (PDF), cumulative distribution function (CDF), moments and characteristics functions. We perform our estimations using the method of maximum likelihood (ML). For the samples considered, empirical comparisons are made using a common set of log-likelihood based criteria. We show that all but one of the proposed distributions perform better than the GHYP distribution under the selection criteria. We also show that all but two of the proposed distributions perform better than the AST distribution under the selection criteria.

The rest of this paper is organized as follows. In Section 2 and corresponding subsections, we present the general form of the proposed distributions; Section 3 describes the data, conducts some exploratory analysis linked to the proposed distributions and outlines evaluation criteria; the results and their discussion are given in Section 4. In Section 5, we conduct a simulation study to assess the performance of the ML estimators with respect to sample size n and to demonstrate the ability of the best performing model. The simulation study also helps to evaluate the uncertainty surrounding the parameters of the best performing model, which ensures that the results obtained are reproducible if the same model is applied to the same data sets, but at a different time interval; finally, Section 6 concludes and summarises our work.

Two of the data sets used are data on cryptocurrencies. There are many papers on risk estimation for cryptocurrency data. Most notable papers include Acereda et al. [37], Trucios et al. [38] and Jimenez et al. [39].

2 Compound distributions

In this section, we begin by writing down the general form of the proposed distributions. Let X denote a continuous random variable representing the observed financial data series; in our case, log-returns of two financial stock indices, two fuel commodities and two cryptocurrencies exchange rates. Assuming that the conditional asset return distribution is Student’t with the PDF given by (1) for − ∞ < x < ∞ and where σ2 > 0.

Now, assuming that the variance σ2 itself is a random variable with PDF given by g(σ2), then the unconditional/actual stock return distribution will be given by the PDF (2) for convenience, we shall let σ2 = τ and rewrite the Eq (2) as (3)

By making use of the series expansion we can further simplify (3) as (4) where (a)k = a(a + 1) ⋯ (a + k − 1) denotes the ascending factorial. Eq (4) is in its general form and shall be used to provide distributions for log-returns of our financial series. The general form of the CDF of X corresponding to (4) can be derived as (5) for − ∞ < x < ∞. By making use of the series expansion we can further simplify (5) as (6)

The general form of the kth moment of X can be expressed as (7) provided that 0 < k < ν. The general form of the characteristic function of X can be expressed as (8) where and Kν(⋅) denotes the modified Bessel function of the third kind defined by

By making use of the series expansion we can further simplify (8) as (9)

Having obtained the general expressions for the PDF given by (4), the CDF given by (6), the kth moment given by (7), and the characteristic function given by (9), we shall proceed to obtain expressions for any given mixing distribution, g(⋅). The choice of the mixing distributions (two parameter inverse gamma distribution, two parameter Lomax distribution, the generalize gamma distribution, two parameter Burr distribution, two parameter Fréchet and one parameter half normal) is motivated by Fig 1, showing the histograms of the volatility for financial series considered in Section 3. The volatility is measured by the standard deviation taken over non-overlapping windows of length 50 days. From Fig 1, we see that g(⋅) corresponds to an exponential-type family of distributions with unimodal PDF, suggesting overall appropriateness of the choices. Notably, the following procedure was used for the choice of g(⋅): (i) fit the considered g(⋅) forms to the standard deviation series obtained using MLE; (ii) select the best performing g(⋅) based on the lowest negative log-likelihood and provide the best fitting parametric outcome for each histogram shown in Fig 1. With this, we observe that the volatility for stock indices and cryptocurrencies is best described by the generalized gamma PDF.

thumbnail
Fig 1. Histogram of standard deviations computed over non-overlapping windows of length 50 days for the specified daily log-returns (S&P500, DJI, Diesel, Propane, BTC and LTC).

https://doi.org/10.1371/journal.pone.0239652.g001

The calculations in the following sections make use of two special functions: the generalized hypergeometric function defined by the Wright [40] generalized hypergeometric function defined by

The properties of these special functions can be found in Prudnikov et al. [41], Gradshteyn and Ryzhik [42], Mathai and Saxena [43] and Srivastava et al. [44].

2.1 Two parameter inverse gamma: With g taking the form

for τ > 0, α > 0 and β > 0. Note that β and α are the scale and shape parameters, respectively. This PDF has a unique mode (which is found at τ = β/(α+ 1)) and skewed moderately to the right. It can be used to describe a wide range of physical phenomenon in diverse disciplines, including climatology, reliability, option pricing, economics, finance and survival analysis. See Bouchaud and Potters [45] for some application of the inverse gamma distribution to stock returns. For the two parameter inverse gamma distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (10) (11) and respectively.

2.2 Two parameter Lomax: With g taking the form

for τ > 0, β > 0 and α > 0. The scale and shape parameters are respectively governed by β and α. This PDF has a unique mode (with the mode at zero). It is notable for characterizing business failure. As a distribution within the Pareto family it has often used in modelling tail losses of returns. In fact, this distribution is also known as type II Pareto distribution and is a special case of the generalized Pareto. It has also been used extensively in analyzing lifetime data. See Benckert and Jung [46], Revankar et al. [47], Arnold [48], Hogg and Klugman [49] and Nair and Hitha [50] for some applications of the Lomax distribution. For the two parameter Lomax distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (12) (13) and respectively.

2.3 Generalized gamma: With g taking the form

for τ > 0, β > 0, λ > 0 and α > 0. The scale, first shape and second shape parameters are respectively given by β, λ and α. This PDF has a unique mode and skewed to the right. The generalized gamma distribution has extensive applications in different areas, including hydrology, water resources, biology, and economics. It encompasses a number of other distributions often used in survival analysis. For example, if λ = α = 1 then the generalized gamma distribution becomes the exponential distribution; if λ = 1 the generalized reduces to the gamma distribution; and if α = 1 the generalized becomes the Weibull distribution. For applications of this family of distribution to stock returns, see Madan and Seneta [26] and Tjetjep and Seneta [28]. For the generalized gamma distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (14) (15) and respectively.

2.4 Two parameter Burr III distribution: With g taking the form

for τ > 0, c > 0 and λ > 0. The two parameters are commonly referred to as the shape (c, λ) parameters. This distribution has a unique mode and moderately skewed to the right. The Burr distribution is one of the popular distribution in statistics. It is often used in reliability analysis as more flexible alternative to other competing distributions such as the lognormal, etc. It has a wide range of applications in other areas such as forestry, meteorology, etc. For the two parameter Burr III distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (16) (17) and respectively.

2.5 Two parameter Fréchet: With g taking the form

for τ > 0, α > 0 and β > 0. This distribution has a unique mode and skewed to the right. The shape and scale parameters are, respectively, governed by α and β. The distribution is also known as inverse Weibull distribution because if 1/Ω has the Weibull distribution then Ω will have the Fréchet distribution. It is a special case of the generalized extreme value distribution which is widely used in characterization of “tail risks” in fields ranging from insurance to finance. Some other application areas of the Fréchet distribution include business and operations research, economics, hydrology, materials and product technology. For the two parameter Fréchet distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (18) (19) and respectively.

2.6 One parameter half normal: With g taking the form

for τ > 0 and θ > 0. The half normal distribution is a normal distribution with scale parameter θ bounded from below at zero. Its applications cut across many areas. For instance, see Meeusen and van Den Broeck [51] and Chou and Liu [52] for applications of the half normal distribution in production processes; Lawless [53] and Cooray and Ananda [54] for applications in life data analysis; Dobzhansky and Wright [55] for applications in genetics; and Bland and Altman [56] for applications in biological sciences. For the one parameter half normal distribution,

Hence, from (4), (6), (7), and (9) we obtain the closed form expressions for the PDF, CDF, moments and characteristic function as (20) (21) and respectively.

2.7 Skewness and kurtosis

By definition, each of the six compound distributions has zero skewness. The kurtosis values can be computed using (11), (13), (15), (17), (19) and (21). These values versus the degree of freedom parameter, ν, are shown in Fig 2.

thumbnail
Fig 2. Kurtosis values corresponding to (11) (top left), (13) (top right), (15) (middle left), (17) (middle right), (19) (bottom left) and (21) (bottom right) versus ν and selected values of other parameters.

https://doi.org/10.1371/journal.pone.0239652.g002

We see that kurtosis is a decreasing function of ν for each compound distribution. The kurtosis for each distribution takes larger values compared to the Student’s t distribution; hence, they are more flexible with respect to heavy tailed data. Over the plotted range, the compound Lomax distribution takes the largest kurtosis values. We note further that the kurtosis is a decreasing function of: α for the compound inverse gamma distribution; α for the compound Lomax distribution; λ for the compound generalized gamma distribution; c for the compound Burr distribution; α for the compound Fréchet distribution.

For details about how skewness and kurtosis can be used to improve model fitting and forecasting performance, see Feunou et al. [57] and Lalancette and Simonato [58].

3 Data

To investigate the empirical performance of the proposed distributions, we consider six popular financial series. These include: two financial stock indices, two fuel commodities prices and two cryptocurrencies exchange rates. Stock indices are Standard & Poor’s 500 (S&P500) and Dow Jones Industrial Average (DJI) for the period starting from the 28th of April 2003 to the 15th of June 2018 as provided by Bloomberg. Fuel commodities are spot prices for the Los Angeles Ultra-Low-Sulfur Diesel (Diesel) and Mont Belvieu, Texas Propane (Propane) in USDs per gallon for the period starting from the 2nd of January 1997 to the 15th of June 2018 as provided by the United States Energy Information Administration. Cryptocurrencies are Bitcoin (BTC) for the period starting from the 18th of July 2010 to the 16th of June 2018 and Litecoin (LTC) for the period starting from the 24th of October 2013 to the 16th of June 2018. Both cryptocurrencies are denominated in USD with their sample sizes representing their entire life cycle on the moment of the data downloaded from Quandl, BNC2 database. For more extensive discussion on cryptocurrencies see Chan et al. [59] and the references therein. In general, there are no specific rationale for composing our data set, though alongside some very common stock indices (S&P500 and DJI) we aim to have some financial series with notable tail (excess kurtosis) characteristics (Propane and LTC), since our work is partially motivated by the heavy tail potential of the parent distribution of the compound distributions. For the above discussed financial series, we computed log-returns as where Ri,t is the return on the index i for the period t, Pi,t is the closing rate/price of the index at the end of period t and Pi,t−1 is the price of the index at the end of the period t − 1. The histogram of the transformed data and their kernel density evaluations are shown in Fig 3. Their characteristics described in Table 1 are: minimum, first quartile (Q1), median, mean, third quartile (Q3), maximum, skewness, kurtosis, standard deviation (SD), variance, range and inter quartile range (IQR).

thumbnail
Fig 3. Time series plots of the daily log-returns of S&P500, DJI, Diesel, Propane, BTC and LTC with their histograms and kernel based density estimates.

https://doi.org/10.1371/journal.pone.0239652.g003

thumbnail
Table 1. Summary statistics of daily log-returns of S&P500, DJI, Diesel, Propane, BTC and LTC.

https://doi.org/10.1371/journal.pone.0239652.t001

From Table 1, we observe the highest range is for the cryptocurrencies returns, followed by the commodities and the smallest for the stock indices. Fig 3 shows the time series plots of returns which appear to oscillate around zero. The oscillations vary a great deal in magnitude, but are almost constant in average over period of the study. Also, from the plot we observe that for each return, periods of high volatility are followed by the periods of low volatility and vice versa. This is not surprising as it is a typical nature of financial indices [6062]. Notably, from Fig 3, there is evidence of sharp market corrections for Diesel and Propane in the early 2000s. This could be explained by the changes in the fundamentals of hydrocarbons, while lack of the clearly defined fundamentals best explains the highest range for the cryptocurrencies. For the stock indices, the pronounced spikes around 2008 could be attributed to the events of the financial crisis, while their lowest range may be explained by their composite nature. The highest kurtosis values are depicted by the Propane and LTC. For S&P500, DJI, Deisel and BTC, the kurtosis values are similar and are greater than that of the normal distribution. All the returns under investigation are clearly heavy tailed. Diesel and LTC are the only two positively skewed series. Overall, inspecting the histograms in Fig 3, we note that each participating return appears more or less symmetrically distributed around zero with the exception of the Propane.

Finally, we proceed to fit GARCH versions of the proposed distributions in Section 2 to the six data sets using the method of ML. Formally, suppose x1, x2, ⋯, xn are independent observations, then the optimal parameters are the values maximizing the likelihood or in most cases due to computational convenience we use the log-likelihood as where Θ = (θ1θk)′ is the parameter vector. Consequently, the optimal estimates for Θ are . All our computations were performed using the standard Nelder-Mead optimization routine with optim command in R as provided by R Core Team [63].

Since the considered distributions are not nested, discrimination among them is performed using the Akaike information criterion (AIC) due to Akaike [64], the Bayesian information criterion (BIC) due to Schwarz [65], the corrected Akaike information criterion (AICc) due to Hurvich and Tsai [66], the Hannan-Quinn criterion (HQC) due to Hannan and Quinn [67], and the consistent Akaike information criterion (CAIC) due to Bozdogan [68]. Extensive discussion on these commonly used criteria is provided by Burnham and Anderson [69] and Fang [70]. Roughly speaking, the smaller the values of these criteria the better the fit.

4 Estimation results and discussion

The GARCH (1, 1) model with the six innovation distributions proposed in Section 2 was fitted to the data described in Section 3. The six innovation distributions do not allow for asymmetry. Also fitted is the GARCH (1, 1) model with the AST and GHYP distributions chosen as the innovation distributions. These two distributions allow for asymmetry of the volatility, which has been noted in the literature for cryptocurrency and energy data sets [37, 71, 72]. We have chosen GARCH (1, 1) as a baseline model, because it is the most simple and accessible model available in the R packages fGarch and rugarch for fitting GARCH type models. We fitted also GARCH models of higher orders, but they did not provide significantly better fits. The method of ML was used for fitting all of the models. For fitting the GARCH (1, 1) model with GHYP innovations, we used the rugarch package. For fitting the GARCH (1, 1) model with AST innovations, we used the VaRES package. The log-likelihood values and the values of two of the five selection criteria (AIC and BIC) for all the proposed distributions are provided in Table 2. The values of the three remaining selection criteria can be obtained from the authors. They led to the same conclusions. Table 2 also gives the differences in empirical and fitted estimates of kurtosis.

thumbnail
Table 2. Log-likelihood values, AIC values, BIC values and differences between empirical and fitted estimates of kurtosis for the GARCH(1, 1) model with the eight innovation distributions fitted to the specified daily log-returns (S&P500, DJI, Diesel, Propane, BTC and LTC).

https://doi.org/10.1371/journal.pone.0239652.t002

According to the selection criteria and the kurtosis values in Table 2, the GARCH (1, 1) with compound generalized gamma innovations gives the best fit, the compound Burr innovations give the second best fit, the compound Fréchet innovations give the third best fit, the compound inverse gamma innovations give the fourth best fit, the AST innovations give the fifth best fit, the compound Lomax innovations give the sixth best fit and the GHYP innovations give the seventh best fit. The worst fit is given by the GARCH (1, 1) model with compound half normal innovations. These conclusions are the same for all the returns.

The probability plots of the standardized residuals for the best fitting GARCH (1, 1) model with compound generalized gamma innovations are shown in Fig 4. The corresponding quantile plots are shown in Fig 5. Both figures suggest that the fit of the model is adequate.

thumbnail
Fig 4. P-P plots of the standardized residuals of the GARCH(1, 1) model with innovations given by (14).

https://doi.org/10.1371/journal.pone.0239652.g004

thumbnail
Fig 5. Q-Q plots of the standardized residuals of the GARCH(1, 1) model with innovations given by (14).

https://doi.org/10.1371/journal.pone.0239652.g005

The p-values of Vuong [73]’s likelihood ratio test to see if the best fitting model is significantly better than the other seven models are given in Table 3. The p-values of Amisano and Giacomini [74]’s likelihood ratio test to see if the best fitting model is significantly better than the other seven models in the left and right tails are given in Table 4. The p-values in all these tables show that the GARCH (1, 1) model with compound generalized gamma innovations provides significantly better fits than all other models. Vuong [73]’s test was performed using the command vuongtest in the nonnest2 package. Amisano and Giacomini [74]’s test was performed using the code available in https://sites.google.com/site/gianniamisanowebsite/.

thumbnail
Table 3. p-values of Vuong 73’s likelihood ratio test comparing the GARCH(1, 1) with (14) versus the seven models.

https://doi.org/10.1371/journal.pone.0239652.t003

thumbnail
Table 4. p-values of Amisano and Giacomini 74’s likelihood ratio test for the left (right in brackets) tails comparing the GARCH(1, 1) with (14) versus the seven models.

https://doi.org/10.1371/journal.pone.0239652.t004

Table 5 tests the significant difference between mean squared errors when the GARCH (1, 1) models were fitted to rolling windows of length 100 days and used to predict the 101th data value [75]. The GARCH (1, 1) with compound generalized gamma innovations is used as the baseline model. The R package fDMA was used to perform the tests. The p-values show that the GARCH (1, 1) model with compound generalized gamma innovations provides significantly better mean squared errors than all other models. These conclusions were the same when the widow length was taken to be 200, 300, …, 1000 days.

thumbnail
Table 5. p-values of Diebold and Mariano 75’s test comparing mean squared errors of the 101th day forecast for rolling windows of length 100 days for the GARCH(1, 1) with (14) versus the same for the seven models.

https://doi.org/10.1371/journal.pone.0239652.t005

Finally, Table 6 gives the p-values of three backtesting methods at 99 percent value-at-risk. In each triplet, the first is the p-value of Kupiec’s proportion of failures [76] test, the second is the p-value of Escanciano and Olmo [77]’s test, and the third is the p-value of peak over threshold’s method. For the last method, we used the evd package. The threshold was chosen by the mean residual plot which was drawn using the command mrlplot. As expected, the peak over threshold’s method gives the largest p-values. For the first two methods, the GARCH (1, 1) with compound generalized gamma innovations gives the largest p-values, the compound Burr innovations give the second largest p-values, the compound Fréchet innovations give the third largest p-values, the compound inverse gamma innovations give the fourth largest p-values, the AST innovations give the fifth largest p-values, the compound Lomax innovations give the sixth largest p-values, and the GHYP innovations give the seventh largest p-values. The smallest p-values with all of them below the 5 percent significance level are given by the GARCH (1, 1) model with compound half normal innovations. Some of the p-values for the GARCH (1, 1) model with GHYP innovations are also below the 5 percent level of significance. The remaining p-values are all above the 5 percent level of significance.

thumbnail
Table 6. p-values of Kupiec’s proportion of failures [76] test, Escanciano and Olmo [77]’s test and peak over threshold’s method for the eight models.

https://doi.org/10.1371/journal.pone.0239652.t006

5 Simulation study

In this section, we conduct a simulation study to assess the performance and accuracy of the ML estimators of the best fitting GARCH(1, 1) model with compound generalized gamma innovations. The following scheme was used:

  1. simulate a sample of size n from the GARCH(1, 1) model with generalized gamma innovations;
  2. estimate (ν, β, λ, α) and the three GARCH parameters;
  3. repeat steps 1 and 2 ten thousand times;
  4. hence, estimate the biases and the mean squared errors for the seven parameters;
  5. repeat steps 1 to 4 for n = 20, 21, …, 500.

The plots of the biases versus n are shown in Fig 6. The plots of the mean squared errors versus n are shown in Fig 7.

thumbnail
Fig 6. Biases of the parameter estimates of the GARCH(1, 1) model with innovations given by (14) based on the simulation study of Section 5.

https://doi.org/10.1371/journal.pone.0239652.g006

thumbnail
Fig 7. Mean squared errors of the parameter estimates of the GARCH(1, 1) model with innovations given by (14) based on the simulation study of Section 5.

https://doi.org/10.1371/journal.pone.0239652.g007

We can observe the following from the figures: the biases can be positive or negative but approach zero as n approaches 500; the biases appear largest for and smallest for ; the biases appear reasonably small at around n = 500; the mean squared errors gradually decrease with increasing n; the mean squared errors appear largest for and smallest for ; the mean squared errors appear reasonably small at around n = 500.

In the simulation scheme, we have taken the initial parameter values as the estimated values for S&P500 returns. The results were similar for a wide range of other initial values including the estimated values for the other five returns.

6 Conclusions

In this paper, based on the scale mixing of the Student’s t distribution, we have developed six new compound distributions. We have also derived their basic properties such as the PDF, CDF, moments and characteristic functions. With these distributions taken as innovations for the GARCH(1, 1) model, we have shown that all but one (respectively, two) of the six distributions perform better than the GARCH(1, 1) model with generalized hyperbolic (respectively, asymmetric Student’s t) innovations. The comparison was made in terms Akaike information criterion values, Bayesian information criterion values, consistent Akaike information criterion values, corrected Akaike information criterion values, Hannan-Quinn criterion values, p-values of Vuong [73]’s likelihood ratio test, p-values of Amisano and Giacomini [74]’s likelihood ratio test for the left tails, p-values of Amisano and Giacomini [74]’s likelihood ratio test for the right tails, mean squared errors of one-day ahead forecasts, and three backtesting methods.

In addition, we have performed a simulation study to examine the accuracy of the best fitting GARCH(1, 1) model with compound generalized gamma innovations. The accuracy was assessed in terms of biases and mean squared errors. Both decreased in magnitude when the sample size increased. Both appeared reasonably small when the sample size was as large as 500. The sample sizes of all six data sets considered are well above 500. The results showed that the GARCH (1, 1) model with compound generalized gamma innovations is valid and worth considering in the general financial context of risk exposure modelling.

Nearly all of the data sets we have considered have skewness close to zero. Hence, there is no need for the compound distributions in Section 2 to incorporate a skewness parameter. However, there are several ways that these distributions can be extended to incorporate skewness. A prominent approach is described in Theodossiou and Savva [78] and Savva and Theodossiou [79]. Another prominent approach is described in Fernández and Steel [5].

An extension of the paper is an analysis of the finiteness of the return distribution unconditional moments through the tail-index according to the “power law” literature; see Gabaix et al. [80], Ibragimov et al. [81] and references therein. This analysis could lead to a better understanding of the empirical results on the existence of the unconditional higher-order moments under the proposed distributions.

Further extensions to the GARCH time series frameworks could be also considered. However, framework of the Generalized Autoregressive Score (GAS) models of Creal et al. [82] and Harvey [83] is more intriguing. The GAS framework allows relatively straightforward introduction of the time-varying dynamics for any desired parameters and can enhance empirical performance of the suggested models further.

Acknowledgments

The authors would like to thank the Editor and the two referees for careful reading and comments which greatly improved the paper. The authors gratefully acknowledge the Department of Mathematics and Statistics, KFUPM for providing the facilities for the work.

References

  1. 1. Gosset W. S. (1908). The probable error of a mean. Biometrika, 6: 1–25.
  2. 2. Abad P., Benito S., and López C. (2014). A comprehensive review of value at risk methodologies. The Spanish Review of Financial Economics, 12: 15–32.
  3. 3. Nieto M. R. and Ruiz E. (2016). Frontiers in VaR forecasting and backtesting. International Journal of Forecasting, 32: 475–501.
  4. 4. Hansen B. E. (1994). Autoregressive conditional density estimation. International Economic Review, pages 705–730. pmid:28905403
  5. 5. Fernández C. and Steel M. F. (1998). On Bayesian modeling of fat tails and skewness. Journal of the American Statistical Association, 93: 359–371.
  6. 6. Theodossiou P. (1998). Financial data and the skewed generalized t distribution. Management Science, 44: 1650–1661.
  7. 7. Jones M. and Faddy M. (2003). A skew extension of the t distribution, with applications. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65: 159–174.
  8. 8. Sahu S. K., Dey D. K., and Branco M. D. (2003). A new class of multivariate skew distributions with applications to Bayesian regression models. Canadian Journal of Statistics, 31: 129–150.
  9. 9. Bauwens L. and Laurent S. (2005). A new class of multivariate skew densities, with application to generalized autoregressive conditional heteroscedasticity models. Journal of Business & Economic Statistics, 23: 346–354.
  10. 10. Aas K. and Haff I. H. (2006). The generalized hyperbolic skew Student’s t distribution. Journal of Financial Econometrics, 4: 275–309. pmid:29333210
  11. 11. Zhu D. and Galbraith J. W. (2010). A generalized asymmetric Student’s t distribution with application to financial econometrics. Journal of Econometrics, 157: 297–305.
  12. 12. Zhu D. and Galbraith J. W. (2011). Modeling and forecasting expected shortfall with the generalized asymmetric Student-t and asymmetric exponential power distributions. Journal of Empirical Finance, 18: 765–778.
  13. 13. Papastathopoulos I. and Tawn J. A. (2013). Extended generalised Pareto models for tail estimation. Journal of Statistical Planning and Inference, 143: 131–143.
  14. 14. Tucker A. L. (1992). A re examination of finite-and infinite-variance distributions as models of daily stock returns. Journal of Business & Economic Statistics, 10: 73–81.
  15. 15. Perez-Quiros G. and Timmermann A. (2001). Business cycle asymmetries in stock returns: Evidence from higher order moments and conditional densities. Journal of Econometrics, 103: 259–306.
  16. 16. Wang J. J., Chan J. S., and Choy S. B. (2011). Stochastic volatility models with leverage and heavy-tailed distributions: A Bayesian approach using scale mixtures. Computational Statistics & Data Analysis, 55: 85–862.
  17. 17. Li R. and Nadarajah S. (2020). A review of Students’t distribution and its generalizations. Empirical Economics, 58: 1461–1490.
  18. 18. McNeil A. J., Frey R., and Embrechts P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton University Press.
  19. 19. Barndorff-Nielsen O. (1977). Exponentially decreasing distributions for the logarithm of particle size. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 353: 401–419.
  20. 20. Press S. J. (1967). A compound events model for security prices. Journal of Business, pages 317–335.
  21. 21. Praetz P. D. (1972). The distribution of share price changes. Journal of Business, pages 49–55.
  22. 22. Andrews D. F. and Mallows C. L. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society. Series B (Methodological), pages 99–102.
  23. 23. Barndorff-Nielsen O., Kent J., and Sorensen M. (1982). Normal variance-mean mixtures and z distributions. International Statistical Review/Revue Internationale de Statistique, pages 145–159.
  24. 24. Kon S. J. (1984). Models of stock returns–a comparison. The Journal of Finance, 39: 147–165.
  25. 25. West M. (1987). On scale mixtures of normal distributions. Biometrika, 74: 646–648.
  26. 26. Madan D. B. and Seneta E. (1990). The variance gamma (VG) model for share market returns. Journal of Business, pages 511–524.
  27. 27. Madan D. B., Carr P. P., and Chang E. C. (1998). The variance gamma process and option pricing. Review of Finance, 2: 79–105.
  28. 28. Tjetjep A. and Seneta E. (2006). Skewed normal variance-mean models for asset pricing and the method of moments. International Statistical Review, 74: 109–126.
  29. 29. Luciano E. and Semeraro P. (2010). A generalized normal mean-variance mixture for return processes in finance. International Journal of Theoretical and Applied Finance, 13: 415–440.
  30. 30. Geweke J. and Amisano G. (2010). Comparing and evaluating Bayesian predictive distributions of asset returns. International Journal of Forecasting, 26: 216–230.
  31. 31. Nadarajah S. (2012). Models for stock returns. Quantitative Finance, 12: 411–424.
  32. 32. McDonald J. B. and Butler R. J. (1987). Some generalized mixture distributions with an application to unemployment duration. The Review of Economics and Statistics, pages 232–240.
  33. 33. Hoogerheide L. F., Kaashoek J. F., and Van Dijk H. K. (2007). On the shape of posterior densities and credible sets in instrumental variable regression models with reduced rank: An application of flexible sampling methods using neural networks. Journal of Econometrics, 139: 154–180.
  34. 34. Ardia D., Hoogerheide L. F., and van Dijk H. K. (2009a). Adaptive mixture of Student’s t distributions as a flexible candidate distribution for efficient simulation: The R package AdMit. Journal of Statistical Software, 29: 1–32.
  35. 35. Ardia D., Hoogerheide L. F., and van Dijk H. K. (2009b). AdMit: Adaptive mixtures of Student’s t distributions. The R Journal, 1: 25–30.
  36. 36. Afuecheta E., Chan S., and Nadarajah S. (2019). Flexible models for stock returns based on Student’s t distribution. The Manchester School, 87: 403–427.
  37. 37. Acereda B., Leon A. and Mora J. (2019). Estimating the expected shortfall of cryptocurrencies: An evaluation based on backtesting. Finance Research Letters, 33: 101181.
  38. 38. Trucios C., Tiwari A. K. and Alqahtani F. (2019). Value-at-risk and expected shortfall in cryptocurrencies’ portfolio: A vine copula–based approach. Applied Economics, 52: 2580–2593.
  39. 39. Jimenez I., Mora–Valencia A. and Perote J. (2020). Risk quantification and validation for Bitcoin. Operations Research Letters, 48: 534–541.
  40. 40. Wright E. M. (1935). The asymptotic expansion of the generalized hypergeometric function. Journal of the London Mathematical Society, 10: 286–293.
  41. 41. Prudnikov A. P., Brychkov Y. A. and Marichev O. I. (1986). Integrals and Series, volumes 1, 2 and 3. Gordon and Breach Science Publishers, Amsterdam.
  42. 42. Gradshteyn I. S. and Ryzhik I. M. (2000). Table of Integrals, Series, and Products, sixth edition. Academic Press, San Diego, CA.
  43. 43. Mathai A. M. and Saxena R. K. (1978). The H-Function with Applications in Statistics and Other Disciplines. John Wiley and Sons, New York.
  44. 44. Srivastava H. M., Gupta K. C. and Goyal S. P. (1982). The H-Functions of One and Two Variables with Applications. South Asian Publishers, New Delhi.
  45. 45. Bouchaud J.-P. and Potters M. (2003). Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management. Cambridge University Press.
  46. 46. Benckert L.-G. and Jung J. (1974). Statistical models of claim distributions in re insurance. ASTIN Bulletin, 8: 1–25.
  47. 47. Revankar N. S., Hartley M. J., and Pagano M. (1974). A characterization of the Pareto distribution. The Annals of Statistics, 2: 599–601.
  48. 48. Arnold B. C. (1983). Pareto Distributions. International Cooperative Publishing House, Fairland, MD.
  49. 49. Hogg R. V. and Klugman S. A. (1983). On the estimation of long tailed skewed distributions with actuarial applications. Journal of Econometrics, 23: 91–102.
  50. 50. Nair N. and Hitha N. (1990). Characterizations of Pareto and related distributions. Journal of the Indian Statistical Association, 28: 75–79.
  51. 51. Meeusen W. and van Den Broeck J. (1977). Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, pages 435–444.
  52. 52. Chan S., Chu J., Nadarajah S., and Osterrieder J. (2017). A statistical analysis of cryptocurrencies. Journal of Risk and Financial Management, 10.
  53. 53. Lawless J. F. (2003). Statistical Models and Methods for Lifetime Data. John Wiley and Sons, New York.
  54. 54. Cooray K. and Ananda M. M. (2008). A generalization of the half-normal distribution with applications to lifetime data. Communications in Statistics—Theory and Methods, 37: 1323–1337.
  55. 55. Dobzhansky T. and Wright S. (1947). Genetics of natural populations. XV. Rate of diffusion of a mutant gene through a population of Drosophila pseudoobscura. Genetics, 32.
  56. 56. Bland J. M. and Altman D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8: 135–160. pmid:10501650
  57. 57. Feunou B., Jahan-Parvar M. R. and Tedongap R. (2016). Which parametric model for conditional skewness? European Journal of Finance, 22: 1237–1271.
  58. 58. Lalancette S. and Simonato J. G. (2017). The role of the conditional skewness and kurtosis in VIX index valuation. European Financial Management, 23: 325–354.
  59. 59. Chou C. -Y. and Liu H. -R. (1998). Properties of the half-normal distribution and its application to quality control. Journal of Industrial Technology, 14: 4–7.
  60. 60. Mandelbrot B. (1963). The variation of certain speculative prices. The Journal of Business, 36: 394–419.
  61. 61. Pagan A. (1996). The econometrics of financial markets. Journal of Empirical Finance, 3: 15–102.
  62. 62. Cont R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1: 223–226.
  63. 63. R Core Team (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  64. 64. Akaike H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19: 716–723.
  65. 65. Schwarz G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6: 461–464.
  66. 66. Hurvich C. M. and Tsai C. -L . (1989). Regression and time series model selection in small samples. Biometrika, 76: 297–307.
  67. 67. Hannan E. J. and Quinn B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society. Series B (Methodological), pages 190–195.
  68. 68. Bozdogan H. (1987). Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52: 345–370.
  69. 69. Burnham K. P. and Anderson D. R. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33: 261–304.
  70. 70. Fang Y. (2011). Asymptotic equivalence between cross-validations and Akaike information criteria in mixed-effects models. Journal of Data Science, 9: 15–21.
  71. 71. Lyu Y., Wang P., Wei Y. and Ke R. (2017). Forecasting the VaR of crude oil market: Do alternative distributions help? Energy Economics, 66: 523–534.
  72. 72. Laporta A. G., Merlo L. and Petrella L. (2018). Selection of value at risk models for energy commodities. Energy Economics, 74: 628–643.
  73. 73. Vuong Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57: 307–333.
  74. 74. Amisano G. and Giacomini R. (2007). Comparing density forecasts via weighted likelihood ratio tests. Journal of Business and Economic Statistics, 25: 177–190.
  75. 75. Diebold F. X. and Mariano R. (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13: 253–265.
  76. 76. Kupiec P. (1995). Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives, 2: 73–84.
  77. 77. Escanciano J. C. and Olmo J. (2011). Robust backtesting tests for value-at-risk models. Journal of Financial Econometrics, 9: 132–161.
  78. 78. Theodossiou P. and Savva C. (2016). Skewness and the relation between risk and return. Management Science, 62: 1598–1609.
  79. 79. Savva C. and Theodossiou P. (2018). The risk and return conundrum explained: International evidence. Journal of Financial Econometrics, 16: 486–521.
  80. 80. Gabaix X., Gopikrishnan P., Plerou V. and Stanley H. E. (2006). Institutional investors and stock market volatility. The Quarterly Journal of Economics, 121: 461–504.
  81. 81. Ibragimov M., Ibragimov R. and Walden J. (2015). Heavy-Tailed Distributions and Robustness in Economics and Finance. Springer Verlag, New York.
  82. 82. Creal D., Koopman S. J., and Lucas A. (2013). Generalized autoregressive score models with applications. Journal of Applied Econometrics, 28: 777–795.
  83. 83. Harvey A. (2013). Dynamic Models for Volatility and Heavy Tails: With Applications to Financial and Economic Time Series, volume 52. Cambridge University Press.