Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Bayesian composite quantile regression for the single-index model

Abstract

By using a Gaussian process prior and a location-scale mixture representation of the asymmetric Laplace distribution, we develop a Bayesian analysis for the composite quantile single-index regression model. The posterior distributions for the unknown parameters are derived, and the Markov chain Monte Carlo sampling algorithms are also given. The proposed method is illustrated by three simulation examples and a real dataset.

1 Introduction

The single-index model (SIM) is one of the most popular semiparametric models in statistics, econometrics and psychometrics. In recent years, there are extensive researches on fitting SIMs using kernel, local linear and average derivatives methods. Among them, two most popular methods are the average derivative (ADE) method (Powell et al. [1], Härdle and Stoker [2], Härdle et al. [3]) and the minimum average variance estimation (MAVE) method (Xia and Härdle [4], Chen et al. [5], Zhao and Feng [6]).

Most of the above methods were based on conditional mean model. As a useful supplement to mean regression, quantile regression (Koenker and Bassett [7]) produced a more complete description of the conditional response distribution. Wu et al. [8] proposed a practical algorithm based on the local linear approach to estimate the nonparametric link function and the quantile regression coefficient. Lv et al. [9] proposed a quantile regression estimation for partially linear single-index model by minimizing the average quantile loss estimation method and multidimensional kernel estimation method. Jiang and Qian [10] used the kernel method to estimate the unknown function and developed a back-fitting algorithm for single-index model based on quantile regression. Xu et al. [11] investigated quantile regression (QR) estimation for single-index QR models when the response was subject to random left truncation. The asymptotic properties for the proposed QR estimates of the index parameter and unknown link function were both obtained. Although, the quantile regression is more robust than the mean regression, sometimes, quantile regression may still loss some efficiency. In order to safeguard quantile regression against potiential efficiency loss, the composite quantile regression (CQR) combining information over different quantiles becomes more and more popular. Intuitively, the CQR could provide an effective estimation for the SIM. Jiang et al. [12] suggested a back-fitting CQR algorithm for SIM. Later, a two-step CQR estimation procedure for SIMs was proposed by Jiang et al [13]. Jiang et al [14] proposed a weighted CQR estimation for SIMs. Liu et al. [15]considered weighted composite quantile estimation of the single-index model with missing covariates at random. Jiang and Yu [16] extended the non-iterative composite quantile regression methods for single-index models to the analysis of massive datasets via a divide-and-conquer strategy. The proposed approach significantly reduced the computing time and the required primary memory. Song et al. [17] focused on the estimators of the parameters and the unknown link function for the single-index model in a high-dimensional situation. The SCAD and Laplace error penalty (LEP)-based penalized composite quantile regression estimators, which could realize variable selection and estimation simultaneously.

In the past few years, there were extensive researches about frequentist estimation of single-index regression models, but Bayesian approach is also a useful statistical analysis tool. For Bayesian approach of nonparametric link function, Antoniadis et al. [18] approximated the link function by B-splines and adopted regularization with generalized cross validation to avoid over-fitting. Wang [19] estimated the index vector and the link function by free-knot splines. Choi et al. [20] considered a Gaussian process regression (GPR) approach to analysing a SIM from the Bayesian perspective, the proposed approach broadened the scope of the applicability of the SIM as well as the GPR. In addition, they discussed the theoretical aspect of the proposed method in terms of posterior consistency. Gramacy and Lian [21] developed a SIM for parsimonious multidimensional nonlinear regression by combining parametric projection with univariate nonparametric regression models. They showed that a particular GP formulation was simple to work with and ideal as an emulator for some types of computer experiment. Hu et al. [22] used a Gaussian process prior for the unknown nonparametric link function and a Laplace distribution for the index vector, which demonstrated the advantages of the Bayesian method compared with the frequentist approach. Liu and Liang [23] investigated single-index quantile regression with missing observation based on Bayesian method, while using spline approximation for the link function. They constructed quasi-posterior distribution of the index vector based on asymmetric Laplace likelihood with missing observation, and established asymptotically normality of the posterior estimator of the index parameters.

As far as we know, there are few work to consider composite quantile regression single-index model based on Bayesian approach. Therefore, this paper consider the Bayesian technique to study the composite quantile regression for single-index models. By using a Gaussian process prior and a location-scale mixture representation of the asymmetric Laplace distribution, we develop a Bayesian analysis for the composite quantile single-index regression model. The posterior distributions for the unknown parameters are derived, and the Markov chain Monte Carlo sampling algorithms are also given. Favorable performance is illustrated on two simulation examples and the real dataset.

This article is organized as follows: In Section 2, we introduce the composite quantile regression for single-index model firstly. Then, the details of our hierarchical Bayesian composite quantile single-index regression model are provided. We execute the posterior sampling algorithm by a more efficient partially collapsed sampler. Note that the link function is integrated out when drawing samples of the index vector. Some detailed derivative processes are showed in detail in the S1 Appendix. Then, numerical illustrations including simulation studies and the real data is presented in Section 3 and section 4. We conclude the paper with a discussion in Section 5.

2 Bayesian hierarchical model

2.1 Model structure

In the composite quantile single-index regression model, for independent and identically distributed pairs (xi, yi), i = 1, 2, …, n, we assume (1) where yi is the response, xi = (xi1, xi2, …, xip)T is the p-dimensional covariate, η(⋅) is an unknown link function, β = (β1, β2, …, βp)T is an unknown parameter vector, and εi is the error. The aim of this paper is to simultaneously estimate β and η(⋅).

According to Zou and Yuan [24], we denote 0 < τ1 < τ2 < ⋯ < τM < 1, the parameter estimators of composite quantile single-index regression is defined as: (2) where m = 1, 2, ⋯, M, , αm is the intercept at the quantile level τm and is the check loss function at τm, I(⋅) denotes the indicator function.

2.2 Hierarchical bayesian modelling

At the τ-th quantile, we model the residual errors by the asymmetric Laplace distribution (ALD, Yu and Moyeed [25], Geraci and Bottai [26], Luo et al. [27]). More specifically, the probability distribution of y given μ = η(xTβ) is where the quantile level τ is the skewness parameter in the distribution, μ is the location parameter, and σ is the scale parameter. Let , and y = (y1, y2, …, yn)T, the conditional distribution for the observations is Thus, the minimization objective function in (2.2) can be rewritten as the likelihood function of a composite quantile single-index regression: (3)

However, there are some troubles in handling directly the above likelihood for Bayesian inference. A location-scale mixture representation of the ALD (Kozumi and Kobayashi [28]) is helpful to deal with this difficulty. We can write the observations satisfying model (1) as (4) where eim ∼ exp(τm(1 − τm)σ−1), εi is a standard normal random variable, which is independent of eim. For proposed Bayesian composite quantile regression for single-index, the model (4) can be written as: Thus, the complete likelihood function of model (3) based on observations (x, y) can be written as: Followng Choi et al. [20] and Gramacy and Lian [21], we model the unknown link function η(⋅) by a Gaussian process prior distribution. More specifically, η is a Gaussian process prior with zero mean and a squared-exponential covariance function, that is where γ and d are hyperparameters. Then, covariates are introduced into the composite quantile regression single-index model where ηn = (η1, η2, …, ηn)T, Cn is an n × n matrix with entries As noted in Gramacy and Lian [21] and Hu et al. [22], it is unnecessary to keep ‖β‖ = 1, so the covariance function is reformulated as Since β is not constrained to have unit norm, β can follow any prior. We choose the Gaussian prior on each component.

To summarize, the Bayesian hierarchical formulation is provided below.

;

;

, σ > 0, j = 1, 2, …, p;

π(eim) ∼ exp(τm(1−τm)σ−1);

π(σ) ∼ IG(aσ, bσ);

π(γ) ∼ IG(aγ, bγ);

π(αm) ∝ 1, m = 1, 2, …, M, i = 1, 2, …, n.

The hyperpriors for σ, γ are set to be IG(aσ, bσ) and IG(aγ, bγ), where IG denotes the inverse Gamma distribution and Ga denotes the Gamma distribution. All of the hyperparameters aσ, bσ, aγ, bγ are set to be 0.5 in numerical experiments. Sensitivity analysis reveals that the results are insensitive to the selection of hyperparameters.

2.3 MCMC sampling

In this section, we will provide the MCMC sampling procedure for model (3). The posterior distributions for all of the unknown parameters and latent variables are proportional to the joint distribution:

The posterior distributions for ηn is: (5) with

Here, , , and F = (F1, F1, ⋯, Fn)T, , zim = yiαm − (1 − 2τm)eim.

Similarly, the posterior distribution of β, γ and αm are (6) where Next, we continue to derive the full conditional distribution of σ, λ, eim, αm. For fixed i, m, we have (7) where the probability density function of GIG(ρ, m, n) is

Here, x > 0, −∞<ρ < ∞, m ≥ 0, n ≥ 0, Kρ is the third modified Bessel function (Barndorff-Nielsen and Shephard [29]).

Finally, the posterior distributions for σ is (8) with .

The details of derivation are given in S1 Appendix. The Metropolis-within-Gibbs algorithm can be used to sample from the posterior distribution. The variables , , σ(t) can be directly derived based on the respective full conditional distributions. (⋅)(t) denotes the sampled values at iteration t. For β(t), a Metropolis step with proposal distribution is used. For α(t), a Metropolis step with proposal distribution is used, and for γ(t), we propose the new value from . In practice, σβ, σα and σγ control the acceptance rate to be within 10%∼30%. In the Metropolis step and Gibbs sampling, we generate 20,000 samples and burn in 10000 samples. The detailed sampling proceeds are as follows.

  • The index vector β, α and parameter γ are sampled from their posterior distributions (6) based on Metropolis steps;
  • Sample the nonparametric link function η(⋅) from its posterior distribution (5);
  • The remaining parameters (eim, σ) are sampled from their full conditional distributions (7) and (8) respectively.

3 Numerical illustrations

We use Monte Carlo simulations to study the performance of the proposed method (BCQR) with comparison to the Bayesian quantile regression (BQR) based on a single quantile τ = 0.5, and the Bayesian linear regression (BLR) for single index model.

Example 1 Data is generated from the following composite quantile regression for single-index model: where , , . The predictors x = (x1, x2, x3, x4)T are uniform in [0, 1]4.

Example 2 Data is generated from the following model: where . The predictors x = (x1, x2)T are uniform in [−1, 1]2.

Example 3 Data is generated from the following model: where . The predictors x = (x1, x2)T are uniform in [0, 1]2.

For all models, we consider the following four different distributions for the random error ε:

  • (1) standard normal distribution, N(0, 1);
  • (2) t-distribution with degrees of freedom 3, t(3);
  • (3) exponential distribution with the rate 0.5, Exp(0.5);
  • (4) mixture normal distribution (MN), 0.5N(−2, 1) + 0.5N(2, 1).

The sample sizes n = 50 and n = 100 are considered. Since the frequentist approach requires the identifiability constraint ‖β‖ = 1, we normalize the Bayesian estimates of the index vector to have unit norm, while the first component of the index vector requires to be positive to resolve the sign indeterminacy.

The biases and standard deviation (SDs × 100) calculated by BCQR9, BCQR5, BQR and BLR for different error distributions in Examples 1–3 are summarized in Tables 13. Here, BCQR5 and BCQR9 represent BCQR method with M = 5 and M = 9 respectively. From Table 1, it can be seen that when the error follows mixture normal distribution, the performances of four methods are similar. Moreover, when the error follows N(0, 1), the BLR performs the best. For example, when n = 100 in example 1, the biases of all parameters are similar for four methods, but the SDs of BLR are the least. For t(3) error distribution, the biases of BCQR9, BCQR5 and BQR perform better than BLR for most cases and the SDs of BCQR9, BCQR5 and BQR are smaller than that of BLR for four parameters with two kinds of sample sizes. Thus, the BCQR9, BCQR5 and BQR outperform BLR for t(3) error distribution. The BCQR9 and BCQR5 outperform BQR for the exponential error distribution, because there are still the least SDs for BCQR9 and BCQR5. From Tables 2 and 3, we can obtain similar conclusions to those in Example 1 and even the advantages of BCQR9 and BCQR5 are more obvious, that is, the biases and SDs of BCQR9 or BCQR5 are both significantly smaller than that of BLR and BQR for all settings. To sum up, we can see that the performances of BCQR with M = 5 and M = 9 are similar, which is better than the BQR and BLR method in most cases.

thumbnail
Table 1. Comparison of Bias (×100) and SDs (×100) of for BCQR, BQR and BLR based on 100 replications in each case for simulation Example 1.

https://doi.org/10.1371/journal.pone.0285277.t001

thumbnail
Table 2. Comparison of Bias (×100) and SD (×100) of for BCQR, BQR and BLR based on 100 replications in each case for simulation Example 2.

https://doi.org/10.1371/journal.pone.0285277.t002

thumbnail
Table 3. Comparison of Bias (×100) and SD (×100) of for BCQR, BQR and BLR based on 100 replications in each case for simulation Example 3.

https://doi.org/10.1371/journal.pone.0285277.t003

In addition, we also investigate the estimated accuracy of Y by bias (×100) and SD for BLR, BQR, BCQR5, BCQR9 under different error settings. It can be found from Table 4 that the performances of four methods are similar and the advantages of them are reflected in different aspects. To be more specific, the proposed BCQR5 and BCQR9 tend to have the smaller biases, while BLR and BQR may reveal the smaller SDs.

thumbnail
Table 4. Comparison of Bias (×100) and SD of Y for BCQR, BQR and BLR based on 100 replications in each case for simulation Examples 1–3.

https://doi.org/10.1371/journal.pone.0285277.t004

Next, we display the trace-plots of some parameters to check the convergence of our approach. As the results for the four error distributions are very similar, we only show the results for BCQR9 with the error term N(0, 1) in Figs 13. Figs 13 demonstrate that the chains of posterior samples are convergent quickly, which means that the samples derived by our approach can be considered to achieve satisfactory convergence. The true function η(⋅) and the estimated link function by BCQR9 (based on posterior mean) are displayed in Figs 46. The 95% credible intervals (CIs) with sample size n = 100 are also plotted.

thumbnail
Fig 1. Trace plots of β1, β2, β3, β4 for BCQR9 in simulation Example 1 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g001

thumbnail
Fig 2. Trace plots of β1, β2, β3, β4 for BCQR9 in simulation Example 2 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g002

thumbnail
Fig 3. Trace plots of β1, β2, β3, β4 for BCQR9 in simulation Example 3 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g003

thumbnail
Fig 4. The true function (black), estimated link function (red) and CI (blue) by BCQR9 in simulation Example 1 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g004

thumbnail
Fig 5. The true function (black), estimated link function (red) and CI (blue) by BCQR9 in simulation Example 2 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g005

thumbnail
Fig 6. The true function (black), estimated link function (red) and CI (blue) by BCQR9 in simulation Example 3 with four different error distributions, n = 100.

https://doi.org/10.1371/journal.pone.0285277.g006

4 Real data analysis

Here, we apply the proposed approach to a body fat data set (Penrose et al. [30]), which can be founded in R package “mfp”. There are 252 observations with the response variable (Percent body fat) and 13 covariates (abdomen, weight, height, neck, chest, age, hip, thigh, knee, ankle, biceps, forearm, wrist). Liu and Yang [31] and Li et al. [32] consider the single index model variable selection procedures for this dataset. Both two methods select abdomen, neck and wrist as the significant variables. Liu et al. [15] consider the weighted composite quantile estimation of the single-index model for thise data. Similar to Liu et al. [15], we consider the response y = log(Percent body fat) and select the covariates: x1 = age, x2 = abdomen, x3 = wrist. One purpose of us is to estimate the relationship between body fat and other covariates. The other aim is to predict the body fat by covariates: x1 = age, x2 = abdomen and x3 = wrist. For prediction, we regard the first 150 observations as training dataset Dtrain and the rest as the testing dataset Dtest. Then, we use the coefficients obtained by training dataset to calculate the mean of the absolute prediction error (MAPE) of BQR, BCQR5 and BCQR9 based on the testing dataset Dtest, where MAPE is the mean of .

The following model is considered: Before applying our methods, we exclude the observations with the percentage body fat estimated to be 0 and the density less than 1. Here, all covariates are standardized. The estimated results are summarized in Table 5.

thumbnail
Table 5. Results of Bayesian single-index model for body fat data.

https://doi.org/10.1371/journal.pone.0285277.t005

From Table 5, we can see that the age and abdomen both suggest positive relationships for response, while wrist has negative influences on the body fat. Among them, abdomen plays the greatest impact, while age has minimal effect on body fat. In addition, the estimators of β are very close to each other for BCQR5 and BCQR9. The lengthes of the confidence interval for BCQR9 and BCQR5 are both smaller than that of BQR, and the BCQR9 has the smallest confidence interval for β. In addition, the predicted performances of three methods are also similar. Correspondingly, we also fit the link function η(⋅) in Fig 7. Thus, we conclude that there is a nonlinear relationship between the covariates and the response variable, which shows that the proposed method can identify the link function η(⋅) well.

thumbnail
Fig 7. The estimated link function η(⋅) (black) and CI (blue) by BQR, BCQR5, BCQR9 for body fat data.

https://doi.org/10.1371/journal.pone.0285277.g007

5 Discussion

In this article, we propose a Bayesian composite quantile regression method for single-index models based on a Gaussian process prior for the nonparametric link function, while an efficient MCMC algorithm for posterior inference is presented. Then, we use three synthetic examples and one real dataset to illustrate and examine the performances of BCQR with comparison to BQR and BLR. The results show that the performance of proposed method is well and it can obtain a shorter length of confidence interval. The proposed method is very effective in small or median sample, while it may be less favored when the sample size goes large. This is because that the positive definiteness of the variance of the posterior distributions of the link functions may be violated for large sample size. We will study this problem in our future work. In addition, the proposed approach may be extended to the incomplete data such as censored data, missing data and so on. Also, with the progress of the times, high dimensional data is also common. Thus, our method may be added penalties such as Lasso, adaptive Lasso, SCAD to execute variable selection.

References

  1. 1. Powell JL., Stock JH., Stoker TM. Semiparametric estimation of index coefficients. Journal of the Econometric Society. 1989; 57: 1403–1430.
  2. 2. Härdle W., Stoker TM. Investing smooth multiple regression by the method of average derivatives. Journal of the American statistical Association. 1989; 84(408): 986–995.
  3. 3. Hardle W., Hall P., Ichimura H. Optimal smoothing in single-index models. The annals of Statistics. 1993; 21(1): 157–178.
  4. 4. Xia Y., Härdle W. Semi-parametric estimation of partially linear single-index models. Journal of Multivariate Analysis. 2006; 97: 1162–1184.
  5. 5. Chen J., Gao J., Li D. Estimation in single-index panel data models with heterogeneous link functions. Econometric Reviews. 2013;32(8): 928–955.
  6. 6. Zhao Y, Feng S. Robust estimation for partial linear single-index models. Journal of Nonparametric Statistics. 2022;34(1): 228–249.
  7. 7. Koenker R, Bassett JrG. Regression quantiles. Econometrica: journal of the Econometric Society. 1978;46: 33–50.
  8. 8. Wu TZ., Yu K., Yu Y. Single-index quantile regression. Journal of Multivariate Analysis. 2010; 101(7): 1607–1621.
  9. 9. Lv Y., Zhang R., Zhao W., et al. Quantile regression and variable selection of partial linear single-index model. Annals of the Institute of Statistical Mathematics. 2015; 67(2): 375–409.
  10. 10. Jiang R., Qian WM. Quantile regression for single index coefficient regression models. Statistics & Probability Letters. 2016; 110: 305–317.
  11. 11. Xu H., Fan G., Li J. Single-Index Quantile Regression with Left Truncated Data. Journal of Systems Science and Complexity. 2022; 35(5): 1963–1987.
  12. 12. Jiang R., Zhou ZG., Qian WM., et al. Single-index composite quantile regression. Journal of the Korean Statistical Society. 2012; 41(3): 323–332.
  13. 13. Jiang R., Zhou ZG., Qian WM., et al. Two step composite quantile regression for single-index models. Computational Statistics & Data Analysis. 2013; 64: 180–191.
  14. 14. Jiang R., Qian WM., Zhou ZG. Weighted composite quantile regression for single-index models. Journal of Multivariate Analysis. 2016; 148:34–48.
  15. 15. Liu H., Yang H., Peng C. Weighted composite quantile regression for single index model with missing covariates at random. Computational Statistics. 2019; 34(4): 1711–1740.
  16. 16. Jiang R, Yu K. Single-index composite quantile regression for massive data. Journal of Multivariate Analysis. 2020; 180: 104669.
  17. 17. Song Y., Li Z., Fang M. Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models. Mathematics. 2022; 10(12): 2000.
  18. 18. Antoniadis A.,Gregoire G., McKeague I. W. Bayesian estimation in single-index models. Statistica Sinica. 2004; 14(4):1147–1164.
  19. 19. Wang HB. Bayesian estimation and variable selection for single index models. Computational Statistics & Data Analysis. 2009; 53(7):2617–2627.
  20. 20. Choi T., Shi JQ., Wang B. A Gaussian process regression approach to a single-index model. Journal of Nonparametric Statistics. 2011; 23(1):21–36.
  21. 21. Gramacy RB., Lian H. Gaussian process single-index models as emulators for computer experiments. Technometrics. 2012; 54(1):30–41.
  22. 22. Hu Y., Gramacy RB., Lian H. Bayesian quantile regression for single-index models. Statistics and Computing. 2013; 23(4):437–454.
  23. 23. Liu CS., Liang HY. Bayesian analysis in single-index quantile regression with missing observation. Communications in Statistics-Theory and Methods. 2022; 1–29.
  24. 24. Zou H., Yuan M. Composite quantile regression and the oracle model selection theory. The Annals of Statistics. 2008; 36:1108–1126.
  25. 25. Yu K., Moyeed RA. Bayesian quantile regression. Statistics & Probability Letters. 2011; 54(4):437–447.
  26. 26. Geraci M., Bottai M. Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics. 2007; 8(1):140–154. pmid:16636139
  27. 27. Luo Y., Lian H., Tian M. Bayesian quantile regression for longitudinal data models. Journal of Statistical Computation and Simulation. 2001;82(10-12):1635–1649.
  28. 28. Kozumi H., Kobayashi G. Gibbs sampling methods for Bayesian quantile regression. Journal of statistical computation and simulation. 2011;81(11-12):1565–1578.
  29. 29. Barndorff-Nielsen OE., Shephard N. Non-Gaussian OrnsteinUhlenbeck-based models and some of their uses in financial economics. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2011;63(2):167–241.
  30. 30. Penrose K., Nelson A., Fisher A. Generalized body composition prediction equation for men using simple measurement techniques. Medicine & Science in Sports & Exercise. 1985;17:189.
  31. 31. Liu H., Yang H., Xia X. Robust estimation and variable selection in censored partially linear additivemodels. Journal of the Korean Statistical Society. 2017; 46:88–103.
  32. 32. Li J., Li Y., Zhang R. B spline variable selection for the single index models. Statistical Papers. 2017; 58:691–706.