Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Quantile regression for static panel data models with time-invariant regressors

  • Li Tao,

    Roles Data curation, Formal analysis, Methodology, Writing – original draft

    Affiliation School of Information, Beijing Wuzi University, Beijing, China

  • Lingnan Tai,

    Roles Data curation, Writing – review & editing

    Affiliation Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China

  • Maozai Tian

    Roles Methodology, Supervision, Writing – review & editing

    mztian@ruc.edu.cn

    Affiliations Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China, School of Statistics and Information, Xinjiang University of Finance and Economics, Xinjiang, China

Abstract

This paper proposes two new weighted quantile regression estimators for static panel data model with time-invariant regressors. The two new estimators can improve the estimation of the coefficients with time-invariant regressors, which are computationally convenient and simple to implement. Also, the paper shows consistency and asymptotic normality of the two proposed estimator for sequential and simultaneous N, T asymptotics. Monte Carlo simulation in various parameters sets proves the validity of the proposed approach. It has an empirical application to study the effects of the influence factors of China’s exports using the trade gravity model.

Introduction

Panel data have their own distinct data characteristics, compared with simple cross-sectional data and time series data. Panel data have more observations, which can improve the validity of econometric model estimation. Besides, panel data can control the effect of omitted variables when modeling and improve the estimation accuracy of the estimation. The traditional panel data analysis is mainly based on the conditional mean regression methods, which cannot fully describe the data. Quantile regression for panel date can describe the independent variable for the dependent variable range accurately, and capture systematic influences of covariates on the location, scale and shape of the conditional distribution of the response.

Recently, there has been a growing literature on studying quantile regression for static panel data. Koenker [1] proposed quantile regression with fixed effects (FEQR) and penalized quantile regression with fixed effect (PQR) employing l1 regularization methods, pointed out that shrinkage of a large number of individual fixed effects toward a common value in the panel model can help to modify the variability caused by these individual effects. Lamarche [2] proved that there existing optimal penalty parameter of penalized quantile regression for panel data with fixed effect. However, when the sample size is large, the calculation of the penalty estimation is rather complicate. Moreover, when the panel data model contains time-invariant variables, the penalty estimator is less effective in estimating the time-invariant variables.

Canay [3] introduceed a two-step estimator for panel data quantile regression models. The two-step estimation method eliminates the fixed effect in the first step, which can greatly reduce the estimated parameters in quantile regression and avoid the choice of the penalty parameters. Obviously, when there are non time-varying variables in the panel model, the two-stage estimation method will ignore the estimation of the coefficient of the non time-varying covariates.

Galvao and Wang [4] developed a new minimum distance quantile regression (MD-QR) estimator for panel data models with fixed effects, which is computationally fast, especially for large cross-sections. Galvao et al. [5] proved that unbiased asymptotic normality of both the FEQR and MD-QR estimators, showing that quantile regression is applicable to the same type of panel data (in terms of N, T) as other nonlinear models. However, the MD-QR estimator is not applicable to the case that the panel model contains the time-invariant independent variables, which is defined as the weighted average of the individual quantile regression slope estimators. There have been other growing studies on quantile regression for panel data. See, for example, Harding and Lamarche [6, 7], Galvao et al. [8], Tao et al. [9], Dai et al. [10], Dai and Jin [11]. The existing research on panel data models mainly focuses on obtaining estimates of time-varying covariates. However, when the panel model contains time-varying covariates, most of these methods are ineffective, or the estimation results are poor.

Some researchers have studied parameter estimation for panel models with time-invariant covariates based on mean regression methods. Plümper and Troeger [12] suggested a three-stage procedure for the estimation of time-invariant in panel data models with unit effects. Pesaran and Zhou [13] proposed the fixed effects filtered (FEF) and fixed effects filtered instrumental variable (FEF-IV) estimators for estimation and inference in the case of time-invariant effects in static panel data models when N is large and T is fixed. Kripfganz and Schwarz [14] proposed a two-stage estimation procedure to identify the effects of time-invariant regressors in a dynamic version of the Hausman-Taylor model. Zhang and Zhou [15] used generalized method of moments (GMM) to estimate the time-varying effects in the first step, and run cross-sectional OLS regression of the time series average of the residuals to estimate the time-invariant effects in the second step.

To the best of our knowledge, our paper is the first to study quantile regression for panel data models with time-invariant regressors. We propose two weighted estimators of quantile regression. The minimum distance estimation and the two-step estimation of quantile regression fail, when panel date model exists time-invariant regressors. Therefore, we give two new estimators to improve the estimation of coefficients of time-invariant variables. First, considering the time-invariant regressor is exogenous, we propose a weighted estimator of quantile regression (W-QR). And then, regarding the time-invariant regressor is endogenous, we propose a weighted estimator of instrumental variable quantile regression (W-IVQR). The two new proposed methods need only two steps, which is computationally convenient and simple to implement. Regress dependent variables on time-varying variables with an intercept using the conventional quantile regression to obtain the slope and intercept estimators for each individual in the first step, and then use different weighted definitions to the obtained slope and intercept estimators to get the estimator of β and γ respectively in the second step. Besides, we study the asymptotic properties of W-QR and W-IVQR estimator under both sequential and simultaneous limits. Monte Carlo simulation in various parameters sets prove the validity of the proposed approach. Finally, we illustrate the proposed W-QR estimation with an application to analyze the effects of the influence factors of China’s exports using the trade gravity model.

The rest of the paper is organized as follows. Section 2 gives the static panel data model with time-invariant regressors and proposes the W-QR estimator and the W-IVQR estimator. Section 3 is devoted to the asymptotic behavior of the proposed estimators. Section 4 describes the Monte Carlo simulation. In Section 5, we illustrate the new approaches with an application to analyze the effects of the influence factors of China’s exports using the trade gravity model. In the end, Section 6 summarizes the paper.

The model and estimators

Static panel data model with time-invariant regressors

Consider the panel data model that contains time-varying as well as time-invariant regressors: (1) where xit is a q × 1 vector of time-varying variables, and zi is a p × 1 vector of observed individual-specific variables that only vary over the cross-sectional unit i. In addition to zi, the outcomes, yit, are also governed by unobserved individual specific effects, ηi. The focus of the analysis is on estimation and inference involving the elements of β and γ. It is clear that without further restrictions on ηi, γ cannot be identified even if β is known to the researchers. For example, consider the simple case where β = 0, and assume that T is small. Then averaging across t, we obtain where , , and . It is clear that without specifying how vi and zi are related it will not be possible to identify the zi. To deal with this problem, it is often assumed that there exists instruments that are uncorrelated with vi, but at the same time are sufficiently correlated with zi. Even if such instruments exist, a number of further complications arise if β ≠ 0. In such a case, the instrumental variable approach must be extended also to deal with the possible dependencies between ηi and xit. In what follows we allow for xit and ηi to have any degree of dependence, but initially assume that zi and vi are uncorrelated for identification of γ.

It is convenient to write model (1) in matrix form as, (2) where y = (yit) is a NT × 1 matrix, , Z = zlT is a NT × p matrix, ⊗ is the Kronecker product, X = (xit) is a NT × q matrix, D = INlT, lT is a T × 1 vector of ones, η = (η1, ⋯, ηN)′ is the N × 1 vector of individual specific effects or intercepts, and ε = (εit) is a NT × 1 matrix. Note that D and Z represent an incidence matrix that identifies the N distinct individuals in the sample.

We assume that the τth quantile of the error εit(τ) is equal to zero. We consider the following model for the τth conditional quantile functions of the response of the tth observation on the ith individual yit, (3) Galvao and Wang [4] propose a simple to implement and efficient minimum distance quantile regression estimator for panels with fixed effects. They consider a minimum distance quantile regression (MD-QR) estimator, , defined as follows where is the slope coefficient estimator from each individual quantile regression problem using the time-series data, and Vi denotes the associated variance-covariance matrix of βi for each individual. As we can see, MD-IVQR estimator is defined as the weighted average of the conventional QR slope estimators, with weights given by the inverses of the corresponding individual variance-covariance matrices, thus, when the model contains time-invariant regressors, MD-QR estimator can not identify γ(τ) and ηi(τ).

Chetverikov et al. [16] are primarily interested in estimating γ(τ). They point out that (3) can also be considered as a special case in form of (4) (5) Indeed, setting , and assuming that . To estimate γ(τ), they develop the grouped IV quantile regression estimator which consists of the following two stages. At the first stage, for each τ and i, estimate τth quantile regression of yit on by the classical quantile regression estimator of Koenker and Bassett [17]. Estimate a 2SLS regression of on zi using the instrument to get an estimator of γ(τ) at the second stage. The differences between our study and Chetverikov et al. [16] lie in three aspects: (1) Chetverikov et al. [16] consider a more general model that allows for interaction via , the panel data model (3) studied in this paper is a special case of (4)-(5) in form and has no interaction effect; (2) our study is interested in estimating β(τ) and γ(τ), Chetverikov et al. [16] are primarily interested in estimating γ(τ) without considering the estimation of β(τ); (3) there are different requirements for N and T tending to infinity in asymptotic theory, Chetverikov et al. [16] assume that as N → ∞, our study assumes that as N → ∞.

The W-QR estimator

In this section, we consider the case when the time-invariant regressor zi is exogenous, that is, E(ziηi(τ)) = 0. The weighted estimator of quantile regression (W-QR) estimator can be computed as follows.

Step 1: For each individual i and each quantile index τ from the set U of indices of interest, estimate τth quantile regression of yit on the time-varying regressors xit by the classical quantile regression estimator of Koenker and Bassett [17]: (6) where ρτ(u) = u(τI(u < 0)) is the check function. We can see that and are the slope and intercept estimator for each individual quantile regression problem with using the time-series data.

Step 2: Follow Galvao and Wang [4], and like Chetverikov et al. [16] and Pesaran and Zhou [13], the W-QR estimator is computed as (7) (8) (9) where , , and Vi denotes the associated variance-covariance matrix of for each individual. As we can see that the estimators and gained in the first step are weighted in different forms separately to get the W-QR estimator of and in the second step. According to the above formulas (7) and (8), the is a weighted combination of , and is a weighted combination of centralized . In addition, the definition of makes sense because is the intercept for cross-sectional unit i.

Remark 1: In mean regression, one need to compute FE estimator of β, and then use the FE residuals to obtain γ, see, Plümper and Troeger [12], Pesaran and Zhou [13]. That is to say, the mean regression methods to settle the estimation of model (1) need at least two stages. FE estimator of β is computed at the first stage, and the estimator of time invariant effects is gained at the second stage. Compared with the mean regression methods of parameter estimation in model (1), the proposed method need only two steps, which is computationally convenient and simple to implement. Regress yit on xit with an intercept using the conventional quantile regression to obtain the slope and intercept estimators for each i in the first step, and then use different weighted definitions to the obtained slope and intercept estimators to get the W-QR estimator of β and γ respectively in the second step.

Remark 2: However, the estimator , defined in (8) is infeasible in applications unless Vi is known for every individual. Thus, it is suggest to use the corresponding consistent estimator to replace the unknown Vi, then is given by (10) The specific form of depends on the assumption on the dependence across individuals, examples for such estimators will be provided in formula (12).

The W-IVQR estimator

In this section, we discuss the case when the time-invariant regressor zi is exogenous. That is, we consider that zi is correlated with ηj(τ) or εj(τ). We propose an IV version of W-QR estimator (denoted as W-IVQR) that allows for endogeneity of the time-invariant regressors.

We now provide W-IVQR estimator that allows for possible endogeneity of the time-invariant regressors, assuming the existence of s × 1 vector of instruments ri for time-invariant zi, where ri is independent of ηj(τ) and εj(τ) for all i and j and sp. The W-IVQR estimator can be obtained as follows.

Step 1: For each individual i and each quantile index τ from the set U of indices of interest, estimate τth quantile regression of yit on the time-varying regressors xit by the classical quantile regression estimator of Koenker and Bassett [17]: (11) We can see that and are the slope and intercept estimator for each individual quantile regression problem with using the time-series data.

Step 2: Follow Galvao and Wang [4], and like Chetverikov et al. [16] and Pesaran and Zhou [13], the W-IVQR estimator is computed as and where Qzr,N, Qrr,N and are , and , , .

Remark 3: In Step 2, we estimate a centralized 2SLS regression of on zi using ri as an instrument to get an estimator of γ(τ). The instrument ri needs to satisfy the following two conditions: (i) instruments ri can impact zi, and dim(ri) ≥ dim(zi); (ii) ri is independent of ηj(τ) and εj(τ) for all i and j. In practice, for panel data model (1), we can choose Hausman and Taylor instrumental variables in [18]. Hausman and Talyor [18] proposed Hausman and Talyor (HT) estimator. One advantage of HT estimator is that there is no need for HT method to adapt instrumental variables beyond the model, but it requires the dimension of xit which are uncorrelated with the individual effects is greater than zi that are correlated with the individual effects. That is, we can use as instrumental variables of ri, provided that sp in practice. HT instrumental variables have been widely used in quantile regression for panel data, e.g. see [6, 9].

Asymptotic theory

Now we briefly discuss the asymptotic properties of the W-QR and W-IVQR estimators. We study the asymptotic properties of the two proposed estimators when both T and N go to infinity, both sequentially and simultaneously. The sequential asymptotics, denoted by (T, N)seq → ∞, is defined as T diverging to infinity first, and then N → ∞. The simultaneous asymptotics, denoted by (T, N) → ∞, means T and N tend to infinity at the same time.

Basic assumptions

To establish the asymptotic properties, we impose the following regularity conditions.

Assumption 1: (i) Observations are independent across individuals. (ii) For all i = 1, ⋯, N, (xit, yit) are i.i.d. within individuals.

Assumption 2: (i) For all i = 1, ⋯, N and t = 1, ⋯, T, the regressors, xit satisfy the moment conditions ∥ xit ∥ ≤ CM. (ii) For all i = 1, ⋯, N, all eigenvalues of are bounded from below by cM.

Assumption 3: For all τU, (β(τ), γ(τ), η(τ)) is in the interior of the set and is compact. Put , for each δ > 0, where Fi(⋅) is the distribution function of uit conditional on xit and zi.

Assumption 4: (i) Let fi(⋅) denote the conditional density function of uit given xit and zi, the conditional density function of fi(⋅) is continuously differentiable. For all τU, i = 1, ⋯, N, fi(⋅) ≤ Cf and fi(0) ≥ cf. (ii) The derivative satisfying .

Although in practice the observations are dependent across time for panel data, Assumption 1 is usual in the literature (see e.g. [4, 5, 11, 19]) partly, because the measurements are sparse and the dependence between them are negligible. Assumption 1 excludes such a negligible temporal dependence to simplify the results. Assumption 2(i) requires the restriction boundary conditions of xit. Assumption 2(ii) assures that are bounded uniformly across i. Assumption 3 restricts the compactness on the parameter space and the inequality is important for the parameter’s identification. It corresponds to Condition 3 of [4, 20]. Assumption 4 is a mild regularity condition that is typically imposed in the quantile regression analysis. Assumption 4 restricts the smoothness and the boundedness of the density and its derivatives.

In applications, the variance-covariance matrices are unknown and need to be estimated. When T and N tend to infinity, we impose the following assumption.

Assumption 5: as T → ∞. Assume that , where exists and is nonsingular.

Assumption 5’: for some hN ↓ 0 uniformly across i and as N → ∞, hN is a bandwidth. Assume that , where exists and is nonsingular.

An example that satisfies Assumption 5 is given in Equation 9 in [5]. Besides, an example satisfying Assumption 5’ is (12) where and is defined in [19].

Asymptotic properties of the W-QR estimator

Assumption 6: (i) The time-invariant regressors, zi are independent of for all τU and i, j = 1, ⋯, N, and ηi and are independent. (ii) For all i = 1, ⋯, N, zi satisfy the moment conditions . (iii) As N → ∞, .

Assumption 7: (i) For all i = 1, ⋯, N, . (ii) As N → ∞, .

Assumption 6(i)-(ii) require that zi is exogenous and bounded. Assumption 6(iii) and Assumption 7 are familiar identification conditions in regression analysis.

Under both sequential and simultaneous limits, we show the consistency of and .

Theorem 1. 1. Under Assumptions 1–3 and Assumptions 5–7, as (T, N)seq → ∞.

2. Under Assumptions 1–4, 5’and Assumptions 6–7, as (T, N) → ∞ and .

Next, under the simultaneous limits, we show the asymptotic normality of and .

Theorem 2. Under Assumptions 1–4, 5’and Assumptions 6–7, as (T, N)→∞ and , and where , Qzz is defined in Assumption 6(iii) and J(τ) is defined in Assumption 7(ii), and V is defined in Assumption 5’.

Asymptotic properties of the W-IVQR estimator

In this section, we study the asymptotic properties of the W-IVQR estimator. The W-IVQR estimator is proposed to deal with the case where zi is endogenous variable. Assume there exists instrument ri for time-invariant zi, where ri is independent of ηj(τ) and εj(τ) for all i and j and sp. Throughout this section we will make the following assumptions.

Assumption 6’: As N → ∞,

Assumption 7’: (i) For all τU and i, j = 1, ⋯, N, E[riηj(τ)] = 0. (ii) As N → ∞, . (iii) For all i = 1, ⋯, N and t = 1, ⋯, T, yit is independent of ri conditional on (xit, zi, ηi). (iv) For all i = 1, ⋯, N, .

Assumptions 6’-7’ are identification conditions.

Theorem 3. 1. Under Assumptions 1–3, 5 and 6’-7’, as (T, N)seq → ∞.

2. Under Assumptions 1–4, 5’-7’, as (T, N) → ∞ and .

Theorem 4. Under Assumptions 1–4, 5’-7’, as (T, N) → ∞ and , and where , , Qzr and Qrr are defined in Assumption 6’, is defined in Assumption 7’(ii), and V defined in Assumption 5’.

Remark 4: In our treatment of panel data models we have assumed that a balanced panel is available, that is each cross-sectional unit has the same time periods available. Often, some periods are missing for some units, and we are left with an unbalanced panel. The two proposed estimators for unbalanced panel data are consistency and asympotic normality under specific assumptions. Follow [21], let κi = (κi1, ⋯, κiT)′ denotes the T × 1 vector of selection indicators: κit ≡ 1 if (xit, yit) is observed, and zero otherwise. Consider the case κi is independent of (εi(τ), xi, zi, ηi(τ)) for τU, where xi = (xi1, ⋯, xiT)′, εi = (εi1, ⋯, εiT)′, the proposed estimators are consistency and asympotic normality by strengthening some assumptions, such as satisfies the assumptions of T, where . A more complicated problem arises when κi depends on (εi(τ), xi, zi, ηi(τ)), for example, when Ti is treated as nonrandom more assumptions need to be modified.

Monte Carlo simulation

We conduct some simulations to assess the finite sample performance of the proposed estimators, W-QR and W-IVQR estimator. We employ two variants of the data generating process (DGP). DGP A considers the time-invariant variable is exogenous. While under DGP B, the time-invariant variable is correlated with the fixed effects. Specifically, yit, fixed effect αi and time-varying regressors are generated from the following model: where β1 = β2 = γ1 = γ2 = 1, g1t ∼ U(1, 2), g2t ∼ U(1, 2), ηi ∼ 0.5(χ2(2) − 2) and .

As regards time-invariant variables, zi,j for j = 1, 2, we consider two different forms of generation of two time-invariant variables. Two time-invariant regressors zi1 and zi2 are generated as where , , and . In DGP A, we set ϕ = (ϕ1, ϕ2)′ = 0, where the time-invariant regressors are exogenous. While ϕ = (ϕ1, ϕ2)′ = (1, 1)′ is set in DGP B, meaning that the time-invariant regressors are correlated with fixed effects. For the instrument variables ri used in DGP B are generated as with and .

For DGP A, we consider two different process for εit:

Case 1:

Case 2:

And for the DGP B, we also consider two different process for εit:

Case 3:

Case 4:

Here we set the number of replications to 1000. For the sake of comparing the performance and efficiency between different methods, we compare the Bias and RMSE of the following estimators: fixed effects quantile regression (FEQR) and penalized quantile regression (PQR) as in Koenker [1] and Lamarche [2]; the grouped IVQR estimator (for short G. IV) of Chetverikov et al. [16]; the proposed W-QR estimator and the proposed W-IVQR estimator. In the simulations, we report results considering {(N, T)} = {(50, 50), (50, 100), (100, 50), (100, 100)}, and τ ∈ {0.25, 0.5, 0.75}. Tables 14 report the estimation results for the DGP A. Tables 58 report the estimation results for the DGP B. The minimum values are marked in bold in each case in the table.

thumbnail
Table 1. Bias and RMSE of 3 estimators for γ1 and γ2 when in the DGP A.

https://doi.org/10.1371/journal.pone.0289474.t001

thumbnail
Table 2. Bias and RMSE of 3 estimators for β1 and β2 when in the DGP A.

https://doi.org/10.1371/journal.pone.0289474.t002

thumbnail
Table 3. Bias and RMSE of 3 estimators for γ1 and γ2 when in the DGP A.

https://doi.org/10.1371/journal.pone.0289474.t003

thumbnail
Table 4. Bias and RMSE of 3 estimators for β1 and β2 when in the DGP A.

https://doi.org/10.1371/journal.pone.0289474.t004

thumbnail
Table 5. Bias and RMSE of 4 estimators for γ1 and γ2 when in the DGP B.

https://doi.org/10.1371/journal.pone.0289474.t005

thumbnail
Table 6. Bias and RMSE of 3 estimators for β1 and β2 when in the DGP B.

https://doi.org/10.1371/journal.pone.0289474.t006

thumbnail
Table 7. Bias and RMSE of 4 estimators for γ1 and γ2 when in the DGP B.

https://doi.org/10.1371/journal.pone.0289474.t007

thumbnail
Table 8. Bias and RMSE of 3 estimators for β1 and β2 when in the DGP B.

https://doi.org/10.1371/journal.pone.0289474.t008

Table 1 provides the Bias and RMSE of the three estimators for γ1 and γ2 when in the Case 1 of DGP A. It is clear that W-QR estimators are significantly better than the other two estimators. The W-QR estimator shows obvious advantages in Bias. The W-QR estimators is approximately unbiased, while the other two estimators are seriously biased. In terms of RMSE, the RMSE of the W-QR estimator are consistently smaller than those of the other two estimators in each quantile with the same sample size. It is noted that the RMSE of the FEQR estimator are very large. The simulation results indicate that, for the coefficients of time-invariant variables, the W-QR method can effectively increase precision of estimation.

Table 2 shows the Bias and RMSE of the three estimators for β1 and β2 when in the Case 1 of DGP A. We can see that the FEQR and W-QR estimators are approximately unbiased. The performance of W-QR estimator is slightly worse than that of FEQR in terms of RMSE, but the RMSE of the W-QR estimator reducing with the increase of N and T. On the other hand, because the Bias and RMSE of W-QR estimator for γ1 and γ2 are best, and the Bias for β1 and β2 are approximately unbiased, the W-QR estimator has the best overall performance. In addition, the FEQR estimator for γ1 and γ2 perform worst, and the PQR estimator for β1 and β2 perform worst.

The Bias and RMSE of the three estimators when in the Case 2 of DGP A are presented separately in Tables 3 and 4. Overall, they are similar to those when . Considering γ1 and γ2, the Bias and RMSE of W-QR estimator are smaller than those of the other two estimators in each quantile with the same sample size. The W-QR estimator performs best. Considering β1 and β2, the W-QR estimators are approximately unbiased as N and T increase. Generally the W-QR estimator has the best overall performance.

Besides, we also find that the Bias and RMSE of for W-QR estimator decrease as sample size increases. The RMSE of the W-QR estimator for γ1 and γ2 decrease obviously with the increase of N but not T, as γ1 and γ2 are the coefficients of time-invariant variables. Meanwhile, the Bias and RMSE of the W-QR estimator for β1 and β2 decrease obviously as T increases but not as N increases because of the incidental parameter problem.

Table 5 gives the Bias and RMSE of the four estimators for γ1 and γ2 when in the Case 3 of DGP B. It can be seen that W-IVQR estimator and G. IV estimator among the four estimators are significantly better than the other two estimators in terms of Bias and RMSE. The Bias of the W-IVQR estimator and G. IV estimator are about in the three decimal places, while the Bias of other two are in single digits and decimal places. In terms of RMSE, the W-IVQR estimator and the G. IV estimator are also consistently smaller than the other two estimators in each quantile with the same sample size. The results of the G. IV estimator and the W-IVQR estimator are similar. The reason is that W-QR is the calculation result of the centralized 2SLS, and the G. IV estimator calculates the 2SLS. Compared with the G. IV estimator, the Bias of the W-IVQR estimator is generally smaller than that of the G. IV estimator, while the RMSE of W-IVQR estimator is slightly larger than that of the G. IV estimator.

Table 6 gives the Bias and RMSE of the three estimation methods for β1 and β2 when in the Case 3 of DGP B. Notice that the G. IV estimator only gives the estimation of γ1 and γ2, and does not give the estimation of β1 and β2. The estimation results shown in Table 6 are similar to those in Table 2. The W-IVQR and FEQR estimator are approximately unbiased. The performance of W-IVQR estimator is slightly worse than that of FEQR in terms of RMSE, but the RMSE of the W-IVQR estimator reduces as sample size increases. In general, the W-IVQR estimator has the best overall performance as the G. IV method cannot estimate β1 and β2, and the Bias of the W-IVQR eatimator for β1 and β2 are approximately unbiased.

Tables 7 and 8 separately give the estimation results when in the Case 4 of DGP B, which are similar to those of Tables 5 and 6. The W-IVQR estimator has the best overall performance. Regarding γ1 and γ2, in terms of Bias, the FEQR estimator and the PQR estimator perform poorly, and the W-IVQR and G. IV estimators are are approximately unbiased as N and T increase. The RMSE of W-IVQR estimator and G. IV estimator are closer to each other, which are much better than those of the FEQR estimator’s and the PQR estimator’s. Considering β1 and β2, the FEQR estimator and W-IVQR estimator are approximately unbiased. In addition, it can be seen that the Bias and RMSE of the W-IVQR estimator decrease as T and N increase. As both N and T become larger, the Bias and RMSE of the W-IVQR estimator become significantly smaller.

Furthermore, comparing the results of W-QR estimator in Tables 14 and W-IVQR estimator in Tables 58, it can be found that the estimation of β1 and β2 are not sensitive to whether the time-invariant covariates are endogenous or not. In other words, estimation accuracy of exogenous time-varying variables is not sensitive to instrumental variables.

From the above analysis, we can find that the W-QR estimator has absolute advantages over the FEQR and PQR estimators in estimating the coefficients of exogenous time-invariant covariates, and the W-IVQR and G. IV estimators perform much better than the other two estimators in estimating the coefficients of endogenous time-invariant covariates. What’s more, it is noted that the RMSE of G. IV’s estimator is less than W-IVQR in most cases. Meanwhile, the FEQR and the weighted estimators for the coefficients of time-varying covariates are asymptotically unbiased, and the RMSE of the FEQR estimator is uniformly smaller than the proposed estimators. A nature idea arises that for model (1), we can use the FEQR method to estimate β, G. IV method to estimate γ, so it seems that we can get a more robust estimator. However, the disadvantage of FEQR estimator is that it is time-consuming to solve a large optimization problem, and the asymptotic properties of G. IV estimator need to be modified. In sum, the W-QR and W-IVQR method can better estimate the static panel data model with time-invariant regressors.

Application

Model construction

The gravity model has been the cornerstone of empirical trade analysis since its inception. It is used to estimate the marginal effects of various trade determinants and to test hypothesized relationships. Traditional estimation methods derive marginal effects of covariates at the mean; alternatively, only estimate the mean effects of explanatory variables on trade flows. This study applies the new quantile approach with an application to analyze the effects of the influence factors of China’s exports using the trade gravity model. Pöyhönen [22] and Tibergen [23] first used the idea of the law of universal gravitation to explain the international trade flow, which received good empirical support in the study of practical problems, and then a large number of international trade-related studies began to use the trade gravity model, see e.g., Eaton and Kortum [24], Freeman and Lewis [25], Greaney and Kiyota [26]. The basic form of the trade gravity model can be expressed as where Yij is the total value of trade between country i and country j, Xi and Xj are the GDP of two countries respectively, Dij represents the geographical distance between two countries, and K is a constant coefficient. It can be observed that the value of trade between the two countries are positively correlated with the gross national product of each country and negatively correlated with the geographical distance between the two countries. Meanwhile, the general trade gravity model can be written in the following form: (13) where α, β and γ represent the elasticity coefficients of the gross national product Xi, Xj and the geographical distance Dij respectively. In fact, the coefficients α, β and θ capture the extent to which GDP and geographical distance affect the total trade between the two countries. Take logarithms of formula (13), we obtain (14) In order to analyze China’s international trade exports as comprehensively as possible and to capture the dynamic heterogeneity of various influencing factors at different levels of trade volume, in this section we construct the above trade gravity model (14) using cross-country panel data of 98 countries or regions with which China has trade transactions for 29 years from 1990–2018, Export data dataset S1 Dataset. F test and Hausman test are carried out to determine the type of panel model for (14). As shown in Table 9, the p-value of F test is 0.000, which rejects the null hypothesis, that is, the fixed effect model is better than the mixed effect model. Hausman test is used to determine whether to build the fixed effect model or the random effect model. The p-value of the Hausman test from Table 10 is less than 0.05, which means that a fixed effect model should be built.

Thus, (14) can be written as (15) where ln Exportit is the logarithm of China’s total trade exports to country i in year t, ln gdpit and ln Chngdpt denote respectively the logarithm of GDP of the ith country and China in year t, ln Di indicates the logarithm of the distance between the Chinese capital and the capital of country i, αi is unobserved individual fixed effect.

Data sources

Data on trade exports and GDP are obtained from the International Monetary Fund (IMF) and World Bank Indicators (WDI), and the distances between capitals are obtained from the Centre détudes prospectives et d’informations internationals (CEPII) database. Besides, data used in the improved model contains the information about APEC and DC is from Baidu encyclopedia, area data comes from the Ministry of Foreign Affairs of the People’s Republic of China.

Model estimation

The FEF method in [13] and the W-QR method are used here to solve (15) as the ln Di here is time-invariant variable. We calculate the standard error of the W-QR estimators based on block bootstrap. The estimation results are presented in Table 11 and Fig 1. As seen in Table 11, the and are positive and are negative at the given quartiles, which are consistent with the estimation results of FEF. The results of W-QR method report more abundant estimation results which can capture fully the influences of covariates on the response, compared with the FEF method.

thumbnail
Table 11. The estimation results of W-QR estimator and FEF estimaor.

https://doi.org/10.1371/journal.pone.0289474.t011

Moreover, from Fig 1, the and are positive at all quartiles, indicating that the GDP of other countries and China’s GDP play a role in boosting China’s total exports. The larger the GDP the greater the producer demand, and therefore will have a boosting effect on China’s export trade. are negative at all quartiles, indicating that the distance factor acts as a disincentive to China’s export trade, i.e., the trade risks and costs associated with the increased geographical distance between the two sides of the trade, which inhibits China’s export trade.

Improved model with more time-invariant regressors and estimation

Control variables are introduced to optimize the model (15), (16) where APEC denotes whether the country/area is an APEC member, DC denotes whether the country/area is a developed country, and Area denotes the area of the country/area. The three newly added control variables are all time-invariant covariates. The F test and Hausman test results of (16) show that it should build the fixed effect panel model. The estimation results are plotted in Fig 2 using the W-QR estimation method proposed in this paper.

thumbnail
Fig 2. The W-QR estimators for time-invariant covariates at the different quantile in model (16).

https://doi.org/10.1371/journal.pone.0289474.g002

From Fig 2, we can see that the coefficient estimates of the distance variable are negative in the low and middle quartiles and become positive in the high quartiles. The trend of the coefficient estimates of the distance variable is similar to that in Fig 1, which indicates that the inhibitory effect of geographical distance is weakening for countries/regions with larger export trade. The estimated value of the coefficient of belonging to the APEC organization is positive at all quartiles, that is to say, belonging to the APEC organization has a boosting effect on international trade, indicating that the more developed the economy, the stronger the domestic consumption demand. The coefficient estimates of belonging to developed countries change from positive to negative as the quantile increases. The coefficient estimates of the area variable are negative at all quartiles, indicating that area has a depressing effect on China’s export trade, possibly because the larger a country is, the more abundant its production of materials and energy.

Conclusion

In order to solve biased parameter estimation for time-invariant variables, this paper proposes two new panel quantile estimation methods, W-QR estimation method and W-IVQR estimation. The W-QR estimation method is applicable when the time-invariant variables are independent of fixed effects. W-IVQR estimation method is applicable to the case where the time-invariant variables are endogenous. The two new proposed methods combine the advantages of MD-QR estimation method and instrumental variable method, which can not only obtain effective estimation of time-varying covariate coefficients, but also be computationally convenient and simple to implement. In the first step, regress dependent variables on time-varying variables with an intercept using the conventional quantile regression to obtain the slope and intercept estimators for each individual. In the second step, use different weighted definitions to the obtained slope and intercept estimators to get the estimator of β and γ respectively. In the large sample property, the consistency of W-QR estimator and W-IVQR estimator under the sequential and simultaneous limits, and the asymptotic distribution of the two estimators under the simultaneous limit are studied. Monte Carlo simulation shows that W-QR estimator and W-IVQR estimator perform well in estimating coefficient of time-invariant variables. The W-QR and W-IVQR estimators for β and γ are asymptotically unbiased. At last, we illustrate the proposed W-QR estimation with an application to analyze the effects of the influence factors of China’s exports using the trade gravity model. We find that for countries/regions with large export trade volume, the inhibition of geographical distance is weakening, because the coefficient estimates of the distance variable are negative at the low and middle quartiles and become positive at the high quartiles.

Appendix: Proofs

For convenience, we collect important definitions below. Let

A.1. Consistency of and under sequential asymptotics

Lemma 1: As N → ∞, (17)

Proof: We observe that by Assumption 6(iii). Therefore, it is suffices to prove that (18) In turn, (18) follows from Assumption 6(ii) and 6(iii) and Chebyshev’s inequality. Hence, (18) follows. This completes the proof of the lemma.

Similarly, the matrix Qzr,N and Qrr,N are defined by (19) (20) have finite probability limits as N → ∞ given by Qzr and Qrr, that is , where Qzr and Qrr appear in Assumption 6’.

Lemma 2: As N → ∞,

Proof: Since by Assumption 7(ii), it suffices to prove that (21) for k, l = 1, ⋯, p and given τU.

As is not necessarily finite, let δ = cM/4. Then by Hölder’s inequality,

In turn, by Assumption 7(i) and 2(iv). Hence, and so denoting , we obtain (22) With this notation, (21) is equivalent to .

Like Theorem 2.1.7 of Tao [27], for G > 0 to be chosen later, denote Zi,≤G = Zi ⋅ 1{|Zi| ≤ G} and Zi,>G = Zi ⋅ 1{|Zi| > G}. Then by Fubini’s theorem and Markov’s inequality, where in the last inequality we used (22). Hence, by Markov’s inequality, for ε > 0, and since |E[Zi,≤G]| = |E[Zi,>G]| ≤ CGδ, Thus, setting G = N1/3, we obtain , which is is equivalent to (21).

Proof of Theorem 1.1. We first prove the consistency of . The last equation holds because . Then, we show the first term is op(1). By the consistency of quantile regression estimators, as T → ∞. And by Assumption 6(ii), for fixed N as T → ∞, we have, Let N → ∞, we have by Lemma 1. Thus, .

Next, we show . Use Assumptions 7(ii), by Lindeberg’s Central Limit Theorem and Cramr-Wold device, we have Let N → ∞, , and Thus, as (T, N)seq → ∞, we obtain It follows that as (T, N)seq → ∞.

On the one hand, for fixed N, by Assumption 5, we have . On the other hand, by the consistency of quantile regression estimators, we obtain as T → ∞. As (T, N)seq → ∞,

This completes the proof.

Remark Strictly speaking, Assumption 5 is not really necessary; can converge to anything because the rightmost equality would hold as long as is consistent as T → ∞.

A.2. Consistency of and under joint asymptotics

Lemma 3 Under Assumptions 1–4 and Assumption 5’, we have .

Proof Lemma 3 implies the Lemma 5 of Galvao and Wang [4]. We verify the conditions. Conditions A1-A5 of [4] are implied by Assumptions 1–4; Condition A6’ of [4] is implied by Assumption 5’. Therefore, the Lemma follows.

Proof of Theorem 1.2. By the proof of Theorem 1.1, we have By Assumption 6(ii), is bounded by CM. With using the Lemma 3, which implies that , we have Because , The last equation holds by the assumption of the relative rate of N and T in the theorem. Therefore, .

Besides, as N → ∞, , and Therefore, we obtain

Moreover, by Lemma 3, we have The last equation holds by the assumption of the relative rate of N and T in the theorem. This completes the proof.

A.3. Asymptotic normality of and under joint asymptotics

Proof of Theorem 2. We only proof the asymptotic normality of , the asymptotic normality of has be proved in Theorem 3.2 of [4]. By above, we have

First, we show . Because and is bounded by CM, by Lemma 3, The last equation holds by the assumption of the relative rate of N and T in the theorem.

Next, use Assumption 7(ii), by Lindeberg’s Central Limit Theorem and Cramér-Wold device, we have

Besides, as N → ∞. Thus, by Slutsky’s theorem, we obtain (23) where , Qzz is defined in Assumption 6(iii) and J(τ) is defined in Assumption 7(ii).

Considering the asymptotic normality of , one can be refer to Theorem 3.2 of Galvao and Wang [4]. We verify the conditions. Conditions A1-A5 of [4] are implied by Assumptions 1–4; Condition A6’ of [4] is implied by Assumption 5’. Therefore, the Theorem 2 follows.

A.4. Consistency of and under sequential asymptotics

Proof of Theorem 3.1. The proof consists of two parts. First, we show that . Second, we show that .

Step 1: The last equation holds because . By the consistency of quantile regression estimators, as T → ∞ and , for fixed N as T → ∞ we have .

We note that by Lemma 1 as N → ∞, (24)

Thus,

Step 2: The last equation holds because . Use Assumption 6’(ii), by Lindeberg’s Central Limit Theorem and Cramr-Wold device, we have where is defined in Assumption 6’(ii). Thus, we have as N → ∞. Combining Step 1 and Step 2, we get .

For fixed N, by Assumption 5, we have . By the consistency of quantile regression estimators, we obtain as T → ∞. As (T, N)seq → ∞, This completes the proof.

A.5. Consistency of and under jiont asymptotics

Proof of Theorem 3.2 By the proof of Theorem 3.1, we have . Then, by Lemma 1, Lemma 3 and is bounded, we have using the condition that T is faster than N2 log N in the theorem. Thus, .

What’s more, by Lemma 3, we have The last equation holds by the assumption of the relative rate of N and T in the theorem. This gives the asserted claim.

A.6. Asymptotic normality of and under jiont asymptotics

Proof of Theorem 4. The proof of the asymptotic normality of is the same as that of , please refer to Theorem 3.2 of Galvao and Wang [4].

Finally, by Slutsky’s theorem, for any τU, we obtain where , is defined in Assumption 7’(ii) and .

Supporting information

Acknowledgments

We thank the editor, the associate editor and two referees for their helpful comments which led to a considerable improvement of the original article.

References

  1. 1. Koenker R. Quantile regression for longitudinal data. Journal of Multivariate Analysis. 2004 Oct;91(1):74–89.
  2. 2. Lamarche C. Robust penalized quantile regression estimation for panel data. Journal of Econometrics. 2010 Aug; 157(2):396–408.
  3. 3. Canay IA. A simple approach to quantile regression for panel data. The Econometrics Journal. 2011 Oct;14(3):368–386.
  4. 4. Galvao AF,Wang L. Efficient minimum distance estimator for quantile regression fixed effects panel data. Journal of Multivariate Analysis: An International Journal. 2015 Jan;133:1–26.
  5. 5. Galvao AF, Gu JY, Volgushev S. On the unbiased asymptotic normality of quantile regression with fixed effects. Journal of Econometrics. 2020,218(1):178–215.
  6. 6. Harding M, Lamarche C. A quantile regression approach for estimating panel data models using instrumental variables. Economics Letters. 2009,104(3):133–135.
  7. 7. Harding M, Lamarche C. A Hausman Taylor instrumental variable approach to the penalized estimation of quantile panel models. Economics Letters. 2014,124(2):176–179.
  8. 8. Galvao AF, Lamarche C, Lima LZ. Estimation of Censored Quantile Regression for Panel Data With Fixed Effects. Journal of the American Statistical Association. 2013,108:1075–1089.
  9. 9. Tao L, Zhang YJ, Tian MZ. Quantile regression for dynamic panel data using hausman taylor instrumental variables. Computational Economics. 2019,53(3):1033–1069.
  10. 10. Dai XW, Yan Z, Mao ZT. Quantile regression for general spatial panel data models with fixed effects. Journal of Applied Statistics. 2020,47(1):45–60. pmid:35707608
  11. 11. Dai XW, Jin LB. Minimum distance quantile regression for spatial autoregressive panel data models with fixed effects. PLoS ONE. 2021,16(12):e0261144. pmid:34905573
  12. 12. Plümper T, Troeger VE. Efficient estimation of time-invariant and rarely changing variables in finite sample panel analyses with unit fixed effects. Political Analysis. 2007,15(2):124–139.
  13. 13. Pesaran MH, Zhou QK. Estimation of time-invariant effects in static panel data models. Econometric Reviews. 2018,37(10):1137–1171.
  14. 14. Kripfganz S, Schwarz C. Estimation of linear dynamic panel data models with time-invariant regressors. Journal of Applied Econometrics. 2019,34(4):526–546.
  15. 15. Zhang YH, Zhou QK. Estimation for time-invariant effects in dynamic panel data models with application to income dynamics. Econometrics and Statistics. 2019,9:62–77.
  16. 16. Chetverikov D, Larsen B, Palmer C. Iv quantile regression for group-level treatments with an application to the distributional effects of trade. Econometrica. 2016,84(2):809–833.
  17. 17. Koenker R, Bassett G. Regression quantiles. Econometrica. 1978,46(1):33–50.
  18. 18. Hausman JA, Taylor WE. Panel Data and Unobservable Individual Effects. Econometrica. 1981,49(6):1377–1398.
  19. 19. Kato K, Galvao AF. Montes-Rojas GV. Asymptotics for panel quantile regression models with individual effects. Journal of Econometrics. 2012,170(1):76–91.
  20. 20. Hahn J, Newey W. Jackknife and analytical bias reduction for nonlinear panel models. Econometrica. 2004,72(4):1295–1319.
  21. 21. Wooldridg JF. Econometric Analysis of Cross Section and Panel Data (2nd edition). The MIT Press. 2010.
  22. 22. Pöyhönen P. A tentative model for the volume of trade between countries. Weltwirtschaftliches Archiv. 1963,90:93–100.
  23. 23. Tinbergen J. Shaping the world economy: suggestions for an international economic policy. New York (N.Y.): Twentieth century fund. 1962.
  24. 24. Eaton J, Kortum S. Technology, geography, and trade. Econometrica. 2002,70(5):1741–1779.
  25. 25. Freeman R, Lewis J. Gravity model estimates of the spatial determinants of trade, migration, and trade-and-migration policies. Economics Letters. 2021,204:109873.
  26. 26. Greaney TM, Kiyota K. The gravity model and trade in intermediate inputs. The World Economy. 2020,43(8):2034–2049.
  27. 27. Tao T. Topics in Random Matrix Theory. Providence. RI: American Mathematical Society. 2012.