Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Liu-type pretest and shrinkage estimation for the conditional autoregressive model

Abstract

Spatial regression models have recently received a lot of attention in a variety of fields to address the spatial autocorrelation effect. One important class of spatial models is the Conditional Autoregressive (CA). Theses models have been widely used to analyze spatial data in various areas, as geography, epidemiology, disease surveillance, civilian planning, mapping of poorness signals and others. In this article, we propose the Liu-type pretest, shrinkage and positive shrinkages estimators for the large-scale effect parameter vector of the CA regression model. The set of the proposed estimators are evaluated analytically via their asymptotic bias, quadratic bias, the asymptotic quadratic risks, and numerically via their relative mean squared errors. Our results demonstrate that the proposed estimators are more efficient than Liu-type estimator. To conclude this paper, we apply the proposed estimators to the Boston housing prices data, and applied a bootstrapping technique to evaluate the estimators based on their mean squared prediction error.

1 Introduction

Data collected across geographical areas may show some dependencies in which closer observations are more similar than those farther apart. This behaviour can be modeled by incorporating a covariance structure into the traditional statistical models. One of these models is the spatial regression model, which assimilate different types of special dependencies. Applications of the spatial regression models have been growing up in different fields as ecology, epidemiology, disease mapping, public health, psychology, and others.

In the context of time series, Autoregressive models represent the error terms at time (t) as a linear function of the recent inherent errors. Similarly, autoregressive models in spatial framework model the data from a specific location, known as site, as a function of data from nearby locations, where a site is a physical location where the data is collected, and the conception of neighborhood between two sites is defined based on a specific distance or closeness metric. One important class of Spatial regression models is the Conditional Autoregressive model. The CA name is due to the possibility of writing the mean and the variance using conditional expectation form. The CA model has recently been extensively applied in a vast range of different areas. For example, but not limited to, Shen X. et.al [1] proposed CA model to analyze the heterogenous genetic effects among individuals which is considered as a random effect in their model. Pérez-Molina [2] modeled hierarchical relationships using multilevel models with random intercepts and a CA component to account for spatial effects. He demonstrated that such models are significantly improve housing price modeling. Tharmin S.A et.al [3] used Bayesian CA in mapping the relative risk of the spread of dengue fever disease in Makassar, Indonesia. They demonstrated that Makassar is still vulnerable to dengue fever. Qiang Z. et.al [4] proposed a Bayesian bivariate CA model to establish the links between crash frequencies and traffic attributes. Dibakar S. et.al [5] investigated the relationship between bicycle crash frequency and the factors that contribute to them at the census block group level in the state of Florida, USA, using the class of CA models within the hierarchical Bayesian framework. Ver Heof J.M. et.al [6] discussed six different types of practical ecological inferences that can be made using the CA and SA models. They compared the CA and simultaneous autoregressive (SA) models and demonstrated their evolution as well as their connection to partial correlations. Wang C. et.al [7] used spatial Poisson-lognormal with CA priors to investigate the impact of traffic congestion of road accidents. Kleinschmidt I. et.al [8] explored the spatial and temporal variation in small-area malaria incidence rates using CA models. Gelfand, A. E., and Vounatsou, P. [9] used multivariate CA models for the analysis of spatial data and there models to study the child growth and the spatial variation in HLA-B allele frequencies.

In classical Statistical inference, we use the sample data information (subjective information) to make inference about the unknown parameter(s). In Bayesian framework, we combine the non-sample information, known as Uncertain Prior Information(UPI) and the sample information to make the inference. The UPI can be obtained from different resources, for example, historical information about the parameter(s), or applying some selection methods used in regression analysis. In many cases, researchers have previous knowledge about some of the regression variables that will be used in their regression model, or may formulate a linear hypothesis of the form H0 : = h, where β is a (p × 1) vector of regression coefficients, H is (p2 × p) known matrix of rank (p2p), and h is a (p2 × 1) fixed vector of constants in . This restriction is a commonly used method in regular regression, experimental design, machine learning and other fields to produce a restricted model that can perform at least as well as the full model with all available predictors. It can also be considered as a variable selection technique in which the reduced model will be tested to investigate the importance of some variables in explanting the variation in the response variable, and to decide how really the model is useful in the prediction process. In our case, we use this hypothesis to produce a sub model with less number of predictors. Theoretically we assume such restriction to study the performance of the reduced model compared with the full one, and numerically, we force some of the coefficients to be zeros (not significant) to confirm our analytical results. In real life problems we can gain some knowledge about the important variables, eliminate redundancy of some variables, and figure out the multicolinearity issue using different techniques, as the AIC, BIC, best subset, penalization algorithms, and others. The correlation matrix among all variables including the responses will also a helpful tool to justify our restriction.

One of the oldest methods that combines the sample and the UPI is the pretest estimation. The pretest estimator combines the sample data model, known as full model, and the UPI model, known as the submodel, into the estimation process using a binary weights, and chooses the submodel estimator if the test statistics rejects the null hypothesis (H0) at a specific level of significance α, and the full model estimator otherwise. Later on, a new estimator that uses a smooth function of the test statistics is the shrinkage estimator. Further, an improved version of the shrinkage estimator, known as the positive shrinkage estimator was proposed. The three estimators have been discussed a lot in the literature under different settings. Al-Momani M. et.al [10] proposed the pretest, shrinkage and positive shrinkage estimators for the vector of regression coefficients of the marginal model with multinomial response, and showed the superiority of the positive shrinkage estimator over the classical generalized estimating equation (GEE). Al-Momani, M. and Dawod B.A. [11] used the idea of pretest, and the shrinkage estimation for the Autoregressive Conditionally Heteroscedastic (ARCH) model. They discovered that the positive shrinkage estimator outperformed the restricted, pretest, and shrinkage estimators regardless of the accuracy of the restriction provided by the linear hypothesis to check whether some of the coefficients of the ARCH model’s parameter vector are not significant is true or not. Li, Y and Jin, B [12] investigated the sparsity and homogeneity of regression coefficients using prior constraint information in their work and showed combining prior knowledge can increase the effectiveness of both sparsity and homogeneity identification. Arumairajan, S. [13] proposed a stochastic restricted Liu estimator that is almost unbiased by combining modified nearly unbiased Liu estimator and mixed estimator when multicollinearity is present and stochastic restrictions are available. He showed that it outperformed the ordinary least squares, mixed estimator, ridge estimator, and other estimators considered in his study in terms of mean squared error sense. Ridge regression theory and an important shrinkage and model selection techniques with application to machine learning has been studied extensively for different models and settings by Saleh, A. K. et al [14, 15]. For more details about the shrinkage estimators, the reader is referred to S.E. Ahmed [16], Nkurunziza, S. et. al [17], Peng, L. et.al [18], and Saleh, A. K. [19], among others.

One common problem that researchers faced while fitting a multiple regression model using the ordinary least squares(OLS) method is the mulicolinearity, which occurs when some of the explanatory variables are correlated. This problem may cause insignificant regression coefficients or some of the coefficients have unexpected signs. There are many estimation methods proposed to improve the OLS estimators. For instance, Hoerl and Kennard [20] proposed the ridge estimate for the OLS estimator. Liu K. [21] introduced a biased estimate in linear regression. A modified version of Liu estimator was proposed by Li and Yang [22]. Yüzbaşı, B. et.al [23] proposed the pretest and shrinkage-type ridge regression estimators in case of linear models. Recently Yüzbaşı, B. et.al [24] proposed the pretest, shrinkage, and pretest-shrinkage Liu-type estimation in linear models. Babar, Iqra et.al [25] proposed new estimators for the shrinkage parameter of Liu estimator based on quantile of the regression coefficients. They showed the new estimator outperformed the existing estimators in terms of mean squared error and absolute error. Arashi M. et al [26] proposed an improved Liu-type unrestricted, restricted, pretest, shrinkage, and positive shrinkage estimators for the regression parameter vector of coefficients. They showed the superiority of the proposed method analytically and numerically. With respect to robust regression, Arashi, M. et al [27] defined the Liu-type rank-based estimators. They examined the asymptotic behavior of the estimators, and provided the proposed estimators’ superiority requirements for the biasing parameters, and supported their findings by numerical calculations. Arashi, M. et al [28] proposed the ridge estimator for high-dimensional multicollinear data. They proved the consistency and derived some asymptotic properties of the proposed estimators and applied it to simulation experiments and real data set. Arashi, M. et al [29] proposed a re-scaled LASSO for multicollinear situations. Their numerical analysis has demonstrated that the scaled LASSO performs frequently better than the LASSO and elastic net while being comparable to other sparse modeling techniques. Arashi, M et al [30] developed an improved ridge approach for the genome regression modeling and used a rank ridge estimator for parameter estimation and prediction when multicollinearity presents with outliers in the data set.

In this manuscript, we aim to propose efficient estimators for the large-scale effect parameter vector (β) in the CA model when it is suspected that some of the coefficients are not significant. So, we partition the (p × 1) parameter vector β as (β1, β2), where β1 is a (p1 × 1) vector, which is considered as the coefficients of the main effect, β2 is a (p2 × 1) vector as the unimportant or nuisance parameters, and p1 + p2 = p. We are primarily interested in estimating β1 when β2 is suspected to be zero or close to zero. In some cases, the full model estimator may be highly variable and difficult to interpret, and the submodel estimator may result in a large biased and under-fitted estimator. To overcome this issue, we considered the Liu-type pretest, shrinkage and positive shrinkage estimators.

The rest of the paper is organized as follows in accordance with our goals. Section 2 provides a brief overview of the CA model. The maximum likelihood estimator of the CA model parameters are given in Section 3. In Section 4, we proposed the Liu-type estimators, and discussed the asymptotic properties in terms of bias, quadratic bias, and quadratic risks in Section 5. We compared the array of estimators using Monte Carlo simulation and real data example in Section 6. Some conclusions are given in Section 7.

2 Conditional autoregressive model

Assume, in accordance with Cressie and Wikle [31], that there are (n) spatial cites (usually referred as locations, geographical areas, etc). The collection of theses cites is known as a lattice indicated by the notation S = {s1, s2, …, sn}. For the ith cite si, a set of neighboring cites, denoted by N(si) is defined as N(si) = {sj : j is a neighbor of i}, j = 1, 2, …, n in which a neighborhood structurer is defined based on a certain metric. For example, two sites are rook-based neighbors in a regular lattices if they have common boundaries, while it is a queen-based neighbors if the two sites have common boundaries and corners. Let Yn(s) = {Y(s1), Y(s2), …, Y(sn))} be a vector of observations that collected at sites {s1, s2, …, sn}, and X(si) = Xi = (X1i, X2i, …, Xpi)′ be the set of covarites, and β = (β1, β2, …, βp)′ is a p × 1 vector of parameters, known as the large-scale effect on Yn(s).

We will assume that Yn(s) is continuous, and follows a Gaussian process with mean μ(s) = E(Yn(s)) = X′(s)β and covariance matrix Var(Y) = σ2(InρW*)−1 D, where σ2 > 0, ρ is the spatial dependence parameter, with wij = 1 if sites i, j (ij) are neighbors to each other, wij = 0 otherwise, wii = 0, , W* is called the standardized proximity matrix, and D is a diagonal matrix with . For simplicity, the covariate vectors for all sites will be consolidated into a design matrix X(s), and all subscripts (n, s) will be removed unless we need to present them explicitly. That is, the data on the lattice s will be denoted by (Y, X). Following Besag et al [32], the Conditional Autoregressive (CA) model follows a multivariate Gaussian (Normal) distribution as YNn(, Vn), where Vn = σ2(InρW*)−1 D. In regression context, the CA model is given by:

(1)

where ϵNn(0, σ2Vn). The model is known as a conditional autoregressive regression model because the mean and variance of Y(si) can be written in a conational form, as follows:

3 The maximum likelihood estimation

The maximum likelihood estimators (MLEs) of the parameter vector β, σ2, and the spatial dependence parameter ρ are derived by a two-step profile-likelihood procedure, see Cressie [33]. We fix the parameter ρ at first, then solve the log-likelihood equation, and plug , back in the log-likelihood to find the MLE of ρ, which is denoted by . The MLEs of β and σ2 are given by: (2) (3) Then, the MLE of ρ is a solution of the log-likelihood function that maximizes L*(ρ) see Ord [34], where (4) Finally, we obtain the MLEs of β and σ2. We denote to the MLEs of (β, σ2, ρ) by . Mardia and Marshall [35] proved the consistency and asymptotic normality of which leads to the asymptotic normality of the large-scale parameter vector .

4 Efficient estimation strategies

Consider the following multiple linear regression model (5) where ϵNn(0, σ2In). The ordinary least square estimators of β, denoted by is given by (XX)−1XY enjoys some good properties. However, when multicollinearity exits, the entries of (XX)−1 become large, which cause a large variation of . To overcome the problem of multicollinearity, Hoerl and Kennard [20] proposed the ridge estimator which is given by: (6) where k > 0, Note that if k = 0, , and if k → ∞, . Later on, Liu [21] proposed a biased estimator to deal with multicollinearity, which benefits form both the ridge estimator and shrinkage estimator, it is denoted by , and given by: (7) (8) where 0 < d < 1, known as the biasing parameter. Obviously, when d = 1, . In the next subsection we introduce the Liu estimator for for the CA model.

4.1 Liu estimators for the CA model

Generally speaking, subjective information about the importance of a certain regression coefficients is available. Such information divides the p × 1 regression parameter vector as β = (β1, β2), where β1, β2 are of dimensions p1 × 1 and p2 × 1, respectively, with p = p1 + p2. Also, the n × p design matrix is partitioned as X = (X1, X2), where X1 is an n × p1 and X2 an n × p2 matrices. So, the model in (1) can be rewritten as: (9) We are initially interested in estimating β1 by removing β2 when X2 is insignificant to explain the variation in the response variable. Such information can be obtained either from some variable selection approaches or some uncertain prior information. In other words, we may consider testing a restriction given by: (10) Assuming we obtained information about X2, then the candidate sub-model is given by: (11) The MLE of β1 for the previous model in (11) can be easily obtained in a similar manners as we got in (2), and is given by: (12) For the model in (9), the MLE of β1 can be obtained by maximizing the log-likelihood given by By setting and , then solve the two equations to get: (13) where , and has the same formula by interchanging the indices 1 and 2. Note that, can be also written in terms of as follows: (14)

We define the Liu estimator of β1 as follows: (15) where 0 < d < 1. We will refer to the estimator in in (15) as the full model estimator of β1. The Liu estimator of the sub-model in (11) is defined as follows: (16) where 0 < ds < 1. In fact, under the null hypothesis in (10), performs better than or when β2 closes to 0, but when β2 starts moving away from the null space, becomes inefficient, while remains consistent.

4.2 The pretest and shrinkage Liu-type estimators

The pretest Liu-type estimator of β1 depends on testing the null hypothesis in (10). It chooses if the hypothesis is rejected at α−level of significance, and otherwise. It is denoted by and given by: (17) where I(.) is the indicator function, Ln is a suitable test statistics for testing H0 in (10), and is given by: , , ln, α is the α−critical value of the distribution of the tests statistics Ln, A1 is defined in a similar way as A2, and S2 is an estimator of σ2. The test statistics Ln follows a chi-square distribution with (p2) degrees of freedom under the null hypothesis. is a binary choice between and , it chooses if H0 is rejected and if not. The Liu-type shrinkage estimator provides a smoother weighting than . It is denoted by , and given by: (18) However, may experience an over-shrinkage problem, and produce unexpected signs of some of coefficients when p2 − 2 > Ln. This issue was handled by the Liu-type positive shrinkage estimation of β1, which is defined as: (19) where u+ = max(0, u).

5 Asymptotic results

In this section, we study the asymptotic behaviour of the proposed estimators assuming a sequence of local alternatives {H(n)} given by: (20) where ξ is a p2 × 1 fixed and known vector. Clearly, if ξ = 0, the local alternatives in (20) reduces to (10).

Let be any of the proposed estimators of β1, and M be a p1 × p1 positive definite weight matrix. Define the cumulative distribution function of by , and the quadratic loss function of as where tr(A) is the trace of the matrix A. If , where denotes to the convergence in distribution, then the asymptotic quadratic risk (AR) of is defined as: (21) The asymptotic joint normality of the sub and full models Liu estimators is the main tool in deriving the AR expressions, we list two theorems below to find these expressions. Assuming the assumptions of theorem 2 of Mardia and Marshall [35] and the following:

  1. as n → ∞, where xi is the ith row of Xn
  2. , where C is a finite positive definite matrix.
  3. , where Gn(d) = (Cn + Ip)−1(Cn + dIp), and Gd = (CIp)−1(C + dIp)

Theorem 1 If 0 < d < 1, and |C| ≠ 0, then , where , and denotes to the convergence in distribution.

Proof: Note that , so which is a linear combination of . Hence, by Mardia and Marshall theorem [35], and as n → ∞, converges in distribution to multivariate Gaussian distribution with: Mean = −(1 − d)(CIp)−1β, and

Theorem 2 Let , , . Under the previous assumptions, the sequence of local alternatives in (20), and as n → ∞, we have:

  1. ,

where , ,

γ = − (λ11.2π), , , , , , and is the conditional distribution mean of β1 given .

The proof of Theorem (2) is similar to the proof of Theorem (1) with little modification. Also, we refer to Bahadır Y. et al [36] for a similar proof.

5.1 Asymptotic distributional and quadratic bias of the estimators

The asymptotic distributional bias expressions, denoted by , where is any of the the prosed estimators, are given in the following theorem.

Theorem 3 The AB expressions are:

  1. ,
  2. ,
  3. ,
  4. ,
  5. ,

where , Hn(x;Δ) is the cumulative distribution function of non-cental chi-square distribution with (n) degrees of freedom and a non-centrality parameter Δ, and . To proof the previous theorem we use the following theorem. The proof can be found in [37].

Theorem 4 Let y = (y1, y2, …, yq)′ be Nq(μ, Σ), and let ϕ be any measurable function, then , where is the chi-square random variable with (n) degrees of freedom and is the non-centrality parameter.

Proof of Theorem (3):

  1. by Theorem (2)-part (2).
  2. Note that: .
    Therefore,
  3. Note that can be written as
    Therefore, Using Theorem (2) and Theorem (4), we get:
  4. Note that can be rewritten as:
    Therefore,

The asymptotic bias expressions are in vector format which can’t be used directly to compare the set of estimators. However, the asymptotic quadratic bias, which is a real number, can be used as a measure of the risk. Following Bahadir [36], the asymptotic quadratic risk for any estimator, denoted by , where is any of the previous estimators, is defined as: (22)

Consequently, the AQ expressions of the proposed estimators are given below:

  1. ,
  2. ,

5.2 Asymptotic quadratic risk

The asymptotic quadratic risk (QR) can be used as a measure of relative performance with respect to the classical MLE of the full model estimator. To obtain the expressions of the QRs of the proposed estimators, we define the quadratic loss function as: (23) where as any of the proposed estimators, and is the MLE of the model in (11). Also, the asymptotic covariance matrix (AC) of is defined as: (24) Finally, for any p1 × p1 positive definite matrix M, the is defined as: (25) where tr(W) is the trace of the matrix W. To derive the QR expressions we use the following theorem

Theorem 5 Let y = (y1, y2, …, yq)′ be Nq(μ, Σ), and let ϕ be any measurable function, then ,

The proof can be found in [37].

Theorem 6 Under the assumptions of Theorem (2), the QR expressions are as follows:

  1. ,
  2. ,

Proof:

  1. where

Since , i = 1, 2, 3, the conditional mean of θ(1)|θ(3) is given by: − λ11.2 + B*(B*)−1(θ(3)π) = −λ11.2 + (θ(3)π). Therefore,

By combining E1, E2, E3, the results holds, and the is obtained.

Similarly, we can proof parts (4) and (5), but we omit the proof to safe the space.

Analytical risk comparisons of the proposed estimators can be carried out based on QR expressions. However, our results are similar to those discussed by Al-Momani, M. et al [38] and Bahadir Y. et. al [36], so we relay on numerical comparisons to check the estimators performance.

6 Numerical study

In this section, we examine the performance of the proposed estimators numerically based on Monte Carlo Simulation experiments and real data example.

6.1 Monte Carlo Simulation

We conduct a Monte Carlo Simulation using square lattices of N × N with N = 7, 10 and corresponding sample sizes n = N2 = 49, 100. The design matrix X is generated from multivariate Gaussian distribution with mean (0) and covariance matrix with first-order Autoregressive structure for the assessment of multicollinearity. That is, , and use ρx = {0.3, 0.6, 0.9}. The error term ϵ is generated from multivariate Gaussian distribution with CA covariance structure, so ϵN(0, σ2(InρW*)−1D). We use σ2 = 1, and employed a queen-based contiguity neighborhood for the matrix W*. The spatial dependence parameter ρ is chosen to vary over the set {−0.9, −0.5, 0, 0.5, 0.90}. The p × 1 parameter vector β is partitioned as , where is a p1 × 1 vector of ones, is a p2−1 vetoer of zeros, and Δ is the non-centrality parameter defined as Δ = ‖ββ0‖, where ‖.‖ is the Euclidian norm. The values of Δ are chosen to vary from 0 to 2. Obviously, when Δ = 0, the null hypothesis in (10) is true, and becomes false when Δ starts moving from the null space. The number of regression coefficients that form the vector β are (p1, p2) ∈ {(5, 10), (5, 20), (5, 30)}, and we use α = 0.05. To fit the full and sub CA models, we use the spdep R-package [39] and apply the function spautolm to the generated data. A 2000 Monte Carlo runs is repeated for each single case. In each of these runs, the full model, sub-model, pretest, shrinkage, and positive shrinkage Liu-type estimators of β1 were computed, and the mean squared error (MSE) for all estimators obtained, then the simulated relative efficiencies(SRE) with respect to the full model MLE estimator () of (β1) are calculated for all values of Δ using the following formula: (26) where for any estimator for β1, say , the , and β is any of the estimators . If the SRE of β is more than one, then it indicates superior to the full model Liu estimator, and vice versa. We noticed no significant difference exits when changing the spatial dependence parameter ρ, we only present the graphs for ρ = 0.90 for different values of Δ which appear below.

Figs (1)–(3) lead to the following conclusions:

  1. dominates the classical MLE estimator uniformly over all values. Further, as p2 increases, its efficiency increases for fixed values of ρ and ρx. Furthermore, efficiency increases as the multicollinearity becomes stronger among the explanatory variables within the design matrix.
  2. When Δ = 0, the Liu-type sub-model estimator dominates all other estimators. It is expected as the null hypothesis is true. However, as Δ starts moving form the null space, the SRE of estimator decreases sharply, and the estimator becomes inefficient compared with rest of the estimators.
  3. As the correlation coefficient ρx increase among the explanatory variables, the SRE values are also increase holding other parameters fixed.
  4. The SRE of all estimators increase when the number of zero coefficients (p2) increases.
  5. The Liu-type positive shrinkage estimator is uniformly dominates the other estimators.
thumbnail
Fig 1. SRE of the proposed estimators with respect to () when n = 49, 100, ρx ∈ {0.3, 0.6, 0.9}, ρ = 0.90, and (p1, p2) = (5, 10).

https://doi.org/10.1371/journal.pone.0283339.g001

thumbnail
Fig 2. SRE of the proposed estimators with respect to () when n = 49, 100, ρx ∈ {0.3, 0.6, 0.9}, ρ = 0.90, and (p1, p2) = (5, 20).

https://doi.org/10.1371/journal.pone.0283339.g002

thumbnail
Fig 3. SRE of the proposed estimators with respect to () when n = 49, 100, ρx ∈ {0.3, 0.6, 0.9}, ρ = 0.90, and (p1, p2) = (5, 30).

https://doi.org/10.1371/journal.pone.0283339.g003

6.2 Boston housing data

Regarding the use of housing market information for census tracts in the Boston Standard Metropolitan Statistical Area in 1970, Harrison and Rubinfeld [40] looked at a number of practical concerns. Their main goal was to determine the correlation between a group of (15) variables and the median price of owner-occupied homes in Boston. A corrected version of the data set with additional spatial information were provided by Gilly and Pace [41]. The data set is available under the R-Packages MASS, spdep, the list of the variables as given in the package are as follows:

  • TRACT: Census tract id number.
  • MEDV: Median value of owner-occupied homes in (1000’s USD).
  • CMEDV: Corrected median values of owner-occupied housing in (1000’s USD).
  • ZN: Proportions of residential land zoned for lots over 2500 square feet per town (constant for all Boston tracts).
  • INDUS: Proportions of non-retail business areas per town.
  • RM: Average numbers of rooms per dwelling.
  • AGE: Proportions of owner-occupied units built prior to 1940.
  • CHAS: A dummy variable with two levels, 1 if tract border to Charles river; 0 otherwise.
  • NOX: Levels of nitrogen oxides concentration (parts per 10 million) per town.
  • CRIM: Crime rate per capita.
  • DIS: Weighted distance to five employment centers.
  • RAD: An index of accessibility to radial highway per town (constant for all Boston tracts).
  • LSTAT: Percentage of lower status population.
  • TAX: Property tax rate per (USD 10,000) per town (constant for all Boston tracts).
  • PTRATIO: Pupil-teacher ratios per town (constant for all Boston tracts).
  • B: The variable B = 1000(b − 0.63)2, where b is the proportion of blacks.

Fig (4) shows a plot of the correlation coefficients among all variables in colors in which the a strong linear relation appears in dark colors, and as it becomes weak, the color changes to light or may disappear when no linear relation exists. The figure shows some strong linear relationship between the CMEDV and some other variables. As we do not have any prior information about the available covariates, we might apply any variable selection method. In our scenario, we employ the AIC/BIC selection criterion to produce a submodel.

The full which contains all available covariates, and the sub model obtained by the AIC/BIC selection are given above in Table (1). To evaluate the performance of the proposed estimators, we used a bootstrapping method suggest by Solow [42], and computed the mean squared prediction error (MSPE) using each estimator as below:

  1. We use the spautolm function to fit the CA full model using all available variables as papered in Table (1) and obtain the maximum likelihood estimates of β, σ2, the spatial dependence parameter ρ, the matrix Vn and the biasing parameter d using the formula suggested by Alheety et. al [43], which is given by: (27) and we estimate d and Vn by replacing σ2, ρ and β by their corresponding MLEs estimates, where
  2. Employ the Cholesky decomposition for the matrix to write it as , where is an (n × n) lower triangular matrix.
  3. Define the residual as , where and define as the centered residual.
  4. Obtain a sample with replacement of size (n) from to get .
  5. Compute the bootstrapping response value as .
  6. Use the bootstrapped value Y* to fit both the full and sub models and obtain the values of the proposed estimators.
  7. Compute the predicted value of the response variable using each estimator as , where is any of the estimators in the set .
  8. Compute the square root of the MSPE for the kth bootstrapping sample as (28) where K is the number of bootstrapping samples.
  9. Compute the relative efficiency of the square root of the MSPE (REMSPE) as follows: (29) where is any of the proposed estimators, and we use K = 2000 bootstrapping samples.
thumbnail
Table 1. Full and submodel for the Boston housing data.

https://doi.org/10.1371/journal.pone.0283339.t001

A value of the REMSPE grater than one indicates the superiority of the estimator in the denominator.

Table (2) above summarized the results of the relative efficiencies. The table indicates that the submodel estimator dominates all other estimators followed by the pretest estimator . This is expected if the chosen AIC/BIC model is accurate or roughly accurate. In addition, all proposed estimators of β1 dominate the estimator .

7 Conclusion

In this paper, we proposed the pretest, shrinkage, and positive shrinkage estimators for the CA model’s large-scale effects vector of parameters. We formulated a hypothesis of the form H0 : β2 = 0 to obtain the Liu estimator of the main effect β1 under this UPI, and the submodel estimators. Then we combined these two estimators to get the Liu-type pretest, shrinkage and positive shrinkage estimators.

Further, the set of estimators were compared analytically based on their asymptotic bias, quadratic bias, and risks, and provided related expressions. Also, these estimators were evaluated numerically via their relative performance using an expensive simulation experiments based on different values of the spatial dependence parameter (ρ) and difference lattice sizes (N), and applied the proposed estimators to a real data example. Our analytical and numerical results showed that the submodel estimator is superior whenever the restriction given by H0 : β2 = 0 is correct or nearly correct, that is when the UPI is true. However, when the restriction becomes false and the test statistics rejects the null hypothesis, the submodel estimator becomes inefficient, and had the highest MSE, while the Liu-type positive shrinkage estimator showed the highest performance compared with other estimators regardless of the accuracy of the UPI.

For future research, the proposed estimation approach might be applied to different spatial regression models and investigate the performance of the proposed estimators analytically and numerically. Also, one more attractive area is the extension of the proposed estimation strategies to the high-dimensional data (HDD) case of the large-scale effect regression parameter vector of the CA model when (p > > n), and study the behavior of the Liu-type estimators. In addition, we can study the Liu-type estimation technique assuming a prior distribution for the CA model, and obtain the updated Liu, pretest, shrinkage, and positive shrinkage Liu-type estimators of β1.

References

  1. 1. Shen X., Wen Y., Cui Y., and Lu Q. A conditional autoregressive model for genetic association analysis accounting for genetic heterogeneity. (2022) Statistics in Medicine, 41(3), 517–542. pmid:34811777
  2. 2. Pérez-Molina E. Exploring a multilevel approach with spatial effects to model housing price in San José, Costa Rica. (2022) Environment and Planning B: Urban Analytics and City Science, 49(3), 987–1004.
  3. 3. Thamrin S. A., Khaerati R., Jaya A. K., and Ansariadi . Estimation of relative risk of dengue fever in makassar using localized bayesian autoregressive conditional model. (2021) Journal of Physics: Conference Series, 1752(1).
  4. 4. Zeng Qiang, Wen Huiying, S.C. Wong, Huang Helai, Guo Qiang, and Pei Xin. Spatial joint analysis for zonal daytime and nighttime crash frequencies using a Bayesian bivariate conditional autoregressive model. (2020) Journal of Transportation Safety and Security, 12(4), 566–585.
  5. 5. Saha Dibakar, Alluri Priyanka, Gan Albert, and Wu Wanyang. Spatial analysis of macro-level bicycle crashes using the class of conditional autoregressive models. (2018) Accident Analysis and Prevention, 118, 166–177. pmid:29477462
  6. 6. Hoef Ver, Jay M., Peterson Erin E., Hooten , Mevin B., Hanks , Ephraim M., and Fortin , Marie-Josèe . Spatial autoregressive models for statistical inference from ecological data. (2018) Ecological Mono-graphs, 88(1), 36–59.
  7. 7. Wang C, Quddus MA, and Ison SG. Impact of traffic congestion on road accidents: a spatial analysis of the M25 motorway in England. (2009) Accid Anal Prev, 41(4), 798–808. pmid:19540969
  8. 8. Kleinschmidt I., Sharp B., Mueller I., and Vounatsou P. Rise in malaria incidence rates in South Africa: a small-area spatial analysis of variation in time trends. (2002) R. American journal of epidemiology, 155(3), 257–264. pmid:11821251
  9. 9. Gelfand A. E., and Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. (2003) Biostatistics (Oxford, England), 4(1), 11–25. pmid:12925327
  10. 10. Al-Momani , Marwan , Riaz M., and Saleh M.F. Pretest and shrinkage estimation of the regression parameter vector of the marginal model with multinomial responses. (2022) Statistical Papers.
  11. 11. Al-Momani , Marwan , and Dawod A. B. A. Model Selection and Post Selection to Improve the Estimation of the ARCH Model. (2022) Journal of Risk and Financial Management. 15(4):174.
  12. 12. Li Y., and Jin B. Pairwise Fusion Approach Incorporating Prior Constraint Information. (2020) Communications in Mathematics and Statistics. 8, 47–62. https://doi.org/10.1007/s40304-018-0168-3
  13. 13. Arumairajan S. On the Stochastic Restricted Modified Almost Unbiased Liu Estimator in Linear Regression Model. (2018) Communications in Mathematics and Statistics. 6, 185–206. https://doi.org/10.1007/s40304-018-0131-3
  14. 14. Ehsanes Saleh A.K. Md., Arashi Mohammad, B.M. Golam Kibria. Theory of ridge regression estimation with applications. (2019) John Wiley & Son.
  15. 15. Ehsanes Saleh A.K. Md. , Arashi Mohammad, R. A. Saleh, Norouzirad Mina. TRank-Based Methods for Shrinkage and Selection: With Application to Machine Learning. (2022) John Wiley & Son.
  16. 16. Ahmed S.E. and Hussein A. and Marwan Al-Momani. Efficient estimation for the conditional autoregressive model. (2015) Journal of Statistical Computation and Simulation. 85(13), 2569–2581.
  17. 17. Nkurunziza S., Al-Momani M., Lin Y. Shrinkage and LASSO strategies in high-dimensional heteroscedastic models. (2015) Communications in Statistics—Theory and Methods. 45(15), 4454–4470.
  18. 18. Peng L., Xu J. and Kutner N. Penalty, Shrinkage estimation of varying covariate effects based on quantile regression. (2014) Statistics and Computing. 24, 853–869. https://doi.org/10.1007/s11222-013-9406-4
  19. 19. Saleh AK Md Ehsanes. Theory of Preliminary Test and Stein-type Estimation with Applications. (2006) John Wiley and Sons.
  20. 20. Hoerl Arthur E. and Kennard Robert W.. Ridge Regression: Biased Estimation for Nonorthogonal Problems. (1970) Technometrics, 12(1), 55–67.
  21. 21. Kejian Liu. A new class of biased estimate in linear regression. (1993) Communications in Statistics—Theory and Methods. 22(2), 393–402.
  22. 22. Li Y., Yang H. A new class of biased estimate in linear regression. (2012) Statistical Papers. 53, 427–437.
  23. 23. Yüzbaşı Bahadır, Bahadir S.E., Ahmed S. and Güngör Mehmet. Improved Penalty Strategies in Linear Regression Models. (2017) REVSTAT-Statistical Journal. 15(2), 251–276. https://revstat.ine.pt/index.php/REVSTAT/article/view/212
  24. 24. Yüzbaşı Bahadır, Asar Yasin and Ahmed S. E.. Liu-type shrinkage estimations in linear models. (2022) Statistics. 56(2), 396–420.
  25. 25. Babar Iqra, Ayed Hamdi, Chand Sohail, Suhail Muhammad, Khan Yousaf Ali, and Marzouki Riadh. Modified Liu estimators in the linear regression model: An application to Tobacco data. (2021) PLOS ONE. 16(11), 1–13.
  26. 26. Arashi M. and Golam Kibria B.M. and Norouzirad M. and Nadarajah S.. Improved preliminary test and Stein-rule Liu estimators for the ill-conditioned elliptical linear regression model. (2014) Journal of Multi-variate Analysis. 126, 53–74. https://doi.org/10.1016/j.jmva.2014.01.002
  27. 27. Arashi M., Norouzirad M., Ahmed S. Ejaz and Yuzbasi B. Rank-Based Liu Regression. (2018) Computational Statistics. 33(3), 1525–1561.
  28. 28. Mohammad Arashi and Norouzirad Minaand Roozbeh Mahdi and Khan Naushad Mamode. A High-Dimensional Counterpart for the Ridge Estimator in Multicollinear Situations. (2021) Mathematics. 9(23), https://www.mdpi.com/2227-7390/9/23/3057.
  29. 29. Arashi Mohammad and Asar Yasin and Bahadır Yüzbaşı. SLASSO: A scaled LASSO for multicollinear situations. (2021) Journal of Statistical Computation and Simulation. 91(15), 3170–3183.
  30. 30. Arashi M. AND Roozbeh M. AND Hamzah N. A. AND Gasparini M. Ridge regression and its applications in genetic studies. (2021) PLOS ONE. 16, 1–17. https://doi.org/10.1371/journal.pone.0245376
  31. 31. Noel Cressie, K Wikle C.. (2011). Statistics for spatio-temporal data Wiley, New Jersey.
  32. 32. Besag Julian and York Jeremy and Mollié Annie. (1991). Bayesian image restoration, with two applica-tions in spatial statistics. Annals of the Institute of Statistical Mathematics. 43(1), 1–20.
  33. 33. Noel Cressie. (1993). Statistics for Spatial Data. John Wiley & Sons, Ltd.
  34. 34. Ord K. (1975). Estimation methods for models of spatial interaction. Journal of the American Statistical Association. 70, 120–126.
  35. 35. Mardia K. V. And Marshall R. J. (1984). Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika.71(1), 135–146.
  36. 36. Bahadır Yüzbaşı and Asar Yasin and E Ahmed S.. Liu-type shrinkage estimations in linear models. (2022) Statistics. 56(2),396–420.
  37. 37. George G. Bock Judge M.E.. (1978). The Statistical Implications Of Pre-Test And Stein-Rule Estimators In Econometrics North-Holland Pub. Co.
  38. 38. Al-Momani Marwan and Hussein Abdulkadir A. and Ahmed S. E. Penalty and Related Estimation Strategies in The Spatial Error Model. (2017) Statistica Neerlandica.71(1), 4–30.
  39. 39. Bivand Roger and S. Wong David W..Comparing implementations of global and local indicators of spatial association. (2018) TEST. 27(3),716–748.
  40. 40. Harrison D. and Rubinfeld D. L. Hedonic housing prices and the demand for clean air. (1978) Journal of Environmental Economics and Management, 5(1), 81–102.
  41. 41. Gilley Otis W. and Pace R.Kelley. On the Harrison and Rubinfeld Data. (1996) Journal of Environmental Economics and Management, 31(3), 403–405.
  42. 42. Solow A. R. Bootstrapping correlated data. (1985) Journal of the International Association for Mathematical Geology, 17, 796–775.
  43. 43. M. I. Alheety, T. V. Ramanathan, and S. D. Gore. On the distribution of shrinkage parameters of Liu-type estimators. (2009) Brazilian Journal of Probability and Statistics, 23(1), 57–67.