Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

MCMCINLA estimation of varying coefficient spatial lag model—A study of China’s economic development in the context of population aging

  • Jiaqi Teng,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation College of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang, China

  • Shuzhen Ding,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation College of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang, China

  • Huiguo Zhang,

    Roles Formal analysis, Funding acquisition, Project administration, Supervision, Writing – review & editing

    Affiliation College of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang, China

  • Xijian Hu

    Roles Conceptualization, Formal analysis, Funding acquisition, Project administration, Supervision, Writing – review & editing

    xijianhu@xju.edu.cn

    Affiliation College of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang, China

Abstract

The dominant spatial econometric model in spatial econometrics is the parametric form, while in the realistic context, the variables often do not satisfy the assumption of linearity and have nonlinear relationships with each other. In this paper, we introduce nonparametric terms into spatial econometric models and propose the MCMCINLA estimation method for varying coefficient spatial lag models. The empirical analysis is conducted with the socioeconomic data of mainland China from 2015 to 2020 to discuss the influencing factors and spatial and temporal distribution characteristics of China’s economic development under the classical spatial lag model and the varying coefficient spatial lag model with population aging as a special covariate, respectively. The results show that with the gradual aging of the population, foreign trade will inhibit the development of regional economy to a certain extent, while urbanization process, resident income, real estate development and high-tech development will have a driving effect on economic growth, and high-tech development has the strongest mobilization on regional economic development. Compared with the classical spatial lag model, the varying coefficient spatial lag model can more fully exploit the information of variables in a more realistic context and derive the variable evolution process.

1. Introduction

In many fields such as ecology, epidemiology and economics, the observations obtained often have certain geospatial characteristics, and the models established often have both spatial correlation and spatial heterogeneity, when the traditional econometric methods are no longer applicable because they cannot distinguish between spatial correlation and spatial heterogeneity, and spatial econometrics comes into being. In recent years, the spatial element has received increasing attention in the empirical evidence of economics, and spatial econometric models have become a standard analytical tool for spatial data and have been widely used in regional economics [1], ecology [2], and sociology [3].

The classical spatial econometric models mainly make linear assumptions and model spatial correlation or spatial spillover effects, while in the real context there is not necessarily a linear relationship between variables, but more often nonlinear characteristics [4]. Therefore, it is necessary to introduce nonparametric terms into the spatial econometric models.

The spatial varying coefficient model is a generalization of the regression model, which was first given by Brunsdon et al. (1996) [5], and proposed a nonparametric smooth estimation method with the distance function between the observation points as the weight—Geographically weighted regression (GWR) method. He introduced the spatial properties of the data into the regression coefficients, assuming that the regression coefficients are a function of the spatial coordinates of each observation point, and explored and analyzed the spatial non-stationarity of the regression relationship by estimating the value of the coefficient function at each position. Fotheringham et al. (2002) [6] systematically summarized the principle and statistical inference of GWR method for the first time, and discussed the application of GWR in generalized linear models. In addition, the development of GWR further expands the related research on nonparametric spatial econometric models. For instance, Cho et al. (2010) [7] proposed a GWR-SLM varying coefficient model that combines GWR with spatial lag model; Basile et al. (2014) [8] considered the existence of spatial correlation, spatial heterogeneity and nonlinearity simultaneously, and obtained a new penalized spline spatial lag geo-additive model (PS-SLM); Geniaux et al. (2018) [9] presented a new hybrid spatial geo-weighted lag model (M-GWR-SLM) which allows regression parameters and spatial dependencies to vary over space. With the gradual development of spatial statistics and Bayesian statistics, the spatially varying coefficient Bayesian hierarchical model has become an effective alternative to GWR and has been favored. Gelfand et al. (2003) [10] focused around spatially varying coefficient Bayesian hierarchical models and provided extensions to generalized linear models and spatial-temporal settings; Bakar et al. (2016) [11] developed an R package, spTDyn, for modeling spatially varying coefficients and temporal dynamic processes in Bayesian hierarchical models; Kroc (2017) [12] explored the habitat behavior characteristics of moose with the help of a spatially varying coefficient Bayesian hierarchical model.

Due to the existence of spatial coordinates, the models discussed above can easily obtain the variation characteristics of the response variable with the spatial location, but it is not easy to obtain the dynamic variation law of the response variable with a covariate. In order to deeply explore the regression relationship between variables, this paper focuses on such models in which the coefficient function is an unknown function of a covariate, that is, by constructing a varying coefficient spatial lag model for research and analysis. The model has the advantages of high flexibility and strong adaptability, it can effectively avoid the problem of the curse of dimensionality, and can obtain the dynamic change characteristics of the response variable and other covariates under the time lapse of a covariate, which is especially suitable for use in longitudinal data analysis.

Spatial lag model is the most widely used type of spatial econometric model. It is used to embody spatial autocorrelation and can reflect the impact of neighboring regions on the region. Currently, estimation on nonparametric spatial lag models is flourishing, and various estimation methods have been proposed successively. Su and Jin (2010) [13] estimated the parametric and nonparametric components of a semiparametric spatial lag model using Local Linear and Cross-Sectional Likelihood methods, respectively. Li and Chen (2013) [14] proposed a new class of semiparametric varying coefficient spatial lag models by setting the coefficients of the explanatory variables in the form of unknown functions of certain variables and estimated the model parameters with the help of Contour Maximum Likelihood. Fang and Qian (2013) [15] estimated the nonparametric spatial lag model by using inverse jump MCMC. Chen et al. (2015, 2016) [16, 17] successively proposed Two-Stage Least Squares estimation and General Moment estimation for semiparametric panel spatial lag models. Krisztin (2017) [18] combined penalized splines with Bayesian methods to propose a new Bayesian semiparametric estimation method for semiparametric spatial lag models. Li and Chen (2018) [19] constructed a nonparametric spatial lag model with random independent variables and gave a General Moment estimation for this model. Zhao and Pu (2021) [20] proposed a new spatial lag single indicator varying coefficient model and completed the nonparametric estimation process by combining spline and Maximum Likelihood.

Bayesian estimation plays an increasingly important role in the statistical inference of spatial econometric models, as the next major parameter estimation method after Maximum Likelihood estimation and Moment estimation. Compared with Maximum Likelihood estimation and Moment estimation, Bayesian estimation is more adaptable to data. Maximum Likelihood estimation finds parameters by maximizing the likelihood function, which is more dependent on the data. When the amount of data is large, the Maximum Likelihood can estimate the parameters well; while when the amount of data is small, the Maximum Likelihood estimation result may not be ideal. Bayesian estimation uses probability to describe the change of unknown parameters. It regards the parameter as a random variable. By introducing a priori to estimate the parameter, a good estimation effect can still be obtained in the case of a small amount of data [21].

For Bayesian statistics, although it can better complete the parameter estimation in the case of small amount of data, it also has the problem of complex calculation process, especially the inference and calculation of the marginal posterior distribution of unknown parameters has always been a major difficulty in Bayesian statistics. In the early 1990s, Robert et al. [22] made a major breakthrough in Bayesian inference by applying the Markov Chain Monte Carlo (MCMC) method through inference simulation. MCMC can solve sampling difficulties such as high-dimensional integration, but may be limited by convergence speed and numerical stability when facing large models or large amounts of data. To address this problem, Rue et al. (2009) [23] proposed Integrated Nested Laplace Approximation (INLA), an algorithm that combines Laplace approximation with modern numerical integration in a Bayesian framework, which can greatly shorten the computation time and improve the computation efficiency and reduce the computation cost while guaranteeing the estimation accuracy of MCMC algorithm. Bivand et al. (2014) [24] extended INLA by Bayesian Model Averaging (BMA) for spatial econometric models and developed an INLABMA package. Gómez-Rubio et al. (2017) [25] described the implementation of a new class of latent models in INLA that can be used to fit spatial econometric models directly to INLA. Gómez-Rubio and Rue (2018) [26] gave a new approach combining INLA and MCMC, namely MCMCINLA, making it possible to fit spatial econometric models. Gómez-Rubio et al. (2020) [27] further explored BMA by using INLABMA for multivariate posterior inference of spatial econometric models with good results.

Due to the limitation of random effects of spatial lag models, INLA currently only supports estimating spatial lag models in parametric form, and no scholars have yet used the method to estimate nonparametric spatial lag models. In this paper, we combine MCMC with INLA and propose MCMCINLA estimation of nonparametric spatial lag models with the help of penalized spline technique to explore the nonparametric relationship between covariates and response variables in varying coefficient spatial lag models. This estimation method can overcome the problem of long computation time of single MCMC and fill the gap that single INLA cannot estimate nonparametric spatial lag models, which can provide new ideas for statistical inference of nonparametric spatial econometric models.

According to statistics, the proportion of China’s population aged 65 and above has reached 13.5% by 2021, indicating that China has entered a deeply aging society [28]. In the context of an increasingly aging population, how to adjust the industrial structure to continue to effectively promote sustainable and healthy economic development has become a key issue for current research. In the past two years, a large number of scholars have explored the relationship between population aging, technological innovation, industrial structure and other factors and economic development using different methods, and some useful conclusions have been obtained. Song and Gao (2022) [29] explored the effects of population aging and private sector market entry on economic growth by constructing the benchmark regression econometric model and the synergistic effects econometric model. Li and Gao (2022) [30] studied the correlation between technological innovation and economic development in the perspective of population aging using the mediating effect and panel threshold effect model. Su (2021) [31] conducted an in-depth analysis of how population aging affects economic growth from the perspective of aggregate supply and aggregate demand based on the fixed effects econometric model. Li and Qin (2020) [32] explored the relationship between population aging, social security expenditures and economic development with the help of the dynamic panel data model.

However, in practical problems, the linear assumptions between economic growth and factors such as technological innovation, income expenditure, and industrial structure are often not satisfied, and simply constructing a spatial econometric model may not be sufficient to fully characterize the regression relationship among the variables and may easily ignore the key information. Based on the above considerations, this paper explores the economic development of 31 regions in mainland China from 2015 to 2020 using MCMCINLA, and constructs spatial lag models and varying coefficient spatial lag models successively for comparative analysis of influencing factors and spatial and temporal characteristic distribution analysis with population aging as a special variable, in order to further explore the hidden information and come up with economic development strategies and suggestions that are more in line with the realistic context of China.

2. Materials and methods

2.1 Data source and preprocessing

Through the China Statistical Yearbook (http://www.stats.gov.cn), we collect data on 15 socioeconomic indicators that can be used to reflect economic development in 31 regions of mainland China from 2015 to 2020 as the object of our study, and the specific indicators and corresponding variables for each indicator are shown in Table 1.

thumbnail
Table 1. Selected indicators and corresponding variables of China’s economic development research.

https://doi.org/10.1371/journal.pone.0279504.t001

In view of the fact that the three variables of real estate development, foreign economy and high-tech development contain a large number of indicators, and there are certain correlations among the indicators with large differences in magnitudes, therefore, before the formal analysis, the data of the indicators involved in these three variables are firstly de-quantified and standardized, and multicollinearity diagnostics.

Abbreviate "real estate development" as A, the corresponding three indicators "sales of complete sets of residential units by real estate development enterprises", "main business income of real estate development enterprises" and "investment completed by real estate development enterprises in the current year" are a1, a2 and a3, respectively. Abbreviation "foreign economy" as B, the corresponding three indicators "total import and export of goods by region", "total import and export of goods by foreign-invested enterprises by region" and "registration of foreign-invested enterprises by region at the end of the year" are b1, b2 and b3, respectively. Abbreviation "high-tech development" as C, the corresponding four indicators "invention patents of industrial enterprises above the scale by region", "number of new product development projects of industrial enterprises above the scale by region", "technology market turnover by region" and "number of major enterprises in national high-tech zones" are c1, c2, c3 and c4, respectively. The results of the multicollinearity diagnostics of real estate development, foreign economy and high-tech development variables for 2015–2020 are shown in Tables 24.

thumbnail
Table 2. The results of the multicollinearity diagnostics of real estate development for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t002

thumbnail
Table 3. The results of the multicollinearity diagnostics of foreign economy for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t003

thumbnail
Table 4. The results of the multicollinearity diagnostics of high-tech development for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t004

Tolerance (TOL) or Variance Inflation Factor (VIF) are generally used as the goodness index to test the existence of collinearity. If Tol<0.1 or VIF>10, there is collinearity. It can be found from the results in Tables 24 that there are different degrees of Tol<0.1 or VIF>10 among the indicators in each year, indicating that the indicators of real estate development, foreign economy and high-tech development have collinearity, and further processing is required to make the research results more reliable.

In order to eliminate the multicollinearity among the indicators, we use principal component analysis to deal with it. The results of the principal component analysis of real estate development, foreign economy and high-tech development variables for 2015–2020 are shown in Tables 57.

thumbnail
Table 5. Principal component extraction of real estate development for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t005

thumbnail
Table 6. Principal component extraction of foreign economy for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t006

thumbnail
Table 7. Principal component extraction of high-tech development for 2015–2020.

https://doi.org/10.1371/journal.pone.0279504.t007

According to the analysis results, the cumulative contribution rate of the first principal component of real estate development and foreign economy reached more than 85%, and the cumulative contribution rate of the first two principal components of high-tech development reached more than 95% in each year, so the first principal component of real estate development and foreign economy and the first two principal components of high-tech development are selected as the final variable data respectively. And then, the remaining five variables are also subjected to the de-quantile standardization operation to be used in the formal study.

2.2 Model building

Using the level of economic development as the response variable y and the seven variables of urbanization rate, resident income, resident consumption, real estate development, foreign economy, high-tech development and population aging as covariates xj, j = 1,2,…,7, constructing spatial lag model, namely, unvarying coefficient spatial lag model (denoted as "Model I"): (1) where n = 186; ρ is the spatial autocorrelation parameter related to the spatial lag term, which can reflect the existence of spatial correlation between regions; W is the 186 × 186-dimensional spatial weight matrix between regions at different times; βj is the regression coefficient of the covariate xj; the error term ε obeys a Gaussian distribution with mean 0 and diagonal variance-covariance matrix σ2In.

The assumption condition of linear relationship between covariates and response variables is relaxed on the basis of Model I. The variable coefficient term βj (U) is introduced to consider spatial heterogeneity. In the context of China’s increasingly aging population, we pay more attention to the impact of population aging on economic development, and focus our research on the development and changes of the national economy with the deepening of aging. Taking population aging as a covariate U in the varying coefficient function, still using the level of economic development as the response variable y, and the remaining six variables as covariates xj, j = 1,2,…,6 to construct a varying coefficient spatial lag model to further explore the relationship between each influencing factor and economic development in the case of population aging. The varying coefficient spatial lag model (denoted as "Model II") is established as follows: (2) where βj (U) = (βj (U1), βj (U2),…,βj (Un)) ′ denotes the unknown nonparametric function on the covariate U. The remaining variables are defined in the same way as Model I.

2.3 MCMCINLA algorithm

The MCMCINLA algorithm was proposed by Gómez-Rubio and Rue in 2018 and can be used to fit spatial econometric models, linear regression models with covariates having missing data, Bayesian Lasso models and mixed models. The core idea of the algorithm is to divide the parameters to be estimated into two groups, one using the Metropolis-Hastings (MH) algorithm in MCMC and the other using the Bayesian Model Averaging (BMA) algorithm [33] in INLA, from which the estimates of all parameters in the model are obtained. Here, we use MH for the parameter ρ for estimation and BMA for the rest of the parameters.

Since the INLA algorithm mainly targets the model with a structured additive regression model with a latent random field of Gaussian Markov Random Fields (GMRF), it is first necessary to prove that the model conforms to the INLA framework before using MCMCINLA for estimating the varying coefficient spatial lag model. Using a Penalized spline method based on a cubic B spline basis with equidistant nodes to fit the varying coefficient term βj (U) [34], letting zj = xj B(j), so that the varying coefficient spatial lag model can be rewritten as (3) Achieving punishment by setting a priori, the prior for α(j) is defined by the difference penalty, with first-order differences corresponding to first-order Random Walk and second-order differences corresponding to second-order Random Walk [35]. Here we use a second-order Random Walk with backward differences, so that α(j) has a Gaussian prior with mean 0 and accuracy matrix Q, and its accuracy matrix Q is semipositive definite, such that α(j) is an IGMRF [36]. Letting (4) where ξ is the perturbation term added to the model. According to the previous analysis, it is known that α has a Gaussian prior with mean 0 and accuracy matrix Q, and ε obeys a Gaussian distribution with mean 0 and accuracy matrix τIn. Therefore, (x, α) is a GMRF with mean 0 and precision matrix P (structure as follows), so the model conforms to the INLA framework and can be solved using MCMCINLA. Only the main results are shown here, and the detailed proof procedure is shown in S1 Appendix.

It is worth noting that in the expansion of the varying coefficient term βj (U) using the B spline basis function, its choice of both the number and location of nodes is very sensitive [37]. To solve this selection problem, Eilers and Marx (1996) [38] suggested using a moderate number of nodes and defining a roughness penalty based on the difference of adjacent B spline coefficients to ensure that the fitted curve is sufficiently smooth, which here corresponds to using a second-order Random Walk (RW2) prior for the penalty. The number of nodes is generally selected using goodness-of-fit metrics such as AIC, BIC, and DIC, and here we use the DIC to select the number of nodes. Another noteworthy issue is that the mean of the varying coefficient term βj (U) is generally unidentifiable [39]. To ensure identifiability, it is required that βj (U) is constrained to have zero mean, i.e., it satisfies , and thus βj (U) can be included in the estimation at each iteration by MCMC centered on its mean.

The estimation process of fitting the varying coefficient spatial lag model using the MCMCINLA algorithm is performed in four main steps as follows. All calculations are done in the R-INLA (http://www.r-inla.org) environment in R.

  1. 1. P spline transformation. The curve is fitted using a cubic B spline basis function and the RW2 prior f(U, model = "rw2") is set as a penalty for the model, at which time the parameters to be estimated in the model are the parameter ρ, the hyperparameter τ and the parameter α.
  2. 2. Estimation of ρ using the MH algorithm. The process is carried out in three main steps as follows.

Step1: Assume that starting from the initial point ρ(1) = 0, the model is fitted conditionally with ρ(1) to obtain π(y|ρ(1)), π(τ|y,ρ(1)) and π(α|y, ρ(1));

Step2: Use the MH algorithm to sample from the posterior of ρ, propose a new point ρ* for ρ by proposing the distribution q(∙│ρ(j−1)), fit the model conditionally on ρ* to obtain π(y|ρ*), π(τ|y,ρ*) and π(αy,ρ*), and calculate π(ρ*), q(ρ*|ρ(j)) and q(ρ(j))|ρ*);

Step3: Calculate the acceptance probability P of ρ* and determine whether the proposal is acceptable or not, where (5) if the proposal is accepted, then ρ(j+1) = ρ* with π(τy,ρ(j+1)) = π(τy,ρ*), π(τy,ρ(j+1)) = π(τy,ρ*), π(αy,ρ(j+1)) = π(αy,ρ*); otherwise, ρ(j+1) = ρ(j), and π(τy,ρ(j+1)) = π(τy,ρ(j)), π(αy,ρ(j+1)) = π(αy,ρ(j)). This iterative process is executed until the end of the estimation.

  1. 3. Estimation of τ and α using the BMA algorithm. For the hyperparameter τ and the parameter α, the conditional margins π(τy,ρ(j)) and π(αy,ρ(j)) generated by the MH algorithm in each iteration can be obtained using BMA, and further the posterior margins of0 τ and α are derived by integrating over ρ, namely
(6)(7)
  1. 4. Estimation of βj (U). The model fitted using MCMCINLA is the model fitted by P spline, which is directly obtained as an estimate about α(j). Therefore, it is finally necessary to multiply it with the design matrix B(j) to obtain an estimate of the varying coefficient term βj (U) using βj (U) = B(j)α(j).

3. Results analysis

3.1 Model selection

Deviance information criterion (DIC) and Widely applicable information criterion (WAIC) are two commonly used metrics to evaluate the fitting effectiveness of Bayesian models, and in general, the smaller the value of DIC and WAIC, the better the fit of the model. From the fitting results in Table 8, it can be found that the DIC and WAIC value of the two models are very small, and the value of Model II is smaller than that of Model I, indicating that the fitting effect of Model II is better, which is more suitable for studying the dynamic changes of economic development.

thumbnail
Table 8. Fitting evaluation results of Model I and Model II.

https://doi.org/10.1371/journal.pone.0279504.t008

3.2 Analysis of factors influencing with spatial lag model

The Model I is fitted using MCMCINLA, and the estimated values of the spatial autocorrelation parameters and the posterior estimates of the regression coefficients of each influencing factor are obtained as shown in Table 9, and the regression coefficients of each influencing factor are plotted as shown in Fig 1. As estimated by MCMCINLA, the spatial autocorrelation parameter is ρ = 0.5946 (95%CI:0.3522~0.8450), indicating that there is a significant spatial correlation between the economic development of provinces. To a certain extent, the economic development of a region also stimulates the economic growth of the surrounding regions. From the results in Table 9, it can be found that the 95% credible intervals for urbanization rate, resident consumption and foreign economy contain 0, indicating that these three factors have no effect on the economic development of the region. The posterior mean of resident income, real estate development, high-tech development and population aging are 0.334, 0.460, 0.124 and 0.084, respectively, indicating that these four factors have significant positive effects on the economic development of the region, and real estate development has the strongest contribution to the economic development of the whole region.

thumbnail
Fig 1. Regression coefficients of each influence factor of the spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.g001

thumbnail
Table 9. Posterior estimates of regression coefficients of each influence factor in spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.t009

3.3 Analysis of factors influencing with varying coefficient spatial lag model

The Model II is fitted using MCMCINLA, and the estimated values of the spatial autocorrelation parameters and the posterior estimates of the regression coefficients of each influencing factor are obtained as shown in Table 10, and the regression coefficients of each influencing factor are plotted as shown in Fig 2. At this point, the spatial autocorrelation parameter is ρ = 0.2745 (95%CI:0.0816~0.4312). The 95% credible interval for resident consumption contains 0, indicating that resident consumption has no impact on economic development. The factor that has a significant negative effect on economic development is the foreign economy, with a posteriori mean of -0.263. The factors that contribute significantly to economic development are urbanization rate, resident income, real estate development and high-tech development, with posteriori means of 0.143, 0.127, 0.256 and 0.357, respectively. At the same time, it can be found that in the context of aging, high-tech development has replaced real estate development as the most mobilizing factor for regional economic development.

thumbnail
Fig 2. Regression coefficients of each influence factor of the varying coefficient spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.g002

thumbnail
Table 10. Posterior estimates of regression coefficients of each influence factor in the varying coefficient spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.t010

According to Fig 2, it is possible to further explore the changes of economic development and each factor in the scenario of increasing population aging in Chinese society. It is clear from the figure that there is not a direct linear relationship between the influencing factors and economic development. The urbanization rate curve is basically located in the positive half-axis of the y-axis, indicating that the urbanization process plays a certain role in promoting the development of regional economy, and with the gradual increase of aging, the promotion role begins to gradually weaken after reaching a certain peak. The main reason for this change is that the urbanization process can promote the influx of rural population to the cities to provide more labor for the industrial development of the cities [40], while the increasing aging problem makes the industrial development generate the problem of insufficient labor supply, which leads to some impact on the economic development rate of the region. The vast majority of the income curve lies on the positive half-axis of the y-axis, indicating that the income level of residents also plays a certain role in promoting the development of the regional economy, and with the gradual aggravation of aging, the role of income in promoting economic development is first decreasing and then increasing. The real estate development curves are basically located in the positive half-axis of the y-axis, indicating that the development of real estate has a certain role in promoting the development of the regional economy, and it is most fluctuated by the influence of aging. The main reason is that the construction of real estate requires a large amount of labor force, and the increasing proportion of the aging population directly reduces the main source of engineering construction, which has a certain impact on the development of the real estate industry and the economic development of the region. The foreign economy curve basically lies in the negative half-axis of the y-axis, indicating that the development of external economic trade does not have a catalytic effect on regional economic growth, and with the gradual increase of aging, the hindering effect of foreign economy on economic development becomes more and more obvious. The main reason for this phenomenon is the decline in the supply of young adult labor after reaching its peak in the context of an aging population, which has led some foreign-invested companies to relocate their factories to regions with a large pool of low-cost labor, such as India [41], thus hampering foreign trade and economic development. The curve of high-tech development basically lies in the positive half-axis of the y-axis, indicating that the development of high-tech plays a certain role in promoting the development of regional economy, and the promotion role gradually increases with the gradual aggravation of aging. The aging population has made people realize that it is difficult to continue to promote rapid economic growth by relying on labor-intensive industries, so they have started to adjust the industrial structure and use high-tech intelligent equipment to replace labor, thus alleviating the economic impact caused by the lack of labor supply [42] and enabling the sustainable and healthy development of the regional economy.

3.4 Analysis of spatial and temporal characteristics with varying coefficient spatial lag model

The posterior mean of population aging variables are obtained using MCMCINLA and the spatial and temporal distribution of China’s economic development from 2015 to 2020 is plotted as shown in Fig 3. It can be found that from a temporal perspective, the economic development of each region during 2015–2019 has continued to improve, and by 2019 there have been a number of regions such as Hebei and Sichuan where the posterior mean of economic development has reached more than 1.4, while thereafter, the COVID-19 began to appear in Wuhan City, Hubei Province at the end of 2019, and under the impact of the epidemic, the economic level of each region in the country in 2020 has seen a significant decline, reaching a maximum of only 1.27. From a spatial perspective, the development of China’s economy has some spatial variability, with the economic development level of the eastern provinces being significantly higher than that of the western provinces. And with the deepening of aging in recent years, Hebei, Sichuan, Hubei and Jiangsu still lead the country in terms of economic level. The "Beijing-Tianjin-Hebei" city circle with Beijing and Tianjin as the center and Hebei as the joint venture, the "Sichuan-Chongqing" city cluster with Sichuan and Chongqing as the center, and the "Yangtze River Delta" economic zone built around the Yangtze River basin have attracted many young people from other provinces to move to them, enabling the regional economy to continue to grow despite the increasing aging population. In addition, according to the spatial distribution map of China’s economic development in 2020, it can be found that Xinjiang may become a potential region for future economic growth. It can draw on the development experience of urban agglomerations to build a network of cities in the northwest, shrink the economic differences between the east and west, and promote further economic development in the northwest. From the perspective of space-time, the gap between the rich and the poor has gradually narrowed, the level of economic development has become more balanced, and the national economic status has tended to be stable. The deepening of population aging has prompted many enterprises to start the transformation from traditional industries to high-tech industries, and through the construction of urban agglomerations, "from point to area", drive the economic development of many regions, and effectively achieve common prosperity.

thumbnail
Fig 3. Spatial and temporal distribution of China’s economic development from 2015 to 2020.

https://doi.org/10.1371/journal.pone.0279504.g003

4. Discussion

By using MCMCINLA to fit the spatial lag model with unvarying coefficients and the spatial lag model with varying coefficients respectively, it can be found that both models conclude that resident income, real estate development and high-tech development will positively contribute to regional economic growth.

However, the comparison also reveals the differences between the two models. After considering the study object in a nonparametric framework, the spatial autocorrelation parameter of the model is significantly lower. The assumption of linear relationship between variables in the classical spatial lag model affects the spatial relationship to a certain extent, making the estimation of spatial correlation high; however, in the context of varying coefficients, this problem can be corrected, making the estimation of spatial autocorrelation parameters in the model more accurate.

After considering the study object in a nonparametric framework, the effect of some of the influencing factors on the response variable has changed, making the results more relevant to reality. For example, the effects of urbanization rate and foreign economy on economic growth change from no effect to positive and negative effects, respectively. To a certain extent, promoting urbanization can promote the development of industry, which in turn promotes regional economic growth; the aging population makes foreign companies shift more to countries with more cheap labor, leading to some restrictions on the development of foreign economy, which is consistent with reality.

After considering the study object in a nonparametric framework, it is possible to dig deeper into the changing relationships and evolutionary trends among variables. Under the condition of constant coefficients, we can only derive the one-to-one correspondence between each influencing factor and the response variable; while in the context of varying coefficients, we can further obtain the trend of the remaining variables and the response variable as one variable changes, so as to make an accurate study of future development and realize one-to-many.

After considering the study object in a nonparametric framework, important variables can be more precisely located to provide strategic references for future development. For example, after setting the factor of population aging as a covariate of the varying coefficient function, the most influential factor on economic development changes from real estate development to high-tech development, which is more suitable for the current economic and social reality in China, so the adjustment of industrial structure can be focused on high-tech, automatic and intelligent system construction in the future.

5. Conclusion

Classical spatial lag models often require linear assumption modeling, while they can also explore the correlation characteristics of spatial data themselves, they are not sufficient to fully characterize the regression relationships between variables in a linear framework alone. This paper explores the nonparametric relationships between covariates and response variables in the varying coefficient spatial lag model based on the P spline, so as to make up for the shortcomings of the single linear regression of the classical spatial lag model and make it more practical. In terms of parameter estimation, we propose the MCMCINLA estimation method for nonparametric spatial lag models, which overcomes the problem of long computation time of single MCMC algorithm and fills the gap that single INLA algorithm cannot estimate nonparametric spatial lag models, and can provide new ideas for statistical inference. In addition, the economic development data of 31 regions in China over 6 years are used as examples to explore the issue of the influencing factors and spatial and temporal distribution of economic development under the classical spatial lag model and under the varying coefficient spatial lag model, respectively. In contrast, it is found that the varying coefficient spatial lag model can more fully explore the data information more in line with the actual background, which can provide judgment and reference for the development trend and strategy of the economic market.

In view of the high complexity of nonparametric spatial lag model itself, compared with MCMC algorithm, MCMCINLA has greatly reduced the operation time, but due to the existence of MH sampling in the algorithm, the algorithm still has the problem of long calculation time in the face of high-dimensional data, which is more suitable for small sample analysis. In addition, this paper mainly analyzes longitudinal data by setting the coefficient function as an unknown function of a covariate. In the future, it can be extended to spatial panel data, and the coefficient function can be set as a function form of spatial position coordinates to conduct more in-depth research on spatial distribution and spatiotemporal distribution.

Supporting information

S1 Data. Socioeconomic data files of various regions in mainland China from 2015 to 2020.

https://doi.org/10.1371/journal.pone.0279504.s001

(ZIP)

S1 Appendix. Proof of GMRF structure for the varying coefficient spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.s002

(PDF)

S2 Appendix. R code for analysis of China’s economic development under the classical spatial lag model and varying coefficient spatial lag model.

https://doi.org/10.1371/journal.pone.0279504.s003

(PDF)

Acknowledgments

The authors appreciate the anonymous reviewers and editors for their constructive comments and suggestions that have significantly improved the quality of this manuscript. In addition, thanks for the support of the National Natural Science Foundation of China (Nos. 11961065), the Natural Science Foundation of Xinjiang (2019D01C045), and the Ministry of education of Humanities and Social Science project (19YJA910007).

References

  1. 1. Hujer R.; Rodrigues P.; Wolf K. Estimating the macroeconomic effects of active labour market policies using spatial econometric methods[J]. International Journal of Manpower, 2009, 30(7):648–671.
  2. 2. Zhang G.; Jia Y.; Su B.; Xiu J. Environmental regulation, economic development and air pollution in the cities of China: Spatial econometric analysis based on policy scoring and satellite data[J]. Journal of Cleaner Production,2021,328.
  3. 3. Albers H.J.; Ando A.W.; Chen X. Spatial-Econometric Analysis of Attraction and Repulsion of Private Conservation by Public Reserves[J]. SSRN Electronic Journal, 2006, 56(1):33–49.
  4. 4. Hou R.; Wang Q.; Li Y. Minimum nonparametric likelihood ratio estimation and test with auxiliary information[J] (in Chinese). Journal of Sichuan Normal University (Natural Science Edition),2016,39(01):59–64.
  5. 5. Brunsdon C.; Fotheringham A.; Charlton M. Geographically weighted regression: a method for exploring spatial nonstationarity[J]. Geographical Analysis, 1996,28:281–298.
  6. 6. Fotheringham A.; Brunsdon C.; Charlton M. Geographically weighted regression: the analysis of spatially varying relationships[M]. Chichester: Wiley, 2002.
  7. 7. Cho S.; Lambert D.; Roberts R.; Kim S. Moderating urban sprawl: is there a balance between shared open space and housing parcel size?[J] Journal of Economic Geography,2010,10(5):763–783.
  8. 8. Basile R.; Durbán M.; Mínguez R.; Montero J.; Mur J. Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities[J]. Journal of Economic Dynamics and Control,2014, 48: 229–245.
  9. 9. Geniaux G.; Martinetti D. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models[J]. Regional Science and Urban Economics,2018,72:74–85.
  10. 10. Gelfand A.; Kim H.; Sirmans C. and Banerjee S. Spatial Modeling With Spatially Varying Coefficient Processes[J]. Journal of the American Statistical Association,2003, 98:462, 387–396.
  11. 11. Bakar K.; Kokic P.; Jin H. Hierarchical spatially varying coefficient and temporal dynamic process models using spTDyn[J]. Journal of Statistical Computation & Simulation, 2016, 86(4–6):820–840.
  12. 12. Kroc M. Statistical Analysis of Moose Habitat Behaviors Using Bayesian Hierarchical Model with Spatially Varying Coefficients[J]. 2017, University of Minnesota Digital Conservancy, https://hdl.handle.net/11299/189080.
  13. 13. Su L.J. and Jin S.N. Profile Quasi-Maximum Likelihood Estimation of Partially Linear Spatial Autoregressive Models[J]. Journal of Econometrics,2010,157:18–33.
  14. 14. Li K.; Chen J. Cross-sectional maximum likelihood estimation of a semiparametric varying coefficient spatial lag model[J] (in Chinese). Quantitative Economic and Technical Economics Research, 2013, 030(004):85–98.
  15. 15. Fang L.; Qian Q. Bayesian estimation of nonparametric spatial lag models[J] (in Chinese). Quantitative Economic and Technical Economics Research, 2013, 030(004):72–84.
  16. 16. Chen H.; Ye A.; Ye J. Two-stage least squares estimation of semiparametric panel spatial lag models and applications[J] (in Chinese). Practice and understanding of mathematics, 2015, 45(24):222–228.
  17. 17. Chen H.; Ye A.; Zhang Z. General moment estimation for semiparametric panel spatial lag models[J] (in Chinese). Systems Engineering, 2016(10):158–162.
  18. 18. Krisztin T. The Determinants of Regional Freight Transport: A Spatial, Semiparametric Approach[J]. Geographical Analysis,2017,49(3).
  19. 19. Li K.; Chen J. GMM estimation of nonparametric spatial lag model[J] (in Chinese). Applied Mathematics A Journal of Chinese Universities (Ser.A), 2018,33(02):140–156.
  20. 20. Zhao J.; Pu Y. Estimation of spatial lag single indicator varying coefficient model and its application[J] (in Chinese). Quantitative Economic and Technical Economics Research,2021,38(11):163–181.
  21. 21. Grzegorczyk M. An introduction to Gaussian Bayesian networks[J]. Methods Mol Biol. 2010,662:121–147. pmid:20824469
  22. 22. Robert C.; George C. Monte Carlo Statistical Methods[M].Springer Science and Business Media,2013.
  23. 23. Rue H.; Martino S.; Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations[J]. Journal of the Royal Statistical Society Series B (Statistical Methodology), 2009, 71(2):319–392.
  24. 24. Bivand R.S.; Gómez-Rubio V.; Rue H. Approximate Bayesian inference for spatial econometrics models[J]. Spatial Statistics, 2014, 9:146–165.
  25. 25. Gómez-Rubio V.; Bivand R. S.; Rue H. Estimating spatial econometrics models with integrated nested Laplace approximation[J]. Mathematics, 2017,17: 2044.
  26. 26. Gómez-Rubio V.; Rue H. Markov Chain Monte Carlo with the Integrated Nested Laplace Approximation[J]. Statistics and Computing,2018,28(5).
  27. 27. Gómez-Rubio V.; Bivand R.S.; Rue H. Bayesian Model Averaging with the Integrated Nested Laplace Approximation[J]. Econometrics,2020,8(2).
  28. 28. Pan H.; Tang J. Research on the impact of population aging on the consumption rate of residents[J] (in Chinese). Jiangxi Social Science,2021,41(01):51–60.
  29. 29. Song J.; Gao C. Population Aging, Private Sector Market Entry and Economic Growth—A Threshold Effect Analysis Based on Provincial Panel Data[J] (in Chinese). Contemporary Economic Management:1–22.
  30. 30. Li J.; Gao Y. Technological innovation and high quality economic development in the perspective of population aging[J] (in Chinese). Population Studies,2022,46(02):102–116.
  31. 31. Su J. How Population Aging Affects Economic Growth-Based on the Analytical Perspective of Aggregate Supply and Aggregate Demand[J] (in Chinese). Journal of Beijing University of Technology and Business (Social Science Edition),2021,36(05):14–23.
  32. 32. Li L.; Qin Q. Population aging, social security spending and economic development[J] (in Chinese). Exploration of Economic Issues,2020(05):40–52.
  33. 33. Hoeting J.; David Madigan A.R.; Volinsky C. Bayesian model averaging: a tutorial[J]. Stat. Sci. 14, 382–401.
  34. 34. Wang X.; Yu Y.; Faraway J.J. Bayesian regression modeling with inla[M]. 2018.
  35. 35. Ding S.; Zhang H.; Hu X.; Li Z. Influence of macroscopic factors on HIV/AIDC epidemic based on R-INLA method[J] (in Chinese). China AIDS STD,2018,24(12):1192–1196.
  36. 36. Rue H.; Held L. Gaussian Markov Random Fields: Theory and Applications[M]. 2005.
  37. 37. Zhao W.; Zheng L.; Zhang H.; Hu X. Spatio-temporal analysis of macro factors against Hand, Foot and Mouth Disease: a Bayesian-INLA approach[J] (in Chinese). China Health Statistics,2020,37(01):6–9.
  38. 38. Eilers P.; Rijnmond D.M.; Marx B.D. Flexible Smoothing with B-Splines and Penalties[J]. Statistical Science, 1996, 11(2).
  39. 39. Lang S.; Brezger A. Bayesian P-Splines[J]. Journal of Computational & Graph-ical Statistics, 2004, 13(1):183–212.
  40. 40. Ye L. Analysis of the contribution of urbanization process to domestic economic development and the channels of its realization[J] (in Chinese). Modern Economic Information,2018(14):21.
  41. 41. Chen X.; Zheng Y.; Yao D. Population aging, industrial intelligence and high-quality economic development[J] (in Chinese). Statistics and Decision Making,2022,38(06):129–132.
  42. 42. Chen Q.; Xu D.; Zhou Y. Labor substitution effect of artificial intelligence in the context of population aging—analysis based on cross-country panel data and provincial panel data in China[J] (in Chinese). China Population Science,2018(06):30–42+126–127.