Using a spatial autoregressive model with spatial autoregressive disturbances to investigate origin-destination trip flows

Linglin Ni; Dapeng Zhang

doi:10.1371/journal.pone.0305932

Abstract

Spatial interaction models with spatial origin-destination (OD) filters are powerful tools to characterize trip flows in space, which is a classic and important problem in regional science. To the authors’ knowledge, existing studies adopting OD filters mostly specify the spatial dependence as an autoregressive process, which may not be the full picture of spatial effects. To examine the problem, this paper proposes the hypotheses that 1) spatial OD dependences can take place in both the spatial autoregressive term and the spatial error term in a spatial interaction model. 2) Estimating a spatial autoregressive model with spatial autoregressive disturbances (SARAR) model with OD filters would disentangle where the spatial dependence exists and by how much. 3) The marginal effects obtained from SARAR models would be preferred to analysts when SARAR models outperform spatial autoregressive (SAR) models and spatial error models (SEM) from the statistical point of view. To assess these hypotheses, this paper specifies, estimates, and applies SARAR models with OD filters to investigate trip distributions. By comparing against alternative models, this paper investigates the estimation results in SAR, SEM and SARAR models using an empirical data collected from Hangzhou, China. The contribution of this paper is to be the first in developing an SARAR model with OD filters for trip distribution analyses and examining its performance.

Citation: Ni L, Zhang D (2024) Using a spatial autoregressive model with spatial autoregressive disturbances to investigate origin-destination trip flows. PLoS ONE 19(6): e0305932. https://doi.org/10.1371/journal.pone.0305932

Editor: Matteo Lippi Bruni, University of Bologna, ITALY

Received: July 9, 2023; Accepted: June 4, 2024; Published: June 26, 2024

Copyright: © 2024 Ni, Zhang. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information file.

Funding: This paper is financially supported by the National Natural Science Foundation of China (72101255, 71801188). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: he authors have declared that no competing interests exist.

1. Introduction

Spatial interaction models with spatial filters on both the origin and the destination (OD) have become a popular way to study OD trip flows, because researchers believe spatial dependence at the flows’ starting points and ending points both affect flow patterns. The spatial OD filters can characterize the influence due to spatial proximity of nearby zones on a traffic analysis zone serving as both an origin and a destination. In a spatial interaction model, OD filters can be specified as spatial autoregressive terms in the dependent variable (SAR) or in the error term (SEM). Specifically, the SAR specification can have three spatially lagged terms to capture spatial dependences at trips’ origins, trips’ destinations, and the interactions of trips’ origins and destinations [1]. In the SEM specification, the spatial effect can be captured by three similar spatially lagged terms, but they are specified in the error terms of the econometric equation.

Selections of the best spatial econometric model specification usually start with investigating one SAR model and one SEM [1]. According to the literature [2], both SAR and SEM specifications can capture spatial dependences effectively. However, the underlying philosophies differ: SAR models assume that spatial dependence occurs within the dependent variable, without directly specifying spatial effects in the attributes or the unexplained portion of the dependent variable. On the other hand, SEM specifications assume that the spatial dependences takes place in the un-explanatory part of the dependent variable, rather than the dependent variable itself or the attributes. Both philosophies can somewhat explain the spatial effects in traffic flow: on the one hand, specifying spatial dependences in the dependent variable, i.e., SAR specifications, implies that traffic flows perform similarly when the origin and/or the destination are close to each other. On the other hand, specifying spatial dependencies in the error term, i.e., SEM specifications, implies that a shock in traffic flows impacts not only the immediate area but also neighboring error terms. This process, in turn, indirectly propagates to other flows.

Not only the underlying philosophies are different between SAR and SEM specifications, but the estimated marginal effects also differ. When the absolute values of spatial coefficients are close to 1, the difference in marginal effects of explanatory variables is distinguishable. That is because derivatives of the traffic flow on explanatory variables would involve the spatial lag terms in the SAR specification, which creates a spillover effect across traffic flows. Nonetheless, marginal effects in the SEM would simply be the coefficients of explanatory variables. The difference in marginal effects would drive different policy implications in transportation planning and management. For example, the accessibility of transportation services usually presents a positive effect on the traffic flow. Public agencies can adjust accessibility policies to intervene traffic flow of certain areas. The amount or intensity of policy intervention relies on the degree to which marginal effects correspond to varied social and economic investments.

Given the differences in model specifications, philosophies, and interpretations, the following hypothesis is proposed: spatial dependences in the OD flow can take place in both the spatial autoregressive term and the spatial error term depending on the real-world scenario. Testing the hypothesis can be executed by specifying and estimating a spatial autoregressive model with spatial autoregressive disturbances (SARAR) model with OD filters. The SARAR with OD filters, which has been theoretically specified [1] but not estimated or applied in the empirical studies, can estimate the spatial dependences on the dependent variable and the error term simultaneously. The estimated spatial coefficients can reveal where the spatial dependence exists and by how much. While the marginal effects derived from the SARAR specification are theoretically similar to those of the SAR model, analysts may prefer the SARAR approach when it demonstrates superior statistical performance over the SAR and SEM specifications.”

Hence, this paper estimates a series of spatial econometric models with OD filters including the SAR, the SEM, and the SARAR specifications to disentangle spatial dependences in the OD travel flow. The detailed estimation methods are proposed and applied on an empirical trip distribution data in Hangzhou, China, which contains the OD travel flow matrices, socio-economic factors, and transportation infrastructure measures. Moreover, OD flow matrices at four separate times have been collected: Thursday morning, Sunday morning, Thursday afternoon and Sunday afternoon. This paper compares the performance of all specifications with OD filters to gain additional insights into marginal effects at separate times.

This paper aims to enrich the existing literature by estimating and applying the SARAR specification with OD filters. Meanwhile, this paper is the pioneer study to examine the philosophies of the SAR, the SEM, and the SARAR specifications and how the difference in spatial dependences drives the variation in marginal effects of explanatory variables. Finally, this paper deepens the understanding of spatial dependences in trip distribution at different times of day and days of week.

The next section will review the literature of the SARAR specification in OD flow analyses. Then, the model specification and estimation methodology are introduced in Section 3. Section 4 conducts an empirical analysis using the SARAR specification, compares its performance with alternative models, and discusses policy implications. Section 5 concludes the paper with discussions and the next steps.

2. Literature review

Spatial econometric models with OD filtering, proposed in LeSage and Pace [1], have become a popular method in regional science, as it can capture spatial dependences at both origins and destinations. By calculating explanatory variables’ marginal effects, analysts can reveal the origin effects, destination effects and total effects of the change in an explanatory variable of a certain region on the flow pattern arising from spatial effects of origins and destinations [3]. Several empirical studies have been conducted borrowing the OD filtering specifications, but all of them, to the best of knowledge, adopted SAR specifications, except LeSage and Fischer explained OD filtering on explanatory variables [1]. For example, Margaretic et al. [4] analyzed the spatial dependence in air passenger flows with SAR specifications. LeSage and Llano [5] used the Bayesian Markov Chain Monte Carlo method to analyze commodity flows of 18 Spanish regions by the SAR specification. Ni et al. [6] used the SAR specification to investigate less-than-truckload flow across all provinces in China. Ni et al. [7] also used the SAR specification to analyze the weekly average travel flow via the 51 zones in Hangzhou, China. Note that the empirical data in last mentioned literature used weekly data. This paper uses a recently obtained daily travel data, which means this paper can explore the difference in trip distribution across weekdays and weekends, as well as mornings and afternoons. The main reason for the existing literature to prefer SAR specifications may be that spillover effects in SAR can be incorporated in the interpretation of marginal effects, adding color in characterizing explanatory variables’ effects on travel flow.

Nonetheless, SAR specification can only capture part of the spatial effect, and a natural extension of SAR is to add spatial terms in the disturbance terms, resulting in the so-called SARAR models. SARAR can address the two most important spatial effects simultaneously: spatial dependence and spatial heterogeneity, specifically referring to heteroskedasticity in the context of spatial dependence, as elucidated by Anselin [8]. Empirical studies usually found that SARAR performs better among SAR, SEM and SARAR. For example, Li et al. [9] adopted a 3-D SARAR hedonic model to unveil the impact of polluted river on the high-rise apartment market of Guangzhou. They claimed the spatial dependence in the error term after fitting a SAR cannot be ignored. In COVID prevalence analysis conducted by Sun et al. [10], SARAR outperformed the linear model, SAR model, and SEM in explaining the spreading over U.S. counties. Hence, in OD filtering analysis, analysts should leverage SARAR models so that spatial effects can be examined in a more comprehensive way and lead to a more trustworthy interpretation of marginal effects with respect to spatial processes, which is one of the key objectives of this paper.

Trip distribution is one of the most important and conventional problems in transportation research [11]. It explains where a trip starts and ends. Understanding trip distribution is a pre-requisite work in solving various transportation problems. OD matrix balancing techniques, gravity models, and spatial interaction models are usually used in practice to calibrate trip distribution. The challenge of trip distribution analysis before the big data era was data collection. Analysts have to rely on somewhat manual traffic counts or travel survey to extract a sample OD matrix, and the accuracy is often in doubt [12]. With the development of intelligent transportation technologies, analysts can access real time trip distribution with full population of data. Then, the research focus extends from OD matrix calibration to trip distribution explanation, i.e., establishing the relationship between explanatory variables and trip distribution. By having these relationship, urban administration officers can intervene traffic flow by adjusting explanatory variables to address traffic problems. However, trip distributions are more sophisticated than what explanatory variables can explain. Spatial dependence, spatial heterogeneity, and other error reduction schemes are needed to improve the performance of trip distribution models. However, such studies are far from completion. Hence, this paper, in the context of leveraging big data to model trip distributions, bridges the gap of the need in trip distribution modeling and the current state of art in spatial interaction models by fitting OD filtering SARAR models on the trip distribution data collected in Hangzhou, China.

3. Methodology

SARAR models with OD filters can jointly examine spatial dependences in the dependent variable and the error term. One of the objectives in this paper is to specify, estimate and apply SARAR models. It’s worth to note that spatial filters can be also specified on the explanatory variable terms, which enables the so-call SLX (spatially lagged X) model [13]. The SLX specification can be combined with SAR and SEM specifications to establish a series of alternative specifications. Based on the relevant literature, the SLX specification may present advantages in the model selection from the mathematical point of view. However, in the trip distribution context, the main focus is often on the spatial dependence of the dependent variable and the error structure. Specifically, OD flow itself is an important variable in transportation analyses, and exploring its spatial dependence is of the greatest interest. In addition, analysts often observe a high variation in the OD matrix, meaning looking into the disturbance term can help improve the overall model performance. This paper investigates SAR, SEM, and SARAR specifications, which is also consistent with Anselin’s point of view in that spatial dependence and spatial heterogeneity are the two most important specifications in the spatial data analysis [2].

Assume y denotes the traffic flow between a set of origins k(1⋯K) and a set of destinations l(1⋯L). Let N = K×L, the size of y is (N×1).

(1)

The SARAR model specification can take the form (2) where W_o = W⨂I_N is the spatial matrix capturing the spatial lag arising from the origin of traffic flows. The operator ⨂ denotes the Kronecker product. I_N denotes the unit matrix of size (N×N) with 1 on its diagonal. The same idea applies to W_d = I_N⨂W which captures the spatial dependence at the destination of traffic flows. The spatial matrix W_w = W⨂W is used to indicate the interaction of the origin and destination dependence. The spatial matrices W_o,W_d, and W_w are employed to define spatial proximity in autoregressive terms. The parameter ρ{ρ_o,ρ_d,ρ_w} capture the spatial dependencies at origin, destination, and their interaction on the dependent variable, while λ{λ_o,λ_d,λ_w} measures the intensity of spatial dependence in the disturbance term.

As pointed out by LeSage and Pace [1], the constraints ρ_w = −ρ_oρ_d and λ_w = −λ_oλ_d of the model can be regarded to reflect the filter

This paper consistently applies these constraints, ensuring that ρ_w and λ_w are treated as known coefficients to enhance the tractability of the estimation process. This approach simplifies the model while providing a more manageable estimation process.

The explanatory variable X{X_o,X_d,X_w} captures the factors of origins, destinations, and their interaction of origins and destinations such as the distance between origins and destinations. The corresponding coefficients as well as the coefficients of constant are β{β_o,β_d,β_w,β_c}, which can be used for deriving the marginal effects of explanatory variables.

The disturbance term ε includes three spatially lagged terms. The term μ is the idiosyncratic term with mean 0 and variance σ².

When ρ{ρ_o,ρ_d,ρ_w} are not zero while λ{λ_o,λ_d,λ_w} are zeros, the SARAR model specification is reduced to a SAR model, which takes the form (3)

When ρ{ρ_o,ρ_d,ρ_w} are zeros while λ{λ_o,λ_d,λ_w} are not zero, the SARAR model specification becomes a SEM model, which takes the form (4)

While these two reduced models can both capture the spatial effects rooting in the trip distribution data, the estimated marginal effects may vary dramatically. Let A = (I_N−ρ_oW_o−ρ_dW_d−ρ_wW_w), the SAR model can be written as (5)

In other words, the derivative of the traffic flow on explanatory variables would involve a spatial filtering process, which captures the spatial spillover effects. For example, X_o’s marginal effect is (6)

Let B = (I_N−λ_oW_o−λ_dW_d−λ_wW_w), the SEM model can be written as (7)

where β{β_o,β_d,β_w} are the marginal effects which does not take any spatial process.

(8)

Assuming two model specifications with similar estimated values of ρ{ρ_o,ρ_d,ρ_w} and λ{λ_o,λ_d,λ_w}, it indicates the intensity of spatial dependence in both specifications are similar, but the corresponding marginal effects may be vastly different. To obtain more reliable marginal effects in empirical studies, one way is to estimate all spatial coefficients in the SARAR model, which can locate spatial effects and derive the values of marginal effects.

This paper adopts a maximum likelihood estimation method to derive the coefficients in the SARAR model. The log-likelihood function takes the form (9)

In this paper, the maximum likelihood estimation is coded and processed in MATLAB, and two sets of estimation experiments have been conducted to show the code can recover the values of coefficients. Each experiment assumes 50 traffic analysis zones, implying a 50×50 = 2,500 observations in the simulated dataset. The primary difference between the two experiments lies in the true values of parameters, which allows for a more nuanced analysis of the model’s performance under varying conditions. Table 1 reports the true values, estimated values, t-statistics, and root mean squared errors (RMSE) of both experiments. Results indicate the MATLAB can achieve a good estimation of all coefficients in the SARAR model. In addition, the experiments demonstrate when spatial coefficients of origin and destination have different signs, and when spatial coefficients of the dependent variable and the error terms have different signs, MATLAB codes can achieve accurate estimation.

Download:

Table 1. Estimation results based on simulation data.

https://doi.org/10.1371/journal.pone.0305932.t001

The collection process and the analysis method complied with the terms and conditions for the source of the data.

4. Empirical study

Data description

This paper uses a trip distribution dataset collected in Hangzhou, China, which comprises the traffic flow between 51 traffic analysis zones in the urban area, and a set of potential explanatory variables including population, key facilities, accessibility, and travel time (S1 Dataset). Specifically, this trip distribution dataset collects information from

cellular signaling data. The cellular signaling data is provided by the China Mobile Communication Corporation, which is one of the three main cellular service providers in China and currently serves 69.56% of residents in Hangzhou. Though data does not correspond to the entire population, it can still reasonably represent the trip distribution pattern for the analyzed region. The data contains more than 30 million of records collected from 56 thousand base stations. Based on the time a user sending or receiving cellular signal, travelers’ physical movements are derived at the zonal level. Aggregating all users’ movement information enables the final trip distribution data. Details of deriving trip distribution data from raw data can be found in Ni et al. [6]
transportation data. The transportation data is provided by local agencies of transportation construction and management. The data contains population, commercial area, and medical area for each traffic analysis zone. This paper leverages this transportation data to derive a number of explanatory variables to explain trip distribution.
online map data. The online map data is provided by Baidu, one of the largest China’s technology companies. Their map service is similar to Google map but focusing on Chinese cities. The online map offers commercial buildings’ locations, road lane length, and travel time information.

Note the trip distribution data derived from the cellular signaling data are collected at different times: a Thursday morning (7am-10am with a total of 1,461,856 trips), a Thursday afternoon (4pm-7pm with a total of 1,436,328 trips), a Sunday morning (1,327,469 trips), and a Sunday afternoon (1,384,214 trips). This paper investigates trip distribution at all these times and compare their results. Table 2 is the description of all variables used in the SARAR model.

Download:

Table 2. Descriptive statistics of variables.

https://doi.org/10.1371/journal.pone.0305932.t002

5. Results analysis

The SARAR model is used to investigate the trip distribution at different times. SAR and SEM models are also estimated for comparison purposes. The estimation results of the trip distribution on Thursday morning are reported in Table 3.

Download:

Table 3. Estimation results of Thursday morning.

https://doi.org/10.1371/journal.pone.0305932.t003

Spatial coefficients

The spatial coefficients in SAR, ρ{ρ_o,ρ_d,ρ_w}, are 0.86, 0.88, and -0.75, respectively. They capture spatial autocorrelations at origin and destination, as well as the effect of their interaction term. The values of spatial coefficients are close to 1, indicating strong spatial dependences in the trip distribution data, and capturing spatial effects is necessary in establishing trip distribution models.

The spatial coefficients in SEM, λ{λ_o,λ_d,λ_w}, are 0.87, 0.88, and -0.76, respectively. From the model fitting perspective, both SAR and SEM have achieved satisfactory performance. The only difference is that the presence of spatial dependences is assumed to be in different mathematical terms: one assumes the spatial dependence take place on traffic flow, and the other assumes the spatial dependences takes place in the disturbance term.

Given spatial coefficients of both spatial autoregressive and spatial error terms are statistically significant, estimating SARAR model is the natural next step. Being able to specify spatial dependences on traffic flow and its disturbance term simultaneously, SARAR can disentangle and distribute spatial dependences across SAR and SEM. As shown in the log-likelihood values, SARAR’s value is greater than the log-likelihood values of SAR and SEM. In addition, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) show SARAR outperforms the alternative models. This is expected because all spatial coefficients are statistically significant in SARAR, adding explanatory power to the model. The values of spatial coefficients in SARAR are smaller than those in SAR and SEM. This is an interesting and important observation because

Looking at spatial coefficients in the autocorrelation term alone in the SARAR model, its magnitude is smaller than the one in the SAR model. Such a difference indicates SAR tends to over-estimate the spatial effects, leading to higher marginal effects given both origin and destination spatial coefficients are positive.
Similarly, the spatial coefficients in the error term alone in the SARAR model is smaller than the one in the SEM. Although the marginal effects do not involve spatial error term, the higher spatial coefficient values in SEM would jeopardize the effect of explanatory variables.
Both ρ{ρ_o,ρ_d,ρ_w} and λ{λ_o,λ_d,λ_w} are smaller in SARAR implies the spatial effects in SAR and SEM are distributed across the spatial autoregressive term and the spatial error term.

Marginal effects

Table 3 reports that most explanatory variables are statistically significant, though some of them have varying significances across models. In general, the SARAR model turns out to report the lower statistic significance, because the effect of explanatory is diluted due to the existence of more spatial coefficients. In other words, the true effect of explanatory variables should be understood coupled with the spatial effects. Specifically, the change in an explanatory variable would have an impact on flows out of the region (origin effect) and coming to the region (destination effect). When adding the spatial spillover effects onto the spatial interaction model, the marginal effects become even more complicated: both the origin effect and the destination effect will affect neighboring regions. To obtain the marginal effects, the following mathematical calculation has been conducted.

For models with spatial autoregressive terms, i.e., SARAR and SAR, the total marginal effects can be written in a matrix TE, which takes the form (10) where Jd₁ and Jo₁ are a n×n matrix with the affected regions equal to 1. Multiplying them with and shows total effects of the r-th explanatory variables on the travel flow. Then, a scalar summary measure of the total effect of changes in the r-th explanatory variable can take the form (11)

Additionally, the derivation of the origin effect (OE), destination effect (DE), intraregional effect (IE) and network effect follows a similar rationale. For detailed explanation, please refer to LeSage and Thomas-Agnan [3].

This paper has calculated the marginal effects (total effect, origin effect, destination effect, intraregional effect, and network effect) for the trip distribution on Thursday morning, Sunday morning, Thursday afternoon, and Sunday afternoon. The estimation results are reported in Table 4. The interpretation becomes straightforward. For example, the results of population on Thursday morning by the SARAR model imply that for every 1% increase in population, the total number of trip would increase by 1.77%. This 1.77% is constituted by the origin effect (0.73%), destination effect (0.94%), intraregional effect (0.04%), and network effect (0.07%). In other words, the decomposition of marginal effects reveals how the effect of explanatory variables would take place in the spatial processes.

Download:

Table 4. Estimated spatial coefficients and marginal effects for all models.

https://doi.org/10.1371/journal.pone.0305932.t004

Comparing the marginal effects among SAR, SEM, and SARAR models, SAR models turn out to have the largest values. The dominant reason is that the spatial coefficients are close to 1 so that the spatial filtering enlarges the effect of explanatory variables. On the other hand, SEM reports the smallest marginal effects. The values of SAR models are in between, which is a direct response to the paper’s hypothesis that SARAR specifications can balances the marginal effects obtained from SAR and SEM due to the distributed spatial effects at the dependent variable and the error term.

Effects of day-in-week and time-of-day

Comparing Thursday morning with Sunday morning, marginal effects of commerce and the number of road lanes are vastly different. The difference has accounted for the difference in total number of trips–the total number of trips on Thursday morning is about 10% higher than Sunday morning. The total effect of commerce on Thursday is 0.98, indicating for every 1 unit difference in the number of commercial facility, the total number of trip would increase by 0.98%. The total marginal effect of commerce on Thursday is about 35% higher than Sunday, indicating commercial facilities produce 35% more trips on weekdays than weekends. Regarding road lane length, the difference is even greater. The total effect on Thursday morning is about 78% greater than weekend. This finding is quite intuitive as traffic congestion is often much worse on weekdays, and willingness to travel is sensitive to additional road capacity.

A Similar finding of the Thursday and Sunday difference can be found during afternoon hours. Commerce is 24% higher and road length is 50% higher on Thursday. The difference in marginal effects implies trip distribution modeling should be conducted for different day-in-week. This paper provides empirical estimation of marginal effects by the SARAR model.

Comparing afternoon’s results against morning results, all differences are less than 10% and most are less than 5%. This finding implies shows the relationship between explanatory variables and trip distribution is consistent over morning hours and afternoon hours. If the transport modeling resource is limited, analysts are not recommended to run spatial interaction models separately for mornings and afternoons.

Implications for Hangzhou 2035

Given that the SARAR model has been specified and estimated, the next step is to investigate how it may be applied to analyze empirical trip distribution data. Based on the estimation result on the Thursday’s morning, and a special development plan proposed by the Hangzhou municipal government, the prediction of trip distribution with respect to longer road lanes can be obtained. In other words, this empirical analysis serves as an example for practitioners to use the SARAR model to make reasonable trip distribution predictions. Generally speaking, the application of SARAR models is indifferent to the traditional trip distribution analyses in terms of determining zones, explanatory variables, and making predictions. The only difference is in specifying the trip distribution models and deriving the marginal effects.

In September 2020, the municipal government of Hangzhou announced the Special Plan for Comprehensive Transportation of Hangzhou (2021–2035). One of the key objectives is to promote a green and intelligent transportation system. The whole city is divided into three types of zones: the demonstration development zones, the priority development zones, and the guidance development zones. The demonstration zones and the priority zones aim to have an increased road lane density, which are more than 10km/km2 and 8 km/km2, respectively. Traffic analysis zones 26,38,43,45,48,49 and 50 represent development zones the corresponding zones road lane length increases are shown in Table 5.

Download:

Table 5. Current and projected road lane length.

https://doi.org/10.1371/journal.pone.0305932.t005

As the length of road lanes is an important influential factor of trip distribution, holding other explanatory variables constant, the projected increase of length will lead to higher traffic volume from and to these zones. With the SARAR specification assumption, the mechanism of traffic volume increase consists of origin effects, destination effects, interregional effects, and network effects. Table 6 reports the scalar summary of these effects for Thursday morning’s travel patterns. Traffic volume increases on the other times can be derived following the same procedure.

Download:

Table 6. Projected average traffic volume increase due to longer road lanes at certain zones with the SARAR specification.

https://doi.org/10.1371/journal.pone.0305932.t006

Due to the increased lane length at zone 26, the traffic volume from and to zone 26 will increase. Moreover, zones around zone 26 will observe an increase with the effect of spatial dependences. In terms of the magnitude of effects, based on the calculation of TE matrix, zone 26 itself is increasing by the greatest percentage, and the effects diminishes as the distances to zone 26 are increasing.

Note that due the differences in model specification, SARAR, SAR and SEM models will observe different projections in traffic volume increase given the same lane length increase assumption. In this particular empirical study, SAR would offer the greatest traffic volume increase, and followed by SARAR. SEM would result in the lowest traffic volume increase. In terms of selection of best project, this paper recommends SARAR as it has the best goodness-of-fit.

6. Conclusions and discussions

This paper specifies, estimates, and applies the spatial autoregressive model with spatial autoregressive disturbances with OD filterings to investigate trip distributions. Apart from being an early work of adopting new models to study a traditional problem, this paper also contributes to the existing literature by (1) filling the void that no existing studies have examined the difference in the spatial OD dependence specification, and how different specifications drive the variation in marginal effects of explanatory variables, and (2) comparing the empirical marginal effects across mornings and afternoons on the weekday and weekend with empirical data.

In general, the empirical use of spatial interaction models with spatial OD filterings is far from comprehensive in modeling trip distributions. In the era with big data in the transportation industry, the need of adopting the OD filtering models to improve the understanding of travel patterns is increasing. Hence, this paper has proven that the trip distribution study is a great use case for the proposed OD filtering models. Continuing this line of research has both academic and practice values in the transportation planning and management.

From the theoretical and empirical analyses, this paper concludes that 1) the spatial process in the trip distribution data takes place in both the SAR and SEM terms. 2) SARAR models can investigate the SAR and SEM simultaneously and the estimated spatial coefficients can reveal the intensity of each spatial effect. 3) SARAR models outperform the SAR and SEM models in all investigated scenarios so that the marginal effects obtained from SARAR should be the best to use in practice from the statistical point of view. 4) Thursday and Sunday present different marginal effects for the commerce and road lane length variables, while morning and afternoon do not exhibit major differences.

The limitations of this study are 1) investigating trip distribution at the same time on two different days may introduce the panel data specifications. Researchers have introduced the fixed effects model [14] but is not yet suited for this paper, because the explanatory variables constant over time would be canceled out in estimation. 2) Data collection of explanatory variables indexed by separate times is expected to examine policy interventions of trip distribution. Hence, a foreseeable next step is to develop time-varying spatial OD dependence models and collect more comprehensive trip distribution data.

Supporting information

S1 Dataset. Analysis data submitted.

https://doi.org/10.1371/journal.pone.0305932.s001

(XLSX)

References

1. LeSage JP, Pace RK. Spatial Econometric Modeling of Origin-Destination Flows. J Reg Sci 2008;48:941–67.
- View Article
- Google Scholar
2. Luc Anselin. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Springer Science and Business Media LLC; 1988. https://doi.org/10.2307/143780
3. Lesage JP, Thomas-Agnan C. Interpreting spatial econometric origin-destination flow models. J Reg Sci 2015;55:188–208. https://doi.org/10.1111/jors.12114.
- View Article
- Google Scholar
4. Margaretic P, Thomas-Agnan C, Doucet R. Spatial dependence in (origin-destination) air passenger flows. Papers in Regional Science 2017;96:357–80. https://doi.org/10.1111/pirs.12189.
- View Article
- Google Scholar
5. LeSage JP, Llano C. A spatial interaction model with spatially structured origin and destination effects. Advances in Spatial Science, Springer International Publishing; 2016, p. 171–97. https://doi.org/10.1007/978-3-319-30196-9_9.
6. Ni L, Wang X, Zhang D. Impacts of information technology and urbanization on less-than-truckload freight flows in China: An analysis considering spatial effects. Transp Res Part A Policy Pract 2016;92:12–25. https://doi.org/10.1016/j.tra.2016.06.030.
- View Article
- Google Scholar
7. Ni L, Wang X, Chen X. A spatial econometric model for travel flow analysis and real-world applications with massive mobile phone data. Transp Res Part C Emerg Technol 2018;86:510–26. https://doi.org/10.1016/j.trc.2017.12.002.
- View Article
- Google Scholar
8. Luc Anselin. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Springer Science and Business Media LLC; 1988. https://doi.org/10.2307/143780.
9. Li X, Chen WY, Hin Ting Cho F. 3-D spatial hedonic modelling: Environmental impacts of polluted urban river in a high-rise apartment market. Landsc Urban Plan 2020;203. https://doi.org/10.1016/j.landurbplan.2020.103883.
- View Article
- Google Scholar
10. Sun F, Matthews SA, Yang TC, Hu MH. A spatial analysis of the COVID-19 period prevalence in U.S. counties through June 28, 2020: where geography matters? Ann Epidemiol 2020;52:54–59.e1. pmid:32736059
11. Schneider M. Gravity models and trip distribution theory. Papers in Regional Science 1959;5:51–6. https://doi.org/10.1111/j.1435-5597.1959.tb01665.x.
- View Article
- Google Scholar
12. Zhang D, Luchian S, Raycroft J, Ulama D. Induced Travel Demand Modeling for High-Speed Intercity Transportation. Transp Res Rec 2019;2673:189–98. https://doi.org/10.1177/0361198119837189.
- View Article
- Google Scholar
13. Halleck Vega S, Elhorst JP. The slx model. J Reg Sci 2015;55:339–63. https://doi.org/10.1111/jors.12188.
- View Article
- Google Scholar
14. Fischer MM, LeSage JP. Network dependence in multi-indexed data on international trade flows. Journal of Spatial Econometrics 2020;1. https://doi.org/10.1007/s43071-020-00005-w.
- View Article
- Google Scholar

[ref1] 1. LeSage JP, Pace RK. Spatial Econometric Modeling of Origin-Destination Flows. J Reg Sci 2008;48:941–67.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Luc Anselin. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Springer Science and Business Media LLC; 1988. https://doi.org/10.2307/143780

[ref3] 3. Lesage JP, Thomas-Agnan C. Interpreting spatial econometric origin-destination flow models. J Reg Sci 2015;55:188–208. https://doi.org/10.1111/jors.12114.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Margaretic P, Thomas-Agnan C, Doucet R. Spatial dependence in (origin-destination) air passenger flows. Papers in Regional Science 2017;96:357–80. https://doi.org/10.1111/pirs.12189.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. LeSage JP, Llano C. A spatial interaction model with spatially structured origin and destination effects. Advances in Spatial Science, Springer International Publishing; 2016, p. 171–97. https://doi.org/10.1007/978-3-319-30196-9_9.

[ref6] 6. Ni L, Wang X, Zhang D. Impacts of information technology and urbanization on less-than-truckload freight flows in China: An analysis considering spatial effects. Transp Res Part A Policy Pract 2016;92:12–25. https://doi.org/10.1016/j.tra.2016.06.030.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref7] 7. Ni L, Wang X, Chen X. A spatial econometric model for travel flow analysis and real-world applications with massive mobile phone data. Transp Res Part C Emerg Technol 2018;86:510–26. https://doi.org/10.1016/j.trc.2017.12.002.
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref8] 8. Luc Anselin. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Springer Science and Business Media LLC; 1988. https://doi.org/10.2307/143780.

[ref9] 9. Li X, Chen WY, Hin Ting Cho F. 3-D spatial hedonic modelling: Environmental impacts of polluted urban river in a high-rise apartment market. Landsc Urban Plan 2020;203. https://doi.org/10.1016/j.landurbplan.2020.103883.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref10] 10. Sun F, Matthews SA, Yang TC, Hu MH. A spatial analysis of the COVID-19 period prevalence in U.S. counties through June 28, 2020: where geography matters? Ann Epidemiol 2020;52:54–59.e1. pmid:32736059
View Article
PubMed/NCBI
Google Scholar

[23] View Article

[24] PubMed/NCBI

[25] Google Scholar

[ref11] 11. Schneider M. Gravity models and trip distribution theory. Papers in Regional Science 1959;5:51–6. https://doi.org/10.1111/j.1435-5597.1959.tb01665.x.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref12] 12. Zhang D, Luchian S, Raycroft J, Ulama D. Induced Travel Demand Modeling for High-Speed Intercity Transportation. Transp Res Rec 2019;2673:189–98. https://doi.org/10.1177/0361198119837189.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref13] 13. Halleck Vega S, Elhorst JP. The slx model. J Reg Sci 2015;55:339–63. https://doi.org/10.1111/jors.12188.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref14] 14. Fischer MM, LeSage JP. Network dependence in multi-indexed data on international trade flows. Journal of Spatial Econometrics 2020;1. https://doi.org/10.1007/s43071-020-00005-w.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

Figures

Abstract

1. Introduction

2. Literature review

3. Methodology

4. Empirical study

Data description

5. Results analysis

Spatial coefficients

Marginal effects

Effects of day-in-week and time-of-day

Implications for Hangzhou 2035

6. Conclusions and discussions

Supporting information

S1 Dataset. Analysis data submitted.

References