Figures
Abstract
Mountainous urban rail transit stations exhibit distinct characteristics. To investigate how these features affect passenger flow variations at rail stations, we analyze geographic-environmental data surrounding the stations and integrate road network topology, automatic fare collection data, and point-of-interest (POI) data. We propose a method to classify rail transit stations by considering the mountainous features and establish a multiscale geographically weighted regression (MGWR) model to assess the classification results. This study focuses on 189 rail stations in Chongqing, identifying six station categories: comprehensive mountainous, comprehensive non-mountainous, employment mountainous, employment non-mountainous, residential mountainous, and residential non-mountainous. The MGWR results show that road growth coefficients, average longitudinal slopes, and road lengths significantly influence station performance. For instance, the average longitudinal slope substantially affects employment in mountainous stations, particularly during the morning peak. The analysis reveals that the average longitudinal slope exerts a stronger negative effect on morning peak inbound passenger flow at employment mountainous stations (-0.949), indicating that commuters are more sensitive to travel time during the morning peak. In contrast, the evening peak inbound passenger flow is less impacted (-0.409), suggesting that evening commuters face fewer time constraints. These findings offer strategic insights for zoning transit stations to support transit-oriented development(TOD).
Citation: Zou Q, Xia Y, Ran X, Guo X, Feng J (2025) Classification of mountain-based rail transit stations and analysis of passenger flow influencing mechanisms. PLoS One 20(5): e0323937. https://doi.org/10.1371/journal.pone.0323937
Editor: Qing-Chang Lu, Chang'an University, CHINA
Received: October 26, 2024; Accepted: April 16, 2025; Published: May 27, 2025
Copyright: © 2025 Zou et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting information files.
Funding: This study is jointly supported by the National Natural Science of China [grant number: 52302386], and the China Postdoctoral Science Foundation [grant number: 2023M730430]. All funds were received by Qingru Zou. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
As economies develop, traffic congestion in major cities is becoming increasingly severe. To address this issue, priority has been given to developing urban public transportation systems, particularly by promoting urban rail transit as the city’s public transportation backbone, supplemented by regular buses. For example, in Tokyo, Japan, rail transit accounts for as much as 77.7% of the total public transportation usage [1], and in Paris, France, it accounts for 70% [2], and in Shanghai, China, it reaches 77.5% as well [3]. However, in mountainous cities, due to complex spatial geography and urban road layout constraints, traffic congestion is even more pronounced, and the role of rail transit as a backbone is less apparent. In Seoul, South Korea, rail transit accounts for 36.2% of total public transportation usage, and in Chongqing, China, it stands at 40.1% [4].
The main reasons why rail transit plays an insignificant role in the backbone of mountain cities are as follows: First, due to the steep gradients of mountainous roads and the limited capacity of rail systems, such as Chongqing’s monorail Line 3, which has low capacity, fewer doors, and long stop times, the morning peak capacity is constrained [5]. Second, public-to-rail transfers around stations are inefficient. Mountainous cities tend to have multicentric clusters, and rail stations are often located on the periphery of districts, making passenger flow distribution less efficient and necessitating ground transport for the “last mile” [6].Lastly, the walking paths around mountainous rail stations are complex, with high road growth coefficients, resulting in poor pedestrian accessibility [7].
Therefore, to mitigate the influence of mountainous spatial characteristics on passenger flow and enhance the pivotal role of rail transit in mountainous cities, it is imperative to study the factors affecting passenger flow in mountain rail transit and increase the utilization rate of urban rail transit in such regions. At present, when analyzing the impact of passenger flow on rail transit stations [8], scholars rarely consider mountainous characteristics, even though spatial differences in influencing factors exist. This paper focuses on mountainous features such as the average longitudinal slope of roads, road network length, and road growth coefficient. It mainly examines walking accessibility around rail stations, transfer convenience, surrounding development intensity, and station passenger flow data. The K-means clustering method is applied to classify Chongqing’s rail transit stations. Subsequently, OLS, GWR, and MGWR models are established to analyze the stations using different classifications.
1.1 Literature review
Since the characteristics of each station and its surrounding land use affect stations differently, leading to variations in passenger flow patterns, it is essential to investigate the mechanisms influencing passenger flow across different station types based on a reasonable classification of stations.
1.1.1 Classification of urban rail rapid transit stations.
Selecting appropriate station classification indicators is key to the classification of rail transit stations. Existing research can be divided into single-indicator and multi-indicator classification approaches. The single-indicator classification includes:
- (1). Classification of station characteristic indicators: Wu Jiaorong (2007) categorized subway stations into distinct functional types based on their varying transfer facilities [9].
- (2). Land use index classification: In the United States, the Center for Transit-Oriented Development (2010) classified stations into 15 types by considering factors such as the number of jobs around the stations, the annual total mileage of households in the traffic area, and whether the surrounding areas were residential, employment-focused, or mixed-use [10].
- (3). Passenger flow index classification: Meekyung (2018) grouped 233 subway stations in Seoul into eight categories using data on subway card swipes, passenger flow at different times, and fluctuations in passenger flow curves [11]. Similarly, Li et al. classified stations into six categories based on passenger traffic characteristics such as the number of peaks and troughs and the skewness in flow fluctuations [12].
In recent years, more researchers have combined multiple indicators to classify stations. For instance, Higgins. et alused data from 372 rail transit stations in Toronto, Canada, to extract land use and passenger flow characteristics, classifying them into ten categories [13]. Kim et al. combined passenger flow and land use data to classify Seoul’s stations [14]. Similarly, Pang Lei et al. classified Tianjin’s rail transit stations into three categories—residential, employment, and mixed-use—by combining built environment, station attributes, and network characteristics [15]. Chen L. et al.used station characteristics and nearby building functions to categorize 30 stations in Xi’an into six types [16].
1.1.2 Study on the influence mechanism of passenger flow.
Regression analysis is a common method to determine how different indicators affect passenger flow at rail stations. Traditional models such as Ordinary Least Squares (OLS) have been widely used. For example, Loo et al. used the OLS model to analyze rail transit stations in Hong Kong and New York, concluding that various indicators within the service areas significantly impacted station passenger flow [17]. An et al. employed Ordinary Least Squares (OLS) regression analysis and found that commercial land use, bus stations, and tourist attractions have significant positive impacts on Shanghai’s rail transit passenger flow, independent of weekdays [18]. However, with the deepening of research, some scholars argue that the OLS method lacks consideration of spatial spillover effects. The Geographically Weighted Regression (GWR) model addresses this by visualizing how independent variables affect dependent variables across different spatial locations. For instance, Qian et al. found that the GWR model offered better explanatory power when studying the dynamic demand for cabs in New York City [19].
The GWR model, however, assumes that all independent variables affect the dependent variable at the same spatial scale, ignoring spatial heterogeneity. To address this limitation, researchers introduced the MGWR model. Jun et al.applied the MGWR model to Seoul’s rail stations, finding that it effectively accounted for both global and local variables [20]. Zahratu et al. highlighted that the MGWR model allows each independent variable to have a specific bandwidth, reflecting the differentiated scales of influence [21]. Tak et al. used the MGWR model to analyze passenger flow in Beijing’s rail stations, concluding that it provided more reliable estimates compared to the classical OLS and GWR models [22].
In summary, while considerable progress has been made in classifying rail transit stations and understanding passenger flow mechanisms, further improvements are necessary. First, much of the existing research focuses on plain areas, overlooking the unique geographic conditions of mountainous cities. Additionally, many studies fail to consider differences in the factors affecting passenger flow across various types of stations. Therefore, it is critical to develop a reasonable classification system for rail stations and examine the passenger flow mechanisms of different station types, considering the spatial and geographic characteristics of mountainous cities, to enhance the rail transit share in such areas.
1.2 Study area
Chongqing has a complex geographic environment with significant elevation changes, making it a typical mountainous city in China. Except for certain sections of monorail Line 3, which reaches full capacity during peak hours, the other lines maintain ample capacity [23]. This paper selects nine operational rail transit lines in Chongqing, comprising Lines 1, 4, 5, 6, 10, the Circular Line, and the Guobo Line as subway systems (high-capacity rail transit), and Lines 2 and 3, which employ the straddle-type monorail technology (medium-capacity light rail transit system). The study investigates a total of 189 stations across these lines, excluding repeated interchange stations. It integrates Chongqing’s road network structure and pedestrian system characteristics, constructing a “key impact area” centered around each rail transit station, with 500-meter radius, known as the station area [24].
2 Data and methodology
2.1 Cluster-based identification of station types
Analyzing passenger flow characteristics at urban rail transit stations is a relatively complex endeavor [25]. Classifying different stations can simplify the research.The clustering methods currently employed by scholars include K-means, DBSCAN, HDBSCAN, OPTICS, and Self-Organizing Maps (SOM). DBSCAN performs poorly on high-dimensional data and is ineffective for clusters with varying densities. When the data set exhibits significant spatial density variations, it struggles to reasonably cluster both high-density and low-density regions. HDBSCAN is highly sensitive to parameter settings and has a high computational complexity. The OPTICS algorithm also suffers from high computational complexity, especially on large-scale data sets, and faces difficulties in extracting meaningful clusters. Self-Organizing Maps (SOM) require the selection of an appropriate network structure, and improper choices can negatively impact model performance. Although the K-means algorithm is sensitive to initial cluster centers and faces challenges in determining the optimal number of clusters (K), its simplicity, efficiency, and suitability for large-scale data sets make it a rational choice for our study. We determine the optimal number of clusters using the elbow method and improve the stability of clustering results by enhancing the initialization method. Moreover, K-means has fewer parameters, which are easy to adjust and interpret, an important advantage in practical applications. We believe that, with appropriate data preprocessing and parameter tuning, K-means can effectively classify mountain railway stations and provide support for the analysis of passenger flow impact mechanisms. The unsupervised classification of station types using the K-means clustering algorithm involves three key steps: selection of indicators, determining the number of clusters, clustering the stations, and identifying station categories.
2.1.1 Selection of station classification indicators.
Based on station functionality, service capacity, and environmental impacts of surrounding areas, and considering data accessibility, quantifiability, and comprehensiveness, we propose four categorized indicators as classification criteria.Development intensity indicator--reflecting urbanization levels and economic activity density in station-adjacent areas;Transfer convenience--indicating accessibility to other zones and the station’s functional role within the transportation network;Walking accessibility--measuring passenger convenience in reaching stations;Passenger flow--capturing distinct temporal characteristics of ridership during morning/evening peak periods.
- (1). Development intensity indicator. POI data contain spatial information of the urban fabric, including attributes such as names, category classifications, and geographical coordinates (latitude and longitude). These data can effectively reveal the density and distribution of developments around these stations. The types of facilities and service functions within a 500-meter radius around the stations significantly influence the attractiveness and vitality of the transit stations [26]. This study employs Gaode Map to obtain POI data within a 500-meter radius around rail transit stations in nine districts of Chongqing, namely Yuzhong, Shapingba, Jiangbei, Yubei, Nan’an, Jiulongpo, Dadukou, Banan, and Beibei. Based on the “Urban Land Use Classification and Planning and Construction Land Use Standards,” a total of 298,035 POIs were extracted from the core station area [27]. This includes 138,887 POIs for commercial land use, 72,878 for residential land use, 40,215 for public administration and services, and 23,766 for transportation purposes.
- (2). Transfer convenience indicator. The number of bus lines within 500 meters of the station is calculated based on Gaode POI data to measure the ease of transferring between buses and rail transit [28]. Additionally, the number of parking lots is used to reflect the convenience of Park + Ride (P + R) modes. A total of 3,585 bus lines and 3,316 parking lots around the stations are extracted.
- (3). Walking accessibility indicators [29]. The 500-meter radius around rail transit stations is generally regarded as the core area for pedestrian accessibility. Within this range, residents and passengers can conveniently reach the stations on foot without relying on other transportation modes. The characteristics of roads and buildings within this range directly influence the walking experience and travel efficiency of passengers, making it a focal area for research. Using Open Street Map data, walking accessibility is measured by the length of the road network within a 500-meter radius around the station area. The degree of zigzagging and elevation differences in the terrain are reflected by the road growth coefficient and average longitudinal slope, as shown in equations (1 and 2). A total of 3,842 road segments were extracted, allowing the calculation of road length, growth coefficient, and average slope.
Where:
I is the average longitudinal slope, H is the station elevation, Ho is the passenger departure point’s elevation within 500m, and Lt is the length of the passenger trip within 500m.
Where:
Cr is the road growth factor, Lr is the length of the passenger walk within 500m, and Ls is the straight-line distance from the start to the end of the passenger walk.
- (4). Station Passenger Flow Indicator: The station passenger flow data from the urban rail transit automatic fare collection system in 2021. The morning and evening peak inbound and outbound passenger flow of each station is selected to reflect the passenger flow changes of passengers in different periods [30]. The station passenger flow indicators are derived from the AFC data in 2021.
Thirteen indicators were selected to classify stations based on four categories of factors. To reduce inconsistencies in scale between different indicators, data were standardized using the Z-score method. The Z-score method standardizes data by subtracting the mean of each variable and dividing by its standard deviation. This process ensures that all variables have a mean of 0 and a standard deviation of 1, thus eliminating the influence of scale differences (as shown in Table 1).
2.1.2 Determination of the number of clusters.
Before conducting cluster analysis, it is necessary to determine the appropriate number of clusters. This study applies the “elbow method” to determine the optimal number of clusters by analyzing the sum of squares of errors (SSE) for various cluster sizes and identifying the elbow point where the marginal improvement in SSE diminishes (i.e., the inflection point). The optimal number of clusters is identified by observing the largest reduction in SSE, corresponding to the inflection point. The basic principle behind the “elbow rule” is that as the number of clusters (k) increases, the clustering becomes more granular, and the cohesion within each cluster improves, causing the SSE to decrease. However, after a certain point, further increases in k result in diminishing improvements in cohesion, causing the SSE reduction to slow. This produces an “elbow” shape on a graph plotting SSE against k, and the k-value corresponding to this elbow is considered the optimal number of clusters. The SSE is calculated using the following formula:
where:
K is the current number of clusters; Ci is the ith subset in the clustering; p is the sample points within the subset; and mi is the average of all sample points within the subset.
2.2 Influence mechanism of passenger flow in different station types
2.2.1 Selection of indicators for passenger flow impact mechanisms.
- (1). Station functional attributes
The operational and geographical attributes of urban rail transit stations influence passenger flow [31]. Based on passenger flow statistics, this study designates stations with particularly high passenger volumes at transfer points as large transfer stations. The geographical location of the station is used to determine whether it serves as an external transportation hub and whether it is in proximity to a large commercial area. These three indicators are collectively referred to as the station functional attributes indicators of the station.
- (2). Built environment
In addition to the previously mentioned indicators, the land use mix index was also considered, reflecting the diversity of land use types in the station area [32].
Where:
H(X) is the land use mix index at station X; Pi is the percentage of the number of types i POIs at station X to the total number of POIs at that station.
This paper analyzes two dimensions: the built environment and the station’s attributes. Among them, three indicators, whether the station is an interchange station, whether it is an external transportation hub, and whether it is adjacent to a large business district, are discrete, while the rest are continuous variables. Table 2 presents an overview of these potential influences. The characteristics of the station itself were obtained through observation and the compilation of web resources, while the built environment was based on categorical data with the addition of a land use mixing rate, as shown in Table 2.
2.2.2 Selection of key variables.
Only if there is no strict collinearity between the independent variables can the model analysis be conducted. The collinearity test is represented by the variance inflation factor (VIF). The greater the VIF, the higher the collinearity probability between the independent variable and other variables.VIF within 10 indicates that there is no serious collinearity [33].
The VIF of the jth independent variable is:
Where:
VIFj is the variance inflation factor of the jth independent variable; Rj2 is the determination coefficient of the jth independent variable as the dependent variable, and the other independent variables are used as linear regression.
There may be spatial dependence between neighboring stations, and this influence cannot be explained by OLS regression, so it is necessary to test the spatial correlation between stations. In this study, Moran’s I was selected as the test index, and the spatial projection coordinates and influencing factors of each station were imported into Arc GIS for the spatial autocorrelation test [34]. Prior to the calculation of the Moran’s I index, the concept of a spatial weights matrix must be introduced. This matrix reflects the spatial proximity between multiple locations by representing the relationships among different spatial units. In the context of the Moran’s I index, the spatial weights matrix is often constructed using Euclidean distance as the basis for assigning weights. The product of the weight matrix γ and the explanatory variable indicators, as shown in Equation (10), reflects the degree of similarity between spatial units. Euclidean distance, which measures the straight-line distance between two points, provides an accurate representation of the actual distance between them in geographical space and thus serves as a valid basis for assigning weights.
The calculation formula of Moran’I is as follows:
Where:
m’ is the total number of rail transit stations;γij is the weight between stations; ei and ej are represented as independent variable indicators for the i and j stations; is the mean value of station- independent variable indicator e; S2 is the variance of the station-independent variable index e; dij is the Euclidean distance between stations i and station j; b is the bandwidth, which refers to the non-negative decay parameter of a weighted remote function.
When the sample space distribution is relatively uniform, equal bandwidth is often used, and the opposite is used.The value of Moran’I is usually [-1, 1]. Under certain significance tests, the Moran index is 0. That is, there is no spatial autocorrelation. A value greater than 0 indicates that the variable has agglomeration, while a value less than 0 indicates that the variable has dispersion (spatial negative correlation) [35]. Generally, normalized statistical Z-value is used to conduct a significance test, and its expression is as follows:
Where:
E (Moran’I) is the theoretical mathematical expectation of the Moran’I; is the theoretical variance.
The reliability of Moran’I can be evaluated by the Z-value and P-value of significance. If the significance P value is less than 0.05 (through a 95% confidence test) the absolute value of the Z score exceeds the critical value of 1.96, indicating that the results of Moran’I are credible, and more than 95% of the certainty is that the data are spatially correlated. When the value of Moran’I is greater than 0, the spatial correlation is positive; when the value of Moran’I is less than 0, the correlation is negative [36].
2.2.3 Passenger flow impact regression modeling.
The OLS, GWR, and MGWR models are commonly used to handle continuous dependent variables. When analyzing the factors influencing station passenger flow in regression models, passenger flow data can be treated as a continuous variable.In the regression models, the dependent variables include the total daily passenger flow, morning peak inbound passenger flow, morning peak outbound passenger flow, evening peak-hour inbound passenger flow, and evening peak outbound passenger flow. The independent variables are the station functional attributes and built environment indicators introduced in the previous section.
- (1). OLS Regression Model [37]
Where:
X1, X2,..., Xn are the independent variables; α0 is the intercept term of the OLS model, α1, α2,..., αn are the regression coefficients for the nth independent variable; Y is the dependent variable; and ε is the residual, which follows a normal distribution with a mean of zero.
- (2). GWR regression model [38]
Where:
(Ui, Vi) is the coordinates of station i; Yi is the dependent variable at location i; β0 (Ui, Vi) is the intercept of station i; βj (Ui,Vi) is the regression coefficient of the jth independent variable at (Ui,Vi); and εi is the residuals of station model i.
- (3). MGWR regression model [39]
Where:
β0 (Ui,Vi) is the intercept for the station i; εl is the lth global indicator for the station; βl is the regression coefficient for the global indicators, which are the first k indicators; and βl (Ui,Vi) is the regression coefficient for the local indicators, i.e., the k + 1 to n’ indicators, which vary with each station.
3 Analysis of results
3.1 Station classification results
K-means clustering is an iterative analysis method that achieves optimal classification by continuously adjusting the positions of cluster centers [40]. The station data is divided into K groups, randomly select K objects as the initial cluster center, and then calculate the distance between each object and each seed cluster center, and assign each object to the nearest cluster center, the cluster center, and the object assigned to them represent a cluster. For each sample assigned, the cluster center of the cluster is recalculated based on the existing objects in the cluster. This process is repeated until no objects are reassigned to different clusters.
The number of clusters was determined using the elbow method. Python code was used to analyze the data samples and identify the clustering type (K) [41], as shown in Fig 1. The analysis reveals that as the K value increases from 1 to 6, there are clear fluctuations in the degree of aberration. When the K value exceeds 6, the variation in aberration decreases significantly. Thus, 6 was selected as the optimal number of clusters.
For the six categories of stations in the clustering results, Figs 2 and 3 show the inbound and outbound passenger flows for each type of weekday station and the characteristics of each type of station.
- (1). Comprehensive-type stations exhibit the highest passenger volumes, with weekday traffic showing distinct bimodal patterns. Comprehensive mountainous stations have higher longitudinal slopes and road growth coefficients. In contrast, Comprehensive non-mountainous stations have higher passenger flows and lower slopes, with more convenient land use and transport conditions.
- (2). Employment-type stations feature higher passenger flows, with weekday flow curves showing a single-peak pattern, with significantly higher outbound flows in the morning peak hour and inbound flows in the evening peak hour. Employment mountainous stations have a higher percentage of POI for commercial facilities and services and a larger number of bus stops and parking lots in the vicinity. Employment non-mountainous stations have higher passenger flows than employment mountainous stations, have gentler topography that is more conducive to urban planning and land development and utilization around the station, have high road network density, and have a higher density of bus routes within the station coverage area.
- (3). The residential-type stations exhibit the smallest passenger flow, with a single-peak pattern of weekday passenger flow, and large inbound passenger flow in the morning peak and outbound passenger flow in the evening peak. The geographical composition of the surrounding area of the residential mountain-type stations is dominated by residential land, and the degree of development of the surrounding land is not high. The passenger flow at residential non-mountainous stations is larger than that at residential mountainous stations, the road network density is high, the average longitudinal gradient is low, and the POI of residential stations is high. This indicates the characteristics of residential riders who leave early and return late.
After classifying 189 stations in Chongqing, by analyzing the passenger flow entering and leaving different types of stations and the characteristics of stations around the stations, it was found that there were significant differences among different types of stations. To further explore the relationship of the influence of different influencing factors of the stations on the passenger flow of the stations, a passenger flow regression model can be established to analyze the influence of the influencing factors on the stations. Propose improvement measures for different types of sites to increase passenger flow.
3.2 Passenger flow impact regression model results
3.2.1 Results of key variables selection.
The covariance test results and spatial autocorrelation of the factors influencing the passenger flow characteristics of Chongqing rail transit stations are shown in the Table 3.
The data show that the VIF values of the 13 indicators studied are all below 10, which means that there is no serious multicollinearity among these indicators and they have good validity. The Moran’I of the 13 indicators of the influencing factors are all positive. The Z-values of the 12 indicators are above 1.96, achieving the level of significance, which meets the conditions for establishing a regression model.
3.2.2 Comparison of regression models.
The independent variables are regarded as global indicators in the OLS, while they are regarded as local indicators in the GWR, and the MGWR is a comprehensive model that integrates the global and local characteristics of the two models of OLS and GWR [42,43]. In this paper, we intend to take Chongqing metro stations as the research object, choose the same independent explanatory variables, and establish three methods of OLS, GWR, and MGWR to regress and analyze the passenger flow characteristics respectively.
The smaller the value of RSS, AIC, and AICc indicators, the larger the value of R2 and adjusted R2, the better the model fitting effect is [44]. As can be obtained from Table 4, the results of the three types of regression models show that: RSS, AIC, and AICc index values MGWR < GWR < OLS, R2 and corrected R2 values are MGWR> GWR > OLS, indicating that the MGWR and GWR models considering spatial correlation have a better fitting effect. Taking the AICc index as the bandwidth optimization criterion, it is found that the AICc index value of the MGWR model regression is significantly smaller than that of the GWR model in different periods, which indicates that the MGWR model realizes further optimization based on the GWR model.
3.2.3 Influence mechanisms of passenger flow characteristics at stations based on MGWR.
To investigate the heterogeneous effects of mountain track characteristics, interchange convenience, and pedestrian accessibility on the passenger flow of different types of stations, based on the results of station classification, the MGWR model is used to study the influence mechanisms of all-day weekday passenger flow, morning peak inbound passenger flow, morning peak inbound and outbound passenger flow, evening peak inbound passenger flow and evening peak outbound passenger flow of different types of stations, respectively [45]. Take the employment class mountain type as an example, as shown in Table 5:
Table 5 shows that: (1) the coefficients of influence on the positive effect on the all-day passenger flow in the employment category of mountain-type stations are, in descending order, neighboring business districts (0.857)> commercial POIs (0.685)> parking lots (0.362)> public administration POIs (0.269)> traffic and transportation POIs (0.24)> external transportation hubs (0.121). It indicates that the commercial land use of this type of station and the neighboring business districts attract a large number of passengers and employment; (2) the coefficients of influence on the negative effect of all-day passenger flow are, in descending order, road growth coefficients (-0.613)>residential POIs (-0.429)>transit routes (-0.33)>mixed rate of land use (-0.281)>average longitudinal slopes of roadways (-0.181)>roadway network length (-0.1)> interchange (-0.033). The road growth coefficient has the largest negative impact, indicating that passengers at these stations are very sensitive to the length of the walk near rail stations, and has the second largest impact on residential POI, with the opposite travel behavior characteristics of the residential and employment class of stations having a large negative impact on employment hill traffic.
4 Conclusion
Based on passenger flow data and built environment data from mountainous urban rail transit stations, this study employs the K-means clustering algorithm to classify the stations and establishes OLS, GWR, and MGWR models for regression analysis of passenger flow characteristics. The findings indicate that stations can be categorized into six types: comprehensive non-mountainous, comprehensive mountain, residential non-mountain, residential mountain, employment mountain, and employment non-mountain. A comparison of model performance reveals that the MGWR model achieves better regression fitting than the GWR and OLS models. The MGWR analysis highlights that the influence of various indicators on passenger flow is spatially non-stationary and varies with geographical location. For instance, a higher density of bus lines and parking lots around a station significantly increases peak passenger flow during the morning and evening. This effect is most pronounced at employment-oriented non-mountain stations and residential non-mountain stations. To enhance passenger flow, it is recommended to increase the number of buses and parking facilities near these types of stations. Additionally, lower average road slopes and road growth coefficients are associated with higher peak passenger flow, particularly at residential non-mountain stations. Thus, improving the pedestrian environment around these stations could attract more passengers.
This study provides a theoretical foundation for increasing the modal share of rail transit, though certain limitations remain. The passenger flow data used in this study were collected at hourly intervals. To better understand the influence of various indicators on passenger flow, future research could refine the time intervals to 15 minutes or less. Furthermore, while this study classifies POI data based on standard urban planning land use categories, subsequent research could further differentiate POI types to analyze the temporal and spatial distribution of passenger flow at urban rail transit stations with greater precision.
References
- 1. Sohu.com. A textbook example! Characteristics of urban (suburban) railroads in the Tokyo metropolitan area of Japan and its inspiration. Available from: https://www.sohu.com/a/321902473_281835
- 2. Baidu.com. Sihan Industrial Research Institute: Analysis of the development status and future development trend of the rail transportation industry. Available from: https://baijiahao.baidu.com/s?id=1775712501205105029
- 3. Caifuhao.eastmoney.com. June 2024 Passenger Volume Ranking of China’s Mainland Provinces and Cities: Shanghai’s Rail Transit Accounts for More Than 60% (with Monthly Top 31 Detailed List). Available from: https://caifuhao.eastmoney.com/news/20240805112110623416980
- 4. Huxiu.com. Traffic congestion worldwide: the evolution of traffic in Seoul. Available from: https://www.huxiu.com/article/536104.html
- 5. Baidu.com. Chongqing’s “busiest” rail line: has nearly a million passengers a day, but misnamed for 10 years. Available from: https://baijiahao.baidu.com/s?id=1784954343860259481&wfr=spider&for=pc
- 6. Zhu Z, Guo X, Zeng J. Route design model of feeder bus service for urban rail transit stations. Math Probl Eng. 2017;20(17):1–6.
- 7. Zambrano Nájera J, Luna CC, Vélez Upegui JJ. Performance assessment of indicators of a multi-hazards early warning system in an urban mountain region. International Journal of Disaster Risk Reduction. 2024;112:104767.
- 8. Zhao J, Jiang J, Liu W. A novel short-time passenger flow prediction method for urban rail transit: CEEMDAN-CSSA-LSTM model based on station classification. Eng Lett. 31(4).
- 9. Wu J, Bi Y, Fu B. Analysis of passenger flow characteristics and transfer system priority based on suburban rail transit station classification. Urban Rail Transit Res. 2007:23–8.
- 10.
Austin M, Belzer D, Benedict A. Performance-based transit-oriented development typology guidebook.
- 11. Kyung M. Classification of Seoul metro stations based on boarding/ alighting patterns using machine learning clustering. Internet Broadcasting Commun. 2018;18(4):13–8.
- 12. Li W, Zhou M, Dong H. Classifications of stations in urban rail transit based on the two-step cluster. Intell Autom Soft Comput. 2020;26(3):531–8.
- 13. Higgins CD, Kanaroglou PS. A latent class method for classifying and evaluating the performance of station area transit-oriented development in the Toronto region. Journal of Transport Geography. 2016;52:61–72.
- 14. Kim K. Identifying the Structure of Cities by Clustering Using a New Similarity Measure Based on Smart Card Data. IEEE Trans Intell Transport Syst. 2020;21(5):2002–11.
- 15. Pang L, Ren L, Zhang Z. Classification of rail transit stations and analysis of passenger flow influencing factors based on passenger flow characteristics. Transp Syst Eng Inform. 2023;23(04):184–93.
- 16. Chen L, Chen Y, Wang Y. Research on the classification of rail transit stations and passenger flow patterns-a case from Xi an, China. Buildings. 2024;14(4).
- 17. Loo BPY, Chen C, Chan ETH. Rail-based transit-oriented development: Lessons from New York City and Hong Kong. Landscape and Urban Planning. 2010;97(3):202–12.
- 18. An D, Tong X, Liu K, Chan EHW. Understanding the impact of built environment on metro ridership using open source in Shanghai. Cities. 2019;93:177–87.
- 19. Qian X, Ukkusuri SV. Spatial variation of the urban taxi ridership using GPS data. Applied Geography. 2015;59:31–42.
- 20. Jun M-J, Choi K, Jeong J-E, Kwon K-H, Kim H-J. Land use characteristics of subway catchment areas and their influence on subway ridership in Seoul. Journal of Transport Geography. 2015;48:30–40.
- 21. Shabrina Z, Buyuklieva B, Ng MKM. Short‐Term Rental Platform in the Urban Tourism Context: A Geographically Weighted Regression (GWR) and a Multiscale GWR (MGWR) Approaches. Geographical Analysis. 2020;53(4):686–707.
- 22. GAO D, XU Q, CHEN P, et al. Spatial characterization of urban rail transit passenger flow and fine-scale built environment. Transp Syst Eng Inform. 2021;21(06):25–32.
- 23. Wang C, Wen W. Spatial layout method and practice of mountainous urban transportation stations in Chongqing. Planner. 2024;40(07):121–6.
- 24. Yang J, Xing S, Rao O. Influence of station land use on occupational and residential functions of subway stations. Sci Technol Eng. 2024;24(23):10050–6.
- 25. Li M, Wang Y, Jia L. The modeling of attraction characteristics regarding passenger flow in urban rail transit network based on field theory. PLoS One. 2017;12(9):e0184131. pmid:28863175
- 26. Zhu T, Sun X, Li Y. Urban public bicycle transportation demand prediction under station classification. Jilin Univ Eng Ed. 2021;51(02):531–40.
- 27.
GB 50137-2011, Urban land use classification and planning construction land use standards. Beijing: China Construction Industry Press; 2010.
- 28. Shi Y, Guan X, Yan J. Evaluation of public transport interchange efficiency of urban rail transit considering built environment characteristics. J Rail Sci Eng. 2023;20(04):1242–9.
- 29. Li J, Lin Y, Zhang L, et al. Research on spatial accessibility of subway stations based on Gaode map API - taking Fuzhou as an example. Logistics Science Technol. 2024;47(14):10–5.
- 30. Li Y. Research on the spatial and temporal distribution characteristics of urban rail transit passenger flow. China Rail. 2021:65–75.
- 31. Wang Y, Yuan R, Tong X, Bai Z, Hou Y. Towards simulation optimization of subway station considering refined passenger behaviors. PLoS One. 2024;19(6):e0304081. pmid:38843188
- 32. Li Q, Peng J, Yang H. Analysis of the relationship between the built environment and passenger flow characteristics of rail transit stations in different station influence zones in Wuhan. J Geo-Inf Sci. 2021;23(7):1246–58.
- 33. Xiong Y, Li L, Tan H. Spatial and temporal pattern and characteristics of Shanghai rail transit passenger flow from the perspective of “dual carbon”. J Hunan Univ Technol. 2022;36(05):1–10.
- 34. Peng T, Zhou T, Cai X. Research on passenger flow prediction model of group urban rail transit based on attribute weighted regression. Transp Syst Eng Inform. 2019;23(01):176–86.
- 35.
Guo G. Cluster analysis and application of urban rail transit stations in Beijing based on passenger flow characteristics and POI data. Beijing Jiaotong University; 2020.
- 36. Lu B, GE Y, Qin K, et al. Geography weighted regression analysis technology review. J Wuhan Univ (Inform Sci Ed). 2019;45(09):1356–66.
- 37. Carvalho C, Nechio F, Tristão T. Taylor rule estimation by OLS. Journal of Monetary Economics. 2021;124:140–54.
- 38. Alexis C, Christopher B, Martin C. A route map for successful applications of geographically weighted regression. J Geogr Syst. 2022;55(1):155–78.
- 39. Li D, Zang H, Yu D, He Q, Huang X. Study on the Influence Mechanism and Space Distribution Characteristics of Rail Transit Station Area Accessibility Based on MGWR. Int J Environ Res Public Health. 2023;20(2):1535. pmid:36674291
- 40.
Yang X. Research on the type identification and influence mechanism of urban rail transit stations from the perspective of passenger flow characteristics. Changan University; 2022:69–73.
- 41. Zhou Y, Xia H, Yue X. Local outlier detection method based on improved K-means. Eng Sci Technol. 2024;56(04):66–77.
- 42. Yang F, He X. Study on the influence of regional characteristics on the spatial distribution of green office buildings. Sichuan Build Sci Res. 2024;50(06):106–14.
- 43. Xiao H, Li X, Xu H, et al. A study on the spatial and temporal heterogeneity of the impact of logistics industry agglomeration on carbon emission in the logistics industry in the Chengdu-Chongqing city cluster. Railway Transp Econ. 2024;10:1–9.
- 44.
Ahmed M, Seraj R, Islam SMS. The k-means algorithm: A comprehensive survey and performance evaluation.
- 45. Ma Z, Yang X, Hu D. Analysis of the degree of influence of passenger flow characteristics in urban rail transit stations. J Tsinghua Univ (Nat Sci Ed). 2023;63(09):1428–39.