Figures
Abstract
Disruptions in urban rail transit (URT) systems can significantly impact operational efficiency, while well-designed bus bridging service (BBS) can effectively mitigate such effects. To address the surge in travel demand caused by disruptions, this study comprehensively considers alternative transportation modes that affected passengers may adopt (including taxis, shared bicycles, bridging buses, and walking), aiming to minimize both the operational costs of bridging buses and the total travel time of passengers. A travel choice model based on the random regret minimization (RRM) theory is developed to characterize passengers’ decision-making behavior following station disruptions. Demand uncertainty is represented using trapezoidal fuzzy variables, and a distributionally robust credibility optimization model is established. An innovative reinforcement learning-based parallel genetic algorithm (RPGA) is proposed for solving the model. A case study of a bidirectional disruption during the 08:00–10:00 on the section of Xi’an Metro Line 2 demonstrates that: (1) The proposed model exhibits stronger robustness under demand uncertainty, achieving a reduction of 3 dispatched vehicles and a cost saving of 9,439 RMB by moderately increasing passenger costs by 850 RMB and extending bridging time; (2) The RPGA algorithm outperforms Non-dominated Sorting Genetic Algorithm II (NSGA-II), Reinforcement Learning-based NSGA-II (RLNSGA-II), and Multi-objective Particle Swarm Optimization Algorithm (MOPSO) in hypervolume (HV), generational distance (GD), and non-dominated ratio (NDR); (3) Increasing the rated passenger capacity within a certain range can reduce average passenger delays but correspondingly raises transportation costs. This method effectively enhances the system’s ability to cope with demand fluctuations and provides decision-making support for emergency scheduling in urban rail transit.
Citation: Liu Y, Yang T, Su J (2025) Optimal bus bridging service for urban rail transit disruptions with stochastic passenger demand. PLoS One 20(10): e0333686. https://doi.org/10.1371/journal.pone.0333686
Editor: Yongxiang Zhang, Southwest Jiaotong University, CHINA
Received: May 24, 2025; Accepted: September 17, 2025; Published: October 7, 2025
Copyright: © 2025 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: We acknowledge funding support from the Sichuan Higher Education Vocational Research Center (Grant No. GZY24B15) and the Key Project of Sichuan Vocational and Technical College of Communications (Grant No. 2024-KJ-ZD-002). The funders participated in study design and provided logistical support for data collection.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
As urbanization accelerates, urban transportation systems are encountering unprecedented challenges. High-capacity and low-emission Urban Rail Transit (URT) has emerged as the backbone of urban public transportation networks. However, URT operations are frequently disrupted by equipment failures and other factors, particularly prolonged stoppages, which can result in mass passenger stranding and significantly reduce transportation efficiency. During such disruptions or emergencies, the capacity of rail systems often falls short of meeting passenger demand, leading not only to train delays but also to passengers being stranded on disrupted lines or at stations, further extending their travel times. For instance, statistics reveal that between 2018 and 2019, URT passengers in London experienced over 30 million hours of delays [1]. In 2022, mainland China’s urban rail transit system recorded more than 717 delays [2]. Similarly, Hong Kong’s MTR system faces an average of about 250 disruptions annually, with disruptions caused by power failures, signaling issues, and door malfunctions lasting over 60 minutes on average [3].
In the face of unplanned disruptions, URT operators typically adopt two strategies. Firstly, they make adjustments within the rail system by redirecting passengers to alternative URT lines for transfers to neighboring stations. Secondly, they invoke the bus system to evacuate stranded passengers. Research indicates that internal adjustments can effectively transfer passengers in short-term emergencies. However, when disruptions exceed 30 minutes, solely relying on internal adjustments becomes challenging to meet passenger demand. Consequently, activating bus bridging services becomes necessary [4]. According to Pender et al., 85% of agencies rely on bus bridging services to manage disruptions. Therefore, exploring innovative connections to ensure seamless and passenger-friendly transportation during unforeseen disruptions is crucial for achieving the city’s long-term sustainability goals [5].
A significant amount of academic attention has been devoted to the Bus Bridging Service Design Problem (BBSDP), which primarily aims to ensure the efficient and orderly evacuation of stranded passengers resulting from URT service disruptions. The BBSDP was first formulated by Kepaptsoglou and Karlaftis using a three-stage genetic algorithm approach for route generation, selection, and bus allocation [6]. Subsequent studies developed alternative optimization frameworks: Jin et al. proposed a column-generation and network-flow based method [7]. Wang et al. created a collaborative URT-bus dispatch model [8]. Gu et al. introduced a two-stage WSPT heuristic for time-delay tradeoffs [9], while Dou et al. minimized combined passenger/bus operation costs [10]. These models can be broadly categorized into deterministic and uncertain models, based on whether or not uncertainty is factored into the optimization process. Deterministic models rely on fixed parameters for planning bus schedules and predominantly adopt a passenger-centric perspective. This includes considerations such as passenger transfer and violation behaviors [11], spatial and temporal passenger demands [12], dynamic passenger flows during disruptions [13], and individual passenger path selection behaviors [14]. These studies collectively contribute a robust theoretical foundation and practical insights for addressing the BBSDP.
However, the effectiveness of the BBSDP in addressing disruptions is significantly impacted by various parameters, including minor fluctuations in bus operating times, which can render a deterministic BBSDP approach suboptimal or even impractical. Consequently, it is imperative to incorporate the impact of uncertainty. Uncertainty modeling represents an extension of deterministic modeling, focusing on the inclusion of uncertainties [15]. Recent studies have identified several uncertainties, such as disruption duration [16], passenger demand [17], and bus running time [18]. For instance, Wang et al. incorporated stochastic commuter demand and modeled it as a batch queuing process with hesitation and repetition characteristics using compound Poisson process theory [19]. Zhang and Lo formulated an optimization problem to determine the optimal initiation time for Shuttle Bus (SB) services under uncertain recovery duration, with the objective of minimizing expected total system costs [3]. Liang et al. developed a bus bridging optimization model addressing bus travel time uncertainty to minimize combined passenger and operational costs [18]. Xu et al. proposed a distributionally robust optimization model for rail transit tactical planning strategy design and disruption tolerance enhancement considering downtime uncertainty [20]. Luo et al. established a stochastic programming model accounting for uncertainties in both commuter demand and spare capacity of existing rail/bus lines, aiming to minimize unmet commuter demand [21]. Chen et al. introduced a robust optimization model based on Passenger Guidance and Extended Bus Bridging Service (E-BBS) to ensure reliable travel guidance and E-BBS solutions, specifically addressing the operational challenges posed by high uncertainty in bus running times [22].
The BBSDP is inherently NP-hard, posing significant challenges in solving large-scale instances. To enhance computational efficiency and identify near-optimal solutions, researchers have devised a range of meta-heuristic algorithms, including genetic algorithms [7], Tabu Search [23], and the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) [9]. These algorithms excel at thoroughly exploring the solution space and maintaining search effectiveness, even when encountering potentially suboptimal or infeasible solutions. For a thorough grasp of research pertaining to bus bridging services in the context of URT disruption management, we recommend consulting the literature review [24]. Furthermore, for the reader’s convenience, representative research outcomes mentioned in this section have been neatly organized and presented in Table 1.
The extant scholarship on bus bridging services predominantly addresses scheduling paradigms within deterministic frameworks, with limited exploration into optimizing bridging modalities under stochastic demand conditions. Although advancements in multi-objective optimization algorithms have been achieved, their computational efficacy necessitates further refinement through integration with emerging technologies such as parallel computing. In response, this study makes three pivotal contributions:
- (1). Development of a Multi-Objective Emergency Bus Scheduling Model: Formulated to concurrently minimize buses transportation costs and passenger travel costs, this model establishes a theoretical foundation for resolving real-world scheduling challenges.
- (2). Innovative Quantification of Demand Uncertainty: By synthesizing interval-valued parameter possibility distributions with robust uncertainty sets, the original optimization problem is reformulated as a computationally tractable robust peer-to-peer model, effectively mitigating the complexities of demand stochasticity.
- (3). Reinforcement-Learning-Based Parallel Genetic Algorithm (RPGA): A hybrid algorithm designed to efficiently solve the bi-objective model, RPGA introduces a novel methodological framework for emergency bus bridging optimization.
The paper is organized with Section 2 detailing the model formulation, including problem description and content of model construction; Section 3 presenting the designed algorithms for solving multi-objective models; Section 4 analyzing the model through a case study and conducting sensitivity analysis among the major factors; and Section 5 summarizing the conclusions of the paper and presenting the future outlook.
2. Mathematical formulation
2.1. Problem description
Fig 1 presents a schematic diagram of sudden service disruption on a single urban rail transit line, when a service disruption occurs between stations S1 and S3 in S1 File, the affected segment is defined as the disruption segment. The nearest turnback stations flanking this segment—S0 and S4—are dynamically designated for short-turning operations, with trains executing turnbacks at these stations and their adjacent counterparts. The bus bridging service is activated immediately after urban rail transit disruption and ends once the disruption is resolved. Building on the demonstrated advantages of new energy buses in environmental benefits (zero emissions, low noise), operational efficiency (low energy consumption, high performance), and sustainable development (reduced fossil fuel dependence), this study proposes a bus bridging service route system using new energy buses for adjacent disruption stations, which aligns with urban bus electrification development strategies and low-carbon transportation policy directives.
As illustrated in Fig 2, the temporary closure of Station B has necessitated alternative travel arrangements for affected metro passengers. Commuters can no longer board or disembark at this station, requiring those who intended to depart from Station B to either select a neighboring station as their new starting point, await the station’s reopening, or utilize other transit options. Similarly, passengers traveling to Station B must now choose between alternative destination stations or different modes of public transportation. These decisions are informed by both personal travel experience and official guidance provided by transit authorities regarding available alternatives such as taxis, bike-sharing systems, shuttle services, or walking routes. The disruption has particularly impacted travel patterns, with all affected passengers needing to adjust their journeys based on these modified transportation options.
Metro operators must implement emergency response measures to maintain connectivity when stations are temporarily closed, typically by deploying shuttle bus services that integrate with alternative transport modes including new energy buses, taxis, bike-sharing systems, and pedestrian routes. This disruption scenario presents two key optimization challenges: (1) dynamic scheduling of new energy shuttle buses requiring optimal dispatch frequency and charging plans that minimize operational costs while accounting for variable energy consumption during traffic congestion, and (2) robust passenger flow distribution across available transport options considering uncertain demand patterns and travelers’ mode choice preferences based on time, cost and comfort factors. The problem is formulated as a dual-objective optimization model simultaneously minimizing bus transportation costs (energy consumption and fleet scheduling expenses) and passenger travel costs (waiting time and travel expenses), requiring solutions that maintain system resilience during peak demand fluctuations while ensuring sustainable operation of the temporary transport network.
2.2. Assumptions
The modeling framework is predicated on the following assumptions:
- (1). Bus Stop Location Setting: Candidate bus stop locations are assumed to be set parallel to existing URT stations to facilitate quick transfers for passengers between the two traffic modes.
- (2). Mode Choice: When selecting transfer modes (bus, taxi, bike-sharing, or walking), passengers comprehensively compare time, cost, and comfort rather than making random choices.
- (3). Emergency Bus Type and Capacity: All emergency buses are assumed to be of the same type, with charging costs and maximum passenger capacity treated as fixed values, ignoring fluctuations in electricity prices across different time periods.
- (4). Road Disruption Neglect: Potential disruptions to emergency bus operations—such as traffic accidents or road closures—are not considered.
- (5). Emergency Bus Service Scope: Each emergency bus is restricted to serving only one specific origin-destination (OD) pair and is not allowed to stop at other stations to pick up passengers during its service period, ensuring operational efficiency and accuracy.
2.3. Symbol definitions
The relevant symbols in this paper are defined as shown in Table 2:
2.4. Deterministic modeling
The objective function of this research comprises two components: the first part accounts for the transportation costs of bus bridging services between disrupted stations and other stations, including both energy consumption costs and temporal costs:
Where denotes the link speed,
represents the energy consumption function,
denotes the time cost.
The second component encompasses the total travel costs for all affected passengers using various transportation modes, including temporal costs, economic costs, and comfort penalties:
The mode choice component employs a Random Regret Minimization (RRM) model to capture passengers’ behavioral responses when selecting:
Where is the deterministic utility term for mode
at station
,
is the sensitivity parameter (regret aversion intensity).
Constraints:
(1) Demand Coverage Constraint: Ensures all passenger demand is assigned to at least one travel mode:
(2) Bus Capacity Constraint: The passenger load on buses cannot exceed the total vehicle supply capacity:
(3) Charging Facility Constraint: The number of charging vehicles cannot exceed the maximum available charging points:
(4) Battery Dynamics Constraint: Tracks real-time battery level changes for electric buses:
(5) Non-negative Frequency Constraint: Bus departure frequencies must be positive or zero:
(6) Passenger Proportion Constraint: The proportion of passengers choosing each mode must be between 0 and 1:
3. Distributionally robust credibility optimization model
When dealing with the challenges of bus bridging services, especially in the face of unexpected event-induced ridership surges, the number of passengers stranded at a station due to differences in travel preferences (e.g., continuing with rail or switching to other modes of transportation) presents great uncertainty. Due to the lack of historical data, it is particularly difficult to accurately predict the passenger demand at each disruption station. For this kind of optimization problems with uncertainty, various solutions have been explored, among which stochastic and robust optimization methods are most commonly used [27]. Compared to stochastic optimization methods, robust optimization methods do not need to rely on specific possibility distributions of uncertain data, but rather work on finding solutions that remain optimal in the worst-case scenario Inspired by the study of Bertsimas and Sim [28], a parametric plausible optimization model is constructed by considering the uncertain passenger demand as an interval-valued fuzzy variable with a possibility distribution. Meanwhile, the decision maker usually requires the credibility of this constraint to satisfy a certain confidence level. The credibility measure Cr is essentially a trade-off operator between the possibility measure Pos and necessity measure Nec. This construction overcomes the excessive optimism of pure possibility constraints while avoiding the over-conservatism of necessity constraints. In demand model, the equivalent conversion between confidence level
and credibility constraint is given by:
The credibility constraint implies that, at least at the confidence level , the number of passengers allowed on board at each station is greater than the passenger demand. According to the definition in the literature [29], let
be a parameter interval-valued fuzzy variable of
with a second likelihood distribution of
, where
is the lower choice variable and
is the upper choice variable. Both
and
are fuzzy parameters, determined by the decision-maker based on expert experience or subjective judgment.
In order to characterize the fluctuation of the likelihood distribution , a parameter selection variable
is introduced to define the parameter interval value fuzzy variable chosen by
, where
reflects the decision maker’s optimistic or pessimistic decision-making attitude. Then the parametric likelihood distribution can be expressed as follows
where the values of and
can be determined by the decision maker based on expert experience or subjective judgment.
According to the above description, the variation of parameter determines the location of the likelihood distribution. That is, a plausibility constraint is a set of constraints with a common structure, i.e., with a fixed parameter
, and with parameter
varying in a given set of uncertainty distributions
defined as follows:
Based on this uncertainty set, the distributionally robust bus bridging service model can be expressed as:
The credibility robust optimization model is transformed into a computationally solvable form. It is assumed that uncertain demands are represented as parameter interval-valued trapezoidal fuzzy variables. Next, the transformation of the model is carried out in two steps: the analysis of the credibility constraints and the analysis of the robust peering.
3.1. Analysis of the credibility constraint
When passenger demand obeys a trapezoidal distribution, the analytical expression for the credibility constraint (12) can be expressed by the following theorem:
Theorem 1: Let passenger demand be expressed as a parameter interval-valued trapezoidal fuzzy variable. If
are independent of each other, then the credibility constraint (12) can be transformed into the following form:
① When , then the plausibility constraint is equal to:
② When , then the plausibility constraint is equal to:
③ When , then the plausibility constraint is equal to:
④ When , then the plausibility constraint is equal to:
Proof: Since the uncertain demand is a trapezoidal fuzzy variable of parameter interval values,
it is a
choice variable of
. Therefore the parameter interval-valued likelihood distribution
of
is:
According to the definition of plausibility measures [30], plausibility constraints can be handled in the following way:
When , then:
Thus, the plausibility constraint is equivalent to
, define
, where
, then
.
If , then
, obtaining the (24):
Solving the Equation (24) gives:
If , then
, obtaining the Equation (26):
Solving the Equation (26) yields Equation (27)
The proofs of ① and ② are completed, and ③ and ④ in Theorem 1 can be proved in a similar way.
To simplify the expression, denote the segmented functions in Theorem 1 by and denote
as the domain of definition of each segmented function, then the plausibility constraint can be expressed as:
3.2. Analysis of robust equivalence
Since the given set of uncertainty distributions is a continuous set, Equation (29) actually represents a semi-infinite programming model containing infinitely many constraints. In order to solve such complex models, they need to be transformed into a computationally tractable form, and the core idea of robust optimization, which is to find a solution that remains feasible for all possible realizations of the uncertain parameters, can be rewritten for Equation (29):
To solve the above problem, the maximization operation at the left end of the inequality in the original problem is transformed into a dyadic problem by means of a dyadic transformation [31]. The original inequality (24) is equivalently expressed in the form of a segmented function:
The first two inequalities are obtained from , while the last two inequalities are obtained from
. Based on the above analysis, an equivalent deterministic model can be obtained:
4. Reinforcement learning-based parallel genetic algorithm
Multi-objective optimization problems are highly challenging due to their NP-hard nature, making it difficult for exact algorithms to find feasible solutions within reasonable timeframes. To address this challenge, researchers have developed a series of heuristic algorithms—such as iterative greedy algorithms, evolutionary algorithms, and particle swarm optimization—which demonstrate excellent performance when dealing with problems featuring relatively simple constraints. However, their effectiveness often becomes limited when confronted with highly complex problems. To break through this bottleneck, many scholars have leveraged advanced technologies like parallel computing to further refine these algorithms.
In pursuit of high-quality solutions, this study designs a Reinforcement-learning-based Parallel Genetic Algorithm (RPGA), which achieves innovative breakthroughs in three key aspects compared to existing methods:
- (1). Dynamic Parameter Adaptation via Q-Learning
Traditional genetic algorithms typically rely on fixed rules or empirical formulas to adjust parameters (e.g., linearly decreasing crossover and mutation probabilities), which struggle to adapt to the dynamic changes in complex solution spaces inherent to multi-objective optimization. RPGA innovatively incorporates Q-Learning to dynamically adjust crossover probability, mutation probability, and the methods for crossover and mutation operations [15].
- (2). Heterogeneous Population Architecture
Conventional genetic algorithms employ a single homogeneous population, which is prone to premature convergence to local optima. RPGA adopts a master-slave island model, designing a parallel computing framework that accelerates the evolutionary process through genetic operations and population communication strategies [19].
The fundamental differences between RPGA and NSGA-II are summarized in Table 3:
Fig 3 presents the detailed flowchart of the RPGA algorithm, which starts with population initialization and progresses through multiple stages including genetic operations, population communication, and reinforcement learning-based parameter adaptation, ultimately outputting high-quality solution sets.
4.1. Parallel computing based on master-slave island model
The master-slave island model represents an efficient parallel computing architecture that divides the population into master and slave populations for parallel evolution. During the evolutionary process, the master population engages in regular exchanges of elite individuals, while the slave population contributes high-quality individuals to the master population. This setup fosters knowledge sharing and evolutionary dynamics among the populations. Given the complexity of the constraints in the BBSDP (a complex and multi-objective optimization problem), it is challenging for a single population to effectively approximate the true Pareto frontier. To address this, we propose the Population Parallel Computing Strategy (PPCS) based on the master-slave island model, as illustrated in Fig 4. This strategy establishes two master populations for multi-objective search and two slave populations for single-objective search, focusing on the two optimization objectives of BBSD. All populations simultaneously perform genetic operations, including selection, crossover, and mutation, to drive the evolutionary process. During iteration, bidirectional communication is maintained between the master populations, while the slave populations communicate unidirectionally to the master populations. This arrangement enhances the probability of the master populations approaching the true Pareto frontier.
4.2. Reinforcement learning to optimize genetic manipulation
The fixed genetic operation strategy in genetic algorithms often results in the algorithm becoming trapped in local optimal solutions, thereby limiting its global search capability. To address this issue, this study introduces the Q-Learning algorithm from the reinforcement learning mechanism to dynamically adjust the parameters of genetic operations, thereby enhancing the algorithm’s global optimization performance. The state and reward mechanisms of Q-Learning are intimately tied to the evolutionary quality of the population. Utilizing the Q-Table, the algorithm selects the optimal action (i.e., the adjustment strategy for the genetic operation parameters) based on the current state. After each iteration, the evolution of the population is fed back to the Q-Learning algorithm to update the Q values in the Q-Table. Fig 5 illustrates the specific flow of the Q-Learning optimized genetic operation, where the design of the state set, action set, and reward process is crucial for the implementation of the algorithm.
Based on the algorithmic strategy introduced earlier, the pseudocode is designed as follows
Algorithm 1 Reinforcement-guided Parallel Genetic Algorithm (RPGA)
Input:Population size ; Maximum generations
; Migration rate
; Migration interval
; Number of islands
; Crossover probability
; Mutation probability
; Genetic operators
Output:Approximated Pareto front
Steps:
1 Initialization
2 ←RandomPopulation(
)
3 Partition into
subpopulations
4 Initialize RL agent with state space and action space
5 ←1, converged←False
6 end Initialization
7 while and not converged do
8 Parallel Evolution on Islands:
9 for each subpopulation in parallel do
10 Perform selection: ←Selection
▷ Tournament selection
11 Perform crossover: ←Crossover
▷ SBX crossover
12 Perform mutation: ←Mutation
▷ Polynomial mutation
13 Evaluate fitness: EvaluateFitness
14 end for
15 Migration & RL Adaptation:
16 if mod
==0 then
17 ▷ Ring-topology migration
18 for to
do
19 Migrate individuals from
to
20 end for
21 end if
22 ▷ Reinforcement learning adaptation
23 for each do
24 Compute state: ←ComputeState
▷ State: Population metrics
25 if ConvergenceCheck (,threshold=5%) then
26 Set converged←True
27 else
28 Compute reward: ←RewardFunction
▷ Based on HV/IGD improvement
29 Update policy: ←RLPolicyUpdate
▷ Adjust
30 Update with action
31 end if
32 end for
33 Increment generation: ←
34 end while
35 return NonDominated Sort ▷ Final Pareto front
5. Numerical experiments
5.1. Network description
Considering the operational disruption on Xi’an Metro Line 2 during the 08:00–10:00, which lasted for 2 hours and affected 6 stations (including two terminal stations, S1 and S6 in S1 File, and four disrupted stations, S2, S3, S4, and S5 in S1 File), we need to devise an efficient bus bridging service. The bus bridging service, with a rated passenger capacity of 50 people per bus and a departure interval of 5 minutes, aims to replace the disrupted section of the metro line and facilitate passenger travel between the affected stations (S2, S3, S4, and S5 in S1 File). There are five dispatch points in the vicinity ready to provide the necessary transportation. The topology of the metro line is depicted in Fig 6, while Table 4 outlines the number of vehicles available for dispatch from each point and the distances to the respective emergency bus stations.
This study utilizes the Automatic Fare Collection (AFC) system swipe card data from Xi’an Metro Line 2 during May 1–7, 2021, employing systematic data cleaning and preprocessing methods. First, quality control is performed on the raw data to remove incomplete records with missing key fields (e.g., exit station and timestamp information). Second, temporal logic anomalies (such as exit times preceding entry times and other chronological errors) are corrected. Concurrently, abnormal trips are identified and excluded based on trip duration characteristics (outliers with dwell times <1 minute or >4 hours). On this basis, an origin-destination (OD) passenger flow matrix with a 15-minute temporal resolution is constructed to effectively extract spatiotemporal distribution characteristics of historical passenger flows, thereby providing high-quality data foundation for subsequent modeling research.
According to the construction method of fuzzy affiliation function, the trapezoidal parameter estimated based on the historical data of passenger flow is obtained, as shown in Table 5. Uncertain demand is denoted by
, where
and
are determined by the decision maker based on expert experience or subjective judgment. In this study, the values of parameters
and
are 0.24 and 0.15, respectively. Meanwhile, the confidence level is set to be
. To guarantee a high degree of confidence in the evacuation process, we compare the numerical results obtained from the nominal model with those derived from the proposed distributionally robust credibility model. In this comparison, the nominal demand for each origin-destination (OD) pair is set equal to the average value of the trapezoidal parameters.
5.2. Parameter settings
The values of relevant parameters in the model are shown in the Table 6:
In order to determine the optimal values of the key parameters of RPGA, experiments are conducted on the main population evolution process based on reinforcement learning to determine the optimal parameters. 600 samples are randomly selected for each experiment, and the evaluation index is taken as the mean value of the results of 10 experiments [32]. The optimal parameter configuration for the Q-Learning algorithm is determined as follows: the discount factor is set to 0.85 through grid search (test range [0.7,0.99]); an exponential decay strategy from 1.0 to 0.1 is adopted for the exploration rate; and an adaptive adjustment mechanism from 0.6 to 0.1 is implemented for the learning rate, significantly improving training stability. The optimal parameter values of the algorithm are finally derived as follows: ,
,
,
,
,
,
.
5.3. Results and discussion
The comparative results between the nominal model and distributionally robust credibility model are presented in Tables 7 and 8. The nominal model dispatched 28 buses to handle station disruptions, generating 58,765 RMB in vehicle transportation costs and 27,580 RMB in passenger travel costs. In comparison, the distributionally robust model only required 23 buses, reducing emergency vehicle costs by 16.1% to 49,326 RMB while incurring a modest 3.1% increase in passenger costs to 28,430 RMB. This represents a reduction of 5 vehicles deployed and significant savings of 9,439 RMB in transportation costs, offset by a slight 850 RMB rise in passenger expenses. By explicitly accounting for demand uncertainty through marginally increased transfer times, the distributionally robust model demonstrates superior robustness in shuttle solutions under uncertain conditions. The findings validate its enhanced adaptability and cost-effectiveness in developing emergency bus scheduling strategies when facing demand fluctuations, making it particularly suitable for practical transit disruption management applications.
5.3.1. Comparative analysis of algorithms.
- 1). Comparative Analysis of RPGA Algorithm versus exact optimization solver CPLEX
This study systematically compares the performance of the RPGA algorithm with the exact solver CPLEX (results shown in Fig 7), yielding the following key findings: In the bus bridging optimization problem, the RPGA algorithm demonstrates significant comprehensive advantages: (1) Economically, RPGA achieves a 20.6% reduction in buses transportation costs and a 5.7% decrease in passenger travel costs; (2) In terms of resource allocation, the required number of dispatched buses is reduced by 11.5%; (3) Regarding computational efficiency, the computation time is shortened by 37.5% compared to CPLEX. These improvements are primarily attributed to the heuristic strategy adopted by RPGA, which effectively circumvents the combinatorial explosion problem. When addressing large-scale real-time scheduling problems, RPGA significantly enhances computational speed while ensuring solution quality.
- 2). Comparison between RPGA Algorithm and Multi-objective Optimization Algorithms
This study compares RPGA with three multi-objective optimization algorithms: NSGA-II [33], Reinforcement Learning-based NSGA-II (RLNSGA-II) [34], and Multi-objective Particle Swarm Optimization Algorithm (MOPSO) [35]. Notably, RPGA’s encoding scheme was also implemented in NSGA-II and MOPSO for fair comparison. To comprehensively evaluate algorithm performance, four established metrics were employed: Hypervolume (HV) measuring solution diversity and convergence, Inverted Generational Distance (IGD) assessing convergence quality, Non-Dominated Ratio (NDR) quantifying solution dominance, and computational TIME recording algorithm efficiency.
① The HV metric provides a comprehensive evaluation of algorithm performance by simultaneously considering three key aspects: cardinality, accuracy, and diversity. A larger HV value indicates superior population quality. The metric is formally calculated as follows:
Where represents the Lebesgue measure,
denotes the non-dominated solution set.
is the size of the non-dominated set.
denotes the hypervolume formed between the reference point and the
-th solution in the solution set.
② The IGD serves as a superior metric for evaluating population convergence, where a smaller IGD value indicates better population quality. The metric is computed as follows:
Where denotes the reference set.
represents the cardinality of the reference set.
is the minimum Euclidean distance between a reference point
and all solutions
in the non-dominated solution set
.
③ The NDR is employed to compare the quality of non-dominated solutions across different algorithms. A higher NDR value indicates superior algorithm performance. Using RPGA as an example, the calculation proceeds through the following steps:
Step 1: Merge the non-dominated solution sets obtained from running all four algorithms, denoted as .
Step 2: Perform non-dominated sorting on to identify the global non-dominated set
. Let
denote the subset of
originating from RPGA.
Step 3: Let denote the cardinality (size) of a set,
.
For the benchmark algorithms, certain parameters were set identical to RPGA while others followed values from the literature. Specifically in NSGA-II, the crossover and mutation probabilities matched RPGA’s settings with other parameters set according to published studies. Similarly for RLNSGA-II, five key parameters including learning rates and operator probabilities were aligned with RPGA while maintaining literature-based values for remaining parameters [28]. MOPSO’s parameters were configured strictly reference [29] implementations.
The evaluation protocol consisted of 20 independent experimental runs for each algorithm. Performance metrics including HV, IGD and NDR were averaged across runs and visualized through trend line plots, with computational time (TIME/min) recorded for each method. As shown in the results Fig 8, our analysis leads to the following key conclusions:
- (1). Overall, RPGA demonstrates superior performance in HV, IGD, and NDR metrics compared to other algorithms while maintaining TIME within an acceptable range. In the early iterations, RPGA’s metrics are comparable to NSGA-II and RLNSGA-II but significantly outperform MOPSO, indicating that genetic evolution-based algorithms are more suitable for MAVRP. As iterations progress, RPGA’s metrics optimize rapidly and show clear advantages over other algorithms, with RLNSGA-II ranking second, demonstrating that reinforcement learning can significantly enhance algorithmic search capability. The consistent superiority of RPGA over RLNSGA-II proves the effectiveness of the master-slave island model’s parallel computing in further improving the quality of non-dominated solutions.
- (2). After approximately 110 iterations, RPGA’s HV, IGD, and NDR metrics become markedly superior to other algorithms. By 160 iterations, RPGA’s metric curves begin to stabilize, whereas other algorithms only start stabilizing after 180 iterations – and even then at inferior metric levels compared to RPGA. This clearly indicates RPGA’s higher search efficiency.
- (3). The computation time (TIME) of RPGA is comparable to RLNSGA-II but slightly longer than NSGA-II and MOPSO, primarily due to the additional computational overhead introduced by reinforcement learning. However, considering the overall performance, RPGA demonstrates significantly superior results in HV, IGD, and NDR metrics compared to other algorithms. Essentially, RPGA achieves higher-quality solutions at the cost of marginally increased computation time.
To comparatively analyze the performance of RPGA against NSGA-II, RLNSGA-II, and MOPSO in solving multi-objective scheduling models, this study conducted a series of experiments. As shown in Table 9. Regarding buses transportation costs, RPGA achieved superior results at 49,326 RMB, representing a 16.1% reduction compared to the second-best performer NSGA-II (58,765 RMB). For passenger travel costs, RPGA attained 28,430 RMB, a 3.9% improvement over NSGA-II (29,580 RMB), demonstrating simultaneous optimization of both cost dimensions. Furthermore, RPGA required only 23 buses for scheduling – 4.2% fewer than MOPSO (24 buses) and the lowest among all algorithms, confirming its capability for resource-efficient scheduling.
5.3.2. Impact of confidence level on optimal shuttle decision-making.
The above results were obtained under a confidence level . To further investigate the impact of
on the optimal shuttle decision, we fixed parameters
and
and varied
within the interval [0.5,0.9] with a step size of 0.1. To evaluate the utility of the distributionally robust credibility model under different uncertain demand scenarios, we considered a surge in passenger flow at disrupted stations by selecting nominal demand deviation values of 0.20, 0.25, 0.30, 0.35, and 0.40 to represent varying degrees of demand fluctuations. The computational results are presented in Table 10, where the expected unmet rate and expected occupancy rate are weighted averages calculated under both nominal and maximum deviation demand values.
From Table 10, it can be observed that for any fixed deviation value, the expected unmet rate of the distributionally robust credibility model increases as the confidence level decreases, while the expected occupancy rate decreases. This implies that higher confidence levels require additional costs from bus operators, such as providing more shuttle vehicles or increasing service frequency, to meet a larger proportion of passenger demand. Conversely, for any fixed confidence level
, both the expected unmet rate and expected occupancy rate increase with higher deviation values. This indicates that greater demand variability leads to higher vehicle occupancy, but when vehicles reach capacity, the number of unmet passenger demands also rises. To mitigate potential risks, the maximum confidence level
across different paths was used to calculate the expected unmet rate and expected occupancy rate, ensuring robustness under extreme demand fluctuations.
Fig. 9 illustrates the trends of expected occupancy rate and expected unmet rate under different demand deviation values across five confidence levels. As shown in Fig 9a, the growth rate of the expected occupancy rate decreases as the confidence level increases, indicating that shuttle vehicles tend to reach full capacity as demand rises. This suggests that higher confidence levels require more resources to accommodate increasing passenger demand, but the marginal gain in occupancy diminishes. In contrast, Fig 9b reveals that the growth rate of the expected unmet rate increases with higher confidence levels. This implies that lower confidence levels result in a higher proportion of unmet demand due to insufficient capacity, highlighting the instability of solutions under such conditions.
5.3.3. Rated passenger capacity analysis.
The relationship between the rated passenger capacity and bus transportation costs and average passenger delays is explored, as shown in Fig 10. In a certain interval, as the rated passenger capacity increases, the bus transportation cost tends to increase, while the average passenger delay decreases accordingly. This indicates that the adjustment of rated passenger capacity has a significant effect on the implementation of the rescue program.
6. Conclusion
This paper investigates the BBS in response to URT disruption/'.pk;. s and introduces a bus bridging model that incorporates the uncertainty of passenger demand. A multi-objective emergency bus scheduling model is developed, aiming to minimize both bus transportation costs and average passenger delays. Given the significant uncertainty in the number of passengers stranded at stations, the uncertain demand is modeled as a trapezoidal fuzzy variable, leading to the establishment of a distributionally robust credibility optimization model. To solve this model, a parallel genetic algorithm enhanced by reinforcement learning is proposed. The study is validated using a case study of the Xi’an Metro Line 1 section between Cao Tan Station and Chang Ning Gong Station during the morning peak hours of 07:00–09:00. The key findings are as follows:
- (1). Comparative analysis between the nominal model and the distributionally robust credibility model reveals that the latter achieves superior operational efficiency: it reduces the number of dispatched vehicles by 3 (12.5% decrease), lowers emergency transportation costs by 9,439 RMB (16.1% reduction), while accepting a marginal 3.1% increase (850 RMB) in passenger costs. This trade-off stems from the model’s explicit consideration of demand uncertainty, which yields more robust shuttle solutions through strategically extended transfer times.
- (2). Comprehensive evaluation demonstrates RPGA’s dominance across all performance metrics: it achieves 4% higher hypervolume (HV), 20% improved inverted generational distance (IGD), and 2.5 × greater non-dominated ratio (NDR) compared to NSGA-II, RLNSGA-II and MOPSO algorithms, with only a 10–15% computational time overhead. These advantages translate to concrete operational benefits: 49,326 RMB in transportation costs (16.1% lower than NSGA-II’s 58,765 RMB) and 28,430 RMB in passenger costs (3.9% reduction versus NSGA-II’s 29,580 RMB), conclusively validating its superiority in multi-objective optimization.
- (3). The trend of expected full load rate and expected unmet rate under different demand deviation values at five confidence levels is analyzed. With the increase of confidence level, the growth rate of expected full load rate decreases, indicating that the feeder vehicles tend to be full-load with the increase of demand. With the increase of the confidence level, the growth rate of the expected unmet rate increases, indicating that when the co nfidence level is low, the larger passenger demand will lead to a higher proportion of unserved demand, and the stability is poor.
- (4). Analysis of the rated passenger capacity of emergency vehicles reveals that within a certain range, an increase in rated capacity leads to a corresponding rise in bus transportation costs and a decrease in passenger travel costs, demonstrating that rated passenger capacity significantly impacts rescue plans.
The proposed model has several areas that warrant further improve ment. Firstly, this study focuses solely on disruptions occurring in a single URT route. Future research could expand this scope by developing a model that addresses disruptions across an entire URT network. Secondly, the current approach designs the bridging service in a static environment, which overlooks the dynamic nature of passenger flow propagation and public transport interactions within the spatiotemporal network. To address this limitation, future work could explore the design of bus bridging services within a spatiotemporal network framework, enabling a more accurate representation of real-world conditions.
Supporting information
S1 File. May 1st −7st Metro Line 2 OD Statistics Table.
https://doi.org/10.1371/journal.pone.0333686.s001
(XLSX)
References
- 1. Liu T, Koutsopoulos HN, Ma Z. Modeling the duration of the impact of unplanned disruptions on passenger trips using smartcard data in urban rail systems. Urban Rail Transit. 2023;9(3):266–79.
- 2. Sui X, Yan H, Pan S, Li X, Gu X. Bus system optimization for timetables, routes, charging, and facilities: a summary. Dig Transport Saf. 2025;4(1):1–9.
- 3. Zhang S, Lo HK. Metro disruption management: optimal initiation time of substitute bus services under uncertain system recovery time. Transport Res Part C. 2018;97:409–27.
- 4. Zheng S, Liu Y, Lin Y, Wang Q, Yang H, Chen B. Bridging strategy for the disruption of metro considering the reliability of transportation system: Metro and conventional bus network. Reliabil Eng Syst Saf. 2022;225:108585.
- 5. Pender B, Currie G, Delbosc A, Shiwakoti N. Disruption recovery in passenger railways. Transport Res Record. 2013;2353(1):22–32.
- 6. Kepaptsoglou K, Karlaftis MG. The bus bridging problem in metro operations: conceptual framework, models and algorithms. Public Transp. 2009;1(4):275–97.
- 7. Jin JG, Teo KM, Odoni AR. Optimizing bus bridging services in response to disruptions of urban transit rail networks. Transport Sci. 2016;50(3):790–804.
- 8. Wang Y, Yan X, Zhou Y, Zhang W. A feeder-bus dispatch planning model for emergency evacuation in urban rail transit corridors. PLoS One. 2016;11(9):e0161644. pmid:27676179
- 9. Dou X, Wang H, Meng Q. Parallel shuttle bus service design for planned mass rapid transit shutdown: the Singapore experience. Transport Res Part C. 2019;108:340–56.
- 10. Gu W, Yu J, Ji Y, Zheng Y, Zhang HM. Plan-based flexible bus bridging operation strategy. Transport Res Part C. 2018;91:209–29.
- 11. Zhang Z, Li X, Zhang J, Shi Y. Optimizing bus bridging service considering passenger transfer and reneging behavior. Sustainability. 2024;16(23):10710.
- 12. Shao L, Song L. An optimization approach considering passengers’ space-time requirements for bus bridging service under URT disruption. J Adv Transport. 2022;2022:1–12.
- 13. Wang J, Yuan Z, Yin Y. Optimization of bus bridging service under unexpected metro disruptions with dynamic passenger flows. J Adv Transport. 2019;2019:1–13.
- 14. Zhu Y, Jin JG, Wang H. Path-choice-constrained bus bridging design under urban rail transit disruptions. Transport Res Part E. 2024;188:103637.
- 15. Wang X, Jin JG, Sun L. Real-time dispatching of operating buses during unplanned disruptions to urban rail transit system. Transport Res Part C. 2022;139:103696.
- 16. Tan Z, Xu M, Meng Q, Li Z-C. Evacuating metro passengers via the urban bus system under uncertain disruption recovery time and heterogeneous risk-taking behaviour. Transport Res Part C. 2020;119:102761.
- 17. Kuo Y-H, Leung JMY, Yan Y. Public transport for smart cities: Recent innovations and future challenges. Eur J Operation Res. 2023;306(3):1001–26.
- 18. Liang J, Wu J, Qu Y, Yin H, Qu X, Gao Z. Robust bus bridging service design under rail transit system disruptions. Transport Res Part E. 2019;132:97–116.
- 19. Wang Y, Guo J, Currie G, Ceder A, Dong W, Pender B. Bus bridging disruption in rail services with frustrated and impatient passengers. IEEE Trans Intell Transport Syst. 2014;15(5):2014–23.
- 20. Xu L, Ng TS, Costa A. Optimizing disruption tolerance for rail transit networks under uncertainty. Transport Sci. 2021;55(5):1206–25.
- 21. Luo C, Xu L. Railway disruption management: Designing bus bridging services under uncertainty. Comput Operations Res. 2021;131:105284.
- 22. Chen J, Du B, Hu H, Yin Y, Peng Q. Urban rail transit disruption management based on passenger guidance and extended bus bridging service considering uncertain bus running time. Expert Syst Applications. 2024;249:123659.
- 23. Wang Y, Zhou Y, Yang H, Yu B, Liu X. Optimizing bus bridging services with mode choice in response to urban rail transit emergencies. Eur J Operational Res. 2025;323(1):108–24.
- 24. Zhang S, Lo HK, Ng KF, Chen G. Metro system disruption management and substitute bus service: a systematic review and future directions. Transport Rev. 2021;41(2):230–51.
- 25. Wang Y, Zhou Y, Yang H, Yan X. Integrated optimization of bus bridging service design and passenger assignment in response to urban rail transit disruptions. Transport Res Part C. 2023;150:104098.
- 26. Feng Y, Ceder A, Zhang S, Cao Z. Bus routing fine-tuning for integrated network-based demand and bus bridging for a disrupted railway system. Expert Syst Applications. 2024;242:122825.
- 27. Wang Z, Chen B, Wang J, Kim J, Begovic MM. Robust optimization based optimal DG placement in microgrids. IEEE Trans Smart Grid. 2014;5(5):2173–82.
- 28. Bertsimas D, Sim M. The price of robustness. Operations Res. 2004;52(1):35–53.
- 29. Liu Y, Liu YK. The lambda selections of parametric interval-valued fuzzy variables and their numerical characteristics. Fuzzy Optimization Dec Making. 2016;15(3):255–79.
- 30. Baoding Liu, Yian-Kui Liu. Expected value of fuzzy variable and fuzzy expected value models. IEEE Trans Fuzzy Syst. 2002;10(4):445–50.
- 31. Yang M, Liu Y, Yang G. Multi-period dynamic distributionally robust pre-positioning of emergency supplies under demand uncertainty. Appl Math Modell. 2021;89:1433–58.
- 32. Fang W, Li X, Lin Z, Zhou J, Zhou T. Mixture correntropy with variable center LSTM network for traffic flow forecasting. Dig Transport Saf. 2024;3(4):264–70.
- 33. Han J, Xue Q, Yuan Y, Yang X, Wu J. Integrated optimization of train stopping plans and timetabling for urban rail transit based on dynamic demand. Dig Transport Saf. 2025;4(1):42–9.
- 34. Bora TC, Mariani VC, Coelho L dos S. Multi-objective optimization of the environmental-economic dispatch with reinforcement learning based on non-dominated sorting genetic algorithm. Appl Thermal Eng. 2019;146:688–700.
- 35. Liu L-L, Wan X, Gao Z, Li X, Feng B. Research on modelling and optimization of hot rolling scheduling. J Ambient Intell Human Comput. 2018;10(3):1201–16.