StationRank: Aggregate dynamics of the Swiss railway

Increasing availability and quality of actual, as opposed to scheduled, open transport data offers new possibilities for capturing the spatiotemporal dynamics of railway and other networks of social infrastructure. One way to describe such complex phenomena is in terms of stochastic processes. At its core, a stochastic model is domain-agnostic and algorithms discussed here have been successfully used in other applications, including Google’s PageRank citation ranking. Our key assumption is that train routes constitute meaningful sequences analogous to sentences of literary text. A corpus of routes is thus susceptible to the same analytic tool-set as a corpus of sentences. With our experiment in Switzerland, we introduce a method for building Markov Chains from aggregated daily streams of railway traffic data. The stationary distributions under normal and perturbed conditions are used to define systemic risk measures with non-evident, valuable information about railway infrastructure.

In particular the theoretical motivation is thoroughly explained and alternative literature is provided in more detail. We also placed emphasis in refactoring our source code towards an improved statistical analysis and visualization of the concepts discussed.
To facilitate your review of our revisions, we provide a point-by-point response to the questions and comments delivered in your decision letter and its accompanying attachment. The responses to the concerns raised are below and are color coded as follows: a) Comments from editor or reviewers are shown as red b) Our responses are shown as italic.
We hope that our edits and the responses we provide satisfactorily address all the issues and points you and the reviewers have raised. The reviewers' comments were very helpful overall, and we are appreciative of such constructive feedback on our original submission. After addressing the issues raised, we feel the quality of the paper is much improved.
PONE-D-  Responses to reviewer #1: The manuscript "StationRank: Aggregate dynamics of the Swiss railway" by Georg Anagnostopoulos and Vahid Moosavi presents a Markov Chain (MC) framework to analyze daily aggregated itineraries of the Swiss railway systems. They use their MC framework to asses the congestion, resilience and fragility of the railway network.
I think that this framework is interesting. It is simple and does not require much computation power and allows one to draw interesting insights from the data. However, this manuscript is not fit for publication in its current state. My main issues are the following: -The authors do not place this study in context. There are many existing tools for the analysis of public transportation that can be much more detailed (using agent-based model for example, as in Manser et al. 2020). In the introduction, the authors should explain what it is they want to do, summarize previous works, explain why their approach is relevant and what is their contribution. In the conclusion, they could elaborate on the limitations of their approach and on possible improvements.

Response:
The present study provides a stochastic framework for studying interactions between localities (stations) that are connected via railway infrastructure. At the center of our analysis is the station, not the train. We construct rankings of stations by means of aggregating daily train flows. In terms of scope, we cover the whole Swiss railway network over a period of one month. Our hypothesis is that we can obtain a quite detailed idea of the spatiotemporal dynamics involved just by utilizing a single openly accessible data set, at low computational cost and without assumptions involving substantial domain-specific knowledge.
Existing parametric tools, such as agent-based models, have been applied to many engineering problems related to complex systems, including analysis of public transportation. Nevertheless, their success relies on the acceptance of a predefined set of properties (Moosavi 2017) whose gradual increase in pursuit of realism is bound by the curse of dimensionality (Bellman 2015 given .
There are also many smaller issues that I list below: -eq. 1, define S and x Response: denotes the position at time step of a particle following a random walk. According to Crisostomi et al. (2011) "vehicles do not actually follow a random walk; however, by analogy with similar approaches like the PageRank model … , the underlying stochastic process is viewed as a mechanism for obtaining useful information, rather than a literal representation of the behaviour of a single vehicle".
-p. 2: How is P_ij computed? How is the daily data aggregated?

Response:
Regarding the aggregation, we group stops by trip id and day of operation, a property that resides in the actual data. A day of operation is not the same as a calendar day. Consider the following: a single trip around midnight might span two calendar days, but occurs in the same day of operation.

such as rail tracks) with no other station in between, then -space is actually a space-of-stations (Kurant and Thiran 2006). Space-of-stops coincide with space-of-stations when vehicles stop at every station. An extended -space allows multiple weighted links between nodes.
In -space any two stations, consecutive or not, that share at least one common route are linked. This representation is also called space-of-changes ( PONE-D-20-17563 -p. 4: You should also discuss the fact that all the quantities you use are evaluated at stationarity, but the system is actually never stationary. What does the stationary distribution represents in this case?

Response:
The term stationary in the context of Markov Chains is not referring to a stationary system. For an example of a stationary model, see Dorbritz (2012). Here the stationary distribution represents occupancy of stations at a theoretical state of equilibrium, it shows a trend. Generally, is just one property of the underlying Markovian system as described by the transition probability matrix .
The matrix in turn can support much more demanding analysis such as spectral clustering, calculation of mean first passage times, network resistance estimation, link recommendation, etc.
-p. 4: About the Kemeny constant, the authors write that it can be described as the "average expected time (steps) from any given state (origin) to any other random state (destination)". This is interesting, but surely, the "expected time from any given state to any other random state" has a very heterogeneous distribution. It would be interesting to also have a measure of the standard deviation of this distribution. PONE-D-20-17563 -p. 4: "For the Swiss railway network, we calculated that K ≈ 30.5, ±1 minute." How is the error (and average) calculated?

Response:
We calculate time series of K, with the two methods previously described, and we normalize them by the respective aggregation times. The reported result was true for a smaller network with fixed aggregation times of 1440min (24 hours). Now we consider a total of 1660 stations and aggregate by day of operation rather than calendar day, so times also vary. The new result is quite interesting in that it shows a hard upper boundary of 41min. In a way, bigger networks are "slower".
-p. 5: "Linear reduction of a station's activity by multiple perturbations of the transition probability matrix is more suitable for a real-world network than simple node removal." Could you explain why?

Response:
Perturbation of the transition probability matrix enables link capacity reduction (Dobritz 2012). In many real-world scenarios only certain connections of a station fail or switch to reduced operation. Response: In Fig 4 and Fig 5 we show the dates. Regarding the colors, we have the following mapping: • remoteness, inflow, outflow, stationary distribution, fragility, influence→ CET-L19 1 • standard deviation → CET-L2 • spectral clustering → turbo • perturbations → seismic PONE-D-20-17563 -p. 5: "In the first weekend, high probabilities were redistributed in the west part of the country. Furthermore, some of the lowest probabilities were reduced near the end of the month." This is not clear at all on Fig. 3. You could show of the variation of pi in the East and West.

Response:
We replaced the time series in Fig 3 with a single day view. The variation of is shown in Fig 7. -p. 5: "namely the fact that the first weekend of October developed a completely distinct dynamic which might be related to the beginning of the Swiss autumn holidays." We see a clear shift, and this is quite nice, but I would not say that it is "a completely distinct dynamic". It would be nice if you could assess if it is related to the beginning of the Swiss autumn holidays by looking at some other statistics in the data that does not rely on the MC framework.

Response:
The distinct dynamic characterization is omitted in the revised manuscript. The patterns initially observed, persist also in the current version that analyzes considerably more data. The observation is consistent among multiple measures, including non Markovian ones such as inflow and outflow. This is clear in Fig 7 and can be easily verified with our app: https://stationrank.herokuapp.com/.
-p. 6: "Known network measures, such as various centrality measures [8, 13, 21], are purely based on network structure and often fail to address dynamic aspects of flow and time which are crucial when evaluating the real behavior of the system and the related risks." This is not true, most centrality measures are based on an underlying diffusion process. E.g. degree centrality can be seen as the ranking given by the stationary distribution of a random walk. Same of page rank. Moreover, here you assess systems at stationarity, so you are not really considering the dynamic of the system apart from daily evolution of the stationarity.

Response:
Certainly the stationary distribution can be viewed as a kind of centrality measure. Here the terminology might seem counter-intuitive, but stationarity is actually just a property of a compact dynamical system representation, in this case of a Markov Chain (Froyland 2001). We do not actually consider the evolution of stationarity, but rather approximate a non-linear dynamical system with a series of linear operators.

Response:
In the current version we place more emphasis on better explaining the inner workings of the Kemeny constant itself, thus omitting any references to ΔΚ or the Braess Paradox. PONE-D-20-17563 -p. 7: "This is a striking finding that suggests a link between systemic fragility and network growth, a claim which needs to be verified also for other networks in further research." Interesting, but there is no need to use bold.

Response:
Indeed, bold is not needed and has been removed.
- Fig. 6: Give the names of a few stations.

Response:
We added the names and also some provision to avoid overlaps: https://stationrank.herokuapp.com/.
- Fig. 7. Do not use divergent colourmaps for influence and fragility.

Response:
See previous response with regards to Fig. 3 - While the article introduces virtually no new theoretical concepts to the problem, it does present an interesting adaptation of the analytical approach to actual historical data for the Swiss railroad network. The article is well written and easy to follow, however the theoretical treatment of some of the concepts utilized in the study are glossed over rather quickly. I suggest the authors take the following comments/suggestions into consideration to address these shortcomings: General comments: 1. While the study focuses on railway traffic, the analysis, as I understand it, was based on trains data. As such the data does not actually represent passengers or goods flow on the network, only trains. While the authors make no claims that their analysis considers actual passengers' or goods traffic, this should be noted in the data section to help ground the reader on what the analysis actually considers. To understand why this clarification is important, consider the following: a perturbation of say +10% passenger demand homogeneously added to the network can be addressed by many means, one of which could be to increasing trains capacity homogeneously via additional cars; and thus the network analysis, as I believe, will show no impact from the additional flow as the number of trains and their schedules remains constant.

Response:
We absolutely agree. This important clarification is now part of our data section (line 104-105).
In future research, it would be interesting to quantify the flows for example in financial terms by combining different data. Still with the detail level available in the data, we show that it is possible to gain interesting insights about the system. 3. The provided resolutions for Figures 3, 4, 5, 7, and 8 do not allow the reader to properly analyze the presented data. In the case of this reviewer, I was able to use the corresponding figures in the online notebook, however better resolutions or different visualization approaches should be used.

Response:
For figures 3, 4 and 5 following rule applies: When temporal evolution is less pronounced as in Fig 3, we simply show a single day view and move the respective time series plot in the supporting information section. When there is interesting temporal trends as in Fig 4 and Fig 5, we show the time series plot (in better resolution) and include a selected single day view in the supporting information section. In Fig 7 we have changed colors and we have optimized the layout in order to maximize space for the plots, same in Fig 8 with bigger plots. 4. While I applaud the authors for making their analysis and results available online for the reader, and encouraging the reader to work with the open source code in the cited repository, I believe that the majority of the readership may not have the skillset or the time needed to get familiar with and run through such a lengthy analysis code. May I suggest that for the perturbation analysis, the authors provide interactive plots (using bokeh library or Jupyter widgets, as appropriate) in a separate notebook to help the readers experiment with the model results. This would significantly increase the readership's access to your analysis and results, beyond your article.

Response:
Thank you for this inspiring comment. Based on your insightful suggestions, we have refactored our source code 2 extensively in order to strengthen readability, make it more reproducible and also support appropriate visualization frameworks. We provide interactive plots of our results, including risk plots and perturbation analysis, in a separate notebook: https://bit.ly/2I4uAs3. This notebook can be automatically converted to an app: https://stationrank.herokuapp.com/.
Specific comments: 5. Line 26: Why is there a need to ensure comparability? Shouldn't the analysis include the entire network and if a route/station is not used on a given day it will show accordingly in the results? What happens to the discarded stations in the analysis? How does this impact the accuracy of the analysis? What does the 75% coverage figure actually mean? Response: The non-k th entries are equal to zero.
10. Line 108: if the teleportation concept is applied, all P ij will be < 1. Hence the statement "if P ip = 1…" is not needed in this analysis.

Response:
The teleportation concept has been abandoned as it resulted to dense matrices. There are quite a few cases where in the sparse scenario, so the statement is indispensable.
11. The generalized equation (7) is not clearly discussed. It appears S is used here to indicate a subset of inbound (direct and indirect) states from the total set of states, which is again indicated by S. If so, different symbols must be used to establish the differentiation. If not, why would equation (7) consider all states? The generalization from equation (6) from Crisostomi et at. to (7) should be discussed in clearer terms as it is a new assertion made by the authors.

Response:
Your first interpretation of the equation is correct. In this context indicates a subset of inbound states. We establish the differentiation by replacing with . We now discuss our generalized equation more thoroughly in lines 216-224.
12. Line 118: is the 95% homogeneous reduction applied to all incoming links to a particular station, a subset of stations, or all stations in the model? All at once or in a sequence of separate scenarios?

Response:
Here we discuss different scenarios. The actual perturbation is performed on all incoming links.
13. Perpetuation of 95% is a relative measure; applied to a busy hub will emphasize the impact and applied to a remote station will deemphasize the impact. This fact should be mentioned in the article.

Response:
Yes, absolutely. We integrated this remark in lines 225-227.
14. Figure 3 contains a large amount of information such that it is difficult to discern its implications to the network -especially that the provided resolution is not suitable for such a task. I suggest the authors should limit this figure to a single day and show the network congestion and details. A second alternative could be to add the single day figure prior to this figure, and then discuss the findings from Figure 3 in the text in more detail than what is provide in the paragraph between lines 137-141.

Response:
We have replaced Fig 3 with  15. A similar comment to that of Figure 3 can be made on Figure 4. Consider a single day view.
Response: In Fig 4 the temporal evolution is much more interesting than in Fig 3, so we keep the overview in the main article and include a single day view in the appendix.
16. The perturbation analysis methodology (steps to generate the results) should be made clearer to the reader. It is not clear if Figures 6 and 7 flow from the analysis done in Figure 5 with a disruption at the Bern station, or another approach was utilized. It is important to highlight the scenario(s) being shown to enable the reader to understand the data being presented.

Response:
Yes we agree. 17. The authors should briefly describe the terms systematic influence and systematic fragility and how it applies to the problem at hand. What do they mean in this problem? It is unclear how Wij can be greater than 1, if it is defined as a measure of difference between stationary distribution values that each can only take values between 0 and 1. This part should be edited to clarify the presented concepts.

Response:
The central hypothesis behind systemic influence is that certain stations with relatively low might disproportionately affect the network when disrupted. Descriptive statistics also show that this measure, in contrast to centrality measures, can reveal influential groups of stations along certain paths, see Fig 7. Systemic fragility measures the vulnerability or exposure of a station to disruptions that occur elsewhere in the network. Stations in densely connected areas can better absorb shocks than stations that are highly dependent on a few other nodes. Regarding the mathematical formulation, we do not actually claim that .