Circuity analyses of HSR network and high-speed train paths in China

Circuity, defined as the ratio of the shortest network distance to the Euclidean distance between one origin–destination (O-D) pair, can be adopted as a helpful evaluation method of indirect degrees of train paths. In this paper, the maximum circuity of the paths of operated trains is set to be the threshold value of the circuity of high-speed train paths. For the shortest paths of any node pairs, if their circuity is not higher than the threshold value, the paths can be regarded as the reasonable paths. With the consideration of a certain relative or absolute error, we cluster the reasonable paths on the basis of their inclusion relationship and the center path of each class represents a passenger transit corridor. We take the high-speed rail (HSR) network in China at the end of 2014 as an example, and obtain 51 passenger transit corridors, which are alternative sets of train paths. Furthermore, we analyze the circuity distribution of paths of all node pairs in the network. We find that the high circuity of train paths can be decreased with the construction of a high-speed railway line, which indicates that the structure of the HSR network in China tends to be more complete and the HSR network can make the Chinese railway network more efficient.


Introduction
In recent years, the rapid development of a high-speed rail (HSR) network in China has evolved into a large-scale HSR network. Each high-speed railway line is a reasonable train path. A HSR network consists of a number of railway lines. Among these lines, a passenger transit corridor can be found when the line not only meets a large passenger demand, but also has a low indirect degree of train path. On the high-speed railway network, except for trains running along each railway line, a large number of cross-line trains are operated. The operation of cross-line trains can decrease the transfer ratio and transfer times, and promote the service level drastically.
The construction of a railway line is constrained not only by geographical conditions along the line, but also by passenger demand. Although many conventional railway lines have been constructed along passenger transit corridors, high-speed railway lines may not be designed along all these corridors. Therefore, the indirect degree of a minority of O-D pairs on the HSR network would be much higher than those on the conventional rail network. Generally, we consider that high-speed trains only run on the HSR network in China [1].With the increasing circuity on the HSR network, people would choose conventional railways, highways, or air instead of high-speed railway. On the HSR network in mainland China completed at the end of 2014 (see Fig 1), the length of the shortest path between Chengdu and Xi'an is 2244 km, which passes through Chongqing North, Yichang East, Wuhan, and Zhengzhou Railway Stations; while the lengths of the paths between this particular O-D pair are about 714 and 866 km on the highway network and conventional rail network, respectively, and the spatial distance is only 510 km. It is unreasonable to operate high-speed trains between Chengdu and Xi'an because of the high path circuity.
For a given O-D, the shortest path is normally utilized as the high-speed train path. However, it is not reasonable to operate trains between any O-D pairs because some O-D pairs may have high indirect degree. Le Corbusier [2] concluded that a man selected the path with a lower indirect degree when he had a definite destination.
Indirect degree can be an evaluation criterion for train paths, but it has not been used in the research on train-path selection. Circuity, the ratio of path length to Euclidean distance, is used to measure the indirect degree, and has been widely applied in the research on urban roadways and public transit networks. On the basis of circuity, weighted circuity, which considers the amount of passengers, is used to measure how efficiently the transportation network is designed to meet the passenger demands (Lee [3], Giacomin and Levinson [4]). Levinson and El-Geneidy [5] randomly selected a number of O-D pairs in 20 metropolitan areas in the United States and calculated the circuity of the shortest paths between the O-D pairs. Meanwhile, they investigated some workers and calculated the circuity between their residences and work locations. They found the circuity of home-work pairs is lower than those of the random O-D pairs. After calculating the circuity of random O-D pairs in 51 well-known metropolitan areas in the United States, Giacomin and Levinson [4] indicated that circuity in an urban road network gradually increased over time, and they concluded that short trips are more circuitous than long trips by proposing a function describing the relevance between circuity and distance. Huang and Levinson [6] discovered that the circuity of the transit network was higher than that of the road network and indicated that circuity can explain mode choice share of commuters. Lee et al. [7] utilized circuity to analyze the transit network indirect degrees of five metropolises in Korea and found that metropolises possessed lower circuity than smaller cities. They also considered that population density significantly impacted the urban transit network circuity because of the strong relationship between passenger flow and circuity. Based on a wide range of GPS travel survey and road network data, Dhakar and Inivasan [8] developed a travel-path-selection model considering circuity and intersection. Poken [9] added the circuity to constraint conditions for the distribution path algorithm and found that circuity controls were more effective than standard least-cost insertion heuristics.
Circuity has also been used in the research of aviation networks. Russon and Hollingshead [10] demonstrated that reasonable hub planning could decrease the degree of passenger travel circuitousness and enhance the advantage of regional air transportation so as to avoid the loss of travelers to other modes of travel. Meilus [11] analyzed the impact of airline mergers and network adjustment on passenger travel efficiency by measuring the circuity of air routes. In the study of train-path-selection problems, Lv et al. [12] solved the problem of the inconformity between the pass ticket routes and real traveling routes by devising a new method based on train-route restriction to design a pass route algorithm.
In related research on transportation network analysis, efficiency, structural properties, robustness and redundancy of networks are areas of significant concern to researchers. Lee [3] introduced two comparative measures: Total Travel Time Degree of Circuity (or Competitiveness) and In-vehicle Travel Time Degree of Circuity (or Competitiveness) to examine the efficiency of transit network configurations. On this basis, simple average and weighted average are also used to measure how efficiently the transit network is designed to meet the passenger demands. Du et al. [13] studied the robustness of the Chinese Airline Network via the removal of nodes and connections and found that the Chinese Airline Network is not as redundant and robust as the Worldwide Airline Network in the high-degree targeted attack strategy. Zhang et al. [14] investigated the evolution of the Chinese airport network, including the topology, the traffic and the interplay between them and found that the main topological properties of the network were steady although there exists a dynamic switching process inside the network. Sen et al. [15] indicated that the Indian Railway network displays small-world properties based on complex network theory.
However, circuity has not yet been used to analyze HSR networks and high-speed train paths. Following the studies of air transport, we suppose that the directness of high-speed train paths has a great influence on attracting passengers along the route. More specifically, this paper will contribute to the study of HSR networks by answering the following questions: (1) How can we determine the reasonable threshold of circuity for train paths? (2) For a large HSR network, what methods can help the planning of train paths? (3) How can we identify passenger corridors in a large railway network? (4) What is the influence of HSR construction on railway transportation?

Indicators and data
Circuity in high-speed rail network can be obtained by the following formula: where C rs is the circuity of the path between origin r and destination s, P rs is the length of the shortest path between origin and destination, and D rs is the great-circle distance. The great-circle distance is the shortest distance between origin and destination on the surface of a sphere. Fig 2 shows the great-circle distance from origin r to destination s. Since we need to analyze the circuity of a large number of O-D pairs, we adopt the Floyd-Warshall shortest path algorithm [16] to solve P rs in formula (1). Circuity was generally defined as the ratio of the shortest network distance over the Euclidean distance between origin and destination (Levinson and El-Geneidy 2009 [5], Barthélemy 2011 [17], Giacomin and Levinson 2014 [4], Huang 2015 [6]). For HSR networks, the path distance is the length of the shortest path along the railway. Here the Euclidean distance is replaced by the great-circle distance because the length of a relatively long path would be underestimated by the method of Euclidean distance. Moreover, the distance between an O-D pair on the HSR network is always longer than it is on the urban transport network. Using Euclidean distance may impact the precision of results. On the other hand, we can easily obtain the longitudes and latitudes of origin and destination rather than their coordination in Euclidean space. The great-circle distance is calculated as follows [18]: where x r ,x s 2 [−π,π] are the longitudes of points r,s and y r ; y s 2 0; p 2 Â Ã are the latitudes of points r,s. There are y r ; y s 2 0; p 2 Â Ã rather than y r ; y s 2 À p 2 ; p 2 Â Ã which means that points r,s have to be both within the same hemisphere. R is the Earth's radius which is set as 6371.393 km (Ballou et al. 2002 [18]).
Meanwhile, we also utilize the concepts of circuity of railway lines and circuity of train paths. The circuity of a high-speed railway line is the ratio of rail-line length to the spherical distance between the origin and destination of the line, and the circuity of a train path is the ratio of path length to the spherical distance between the origin and destination of the train.
The study area is based on the HSR network in mainland  Table 1 [19]. We extracted the paths of operated trains from the high-speed train schedule which is sourced from China Railways Corp. in 2014.
In addition, there may be more than one high-speed railway station in a city, and they have clear work assignments to access trains operated on different railway lines [20]. For example, there are three railway stations in Wuhan, i.e., Wuhan, Wuchang, and Hankou Railway Stations, which link the Beijing-Guangzhou, Nanjing-Hefei-Wuhan, and Wuhan-Chongqing high-speed railway lines, respectively. Wuhan Station can access trains operated on all three high-speed railway lines. Wuchang Station can access trains only operated on Nanjing-Hefei-Wuhan and Wuhan-Chongqing high-speed railway lines. Hankou Station cannot access trains from Guangzhou direction along Beijing-Guangzhou high-speed railway line. The work assignment of Wuchang and Hankou Railway Stations are shown by dotted lines in Figs 3 and 4, respectively.
With respect to reasonable train-path-selection problems, since each high-speed railway line is a reasonable passenger transit corridor, we mainly discuss the situation in which both In order to analyze the circuity of any O-D pairs in a HSR network, we mainly research the distribution of circuity and the distribution characteristics of some O-D pairs with high circuity. Furthermore, combining high-speed railway line planning and construction, we analyze the variation trend of circuity distribution in HSR networks.

Circuity of high-speed train paths
High-speed trains can be classified as two types based on paths: one includes within-line trains (of which origin and destination are on the same line) and the other includes cross-line trains (of which origin and destination are on the different lines). We can separately solve the circuity of paths of these two types of high-speed trains. For the first type, we only have to calculate the circuity of each high-speed railway line. For the second, we need to calculate the circuity of train paths one by one. The HSR network depicted in Fig 1 consists of 28 railway lines (operated by China Railways Corp.); we can obtain the circuity of these lines ( Table 2) using formula (1).
According to Table 2, the maximum circuity of the railway lines investigated is 1.29, the minimum is 1.08, and the average is 1.16. In general, high-speed railway lines are not circuitous. The circuity of Jiaoji HSR is the highest because of the terrain of Shandong peninsula. Similarly, the circuity of any part of the railway lines is also low. The relationship between the path length and circuity of each O-D pair on each high-speed railway line is illustrated in Fig  5. From Fig 5, we observe that the circuity converges in the interval [1,1.5].
In 2014, except for interurban railway lines, the high-speed trains in operation in China run on 292 train paths [21]. The circuity of train trip G7376/G7377 from Jiangshan Railway Station to Hefei South Railway Station, which passes through Hangzhou East, Shanghai Hongqiao, and Nanjing South Railway Stations, is the highest at 2.48. On the other hand, the lowest is 1.03, corresponding to train trip G7282 from Hefei Railway Station to Huainan East Railway Station, which passes through Shuijiahu Railway Station. The average circuity of the entire paths on this HSR network is 1.37.
For all high-speed train paths, the relationship between path length and circuity of each O-D pair is shown in Fig 6. The figure indicates that the maximum circuity of subpaths is obviously higher than that of train paths (2.48), the main reason being that there is a break angle as a part of a train path with a low circuity.

Searching and clustering of train paths
Determine the circuity threshold of train-paths According to the analysis of circuity of high-speed railway lines and the paths of trains in Section 3, the average circuity of train paths is higher than that of railway lines. We set the maximum circuity of train paths, i.e., 2.48, as the threshold value, and under this constraint we search the reasonable train paths to form the train-path sets.
In contrast, the circuity of many urban road networks and public transit networks is lower than 2.48. For example, Newell [22] indicated that the circuity of the path between a random

Clustering of train paths
Since each subpath of a train path is reasonable, a new path can also be formed by an existing path and an extra branch path. Under the constraint of the given threshold value of path circuity, the number of train paths is too large, and their description would be complicated. In order to simplify the expression of cross-line train path sets, we should cluster all of the reasonable train paths.
We cluster the paths mainly on the basis of overlap ratio of train paths. In this paper, the train path between two nodes is the shortest path and these shortest paths are often exclusive, so the overlap ratio of two train paths can be described by the relationship between their node sets.
We denote N rs as the node set of a path P rs between the O-D pair (r,s), and its node number is |N rs |. For path node sets N rs and N ij , accurate clustering can be expressed by the inclusion relation between the two path node sets. If N ij N rs , then the train path P ij belongs to the class represented by train path P rs . Except for the accurate clustering, some error is allowed. Error can be described in terms of two conditions, i.e., absolute error and relative error. Absolute error means the number of nodes in N ij that do not belong to N rs and the error cannot exceed the threshold value Δn: Relative error means the ratio of number of nodes in N ij that do not belong to N rs and the ratio does not exceed the threshold value ε: When sets N rs and N ij satisfy either formula (3) or (4), path P ij can be merged into the class represented by train path P rs .
We designate the representative in one path class as a center path, which has the most nodes in this class. Clustering paths can be carried out in descending order of the number of path nodes; the detailed clustering procedure is as follows: Step 1: Add all paths to the path set O to be clustered, and arrange the paths in the descending order of the number of path nodes.
Step 2: Choose the path with the largest number of nodes from O to be the center path of one path class, determine whether other paths belong to this class by formula (3) or (4), and delete the paths in the new path class from O.
Step 3: Repeat Step 2 until O is empty.

Searching and clustering train paths in Chinese HSR network
I2n this section, we take the HSR network in China (except for the Hainan East Ring Line and some intercity railway lines) depicted in  In Fig 7, there are already through trains running on the passenger transit corridors labeled 1 to 19, while few through trains are operated on the passenger transit corridors labeled 20 to 51. As long as there is adequate passenger demand, we can operate trains on these alternative passenger transport corridors (labeled 20 to 51). The longest center path measures 3705 km, which is constrained by the standard that electric multiple units must stop to be inspected and repaired at Level One after running 4000 km, and the two terminal stations of this path must have EMUDs. The shortest center path measures 819 km, exceeding the full mileage of some trains. If we change the circuity threshold value to 2.06 (the average of the circuity of the cross-line trains operated), many paths to some major cities, such as Qingdao and Dalian, would be excluded.
However, in the practical operation, some trains do not run along the shortest path but pass through some significant cities. These paths are as follows: 1. The Ning Hang high-speed railway line connects Hangzhou East and Nanjing South Railway Stations, and the shortest path between these two cities on the HSR network is 256 km, but the length of some train paths between these two cities reaches 470 km. For example, some paths that start from Hangzhou East Railway Station, pass through Shanghai Railway Station, and then arrive at Nanjing South Railway Station. These particular cases result from the significant position of Shanghai in the entire network. The specific paths in the three above situations are shown in Table 3 In conclusion, during the process of HSR train-path searching, the reasonable shortest paths between O-D pairs should be mainly selected. However, in several cases, the shortest path may not be the practical option so that the train can pass through an important metropolis or the superior railway lines.  the lower the circuity. This result is consistent with the conclusion of Giacomin and Levinson [4] for urban road networks. Owing to the existence of break angle segments, there are some O-D pairs with short paths but high circuity in part 1. For example, the path from Yuncheng North Railway Station to Sanmenxia South Railway Station [Fig 10(A)] goes through the intersection of two railway lines, so the path between any two nodes on this path has short length but high circuity [Fig 10(B)]. Parts 2-4 of Fig 9 are relatively special because of the long path length and high circuity. We find that the node pairs in these three parts are mainly distributed on paths from Wuhan Railway Station to Jiujiang Railway Station, from Wuzhou South Railway Station to Shenzhen North Railway Station, and from Dazhou Railway Station to Baoji South Railway Station [Figs 11(A)-13(A)]. Moreover, the relationships between the circuity and length of the path between any two points on these three paths are shown in Figs 11(B)-13(B), respectively. The long distances and high indirect degrees of these paths result from the lack of direct railway lines.
In terms of the short-range design of the HSR network in China (adjusted in 2008, National Development and Reform Commission) [24], Zheng Xu (1 to 2), Wu Jiu (3 to 4), Xi Cheng (5  dramatically due to these planning lines, and parts 2-4 disappear. It is notable that there is still a convex part (in the black circle) that represents the O-D pairs on the path from Qingdao Railway Station to Dalian Railway Station (12 to 13). These two cities, located in two peninsulas of China, i.e., the Shandong and Liaodong peninsulas, are close in space but separated by the sea. There is no sea-crossing bridge or subsea tunnel between these two cities, so it is difficult to operate high-speed trains between them. With the increase of the density of a HSR network, the circuity between any two nodes increases at first and then decreases with path length. From the above discussion, we conclude that the structure of China's HSR network tends to be completed.

Conclusions
This paper adopted circuity, the ratio of network to Euclidean distance, to analyze HSR network and high-speed train paths. After comparing train paths circuity with that of road network and bus routes, we set the maximum path circuity of operated high-speed trains in China as the threshold value. Under constraint of circuity threshold value, we can search all of the reasonable train paths on the network. Then we propose a clustering method based on the inclusion relation which is defined as a range of absolute or relative error between each two paths. By using a clustering method according to the inclusion relation between each two paths, we simplify the expression of a large number of reasonable train paths. More importantly, we also obtain a Analyses of HSR network and high-speed train paths series of passenger transit corridors which are center paths of train path classes. High-speed trains can be operated along these corridors when there are adequate passenger demands.
Circuity of high-speed railway lines is quite low (from 1.08 to 1.29) in the Chinese HSR network, which means that each HSR line can provide a reasonable train path. For both the constructed HSR network and the planned HSR network in China, the circuity of any O-D pairs are examined. The results show that the longer the paths, the lower the circuity, which is consistent with the results for urban road networks and transit networks found in preceding research. The circuity of the planned HSR network with more HSR lines is lower than the constructed HSR network, which demonstrates that the structure of China's HSR network tends to be complete. Meanwhile, we also analyze the reason that a few O-D pairs have a relatively high circuity value. These O-D pairs show that if we build HSR lines between these zones, the construction of the network can be improved significantly.