Graph drawing using tabu search coupled with path relinking

Graph drawing, or the automatic layout of graphs, is a challenging problem. There are several search based methods for graph drawing which are based on optimizing an objective function which is formed from a weighted sum of multiple criteria. In this paper, we propose a new neighbourhood search method which uses a tabu search coupled with path relinking to optimize such objective functions for general graph layouts with undirected straight lines. To our knowledge, before our work, neither of these methods have been previously used in general multi-criteria graph drawing. Tabu search uses a memory list to speed up searching by avoiding previously tested solutions, while the path relinking method generates new solutions by exploring paths that connect high quality solutions. We use path relinking periodically within the tabu search procedure to speed up the identification of good solutions. We have evaluated our new method against the commonly used neighbourhood search optimization techniques: hill climbing and simulated annealing. Our evaluation examines the quality of the graph layout (objective function’s value) and the speed of layout in terms of the number of evaluated solutions required to draw a graph. We also examine the relative scalability of each method. Our experimental results were applied to both random graphs and a real-world dataset. We show that our method outperforms both hill climbing and simulated annealing by producing a better layout in a lower number of evaluated solutions. In addition, we demonstrate that our method has greater scalability as it can layout larger graphs than the state-of-the-art neighbourhood search methods. Finally, we show that similar results can be produced in a real world setting by testing our method against a standard public graph dataset.


Introduction
Graph drawing is the process of transforming a graph into a visual representation that is called a graph layout [1]. The graph layout depends on different aesthetic measures that could give a better understanding of graphs. Such measures include minimizing edges crossings, uniform edge length, maximizing node-to-node and node-to-edge occlusions, maximizing graph symmetry, maximizing angular resolution, and others [1,2,3,4]. These measures can be combined to form a multi-criteria weighted sum objective function that measures the quality of a graph and then optimized by search based methods (optimization methods). a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Search based methods can be placed into two categories according to the number of solutions examined at the same time: neighbourhood search methods and population based methods. While neighbourhood search methods work on a single solution at a time, population based methods evolve a set of points in the search space [5]. These methods can produce good solutions, but they have great potential for improvement. For example, in neighbourhood search methods, simulated annealing adds an element of non-determinism in order to escape from local optima in the search space. This slows down the performance of the algorithm since this stochastic behaviour means that a large number of iterations can be required to reach a good solution [2]. Hill climbing is generally faster in reaching a final layout, but the final result is not always the best as it is more likely to get trapped in a local optima [6]. Population based methods such as genetic algorithms typically have an even slower rate of convergence compared to simulated annealing and hill climbing as they make a wider search of the problem space. In addition they often require large memory to maintain the population and can require additional algorithms to spread the solutions [7].
Neighbourhood search based layout is frequently used in graph drawing. It is used to optimize different quality measures that are combined as a single weighted sum objective function. However, such methods work slowly as the process of recalculating the metrics is repeated a large number of times during the search process. Here we propose a novel neighbourhood search based graph drawing algorithm and compare it with other such approaches. We show that our approach improves on the current state-of-the-art in neighbourhood search for graph drawing.
Another popular type of layout is the class of Force-directed approaches. These differ considerably from search based methods. Here, interactions between nodes are applied such as the attraction of connected nodes and repulsion of disconnected nodes, where the method attempts to find an equilibrium layout [18,19,20,21]. In addition, systems such as Pajek draw large networks using spring embedders and eigenvectors [22]. However aesthetics can only be indirectly coded in force directed approaches, whereas search based methods have the advantage of allowing tuneable combinations of metrics to meet user preferences.
The scope of the research described in this paper is to improve the efficiency and effectiveness of drawing general graph layouts with undirected straight lines based on a weighted sum multi-criteria optimization. The main goal of our work is concerned with developing a new graph drawing search method based on tabu search and path relinking. To our knowledge, these methods have not been used before to lay out general graphs with multi-criteria optimization.
Since our method belongs to the category of neighbourhood search methods, we compare it against hill climbing and simulated annealing as these are well-known neighbourhood search methods that have been frequently applied to graph drawing.
In this paper we show that tabu search alone outperforms hill climbing, but not simulated annealing, we then show that when tabu is combined with path relinking it outperforms simulated annealing. The tabu search algorithm outperforms hill climbing in minimizing the value of the objective function and the number of evaluated solutions used to draw a graph layout. The addition of applying path relinking within the tabu search procedure speeds up the identification of good solutions and outperforms simulated annealing by producing graph layouts with better values of objective function. We also demonstrate that when targeting a particular value of an objective function, the combination of tabu search and path relinking achieves the goal in a smaller number of evaluated solutions.
Path relinking integrates intensification and diversification strategies [23]. This approach generates new solutions by exploring paths that connect high quality solutions (elite solutions from the reference set) by starting from one of these solutions, called the initiating solution, and generating a path in the neighbourhood space that leads toward another solution, called the guiding solution. Note that the initiating and the guiding solutions represent the starting and the ending points of the path. This is accomplished by selecting moves that introduce attributes contained in the guiding solutions [13]. A crucial difference between evolutionary algorithms, such as genetic algorithms, and path relinking is that the former uses a factor of randomness to create offspring from parent solutions, whereas the latter uses systematic and deterministic rules to combine elite solutions. The main principle of its deterministic behaviour is the gradual introduction of attributes from the guiding solution to intermediate solutions. These attributes should have fewer characteristics from the initial solution and more characteristics from the guiding solution as the search moves along the path [24]. Path relinking has been considered to be particularly appropriate to tabu search as it allows for "tunnelling" through infeasible regions formed from the tabu list [25].
In a previous paper [26], we described an initial attempt to use a tabu search based approach for graph drawing and we compared it with hill climbing only. The method searches for the best positions for the nodes that minimize the value of the objective function, and draws a nice graph layout accordingly. Tabu search forbids moves that have been previously examined which may be considered poor potential solutions, making it a more effective layout method than hill climbing. In this paper, we widen the comparison to include simulated annealing. We also extend and improve our preliminary work [26] by coupling tabu with path relinking. This combined approach considerably improves the effectiveness of the search. We also conduct a more thorough comparison. Here we use four different neighbourhood search based methods: hill climbing, simulated annealing, tabu search and tabu search with path relinking. In addition, we change the criterion of comparison between the methods to the number of evaluated solutions calculated by the drawing algorithm (as this is a machine independent criterion) instead of using the algorithm's execution time which was used in the preliminary work. We only present an execution time comparison when we test the scalability of the methods which was not tested in our previous work. In this case, we use execution time to give a realistic idea of run time for applying the methods. Statistical significance tests that confirm the results of our experiments are also included in this paper unlike our work in [26]. The code and data related to this research can be accessed at Dryad digital repository: doi:10.5061/ dryad.k082rv8.
The rest of this paper is organized as follows: Section 2 describes some background in search based techniques; Section 3 describes our method that couples tabu search with path relinking; Section 4 describes parameters tuning process along with experimental results for applying hill climbing, simulated annealing, tabu search, and tabu search with path relinking on random graphs, and for testing the scalability of our method; Section 5 describes experimental results of applying the same approaches on real world public graph datasets; Section 6 discusses and analyses the results; finally, in Section 7 we give our conclusions in addition to directions for future work.

Background
Graph drawing is the process of placing nodes and edges in order to form clear and understandable layouts. However, this process is a challenge as it depends on what we consider as a nice graph (graph aesthetics) and the efficiency of its automated implementation. Graph drawing aesthetics are quality measures which determine the readability and usability of graphs. A good layout can deliver information clearly whereas a poor layout can mislead [27]. An objective function comprising metrics in a weighted sum can be used to quantify the quality of the graph layout and so be used within search based layout mechanisms. Typical criteria includes: minimizing edge crossings, minimizing edge bends, uniform edge length, maximizing graphs symmetry, maximizing node-to-node and node-to-edge occlusions and maximizing angular resolution of incident edges [2,3,4]. However, it is computationally expensive to find a minimum objective function's value as the measurements can be time consuming to calculate, and the objective function is required to be determined for every layout examined. Since the overall objective function might include both continuous and discrete functions, some general search based approaches, such as neighbourhood search methods like simulated annealing and hill climbing, and population based methods such as genetic algorithms, have been used in order to find a good value of the objective function.
Simulated annealing was an early search based method to be applied to the graph layout problem [2]. It was used to draw undirected graphs with straight line edges. The original algorithm produces nice graph layouts for small sized graphs. However, it does not perform well for larger graphs. This approach models the physical process of heating a material and then slowly cooling the temperature to decrease defects, so minimizing the system energy. The method tries to escape from local minimum to global minimum by applying uphill moves (moves that worsen, rather than improve, the temporary solution) with decreasing probability as the search progresses. A variation of the approach uses gradient descent [8]. The gradient vector of the objective function represents the direction in which the node should move to increase the value of the objective function. However, this method is slow when being applied on large graphs and it has some challenges. For example, the objective function needs to be expressed explicitly in terms of coordinates as its derivative must be found. Some important criteria, such as minimizing edge crossings, are not continuous and so cannot be modelled with gradient decent.
Hill climbing is another search based approach that has been used in the field of graph drawing to minimize edge crossings [11]. It is a simpler and a faster search approach than simulated annealing as no uphill moves are made, but, as a result, tends to get trapped in poor local optima.
A genetic algorithm approach for drawing graphs under a number of visual constraints was proposed in [14,15]. The proposed algorithm produces graph layouts with good quality. It can be easily adapted to take new layout aesthetics into account. However, the major problem is its slow rate of convergence.
Tabu search is a neighbourhood search based method proposed by Glover [28] for finding good solutions to combinatorial optimization problems. It is a neighbourhood search based procedure that uses a memory structure while it carefully explores the neighbourhood of each solution as the search progresses to avoid getting trapped in a local optima. It proceeds on the assumption that there is no value in choosing an inferior solution unless it is necessary, as in the case of escaping from a local optimum [29]. It improves the efficiency of the searching process by storing a tabu list of local solutions. This is used to restrict the search by forbidding moves to some poor neighbour solutions that have already been visited [30]. An additional feature of tabu search is applying intensification and diversification. It might be useful to intensify the exploration in some region because it may contain a high incidence of acceptable solutions. This can be obtained by introducing a new term in the objective function that assigns a high priority to solutions in the relevant region. Diversification is responsible for moving the exploration process over different regions of the search space. For example, the objective function can be adjusted so that it penalizes solutions which are close to the current one. Thus, tabu search can induce an interplay between intensification and diversification that is intended to form a dynamic and aggressive search strategy [12].
Tabu search has shown good results for generating approximate solutions to NP-hard problems in a reasonable amount of time [31,32]. It has also produced comparably fast solutions in some graph theory applications such as graph partitioning [29,33], as well as for special graph layout problems such as straight line crossing minimization [13,34], and the bipartite graph drawing problem [12]. Tabu search has also outperformed many existing heuristics for solving the vehicle routing problem [35,36,37]. Tabu search was introduced for general graph drawing in our previous paper [26]. The experimental results on random graphs showed that our tabu search based approach was faster than hill climbing with good quality graph layouts. However, the performance comparison was not conclusive as it was based on the algorithm's execution time, rather than the more accurate measure of number of evaluated solutions.
Path relinking is another neighbourhood search based method that was proposed as an approach to integrate intensification and diversification strategies [23,38]. It is a relatively new approach that has been applied on several computational problems with a great success. The aim of path relinking is to introduce attributes of the guiding solution into solutions obtained by moving away from the initial solution in a systematic manner where similarities and differences in the structure of the initial and guiding solutions are properly identified.
Any path relinking implementation must include the following three components [24]: • Building the reference set, • Choosing the initial and the guiding solutions, • Constructing a neighbourhood structure for moving along paths between initial and guiding solutions.
Using path relinking periodically in a search procedure is intended to speed up the identification of good solutions. Combining tabu search with path relinking is motivated by the desire to tunnel through blocked off areas created by the tabu solutions [25]. Solving a vehicle routing problem using tabu search with path relinking was able to generate better solutions compared to the traditional tabu search approach alone [24]. Tabu search and path relinking were also used to address the job shop scheduling problem and produced competitive results compared to alternative state of the art algorithms [39].

Our method
This section describes our graph drawing method based on tabu search coupled with path relinking. First we describe the tabu search procedure, then we detail how the path relinking is integrated into it.
In this work, we use a systematic exploration for local search space. For each node, we search the points (candidate solutions) around a square centred on the node at a given distance, as shown in Fig 1. Eight points around the square are checked (up, down, left, right, and the four corners). We compute the objective function's value at each candidate solution, and we select the candidate solution that gives the lowest objective function's value. In the case that there are multiple candidate solutions that share the lowest value, we select the first encountered candidate solution starting from the right point around the square and move along the points of the square in a clockwise direction. This is also how the objective function tie-breaks are resolved in the other methods.
Note that, using a geometric shape for defining a search space in the field of graph drawing was used earlier in [2,3] where a circle and a rectangle had been respectively used. However, since evaluating the value of a multi-criteria objective function is a lengthy process, we restricted the movements to eight points only to avoid the long execution time for re-evaluating the value of the objective function with a large number of evaluated solutions. We used the same neighbourhood searching strategy with all the methods included in this research in order to make a fair comparison. This searching strategy can be easily adjusted with our implementation by increasing the number of repetitions from eight points to any larger number, but the execution time would be significantly longer.
In the tabu search procedure, we first compute the value of the objective function for the initial graph layout. Then, the following steps are performed for a set number of iterations (maxIterations): for each node, we search the points around a square as described above. The ratio of the objective function values for the candidate solution and the current solution is computed at each point around the square. Solutions with ratios above or equal to a predefined threshold value (tabuCutOff) are considered to be tabu moves and are stored in a tabu list. We then move the node to a neighbouring point which is not in the tabu list and its objective function value is minimal. Then the current location is added to the tabu list. Note that the new solution might not be better than the current solution. In case all eight candidate solutions surrounding the current solution are in tabu list, the intensification and the diversification processes are used. The search intensification process is as follows: after a chosen number of iterations (intensifyIterations), the square size centred on the node is reduced and the tabuCutOff value is decreased by a set value (intensifyCutOff) by calling function SmallerSquareSize() and function SmallerTabuCutOff() respectively as shown in Algorithm 1. The diversification process consists of updating the tabu list by removing old solutions from the list after a number of iterations (tabuDuration).
Our objective function follows a standard approach for search based graph drawing methods. We implemented four common metrics for measuring the quality of the graph similar to those used in [2,3]. These represented the aesthetics of: spreading the nodes out evenly on the drawing space (node-node occlusion), making uniform edge lengths, minimizing edge crossings, and improving angular resolution (maximizing the distance between incident edges). Spreading the nodes out evenly on the drawing space means that the distances between nodes should be maximized. This criterion was measured using the following formula: where d ij represents the Euclidean distance between two nodes i and j, and i 6 ¼ j.
Unifying the length of the edges was computed by defining a specific length (len), then all the edges would be adjusted to reach the required length using the following formula: where E is the set of edges.
In order to minimize the number of crossing edges, we only needed to find the number of edge intersections and we tried to minimize that number in each iteration of the optimization process.
The last metric, the angular resolution, was computed by maximizing the distance between incident edges using the following formula: where deg(v) denotes the degree of a node v, and θ(e1,e2) is the angle in radians between two adjacent edges e1 and e2 incident to node v. All these metrics contribute in the graph quality objective function which is computed as follows: where w i and m i are the weight and the measure for the criterion i respectively. Similar multi-criteria objective functions have been previously used in [2,3]. The metrics used are well-known and have been shown to represent a quality of a graph layout [27]. The problem in a multiple objective optimization function is that the value of a specific measure may dominate the others. Therefore, we applied a normalization process to ensure that the value of each measure is between 0 and 1.
We cannot determine unified weights that work well for all graphs with any size, and indeed weights can vary according to application area and user preference. The goal of our research is to develop an improved optimizer for general undirected graphs, rather than concentrate on generating the best possible layout. Hence, we take the approach of assuming all criteria are equally important and assign the value 1 to all weights.
Re-computing the objective function for each node move is a time consuming process. Therefore, we implemented a system that caches the results for each node and edge. When calculating objective function, we only compute the values that might change when a node is moved.
We couple our tabu search procedure with path relinking to intensify the search within a specific space of elite solutions as described in Algorithm 1. The path relinking procedure is called within the tabu search procedure every fixed number of iterations (intensifyIterations). Building a reference set of elite solutions is the first step in path relinking. This has a maximum size (refSize) and contains no redundant solutions and is recommended to be relatively small [24]. Initially, the solutions which are produced by the tabu search procedure are added to the reference set. A solution is directly added to the reference set as long as the set is not full. However, once the reference set becomes full, a solution will replace the worst solution in the set when any of the following criteria is satisfied: (a) Quality: the objective function's value of the added solution is better (smaller) than the objective function's value of the best solution in the reference set. This is performed by the Quality() function in Algorithm 1.
(b) Diversity: the objective function's value of the added solution is better (smaller) than the objective function's value of the worst solution in the set, and it is dissimilar to the solutions in the set. The dissimilarity measure is computed as follows: we define D b s , the level of dissimilarity between solution s and the best solution b, as the sum of distances between the corresponding nodes in the two graph layouts. This is performed by the Diversity() function in Algorithm 1.
We also define the median position of all solutions x 2 refSet relatively to the best solution b as: where |refSet| denotes the number of solutions in the reference set. A solution s is included in refSet if its objective function's value is better than the objective function's value of the worst solution in refSet and its level of dissimilarity exceeds the median. When the path relinking procedure is called, the following steps are performed for a set number of iterations (PRmaxIterations) as long as the reference set has more than one solution (see Algorithm 2): firstly, we select two solutions from the reference set (initial and guiding solutions). There are different ways for selecting these two solutions [24,39,40] as we show later in this section. In this paper, the guiding solution is always of a better (smaller) objective function's value than the objective function's value of the initial solution. Secondly, we remove the initial solution from the reference set as its path to the guiding solution will be explored. Thirdly, we call function MoveAlongPath(), which moves on a path from the initial solution toward the guiding solution and vice versa to generate intermediate solutions (as this has produced better results in other applications [24] compared to moving in one direction only). These intermediate solutions should become closer to the guiding solution (i.e. contain more attributes from the guiding solution and less attributes from the initial solution). In our algorithm, for each node in the initial solution, we visit the 8 positions around a square (same positions described previously) of a predefined size (pathSqrSize) and compute the Euclidean distance from each position to its corresponding node in the guiding solution as shown in Fig 2. We select the position with the shortest Euclidean distance. Its objective function's value is computed along with its dissimilarity level, and we update the ref- initialSquareSize: predefined square size where tabu search candidate solutions are located on its border. squareReduction: predefined value which represents the rate of reduction for the size of the square.
maxIterations: predefined maximum number of iterations of the tabu search drawing algorithm.
initialCutOff: predefined minimum value that determines whether a move is tabu or not.
intensifyCutOff: predefined value which represents the rate of reduction for current cutOff value.
intensifyIterations: predefined number of iterations in which the tabu search searching process starts to intensify.
duration: predefined number of iterations in which a move should remain in the tabu list.
refSize: predefined size for the maximum number of solutions that can be added to the reference set of path relinking. Different selections for the initial (source) and guiding (destination) solutions affect the quality of graph layouts drawn by the path relinking procedure. There are five different variations for the selection mechanism of the source solution and the destination solution from the reference set [24]: (a) The worst and the best elite solutions.
(b) The best and the second best elite solutions.
(c) Random selection of elite solutions.
(d) The best elite solution and the most distant elite solution to the best. In our application, the distance between two layouts can be computed as the summation of Euclidean distances between the corresponding nodes in the two layouts as described in Diversity () function used in Algorithm 1. The most distant solution is the one with the maximum summation of distances to the best elite solution (i.e. the most distant solution = s such that s 2 refSet and satisfies the formula: where b is the best solution in refSet and D b s is the level of dissimilarity between solutions s and b).
(e) The two most distant elite solutions.
We tested these five different variations on random connected graph datasets. The results showed that, in overall, variation (d) resulted in better final objective function's value and lower number of evaluated solutions than the others.
We then examined the way the path is formed from the initial solution to the guiding solution. In the basic algorithm, the step-size we use to move from an initial solution to intermediate solutions is fixed (pathSqrSize). We examined if using a variable step-size would improve performance. Moving along the path such that the movement starts faster near the initial solution and it becomes slower as it gets closer to the guiding solution, which intensifies the search in the area of the initial solution. This strategy is applied on both directions: from an initial solution to a guiding solution and vice versa. This variation introduces two new parameters to our path relinking procedure: number of iterations required to update the step-size (accelerationPeriod), and the rate of decreasing the step-size (accelerationRate). The net effect is to search more closely to the two known solutions than in the space between them.
We applied both variations to randomly generated connected graph layouts with different number of nodes and edges. The two variations used the same values of all path relinking parameters except for the newly introduced parameters as they are only related to the variable step-size strategy. We ran both of them until they reach the stopping criteria. The results showed that using a variable step-size to move along a path can produce graph layouts with better objective function's values compared to a fixed step-size strategy. Hence we apply this variable step-size in our algorithm.

Experimental results
Our research questions were: "Does our tabu search drawing method perform better than hill climbing and simulated annealing approaches?" and "Does coupling the method with path relinking improve the performance of the tabu search graph drawing method?" To answer these questions, we implemented and evaluated the methods. We also implemented hill climbing and simulated annealing, the two commonly used alternative neighbourhood search based approaches for graph drawing.
Three types of evaluations were carried out: 1. finding the best layout that can be achieved (i.e. minimizing the value of the cost function); 2. how long it took to draw a graph to a particular level of quality; 3. how good the quality of the graph was after a fixed optimization time (number of evaluated solutions).
These allow us to examine different possible use cases for graph layout: firstly, generating the best possible layout; secondly, speed to draw an acceptable layout; and thirdly, how good the graph layout can be if there is a fixed time available to produce it.
We first compared the performance of our tabu search method to hill climbing and simulated annealing. As both tabu search and simulated annealing showed a better performance than hill climbing, we then discarded hill climbing for future tests and moved on to comparing our improved tabu search method that has the addition of path relinking against just simulated annealing. Note that, in these experiments, we used the same search space structure, as described previously in section 3 (Fig 1), for all the graph drawing methods.

Parameters tuning
Each method has a number of parameters. Changing the values of those parameters can affect the performance of the method and the quality of the layouts generated by the drawing method. Several experiments were conducted to calibrate the parameters of each method. The experiments show the effect of increasing and decreasing the values of the parameters on the performance of the method and the quality of the layout.
Parameters tuning is a common practice in search based methods. Typically, one parameter is tuned at a time, which might cause suboptimal choices, as parameters could interact in a complex way. However, simultaneous tuning of more parameters leads to a large amount of experiments. There are drawbacks to parameter tuning based on experimentation which can be summarized as follows [41]: • Parameters are not independent, but testing all different combinations systematically is practically impossible.
• The process of parameter tuning is time consuming, even if parameters are optimized one by one, regardless of their interactions.
• The selected parameter values are not necessarily optimal, even if the effort made for setting them was significant.
However, in this research, we applied a tuning process similar to those conducted in [2,11,42,43]. We performed a systematic incremental exploratory test on a wide range of values for each individual parameter in order to select a robust set of initial values. Then for each single parameter, we tested values while fixing the rest of the parameters. The value which gave the minimum objective function's value was selected for that parameter, and then we moved to the next parameter to apply the same procedure.
The following are the selected values of the parameters in our tabu search method: • maxIterations = 40 • tabuCutOff = 4 • intensifyIterations = 5 • intensifyCutOff = 0.005 • tabuDuration = 5 We also performed similar parameter tuning on hill climbing and simulated annealing approaches. Hill climbing algorithm is affected by two parameters: the initial value of the square size used to determine the neighbourhood solutions (initialSquareSize) and the value used to reduce the size of the square (squareReduction). The performance of simulated annealing drawing algorithm is influenced by four parameters: number of iterations for running the algorithm (maxIterations), number of iterations at each temperature (iterPerTemp), the initial temperature used in the annealing process (initialTemp), and the temperature cooling down factor (coolDown). The following are the selected values of the parameters for hill climbing and simulated annealing: There are six parameters which affect our improved path relinking procedure: number of iterations to repeat the path relinking procedure (PRmaxIterations), the size for the maximum number of solutions that can be added to the reference set of path relinking (refSize), the maximum length of the path (pathLength), the square size where path relinking candidate solutions are located on its border (pathSqrSize), number of iterations required to update the size of the square (accelerationPeriod), and the rate of decreasing the searching step-size (accelerationRate).
For tuning the values of these parameters, we applied our improved graph drawing algorithm on 100 random connected graphs which were divided into five sets such that each set had a different number of nodes and edges as described in Table 1.
Since the improved procedure is called within our tabu search drawing algorithm, we used the same values of the parameters of tabu search which we got in an earlier tuning experiment. On the other hand, in order to calibrate the values of the parameters of the improved path relinking, we followed the same incremental testing process we performed with all the other methods. In the first stage of tuning, we selected arbitrary values for the parameters which were chosen based on an observation of a quick experiment. The initial values of the parameters were: PRmaxIterations = 4, refSize = 20, pathLength = 10, pathSqrSize = 18, accelerationPeriod = 9, accelerationRate = 0.01. We started with one parameter, tested it thoroughly with different values, and selected the value which draws layouts with the minimum objective function's value compared to the other values. If the values of the objective function were too close to each other, we would select the values based on the ones which performed the fewest number of evaluated solutions. We fixed that value of the first parameter and we moved to testing another parameter in the same manner, and so forth.
We started the tuning process with PRmaxIterations parameter by testing the values of the set {1, 4, 7, 10}. Fig 3 shows that increasing the value of this parameter would minimize the value of the objective function of the generated layout. According to the set of values which we tested, the best value to choose was 10.
In the next parameter (refSize), we selected the set {10, 20, 30, 40} to be used in calibrating this parameter. With reference to Fig 4, the best value for refSize that gave the best objective function's value was 20. Note that, all the tested values led to producing very close values of objective function, but as the value of this parameter increases it slightly increases number of evaluated solutions, as shown in Fig 5. We selected the value 20 as it gave an objective function's value (on the graphs with label N250E2490) slightly better than the other values and the number of evaluated solutions performed by the algorithm when using this value is less than the evaluated solutions when we test this parameter on the values 30 and 40). The length of the path from the initial solution to the target solution (pathLength) was tested with the following set of values: {10, 20, 30, 40}. After testing all these values, we selected the value 20. We chose this value although it did not give a better value of objective function compared to the value 10 on small graphs, but it has the same behaviour on larger graphs as shown in Fig 6. We first need to test the effect of initial square size value on longer paths. If the effect is not significant, then we could select the value 10 in further parameters testing.
pathSqrSize parameter was tested with the values {5, 10, 15, 20}. According to Fig 7, the best value that could be picked is 20 since the value of the objective function was slightly smaller as the graph size became larger. The value 15 also produced good results but when applied on larger graphs, the value 20 was better.
To test the effect of accelerationPeriod parameter, we tested it with the following values: {1, 5, 9, 13}. Fig 8 shows that changing the value of this parameter did not greatly affect the value of the objective function. But Fig 9 shows that increasing the value of this parameter would slightly increase the number of evaluated solutions. That is why we chose the value 5 although there was no big difference with the objective function's values produced when acceleration-Period was set to 9 or 13, but it was better on larger graphs with a fewer number of evaluated solutions.  The last parameter, accelerationRate, was tested with the values {0, 0.05, 0.1, 0.15}. Increasing the value of this parameter had increased the value of the objective function as shown in Fig 10 when the values went beyond the value 0.05. On the other hand, setting the value 0 to this parameter had produced larger values of objective function compared to those when the value 0.05 was assigned to this parameter. Therefore, we chose the value 0.05 in this stage, but in further experiments, we tested the value of this parameter with a set of values in the range between 0 and 0.05 to examine the behaviour of the objective function in that specific range.
We continued the tuning process by performing another two rounds of tuning similar to the process we followed in the first stage. The first consecutive round examined the effect of the values of the parameters on the value of the objective function when the method performs a fixed number of evaluated solutions, whereas, the second round examined the effect of the values of the parameters on the number of evaluated solutions when the method executes to reach a fixed objective function's value. In each round, all the parameters started with the values which were chosen in the preceding round. When a parameter is tested, a set of values, which are close to the value that was chosen in the previous round, is selected. After all, the chosen values of the parameters in our path relinking procedure are: • PRmaxIterations = 4 • refSize = 20  • pathSqrSize = 20 • pathLength = 15 • accelerationPeriod = 7 • accelerationRate = 0.002.

Tabu search versus hill climbing and simulated annealing
First, in order to compare tabu search with hill climbing and simulated annealing, we generated random graph datasets in two categories. The graphs of the first category have the same number of nodes but with different densities (i.e. different number of edges), whereas the graphs of the second category have different number of nodes and edges. The random graph generator is based on the Erdős-Rényi model [44]. The parameters to the random graph generator were the number of nodes and the density of the graph. Random locations for the nodes were generated based on the size of the window where the graph is displayed. Then, the generator chose random nodes as end points of edges. A similar process was performed in [26]. Selfsourcing edges and multiple edges between the same pair of nodes were not allowed. Finally, the graphs generator tested the connectivity of the generated graphs. Only connected graphs were accepted.  Graph drawing using tabu search coupled with path relinking There were 80 random graphs in the first category split into 4 groups of 20 test cases. All the graphs in this category had 150 nodes. Each group had a differing number of edges so that the density varied. The graphs in each group had the same number of nodes and edges. See Table 2 for characteristics of the graphs in the first category.
The second category also had 80 random graphs, again split into 4 groups. The number of nodes for a group varied, increasing in steps of 50. The value of the density was chosen for each group to avoid too dense graphs. A similar random process used to generate graphs in the first category was applied to this category. See Table 3 for characteristics of the graphs in the second category.
The initial layout of nodes for each graph was random. We first applied our tabu search based approach along with hill climbing and simulated annealing approaches to the graphs. Tabu search, path relinking, and hill climbing approaches are deterministic methods which are not influenced by chance. On the other hand, simulated annealing is a stochastic method in which it includes an element of randomness in the neighbourhood searching process. Therefore, this approach has been tested on each individual graph for 30 different runs. Then we find the median of the results for the 30 different runs to compare with the results of tabu search and hill climbing approaches. Note that, we modelled the neighbourhood transition probability of simulated annealing by that described in [2]. To make a comprehensive comparison between the methods, we divided our experiment into three phases. In phase I, we applied the methods on the graphs of the two categories. The methods executed on the 20 test cases in each group of the two categories, and then the average objective function's value and the average number of evaluated solutions were computed for each group in each method. In this phase, the hill climbing approach executed until it found the best solution that can be reached (i.e. a solution that cannot be further improved). Whereas, simulated annealing and tabu search are more flexible in how they reach a good solution, and hence we ran them using the values of the parameters discussed earlier in the previous subsection.
The following figures show bar charts of the results obtained from phase I. Here we are looking for graph layouts with the minimum objective function's values that can be achieved. Fig 11 clearly shows the difference between the three methods in terms of the lowest objective function's value that can be obtained. Fig 12 shows the number of evaluated solutions required to achieve this objective function's value.
In phase II, we investigated the performance of the approaches rather than the quality of the produced layouts. The following process was performed to test which method has the fewest number of evaluated solutions to reach similar values for the objective function: a. We ran the hill climbing method on the graphs until no improvements could be made on the value of the objective function. We started with hill climbing because, in phase I, it produced graph layouts with the worst quality compared to the other two methods. Therefore, simulated annealing and tabu search could typically produce graph layouts with as good quality as the one produced by hill climbing.
b. We ran simulated annealing and tabu search methods until they reached an equal or better objective function's value compared to the one found by the hill climbing drawing algorithm.
c. We measured the number of evaluated solutions for each method.  Table 4 give the results obtained from phase II on the two categories of graphs (refer to the end of subsection 4.3 and section 6 for a complete description and interpretation of the p-value column).
In phase III, we investigated the quality of the layout produced by the drawing algorithms in a fixed amount of time. The following process was performed to test which method Graph drawing using tabu search coupled with path relinking produces graph layouts with the lowest values of objective function (best quality) when the three methods apply the same number of evaluated solutions: a. We ran the tabu search method on the graphs for a predefined number of iterations (maxIterations = 40). The number of evaluated solutions is computed and saved. We started with tabu search because in phase I, it generated the least number of evaluated solutions.
b. We ran hill climbing and simulated annealing methods until they perform the same number of evaluated solutions performed by the tabu search method.
c. We measured the value of the objective function produced by the drawing algorithms in each of the above steps.  Table 5 give the results obtained from phase III when applied on the two categories of graphs. Graph drawing using tabu search coupled with path relinking

Tabu search with path relinking versus simulated annealing
Here, we want to test the effect of adding path relinking to our tabu search algorithm. In this experiment, we exclude hill climbing as the results of the previous experiment (subsection 4.2) showed that hill climbing performed considerably worse than both tabu search and simulated annealing in all phases. In order to avoid overfitting, where the drawing algorithm could be tailored to the dataset used in the first experiment, we generated new random graph datasets in this experiment which are also divided into two categories, using the same procedure we followed for generating random graphs in our previous comparison.
In the first category, we had 80 random graphs split into 4 groups of 20 test cases. All the graphs in this category had 160 nodes, randomly positioned. Each group had a different number of edges so that the density varied. The graphs in each group had same number of nodes and edges but with different random layouts. See Table 6 for the characteristics of the graphs in the first category. The graphs of the second category were generated in the same way of those graphs of category II described in the previous experiment. See Table 7 for the characteristics of the graphs in the second category.
We divided our experiment into three phases similar to those in the previous experiment. In the first phase, all the methods ran until they finish execution.   Fig 15 shows the difference between the three methods (the combination of path relinking and tabu search, pure tabu search, and simulated annealing) in terms of the lowest objective function's value that can obtained. Fig 16, shows the number of evaluated solutions required to reach this objective function's value.
In phase II, where we investigated the performance of the methods, we tested number of evaluated solutions performed by each method to reach similar values for the objective function. We ran tabu search first (as it produced graph layouts with the largest objective function's values in phase I compared to the others) so that the other methods could easily produce graph layouts with better quality against the ones produced by tabu search. Then, we ran the other methods until they reached an equal or better objective function's value compared to the one found by the tabu search. Finally, we measured the number of evaluated solutions of each method. Fig 17 and Table 8 show the results obtained from phase II when applied on the two categories of graphs.
In phase III, we investigated the quality of the layouts produced by the drawing algorithms. We tested which method produced graph layouts with smallest values of objective function when they perform the same number of evaluated solutions. We ran the tabu search method  Graph drawing using tabu search coupled with path relinking on the graphs for a predefined number of iterations (maxIterations = 40) as described earlier in this subsection. We started with tabu search because in phase I, it generated the least number of evaluated solutions. We ran the other methods until they perform the same number of evaluated solutions performed by the tabu search method. Finally, we measured the value of the objective function produced by each drawing algorithm. Fig 18 and Table 9 show the results obtained from phase III on the two categories of graphs.
To test the effect of randomness in generating the initial graph layouts used in comparing the methods, we performed a statistical significance test on the results generated from the three phases. To demonstrate that there is a statistical significant difference between the methods, we applied the Friedman test [45] which is a non-parametric test for testing the differences between several samples. This test requires no prior knowledge of the distribution of the data.
We ran the methods on 20 randomly generated test cases for each group of graphs in the first and the second categories. Note that, in simulated annealing, we calculated the median of 30 runs for each test case instead of finding the mean (median is more reliable in avoiding outliers) and consequently we got 20 medians (since we find the mean of 30 medians for each test case). Then we compared them with the results of the means computed by the other search based methods using Friedman test with a significance level α = 0.05. See the p-values in the last column of Table 4, Table 5, Table 8, and Table 9.
The Friedman test allowed us to conclude that there is a significant difference between the methods, but it does not show how each method differs from the other. Therefore, a post-hoc test for multiple comparisons between the methods had to be conducted. The Bonferroni method [46,47] is a simple method that allows pairwise comparisons, see Tables 10 and 11 for the p values. Note that all the statistical tests were conducted using R statistical package i386 (version 3.1.1).
In terms of threats to validity, three deterministic algorithms and one stochastic algorithm were examined. The deterministic methods were applied once on the same initial graph layout whereas the stochastic method was applied 30 times on the same graph. The main internal threat is in the implementation of the algorithms. The methods were implemented by the same coder, and were run on the same machine. There is the possibility that one of the methods was implemented in a more efficient way. However, the methods share substantial code which increases confidence that none was particularly disadvantaged. In addition, a systematic parameter tuning method was applied. In terms of external threats, a threat to the Graph drawing using tabu search coupled with path relinking generalizability of the results is possible. Selection bias was avoided by using randomly generated graphs (except in the parameters of the generation algorithm, such as number of nodes and edges). However, randomly generated graphs generally do not have the same characteristics as real world graphs. In section 5, we explore the methods applied to real world standard public datasets sourced from the Internet.

Scalability and performance analysis
In order to test the scalability of our method and its ability to work effectively on large graph datasets, we ran our method against simulated annealing on randomly generated large graphs according to phase I as described in subsection 4.2. Note that we excluded hill climbing from this comparison as the statistical tests in subsection 4.2 showed that hill climbing is considerably worse than the other methods. We ran simulated annealing 30 times on each dataset, and the median value was recorded for each set. The graphs were generated using the same generator discussed in subsection 4.2 and in [44]. We started with a graph dataset of 1000 nodes and 3003 edges and we kept increasing the number of nodes and edges as we move from one dataset to another as shown in Table 12. We stopped increasing the size of the datasets when we got a very long execution time for one of the tested methods (almost half a day). Figs 23-25, show the overall performance of our method when being applied on a set of graphs with increasing number of nodes and edges, as described in Table 11, in terms of

Real world graph datasets
After performing several experiments on random graphs, we tested our methods on real world graph datasets to demonstrate that we can reproduce similar results in a real world setting. We selected 10 different datasets from different sources as shown in Table 13 that also shows number of nodes, number of edges, and density in each graph. The graphs have different sizes with different densities. The initial layout of the nodes in each graph was generated randomly. Hill climbing, tabu search and path relinking algorithms have run once on the same initial layout whereas simulated annealing has run 30 times, as we did previously, and we calculated the   Table 13 according to phase I. Fig 29 represents number of evaluated solutions performed by each method when testing them on the real world graphs described in Table 13 according to phase II, while Fig 30 dem-onstrates the values of the objective function produced by each method when they follow the experiment described in phase III when applied on the same set of data.
Figs 31 and 32 are two examples of the layouts produced by all the methods when applied to graph 3 and graph 5 respectively in the list of real world datasets described in Table 13. We also report the normalized values of each aesthetic used in our objective function independently when the methods were applied on both graphs as shown in Table 14 and Table 15.

Analysis of results
Our graph drawing method of coupling tabu search with path relinking outperforms the other tested methods in terms of the quality of the produced graph layouts and number of evaluated solutions needed to reach a particular objective function's value.
In experiment 1 we tested simulated annealing, hill climbing and tabu search. In experiment 2 we tested simulated annealing, tabu search and tabu search with path relinking. We covered three comparisons: phase I, diagram layouts with the best objective function's value that can be achieved; phase II, number of evaluated solutions performed by each drawing algorithm to reach a particular level of layout quality; and phase III, quality of layout drawn by drawing algorithms after a fixed number of evaluated solutions.
When we were looking for a layout with the best objective function's value that can be achieved, experiment 1 phase I, Fig 11 shows that simulated annealing produces graph layouts with the best objective function's value compared to hill climbing and tabu search, with hill climbing the worst. On the other hand, simulated annealing evaluates a large number of solutions in order to get the layouts compared tabu search and Fig 12 shows that tabu search clearly outperforms the other two methods in terms of performance efficiency. In experiment 2 phase I, Fig 15 shows that tabu with path relinking outperforms the other methods in the quality of the layouts but with the highest number of evaluated solutions as shown in Fig 16. In experiment 1 phase II, where we tested number of evaluated solutions performed by the drawing algorithms to reach a particular level of quality, see Fig 13 and Table 4, our tabu search method generates graph layouts of good quality with fewer number of evaluated solutions compared to hill climbing and simulated annealing. Calling the path relinking procedure within our tabu search procedure improves the performance of the drawing algorithm and reduces the number of evaluated solutions in experiment 2 phase II, see Fig 17 and Table 8. In experiment 2 phase II, see Table 10, it is clear that path relinking outperforms simulated annealing in drawing graph layouts with similar objective function's values using a few number of evaluated solutions. It also outperforms the pure tabu search procedure on large graphs (as number of nodes increases) unlike smaller graphs where there is no significant difference. Finally, for phase III, we ran the drawing algorithms so that they evaluate a specific number of solutions to test the quality of layouts that would be generated in a set time. For experiment 1 phase III we conclude from Fig 14 and Table 5 that our tabu search approach draws graph layouts with better quality (or similar quality in the worst case) compared to hill climbing and simulated annealing when they evaluate the same number of solutions. Adding path relinking to tabu search in experiment 2 phase III has improved the quality of the layouts compared to those layouts produced by pure tabu search procedure using the same number of evaluated  solutions as shown in Fig 18 and Table 9. Table 11 showed that our tabu search/path relinking method draws graph layouts with better quality compared to simulated annealing. It also outperforms pure tabu search as the size of the graph increases, but there is no significant difference on smaller graphs. The results of our experiments gives us strong evidence that our tabu search with path relinking outperforms hill climbing, simulated annealing, and pure tabu search procedures, and has a better scalability.

Conclusions
We have described a novel automated neighbourhood search method for drawing general graph layouts with undirected straight lines based on a weighted sum multi-criteria optimization. Our new method is based on tabu search with path relinking. The method searches for the best positions of the nodes, so minimizing the value of the objective function and drawing a nice graph layout. The integration of features of tabu search and path relinking in one implementation makes our method a more effective graph layout method than other well-known Graph drawing using tabu search coupled with path relinking   Graph drawing using tabu search coupled with path relinking neighbourhood search methods such as hill climbing and simulated annealing. The key feature in tabu search is the combination of forbidding reverse moves using a memory-based tabu list and allowing escapes from local optima. Whereas building a reference set of elite solutions  Graph drawing using tabu search coupled with path relinking generated by tabu search and moving efficiently along the path between two solutions are the main aspects of our path relinking procedure. We also developed a systematic way for choosing the values of the parameters used by the method. Our experimental results on random graphs and real world graphs show that our tabu search/path relinking approach draws graph layouts with good quality in a relatively low number of evaluated solutions. Coupling tabu search with path relinking outperforms simulated annealing and hill climbing in both terms of quality of layout and speed of layout.
In terms of future work, the performance of our method can be further improved by implementing a hybrid of path relinking and a Greedy Randomized Adaptive Search Procedure (GRASP). This combination has been previously applied efficiently in many applications with promising results [13]. In addition, more investigation can be performed on the effectiveness  Table 13) produced by all the methods.
https://doi.org/10.1371/journal.pone.0197103.g032 Table 14. Normalized values of each aesthetic when the methods were applied on graph dataset 3 (listed in Graph drawing using tabu search coupled with path relinking of our approach in comparison with force-directed approaches and other population based approaches that have been previously used in the field of graph drawing such as ant colony optimization [58]. Also, we can use a double-blind test on real human users to evaluate the layouts generated by different graph drawing algorithms as visualization is also concerned of how significant the differences are to human eye and the human sense of aesthetics. Finally, experiments can be conducted to study the efficiency of this method when applied to different types of graphs such as trees, hierarchical, and circular graphs. Our method can be easily adjusted to work with directed edges, but each type of these graphs has its own aesthetic measures such as: subtree separation, closest and farthest leaves for tree graphs; uniform edge direction and cycle removal for hierarchical graphs; partitioning the graph into clusters and placing the nodes of each cluster onto the perimeter of an embedding circle for circular graphs [59]. These aesthetics, in addition to the ones discussed in this paper which usually exist in any graph, must be formulated in a weighted sum multi-criteria objective function to be optimized by our proposed method. Table 15. Normalized values of each aesthetic when the methods were applied on graph dataset 5 (listed in