Optimal Routing in General Finite Multi-Server Queueing Networks

The design of general finite multi-server queueing networks is a challenging problem that arises in many real-life situations, including computer networks, manufacturing systems, and telecommunication networks. In this paper, we examine the optimal routing problem in arbitrary configured acyclic queueing networks. The performance of the finite queueing network is evaluated with a known approximate performance evaluation method and the optimization is done by means of a heuristics based on the Powell algorithm. The proposed methodology is then applied to determine the optimal routing probability vector that maximizes the throughput of the queueing network. We show numerical results for some networks to quantify the quality of the routing vector approximations obtained.


Introduction
The design of networks with random arrivals, random service times, multiple servers, and a finite number of buffer spaces is a challenging problem that arises in many real-life situations, e.g. in manufacturing systems [1,2], computer networks [3,4], public services [5,6], call centers [7,8], pedestrian and vehicular traffic [9][10][11][12], among other situations. This problem is challenging because finite queueing networks are notoriously difficult to analyze analytically, and closed form expressions are not easily constructed for the performance measures of such systems. Also note that the underlying network design problems involved are usually very hard to solve.
In fact, there are several distinct network design optimization problems associated with finite queueing networks. According to Daskalaki & Smith [13] the optimal network design problem can be split up into three interrelated optimization problems: 1. The optimal topology problem (OTOP), which deals with decisions of the design of the network itself, that is, the number of nodes (e.g. workstations, warehouses, delivery points, etc.) and arcs (e.g. corridors, conveyors, escalators, etc.) and the general configuration of these two components; 2. The optimal routing problem (OROP), which deals with determining the routing probabilities in the network defined by the first problem; 3. The optimal resource allocation problem (ORAP), which deals with the optimal allocation of the scarce resources in the network, e.g. the number of buffers (i.e., the buffer allocation problem, BAP) and the number of servers (i.e., the server allocation problem, CAP).
These three problems are challenging and difficult optimization problems. For an arbitrary topology, the OTOP is shown to be N P{hard [14], and the same is conjectured for the general ORAP [15].
Previous work focused mainly on the ORAP in open finite acyclic queueing network settings. Both BAP and CAP are probably among the most well-known optimal resource allocation problems [16]. For instance, Cruz et al. [17] and Smith et al. [18] looked into the BAP, both in a single and in a multi-server setting, and Smith et al. [19] proposed algorithms to solve the CAP. However, the routing probabilities are usually assumed to be known beforehand for BAP and CAP [20].
The overall research objective of this paper is to build a relevant model and solution methodology for the system's throughput maximization problem. In this paper, we focus on optimizing the routing probabilities through the queueing network, i.e. the OROP. A similar research question is tackled by Daskalaki & Smith [13] in which they evaluated the joint effect of buffer allocation and routing on the throughput. Earlier, Gosavi & Smith [21] focused on the routing optimization problem related to the overall objective of throughput maximization. The common ground of both papers is that they used queueing networks with single servers and exponential service times [13,21]. Kerbache & Smith [22] considered, for different product classes, the optimal routes conjoint with a variant of the optimal topology problem to determine the connected arcs in a single server queueing network setting. Distinct from Daskalaki & Smith [13] and Gosavi & Smith [21] is that Kerbache & Smith [22] considered general arrival times, general service times, and single server queues. Secondly, Gosavi & Smith [21] did not consider the general expansion method (GEM) in their analysis as the evaluation tool [23].
Specifically, we examine the OROP, by means of a combination of the GEM and a heuristic based on the Powell algorithm [24], specifically for acyclic networks of M=G=c=K queues, which in Kendall notation means a queueing system with Markovian arrival rates, General service times, c parallel servers, and a total capacity of K items, including those items in service (practical applications to M=G=c=K queueing networks include manufacturing and service systems [25] and transportation and material handling systems [26]). The results are compared to simulations to attest for the quality of the routing vectors obtained. Besides, another important contribution of this paper is to present helpful approximations to swift managerial decisions regarding the optimal routing probability vectors to maximize the overall throughput in a network of finite general-service queues. We also present important empirical arguments to show that these approximations are effective. This paper is organized as follows. The next section describes in detail the mathematical model formulation considered and elaborates on both the performance evaluation tool and on the optimization procedure. In the following section detailed results are given for the problem on hand. Finally, the last section concludes this paper with final remarks and topics for future work in the area.

Mathematical Programing Formulation
The problem is defined on a digraph D~(V ,A), in which V is the set of vertexes (finite queues) and A is the set of arc (connections between the queues). Each vertex (queue) is characterized by Poisson arrivals, general service, and multiple servers. Mathematically, the optimal routing problem can be formulated as follows. (OROP): subject to: in which H(a ) is the throughput, which is the objective that must be maximized, a the optimal routing probability matrix, a:½a i, j , i.e. the matrix that maximizes the objective function of this predefined network, and d(i) is the set of succeeding vertexes of vertex i, that is, d(i):fjj (i, j) [ Ag: The throughput will thus be affected by the effective routing of jobs through the network, the variability of the service distribution, the number of servers, and the number of buffers. It should be clear that the above model is a highly difficult stochastic programming problem to handle due to the inherent non-linearity of the objective function combined with the absence of any closedform expressions for the throughput H( a): Proposed Algorithm Figure 1 presents the proposed algorithm to solve the OROP in order to provide more insights into the interaction between the performance evaluation tool and the optimization method.
The initial routing probability vector is conveniently set to the inverse of the number of nodes after a split, in which n i is the number of succeeding nodes to node i, that is, the cardinality of set D(i): The optimization step itself is an iteration in which new solutions are generated following the Powell logic until convergence, that is, until the difference in H, DH:(H (k) {H (k{1) ), is less than a predefined tolerance e: The Powell algorithm can be described as an unconstrained optimization procedure that does not require the calculation of first derivatives of the function. Numerical examples have shown that the method is capable of minimizing a function with up to twenty variables [24,27]. The Powell method locates the minimum of a non-linear function f (x) by successive uni-dimensional searches from an initial starting point x (k) along a set of conjugate directions. These conjugate directions are generated within the procedure itself. The Powell method is based on the idea that if a minimum of a non-linear function f (x) is found along p conjugate directions in a stage of the search, and an appropriate step is made in each direction, the overall step from the beginning to the p-th step is conjugate to all of the p sub-directions of the search. We have seen remarkable success in the past with coupling Powell algorithm and the GEM [19]. We discuss the GEM in detail now, which is also the method we used to obtain the performance measures for the problem studied in this paper.

Performance Evaluation
In previous papers (see e.g. Kerbache & Smith [23,28]) the GEM has been successfully used to evaluate the performance measures of acyclic networks of finite queues. The GEM is a robust and effective approximation technique that is basically a combination of repeated trials and node-by-node decomposition in which each queue is analyzed separately and then corrections are made in order to take into account the interrelation between the queues in the network. The GEM has three stages, Network Reconfiguration, Parameter Estimation, and Feedback Elimination, to be described as follows.
Stage I: Network Reconfiguration. The first step in the GEM involves reconfiguring the network. An artificial vertex h j is added preceding each finite vertex j in the network. The artificial vertex is added to register the blocked customers at node j and is modeled as an M=G=? queue, as shown in Figure 2.
When an entity leaves vertex i, vertex j may be blocked with probability p Kj , or unblocked, with probability (1{p Kj ): Under blocking, the entities are rerouted to vertex h j for a delay while node j is busy. Vertex h j helps to accumulate the time an entity has to wait before entering vertex j and to compute the effective arrival rate to vertex j: In other words, the GEM ultimate goal is to provide an approximation scheme to update the service rates at the upstream vertex i to take into account all blocking after service caused by the downstream vertex j : in which m i is the exponential service rate at vertex i, p K j is the blocking probability of finite queue j of size K j , m' h j is the corrected exponential service rate at the artificial vertex h j , andm m i is the updated service rate at vertex i: As a final note, an important point about this process is that we do not physically modify the networks, only represent the expansion process through the software. Stage II: Parameter Estimation. In the second stage, the parameters p K , p' K , and m h must be estimated, which is done essentially by utilizing known results for queueing theory. Index j is omitted for simplicity. p K : Analytical results from the M=M=c=K queue provide the following expression for the blocking probability p K : in which for =(cm)=1: However, the interest is on M=G=c=K models, for which there is not exact closed form expression for p K : Then approximations must be used and Kimura's [29] two moment approximation has proven to be very effective in similar cases [25,30]. For example, let us fix c~2 and the following closed form expression can be developed from Equation (6), for the optimal buffer size B M~K {2 for Markovian M=M=2=K queues, as a function of the blocking probability: The following Kimura's two moment approximation can be used to approximate the optimal buffer size B (s 2 ) of a general service M=G=2=K queue: in which s 2 is the squared coefficient of variation of the service time distribution at the queue, r:=(cm) is the traffic intensity, B M is the exact buffer size for a respective Markovian system, and NINT ( x) is the nearest integer to x: Now, if we invert Equation (9) to  This is a process that can be extended for cw2: In fact, expressions for p K of up to c~500 are available [30]. Another expressions, for c~3, . . . ,10, are included in the software so that we have a complete set of blocking probabilities for c [ f1, . . . ,10g: p' K : Since there is no closed form solution for this quantity the following approximation is used given by Labetoulle & Pujolle [31] obtained using diffusion techniques.   Table 1. Results for two-branch split networks.
in which r 1 and r 2 are the roots to the polynomial , in which j and h are the actual arrival rates to the finite and artificial holding nodes respectively. Furthermore, the arrival rate to the finite node j is given by: The delay distribution of a blocked customer at the holding node has the same distribution as the remaining service time of the customer being serviced at the node doing the blocking. Using renewal theory, one can show that the remaining service time distribution has the following rate m h : in which s 2 is the service time variance given by Kleinrock [32]. Notice that if the service time distribution at the finite queue doing the blocking is exponential with rate m j , then: that is, the service time at the artificial node is also exponentially distributed with rate m j : When the service time of the blocking node is not exponential, then m h will be affected by s 2 : Stage III: Feedback Elimination. This stage is simply to eliminate the feedback loop, by recomputing the service time at vertex h k : The updated service rate is given by: Summary. Similar equations can be established with respect to each of the finite vertexes (queues). Ultimately, we have simultaneous non-linear equations in variables p K , p' K , and m h , along with auxiliary variables such as m j and i : Solving these equations simultaneously, we can compute all performance measures of the network.

Numerical Results and Discussion
The software implementation is currently in Fortran 77. The compiler used was the DIGITAL Visual Fortran, Version 6, with the IMSL Fortran 90 MP Library version 3.0 for Microsoft Windows, to solve the nonlinear equations from the GEM. In our implementation, we set e~10 {1,000 , which proved to be effective based on the experiments. We first discuss the shape of the objective function. Secondly, we will give more insights for a number of split structures. We end the numerical results with some Table 2. Perturbations around the optimal solution of two-branch split networks.  Table 3. Results for three-branch split networks.

Shape of the Objective Function
It is interesting to analyze the shape of the objective function for the optimization problem described earlier. The case discussed here is defined as follows. We have a three node network with a split into two branches, as seen in Figure 3. The general parameters for the base case are c 1~4 , K 1~2 0, and c i~2 and K i~2 , for i~2,3: The criteria to select those parameters is such that the number of servers c 1 and the total capacity K 1 of node 1 is larger than the others as to prevent it from becoming a bottleneck.
We are particularly interested in the relationship between the overall throughput H~h 1 zh 2 , the routing probability a 1, 2 , the arrival rate 1 , and the squared coefficient of variation of node 2, s 2 2 : Consequently, we set m i~2 , for all nodes, and s 2 1~s 2 3~1 : The sensitivity of these settings on the throughput is not analyzed now, but is amongst others the subject of study in the next sections. Next, we enumerate all possible combinations for 1 , a 1, 2 , and s 2 2 , and then analytically obtain the corresponding throughput H, which is shown in Figure 4 (always on the vertical axis), as a function of 1 , a 1, 2 , and s 2 2 : Figure 4-(a) clearly shows that the arrival rate is interacting with the routing probability. For low arrival rates, the network has low utilization. Consequently, different routing probabilities do not result in large changes in throughput H: On the other hand, for large arrival rates, 1 w5, one clearly sees an optimal point in regard to the routing probability. Due to the symmetrical structure considered, a 50% split seems to be optimal here. Figure 4-(b) looks into the joint effect of changing the squared coefficient of variation, s 2 2 , together with the routing probability a 1, 2 : Again the inverted U-shape effect with a maximum around the 50% routing probability is visible. Next to this, it is clear that increasing the squared coefficient of variation from 0 to 2 reduces the overall throughput H but has a smaller impact on throughput than the routing probability. Based on this simple network structure, it is evident that the routing probabilities and the squared coefficient of variation affect the throughput to a large extent. Consequently, correctly setting the routing matrix a leads to significant gains in terms of throughput.

Basic Split Networks
In this section, we analyze further the two-branch network from Figure 3 and include in our analysis the three-branch network seen in Figure 5. We are interested in assessing the influence of the number of servers c i , total capacities K i , service rates m i , and squared coefficient of variation of the service times s 2 i , Vi [ V , in the model OROP, Equations (1) -(3). We choose to start with these two variants of a basic split structure as, from a routing allocation point of view, splits are the key building blocks in a generally configured network. The nodes after the splits are the ones of interest here. The first buffer K 1~2 0 is larger than the others (K i~2 , i~2,3,4) as this will help to prevent the first queue of becoming a bottleneck node. The arrival rate 1 is set equal to values f3,5,7g: Split with Two Branches. The top part of Table 1 gives the results for a two-branch split networks for unbalanced service rates m and different squared coefficients of variation s 2 2 and s 2 3 : In these cases the service rate of node 2 is made either relatively lower (m 2~1 versus m 3~2 ), or equal (m 2~2 versus m 3~2 ), or higher (m 2~3 versus m 3~2 ) than the service rate of node 3. The base cases (sets B1d to B1f) show that the routing probability is roughly equal to 0.5 when the nodes after the split are identical (that is, same number of servers, capacities, service rates, and squared coefficient of variation). Moreover, this results appears to be insensitive to changes in the squared coefficient of variation of both nodes after the split. Of course, the throughput H is affected (reduced) due to the changes (increase) in the variability. Secondly, changing the service rate of node 2, m 2 (and keeping all other parameters equal to the base case settings), clearly shows that the fast nodes receive preference over the slow nodes. For example, in sets B1a to B1c (i.e., when node 2 is slower than node 3) a lower routing probability is set to node 2 (0.3334) than the one to node 3 (0.6666).
Rather than changing the squared coefficient of variation of both nodes after the split, we evaluated some unbalanced cases where only node 2 is affected by a different squared coefficient of variation, s 2 2~f 0:0,0:5,1:0,1:5,2:0g (sets B1j to B1x, Table 1), combined with (m 1 ,m 2 ,m 3 )~f(2,1,1),(2,2,2),(2,3,3)g: For these cases, we observe that the unbalance caused by the squared coefficient of variation only slightly changes the routing probability compared to sets with equal squared coefficients of variation (sets B1l, B1q, and B1v, Table 1). This is a confirmation of what we observed when evaluating the objective function earlier in the previous section. As we are now focusing on the small scale networks, this conclusion does not mean that the squared coefficient of variation has little effect in general. It is interesting to see that the throughput value is reducing as the squared coefficient of variation goes up although the routing probability is changing to protect more against the uncertainty in the second node. This is more prevalent in highly loaded systems.
For the two-branch split networks, we evaluated a number of routing vectors around the optimal routing obtained. Table 2 shows that the algorithm seems to have reached the optimal allocation for the routing probabilities into nodes 2 and 3 (set B1e, Table 1). Of course, one might argue that the optimization is rather easy due to the symmetric setting of the parameters. Therefore, we did the same analysis for the same parameter settings but with a network with unbalance in the service rates (set B1b, Table 1), also seen in Table 2.
In conclusion, we observed that in previous results the optimization algorithm tries to balance out the flow taking into                               account the differences (in service rates and squared coefficient of variation) among the two nodes after the split, which is intuitively logical as this strategy leads to the highest throughput. Moreover, the methodology seems to always find the optimal routing vector. Split with Three Branches. Let us now turn to the threebranch split network, Figure 5. It would be interesting to see to what extent the optimization algorithm balances the flow over the three nodes after the split and to what extent this is affected by the characteristics of the different nodes after the split. Table 3 shows a selected set of experiments done for this specific case. Table 3 shows that for the complete symmetric case, that is, set B2c, Table 3, again the routing probabilities are symmetric, i.e. a i, j~0 :333,V (i,j) [ A: For the unbalanced cases in the squared coefficient of variation (sets B2a, B2b, B2d, and B2e, Table 3), it can be observed that the routing probability into the two identical nodes (a 1, 3 and a 1, 4 ) are close to each other. For the remaining asymmetrical cases (sets B2f to B2o, Table 3), again the same conclusion holds. The faster (either in high number of servers or service rates) or more reliable (in terms of low squared coefficient of variation) are the nodes, more favored they are, resulting in high routing probabilities into these nodes.

Complex Networks
The simple networks discussed so far are interesting as they make it possible to show the behavior and logic of the optimization model in the presence of one split. In this section, we will evaluate some different complex topologies with regard to their routing probabilities. The first complex network considered is an extension of the two-and three-branch split networks, as depicted in Figure 6. Table 4 gives an overview of a selected set of experiments for the structure C1. The initial setting is again a balanced case, that is, c 1~5 , K 1~2 0, m 1~2 , c i~2 , K i~2 , m i~2 , s 2 i~1 , for i~2,3, . . . ,7 (set C1a, Table 4). Additional set of experiments involves unbalancing the service rates m i , the squared coefficients of variation s 2 i , and the number of servers c i , for nodes 6 and 7. With these experiments, we evaluate whether and how the methodology takes the characteristics of the complete sub-network after the split into account in determining the optimal routing vector.
We set up the experiments in such a way that either there are slow nodes (experiments C1c, C1e, C1g and C1h) or slow subsystems consisting of three connected nodes (experiments C1b, C1d, C1f, and C1i). Based on Table 4, we observe that in general the slower part of the network tends to receive less flow due to a lower routing probability into the relevant part. When after the first split in node 1 there is the choice to go to either the fast or slow subsystem, the faster subsystem is preferred. This is very clear in experiments C1b, C1d, C1f, and C1i, when the routing probability always favors the fastest downstream subsystem. However, if the last nodes are different (experiments C1c, C1e, C1g, and C1h), the conclusion is different. In all these experiments, the first split is just exactly half. The imbalance in the last nodes (i.e. nodes 4, 5, 6, and 7 are different), is completely absorbed in the routing probability at the immediately preceding nodes (i.e. nodes 2 and 3). Interestingly, this effect did not propagate upstream and did not affect the routing at the first split. Again, we see that the effect of the squared coefficient of variation on the routing probability is smaller compared to the number of servers or the service rates.
The second network structure C2 has a more general structure than the other networks, as seen in Figure 7. Nodes 10 and 12 can act as a bottleneck node which might become overloaded depending upon the specific parameters. It is then interesting to Table 7. Evaluating the approximations for some C-sets.  Table 5 gives the results for a selection of parameter settings for network structure C2. The table shows that when node 10 becomes the bottleneck, routing the jobs into the direction of node 10 is avoided by reducing the routing probability at node 3 (always around 0.4) and node 5 (ranging between 0.0409 to 0.4751). On the other hand, when the characteristics of node 10 are such that it is not a bottleneck, then the routing to nodes 5 and 6 is almost 50/ 50. Secondly, it is clear that if the last node (node 12) becomes the bottleneck, only the throughput will be reduced.

Approximations for the Routing Probabilities
From a managerial point of view, it is interesting to have some good easy approximations that can be used to quickly set the routing probabilities. A number of possible approximations for the routing probabilities in the arc (i,j) [ A, after a split of vertex i into n i vertexes, can be considered.
a (4) i, j~c in which d(i) is the set of succeeding vertexes of vertex i, that is, d(i):fjj (i,j) [ Ag: Notice that n i is the cardinality of set d(i): The first approximation, Equation (13), is simple but does not use any information from the n i vertexes after the split. This approximation only provides an equal spread of the throughput over the succeeding vertexes. It is expected that this approximation works well when the nodes after the split are very similar in terms of service rate, number of servers, and so on. The other approximations, Equations (14), (15), and (16), do take more information into account. Equation (16) is believed to be the most general as it combines information in regard to the speed and the number of servers. On the other hand, no information about the squared coefficient of variation is included in none of the approximations.
Tables 6 and 7 show that the performance of approximation a (4) i improves as the network becomes more unbalanced (for instance, cases B1a-B1c are unbalanced, as defined in Table 1, and for these cases the smallest D% found is for a (4) i , D%~0:00%, on the other hand cases B1d-B1f are balanced and for them all D% is equal to 0.00%). This approximation of course takes into account the most information from the nodes after the split (not taking into account the squared coefficient of variation). If the nodes after the split are more alike (balanced) then the second approximation becomes favorable. On the other hand the first approximation a (1) i is performing acceptable as well and could be preferred due to the easy implementation.

Conclusions and Final Remarks
In this paper, we examined the optimal routing problem in open finite acyclic queueing networks with a given general topology and multiple generally distributed servers. We determined the optimal routing probability vector that maximizes the throughput of an arbitrary configured network via a combination of the Generalized Expansion Method and Powell optimization tool. We presented numerical results showing the merits of the approach. Approximations for the routing probability vector are also presented and evaluated.
We have considered here only the throughput as the main performance measure. It would also be interesting to evaluate the behavior of the routing algorithm to minimize the cycle time, the work-in-process (WIP) or other performance measures. Topics for future research on the area include the routing in queueing networks with cycles, e.g., to model many important industrial systems that have reverse streams of products due to re-work, or even the extension to GI=G=c=K queueing networks.