Queues with Dropping Functions and General Arrival Processes

In a queueing system with the dropping function the arriving customer can be denied service (dropped) with the probability that is a function of the queue length at the time of arrival of this customer. The potential applicability of such mechanism is very wide due to the fact that by choosing the shape of this function one can easily manipulate several performance characteristics of the queueing system. In this paper we carry out analysis of the queueing system with the dropping function and a very general model of arrival process—the model which includes batch arrivals and the interarrival time autocorrelation, and allows for fitting the actual shape of the interarrival time distribution and its moments. For such a system we obtain formulas for the distribution of the queue length and the overall customer loss ratio. The analytical results are accompanied with numerical examples computed for several dropping functions.


Introduction
Consider a simple queueing system with a stream of arriving customers, the queue of customers waiting for service and one service station (server) performing the service that takes random time. In order to control the performance of such system (e.g. to keep the mean queue length below 10) we have three options: (a) manipulate dynamically the service rate; (b) manipulate dynamically the customer arrival rate; (c) deny the service to some customers and not allow them to the queue.
All the listed approaches are equally good for control purposes in the sense that we can achieve the same results using each of them. The difference between them is in the practical applicability, which depends on the character and purpose of the queueing system of interest.
In many everyday-life queues, approach (a) can be used. At a bank for instance, additional cashiers can be assigned for serving an exceptionally long queue, thus multiplying the service rate. For serving exceptional passenger traffic, a bus (train, ferry) of larger capacity can be used occasionally, etc.
Usually, the application of (a) is connected with some additional costs (salary, equipment, energy consumption etc.). Of course, it can be applied only from time to time, using some sophisticated policy based on current queue size or predictions on its future size.
In some cases however, it is not possible to use (a). In networking for instance, when a queue of packets is being transmitted at a router's output interface, it is not possible to enlarge the throughput of the physical output link on demand.
Approach (b) cannot be used in most everyday-life applications of queueing systems. This is due to the fact, that the operator of the queue has usually no means to reduce quickly the customer arrival rate (think of a bank, for instance). On the other hand, method (b) works very well in networking. In particular, the Internet hosts, which use the TCP protocol, are forced to reduce their packet sending rates immediately, when the network congestion is expected. This proved to be an efficient way for preventing congestion collapses of the Internet, which were observed in its early years of operation.
As for (c), it is certainly the simplest method. Rejecting a customer is not technically difficult and does not require any spare resources, as method (a). Its disadvantage is that a fraction of customers leaves the system unserved. Therefore, it can be used in these applications only, in which such losses are tolerated. This is the case of the Internet, where packet losses are unavoidable anyway due to buffer overflows and traffic burstiness, so they may be as well caused rather in a planned, controlled manner. There are also many everyday-life examples, in which approach (c) can be used. For instance, when a waiting line at a call center is long, it might be better to reject a customer at once, instead of allowing him to the queue and keeping online for a long time before serving. Of course, there are also every-day life examples in which customers cannot be rejected-in those cases we are practically limited to method (a).
In this paper we deal with type (c) of controlling the performance of a queueing system. The literature on this method will be discussed later.
In order to decide in method (c), whether a customer has to be accepted to the queue or rejected, a large number of different disciplines can be proposed. One of the most natural ones is that each customer can be rejected randomly, with the probability that depends on the length of the queue upon the arrival of this customer (see Fig 1). In this paper we deal with such discipline. The function mapping the queue lengths into probabilities is called the dropping function and will be denoted by d(n), where n is the queue length. In most practical situations the dropping function would be non-decreasing, i.e. the longer the queue, the more likely the customer is rejected. (However, this assumption is not necessary in our analysis).
The idea of a queueing system with the dropping function was for the first time proposed in networking, in the RED algorithm [8], which uses linear dropping function to drop packets incoming to an Internet router. Following the first, linear dropping function, some other types of functions were studied: doubly linear [9], exponential [10] and polynomial [11]. In addition, a large number of studies were carried out on systems, in which the packet dropping probability is not a simple function of the queue length, but depends on several other factors, often in a very complex way (see e.g. [12][13][14][15][16][17][18][19][20]). In networking, this is called the active queue management (AQM).
However, besides networking, the dropping functions have great potential for applications in many other systems involving queueing of customers (or jobs, tasks etc.). This is connected with their powerful control capabilities. Namely, by shaping the dropping function according to our needs, we can keep the mean queue length, its variance, the loss ratio and several other performance characteristics at some required levels.
The literature on analytical solutions of queueing systems with dropping functions is not extensive. What is important, all the previous works were devoted to systems with very simple arrival processes, namely Poisson or renewal processes without interarrival times correlation. In particular, in [21], an approximate analysis of the queue with batch Poisson arrivals, linear dropping function and exponential service times was carried out. In [22], an approximate analysis of the system with batch renewal arrivals and exponential service times was performed. In [23] an exact analysis of the queue with Poisson arrivals, arbitrary dropping function and arbitrary service times was presented for systems in the equilibrium. In [24], an exact analysis of the queue with renewal arrivals, arbitrary dropping function and exponential service times was shown. Then, in [25], an analysis of the system with Poisson arrivals and arbitrary dropping function was carried out in the transient case. Finally, in [26] a solution of the system with Poisson arrivals and general distribution of the job size was shown.
Due to the simplicity of the arrival processes, all the mentioned models are not adequate for modeling of many real systems. For instance, it is well known that streams of packets in the Internet have strongly autocorrelated structure (see e.g. [27][28][29])-not only are the interarrival times correlated, but also the packet sizes may be correlated with the local intensity of the arrival stream. On the other hand, it is well known that not taking these properties into account may lead to an optimistic underestimation of the queueing characteristics by several orders of magnitude (see e.g. [30]). For these reasons, the previous analytical results on queues with dropping functions are of little use in networking and in other applications with sophisticated, autocorrelated traffic of customers.
The novelty of this paper is that we carry out analysis of the system with: • a very general model of the arrival process, taking into account the correlation of the interarrival times, correlation between local rate and job (customer) size, batch arrivals, the actual shape of the interarrival time distribution and its moments; • arbitrary distribution of the customer service time; • arbitrary dropping function.
To the best of the authors' knowledge, there are no results of this generality in the literature. Given the presented requirements, a rather obvious choice of the model of the arrival process is the batch Markovian arrival process (BMAP, see [31]). It not only fulfills all the presented requirements, but also is analytically tractable and several efficient procedures for fitting precisely the BMAP parameters to the real, observed arrival processes have been proposed, [32][33][34][35].
As for the queueing characteristics of interest, we present analytical formulas for the two most important ones: the queue length distribution and the loss ratio (the fraction of rejected customers). The results on the queue length are presented in the form which allows obtaining both transient and stationary distributions.
In addition to analytical solutions, we present numerical examples for several dropping functions.
The remaining part of the paper is structured as follows. In Section 2, a formal description of the queueing system is presented, including the definition and basic characteristics of the batch Markovian arrival process. Then, in Section 3, the main results on the distribution of the queue length and loss ratio are proven. In Section 4, calculations of two auxiliary functions, H and q, needed to obtain numerical values of queue lengths and loss ratios, are presented. In Section 5, numerical examples are shown. Finally, remarks concluding the paper are gathered in Section 6.

Model of the queue
We deal with a queueing system with single service station, whose arrival process is the batch Markovian arrival process (defined below). The customers are served in the arrival order; those who cannot be served immediately form a queue. The service time is random and can have an arbitrary distribution, with distribution function denoted by F(t). It is assumed, that the capacity of the waiting room (buffer) is limited and equal to b customers. This means that the number of customers present in the system, in the queue and the service position, must not exceed b. A customer who arrives when the waiting room is full is rejected and never returns.
Moreover, a customer who arrives when the waiting room is not full can be rejected with probability d(n), where n is the queue length at the time of arrival of this customer (including the service position, if occupied). Function d(n) will be called the dropping function. The dropping function assumes values in [0, 1] for n = 0, . . ., b − 1. As the waiting room is finite, the dropping function must fulfill the condition: d(n) = 1 for n ! b. The queue length (including the service position, if occupied) at time t will be denoted by X(t). If X(0) > 0, then it is assumed that the time origin corresponds to a service completion. The load offered to the queueing system is defined in a natural way as: where Λ is the average rate of the arrival process, given by Eq (5).
The batch Markovian arrival process was proposed (using different parameterization) in [36] and initially called N-process. An easier-to-use parameterization of it was introduced in [31]-since then the acronym BMAP has been used. A rich description of the BMAP process, its characteristics and the bibliography can be found in [37].
Let I be the unit matrix of size m × m, 0 be the square matrix of zeroes, while 1 be the column vector of size m with all entries equal to 1.
BMAP is defined as the two-dimensional Markov chain (N(t), J(t)) on state space {(i, j):i ! 0, 1 j m}, with the intensity matrix Q in the following form: where D k , k ! 0, are m × m matrices. Moreover, D k for k ! 1 are non-negative, D 0 has negative entries on its diagonal and non-negative elsewhere, matrix D defined as The components of the Markov chain (N(t), J(t)) are to be interpreted as follows: N(t) is the total number of customers arriving in interval (0, t], while J(t) is the state at time t of the onedimensional Markov chain modulating the customer arrival process, whose intensity matrix is D. The stationary distribution of J(t) will be denoted by π, and, as always, we have: The evolution of the BMAP process can be presented in the following manner. Say, at t = 0 the modulating chain J is in some state i. The modulating chain remains in this state for some random time, which is exponentially distributed with mean 1/μ i where After this random time, the state of the modulating chain changes into k, possibly with an arrival of a batch of j customers. Precisely, this happens with probability p i (j, k), where: (As it is possible that p i (0, k) > 0, a change of the modulating state without arrival of any customers can happen as well). After the change, the modulating chain remains in state k for some random time, which is exponentially distributed with mean 1/μ k , and so on.
The most important characteristics of the BMAP process can be computed as follows. The average rate of the process, Λ, is while the average rate of arrivals of batches is Therefore, the average size of a batch is equal to The variance of the time between arrivals of consecutive batches equals The k-lag autocorrelation of batch interarrival times is where p is the stationary vector for intensity matrix C−I, i.e. it fulfills pðC À IÞ ¼ ½0; . . . ; 0; p 1 ¼ 1: The counting function of the BMAP process is defined as where P denotes probability. Using the matrix notation, we can denote the counting functions as The generating function of the counting function is then where BMAP processes have been successfully used in the modeling of traffic in telecommunication networks, e.g. [38][39][40], as well as of vehicular traffic, [41].
As regards previous research on queueing models with BMAP arrivals, but without the dropping function, in [31,42] the steady-state analysis of models with infinite buffers was carried out. In [43][44][45], the transient states of queues with infinite buffers were studied. The systems with the finite waiting room were studied in steady state in [46][47][48], while in transient case in [49,50], using the potential method (see also [51]).

Queue length distribution and losses
We will denote by F n, i (t, l) the probability that the queue length at time t is l, under assumption that at the beginning (t = 0) the queue length was n, and the state of the modulating chain was i: A very import role in the computation of F n, i (t, l) will be played by two functions: H n,k,i,j (u) and q n (v,k).
Firstly, H n,k,i,j (u) is the counting function of the arrival process filtered by the dropping mechanism. It is defined as the probability that in a system with suspended service, exactly k customers would be accepted into the waiting room in time interval (0, u] and there would be J(u) = j, assuming that initially there was X(0) = n and J(0) = i. In this definition, the suspended service means that the number of customers present in the queue follows from the BMAP arrivals and the dropping mechanism only; the service is turned off and no served customers leave the system in interval (0, u]. Secondly, q n (v, k) denotes the probability, that when a batch of v customers arrives to the system in which n customers are already present, exactly k customers from this batch are accepted. The fact that sometimes only a part of the batch is accepted is, of course, a consequence of the assumed dropping policy and finite waiting room.
Both H n,k,i,j (u) and q n (v, k) will be computed in Section 4.
Having defined H n,k,i,j (u) and q n (v, k), we can build a system of integral equations for function F n, i (t, l). If at t = 0 there is at least one customer in the system, the total probability formula used with respect to the first arrival time, u, allows us to write: In the first sum on the right-hand side of Eq (6), all the events in which the first service completion time, u, happens before t, are taken into account. On the other hand, in function ρ n, i (t, l) all the events in which there is no service completion by the time t are taken into account.
If at t = 0 there are no customers in the system, the total probability formula used with respect to the first event in the arrival process, gives: where 1 i m and δ ij equals 1 if i = j and 0 otherwise. As presented in the definition of BMAP, the first event in BMAP happens after an exponentially distributed time with density μ i e −μ i u . The first summand of Eq (7) corresponds to the case u < t. The first event in BMAP can be an arrival of a batch of size v or a change of the modulating state into j, or both. If a batch of v customers arrives indeed, k of these customers are accepted with probability q 0 (v, k). Therefore, at time u we have k customers in the system and the state of the modulating chain is j. The second summand of Eq (7) corresponds to the case, where the first event in BMAP is after t. In this case the queue length at time t is equal to 0.
Applying the following notation a n;k;i;j ðsÞ ¼ respectively. Now we will rewrite Eqs (8) and (9) in the matrix form. The following square matrices will be used for this purpose: as well as column vectors: n ðs; lÞ ¼ ½ n;1 ðs; lÞ; . . . ; n;m ðs; lÞ T ; n;i ðs; lÞ ¼ 8 < : Using the introduced notation we can rewrite Eqs (8) and (9) respectively. Introducing the following (b + 1)m × (b + 1)m matrix: > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > : system Eqs (17) and (18)  where M(s) and R(s, l) are given in formulas (19) and (22), respectively. Using Theorem 1 we can obtain both the transient and stationary distribution of the queue length. For calculations of the transient distribution, we have to exploit one of the several available methods for numerical inversion of the Laplace transform (see e.g [52][53][54][55][56]). The stationary distribution does not need that. It can be obtained directly from Eq (23) by using the basic properties of the Laplace transform. Namely, we have where [Á] 1 is the first entry of a vector (but it can be any other entry as well-they are all equal in the stationary case).
Having computed the stationary distribution of the queue length, we can compute the loss ratio of the system. The loss ratio is the fraction of customers rejected upon arrival, measured in a long time interval. It will be denoted by L. Its relation with the empty system probability is not difficult to derive. Namely, in a long time interval of length T, the service station is busy for approximately (1−P 0 )T time. As R 1 0 xdFðxÞ is the average service time, the number of customers served in a long interval of length T is approximately ð1 À P 0 ÞT= R 1 0 xdFðxÞ: On the other hand, there are approximately ΛT new customers arriving to the system in a long interval of length T. Therefore, the loss ratio must be:

Auxiliary results
In this section, functions H n,k,i,j (t) and q n (v,k) for the BMAP process and the dropping mechanism will be computed. This is the last step necessary to use we will prove first the following theorem.
where h n,k (s), Y n (s) and Z(s) are the following m × m matrices: ZðsÞ ¼ diagðzðsÞÞ; Y n ðsÞ ¼ ðI À K n;0 ðsÞÞ À1 : proof. Consider first, that in time interval (0, u] one or more customers have been accepted to the waiting room. In this case, using the total probability formula with respect to the first event in BMAP, we can write for any k ! 1, 0 n b, 1 i m, 1 j m the following equation: Z t 0 q n ðv; wÞp i ðv; aÞm i e Àm i u H nþw;kÀw;a;j ðt À uÞdu: The reasoning is somewhat similar to the one presented for Eq (7). The first event in BMAP happened after an exponentially distributed time with density μ i e −μ i u . The first event could have been an arrival of a batch of size v, or a change of the modulating state into a, or both. What is important, the first event must have happened before t. If a batch of v customers arrived, w of these customers were accepted with probability q n (w, k). Therefore, at time u we had n+w customers in the system and the state of the modulating chain was a. Now, consider that no customers have been accepted in (0, u]. In this case, the first event in (0, u] could have been either a change of the modulating state (without arrival), or an arrival of a batch of customers, of which all customers were rejected. Moreover, it could have happened that the first event was after t. Therefore, we have: Application of the Laplace transform to Eqs (28) and (29) respectively. Applying the matrix notation to Eq (30) we get: while Eq (31) yields: h n;0 ðsÞ ¼ K n;0 ðsÞh n;0 ðsÞ þ ZðsÞ: This finishes the proof, as Eq (32) is equivalent to Eq (27), while Eq (33) is equivalent to Eq (26).
As for the numerical calculations of H n,k,i,j (u) values from h n,k,i,j (s), one can use the aforementioned methods of transform inversion [52][53][54][55][56]. We use the Spinelli method of [52].

Examples
For numerical purposes, we use the following BMAP parameterization, [50]: As can be easily computed, the average rate of this BMAP is 1, while the average size of an arriving batch is 8. Moreover, the process is strongly autocorrelated (see [50] for the graph of its autocorrelation function). It is assumed that the capacity of the waiting room is 200, i.e. b = 200. If not stated otherwise, the service time is 1.1, which gives the load of the system equal to 110%-the system is overloaded.
To demonstrate capabilities of dropping functions, we will present several functions providing arbitrary values of the average queue length, as well as arbitrary values of the loss ratio. Then a function that keeps two different values of the average queue length, depending on the load of the system, will be presented.
Firstly, how can we obtain an arbitrary value of the average queue length? To show this, let us start with the class of simple linear dropping functions in the form: a n; if 0 n < 200 and a n < 1; 1; if n ! 200 or a n ! 1; where a is a parameter. Let us assume also, that the required average queue length is x. If E d a ðFÞ denotes the average queue length for dropping function d a (which can be computed using Theorem 1, then we have to solve numerically the following equation: with respect to a. This equation can be solved, for instance, using the bisection method.
In Table 1 Table 1 are depicted. The spikes, which can be observed in low queue ranges, are typical when  Arbitrary values of the loss ratio can be achieved in the same way as the queue lengths. To show this, let us consider the following class of quadratic dropping functions: where b is a parameter. Let L d b denote the loss ratio for dropping function d b (which can be computed using Eq (25) and Theorem 1) and y be the required loss ratio. Thus we have to solve numerically the following equation: with respect to b. This again can be done using bisections.
For instance, the obtained five values of b which provide the loss ratio of y = 20.0%, 25.0%, 30.0%, 35.0% and 40.0%, are presented in in Table 2 Firstly, what are the minimum and maximum values of the average queue length and the loss ratio that can be obtained using the dropping function? To answer this, we have to compute the performance characteristics of the system without the dropping function, or, in other words, with the trivial dropping function in the form:  It is a simple matter to check, that in the second example we could have also used linear dropping functions to obtain loss ratios of 20%, 25%, 30%, etc. (This can be verified using bisections for functions d a ).
This observation leads to the second question. If the simple, linear dropping functions seem to have full control capabilities regarding the average queue length and the loss ratio, then what is the use of more complicated shapes of dropping functions?  To answer that, we will show first that there are many different dropping functions providing the same average queue length, but different other performance characteristics (e.g. the loss ratio).
For instance, the following three dropping functions were found, each of them providing the average queue length of 75: (In the case of dropping functions with two parameters we can set one parameter manually and find the other using bisesctions). The shapes of these dropping functions are presented in Fig 6. As we can see, they are quite different-one function is non-monotonic, two functions are monotonic, one function is convex, two are concave. In Table 3, the performance characteristics of the systems with dropping functions d 1 -d 3 are presented. As we can see, the average queue length is common, but other characteristics are different.
In Fig 7, distributions of the queue length in systems with dropping functions d 1 -d 3 are presented.
Similarly, there are many different dropping functions providing the same value of the loss ratio, but different other performance characteristics (e.g. the queue length). For instance, in addition to the first dropping function in Table 2, the following three different dropping functions provide the customer loss ratio of 20%: The shapes of these functions are depicted in Fig 8, while in Table 4 the performance characteristics of the systems exploiting them are shown. In Fig 9, distributions of the queue length in systems with dropping functions d 4 -d 6 are depicted.
As we can see, the queue length and its deviation can vary, even if the loss ratio is common in all three systems.
We can conclude that using different shapes of dropping function opens the possibility to control more than one performance characteristics at the same time. (However, every next characteristic in a smaller interval).
In the final example, the dropping function was designed is such a way, that it provides the average queue length of 30 customers when the load of the system is 90% (underloaded system), and 60 customers-when the load is 110% (overloaded system).   To find the parameters of the dropping function in this case, the system of two equations had to be solved numerically. The obtained function has the following form: and is depicted in Fig 10. The performance characteristics for d 7 and different loads are given in Table 5, while the distributions of the queue length are shown in Fig 11. This example demonstrates the possibility of designing the dropping function in such a way, that for different loads it keeps performance characteristics at some arbitrary levels.

Conclusions
We presented an analysis of the queueing system with the dropping function of arbitrary type, arbitrary distribution of the service time, and a very general customer arrival process, which allows for modeling of autocorrelation, batch arrivals and arbitrary shape of the interarrival time distribution. We obtained the distribution of the queue length both in transient and stationary regime, computed the loss ratio and presented several numerical examples.
As for the future work and open questions, probably the most interesting is the question on the stability condition for the systems with BMAP arrivals and infinite waiting room. Naturally, if the dropping function is applied in a system with infinite waiting room, the stability condition is not ρ < 1 any more. For instance, if ρ = 2 and the dropping function is d(n) = 0.51 for every n, then the system is obviously stable. But would it be stable for a more sophisticated dropping function, say dðnÞ ¼ 0:51 À 1 n , n ! 2? Does the stability depend only on ρ and the dropping function or on the BMAP structure as well?

Author Contributions
Conceived and designed the experiments: AC PM. Performed the experiments: AC PM. Analyzed the data: AC PM. Contributed reagents/materials/analysis tools: AC PM. Wrote the paper: AC PM.