Multiple probabilistic traveling salesman problem in the coordination of drug transportation—In the context of sustainability goals and Industry 4.0

Improving the effectiveness of route planning, especially in road transport deliveries is a challenge we need to face in the context of advancing climate change and the sustainable development goals. The main aim of the paper is to demonstrate the above average and utilitarian significance of the multiple probabilistic traveling salesman problem (MPTSP) in the coordination and modeling of sustainable product transportation, which is a novelty at the theoretical, conceptual, methodological and empirical level. We propose a new, hybrid algorithm of solving MPTSP instances (it connects harmony search, k-means and 2-opt), which can be successfully used in economic practice for coordination and modeling of Industry 4.0. The effectiveness of proposed approach is tested using a case study of drugs distribution services and datasets obtained from the transportation enterprise located in Poland. The study focuses on the issue of planning routes, with particular emphasis on the changing demand of customers. It should be stressed that this work may be of interest to researchers but also to management practitioners. The value added of this research lies in the innovative modeling the coordination of sustainable drug transportation as an instance of MPTSP and proposing an effective method to solve it. The main research results confirm that proposed method contributes to overall sustainability of studied supply chain.


Introduction
In the 18th century, the first industrial revolution brought significant changes in industry thanks to steam engines. The second industrial revolution was based on electric power and mass production, while the attributes of the third included the integration of information technology and computers in manufacturing. As the need to gain and sustain competitive The structure of the paper is as follows. After an introduction, we present a discussion of the underlying theoretical assumptions of the MPTSP. Second, the harmony search (HS) algorithm for the MPTSP is addressed. The research design and results are presented in the subsequent sections, while the last part of the paper offers conclusions and suggests future research directions.

Sustainable transportation
Dynamic changes in the macro-and microeconomic environments, as well as unexpected revolutionary changes in the functioning of markets and societies (for example, the pandemic period [15]), pose a number of qualitatively new challenges in the supply chain field. Today, a major challenge is the integration of sustainability principles into the supply chain, taking a multidimensional (economic, environmental, and social impact) and multiscale (institutional, geographical, and temporal) approach [16]. Another challenge is the need to improve logistic processes, taking into account the modification of logistics management objectives [17], while a third is the increasingly important integration of environmental issues into logistics activities [18]. These conditions imply that traditional logistics activities (with a particular emphasis on transport) have begun to shape the reliability of logistics services differently than before.
Currently, in many organizations, a large portion of the logistics costs in the supply chain are determined by transportation [19]. The transportation sector has proven to be particularly difficult territory for the advancement of sustainability, understood as limiting depletion of resources to the rate at which they can be replenished or alternatives can be identified [20]. Current forms of transportation are not sustainable [21]: The transportation sector consumes non-renewable resources, such as energy, human and ecological habitats, atmospheric carbon-loading capacity, and individuals' available time [20]. In particular, the dynamic global development of road transport is one of the main sources of environmental and noise pollution and, as a result, of deteriorating quality of life. Taking this into account, there is a clear need to improve the efficiency and effectiveness of product delivery and overall sustainability. Various strategies have been proposed to achieve more sustainable transportation, defined as transportation that satisfies current transport and mobility needs without compromising the ability of future generations to meet their own [22]. Other authors have pointed to the emergence of new goals for transportation decision-making and for mainstreaming alternative practices by reconfiguring existing networks [23] through optimizing transportation in more sustainable ways [24,25]. Sustainable transportation can be considered by examining the sustainability of the transport system itself, focusing on the positive and negative aspects and externalities of traffic and transport as they exist now or in the near future [26]. Sustainable transportation implies finding a proper balance between current and future environmental, social, and economic concerns [27]. As pointed out by Steg and Gifford [26], a distinction can be made here between behavioral and technological strategies. Behavioral strategies focus on reducing the level of road transport use through psychological interventions-for example, by shifting to less polluting modes of transport, changing destination choices, altering habits and attitudes, combining trips, or traveling less [26,28]. Technological strategies are instead focused on reducing the negative impact per vehicle and per kilometer [26], mainly through [29]: • increasing the efficiency of the transport system via digitization, automation, deployment of new data analysis solutions, and optimization methods (also key elements of Industry 4.0); and • the technological transition toward low-and zero-emissions vehicles, mostly in the field of vehicle design, and the use of alternative forms of energy for transportation.
Digitization in transportation and logistics is crucial in terms of sustainability, as it allows for more resource-efficient traffic management and optimization of existing transport networks.
Depending on the restrictions imposed by business practices, the traveling salesman problem (TSP) or vehicle routing problem (VRP; derived from the TSP) are often used to plan routes as part of a technological strategy. The VRP assumes that reducing the number of vehicles used and total distance traveled will reduce greenhouse gas (GHG) emissions [30], emissions of toxic and harmful substances, and noise pollution [26] and, as a result, will affect sustainable transportation. Due to various restrictions, it is necessary to develop more detailed variants of these optimization problems, which, when used in business practice, may affect the provision of sustainable transport according to the VRP. Looking for countermeasures for this type of phenomenon (increasingly common in operational practice), the authors draw attention to the potential of the MPTSP, which can be used to coordinate and model sustainable product distribution in various contexts, where the capacity of vehicles is not a real problem and the collection of recipients is constant, but it is not always necessary to visit each of them (modeling the drug transportation process). Significantly, in recent years, there has been a growing interest in questions about the critical role of methods for network and organizational coordination [31] in organizational performance [32]. Coordination is considered the act of managing interdependencies between activities [31] performed to achieve a goal [33].
Nowadays, although computers have enormous computing power, it is still not possible to accurately solve larger instances of some optimization problems that occur in business practice (e.g., job-shop scheduling, VRP, TSP). To overcome these difficulties, scientists have developed a number of approximate approaches, including metaheuristics that often draw inspiration from the real world. Such approaches usually arrive at good solutions within a short period of time, which enables the solving of even complex problems (e.g., the flexible containership loading problem for seaport container terminals [34] or the problem of planning milk-run tours for just-in-time production subject to hard time windows in congested road networks [35]).
Among the various approximate approaches used to solve TSP instances, we can distinguish between the use of population-based (e.g., ant colony system [36], the novel memetic genetic algorithm [37], and HS [38]) and single-solution-based metaheuristics (e.g., simulated annealing (SA) [39] and tabu search (TS) [39]). They are usually more effective than classic heuristics such as 2-opt or 3-opt algorithms or nearest-neighbor algorithms (NNAs), but at the cost of greater time complexity. In addition, metaheuristics often require the determination of control parameter values that affect the exploration and exploitation capabilities of the algorithm. Their proliferation can be explained by the "no free lunch" theorem [40], according to which there is no one metaheuristic that is best for all optimization problems. Therefore, it is necessary to determine the effectiveness of a technique for a specific issue.
Metaheuristics such as SA [24,41], TS [42], and HS [43] are also used to solve VRP variants. Popular approaches to the design of algorithms adapted for the VRP include the hybridization of metaheuristics with other methods (e.g., Barma et al. [44] combined a discrete antlion optimization algorithm with 2-opt, while Zhou et al. [45] proposed a hybrid bat algorithm with path relinking). The use of mathematical programming methods (linear and dynamic programming) mentioned in [46] should also be considered. However, due to their significant computational complexity, they are not used for larger VRP instances.
An example of a metaheuristic that has achieved good results in probabilistic TSP (PTSP) instances is HS [47], which draws inspiration from the process of seeking harmony in music and has been used in the process of solving many optimization problems [48]. This, indicates that it may have potential for solving MPTSP instances, especially in Industry 4.0. In [47], HS was combined with 2-opt to improve the algorithm's exploitation capability. Hybridization is often used when solving VRP instances (e.g., [44,45]), demonstrating its possibly beneficial effects on the results for the MPTSP.

Multiple probabilistic traveling salesman problem
The PTSP is an NP-hard problem [49] proposed in Jaillet's doctoral dissertation [50] as a development of the classic TSP. In the TSP, for the directed graph G = (V, A) (V = {1,. . .,n} and A = {(i, j):i, j 2 V}), with edge weights designated as c ij (representing the distance between nodes i and j), it is necessary to designate a tour (consisting of all n nodes) of the minimum length [51]. Following [51], we denote the basic model of the TSP with the Miller, Tucker, and Zemlin formulation as follows (with binary x ij variables equal to 1 if arc (i, j) belongs to the optimal solution and u i variables are used to define the order in which vertex i is visited by the salesman): The PTSP additionally assumes that each i node is assigned a certain probability p i of being visited by the traveling salesman. The aim of the optimization, therefore, is to designate such a tour a priori λ (consisting of all n nodes) to make its average length as short as possible. According to [52], the value of the objective function in PTSP may be written as: where S is a subset of V;L λ (S) is the length of the route, assuming that clients visited belong to S (in the order they appear in the itinerary); and p(S) is the probability that all clients in S are to be visited, amounting to: The expected length of the a priori route λ = (1, 2, . . ., n) amounts to: where c ij represents the weight of the edge connecting nodes at positions i and j on the route. When considering the PTSP, two particular variants need to be addressed: 1. Where each of the nodes i is described with the probability of visiting p i = 1, we have an instance of the classic TSP [49].
2. If p i for each node i is the same (and amounts to p), then we have a homogeneous PTSP, which was used to model a problem of drug distribution in [47] (from the utilitarian point of view, pharmacies are characterized practically with the same probability of supply necessity).
This paper is an extension of research presented in [47]. Therefore, we will further analyze only the homogeneous PTSP, in which the value from the formula (10) can be written as follows [52]: where L ðrÞ l � P n j¼1 c ðj;1þðjþrÞ mod nÞ [47]. The MPTSP is an innovative combination of the PTSP and multiple TSP. The second problem is the generalization of the classic TSP, in which it is necessary to determine the routes for z salesmen who start and end their journey at the base [53]. It has found practical application in, for example, the minimization of off-grade production at multi-site multi-product plants [54]. Various metaheuristics have been used to solve it, such as the improved genetic algorithm and particle swarm optimization [55].
In case it is necessary to plan routes for companies distributing drugs to a greater number of pharmacies, there is a need to separate delivery points into fewer drivers in order to execute the task efficiently. For this reason, the PTSP was broadened to include the possibility of k existing delivery zones (which might be interpreted as areas subjected to a given driver or a company branch), hence obtaining the MPTSP. For each delivery zone m (m = 1, . . ., k) a disjoint set of V m nodes is assigned, which are to be handled within one route λ m (V m � V and The task amounts to assigning of the minimum sum of the average lengths of routes λ m consisting of particular nodes belonging to the given set V m .

HS for PTSP
HS is a popular metaheuristic first proposed in Geem's doctoral dissertation [56]. It assumes the existence of the structure HM (harmony memory), which includes HMS harmonies (solutions of the problem) composed of various pitches (components of the solution). Based on two parameters, PAR (pitch adjustment rate) and HMCR (HM consideration rate), there is a iterative generation of new harmonies, whose subsequent pitches are selected according to the following rules: 1. For a probability equal to HMCR � (1 − PAR) for the pitch in position i, the value is chosen from the pitches in position i in HM.
2. For a probability equal to HMCR � PAR, the modified pitch value is chosen in position i (using the parameter bw (Bandwidth), described in more detail in [57]).
3. For a probability equal to 1 − HMCR, a random acceptable value is chosen.
The created harmony is compared with the worst harmony located in HM. If it is characterized with the most favorable value of the objective function, it takes its place, after which HM elements are sorted (assuming that the best harmony is in the first spot and the worst in the last). The entire procedure is repeated through IT iterations.
In [47], an attempt was made to adjust HS for the efficient solving of PTSP instances (based on an approach intended for the asymmetric TSP proposed in [38]). The authors assumed that the pitches in the harmony correspond to the number of nodes to be visited by the traveling salesman and that the order of their occurrence indicates the sequence of travel. They also pointed out the lack of efficiency in the choice of pitch value based exclusively on its absolute position in the harmony (due to the nature of the problem) and replaced this operation by choosing the value of a new pitch from among the available nodes (i.e., those still not added to the new route), which occurred in harmonies already located in HM after the most recently added node to the currently created route. The choice is made based on the roulette wheel method in such a way as to facilitate choosing more closely located nodes; if all the analyzed nodes have already been added to the route, a pseudorandom choice is made from among the available nodes. Additionally, the possibility of modifying the pitch with HMCR � PAR probability was replaced with making a greedy move intended to choose the node located nearest to the most recently visited node (thus eliminating the need to use bw). A new parameter R was also introduced to avoid premature convergence by means of resetting HM elements (leaving only the best harmony) after a defined number of iterations, without replacing the worst solution located in HM (an analysis of this approach was presented in [58]). The efficiency of HS for PTSP was additionally increased by applying 2-opt in the solution provided by the algorithm (while originally the solution was referred to as hybrid HS, in this work it is identified as HS). The pseudocode of the algorithm is shown in Algorithm 1.

k-means clustering algorithm
k-means clustering is a popular non-hierarchical clustering algorithm [59] and an unsupervised learning algorithm [60]. It assumes the allocation of objects to k classes defined in advance, through the following steps: 1. Initial division of objects into k clusters;

Hybrid of HS and k-means clustering algorithm for MPTSP
In our research, we took a two-stage approach to solving MPTSP instances: 1. In the first step, the pharmacies are divided into k areas, where k is determined on the basis of the abilities of the given enterprise. Handled points are allotted to the given cluster using k-means clustering, where the initial assignment is determined on the basis of pseudo-random choice of k pharmacies, creation of centroids with their coordinates, and subsequent assignment of the nearest nodes to them (to determine the distance between nodes, we used the haversine formula). The coordinates of the centroids in the subsequent iterations are determined based on the arithmetic mean of the coordinates of the nodes that make up the cluster. The algorithm stops at the moment that the next iteration does not change the assignment of nodes to any cluster.
2. Each cluster created in step 1 is treated as a separate PTSP instance being solved by a new HS instance, according to the approach proposed in [47] (described in Subsection HS for PTSP). This approach allows for the use of parallel computing, facilitating shorter computation time. The routes obtained in particular PTSP instances are a solution of an MPTSP instance (the sum of their objective function is a value of the objective function of the MPTSP).

Research methodology
Given the explanatory nature of the research, the usefulness and effectiveness of the proposed approach was checked using an instrumental case study [61], focusing on an enterprise delivering drugs within Silesian District, Poland. The problem instance consists of 90 nodes (i.e., 90 venues where drugs may be delivered), and the length between them was assigned using the haversine formula. In order to determine the usefulness of the method in economic practice, the study was performed with different parameter values: k (k 2 {1, 2, 3, 4}) and p (p 2 {0.75, 0.8, 0.9, 0.95, 1}). In addition, for the case where our problem can be reduced to a TSP instance

PLOS ONE
(p = 1 and k = 1), we obtained an optimal result using the NEOS Server for Concorde [62]; the optimal length is 70.46 km (the result was obtained on the basis of a matrix containing distances rounded to six decimal places). According to [52], it is possible to calculate the theoretical lower bound to the homogeneous PTSP optimum using the following formula: where L TSP is the optimal length of the corresponding TSP. Based on this, we determined the following LB values (for k = 1): 52.85, 56.37, 63.41 and 66.94 (for p = 0.75, p = 0.8, p = 0.9 and p = 0.95, respectively). Due to the nondeterministic nature of the algorithm, the computations were repeated 30 times, using a different seed each time (but maintaining the same allocation of points to clusters for each k, in order to determine the impact of HS nondeterminism on results). The obtained value of the objective function was designated as f (with its average value designated as � f , sample standard deviation as s f , minimum value as min f , and maximum value as max f ) and the actual time of execution of both steps of the method as t [s] (with its average value designated as � t, sample standard deviation as s t , minimum value as min t , and maximum value as max t ).
To verify the effectiveness of our approach, we compared the value of the objective function of the solutions constructed by HS with the results obtained using the 2-opt algorithm and NNA. Due to the significant influence of the initial solution on the results generated by 2-opt, the calculations were repeated 30 times, each time using a different pseudo-randomly generated initial solution. In a deterministic NNA, the generation of the solution started from the first node assigned to a given cluster. Due to the relatively low computational complexity of both methods, the execution time was not measured.

Results
The obtained results are presented in Tables 1-4 (for k = 1, k = 2, k = 3 and k = 4, respectively) and in Figs 2 and 3 (for � f and � t, respectively). We found that-regardless of the value of the parameter p-the most beneficial variant for the analyzed case study was to make a division into two zones and prepare a separate route for each of them. The results were worse for k = 4 than for k = 2 and k = 3. For k = 3, the routes were less profitable than for k = 2, probably due to unfavorable assignment of points to zones by the k-means algorithm, which proves it is necessary to assign the appropriate value k to the given instance of the problem. A particularly noteworthy observation is that the greatest variability of the value f (s f ) was characteristic of the   results for k = 1, in which the non-deterministic nature of the method was emphasized, due to a relatively large number of nodes that make up the route constructed by HS (for instances where k = 4, assuming the creation of four relatively short routes, the impact of variability caused by the non-determinism of HS was insignificant, at the cost of a significant influence of the k-means algorithm). When analyzing the actual time needed to execute the method, particular attention should be paid to the influence of external factors on the executed measurements. The decreased running time of the algorithm related to increases in the value of the parameter k was likely caused by the acceleration of the most time-consuming part of the two-stage procedure (i.e., the execution of HS). The limitation of the number of pitches included in particular harmonies (decreasing it to the cluster size) accelerated the execution of the main loop of HS. Consequently, the total time of sequential execution of HS instances for particular subsets of nodes was shorter than the time of execution of the method for k = 1. The increase of the execution time of the method for p = 0.75 with reference to the remaining values of p was likely caused exclusively by external factors affecting the measurement. However, it is particularly noteworthy that the running time of the entire algorithm was insignificant (and can be additionally shortened using parallel computations for k > 1), enabling designation of the solution within 4-7 minutes on average. This makes the use of this technique possible in many cases occurring in contemporary economic practice. . We found that our approach was more effective than the classic heuristics adapted to the TSP and enables the construction of shorter routes.

Discussion, conclusions, and future work
In this work, we proposed a hybrid algorithm for solving MPTSP instances (combining HS, kmeans clustering, and 2-opt). It can be successfully used in economic practice for coordination and modeling in Industry 4.0 and in applications with special restrictions that prevent the use of popular VRP and TSP variants (e.g., to optimize itineraries in the process of distribution of .85 for p = 0.95, p = 0.9, p = 0.8 and p = 0.75, respectively). We showed that our proposed method can contribute to the overall sustainability of the supply chain. In light of our close-tooptimal results, we can state that the application of the algorithm in business practice will reduce the average length of travel routes. When comparing the results with the solutions obtained by NNA and 2-opt, we found that our approach is more effective than the classic heuristics adapted to the TSP and enables the construction of shorter travel routes. The reduction of distance travelled by vehicles means a reduction in GHG emissions; emissions of toxic and harmful substances polluting air, water, and soil; noise pollution; and damage to natural habitats. It is of especially great importance for drug deliveries, which are mostly based on road transport-the mode of transportation with the greatest negative impact on the environment. Moreover, a decrease in the distance traveled by vehicles means more efficient traffic flows; as such, a reduction in the number of vehicles used results in decreased congestion and resource consumption across the overall transportation system.
Shifting towards Industry 4.0 is a response to the need to establish a sustainable economy, especially in the context of the growing problems of climate change and natural resources depletion. In this context, optimization and digitization are key elements of energy-and resource-efficient logistics. Studies conducted to date show that in many sectors-particularly the transport sector-technology itself is not an issue anymore, in light of the ability to access enormous computing power and data sets. Despite this, however, it is still not possible to solve some optimization problems in business practice. Accordingly, new approaches like the MPTSP are needed.
Conscious implementation of behavioral and technological strategies may improve environmental quality, urban quality of life, and destination accessibility [26]. In terms of technological strategies, the rapid growth of Logistics 4.0 as an element of Industry 4.0 will lead to a wider variety of available tools. However, it should be kept in mind that sustainable transportation (planning) may require a change in people's mindsets at the same time.
Based on our results, the choice of the number of clusters that must be determined using the proposed method should be designated not only on the basis of the resources available to the enterprise in question but also in light of the nature of the given task. In addition, when considering only the lengths of routes existing in particular clusters, the distance from the base of the enterprise delivering drugs is not analyzed. This must be taken into account in practical applications, as it requires that travel commence at a given location (a factor that could affect the optimal number of clusters). When solving the instance of the problem, the actual distances between nodes should be applied, taking into account the nature of the existing line infrastructure.
As indicated, an important new theoretical implication of our work is the study of the MPTSP in the coordination and modeling of sustainable product transportation. Moreover, practitioners can use these research results to improve and optimize the flow of materials in a more sustainable way. The proposed approach can also be modified and adapted to optimize other kinds of supply chains as well as developed for the purposes of further research work. Further work on this study's subject matter might include extensions of the analyzed "test bed" (in order to analyze the behavior of the algorithm on diversified data) and increasing the efficiency of the algorithm itself. The first step of the procedure should include modifying the kmeans algorithm, according to other studies, such as [59,63] in which the quality of the obtained results was improved by applying the appropriate procedure for selecting initial solutions. [64] proposed using a modification that enabled the time of k-means execution to be shortened. The hybrid HS algorithm used to execute the second step of the procedure in this work can be modified with a more efficient version of 2-opt (also referenced in [47], proposing the use of 2-p-opt, used in the paper [49]) that can be combined with other metaphor-based metaheuristics or swarm intelligence algorithms (e.g., it was combined with Artificial Bee Colony [65] and with Cuckoo Search in [66]) and may apply dynamically changing parameter values (such as PAR, used in [67]). Additionally, it would be worthwhile to analyze the impact of the values of particular parameters on the method's efficiency. It may also prove useful to formulate a compact mathematical model with objective functions and constraints for the MPTSP in order to use appropriate solvers. Finally recoverable robustness concepts [68,69] can be integrated into the PTSP.
The limitations of this study result from the necessity to limit the volume of the paper and discussed issues. The presented empirical analysis does not include the actual calculation of the environmental impact resulting from the application of the developed hybrid algorithm. The reduction of negative impact on the environment resulting from the reduction of time and fuel consumption, noise and air pollutant emission etc. can be dimensioned in appropriate values and monetised, which in fact would allow us to measure the effectiveness of the proposed approach and the real contribution to sustainability. In this regard, the conducted study may be the basis for designing future research.