Figures
Abstract
While the classical knapsack problem has been the object to be solved by optimization algorithm proposals for many years, another version of this problem, discounted {0-1} knapsack problem, is gaining a lot of attention recently. The original knapsack problem requires selecting specific items from an item set to maximize the total benefit while ensuring that the total weight does not exceed the knapsack capacity. Meanwhile, discounted {0-1} knapsack problem has more stringent requirements in which items are divided into groups, and only up to one item from a particular group can be selected. This constraint, which does not exist in the original knapsack problem, makes discounted {0-1} knapsack problem even more challenging. In this paper, we propose a new algorithm based on salp swarm algorithm in the form of four different variants to resolve the discounted {0-1} knapsack problem. In addition, we also make use of an effective data modeling mechanism and a greedy repair operator that helps overcome local optima when finding the global optimal solution. Experimental and statistical results show that our algorithm is superior to currently available algorithms in terms of solution quality, convergence, and other statistical criteria.
Citation: Dang BT, Truong TK (2022) Binary salp swarm algorithm for discounted {0-1} knapsack problem. PLoS ONE 17(4): e0266537. https://doi.org/10.1371/journal.pone.0266537
Editor: Seyedali Mirjalili, Torrens University Australia, AUSTRALIA
Received: October 21, 2021; Accepted: March 22, 2022; Published: April 7, 2022
Copyright: © 2022 Dang, Truong. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript, as well as publicly available to the research community at: https://doi.org/10.6084/m9.figshare.19416857.v2.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Mathematical problems are not merely theoretical abstractions. Many of them are the mappings of real-life problems, and some have broad and practical applications. Hence, solving those problems becomes more urgent, and we can recognize the efforts of researchers in this area for many decades. Knapsack problem (KP) [1] is a typical example of such problems. In particular, KP is a classical combinatorial optimization problem in which we have a given set of items, and each item is coupled with a weight and a profit value.
The knapsack problem has vast practical applications in many areas. These applications include but are not limited to computer memory management, facilities management, energy consumption optimization, adaptive multimedia systems, resource allocation, logistics, encryption, cryptography, etc. All these seemingly unrelated problems meet in a common point: there is a limited resource and the utilization of that resource needs to be optimized.
To resolve the KP, we have to determine an item subset with the maximum sum profit while the total weight of the selected items is still less than or equal to a predetermined knapsack capacity. In Guldan’s thesis [2], new mathematical models of the discounted {0-1} knapsack problem (DKP01) are introduced. This problem integrates the discount concept used in the real-world sale business, in which we can get a discounted price for purchasing multiple items together.
In DKP01, assume that we have n groups with three items each. Each item has two features. These features were often referred to as profit and weight in previous works. Although these names are consistent with the idea of the capacity of a knapsack in the original KP, they do not really fit the cost reduction concept in the DKP01. Therefore, we suggest that this pair of features be value and cost, abbreviated as v and c. Items in group i, i ∈ {0, 1, …, n − 1}, can be modeled as xi,1(ci,1, vi,1), xi,2(ci,2, vi,2), and xi,3(ci,3, vi,3). Note that the third item, xi,3, represents the case where the discount is applied in real life, with ci,1 + ci,2 > ci,3 and vi,3 = vi,1 + vi,2. In other words, the situation where item xi,3 is chosen is equivalent with the case where both of the first and the second item in the group, xi,1 and xi,2, are selected and, for choosing them together, we get a discounted cost. As a result, for each group, one in four options can happen: (i) taking the first item only, (ii) taking the second item only, (iii) taking the third item (which equals to taking both the first and second item in real world) and get the discounted cost ci,3, and (iv) not taking any in these items at all. In another way, in the presentation model of the problem where there are three items in a group, we can select at most one item from that group. This is what separates DKP01 from the traditional KP, which does not divide items into groups and does not have any limitation in terms of the number of items to be selected in each group. This particular characteristic makes DKP01 more strenuous than the original KP.
Generally, the goal in solving DKP01 is selecting a subset of items whose total value is maximum while the sum cost of the chosen ones is not greater than a predefined threshold C and no more than one item can be picked from each group. This problem is mathematically modeled as follows.
(1)
(2)
(3)
(4)
where, si,j = 0 shows that the item xi,j is not chosen, and si,j = 1 means the item xi,j is selected, with i ∈ {0, 1, …, n − 1}, j ∈ {1, 2, 3}. A binary vector A = (a0, a1, …, a3n−1) ∈ {0, 1}3n is a candidate solution of DKP01 if A complies with the conditions given by Eqs 2 and 3.
Being an NP-hard problem, unless P = NP, DKP01 cannot be resolved using a polynomial-time algorithm. In her master’s thesis, Guldan proposes the DKP01 and uses dynamic programming to solve it [2]. in fact, the set-out cases of the KP have very different properties. Circumstances where items have a weak relationship, or even no relationship, between value and cost, are considered relatively easy to find the optimal solution even when the number of items is large. Meanwhile, in cases where values and costs are closely related, choosing which items to put in the knapsack is harder. Based on this characteristic, one way to solve the traditional KP is using the idea of core, which is defined in [3, 4]. Then, the authors of [5] define an alternative core of the DKP01 by mimicking the idea of core in KP. Using this new alternative core, they propose a partitioning scheme to divide the original DKP01 into sub-problems to reduce calculation complexity, and utilize dynamic programming to solve them. Additionally, [6] proposes an exact algorithm that tries to minimize the total cost with a predetermined sum value to solve DKP01. Then, based on this algorithm, three approximate algorithms are introduced.
Evolutionary and swarm intelligence-based computation are also applied in solving DKP01. The authors of [7] propose two mathematical models for DKP01 and two genetic algorithm-based algorithms, FirEGA and SecEGA, to resolve the problem. In [8], the authors propose two evolutionary operators called global exploration operator (R-GEO) and local development operator (R-LDO) to design a ring theory-based evolutionary algorithm which is used to solve the DKP01. While the two operators rely on ring theory, the evolutionary algorithm is based on the flower pollination algorithm [9]. The authors of [10] also nominate a multi-strategy algorithm for DKP01 on the basis of monarch butterfly optimization (MBO) [11]. In this study, the monarch butterfly population was separated into two sub-populations. The positions of monarch butterfly individuals in the first sub-population are handled by a neighborhood mutation-based crowding operator, which replaces the original MBO migration operator. Moreover, in [12], the application of moth search (MS) for DKP01 is investigated. First, the impacts of the Lévy flights operator and the fly straightly operator on basic MS are evaluated. Then, nine MS-based algorithms are developed using a global-best harmony search (GHS)-based mutation operator. Another contribution in nature-inspired optimization algorithm application, [13], introduces a discrete hybrid teaching-learning-based optimization algorithm (HTLBO) to resolve DKP01. A quaternary code is introduced to represent a DKP01 solution, and the individuals are modeled by double coding. The Learner’s learning strategy is improved to expand the discovery capabilities of HTLBO, while self-learning is implemented to balance exploration and exploitation. Two sorts of crossover are also designed to strengthen the effectiveness of global search in this algorithm.
In a recently published work [14], Truong has developed a binary version of the famous Particle Swarm Optimization algorithm [15] to solve the DKP01 problem. In another publication [16], moth-flame optimization [17] is used to solve this problem. Most recently, an improved version of the Harris Hawks Optimization (HHO) algorithm [18] is proposed in [19]. Although the HHO algorithm has a pretty good balance between exploration and exploitation, the authors of this paper have suggested tweaks that target the attack phase of Harris hawks using opposition-based learning (OBL) strategy to increase the diversity in the search process. The main idea of OBL is to compare the fitness values of the current solution and its opposite case and then choose the better solution to include in the next generation. Additionally, the prey escape energy value, which was originally designed to reduce linearly, has also been redefined to reduce logarithmically non-linearly, making the transition between exploration and exploitation smoother. The authors also introduce a random unscented sigma point mutation mechanism to help HHO converge more quickly to the best solution the algorithm can achieve. Besides solving traditional benchmark functions (CEC2017 and CEC2020) and engineering problems, the resulting algorithm is also used to solve the DKP01 problem in selected data sets. However, the test results show that the existing DKP01 test instances are not simple, and this algorithm has not achieved very good results, which also means that there is still a lot of space for other solutions in the future.
Besides taking advantage of classical optimization algorithms, new optimization algorithms are also regularly introduced and open up new directions in solving optimization problems. An example for this is [20], where the authors presented two variants of a widely accepted swarm intelligence-based optimization algorithm, the single objective salp swarm algorithm (SSA) and the multi-objective salp swarm algorithm (MSSA). The main motivation of SSSA and MSSA is the swarming conduct of salps when exploring and rummaging for food in the seas. Test results on various data sets show that the SSA is able to improve the initial arbitrary solutions and converge towards the ideal one. More details on SSA will be given in the next section of this paper.
Despite being a relatively new algorithm, SSA has been cited in several scientific works across various research fields. In [21], the authors develop a binary version of SSA utilizing eight transformation functions and a crossover operator instead of the basic one which the original SSA provides. In [22], to study the optimal connections between switches and controllers and the optimal number of deployed controllers in large-scale software-defined networks (SDN) [23], the authors propose an optimization algorithm based on SSA using chaotic maps. In an effort to solve the feature selection problem, [24] introduces another chaotic SSA algorithm and integrates it with a K-nearest neighbor classifier. Their solution is also proved to be efficient in tackling the local optima stagnation issue as well as improving the convergence behavior of the original SSA algorithm. [25] implements opposition-based learning in the initialization phase of SSA to enhance its population diversity. Moreover, local search algorithm (LSA) is also used in this work to improve exploitation performance. The authors of [26] propose a binary SSA using a modified arctan transformation. In [27], SSA is enhanced by balancing the exploration and exploitation process. [28] extends the original SSA by implementing multiple independent salp chains and applies them for maximum power point tracking (MPPT) of photovoltaic systems under partial shading conditions. [29] uses space transformation search (STS) [30] to improve the performance of SSA, and the resulted algorithm is deployed to train a multi-layer feed-forward network. A recent publication, [31], proposes new mutation operators to balance the exploration and exploitation phases of SSA. The authors of [32] present the solitary and colonial reproduction phase of salp in emended salp swarm algorithm (ESSA), which is used to resolve the economic load dispatch problem in a multi-objective framework. In [33], composite mutation strategy (CMS) and restart strategy (RS) are integrated into SSA to boost exploitation and exploration trends of SSA as well as aid salps in avoiding local optimum.
Though numerous studies have referred to SSA, to the best of our knowledge, this paper is the first to utilize SSA in resolving DKP01. Although the algorithms for DKP01 mentioned above have achieved encouraging results, parts of the solutions chosen by them are not very reasonable. They can be improved, such as the classical solution representation, which is not an ideal choice and will be replaced by the scheme in this paper. Besides, we intend to combine the power of SSA with the application of a greedy repair operator for local optimization as well as to address the weakness of SSA when its solutions are easily stuck at the local optimal point and can’t get out to try other candidate solutions during the global optimization process. In detail, the contributions of this work include:
- A novel binary salp swarm algorithm (BSSA) with four binary transformation functions and a new solution presentation scheme to solve the discounted {0-1} knapsack problem.
- A combination with a minimal encoding scheme whose binary solution vector length is 2n (in comparison to the length of 3n of the original DKP01 that is used in many previous papers). While providing enhancements in calculation speed and reducing the complexity, this scheme automatically satisfies the constraint of the DKP01 stated in Eq 2.
- The use of a repair operator on the positions of the salps during salp chain movement towards the food source to avoid local optima and enhance calculation effectiveness.
The rest of this paper is as follows. The next section gives an introduction to the salp swarm algorithm (SSA), which is the basis for our algorithm. The section after that details our proposed binary SSA for DKP01. Then come the simulation results and discussion of our algorithm’s performance in comparison to those of other existing algorithms. Finally, the Conclusions section will conclude the paper.
Salp swarm algorithm
Introduced by [20], SSA has received much attention recently due to its simplicity, effectiveness, as well as adaptability to various optimization problems. This section will give details on this algorithm.
A salp, which can be found generally in deep seas but sometimes near the surface, is a barrel-formed, planktic tunicate that moves by contracting, thereby pushing water through its jelly-like body. One of the most interesting activities of a salp population is forming a salp chain, which may increase the swarm effectiveness in traveling and foraging. SSA is an effort to facsimile the swarming behavior of salps in oceans.
In SSA, individuals in a salp population are classified into two categories: the leader, which is the salp at the head of the chain, and the followers. The position of a given salp is modeled as an n-dimensional search space, in which n is the number of variables of the problem to be solved. As a result, all the position vectors of the salp population form a 2-dimensional matrix named pos. The food source of the swarm is modeled as the target F in the search space.
The position of the leader is updated utilizing the below condition:
(5)
where
represents the coordinates of the leader in the jth dimension, Fj depicts the position of the target F in the jth dimension, ubj is the upper bound of the jth dimension, and lbj is the lower bound of the jth dimension. Additionally, c1 is a number generated using the following rule:
(6)
where k is the current iteration and K is the maximum iteration. Meanwhile, c2 and c3 are randomized in the range [0, 1].
The positions of the followers are manipulated using the below equation:
(7)
The general idea of SSA is simple: the leader moves towards the target (food source), and the followers trail the leader. In optimization problems, while the global optimum should be the target, there is no such thing that exists. To resolve this, the best solution obtained at a given time is considered the global optimum, and the salps should head towards it. The pseudo-code of SSA is shown in Algorithm 1.
Algorithm 1: The pseudo-code of salp swarm algorithm
Inputs: Initial parameters
Output: Optimal solution
Initialize salp population considering upper bound and lower bound
while end condition is not met do
F ← the best search agent (salp)
Generate c1 using Eq 6
for each salp do
if the salp is the leader then
Update the position of the leading salp by Eq 5
else
Update the position of the current salp by Eq 7
Amend the salps based on the upper and lower bounds of variables
return F
Next, we will do some analysis to clarify how SSA works. From the given information, the interesting part is how the position elements of the salps are manipulated. Firstly, it is easy to notice that the lower bound lb and upper bound ub vectors are critical in keeping the position elements of the leading salp be in the valid range, which will, in turn, lead the followers on the right path. For simplicity, assume that all items in lb has the same value of 1, and all items in ub has the same value of 10. Thus, from Eq 5 and since c2 is randomly generated in [0, 1], the position of the leader in the jth dimension is specified by:
(8)
The values of c1 are in the range of [0, 2]. Assume that the maximum iteration K = 100, the curve formed by values of c1 is illustrated in Fig 1.
It can be seen that the closer k gets to K, the smaller c1 is, and the less chance that the corresponding position element of the leading salp can change significantly. Although this is consistent with the nature of global search in evolutionary computation, where searches in the early stage should cover a broader scope than those in the later stage, this also represents the risk that the leading salp can be easily stuck at a local optimal point. This becomes even more serious when a position element of a following salp is simply the average between the value of the new position element of the preceding salp and the value of its own position element in the previous iteration. This means that, when the leading salp gets stuck, it is unlikely that the salps that follow it have a way to assist it in coming up with solutions to get out of the local optimum.
To resolve this problem and due to the fact that SSA has no mechanism to deal with DKP01 constraints, we decide to implement a repair operator which will help the solution given by SSA avoid local optima, and improve its fitness. Details of this operator and other proposed algorithms are given in the next section. Note that although there exists a multi-objective version of SSA, discussion of it is beyond the scope of this paper.
Proposed binary salp swarm algorithm for DKP01
The original SSA needs many amendments to solve the DKP01. This section will provide details on the solutions that we propose.
Binary transformation functions
To operate in a binary search space, binary transformation functions are necessary so that the related parameters should take the value 0 or 1 only. Sigmoid function [34, 35] is widely accepted as a means of transferring real values into probability. In this paper, we utilize four S-shaped sigmoid transformation functions as follows:
(9)
(10)
(11)
(12)
Plots drawn from the outputs of functions detailed in Eqs 9–12 are shown in Fig 2. Using these four functions, we propose a novel Binary SSA (BSSA) optimizer for DKP01 which has four variants being BSSA1, BSSA2, BSSA3, and BSSA4, respectively. In particular, BSSA1 will take advantage of Sig1(⋅), BSSA2 utilizes Sig2(⋅), BSSA3 implements Sig3(⋅), and BSSA4 makes use of Sig4(⋅). Our new algorithm will also use modified versions of Eqs 5 and 7 which will use the transformation functions given in Eqs 9–12.
Firstly, we define z1 and z2 as:
(13)
(14)
As a result, we have new expressions for salp position manipulation as follows:
(15)
(16)
where p ∈ {1, 2, 3, 4}, and r is a randomized value, r ∈ [0, 1].
Solution presentation
The traditional approach to encode a solution of a {0, 1} optimization problem is using a binary vector whose length is the number of dimensions of the search space:
(17)
Each three-bit binary number represents three items in a group. If a bit is set to 1, the item at the related spot is selected. Otherwise, value 0 at a given bit means that the item is not chosen to be in the knapsack.
In this paper, we use a binary encoding scheme as shown in Table 1, with two-bit binary numbers used for solution presentation, as described in Eq 18.
(18)
For binary solutions of length 3n, the search space has 23n possible cases, while a binary solution of length 2n has a much smaller search space: 22n potential cases. The representation of the 2n solution allows the search for candidate solutions in a much smaller space. Besides, the representation of the 2n solution also helps them to automatically satisfy the constraint specified in Eq 2, which we do not have when using the 3n representation. 3n solutions need to be checked to ensure that a specific three-bit binary number does not violate Eq 2. In the worst case, when a violation occurs, another value needs to be assigned to that number. These actions are not necessary with binary solutions of length 2n.
When considering the constraint of Eq 2, all random solutions in the 2n search space are possible solutions, while the 3n search space contains non-viable solutions. Therefore, representing a solution of length 2n reduces the computation time.
Repair operator
To deal with the restriction in Eq 3 and enhance the solution, we use a repair operator based on the functions used in [6, 14]. With n groups, we have a total of 3n candidate items to be put into the knapsack, including the combined items. Note that when the mentioned functions only support the 3n solution, we design our operator so that the 2n solution is supported while 3n items are still in consideration.
In short, the repair operator does the job of manipulating the selected set based on the value-to-cost ratio values vi, j/ci, j, (i ∈ {0, 1, …, n − 1}, j ∈ {1, 2, 3}) to reduce CPU usage and improve local optimum avoidance capability.
Since choosing which items to remove from or to add to the knapsack is not simply a matter of prioritizing combined items (a particular combined item is not necessarily better than a single item), we decide to sort all items and put them into a deterministic process. Thus, before the repair operator execution, all the items, including the combined ones, are sorted decreasing by the value-to-cost ratio values. The indexes of the items in this order are kept in the ID vector of length 3n. Using the ID vector, the items with more priority will be processed first. Then, the steps which this operator will do are as follows.
The repair operator has two phases: the repair and optimization phases. The repair phase is designed to fix a solution to become a feasible one from an impracticable state. Meanwhile, the optimization phase will enhance the fitness of a viable solution. If the current total cost is greater than C, the repair phase will remove items from the knapsack until the condition given by Eq 3 is met. After that, the optimization phase adds items to the knapsack provided that the total cost does not exceed C.
The inputs of the operator include the solution Y of length 2n, the cost vector of length 3n, the index vector ID, and the knapsack capacity C.
Algorithm 2 shows the pseudo-code of the related repair operator. Note that the operator’s computational complexity is O(n).
Algorithm 2: Repair operator
Input: Solution Y = (y1, y2, …, y2n) ∈ {0, 1}2n, cost vector c, index vector
ID = (id1, id2, …, id3n), and knapsack capacity C.
Output: Solution after repair Y
% Repair phase
fc ← 0
for k ← 1 to 3 * n do
i ← floor((idk − 1)/3)
j ← mod(idk − 1, 3) + 1
if y2i+1 = 0 ∧ y2i+2 = 1 ∧ j = 1 then
if
then
else
y2i+1 ← 0; y2i+2 ← 0
if y2i+1 = 1 ∧ y2i+2 = 0 ∧ j = 2 then
if
then
else
y2i+1 ← 0; y2i+2 ← 0
if y2i+1 = 1 ∧ y2i+2 = 1 ∧ j = 3 then
if
then
else
y2i+1 ← 0; y2i+2 ← 0
% Optimization phase
for k ← 1 to 3 * n do
i ← floor((idk − 1)/3)
j ← mod(idk − 1, 3) + 1
if
then
if j = 1 then
y2i+1 ← 0; y2i+2 ← 1
if j = 2 then
y2i+1 ← 1; y2i+2 ← 0
if j = 3 then
y2i+1 ← 1; y2i+2 ← 1
return Y
To sum up, the pseudo-code of our proposed BSSA algorithms for DKP01 is detailed in Algorithm 3.
Algorithm 3: Pseudo-code for BSSA algorithms for DKP01
Input: Initial parameters
Output: Optimal solution
Initialize salp population considering upper bound and lower bound
while end condition is not met do
for each salp do
Converting real position values of the current salp into binary numbers using Eqs 15 or 16
Calculate the fitness of the current salp using the repair operator
F ← the best salp
Generate c1 using Eq 6
for each salp do
if the salp is the leader then
Update the position of the leading salp by Eq 5
else
Update the position of the current salp by Eq 7
Amend the salps based on the upper and lower bounds of variables
Results and discussion
The simulations used for this paper are for these goals:
- Compare four variants of our proposed BSSA to determine the best ones for DKP01. This is an internal test only. Thus, only the algorithms proposed by this paper are included in related tests, diagrams, and tables.
- Then, our best BSSA variants for DKP01 will be compared to selected algorithms proposed by other scientific works to see which one performs best in various aspects through statistical calculations. The chosen algorithms are the best we could find in recent publications.
Firstly, we choose two revised versions of genetic algorithm (GA) [36] and particle swarm optimization (PSO) [15] to include in the comparison. In the case of the GA variant for DKP01, we choose FirEGA, which is introduced in [7]. The PSO version to be tested is the best one from [14], BPSO8. We also include the results of MS1, which is designed based on the moth search algorithm and is the best algorithm for DKP01 proposed in [12], and MMBO, a multi-strategy monarch butterfly optimization algorithm for DKP01 introduced by [10]. The authors of this paper propose many variants of their algorithm, and we choose the best one of them. In [19], the authors have tested their algorithm on selected instances of the DKP01 problem. Since this is a promising algorithm and a recently published work, we also decided to include the experimental results of this algorithm for comparison.
The parameters used for testing are shown in Table 2. For a fair comparison, we set the population sizes (the number of particles in case of the PSO variant) at the same value, 50. Furthermore, the maximum iterations of all algorithms are set to the number of dimensions of DKP01, 2n, for the same reason.
We use 40 DKP01 instances proposed by [7] and available at https://www.doi.org/10.6084/m9.figshare.19416857.v2 to test all algorithms. They include 10 strongly correlated instances (SDKP1-SDKP10), 10 inverse strongly correlated instances (IDKP1- IDKP10), 10 uncorrelated instances (UDKP1- UDKP10), and 10 weakly correlated instances (WDKP1- WDKP10). The correlation is considered strong when cost and value are closely related and highly dependent on each other. Contrarily, the correlation is considered weak when cost and value are loosely related. The number of items in each instance is 3n, n ∈ {100, 200, …, 1000}. The mentioned instances are also used in [37].
All related algorithms, coded on MATLAB R2018a, run on an ASUS laptop, equipped with an Intel Core i5-8250u 1.6 GHz CPU, 8 GB DDR3 SDRAM, and uses Microsoft Windows 10 as the operating system.
Convergence behaviour
Our first concern is the convergence speed towards the optimal solution of the algorithms. We recorded the degree of convergence by running four versions of our proposed algorithm on different data set files, each algorithm being run once. After each iteration, the resulting best-so-far total value is saved. This set of values is fed into a graph showing the convergence behaviour.
In fact, the four data set types of the DKP01 problem have quite different characteristics. However, we found that the convergence behavior of these algorithms on problems of different sizes on the same data set type is not significantly different. Therefore, we decided to choose two typical cases to describe the convergence of the algorithms for each type of data set. Fig 3 summarizes the converging curves for these types of data sets.
In the test with all data sets, BSSA1 and BSSA2 proved their superiority over the other two versions of the algorithm. They achieve better solution quality and higher fitness value from the first iterations. The early convergence behaviour also suggests that BSSA1 and BSSA2 can be further improved to take advantage of later iterations. Fig 3 also shows that while BSSA3 and BSSA4 can perform closer to the performances of BSSA1 and BSSA2 in case of smaller problems, it can be concluded that BSSA3 and BSSA4 are not appropriate to be used for larger problems.
Stability and solution quality
This subsection focuses on examining the stability and quality of the solutions returned by our proposed algorithms. For demonstration, we use box plots whose data are the best values achieved after each algorithm run. To obtain a series of best values that will be used to create the box plot, we run each algorithm 30 times and get 30 best results.
In descriptive statistics, a box plot [38] is a graphical tool to demonstrate the data distribution using a five-number summary of that data set. Those five numbers are the minimum, the first quartile, the median, the third quartile, and the maximum values. A box plot will occupy the space from the first quartile to the third quartile, and as a result, it will span approximately 50 percent of the data range from the minimum value to the maximum one. The lowest 25 percent and the highest 25 percent spaces are not in the box. The horizontal line in the box stands for the median. The higher this line is, the better the quality will be. Moreover, the more flattened the box, the more consistent the values.
We use the same approach as in the analysis of the convergence curves, which means that we choose two typical cases for each data set type. Fig 4 shows these charts. It is easy to see that BSSA3 and BSSA4 cannot compete with BSSA1 and BSSA2. Their boxes are thicker, which means the outputs are not stable. In other words, the differences among best total values obtained after 30 runs of these algorithms are significant. In most cases, even the maximum best value after 30 runs achieved by BSSA3 and BSSA4 is not close to the minimum best value obtained by BSSA1 and BSSA2. This magnifies the preeminence of BSSA1 and BSSA2. The same goes for other tests. If we have a closer look at the boxes provided by BSSA1 and BSSA2, it is fair to conclude that BSSA2 is slightly better than BSSA1 in terms of stability and solution quality.
Wilcoxon rank sum test
The Wilcoxon rank-sum test [39] is a non-parametric hypothesis test that is used to evaluate whether the distributions of populations obtained from two separate sources are with the same medians or not. In this subsection, Wilcoxon rank-sum tests are implemented to assess the differences among the solutions returned by our proposed algorithms. Specifically, Table 3 displays the p values we obtained when testing the solution sets given by BSSA1 against those of BSSA2, BSSA2, and BSSA4, respectively. Note that there exists a default significance level α = 0.05. In case p ≥ α, there is not enough statistical evidence to confirm that the difference between the compared populations is significant. Otherwise, it can be concluded that the dissimilarity among the two related sets of values is notable.
Based on the statistical results in Table 3, we can conclude that the solutions given by BSSA1 are significantly different from the solutions returned by BSSA3 and BSSA4. The situation between BSSA1 and BSSA2 is more complicated. There are 21 times p exceeds the 0.05 threshold, while in the remaining 19 times, p is less than 0.05. In another word, in 52.5 percent of the tests, the difference among the solutions given by BSSA1 and BSSA2 is clear, while we can not statistically differentiate them in 47.5 percent cases. In short, Wilcoxon rank sum tests reaffirm what we have observed in previous subsections: BSSA1 and BSSA2 are at the same level and both of them are superior to BSSA3 and BSSA4.
Friedman test and Nemenyi post-hoc test
This subsection presents the results of the Friedman test [40–42] to provide an additional statistical perspective. Friedman test is a non-parametric test to replace the Repeated Measures ANOVA test [43]. The input parameters for this test are three or more populations, and the test will return a single conclusion after comparing these populations in its way. The null and alternative hypotheses of this test are:
- H0: The mean values of the populations are similar.
- Ha: At least one population mean is different from the mean values of the rest.
Again, the significance level α = 0.05 is applied. Suppose the p-value returned by the Friedman test is less than or equal to α. In that case, the conclusion will be that the null hypothesis is rejected and the alternative hypothesis is confirmed. Otherwise, the null hypothesis will be accepted.
We perform the Friedman test based on the mean fitness values obtained from 30 runs on each instance. Thus, for each in the algorithms BSSA1, BSSA2, BSSA3, BSSA4, we have a population of 40 mean values. These four populations are input parameters for the Friedman test. The returned results of Friedman’s test are as follows:
- Test statistic: 108.66
- p-value: 2.1281E-023
The results show that at least one population mean is significantly different from the rest. To clarify which algorithm’s solution population this conclusion is for, we perform the Nemenyi post-hoc test [44]. This test will help answer the question of which population is genuinely distinct. This test returns a table containing the results of pairwise tests. Table 4 shows p-values of this test.
The results in Table 4 show that, when comparing BSSA1 with BSSA2, the p-value is 0.9. When comparing BSSA1 with BSSA3, the p-value is 0.001. The result is the same when comparing BSSA1 with BSSA4. When comparing BSSA2 with BSSA3 and BSSA4, the p-values are the same and equal to 0.001. The p-value when comparing BSSA3 with BSSA4 is 0.00299. When assessing these values with a significance level of 0.05, it can be seen that BSSA1 and BSSA2 have similar populations of mean total values, and they are significantly different from those of BSSA3 and BSSA4. If we consider the case of BSSA3 and BSSA4, they are also considerably different, although this difference is not as significant as the difference when compared with BSSA1 and BSSA2.
In summary, the Friedman and Nemenyi tests show that the results of BSSA1 and BSSA2 are not significantly different, while they are substantially different from those of BSSA3 and BSSA4.
Comparison to other algorithms
In this subsection, we compare BSSA1 and BSSA2 with five other algorithms for DKP01. The first is an evolutionary algorithm, FirEGA [8], the second is a swarm intelligence-based one, BPSO8 [14]. Additionally, the best algorithms proposed in [12] and [10], MS1 and MMBO, are also included in the comparison. Note that the two latter algorithms were not tested in inverse correlated data sets, and the related papers did not provide data in some criteria. Anyway, the most important results are available, and they help us in this comparison phase. The results from Improved Harris hawk optimizer (IHHO) [19], a recently published work, are also included in the statistical tables. Although the authors of this algorithm only tested on some representative data sets, we believe that their results help further clarify where our algorithm stands.
Tables 5–8 are used to show the test results. These tables include the statistical calculation results of the total values of the solutions returned after 30 runs of each algorithm. Specifically, column Instance shows the name of the instance tested. Column OPT stores the optimum value, and column Algorithm specifies the algorithm name. At the same time, Best, Average, and Worst present the best, average, and worst values. Meanwhile, Stdev indicates the standard deviation, and Gap reveals the gap between the average and optimum values. Specifically, the Gap value is calculated as specified in the below expression:
(19)
where AVE stands for the average result.
Table 5 shows the superiority of our proposed algorithms over other candidate solutions when they are tested with inverse strongly correlated instances. They lead the ranking table in almost all of the cases. The only circumstances when other algorithms raise their voices are the case of IDKP1 when FirEGA has the same best value as our algorithms, and the case of IDKP3 when IHHO has the best results in Best and Average categories. In general, BSSA1 takes the top place 27 times, while BSSA2 has 36 times on this aspect. It is also worth mentioning that there are 14 times when BSSA1 and BSSA2 share the top rank. Generally, our proposed algorithms lead comfortably in the tests using this instance type.
In the case of strongly correlated instances, the situation has changed. Table 6 stores data related to these tests. BPSO8 proves that it adapts very well to this test by leading the ranking table 19 times in total. The results also show that BSSA1 leads 14 times, BSSA2 does the same 7 times, while IHHO, MS1 and FirEGA step aside in every aspect. It is necessary to note that while there are 10 strongly correlated instances, BPSO8 leads on the gap value in all these 10 times. Their average fitness value is closer to the optimum value than that of the opponents. It also means that, with the remaining 9 times claiming the top place, BPSO8 is not superior to BSSA1 and MMBO, whose numbers are 14 and 8, respectively. Another interesting fact is that no top spot is shared among the tested algorithms in this type of instance.
Moreover, BPSO8 seems to be truly better in attaining the best solution, with 5 times at the top of the table, while the remaining 5 times are taken by MMBO and BSSA1 (3 for MMBO and 2 for BSSA1). It is rather equal when we find the best one in terms of the best average result, when BPSO8, MMBO, and BSSA1 reach the first position 3 times each. When searching for the best candidate by comparing the worst outputs of the tested algorithms after 30 runs, our proposed BSSA1 finishes top 4 times while the closest opponents, MMBO and BSSA2, another variant of our algorithm, reach the top spot 2 times. For MS1 and MMBO have no data on Stdev and Gap, if we exclude rankings on these columns, BSSA1 and BPSO8 have similar overall performance in strongly correlated instances, while MMBO finishes third, not so far behind. In terms of standard deviation, our proposed algorithms lead in all 10 instances, which proves that their returned solutions are more consistent than those from the others. In another approach, BPSO8 has the best average ranking in the Best and Gap categories, and BSSA1 comes out on top in the Average, Worst, and Stdev categories.
Uncorrelated data sets are where the values and the costs are not related much, and it is interesting to see how tested algorithms perform in this type of distribution. Table 7 gives data on the performances of the algorithms in solving these instances. It is a rather equal performance when BPSO8 and BSSA1 take the top rank 23 times each. Interestingly, while BPSO8 is unbeatable in all 10 instances when we look at the gap values, its standard deviation performance is not that great. Furthermore, the Stdev values of our proposed BSSA1 and BSSA2 are much lower than those from BPSO8. Hence, since the returned solutions are concentrated close to the expected value, we can conclude that BSSA1 and BSSA2 are more stable than BPSO8, whose solutions are more dispersed. Finally, IHHO, FirEGA, MS1, and MMBO are not really in good form with this type of DKP01 problem with no wins. In terms of average ranking, it is interesting that BPSO8 and BSSA1 lead in the same categories as in the case of strongly correlated instances.
Weakly correlated instances are where MMBO has its voice. Table 8 shows that MMBO gets the first rank in terms of best solutions 5 times. These tests also prove the solid performance of our proposed algorithms. BPSO8 performs best on gap values with 10 wins, while IHHO and FirEGA struggle with every aspect of the tests with zero wins. If we count only the total times claiming the winner spot in the Best, Average, and Worst categories, MMBO has 9 times, BPSO8 has 4 times, BSSA2 has 6 times while BSSA1 is the winner with 10 first places. Our BSSA1 variant dominates most of the categories in terms of average rankings.
Table 9 summarizes the performance of the tested algorithms through their average rankings. Note that because the results of IHHO are available just for 3 instances for each data set type, their average ranking values are calculated by total ranking value divided by 3 for each data set type. Additionally, because the data sets used in the experiments are very different, we provide the statistical results by data set type to see how algorithms perform with each of them. The lower the average rank is, the better the algorithm operates. Our proposed algorithms, BSSA1 and BSSA2, outperform other algorithms in inverse strongly correlated instances. In the case of strongly correlated and uncorrelated instances, although BPSO8 has the best average best ranks, it cannot repeat that performance level in other factors. After analyzing all the factors, it can be seen that BSSA1 achieves the best average rankings in these two data sets. What happens with the weakly correlated data sets is the repetition of what can be seen with inverse strongly correlated instances, where our algorithms have a big difference compared to others in terms of average rankings. In summary, even though the data sets have significant differences, our proposed algorithms still have good adaptability and give higher quality solutions than other algorithms.
Computational cost
In this subsection, we will provide an overview of our proposed algorithm’s running time compared to BPSO8. Since PSO is very popular and widely used, we decided to use it as the benchmark algorithm in this test. This comparison allows us to see the computational time of our BSSA2 algorithm compared to a widely recognized optimization algorithm. To perform this assessment, we run the algorithms on uncorrelated instances (UDKP1-UDKP10). Each algorithm will run 30 times on each instance, then the average running time of 30 runs is calculated. Test results show that our algorithm needs more time to finish a run. This could be due to the differences in the operations of the global search algorithm. Anyway, the time required is still acceptable.
Table 10 provides the results of this test.
Conclusions and future work
Discounted {0-1} knapsack problem (DKP01) is not just a theoretical problem but also a principle widely applied in real life. Therefore, finding an effective way to solve this problem will help in real-world business and real-time decision-making systems. Using metaheuristics to solve NP-hard problems, our paper proposes and evaluates a new optimization algorithm with four variants based on the salp swarm algorithm that integrates many new techniques and results in better solution quality. This quality is worth the additional computational cost. The new algorithm is also more stable in producing good solutions than existing ones.
Although the performance of our algorithm is optimistic, some aspects can be further studied in the future. Firstly, in the current approach, while SSA is responsible for covering the search space, the repair operator is responsible for correcting errors that the solutions provided by SSA might make and optimizing it in a predetermined strategy. The problem is that SSA’s exploration capability is somewhat limited, and tweaks are needed to make the global search capacity stronger. Simply put, the current mechanism makes this algorithm very powerful in exploiting a certain direction as well as searching the neighborhoods of the salps in the chain. However, if the algorithm is modified and improved reasonably, later salps can significantly contribute to the exploration process. Improving this property will make the algorithm more powerful and flexible. Next, the repair operator can also be improved. We will be testing various options to make the repair operator work even better, such as defining a new item partitioning scheme. It is also important to note that the sizes of the problem instances in the test data sets are only from 100 to 1,000 dimensions. In the case of more complex data sets, such as 10,000 or 100,000 dimensions or even more, current algorithms for DKP01 will reveal their weakness in terms of computational cost. That’s also an approach we plan to focus on, specifically developing parallel versions of the algorithm that take advantage of the computing power of next-generation CPUs and GPUs and reduce the computational cost.
References
- 1. Mathews GB. On the Partition of Numbers. Proceedings of the London Mathematical Society. 1896;s1-28(1):486–490.
- 2.
Guldan B. Heuristic and exact algorithms for discounted knapsack problems. University of Erlangen-Nürnberg, Germany. Schloßplatz 4, 91054 Erlangen, Germany; 2006.
- 3. Balas E, Zemel E. An Algorithm for Large Zero-One Knapsack Problems. Operations Research. 1980;28(5):1130–1154.
- 4. Martello S, Pisinger D, Toth P. Dynamic Programming and Strong Bounds for the 0-1 Knapsack Problem. Management Science. 1999;45(3):414–424.
- 5. Rong A, Figueira JR, Klamroth K. Dynamic programming based algorithms for the discounted 0–1 knapsack problem. Applied Mathematics and Computation. 2012;218(12):6921–6933.
- 6. He YC, Wang XZ, He YL, Zhao SL, Li WB. Exact and approximate algorithms for discounted 0-1 knapsack problem. Information Sciences. 2016;369:634–647.
- 7. He Y, Wang X, Li W, Zhang X, Chen Y. Research on genetic algorithms for discounted 0–1 knapsack problem. Chinese J Comput. 2016;39(12):2614–2630.
- 8. He Y, Wang X, Gao S. Ring Theory-Based Evolutionary Algorithm and its application to D{0-1} KP. Applied Soft Computing. 2019;77:714–722.
- 9. Draa A. On the performances of the flower pollination algorithm – Qualitative and quantitative analyses. Applied Soft Computing. 2015;34:349–371.
- 10. Feng YH, Wang GG, Li W, Li N. Multi-strategy monarch butterfly optimization algorithm for discounted 0-1 knapsack problem. Neural Computing and Applications. 2018;30:3019–3036.
- 11. Wang GG, Deb S, Cui Z. Monarch butterfly optimization. Neural Computing and Applications. 2019;31(7):1995–2014.
- 12. Feng YH, Wang GG. Binary Moth Search Algorithm for Discounted 0-1 Knapsack Problem. IEEE Access. 2018;6:10708–10719.
- 13. Wu C, Zhao J, Feng Y, Lee M. Solving discounted {0-1} knapsack problems by a discrete hybrid teaching-learning-based optimization algorithm. Applied Intelligence. 2020;50:1872–1888.
- 14. Truong TK. Different Transfer Functions for Binary Particle Swarm Optimization with a New Encoding Scheme for Discounted 0-1 Knapsack Problem. Mathematical Problems in Engineering. 2021;2021.
- 15. Poli R, Kennedy J, Blackwell T. Particle swarm optimization. Swarm intelligence. 2007;1(1):33–57.
- 16. Truong TK. A New Moth-Flame Optimization Algorithm for Discounted {0-1} Knapsack Problem. Mathematical Problems in Engineering. 2021;2021.
- 17. Mirjalili S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-Based Systems. 2015;89:228–249.
- 18. Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H. Harris hawks optimization: Algorithm and applications. Future Generation Computer Systems. 2019;97:849–872.
- 19. Guo W, Xu P, Dai F, Zhao F, Wu M. Improved Harris hawks optimization algorithm based on random unscented sigma point mutation strategy. Applied Soft Computing. 2021;113:108012.
- 20. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM. Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Advances in Engineering Software. 2017;114:163–191.
- 21. Faris H, Mafarja MM, Heidari AA, Aljarah I, Ala’M AZ, Mirjalili S, et al. An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowledge-Based Systems. 2018;154:43–67.
- 22. Ateya AA, Muthanna A, Vybornova A, Algarni AD, Abuarqoub A, Koucheryavy Y, et al. Chaotic salp swarm algorithm for SDN multi-controller networks. Engineering Science and Technology, an International Journal. 2019;22(4):1001–1012.
- 23. Nunes BAA, Mendonca M, Nguyen XN, Obraczka K, Turletti T. A survey of software-defined networking: Past, present, and future of programmable networks. IEEE Communications surveys & tutorials. 2014;16(3):1617–1634.
- 24. Hegazy AE, Makhlouf M, El-Tawel GS. Feature selection using chaotic salp swarm algorithm for data classification. Arabian Journal for Science and Engineering. 2019;44(4):3801–3816.
- 25. Tubishat M, Idris N, Shuib L, Abushariah MA, Mirjalili S. Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Systems with Applications. 2020;145:113122.
- 26. Rizk-Allah RM, Hassanien AE, Elhoseny M, Gunasekaran M. A new binary salp swarm algorithm: development and application for optimization tasks. Neural Computing and Applications. 2019;31(5):1641–1663.
- 27. Qais MH, Hasanien HM, Alghuwainem S. Enhanced salp swarm algorithm: Application to variable speed wind generators. Engineering Applications of Artificial Intelligence. 2019;80:82–96.
- 28. Yang B, Zhong L, Zhang X, Shu H, Yu T, Li H, et al. Novel bio-inspired memetic salp swarm algorithm and application to MPPT for PV systems considering partial shading condition. Journal of cleaner production. 2019;215:1203–1222.
- 29. Panda N, Majhi SK. Improved salp swarm algorithm with space transformation search for training neural network. Arabian Journal for Science and Engineering. 2020;45(4):2743–2761.
- 30.
Wang H, Wu Z, Liu Y, Wang J, Jiang D, Chen L. Space Transformation Search: A New Evolutionary Technique. In: Proceedings of the First ACM/SIGEVO Summit on Genetic and Evolutionary Computation. GEC’09. New York, NY, USA: Association for Computing Machinery; 2009. p. 537–544.
- 31. Salgotra R, Singh U, Singh G, Singh S, Gandomi AH. Application of mutation operators to salp swarm algorithm. Expert Systems with Applications. 2021;169:114368.
- 32. Kansal V, Dhillon JS. Emended salp swarm algorithm for multiobjective electric power dispatch problem. Applied Soft Computing. 2020;90:106172.
- 33. Zhang H, Wang Z, Chen W, Heidari AA, Wang M, Zhao X, et al. Ensemble mutation-driven salp swarm algorithm with restart mechanism: Framework and fundamental analysis. Expert Systems with Applications. 2021;165:113897.
- 34. Minai AA, Williams RD. Original Contribution: On the Derivatives of the Sigmoid. Neural Netw. 1993;6(6):845–853.
- 35.
von Seggern DH. CRC Standard Curves And Surfaces With Mathematica, Second Edition (Advances in Applied Mathematics). Massachusetts, USA: Chapman & Hall/CRC; 2006.
- 36. Holland JH. Genetic Algorithms. Scientific American. 1992;267(1):66–73.
- 37. Zhu H, He Y, Wang X, Tsang ECC. Discrete differential evolutions for the discounted 0-1 knapsack problem. International Journal of Bio-Inspired Computation. 2017;10(4):219.
- 38. McGill R, Tukey JW, Larsen WA. Variations of box plots. The American Statistician. 1978;32(1):12–16.
- 39.
Haynes W. In: Dubitzky W, Wolkenhauer O, Cho KH, Yokota H, editors. Wilcoxon Rank Sum Test. New York, NY: Springer New York; 2013. p. 2354–2355.
- 40. Friedman M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. Journal of the American Statistical Association. 1937;32(200):675–701.
- 41. Friedman M. A Correction: The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. Journal of the American Statistical Association. 1939;34(205):109.
- 42. Friedman M. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics. 1940;11(1):86–92.
- 43. Gueorguieva R, Krystal JH. Move Over ANOVA: Progress in analyzing repeated-measures data and its reflection in papers published in the archives of General Psychiatry. Archives of General Psychiatry. 2004;61(3):310–317. pmid:14993119
- 44.
Nemenyi PB. Distribution-Free Multiple Comparisons. Princeton University. 1963.