Optimal shortening of uniform covering arrays

Jose Torres-Jimenez; Nelson Rangel-Valdez; Himer Avila-George; Oscar Carrizalez-Turrubiates

doi:10.1371/journal.pone.0189283

Abstract

Software test suites based on the concept of interaction testing are very useful for testing software components in an economical way. Test suites of this kind may be created using mathematical objects called covering arrays. A covering array, denoted by CA(N; t, k, v), is an N × k array over with the property that every N × t sub-array covers all t-tuples of at least once. Covering arrays can be used to test systems in which failures occur as a result of interactions among components or subsystems. They are often used in areas such as hardware Trojan detection, software testing, and network design. Because system testing is expensive, it is critical to reduce the amount of testing required. This paper addresses the Optimal Shortening of Covering ARrays (OSCAR) problem, an optimization problem whose objective is to construct, from an existing covering array matrix of uniform level, an array with dimensions of (N − δ) × (k − Δ) such that the number of missing t-tuples is minimized. Two applications of the OSCAR problem are (a) to produce smaller covering arrays from larger ones and (b) to obtain quasi-covering arrays (covering arrays in which the number of missing t-tuples is small) to be used as input to a meta-heuristic algorithm that produces covering arrays. In addition, it is proven that the OSCAR problem is NP-complete, and twelve different algorithms are proposed to solve it. An experiment was performed on 62 problem instances, and the results demonstrate the effectiveness of solving the OSCAR problem to facilitate the construction of new covering arrays.

Citation: Torres-Jimenez J, Rangel-Valdez N, Avila-George H, Carrizalez-Turrubiates O (2017) Optimal shortening of uniform covering arrays. PLoS ONE 12(12): e0189283. https://doi.org/10.1371/journal.pone.0189283

Editor: M. Sohel Rahman, Bangladesh University of Engineering and Technology, BANGLADESH

Received: June 21, 2017; Accepted: November 23, 2017; Published: December 21, 2017

Copyright: © 2017 Torres-Jimenez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files. The supporting data is additionally available at: http://www.tamps.cinvestav.mx/∼oc/OSCAR.

Funding: The authors acknowledge the GENERAL COORDINATION OF INFORMATION AND COMMUNICATIONS TECHNOLOGIES (CGSTIC) at CINVESTAV for providing HPC resources on the Hybrid Cluster Supercomputer “Xiuhcoatl,” that have contributed to the research results reported. The following projects have funded the research reported in this paper: 238469 - CONACYT Métodos Exactos para Construir Covering Arrays Optimos to JT-J; 2143 - Cátedras CONACYT - Fortalecimiento de las capacidades de TICs en Nayarit to HA-G; and 148784 - Fondo Mixto CONACYT y Gobierno del Estado de Nayarit, Unidad de Transferencia Tecnologica CICESE – Nayarit to HA-G.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Functionality tests during software development demand special attention, and they are generally important for preventing malfunctions in software components. During the testing phase, it is desirable to find all errors that could arise in a software component before it is delivered to the user. If a software component has a large number of parameters, then testing it exhaustively might be expensive because of the large number of configurations that can arise from the different parameters’ values; e.g., a software component with just 20 parameters of 2 different values each would require 2²⁰ = 1,048,576 tests. An alternative is to test the system using a small, randomly generated test suite, but in this case, there is no guarantee of the testing coverage; instead, a better choice is to use a combinatorial testing approach that provides a coverage guarantee for small test suites. This combinatorial testing approach (also called interaction testing) guarantees the coverage of all interactions of a certain size among different values of the input parameters of a software component. This approach is based on evidence presented by [1] that many errors are produced by the interactions of only a few parameter values. Specifically, the cited authors showed evidence that test suites with an interaction size of 6 are sufficient to detect all known errors in a collection of different software components.

A uniform covering array (CA), denoted by CA(N;t, k, v), is a commonly used structure in interaction testing. It is an array with dimensions of N × k constructed over with the property that every N × t sub-array covers all members of at least once. The value of N is the number of rows of , i.e., the number of test cases; k is the number of columns or parameters; v is the number of values that each parameter can take; and t is the degree of interaction among the parameters. Because there are sets of t columns {c₁,…,c_t}, the number of different t-tuples that must be covered at least once in is . When a specific t-tuple is missing in a set of t columns (c₁, …, c_t), we refer to it as a missing t-wise combination (or a missing combination, for short). Below, a CA(6; 2, 5, 2) in which all t-wise combinations are covered at least once is shown.

The covering array construction (CAC) problem is the search for the covering array number (CAN), i.e., the minimum value N for which an array CA(N;t, k, v) still exists. Formally, the CAN can be defined as CAN(t, k, v) = min{N|∃ CA(N; t, k, v)}.

In some theoretical studies, the following definition is adopted: CAN(t, k, v) = O(v^t log k) [2]. This definition is interesting because as the number of columns grows linearly, the number of rows grows only logarithmically. This is an advantage of such combinatorial structures because of the possibility of deriving small test suites. For instance, for a software component with 126 binary parameters, exhaustive testing would require 2¹²⁶ tests, whereas interaction testing with strength 2 would require only 10 tests.

A complementary problem to the CAC problem is known as the test suite reduction problem (TSRP), which consists of finding, for a given array, the smallest subset of rows that covers all t-wise combinations [3]. The CAC problem is a special case of the TSRP in which the input is an array that contains all v^k distinct test cases.

For some special cases, there are algorithms that can solve the CAC problem in polynomial time:

when v = t = 2 [4],
when v is a prime power and k ≤ v + 1 [5], and
when k = t + 1 [6].

However, the CAC problem remains highly combinatorial in most cases. Moreover, some variants have been proven to be NP-complete; e.g., the work presented in [2, 7] shows the NP-completeness of the problem of extending a matrix by one row with no fewer than m missing t-wise combinations. The problem defined in the current work is also NP-complete, as proven in this paper.

Various methods have been developed to address the CAC problem. Exact methods solve it to optimality; however, they usually require exponential time to achieve their goal [8–11]. As a result of this complexity, various approximate methods have been proposed as alternatives, including recursive [4, 12, 13], algebraic [14, 15], greedy [16–20], and meta-heuristic approaches. This last category includes methods based on strategies such as genetic algorithms [21], simulated annealing [22], and tabu search [23].

These approximate algorithms can be used to build non-optimal CAs in a reasonable time; some of these algorithms depend on the quality of their inputs to produce small CAs. Most of the time, these inputs are based on matrices that are nearly CAs. The objective of the present work is to construct matrices with sufficiently few missing combinations to still be considered quasi-CAs. Such arrays are created by solving the problem known as the Optimal Shortening of Covering ARrays (OSCAR); related results were published in [24]. The OSCAR problem is relevant to the construction of CAs because it can produce smaller CAs or excellent initialization matrices for meta-heuristic algorithms for constructing CAs. The main contributions of this work are as follows. It formalizes three of the five algorithms presented in [24]. It also presents seven new approximate strategies for solving the OSCAR problem. In addition, the present work offers a complete analysis of the performance of all of the new and old algorithms, something that has not been done before. Furthermore, it proposes three new benchmarks with more than 800 OSCAR instances, which extend the range of study to matrices with strengths of t = {2, 3, 4, 5}, whereas previous works have studied only t = 2; these benchmarks are used as part of the experiments conducted to analyze the strategies. These experiments not only evaluate how effectively the algorithms solve the OSCAR problem but also compare the best of them against state-of-the-art strategies. These experiments provide evidence that solving the OSCAR problem using the proposed approaches enables the creation of quasi-CAs that are better than other reported initialization functions and even than the fast and versatile IPOG-F, a state-of-the-art algorithm for constructing CAs; the main result is that the arrays produced using the proposed algorithms have 90% fewer missing t-wise combinations than those generated using the other approaches considered for comparison.

This paper is organized as follows. In the problem definition section, the OSCAR problem is formally defined; its NP-completeness is proven, and some of its applications are described. In the related work section, some of the work related to initialization functions for meta-heuristics for CA construction is presented. Subsequently, the algorithms proposed in this work for solving the OSCAR problem are presented. In the experimentation section, an experiment performed to test the proposed algorithms for the construction of matrices with few missing combinations is presented. Finally, in the conclusions section, final comments regarding this work are provided.

Problem definition

Let denote a CA(N; t, k, v) or a quasi-CA(N; t, k, v) (a quasi-CA is a matrix with a relatively small number of missing t-combinations). Then, the OSCAR problem can be defined as , where is a function that counts the number of missing t-wise combinations in the given array and N′ = N − δ and k′ = k − Δ are defined in terms of two predefined integer values, 0 ≤ δ ≤ N − v^t and 0 ≤ Δ ≤ k − t, which satisfy δ > 0 ∨ Δ > 0. Hence, an OSCAR instance is specified by the elements .

The search space for an OSCAR instance consists of all submatrices of the given matrix . Accordingly, the number of feasible solutions that form such a space can be estimated to be , where and represent the numbers of different ways to choose subsets of rows and columns, respectively, from the original matrix . Throughout the remainder of this document, for a given submatrix , we use to denote the subset of rows chosen from and to denote the subset of columns.

We present an example of a solution to the OSCAR instance specified by (see Table 1), δ = 2, and Δ = 2, for which it is feasible to construct a solution (see Table 2) where . The solution for this instance is obtained by eliminating and from . Because , the solution is a CA(4; 2, 3, 2).

Download:

Table 1. OSCAR example, the input array

.

https://doi.org/10.1371/journal.pone.0189283.t001

Download:

Table 2. Solution to the OSCAR problem,

, when δ = 2 and Δ = 2.

https://doi.org/10.1371/journal.pone.0189283.t002

Alternatively, the matrix can be represented by another matrix with dimensions of . This matrix has the same number of rows as and contains one column for each subset of t columns derived from . Each cell contains a value from the set {0, 1, …, v^t − 1}; this value represents the t-tuple covered by row i in the subset of t columns associated with column j.

The OSCAR instance is shown in Tables 3, 4 and 5. The initial matrix is shown in Table 3, the t-tuples and sets of columns are shown in Table 4, and the new matrix representation is presented in Table 5 (t-wise combinations covered).

Download:

Table 3. An instance of the OSCAR problem, initial matrix.

https://doi.org/10.1371/journal.pone.0189283.t003

Download:

Table 4. An instance of the OSCAR problem, the t-tuples and sets of columns.

https://doi.org/10.1371/journal.pone.0189283.t004

Download:

Table 5. An OSCAR instance, t-wise combinations covered.

https://doi.org/10.1371/journal.pone.0189283.t005

Finally, the tuple is used to define an instance of the OSCAR problem, and the NP-completeness of the problem can be proven based on this new representation. The remainder of this section is devoted to this proof.

The proof that the OSCAR problem is NP-complete

To demonstrate the NP-completeness of the OSCAR problem, it is necessary to show it to be equivalent to a problem that is already known to be NP-complete. For this purpose, this work presents the transformation of the maximum cover (or MAXCOVER) problem (cf. [25] for a review of this problem) into the OSCAR problem. For the proof, the previously defined notation for an OSCAR instance is extended to , where the value h denotes an integer that supports the following question: is there a sub-array of with dimensions of (N − δ) × (k − Δ) such that ? This question transforms the OSCAR problem into its decision form, which is required for this demonstration.

First, it is proven that the OSCAR problem is NP in nature. Let us begin with the case in which Δ = 0, meaning that the matrix is a subset of only the rows of . Clearly, the size of the search space is reduced to . The claim that the problem is NP in nature holds because computing the value of would require time proportional to to examine all possible t-wise combinations, which are equal in number to the number of columns of . In other words, the question of whether for the OSCAR problem can be answered in polynomial time in the dimensions of .

Now that it has been shown that the OSCAR problem is NP in nature, let us proceed with the transformation of the NP-complete MAXCOVER problem. The objective of the MAXCOVER problem is to cover a given set , regarded as the universe. To achieve this goal, we must use a subset of , where each subset , for all 1 ≤ i ≤ m, is given in advance and has a size of at most C. This problem can be characterized by the tuple and can be transformed into an OSCAR instance as follows: a) The matrix is constructed, with m + 1 rows and l + max{|Y_i|} + 1 columns. b) For 1 ≤ i ≤ m and 1 ≤ j ≤ l, the value of each cell is 1 if subset Y_i covers element q_j or 0 otherwise. c) For 1 ≤ i ≤ m and j > l, the value of each cell is 0. d) For i = m + 1, the value of each cell is 0 if 1 ≤ j ≤ l or 1 otherwise. e) The values of δ, Δ, and h are set to m − C, 0, and 0, respectively. The matrix can be constructed in a time of O(lm), and the derived OSCAR instance is denoted by .

Table 6 shows an example of the transformation of the MAXCOVER problem into the OSCAR¹ problem. The following elements are used in this case:

Y₂ = {q₂, q₄, q₅}, Y₃ = {q₁, q₄, q₅}, Y₄ = {q₁, q₂, q₃}, Y₅ = {q₂, q₃, q₄}
C = 3

Download:

Table 6. The MAXCOVER instance specified by

,

, Y₂ = {q₂, q₄, q₅}, Y₃ = {q₁, q₄, q₅}, Y₄ = {q₁, q₂, q₃}, Y₅ = {q₂, q₃, q₄}}, and C = 3 represented as an OSCAR instance.

https://doi.org/10.1371/journal.pone.0189283.t006

The array has dimensions of 6 × 9, and the δ is equal to 5 − 3 = 2.

Finally, to complete the proof that the OSCAR problem is NP-complete, we demonstrate that the OSCAR instance built from the MAXCOVER instance has a solution if and only if the latter has a solution. For this purpose, we start by showing that an optimal solution for must include row Y₆ of . This fact can be easily proven since all t-tuples must be covered in and those with the value 1 in any column j > l can only be covered by row Y_m+1.

The next step is to show that there is a solution with C subsets for the MAXCOVER instance iff there is a matrix with C + 1 rows that solves . This condition can also be easily proven. We first note that the t-tuple with value 0 is covered for any column j ≤ l by the row Y_m+1. The same tuple is also covered for any column j > l by any row from {Y₁,…,Y_m}. With this information, the only t-tuples that remain uncovered are those with value 1 in any column j ≤ l. Given that during the construction of the OSCAR instance, a t-tuple with value 1 is assigned only to those rows in columns j ≤ l that are associated with a subset of , the following claim is valid: any subset of that is formed of C elements and represents a solution for the MAXCOVER instance can also be transformed into a solution for the OSCAR instance. This claim is justified since the associated rows with the chosen C elements cover all t-tuples for any column but those with value 0 in columns j > l. Then, it is necessary only to add row m + 1 to cover the missing t-tuples. It is also true that a solution with C + 1 rows for the OSCAR instance is a valid solution for the equivalent MAXCOVER instance, since it is necessary only to choose those subsets of associated with the rows selected in the solution for the OSCAR instance. Finally, if one of these instances has no solution, then neither does the other; this claim holds because of the equivalence between such solutions, which has already been shown. Hence, it is demonstrated that a solution to the MAXCOVER problem implies a solution to the OSCAR problem.

Finally, any instance of the OSCAR problem for the case of Δ > 0 is equivalent to instances of the problem with Δ = 0. Since it has been proven that instances of this special case are NP-complete, then the general case of the OSCAR problem is at least as complex.

Applications of the OSCAR problem

Methods of solving the OSCAR problem have the following applications: a) they can reduce the search space in the CAC problem; b) they can directly construct CAs, when there are no t-wise combinations missing in the matrices they generate; c) they can be used as initializing functions for meta-heuristics for CA construction; d) they can aid in the identification of better upper bounds for CA matrices; and e) they can be used for fine-tuning in experimental design. Each of these applications is detailed in the remainder of this section.

The OSCAR problem successfully yields a quasi-CA that has zero a small number of missing t-wise combinations. Such a situation is convenient since instead of searching for a CA(N + δ; t, k + Δ, v) in a feasible region with a size of , corresponding to the original domain, it may be possible to construct such a CA from a relaxed region of a smaller size, .

The second and third applications of the OSCAR problem are related to the construction of CAs. The OSCAR problem enables the direct construction of CAs when , i.e., when is a CA. Additionally, whenever the matrix constructed as a solution to an OSCAR instance is not a CA (i.e., the number of missing t-tuples is greater than zero), this solution can still be used indirectly for CA construction because it can serve as the initial solution for meta-heuristic algorithms. Note that the performance of a meta-heuristic for constructing CAs depends on the quality of the initial matrix. Hence, the sub-array obtained as a solution to the OSCAR problem is adequate for this purpose because it has only a few missing t-wise combinations; this is in contrast to arrays of the same size constructed using random initialization functions, which are likely to be missing a large number of the possible t-wise combinations due to their random nature. Some of the existing meta-heuristic algorithms designed for CA construction, which show dependence on the initial matrix, are reported in [21, 23, 26, 27]. It is in algorithms of this type that the OSCAR problem finds its main area of application, namely, the generation of initial matrices with few missing t-wise combinations.

The fourth application of the OSCAR problem is the identification of new upper bounds for CA matrices. Many such upper bounds have been reported in the literature. For example, the best upper bounds for some CAs can be found in the repositories of [28, 29]. In addition, some bounds on CAN(t, k, v) can be found in [29]; however, the corresponding CAs have values of N that are far from optimal.

Because of the hardness of the CAC problem, the value of CAN(t, k, v) for any arbitrary set of values of t, k, and v is generally unknown. However, suitable new upper bounds can be obtained from existing matrices; e.g., between CA(174; 2, 110, 9) and CA(177; 2, 117, 9), the upper bounds on the required numbers of columns for the cases of N = 175 and N = 176 are unknown, but it can be inferred that they should be between 111, …, 116. Because most of these upper bounds have not been shown to be optimal, the question arises as to whether other upper bounds can be found. We conclude that inputs derived by solving the OSCAR problem can be used to test potential upper bounds in order to find new bounds for CA(N; t, k, v); this can be achieved through the proper selection of the values δ and Δ used to reduce the matrix size.

Some specific cases of the values of δ and Δ are as follows:

When δ > 0 and Δ = 0, i.e., only the number of rows is to be reduced, the rows that are selected to be discarded are those whose elimination results in the minimum number of missing combinations in the final array.
When δ = 0 and Δ > 0, i.e., only the columns of columns is to be reduced, the columns that are selected to be discarded are similarly those whose elimination results in the minimum number of missing t-wise combinations. However, this case makes sense only when the array is not a CA.

Finally, another application of solutions to the OSCAR problem is their direct use in testing scenarios. Through the careful selection of the OSCAR problem parameters δ and Δ, it is possible to ensure that resulting sub-array has the desired numbers of rows (i.e., test cases) and columns (i.e., parameters) to produce a quasi-CA (with 90–100% coverage of the t-tuples) that provides the required level of assurance.

The proposed methodology for the construction of CAs consists of generating an initial solution for a meta-heuristic algorithm by solving an instance of the OSCAR problem; i.e., the OSCAR problem is solved to obtain the solution , which is then used as the initial array in a meta-heuristic algorithm.

Related work

The construction of CAs is a highly combinatorial problem that can benefit from the use of approximate algorithms to construct CAs of a desired size within a reasonable amount of time. Many researchers, instead of directing their efforts toward finding CAs with the minimum number of rows using an exact approach, have designed approximate algorithms to improve the best known upper bound for CAs and then reduce the gap between that bound and the CAN. These CA construction algorithms can be classified, in accordance with their characteristics, into the following types: (a) algebraic approaches, (b) exact approaches, (c) greedy approaches, (d) transformations, and (e) meta-heuristic approaches.

Algebraic methods have the characteristic that the CA construction process involves formulas or operations using mathematical objects such as vectors, finite fields, groups, and CAs with small values of t, k, and v. Some algebraic methods yield optimal constructions, including the CA(N; 2, k, 2) methods of [30] and [31]; Bush’s construction method for CA(N; t, q + 1, q), where q is a prime or a prime power and q ≤ t (cf. [5]); and the zero-sum method of [6], which yields an optimal CA(t, t + 1, v) for any t ≥ 2. The main feature of these approaches is that most of them require small CAs or quasi-CAs from which to construct larger CAs.

Exact methods are exhaustive approaches for the construction of optimal CAs. Although some approaches include techniques for accelerating the search process, they generally require exponential time to complete their task, making them practical only for the construction of small optimal CAs. This category includes branch-and-bound (B&B) strategies, such as the work proposed by [10], which incorporates symmetry-breaking techniques, partial t-wise verification and fixed blocks in the bounding process, and the work of [8], which, for the generation of a non-isomorphic CA(N; 2, k, 2), uses a pruning strategy based on bounds defined by the minimum ranks established in terms of the CA size.

Greedy strategies are commonly used for combinations of the parameters N, t, k, and v for which exact methods are impractical, with the basic purpose of producing a good solution in a short time. The majority of commercial and open-source tools for generating test data (including AETG [32], TCG [17], ACTS [33], IPOG-F [19], and DDA [34]) use greedy algorithms for CA construction.

Transformations generally exploit the structure of existing CAs either to make them smaller or to support other approaches, e.g., algebraic approaches, in creating smaller CAs. This task is usually performed in one of two ways: a) through the identification of redundancy or b) through the construction of submatrices. Redundancy in a CA can be identified through the permutation of rows or columns or through the changing of symbols (cf. [35], [36] and [37]). However, approaches based on the construction of submatrices provide a better basis for new CAs, and the present work can be considered to be of this type.

Finally, similar to greedy methods, meta-heuristic approaches are strategies that are not guaranteed to find a CA with the minimum number of rows. In practice, meta-heuristic methods yield very good results, but they consume more CPU time than greedy algorithms. Some meta-heuristics that have been used to solve the CAC problem include simulated annealing (SA) [22], tabu search (TS) [38], memetic algorithms (MAs) [27], and genetic algorithms (GAs) [21].

For all of the strategies described above, the main goal is the construction of CAs, i.e., matrices with zero missing t-wise combinations. However, the CA construction performance of algebraic and meta-heuristic approaches is improved when the initial matrices are quasi-CAs, i.e., when they are missing only a small number of the possible t-wise combinations. This situation raises the question of how an initial matrix should be constructed for these approaches. The answer is to use initialization functions. Hence, these initialization functions are a key element of the development of meta-heuristics for CA construction.

The main initialization functions used in state-of-the-art methods are as follows: a) random matrix initialization [21, 23, 26, 27], b) initialization with a balanced number of symbols per column [27], c) initialization through row augmentation [39], d) initialization based on submatrices [40], and e) initialization based on greedy strategies [41, 42]. The four first strategies do not consider the number of missing t-wise combinations in the construction of the initial matrix. Strategies of the last type can be used to build CAs, but they are typically larger than the required matrix size; this situation results in random discarding of rows and/or columns that is also not optimized in terms of the number of missing t-wise combinations. Hence, an alternative is to use an existing matrix of greater size and optimize the row/column reduction process until a matrix of the required size is obtained. This optimization is exactly equivalent to solving the OSCAR problem, and this work proposes a wide variety of new meta-heuristic and hybrid strategies for this purpose.

In summary, whereas CA construction approaches (e.g., exact, greedy, algebraic, meta-heuristic and transformation methods) produce matrices with no missing t-wise combinations, the strategies presented in this work solve the OSCAR problem to generate quasi-CAs. Quasi-CAs are important because they can be used as initial matrices for CA construction strategies based on algebraic and meta-heuristic methods and can thus improve the performance of these methods in the construction of new CAs.

The remainder of this section provides a more detailed introduction to some of the relevant initialization functions found in the scientific literature related to this topic. These four initialization functions will be denoted by , , , and in this paper; Fig 1 shows an example of each one initialization function.

Download:

Fig 1. Example of the Hamming distances between the two rows r₁ and r₂ that are already in the matrix C and the two candidate rows d₁ and d₂.

https://doi.org/10.1371/journal.pone.0189283.g001

Each of the four initialization functions creates an array with N rows and k columns, in which each cell is initialized with a symbol of the given alphabet {0, 1, …, v − 1} of v symbols. The function is presented in [21, 23, 26, 27]; this function initializes each cell c_ij of with a symbol drawn at random from the set {0, 1, …,v − 1}. Fig 1 (random) shows an example of the use of to initialize a matrix .

The function initializes with a balanced number of randomly generated symbols per column. Each column k_i, where 1 ≤ i ≤ k, will contain an almost uniform distribution of the symbols {0, 1, …, v − 1}. To achieve such uniformity, a symbol is generated at random for each of the N rows of column k_i, but during the random generation process, it is ensured that the first symbols appear times and that the remaining symbols appear times. For example, in a 10 × 4 matrix with an alphabet size of v = 3, each of the four columns contains symbols that appear times and symbol that appears times; this situation is exemplified in Fig 1 (balanced). The use of guarantees that each column has a balance in the cardinalities of each symbol, something that cannot be guaranteed when using . The function is a generalization of the initialization function presented in [27] for solving the binary CAC problem using an SA approach.

The function initializes one row at a time. This function generates the first row r₁ at random; i.e., each of its cells will contain a symbol randomly chosen from {0, 1, …,v − 1}. Subsequently, each new row is selected from a set of two random candidate rows d₁ and d₂ and is added to . The chosen candidate row is the one that maximizes the Hamming distance with respect to all rows r_s that already exist in . The Hamming distance between two rows is equal to the number of positions at which the corresponding symbols are different; correspondingly, the Hamming distance between a candidate row d_j and all rows already in is equal to the number of positions l in each row r_s that differ from the corresponding positions in d_j, summed over all existing rows r_s. Formally, this latter definition can be expressed as , where i is the number of rows already added to and h(r_s,l, d_j,l) = 1 if r_s,l ≠ d_j,l or 0 otherwise. This process is repeated until all N rows have been created. This initialization function has been used previously in [39].

An example of the selection of a row as defined in is shown in Fig 2; the matrix already contains 2 rows, and the third row will be the candidate d₁ because it maximizes the value of . Fig 1 (Hamming) shows the full initial matrix.

Download:

Fig 2. Initialization functions.

(a) results in 20 missing combinations. (b) results in 18 missing combinations. (c) results in 15 missing combinations. (d) results in 7 missing combinations.

https://doi.org/10.1371/journal.pone.0189283.g002

Finally, the function initializes based on groups of t columns. This function is based on the sub-array , which is constructed using the v^t combinations of symbols derived from an alphabet of size v and a strength value of t; e.g., will be formed of the elements in the set {00, 01, 02, 10, 11, 12, 20, 21, 22}, where each element represents a row in . The function is performed in two steps. In the first step, is used to define the symbols in the first t columns of the matrix . During this process, juxtaposition of is applied to complete the N rows of ; specifically, is juxtaposed times, and the remaining rows of are filled with the first rows of . In the second step, the first t columns of are copied into the next subset of t columns whose symbols have not yet been defined, and the values are changed in some pairs of rows; these changes are executed by randomly choosing pairs of rows and, for each pair, exchanging the values of those columns in each row. This step is repeated until all k columns of have been defined. If the number of columns in the last subset (t′) is smaller than t, then only the first t′ columns of are used. This function is a generalization of the last initialization function presented in [40]. An example of this initialization method is shown in Fig 1 (t-groups).

Algorithms for solving the OSCAR problem

This paper has formally defined the OSCAR problem and has proven that it is NP-complete. Now, various strategies are proposed for solving this problem. This section is devoted to this purpose; throughout the remainder of the section, each proposed approach is described in detail.

Given that a solution to a specific instance of the OSCAR problem is defined by two sets, and , and considering that each of these two sets can be selected using one of three different approaches (exact (), greedy (), or meta-heuristic ()), it is possible to define 9 basic algorithms, as shown in Table 7. The superindices for the and options indicate the number of variants that have been defined. For the approach, the two corresponding algorithms are denoted by (first the number of columns is reduced, then the number of rows) and (first the number of rows is reduced, then the number of columns). Three variants have been defined for the approach; these variants are denoted by (first the number of columns is reduced, then the number of rows), (first the number of rows is reduced, then the number of columns), and (the numbers of columns and rows are reduced in an alternating fashion). Thus, we ultimately present a total of 12 possible algorithms for solving the OSCAR problem.

Download:

Table 7. Different algorithms for solving the OSCAR problem.

The algorithms are grouped by the exact (), greedy (), and meta-heuristic () approaches.

https://doi.org/10.1371/journal.pone.0189283.t007

We first describe the reduction of the numbers of rows and columns of the initial matrix using the greedy approach. Afterward, the three greedy algorithms , and are defined. Next, we introduce the exact algorithms and for solving the OSCAR problem by exploring the entire search space, which has a size of ; these algorithms are based on a B&B approach [43]. Next, the meta-heuristic algorithm for solving the OSCAR problem is presented; this algorithm is based on the SA approach. Finally, the six hybrid algorithms , , , , and for solving the OSCAR problem are defined.

Greedy algorithms , , and for solving the OSCAR problem

The proposed greedy algorithms are based on two functions ( and ) that reduce the number of rows or columns one element (row or column) at a time, starting from the array , while considering the number of missing t-wise combinations after the reduction process. Examples of all of the greedy strategies proposed in this section are presented based on the OSCAR instance shown in Fig 3. It presents the problem instance specified by the matrix and the values Δ = 1 and δ = 2. It shows each combination of columns, or each t-tuple, that is derived from and all of the possible t-wise combinations of symbols that could be found in each of them; it also shows the sets A_R and A_C of rows and columns, respectively. Besides, it shows the auxiliary structure P, which is used to store the number of times that each t-wise combination is covered in each t-tuple; this structure P is a matrix of v^t = 2² rows and columns, in which each cell p_i,j contains the number of times that the i^th t-wise combination of symbols appears in A in the subset of columns defined by the j^th t-tuple.

Download:

Fig 3. OSCAR instance.

Problem instance specified by the matrix and the values Δ = 1 and δ = 2. Related information, t-wise combinations, t-tuples, and P matrix.

https://doi.org/10.1371/journal.pone.0189283.g003

Greedy approach for reducing the number of rows.

The greedy function that reduces the number of rows is denoted by , and it is defined below. Let be the set of rows of , and let be a vector in which each element is associated with a row r_i and has a value equal to the number of t-wise combinations that are exclusively covered by the associated row. The function selects the row {r_i|i = min_j{o_j}} to be discarded; ties are broken randomly.

The function uses the vector that describes the initial array to choose a row r_i to be discarded such that the value o_i is minimized. Discarding that row from results in an array such that , since once row r_i is discarded, the t-wise combinations that were covered exclusively by row r_i are no longer covered in . Therefore, the resulting array without row r_i will be missing the minimum possible number of t-wise combinations because o_i has the minimum value among the elements of . When o_i = 0, row r_i is clearly superfluous, since it does not cover any t-wise combinations exclusively.

Every time that a row r_i is discarded in the reduction process, the number of rows that cover each of the t-wise combinations covered by the discarded row r_i must be decreased by one. Whenever a t-wise combination is then covered by only a single remaining row j, the value of o_j must be increased by one. We update the vector in this way.

The time required to initially populate for the function is (in all algorithms with a greedy component, this process is called getN()), since it is necessary to explore all rows per set of t columns, to determine the number of times that each t-tuple is covered, and to confirm the t-wise combinations that are covered by only one row. The time required to discard a row and update is , since it is necessary to first explore the vector and then, for each set of t columns, verify the number of times that each t-tuple is covered in at most N − 1 rows.

Tables 8, 9 and 10 illustrate the application of and to the matrix defined in Fig 3. The vector shown in Table 8 is the result of the call to . Each element of this vector has a value equal to the number of unique t-wise combinations covered by the corresponding row; e.g., the value o₁ = 2 implies that row r₁ contains two t-wise combinations that are exclusively covered by this row (these are the symbol combinations 00 and 10 corresponding to the t-tuples (c₂, c₄) and (c₃, c₅), respectively). Now, a call to will result in an arbitrary selection from among the rows {r₁, r₂, r₅, r₆}; let us assume that r₂ is chosen. The elimination of this row will produce the new vector shown in Table 10. To illustrate the update operation of , Table 9 shows how the auxiliary structure P is modified in accordance with the t-wise combinations that are eliminated with the deletion of row r₂; note that there are 8 new t-wise combinations that are now uniquely covered in the remaining rows. In the new vector , the value of the element corresponding to each of these rows is incremented by the number of t-wise combinations in that row for which the corresponding value in P has been changed to 1 after the elimination of row r₂. For example, the t-wise combinations 01, 00 and 10 associated with t-tuples (c₁, c₂), (c₁, c₃), and (c₂, c₃) are newly exclusively covered by row r₅ after the removal of row r₂; consequently, o₅ is increased from 2 to 5 in the new vector .

Download:

Table 8. Greedy approach for reducing the number of rows: Examples of the getN() and

functions.

Results of .

https://doi.org/10.1371/journal.pone.0189283.t008

Download:

Table 9. Greedy approach for reducing the number of rows: Examples of the getN() and

functions.

P matrix.

https://doi.org/10.1371/journal.pone.0189283.t009

Download:

Table 10. Greedy approach for reducing the number of rows: Examples of the getN() and

functions.

Results of .

https://doi.org/10.1371/journal.pone.0189283.t010

Greedy approach for reducing the number of columns.

The function that reduces the number of columns using the greedy approach is denoted by and is defined below. Let be the set of columns of ; let , with dimensions of k × k, be an array in which each element k_i,j stores the number of times that columns i and j together are involved in a missing t-wise combination; and let be a vector in which . The function selects the column {c_i|i = max_j{u_j}} to be discarded; ties are broken randomly. Whenever a column i is discarded, the vector is updated by subtracting the value k_i,j from u_j for all j ≠ i. Each element in is associated with a column, and its value is equal to the number of times that column is involved in a missing t-wise combination.

In summary, the function chooses a column i associated with the maximum value u_i in the vector . When discarding column i, we obtain an array such that , since once column i has been discarded, the associated missing combinations involving column i are deleted. Therefore, the resulting array will have the minimum number of missing t-wise combinations, since u_i has the greatest value among the elements of .

When we discard a column, the values of the elements of must be updated. To do so, a value of −1 is assigned to u_i, and the value of each element u_j such that j ≠ i is updated based on its interaction with the recently discarded column; i.e., u_j = u_j − k_i,j. This process is intuitively illustrated as follows. Suppose that we have a set of n criminals who are accused of having committed m crimes together, and suppose that the authorities have found that a certain criminal s is the only one who committed l of these crimes, where l ≤ m; then, the number of crimes of which each of the remaining criminals is accused must be decreased in accordance with his initially suspected degree of participation in committing crimes with criminal s.

The time required to initially populate and for the function is (in all algorithms with a greedy component, this process is called getK()), since for each set of t columns, all N rows must be explored, the vector must then be updated based on the missing t-wise combinations in these columns, and must be updated for all possible pairs in this set of t columns. The time required to discard a column and update and accordingly is O(2k), since the vector must be explored to obtain the column i with the greatest value, and column i of must then be explored to update .

Tables 11 and 12 illustrate the application of and to the matrix defined in Fig 3. The matrix and the vector shown in Table 11 are the results of the call to ; given that the initial matrix is a CA, all values in and are zero because there are no missing t-wise combinations. Now, a call to will result in the arbitrary selection of a column from among {c₁, c₂, c₃, c₄, c₅}; let us assume that c₂ is chosen. The elimination of this column will produce the new vector shown in Table 12, which also has zero missing t-wise combinations because the values c_j, for j ≠ 2, are all zero.

Download:

Table 11. Greedy approach for reducing the number of columns: Examples of the

and

functions.

https://doi.org/10.1371/journal.pone.0189283.t011

Download:

Table 12. Greedy approach for reducing the number of columns: New vector

.

https://doi.org/10.1371/journal.pone.0189283.t012

Now that the greedy functions and have been defined, the three greedy algorithms are introduced below.

Greedy algorithm .

is a greedy algorithm that first reduces to a matrix with k − Δ columns using the function . The newly formed array is denoted by and has N rows and k − Δ columns. Then, is further reduced to a matrix with N − δ rows using the function , yielding the solution . Algorithm 1 describes the approach.

Algorithm 1

1: function

2:

3:

4: getK

5: for i < Δ do

6:

7:

8: end for

9:

10: getN

11: for i < δ do

12:

13:

14: end for

15:

16:

17: return

18: end function

The time required to execute can be calculated from the times required for populating and updating the necessary structures, as follows: .

During the execution of this algorithm, Δ columns are first discarded from following the defined reduction process, resulting in an array with k − Δ columns; then, the necessary structures for eliminating rows from are populated, and finally, δ rows are discarded following the defined reduction process, yielding the solution .

Fig 4 shows an example of the application of to the problem instance presented in Fig 3. This table illustrates how the initial matrix evolves into the final matrix . First, Fig 4(a) shows the changes made to due to the elimination of columns (see the loop in lines 5 to 8); for each iteration i of this loop, the table presents the initial vector , the set J_C of columns chosen so far, the vector obtained by updating after the elimination of the column c selected in that iteration, and the resulting matrix after that iteration. This part of the algorithm is performed only once because Δ = 1. Subsequently, Fig 4(b) presents the changes made to the last matrix obtained in the previous process due to the elimination of rows (see the loop in lines 11 to 14); for each iteration i of this loop, the table presents the vector derived from the previous matrix , the set J_R of rows chosen so far, the updated vector obtained after the elimination of the row r selected in that iteration, and the resulting matrix after that iteration. This second loop is repeated twice because δ = 2. The last matrix obtained in the second part of is returned as the final matrix .

Download:

Fig 4. Example of

.

(a) Discarding columns. (b) Discarding rows.

https://doi.org/10.1371/journal.pone.0189283.g004

Greedy algorithm .

The greedy algorithm first removes δ rows from using the function to obtain an array with N − δ rows and k columns. Then, is reduced to a matrix with k − Δ columns using the function to obtain the final solution . Algorithm 2 describes the approach.

Algorithm 2

1: function

2:

3:

4: getN

5: for i < δ do

6:

7:

8: end for

9:

10: getK

11: for i < Δ do

12:

13:

14: end for

15:

16:

17: return

18: end function

The time required to execute can be calculated from the times required for populating and updating the necessary structures. The result is , since N − δ rows are first discarded from the input array , generating an array with N − δ rows and k columns, and this array is then reduced to one with k − Δ columns to obtain the solution .

Fig 5 shows an example of the application of to the problem instance presented in Fig 3. This table illustrates how the initial matrix evolves into the final matrix . First, Fig 5(a) presents the changes made to due to the elimination of rows (see the loop in lines 5 to 8); for each iteration i of this loop, the table presents the initial vector , the set J_R of rows chosen so far, the vector obtained by updating after the elimination of the row r selected in that iteration, and the resulting matrix after that iteration. This part of the algorithm is repeated twice because δ = 2. Subsequently, Fig 5(b) presents the changes made to the last matrix obtained in the previous process due to the elimination of columns (see the loop in lines 11 to 14); for each iteration i of this loop, the table presents the vector derived from the last matrix , the set J_C of columns chosen so far, the updated vector after the elimination of the column c selected in that iteration, and the resulting matrix after that iteration. This second loop is executed only once because Δ = 1. The last matrix obtained in the second part of is returned as the final matrix .

Download:

Fig 5. Example of

.

(a) Discarding rows. (b) Discarding columns.

https://doi.org/10.1371/journal.pone.0189283.g005

Greedy algorithm .

The greedy algorithm distributes the elimination of Δ columns and δ rows in a round-robin fashion. This algorithm alternately discards first a single row and then some number of columns until the number of rows has been reduced to N − δ. This algorithm uses a vector with δ elements, where each element d_i, corresponding to the i^th discarded row, indicates the number of columns that should be discarded immediately after discarding that row. When Δ > δ, the first δ − 1 elements of are each filled with a value of , and the last one is filled with a value of . When δ ≥ Δ, the first Δ elements of are each filled with a value of one. Once the Δ columns have been distributed among the δ rows, one row is discarded, and then, the number of columns is reduced to k − d_i. Hence, with the exploration of each element i of the vector , the numbers of rows and columns of the array will be decreased to N_i = N − i and , respectively. Algorithm 3 describes the approach.

The time required to execute can be obtained by considering how the numbers of columns and rows of will be reduced while exploring the vector ; the result is . As each element i of is explored, first, the necessary structures are populated to discard rows from the array , which has N − i + 1 rows and columns, and the number of rows is reduced to N − 1. Next, it is necessary to populate the structures needed to discard columns from the new array with the reduced number of rows, and then, the number of columns is reduced to k − d_i.

Fig 6 shows an example of the application of to the problem instance presented in Fig 3. Because Δ is not greater than δ in this instance, the number of columns that must be eliminated with the elimination of each row is given by the vector ; i.e., after the deletion of the first row, one column must be deleted, and then the algorithm proceeds to the deletion of the second row to satisfy the value δ = 2. The example shown in Fig 6 illustrates the reduction process for the instance given in Fig 3. The first column lists the main structures that are changed during the execution of the algorithm. Each of the remaining columns in Fig 6 represents a different iteration of the main loop of the algorithm.

Download:

Fig 6. Example of

.

Column 1 shows the main structures used throughout the algorithm, and each of the remaining columns represents a different iteration of the main algorithm.

https://doi.org/10.1371/journal.pone.0189283.g006

Algorithm 3

1: function

2:

3:

4: if Δ > 0 then

5: if Δ > δ then

6: for i ← 0 to i < δ − 1 do

7:

8: end for

9:

10: else

11: for i ← 0 to i < Δ do

12: d_i ← 1

13: end for

14: end if

15: for i ← 0 to i < δ do

16: getN

17:

18:

19: getK

20: for j ← 0 to j < d_i do

21:

22:

23: end for

24: end for

25:

26: else

27:

28: end if

29: return

30: end function

Meta-heuristic algorithm

The approximate algorithm for searching for a solution to the OSCAR problem is based on the SA approach and is described in Algorithm 4. This approach is a general-purpose stochastic optimization strategy that has been proven to be an efficient means of approximating global optimal solutions to many NP-complete combinatorial optimization problems. In this strategy, a solution is first constructed using the Initialize(…) method, and this solution is designated as the first global best solution ; then, the algorithm enters an iterative improvement process, controlled by the length of the Markov chain, until a certain termination criterion is achieved. In each iteration of this improvement process, a new solution is generated using the GenerateNeighbor(…) method, and this new solution is substituted for whenever its quality is superior to that of the current solution or the probability condition is satisfied. The probability condition is based on the Boltzmann distribution, and it is defined with respect to the values of an initial temperature , a final temperature , and a quality function τ(…) of and . The global best is also updated every time a solution improves upon it. The details of these procedures are presented in the remainder of this subsection.

Algorithm 4

1: function

2:

3:

4: ← Initialize(N, k, δ, Δ)

5:

6:

7: L ← (N + k)v

8: n ← 0

9: while and do

10: for i ← 0 to i < L do

11: ← GenerateNeighbor

12: if then

13:

14: if then

15:

16: else if random(0…1) then

17:

18: end if

19: end if

20: end for

21:

22: n ← n + 1

23: end while

24: ← constructMatrix

25: return

26: end function

A solution is represented by a vector with N + k elements, which identify the subsets of the rows and columns of the matrix that are used to construct a new submatrix . All of the elements are binary; each of the first N elements is associated with a particular row in , and each of the last k elements is associated with one of its columns. The subset of rows to be excluded from the new submatrix consists of the rows of that are associated with the corresponding elements in that have a value of 1. Similarly, the subset of columns to be excluded from the new submatrix consists of the columns of that are associated with the corresponding elements in that have a value of 1.

The Initialize(N, k, δ, Δ) method is used to construct the first solution in the proposed strategy. This method uses the best greedy algorithm among those proposed in this paper. The greedy strategy is chosen based on preliminary experiments for SA_OSCAR.

Once the initial solution has been created, it is modified using the neighborhood function . This method randomly chooses from among three predefined strategies, , and , to create a new solution . In the strategy , a row of is exchanged; one existing row in the solution is randomly removed and replaced with a different row not previously included. The strategy follows the same approach as that of but for columns. Finally, the strategy is the combination of the previous two. The use of the neighborhood function is controlled by the parameter L, which is called the Markov chain length.

The quality function, or evaluation function, that is used to measure the fitness of a solution is derived from the definition of the OSCAR problem. This function, denoted by , counts the number of missing t-wise combinations. We note that one missing t-wise combination represents a t-tuple for a particular combination of columns that the matrix does not contain but would be required to cover in order for the matrix to be a CA(N − δ; t, k − Δ, v).

Finally, the cooling schedule is controlled by a cooling factor α, which is used to gradually decrease an initial temperature until it reaches a given final temperature , marking the end of the algorithm. Note that the algorithm also includes an alternative termination criterion, which is defined as a maximum number of iterations of the main loop.

The time required to execute is O(iL), where is the number of temperature decrements necessary to reach .

Exact algorithms and

The exact algorithms and were previously reported in [24]. They follow the B&B strategy (cf. [43]) and avoid the need to explore the entire feasible region to find the optimal solution. The general idea behind these algorithms is described in Algorithm 5 and Algorithm 6.

These previously presented algorithms first construct an initial solution using a greedy strategy; then, they remove from this solution all possible combinations of rows and columns such that the resulting matrix has N − δ rows and k − Δ columns. Because the order in which the rows and columns are removed matters, the strategies differ in their selection of which elements are removed first. Whereas first removes a subset of columns, first removes a subset of rows. Both algorithms, after the first selected elements have been removed, perform a B&B search over the columns and/or rows, testing each element one by one, in order to find the submatrix with the minimum number of missing t-wise combinations.

Algorithm 5

1: function

2: ← bestGreedy

3:

4:

5: for each and do

6: ← Eliminate

7: for each do

8: ← Eliminate

9: if then

10:

11:

12: end if

13: end for

14: end for

15: return

16: end function

Algorithm 6

1: function

2: ← bestGreedy

3:

4:

5: for each and do

6: ← Eliminate

7: for each and do

8: ← Eliminate

9: if then

10:

11:

12: end if

13: end for

14: end for

15: return

16: end function

We note that the elimination of the second set of elements obeys a lexicographical order given by the columns or rows that are being deleted. Moreover, during the search process, both algorithms avoid the elimination of certain columns or rows that would exert undesirable effects on the final submatrix (i.e., selections for which the number of missing t-wise combinations would increase over a certain upper bound); for a more detailed description of the algorithms, refer to [24]. The time required to execute either or is since, in the worst case, it is not possible to discard solutions.

Hybrid algorithms for solving the OSCAR problem

Hybrid algorithm .

The algorithm combines the greedy and exact approaches to solve the OSCAR problem. The algorithm proceeds in two phases. First, it chooses a set of Δ columns and removes them from the initial matrix ; the resulting matrix is denoted by and has dimensions of N × (k − Δ). Afterward, the algorithm discards δ rows from in a greedy manner to construct a possible solution , i.e., a matrix with dimensions of (N − δ) × (k − Δ), for the OSCAR instance at hand. To obtain the best solution , the algorithm explores all possible combinations of Δ columns and identifies the best matrix from among all matrices constructed during the process described above.

Algorithm 7

1: function

2: for i ← 0 to do

3: ← GREATERTHANPOLYNOMIAL

4:

5: getN

6:

7: for j ← 0 to j < δ do

8:

9:

10: end for

11:

12: if then

13:

14:

15: end if

16: end for

17: return

18: end function

The algorithm for solving the OSCAR problem is described in Algorithm 7. Each combination of k − Δ columns is represented by the vector . Each new combination of k − Δ columns is computed by the function GREATERTHANPOLYNOMIAL() [44]. An array , with dimensions of N × (k − Δ), is constructed using the columns indicated by . Then, the algorithm populates the necessary structures to reduce to a matrix with N − δ rows, and this reduction process yields an array . The best solution that has been found so far during the exploration process is represented by . Whenever for a newly constructed matrix , the matrix is replaced with .

The time required to execute can be derived from the times required for populating the necessary structures, reducing the number of rows, and updating the necessary values. This time is proportional to .

Hybrid algorithm .

The algorithm also combines the exact and greedy approaches to find a solution to the OSCAR problem; compared with , the difference is that it explores all possible combinations of δ rows that can be eliminated from the original matrix. The algorithm proceeds in two phases. First, it chooses a set of δ rows and removes them from the initial matrix ; the resulting matrix is denoted by and has dimensions of (N − δ) × k. Subsequently, the algorithm greedily discards Δ columns from to construct a possible solution , i.e., a matrix with dimensions of (N − δ) × (k − Δ), for the OSCAR instance at hand. To obtain the best solution , the algorithm explores all possible combinations of δ rows and identifies the best matrix from among all matrices constructed during the process described above.

The algorithm for solving the OSCAR problem is described in Algorithm 8. Each combination of N − δ rows is represented by the vector . Each new combination of N − δ rows is computed by the function GREATERTHANPOLYNOMIAL(). An array , with dimensions of (N − δ) × k, is constructed using the rows indicated by . Then, the algorithm populates the necessary structures to greedily reduce to a matrix with k − Δ columns, and this reduction process yields an array . The best solution that has been found so far during the exploration process is represented by . Whenever for a newly constructed matrix , the matrix is replaced with .

Algorithm 8

1: function

2: for i ← 0 to do

3: ← GREATERTHANPOLYNOMIAL

4:

5: getK

6:

7: for j ← 0 to j < Δ do

8:

9:

10: end for

11:

12: if then

13:

14:

15: end if

16: end for

17: return

18: end function

The time required to execute can be derived from the times required for populating the necessary structures, reducing the number of columns, and updating the necessary values. This time is proportional to .

Hybrid algorithm .

The algorithm uses a hybrid strategy that combines the SA meta-heuristic [45] with the greedy approach to construct a solution to the OSCAR problem. In each iteration of , a local search is performed over the possible set of columns that can be eliminated to obtain a matrix with dimensions of N × (k − Δ). Afterward, the matrix is subjected to a greedy process to reduce its size by δ rows and thus to construct a solution with dimensions of (N − δ) × (k − Δ) for the OSCAR instance at hand. Once the matrix has been built, the Boltzmann criterion is used as usual in SA. The details of the strategy are presented in the remainder of this subsection.

Algorithm 9 describes the proposed approach for solving the OSCAR problem. The algorithm uses a vector W_u of size k to represent the state of each column in the solution . The elements of the vector take values of w_i ∈ {0, 1} for 1 ≤ i ≤ k, where a value of 0 indicates that the corresponding column is not present in the solution and a value of 1 indicates otherwise. In addition, two constraints are imposed to obtain a proper OSCAR solution: there must be k − Δ elements with a value of w_i = 1, and there must be Δ elements with a value of w_i = 0.

Algorithm 9

1: function

2: ← Initialize(k, Δ)

3:

4:

5: L ← (N + k)v

6: n ← 0

7: while do

8: for i ← 0 to i < L do

9: ← generateNeighbor

10: ← ConstructSolution

11: getN

12: for i ← 0 to i < δ do

13:

14: end for

15: if then

16:

17: if then

18:

19: end if

20: else if random(0 … 1) then

21:

22: end if

23: end for

24:

25: end while

26: ← constructMatrix

27: return

28: end function

The algorithm uses a set of perturbations to the vector as its neighborhood function. For this purpose, it chooses two elements w_i and w_j, where w_i ≠ w_j, and interchanges their values. The new solution formed via this perturbation, which is a neighbor of , is denoted by .

Finally, the evaluation function used in is τ, the number of missing combinations in a created matrix. This function is also used to evaluate the matrices created during the local search.

The time required to execute is proportional to O(iLF_R), where is the number of temperature decrements necessary to reach and F_R is the time cost of the greedy approach for eliminating rows.

Hybrid algorithm .

The algorithm uses another hybrid strategy that combines the SA meta-heuristic [45] with the greedy approach to construct a solution to the OSCAR problem. In each iteration of , a local search is performed over the possible set of rows that can be eliminated to obtain a matrix with dimensions of (N − δ) × k. Afterward, the matrix is subjected to a greedy process to reduce its size by Δ columns and thus to construct a solution with dimensions of (N − δ) × (k − Δ) for the OSCAR instance at hand. Once the matrix has been built, the Boltzmann criterion is used as usual in SA. The details of the strategy are presented in the remainder of this subsection.

Algorithm 10 describes the proposed approach for solving the OSCAR problem. The algorithm uses a vector W_u of size N to represent the state of each row in the solution . The elements of the vector take values of w_i ∈ {0, 1} for 1 ≤ i ≤ k, where a value of 0 indicates that the corresponding row is not present in the solution and a value of 1 indicates otherwise. The constraints imposed to ensure a proper OSCAR solution are as follows: there must be N − δ elements with a value of w_i = 1 and δ elements with a value of w_i = 0.

Algorithm 10

1: function

2: ← Initialize(k, Δ)

3:

4:

5: L ← (N + k)v

6: n ← 0

7: while and do

8: for i ← 0 to i < L do

9: ← generateNeighbor

10: ← ConstructSolution

11: getK

12: for i ← 0 to i < Δ do

13:

14: end for

15: if then

16:

17: if then

18:

19: end if

20: else if random(0 … 1) then

21:

22: end if

23: end for

24:

25: end while

26: ← constructMatrix

27: return

28: end function

The algorithm uses a set of perturbations to the vector as its neighborhood function. For this purpose, it chooses two elements w_i and w_j, where w_i ≠ w_j, and interchanges their values. The new solution formed via this perturbation, which is a neighbor of , is denoted by .

Finally, the evaluation function used in is τ, the number of missing combinations in a created matrix. This function is also used to evaluate the matrices created during the local search.

The time required to execute is O(iLF_C), where is the number of temperature decrements necessary to reach and F_C is the time cost of the greedy approach for eliminating columns.

Hybrid algorithm .

The algorithm combines the meta-heuristic and exact approaches to solve the OSCAR problem, using a strategy based on the exploration of all possible combinations of Δ columns that can be eliminated from the original matrix. The algorithm proceeds in two phases. First, it chooses a set of Δ columns and removes them from the initial matrix ; the resulting matrix is denoted by and has dimensions of N × (k − Δ). Then, the algorithm uses the SA approach to discard δ rows from to construct a possible solution , i.e., a matrix with dimensions of (N − δ) × (k − Δ), for the OSCAR instance at hand. To obtain the best solution , the algorithm explores all possible combinations of Δ columns and identifies the best matrix from among all matrices constructed during the process described above.

Algorithm 11 describes our approach. For each possible combination of columns , the algorithm performs a meta-heuristic search to define the set of rows . To determine each element in , the function GREATERTHANPOLYNOMIAL() [44] is used to systematically generate each different combination of columns.

Algorithm 11

1: function

2: for i ← 0 to do

3: ← GREATERTHANPOLYNOMIAL

4: ← Initialize(N, δ)

5:

6:

7: L ← (N + k)v

8: n ← 0

9: while and do

10: for j ← 0 to j < L do

11: ← generateNeighbor

12: if then

13:

14: if then

15:

16: end if

17: else if random(0 … 1) then

18:

19: end if

20: end for

21:

22: end while

23: if then

24:

25: ← constructMatrix

26: end if

27: end for

28: return

29: end function

The time required to execute is , where is the number of temperature decrements necessary to reach .

Hybrid algorithm .

The algorithm also combines the meta-heuristic and exact approaches to solve the OSCAR problem; compared with , the difference is that it explores all possible combinations of δ rows that can be eliminated from the original matrix. The algorithm proceeds in two phases. First, it chooses a set of δ rows and removes them from the initial matrix ; the resulting matrix is denoted by and has dimensions of (N − δ) × k. Then, the algorithm uses the SA approach to discard Δ columns from to construct a possible solution , i.e., a matrix with dimensions of (N − δ) × (k − Δ), for the OSCAR instance at hand. To obtain the best solution , the algorithm explores all possible combinations of δ rows and identifies the best matrix from among all matrices constructed during the process described above.

Algorithm 12 describes our approach. For each possible combination of rows , the algorithm performs a meta-heuristic search to define the set of columns . To determine each element in , the function GRATERTHANPOLYNOMIAL() [44] is used to systematically generate each different combination of rows.

Algorithm 12

1: function

2: for i ← 0 to do

3: ← GREATERTHANPOLYNOMIAL

4: ← Initialize(k, Δ)

5:

6:

7: L ← (N + k)v

8: n ← 0

9: while and do

10: for i ← 0 to i < L do

11: ← generateNeighbor

12: if then

13:

14: if then

15:

16: end if

17: else if random(0 … 1) then

18:

19: end if

20: end for

21:

22: end while

23: if then

24:

25: ← constructMatrix

26: end if

27: end for

28: return

29: end function

The time required to execute is , where is the number of temperature decrements necessary to reach .

In the next section, we demonstrate the performance of our 12 algorithms.

Experimentation

This section presents the experimental design used to test the performance of the proposed algorithms for solving the OSCAR problem. The methodology consisted of the following steps: 1) A set of benchmark instances was defined. 2) The parameters of the SA algorithm were subjected to a fine-tuning process. 3) The performances of the algorithms were evaluated by using them to solve the benchmark problem instances. 4) A performance comparison against state-of-the-art initialization functions was conducted. 5) The results derived from the algorithms were used to define new upper bounds for existing CAs.

The proposed algorithms were implemented in the C language and compiled using gcc with the optimization option -O3. We used a computer with 72 Intel Xeon 1.6 GHz CPU cores and RAM of 64 GB. The remainder of this section describes the experimental methodology in detail.

Definition of the benchmarks

This subsection introduces the three benchmarks used to properly test the proposed set of OSCAR algorithms. The benchmark (S1 dataset) consists of 12 small CAs, which are described in Table 13, and it is used to analyze the performance of all algorithms presented in this document; then, the algorithms that achieve the best experimental results on this benchmark in terms of both time and solution quality are further tested on the following benchmark. The benchmark (S2 dataset), presented in Table 14, consists of 62 CAs; it is an extension of the benchmark presented in [24] such that the adjusted values of δ and Δ provide support for the discovery of a greater number of new upper bounds for the related CAs. This benchmark aids in the identification of the OSCAR solver with the best overall experimental performance, and it is also used to compare the results of the proposed OSCAR solvers against other state-of-the art initialization functions. Finally, the benchmark (S3 dataset) consists of 820 instances (see Table 15); this benchmark is used to evaluate the quasi-CA construction performance of IPOG-F, a classical and versatile (in the sense that it can rapidly construct any type of CA) greedy algorithm that is widely used in the literature, against the best OSCAR strategies identified in the experiments on the previous benchmarks in terms of both the time required for matrix construction and the quality of the constructed matrices. Table 15 presents the instances included in benchmark , organized into 20 sets. In each set, one OSCAR instance is defined per value of k considered (from 10 to 50), as shown in column 1; the remaining columns show the values for v, t, δ, and Δ, which correspond to the alphabet size, the strength, and the numbers of rows and columns to be eliminated, respectively. We note that the benchmark is also characterized by its wide variety of values of the strength t and the alphabet size v.

Download:

Table 13. Benchmark

, which is composed of 12 small instances of the OSCAR problem.

The column 2 shows the CA uses as initial array, while the columns 3 and 4 show the number of rows δ and columns Δ to be shortened, respectively.

https://doi.org/10.1371/journal.pone.0189283.t013

Download:

Table 14. Benchmark

, which is composed of 62 instances of the OSCAR problem.

Each instance shows the initial array , and the number of rows δ and columns Δ to be shortened.

https://doi.org/10.1371/journal.pone.0189283.t014

Download:

Table 15. Groups of instances’ sets that form the benchmark

.

The column 1 is the identifier of the groups. The columng 2 shows the ranges of k, the number of columns. The remaining columns are the alphabet v, strength t, and rows δ and columns Δ to be shortened.

https://doi.org/10.1371/journal.pone.0189283.t015

Fine-tuning of the parameters of

The approach is the basis for several of our other approaches. Because this approach uses the SA algorithm, a fine-tuning process is necessary to adjust the values of its parameters to improve its performance. During the tuning process performed in this study, the Markov chain length L, the final temperature , and the initialization function were fixed; all remaining parameters (i.e., the initial temperature , the decrement factor α, and the maximum number of evaluations ) were subjected to adjustment. Because different neighborhood functions are used in our approach, each with a certain probability of being applied, a fourth parameter was also considered during the tuning process: the application probability of each neighbor function, denoted by . The goal of this fine-tuning process was to test the performance of using different configurations of the parameter values to identify the configuration that yielded the best performance.

The sets of values considered for the parameters , α, and were {1, 4}, {0.90, 0.99}, and {100L, 500L}, respectively. In the fine-tuning approach presented in [46, 47], a CA is used as a means of systematically sampling the entire set of parameter value combinations; the method starts at an initial level of interaction t, which is used to construct a CA(N; t, k, v), and t is then increased until the generated sample is suitable for the purposes of the experiment. The present study required the smallest possible sample in order to reduce the experimental time; this sample was constructed using an interaction level of t = 2. A summary of the final combinations of values tested, derived from the constructed CA(4; 2, 3, 2), is shown in Table 16. Meanwhile, the vector of probabilities used for the initialization functions was defined based on solutions to the Diophantine equation a₁x + a₂x + a₃x = 10, following the approach presented in [44]. During this process, each of the 66 solutions to the Diophantine equation was used to generate a possible vector , in which the probability value for each initialization function i was estimated as .

Download:

Table 16. Different parameter configurations that were tested for the algorithm

.

https://doi.org/10.1371/journal.pone.0189283.t016

Because we considered 4 different configurations of the values of the parameters , α, and and 66 configurations of the probability vector , the experiment to fine-tune involved 264 different parameter value configurations. Each configuration was used to solve two instances of the OSCAR problem, specified by and , with a total of 31 runs per instance, where the value of the solution reported was the best among all the runs.

The results obtained from the fine-tuning process indicated that the optimal parameter values for are , α = .99, and and that the desired solution to the Diophantine equation is a₁ = 4, a₂ = 3, and a₃ = 3. This configuration was also used in the algorithms , , , and , which also use the meta-heuristic approach.

Evaluation of the 12 proposed algorithms

This section presents the evaluation of the 12 proposed algorithms for solving the OSCAR problem. All algorithms were tested on the smaller set of 12 instances, , to identify the three best algorithms. Then, the larger set was solved using only those three algorithms to further evaluate the general performance of these approaches.

The algorithms were first tested using the benchmark consisting of 12 OSCAR instances derived from 10 CAs taken from the literature. The values δ and Δ for these instances were fixed such that the size of the resulting array would represent a possible new upper bound. In addition, these instances were created such that the size of the search space would permit us to solve them using all 12 algorithms; this was a concern because exact algorithms must explore all possible combinations of rows and columns, and therefore, if the search space is too large, they may require an excessive amount of time.

Table 17 presents the results obtained using each algorithm based solely on the greedy approach (i.e., the algorithms , , and ) when solving the benchmark . Note that and show better performance than ; when all instances are considered, the former algorithms result in equal or fewer missing t-wise combinations compared with the latter. Therefore, the findings show that it is beneficial to eliminate rows before columns (as in and ) when working with initial matrices that are already CAs. This is because when rows are removed from the matrix, those that contribute the least to the CA are chosen for deletion, and the t-wise combinations that are lost as a result can subsequently be compensated for by eliminating the columns that produce them. Meanwhile, although the time performance of is superior to that of the others for this particular set of instances, it will worsen rapidly with increasing values of δ and Δ. In general, the time performance of will be the worst among the three greedy approaches, as indicated by the theoretical complexities presented alongside the definitions of these algorithms, mainly because of the greater number of calls to the greedy strategies for eliminating rows and columns.

Download:

Table 17. Results obtained when solving

using the algorithms

,

, and

.

https://doi.org/10.1371/journal.pone.0189283.t017

Table 18 shows the results obtained using the hybrid approaches that combine the greedy strategy with either the exact approach or the meta-heuristic approach (i.e., the algorithms , , and ) when solving the benchmark . An increase in running time is observed for these approaches, mainly due to the use of the more elaborate strategies of the exact and meta-heuristic algorithms. However, the results achieved also improve upon some of the results obtained by the solely greedy algorithms. All of these hybrid algorithms achieve the same results; however, the average time increase for is much greater than that for the algorithms that include exact strategies. Let’s point out that the small amounts of times appearing in the exact approach are indeed a result from its expected theoretical behavior.

Download:

Table 18. Results obtained when solving

using the algorithms

,

, and

.

https://doi.org/10.1371/journal.pone.0189283.t018

Table 19 shows the results for another set of hybrid approaches, all involving the meta-heuristic strategy in combination with either the exact approach or the greedy approach (i.e., the algorithms , , and ), when solving the benchmark . From these results and the previous ones shown in Table 18 for , it can be seen that the algorithms , , , and all find solutions with a comparable number of missing t-wise combinations to those in the solutions created by the other (exact or greedy) approaches. However, it should be noted that the initial matrices used in these algorithms were the best solutions obtained by a greedy algorithm, and in most cases, the differences in the number of missing t-wise combinations between these initial matrices and the results reported by the hybrid algorithms are nearly zero. These findings indicate that the contribution of these hybrid approaches is minimal.

Download:

Table 19. Results obtained when solving

using the algorithms

,

, and

.

https://doi.org/10.1371/journal.pone.0189283.t019

Table 20 shows the results of solving the benchmark using the algorithms , , and . Note that exhibits better performance than because there are more possible ways to select rows than columns. However, exhaustive search algorithms are impractical for finding a solution to an OSCAR instance except when the values of δ and Δ are both quite small. The exact algorithm is more suitable when N − δ > k − Δ since a greater portion of the search space is defined by N − δ; otherwise, behaves better. However, the execution time of the proposed exact algorithms grows with the desired degree of reduction for a given instance, and they can become infeasible.

Download:

Table 20. Results obtained when solving

using the algorithms

,

, and

.

https://doi.org/10.1371/journal.pone.0189283.t020

Finally, some additional important observations are noted in the following. First, for every instance, the solutions obtained by the algorithms and have the same number of missing t-wise combinations. When N − δ > k − Δ, requires more time than ; similarly, when N − δ < k − Δ, requires more time than . These findings suggest that is appropriate when N − δ > k − Δ and that is appropriate when N − δ < k − Δ.

Second, as δ and Δ increase for a given array , the number of missing t-wise combinations produced by the pure greedy algorithms increases in comparison with the results of the hybrid algorithms that include exact strategies, i.e., and . For example, for the instances with the input array CA(255; 2, 18, 15), we note that the solution obtained by when δ = 3 and Δ = 1 has 267 missing t-wise combinations, whereas the solution obtained by has 257 missing t-wise combinations; similarly, when δ = 9 and Δ = 11, the solution obtained by has 65 missing t-wise combinations, whereas the solution obtained by has 61 missing t-wise combinations. It can be inferred that the inclusion of an exact strategy contributes to reducing the number of missing t-wise combinations, at the cost of an increase in the time required to build the matrix.

Third, and most importantly, the algorithms that showed the best performance in the experiment were , and ; all of them obtained comparable solutions, with only small differences in both quality and time cost. The algorithms that include meta-heuristic strategies consumed considerably more time but showed little difference in the quality of their solutions, whereas the exact approaches are too expensive for large values of δ and Δ.

To further evaluate the proposed approaches, the algorithms , and were used to solve the benchmark , which includes larger CAs. Table 21 summarizes the results obtained when solving . In addition to this experiment, an instance specified by the array and values of δ = 2 and Δ = 33 was also solved using the meta-heuristic algorithm ; this algorithm produced a solution with zero missing t-wise combinations, meaning that the approach constructed a new CA of the form CA(134; 5, 35, 2). This last result serves as evidence that an approach based on seeking a solution to the OSCAR problem can also be used to construct CAs.

Download:

Table 21. Results obtained when solving

using the algorithms

,

and

.

https://doi.org/10.1371/journal.pone.0189283.t021

Performance comparison with state-of-the-art initialization algorithms

This subsection evaluates the performance of the proposed OSCAR approaches against the performance of several state-of-the-art initialization functions. For this purpose, the initialization functions described in the related work section are considered, and their results are compared with the best solutions obtained using the approaches proposed in this work.

The performance comparison was performed as follows. The benchmark was chosen as the set of instances to be used in this evaluation. First, the OSCAR algorithms proposed in this work were used to solve the benchmark, and the best matrix among all of the results was obtained for each instance. Then, the initialization functions, denoted by , were used to construct arrays of the same dimensions as the matrices derived by solving the OSCAR instances; i.e., for each instance, we constructed an array with N − δ rows and k − Δ columns. Once all of the solutions generated by the OSCAR algorithms and the state-of-the-art initialization functions had been obtained, they were evaluated with regard to the function τ, i.e., the number of missing t-wise combinations in each newly constructed matrix. Table 22 summarizes the results of this experiment. Column one shows the identifier of each instance in , column two shows the number of missing t-wise combinations in the best solution obtained using the OSCAR approaches, and columns three to six present the numbers of missing t-wise combinations in the solutions derived using the state-of-the-art initialization functions . Note that for the last problem instance, one of the proposed OSCAR approaches (the meta-heuristic algorithm ) was able to construct a CA, as seen from the fact that the new matrix has zero missing t-wise combinations.

Download:

Table 22. Quality of solutions measured as missing t-wise combinations

, and obtained by the best solution

from the proposed approaches, and the initialization functions.

https://doi.org/10.1371/journal.pone.0189283.t022

Performance comparison with IPOG-F, a state-of-the-art CA construction approach

The experiment presented here involves the comparison of the and strategies against the state-of-the-art IPOG-F algorithm for CA construction. The goal in this experiment was to evaluate the performance when constructing CAs and/or quasi-CAs using IPOG-F, a fast greedy algorithm for CA construction that is widely used in the literature and is versatile in the sense that it can rapidly construct any type of CA. The and strategies are among the best of the proposed OSCAR solvers, as indicated by the experiments on the previous benchmarks. Both of these strategies were compared against IPOG-F in terms of the matrix construction time and the matrix quality (i.e., the number of missing t-combinations). Table 23 summarizes the results of this comparison on . Column 1 lists each set of instances in the benchmark. Columns 2 to 4 present the accumulated solution quality (i.e., the accumulated number of missing t-wise combinations) per set for each strategy. Columns 5 to 7 report the accumulated time per set and strategy.

Download:

Table 23. Summary of the results of evaluating the performance of algorithms E₁ = IPOG-F,

, and

₃, over the benchmark

.

The performance is measured in the missing t-wise combinations, and in the time (in seconds) spent to find it.

https://doi.org/10.1371/journal.pone.0189283.t023

The experiment reported in this section was conducted to test IPOG-F as an approach for constructing CAs and/or quasi-CAs. The results shown in Table 23 reveal that the matrices constructed using have up to 90% fewer missing t-wise combinations than those constructed using IPOG-F. In addition, could generate CAs in 40 of the 820 OSCAR instances by reducing the number of missing t-wise combinations to 0, whereas IPOG-F failed to obtained any CA with the desired numbers of rows and columns. Finally, achieved better running times than IPOG-F for small values of v and t; however, the time performance of rapidly worsened with increasing values of the alphabet size and strength. By contrast, the strategy achieved time consumption results similar to those of IPOG-F while also improving the solution quality, making it a better choice than IPOG-F for the construction of quasi-CAs.

Applications of the proposed approaches for solving the OSCAR problem

In this subsection, we demonstrate that the matrices constructed by solving the OSCAR problem can be used as initial matrices for meta-heuristics for CA construction to assist in the construction of better matrices. For this purpose, the outputs of the initialization functions described in the related work section and the best solutions obtained using the proposed OSCAR approaches were used as the initial matrices for a meta-heuristic reported in [22].

Table 24 shows the new upper bounds for CAN(t, k, v) obtained using our proposed methodology. The second column shows the new CA bounds obtained when using the best matrices generated by the proposed OSCAR algorithms as the initial matrices for the meta-heuristic algorithm, and the third column shows the previous upper bounds for those CAs. The best produced solution for each specific instance of the OSCAR problem was used as the initial matrix for the meta-heuristic CA construction algorithm reported in [22], which is also based on the SA algorithm. Because of the small number of missing t-wise combinations in all of the produced initial matrices, the performance of the meta-heuristic algorithm was improved. The results define new upper bounds on CAN(t, k, v) for several CAs.

Download:

Table 24. New upper bounds.

https://doi.org/10.1371/journal.pone.0189283.t024

Conclusions

The present work has indirect implications for the interaction testing of software by aiding in the construction of tests of economical size (a feasible number of test cases). In particular, this paper presents and analyzes strategies for the construction of arrays with sufficiently few missing combinations to be considered quasi-CAs. Such arrays are constructed by solving the problem known as the Optimal Shortening of Covering ARrays (OSCAR) problem. The development of these strategies is motivated by the fact that the arrays thus produced can be used as excellent initialization matrices for algebraic or meta-heuristic approaches for the construction of CAs, which are mathematical objects that have broad applications in the testing of software components.

This work presents an analysis of twelve different strategies for solving the OSCAR problem. Five of them correspond to greedy and exact approaches previously described in the literature, whereas the remaining seven algorithms are newly proposed here. The new approaches involve the use of simulated annealing and hybridization in their design. We note that this work also provides pseudocodes for the design of all presented algorithms, including, for the first time, the designs for the greedy approaches, which have been only briefly described in previous works. In addition, to test these strategies, three new OSCAR benchmarks with more than 1,000 instances have been designed, representing a considerable improvement over the previously reported 20-instance benchmark in terms of both size and variety in the values of the strength and alphabet size parameters, t and v, respectively.

The experimental design developed for the comparative analysis involved all three proposed benchmarks. The first benchmark, which consists of small instances, was solved using all twelve strategies: three greedy algorithms , two exact algorithms , one meta-heuristic algorithm , and six hybrid approaches . Using this benchmark, the algorithms were compared in terms of running time and solution quality (measured as the number of missing t-wise combinations in each constructed array). As expected, the results showed that the greedy algorithms were the fastest, the exact algorithms yielded the best solutions, and the meta-heuristic provided a balance between quality and time. It was also observed that the solution quality of the pure greedy algorithms worsened with increasing instance size, but this situation could be addressed through the use of hybrid algorithms. The hybrid algorithms involving a mixture of greedy and exact approaches had higher running times but also higher solution quality. The first experiment indicated that hybrid algorithms involving a mixture of the meta-heuristic and greedy strategies are a viable alternative. Such strategies had somewhat higher running times for array construction but resulted in fewer missing t-wise combinations than the hybrid greedy approaches, mainly when the numbers of rows and columns to be deleted were high. In terms of solution quality, the experimental results indicated that the best algorithms were and because they yielded solutions with as few missing t-wise combinations as the exact approaches and but in less time. In terms of running time, the experiment indicated that the best algorithms were and because they were faster than any other algorithms while maintaining an acceptable solution quality; however, we note that for larger instances, the time performance of will be worse than that of because it is more strongly affected by the instance size and the numbers of rows and columns to be removed.

The second benchmark was used to perform an in-depth analysis of some of the best strategies, namely, , and . This experiment tested the performance of these algorithms on larger OSCAR instances to yield a better understanding of their behavior. The best solutions were still produced by the hybrid greedy-exact approaches and , but in the latter approach the time increased exponentially. By contrast, the pure greedy algorithm continued to be fast, and its solutions only slightly deviated from those of the hybrid algorithms. After this analysis, the same benchmark was used to compared the best results from these approaches against the initialization functions generated using state-of-the-art methods. The experimental results showed that in all instances, the number of missing t-wise combinations was reduced by approximately 90% in the matrices constructed using the proposed approach in comparison with those taken from the literature.

Finally, an experiment was conducted using the third benchmark to test IPOG-F as an approach for constructing CAs and/or quasi-CAs. The results revealed that with , the number of missing t-wise combinations was reduced by up to 90% compared with IPOG-F. Moreover, it was found that could obtain CAs in 40 of the 820 OSCAR instances by reducing the number of missing t-wise combinations to 0, whereas IPOG-F failed to obtain any CA with the desired numbers of rows and columns. Finally, it was observed that the running time of was better than that of IPOG-F for small values of v and t but worsened rapidly with increasing values of the alphabet size and strength. By contrast, the strategy achieved running times similar to those of IPOG-F while also improving the solution quality, making it a better choice than IPOG-F for the construction of quasi-CAs.

A major drawback of some of the proposed approaches (with the exception of the greedy ones) is the time consumed to solve the problem, which increases with the numbers of rows and columns to be eliminated. Moreover, the experimental design could be improved to test a wider range of possible values to adjust the meta-heuristic and investigate a wider number of strategies. The ranges of values of the alphabet size and strength parameters should be extended to further probe the resulting changes in performance of the different strategies. Future work should also address the lack of an in-depth analysis of the use of the meta-heuristic approach to properly characterize its region of importance. In general, a more extensive characterization study could provide better insight into the behavior of these strategies, and this remains as future work.

Supporting information

S1 Dataset. Benchmark .

https://doi.org/10.1371/journal.pone.0189283.s001

(ZIP)

S2 Dataset. Benchmark .

https://doi.org/10.1371/journal.pone.0189283.s002

(ZIP)

S3 Dataset. Benchmark .

https://doi.org/10.1371/journal.pone.0189283.s003

(ZIP)

Acknowledgments

The authors acknowledge the General Coordination of Information and Communications Technologies (CGSTIC) at CINVESTAV for providing HPC resources on the Hybrid Cluster Supercomputer “Xiuhcoatl”, which contributed to the research results reported here. The research reported in this paper was funded through the following projects: CONACYT—Métodos Exactos para Construir Covering Arrays Óptimos, project number 238469; and Cátedras CONACYT—Fortalecimiento de las capacidades de TICs en Nayarit, project number 2143.

Compliance with ethical standards

All authors declare that a) we do not have any conflicts of interest, b) this manuscript is the authors’ original work and has not been published nor simultaneously submitted elsewhere, and c) we have acknowledged all entities that have funded this work in any way.

References

1. Kuhn DR, Wallace DL, Gallo AM. Software fault interaction and implications for software testing. IEEE Transactions on Software Engineering. 2004; 30(6):418–421.
- View Article
- Google Scholar
2. Lawrence JF, Kacker RN, Lei Y, Kuhn DR, Forbes M. A survey of binary covering arrays. Journal of Combinatorial Designs. 2011; 18(1):1–30.
- View Article
- Google Scholar
3. Jones JA, Harrold MJ. Test-suite reduction and prioritization for modified condition/decision coverage. IEEE Transactions on software Engineering. 2003; 29(3):195–209.
- View Article
- Google Scholar
4. Sloane NJA. Covering arrays and intersecting codes. Journal of Combinatorial Designs. 1993; 1(1):51–63.
- View Article
- Google Scholar
5. Bush KA. Orthogonal arrays of index unity. Annals of Mathematical Statistics. 1952; 23(3):426–434.
- View Article
- Google Scholar
6. Colbourn CJ, Dinitz JH. The CRC handbook of combinatorial designs. CRC Press; 1999.
7. Colbourn CJ. Combinatorial aspects of covering arrays. Le Matematiche. 2004; 58:121–167.
- View Article
- Google Scholar
8. Meagher K. Non-isomorphic generation of covering arrays. University of Regina; 2002.
9. Lopez-Escogido D, Torres-Jimenez J, Rodriguez-Tello E, Rangel-Valdez N. Strength Two Covering Arrays Construction Using a SAT Representation. In: Gelbukh A, Morales EF, editors. MICAI 2008: Advances in Artificial Intelligence. vol. 5317 of Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2008. p. 44–53.
10. Bracho-Rios J, Torres-Jimenez J, Rodriguez-Tello E. A New Backtracking Algorithm for Constructing Binary Covering Arrays of Variable Strength. In: MICAI 2009: Advances in Artificial Intelligence. vol. 5845 of Lecture Notes in Computer Science. Springer; 2009. p. 397–407.
11. Banbara M, Matsunaka H, Tamura N, Inoue K. Generating combinatorial test cases by efficient SAT encodings suitable for CDCL SAT solvers. In: Proceedings of the 17th international conference on Logic for programming, artificial intelligence, and reasoning. Springer-Verlag; 2010. p. 112–126.
12. Hartman A, Raskin L. Problems and algorithms for covering arrays. Discrete Mathematics. 2004; 284(1–3):149–156.
- View Article
- Google Scholar
13. Martirosyan S, Trung TV. On t-Covering Arrays. Designs, Codes and Cryptography. 2004; 32(1–3):323–339.
- View Article
- Google Scholar
14. Chateauneuf M, Kreher DL. On the state of strength-three covering arrays. Journal of Combinatorial Design. 2002; 10(4):217–238.
- View Article
- Google Scholar
15. Colbourn CJ, Martirosyan SS, Mullen GL, Shasha D, Sherwood GB, Yucas JL. Products of mixed covering arrays of strength two. Journal of Combinatorial Design. 2006; 14(2):124–138.
- View Article
- Google Scholar
16. Cohen DM, Dalal SR, Fredman ML, Patton GC. The AETG system: An approach to testing based on combinatorial design. IEEE Transactions on Software Engineering. 1997; 23(7):437–444.
- View Article
- Google Scholar
17. Tung YW, Aldiwan WS. Automating test case generation for the new generation mission software system. In: IEEE Aerospace Conference Proceedings. vol. 1. IEEE Computer Society; 2000. p. 431–437.
18. Bryce RC, Colbourn CJ, Cohen MB. A framework of greedy methods for constructing interaction test suites. In: Proceedings of the 27th International Conference on Software Engineering. ICSE’05; 2005. p. 146–155.
19. Forbes M, Lawrence J, Lei Y, Kacker RN, Kuhn DR. Refining the In-Parameter-Order Strategy for Constructing Covering Arrays. Journal of Research of the National Institute of Standards and Technology. 2008; 113(5):287–297. pmid:27096128
- View Article
- PubMed/NCBI
- Google Scholar
20. Colbourn CJ, Cohen MB, Turban R. A Deterministic Density Algorithm for Pairwise Interaction Coverage. In: Proceedings of the IASTED International Conference on Software Engineering; 2004. p. 242–252.
21. Shiba T, Tsuchiya T, Kikuno T. Using Artificial Life Techniques to Generate Test Cases for Combinatorial Testing. In: Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004. vol. 1. IEEE Computer Society; 2004. p. 72–77.
22. Avila-George H, Torres-Jimenez J, Gonzalez-Hernandez L, Hernández V. Metaheuristic approach for constructing functional test-suites. IET Software. 2013; 7(2):104–117.
- View Article
- Google Scholar
23. Nurmela KJ. Upper bounds for covering arrays by tabu search. Discrete Applied Mathematics. 2004; 138(1–2):143–152.
- View Article
- Google Scholar
24. Carrizales-Turrubiates O, Rangel-Valdez N, Torres-Jimenez J. Optimal Shortening of Covering Arrays. In: Batyrshin I, Sidorov G, editors. Advances in Artificial Intelligence. vol. 7094 of Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2011. p. 198–209.
25. Feige U. A threshold of ln n for approximating set cover. Journal of the ACM. 1998; 45(4):634–652.
- View Article
- Google Scholar
26. Cohen DM, Colbourn CJ, Ling ACH. Constructing strength three covering arrays with augmented annealing. Discrete Mathematics. 2008;308(13):2709–2722.
- View Article
- Google Scholar
27. Rodriguez-Tello E, Torres-Jimenez J. Memetic Algorithms for Constructing Binary Covering Arrays of Strength Three. In: Collet P, Monmarch N, Legrand P, Schoenauer M, Lutton E, editors. Artifical Evolution. vol. 5975 of Lecture Notes in Computer Science. Springer; 2010. p. 86–97.
28. Colbourn CJ. Tables of Covering Arrays; 2017. OnLine. http://www.public.asu.edu/~ccolbou/src/tabby/catable.html
29. National Institute of Standards and Technology. Tables of Covering Arrays; 2017. OnLine. http://math.nist.gov/coveringarrays/ipof/ipof-results.html
30. Rényi A. Foundations of Probability. Wiley; 1971.
31. Kleitman DJ, Spencer J. Families of k-independent sets. Discrete Mathematics. 1973; 6(3):255–262.
- View Article
- Google Scholar
32. Cohen DM, Dalal SR, Parelius J, Patton GC. The combinatorial design approach to automatic test generation. IEEE Transactions on Software Engineering. 1996; 13(5):83–88.
- View Article
- Google Scholar
33. Lei Y, Kacker RN, Kuhn DR, Okun V, Lawrence J. IPOG: A General Strategy for T-Way Software Testing. In: ECBS’07: Proceedings of the 14th Annual IEEE International Conference and Workshops on the Engineering of Computer-Based Systems. IEEE Computer Society; 2007. p. 549–556.
34. Bryce RC, Colbourn CJ. A density-based greedy algorithm for higher strength covering arrays. Software Testing, Verification and Reliability. 2009; 19(1):37–53.
- View Article
- Google Scholar
35. Quiz-Ramos P, Torres-Jimenez J, Rangel-Valdez N. Constant Row Maximizing Problem for Covering Arrays. In: Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. IEEE Computer Society; 2009. p. 159–164.
36. Nayeri P, Colbourn CJ, Konjevod G. Randomized post-optimization of covering arrays. European Journal of Combinatorics. 2013; 34(1):91–103.
- View Article
- Google Scholar
37. Lara-Alvarez C, Avila-George H. New Algorithm for Post-Processing Covering Arrays. International Journal of Advanced Computer Science and Applications. 2015; 6(12):250–254.
- View Article
- Google Scholar
38. Walker RA II, Colbourn CJ. Tabu search for covering arrays using permutation vectors. Journal of Statistical Planning and Inference. 2009; 139(1):69–80.
- View Article
- Google Scholar
39. Avila-George H, Torres-Jimenez J, Hernández V. New bounds for ternary covering arrays using a parallel simulated annealing. Mathematical Problems in Engineering. 2012; 2012:1–18.
- View Article
- Google Scholar
40. Gonzalez-Hernandez L, Rangel-Valdez N, Torres-Jimenez J. Construction of mixed covering arrays of strengths 2 through 6 using a tabu search approach. Discrete Mathematics, Algorithms and Applications. 2012; 04(03):1–20.
- View Article
- Google Scholar
41. Bryce RC, Colbourn CJ. One-test-at-a-time Heuristic Search for Interaction Test Suites. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation. GECCO’07; 2007. p. 1082–1089.
42. Bao X, Liu S, Zhang N, Dong M. Combinatorial Test Generation Using Improved Harmony Search Algorithm. International Journal of Hybrid Information Technology. 2015; 8(9):121–130.
- View Article
- Google Scholar
43. Donald LK, Stinson DR. Combinatorial algorithms: generation, enumeration, and search. CRC Press; 1999.
44. Torres-Jimenez J, Rangel-Valdez N, Kacker RN, Lawrence JF. Combinatorial Analysis of Diagonal, Box, and Greater-Than Polynomials as Packing Functions. Applied Mathematics & Information Sciences. 2015; 9(6):2757–2766.
- View Article
- Google Scholar
45. Van Laarhoven PJM, Arts EHL. Simulated Annealing: Theory and Applications. Philips Research Laboratories; 1992.
46. Rangel-Valdez N, Torres-Jimenez J, Bracho-Rios J, Quiz-Ramos P. Problem and Algorithm Fine-Tuning—A Case of Study using Bridge Club and Simulated Annealing. In: Correia AD, Rosa AC, Madani K, editors. IJCCI; 2009. p. 302–305.
47. Pérez Espinosa H, Avila-George H, Rodríguez-Jacobo J, Cruz-Mendoza HA, Martínez-Miranda J, Edrein Espinosa-Curiel I. Tuning the Parameters of a Convolutional Artificial Neural Network by Using Covering Arrays. Research in Computing Science. 2016; 121:69–81.
- View Article
- Google Scholar

[ref1] 1. Kuhn DR, Wallace DL, Gallo AM. Software fault interaction and implications for software testing. IEEE Transactions on Software Engineering. 2004; 30(6):418–421.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Lawrence JF, Kacker RN, Lei Y, Kuhn DR, Forbes M. A survey of binary covering arrays. Journal of Combinatorial Designs. 2011; 18(1):1–30.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Jones JA, Harrold MJ. Test-suite reduction and prioritization for modified condition/decision coverage. IEEE Transactions on software Engineering. 2003; 29(3):195–209.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Sloane NJA. Covering arrays and intersecting codes. Journal of Combinatorial Designs. 1993; 1(1):51–63.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Bush KA. Orthogonal arrays of index unity. Annals of Mathematical Statistics. 1952; 23(3):426–434.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Colbourn CJ, Dinitz JH. The CRC handbook of combinatorial designs. CRC Press; 1999.

[ref7] 7. Colbourn CJ. Combinatorial aspects of covering arrays. Le Matematiche. 2004; 58:121–167.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Meagher K. Non-isomorphic generation of covering arrays. University of Regina; 2002.

[ref9] 9. Lopez-Escogido D, Torres-Jimenez J, Rodriguez-Tello E, Rangel-Valdez N. Strength Two Covering Arrays Construction Using a SAT Representation. In: Gelbukh A, Morales EF, editors. MICAI 2008: Advances in Artificial Intelligence. vol. 5317 of Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2008. p. 44–53.

[ref10] 10. Bracho-Rios J, Torres-Jimenez J, Rodriguez-Tello E. A New Backtracking Algorithm for Constructing Binary Covering Arrays of Variable Strength. In: MICAI 2009: Advances in Artificial Intelligence. vol. 5845 of Lecture Notes in Computer Science. Springer; 2009. p. 397–407.

[ref11] 11. Banbara M, Matsunaka H, Tamura N, Inoue K. Generating combinatorial test cases by efficient SAT encodings suitable for CDCL SAT solvers. In: Proceedings of the 17th international conference on Logic for programming, artificial intelligence, and reasoning. Springer-Verlag; 2010. p. 112–126.

[ref12] 12. Hartman A, Raskin L. Problems and algorithms for covering arrays. Discrete Mathematics. 2004; 284(1–3):149–156.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref13] 13. Martirosyan S, Trung TV. On t-Covering Arrays. Designs, Codes and Cryptography. 2004; 32(1–3):323–339.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref14] 14. Chateauneuf M, Kreher DL. On the state of strength-three covering arrays. Journal of Combinatorial Design. 2002; 10(4):217–238.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref15] 15. Colbourn CJ, Martirosyan SS, Mullen GL, Shasha D, Sherwood GB, Yucas JL. Products of mixed covering arrays of strength two. Journal of Combinatorial Design. 2006; 14(2):124–138.
View Article
Google Scholar

[34] View Article

[35] Google Scholar

[ref16] 16. Cohen DM, Dalal SR, Fredman ML, Patton GC. The AETG system: An approach to testing based on combinatorial design. IEEE Transactions on Software Engineering. 1997; 23(7):437–444.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref17] 17. Tung YW, Aldiwan WS. Automating test case generation for the new generation mission software system. In: IEEE Aerospace Conference Proceedings. vol. 1. IEEE Computer Society; 2000. p. 431–437.

[ref18] 18. Bryce RC, Colbourn CJ, Cohen MB. A framework of greedy methods for constructing interaction test suites. In: Proceedings of the 27th International Conference on Software Engineering. ICSE’05; 2005. p. 146–155.

[ref19] 19. Forbes M, Lawrence J, Lei Y, Kacker RN, Kuhn DR. Refining the In-Parameter-Order Strategy for Constructing Covering Arrays. Journal of Research of the National Institute of Standards and Technology. 2008; 113(5):287–297. pmid:27096128
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref20] 20. Colbourn CJ, Cohen MB, Turban R. A Deterministic Density Algorithm for Pairwise Interaction Coverage. In: Proceedings of the IASTED International Conference on Software Engineering; 2004. p. 242–252.

[ref21] 21. Shiba T, Tsuchiya T, Kikuno T. Using Artificial Life Techniques to Generate Test Cases for Combinatorial Testing. In: Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004. vol. 1. IEEE Computer Society; 2004. p. 72–77.

[ref22] 22. Avila-George H, Torres-Jimenez J, Gonzalez-Hernandez L, Hernández V. Metaheuristic approach for constructing functional test-suites. IET Software. 2013; 7(2):104–117.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref23] 23. Nurmela KJ. Upper bounds for covering arrays by tabu search. Discrete Applied Mathematics. 2004; 138(1–2):143–152.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref24] 24. Carrizales-Turrubiates O, Rangel-Valdez N, Torres-Jimenez J. Optimal Shortening of Covering Arrays. In: Batyrshin I, Sidorov G, editors. Advances in Artificial Intelligence. vol. 7094 of Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2011. p. 198–209.

[ref25] 25. Feige U. A threshold of ln n for approximating set cover. Journal of the ACM. 1998; 45(4):634–652.
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref26] 26. Cohen DM, Colbourn CJ, Ling ACH. Constructing strength three covering arrays with augmented annealing. Discrete Mathematics. 2008;308(13):2709–2722.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref27] 27. Rodriguez-Tello E, Torres-Jimenez J. Memetic Algorithms for Constructing Binary Covering Arrays of Strength Three. In: Collet P, Monmarch N, Legrand P, Schoenauer M, Lutton E, editors. Artifical Evolution. vol. 5975 of Lecture Notes in Computer Science. Springer; 2010. p. 86–97.

[ref28] 28. Colbourn CJ. Tables of Covering Arrays; 2017. OnLine. http://www.public.asu.edu/~ccolbou/src/tabby/catable.html

[ref29] 29. National Institute of Standards and Technology. Tables of Covering Arrays; 2017. OnLine. http://math.nist.gov/coveringarrays/ipof/ipof-results.html

[ref30] 30. Rényi A. Foundations of Probability. Wiley; 1971.

[ref31] 31. Kleitman DJ, Spencer J. Families of k-independent sets. Discrete Mathematics. 1973; 6(3):255–262.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref32] 32. Cohen DM, Dalal SR, Parelius J, Patton GC. The combinatorial design approach to automatic test generation. IEEE Transactions on Software Engineering. 1996; 13(5):83–88.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref33] 33. Lei Y, Kacker RN, Kuhn DR, Okun V, Lawrence J. IPOG: A General Strategy for T-Way Software Testing. In: ECBS’07: Proceedings of the 14th Annual IEEE International Conference and Workshops on the Engineering of Computer-Based Systems. IEEE Computer Society; 2007. p. 549–556.

[ref34] 34. Bryce RC, Colbourn CJ. A density-based greedy algorithm for higher strength covering arrays. Software Testing, Verification and Reliability. 2009; 19(1):37–53.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref35] 35. Quiz-Ramos P, Torres-Jimenez J, Rangel-Valdez N. Constant Row Maximizing Problem for Covering Arrays. In: Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. IEEE Computer Society; 2009. p. 159–164.

[ref36] 36. Nayeri P, Colbourn CJ, Konjevod G. Randomized post-optimization of covering arrays. European Journal of Combinatorics. 2013; 34(1):91–103.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref37] 37. Lara-Alvarez C, Avila-George H. New Algorithm for Post-Processing Covering Arrays. International Journal of Advanced Computer Science and Applications. 2015; 6(12):250–254.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref38] 38. Walker RA II, Colbourn CJ. Tabu search for covering arrays using permutation vectors. Journal of Statistical Planning and Inference. 2009; 139(1):69–80.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref39] 39. Avila-George H, Torres-Jimenez J, Hernández V. New bounds for ternary covering arrays using a parallel simulated annealing. Mathematical Problems in Engineering. 2012; 2012:1–18.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref40] 40. Gonzalez-Hernandez L, Rangel-Valdez N, Torres-Jimenez J. Construction of mixed covering arrays of strengths 2 through 6 using a tabu search approach. Discrete Mathematics, Algorithms and Applications. 2012; 04(03):1–20.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref41] 41. Bryce RC, Colbourn CJ. One-test-at-a-time Heuristic Search for Interaction Test Suites. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation. GECCO’07; 2007. p. 1082–1089.

[ref42] 42. Bao X, Liu S, Zhang N, Dong M. Combinatorial Test Generation Using Improved Harmony Search Algorithm. International Journal of Hybrid Information Technology. 2015; 8(9):121–130.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref43] 43. Donald LK, Stinson DR. Combinatorial algorithms: generation, enumeration, and search. CRC Press; 1999.

[ref44] 44. Torres-Jimenez J, Rangel-Valdez N, Kacker RN, Lawrence JF. Combinatorial Analysis of Diagonal, Box, and Greater-Than Polynomials as Packing Functions. Applied Mathematics & Information Sciences. 2015; 9(6):2757–2766.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref45] 45. Van Laarhoven PJM, Arts EHL. Simulated Annealing: Theory and Applications. Philips Research Laboratories; 1992.

[ref46] 46. Rangel-Valdez N, Torres-Jimenez J, Bracho-Rios J, Quiz-Ramos P. Problem and Algorithm Fine-Tuning—A Case of Study using Bridge Club and Simulated Annealing. In: Correia AD, Rosa AC, Madani K, editors. IJCCI; 2009. p. 302–305.

[ref47] 47. Pérez Espinosa H, Avila-George H, Rodríguez-Jacobo J, Cruz-Mendoza HA, Martínez-Miranda J, Edrein Espinosa-Curiel I. Tuning the Parameters of a Convolutional Artificial Neural Network by Using Covering Arrays. Research in Computing Science. 2016; 121:69–81.
View Article
Google Scholar

[101] View Article

[102] Google Scholar

Figures

Abstract

Introduction

Problem definition

The proof that the OSCAR problem is NP-complete

Applications of the OSCAR problem

Related work

Algorithms for solving the OSCAR problem

Greedy algorithms , , and for solving the OSCAR problem

Greedy approach for reducing the number of rows.

Greedy approach for reducing the number of columns.

Greedy algorithm .

Greedy algorithm .

Greedy algorithm .

Meta-heuristic algorithm

Exact algorithms and

Hybrid algorithms for solving the OSCAR problem

Hybrid algorithm .

Hybrid algorithm .

Hybrid algorithm .

Hybrid algorithm .

Hybrid algorithm .

Hybrid algorithm .

Experimentation

Definition of the benchmarks

Fine-tuning of the parameters of

Evaluation of the 12 proposed algorithms

Performance comparison with state-of-the-art initialization algorithms

Performance comparison with IPOG-F, a state-of-the-art CA construction approach

Applications of the proposed approaches for solving the OSCAR problem

Conclusions

Supporting information

S1 Dataset. Benchmark .

S2 Dataset. Benchmark .

S3 Dataset. Benchmark .

Acknowledgments

Compliance with ethical standards

References