Solving Hard Computational Problems Efficiently: Asymptotic Parametric Complexity 3-Coloring Algorithm

Many practical problems in almost all scientific and technological disciplines have been classified as computationally hard (NP-hard or even NP-complete). In life sciences, combinatorial optimization problems frequently arise in molecular biology, e.g., genome sequencing; global alignment of multiple genomes; identifying siblings or discovery of dysregulated pathways.In almost all of these problems, there is the need for proving a hypothesis about certain property of an object that can be present only when it adopts some particular admissible structure (an NP-certificate) or be absent (no admissible structure), however, none of the standard approaches can discard the hypothesis when no solution can be found, since none can provide a proof that there is no admissible structure. This article presents an algorithm that introduces a novel type of solution method to"efficiently"solve the graph 3-coloring problem; an NP-complete problem. The proposed method provides certificates (proofs) in both cases: present or absent, so it is possible to accept or reject the hypothesis on the basis of a rigorous proof. It provides exact solutions and is polynomial-time (i.e., efficient) however parametric. The only requirement is sufficient computational power, which is controlled by the parameter $\alpha\in\mathbb{N}$. Nevertheless, here it is proved that the probability of requiring a value of $\alpha>k$ to obtain a solution for a random graph decreases exponentially: $P(\alpha>k) \leq 2^{-(k+1)}$, making tractable almost all problem instances. Thorough experimental analyses were performed. The algorithm was tested on random graphs, planar graphs and 4-regular planar graphs. The obtained experimental results are in accordance with the theoretical expected results.


Introduction
Graph Coloring is one of the oldest and among the most popular Constraint Satisfaction Problems (CSPs) [1]. The study of efficient CSP-solving algorithms is a central topic in Computer Science and Artificial Intelligence because of its wide applicability in many engineering projects, e.g., very-large-scale integration (VLSI) testing, planning and scheduling, timetabling, satellite range scheduling, register allocation, printed circuit testing, and frequency assignment [2,3,4], as well as theoretical physical models, e.g., spin-glasses and the anti-ferromagnetic Potts model [5].
Graph coloring has found application in life sciences as well, for instance, nucleic acid sequence design has been modeled as a graph coloring problem [6], and in general, combinatorial optimization problems frequently arise in molecular biology: genome sequencing; global alignment of multiple genomes; identification of siblings, cousins, or second cousins through comparison of genomes; finding protein modules containing specified types of proteins; or the computational discovery of dysregulated pathways in human diseases are NP-hard or even NP-complete problems [7,8] 1 .
1. The proposed algorithm is exact and runs in polynomial-time; however, it is parametric. Its running time can be controlled (bounded) by means of a simple parameter (α, the maximum recursion level) that determines the order of the bounding polynomial. Hence, its complexity is on-demand. 2. If for a given α the algorithm is unable to find a certificate then it returns "undetermined" so that it can be re-run with a higher α until a solution is obtained, taking into account the computational resources available. Self-tuning the α parameter (by using the undetermined return value) gives the algorithm the capability of using only the required computational resources for a particular problem instance, e.g., for almost all planar graphs, it is sufficient to have a value of α = 0 to obtain the right solution, thus obtaining an O(n 2 ) algorithm for the 3-coloring problem in almost all planar graphs. 3. It generates certificates for both Yes and No instances, i.e., either a legal 3-coloring or a 3-uncolorability certificate, and hence, it gives stand-alone indubitable results so that it is not necessary to trust neither the correctness 2 of the algorithm itself nor the particular implementation used for recognizing whether the provided solution is correct, since the result can be efficiently verified using only the provided solution (the certificate or witness).
Moreover, from the theoretical point of view, the most important result is the classification of all graphs by the number α(G): the minimum value of the parameter α required by the algorithm to obtain a certificate given a particular instance G of the 3-coloring problem 2 . This results in important consequences and allows the development of a thorough analysis.
Since for each finite graph G there is a corresponding α(G) ∈ N, the algorithm is polynomial-time, and its order depends on α, since P also depends on α. However, for both practical and theoretical results, the most important thing to be determined is the "speed" of this convergence. Here, as the main theoretical result, it is proved, the non-trivial and highly significant fact, that the probability of requiring a value of α > k for obtaining a solution for a random graph decreases exponentially as a function of k. This result is formalized as Theorem 1 given below: Theorem 1. Let G be a random graph; if P(α = k) is the probability that α(G) = k and P(α > k) is the probability that α(G) > k then for all k ∈ N: In the experimental part of this article, the algorithm was thoroughly evaluated using significant samples pertaining to three different graph distributions: 1. Random planar graphs [41,42]. 2. Random 4-regular planar graphs [43,44,45]. 3. Erdős-Renyi connected random graphs [20].
An interesting experimental finding is that in all the test cases for planar graphs, it was found that fixing the maximum recursion level to α = 1 was sufficient to obtain a solution, i.e., exact and efficiently verifiable results were obtained using a polynomial algorithm. This was also the case for 4-regular planar graphs. Furthermore, in the general (random graphs) 3-coloring case experiments, it has been observed that the distribution of α(G) conforms to the theoretical decreasing pattern: the majority of graphs are in α(G) = 0 or α(G) = 1, some in α(G) = 2, a few in α(G) = 3, very few in α(G) = 4, and so on. Indeed, it was not possible to obtain a graph with α(G) > 4 in the sampled random graphs.

Practical applicability
The most common methods for dealing with a NP-complete problem when solving a big (intractable by brute force) real problem are: heuristic algorithms, approximate algorithms, randomized algorithms, and fixed-parameter tractability. The presented algorithm introduces a novel type of solution method and presents many new features that are usually absent from the previous approaches.
Suppose that a very critical hypothesis about certain phenomenon needs to be proved in a molecular biology study and that such a hypothesis depends on testing a property of an object that can be present only when it adopts some particular admissible structure (an NP-certificate) or be absent (no admissible structure for this property) and the problem is not fixed-parameter tractable. Then, 1. None of the standard approaches can discard the hypothesis when no solution is found, since none will give a proof that the problem has no solution, i.e., a proof that there is no admissible structure for this property. (C) Tadpole graph T 31 , which imposes a binary constraint on 3-coloring: either G/xw or G/yw. (D) Path of length two P 2 . (E) C 4 graph that imposes a binary constraint on 3-coloring: either G/xz or G/yw.

2.
Even when a solution exists, heuristic as well as randomized algorithms do not guarantee finding a solution even with extraordinary computational power. 3. Approximate algorithms can give, if lucky, an estimated probability of the existence of such a property, possibly including an approximate admissible structure that is only probably correct.
However, the proposed method solves the problem by providing certificates (proofs) in both cases: present or absent; hence, one can accept or reject the hypothesis on the basis of a rigorous proof that will be independent of the algorithm itself and the implementation used. Moreover, the proposed method assures an exact solution. The only requirement is sufficient computational power (as in brute force methods). However, we have proven that the amount of required computational resources, i.e., the complexity of a problem for the proposed method, is distributed negative exponentially with respect to the problem complexity; hence, the harder a problem is, the lower is its probability of appearance. The exponential reduction in unsolvable instances makes any investment in computing power profitable.

Basic terminology
This article follows the standard graph theory terminology; for general terms and notation, the book of Jensen and Toft [46] and the recent book on chromatic graph theory by Chartrand and Zhang [47] should be consulted. However, some special terms and particular notations are defined below. Unless we state otherwise, all graphs in this work are connected and simple (are finite and have no loops or parallel edges).
The term random graph G n,m refers to a graph chosen at random (with equal probability) from among all possible graphs with n vertices and m edges, as defined by Erdős-Renyi [20].
We refer to u, v as a planar preserving edge if uv is not an edge of G and G + uv remains planar. A vertex contraction, also called vertex identification or vertex merging, is denoted by G/uv. Vertex or edge additions and deletions are denoted as follows: G + u or G + uv and G − u or G − uv, respectively. A vertex ordering of a graph G = (V, E) is a bijection π : V → {1, 2, ..., |V|}, and thus, a set of n vertices can be ordered in n! different ways.
The specially named graphs used in the article are as follows (c.f. figure 1): complete graph K 4 , diamond graph, complete 3-partite graph K 112 (a K 4 minus one vertex), tadpole graph T 31 , triangle graph T 3 , path graph P of length two P 2 , and square graph C 4 .
A certificate [9] (or witness [10]) is an efficiently verifiable proof of the correctness of an answer for some given decision problem. For instance, given a graph G, a legal 3-coloring of G or a short proof that G is not 3-colorable are certificates for the 3-colorability problem.

Materials and Methods
Definition of the Algorithm Definition 1. Given a graph G and a 3-colorable subgraph H of G, there is an unavoidable vertex contraction 3 u, v ∈ V(G), u, v E(G) if the addition of the new edge uv to H makes H not 3-colorable.
Definition 2. Given a non-3-colorable input graph G, a 3-uncolorability certificate W is a description of a (possibly empty) sequence of unavoidable vertex contractions, G/uv, leading to a graph containing K 4 , such that either of the following two cases apply: 1. u, v are the non-complete vertices of a complete 3-partite K 112 diamond subgraph of G or; 2. a nested 3-uncolorability certificate for the graph G + uv is provided.
Hence, in order to design an algorithm for obtaining a 3-uncolorability certificate, a method for obtaining such nested certificates should be provided. The proposed algorithm is recursive and uses a parameter α to limit the recursion depth. A very simple sketch of this algorithm is as follows: Algorithm: 1 is-3-colorable(G, α): 1 Contract every u, v of a diamond subgraph until no other diamond subgraph exists or until the graph becomes the K 3 or it contains a K 4 subgraph.
2 If the graph becomes the K 3 graph then return the current contraction sequence (i.e., a legal 3-coloring).
3 If K 4 is found then return the current contraction sequence (i.e., a 3-uncolorability certificate). 6 Return "undetermined for the current value of α."

END.
Now, let us define a greedy 3-coloring algorithm that will serve as the baseline for the derivation of the proposed coloring algorithm.
Definition 3. The g 3 (G) algorithm is a "greedy-contraction" 3-coloring algorithm that sequentially, and at each step, selects two non-adjacent vertices x and y of a graph G and contracts them to obtain the graph G/xy, while maintaining a list S of the vertices that have been contracted thus far so that if the resulting graph is a triangle (or even a K 2 ) and S contains at most three independent sets, these are three (or less) color classes of G and hence, a legal 3-coloring of G.
The justification for using such a simple approach in combination with a more sophisticated way of detecting (and avoiding) vertex contraction that unavoidably leads to an non-3-colorable graph is derived from the following lemma (lemma 1): Lemma 1. Given an exact algorithm W(G) 0 of complexity O(n k ) to obtain a 3-uncolorability certificate for any non-3-colorable graph, there is an exact algorithm W(G) 1 of complexity O(n k+1 ) to obtain a 3-coloring of any 3-colorable graph.
Proof. Assume W(G) 0 exists. Then, given a 3-colorable graph G, apply the greedy g 3 (G) algorithm but avoiding the contraction of every {x, y} such that G/xy is not 3-colorable, which can be determined in O(n k ) by W(G/xy) 0 . Since G is 3-colorable, it will converge (at most) to a triangle graph. Since g 3 (G) is of complexity O(n) (at every step, at least one vertex will get colored), we obtain a 3-coloring of G in O(n)O(n k ) = O(n k+1 ) Corollary 1. Hence, if W(G, α) 0 is an exact parametric algorithm, of complexity O(n f (α) ), to obtain a 3-uncolorability certificate for any non-3-colorable graph, there is an exact parametric algorithm W(G, α) 1 of complexity O(n f (α)+1 ) to obtain a 3-coloring of any 3-colorable graph.
2 While G has more than three vertices, 2.1 Select two non-neighboring vertices u, v.

If not is
3 Return 1 and a legal 3-coloring as the list of contracted vertices.

END.
Finally, an automated algorithm can be developed to eliminate the need for specifying the α parameter.

Some advanced improvements and special case handling
The algorithm is divided into two parts: the decision problem (is-3-colorable) and the coloring algorithms (general-3COL). There are two versions of each of these algorithms: one for planar graphs and the other for nonplanar graphs. First, the algorithm for the planar graphs case is described, which is better for understanding the key idea behind the algorithms. Then, this description is generalized for the non-planar graph case.

Specialization for planar graphs
The development of a special algorithm for planar graphs has two main advantages: 1. To take advantage of some special structural constraints of planar graphs (e.g., Grőtzsch's like theorems) that aid the development of more efficient algorithms. 2. To formalize an algorithm for planar graphs that preserves planarity at each step, allowing the development of theoretical studies on the the class of planar graphs, e.g., inductive proofs and structure-based proofs.
Now, lets us consider the (is-3-colorable) routine. According to Grotzsch's 3-color theorem [26] (triangle-free planar graphs are 3-colorable), every non-3-colorable planar graph should have {x,y,z,w}-tadpole T 31 subgraphs (cf. Figure 1B). The key idea is that T 31 subgraphs impose binary constraints, i.e., either {x, w} or {y, w} must be contracted since T 31 + xw + yw is a K 4 (the same is true for square graphs). Thus, there is no need for Step 5 of Algorithm 1 to check every non-edge but just every T 31 subgraph. Thus, the routine can be performed for each T 31 by contracting G/yw whenever G/xw is not 3-colorable, i.e., when y, w is a unavoidable vertex contraction, as shown in the next algorithm: 1 Contract every u, v of a diamond subgraph until no other diamond subgraph exists or until the graph becomes the K 3 or it contains a K 4 subgraph.
2 If the graph becomes the K 3 graph then return the current contraction sequence (i.e., a legal 3-coloring).
3 If K 4 is found then return the current contraction sequence (i.e., a 3-uncolorability certificate). 6 Return "undetermined for the current value of α."

END.
Now, let us show a planarity preserving coloring algorithm for planar graphs. The idea involves reducing G to a planar triangulation by means of the addition/contraction of the planar preserving edges of the planar graph G. At the end, if the triangulation has all degrees even, it is 3-colorable [50,46] and finding a legal 3-coloring is linear-time. Otherwise, the algorithm returns "undetermined," meaning that α was not sufficient for obtaining a certificate for the input graph. The specialized coloring routine is described next.
2 While G is not a planar triangulation, 2.1 Select a planar preserving edge u, v.

If not is
3 If triangulation G has an odd vertex, return ∞, ∅ 7 4 Return 1 and a legal 3-coloring of G in linear time.

END.
As can be seen, the graph remains planar at each step, making valid any assumption or structural property of planar graphs at each iteration.
A slight improvement of the worst and expected cases in non-planar graphs For non-planar graphs, a slight modification can be made to improve the worst and the expected case running time of the algorithm. The key idea in this case is to build a complete vertex, i.e., a vertex joined to all the remaining vertices of the graph so that for testing 3-colorability it is sufficient to test 2-colorability of the neighborhood, which can be done in linear time.
2 Let u be the vertex with the highest degree of G.
3 While u is not a complete vertex 5 Return 1 and a legal 3-coloring as the list of contracted vertices.

Proof of the Main Theorem
To formalize the analysis of the algorithm, let us define the following two algorithm specifications: is a parametric-complexity algorithm that computes a function that assigns to a given input graph G just one of three possible values: 0, 1, or ∞, when G is, respectively, non-3-colorable, 3-colorable or the algorithm was unable to find a solution for the given value of the α parameter: Definition 5. The W(G, α) is a parametric-complexity algorithm that computes a function that assigns to a given input graph G just one of three possible values: a 3-uncolorability certificate, a legal 3-coloring, or a null value, when A(G, α) is, respectively, 0, 1, or ∞.
Since the proposed algorithm is a greedy algorithm, it is affected as the other greedy sequential coloring algorithms, by the initial vertex ordering; hence, it is not possible to define a function α(G) simply as the minimum k ∈ N required to obtain a certificate for a particular graph G without considering the vertex ordering; for instance, for any 3-colorable graph, it can be shown at least two different vertex orderings V 1 , V 2 such that for V 1 , a solution can be found for a value of α(G) = 0, while for V 2 , a value of α(G) > 0 is required. Thus, a solution is to define the function α(G) on the basis of the worst-case vertex ordering, and therefore, α(G) will imply a computational complexity measure.
Definition 6. Given a graph G, the integer α(G) is For G non-3-colorable: The minimum k ∈ N required to obtain a 3-uncolorability certificate, assuming that the ordering of the vertices is the worst case for the is-3-colorable(G, k) algorithm.
For G 3-colorable: The value α(H) of the non-3-colorable graph H = G/uv where α(H) is the maximum over all H = G/uv required to obtain a solution for G, assuming that the ordering of the vertices is the worst case for the general-3-COL(G, k) algorithm.
Now, let us divide the proof of Theorem 1 into two cases: First, it is proved that for non-3-colorable graphs the cardinality of the set A of graphs with α(G) = k is greater than the set B of graphs with α(G) > k.
Lemma 2. Let G * be the set of all graphs and assume (with no loss of generality) that in particular G * is defined for a maximum number of vertices or edges that exhausts the representation limit of any computational device, i.e., G * is finite. Let H, A, and B be the sets: then |B| < |A| .
Proof. Since H is the set of non-3-colorable graphs, for every graph in B, there is at least a graph in A: simply take any graph in B not in A and join it to the smallest graph in A; the resulting graph is in A since a 3-uncolorability certificate can be found with α(G) = k. Moreover, no graph of A is a subgraph of any graph in B. Hence, the cardinality of B is strictly less than the cardinality of A.
Therefore, case 1 is proved since A and B are finite sets and a uniform probability distribution over C = A ∪ B is well defined. Hence, P(α = k) ≥ P(α > k) holds for non-3-colorable graphs.

Runtime analysis of the algorithm
The average-case complexity, worst-case complexity, and experimental performance of the algorithm are analyzed. The average-case analysis is informally presented as a mean of establishing the theoretically expected behavior over different kinds of instances. The worst-case analysis establishes the order (Big O) of the algorithm. Finally, the experimental analysis confronts the algorithm with samples from a series of graph distributions to study its performance and contrast it with the theoretical results. Table 1 shows the average case (expected) performance of the algorithm with respect to the type of the instance (Yes/No) and the density of the graph, i.e., above/below the phase transition threshold.

Average-case (expected) complexity
In all the cases (except at the phase transition threshold), there is a high probability of a short running time. A priori, it may look that the worst case should occur on the sparse non-3-colorable graphs. This observation is based on the fact that for this class of graphs, it is more complex to obtain a K 4 by random edge additions and vertex contractions; nevertheless, some restrictions apply. Since the proportion of non-3-colorable graphs decreases fast below the threshold and almost all non-colorable graphs contain a K 4 , the probability of obtaining a K 4 -free non-3colorable graph below the threshold is very small. Moreover, it is known that vertex 3-colorability of a graph with maximum vertex degree three can be determined in polynomial-time [15]. Further, every vertex of maximum degree two can be removed from the graph without affecting the 3-colorability; thus, non-3-colorable sparse graphs are very rare below c 4 (this follows from the sharp-thresholds theory).
Hence, in almost all cases, a short running time is expected. By short, I mean significantly shorter than the worst-case upper bound.

Worst-case complexity
To determine the computational complexity (g) of the entire algorithm, we will start by analyzing the algorithm from the is-3-colorable routine. This routine admits a special parameter α that controls the level of recursive calls. In order to analyze its complexity, the recursion is fixed to α = 0, and once the complexity for α = 0 is obtained, the complexity for α > 0 is established.
The is-3-colorable routine depends on the complexity of the contraction step (Step 1). At first sight, the contraction Step 1 has complexity of order O(n 4 ) since it explores each K 112 subgraph whose number may increase with an increase in the number of combinations of four elements in the vertices of G. However, a relatively in-depth analysis reveals that the algorithm performs a vertex contraction until there is no other K 112 . This means that this operation is bounded by the number of edges of the complement of G, which has a quadratic O(n 2 ) order in the number of vertices. Steps 2 and 3 are absorbed into Step 1. Hence, for α = 0, = O(n 2 ), for α = 0.
For α > 0, the complexity of is-3-colorable routine also depends on recursive calls inside a for loop through every non-edge that has order O(n 2 ); therefore, for α = 1, we will have O(n 2 )O(n 2 ).
Thus, on the basis of lemma 1, it can be shown that the complexity of an algorithm that finds a 3-coloring is just one order higher.

Experimental Results
The problem of evaluating algorithms experimentally could be very tricky if tests are performed on "artificial instances," which may be uncorrelated or isolated from any specific practical application as claimed by Johnson [51] who proposed a methodological approach to the experimental analysis of algorithms. Nevertheless, there are some lines of research suggesting special distributions of graph instances on which purported NP-complete problem solvers should be evaluated in order to appropriately determine their performances (e.g., [52,17,53]).
In the experimental part of this article, the algorithm was thoroughly evaluated over significant samples pertaining to three different graph distributions. Each class, and each distribution, has a good justification: 1. Pseudo-random planar graphs [41,42]. Planarity imposes some interesting structural properties, i.e., the 3coloring problem on planar graphs is the only unqualified problem that remains open [15] since 1-coloring is trivial, 2-coloring is well characterized, and the maximum chromatic number on the plane is four [54,55], and at the same time, the determination of 3-colorability of planar graphs is NP-complete [48,14]. 2. Random 4-regular planar graphs [43,44,45]. Even more, the 3-colorability of four-regular planar graphs still remains NP-complete [56], and most importantly, in this class, the average degree is fixed, and hence, the phase-transition phenomenon as defined for random graphs cannot be applied directly in this case. 3. Erdős-Renyi connected random graphs [20]. Finally, sampling from the Erdős-Renyi (connected) random graphs distribution gives the necessary theoretical support for evaluating an algorithm in the general case, validating the theoretical bounds and allowing one to obtain results that can be compared against other algorithms in the literature, e.g., the best-performing 3-coloring algorithms proposed in the literature [57].

Sample generation details
Random planar graphs [41,42] are complex to generate, and their definitions and sampling methods are more difficult to implement. Instead, we opted for a relatively simple approach of generating "pseudo random planar graphs." The procedure involves the generation of a maximal planar graph and the uniform selection of edges from this graph at random (i.e., with equal probability) to create another graph called a pseudo random graph. For this purpose, we used the "Create Random Planar Graph" algorithm implementation used by [58] (Gato -the Graph Animation Toolbox 4 ) for the creation of such random planar graphs. Only one modification was included to avoid generating a considerably large number of graphs containing a K 4 subgraph. The idea involves the generation of a K 4 -free planar graph during 100 attempts returning the first encountered K 4 -free graph; otherwise, returning the 100 th -generated graph.
Random 4-regular planar graphs are also very complex to generate. Here, the procedures described in Refs. [43], [44] and [45] to generate all the 4-regular planar graphs have been used. In particular, Theorem 2 of [45] is used for generating 4-regular planar graphs. However, there is no theory defining a random 4-regular planar graph, so an ad hoc distribution has been specified to balance the proportion of Yes/No instances. The distribution has been obtained by assigning a probability to each graph transformation (see [45]): P(φ A ) = .80, P(φ B ) = .05, P(φ C ) = .10, and P(φ F ) = .05.
The Erdős-Renyi [20] random graph, is a very well-known model that allows uniform sampling from graphs at random by specifying either a number of vertices and edges or the probability of a number of vertices and the edges. We followed standard methods to sample from this distribution. The only modification is that the generation of a connected graph is assured by first generating a simple path passing through all the vertices and then adding the remaining random edges.
The experiments are designed to study the behavior (not just the performance or running time) of the proposed algorithm. For this purpose, there are curves evaluating the algorithm's performance over a particular graph distribution and there is an initial comparative plot against the backtracking algorithm. The use of backtracking is restricted to the study of the algorithm's scalability since it is not possible to use backtracking consistently beyond the 100 vertex barrier because of its exponential growth. For planar graphs, the planar versions were used, while for random graphs, the improved versions were used.
All experiments were developed in the Python 5 programming language using its standard libraries and the other libraries developed by the current author. Other software includes the planarity library 6 [59] as well as the above mentioned Gato (the Graph Animation Toolbox) libraries. The experiments were realized on common personal computers, and no parallelism was used. All time measurements are done in seconds using Python's time.clock() 7 function.

Experiment 1: scaling factor compared against backtracking
The first part of the experimental analysis is a comparison between the scaling factor of the proposed algorithm with that of simple backtracking, in order to determine the differences in the behavior of both algorithms. This experiment involved the generation of uniformly random planar graph instances from 10 to 100 vertices (incremented by 1 and generating 100 graphs for each number) and solving each instance with both algorithms. Table 2 shows the parameters of the sample used in the experiment.
For each algorithm, the mean and maximum (max) running times were recorded as well as some other relevant statistics. Times labeled as t 1 correspond to the backtracking algorithm, while the t 2 times correspond to the proposed parametric algorithm. Comparative plots are shown in Figure 2; there are six plots from (a) to (f). Figure 2a shows the running times as a function of the number of vertices for both kinds of instance types and for both algorithms. Figure 2b shows the proportion of 3-colorable and non-3-colorable graphs over the total number of graphs per number of vertices. It can be seen that the distribution tends to be uniform. Figures 2c and 2d also show the running times as a function of the number of vertices but discriminated by the instance types (Yes/No) so that subtle differences can be observed.
The general results indicate that there is a crossing point in which backtracking continues to grow exponentially while the proposed algorithm remains polynomial (cf. Figure 2a at around 50 vertices); hence, a clear difference in the behavior of both algorithms is observed. This difference is clearer in the non-3-colorable instances (cf. Figure 2c) where the maximum running times of the parametric algorithm are relatively low in all the cases. Nevertheless, for the 3-colorable instances (cf. Figure 2d), the difference starts to be clear around graphs on 50 vertices.
Moreover, when running times are compared as a function of the average degree, there is a significant difference in the behavior of both algorithms. For non-3-colorable instances, the parametric algorithm exhibits an almost constant performance (cf. Figure 2e) and a totally uncorrelated curve against backtracking, which on the contrary is very sensitive to the average degree. This difference, although to a slightly minor degree, can also be observed in the 3-colorable instances as shown in Figure 2f.

Experiment 2: random planar graphs
This experiment involves the generation of uniformly random planar graph instances from 100 to 1000 vertices (incremented by 100 and generating 1000 graphs for each number) and solving each instance with the parametric algorithm. Table 3 shows the parameters of the sample used in the experiment. It is not possible to compare the results against backtracking because of its exponentially increasing running time.
For each instance type (Yes/No), the mean and maximum (max) running times where recorded, as well as some other relevant statistics. Comparative plots are shown in Figures 3 and 4. Figure 3a shows the running times as a function of the number of vertices for both kinds of instance types. Figure 3b shows the proportion of 3-colorable and non-3-colorable graphs over the total number of graphs per number of vertices. It can be seen that the distribution is far from uniform. Figures 4a and 4b also show the running times as a function of the number of vertices but discriminated by the average degree.
The results indicate that there is a difference in running times depending on the instance type (cf. Figure 3a). This difference is expected since the proposed algorithm returns earlier (without entering the main loop) when a 3uncolorability certificate is found. Even in the case when the average degree is considered, the difference is high (cf. Figures 4a and 4b). Figure 4: Results for planar graphs. Runtime analysis over random planar graphs considering instance type and average degree. Plots (a) and (b) also show the running times as a function of the number of vertices but discriminated by the average degree.

Experiment 3: random planar 4-regular graphs
In this experiment, graphs were sampled from an ad-hoc distribution over the 4-regular planar graphs. The samples were used for generating graph instances from 100 to 1000 vertices (incremented by 100 and generating 1000 graphs for each number) and solving each instance with the parametric algorithm. Table 4 shows the parameters of the samples used in the experiment.
For each instance type (Yes/No), the mean and maximum (max) running times were recorded, as well as some other relevant statistics. Comparative plots are shown in Figure 5. Figure 5a shows the running times as a function of the number of vertices for both kinds of instance types. Figure 5b shows the proportion of 3-colorable and non-3colorable graphs over the total number of graphs per number of vertices; it can be seen that the distribution is far from uniform.
These results indicate that there is a very significant difference in running times depending on the instance type (cf. Figure 5a). This difference was expected since the proposed algorithm returns earlier when a 3-uncolorability certificate is found. Again, the observed difference is high. Figure 5: Results for 4-regular planar graphs. Runtime analysis over random 4-regular planar graphs between 100 and 1000 vertices. Plot (a) shows the running times as a function of the number of vertices for both kinds of instance types. Plot (b) shows the proportion of 3-colorable and non-3-colorable graphs over the total number of graphs per number of vertices.

Experiment 4: Erdős-Renyi random graphs
In the last experiment, graphs were sampled from the well-known Erdős-Renyi random graphs distribution. The samples were graph instances of 100 vertices, generating in total 10000 graphs, and solving each instance with the parametric algorithm. Table 5 shows the parameters of the sample used in the experiment. For each instance type (Yes/No), the mean and maximum (max) running times were recorded, as well as some other relevant statistics. Comparative plots are shown in Figure 6. Figure 6a shows the quantity of 3-colorable and non-3-colorable graphs as a function of the average degree, i.e., a phase transition plot, in this case, occurring at around d = 4.74. It should be noted that this phase transition is for the connected random graphs and not standard random graphs, which can contain many components, thus affecting the phase transition threshold. Figure 6b shows the quantity of graphs corresponding to each α(G) value. As predicted by the theory, almost all graphs have α(G) ≤ k for some integer k, and the proportion of graphs decreases exponentially as a function of α clearly below the line of 2 −(α+1) . These results confirm the established theoretical bounds. Figures 6c and 6d show the running time as a function of the average degree. It can be observed that in the random graphs case, the difference in running time is not as high as the difference observed in the planar graphs case. Further, there is a difference in the location of the harder instances for each kind of instance type: the harder instances for the non-3-colorable case are located around an average degree of d ≈ 5 while, in the 3-coloring case, they are located slightly below an average degree of d ≈ 4.8. Although the numbers seems to be very close, the shapes of the running-time curves are not. The shape of the running-time curve in Figure 6d falls sharply after 5, while the shape of the running-time curve in 6c does not. This may indicate that there is a true difference in the location of the harder instances depending on the type (Yes/No) of the instance, and (to the best of my knowledge), there are no other works identifying a separation of a complexity threshold on the basis of the type of instance.
As observed in the experiments, the value of α(G) was directly correlated with the average degree d = 2m/n (m = edges, n = vertices); hence, near the phase transition threshold (d * ) [18,19], the probability of a relatively high value of α(G) increases.
However, many interesting questions remain open; e.g., • What is the exact distribution of α(G) in random graphs?
• Apart from the average degree, what other parameters are related to α(G)?
• Given an arbitrary input graph G, can the α(G) value be predicted (exactly or approximately) and what is the best possible approximation to α(G)?
• As for the chromatic number χ(G), is there any (efficient) graph construction mechanism that allows the generation of graphs with arbitrarily large α(G)? Figure 6: Runtime analysis of the algorithm for random graphs. The behavior of the proposed algorithm over the well-known Erdős-Renyi random graphs distribution. Plot (a) shows the quantity of 3-colorable and non-3-colorable graphs as a function of the average degree, i.e., a phase transition plot, in this case, occurring at around d = 4.74. Plot (b) shows the quantity of graphs corresponding to each α(G) value. As predicted by the theory, the proportion of graphs decreases exponentially as a function of α below the line of 2 −(α+1) . Plots (c) and (d) show the running times as a function of the average degree.

Discussion
In this article, an asymptotic parametric exact 3-coloring algorithm has been presented. This is (to the best of my knowledge) the first algorithm of its kind for the 3-coloring problem.
The maximal complexity of the algorithm is controlled by the parameter (α) that bounds the recursion depth and determines its running time. The algorithm relies on the efficient search of 3-uncolorability certificates. Here, a formal definition of the 3-uncolorability certificate has been introduced. This is the central theoretical concept that allowed the development of the proposed algorithm. The definition of the 3-uncolorability certificate presented here is (to the best of my knowledge) the first one that is formally presented and the most naturally related to the 3-coloring problem.
A very significant feature of 3-uncolorability certificates is that it is possible to obtain them from small subgraphs of a particular graph, indeed, as small as four vertices (i.e. by finding a K 4 subgraph). Hence, an interesting theoretical analysis that should follow is to study of the behavior of α(G) on 4-critical graphs since in this class, there is no subgraph with chromatic number four, and hence, finding unavoidable vertex contractions may be relatively hard (e.g., see Ref. [53] for a good initial development of this idea). Hence, a classification of 4-critical graphs on the basis of α(G) can lead to very significant results.
There is an interesting symmetry between coloring and uncolorability certificates: • In order to show that a graph is 3-colorable, it is sufficient to encounter just one legal coloring; nevertheless, any legal coloring must assign a color to all the vertices of the graph without violating any constraint since it remains hard to determine if a partial coloring is extensible to all the vertices of the graph.
• Instead, in order to show that a graph is not 3-colorable, one needs to verify that none of the possible 3-colorings is a legal one; nevertheless, for obtaining a 3-uncolorability certificate, it is sufficient to encounter just one non-3-colorable subgraph (e.g., a 4-critical subgraph), i.e., a small graph.
Thus, while for considerably large graphs, just verifying a legal 3-coloring can be complex in practice, it remains practical (at least in theory) to determine 3-uncolorability even for such graphs.
Hence, in principle, finding uncolorability certificates can be assumed to be at least of the same kind as finding colorings. Thus, there should not be any problem in the development of 3-coloring algorithms on the basis of a search for 3-uncolorability certificates that eventually reach the same level of sophistication and performance as its coloring-based peers.
Moreover, if the algorithm is used as a heuristic, e.g., to test whether a solution can be found quickly ("just by chance") with a relatively low (efficient) α, the algorithm will search for both 3-colorings and 3-uncolorability certificates at the same time, in clear contrast with the use of backtracking, greedy-based, and randomized 3-coloring algorithms. Further, this feature is particularly important as its consideration ensure that it is not necessary to trust the correctness of the algorithm itself or the particular implementation used in order to recognize that the provided solution is correct since the result can be efficiently verified using just the solution provided, i.e., a legal 3-coloring or a 3-uncolorability certificate.
The developed theoretical analysis guarantees some good features of the proposed algorithm. The most important one, for both practical and theoretical purposes, is that while the algorithm relies on the value of α to be able to find a certificate, the probability that α(G) > k decreases at the rate P(α(G) > k) ≤ 2 −(k+1) , e.g., for k = 19, there is less chance than one in a million of not obtaining a solution with the proposed polynomial algorithm (i.e., probability of success = 0.999999), assuming that the input is a random graph.
Thus, while certainly beyond some value of α, the running times would become prohibitive given the current state of the computing machinery, the developed algorithm scales polynomially, and the probability of obtaining a solution (success) grows exponentially with an increment of α. Hence, any step (i.e., any investment) in computing power technology will lead to a huge (exponential) growth of the class of tractable 3-coloring instances, as well as CSPs in general.
Perhaps, it could be the case that we can achieve at least a "technological tractability"?, i.e., a guaranteed number of instances such that almost all computational problems of practical interest could be solved for α(G) ∈ [0, k] for some integer k.
It should also be observed that increasing α as the result of technological progress implies that α f (input), i.e., α is not a function of the input. Does technological progress imply a polynomial algorithm for 3-colorability?
In addition, since 3-colorability is NP-complete and to each graph corresponds a unique α(G), a classification based on α(G) of all the NP-complete problem instances can be done by a reduction of each problem instance to a 3-coloring instance G such that α(G) = k for some k ∈ N.
However, can we define NP as follows? Let us define NP(α) as the class of problems in NP that are also in P for some particular value of α(G) ∈ N. Then, i.e., can then NP be defined as the infinite union of problems in P? Finally, even determining the infiniteness of α(G), is there, as in the case of the maximal degree four, ( (G) ≤ 4); a k ∈ N such that determining 3-colorability over a class of graphs with α(G) ≤ k is still NP-complete, i.e., P = NP?
In the maximal-degree case, we know that 3-colorability restricted to (G) ≤ 4 is still NP-complete. Nevertheless, the problem is to determine whether a polynomial algorithm exists or not.
On the contrary, in the finite-α(G) case, we know that 3-colorability restricted to α(G) ≤ k is in P. Nevertheless, the problem is to determine whether it is NP-complete for a class of graphs and finite k ∈ N.

Reproducibility note
The working source-code of the algorithm and all the software libraries needed to appropriately use and experiment with the algorithm have been released and are available at the publisher's website.
Furthermore, there is a web application that implements the algorithm inside the Google App Engine cloud computing framework. The users can visit the site and test the algorithm at the following url: • http://graph-coloring.appspot.com The web coloring application just asks for a file where a graph is defined following the plain text version of the simple edge-list according to the DIMACS standard format specification (http://mat.gsia.cmu.edu/COLOR/ general/ccformat.ps), such as the .col files in http://mat.gsia.cmu.edu/COLOR/instances.html. Table 1: A priori expected performance of the algorithm with respect to 3-colorability and graph density parameters.

Tables
High probability of a short running time due to the existence of many legal colorings.
High probability of a short running time since almost all non-3-colorable graphs contain a K 4 and hence the probability of obtaining a K 4 -free non-3-colorable graph decreases rapidly when the average degree falls below the phase transition threshold.
High probability of a short running time due to the existence of many K 112 subgraphs that prune the search, e.g., graphs tend to be uniquely colorable.
High probability of a short running time due to the existence of many small 3uncolorability certificates due to the average degree, e.g., too many K 4 -subgraphs.
Average case (expected) performance of the algorithm with respect to the density of the graph, i.e., above/below the phase transition threshold (d * 4.69) and the type of instance (Yes/No). Table 2: Parameters of the scalability test between backtracking and the proposed algorithm.

Sample property Value
Sample type: Random planar graphs.
Average degree: From 2 to 5 edges uniformly distributed.
Group size: 100 graphs per number of vertices.
Random planar graph instances from 10 to 100 vertices incremented by 1 and generating 100 graphs for each number of vertices, i.e., 9000 graphs in total. Table 3: Parameters of the sample used in the random planar graphs test.

Sample property Value
Sample type: Random planar graphs.
Average degree: From 2 to 5 edges uniformly distributed.
Group size: 100 graphs per number of vertices.
Uniformly random planar graph instances from 100 to 1000 vertices incremented by 100 and generating 1000 graphs for each number of vertices, i.e., 10000 graphs in total. Table 4: Parameters of the sample used in the random planar 4-regular graphs test.

Sample property Value
Sample type: Random planar 4-regular graphs.
Group size: 1000 graphs per number of vertices.
Graphs are sampled from an ad-hoc distribution over the 4-regular planar graphs. The sample consist of graph instances from 100 to 1000 vertices incremented by 100 and generating 1000 graphs for each number of vertices, i.e., 10000 graphs in total. For the exact meaning of each graph transformation operation, i.e., φ A , φ B , φ C and φ F see Ref. [45]. Table 5: Parameters of the sample used in the random graphs test.

Sample property Value
Sample type: Erdős-Renyi random graphs.
Average degree: From 3 to 6 edges uniformly distributed.
Graphs are sampled from the well-known Erdős-Renyi random graphs distribution. The sample consist of graph instances for 100 vertices, generating in total 10000 graphs. For each number of vertices, the average degree is varied from 3 to 6.