An Algorithm for Finding the Singleton Attractors and Pre-Images in Strong-Inhibition Boolean Networks

The detection of the singleton attractors is of great significance for the systematic study of genetic regulatory network. In this paper, we design an algorithm to compute the singleton attractors and pre-images of the strong-inhibition Boolean networks which is a biophysically plausible gene model. Our algorithm can not only identify accurately the singleton attractors, but also find easily the pre-images of the network. Based on extensive computational experiments, we show that the computational time of the algorithm is proportional to the number of the singleton attractors, which indicates the algorithm has much advantage in finding the singleton attractors for the networks with high average degree and less inhibitory interactions. Our algorithm may shed light on understanding the function and structure of the strong-inhibition Boolean networks.


Introduction
Revealing how a genetic regulatory network is organized for its function is one of the main topics in system biology [1][2][3]. With the contribution of experimental physiologists, many precise details of gene interactions as well as their functions have been revealed [3]. Based on the experimental data, different kinds of mathematical models have been developed, such as master equations [4], Monte-Carlo method [5,6], stochastic model [7,8], ordinary differential equations (ODE) [9][10][11][12][13], Boolean network method [14][15][16] and other mathematical models [17]. Among all the models, the Boolean network is a simple but powerful mathematical model. It was originally introduced by Kauffman [14] and has been quickly developed into many different types, including the general Boolean network model [15], the AND/OR Boolean network model [16], and the strong-inhibition Boolean network model [18], and so on. In the general Boolean network model, the most details of the network are taken into account. It can perfectly yield insight to global behavior of the network, however, it is difficult to analysis the general Boolean network due to the complexity of large genetic regulatory network. Further, a much simpler AND/OR Boolean network model where many details of the network are neglected is presented. Recently, a strong-inhibition Boolean network model which is a biophysically plausible gene model has been obtained from the transition of the general Boolean network model, as the inhibitory interactions often dominate the activations in some important genomic regulations [18]. The model has been successfully used to reveal the backbone motif structure of the cell-cycle network and reconstruct the network structure from evolution data [18][19][20], since the model has more details of the network, and a simple rule which is similar to the AND/OR Boolean network model. It is very suitable to be applied to study the genetic regulatory network.
To capture the biological meaning of the genetic regulatory network, it is very necessary to identify the stable states of the dynamic system, such as the cyclic attractor and the singleton attractor. Two different attractors correspond to different functional states of the network. As an example, the cyclic and singleton attractor correspond to cell growth and differentiated states (or apoptosis) in the cell-cycle network, respectively [21]. It is noteworthy that the identification of the attractors is of great importance and is very useful in many applications, such as the treatment of human cancers [22,23], genetic engineering [24] and validate hypotheses on the transcription processes [25]. In order to identify all attractors of the genetic regulatory network, so far several methods have been proposed, including the methods relying on binary decision diagrams [26,27], constraint programming [28], feedback vertex sets [29,30], linear mapping [31]. Moreover, many of these methods have been developed to be more general and effective [32][33][34].
In particular, the identification of the singleton attractor is especially important for revealing the function of the genetic regulatory network [30]. To find the singleton attractors of the genetic regulatory network in the context of a Boolean network, several algorithms have been proposed. For instance, Leone et al. firstly applied the SAT (the satisfiability problem of Boolean formulas in conjunctive normal form) algorithms to identify the singleton attractor in a Boolean network [35]. Based on this observation, Tamura and Akutsu showed that the detection problem of singleton attractor for a Boolean network with maximum in-degree k can be reduced to (k + 1)-SAT, and presented an algorithm for detecting the singleton attractor of an AND/OR Boolean network in O(1.787 N ) time [36]. Subsequently, the authors and coworkers succeeded in improving the above algorithm, including proposed an O(1.587 N ) time algorithm for determining the singleton attractor of an AND/OR Boolean network [37], an O(1.757 N ) time algorithm for determining the singleton attractor of planar and nonplanar Boolean networks [38], the O(1.799 N ) and O(1.619 N ) time algorithm for determining the singleton attractor of Boolean networks with nested canalyzing functions and chain functions [39], respectively. In the O(1.799 N ) time algorithm, it was implicitly assumed that the network does not contain the positive self-loop. While allowing for the presence of positive self-loop, the singleton attractor detection problem was solved in O(1.871 N ) time [40]. Besides, Zhang et al. developed a quite fast algorithm using gene ordering [30]. It is shown that the algorithm can identify all singleton attractors for a random Boolean network with maximum indegree two with an average time O (1.19 N ). However, the average computational time complexity would increase (approximately O(1.5 N ) for maximum indegree ten) with increasing of maximum indegree. Recently, Zou proposed an algorithm by dividing the polynomial equation system into many subsystems [41]. Indeed, the problem of detection for the singleton attractor is still NP-hard [29,30,42]. Thus, it is not plausible to solve this problem efficiently in all cases. However, we can develop a new algorithm that is fast and can be applied in different mathematical models.
In this paper, we focus our attention on the strong-inhibition Boolean network model, and propose an algorithm for detecting the singleton attractors and pre-images of genetic regulatory network. Our algorithm is applied to the cell-cycle network of budding yeast, and can accurately identify all the singleton attractors of the network. Furthermore, we show that the average computational time increases exponentially with the growth of the network size N, and the order is approximately O (1.34 N ). Moreover, we discover that the computational time is proportional to the number of the singleton attractors of network. Based on this observation, we find that our algorithm has much advantage in finding the singleton attractors for the networks with high average degree and less inhibitory interactions. We further extend the algorithm for studying the pre-image problem. The pre-image problem (or predecessor problem), to find the set of all possible inputs that evolve into the given output, has been addressed by Wuensche in [43]. In recent years, several algorithms have been introduced, such as reverse algorithm [44], probabilistic algorithm [45]. But it has also been shown to be NP-hard in general [46]. In this paper, we extend the previous singleton attractor detection algorithm to find the pre-images of any given state. All pre-images and even the basin of one singleton attractor are successfully and accurately found.
The paper will be arranged as follows: in Sec. II, the state transition model is introduced. In Sec. III, the algorithm for finding the singleton attractor is given. In Sec. IV, we show an example for finding the singleton attractor with this algorithm, and the computational time of the algorithm for finding singleton attractors is analyzed. In Sec. V, we present the algorithm of finding the pre-images of any target network state. Finally, we give a summary in Sec VI.

State transition model
Boolean network model is a discrete dynamical model of genetic regulatory networks. In this model, each node has only two states, 1 or 0, representing "on" (active) or "off" (inactive) state, respectively. For a network system with N components, the network state at time t is denoted by S(t) = (S 1 (t), S 2 (t), . . ., S N (t)). The network state in the next step is uniquely determined by the following rule [15]: if P N j¼1 a ij S j ðtÞ ¼ 0; where i, j = 1, 2, . . ., N. A = {a ij , i, j = 1, 2, . . ., N} is the adjacency matrix of the network, which denotes the interactions between all the components. a ij can be positive, negative, or zero, standing for an activating interaction, inhibiting interaction or no interaction, respectively. Usually, a ii take −1, 1, or 0, and a ij take −γ, 1, or 0 for j 6 ¼ i.
In fact, the inhibiting interactions are dominant for most biomolecular interactions, one prefers γ ! 1. Following the "dominant inhibition" assumption γ = 1, which represents some typical biological transcriptional regulatory processes [47][48][49], the Eq (1) can be simplified into a logical equation [18] where r ij and g ij represent the putative inhibitory and putative stimulatory edge from node j to i, respectively. The relation between a ij and g ij , r ij is where i, j = 1, 2, . . ., N. It should be noted that node j can't have activating and inhibiting effect on node i at the same time, namely g ij and r ij can't both take the value 1 for any fixed i and j. The addition, multiplication and bar in the Eq (2) represent the Boolean operator OR, AND, and NOT, respectively. This model is called strong-inhibition Boolean model due to the "dominant inhibition" assumption.

The algorithm for finding the singleton attractor
To identify the singleton attractor which the network system will eventually evolve into a limited set of stable states, we should check all the states in the network state space. However, a great deal of time will be assumed for the large networks under this enumeration-based algorithm since the state space consists of 2 N different states. So it is very necessary to design more efficient method to identify the singleton attractor.
For the strong-inhibition Boolean model, the singleton attractors are solutions of the following equations where i, j = 1, 2, . . ., N. We can obtain a concise equation by setting S i = S i (t), Among all the interactions that regulate node i, we suppose that there are h positive interactions (g ij 1 = g ij 2 = . . . = g ij h = 1) and m negative interactions (r ik 1 = r ik 2 = . . . = r ik m = 1). Then Eq (5) can be written as: where^and _ represent AND and OR, respectively. And According to the definition in Ref. [40], one can see that the right side of Eq (6) is a chain function, which is a special case of nested canalyzing function (nc-function). It was proved by Akutsu et al. that finding a singleton attractor for an nc-network with chain functions remains NP-hard [39]. Then, some rules are gotten by analysing Eq (5), rule 1. if S i = 1 and r ji = 1, then S j = 0; rule 7. if ∑ j6 ¼i (S j r ij ) = 0, and there is a node j 0 such that S j 0 g ij 0 = 1, then S i = 1.
According to these rules and the given network structure, the states of other nodes can be determined if we already know the states of part of nodes. The states of its neighbors may be determined with the rules 1 − 4 if the state of a node is known; its state may be determined with the rules 5 − 7 if the state of a node is unknown. What's more, according to the rules 1 and 2, we can find that if S i = 1 and the node i has more putative inhibitory connections, the states of its neighbors can be easily determined. Based on these rules, we can design the following algorithm to identify the singleton attractors.
Step 1. Input the network matrix A, and a network state S = (S 1 , S 2 , . . ., S N ), where the states of l 0 (0 < l 0 N) nodes are unknown while the states of other N − l 0 nodes are known. Those l 0 nodes are variable nodes as their states can be 1 or 0.
Step 2. Find the node i 0 which has the most putative inhibitory connections among the variable nodes of the network.
Step 3. Let S 1 old ¼ S and assign S i 0 = 1. According to the rules 1 − 7, we can determine the states of more variable nodes, and obtain new network state S new . Throughout this process, a case may occur: according to one rule, the state of a variable node can be determined as 1, but it is determined as 0 with another rules. This contradiction means that the assignment of S i 0 = 1 is not appropriate, and the corresponding network state is not a singleton attractor of the network system, so we should remove it. On the other hand, if this case does not occur, we should remember the state S new and count the number of its variable nodes l new . Moreover, if l new > 0, we should remember this state and wait the next turn to begin Step 1; if l new = 0, which means that the states of all nodes are determined and a new singleton attractor is found, return this singleton attractor.
Step 4. Let S 0 old ¼ S and assign S i 0 = 0. Do the same as those did in Step 3. Next, we will prove that the states found with our algorithm are exactly all the singleton attractors of the network. In our algorithm, we determine the states of variable nodes according to the rules and remove the state if contradiction appears. Actually, the states found with our algorithm are states that do not contradict with the rules, here we use set U to denote them. And we use set V to represent all the singleton attractors of the network. Because a singleton attractor must satisfy Eq (5), so it obeys the rules, thus V is a subset of U, namely V & U. Afterwards, we will prove that U is also a subset of V.
Suppose that U is not a subset of V, there must be a state S 0 such that S 0 2 U and S 0 = 2 V. Accordingly, the state S 0 obeys all the rules but does not satisfy Eq (5). Then, there must be a node's state, assuming S 0 i , such that where S 0 i ¼ 0 or 1. Firstly, assuming S 0 i ¼ 0 and inserting it into Eq (7), we obtain The Eq (9) indicates And from Eq (10), we obtain Obviously, it contradicts with the rule 6 through Eq (11) and S 0 Through Eqs (11) and (13) and S 0 i ¼ 0, one can see that it contradicts with the rules 4 and 7. Therefore, it is impossible that S 0 which obeys all the rules does not satisfy the Eq (5), if Through Eq (16) and S 0 i ¼ 1, we find that it contradicts with the rules 1 and 2. These results show that if S 0 i ¼ 1, it is also impossible that S 0 which obeys all the rules does not satisfy Eq (5). Now we can conclude that the state S 0 , which obeys all the rules but dissatisfies Eq (5), does not exist. It indicates that U is a subset of V, namely U & V. Therefore, U = V is proved. And we get the proof that the states found with our algorithm are exactly all the singleton attractors of the network. The code of this algorithm is provided in Supporting Information (S1 File).

An example for finding the singleton attractor
To verify the validity of the above algorithm, we apply it to detect the singleton attractors of the cell-cycle network for budding yeast which is well studied in the research of cell-cycle process [15]. As shown in Fig 1(a), the network consists of 11 nodes and 34 edges. According to prior works, there are 7 singleton attractors among 2 11 = 2048 states in the state space of the cell-cycle network, as shown in Fig 1(b). Next, we will show how to find these singleton attractors with our algorithm.
A flow chart for detecting all the singleton attractors of the network is shown in Fig 2. We obtain S 1 = 0 according to the known network structure and the rule 5. Next, we begin to detect the states of the variable nodes. We firstly rank all nodes from large to small according to the number of their putative inhibitory connections, the order is node 10, 9, 8, 5, 7, 4, 6, 3, 2, 11, 1. Firstly, the network state (S 1 , S 2 , . . ., S N ) is set to (0, S 2 , . . ., S N ) with l 0 = N − 1. Then, we get S 7 = 1 according to the rule 7 by setting S 10 = 1. Afterwards, S 10 = 0 is obtained according to the rule 1. Now the contradiction appears, so the state S = (0, S 2 , . . ., S 9 , 1, S 11 ) should be removed. On the other hand, assign S 10 = 0. It turns out that no more nodes' state can be determined, so we remember the state S = (0, S 2 , . . ., S 9 , 0, S 11 ) with l 0 = N − 2 and begin the next turn.
Furthermore, the singleton attractors of five classical networks are detected. As shown in Table 1, the number of singleton attractors (NS) and running time of our algorithm for each network are given in columns 4 and 5. We also give the results of the enumeration-based algorithm in columns 6 and 7. Comparing our algorithm with the enumeration-based algorithm, our algorithm can not only find the same number of the singleton attractors with the enumeration-based algorithm (columns 4 and 6), but also decrease extremely the running time (columns 5 and 7). Especially for the largest network (T-cell receptor), it took round 0.0468 seconds for our algorithm, whereas the enumeration-based algorithm spent more than 40 hours.
Next, we will show the computational time of our algorithm by numerical simulations on random gene regulatory networks. Similar to the ER random network [54], there exits interaction (a ij 6 ¼ 0) from node j to i (i = j is allowed) with the probability p. Usually, the parameter p is determined by p = hki/N, where hki is the average degree of the network. Furthermore, a ij 6 ¼ 0 has been assumed to take an independent value (±1) distributed according to a twopoint probability distribution function. More specifically, we adopt the following values at random: the inhibiting interaction with probability r; 1; the activating interaction with probability 1 À r: The average degree is about 2 * 4 for many biological networks, and the amount of inhibiting interactions is less than that of activating interactions [15,[55][56][57], so hki = 3 and r = 0.4 are fixed. We set N = 50 and generate 500 sample networks. The CPU computational time for finding all the singleton attractors of each network with above algorithm is calculated and shown in Fig 3(a). As we can see, the singleton attractors of every network can be detected, and the time is no more than 1 minute. This indicates that the singleton attractors of a network can be found efficiently with our algorithm. The average CPU computational time avT as a function of the network size N is also computed. The result is shown in a semi-logarithmic plot in Fig 3(b), with a straight fit (dashed line) to guide an eye. The value of avT for each N is averaged over 500 samples. And the error bars denote the range of CPU computational time, where the upper and lower ends of bars represent the maximum and minimum values, respectively. In the figure, the average CPU computational time increases as the growth of network size N. This behavior is well characterized by the exponential relationship, avT / 1.34 N . This exponential increase also exists between the maximum CPU computational time and the size of network.
Although the average and maximum CPU computational time increase exponentially with the increase of N, we do not know why the minimum CPU computational time is very short for each N, as shown in Fig 3(b). To understand this question, the maximum, average, and minimum number of the singleton attractors as the function of the size of system N are shown in Fig  4(a), respectively. From this figure, we find that the average and the maximum number of the singleton attractors increase exponentially with N increasing, and there is not exponential relation between the minimum number of the singleton attractors and the size of system. Furthermore, the sample networks with size of N = 50 are generated. We plot the CPU computational time against the number of singleton attractors (NS) of the sample networks, and show the result in Fig 4(b). We find that the computational time increases linearly with the number of the singleton attractors increasing. Based on these observations, the minimum CPU computational time nonexponentially increases with the size of system, as the exponential relation between the minimum number of the singleton attractors and the size of system does not exist. Following the above observation, we may want to know what characteristics of networks could have small number of singleton attractors. So we calculate the average number of Algorithm for Finding the Singleton Attractors and Pre-Images singleton attractors of networks for different average degree hki 0 s and inhibiting interactions probability r 0 s. In Fig 5(a) and 5(b), we show the semi-logarithmic plots of avS (the average number of singleton attractors) as a function of hki and r, respectively. We can find that avS decreases extremely fast with the increase of hki and increases exponentially with the increase of r. According to the above observation, these results indicate that less computational time is assumed to find the singleton attractors of the networks with high average degree and less inhibitory interactions. Algorithm for Finding the Singleton Attractors and Pre-Images
If one observes carefully those rules, we find that the previous algorithm for detecting the singleton attractors can be extended to find the pre-images for any target network state. Here, we just need to input the target state in Step 1 and replace the rules 1 − 7 with rules 1.1 − 1.5 and 2.1 − 2.5 in Steps 3 and 4.
Next, we will prove that the states found with this algorithm are precisely all the pre-images of the target network state. Here, we still use U and V to stand for the set of the states found with this algorithm and the set of all the pre-images of the target network state, respectively. Obviously, any pre-image of S 0 (t + 1) must satisfy Eq (18), and it follows certainly all the rules. So V is a subset of U, namely V & U. Therefore, to prove U = V, we just need to prove U & V.
Suppose that U is not a subset of V, then there must be a state S 0 (t) such that S 0 (t) 2 U and S 0 (t) = 2 V. The state S 0 (t) obeys all the rules but does not satisfy Eq (18), namely, there is a node's state which dissatisfies the equation. We assume that this state is S 0 i ðt þ 1Þ, and we have where S 0 i ðt þ 1Þ ¼ 0 or 1. Firstly, assuming S 0 i ðt þ 1Þ ¼ 1 and inserting it into Eq (19), we obtain Through Eq (21), there must be a node j 0 such that S 0 j 0 ðtÞ ¼ 1 and r ij 0 = 1, we can find that it contradicts with the rules 1.1. So it is necessary that Q j6 ¼i ðS 0 j ðtÞr ij Þ ¼ 1, hence we have ∑ j 6 ¼ i (S j r ij ) = 0. If Eq (22) holds, one case is P j6 ¼i ðS 0 j ðtÞg ij Þ ¼ 0, g ii = r ii = 0 and S 0 i ðtÞ ¼ 0, it contradicts with the rule 1.3 through ∑ j 6 ¼ i (S j r ij ) = 0; the other case is P j6 ¼i ðS 0 j ðtÞg ij Þ ¼ 0 and r ii = 1, we can observe that the contradiction appears according to the rule 1.5. Therefore, it is impossible that S 0 (t) which obeys all the rules does not satisfy Eq (18) From this equation, we can get Y j6 ¼i The Eq (24) indicates X j6 ¼i ðS 0 j ðtÞr ij Þ ¼ 0: And from Eq (25), we obtain X j6 ¼i ðS 0 j ðtÞg ij Þ ¼ 1 ð27Þ If Eq (27) works, then there must be a node j 0 such that S 0 j 0 ðtÞ ¼ 1 and g ij 0 ðtÞ ¼ 1: Through Eqs (26) and (29), we can find that it contradicts with the rule 2.1. On the other hand, if Eq (28) is valid, two cases should be discussed: one case is g ii = r ii = 0 and S 0 i ðtÞ ¼ 1, it contradicts with the rule 2.2 since Eq (26) holds; the other case is g ii = 1 (r ii = 0 at this time), combining with Eq (26), one can see that the contradiction appears according to the rule 1.5. In summary, if S 0 i ðt þ 1Þ ¼ 0, it is also impossible that S 0 (t) obeys all the rules and dissatisfies Eq (18) at the same time.
These results demonstrate that the state S 0 (t), which obeys all the rules but dissatisfies the Eq (18), does not exist. Therefore, U is a subset of V (U & V), and U = V is proved. Now, we can conclude that the states found with our algorithm are precisely all the pre-images of the target network state.

Conclusion
In summary, we have presented a novel algorithm for finding the singleton attractors in strong-inhibition Boolean networks. The average case time complexity of the proposed algorithm is approximately O(1.34 N ) which is much faster than the enumeration-based algorithm. It may not be faster than the out-degree based ordering algorithm in [30] for networks with very low maximum indegree. However, we find that the computational time assumed of the algorithm is proportional to the number of the singleton attractors, it shows that our algorithm will work much better for networks with less singleton attractors, especially for the networks with high average degree and less inhibitory interactions. What's more, the algorithm can be extended to identify the pre-images of any network state and the basin of any singleton attractor. Therefore, the proposed algorithm could be effective and practical. We hope it has good applications in the study of biological networks. On the other hand, we can also know that the algorithm has its own limitations. For example, our algorithm completely relies on the strong-inhibition Boolean model, it may not be workable for other kinds of model, such as the general Boolean model. The computational time of the algorithm increases exponentially with the size of network, so it may not be applied to networks with several hundreds or more nodes. Moreover, we didn't get the theoretical results of the time complexity of the algorithm though we have made great efforts. We hope some of these limitations can be overcome and this work can be further improved in the future.  attractor (0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0)  Algorithm for Finding the Singleton Attractors and Pre-Images Supporting Information S1 File. Data and code of the algorithm. This data set contains the code of the algorithm for finding the singleton attractors and pre-images in strong-inhibition Boolean networks. The adjacency matrices of five classical networks are also given. (ZIP)