Robustness Analysis of Stochastic Biochemical Systems

We propose a new framework for rigorous robustness analysis of stochastic biochemical systems that is based on probabilistic model checking techniques. We adapt the general definition of robustness introduced by Kitano to the class of stochastic systems modelled as continuous time Markov Chains in order to extensively analyse and compare robustness of biological models with uncertain parameters. The framework utilises novel computational methods that enable to effectively evaluate the robustness of models with respect to quantitative temporal properties and parameters such as reaction rate constants and initial conditions. We have applied the framework to gene regulation as an example of a central biological mechanism where intrinsic and extrinsic stochasticity plays crucial role due to low numbers of DNA and RNA molecules. Using our methods we have obtained a comprehensive and precise analysis of stochastic dynamics under parameter uncertainty. Furthermore, we apply our framework to compare several variants of two-component signalling networks from the perspective of robustness with respect to intrinsic noise caused by low populations of signalling components. We have successfully extended previous studies performed on deterministic models (ODE) and showed that stochasticity may significantly affect obtained predictions. Our case studies demonstrate that the framework can provide deeper insight into the role of key parameters in maintaining the system functionality and thus it significantly contributes to formal methods in computational systems biology.

where φ is a path formula given as φ ::= X Φ | Φ U I Φ, a is an atomic proposition, ∼∈ {<, ≤, ≥, >}, p ∈ [0, 1] is a probability, r ∈ R ≥0 is an expected reward and I = [a, b] is a bounded time interval such that a, b ∈ R ≥0 ∧ a ≤ b. Path operators G (always) and F (eventually) are derived in the standard way using the operator U. In order to specify properties containing rewards (R ∼r [C ≤t ] is the cumulative reward acquired up to time t, R ∼r [I =t ] is the instantaneous reward in time t) the CTMC C is enhanced with reward (cost) structures. Two types of reward structures are used. A state reward ρ(s) defines the rate with which a reward is acquired in state s ∈ S. A reward of t · ρ(s) is acquired if a CTMC remains in state s for t time units. A transition reward ι(s i , s j ) defines the reward acquired each time the transition (s i , s j ) occurs.
Since the reward function has to be defined before the actual analysis of the CTMC, the rewards for particular states have to be known prior to the specification of the property. This fact limits the class of properties that can be expressed using such structures. For example, noise expressed by a mean quadratic deviation (mqd ) of the population probability distribution of a species at a given time cannot be specified using CSL with rewards. To compute the mqd we need to know the mean of the distribution to be able to obtain the corresponding coefficients and encode them into state rewards.
To overcome this problem we introduce the abstract state operator E ∼r [I =t ] which evaluates the state distribution π C,s0,t at the given time instant t by a user provided real-valued post-processing function P ost(π C,s0,t ) and compares it to ∼ r.
The formal semantics of the bounded fragment of CSL with rewards and post-processing functions is defined similarly as the semantics of full CSL and thus we refer the readers to original papers. The key part of the semantics is given by the definition of the satisfaction relation . It specifies when a state s satisfies the state formula Φ (denoted as s Φ) and when a path ω satisfies the path formula φ (denoted as ω φ). The informal definition of is as follows: • s E ∼r [I =t ] iff P ost(π C,s,t ) satisfies ∼ r.
• s P ∼p [φ] iff the probability of all paths ω ∈ P ath C (s) that satisfy the path formula φ (denoted as P rob C (s, φ)) satisfies ∼ p, where ω satisfies X Φ iff the second state on ω satisfies Φ ω satisfies Φ U I Ψ iff there exists a time instant t ∈ I such that the state on ω occupied at t satisfies Ψ and all states on ω occupied before t ∈ [0, t) satisfy Φ • s R ∼r [C ≤t ] iff the sum of expected rewards over P ath C (s) cummulated until t time units (denoted as Exp C (s, X C ≤t )) satisfies ∼ r • s R ∼r [I =t ] iff the sum of expected rewards over all paths ω ∈ P ath C (s) at time t (denoted as Exp C (s, X I =t )) satisfies ∼ r.
A set Sat C (Φ) = {s ∈ S | s Φ} denotes the set of states that satisfy Φ. Note that the syntax and semantics can be easily extended with "quantitative" formulae in the form Φ :: , i.e., the topmost operator of the formula Φ returns a quantitative result, as used, e.g., in PRISM [1]. In this case the result of a decision procedure is not in the form of a boolean yes/no answer but is the actual numerical value of the probability P rob C (s, φ), the expected reward Exp C (s, X) for X ∈ {X I =t , X C ≤t } or the value of P ost C (s, t). The computation of a numerical value is of the same complexity class as the computation of a result to be compared leading to a boolean answer, although in some cases the comparison may be carried out on less precise or preliminary results.

II Transient Analysis
The aim of transient analysis is to compute a transient probability distribution. Given an initial distribution π C,s0,0 (i.e., π C,s0,0 (s) = 1 if s 0 = s, and 0, otherwise) at time 0 of a CTMC C = (S, s 0 , R) what will the transient state distribution π C,s0,t look like in some future yet finite time t ∈ R ≥0 . Transient analysis of a CTMC may be efficiently carried out by a standard technique called uniformisation [2]. The transient probability in time t is obtained as a sum of expressions giving the state distributions after i discrete reaction steps of the respective uniformised discrete time Markov chain (DTMC) weighted by the i th probability of the Poisson process. It is the probability of i such steps occurring in time t, assuming the delays between steps of the CTMC C are exponentially distributed with rate q. Formally, for the rate q satisfying q ≥ max{E C (s) | s ∈ S} (E is the exit rate of state s) the uniformised DTMC unif(C) is defined as unif(C) = S, s 0 , Q unif(C) where and the ith Poisson probability in time t is given as γ i,q·t = e −q·t · (q·t) i i! . The transient probability can be computed as follows: Although the sum is in general infinite, for a given precision the lower and upper bounds L , R can be estimated by using techniques such as that of Fox and Glynn [3] which also allow for efficient solutions of the Poisson process. In order to make the computation of uniformisation feasible the matrix-matrix multiplication is reduced to a vector-matrix multiplication, i.e., π C,s0,0 · (Q unif(C) ) i = (π C,s0,0 · (Q unif(C) ) i−1 ) · Q unif(C) .
Standard uniformisation can be intractable when the system under study is too complex, i.e., contains more than an order of 10 7 states and the upper estimate R , denoting the number of vector-matrix multiplications as iterations, is high (more than an order of 10 6 ). Therefore, many approximation techniques have been studied in order to reduce the state space and to lower the number of iterations R . State space reductions are based on the observation that in many cases (especially in biochemical systems) a significant amount of the probability mass in a given time is localised in a manageable set of states. Thus neglecting states with insignificant probability can dramatically reduce the state space while the resulting approximation of the transient probability is still sufficient. Methods allowing efficient state-space reduction are based on finite projection techniques [4,5], dynamic state space truncation [6] and aggregation methods [7].
Since the number of iterations R inherently depends on the uniformisation rate q that has to be greater than the maximal exit rate of all the states of the system, a variant of standard uniformisation, the so-called adaptive uniformisation [8], has been proposed. It uses a uniformisation rate that adapts depending on the set of states the system can occupy at a given time, i.e., after a particular number of reactions. In many cases, a significantly smaller rate q can be used and thus the number of iterations R can be significantly reduced during some parts of the computation. Moreover, adaptive uniformisation can be successfully combined with reduction techniques mentioned above [6]. The downside of adaptive uniformisation is that the Poisson process has to be replaced with a general birth process which is more expensive to solve. More details can be found, e.g., in [8].

III Global CSL Model-Checking
Let us first define an auxiliary function Eval(C, Φ), that returns the numerical value representing the quantitative model-checking result for a given CTMC C = {S, s 0 , R} and an inspected formula Φ, in the following way: where ∈ {=?, ∼ r}.
The goal of the local model checking technique is to compute the value Eval(C, Φ). On the other hand, for the global model checking technique it is to compute the values of Eval(C, Φ) for all C = {S, s, R} where s ∈ S. The crucial advantage of the global approach is the fact that it has the same computational complexity as the local approach. Therefore, the global model checking technique is much more suitable for robustness analysis over perturbations of initial conditions that are encoded as the initial state of the corresponding CTMC.
Global model checking returns the vector of size |S| such that the i th position contains the model checking result provided that s i is the initial state. Let C = (S, R, L) be a labelled CTMC where the initial state is not specified. The crucial part of this method is to compute the vector of probabilities P rob C,φ for any path formula φ and the vector of expected rewards Exp C,X for X ∈ {X I =t , X C ≤t } such that for all s ∈ S the following holds: In local model checking the computation of P rob C (s, φ) and Exp C (s, X) is reduced to the computation of the transient probability distribution π C,s,t , as described, e.g., in [2,9]. Thus, for different initial states s we have to compute the corresponding transient probability distributions separately. The key idea of the global model checking method is to use backward transient analysis. The result of backward transient analysis is the vector τ C,A,t such that for an arbitrary set of states A, the value τ C,A,t (s) is the probability that A is reached from s at the time t. Without going into details the vector τ C,A,t can be computed in a very similar way using the uniformised DTMC unif(C) as in the case of vector π C,s,t . Only vector-matrix multiplications are replaced by matrix-transposed-vector multiplication and τ C,A,0 (s) = 1 if s ∈ A, and 0, otherwise.
The global model checking technique cannot be used if Φ includes the operator E ∼r [I =t ]. In such a case we have to compute the value P ost(π C,s,t ). Hence the local model checking technique has to be employed, i.e., we first compute the vector π C,s,t and then apply the user specified function Post.
Now we briefly show how the vector P rob C,φ is computed using backward transient analysis. Since the definition of next operator X Φ does not rely on any real time aspects of CTMCs, its evaluation stems from the probability of the next reaction that can be easily obtained from the transition matrix R. The evaluation of the until operator Φ 1 U I Φ 2 depends on the form of the interval I and is separately solved for the cases of I = [0, For the formula φ = Φ 1 U [t1,t2] Φ 2 the evaluation is split into two parts: staying in states satisfying Φ 1 until time t 1 and reaching a state satisfying Φ 2 , while remaining in states satisfying Φ 1 , within time t 2 − t 1 . The formula φ can be evaluated using the vector τ C,v,t that takes a vector v instead of a set A (i.e., τ C,v,0 = v) in the following way: The backward transient analysis can be also used in the case of reward computation. Since operator R ∼p [I =t ] expresses the expected reward at time t, the vector Exp C,X I =t can be computed as follows: Exp C,X I =t = τ C,v,t where v = ρ such that ρ is the given state reward structure.
For evaluation of the operator R ∼p [C ≤t ] we have to use mixed Poisson probabilities (see, e.g., [2,10]) in the backward transient analysis. It means that during the uniformisation the Poisson probabilities γ i,q·t are replaced by the mixed Poisson probabilitiesγ i,q·t that can be computed as: Using the given state reward structure ρ we can compute the vector Exp C,X C ≤t in the following way: where v = ρ and the mixed Poisson probabilitiesγ i,q·t are used.
To recap the overall method of stochastic model checking of CTMCs over CSL formulae we present the methods from an abstract perspective. The evaluation of a structured formula Φ proceeds by bottom-up evaluation of a set of atomic propositions, probabilistic or expected reward inequalities and their boolean combinations. This evaluation gives us a discrete set of states that are further used in the following computation. The process continues up the formula until the root is reached. The final verdict is reported either in the form of a boolean yes/no answer or as the actual numerical value of the probability or the expected reward. This process can be easily extended for the operator E ∼r [I =t ], though the local model checking method has to be used.

IV Min-Max Approximation
The key idea of min-max approximation proposed in [11] is to approximate the largest set of states satisfying Φ, and the smallest set of states satisfying Φ with respect to the space of perturbations P.
Let C be a set of parameterised CTMCs induced by the space of perturbations P. We compute the approximation Sat C (Φ) and Sat ⊥ C (Φ) such that To obtain such approximations we extended the satisfaction relation and showed in that it is sufficient for an arbitrary path formula φ, and X ∈ {X C ≤t , X I =t } to compute the vectors P rob C,φ , P rob C,φ ⊥ , Exp C,X and Exp C,X ⊥ such that for each s ∈ S the following holds: The min-max approximation can be easily extended to the operator E ∼r [I =t ]. For the given state s ∈ S and the time t it is sufficient to compute the values P ost C (s, t) and P ost C (s, t) such that the following holds: The approximated sets Sat C (Φ) and

V Parameterised Uniformisation
The parameterised uniformisation introduced in [11] is a modification of the standard uniformisation technique that allows us to compute strict approximations of the minimal and maximal transient probability with respect to the set C, moreover, the modification preserves the asymptotic time complexity of standard uniformisation. For the given state s ∈ S and time t ∈ R ≥0 the parameterised uniformisation returns vectors π C,s,t and π C,s,t ⊥ such that for each state s ∈ S the following holds: The modification is based on the computation of the local maximum (minimum) of π Cp,s,t (s ) over all C p ∈ C for each state s' and in each iteration i of the standard uniformisation. It means that in the i th iteration of the computation for a state s' we consider only the maximal (minimal) values in the relevant states in the iteration i-1, i.e., the states that affect π Cp,s,t (s ).
In [11] we have defined the function σ(s) (formally σ(p, s, π)) which for each state s ∈ S, perturbation point p ∈ P, and probability distribution π (or pseudo-distribution with the sum smaller or larger than 1) returns the difference of probability mass inflow and outflow to/from state s. If all reactions are described by mass action kinetics the resulting σ functions are monotonic with respect to any single perturbed stochastic rate constant k r . This allows us to efficiently compute for each state s the local maximum (minimum) of π Cp,s,t (s ) over all C p ∈ C corresponding to P.
However, in the case of more complex rate functions than those resulting from mass action kinetics, the corresponding σ(s) function does not have to be in general monotonic over k r ∈ [k ⊥ r , k r ] for all states s. This makes the computation of local extremes with respect to k r more complex, though still tractable. In the following let us assume the space of perturbations P = [k ⊥ r , k r ] × P is be decomposed along the k r axis.
The key idea is for each state s to be able to efficiently decompose P into subspaces P = P 1 ∪. . . ∪P n , such that for each P i the function σ(s) over P i is monotonic and then use the original method. The problem is that a computation of such a strict decomposition into monotonic subspaces is computationally demanding. Therefore we use a simplification, by off-line functional analysis we identify properties of σ functions for a given class of reaction kinetics and then obtain a partial decomposition of P based on function derivations into subspaces where monotonicity is guaranteed. For the remaining subspaces P j , where monotonicity of σ is not guaranteed, we employ a less accurate approximation.
We decompose the function σ(s) over P j into functions α This approximation increases the inaccuracy of parameterised uniformisation, however, the subspaces P j where the monotonicity of σ(s) is not guaranteed are usually small and, together with perturbation space decomposition introduced in the following section, they keep on getting smaller. Hence, the additional inaccuracy of the presented extension is manageable. Despite the fact that the time demands of this approximation are orders of magnitudes lower than other numerical methods computing maximum/minimum of σ(s) over P i , they still significantly slow down the computation of parameterised uniformisation.
The aforementioned parameterised uniformisation can be straightforwardly employed also for backward transient analysis. It means that we can efficiently compute the vectors τ C,A,t and τ C,A,t ⊥ such that for the given set of states A and each state s ∈ S the following holds: the vectors τ C,A,t and τ C,A,t . However for a general class of user-defined post-processing functions Post, the vectors π C,s,t and π C,s,t ⊥ cannot be directly used to compute values of P ost C (s, t) = P ost(π C,s,t ) nor P ost C ⊥ (s, t) = P ost(π C,s,t ⊥ ) that would satisfy Equation 3, since there is no guarantee about the projective properties of the function Post.
Now we show the main idea of computing P ost C (s, t) and P ost C ⊥ (s, t) for the post-processing function Post defined as the mean quadratic deviation of a probability distribution. This function allows us to quantify and analyse noise in different variants of signalling pathways that are studied in the second case study. The post-processing function is defined as P ost(π) = s∈S |s(A) − mean(π, A)| 2 · π(s), where s(A) gives the population of A in state s and mean(π, A) is the mean of the distribution π defined as mean(π, A) = s∈S s(A) · π(s).
Let us suppose we have an upper and lower bound on the probability distribution π , π ⊥ obtained by the parameterised uniformisation. It means that ∀C p ∈ C and ∀s ∈ S. π ⊥ (s) ≤ π Cp (s) ≤ π (s). To find the maximal value max P ost(π Cp ) | C p ∈ C means to find the distribution π max such that s∈S π max (s) = 1, ∀s ∈ S. π ⊥ (s) ≤ π max (s) ≤ π (s) and the probability mass in π max is distributed with the farthest distance from the mean. Clearly, such a distribution has a maximal mean quadratic deviation. Note that the number of distributions satisfying the first two conditions is uncountable. Thus we cannot employ a direct search strategy.
Our searching strategy builds on the observation that only the distributions that localise most of the mass as far as possible from the mean (i.e., maximising the mean quadratic deviation and still meeting the bounds π ⊥ , π ), have to be considered. These distributions can be linearly ordered with respect to the sum of mass x localised at the low population end of the state space. It can be shown that the function that evaluates P ost on all these distributions is piece-wise quadratic with respect to x and has O(|S|) segments. Therefore, O(|S|) many steps are sufficient to compute max P ost(π Cp ) | C p ∈ C .
To compute the minimal value min P ost(π Cp ) | C p ∈ C we proceed analogously, i.e., only the distributions that localise most of the mass as close as possible to the mean are considered. This leads again to a piece-wise quadratic function. It is also important to note that the perturbation space decomposition presented in the next section allows us to obtain the values P ost C (s, t) and P ost C ⊥ (s, t) with the desired precision.

VI Perturbation Space Decomposition Strategy
For the sake of simplicity, we present the parametric decomposition strategy only on the computation of π C,s,t and τ C,A,t for ∈ { , ⊥} and P, since it can be easily extended to the computation of D C Φ,P, for any formula Φ. The key part of the parametric decomposition is to decide when the inspected subspace should be further decomposed. The condition for the decomposition is different for π C,s,t and τ C,A,t . Since the vector π C,s,t gives us the transient probability distribution from the state s that is further used to compute D C Φ,P, , we consider the following condition. The space P is decomposed if during the computation of parameterised uniformisation in an iteration i it holds that: |S| k=1 π C,s,i (s k ) − |S| k=1 π C,s,i ⊥ (s k ) > Err where π C,s,i denotes the corresponding approximation of π C,s,0 · (Q unif(C) ) i .
In contrast to π C,s,t , the value τ C,A,t (s) for each state s ∈ S is further used to D C Φ,P, and thus we consider a different condition. The space P is decomposed if during the computation of parameterised uniformisation in an iteration i for any state s it holds that: τ C,A,i (s) − τ C,A,i ⊥ (s) > Err.
If the decomposition takes place we cancel the current computation and decompose the perturbation space P to n subspaces such that P = P 1 ∪ . . . ∪ P n . Each subspace P j defines a new set of CTMCs C j = {C j | j ∈ P j } that is independently processed in a new computation branch. Note that we could reuse the previous computation and continue from the iteration i − 1. However, the most significant part of the error is usually cummulated during the previous iterations and thus the decomposition would have only a negligible impact on error reduction.
A minimal decomposition with respect to the perturbation space P defines a minimal number of subspaces m such that P = P 1 ∪ . . . ∪ P m and for each subspace P j , where 1 ≤ j ≤ m, it holds that Err C,s Φ,Pj ≤ Err, where Err C,s Φ,Pj = D C Φ,Pj , − D C Φ,Pj ,⊥ . Note that the existence of such decomposition is guaranteed only if the evaluation function D C,s Φ over P is continuous. If the evaluation function is continuous there can exist more than one minimal decomposition. However, it cannot be straightforwardly found. To overcome this problem we have considered and implemented several heuristics allowing to iteratively compute a decomposition satisfying the following: (1) it ensures the required error bound whenever D C Φ over P is continuous, (2) it guarantees the refinement termination in the situation where D C,s Φ over P is not continuous and the discontinuity causes that Err cannot be achieved. To ensure the termination an additional parameter has to be introduced as a lower bound on the subspace size. Hence this parameter provides a supplementary termination criterion.