Reduction of multiscale stochastic biochemical reaction networks using exact moment derivation

Biochemical reaction networks (BRNs) in a cell frequently consist of reactions with disparate timescales. The stochastic simulations of such multiscale BRNs are prohibitively slow due to high computational cost for the simulations of fast reactions. One way to resolve this problem uses the fact that fast species regulated by fast reactions quickly equilibrate to their stationary distribution while slow species are unlikely to be changed. Thus, on a slow timescale, fast species can be replaced by their quasi-steady state (QSS): their stationary conditional expectation values for given slow species. As the QSS are determined solely by the state of slow species, such replacement leads to a reduced model, where fast species are eliminated. However, it is challenging to derive the QSS in the presence of nonlinear reactions. While various approximation schemes for the QSS have been developed, they often lead to considerable errors. Here, we propose two classes of multiscale BRNs which can be reduced by deriving an exact QSS rather than approximations. Specifically, if fast species constitute either a feedforward network or a complex balanced network, the reduced model based on the exact QSS can be derived. Such BRNs are frequently observed in a cell as the feedforward network is one of fundamental motifs of gene or protein regulatory networks. Furthermore, complex balanced networks also include various types of fast reversible bindings such as bindings between transcriptional factors and gene regulatory sites. The reduced models based on exact QSS, which can be calculated by the computational packages provided in this work, accurately approximate the slow scale dynamics of the original full model with much lower computational cost.

Since the deficiency of this network is n c − − r = 2 − 1 − 1 = 0 and it is reversible and hence weakly reversible as well, we know that there is a complex-balanced equilibrium and every equilibrium is complex balanced [1,2]. First, we can pick any potential steady state,λ = (1, 1, K), where K := κ f /κ b . Note that the choice of the steady state does not affect the final result. There are two conservations: S 1 + S 3 = p and S 2 + S 3 = n, where p and n are determined by the initial conditions. Thus, we have α 11 = α 13 = 1, α 22 = α 23 = 1, α 12 = α 21 = 0, β 1 = p, β 2 = n in (18). Due to the conservations, all p k should vanish except those corresponding to vectors k = (k 1 , k 2 , k 3 ) such that k 1 + k 3 = p and k 2 + k 3 = n. The set consisting of all such vectors is invariant, so if k 1 + k 3 = p and k 2 + k 3 = n 0 otherwise is a solution of the ssCME. In order to obtain a probability density, we must normalize them by the sum of these p k 's (i.e. the partition function, Z(p, n)). Because of the two conservations, the sum can be expressed in terms of just one of the indices, let us say k 1 .
Since k 1 + k 3 = p and k 3 ≥ 0, necessarily k 1 ≤ p. Since k 2 = n − k 3 = n + k 1 − p must be non-negative, we also have the constraint k 1 ≥ max{0, p − n}. Thus, p k is nonzero only when k 1 ∈ {max{0, p − n}, . . . , p} with k 2 = n + k 1 − p, k 3 = p − k 1 , and we have: PLOS S1/S7 where second equality comes from the convention that In particular, we have; Another way to derive Z(p, n) is using the following recursion formula, which can be obtained by using the computaional package MVPoisson from [3] based on Wilf-Zeilberger theory: Note that by symmetry, a recursion on p can be found by exchanging n and p. Furthermore, in terms of the Gauss's hypergeometric function 2 F 0 , we can also write: Since Z(p, n) is calculated, we can derive the conditional mean of the first species using (19): for p ≥ 1, n ≥ 0, and zero otherwise. For example, ϕ(1, n) = 1 Kn + 1 ϕ(2, n) = 2(Kn + 1) K 2 n 2 + (−K 2 + 2 K) n + 1 .
A Matlab code allowing for the calculation of (S3) is provided in this work.

A complex balanced network with competitive reversible bindings
Here, we apply (19) to calculate stationary conditional moments of another complex balanced network, this one with two competing reversible bindings: Since the deficiency of this network is n c − − r = 4 − 2 − 2 = 0 and it is weakly reversible, there is a complex-balanced equilibrium and every equilibrium is complex balanced [1,2]. The steady states of the associated deterministic system satisfy Note that following quantities are conserved: Subject to these constraints, we can pick the following partition function: Thu sum can be re-written as a double sum because S is equal to the following set due to the three conservations.
The procedure used to find the set S is a special case of an algorithmic approach described in the following section.
Using the equality S = S , we can rewrite the partition function as follows : where we may use either alternative expressions in terms of Q or Q defined as follows: The sum in Q is numerically better performed than that in Q when p is large and n is small. With a change of indices = p − , it can be shown that Q is the partition function Z(p, n) given by formula (S2) for the single binding example Thus, Q can also be written as in terms of 2 F 0 , Gauss's hypergeometric function. When n B = 0, the partition function becomes which is expected as when n B = 0 the species B can only be zero, so the system reduces to the previous example, with S 1 = A, S 2 = C, and S 3 = E. When n B = 1, we get Using this, the conditional mean of species D given the constraints (n A , 1, n C ) is derived by (19): .
Using Q, we may write, alternatively, and thus, cancelling the n A ! terms, and using Z(n A − 1, 0, n C ) = n A n A ! Q(n A − 1, n C ), which is far better behaved numerically when n A is large. A Matlab code allowing for the calculation of (S6) is provided in this work. Rather than the direct summation, partition functions and thus stationary moments can also be derived using the recursion method from [3]. For this example, a third-order recursion for Z can be obtained by the algorithm MVPoisson from [3]. In order to conveniently display the recurrences, let us use the following notations. We will write Z instead of Z(b 1 , b 2 , b 3 ), and a notation like Z +···+ i means a shift of the ith argument by the indicated number of plus signs. For example, Z ++ 3 means Z(b 1 , b 2 , b 3 + 2). There are three recurrences of order three, as follows, for each of the three arguments: The algorithm provides 27 initial conditions, the values of Z for the triples (1, 1, 1), (1,1,2), (1, 1, 3), . . . (3,3,3) listed in that order. We display them as three matrices, respectively shown below. The first matrix lists the elements of the form (1, , ), the next one (2, , ), and the last one (3, , ). In each matrix, elements are listed in the usual matrix order: ( , i, j) is the (i, j)th entry of the matrix.

Computing reduced sums
We remark that the reduced indices for the sums defining the partition function can be obtained in a more systematic form, through the use of Smith canonical forms. Suppose that P is a matrix in Z q×n that represents q conservation laws on n species. For instance, P =   1 0 0 1 1 0 1 0 1 0 0 0 1 0 1   in the competitive binding example. We assume, as in this and other examples, that q ≤ n and that the matrix P has full row rank q. Under this assumption, the integer matrix P can be represented in Smith canonical form (see for example [4]), meaning that there exist two unimodular (that is to say, invertible over the ring of integers) matrices U ∈ Z q×q and V ∈ Z n×n so that where ∆ = diag (δ 1 , . . . , δ q ), 0 is a q × (n − q) matrix of zeroes, and the δ i 's are the elementary divisors of the matrix P . The elementary divisors are unique up to sign change, there are formulas that express then in terms of the minors of P (see [4] for details). For example, for the above example, we have U = I (3 × 3 identity matrix), . In general, if we wish to find non-negative integer solutions of Ak = b, for a given (non-negative) integer vector b, we use that U P V V −1 k = U b, so, using the indices = V −1 k, which means that the last n − q indices are free, and the constraint V ≥ 0 is imposed to insure non-negativity of k. For instance, in the competitive binding example, and recalling that U = I and ∆ = I, the equation [∆ 0] = U b gives that 1 = b 1 , 2 = b 2 , 3 = b 3 , and 4 = i, 5 = j are arbitrary. Thus we can express the sum as a sum over the two indices k 4 = i and k 5 = j, with k 1 = b 1 − (i + j), k 2 = b 2 − i, and k 3 = b 3 − j.
The non-negativity condition V ≥ 0, applied with the above matrix V , says that these expressions must be non-negative: which means that the sum can be re-expressed as a sum over i ≥ 0, j ≥ 0, subject to i ≤ b 2 , j ≤ b 3 , and i + j ≤ b 1 . This is exactly the same as the set S computed by hand.