Adaptive Models for Gene Networks

Biological systems are often treated as time-invariant by computational models that use fixed parameter values. In this study, we demonstrate that the behavior of the p53-MDM2 gene network in individual cells can be tracked using adaptive filtering algorithms and the resulting time-variant models can approximate experimental measurements more accurately than time-invariant models. Adaptive models with time-variant parameters can help reduce modeling complexity and can more realistically represent biological systems.

An ordinary differential equation (ODE) for simple gene regulation (gene u activates gene y) can be expressed as [1]: where u(t) and y(t) stand for the concentrations of protein u and y as functions of time t. F max is the maximal level of the y protein production (in units of concentration per unit time) that is reached when u(t) >> K. K is the concentration of u(t) at which half-maximal production of y protein is reached and n is the Hill coefficient. p y is a degradation/dilution parameter that affects the rate at which y decreases. Eq. 1 can be transformed into a linearized form as [1]: where p uy is a parameter that determines the effect of the u protein on the production of the y protein.
Denoting p53 as u(t) and MDM2 as y(t), the dynamics of the p53-MDM2 negative feedback loop can be expressed as [2]: Eq. 3 and Eq. 4 can be discretized using Euler's method (h: discrete-time unit): where i is the iteration index. By substituting Eq. 5 into Eq. 6, we get: If we further incorporate an error term e(i) into Eq. 7 to model approximation errors, then we find that the discrete-time linear difference equation derived from the Geva-Zoatorsky's continuous-time linear differential equation [2] is a 3 rd -order ARX model as shown below.
Using the MATLAB System Identification Toolbox (Mathworks, USA), we tested 1,000 combinations of n a , n b , and n k values that change from 1 to 10 (10 x 10 x 10 = 1,000). The model order that corresponds to the best model performance is selected based on unexplained output variance (the vertical axis in Fig.  S1), the ratio between the prediction error variance and the output variance in percent [3]. Unexplained output variance is the portion of the output not explained by the model. The relationship between unexplained output variance and the Best Fit score can be shown as: where y is the measured output (MDM2) vector, ŷ is the estimated model output vector, and y is the vector with all entries equal to y , the mean of the data vector y . The relationship illustrates that as the unexplained output variance increases the Best Fit score decreases and vice versa. For our case, the model order with the least unexplained output variance turned out to be 4 (n a = 1, n b = 3) (Fig. S1). In the figure, it is also seen that models with low orders (between 2 and 4) perform similarly, suggesting that the order may not play a critical role in that range. On the other hand, performance degrades when the order increases, indicating that unnecessarily complex models can degrade performance. Since a good model is one that is simple and performs well enough, we may choose small model orders such as 2 or 3 in real applications. Figure S1. Model order vs. unexplained output variance (estimation error measure).
The equation and parameter values of the 4 th order model (n a = 1, n b = 3, n k = 2) are shown below.
The equation and parameter values of the 3rd order model (n a = 2, n b = 1, n k = 2) described in Supplementary Note 1 are shown below. Figure S2 illustrates that performance is not observed to improve with other commonly used model structures such as ARMAX, Box-Jenkins, output-error, and state-space (the Best Fit scores are shown in the parentheses) [3]. This suggests that choosing different model structures is not sufficient to resolve the issue of increasing the Best Fit score.

Supplementary Note 4: Instructions for Using AFGN.exe
As a supplementary material, we provide a LabVIEW-based Windows application that can be used to run the simulated experiments described in the main text (Fig. S3). 1. Execute AFGN.exe and load the p53 and MDM2 data files (Fig. S3).
2. Select the ARX model order (n a , n b , and n k ). The default values are n a = 1, n b = 8, and n k = 1. 5. The measurement noise variance R value is required only for KF. The default value is 1.
6. Run the application by clicking the arrow at the top (Fig. S5). Parameter tracking can be observed below the control panel.