Figures
Abstract
Proper regulation of cell signaling and gene expression is crucial for maintaining cellular function, development, and adaptation to environmental changes. Reaction dynamics in cell populations is often noisy because of (i) inherent stochasticity of intracellular biochemical reactions (“intrinsic noise”) and (ii) heterogeneity of cellular states across different cells that are influenced by external factors (“extrinsic noise”). In this work, we introduce an extrinsic-noise-driven neural stochastic differential equation (END-nSDE) framework that utilizes the Wasserstein distance to accurately reconstruct SDEs from stochastic trajectories measured across a heterogeneous population of cells (extrinsic noise). We demonstrate the effectiveness of our approach using both simulated and experimental data from three different systems in cell biology: (i) circadian rhythms, (ii) RPA-DNA binding dynamics, and (iii) NFB signaling processes. Our END-nSDE reconstruction method can model how cellular heterogeneity (extrinsic noise) modulates reaction dynamics in the presence of intrinsic noise. It also outperforms existing time-series analysis methods such as recurrent neural networks (RNNs) and long short-term memory networks (LSTMs). By inferring cellular heterogeneities from data, our END-nSDE reconstruction method can reproduce noisy dynamics observed in experiments. In summary, the reconstruction method we propose offers a useful surrogate modeling approach for complex biophysical processes, where high-fidelity mechanistic models may be impractical.
Author summary
In this work, we propose extrinsic-noise-driven neural stochastic differential equations (END-nSDE) to reconstruct noisy regulated gene expression dynamics. One of our main contributions is that we generalize a recent Wasserstein-distance-based SDE reconstruction approach to incorporate extrinsic noise (parameters that vary across different cells). Our approach can thus capture intrinsic fluctuations in gene regulatory dynamics driven by extrinsic noise (heterogeneity among cells), offering an advantage over deterministic models and outperforming other benchmarks. By inferring noise intensities from batches of experimental data, our END-nSDE can partially capture experimental noisy signaling dynamic data and provides a surrogate model for biomolecular processes that are too complex to model directly.
Citation: Zhang J, Li X, Guo X, You Z, Böttcher L, Mogilner A, et al. (2025) Reconstructing noisy gene regulation dynamics using extrinsic-noise-driven neural stochastic differential equations. PLoS Comput Biol 21(9): e1013462. https://doi.org/10.1371/journal.pcbi.1013462
Editor: Michael A. Beer, Johns Hopkins University School of Medicine, UNITED STATES OF AMERICA
Received: April 2, 2025; Accepted: August 24, 2025; Published: September 17, 2025
Copyright: © 2025 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: No data was created in this research. All data used in this research are publicly available at https://www.nature.com/articles/s41467-023-39579-y and https://www.embopress.org/doi/full/10.1038/s44320-024-00047-4 and have been properly cited. The simulated datasets, neural SDE model code, and analysis scripts to replicate the study findings are available on GitHub at https://github.com/JianchengZ/Neural-SDE-GeneDynamics.
Funding: XG acknowledges financial support from UCLA Collaboratory Fellowship. LB acknowledges financial support from hessian. AI and the ARO through grant W911NF-23-1-0129. TC acknowledges inspiring discussions at the “Statistical Physics and Adaptive Immunity” program at the Aspen Center for Physics, which is supported by the National Science Foundation grant PHY-2210452. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. XG and AH acknowledge support from NIH R01AI173214.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Reactions that control signaling and gene regulation are important for maintaining cellular function, development, and adaptation to environmental changes, which impact all aspects of biological systems, from embryonic development to an organism’s ability to sense and respond to environmental signals. Variations in gene regulation, arising from noisy biochemical processes [1,2], can result in phenotypic heterogeneity even in a population of genetically identical cells [3].
Noise within cell populations can be categorized as (i) “intrinsic noise,” which arises from the inherent stochasticity of biochemical reactions and quantifies, e.g., biological variability across cells in the same state [2,4,5], and (ii) “extrinsic noise,” which encompasses heterogeneities in environmental factors or differences in cell state across a population. A substantial body of literature has focused on quantifying intrinsic and extrinsic noise from experimental and statistical perspectives [1,2,6–13]. Experimental studies have specifically identified relevant sources of noise in various organisms, including E. coli (Escherichia coli), yeast, and mammalian systems [2,14–17].
Extrinsic noise is associated with uncertainties in biological parameters that vary across different cells. The distribution over physical and chemical parameters determine the observed variations in cell states, concentrations, locations of regulatory proteins and polymerases [1,2,18], and transcription and translation rates [19]. For example, extrinsic noise is the main contributor to the variability of concentrations of oscillating p53 protein levels across cell populations [20]. On the other hand, intrinsic noise, i.e., inherent stochasticity of cells in the same state, can limit the accuracy of expression and signal transmission [2,5]. Based on the law of mass action [21,22], ordinary differential equations (ODEs) apply only in some deterministic or averaged limit and do not take into account intrinsic noise. Therefore, stochastic models are necessary to accurately represent biological processes, such as thermodynamic fluctuations inherent to molecular interactions within regulatory networks [1,5,18] or random event times in birth-death processes.
Existing stochastic modeling methods that account for intrinsic noise include Markov jump processes [23,24] and SDEs [25–27]. These approaches are applicable to different system sizes: Markov jump processes provide exact descriptions for discrete molecular systems, while SDEs serve as continuous approximations to Markov processes when molecular abundances are sufficiently high. SDE approaches may not be suitable for gene expression systems with very low copy numbers, where discrete master equation descriptions are more accurate. However, SDE approaches become more appropriate when modeling protein dynamics or when gene regulatory interactions are modeled implicitly through Hill functions. Additionally, a hierarchical Markov model was designed in [28] for parameter inference in dual-reporter experiments to separate the contributions of extrinsic noise, intrinsic noise, and measurement error when both extrinsic and intrinsic noise are present. The described methods have been effective in the reconstruction of low-dimensional noisy biological systems. Discrete master-equation methods to model the evolution of probabilities in systems characterizing, e.g., gene regulatory dynamics [29–31], can be computationally expensive and usually require specific forms of a stochastic model with unknown parameters that need to be inferred. It is unclear whether such methods and their generalizations can be applied to more complex (e.g., higher-dimensional) systems for which a mechanistic description of the underlying biophysical dynamics is not available or impractical.
SDEs can capture both the mean dynamics (as ODEs do) and random fluctuations, offering a practical and scalable alternative to master equations in complex systems. Thus, we introduce an extrinsic-noise-driven neural stochastic differential equation (END-nSDE) reconstruction method that builds upon a recently developed Wasserstein distance (W2 distance) nSDE reconstruction method [32]. Our method is used to identify macromolecular reaction kinetics and cell signaling dynamics from noisy observational data in the presence of both extrinsic and intrinsic noise. A key question we address in this paper is how extrinsic noise that characterizes cellular heterogeneity influences the overall stochastic dynamics of the population.
The major differences between the approach presented here and prior work [32] are: (i) the inclusion of extrinsic noise into the framework allowing one to model cell-to-cell variability through parameter heterogeneity, and (ii) the ability of our method to learn the dependency of the SDE on those parameters, enabling reconstruction of a family of SDEs rather than a single SDE model. In contrast, the method developed in reference [32] focuses on reconstructing a single SDE without considering parameter variations or extrinsic noise sources. In Fig 1, we provide an overview of the specific applications that we study in this work.
A. Workflow for training and testing of the extrinsic-noise-driven neural SDE (END-nSDE). Predicted trajectories are simulated (see B) using a range of model parameters (see Sect 2.2) before splitting into training and testing sets (see Fig E in S1 Text for details on the splitting strategy). Model parameters and state variables serve as inputs to a neural network that reconstructs drift and diffusion terms (see C). Network weights are optimized by minimizing the Wasserstein distance (Eq 8) between the training set and predicted trajectories. B. Predicted trajectories are generated by the reconstructed SDE . C. The drift and diffusion functions,
and
, are approximated using parameterized neural networks. The parameterized neural-network-based drift function
and diffusion function
take the system state
and biological parameters ω as inputs. D. Table of three examples illustrating the nSDE input, along with training and testing datasets. For the last, NF
B example, a more detailed workflow for validation on experimental datasets is illustrated in Fig 8.
Our approach employs neural networks as SDE approximators in conjunction with the torchsde package [33,34] for reconstructing noisy dynamics from data. Previous work showed that for SDE reconstruction tasks, the W2 distance nSDE reconstruction method outperforms other benchmark methods such as generative adversarial networks [32,35]. Compared to other probabilistic metrics such as the KL divergence, the Wasserstein distance better incorporates the metric structure of the underlying space. This geometric property makes the Wasserstein distance particularly suitable for trajectory and image data on high-dimensional manifolds, where the supports of different distributions do not always overlap [36]. Additionally, the W2-distance-based nSDE reconstruction method can directly extract the underlying SDE from temporal trajectories without requiring specific mathematical forms of the terms in the underlying SDE model. We apply our END-nSDE methodology to three biological processes: (i) circadian clocks, (ii) RPA-DNA binding dynamics, and (iii) NFB signaling to illustrate the effectiveness of the END-nSDE method in predicting how extrinsic noise modulates stochastic dynamics with intrinsic noise. Additionally, our method demonstrates superior performance compared to several time-series modeling methods including recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and Gaussian processes. In summary, the reconstruction method we propose provides a useful surrogate modeling approach for complex biophysical and biochemical processes, especially in scenarios where high-fidelity mechanistic models are impractical.
2. Methods and models
In this work, we extend the temporally decoupled squared W2-distance SDE reconstruction method proposed in Refs. [32,37] to reconstruct noisy dynamics across a heterogeneous cell population (“extrinsic noise”). Our goal is to not only reconstruct SDEs for approximating noisy cellular signaling dynamics from time-series experimental data, but to also quantify how heterogeneous biological parameters, such as enzyme- or kinase-mediated biochemical reaction rates, affect such noisy cellular signaling dynamics.
2.1. SDE reconstruction with heterogeneities in biological parameters
The W2-distance-based neural SDE reconstruction method proposed in [32] aims to approximate the SDE
using an approximated SDE
where and
are two parameterized neural networks that approximate the drift and diffusion functions
and
in Eq (1), respectively. These two neural networks are trained by minimizing a temporally decoupled squared W2-distance loss function
where denotes the set of all coupling distributions π of two distributions
on the probability space
, and
and
are the observed trajectories at time t and trajectories generated by the approximate SDE model Eq (2) at time t, respectively. μ and
are the probability distributions associated with the stochastic processes
and
, respectively, while
and
are the probability distributions of
and
at a specific time t. A coupling
between
and
is defined by
where is the Borel σ-algebra on
and
represents the expectation when
.
The term in Eq (3) is denoted as the temporally decoupled squared W2 distance loss function. For simplicity, in this paper, we shall also denote Eq (3) as the squared W2 loss. The infimum is taken over all possible coupling distributions
and
denotes the
norm of a vector. That is,
Across different cells, extrinsic noise or cellular heterogeneities such as differences in kinase or enzyme abundances resulting from cellular variabilities, can lead to variable, cell-specific, gene regulatory dynamics. Such heterogeneous and stochastic gene expression (both intrinsic and extrinsic noise) can be modeled using SDEs with distributions of parameter values reflecting cellular heterogeneity. To address heterogeneities in gene dynamics across different cells, we propose an END-nSDE method that is able to reconstruct a family of SDEs for the same gene expression process under different parameters. Specifically, for a given set of (biological) parameters ω, we are interested in reconstructing
using the approximate SDE
in the sense that the errors and
for all different values of ω will be minimized. In Eq (7),
and
are represented by two parameterized neural networks that take both the state variable
and the parameters ω as inputs. To train these two neural networks, we propose an extrinsic-noise-driven temporally decoupled squared W2 distance loss function
where and
are the distributions of the trajectories
and
, and
is the temporally decoupled squared W2 loss function in Eq (3). Λ denotes the set of parameters ω. Eq (8) is different from the local squared W2 loss in Refs. [38,39] since we do not require a continuous dependence of
on the parameter ω nor do we require that ω is a continuous variable. The extrinsic-noise-driven temporally decoupled squared W2 loss function Eq (8) takes into account both parameter heterogeneity and intrinsic fluctuations as a result of the Wiener processes
and
in Eqs (1) and (2).
Our END-nSDE method is outlined in Fig 1A–1C. With observed noisy single-cell dynamic trajectories as the training data, we train two parameterized neural networks [40] by minimizing Eq (8) to approximate the drift and diffusion terms in the SDE. The reconstructed nSDE is a surrogate model of single-cell dynamics (see Fig 1B and 1C). The hyperparameters and settings for training the neural SDE model are summarized in Table A in S1 Text. Through the examples outlined in Fig 1D, we will show that our W2-distance-based method can yield very small errors in the reconstructed drift and diffusion functions and
.
Algorithm 1 END-nSDE training and prediction framework.
Obtain training trajectories (simulated or experimental time-series data). Maximum training epochs
.
Preprocess the relevant training trajectories by grouping them according to different biophysical parameters ω.
Phase 1: Training
for do
Input the initial state and
into the END-nSDE to generate new predictions
.
Calculate the loss function in Eq (8) and perform gradientdescent to train the END-nSDE model.
end for
return the trained END-nSDE model
Phase 2: Prediction
Input initial condition and corresponding noise parameters ω from testing data into the trained END-nSDE model.
Generate predicted trajectories from the learned model.
2.2. Biological models
We consider three biological examples where stochastic dynamics play a critical role and use our END-nSDE method to reconstruct noisy single-cell gene expression dynamics under both intrinsic and extrinsic noise (also summarized in Fig 1D). In these applications, we investigate the extent to which the END-nSDE can efficiently capture and infer changes in the dynamics driven by extrinsic noise.
2.2.1. Noisy oscillatory circadian clock model.
Circadian clocks, often with a typical period of approximately 24 hours, are ubiquitous in intrinsically noisy biological rhythms generated at the single-cell molecular level [41].
We consider a minimal SDE model of the periodic gene dynamics responsible for per gene expression which is critical in the circadian cycle. Since per gene expression is subject to intrinsic noise [42], we describe it using a linear damped-oscillator SDE
where x and y are the dimensionless concentrations of the per mRNA transcript and the corresponding per protein, respectively. ,
are two independent Wiener processes and the parameters
and
denote the damping rate and angular frequency, respectively. A stability analysis at the steady state
in the noise-free case (
in Eq (9)) reveals that the real parts of the eigenvalues of the Jacobian matrix
at
are all negative, indicating that the origin is a stable steady state when the system is noise-free. Noise prevents the state (x(t),y(t)) from remaining at (0,0); thus, fluctuations in the single-cell circadian rhythm are noise-induced [42].
To showcase the effectiveness of our proposed END-nSDE method, we take different forms of the diffusion functions and
in Eq (9), accompanied by different values of noise strength and the correlation between the diffusion functions in the dynamics of x,y.
2.2.2. RPA-DNA binding model.
Regulation of gene expression relies on complex interactions between proteins and DNA, often described by the kinetics of binding and dissociation. Replication protein A (RPA) plays a pivotal role in various DNA metabolic pathways, including DNA replication and repair, through its dynamic binding with single-stranded DNA (ssDNA) [43–46]. By modulating the accessibility of ssDNA, RPA regulates multiple biological mechanisms and functions, acting as a critical regulator within the cell [47]. Understanding the dynamics of RPA-ssDNA binding is therefore a research area of considerable biological interest and significance.
Multiple binding modes and volume exclusion effects complicate the modeling of RPA-ssDNA dynamics. The RPA first binds to ssDNA in 20 nucleotide (nt) mode, which occupies 20nt of the ssDNA. When the subsequent 10nt of ssDNA is free, 20nt-mode RPA can transform to 30nt-mode, further stabilizing its binding to ssDNA, as illustrated in Fig 2. Occupied ssDNA is not available for other proteins to bind. Consequently, the gap size between adjacent ssDNA-bound RPAs determines the ssDNA accessibility to other proteins.
The possible steps in the biomolecular kinetics of multiple RPA molecules binding to ssDNA. The RPA in the free solution can bind to ssDNA with rate k1 provided there are at least 20 nucleotides (nt) of consecutive unoccupied sites. This bound “20nt mode” RPA unbinds with rate k−1. When space permits, the 20nt-mode RPA can extend and bind an additional 10nt of DNA at a rate of k2, converting it to a 30nt-mode bound protein. The 30nt-mode RPA transforms back to 20nt-mode spontaneously with the rate k−2. However, when the gap is not large enough to accommodate the RPA, the binding or conversion is prohibited ( and
).
Mean-field mass-action type chemical kinetic ODE models cannot describe the process very well because they do not capture the intrinsic stochasticity. A stochastic model that tracks the fraction of two different binding modes of RPA, 20nt-mode (x1) and 30nt-mode (x2), has been developed to capture the dynamics of this process. A brute-force approach using forward stochastic simulation algorithms (SSAs) [48] was then used to fit the model to experimental data [47]. However, a key challenge in this approach is that the model is nondifferentiable with respect to the kinetic parameters, making it difficult to estimate parameters. Yet, simple spatially homogeneous stochastic chemical reaction systems can be well approximated by a corresponding SDE of the form given in Eq (1) when the variables are properly scaled in the large system size limit [49]. While interparticle interactions shown in Fig 2 make it difficult to find a closed-form SDE approximation, results from [49] motivate the possibility of an SDE approximation for the RPA-ssDNA binding model in terms of the variables x1 and x2.
Here, to address the non-differentiability issue associated with the underlying Markov process, we use our END-nSDE model to construct a differentiable surrogate for SSAs, allowing it to be readily trained from data. Further details on the models and data used in this study are provided in Appendix B of S1 Text. Throughout our analysis of RPA-DNA binding dynamics, we benchmark the SDE reconstructed by our extended W2-distance approach against those found using other time series analysis and reconstruction methods such as the Gaussian process, RNN, LSTM, and the neural ODE model. We show that our surrogate SDE model is most suitable for approximating the RPA-DNA binding process because it can capture the intrinsic stochasticity in the dynamics.
2.2.3. NF
B signaling model.
Macrophages can sense environmental information and respond accordingly with stimulus-response specificity encoded in signaling pathways and decoded by downstream gene expression profiles [50]. The temporal dynamics of NFB, a key transcription factor in immune response and inflammation, encodes stimulus information [51]. NF
B targets and regulates vast immune-related genes [52–54]. While NF
B signaling dynamics are stimulus-specific, they exhibit significant heterogeneity across individual cells under identical conditions [51]. Understanding how specific cellular heterogeneity (extrinsic noise) contributes to heterogeneity in NF
B signaling dynamics can provide insight into how noise affects the fidelity of signal transduction in immune cells.
A previous modeling approach employs a 52-dimensional ODE system to quantify the NFB signaling network [51] and recapitulate the signaling dynamics of a representative cell. This ODE model includes 52 molecular entities and 47 reactions across a TNF-receptor module, an adaptor module, and a core module with and NF
B-IKK-I
Bα (I
Bα is an inhibitor of NF
B, while IKK is the I
B kinase complex that regulates the I
Bα degradation) feedback loop (see Fig 3) [55]. However, such an ODE model is deterministic and assumes no intrinsic fluctuations in the biomolecular processes. Yet, from experimental data, the NF
B signaling dynamics fluctuate strongly; such fluctuations cannot be quantitatively described by any deterministic ODE model. Due to the system’s high dimensionality and nonlinearity, it is challenging to quantify how intrinsic noise influences temporal coding in NF
B dynamics.
TNF binds its receptor, activating IKK, which degrades IBα and releases NF
B. The free NF
B translocates to the nucleus and promotes I
Bα transcription. Newly synthesized I
Bα then binds NF
B and exports it back to the cytoplasm. Red arrows indicate noise that we consider in the corresponding SDE system.
To incorporate the intrinsic noise within the NFB signaling network, we introduce noise terms into the 52-dimensional ODE system to build an SDE that can account for the observed temporally fluctuating nuclear NF
B trajectories. While NF
B signaling pathways involve many variables, experimental constraints limit the number of measurable components. Among these, nuclear NF
B concentration is the most direct and critical experimental readout. As a minimal stochastic model, we hypothesize that only the biophysical and biochemical processes of NF
B translocation (which directly affects experimental measurements) and I
Bα transcription (a key regulator of NF
B translocation) are subject to Brownian-type noise (red arrows in Fig 3), as these processes play crucial roles in the oscillatory dynamics of NF
B [55].
The intensity of Brownian-type noise in the NFB dynamics may depend on factors such as cell volume (smaller volumes result in higher noise intensity), or copy number (lower copy numbers lead to greater noise intensity), and is therefore considered a form of extrinsic noise. Noise intensity parameters thus capture an aspect of cellular heterogeneity. There are other sources of cellular heterogeneity, such as variations in kinase or enzyme abundances, which are too complicated to model and are thus not included in the current model. For simplicity, all kinetic parameters, except for the noise intensity (σ), are assumed to be consistent with those of a representative cell [55]. The 52-dimensional ODE model for describing NF
B dynamics is given in Refs. [51,56]. We extend this model by adding noise to the dynamics of the sixth, ninth, and tenth ODEs of the 52-dimensional ODE model. We retain 49 ODEs but convert the equations for the sixth, ninth, and tenth components to SDEs:
In Eqs 10, u2 is the concentration of IBα in the cytoplasm; u3 is the concentration of I
Bα in the nucleus; u4 is the concentration of the I
Bα-NF
B complex; u5 is the concentration of the I
Bα-NF
B complex in the nucleus; u6 is the mRNA of I
Bα; u7 is the IKK-I
Bα-NF
B complex; u9 is NF
B; u10 represents nuclear NF
B concentration; and u52 is the nuclear concentration of NF
B with RNA polymerase II that is ready to initiate mRNA transcription. A description of the parameters and their typical values are given in Table C in S1 Text. The quantities
and
are noise terms associated with I
Bα transcription and NF
B translocation, respectively. The remaining variables are latent variables and their dynamics are regulated via the remaining 49-dimensional ODE in Refs. [51,56]. The activation of NF
B is quantified by the total nuclear NF
B concentration (
), which is also measured in experiments.
Within this example, we wish to determine if our proposed parameter-associated nSDE can accurately reconstruct the dynamics underlying experimentally observed NFB trajectory data.
3. Results
3.1. Accurate reconstruction of circadian clock dynamics
As an illustrative example, we use the W2-distance nSDE reconstruction method to first reconstruct the minimal model for damped oscillatory circadian dynamics (see Eq (9)) under different forms of the diffusion function. We set the two parameters and
in Eq (9) and impose three different forms for the diffusion functions
: a constant diffusion function [57], a Langevin [58] diffusion function, and a linear diffusion function [59]. These functions, often used to describe fluctuating biophysical processes, are
and
There are two additional parameters in Eqs (11), (12), and (13): that determines the intensity of the Brownian-type fluctuations and c that controls the correlation of fluctuations between the two dimensions. For each type of diffusion function, we trained a different nSDE model, each of which takes the state variables (x,y) and the two parameters
as inputs and which outputs the values of the reconstructed drift and diffusion functions.
We take 25 combinations of ; for each combination of
, we generate 50 trajectories from the ground truth SDE (9) as the training data with
. The initial condition is set as
. To test the accuracy of the reconstructed diffusion and drift functions, we measure the following relative errors:
Here, is the vector of ground truth drift functions and
is the reconstructed drift function.
is the matrix of ground truth diffusion functions
given in Eqs 11, (12), and (13). M is the number of training samples,
denotes the
norm of a vector, and the matrix norm
for a matrix
. The errors are measured separately for different parameters
.
The errors in the reconstructed drift function and diffusion function
as well as the temporally decoupled squared W2 loss Eq (3) associated with different forms of the diffusion function and different values of
are shown in Fig 4. When the diffusion function is a constant Eq (11), the mean reconstruction error of the drift function is 0.15, the mean reconstruction error of the diffusion function is 0.16, and the mean temporally decoupled squared W2 loss between the ground truth trajectories and the predicted trajectories is 0.074 (averaged over all sets of parameters
). When a Langevin-type diffusion function Eq (12) is used as the ground truth, the mean errors for the reconstructed drift and diffusion functions are 0.069 and 0.29, respectively, and the mean temporally decoupled squared W2 loss between the ground truth and predicted trajectories is 0.020. For a linear-type diffusion function as the ground truth, mean reconstruction errors of the drift and diffusion functions are 0.19 and 0.41, respectively, and the mean temporally decoupled squared W2 distance is 0.013. For all three forms of diffusion, our END-nSDE method can accurately reconstruct the drift function
(see Fig 4D–4F). When the diffusion function is a constant, our END-nSDE model can also accurately reconstruct this constant (see Fig 4G). When the diffusion function takes a more complicated form such as the Langevin-type diffusion function Eq (12) or the linear-type diffusion function Eq (13), the reconstructed nSDE model can still approximate the diffusion function well for most combinations of
, especially when the correlation c>0.2 (see Fig 4H–4I). Overall, our proposed END-nSDE model can accurately reconstruct the minimal stochastic circadian dynamical model Eq (9) in the presence of extrinsic noise (different values of
); the accuracy of the reconstructed drift and diffusion functions is maintained for most combinations of
. While the drift function is reconstructed with high accuracy, the reconstructed diffusion function exhibits larger relative errors, particularly for models with more complex diffusion forms. How errors depend on the functional forms of the diffusion should be investigated.
Temporally decoupled squared W2 losses Eq (3) and errors in the reconstructed drift and diffusion functions for different types of the diffusion function and different values of . A-C. The temporally decoupled squared W2 loss between the ground truth trajectories and the trajectories generated by the reconstructed nSDEs for the constant-type diffusion function Eq (11), Langevin-type diffusion function Eq (12), and the linear-type diffusion function Eq (13). D-F. Errors in the reconstructed drift function for the three different types of ground truth diffusion functions and the linear-type diffusion function Eq (13). G-I. Errors in the reconstructed diffusion function for the three different types of ground truth diffusion functions.
To investigate how the strengths of the extrinsic and intrinsic noise and affect our reconstruction of extrinsic-noise-driven SDEs, we conduct an additional test on the reconstruction of circadian clock dynamics. We generate training trajectories from a revised version of Eq (9):
In Eq (16), for each set of , we generate 25 groups of
with each group containing 50 trajectories as training data. To train the neural SDE model, both the state variables (x,y) and
are input into the neural SDE.
characterizes the average level of intrinsic noise while
represents the strength of extrinsic noise, and we use different values of
. As shown in Fig 5B and 5C, errors in the reconstructed drift function and in the reconstructed diffusion function, averaged over all different sets of
, increases with both
and
. Specifically, an increase in the intrinsic noise level (
) reduces the reconstruction accuracy more than an increase in the extrinsic noise (
) does. More analysis on how the variation in intrinsic noise and extrinsic noise could affect the accuracy of the reconstructed drift and diffusion functions using our proposed END-nSDE method is promising.
3.2. Accurate approximation of interacting DNA-protein systems with different kinetic parameters
To construct a differentiable surrogate for stochastic simulation algorithms (SSAs), the neural SDE model should be able to take kinetic parameters as additional inputs. Thus, the original W2-distance SDE reconstruction method in [32] can no longer be applied because the trained neural SDE model cannot take into account extrinsic noise [60], i.e., different values of kinetic parameters. To be specific, we vary one parameter (the conversion rate k2 from 20nt-mode RPA to 30nt-mode RPA) in the stochastic model and then apply our END-nSDE method which takes the state variables and the kinetic parameter k2 as the input. We set with other parameters taken from experiments [47] (
s−1,
s−1,
s−1, see Fig 2). For each k2, we generate 100 trajectories and use 50 for the training set and the other 50 for the testing set. Each trajectory encodes the dynamics of the fraction of 20nt-mode DNA-bound RPA x1(t) and the fraction of 30nt-mode DNA-bound RPA x2(t).
When approximating the dynamics underlying the RPA-DNA binding process, we compare our SDE reconstruction method with other benchmark time-series analysis or reconstruction approaches, including the RNN, LSTM, Gaussian process, and the neural ODE model [61,62]. These benchmarks are described in detail in Appendix C in S1 Text.
The extrinsic-noise-driven temporally decoupled squared W2 distance loss Eq (8) between the distribution of the ground truth trajectories and the distribution of the predicted trajectories generated by our END-nSDE reconstructed SDE model is the smallest among all methods (shown in Table 1). The underlying reason is an SDE well approximates the genuine Markov counting process underlying the continuum-limit RPA-DNA binding process [49]. The RNN and LSTM models do not capture the intrinsic fluctuations in the counting process. The neural ODE model is a deterministic model and cannot capture the stochasticity in the RPA-DNA binding dynamics. Additionally, the Gaussian process can only accurately approximate linear SDEs, which is not an appropriate form for an SDE describing the RPA-DNA binding process.
In Fig 6A and 6B, we plot the predicted trajectories obtained by the trained neural SDE model for two different values and
. Actually, for all different values of k2, trajectories generated by our END-nSDE method match well with the ground truth trajectories on the testing set, as the temporally decoupled squared W2 loss is maintained small for all k2 (shown in Fig 6C). This demonstrates the ability of our method to capture the dependence of the stochastic dynamics on biochemical kinetic parameters.
A. Sample ground truth and reconstructed trajectories evaluated at , where we use the convention that
. B. Sample ground truth and reconstructed parameters evaluated at
. C. Temporally decoupled squared W2 distances (see Eq (8)) between the ground truth and reconstructed trajectories evaluated at different
values. In A and B, blue and red trajectories represent the filling fractions of DNA by 20nt-mode and 30nt-mode RPA, respectively. The dashed lines represent the predicted trajectories, and the solid lines represent the ground truth. Throughout the figure, the data are generated by a single neural SDE model that accepts the conversion rate k2 as a parameter and outputs the trajectories.
3.3. Reconstructing high-dimensional NF
B signaling dynamics from simulated and experimental data
Finally, we evaluate the effectiveness of the END-nSDE framework in reconstructing high-dimensional NFB signaling dynamics under varying noise intensities and investigate the performance of the neural SDE method in reconstructing experimentally measured noisy NF
B dynamics. The procedure is divided into two parts. First, we trained and tested our END-nSDE method on synthetic data generated by the NF
B SDE model Eq (10) under different noise intensities
. Second, we test whether the trained END-nSDE can reproduce the experimental dynamic trajectories.
3.3.1. Reconstructing a 52-dimensional stochastic model for NF
B dynamics.
For training END-nSDE models, we first generated synthetic data from the 52-dimensional SDE model of NFB signaling dynamics Eqs 19 and established models [51,56]. The synthetic trajectories are generated under 121 combinations of noise intensity
in Eqs (10) (see Appendix D of S1 Text). The resulting NF
B trajectories vary depending on noise intensity, with low-intensity noise producing more consistent dynamics across cells (see Fig 7A) and higher-intensity noise yielding more heterogeneous dynamics (see Fig 7B). The simulated ground truth trajectories are split into training and testing datasets (see Appendix E in S1 Text for details). Specifically, we excluded 25 combinations of noise intensities
from the training set in order to test the generalizability of the trained neural SDE model on noisy intensities.
A. Sample trajectories of nuclear NFB concentration as a function of time with
,
. B. Sample trajectories of nuclear NF
B concentration as a function of time with
,
. C. Reconstructed nuclear NF
B trajectories generated by the trained neural SDE versus the ground truth nuclear NF
B trajectories under noise intensities
,
in Eq (10). D. Reconstructed nuclear NF
B trajectories generated by the trained neural SDE versus the ground truth nuclear NF
B trajectories under noise intensities
,
. E. The squared W2 distance between the distributions of the predicted trajectories and ground truth trajectories on the training set under different noise strengths
. For training, we randomly selected 50% sample trajectories in 80 combinations of noise strengths
as the training dataset. Blank cells indicate that the corresponding parameter set is not included in the training set. F. Validation of the trained model by evaluating the squared W2 distance between the distributions of predicted trajectories and ground truth trajectories on the validation set.
Next, as detailed in Appendix E of S1 Text, we trained a 52-dimensional neural SDE model using our END-nSDE method on synthetic trajectories. The loss function is based on the W2 distance between the distributions of the neural SDE predictions in Eqs (10) and the simulated nuclear IBα-NF
B complex and nuclear NF
B activities (u5(t) and u10(t), respectively) and the corresponding END-nSDE predictions. The remaining 50 variables of the NF
B system were treated as latent variables, as they are not directly included in the loss function calculation.
Although the NFB dynamics vary under different noise intensities
, the trajectories generated by our trained neural SDE closely align with the ground truth synthetic NF
B dynamics under different noise intensities
(see Fig 7C and 7D). The neural SDE model demonstrates greater accuracy in reconstructing NF
B dynamics when the noise in I
Bα transcription (
) is smaller, as evidenced by the reduced squared W2 distance between the predicted and ground-truth trajectories on both the training and validation sets (see Fig 7E and 7F). The temporally decoupled squared W2 loss Eq (8) on the validation set is close to that on the training set for different values of noise intensities
. The mean squared W2 distance across all combinations of noise intensities
is 0.0013 for the training set, and the validation set shows a mean squared W2 distance of 0.0017.
Since the loss function for this application involves only two variables out of 52, we also tested whether the “full” 52-dimensional NFB system can be effectively modeled by a two-dimensional neural SDE. After training, we found that the reduced model was insufficient for reconstructing the full 52-dimensional dynamics, as it disregarded the 50 latent variables not included in the loss function (see Fig D in Appendix F of S1 Text). This result underscores the importance of incorporating latent variables from the system, even when they are not explicitly included in the loss function.
3.3.2. Reproducing NF
B data with a trained END-nSDE.
We assessed whether our proposed END-nSDE can accurately reconstruct the experimentally measured NFB dynamic trajectories. For simplicity and feasibility, we tested the END-nSDE under the assumption that: (1) all cells share the same drift function, and (2) cells with trajectories deviating similarly from their ODE predictions have the same noise intensities. Based on these assumptions, we developed the following workflow (see Fig 8):
Workflow for reconstructing experimental data using the trained parameterized nSDE and the parameter-inference neural network (NN). The boxes on the left outline the steps of the experimental data reconstruction process, while the boxes on the right illustrate the corresponding results at each step.
- We used experimentally measured single-cell trajectories of NF
B concentration, obtained through live-cell image tracking of macrophages from mVenus-tagged RelA mouse with a frame frequency of five minutes [63], yielding a total of 31 consecutive time points. These trajectories correspond to the sum of nuclear I
B
NF
B and NF
B concentration in the 52D SDE model (u5(t) and u10(t) in Eq (10)).
- The experimental dataset was divided into subgroups. Cosine similarity was calculated between the ODE-generated trajectory (representative-cell NF
B dynamics) and experimental trajectories. The trajectories are then ranked and divided into different groups based on their cosine similarity with the trajectory generated from the ODE model [64]. Experimental trajectories with higher similarity to the ODE trajectory are expected to exhibit smaller intrinsic fluctuations, corresponding to lower noise intensities (see Appendix G in S1 Text for details).
- Each group of experimental trajectories was input into the trained neural network (see the next paragraph for more details) to infer the corresponding noise intensities
. For simplicity, we assume that trajectories within each group share the same noise intensities.
- The inferred noise is then used as inputs for the trained END-nSDE to simulate NF
B trajectories.
- The simulated trajectories were compared with the corresponding experimental data to evaluate the model’s performance.
To estimate noise intensities from different groups of experimentally measured single-cell nuclear NFB trajectories (step (3) in the proposed workflow), we trained another neural network to predict the corresponding I
Bα transcription and NF
B translocation noise intensities from the groups of NF
B trajectories in the synthetic training data, similar to the approach taken in [65]. The trained neural network can then be used for predicting noise intensities in the validation set (see Appendix H in S1 Text for technical details).
Assessing the impact of group size (number of trajectories) on noise intensity prediction performance, we found that taking a group size of at least two leads to a relative error of around 0.1 (see Fig 9A). Given the high heterogeneity present in experimental data, we took a group size of 32 as the input into the neural network. Under this group size, the relative errors in the predicted noise intensities were 0.021 on the training set and 0.062 on the testing set (see Fig 9B and 9C).
A. Plots showing the mean (solid circles) and variance (error bars) of the relative error in the reconstructed noise intensities predicted by the parameter-inference NN for the testing dataset, as a function of the group size of input trajectories. B. Heatmaps showing the relative error in the reconstructed noise intensities for the training dataset. Colored cells represent results from the parameter-inference NN for the training dataset, while blank cells indicate noise strength values not included in the training set. C. Heatmaps showing the relative error in the diffusion function for the testing dataset. D. The inferred intensity of I
Bα transcription noise (
) and NF
B translocation noise (
) in different groups of experimental trajectories, plotted against the group’s ranking in decreasing similarity with the representative ODE trajectory. E-H. Groups of experimental and nSDE-reconstructed trajectories ranked by decreasing cosine similarity: #1 (E), #4 (F), #16 (G), #29 (H). The squared W2-distance between experimental and SDE-generated trajectories are 0.157 (E), 0.143 (F), 0.212 (G), 0.236 (H). The inferred noises are (10−0.49,10−0.81) (E), (10−0.47,10−0.78) (F), (10−0.46,10−0.74) (G), (10−0.44,10−0.71) (H). I. The temporally decoupled squared W2 distance between reconstructed trajectories generated by the trained END-nSDE and groups of experimental trajectories, ordered according to decreasing similarity with the representative ODE trajectory.
Using the trained neural network, we inferred noise intensities for the experimental data, which were grouped based on their cosine similarities with the representative-cell trajectory (deterministic ODE) with a group size of 32. The predicted noise intensities on the experimental dataset are larger than the noise intensities on the training set, possibly because unmodeled extrinsic noise complicates the inference of noise intensity. The transcription noise of IBα is predicted to be within the range of [10−0.81,10−0.71] (see Fig 9D). In addition, the inferred noise for NF
B translocation fell within [10−0.49,10−0.43] (see Fig 9D). These inferred noise intensities were then used as inputs to the END-nSDE to simulate NF
B trajectories.
We compare the reconstructed NFB trajectories generated by the trained neural SDE model with the experimentally measured NF
B trajectories (see Fig 9E–9I). The trajectories generated using our END-nSDE method successfully reproduce the experimental dynamics for the majority of time points for the top 50% of cell subgroups most correlated with the representative-cell ODE model (see Fig 9E–9G, Fig 9I).
For the top-ranked subgroups (#1 to #16), the heterogeneous nSDE-reconstructed dynamics align well with the experimental data for the first 100 minutes. The predicted trajectories deviate more from ground truth trajectories observed in experiments after 100 minutes possibly due to error accumulation and errors in the predicted noise intensity. For experimental subgroups that significantly deviate from the representative-cell ODE model, the END-nSDE struggles to fully capture the heterogeneous trajectories. This limitation likely arises from the assumption that all cells in a group share the same underlying dynamics, whereas in reality, substantial cellular differences in underlying dynamics exist due to heterogeneity in the drift term, an aspect not accounted for in END-nSDE due to the high computational cost.
With sufficient data and computational resources, our proposed workflow is able to incorporate extrinsic noise in the drift terms, allowing for further discrimination of experimental trajectories. Our END-nSDE method can partially reconstruct experimental datasets and has the potential to fully capture experimental dynamics. Furthermore, trajectories generated from the trained END-nSDE model can reproduce the intrinsic fluctuations in the observed NFB signaling dynamics which are inaccessible to the representative-cell ODE model.
4. Discussion
In this work, we used the W2-distance to develop an END-nSDE reconstruction method that takes into account extrinsic noise in gene expression dynamics as observed across various biophysical and biochemical processes such as circadian rhythms, RPA-DNA binding, and NFB translocation. We first demonstrated that our END-nSDE method can successfully reconstruct a minimal noise-driven fluctuating SDE characterizing the circadian rhythm, showcasing its effectiveness in reconstructing SDE models that contain both intrinsic and extrinsic noise. Next, we used our END-nSDE method to learn a surrogate extrinsic-noise-driven neural SDE, which approximates the RPA-DNA binding process. Molecular binding processes are usually modeled by a Markov counting process and simulated using Monte-Carlo-type stochastic simulation algorithms (SSAs) [48]. Our END-nSDE reconstruction approach can effectively reconstruct the stochastic dynamics of the RPA-ssDNA binding process while also taking into account extrinsic noise (heterogeneity in biological parameters among different cells). Our END-nSDE method outperforms several benchmark methods such as LSTMs, RNNs, neural ODEs, and Gaussian processes.
Finally, we applied our methodology to analyze NFB trajectories collected from over a thousand cells. Not only did the neural SDE model trained on the synthetic dataset perform well on the validation set, but it also partially recapitulated experimental trajectories of NF
B abundances, particularly for subgroups with dynamics similar to those of the representative cell. These results underscore the potential of neural SDEs in modeling and understanding the role of intrinsic noise in complex cellular signaling systems [66–68].
When the experimental trajectories were divided into subgroups, we assumed that all cells across different groups shared the same drift function (as in the representative ODE) and cells within each group shared the same diffusion term. We found that subgroups with dynamics more closely aligned with the deterministic ODE model resulted in better reconstructions. In contrast, for experimental trajectories that deviated significantly from the representative ODE model, their underlying dynamics may differ from those defined by the representative cell’s ODE. Therefore, the assumption that a group shares the same drift function as the representative cell ODE holds only when the trajectories closely resemble the ODE. Incorporating noise into the drift term for training the neural SDE could potentially address this issue. We did not consider this approach due to the high computational cost required for training.
Applying our method to high-dimensional synthetic NFB datasets, we showed the importance of incorporating latent variables. This necessity arises because the ground-truth dynamics of the measured quantities (nuclear NF
B) are not self-closed and inherently depend on additional variables. Consequently, the 52-dimensional SDE reconstruction requires more variables than just the “observed” dynamics of nuclear NF
B. In this example, the remaining 50 variables in the nSDE were treated as latent variables, even though they were not explicitly included in the loss function.
For high-dimensional systems (e.g., 52 dimensions as in our NFB example), analyzing stochastic dynamics remains challenging. Even though regulated processes do not follow gradient dynamics in general, imposing a self-consistent energy landscape and adopting lower dimensional projections can provide a valuable framework for studying stochastic dynamics of high-dimensional biological systems [69–71]. Once an effective energy landscape is identified, prior knowledge about the system structure can be incorporated into the neural SDE framework through the following formulation:
where represents the prior knowledge of the energy landscape. The neural networks
and
then learn deviations from the prior knowledge and the unknown intrinsic noise. Such prior information on the energy landscape could facilitate training and improve accuracy of the learned model [37]. How imposition of a high-dimensional landscape as a constraint affects our W2-distance-based inference and how this potential sold be interpreted should be explored in more depth. If meaningful and informative, prior results on how landscapes can be used to characterize neural networks across various tasks can be leveraged [72,73].
Finally, neural SDEs can serve as surrogate models for complex biomedical dynamics [74,75]. Combining such surrogate models with neural control functions [72,76,77] can be useful for tackling complex biomedical control problems. As shown in preceding work [32,37], a larger number of training trajectories led to a more accurate reconstructed neural SDE. However, in biological experiments, obtaining more training trajectories could be more expensive. Therefore, it is of biological significance to find out the number of training trajectories that can be practically obtained in real experiments and that are necessary for an accurate reconstruction of the intrinsic-noise-aware SDE using our END-nSDE approach. Finally, it is worth further investigation to find out the biophysical molecular processes in which taking into account intrinsic fluctuations is necessary. In such problems, using our END-nSDE framework to reconstruct the noisy molecular dynamics could yield a more biologically reasonable, noise-aware model than first-principle-based mass-action ODE models.
While our work focuses on gene regulation dynamics, it is important to emphasize that the END-nSDE reconstruction method is general and can potentially be applied to biological systems beyond gene regulation. The method’s ability to capture both intrinsic and extrinsic noise makes it suitable for modeling various stochastic biological processes, including, but not limited to, signal transduction networks, metabolic pathways, population dynamics, and developmental processes. The examples we chose-circadian rhythms, RPA-DNA binding dynamics, and NFB signaling-were selected to demonstrate the method’s capabilities and bring the neural SDE approach to the attention of the molecular and cell biology community. Future applications could extend to other domains such as epidemiology, ecology, and systems biology, where stochastic dynamics that could be described by SDEs with heterogeneity among different cells or individuals are prevalent.
Besides better understanding effective energy landscape constraints, there are several promising directions for future research. First, techniques to extract an explicit form of the learned neural network SDEs can be developed. For example, one could employ a polynomial model as the reconstructed drift and diffusion functions in the SDE [78]. Such an explicit functional form of the approximate SDE may facilitate biological interpretation of the underlying model. Recent research has also shed light on directly interpreting trained neural networks using simple functions such as polynomials [79]. Therefore, one can apply such methods to extract the approximate forms from the learned drift and diffusion functions in the neural SDE and interpret their biophysical meaning.
Another promising avenue of investigation is to combine discrete and continuous modeling approaches to account for both mRNA and protein dynamics. Such a hybrid approach would use discrete Markov jump processes for low-abundance species (such as mRNA) while employing SDEs for high-abundance species (such as proteins), thereby addressing the limitations of pure SDE approaches when molecular counts approach zero.
Finally, the presence of unobserved variables in cellular systems poses a significant challenge for accurate SDE modeling. Many cellular processes involve hidden regulatory mechanisms, unmeasured metabolites, or latent cellular states that influence the observed dynamics but are not directly captured in experimental measurements. This limitation can lead to model misspecification, where the inferred drift and diffusion functions compensate for missing variables, potentially resulting in biased parameter estimates and poor predictive performance. A more realistic scenario occurs when we already know what molecules can have an effect on the dynamics, but experiments can only report a few molecular species. In such cases, we can model the full system dynamics with the full dimension with a parameterized model for sampling the initial values of those unobserved variables. The rest of the training procedure would be the same as in the main text.
Acknowledgments
We acknowledge Stefanie Luecke for providing the experimental datasets for NFB dynamics.
References
- 1. Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci U S A. 2002;99(20):12795–800. pmid:12237400
- 2. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297(5584):1183–6. pmid:12183631
- 3. Sanchez A, Choubey S, Kondev J. Regulation of noise in gene expression. Annu Rev Biophys. 2013;42:469–91. pmid:23527780
- 4. Foreman R, Wollman R. Mammalian gene expression variability is explained by underlying cell state. Mol Syst Biol. 2020;16(2):e9146. pmid:32043799
- 5. Mitchell S, Hoffmann A. Identifying Noise Sources governing cell-to-cell variability. Curr Opin Syst Biol. 2018;8:39–45. pmid:29623300
- 6. Thattai M, van Oudenaarden A. Intrinsic noise in gene regulatory networks. Proc Natl Acad Sci U S A. 2001;98(15):8614–9. pmid:11438714
- 7. Tsimring LS. Noise in biology. Rep Prog Phys. 2014;77(2):026601. pmid:24444693
- 8. Fu AQ, Pachter L. Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. Stat Appl Genet Mol Biol. 2016;15(6):447–71. pmid:27875323
- 9. Llamosi A, Gonzalez-Vargas AM, Versari C, Cinquemani E, Ferrari-Trecate G, Hersen P, et al. What population reveals about individual cell identity: single-cell parameter estimation of models of gene expression in yeast. PLoS Comput Biol. 2016;12(2):e1004706. pmid:26859137
- 10. Dharmarajan L, Kaltenbach H-M, Rudolf F, Stelling J. A simple and flexible computational framework for inferring sources of heterogeneity from single-cell dynamics. Cell Syst. 2019;8(1):15-26.e11. pmid:30638813
- 11. Finkenstädt B, Woodcock DJ, Komorowski M, Harper CV, Davis JRE, White MRH. Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: an application to single cell data. The Annals of Applied Statistics. 2013; p. 1960–82.
- 12. Dixit PD. Quantifying extrinsic noise in gene expression using the maximum entropy framework. Biophys J. 2013;104(12):2743–50. pmid:23790383
- 13. Fang Z, Gupta A, Kumar S, Khammash M. Advanced methods for gene network identification and noise decomposition from single-cell data. Nat Commun. 2024;15(1):4911. pmid:38851792
- 14. Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mRNA synthesis in mammalian cells. PLoS Biol. 2006;4(10):e309. pmid:17048983
- 15. Raser JM, O’Shea EK. Control of stochasticity in eukaryotic gene expression. Science. 2004;304(5678):1811–4. pmid:15166317
- 16. Sigal A, Milo R, Cohen A, Geva-Zatorsky N, Klein Y, Liron Y, et al. Variability and memory of protein levels in human cells. Nature. 2006;444(7119):643–6. pmid:17122776
- 17. Singh A, Razooky BS, Dar RD, Weinberger LS. Dynamics of protein noise can distinguish between alternate sources of gene-expression variability. Mol Syst Biol. 2012;8:607. pmid:22929617
- 18. Paulsson J. Models of stochastic gene expression. Physics of Life Reviews. 2005;2(2):157–75.
- 19. Singh A, Soltani M. Quantifying intrinsic and extrinsic variability in stochastic gene expression models. PLoS One. 2013;8(12):e84301. pmid:24391934
- 20. Wang D-G, Wang S, Huang B, Liu F. Roles of cellular heterogeneity, intrinsic and extrinsic noise in variability of p53 oscillation. Sci Rep. 2019;9(1):5883. pmid:30971810
- 21. Voit EO, Martens HA, Omholt SW. 150 years of the mass action law. PLoS Comput Biol. 2015;11(1):e1004012. pmid:25569257
- 22. Ferner RE, Aronson JK. Cato Guldberg and Peter Waage, the history of the Law of Mass Action, and its relevance to clinical pharmacology. Br J Clin Pharmacol. 2016;81(1):52–5. pmid:26174880
- 23. Bressloff PC, Newby JM. Metastability in a stochastic neural network modeled as a velocity jump Markov process. SIAM Journal on Applied Dynamical Systems. 2013;12(3):1394–435.
- 24.
Kurtz TG. Limit theorems and diffusion approximations for density dependent Markov chains. In: Wets RJB, editor. Berlin, Heidelberg: Springer; 1976. p. 67–78.
- 25. Tian T, Burrage K, Burrage PM, Carletti M. Stochastic delay differential equations for genetic regulatory networks. Journal of Computational and Applied Mathematics. 2007;205(2):696–707.
- 26. Chen K-C, Wang T-Y, Tseng H-H, Huang C-YF, Kao C-Y. A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics. 2005;21(12):2883–90. pmid:15802287
- 27. Xia M, Chou T. Kinetic theories of state- and generation-dependent cell populations. Phys Rev E. 2024;110(6–1):064146. pmid:39916132
- 28. Zechner C, Unger M, Pelet S, Peter M, Koeppl H. Scalable inference of heterogeneous reaction kinetics from pooled single-cell recordings. Nat Methods. 2014;11(2):197–202. pmid:24412977
- 29. Sukys A, Öcal K, Grima R. Approximating solutions of the Chemical Master equation using neural networks. iScience. 2022;25(9):105010. pmid:36117994
- 30. Öcal K, Gutmann MU, Sanguinetti G, Grima R. Inference and uncertainty quantification of stochastic gene expression via synthetic models. Journal of The Royal Society Interface. 2022;19(192):20220153.
- 31. Cao Z, Chen R, Xu L, Zhou X, Fu X, Zhong W, et al. Efficient and scalable prediction of stochastic reaction-diffusion processes using graph neural networks. Math Biosci. 2024;375:109248. pmid:38986837
- 32.
Xia M, Li X, Shen Q, Chou T. Squared Wasserstein-2 Distance for Efficient Reconstruction of Stochastic Differential Equations; 2024. https://arxiv.org/abs/2401.11354.
- 33.
Li X, Wong TKL, Chen RTQ, Duvenaud D. Scalable gradients for stochastic differential equations. In: International Conference on Artificial Intelligence and Statistics, 2020.
- 34.
Kidger P, Foster J, Li X, Lyons TJ. Neural SDEs as Infinite-Dimensional GANs. In: Proceedings of the 38th International Conference on Machine Learning. 2021. p. 5453–63.
- 35.
Kidger P, Foster J, Li X, Lyons TJ. Neural SDEs as infinite-dimensional GANs. In: 2021. p. 5453–63.
- 36.
Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. 2017. http://arxiv.org/abs/1701.07875v3
- 37. Xia M, Li X, Shen Q, Chou T. An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks. Machine Learning: Science and Technology. 2024;5:045052.
- 38. Xia M, Shen Q. A local squared Wasserstein-2 method for efficient reconstruction of models with uncertainty. 2024. https://arxiv.org/abs/2406.06825
- 39. Xia M, Shen Q, Maini P, Gaffney E, Mogilner A. A new local time-decoupled squared Wasserstein-2 method for training stochastic neural networks to reconstruct uncertain parameters in dynamical systems. Neural Networks. 2025.
- 40. Yu R, Wang R. Learning dynamical systems from data: an introduction to physics-guided deep learning. Proc Natl Acad Sci U S A. 2024;121(27):e2311808121. pmid:38913886
- 41. Gonze D. Modeling circadian clocks: from equations to oscillations. Open Life Sciences. 2011;6(5):699–711.
- 42. Westermark PO, Welsh DK, Okamura H, Herzel H. Quantification of circadian rhythms in single cells. PLoS Comput Biol. 2009;5(11):e1000580. pmid:19956762
- 43. Dueva R, Iliakis G. Replication protein A: a multifunctional protein with roles in DNA replication, repair and beyond. NAR Cancer. 2020;2(3):zcaa022. pmid:34316690
- 44. Caldwell CC, Spies M. Dynamic elements of replication protein A at the crossroads of DNA replication, recombination, and repair. Crit Rev Biochem Mol Biol. 2020;55(5):482–507. pmid:32856505
- 45. Nguyen HD, Yadav T, Giri S, Saez B, Graubert TA, Zou L. Functions of Replication Protein A as a Sensor of R Loops and a Regulator of RNaseH1. Mol Cell. 2017;65(5):832-847.e4. pmid:28257700
- 46. Wold MS. Replication protein A: a heterotrimeric, single-stranded DNA-binding protein required for eukaryotic DNA metabolism. Annu Rev Biochem. 1997;66:61–92. pmid:9242902
- 47. Ding J, Li X, Shen J, Zhao Y, Zhong S, Lai L, et al. ssDNA accessibility of Rad51 is regulated by orchestrating multiple RPA dynamics. Nature Communications. 2023;14:3864.
- 48. Gillespie DT. Exact stochastic simulation of coupled chemical reactions. The Journal of Physical Chemistry. 1977;81(25):2340–61.
- 49. Gillespie DT. The chemical Langevin equation. The Journal of Chemical Physics. 2000;113(1):297–306.
- 50. Sheu K, Luecke S, Hoffmann A. Stimulus-specificity in the responses of immune sentinel cells. Curr Opin Syst Biol. 2019;18:53–61. pmid:32864512
- 51.
Adelaja A, Taylor B, Sheu KM, Liu Y, Luecke S, Hoffmann A. Six distinct NF
B signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity. 2021;54(5):916-930.e7. pmid:33979588
- 52. Hoffmann A, Levchenko A, Scott ML, Baltimore D. The IkappaB-NF-kappaB signaling module: temporal control and selective gene activation. Science. 2002;298(5596):1241–5. pmid:12424381
- 53.
Cheng QJ, Ohta S, Sheu KM, Spreafico R, Adelaja A, Taylor B, et al. NF-
B dynamics determine the stimulus specificity of epigenomic reprogramming in macrophages. Science. 2021;372(6548):1349–53. pmid:34140389
- 54.
Sen S, Cheng Z, Sheu KM, Chen YH, Hoffmann A. Gene regulatory strategies that decode the duration of NF
B dynamics contribute to LPS- versus TNF-specific gene expression. Cell Syst. 2020;10(2):169-182.e5. pmid:31972132
- 55.
Adelaja A, Taylor B, Sheu KM, Liu Y, Luecke S, Hoffmann A. Six distinct NF
B signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity. 2021;54(5):916-930.e7. pmid:33979588
- 56. Guo X, Adelaja A, Singh A, Wollman R, Hoffmann A. Modeling heterogeneous signaling dynamics of macrophages reveals principles of information transmission in stimulus responses. Nat Commun. 2025;16(1):5986.
- 57.
Bressloff PC. Stochastic processes in cell biology. Springer; 2014.
- 58. Spagnolo B, Spezia S, Curcio L, Pizzolato N, Fiasconaro A, Valenti D. Noise effects in two different biological systems. The European Physical Journal B. 2009;69:133–46.
- 59. Pahle J, Challenger JD, Mendes P, McKane AJ. Biochemical fluctuations, optimisation and the linear noise approximation. BMC Syst Biol. 2012;6:86. pmid:22805626
- 60. Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci U S A. 2002;99(20):12795–800. pmid:12237400
- 61.
Kelleher JD. Deep learning. MIT Press. 2019.
- 62.
Saptadi NTS, Kristiawan H, Nugroho AY, Rahayu N, Waseso B, Intan I. Deep Learning: Teori, Algoritma, dan Aplikasi. Sada Kurnia Pustaka; 2025.
- 63.
Luecke S, Guo X, Sheu KM, Singh A, Lowe SC, Han M, et al. Dynamical and combinatorial coding by MAPK p38 and NF
B in the inflammatory response of macrophages. Mol Syst Biol. 2024;20(8):898–932. pmid:38872050
- 64. Nakamura T, Taki K, Nomiya H, Seki K, Uehara K. A shape-based similarity measure for time series data with ensemble learning. Pattern Analysis and Applications. 2013;16(4):535–48.
- 65. Frishman A, Ronceray P. Learning force fields from stochastic trajectories. Physical Review X. 2020;10(2):021009.
- 66. Rao CV, Wolf DM, Arkin AP. Control, exploitation and tolerance of intracellular noise. Nature. 2002;420(6912):231–7. pmid:12432408
- 67. Arias AM, Hayward P. Filtering transcriptional noise during development: concepts and mechanisms. Nat Rev Genet. 2006;7(1):34–44. pmid:16369570
- 68. Eling N, Morgan MD, Marioni JC. Challenges in measuring and understanding biological noise. Nat Rev Genet. 2019;20(9):536–48. pmid:31114032
- 69. Kang X, Li C. A dimension reduction approach for energy landscape: identifying intermediate states in metabolism-EMT network. Adv Sci (Weinh). 2021;8(10):2003133. pmid:34026435
- 70. Lang J, Nie Q, Li C. Landscape and kinetic path quantify critical transitions in epithelial-mesenchymal transition. Biophys J. 2021;120(20):4484–500. pmid:34480928
- 71. Li C, Wang J. Landscape and flux reveal a new global view and physical quantification of mammalian cell cycle. Proc Natl Acad Sci U S A. 2014;111(39):14130–5. pmid:25228772
- 72. Böttcher L, Asikis T. Near-optimal control of dynamical systems with neural ordinary differential equations. Machine Learning: Science and Technology. 2022;3(4):045004.
- 73. Böttcher L, Wheeler G. Visualizing high-dimensional loss landscapes with Hessian directions. Journal of Statistical Mechanics: Theory and Experiment. 2024;2024(2):023401.
- 74. Fonseca LL, Böttcher L, Mehrad B, Laubenbacher RC. Optimal control of agent-based models via surrogate modeling. PLoS Comput Biol. 2025;21(1):e1012138. pmid:39808665
- 75. Böttcher L, Fonseca LL, Laubenbacher RC. Control of medical digital twins with artificial neural networks. Philosophical Transactions of the Royal Society A. 2025;383:20240228.
- 76. Asikis T, Böttcher L, Antulov-Fantulin N. Neural ordinary differential equation control of dynamics on graphs. Physical Review Research. 2022;4(1):013221.
- 77. Böttcher L, Antulov-Fantulin N, Asikis T. AI Pontryagin or how artificial neural networks learn to control dynamical systems. Nat Commun. 2022;13(1):333. pmid:35039488
- 78. Fronk C, Petzold L. Interpretable polynomial neural ordinary differential equations. Chaos. 2023;33(4):043101. pmid:37097945
- 79. Morala P, Cifuentes JA, Lillo RE, Ucar I. NN2Poly: a polynomial representation for deep feed-forward artificial neural networks. IEEE Trans Neural Netw Learn Syst. 2025;36(1):781–95. pmid:37962996