Figures
Abstract
Discovering mathematical equations that govern physical and biological systems from observed data is a fundamental challenge in scientific research. We present a new physics-informed framework for parameter estimation and missing physics identification (gray-box) in the field of Systems Biology. The proposed framework—named AI-Aristotle—combines the eXtreme Theory of Functional Connections (X-TFC) domain-decomposition and Physics-Informed Neural Networks (PINNs) with symbolic regression (SR) techniques for parameter discovery and gray-box identification. We test the accuracy, speed, flexibility, and robustness of AI-Aristotle based on two benchmark problems in Systems Biology: a pharmacokinetics drug absorption model and an ultradian endocrine model for glucose-insulin interactions. We compare the two machine learning methods (X-TFC and PINNs), and moreover, we employ two different symbolic regression techniques to cross-verify our results. To test the performance of AI-Aristotle, we use sparse synthetic data perturbed by uniformly distributed noise. More broadly, our work provides insights into the accuracy, cost, scalability, and robustness of integrating neural networks with symbolic regressors, offering a comprehensive guide for researchers tackling gray-box identification challenges in complex dynamical systems in biomedicine and beyond.
Author summary
Our study addresses the fundamental challenge of uncovering mathematical rules governing physical and biological systems from real-world data. We introduce a novel framework, AI-Aristotle, designed for parameter estimation and identifying hidden physics (gray-box) in Systems Biology. AI-Aristotle combines the powerful eXtreme Theory of Functional Connections (X-TFC), Physics-Informed Neural Networks (PINNs), and symbolic regression (SR) techniques to discover parameters and uncover hidden relationships. Our work offers guidance to researchers addressing gray-box identification challenges in complex dynamic systems, including applications in biomedicine and beyond.
Citation: Ahmadi Daryakenari N, De Florio M, Shukla K, Karniadakis GE (2024) AI-Aristotle: A physics-informed framework for systems biology gray-box identification. PLoS Comput Biol 20(3): e1011916. https://doi.org/10.1371/journal.pcbi.1011916
Editor: Piero Fariselli, Universita degli Studi di Torino, ITALY
Received: October 4, 2023; Accepted: February 13, 2024; Published: March 12, 2024
Copyright: © 2024 Ahmadi Daryakenari et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data and codes used in this manuscript are available on GitHub at https://github.com/mariodeflorio/AI-Aristotle.
Funding: NAD and GEK gratefully acknowledge the National Institutes of Health (NIH) Spleen grant R01HL154150. MD, KS, and GEK gratefully acknowledge the Office of Naval Research (ONR) Vannevar Bush grant N00014-22-1-2795. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
One of the most coveted tasks in Machine Learning is the discovery of new physics laws from observed and experimental data. When dealing with dynamical systems, a classic goal for inverse problems is parameter discovery, where experimental data and systems of differential equations are leveraged to estimate the unknown parameters governing [1]. In some cases, only partial knowledge of the physics may be available, which means one or several terms of the system of equations are unknown. This is the case with the so-called Gray-Box model, where an inversion can be performed to recover the missing terms [2].
One of the first attempts to extrapolate governing equations from observed data is presented in the well-known work by Brunton et al. [3], in which the authors propose a new school of thought for dynamical system discovery problem from the perspective of sparse regression [4] and compressed sensing [5]. In particular, they take advantage of the fact that most physical systems are described by only a few relevant terms governing the dynamics, making the governing equations sparse in a high-dimensional non-linear function space. This method named SINDy—Sparse Identification of Nonlinear Dynamics—depends on the choice of the candidate non-linear functions library and the availability and quality of the data. Thus, it is not a generalized method and works better if guided by the available knowledge in the form of constraints on the functional form of the phenomena under study. For example, given the trend of the observed data, one can approximately understand if it is a trigonometric or polynomial trend and build the library accordingly. SINDy has shown its capability in identifying non-linear dynamical systems from data without previous assumptions of the forms of the differential equations governing the phenomena.
Another method to retrieve governing equations from data has been proposed by Udrescu et al. [6]. In this paper, the authors make use of symbolic regression (SR), which aims to find a symbolic expression that accurately represents an unknown function based on a given dataset. They developed a novel recursive multidimensional symbolic regression algorithm, named AI-Feynman, that combines neural network techniques with physics-inspired strategies. The efficiency of this method has been proved by discovering 100 equations from the Feynman Lectures on Physics, outperforming the accuracy of the state-of-the-art publicly available software. However, despite the groundbreaking capability of this work, there are some drawbacks and areas for improvement. The method currently focuses on equations involving elementary functions but does not handle equations involving derivatives and integrals commonly found in physics. Integrating the capability to discover such equations would be valuable. Also, while the AI-Feynman shows promise, it could further benefit from combining the strengths of genetic algorithms and its approach to generate a more robust and versatile equation discovery tool. Overall, the development and refinement of symbolic regression algorithms continue to evolve, offering exciting possibilities for future discoveries in the realm of physics and beyond.
In this research direction, a new framework named AI-Descartes has been recently published [7]. In this paper, the authors address the challenge of deriving meaningful mathematical models from both axiomatic knowledge and experimental data by combining logical reasoning with SR. The novelty of this method lies in the attempt to generate models that are consistent with general logical axioms. The authors showcase their method’s effectiveness by applying it to three classic scientific laws: Kepler’s third law of planetary motion, Einstein’s relativistic time-dilation law, and Langmuir’s theory of adsorption. They demonstrate the capability to discover governing laws even with limited data points, emphasizing the importance of logical reasoning in distinguishing between candidate formulas with similar data-fit accuracy. However, this method relies on the correctness and completeness of background theories, which may not always hold, and the development of further techniques such as abductive reasoning [8] for partially addressing incomplete theories would be needed. Scaling behavior remains a challenge, especially regarding the undecidability of certain logical types and variations in run-time performance.
Another recently developed SR package, named Feyn [9] and based on the symbolic regressor QLattice, is showing great performance and capabilities, especially for small data sets, where traditional machine learning techniques such as gradient boosting and random forests tend to overfit [10]. Christensen et al. [11] efficiently used Feyn on clinical omics datasets to generate high-performing models to predict disease outcomes and to reveal putative disease mechanisms.
Other approaches using particular type of Neural Networks called Random Projection Neural Networks (RPNNs) [12–14] are used in combination with SR. RPNNs demonstrated great efficiency in solving forward problems of stiff ODEs and DAEs, outperforming traditional solvers [15, 16]. In Ref. [17], RPNNs are used for learning PDEs from spatio-temporal data and for the construction of the bifurcation diagram of the learned PDE. In a recent work [18], RPNNs are used to model a representation for SR called Interaction Transformation [19], showing the capability of this framework in drastically reducing the computational effort. In another work [20], a single-layer NN is combined with SR. In this approach, the SR layer, incorporating mathematical operators and basis functions, is constructed randomly instead of using genetic programming, and the output weighting parameters are optimized through least-squares optimization. The use of least-squares optimization significantly reduced computational time, resulting in system models based on simple analytic expressions that accurately represent the input-output relationship of dynamic systems. Recently, RPNNs and SR were combined in the AI-Lorenz [21] to discover chaotic dynamical systems in a black-box fashion, when the differential equations of the model are totally unknown.
One of the earliest works on addressing “gray-box” identification for nonlinear dynamical systems is the one of Ref. [2]. The gray-box in this paper is composed of a known part, represented by a system of Ordinary Differential Equations (ODEs), and unknown parts, which are approximated using neural networks. The paper illustrates this approach by applying it to model a complex reacting system with nonlinear kinetics for parameter discovery. The authors also highlight the challenges of working with discrete-time models and the advantages of using continuous-time approximations for a more nuanced understanding of system behavior. Other gray-box identification and parameter estimation methodologies were applied to a wide range of applications, such as phase field systems, biotechnology, and optogenetics [22–26]. More recently in [27], NNs and Gaussian Processes were used to perform gray box identification of PDEs based on stochastic Monte Carlo simulators in biological systems and in particular for the chemotaxis motility.
The PINN frameworks [28] are advancing the state-of-the-art methodologies for inverse problems of parameter discovery. Particularly challenging is the scenario in which we have a highly nonlinear dynamics system with many unknown parameters and very few available experimental data to leverage. This challenge has been addressed in a systems-biology-informed deep learning algorithm that incorporates the system of ODEs into the neural networks. In the works [29, 30], the authors proved the efficiency of this new algorithm to infer the dynamics of unobserved species using only a few scattered and noisy measurements by testing it for benchmark problems in systems biology.
In this work, we propose a new framework named AI-Aristotle to perform parameter discovery and gray-box identification for problems in Systems Biology. We employ two neural networks based methods for the unknown terms approximation, such as PINNs and X-TFC [31] with domain decomposition [15], and two symbolic regression algorithms for the mathematical explicitation of the gray-box model, such as PySR [32, 33] and gplearn [34]. Our framework is tested for two problems. The first one is a three-compartment pharmacokinetics model describing single-dose drug absorption. The second, more challenging problem is an ultradian endocrine model describing the glucose-insulin interaction. PINNs and X-TFC have been previously employed for gray box identification [21, 35, 36]. The novelty of this work lies in its unique integration of these methods and their concatenation with symbolic regression algorithms. This integrated framework allows the user to select the neural network-based module depending on the data availability, using two different symbolic regression algorithms for cross-validation. Unlike the SINDy method, which encounters difficulties with high-dimensional noisy data, the symbolic regression methods in this framework effectively address these challenges.
This paper is organized as follows. In Section 2, we present an introduction of the physics-based models used for our simulations. In Section 3, we report the two Neural Networks methods for solving the inverse problem with data and physics models and the two SR algorithms used to explicitly identify the previously retrieved gray-boxes. In Section 4, we report the results obtained by the two NN methods and the two SR algorithms for different test cases involving both parameter discovery and gray-box identification. Finally, we summarize conclusions and discussion in Section 4.3.2.
2 Models
In this section, the mathematical models describing the phenomena of our simulations are introduced. These models are designed to capture the dynamic interactions within specific biological processes, such as drug absorption and glucose-insulin interaction, offering physics-based knowledge of the behavior and characteristics of the systems under study.
2.1 Pharmacokinetics model
The first model we aim to use for our simulations is a single-dose compartmental Pharmacokinetics (PK) model [37], represented by the following system of ODEs:
(1)
This model evaluates the variation of drug concentration in three compartments, in a time range [0, 10] hours. The drug is initially introduced in the GI-tract (first compartment G), where it dissolves and diffuses into the bloodstream (second compartment B). Finally, the drug is eliminated from the bloodstream through the liver, kidneys, and urinary tract (third compartment U). The parameters kg = 0.72h−1 and kb = 0.15h−1 represent the rates at which the drug diffuses from the GI-tract into the bloodstream, and then eliminated from the bloodstream through the liver, kidneys, and urinary tract, respectively. The intake drug is considered to be 0.1μg of antibiotic tetracycline. In Section 4, we will show our simulations using this model for two test cases: 1) Parameters discovery, and 2) Gray-Box identification. With “Gray-Box”, we indicate the missing terms of a model. For this PK model, the missing term considered is the right-hand-side of the first ODE, which we approximate with an unknown function h(t) as follows:
(2)
which we aim to obtain by using available data for B, G, and U.
2.2 Ultradian Endocrine model
The second model used in our simulations is an ultradian model for the glucose-insulin interaction [38], which is modeled by 6 state variables and 30 parameters [29]. This model describes the existence of rhythmic oscillations in both glucose and insulin levels within the body that occur on a relatively short timescale, typically less than 24 hours. In particular, in our simulation, we will use a time range [0, 1800] minutes. It results in the following system of ODEs:
(3)
The three main variables of this model are the plasma insulin concentration Ip, the interstitial insulin concentration Ii, and the glucose concentration G. The last three variables h1, h2, and h3—a three-stage linear filter—represent the delay process between insulin and glucose production [38]. The functions f1, f2, f3, and f4, represent the insulin secretion, the insulin-independent glucose utilization, the insulin-dependent glucose utilization, and insulin-dependent glucose utilization, respectively [39], and they are expressed as follows:
where
and IG(t) is the exogenous (externally driven) glucose delivery rate. In our simulations, we define it over N = 3 nutrition events, at time tj (minutes) with a carbohydrate quantity mj (grams):
(4)
where (tj, mj) = [(300, 60)(650, 40)(1100, 50)](min, g), and the parameters governing this system of ODEs are listed in Table 1. Fig 1 shows the flow diagram of the glucose-insulin model, where the circles represent the three main state variables (Ip, Ii, G), the solid arrows represent the input and output flows and rate of exchange, and dashed arrows represent functional relationships. The delay arrow denotes the delay process of h1, h2, h3 state variables.
The circles represent the three main state variables (Ip, Ii, G), the solid arrows represent the input and output flows and rate of exchange, and the dashed arrows represent functional relationships.
The search ranges are listed only for the five parameters used for the parameter discovery in our simulations.
Also for this second model, we aim to pursue parameter discovery and gray-box identification. For the latter case, the missing terms we approximate with two unknown functions, f(t) and g(t), which are in the first two ODEs, as follows:
(5)
3 Methodology
As mentioned in the Introduction section, the parameter discovery and approximation of the unknown terms in the systems of ODEs are performed by two NN-based methods, while the symbolic regression is performed by two different algorithms, to cross-verify the mathematical expressions obtained. In this section, we present some details of these methods that are included in the AI-Aristotle framework, whose overall schematic is shown in Fig 2.
AI-Aristotle framework for gray-box identification: 1. The observed data and the partial knowledge of physics are used to train the selected neural network-based module. 2. The selection of the neural networks-based module needs to be done between (a) X-TFC, recommended for high-resolution data and missing terms discovery, and (b) PINN, recommended for sparse data and parameter estimation. The neural network outputs are the time-dependent representations of the missing terms of the dynamical systems, which are fed into the symbolic regression algorithm. 3. The selected Symbolic Regression module identifies the mathematical expressions of the missing terms. It is recommended to use both symbolic regressors for cross-validation. 4. The full knowledge of physics is now available, allowing forward modeling performance.
3.1 X-TFC
The first NN-based method presented uses a single-layer random projection neural network. For the sake of simplicity, we will show its implementation for the gray-box identification in the pharmacokinetics model only, since the implementation for the ultradian endocrine model is similar.
Different techniques are combined to build this algorithm for solving both forward and inverse problems involving differential equations. The first one is a functional interpolation technique named the Theory of Functional Connections (TFC) [40, 41]. According to TFC [42], we can approximate the unknown solutions of our system of ODEs, taking into consideration the initial conditions, with the so-called constrained expressions (CE) as follows:
(6a)
(6b)
(6c)
whose derivatives can be analytically expressed:
(7a)
(7b)
(7c)
The parameter c represents a mapping coefficient that maps the time domain t into the activation function domain. To these systems, we need to add the NN approximation of the unknown term h(t), which is
(8)
Here, σ is the free-chosen function of the CE. No matter what free-chosen function will be selected, the CE will always satisfy the initial conditions exactly. According to the X-TFC framework [31], we select a single-layer NN as a free-chosen function, such as
(9)
where L is the number of neurons,
is the jth input weight connecting the input node with the jth neuron,
with j = 1, …, L is the jth output weight connecting the output node with the jth neuron, bj is the bias of the jth neuron, and σj(⋅) is the NN’s activation function, which is selected by the user (for all the simulations in this work, we select a tanh activation function. The motivation for this choice is reported in the first section of S1 Text. In the extreme learning machine algorithm [43], input weights and biases are randomly pre-selected (uniform random distribution), thus the only unknown parameters that need to be computed are the output weights β = [β1, …, βL]T. Once the CEs are built, they can be replaced in the system of ODEs of Eq (2), to obtain the loss functions
(10a)
(10b)
(10c)
(10d)
(10e)
(10f)
where
, and
are the available observed data of the three variables. As we can see, now we have reduced the problem into a system of linear equations of the type Ax = b, where the unknown x is the vector of output weights β. However, here we show the procedure to solve it as a system of non-linear equations (that will be the case of the Ultradian Endocrine model). When dealing with a system of non-linear ODEs, the next step is to build the Jacobian matrix, by deriving the six previous losses with respect to βB, βG, βU, and βf. For the pharmacokinetics model, the Jacobian is
(11)
The unknown vector β is computed by iteratively solve the linear system . Each k-th iteration corresponds to an update of the output weights βk+1 = βk + Δβk, where
. If the Jacobian is rank-deficient, it is good practice to minimize the value of the Euclidean norm to achieve better performance or compute the Moore-Penrose pseudoinverse of the Jacobian as proposed in Refs. [16, 17]. Once all the output weights β are computed, they will be replaced into the CEs of Eqs (6a) to (6c) and (8) to find our sought solutions. In this work, X-TFC is used in a domain-decomposition fashion [15, 44], where the time-domain is decomposed into several sub-domains with equispaced time steps, and the algorithm is applied to each sub-domain subsequently, such that the solution found at the interface becomes the new initial condition for the subsequent iteration of the algorithm in the next sub-domain. A schematic of the X-TFC algorithm to solve the gray-box inverse problem for the pharmacokinetics model is shown in Fig 3.
Input weights and biases are randomly selected. The last step solves iteratively a least squares system, thus no back-propagation is involved in the training, allowing fast computational times.
3.2 Physics-Informed Neural Networks (PINNs)
The second NN-based approach is known as Physics-Informed Neural Networks (PINNs). This method has the capability to address both forward and inverse problems associated with differential equations by using a deep, fully connected neural network.
3.2.1 PINNs for Pharmacokinetics model.
Building upon the concept of PINNs as originally proposed in reference [28], we introduce a deep learning framework that incorporates the differential equations governing the single-dose compartmental Pharmacokinetics model. In this framework, a neural network characterized by parameters θ1 takes time t as input and generates an output vector representing the state variables u(t; θ1) = (uB(t, θ1), uG(t; θ1), uU(t; θ1)) which serves as an approximation of the ODE solution . To solve the gray-box inverse problem, in addition to the unknown parameters, we have an unknown component of the equation. Thus, we introduce another neural network with a different design to approximate the unknown term h(t). The system of ODEs for this model is as follows:
(12)
Here, the parameters θ2 characterize the second neural network, which takes t as input and generates an output h(t; θ2).
The next crucial step involves constraining the neural network to satisfy both the scattered observations of and the system of ODEs (12). This is achieved by constructing the loss function that takes into account terms corresponding to the observations and the ODE system. To be more specific, let us assume that we have measurements of
at various time instances t1, t2, …, tMdata. We want to ensure that the neural network satisfies the ODE system at specific time points t1, t2, …, tNode. It is important to note that the time instants t1, t2, …, tMdata, and t1, t2, …, tNode may not necessarily be on a uniform grid and can be chosen arbitrarily. Here, N is the number of collocation points, and M is the number of data points.
For computing the total loss, we employ the Self-Adaptive Loss Balanced method [45, 46]. The total loss function is defined as a function of θ1, θ2, p, λode, where p represents the unknown parameters of the ODEs, and λode is a vector representing the individual loss weights for all the state variables, i.e., λode = (λ1, λ2, …, λS), where S is the number of state variables. Note that λdata and λIC are constant values equal to 1 in this study and are not trainable variables in our neural network [46]. The total loss function is defined as a function of θ1, θ2, p, λode, where p represents the unknown parameters of the ODEs, and λode is a vector representing the individual loss weights for all the state variables, i.e., λode = (λ1, λ2, …, λS), where S is the number of state variables. Note that λdata and λIC are constant values, equal to 1 in this study, and are not trainable variables in our neural network. The total loss is computed as follows:
(13)
where
(14)
(15)
(16)
We emphasize that and
represent the discrepancies between the neural network predictions and the measured data, making them supervised losses. Conversely, Lode is derived from the ODE system and, therefore, qualifies as an unsupervised loss. In the final step, we simultaneously determine the parameters
,
of both neural networks and the unknown ODE parameters p* by minimizing the loss function using gradient-based optimization methods, such as the Adam optimizer [47] and L-BFGS optimizer [48]. Additionally, we determine the
vector by updating adaptive weights in each epoch by solving:
(17)
For the training process, where our goal is to predict the unknown term h(t; θ2) and the values of parameters simultaneously, we employ the Adam optimizer with default hyperparameters and a learning rate of 10−4. Training is performed on the entire dataset. Since our total loss comprises two supervised losses and one unsupervised loss, we adopt a two-stage training strategy as follows:
- Recognizing that supervised training typically yields faster convergence than unsupervised training, we initially train the network using the two supervised losses,
and
, for a set number of iterations. This initial training phase enables the network to quickly align with the observed data points.
- Subsequently, we continue the training process, incorporating all three losses.
Empirical observations demonstrate that this two-stage training approach expedites network convergence. The specific number of iterations for each stage and parameters for the implementation are detailed in Section 4.1. A schematic of the PINNs algorithm for solving the gray-box inverse problem in the pharmacokinetics model is shown in Fig 4.
Here, u(t; θ1) is a vector that contains all three output states. Unlike the X-TFC network, PINN requires back-propagation, which is the expensive computational component.
3.2.2 PINNs for Ultradian Endocrine model.
The system of ODEs for this model is as follows:
(18)
Here, parameters θ2 characterize the second neural network, which takes t as input and generates two outputs f(t; θ2) and g(t; θ2).
In accordance with the pharmacokinetics model, this study adopts a self-adaptive loss-balanced method and a two-stage training strategy. To expedite the neural network training process, extending the discussion from the previous section on Fully connected Neural Networks, we introduce supplementary layers following the workflow presented in [29].
- Input Scaling Layer: In cases where the time domain exhibits significant variation spanning multiple orders of magnitude, which can detrimentally affect neural network training, we employ a linear scaling function on the time variable t, using a value in the time domain T to obtain
, which approximates values to be ∼O(1). In this study, for the time interval ranging from 0 to 1800, we have adopted a value of T = 100.
- Feature Layer: Frequently, solutions to ordinary differential equations (ODEs) display patterns such as periodicity or exponential decay. To enhance the neural network’s ability to learn these patterns, especially in multimodal solutions with multiple levels of frequencies, we incorporate a dedicated feature layer. This layer is key in capturing the complexity of multimodal solutions. The general framework remains consistent across different problems. We utilize the set of functions e1(θ), e2(θ), …, eL(θ) to construct L features
, as illustrated in Fig 5. If discerning a clear pattern proves challenging, it is advisable to omit the feature layer rather than introducing inaccurate information. This feature layer is a training aid and not a mandatory component for the success of the PINNs for system biology identification problems.
- Output Scaling Layer: The predicted outputs, denoted as
, may exhibit variations in magnitudes. To address this, we can normalize the network outputs. To standardize these outputs, we employ a normalization procedure, expressed as follows:
Here,
represent the magnitudes of the corresponding ODE solutions
. This normalization ensures that the predicted outputs are scaled consistently with the characteristics of the underlying ODE solutions. Furthermore, we introduce an additional component to this layer to facilitate the alignment of the state variables with a linear trajectory connecting the initial and final data points. This linear transformation facilitates interpreting and visualizing the model’s outputs, ensuring their alignment with meaningful data trends. In summary, the Output Scaling Layer standardizes predicted outputs while integrating a linear transformation component. This integration enhances the interpretability and relevance of the model’s results, expediting the neural network’s convergence towards an accurate solution. We observed that without the output scaling layer, the model tended to get stuck in local minima.
The list of parameters of this model can be found in Section 4.2. A schematic of the PINNs algorithm for solving the gray-box identification problem in the Ultradian Endocrine model is shown in Fig 5.
3.3 Symbolic regression
Symbolic regression is a powerful method used in machine learning, designed to discover a mathematical expression or equation that provides the optimal fit for a provided dataset. Unlike traditional regression methods (e.g., linear regression, polynomial regression), symbolic regression seeks to discover the underlying mathematical relationship between input variables and the target variable without making assumptions about the form of the equation. Two popular symbolic regression algorithms commonly used in this context are PySR (Python Symbolic Regression) [32] and gplearn (Genetic Programming for Symbolic Regression) [34]. These algorithms employ different techniques to discover symbolic expressions from data, and their processes are very similar to each other.
They are SR libraries that combine genetic programming with machine learning techniques to discover mathematical expressions. The first step of their processes is creating an initial population of candidate equations represented by mathematical expressions composed of simple mathematical operations (+, −, ×, ÷), functions (e.g., sine, cosine, exponential), and variables. Subsequently, each candidate equation is evaluated against the given dataset, and its performance is assessed using a fitness function, that measures how well the equation fits the data, typically by calculating the mean squared error (MSE) or a similar metric. A genetic algorithm is used to select the best-performing candidate equations for the next generation. Equations that fit the data well are more likely to be selected, while less fit equations may be removed. Genetic operations like crossover (combining parts of two equations) and mutation (making small changes to an equation) are applied to the selected equations to create a new generation of candidate equations. This process iterates through multiple generations, continually improving the equations’ fitness until a termination condition, such as a maximum number of generations, or a threshold fitness level, is met.
4 Results
In this section, the results of our simulations are reported and discussed. The first two subsections 4.1 and 4.2 show the performance of X-TFC and PINNs in parameter discovery and gray-box identification for both the Pharmacokinetics and Ultradian Endocrine model. The synthetic data are generated by solving the forward problems with Runge-Kutta method for PINNs, and RPNNs for X-TFC. The outputs of the gray-box identification are used as input in the symbolic regression algorithms for the symbolic distillation of both NN-based methods, whose results and performance are shown in subsections 4.3.1 and 4.3.2.
4.1 Pharmacokinetics
In the parameter discovery test case, we aim to infer the value of the parameters kg = 0.72h−1 and kb = 0.15h−1 of the system of ODEs in Eq (1), given a certain number of available data points of B, G, and U. The results and performance for both X-TFC and PINNs are reported in Table 2, simulating the variation of drug concentration in the three compartments for a time domain of 50 hours. The number of data points used varies from 10 to 100, and both methods show great accuracy in retrieving both the parameters governing the ODEs. The accuracy of the methods is evaluated with the absolute difference between the nominal value of the parameters and their inference. As expected, we see an increase in accuracy while increasing the number of data points, but one can see that both methods can give great precision even for a meager dataset (10 data points—one every five hours). To substantiate this claim, particularly for PINNs, we executed the model 10 times, each with a distinct random seed. We then computed the average relative error (%) of the inferred parameter values over these 10 runs and reported this average alongside the corresponding average computational time in Table 2. For the pharmacokinetics inverse problem, in PINNs, we utilized the Adam optimization with Nc = 500, learning rate (lr) of 1×10−4, and we conducted training for 50,000 iterations. Notably, in this context, the application of self-adaptive loss balancing weights was deemed unnecessary, and the two-phase training method was not employed. We perform the computational experiments for PINNs on NVIDIA’s GeForce RTX 3090 GPUs, which are powered by NVIDIA’s 2nd generation RTX Ampere architecture. The GPU has 10496 core and is endowed with 24 GB of GDDR-6X memory. PINNs parameters setup is shown in Table 3.
Refer to Table 1 for X-TFC hyperparameters.
The initial and second numbers in the ‘Number of Iterations’ Row represent the iterations during the primary and secondary training stages using Adam optimization. The third number corresponds to the training stage utilizing L-BFGS. The first and second numbers in the ‘Architecture of Neural Networks’ indicate the width and depth, respectively.
Since X-TFC uses a domain decomposition technique, we report the number of iterations needed from the iterative least-squares for each sub-domain, with an iteration tolerance set equal to 1e-06. The X-TFC results reported in Tables 2 and 4 are obtained with certain neural networks hyperparameters setup, which are specified so that they can be readily reproducible. With a proper ablation study and domain decomposition, we can reduce these errors by several orders of magnitude, as shown in Tables A and B in S1 Text. The tuning hyperparameters are N number of points per sub-domain, L number of neurons, and tstep the length of each subdomain. These setups for each simulation are reported in Table 5, made with an Intel(R) Xeon(R) W-2255 CPU @ 3.70GHz machine.
Comparison between X-TFC and PINNs performance via MAE, RMSE, RE, and computational time for different numbers of data points. The initial number in the ‘# of Iter.’ column for PINNs represents the iterations during the primary training stages using Adam optimization while the second number corresponds to the training stage utilizing L-BFGS.
GPUs, renowned for their inherently parallel architecture, excel in efficiently distributing specific computations across a multitude of cores. As the volume of data points grows, the potential for enhanced parallelization efficiency becomes evident, potentially resulting in reduced computation times. It is worth highlighting that computational times may decrease when employing GPUs as the number of data points increases, as illustrated in Table 2 depicting the results of the PINNs method. This phenomenon is particularly noticeable due to our utilization of GPUs for this method.
In the gray-box identification test case for the Pharmacokinetics model, we aim to obtain the right-hand-side unknown term h(t) of the first ODE of the system (2). X-TFC and PINN results and performance for a simulation of 50 hours are shown in Table 4. Performance is evaluated via Mean Absolute Error (MAE):
Root Mean Squared Error (RMSE):
and Relative Error (RE):
where
and h(t) are the exact and learned solutions, respectively. Also, for these test cases, we can see how both methods can perform a good inversion of the unknown term h(t) given a few data samples. Fig 6A shows the learned concentrations in time of the three state variable B, G, and U for X-TFC and PINNs solutions vs. the exact solution (given by 50 data points), while the learned function h(t) is plotted in Fig 6B.
As presented in Tables 2 and 4, our comparative analysis reveals valuable insights into the performance of the X-TFC and PINNs methods when applied to the same problem with varying data sizes within the same time range. For smaller sizes of the dataset (e.g., 10 data points), the PINNs method can achieve better performance in accuracy, especially for the gray-box test case, showing its inherent performance in handling sparse datasets for approximating complex functions due to the high expressivity of the deep neural network. Conversely, as the dataset size increases, the performance of the X-TFC method in terms of accuracy improves substantially. Its computational speed, a distinct advantage, allows it to effectively capitalize on larger datasets. With more data points, the X-TFC method can produce increasingly accurate results, eventually surpassing the accuracy achieved by the PINNs method. Despite the initial accuracy advantage of PINNs, it reaches a point where further increasing the dataset size does not significantly improve accuracy with the same setup while still keeping great performance. This is probably due to the optimization error, and overcoming this limitation may involve architectural enhancements, such as increasing the neural network’s depth, employing different optimization algorithms, or implementing alternative techniques. In contrast, the X-TFC method continues to benefit from additional data, showcasing its scalability and adaptability. In summary, for problems with small datasets, the PINN method excels in providing accurate solutions. For larger datasets the X-TFC method becomes increasingly competitive, offering the potential for superior accuracy with adequate computational resources.
Finally, we evaluate the performance of the two NN-based models for noisy data, simulating a more realistic scenario. We perturb 100 synthetic data points with a Uniform random distribution noise at four different levels of noise n = [1%, 2%, 3%, 4%, 5%, 10%] as follows:
(19)
where
is a random variable following a uniform random distribution. In Table 6, the performance of X-TFC and PINNs for retrieving the missing term h(t) are reported in terms of MAE, RMSE, RE, number of iterations, and computational times. The X-TFC results are obtained without domain decomposition to avoid overfitting in the solution, using 100 collocation points, 100 neurons, and a least-squares tolerance of 1e-06 in 0.05 seconds. For PINNs, the previous framework design is kept to handle noisy data. The PINNs and X-TFC solution comparison with the exact solution, for a noise standard deviation of 0.05, is presented in Fig 7A. Additionally, Fig 7B compares the results obtained from the X-TFC and PINN methods. For all five values of noise std, we can find h(t) with good accuracy using both NN-based methods, keeping low errors, which increase with the increase of noise, as expected.
Pharmacokinetics model: (A) Comparison between exact solution B, G, and U and solution of PINNs and X-TFC with noisy data (noise std = 0.05). (B) Comparison between exact solution vs. X-TFC and PINNs solutions for unknown term h(t) with noisy data (noise std = 0.05).
Comparison between X-TFC and PINNs performance via MAE, RMSE, RE, and computational time for different values of noise.
4.2 Ultradian Endocrine model
The results of the parameter discovery test case for the Ultradian Endocrine model are reported in Table 7, as the absolute difference between the nominal and inferred values of the parameters. Our simulations were conducted for the discovery of five parameters. However, the PINNs algorithm proved to be very effective in system identification, discovering up to 21 parameters of the ultradian endocrine model using only data for G and Ip. As presented in [30], using only 360 data points for G, the PINNs algorithm was able to discover 17 parameters accurately, which is challenging and not possible for the X-TFC algorithm to do with a small amount of data on only one state variable.
The performance of the two methods is given by the absolute difference between nominal values and inferred values. On the right, we also present computational times in seconds.
With X-TFC, we can retrieve the parameters already in the first sub-domain. Further iterations of the algorithm might produce higher errors. Thus, more careful hyparameter selection and initialization of parameters and output weights initial guesses at each subdomain need to be carried out. The Levenberg-Marquardt algorithm is employed to perform the non-linear least squares, allowing us to define the search range of the parameters. In the context of PINNs, the obtained results are contingent on the learning process. Notably, the neural network’s capacity to learn effectively is closely tied to the temporal scope of the problem. Specifically, the neural network may not yield accurate approximations within a smaller time range, which corresponds to a reduced dataset size.
In the gray-box identification case, we aim to infer the two unknown terms f(t) and g(t) in the system of ODEs (5), from available data of the variables Ip and G. In Table 8, the MAE, RMSE, RE, and computational times are reported for both X-TFC and PINNs frameworks, for different amount of data points, from 360 to 1800 (i.e., data available every 5, 4, 3, 2, and 1 minutes), in a simulation of 1800 minutes. For X-TFC, a domain decomposition of several subdomains is needed, thus the number of iterations reported in the table refers to the average number of iterations in one subdomain. The hyperparameters for the X-TFC neural networks, as well as the configuration of parameters for the PINNs, employed to generate the results presented in Table 8, are documented in Tables 9 and 10, respectively. The first three state variables of the model learned by X-TFC and PINNs are plotted vs. the exact solution in Fig 8A, while the two learned functions f(t) and g(t) are plotted in Fig 8B. In both figures, the overlap of the solutions of both frameworks is clear.
X-TFC and PINNs performance in terms of MAE, RMSE, RE, number of iterations, and computational time for different numbers of data points.
The first and second numbers in the ‘Architecture of Neural Networks’ indicate the width and depth, respectively. The initial and second numbers in the ‘Number of Iterations’ Row represent the iterations during the primary and secondary training stages.
As evidenced by the data presented in Tables 7 and 8, encompassing both gray-box and inverse problem scenarios, and spanning across both this model and the pharmacokinetics model, a discernible pattern emerges concerning the impact of dataset size on method performance.
In the case of the X-TFC method, an increase in the number of data points leads to progressively more accurate results. However, it is noteworthy that when confronted with a relatively small dataset, the PINNs method exhibits superior performance, characterized by heightened accuracy and reduced absolute error. For instance, in Table 8, the PINNs method demonstrates better efficacy with merely 360 and 450 data points. Nevertheless, as the dataset grows, the X-TFC method surpasses PINNs in accuracy and computational efficiency.
In summary, the choice between the X-TFC and PINN methods should be made judiciously, with careful consideration of dataset size and noise levels. While the X-TFC method excels with larger datasets, the PINN method exhibits a unique strength in scenarios involving smaller datasets or noisy data, where it achieves greater accuracy.
4.3 Symbolic distillation of gray-box models recovered from X-TFC and PINNs methods
After training the X-TFC and PINNs model, we obtain a gray-box model for f(t), g(t) and h(t) parameterized by high dimensional parameters. Therefore, we perform symbolic regressions and fit compact closed-form analytical expressions to f(t), g(t) and h(t) independently by using PySR [33] and gplearn [34]. Both packages use a genetic algorithm to combine algebraic expressions stochastically. The employed method shares similarities with the method of natural selection, as it assesses the “fitness” of each expression based on its simplicity and accuracy. In this study, we consider binary operations in the fitting process as +, −, and ×. In symbolic regression, the accuracy of recovered expressions is assessed through complexity, score, and loss. Complexity measures the intricacy of the discovered equations in terms of the number of terms, mathematical operations, and the overall structure of the equations. Managing complexity is an important aspect of symbolic regression because overly complex equations can be difficult to interpret and may not generalize well to new data, leading to overfitting. Score in symbolic regression algorithm is typically used to discover the mathematical expressions that maximize or minimize the chosen scoring metric while considering different combinations of mathematical operations and constants. Loss in symbolic regression typically refers to a mathematical function that quantifies the discrepancy between the predicted values generated by a symbolic expression or equation and the actual observed values in the dataset.
We represent the validation metrics for the model obtained from PySR with variation in loss and score against the complexity of symbolic expression. The loss function can be considered as mean square error (MSE) or root mean square of error (RMSE) between actual and predicted outputs. However, the score is defined as the negative of the derivative of the log-loss with respect to the complexity. The complexity in PySR is defined as the number of nodes in an expression tree, irrespective of each node’s content. In the PySR implementation, we chose the candidate model with the highest score among expressions with a loss better than at least 1.5x the most accurate model represented by the lowermost loss function. In gplearn, we observe the variation of the loss function against the length of the symbolic expression, and we choose the candidate model when complexity increases, but the loss remains stagnant.
4.3.1 Symbolic distillation of pharmacokinetics model.
We perform symbolic regression for (12), in particular
(20)
where we recover the expression hsym in terms of G and B using symbolic regressions.
In Table 11, we show the closed-form symbolic models obtained from the packages PySR and gplearn for the gray-box models recovered from X-TFC and PINNs approaches. From Table 11, it is evident that symbolic models are in very good agreement with the true models. Validation metrics for the models obtained from PySR and gplearn are shown in Fig 9. In Fig 9A, we show the plots of loss and scores against the complexity of expressions for the symbolic models obtained from PySR. In Fig 9A, it is evident that as complexity increases, the scores remain constant for both the PINNs and X-TFC, which indicates convergence of the candidate model. Similarly, the loss for the PINN approach obtained the convergence very early, but the loss for the X-TFC method keeps decreasing, but complexity remains constant. Therefore, a candidate model with a complexity of 5 is appropriate and does not overfit. Fig 9B shows the validation metric of the symbolic model obtained from gplearn. Unlike PySR, gplearn provides the metric in terms of loss and length of expressions as the population evolves. In Fig 9B, we plot the loss against the length of expression in symbolic models. The candidate models for PINN and X-TFC methods, shown in Table 11, correspond to lengths of 7 and 19, respectively. In Fig 10, we show the evolved tree of binary operations, obtained from gplearn, in the symbolic model recovered for hsym obtained from PINNs. It is to be noted that the number of nodes (9) in Fig 10 represents the length of expression in the symbolic model.
(A) represents variation in loss and score of symbolic models obtained from PySR with respect to the complexity of expressions. Once convergence is achieved, the score remains constant as the complexity of the recovered expression increases, and thus, the criteria for selection of candidate symbolic with expression shown in Table 11. (B) represents variation in loss of symbolic models, obtained from gplearn, with respect to the length of expression. We choose the length of expression 9 and 19 for PINNs and X-TFC, respectively. These lengths of expressions correspond to the minimum loss for the regressed symbolic models with closed form expression shown in Table 11.
It is to be noted that the number of nodes in the tree corresponds to the length of expressions, which is 9 for the PINNs method.
An evaluation of the framework for data affected by noise is reported in Sections 4.1 (Table 6 with comments at the end of the section) and 4.3.1 for the pharmacokinetics case. The performance has been evaluated for X-TFC, PINNs, and symbolic regression. We performed the symbolic regression for the nose levels of 1%, 2%, 3%, 4%, 5%, and 10% sampled from a Uniform distribution with mean (μ) variance 0 and (σ2), respectively. To accelerate the convergence of symbolic regression, we use a L1− loss function with a regularizer, chosen as noise level. Therefore, the loss function is defined as
(21)
where
is vector of actual data,
is predicted data, s is the scale of the noise and I is identity matrix of size n × n. The mathematical expressions distilled for different noise levels are reported in Table 12.
4.3.2 Symbolic distillation of X-TFC and PINNs for Ultradian Endocrine model.
The gray-box models for Ip and Ii are expressed as
(22)
(23)
Here, we discover the closed and compact form of fsym(Ip, Ii) and gsym(Ip, Ii) using symbolic regression. In Table 13, we present the close and compact form symbolic models for fsym (PINNs and X-TFC) and gsym (PINNs and X-TFC) recovered by using PySR and gplearn. Table 13 shows a very good agreement between the symbolic models and actual expression represented by the semi-discrete system of ODEs. In Figs 11 and 12, we present the plots that show the variation in score and loss against complexity of recovered expression for models learned from X-TFC and PINNs, for PySR and gplearn packages, respectively. Interpretation of the Figs 11 and 12 are the same as those explained in Section 4.3.1. For example, in Fig 11, the convergence with PySR is achieved when the score remains constant while the complexity increases. In Fig 12A, the convergence with gplearn framework for fsym is achieved at length of expression of 18 and 25 for PINN and X-TFC, respectively. However, for gsym, we see that convergence is achieved for lengths of expression of 13 and 18 for PINN and X-TFC, respectively. In Fig 13, we show the evolved tree of binary operations, obtained from gplearn, in a symbolic model recovered for gsym obtained from PINNs. It is to be noted that a number of nodes in the tree (13) in Fig 10 represents the length of expression in the symbolic model.
(A) fsym and (B) gsym are expressed by score and loss metrics against the complexity of the expressions recovered using PySR. It is to be noted that, in both the plots, once convergence is achieved, the score remains unchanged as complexity increases.
(A) fsym and (B) gsym are expressed by MSE loss against length of the expressions recovered using gplearn and presented in Table 13. For fsym, we choose length of expression 18 and 25 for PINNs and X-TFC, respectively. However, for gsym, we choose length of expression 13 and 25 for PINNs and X-TFC, respectively.
Summary and discussion
This paper presents a comprehensive framework named AI-Aristotle, which combines two neural network-based methods (X-TFC and PINNs) with two symbolic regression techniques to address the challenging tasks of parameter discovery and gray-box identification in Systems Biology problems.
Our framework was evaluated on two benchmark problems: the pharmacokinetics drug absorption model and the ultradian endocrine model describing glucose-insulin interactions. The results demonstrated the capability of both X-TFC and PINNs to accurately estimate parameters even with limited data, showcasing their potential for model calibration in real-world scenarios. In the gray-box identification simulations, our framework successfully discovered the missing terms in the differential equations governing the systems. The learned functions exhibited high accuracy even with a small number of data points. This ability to identify gray-box terms is essential for improving model fidelity and understanding complex systems where some underlying mechanisms are not fully known. We further distilled the learned neural network models using two symbolic regression algorithms, providing interpretable mathematical expressions. This process enhances the transparency and usability of the models, facilitating their integration into scientific research and decision-making processes.
Our study has unveiled a noticeable trend in how dataset size affects the performance of different methods. When we look at the X-TFC method, increasing the number of data points leads to progressively improved results. However, when dealing with relatively small datasets, the PINN method outperforms on accuracy. This superiority can be attributed to PINNs’ efficiency in handling sparse datasets and approximating complex functions with fewer data points. As the dataset size expands, the X-TFC method overtakes PINNs in both accuracy and computational efficiency. In particular, the latter occurs because of the use of least-squares optimization as a solver instead of the back-propagation. It seems that the optimization error dominates in PINNs; hence, no further improvement can be achieved even for more data points. Thus, when choosing between the X-TFC and PINN methods, careful consideration of dataset size and required computational time is paramount. To investigate that X-TFC is more sensitive to the selected hyperparameters before training, an ablation study to find the best parameters is required to achieve the desired accuracy. This does not add any serious computational expense as X-TFC is extremely fast.
We perform the distillation of gray-box models obtained by using PINNs and X-TFC methods. Symbolic regression provided compact and closed-form expression for PINN and X-TFC-based surrogates. To show the robustness of recovered symbolic expression, we used PySR and gplearn package and recovered almost identical expressions for the Pharmacokinetics and Ultradian Endocrine model. At the implementation level, we find that PySR is a more robust and efficient framework than gplearn; for example, for the problems we considered here, it takes 10 minutes for PySR on CPU, while it takes up to one hour for gplearn. Also, PySR requires less effort in tuning the hyperparameters of the model to perform the symbolic regressions. The robustness of PySR is due to the implementation of simulated annealing-based mutation of a tree of binary expressions [33], which is not present in the gplearn framework.
The proposed framework can be applied to a broad range of physical phenomena to estimate the governing parameters and identify the mathematical expressions of the missing parts of a partial knowledge of the physics. Both PINNs and X-TFC are effective and generalizable for solving problems and dynamical systems involving both ODEs [29, 44, 49–57] and PDEs [31, 58–63], in fields such as rarefied-gas dynamics, optimal control, epidemiology, radiative transfer, chemical kinetics, and many others. X-TFC, in Ref. [15], proved to be efficient and robust in solving stiff problems in the field of chemical kinetics, also for large-scale problems in terms of the number of ODEs (air pollution POLLU problem with 20 ODEs) and in terms of time horizon (Belousov-Zhabotinsky reaction) thanks to the domain decomposition technique, outperforming traditional numerical methods. Likewise, in the PINNs framework, we can split the operator as shown in Ref. [64] for a stiff biological neural model, to alleviate the issue of the stiffness of ODEs, choosing the operator splitting approach between the Strang [65] and the Godunov splitting [66].
Thus, the AI-Aristotle framework is generalizable and applicable to a large number of scientific disciplines other than systems biology. The same applies to symbolic regression techniques, which have been widely used to discover physics laws in fields such as Alzheimer’s disease [36], chaotic systems [21], wind speed forecasting [67], and so on.
Supporting information
S1 Text. Supplementary information file, including supplementary Tables A-E.
X-TFC hyperparameters setup for Pharmacokinetics parameter discovery improvement. Ablation study of X-TFC for Ultradian Endocrine model. Ablation study of PINNs for Ultradian Endocrine model.
https://doi.org/10.1371/journal.pcbi.1011916.s001
(PDF)
References
- 1.
A. Tarantola, Inverse problem theory and methods for model parameter estimation, SIAM, 2005.
- 2.
Rico-Martinez R, Anderson J, Kevrekidis I. Continuous-time nonlinear signal processing: a neural network based approach for gray box identification. In: Proceedings of IEEE Workshop on Neural Networks for Signal Processing. IEEE; 1994. p. 596–605.
- 3. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the national academy of sciences. 2016;113(15):3932–3937. pmid:27035946
- 4. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological). 1996;58(1):267–288.
- 5. Donoho DL. Compressed sensing. IEEE Transactions on information theory. 2006;52(4):1289–1306.
- 6. Udrescu SM, Tegmark M. AI Feynman: A physics-inspired method for symbolic regression. Science Advances. 2020;6(16):eaay2631. pmid:32426452
- 7. Cornelio C, Dash S, Austel V, Josephson TR, Goncalves J, Clarkson KL, et al. Combining data and theory for derivable scientific discovery with AI-Descartes. Nature Communications. 2023;14(1):1777. pmid:37045814
- 8.
Douven I. The art of abduction. MIT press; 2022.
- 9.
Broløs KR, Machado MV, Cave C, Kasak J, Stentoft-Hansen V, Batanero VG, et al. An approach to symbolic regression using feyn. arXiv preprint arXiv:210405417. 2021;.
- 10.
Wilstrup C, Kasak J. Symbolic regression outperforms other models for small data sets. arXiv preprint arXiv:210315147. 2021;.
- 11. Christensen NJ, Demharter S, Machado M, Pedersen L, Salvatore M, Stentoft-Hansen V, et al. Identifying interactions in omics data for clinical biomarker discovery using symbolic regression. Bioinformatics. 2022;38(15):3749–3758. pmid:35731214
- 12.
Andras P. Random projection neural network approximation. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE; 2018. p. 1–8.
- 13.
Wouter F. Schmidt, Martin A. Kraaijveld, Robert PW. Duin, and others, Feed forward neural networks with random weights, in International conference on pattern recognition, pages 1–1, 1992, organization = IEEE Computer Society Press.
- 14.
Boris Igelnik and Yoh-Han Pao, Stochastic choice of basis functions in adaptive function approximation and the functional-link net, IEEE transactions on Neural Networks, volume 6, number 6, pages 1320–1329, 1995, publisher = IEEE.
- 15. De Florio M, Schiassi E, Furfaro R. Physics-informed neural networks and functional interpolation for stiff chemical kinetics. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2022;32(6).
- 16.
Gianluca Fabiani, Evangelos Galaris, Lucia Russo, and Constantinos Siettos, Parsimonious physics-informed random projection neural networks for initial value problems of ODEs and index-1 DAEs, Chaos: An Interdisciplinary Journal of Nonlinear Science, volume 33, number 4, 2023, publisher = AIP Publishing.
- 17.
Evangelos Galaris, Gianluca Fabiani, Ioannis Gallos, Ioannis Kevrekidis, and Constantinos Siettos, Numerical bifurcation analysis of PDEs from lattice Boltzmann model simulations: a parsimonious machine learning approach, Journal of Scientific Computing, volume 92, number 2, pages 34, 2022, publisher = Springer.
- 18. de Franca FO, de Lima MZ. Interaction-transformation symbolic regression with extreme learning machine. Neurocomputing. 2021;423:609–619.
- 19. de França FO. A greedy search tree heuristic for symbolic regression. Information Sciences. 2018;442:18–32.
- 20. Köktürk-Güzel BE, Beyhan S. Symbolic regression based extreme learning machine models for system identification. Neural Processing Letters. 2021;53(2):1565–1578.
- 21.
Mario De Florio, Ioannis G Kevrekidis, George Em Karniadakis, AI-Lorenz: A physics-data-driven framework for black-box and gray-box identification of chaotic systems with symbolic regression, arXiv preprint arXiv:2312.14237, 2023.
- 22. Kemeth FP, Alonso S, Echebarria B, Moldenhawer T, Beta C, Kevrekidis IG. Black and gray box learning of amplitude equations: Application to phase field systems. Physical Review E. 2023;107(2):025305. pmid:36932491
- 23. Lovelett RJ, Avalos JL, Kevrekidis IG. Partial observations and conservation laws: Gray-box modeling in biotechnology and optogenetics. Industrial & Engineering Chemistry Research. 2019;59(6):2611–2620.
- 24. Quach M, Brunel N, d’Alché Buc F. Estimating parameters and hidden variables in non-linear state-space models based on ODEs for biological networks inference. Bioinformatics. 2007;23(23):3209–3216. pmid:18042557
- 25. Wandy J, Niu M, Giurghita D, Daly R, Rogers S, Husmeier D. ShinyKGode: an interactive application for ODE parameter inference using gradient matching. Bioinformatics. 2018;34(13):2314–2315. pmid:29490021
- 26. Loos C, Krause S, Hasenauer J. Hierarchical optimization for the efficient parametrization of ODE models. Bioinformatics. 2018;34(24):4266–4273. pmid:30010716
- 27.
Seungjoon Lee, Yorgos M. Psarellis, Constantinos I. Siettos, and Ioannis G. Kevrekidis, Learning black-and gray-box chemotactic PDEs/closures from agent-based Monte Carlo simulation data, Journal of Mathematical Biology, volume 87, number 1, pages 15, 2023, publisher = Springer.
- 28. Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics. 2019;378:686–707.
- 29. Yazdani A, Lu L, Raissi M, Karniadakis GE. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLoS computational biology. 2020;16(11):e1007575. pmid:33206658
- 30.
Daneker M, Zhang Z, Karniadakis GE, Lu L. Systems biology: Identifiability analysis and parameter identification via systems-biology-informed neural networks. In: Computational Modeling of Signaling Networks. Springer; 2023. p. 87–105.
- 31. Schiassi E, Furfaro R, Leake C, De Florio M, Johnston H, Mortari D. Extreme theory of functional connections: A fast physics-informed neural network method for solving ordinary and partial differential equations. Neurocomputing. 2021;457:334–356.
- 32.
Virgolin M, Pissis SP. Symbolic regression is np-hard. arXiv preprint arXiv:220701018. 2022;.
- 33.
Cranmer M. Interpretable machine learning for science with PySR and SymbolicRegression. jl. arXiv preprint arXiv:230501582. 2023;.
- 34.
Stephens T. gplearn: Genetic programming in python, with a scikitlearn inspired api. [Online]. Available: https://github.com/trevorstephens/gplearn; 2015.
- 35.
Kiyani E, Shukla K, Karniadakis GE, Karttunen M. A Framework Based on Symbolic Regression Coupled with eXtended Physics-Informed Neural Networks for Gray-Box Learning of Equations of Motion from Data. arXiv preprint arXiv:230510706. 2023;.
- 36. Zhang Zhen, Zou Zongren, Kuhl Ellen, Karniadakis George Em, Discovering a reaction–diffusion model for Alzheimer’s disease by combining PINNs with symbolic regression, Computer Methods in Applied Mechanics and Engineering, 419, 116647, 2024, Elsevier.
- 37.
Barnes B, Fulford GR. Mathematical modelling with case studies: a differential equations approach using Maple and MATLAB. vol. 25. CRC Press; 2011.
- 38. Sturis J, Polonsky KS, Mosekilde E, Van Cauter E. Computer model for mechanisms underlying ultradian oscillations of insulin and glucose. American Journal of Physiology-Endocrinology And Metabolism. 1991;260(5):E801–E809. pmid:2035636
- 39. Albers DJ, Elhadad N, Tabak E, Perotte A, Hripcsak G. Dynamical phenotyping: using temporal analysis of clinically collected physiologic data to stratify populations. PloS one. 2014;9(6):e96443. pmid:24933368
- 40. Mortari D. The theory of connections: Connecting points. Mathematics. 2017;5(4):57.
- 41. De Florio M, Schiassi E, D’Ambrosio A, Mortari D, Furfaro R. Theory of functional connections applied to linear ODEs subject to integral constraints and linear ordinary integro-differential equations. Mathematical and Computational Applications. 2021;26(3):65.
- 42. Mortari D. Least-squares solution of linear differential equations. Mathematics. 2017;5(4):48.
- 43. Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing. 2006;70(1-3):489–501.
- 44. Enrico Schiassi, Mario De Florio, Barry D Ganapol, Paolo Picca, and Roberto Furfaro, Physics-informed neural networks for the point kinetics equations for nuclear reactor dynamics, Annals of Nuclear Energy, vol. 167, p. 108833, Elsevier, 2022.
- 45. Xiang Z, Peng W, Liu X, Yao W. Self-adaptive loss balanced Physics-informed neural networks. Neurocomputing (Amsterdam). 2022;496:11–34.
- 46. McClenny LD, Braga-Neto UM. Self-adaptive physics-informed neural networks. Journal of Computational Physics. 2023;474.
- 47.
Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014;.
- 48. Liu DC, Nocedal J. On the limited memory BFGS method for large scale optimization. Mathematical programming. 1989;45(1):503–528.
- 49. Nath Kamaljyoti, Meng Xuhui, Smith Daniel J., Karniadakis George Em, Physics-informed neural networks for predicting gas flow dynamics and unknown parameters in diesel engines, Scientific Reports, vol. 13, no. 1, article 13683, August 22, 2023, https://doi.org/10.1038/s41598-023-39989-4 pmid:37607951
- 50. Zhai Weida, Tao Dongwang, Bao Yuequan, Parameter estimation and modeling of nonlinear dynamical systems based on Runge–Kutta physics-informed neural network, Nonlinear Dynamics, vol. 111, no. 22, pp. 21117–21130, 2023, Springer
- 51.
Jochen Stiasny, Samuel Chevalier, Spyros Chatzivasileiadis, Learning without data: Physics-informed neural networks for fast time-domain simulation, in 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 438–443, 2021, IEEE.
- 52.
Enrico Schiassi, Andrea D’Ambrosio, Hunter Johnston, Mario De Florio, Kristofer Drozd, Roberto Furfaro, Fabio Curti, and Daniele Mortari, Physics-informed extreme theory of functional connections applied to optimal orbit transfer, in Proceedings of the AAS/AIAA Astrodynamics Specialist Conference, Lake Tahoe, CA, USA, pages = 9–13, 2020.
- 53. Mario De Florio, Enrico Schiassi, Barry D Ganapol, and Roberto Furfaro, Physics-informed neural networks for rarefied-gas dynamics: Thermal creep flow in the Bhatnagar–Gross–Krook approximation, Physics of Fluids, vol. 33, no. 4, AIP Publishing, 2021.
- 54. Mario De Florio, Enrico Schiassi, Barry D Ganapol, and Roberto Furfaro, Physics-informed neural networks for rarefied-gas dynamics: Poiseuille flow in the BGK approximation, Zeitschrift für angewandte Mathematik und Physik, vol. 73, no. 3, p. 126, Springer, 2022.
- 55. Mario De Florio, Enrico Schiassi, Roberto Furfaro, Barry D Ganapol, and Domiziano Mostacci, Solutions of Chandrasekhar’s basic problem in radiative transfer via theory of functional connections, Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 259, p. 107384, Elsevier, 2021.
- 56. Mario De Florio, Enrico Schiassi, Francesco Calabrò, and Roberto Furfaro, Physics-Informed Neural Networks for 2nd order ODEs with sharp gradients, Journal of Computational and Applied Mathematics, vol. 436, p. 115396, Elsevier, 2024.
- 57. Enrico Schiassi, Mario De Florio, Andrea D’Ambrosio, Daniele Mortari, and Roberto Furfaro, Physics-informed neural networks and functional interpolation for data-driven parameters discovery of epidemiological compartmental models, Mathematics, vol. 9, no. 17, p. 2069, MDPI, 2021.
- 58. Mowlavi Saviz and Nabi Saleh, Optimal control of PDEs using physics-informed neural networks, Journal of Computational Physics, vol. 473, p. 111731, Elsevier, 2023.
- 59. Ehsan Kharazmi, Zhongqiang Zhang, and Karniadakis George Em, hp-VPINNs: Variational physics-informed neural networks with domain decomposition, Computer Methods in Applied Mechanics and Engineering, vol. 374, p. 113547, Elsevier, 2021.
- 60. Qin Lou, Xuhui Meng, and George Em Karniadakis, Physics-informed neural networks for solving forward and inverse flow problems via the Boltzmann-BGK formulation, Journal of Computational Physics, vol. 447, p. 110676, Elsevier, 2021.
- 61. Xiaoli Chen, Liu Yang, Jinqiao Duan, and George Em Karniadakis, Solving Inverse Stochastic Problems from Discrete Particle Observations Using the Fokker–Planck Equation and Physics-Informed Neural Networks, SIAM Journal on Scientific Computing, vol. 43, no. 3, pages B811–B830, SIAM, 2021.
- 62.
Enrico Schiassi, Andrea D’Ambrosio, and Roberto Furfaro, Bellman Neural Networks for the Class of Optimal Control Problems With Integral Quadratic Cost, IEEE Transactions on Artificial Intelligence, 2022, IEEE.
- 63. Yubiao Sun, Ushnish Sengupta, and Matthew Juniper, Physics-informed deep learning for simultaneous surrogate modeling and PDE-constrained optimization of an airfoil geometry, Computer Methods in Applied Mechanics and Engineering, vol. 411, p. 116042, Elsevier, 2023.
- 64.
S. Shekarpaz, F. Zeng, and G. Karniadakis, Splitting physics-informed neural networks for inferring the dynamics of integer-and fractional-order neuron models, arXiv preprint arXiv:2304.13205, Apr 26, 2023.
- 65. Strang Gilbert On the construction and comparison of difference schemes SIAM journal on numerical analysis vol. 5, no. 3, pages 506–517, 1968 SIAM.
- 66. Godunov Sergei K and Bohachevsky I Finite difference method for numerical computation of discontinuous solutions of the equations of fluid dynamics Matematičeskij sbornik vol. 47, no 3, pages 271–306, 1959.
- 67.
Ismail Alaoui Abdellaoui, Siamak Mehrkanoon, Symbolic regression for scientific discovery: an application to wind speed forecasting, 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pages = 01–08, 2021, IEEE.