Skip to main content
Advertisement

< Back to Article

Fig 1.

The neural network architecture of Ref. [33]’s π-net V1 is shown on the left.

On the right, we show a worked example of what a 1-dimensional input layer with variable x symbolically looks like throughout the network architecture. The circles with the * symbol represent a layer that is the Hadamard product of the layer’s inputs. The boxes labeled L represent standard linear layers without any activation functions. This neural network architecture has no standard activation functions such as tanh or ReLU, which makes it interpretable.

More »

Fig 1 Expand

Fig 2.

For the univariate cubic polynomial f(x) = 1 + x + 2x2 + 4x3, a third order Bayesian polynomial neural network was trained with the No-U-Turn-Sampler (NUTS) algorithm.

The kernel density estimates for the posterior distributions of the weights and biases of the polynomial neural network are shown. The panes are sequentially ordered (left-to-right, top-to-bottom) from the first layer to the last layer in the neural network. There is no legend associated with the colors in the figure. The colors are only used to distinguish between posterior distributions of the parameters.

More »

Fig 2 Expand

Fig 3.

For the univariate cubic polynomial f(x) = 1 + x + 2x2 + 4x3, a third order Bayesian polynomial neural network was trained with a) the Laplace approximation, b) Markov Chain Monte Carlo with the No-U-Turn-Sampler (NUTS) algorithm, and c) Variational Inference. For comparision, d) Bayesian linear regression was also performed on the training data. The kernel density estimates for the posterior distributions of the polynomial coefficients are shown (left) along with their predictions and credible intervals (right). For the left column, the true value of the parameters is shown in the legend. Each of the columns share the same legend.

More »

Fig 3 Expand

Fig 4.

For the univariate cubic polynomial f(x) = 1 + x + 2x2 + 4x3, a third order Bayesian polynomial neural network was trained with Markov Chain Monte Carlo.

To access the convergence and mixing of the Markov Chain, we show the trace plot for all of the expanded polynomial coefficients.

More »

Fig 4 Expand

Fig 5.

For the univariate cubic polynomial f(x) = 1 + x + 2x2 + 4x3, a third order Bayesian polynomial neural network was trained with Markov Chain Monte Carlo.

To show sufficient Markov Chain length, we show the results of MCMC for the following sample sizes: a.) 500 samples, b) 1000 samples, and c) 10000 samples.

More »

Fig 5 Expand

Fig 6.

For the univariate cubic polynomial f(x) = 1 + x + 2x2 + 4x3, a third order Bayesian polynomial neural network was trained with the Laplace approximation, Markov Chain Monte Carlo with the No-U-Turn-Sampler (NUTS) algorithm, and Variational Inference.

For comparision, Bayesian linear regression was also performed on the training data. We repeated the inference methods for 100 distinct datasets and calculated the fraction of the datasets in which the 90% and 95% credible intervals captured the true parameter value. The 90% and 95% confidence intervals for the 90% and 95% coverage fractions are also shown.

More »

Fig 6 Expand

Fig 7.

The fitted Gaussian process regression model trained on the noisy Lotka Volterra Oscillator data was used as initial conditions for the neural ODE’s integration training trajectories.

More »

Fig 7 Expand

Fig 8.

For the Lotka Volterra Oscillator, we show the kernel density estimates for the posterior distributions of the polynomial coefficients obtained with a.) the Laplace Approximation, b.) Markov Chain Monte Carlo, and c.) Variational Inference. For comparison purposes, we also show the case for d.) Approximate Bayesian Computation on a normal ODE. The true value of the coefficients is shown in the legend. The legend is shared for each of the columns.

More »

Fig 8 Expand

Fig 9.

For the Lotka Volterra Oscillator, we show the predictive performance of a Bayesian polynomial neural ODE trained using a) the Laplace Approximation, b) Markov Chain Monte Carlo, and c) Variational Inference. The solid red and blue dots indicate the training data, solid green lines indicate the true ODE model, dashed lines indicate the predictive mean model, and shaded regions indicate 95% and 99.75% credible intervals.

More »

Fig 9 Expand

Fig 10.

For the Lotka Volterra Oscillator, a second order Bayesian polynomial neural network was trained with Markov Chain Monte Carlo.

To access the convergence and mixing of the Markov Chain, we show the trace plot for all of the expanded polynomial coefficients.

More »

Fig 10 Expand

Table 1.

To access the convergence of Markov Chain Monte Carlo inference on the Lotka Volterra Oscillator, we show the Geweke diagnostic number for each of the parameters.

The values were calculated with Eq 23. At the confidence level of 0.05, the upper and lower tails of the t distribution are ±1.964.

More »

Table 1 Expand

Fig 11.

The fitted Gaussian process regression model trained on the noisy Damped Oscillator data was used as initial conditions for the neural ODE’s integration training trajectories.

More »

Fig 11 Expand

Fig 12.

For the Damped Oscillator, we show the kernel density estimates for the posterior distributions of the polynomial coefficients obtained with a.) the Laplace Approximation, b.) Markov Chain Monte Carlo, and c.) Variational Inference. For comparison purposes, we also show the case for d.) Approximate Bayesian Computation on a normal ODE. The true value of the coefficients is shown in the legend. The legend is shared for each of the columns.

More »

Fig 12 Expand

Fig 13.

For the Damped Oscillator, we show the predictive performance of a Bayesian polynomial neural ODE trained using a) the Laplace Approximation, b) Markov Chain Monte Carlo, and c) Variational Inference. The solid red and blue dots indicate the training data, solid green lines indicate the true ODE model, dashed lines indicate the predictive mean model, and shaded regions indicate 95% and 99.75% credible intervals.

More »

Fig 13 Expand

Fig 14.

For the Lorenz Attractor, we show the kernel density estimates for the posterior distributions of the polynomial coefficients obtained with a.) the Laplace Approximation, b.) Markov Chain Monte Carlo, and c.) Variational Inference. The true value of the coefficients is shown in the legend. The legend is shared for each of the columns.

More »

Fig 14 Expand

Fig 15.

For the Lorenz Attractor, we show the predictive performance of a Bayesian polynomial neural ODE trained using the Laplace Approximation.

The solid red, blue, and orange dots indicate the training data, solid green lines indicate the true ODE model, dashed lines indicate the predictive mean model, and shaded regions indicate 95% and 99.75% credible intervals.

More »

Fig 15 Expand

Fig 16.

For the Lorenz Attractor, we show the predictive performance of a Bayesian polynomial neural ODE trained using Markov Chain Monte Carlo.

The solid red, blue, and orange dots indicate the training data, solid green lines indicate the true ODE model, dashed lines indicate the predictive mean model, and shaded regions indicate 95% and 99.75% credible intervals.

More »

Fig 16 Expand

Fig 17.

For the Lorenz Attractor, we show the predictive performance of a Bayesian polynomial neural ODE trained using variational inference.

The solid red, blue, and orange dots indicate the training data, solid green lines indicate the true ODE model, dashed lines indicate the predictive mean model, and shaded regions indicate 95% and 99.75% credible intervals.

More »

Fig 17 Expand

Fig 18.

It is common for a domain expert to understand part of the system’s underlying mechanisms, but have in incomplete model.

Given an incomplete model, a neural ODE can learn the missing terms from the ODE model that best fit the observed data. We have removed two of the terms from the Lotka Volterra model and tested the neural ODE’s ability to learn the missing terms. We show the kernel density estimates for the posterior distributions of the polynomial coefficients obtained with a.) the Laplace Approximation, b.) Markov Chain Monte Carlo, and c.) Variational Inference. The true value of the coefficients is shown in the legend. The legend is shared for each of the columns.

More »

Fig 18 Expand