Missing data in amortized simulation-based neural posterior estimation

doi:10.1371/journal.pcbi.1012184

Fig 1.

Illustration of the workflow combining BayesFlow with missing data encoding.

Upfront training phase (left): Parameters θ ∼ π(θ) are sampled from the prior to simulate complete data sets x_1:N. Then, missing entries are randomly selected and encoded according to one of the three approaches “Insert c” (here c = −1), “Augment by 0/1” (here c = −1), and “Time labels”. The BayesFlow network is trained on such data sets with missing values using an online learning algorithm. Amortized inference (right): Experimentally observed incomplete data are processed using the preferred encoding approach (here “Augment by 0/1”, c = −1) and passed through the pre-trained BayesFlow network in its inverse direction. This leads to representative samples from the posterior conditioned on the available data . The upfront training amortizes over inference on arbitrarily many incomplete data sets.

More »

Expand

Fig 2.

Posterior approximations for the conversion reaction model with n_x = 3 observations.

Two test data sets at parameters [−0.98, −0.66] (Data set 1, top, n_⌀ = 1) and [−0.71, −0.54] (Data set 2, bottom, n_⌀ = 2) are shown. In Data set 2, no informative data are available, such that the posterior must equal the prior. All three encodings yield near-perfect posterior approximations for this simple problem.

More »

Expand

Fig 3.

Increased robustness through binary indicator augmentation in case of ambiguous dummy values c = 0.5 for the conversion reaction model with n_x = 3 observations.

Left: The approach “Insert 0.5” sees a data set in which only the second observation is missing. However, the network misinterprets the signal 0.501 as another missing value. Hence, the estimated posterior is wrong, and the third available data point is not fitted by the re-simulated trajectories. Middle: The approach “Augment by 0/1” is able to correctly identify the value 0.5 in the second entry as a missing value and in the third entry as a signal. Consequently, the estimated posterior is correct, and the re-simulated trajectories fit the third data point, but not the second one. Right: With changed binary indicator, the approach “Augment by 0/1” correctly interprets the value 0.5 in the second and third entry as missing, despite 0.5 being a plausible data value for the third entry.

More »

Expand

Fig 4.

Comparison of loss behavior for the sinusoidal model with variable data set length.

(a) Epoch-averaged loss over all 300 training epochs. (b) Loss in the last 20 iterations of the final epoch. Our missing data handling approach based on binary indicator augmentation achieves superior convergence to the original BayesFlow method, both globally (a) and on the level of individual iterations (b).

More »

Expand

Fig 5.

Results for the sinusoidal model with uniformly sampled missing time steps.

Top: Posterior distributions. Bottom: Posterior predictive checks. Two data sets at parameters [0.2, −0.4] (Data set 1, left, n_⌀ = 15) and [0.95, 0.1] (Data set 2, right, n_⌀ = 20) are shown.

More »

Expand

Fig 6.

Results for the SIR ODE model.

Top: Posterior distributions. Bottom: Posterior predictive checks displaying the means of noise-corrupted simulations and their centered 90% credible intervals. Three data sets at ground truth parameters [−0.8, −1.4] (Data set 1, left, n_⌀ = 15), [−1.0, −1.7] (Data set 2, middle, n_⌀ = 10) and [−0.5, −1.3] (Data set 3, right, n_⌀ = 5) are shown.

More »

Expand

Fig 7.

Parameter-dependent missingness for a modified conversion reaction model.

Left: Visualization of the data generation process. Right: Posterior approximations using the encoding “Augment by 0/1” for three data sets at ground truth parameters [−0.75, 0.6] (Data set 1), [−0.6, 0.3] (Data set 2) and [−1.1, 0.7] (Data set 3).

More »

Expand