Distilling identifiable and interpretable dynamic models from biological data

doi:10.1371/journal.pcbi.1011014

Fig 1.

Workflow of the methodology.

Scenario (I) (solid black lines only): data-driven full model discovery from (time-series) data with no prior knowledge. We apply SINDy-PI and test the SIO of the discovered candidate model (CM). If the CM is not FISPO, we reparameterize it. Next, we check if the model is interpretable; if not, we reformulate it via symbolic manipulation. The result is a FISPO interpretable model, M*. Scenario (II) (solid black lines + dashed, dark orange lines in the lower part): model discovery from (time-series) data with good prior knowledge. In this scenario we seek model (in)validation and/or refinement. We have a prior model (PM) which we want to compare with an alternative candidate discovered from data (CM). To this end, we check the SIO of the PM and reparameterize if needed. In parallel, we apply SINDy-PI to the data to obtain a CM, and we make sure it is FISPO (using reparameterisation if not). Then, we use model reformulation techniques to ensure interpretable versions (M* and PM*) if needed. Lastly, we perform a comparative analysis.

More »

Expand

Table 1.

Main features of the case studies: Relevant references and main characteristics of the models considered in the case studies.

The fourth and fifth rows show the maximum degree of N(x) and D(x) in Eq 4. The last row indicates if the original (ground truth, GT) model is fully identifiable and observable (FISPO).

More »

Expand

Fig 2.

Lorenz case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Parameter accuracy: center, matching parametric ODEs for PM and M*. Predictive accuracy: on the right, time evolution of the different states (x₁, x₂ and x₃) of the PM and M* models.

More »

Expand

Fig 3.

Immunity model.

Structural unidentifiability in CM (unidentifiable parameters in red, identifiable parameters in blue) leads to the same output dynamics when different parameterizations are considered, as can be seen in CM2. In contrast, the reformulation M* is FISPO and therefore there is a unique set of parameters compatible with the output measurements.

More »

Expand

Fig 4.

Immunity case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Parameter accuracy: center, matching parametric ODEs for PM and M*. Predictive accuracy: on the right, time evolution of the different states (x₁ and x₂) of the PM and M* models.

More »

Expand

Fig 5.

Bacterial case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Parameter accuracy: center, matching parametric ODEs for PM and M*. Predictive accuracy: on the right, time evolution of the different states (x₁ and x₂) of the PM and M* models.

More »

Expand

Fig 6.

Microbial case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Parameter accuracy: center, matching parametric ODEs for PM and M*. On the right, predictive accuracy: time evolution of the different states (x₁ and x₂) of the PM and M* models.

More »

Expand

Fig 7.

Crypt case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Parameter accuracy: center, matching parametric ODEs for PM and M*. Predictive accuracy: on the right, time evolution of the different states (x₁, x₂ and x₃) of the PM and M* models.

More »

Expand

Fig 8.

Yeast-Glycolysis case study.

Structural accuracy: on the left, active terms in ξ (non-zero terms of the prior model PM in black, and of the inferred model M* in green). Due to the large number of terms in ξ, the candidate functions are not shown. Parameter accuracy: center, matching parametric ODEs for PM and M*. Predictive accuracy: on the right, time evolution of the different states (x₁, x₂, x₃, x₄, x₅, x₆ and x₇) of the PM and M* models.

More »

Expand