Hybrid Models and Biological Model Reduction with PyDSTool

The PyDSTool software environment is designed to develop, simulate, and analyze dynamical systems models, particularly for biological applications. Unlike the engineering application focus and graphical specification environments of most general purpose simulation tools, PyDSTool provides a programmatic environment well suited to exploratory data- and hypothesis-driven biological modeling problems. In this work, we show how the environment facilitates the application of hybrid dynamical modeling to the reverse engineering of complex biophysical dynamics; in this case, of an excitable membrane. The example demonstrates how the software provides novel tools that support the inference and validation of mechanistic hypotheses and the inclusion of data constraints in both quantitative and qualitative ways. The biophysical application is broadly relevant to models in the biosciences. The open source and platform-independent PyDSTool package is freely available under the BSD license from http://sourceforge.net/projects/pydstool/. The hosting service provides links to documentation and online forums for user support.


Introduction
There are now many simulation and visualization software packages available for individual application domains across the biosciences, and several general purpose packages for analyzing dynamical systems. However, less attention has been paid to the tools needed to develop models for complex phenomena through inference and reverse engineering of natural systems [1][2][3]. The unique scientific focus of the PyDSTool software package [4] (current version provided in Supplemental Text S4) is providing integrated simulation, informatic, and diagnostic tools to support a forward-looking modeling methodology for the biosciences. This fills a niche in the biosciences where mathematical tools such as dimension reduction, qualitative dynamical systems theory and bifurcation analysis can be better integrated into modeling workflows. To enhance this support, PyDSTool is open source, and is extensible and inter-operable with existing applicationspecific tools rather than competing with them. As such, PyDSTool fills a niche that is between large-scale simulation tools, capable of efficiently handling thousands of variables (e.g., Neuron [5], NEST [6], VCell [7] or Bio-SPICE [8]), graphically interactive environments that are better suited for relatively small numbers of variables (e.g., XPPAUT [9], DsTool [10] and CONTENT [11]), and lightweight scripting in Matlab [12] or SciPy [13]. This niche is detailed below by considering some unmet challenges to model development.
Model development in the biosciences requires tools such as simulation, visualization, data analysis, diagnostics and validation, sensitivity analysis, and parameter optimization [14]. There is no rigorous methodology for combining these tools effectively, making development an esoteric and often ad hoc process that blends intuition and exploration [15]. In addressing this shortcoming, systematic and computer-assisted approaches to model development have received attention in the software engineering and business management literature [16,17], but they typically focus on graphically-designed workflows for integrating large quantities of data analysis with large scale simulation, and on management of sharing, collaboration, and provenance [18][19][20]. Previous work has suggested that a computer-assisted approach to identifying models using qualitative representations can bridge the scale from microscopic to macroscopic models and can guide users to develop heterogeneous and multi-level representations that assist in comprehending complex mechanisms [21][22][23][24][25][26][27][28][29][30][31]. However, an integrated and general-purpose software environment has not yet emerged in response to this need. In response, PyDSTool has been developed as a platform for prototyping model development principles gleaned from other fields.
At a more technical level, PyDSTool provides a range of general purpose simulation and analysis tools with similar functionality to existing packages. A detailed feature comparison is provided on the website [32] and is summarized here. A major difference from other packages is the programmatic and interactive aspects of the PyDSTool library within the Python environment. For instance, PyDSTool provides high-level compositional model-building tools involving symbolic expressions and modular component templates. This helps users construct large models efficiently and in a mathematically natural fashion compared to graphical approaches or de novo coding directly in C, Fortran, Java, Python, etc. The rich representation of model structure also facilitates the sophisticated manipulation of models both interactively and algorithmically.
The symbolic tools provided allow Jacobian functions to be defined automatically for smooth dynamical systems, making their analysis more efficient. While computer algebra systems such as Maple [33] and Mathematica [34] provide more advanced symbolic tools, they do not provide fast numerical integration and bifurcation tools, and they are not intended for intensive numerical exploration of high-dimensional, nonlinear dynamical systems.
PyDSTool exhibits superior performance in numerical integration over other high-level environments because it automatically generates low-level C code from user model descriptions. The lowlevel code is linked dynamically with C or Fortran solvers and reloaded as a dynamic linked library for transparent user access within Python. Comparable high-level programmatic environments such as Matlab or Python-based packages such as SciPy, Brian [35], or PySCeS [36], rely on the less efficient high-level implementation of model codes, the results of which are accessed via an expensive 'call-back' interface from low-level integrators. In addition, Matlab is neither free nor open source, and does not provide convenient model description and analysis tools for working with dynamical systems models. This article describes an example of how the PyDSTool environment supports flexible and extensible workflows to be built over a class hierarchy designed for development and analysis of dynamical models in scientific applications, as well as for interoperability between packages and algorithms. Example workflows suited to this environment include: (1) model creation specified by code that later adaptively modifies the model structure or parameters (e.g., when interfaced with optimization algorithms); (2) adaptive batches of simulations and analyses that post-process simulation results to determine which simulations to perform next; and (3) exploration of model properties and live prototyping of model development code. Further examples are discussed throughout the text.

Using hybrid systems for reduction
To illustrate some of the unique benefits of PyDSTool to the modeler, we will discuss an example that combines elements of the workflows (1) and (3) above using reduction techniques and hybrid dynamical systems (see [37] and references therein). Hybrid dynamical systems are defined later, but loosely speaking they are made up of smooth vector fields (such as those defined by differential equations) that are punctuated by discrete changes. This makes them especially useful for modeling systems that display modularity in functional state or inherent structure. We will discuss a measurement of modularity in Results.
In contrast to dimension reduction methods that aim to increase simulation efficiency (e.g., see [38]), reduction to hybrid system models has been demonstrated as a tool for inferring the key causal mechanisms underlying a high-dimensional or otherwise complex phenomenon, and for validating the resulting hypotheses with a dynamical system of lower dimension or greater simplicity (e.g., [39][40][41][42]). In lieu of a more developed qualitative theory of nonlinear dynamics for hybrid systems, explicitly simulated representations of the reduced dynamics are needed to ensure that the hypotheses are both logically self-consistent and consistent with experimental data. (E.g., see [43] for an example of exploring this issue in biomechanical modeling.) In addition, algorithms have recently been developed that assist with the inference of mechanistic relationships in ordinary differential equation (ODE) models, and also with the systematic and semi-automated construction of the resulting reduced descriptions in terms of hybrid systems [44][45][46].
A simple example of reducing an ODE in this way is the replacement of a smooth sigmoidal function having a steep slope with a piecewise-constant or piecewise-linear step function in contexts that are not sensitive to the details of the smooth slope (see [47]). This reduction can be understood in terms of functional modularity because, in this situation, the strong nonlinearity of the function ensures effective decoupling between the input and output when the input is sub-threshold. The validation of the reduction in an explicit context tests the hypothesis that the details of the function's transition were not mechanistically relevant. The integrate-and-fire neuron described in Supplementary Text S1 uses a similar replacement of smooth spiking dynamics with a discrete, instantaneous reset.
Multiple time scale systems such as the Van der Pol oscillator or chemical reaction systems with explicit small parameters are classic examples of systems that can be reduced to hybrid systems [48][49][50]. The models are studied as singularly perturbed systems, from which a quasi-steady state approximation and similar techniques obtain a 'fast-slow' reduction. This reduction is generally in the form of a set of differential-algebraic equations (DAEs) with domain consistency conditions, and can be seen as a piecewise-local dimension reduction of the model. Thus, it can be simulated and numerically analyzed using a hybrid model formalism [44].
We demonstrate hybrid model reduction in a scenario involving the space-clamped Hodgkin-Huxley (HH) formalism for neural action potentials (AP) [51]. This is a common biophysical model based on the first-order kinetics of ion transport across a cell membrane, and reflects an equation structure inherently similar to many models in systems biology (two recent examples that use PyDSTool can be found in [52,53]). Although the HH model has been analyzed extensively by mathematical and numerical means, PyDSTool provides a novel opportunity to algorithmically derive, specify, analyze, and validate a reduced and explicit description of an AP, from which we claim that superior insight into its biophysical mechanism is possible.

Design and Implementation
The PyDSTool package is a library-based environment written primarily in Python, utilizing the numpy [54], SciPy [13], and matplotlib [55] packages. A few optional dependencies are a C and Fortran compiler and the SWIG interfacing software (http:// swig.org), which are only necessary to run simulations at their fastest or to run the bundled AUTO continuation software [56]. The avoidance of non-Python external dependencies simplifies the installation process on any operating system, which is described in full via the 'Getting Started' link from http://pydstool.sourceforge. net.

Core classes
PyDSTool is unique in providing a variety of high-level data types ('classes') and library functions that closely mimic mathematical counterparts in dynamical systems theory and provide intuitive functionality. For example, domains and numeric intervals are represented by the Interval class, for which membership, intersection, endpoint testing, etc., are simple operations. Similarly, numerical arrays are extended to become Pointsets, incorporating several features: named fields instead of indices for accessing variables, an associated independent variable that may parameterize the data, and metadata labels that can be indexed and cross-referenced. Pointsets are further abstracted to Trajectories (parameterized, smooth curves), which further add a transparent layer of domain checking and interpolation that allows numerically computed data to be treated as continuous, when appropriate. Among others, such classes provide intuitive abstractions that allow users to more efficiently express their mathematical ideas in new algorithms, or to naturally specify complex meta-model constraints.
At the lowest level, PyDSTool supports simulations of ordinary differential equations (ODEs), differential-algebraic equations (DAEs), and discrete mappings [57]. Few comparable dynamical systems packages support DAEs, which are useful in hybrid modeling, and rarely support hybrid dynamics beyond simple case-based 'switch' or 'if' statements or Heaviside functions. PyDSTool users can specify dynamics using evolution equations or explicit functions of time or state. The range of possible formalisms for specifying dynamics is supported by the Generator abstract class, which creates Trajectory objects on demand. There are several ODE solver implementations supported: the adaptive time step solvers Dopri and Radau (an implicit solver that is well-suited to stiff systems and also supports DAEs) [58], a 4th-order Runge-Kutta fixed time-step method, and a wrapping of VODE (via SciPy) [59]. All solvers support arbitrary-precision event detection with a simple Event class, which is crucial for defining hybrid systems and is missing from many application-specific simulators. PyDSTool is modular and can be extended to support other solvers.
Bundled toolboxes provide special functionality such as phase plane analysis, model reduction, optimization, data analysis, and templates and interfaces for application-specific modeling and third-party software. For instance, users who install the PySCeS [36] or SloppyCell [60] systems biology packages have the option to create models by exporting from those packages (e.g., based on SBML definitions [61]). Equally, with the NineML Python API installed [62,63], many forms of neural models can be imported directly. Alternatively, models can be prepared directly in PyDSTool using modular constructors and symbolic expressions, or by writing raw text definitions. An export option to the ADOL-C periodic solver in Matlab is also provided [64].

Hybrid model implementation
We take a practical approach to implementing hybrid systems (sometimes known as composite models in other fields) in PyDSTool that is most applicable to biophysical models, where smooth dynamics are primary and are punctuated by finite numbers of discrete events. This differs from the majority of existing simulation platforms, which typically focus on physical models for engineering applications with many parallel discrete event processes mixed with smooth dynamics. Hybrid models in PyDSTool can be built from sub-models that mix discrete mappings, ODEs, DAEs, preset trajectories, or any other embedded code that can produce a Trajectory object.
There are many formalisms for hybrid systems, but the 6-tuple of Simić et al. is adequate for our purposes [65]: H~Q,E,D,X ,G,R ð Þ . We do not take a formal approach here, and it is sufficient to describe these elements informally and direct the reader to the reference for details. Q is a finite set of discrete states of the system, which we will refer to as regimes. The regime transition graph is given by nodes from Q and edges from E. In each regime, a sub-model is defined from a corresponding ndimensional vector field from the set X over a domain from the set D. Transition events for the edges in E are indicated by 'guards' from the set G, which are n{1 dimensional sub-manifolds in each domain. We will define these guards by zero-crossing functions on those domains. Finally, there is an optional resetting map associated with each edge, taken from the set R, which discretely changes the state variables on a regime transition. Consistent with some formalisms, we allow a subset (often just one) of the n state variables (known as 'indicator' variables) to be discrete and therefore constant during each regime.
PyDSTool uses three essential code elements to define a hybrid model: (1) a hierarchy of component sub-models, (2) a mixture of zero-crossing events and global self-consistency conditions, and (3) transition rules between the sub-models that are applied on occurrence of an associated event or condition failure. During simulation of a sub-model, a terminal event may occur that stops the trajectory generation (see the Tutorial in Supplemental Text S1). In such a scenario, the transition rules for the stopped submodel are applied to the final state to choose the next sub-model. The final state may also be mapped to a new value before becoming the initial condition for the next sub-model.
A user may define a hybrid model with only one sub-model, such that a terminal event or maximum elapsed time defined for that sub-model's regime will be associated with a transition back to the same sub-model, typically after a discrete change to the state is applied. An example of this is using a simple threshold-crossing event to signal an action potential (AP) in an integrate-and-fire neuron model [66], after which a discrete change is applied to reset the membrane potential. (This model's implementation is described further in Supplemental Texts S1 and S2.) Figure 1 summarizes the implementation structure of hybrid models as a hierarchy of HybridModel objects: other HybridModels may reside at nodes while NonHybridModels (that are nonhierarchical and can only wrap Generators) reside at the leaves of the tree. In between sub-models and their parent are ModelInterface (MI) objects. MIs are generic wrappers that filter, transform or otherwise post-process the output of a sub-model ( Figure 2). For the simplest hybrid models (see Text S1 for an example), MIs are neutral and no external conditions are necessary, but their high level of generality allows validation of hybrid model selfconsistency conditions (see Results) and facilitates qualitative model optimization algorithms (see Future Directions). MIs do these tasks by encoding data-and hypothesis-driven constraints. Additionally, they can make optimization more robust using internal failure recovery. For example, recovery code can safeguard attempts by a standard optimization routine to test parameter values in the model that may be inconsistent with its definition. For instance, such code can catch a problem and return special values of the objective function rather than an error condition.

Scaffolding concepts and implementation in PyDSTool
In this sub-section, we introduce a broad conceptual framework for the established analytical approach of piecewise-reduced models, and focus on the validation of the analysis by implementing the result as a hybrid systems reduction within PyDSTool.
Modularity of systems (more accurately, 'near decomposability') can be described as the occurrence of clusters of state variables that are highly inter-coupled but sparsely or weakly coupled to external variables [2,67,68]. Figure 3 schematically describes a spatial or structural form of the conceptual reduction framework used here, in which each inferred module can be analyzed, reduced and tested separately, and then further tuned in the context of the whole system. Following similar steps, Figure 4 summarizes the approach for a model that exhibits modular functional patterns that change over time. The modules form the basis of each sub-model of a hybrid model reduction. Reduction approaches are complicated when the effective decoupling between groups of variables is time-or state-dependent, as in the singularly perturbed systems mentioned earlier, and in the main example described below.
In its focus on supporting the exploration and understanding of emergent dynamics across scales, PyDSTool permits embedding of data playback, simplified model components, or even analytical tools as surrogates inside MI objects, thereby creating a hybrid model of heterogeneous component types. For instance, an embedded optimization tool could interface with regular model components to infer a low-dimensional characterization (e.g. a functional curve fit or a reduced-order model) of the dynamics of some part of a larger process [40,41] (Figures 3 and 4). This exploratory approach can be described as 'scaffolding,' and permits users to focus on developing specific parts of a model in a contextually appropriate fashion. Some components may be temporary placeholders while others are refined, e.g. fitted to experimental data, such that the eventual goal may be to unify the model as an entirely ODE-based (non-hybrid) model. Similarly, scaffolding can allow model order reductions to be targeted to specific temporal or state sub-domains, increasing the degree of reduction possible. A more advanced example is replacing a dynamic variable with a surrogate time series that plays back experimental or simulated data (known as an 'external input' variable in PyDSTool) and ignoring feedback to it from other coupled variables.
The scaffolding idea has not been exploited in computational software previously, but is conceptually related to using code stubs and testing with surrogate data in software engineering, and to the mathematical 'buffering' principle for tackling complex models of biochemical systems [69].

Example: The Hodgkin-Huxley action potential
We present a hybrid system implementation of the reduced Hodgkin-Huxley (HH) action potential (AP) derived from a 'dominant scale' analysis [70]. Previously, rigorous mathematical approaches to this system have yielded asymptotically-valid DAEs with only implicit dynamics for some slow variables, whereas we derive a fully explicit reduced-order model of the dynamics throughout an AP [46,71]. Details of the ODEs for this system and their analysis can be found in these references, and for reasons of space we briefly summarize the mathematical setup. Coding details of this hybrid model specification can be found in Supplemental Texts S1 and S3.
There are four state variables in the HH model: v for the membrane potential and three ionic channel gating variables, two for the fast sodium (m, h) and one for delayed-rectifier potassium (n). These are given by differential equations for their first order kinetics, and are only coupled with the equation for v in a hub-like graph with non-symmetric coupling rules. The validity of a reduced regime determined by dominant scale analysis is determined by the truth of the defining assumptions. These include controls on the relative time scales of the variables (each available in explicit algebraic form) and the relative scales of dominance calculated as the quantities Y s for s[C~fm,h,n,l,ag, which includes the passive input terms for the leak conductance and an applied current I app (l and a are dummy variables that are 0 or 1 depending on their inclusion in a regime). The Y s are essentially sensitivities of the quasi-static resting potential of v with respect to changes in each other variable, and are available in explicit algebraic form. The 'active' set of variables in a regime can now be defined as for some user-defined scale tolerance sw1. A terminal event will indicate that a variable has left A during the regime when the above inequality becomes an equality (a zero-crossing), with a similar event for variables joining A. Post-processing of trajectories is required to determine exactly which variable left or joined A (see Supplemental Text S1).
Dominant scale analysis of a periodic orbit indicated four reduced regimes that capture the essential dynamics of qualitatively distinct parts of the AP cycle relative to v [70]. The sub-models for the regimes do not need to be decomposed further into sub-models, and so are implemented by NonHybridModels containing single Generator objects. ..n represent 'internal' model interface objects that wrap n sub-models (n §1). If global consistency conditions are applied to these sub-models, then n 'external' model interface objects, eMI 1...n , may also be provided. Each iMI may either contain a non-hybrid or another hybrid model object (an example is shown). Non-hybrid models combine with a GeneratorInterface to make a thin wrapper for Generator objects, ensuring API-compatibility with hybrid model objects and other MIs and thus promoting interchangeability. Conditions in each eMI specify a target combination of truth or falsity of one or more constituent features. The features measure properties of the corresponding iMI and compare them to those in some external data such as a user-imposed logical template or experimental data. doi:10.1371/journal.pcbi.1002628.g001 Each sub-model contains v and a different set of active inputs requiring ODEs for zero or more truly dynamic state variables, resulting in a DAE. The simplified equations and their consistency conditions form the scaffolding for this sub-model. There is no coupling to v from the variables not in the active set, but we must track any changes they make because they are 'slaved' to v. The validation method requires that we simulate the non-active 'shadow variables' in order to correctly determine when the sub-model's assumptions become inconsistent with the original model [44]. For this situation, PyDSTool naturally supports sub-models having differing dimension by allowing dummy variables to be included in a sub-model. Thus all sub-models must formally include all state variables in one form or other.
Taken together as a hybrid model, the piecewise-local reductions form a formal hypothesis about the key mechanistic relationships between variables during an AP. The inferred mechanistic description is formalized as a sequence of domain and transition rules called a 'template' (e.g., see Table 1 of [46]). Such clear and specific insights give an advantage for hybrid modeling over other forms of model reduction. For parameters corresponding to a fast inhibitory interneuron, the trajectories computed with this hybrid model are almost indistinguishable from that of the full model over a wide range of applied currents. For instance, the periods of oscillation match closely ( Figure 5) as I app is increased, except around the onset of AP oscillations from a steady state (I app &0:3). Figure 6 summarizes the similarity of APs in a more sophisticated dominant scale analysis that determined hybrid models from four HH-type neurons all matching the same template, thereby validating the proposed mechanism [46]. Greater mismatches between reduced and original model trajectories indicate weakening self-consistency conditions in the reductions and provide a diagnostic focus for refinement of the mechanisms at work.

Availability and Future Directions
The software and its source code are publicly available, anonymously and for free, under the BSD license: http:// sourceforge.net/projects/pydstool/. The download is accompanied by a test suite, documentation of the API, and a link to online documentation at http://pydstool.sourceforge.net. Sourceforge provides user forums for feedback about software use, bug reporting, etc. Interfaces to specialized modeling and simulation packages will continue to be developed in collaboration with interested users. This example assumes a model with hub-like connectivity, exhibiting multiple scale dynamics, and a periodic behavior (period T), but a similar process can be described for non-periodic dynamics. State variables are shown by boxes and their inter-coupling by lines. A) Dominant scale analysis identifies Regime I over some time window ½t 0 ,t 1 ) (indicated on the blue time axis) within which a subset of the variables (yellow oval) are the most influential on the system's output; the other connections are effectively weak (dashed lines). B) The internal dynamics of the resulting sub-model for the regime (yellow puzzle piece) is analyzed in the context of known input and output conditions alongside the full model under equivalent conditions, and the parameters and contextual conditions for the reduction are tuned to maximize the accuracy of this representation over ½t 0 ,t 1 ). C) The consistency of the sub-model with the full dynamics beyond t[½t 0 ,t 1 ) is tested for the generation of accurate cyclic behavior over a period ½t 0 ,T T) forT T&T, allowing for further refinement. D) The process in A-C is repeated for other regimes, creating four consecutive sub-models in this example. These should form a self-consistent cycle of entry and exit conditions (indicated by matching puzzle pieces) such that from the composition emerges a periodic behavior closely matching that of the full model. doi:10.1371/journal.pcbi.1002628.g004 This work reinforces the idea that a reduction should take place in the context of a particular phenomenon that the model captures, and that constraints from that context can be imposed as part of a computational reduction process. The hybrid model reduction methodology has determined an explicit mechanistic model for the sodium and potassium dynamics of the action potential process using a sequence of low-dimensional approximations with no a priori assumptions or formal asymptotic limits. This decompositional approach is intended to facilitate a more sophisticated investigation of underlying mechanisms in complex dynamics than the comparatively naive approaches of brute-force parameter sweeps and large-scale simulation. It also expected to lead to more effective methods of designing complex dynamics in continuous dynamical systems.
The design of PyDSTool facilitates qualitative and multiobjective optimization techniques, which are increasingly recognized as important aspects of biological modeling [3,72,73]. Work in progress extends the use of PyDSTool classes to develop the concept of scaffolding further into model optimization and analysis applications [74].

Supporting Information
Text S1 Tutorial for Hybrid Model Implementation in PyD-STool. (PDF) Text S2 Syntax-highlighted code for the demonstration script IF_squarespike_model.py. (PDF) Text S3 Syntax-highlighted code for the demonstration script HH_DSSRTtest.py. (PDF) Text S4 Complete source code for the PyDSTool package (version 0.88.120504). Includes API documentation and help files linking to web pages. This file is identical to the current public release on Sourceforge.net. (ZIP) Acknowledgments I thank S. Nolen and the anonymous reviewers for their constructive comments.