PEtab—Interoperable specification of parameter estimation problems in systems biology

Reproducibility and reusability of the results of data-based modeling studies are essential. Yet, there has been—so far—no broadly supported format for the specification of parameter estimation problems in systems biology. Here, we introduce PEtab, a format which facilitates the specification of parameter estimation problems using Systems Biology Markup Language (SBML) models and a set of tab-separated value files describing the observation model and experimental data as well as parameters to be estimated. We already implemented PEtab support into eight well-established model simulation and parameter estimation toolboxes with hundreds of users in total. We provide a Python library for validation and modification of a PEtab problem and currently 20 example parameter estimation problems based on recent studies.


SBML model definition SBML model definition
The model must be specified as valid SBML. There are no further restrictions.

Condition table Condition table
The condition table specifies parameters, or initial values of species and compartments for specific simulation conditions (generally corresponding to different experimental conditions). This is specified as a tab-separated value file in the following way: Condition names are arbitrary strings to describe the given condition. They may be used for reporting or visualization.

${parameterOrSpeciesOrCompartmentId1}
Further columns may be global parameter IDs, IDs of species or compartments as defined in the SBML model. Only one column is allowed per ID. Values for these condition parameters may be provided either as numeric values, or as IDs defined in the SBML model, the parameter table or both.

${parameterId}
The values will override any parameter values specified in the model.

${speciesId}
If a species ID is provided, it is interpreted as the initial concentration/amount of that species and will override the initial concentration/amount given in the SBML model or given by a preequilibration condition. If NaN is provided for a condition, the result of the preequilibration (or initial concentration/amount from the SBML model, if no preequilibration is defined) is used.

${compartmentId}
If a compartment ID is provided, it is interpreted as the initial compartment size. This field allows overriding or introducing condition-specific versions of output parameters defined in the observation model. The model can define observables (see below) containing place-holder parameters which can be replaced by condition-specific dynamic or constant parameters. Placeholder parameters must be named observableParameter${n}_${observableId} with n ranging from 1 (not 0) to the number of placeholders for the given observable, without gaps. If the observable specified under observableId contains no placeholders, this field must be empty. If it contains n > 0 placeholders, this field must hold n semicolon-separated numeric values or parameter names. No trailing semicolon must be added.

Measurement
Different lines for the same observableId may specify different parameters. This may be used to account for condition-specific or batch-specific parameters. This will translate into an extended optimization parameter vector.
All placeholders defined in the observation model must be overwritten here. If there are no placeholders used, this column may be omitted.

noiseParameters [NUMERIC, STRING OR NULL, OPTIONAL]
The measurement standard deviation or NaN if the corresponding sigma is a model parameter.
Numeric values or parameter names are allowed. Same rules apply as for observableParameters in the previous point.
datasetId [STRING, OPTIONAL] The datasetId is used to group certain measurements to datasets. This is typically the case for data points which belong to the same observable, the same simulation and preequilibration condition, the same noise model, the same observable transformation and the same observable parameters. This grouping makes it possible to use the plotting routines which are provided in the PEtab repository.

replicateId [STRING, OPTIONAL]
The replicateId can be used to discern replicates with the same datasetId , which is helpful for plotting e.g. error bars.

Observables table Observables table
Parameter estimation requires linking experimental observations to the model of interest. Therefore, one needs to define observables (model outputs) and respective noise models, which represent the measurement process. Since parameter estimation is beyond the scope of SBML, there exists no standard way to specify observables (model outputs) and respective noise models. Therefore, in PEtab observables are specified in a separate The parameterId of the parameter described in this row. This has to match the ID of a parameter specified in the SBML model, a parameter introduced as override in the condition table, or a parameter occurring in the observableParameters or noiseParameters column of the measurement table (see above).

parameterName [STRING, OPTIONAL]
Parameter name to be used e.g. for plotting etc. Can be chosen freely. May or may not coincide with the SBML parameter name.

parameterScale [lin|log|log10]
Scale of the parameter to be used during parameter estimation.

lowerBound [NUMERIC]
Lower bound of the parameter used for optimization. Optional, if estimate==0 . Must be provided in linear space, independent of parameterScale .

upperBound [NUMERIC]
Upper bound of the parameter used for optimization. Optional, if estimate==0 . Must be provided in linear space, independent of parameterScale .

nominalValue [NUMERIC]
Some parameter value to be used if the parameter is not subject to estimation (see estimate below). Must be provided in linear space, independent of parameterScale . Optional, unless estimate==0 .