Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

The various namespaces used in this work are referenced through the abbreviation prefixes shown in this table.

More »

Table 1 Expand

Fig 1.

UML class diagram for Process Run Crate.

The central class is the s:CreateAction, which represents the execution of an application. It links to the application itself via s:instrument, to the entity that executed it via s:agent, and to its inputs and outputs via s:object and s:result, respectively. In this and following figures, classes and properties are shown with prefixes to indicate their origin. Some inputs (and, less commonly, outputs) are not stored as files or directories, but passed to the application (e.g., via a command line interface) as values of various types (e.g., a number or string). In this case, the profile recommends a representation via s:PropertyValue. For simplicity, we left out the rest of the RO-Crate structure (e.g. the root s:Dataset), and attributes (e.g. s:startTime, s:endTime, s:description, s:actionStatus). In this UML class notation, diamond ♢ arrows indicate aggregation and regular arrows indicate references, * indicates zero or more occurrences, 1 means single occurrence. The term prefix s: represents the namespace https://schema.org/.

More »

Fig 1 Expand

Fig 2.

Diagram of a simple workflow where the head and sort programs were run manually by a user.

The executions of the individual software programs are connected by the fact that the file output by head was used as input for sort, documenting the computational flow in an implicit way. Such executions can be represented with Process Run Crate. The term prefix s: represents the namespace https://schema.org/.

More »

Fig 2 Expand

Fig 3.

UML class diagram for Workflow Run Crate.

The main differences with Process Run Crate are the representation of formal parameters and the fact that the workflow is expected to be an entity with types s:MediaObject (File in RO-Crate JSON-LD), s:SoftwareSourceCode and bioschemas:ComputationalWorkflow. Effectively, the workflow belongs to all three types, and its properties are the union of the properties of the individual types. In this profile, the execution history (retrospective provenance) is augmented by a (prospective) workflow definition, giving a high-level overview of the workflow and its input and output parameter definitions (bioschemas:FormalParameter). The inner structure of the workflow is not represented in this profile. In the provenance part, individual files (s:MediaObject) or arguments (s:PropertyValue) are then connected to the parameters they realise. Most workflow systems can consume and produce multiple files, and this mechanism helps to declare each file’s role in the workflow execution. The filled diamond ♦ indicates composition, empty diamond ♢ aggregation, and other arrows relations. The term prefixes are defined in Table 1.

More »

Fig 3 Expand

Fig 4.

UML class diagram for Provenance Run Crate.

In addition to the workflow run, this profile represents the execution of individual steps and their related tools. The prospective side (the execution plan) is shown by the workflow listing a series of s:HowToSteps, each linking to the s:SoftwareApplication that is to be executed. The bsp:input and bsp:output parameters for each tool are described in a similar way to the overall workflow parameter in Fig 3. The retrospective provenance side of this profile includes each tool execution as an additional s:CreateAction with similar mapping to the realised parameters as s:MediaObject or s:PropertyValue, allowing intermediate values to be included in the RO-Crate even if they are not workflow outputs. The workflow execution is described the same as in the Workflow Run Crate profile with an overall s:CreateAction (the workflow outputs will typically also appear as outputs from inner tool executions). An additional s:OrganizeAction represents the workflow engine execution, which orchestrated the steps from the workflow plan through corresponding s:ControlActions that spawned the tool’s execution (s:CreateAction). It is possible that a single workflow step had multiple such executions (e.g. array iterations). Not shown in figure: s:actionStatus and s:error to indicate step/workflow execution status. The filled diamond ♦ indicates composition, empty diamond ♢ aggregation, and other arrows relations. The term prefixes are defined in Table 1.

More »

Fig 4 Expand

Fig 5.

Venn diagram of the specifications for the various RO-Crate profiles.

Process Run Crate specifies how to describe the fundamental classes involved in a computational run, and thus is the basis for all profiles in the WRROC collection. Workflow Run Crate inherits the specifications of both Process Run Crate and Workflow RO-Crate. Provenance Run Crate, in turn, inherits the specifications of Workflow Run Crate (and in a sense includes multiple Process Runs for each step execution, but within a single Crate).

More »

Fig 5 Expand

Table 2.

Workflow Run Crate implementations.

More »

Table 2 Expand

Table 3.

Summarised results of our qualitative analysis of Provenance Run Crates generated with runcrate.

More »

Table 3 Expand

Table 4.

Mapping from Workflow Run RO-Crate to equivalent W3C PROV concepts using SKOS [40].

For instance, s:CreateAction has broader match prov:Activity, meaning that prov:Activity is more general. Prefix prov: https://www.w3.org/ns/prov#.

More »

Table 4 Expand