Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers

doi:10.1371/journal.pone.0188511

Fig 1.

Singularity Hub user workflow.

The user creates a specification file to build an image (top left), and pushes this to a version controlled repository (top right). The repository is connected to Singularity Hub by way of a webhook, which sets up and sends a job to a queue to run in the cloud (bottom right). After a series of checks for job priority, the job is deployed to create an instance “builder” that clones the repository, builds the image from the specification file, and sends the completed image to cloud storage (bottom left). Complete metadata is sent back to Singularity Hub to alert the user of the build completion, and the image is available for use by way of the Singularity software and Singularity Hub API.

More »

Expand

Fig 2.

Container tree: Each container is stored with a list of folders and files that render into an interactive tree for the user, both in the Singularity Hub web interface, and on the user’s local command line using the Singularity Hub software.

More »

Expand

Fig 3.

The collection tree provides the researcher with an immediate comparison of the latest version of all containers across Singularity Hub, an easy way to find similar containers using the software and files inside as the metric for comparison.

In the example above, a gray node represents a group of containers, and a red node a single container. The user can hover over a node to see all the containers that are represented.

More »

Expand

Fig 4.

Operating system estimation: Each container is compared to a set of 46 operating systems, including multiple versions of Ubuntu, Centos, Debian, Opensuse, Alpine, Busybox, Fedora, and others.

In the example above, the user is highlighting one of the columns to inspect the score, and the build was for a container bootstrapping a Centos 6 image.

More »

Expand

Table 1.

Builder metadata.

More »

Expand

Table 2.

Levels of reproducibility.

More »

Expand

Fig 5.

Reproducibility assessment algorithm: A comparison between two containers comes down to comparing the members of the tar stream first based on an md5sum of the file member itself, and then in the case of a mismatch, looking at the content hash (non-root owned) or using a size heuristic (root owned).

The final counts of files of overlapping versus different files are then used to calculate an information coefficient using a subset of files particular to a filter (Levels of Reproducibility of Containers) to describe similarity of the two containers.

More »

Expand

Table 3.

Analysis repository collections.

More »

Expand

Table 4.

Builder consistency.

More »

Expand

Fig 6.

Reproducibility level gradients: As an assessment of metric interpretability, we calculated comparisons for each metric for a complete operating system (e.g., Busybox) against all versions of itself with one file removed until no files remained.

As we remove files starting with the newest based on time-stamp (from right to left) similarity decreases between the full container and it’s comparator until we reach a score of 0.0. The metrics for RUNSCRIPT, LABELS, by way of being the newest files in the image, return a perfect score of 1.0 given that they are present in both images, which only occurs at the far right of the plot.

More »

Expand

Fig 7.

Hierarchical clustering of Singularity Hub public images (accessed March 28, 2017) with Ward’s metric.

We observed that containers generally clustered first on the level of operating systems (dark blue) in the order of Debian, Ubuntu, and then CentOS and Busybox.

More »

Expand