Nine quick tips for software containerization

doi:10.1371/journal.pcbi.1014197

Fig 1.

Containerization as a sequence of decisions across the computational research lifecycle.

The diagram illustrates five stages of computational research practice, from initial design and environment construction to workflow integration, execution, and long-term sharing. The nine tips presented in this article are mapped onto these stages as key decision points. This lifecycle perspective emphasizes that containerization is not a single technical step but a series of strategic choices that influence reproducibility, maintainability, security, and reuse throughout a research project.

More »

Expand

Fig 2.

Example of container integration in a Nextflow workflow.

This code snippet demonstrates how containers can be specified for individual processes within a workflow management system. The container directive isolates the alignment tool (here, BWA version 0.7.17) in its own environment; other pipeline steps can use different containers. This modular approach allows updating a single tool without rebuilding the entire computational environment, improving both maintainability and reproducibility.

More »

Expand

Fig 3.

Mounting external data to a Docker container at runtime.

This command illustrates the separation between software (inside the container) and data (external to the container). The -v flag binds the host directory/path/to/data to the container’s /data directory, making external files accessible to the containerized analysis without embedding them in the image. This approach keeps images small, portable, and free from data privacy or licensing concerns while maintaining clear provenance of required inputs.

More »

Expand