From ideal to practical: Heterogeneity of student-generated variant lists highlights hidden reproducibility gaps

doi:10.1371/journal.pcbi.1013552

Fig 1.

Steps of the pipelines.

Each pipeline has trimming and duplicate marking steps. The rest of the steps have alternative options. Each group created 12 VCF documents (2 aligners, 2 options for base recalibration, three variant calling algorithms).

More »

Expand

Table 1.

Variant counts by group and pipeline configuration.

More »

Expand

Fig 2.

PCA analysis results.

PCA for each pipeline configuration along with the high-confidence variant list, grouped by the aligner and variant calling algorithm. ‘SS’ stands for SomaticSniper variant caller algorithm.

More »

Expand

Fig 3.

Performance scores of variant lists collected from the students

It shows the distribution of scores for each pipeline combination for 132 VCF files. ‘YB’ and ‘NB’ denote applying and not applying base recalibration, respectively. ‘SS’ is the abbreviation of the SomaticSniper variant caller algorithm.

More »

Expand

Fig 4.

Performance comparison of variant callers using local installation versus Docker container.

The box plots display the distribution of precision, recall, and F1-scores for different variant callers (Mutect, Strelka, and SomaticSniper) categorized by installation method (local vs. Docker).

More »

Expand

Fig 5.

Performance comparison of variant callers using two operating environments: Linux and Windows Subsystem for Linux (WSL).

The box plots display the distribution of precision, recall, and F1-score for different variant callers—Mutect, Strelka, and SomaticSniper—categorized by the operating environments (Linux vs. WSL).

More »

Expand

Fig 6.

Box plots of time spent on different stages of the variant calling process versus the corresponding F1-score.

The stages analyzed include downloading data, mapping, variant calling, filtering, and analysis.

More »

Expand

Fig 7.

P-values obtained from ANOVA.

A value lower than 0.05 indicates a statistically significant effect on the outcome.

More »

Expand