Fig 1.
ROC curves of the three methods HMMcopy, Ginkgo, and CopyNumber.
(a) Coarse-grained analysis results, and (b) fine-grained results. For each method, the results based on three thresholds of correctness are plotted. For HMMcopy, nu, which controls the suggested degree of freedom between states, was tuned to take on the values 0.01 (rightmost), 0.1, 2.1 (the tool’s default), 4, 10, and 20 (leftmost). For Ginkgo, alpha, which controls the significance level to accept a change point, was tuned to take on the values 1e-1000 (rightmost), 1e-100, 1e-10, 1e-5, 1e-4, 1e-3, 1e-2 (the tool’s default), 0.02 and 0.05 (leftmost). The dots corresponding to values 1e-5 and 1e-10 in coarse-grained analysis overlap. For CopyNumber, gamma, which is the weight of the penalty on changing a state, was tuned to take on the values 40 (rightmost, and the tool’s default), 10, 5, 4, 3, 2, and 1 (leftmost).
Fig 2.
Computational requirements of Ginkgo, HMMcopy, and CopyNumber.
Results are for analyzing a 1000-cell dataset on Intel(R) Xeon(R) CPU E5-2650 v2 whose clock speed is 2.60GHz. Left and right panels correspond to running time (in log10 of seconds) and memory consumption (in log10 of kb). The running time and memory were recorded for using different parameters as described in Fig 1. As Ginkgo’s running time increases more than twofold for α = 0.05, we treated it as an outlier and did not include this running time point in this plot.
Fig 3.
Recall and precision of Ginkgo, HMMcopy, and CopyNumber as functions of the ploidy.
The ploidy level is varied and the results are based on the (a) coarse-grained and (b) fine-grained analyses. The ploidies of the simulated data were 1.5, 2.1, 3.0, 3.8, and 5.3.
Fig 4.
Recall and precision of Ginkgo, HMMcopy, and CopyNumber as functions of the coverage.
The coverage is varied and the results are based on the (a) coarse-grained and (b) fine-grained analyses. The coverages are varied to mimic those produced by MALBAC, DOP-PCR, TnBC and Bulk sequencing.
Fig 5.
Comparison of HMMcopy, Ginkgo and CopyNumber on Sample 102 in [43].
(a) Venn diagram of the breakpoints from Ginkgo, HMMcopy and CopyNumber. Breakpoints from two methods are counted as overlapping if they are within 400,000bp of each other. (b) Distribution of the copy number changes (under a parsimony analysis) per bin based on the copy number profiles obtained by HMMcopy for the seven samples. (c) Distribution of the copy number changes (under a parsimony analysis) per bin based on the copy number profiles obtained by Ginkgo for the seven samples. For (b) and (c), a maximum parsimony tree was inferred from the copy number profiles of the cells, and the minimum number of copy number changes per bin along all the branches of the tree was computed by parsimony analysis. The percentages of the bins with each number of copy number changes are plotted.