Fig 1.
Flow chart illustrating the overall experimental design.
Exome sequencing was performed on matched tumor-normal DNA from five breast cancer patients followed by somatic variant calling using nine different somatic variant callers. The union of these calls, except intronic and intergenic positions, reported by the nine somatic variant callers was included in a capture reagent and targeted deep sequencing was performed. Variant calling was repeated in the deep sequencing data. The two dataset were compared to a set of manually curated high-confidence somatic mutations which were obtained through manual inspection of the data.
Fig 2.
Total number of somatic mutations called by nine somatic variant caller tools in the exome sequencing data of five breast cancer samples.
SNV and indel calls in left and right panels, respectively.
Fig 3.
Pairwise comparisons of the nine studied variant callers in exome sequencing of five breast cancer samples in exome sequencing and deep sequencing data in upper and lower panels, respectively. The matrix depicts the agreement among the studied variant callers. In each horizontal line, the number reflects the fraction of calls found by the caller that are also reported by the other callers. For instance, looking at EBCall in the first line, Mutect reports 51% of the calls reported by EBCall. Deep sequencing data includes only data covered by 200 x at minimum in both tumor and normal sample. The color reflects the degree of agreement, with the highest color intensity depicting high agreement between the two callers.
Fig 4.
Impact on variant calling of increased sequencing depth.
The impact on variant calling of increased sequencing depth for SNV and indel calling are shown in left and right panels, respectively. The number of called positions called in exome sequencing only, validation data only and both data set, depicted in blue, green and red, respectively. This analysis only includes regions that are successfully covered (at least 200 x) in the deep sequencing data.
Fig 5.
Concordance of called positions.
Concordance of called positions in exome sequencing data and deep sequencing data are shown in upper and lower panels, respectively. SNVs and indels are depicted in left and right panels, respectively.
Fig 6.
Variant caller sensitivity for detecting the manually curated mutations for SNVs and indels are shown in left and right panels, respectively. The y-axis depicts the number of variant calls. The dark and light grey bars represent calls in the exome and targeted deep sequencing data, respectively.
Fig 7.
Calling patterns of the somatic variant callers.
Hierarchical cluster analysis of mutations called by the somatic variant callers in exome and deep sequencing data in left and right panel, respectively. Each red line represents a called somatic mutation.
Table 1.
The nine somatic variant callers, settings and post-call filtering used in the study.