Fig 1.
Overview of Refphase algorithm.
a) Refphase creates a minimum consistent segmentation across the single-sample segmentations input for each tumour. b) In each segment in which at least one sample had allelic imbalance in the tumour input, an optimal reference sample for phasing is determined. c) The phasing of each reference sample is derived from its BAF. d) Phasing is then applied to the BAFs in all other samples which are not the reference. e) Allele-specific copy numbers are re-estimated for each sample utilising the reference phasing, and the most parsimonious phasing solution along each chromosome is then chosen in horizontal phasing optimization. f) In each segment, event categories relative to the input ploidy of the corresponding sample are defined using LogR values. g) Tumour-level events are called and intra-tumour heterogeneity metrics calculated.
Fig 2.
Detection of mirrored subclonal allelic imbalance and parallel evolution.
a) LogR and BAF tracks in chromosomes 1 to 6 from tumour CRUK0034. LogR tracks show LogR values in light grey points. The black line shows the median LogR within a minimum consistent segment. BAF tracks show phased BAF as either orange “A” haplotype points or blue “B” haplotype points. Unphased BAF values are shown as light grey points. b) SCNA summary tracks showing (top) gains relative to ploidy. The full height grey bar indicates that a gain is identified in every sample from tumour CRUK0034. A light yellow background indicates the presence of a subclonal gain, and the height of the stacked darker yellow bar indicates the proportion of samples in which a subclonal gain is present. (middle) Track indicates the presence of MSAI between at least two samples, shown by purple fill. (bottom) Track indicates the presence of parallel gains, shown by red fill. c) MSAI detected from tumour CRUK0034 affecting chromosome 4. d) Parallel evolution of chromosome arm 5p gain from tumour CRUK0034. e) Schematic of the copy number states related to MSAI and parallel evolution.
Fig 3.
Analysis of CRUK0063 multi-sample and multi-time point NSCLC case.
a) MEDICC2 phylogeny. Multi-sample reference phased allele-specific copy number output from Refphase can be passed directly to MEDICC2 to produce a phylogenetic reconstruction. b) SCNA summary tracks for all samples—primary and metastases—from patient CRUK0063. c) BAF profiles and SCNA summary tracks for the primary samples from CRUK0063 are indicated by a red border. d) SCNA summary tracks and BAF profiles from five post-mortem metastatic samples with a blue border. Sample BAF tracks are ordered by their position in the MEDICC2 phylogenetic reconstruction. Green boxes, arrowheads and associated Roman numerals highlight selected examples of MSAI on chromosomes 2p, 6q, 4p and 10q, described in the main text. WGD indicates whole genome doubling events inferred by MEDICC2.
Fig 4.
Low cancer cell fraction SCNA detection and cohort-level analysis for the Sottoriva et al. [46] and Brastianos et al. [47] cohorts.
a) Barplots showing the proportion of the genome affected by MSAI in each tumour sample in the pan-cancer cohort grouped by tumour type. b) Barplots showing the proportion of the genome with allelic imbalance that was identified using multi-sample reference phasing and that was previously undetected using ASCAT. Each tumour sample in the pan-cancer cohort is arranged by tumour type. Light blue bars represent the proportion of the genome in each sample affected by previously undetected allelic imbalance that did not result in an alteration in the previously estimated integer allele-specific copy number. Dark blue bars represent instances in which newly detected allelic imbalance that resulted in new integer copy number state being estimated. c) Barplots representing the estimated cancer cell fraction of each sample in the pan-cancer cohort grouped by cancer type. d) Unphased BAF and integer copy number states across the genome from two samples analysed using ASCAT. e) Phased BAF and haplotype-specific copy number states across the genome from two samples analysed using Refphase. f) Line plot showing the proportion of simulated copy-neutral loss of heterozygosity events identified at differing non-cancer proportions of a sample using ASCAT (red) or Refphase (blue).
Fig 5.
Cohort-level analysis of cancer evolution for the Sottoriva et al. [46] and Brastianos et al. [47] cohorts.
a) Barplot showing the number of samples per tumour and colored by WGD clonality status with clonal WGD (dark orange), subclonal WGD (light orange), and non-WGD (blue). b) Barplot showing proportion of the genome classified as affected by clonal SCNA (grey) and subclonal SCNA (yellow) from the pan-cancer cohort. c) Barplot showing proportion of the genome affected by parallel evolution of SCNAs relative to ploidy or LOH (red) and parallel evolution of 2|1 copy number states (dark blue). d) BAF of heterozygous SNPs across the genome from tumour samples from colorectal adenocarcinoma U. SNPs are coloured orange and blue according to the phased haplotype they are assigned to while SNPs that could not be phased are coloured grey. Regions of the genome demonstrating parallel evolution of a 2|1 and 1|2 copy number states are highlighted with a dark blue outline. e) Schematic demonstrating whole genome doubling and independent subsequent copy number loss events revealed by MSAI.