Histological Transformation and Progression in Follicular Lymphoma: A Clonal Evolution Study

Background Follicular lymphoma (FL) is an indolent, yet incurable B cell malignancy. A subset of patients experience an increased mortality rate driven by two distinct clinical end points: histological transformation and early progression after immunochemotherapy. The nature of tumor clonal dynamics leading to these clinical end points is poorly understood, and previously determined genetic alterations do not explain the majority of transformed cases or accurately predict early progressive disease. We contend that detailed knowledge of the expansion patterns of specific cell populations plus their associated mutations would provide insight into therapeutic strategies and disease biology over the time course of FL clinical histories. Methods and Findings Using a combination of whole genome sequencing, targeted deep sequencing, and digital droplet PCR on matched diagnostic and relapse specimens, we deciphered the constituent clonal populations in 15 transformation cases and 6 progression cases, and measured the change in clonal population abundance over time. We observed widely divergent patterns of clonal dynamics in transformed cases relative to progressed cases. Transformation specimens were generally composed of clones that were rare or absent in diagnostic specimens, consistent with dramatic clonal expansions that came to dominate the transformation specimens. This pattern was independent of time to transformation and treatment modality. By contrast, early progression specimens were composed of clones that were already present in the diagnostic specimens and exhibited only moderate clonal dynamics, even in the presence of immunochemotherapy. Analysis of somatic mutations impacting 94 genes was undertaken in an extension cohort consisting of 395 samples from 277 patients in order to decipher disrupted biology in the two clinical end points. We found 12 genes that were more commonly mutated in transformed samples than in the preceding FL tumors, including TP53, B2M, CCND3, GNA13, S1PR2, and P2RY8. Moreover, ten genes were more commonly mutated in diagnostic specimens of patients with early progression, including TP53, BTG1, MKI67, and XBP1. Conclusions Our results illuminate contrasting modes of evolution shaping the clinical histories of transformation and progression. They have implications for interpretation of evolutionary dynamics in the context of treatment-induced selective pressures, and indicate that transformation and progression will require different clinical management strategies.


Specimen acquisition and sample preparation
The whole-genome sequencing (WGS) cohort consists of 41 patients that were selected to fall into three groups: 1) A "Transformed" group of 15 patients (TFL) diagnosed with FL and subsequent or concomitant (patient FL1014) transformation to large cell lymphoma; these cases were selected irrespective of type of treatment received. 2) A "Progressed" group of 6 patients (PFL) that were diagnosed with FL and subsequently experienced progressive disease without histological evidence of transformation; 5 out of these 6 patients progressed within 2.5 years after starting first-line immuno-chemotherapy with R-CVP (rituximab, cyclophosphamide, vincristine and prednisone). 3) A "Good outcome" group of 20 patients (NPFL) who were diagnosed with FL and did not experience progression for at least 5 years. All cases were selected irrespective of clinical stage, grade and t (14;18) translocation status in order for the cohort to be reflective of the clinical and pathological heterogeneity that is inherent to FL. Samples with a tumor content of less than 50% and available frozen single cell suspensions were flow-sorted to purify tumor (CD19+ kappa or lambda+ CD3-) and germline cells (CD19-kappa or lambda-CD3+). Germline DNA was obtained from flow-sorted CD3+ lymphocytes or from peripheral blood cells. All germline samples were confirmed to be free of tumor contamination by the absence of PCR-amplifiable patient-specific t (14;18) and/or VDJ rearrangements.
The capture sequencing cohort refers to the samples from 277 patients (39 patients overlapping with the WGS cohort), in which germline DNA was available for 80 patients. These patients were divided into three groups: 1) A "Transformed" group of 159 patients (sample at primary FL timepoint available in 128 cases, sample at transformed FL timepoint available in 149 cases and samples from both timepoints available in 118 cases). 2) A "Progressed" group of 41 FL samples from patients who presented with early progressive disease within 2.5 years after starting immuno-chemotherapy with R-CVP. Progression was defined as radiological evidence of progressive disease and requirement for initiation of second-line therapy. 3) A "Good outcome" group of 84 patients without progression for at least 5 years after either observation or R-chemotherapy and R maintenance.
For both whole-genome sequencing and capture sequencing, DNA was extracted from frozen tissue or single cell suspensions using Qiagen DNA/RNA AllPrep kits. Timepoint specificity is indicated for each sample by the su xes -T1 for the primary timepoint (by definition FL) or -T2 for the secondary timepoint in the TFL or the PFL cohorts (transformed or treatment-resistant FL). Clinical annotation for all samples can be found in S2 Table, S6 Table, and S5 Table. 2 Pathology All samples were centrally reviewed by expert hematopathologists at the BC Cancer Agency (AM, PF, KT, RDG), and the following histopathologic co-variates were recorded: FL grade (1, 2 or 3A), histological diagnosis at transformation (DLBCL, composite or B-cell lymphoma not otherwise classifiable (BCLU)), cell of origin for all TFL cases with a DLBCL histology, and immunohistochemistry for expression of TP53, IRF4, CD8, and B2M. Composite histology was defined as any evidence of underlying low grade lymphoma in a sample that concomitantly harbored large cell lymphoma. The Lymph2Cx assay was performed as previously described [1,2], with the exception that it was applied in 4 cases to RNA extracted from fresh-frozen blocks, using 100 ng as input. Immunohistochemical stained slides for the T cell marker CD8 (antibody clone C8/144B, Dako, catalogue number M7103) were scanned with an Aperio ScanScope XT at 20x magnification.
Analysis was performed using the Aperio ImageScope viewer (v12.1.0; Aperio Technologies). Only cores and areas containing tumor were scored by applying the Positive Pixel Count algorithm with an optimized color saturation threshold. Any staining was considered positive and the number of positive pixels was divided by the total pixel count. Scores from both cores were subsequently averaged and multiplied by 100 to obtain the percentage of positive pixels. Images from tissue cores stained for CD8 and B2M (rabbit polyclonal antibody, Dako, catalogue number A0072) were taken using a Nikon Eclipse 80i microscope equipped with a Nikon DS-Ri1 camera and NIS Elements Imaging Software, D3.10.

Availability of sequencing data
Genome data is available at the European Genome-phenome Archive (http://www.ebi.ac.uk/ega) under accession number EGAS00001001709.

Library construction, alignment, and filtering
Whole genome sequencing (WGS) libraries were constructed from genomic DNA using PCR-free library construction protocols, with the exception of libraries from cases FL1001, FL1002, FL2001, and FL2002 that were constructed during an earlier phase of the project using PCR-containing protocols. Libraries were sequenced on Illumina HiSeq 2500 instruments, generating on average 1822261237 paired-end sequence reads of 100-125 length per library. Libraries were aligned and processed using the standard Canada's Michael Smith Genome Sciences Centre WGS alignment pipeline. The BWA (v0.5.7) aligner [3] was used to align the paired-end reads to the human reference genome GRCh37 (http://www.bcgsc.ca/downloads/genomes/ 9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa) . PCR duplicates were marked using Picard tools (v1.126) using MarkDuplicates and sequencing statistics (S2 Fig; S1 Table) were collected using CollectWgsMetrics and CollectMultipleMetric. No joint realignment of the reads within this cohort was performed.
While assessing the number and type of somatic single nucleotide variant (sSNV) substitutions across all our samples, it became apparent that samples FL1005T1, FL1009T1, FL1009T2, FL1012T1, FL1012T2, FL1014T1, FL1014T2, FL2005T1, FL2005T2 and FL3014T1 had an overwhelming representation of C to A substitutions at low allelic ratios (between 10-15%). The cases appeared to be randomly a ected. The low allelic ratio C to A substitutions were documented to be artifactual by amplicon sequencing in sample FL1007T1 (data not shown) and deemed to be compatible with oxidative damage during DNA shearing, as described by Costello et al. [4] We therefore filtered these positions out by applying allelic ratio filters for C to A substitutions in the a ected samples (filter set at 0.20 for FL1005T1, FL1009T1, FL1009T2, FL1012T2, FL1014T1, FL3014T1 and at 0.25 for FL1012T1, FL1014T2, FL2005T1, FL2005T2).
All sequencing data will be deposited at the European Genome-phenome Archive (http://www.ebi.ac. uk/ega) under accession number EGAS00001001709.

Somatic single nucleotide variant and small insertion-deletion predictions
The full bioinformatics workflow for predicting somatic single nucleotide variants (sSNVs) from the whole genome sequencing data is shown in S14 Fig. MutationSeq [5] (v4.1.0) was used to predict sSNVs for each tumor-normal pair using the parameters "-m model_v4.0.2.npz". To increase the sensitivity for sSNVs, Strelka [6] (v1.0.13) was also used to predict sSNVs for each tumor-normal pair using the default parameters.
Mutations reported by MutationSeq and/or Strelka (i. e. appeared in the passed.somatic.snvs.vcf file) were aggregated to form a timepoint-specific list of candidate sSNV positions. For TFL and PFL patients, a patient-centric candidate list of sSNV positions was generated by aggregating the candidate sSNVs positions across both timepoints (i.e. T1 + T2 candidate sSNVs). For NPFL, the patient-centric candidate list is equivalent to the T1 candidate list as there is only one timepoint sample.
MutationSeq was then re-run specifically interrogating the patient-centric sSNV candidate list across both timepoints (for TFL and PFL patients) and single timepoint for NPFL patients. This e ectively allowed us to retrieve sSNV information at candidate positions predicted by MutationSeq and Strelka in a consistent output format, and also across timepoints. Final putative timepoint centric sSNVs lists were constructed based on the following criterion: 1) the sSNV had a MutationSeq probability >= 0.9 and MutationSeq filter field = "PASS", 2) the sSNV was predicted by Strelka and MutationSeq filter field = "PASS", or 3) sSNV was predicted by Strelka, MutationSeq filter = "INDL" and MutationSeq probability >= 0.9. SnpE (v3.5) was then used to annotate each sSNV with respect to the canonical transcript with the parameters "-canon -no-downstream -no-intergenic -no-upstream" using the GRCh37.72 SnpE database.
In the scenario where a position may have multiple SnpE e ects, the e ect with the most impact was chosen. The e ect impact was ordered as follows from lowest to highest: 1) intragenic, 2) intron, 3) exon, The same runs of Strelka for predicting sSNVs also predicted small somatic insertions and deletions (sIndels) with the passing results placed into the passed.somatic.indels.vcf file. These indels all passed the Strelka internal filters that are listed at https://sites.google.com/site/strelkasomaticvariantcaller/home/ somatic-variant-output. SnpE was used to annotate each sIndel with the same parameters as used in the sSNV annotations. For downtream analyses, coding sIndel e ects were considered to be: 1) codon change plus codon deletion, 2) codon change plus codon insertion, 3) codon deletion, 4) codon insertion, 5) frame shift, 6) splice site acceptor, 7) splice site donor, and 8) splice site region.

Tumor content estimation
To estimate the tumor content of each tumor-normal pair, we used the sSNV results from the whole genome sequencing data. More specifically, we applied a variational bayes binomial mixture model clustering (VBBMM) on the sSNV allele count data of each patient. For TFL and PFL patients, we clustered in Next, we identified the cluster most representative of the clonally dominant diploid heterozygous mutations in each patient. To calculate the sample-specific tumor content, we took the mean variant allele fraction (VAF) of that cluster in that sample and multiplied by 2. These tumor/normal content estimation values were used as input into PyClone [7] and TITAN [8]. For TitanCNA, there were a few exceptions where the tumor/normal content estimations from sSNV data are not used. See "Somatic copy number alteration prediction" in the Supplemental Appendix for specific details on these samples.

Somatic copy number alteration prediction
The full bioinformatics workflow for predicting somatic copy number alterations (sCNA) from the WGS data is shown in S17 Fig available as part of the SnpE package). Read count data for the reference and variant alleles were then retrieved for all these snp positions in both the tumor and normal bam files. Reads considered PCR duplicates and non-unique were filtered "samtools view -F 1024" and not considered in the allele read counting.
HMMcopy [9] (v0.1.1) was used to generate coverage wig files for the tumor and normal samples using a window size of 1000 base pairs ("readCounter -w 1000"). Additionally, HMMcopy was also used to calculate GC content of the GRCh37 genome ("gcCounter -w 1000"). Finally, a GRCh37 mappability file was generated by first running HMMcopy's "generateMap -w 35" to generate a BigWig file which is used as input into HMMcopy's "mapCounter -w 1000" to generate a final mappability wig file.
The tumor-normal pair's read count, coverage data, and normal content estimations (see "Tumor content estimation" in the Supplemental Appendix for details on estimation) along with the GRCh37 GC content and mappability data were used as input into TITAN [8] bioconductor R package (named TitanCNA; v1.5.7) to predict sCNAs. The TitanCNA "loadDefaultParameters" function was used with the parameters "copyNumber = 4, numberClonalClusters = 1. . . 5, symmetric = TRUE" followed with setting ploidy = 2 to get the initial parameters. The numberClonalClusters was set to a value between 1 to 5 (more details below) Data was filtered for low and high depth positions ("filterData with parameters minDepth = 10, maxDepth = 200").  (FL1001T2, FL1007T2, FL1008T2, FL1012T2, FL2008T1, FL3004T1, FL3006T1, FL3012T1,   FL3015T1), the optimal model selection was not a pragmatic solution. In these pairs, we investigated the model for every numberClonalClusters value and selected the model selection with the most pragmatic results. For FL1001T1 and FL3012T1, the TITAN results were uninterpretable indicating that TITAN was having di culty converging on a pragmatic solution. For these patients, we ran TITAN across a range of normal content initialization values (i.e. 0.1 to 1) to determine if normal content was contributing to the non-convergence. We discovered that a normal content of 1 as an initialization value gave a pragmatic solution and chose to go forward with the results of this initialization value. The final parameters values selected are listed in S7 Table. The output TITAN results were then converted into TITAN segments using the "createTITANsegment-  Table. To increase our specificity for sCNAs, we masked our segment data with a copy number variant mask (mask is listed in S8 Table). This mask was constructed using the gold standard data from the Database of Genomic Variants (vJuly 2015), peripheral blood samples [10], and normal breast tissues (METABRIC [11]).
TITAN segments were overlapped with the copy number variation mask and any segment that overlapped with >= 25% of its length with any mask segment was labeled as a mask segment and assigned a neutral summary state.
To assign gene-centric copy number, genes coordinates (Ensembl v72) were overlapped with the TITAN segment coordinates. Genes were considered overlapping with a TITAN segment if >= 50% of the gene overlapped with the segment and then would be assigned the state of the overlapping TITAN segment.

Somatic structural rearrangement prediction
Destruct [12] (v0.2.0) was used to predict structural rearrangements for each patient. For TFL and PFL patients, the normal, T1 and T2 samples were used as input into Destruct to simultaneously predict rearrangements across all samples. For NPFL patients, the normal and T1 samples were used. The following filters were applied for Destruct predictions: 1) 0 read in the matching normal sample, 2) distance to any other breakpoint is > 50 basepairs, 3) log likelihood > ≠20, 4) minimum template length > 120, 5) matescore AE 10, 6) both rearranged partners must be on an autosomal or sex chromosome, 7) not found as a variant in the database of genomic variants database, and 8) number of split reads > 0 if the rearrangement does involve the IGH locus. When considering rearrangements that may a ect a gene, the breakpoint must be inside the gene itself and not upstream or downstream of the gene. A rearrangement was considered to be shared if there was Ø 1 read in both timepoints and timepoint specific if it contained 0 reads in one timepoint.

Digital droplet PCR
Digital droplet PCR (ddPCR) was performed on selected pairs of T1 and T2 biopsies. As controls, we used 200ng of genomic DNA from 3 reactive lymph node samples from unrelated individuals and the data from these samples was pooled. For the test samples, we used a minimum of 600ng and 200ng of genomic DNA for T1 and T2 samples, respectively. Primers and FAM-or HEX-tagged probes were designed by Integrated DNA Technologies (IDT) and their sequences are reported in S9 Table. PCR was performed in droplets generated using the Biorad QX200 droplet generator and each 20 µL reaction mix contained 900nM forward and reverse primers, 250nM FAM and HEX probes and 5 units of HindIII, in addition to 1x ddPCR Supermix for Probes (No dUTP) (Biorad), genomic DNA and water. For each pair of probes, the optimal annealing temperature was determined using a temperature gradient. PCR conditions were as follows: 95C for 10 minutes, (94C for 30 seconds, optimal annealing temperature for 90 seconds) x 39, 98C for 10 minutes. Droplets were assessed for FAM or HEX fluorescence using the Biorad QX200 droplet reader. In all instances, we verified that no signal for either wild-type or mutant DNA could be detected in the non-template control wells. We clustered results from T2 samples using Gaussian mixture models and the R package mclust (v5.1), setting the number of mixture components to 4 and initializing hierarchical clustering on a random sample of 1000 datapoints. If clustering using this approach did not reveal 4 distinct clusters corresponding do empty droplets, wild-type only droplets, mutant only droplets or double positive droplets, we repeated the clustering until an appropriate solution was found. The seed of the first successful clustering was then set for all other samples corresponding to the same case. As DNA degradation with time potentially competes with the detection of rare alleles, we only considered single, mutant signal-positive droplets to be positive events, as described in Wong et al. [13].
6 Targeted deep amplicon sequencing data analysis 6

.1 Selection of positions for deep amplicon sequencing
For all cases, we selected at least 192 predicted mutations to be taken forward for targeted deep amplicon sequencing. These mutations included all non-synonymous (S18 Fig b) and synonymous coding sSNVs (S18 Fig c) as well as coding sIndels. As this sum fell typically short of 192, we backfilled the list of positions for deep amplicon sequencing to a total of 192 by also including non-coding sSNVs that were proportionally selected from mutational clusters generated from each patient's sSNV VAF (S18 Fig d). These clusters are the same clusters identified through the VBBMM used for tumor content estimation (S18 Fig a).

Primer design, library construction, and alignment
Primers were designed using an in-house automated primer design pipeline using the following input parameters for Primer3: [14,15]

Somatic single nucleotide variant validation
Reads were fitered out if they either 1) aligned > 10 base pairs away from an amplicon's start or end position, In some normal samples, there was evidence of contaminating tumor DNA leading to the presence of the variant allele in the matching normal. To deal with these situations, we performed a one-tailed fisher exact test on the tumor and normal read counts testing if the allelic ratio was higher in the tumor. A significance level of < 0.05 was set to specify if a position was somatic. The mean ± standard deviation validation rate (precision) was 96.3% ± 5.4%.

Clonal dynamics analysis
For clonal dynamics analysis, we chose to use sSNVs as genetic markers for clonal lineage tracking. There were 2 major reasons for this: 1) the mutational burden of sSNVs is far higher (7133.29 ± 3107.02) than sIndels (512.63 ± 296.67) giving us an adequate number of genetic markers for clonal lineage tracking, 2) using sSNV as a genetic marker for clonal lineage tracking is much less technically challenging than using other genetic markers (e.g. indels). Specifically, the first step in the process of clonal analysis requires the ascertainment of accurate VAFs. This in turn requires accurate alignment of indels which is known to be a di cult problem in the field of sequence aligners. As a result, indel VAF measurements are generally more susceptible to technical noise which in turn a ects cellular prevalence, clonal prevalence and ultimately the clonal lineage tracking.

Estimation of mutational cellular prevalences
For each TFL and PFL patient, we inferred the mutational cellular prevalence of each validated sSNV (aka. proportion of cancer cells with sSNV) for both the T1 and T2 sample. Specifically, any mutation that was validated in a T1 and/or T2 sample ("Somatic single nucleotide variant validation" in the Supplemental Appendix; S19 Fig a-b) was used as an input into PyClone (v0.12.7). For NPFL patients, only the T1 sample was considered. PyClone requires for each sSNV: 1) read count of the variant allele, 2) read count for the reference allele, and 3) major and minor copy number. For 1) and 2), these data were retrieved from the targeted deep amplicon sequencing ("Targeted deep amplicon sequencing data analysis" in the Supplemental Appendix). The major and minor copy number data of each feature was taken from the TITAN segments ("Somatic copy number alteration prediction" in the Supplemental Appendix). In addition to these inputs for each sSNV, the tumor content estimated from the sSNV data ("Tumor content estimation" in the Supplemental Appendix) was also used. Any sSNV without matching copy number data or in a homozygous deleted region is not considered for PyClone analysis.
The PyClone "build_mutations_file" with the parameters "-var_prior parental_copy_number" was used to construct sample-centric mutation yaml files. Patient configuration yaml files were constructed manually with sample-specific tumor content estimations and "error_rate = 0.001" along with the following parameters: These yaml files were then used as input into the "analyze" function of PyClone with the parameters "-seed 1000". The "build_table" function with the parameter "-burnin 50000" was then used to retrieve the estimated mutational cellular prevalence of each sSNV (S19 Fig c). As some sSNVs had large uncertainty in their estimated mutational cellular prevalence, we performed a credible interval analysis using the "CI_filter.py" script from PostPy (v0.1) using the parameters "-b 50000 -i 90 -r 0". Any sSNVs with a 90% credible interval larger than 0.3 in any sample of a patient were filtered out. In addition, positions in the IGH location with low coverage were also filtered out. PyClone was then re-run without the filtered sSNVs.

Estimation of clonal prevalence and phylogenies
sSNVs with similar mutational cellular prevalences were clustered by PyClone. We next calculated each cluster's mean cellular prevalence by taking the mean of cellular prevalence of all sSNVs in the cluster (S19 Fig d). After filtering out singleton clusters (i.e. clusters with only a single sSNV) and clusters with subclonal subclonal copy number changes, cluster means were then used as input into Citup [17] (v0.2) to estimate clonal prevalences and phylogenies (S19 Fig d-e).

Filtering of PyClone clusters for clonal phylogeny construction
The following filters were applied to PyClone clusters to remove clusters before input into Citup: 1) all singleton PyClone clusters (i.e. clusters with only a single sSNV), 2) clusters with subclonal copy number changes as these result in inaccurate VAF and mutational cellular prevalence estimations, 3) clusters that violated phylogenetic construction rules. Specific details on each filtered patient cluster are described below.
For FL1005, cluster 1 has a subclonal deletion that results in the variant alleles being deleted in subset of the tumor cells. This results in a decrease in VAF and mutational cellular prevalence resulting in a di erent cluster. Similarly, cluster 2 has a subclonal deletion that results in the reference alleles being deleted in subset of the tumor cells. This results in an increase in VAF and mutational cellular prevalence resulting in a di erent cluster. The sSNVs in these two clusters should belong in the ancestor cluster 3.
For FL2007, cluster 1 has subclonal amplifications of the reference alleles resulting in a decrease in VAF and mutational cellular prevalence resulting in a di erent cluster. The sSNVs in this cluster should belong in the ancestor cluster 3.
For FL2008, cluster 5 has two sSNVs chr18:75515528 and chr6:2606119 both of which have a subclonal copy number change. Specifically, chr18:75515528 has a subclonal deletion that deletes the variant allele (T).
While chr6:2606119 has a subclonal copy number gain that results in the reference allele (C) being gained.
The sSNVs in this cluster should belong in the ancestor cluster 6. Cluster 4 violates the additive rule as the sum of the T2 mutational cellular prevalences of clusters 3 and 10 are greater than cluster 4.

Neutral evolution modeling
Genetic drift trajectory modeling was performed using the Wright-Fisher model [18]. Under this model, the following assumptions are made: 1) generations do not overlap, and 2) each copy of an allele in a generation is independently drawn from the previous generation at random, and 3) the number of cells does not change between generations. Simulations were initialized with 10000 diploid cells and a starting mutant VAF of 0.01 for TFL and 0.5 for PFL patients. Each simulation was run for 10000 generations and a total of 1000 simulations were run. K-means clustering, ranging from 2 to 10 clusters (k), was used to cluster together simulations that exhibit similar genetic drift trajectories. To decide on the optimal number of clusters, the total within sum of squares vs. k-clusters was plotted and the optimal number of cluster was selected based on the elbow of the curve. This resulted in 5 and 4 optimal clusters for TFL and PFL patients respectively.
After identifying the cluster with the genetic drift trajectory most similar to the observed patterns in TFL patients, the proportion of simulations of that trajectory was used as the background trajectory rate.
This rate was then used as input into the binomial exact test to quantify the probability of 13 out of 15 TFL patients following the observed patterns.
9 Capture-based targeted sequencing

Capture panel and library construction
We selected 86 genes (S10 Table) for capture-based targeted sequencing, based on the following 6 criteria: 1) recurrence in FL of >5%, [19,20], 2) recurence in DLBCL >5% (our own data) or reported to be consistently mutated in Burkitt lymphoma across studies, [21][22][23], 3) genes significantly mutated (q <0.05) in the compiled dataset from our study and others [24,25] applying the MutSigCV algorithm [26] (v1.4), 4) T1 gene mutations associated with early transformation/progression (<5 years) versus no progression (for at least 5 years) based on our data and external cohorts [24,25], 5) genes that were found to be T2-specific in at least 3 cases from our and external cohorts, [24,25] and 6. IL4R, PTPN1, NOTCH2 and RFX5. Furthermore, we selected 20 genes, 12 of which overlapped with the 86 above-mentioned genes, to assess mutations in areas that are targets of somatic hypermutation (S11 Table). Libraries were constructed from either 500ng of fresh-frozen genomic DNA or 200ng of FFPET-derived genomic DNA, and captured using custom SureSelectXT2 baits (Agilent). Captured libraries were pooled to a maximum of 46 libraries per pool and each pool was sequenced on one Illumina HiSeq lane, generating 125bp indexed reads (V4 chemistry). In total, the capture space was 452, 129 base pairs. Analysis details of capture-based sequencing are provided in the supplementary materials.
Full sequencing statistics can be found in S12 Table.
• filtering out of putative SNPs: for cases with available germline DNA sequenced (n = 80), variants were filtered out if they were present in the germline samples; for cases without available germline DNA sequenced, SNVs were filtered out if they were found in at least 2 normal (unrelated) samples or in dbSNP (v147); -for 8 cases without germline DNA sequenced by targeted capture sequencing, SNPs were filtered out using matching germline whole-genome sequencing libraries; variants that occurred more than 4 times and had a mean variant allele frequency > 0.45 and < 0.55, with a standard deviation of < 0.1 were filtered out.
• any variant that occurred within a distance of < 10 base pairs of an indel was filtered out.
• putative artifacts were filtered out, defined as as variants that occurred more than 5 times in either fresh-frozen or FFPET samples, with a mean variant allele frequency of < 0.15 or > 0.9; or variants that occured in more than 10% of either fresh-frozen of FFPET samples.
sSNVs were annotated with SnpEFF and if multiple e ects were called for a single position, the e ect with the putatively greatest impact was chosen as described above for SNV calling in our WGS cases.
For samples with matching germline, indels were called using Strelka (v1.0.13) and the output was not filtered. For samples without matching germline, indels were called using VarScan and germline variants were filtered out using dbSNP (v137) and the 1000 genomes. Artifacts were filtered out in an analogous fashion to sSNVs.

Copy number analysis
Copy number analysis was performed using CNVkit [27] (v0.7.11) using the workflow depicted in S20 Fig. Only the FF samples were usable for this analysis as the fragmentation of FFPET samples resulted in poor quality copy number estimates (data not shown). SNPs were generated using VarScan and restricted to positions found in dbSNP (v137). These SNPs were then used as input to generate parental copy number in CNVkit. Copy number segments were projected onto gene sca ords (Ensembl v72) to generate gene-centric copy number estimates.

Determination of mutational cellular prevalence
This analysis was restricted to samples from those 32 patients in which fresh-frozen samples had been sequenced from both timepoints by capture-based sequencing and that had not been subjected to WGS.
Mutational cellular prevalence was determined per sSNV using PyClone, using the following inputs: • variant allele frequency from repeat MutationSeq run, providing variant allele frequencies even if the variant had not been called by our SNV prediction pipeline; • tumor content estimated from clustering of variant allele frequencies across both samples from each patient; • copy number status determined using CNVkit.
sSNVs were filtered out of samples from both timepoints for a given patient if the standard deviation of the estimate of cellular prevalence was > 0.1 in any of the two samples.

Code availability
The software used in this project, along with their version indicated with "v" prefix, is listed below. URL addresses are provided to indicate where to obtain the software.