Validation of Next Generation Sequencing Technologies in Comparison to Current Diagnostic Gold Standards for BRAF, EGFR and KRAS Mutational Analysis

Next Generation Sequencing (NGS) has the potential of becoming an important tool in clinical diagnosis and therapeutic decision-making in oncology owing to its enhanced sensitivity in DNA mutation detection, fast-turnaround of samples in comparison to current gold standard methods and the potential to sequence a large number of cancer-driving genes at the one time. We aim to test the diagnostic accuracy of current NGS technology in the analysis of mutations that represent current standard-of-care, and its reliability to generate concomitant information on other key genes in human oncogenesis. Thirteen clinical samples (8 lung adenocarcinomas, 3 colon carcinomas and 2 malignant melanomas) already genotyped for EGFR, KRAS and BRAF mutations by current standard-of-care methods (Sanger Sequencing and q-PCR), were analysed for detection of mutations in the same three genes using two NGS platforms and an additional 43 genes with one of these platforms. The results were analysed using closed platform-specific proprietary bioinformatics software as well as open third party applications. Our results indicate that the existing format of the NGS technology performed well in detecting the clinically relevant mutations stated above but may not be reliable for a broader unsupervised analysis of the wider genome in its current design. Our study represents a diagnostically lead validation of the major strengths and weaknesses of this technology before consideration for diagnostic use.


Introduction
Molecular cancer diagnostics in clinical practice is constantly and rapidly evolving. With the need to identify standard-of-care mutations in companion diagnostics to predict therapeutic response, cancer treatment has been revolutionised.
Since the 1970s, the Sanger method [1] is the gold standard for mutation analysis in cancer diagnostics; however its low-throughput and relative low sensitivity, long turnaround time and overall cost [2] have called for new paradigms. Next Generation Sequencing (NGS) can massively parallel sequence millions of DNA segments and, in principle, offers benefits relating to possible lower costs, increased workflow speed and enhanced sensitivity in mutation detection [3]. As whole-genome-sequencing may be unaffordable for routine diagnostics, the targeted sequencing of exon coding regions or a subset of 'genes of interest' offered by NGS is an attractive proposition a priori [4]. When compared with single-gene analysis, the use of NGS in diagnostics would allow the analysis of more than one therapeutic avenue as well as the generation of other valuable information for research purposes. To be a plausible option, a) NGS technologies must be as efficient as the current detection methods in the diagnosis of those single genes that currently represent standard-of-care; and b) the extra information generated must be of sufficient quality to consider alternative therapies or be accepted for downstream research endeavours. Such validations should include, at least, the V600E mutation in the BRAF gene, indicating which malignant melanoma patients respond effectively to vemurafinib treatment [5,6], mutations of the EGFR gene to predict which lung adenocarcinomas respond to tyrosine kinase inhibitor (TKI) treatment, primarily those identified in amino acid (aa)719 exon 18, exon 19 deletions, aa768 exon 20 and aa858 exon 21 [7,8], and KRAS mutations in exon 2 (codon 12 and 13) to predict lack of response to targeted monoclonal antibodies in colorectal cancer [9]. To test the presumed advantage of NGS versus single-gene approaches, the bench validation must be accurately executed and major challenges in bioinformatic analysis met. Indeed, the clinical utility of NGS has been described in other disease settings. For example, NGS is as reliable as Sanger sequencing in the detection of a range of mutations associated with hereditary cardiomyopathy. The authors concluded that targeted NGS of a diseasespecific subset of genes is equal to the quality of Sanger sequencing and it can therefore be reliably implemented as a stand-alone diagnostic test [10].
Here, we test the validity of the current technical designs for the NGS analysis of BRAF, EGFR and KRAS (and of more than 40 other key oncogenes), exploring all the bench and bioinformatics analytical variables to confirm the possible application of these technologies for routine cancer diagnostics.

Materials and Methods
(See S1 for choice of clinical materials, DNA extraction protocols, sequencing analysis by Sanger and q-PCR platforms, general sequencing workflow for NGS analysis and ethical framework of the study). Thirteen aliquots of tumour DNA extracted from formalin-fixed-paraffin-embedded (FFPE) malignant melanoma, lung adenocarcinoma and colon carcinoma, and genotyped by Sanger/q-PCR sequencing for BRAF, EGFR and KRAS status, respectively, were obtained from the Northern Ireland Biobank following ethical approval (NIB12-0049). The data set has been deposited in NCBI SRA http://www.ncbi.nlm. nih.gov/sra (SRP023265).

IonTorrent sequencing
PGM sequencing was performed according to IonTorrent protocols using 10 ng DNA (Table S1), IonAmpliSeq TM Cancer Panel primer pool and IonAmpliSeq TM Library Kit 2.0 Beta (Life Technologies) for whole-exon-sequencing of BRAF, EGFR and KRAS and targeted ''hot-spot'' regions in 43 other cancer-related genes (http://tools.invitrogen.com/content/sfs/brochures/IonAmpliSeq_ CancerPanel_Flyer_CO32201_06042012.pdf). Template preparation was performed on the Ion OneTouch TM system for 100 bp libraries (Life Technologies). QCs were performed using the IonSphere TM Quality Control Kit (according to the protocol) ensuring that 10-30% of template positive Ion spheres were targeted in the emPCR reaction. Prior to loading onto 314 chips, sequencing primer and polymerase were added to the final enriched Ion spheres.

GS Junior sequencing
GS Junior Titanium Fusion primers for BRAF (exon 15) and KRAS (exon 2) were designed incorporating 8 Roche multiplex identifier (MID) barcodes. BRAF and KRAS libraries were prepared adhering to the Roche Amplicon Library Preparation manual. For EGFR, libraries were prepared using the EGFR18-21MastR kit (Multiplicom), according to the accompanying protocol. Clonal amplification onto DNA capture beads was performed manually adhering to the emPCR Amplification manual-Lib A (Roche). After DNA library bead enrichment, adaptor-specific sequencing primers were annealed and libraries were sequenced according to the Sequencing Manual on the GS Junior (Roche).

Bioinformatics analysis
IonTorrent 'closed' bioinformatics. was performed using IonTorrent Version (V) 2.0.1 (ID.1-ID.10 and ID.12) and reanalysed by V2.2 upgrade (all samples). The HG19 reference was used for alignment. For 314 chip sequencing a threshold of $200,000 final quality library reads was applied.
GS Junior 'closed' bioinformatics. was performed using Roche 454 Amplicon Variant Analyzer (AVA) V2.7 for BRAF and KRAS. HG19 BRAF and KRAS regions were used as the alignment reference, respectively. Multiplicom provided scripts for analyzing their EGFR18-21MastR Kit sequencing results. A threshold of $50,000 final quality library reads was applied for all.
Variants obtaining a frequency of detection $5% were considered in the analysis.
'Open' CLC Genomics Workbench. V5.5 was employed to comparatively analyse data generated by IonTorrent software (PGM) and AVA (GS Junior). HG19 was downloaded within CLC, incorporating tracks from the COSMIC database. Alignment was carried out by 2 methods: a loose alignment setting to identify large base changes, for example deletions, and a stringent alignment setting for quality based variant detection (QBVD) of single nucleotide variants (SNVs), both capped at 5% mutation frequency. Three QBVD thresholds were applied: (i) the lowest coverage and (ii) the second lowest coverage required to detect standard-of-care variants. A third arbitrary threshold (iii) at ,2fold higher coverage than (ii) was also employed. Coverage equated to (i) = 71 (aa719 exon 18; EGFR), (ii) = 259 (aa858 exon 21; EGFR) and (iii) = 500, meaning any SNVs with coverage equal to or above these thresholds were included in CLC analysis and compared. For IonTorrent, four sets of data were retrieved:

EGFR analysis
Lung adenocarcinoma (ID.3-ID.10). Deletion in exon 19 affecting aa745_750 was detected in 25% of patients (ID.7 and ID.10) and was concordant across Sanger, q-PCR (Cobas), IonTorrent (V2.2) and Roche GS Junior platforms (mean coverage depth .2500, Table 1). A variant in exon 20 G/A aa803 was detected in ID.10 by IonTorrent but disregarded as reanalysis by CLC and subsequent GS Junior sequencing did not confirm the base call hence not meeting our variant 'passed' criteria outlined in Materials and Methods. Lowering the stringency of the bioinformatic analysis allowed the detection of standard-of-care TKI-sensitising mutations [7,8] that were not detected by single gene analysis and thus, are likely to represent false-positive results. For example, in ID.9, a SNV was detected in exon 18 G/T aa719 by IonTorrent and, as a result of applying the lowest threshold, this SNV was also detected by CLC re-analysis at coverage of 89. By employing more stringent thresholds (QBVDii = 259; QBVDiii = 500), this standard-of-care mutation was not detected. Variants that 'passed' the 2/4 analysis criteria but not identified by the gold standard Sanger/q-PCR methods were in ID.4, ID.5 and ID.9 by IonTorrent (and CLC re-analysis) at the sensitizing mutation EGFR exon 20 G/T aa768 [8]. This finding was discordant with GS Junior and Sanger/q-PCR. Sequencing coverage averaged at 477 (for all 3 QVBD thresholds). Of note, when applying the highest threshold (QVBDiii = 500) only, this standard-of-care mutation would only have been called in ID.4. Variant 'passed' in 4/4 NGS analysis was the important activating mutation in ID.9 at exon 21 T/G aa858 (also identified by Sanger/q-PCR). Importantly, by applying the highest QBVDiii threshold of 500, this key SNV would only have been called in 1/4 analysis of ID.9. Variants 'passed' in 4/4 NGS analysis were the two standard-of-care mutations in exon 18 G/T aa719 and exon 20 G/T aa768 in ID.8, concordant with Sanger/q-PCR (Table 1). Even when applying the most stringent threshold (QBVDiii = 500), both therapeutic mutational targets were called. In addition, two silent germline polymorphisms [10,11] were detected in 4/4 NGS analyses: a SNV at exon 21 C/T aa836 in ID.9 and a SNV at exon 20 G/A aa787 in all lung adenocarcinoma samples. Frequencies of EGFR SNVs were concordant between IonTorrent and re-analysis by CLC (rho.c.est = 0.9957) and similarly between GS Junior's AVA data and CLC_AVA (rho.c.est = 0.9953), verifying a strong overlap between platform-specific proprietary software and open third-party tools. There was a poor concordance between IonTorrent software and GS Junior's AVA pipeline for matched EGFR SNV frequencies, rho.c est = 0.7718. Malignant melanoma (ID.1 and ID.2). The therapeutic TKI target in EGFR exon 18 aa719 was detected in ID.2 by IonTorrent and CLC_IonTorrent analysis, albeit at the least stringent threshold of 71. No other mutations in EGFR were considered as they have previously been described as silent germline polymorphisms [10,11] or did not meet the variant 'passed' criteria (exon 20 aa804) ( Table 1).
Colorectal carcinoma (ID.11-ID.13). In 1/3 clinical samples, IonTorrent and CLC_IonTorrent analysis identified a SNV at exon 18 G/T aa719 however application of the two highest stringency thresholds excluded this base call. Again, the germline silent mutation in EGFR exon 20 aa787 was returned from the analysis in all colorectal carcinoma samples in this study and has been reported by others [12] (Table 1).

BRAF and KRAS analysis
Malignant melanoma (ID.1 and ID.2). BRAF: The standard-of-care mutation, exon 15 A/T aa600 [5] was detected in both malignant melanoma clinical samples by the q-PCR method. This finding was concordant with sequence data generated from the IonTorrent NGS platform (ID.1 and ID.2) and GS Junior (ID.2 only), Table 2. Unfortunately, the clinical sample, ID.1, was exhausted and analysis with 454 GS Junior platform could not be completed.
KRAS: Although not meeting the variant 'passed' criteria, it is noteworthy that the clinically relevant KRAS mutation in exon 2 C/T aa12 was called (IonTorrent only) in ID.1, though at a low frequency of 5.1%. This variant was not detected by q-PCR sequencing of KRAS, Table 2.
KRAS: The important therapeutic KRAS mutation in colon cancer was detected in ID.3 and ID.6 (exon 2 aa12), two EGFR wildtype samples. Furthermore, the detection of KRAS mutations by NGS was concordant with the gold standard methods, Table 2. Within exon 2 of the KRAS gene, the GS Junior AVA software (and CLC_AVA) but not IonTorrent, called other variants with frequencies #7%, including a SNV at exon 2 aa14 with a COSMIC ID (http://www.sanger.ac.uk/perl/ genetics/CGP/cosmic?action = bygene&ln = KRAS&start = 4&end = 20&coords = AA%3AAA) ( Table 2). Interestingly, these multiple variants were only present in ID.8 and ID.9 both of which harbour several EGFR activating mutations (Table 1).
Colorectal carcinoma (ID.11-ID.13). All colorectal carcinoma samples had been reported as BRAF and KRAS wildtype by q-PCR sequencing. For KRAS analysis, this was concordant across all technologies, however the BRAF mutation at codon 600 was called in 1/3 colorectal carcinoma samples though at a low coverage equal to 4.38% and only by 1/4 analysis (IonTorrent,   Table 3). Even at the highest stringency threshold (QBVDiii = 500; Table 3), TP53 SNVs were called in 62.5% of samples. This is improbable, as when we compare this frequency with the COSMIC database for TP53 mutations in lung adenocarcinoma, the results are discordant, Table 3. Five genes were flagged as mutant in all 8 lung adenocarcinoma samples namely RET, APC, FGFR3, NPM1 and PDGFRA when the least stringent QBVD threshold was applied (QBVDi = 71). By applying the highest threshold (QBVDiii = 500), still 100% of samples had mutations in PDGFRA and APC, while 37.5% and 75% of patient samples had FGFR3 and RET mutations, respectively. For each of the five genes, the findings are markedly discordant with COSMIC frequencies (Table 3) and were regarded as false-positives. Additionally, SNVs in NPM1 were disregarded as detection was within a homopolymer region, a documented caveat of the IonTorrent variant calling software [13]. Here, the NGS methodology needs further software and chemistry improvements. The importance of applying thresholds was addressed in relation to other genes on the panel. A low stringency level (QBVDi = 71) allowed the detection of STK11, DAPK2, CDKN2A, CDKN2B, HIP1 and CSF1R SNVs in one or two (CDKN2B only) of the patient samples. Again, these variants have not been considered in the final gene profile for lung adenocarcinomas as they are discordant with that reported by the COSMIC database ( Figure 1, Table 3). Other SNVs that were detected, even by a highly stringent threshold approach, but excluded when referenced against the COSMIC database included PIK3CA (62.5% of samples vs 2.2% COSMIC frequency), KIT (37.5% vs 0.3%), KDR (37.5% vs 4%), ABL1 (37.5% vs 0.8%), NOTCH (25% vs 1.5%), FGFR2 (62.5% vs 1.1%) and ATM (50% vs 5%). Mutations that could be genuine but would require further investigation in a larger patient cohort are AKT1, FGFR1 and ERBB2, each occurring in 1/8 of the clinical samples and are reported as low frequency occurring mutations in lung adenocarcinoma by the COSMIC database ( Figure 1, Table 3).
Colorectal carcinoma (ID.11-ID.13). Analysis of ID.12 was carried out using IonTorrent V2.01 and V2.2 as this sample was sequenced prior to the software upgrade, hence included in the heat map generated in Figure 1. ID.11 and ID.13 have been analysed by V2.2 only, Table 4. Due to the limited numbers available, we adjusted our inclusion criteria for the additional genes interrogated by the Ion AmpliSeq panel. In each of the colorectal carcinoma samples, SNVs were considered if a) detected in 2/3 samples tested and b) called by both IonTorrent analysis and CLC re-analysis. With this approach, FGFR3, PDGFRA, APC, RET, ATM and TP53 were flagged; however, experience in the larger lung adenocarcinoma cohort (Table 3) may call into question the reliability of the former 4 genes, Table 4. The results of ATM and TP53 do not allow analytical comment within this small sample number.
Malignant melanoma (ID.1 and ID.2). As above, we adjusted the inclusion criteria. SNVs were considered if detected in both malignant melanoma (BRAF mutant) samples. Genes included FGFR3, PDGFRA, APC, RET, NPM1 and PIK3CA. As before, the reliability of the former 4 genes is questionable. NPM1 was disregarded as the mutation was flagged in a homopolymer region. PIK3CA is a likely true mutation identified by the Ion AmpliSeq panel (Figure 1) and requires future validation.

Threshold filtering
The gene information obtained from interrogation of the Ion AmpliSeq panel was represented as a box plot ( Figure S1) demonstrating the importance of threshold application in SNV detection in NGS. The lines represent each of the QBVD thresholds (i = 71, ii = 259, iii = 500) and the proportion of gene SNVs that are filtered according to what stringency level has been applied. Figure 2 demonstrates the relevance of 'filtering by threshold application' of COSMIC SNVs in some of those patients with standard-of-care mutations in EGFR (ID.9), KRAS (ID.3 and ID.6) and BRAF (ID.1 and ID.2). For example, in ID.9 the lowest threshold level of detection for large deletions, calls variants in 151 gene regions in the Ion AmpliSeq cancer panel, 14 of which have been referenced in the COSMIC database. By applying an internal SNV only detection capability in CLC, the number of SNVs called in the full panel was reduced to 109 (14 COSMIC references still remained). As expected, application of the QBVD thresholds (i = 71, ii = 259 and iii = 500) resulted in a decrease in the number of SNVs detected from 26 to 14 to 9 and those that were COSMIC tracked, reduced from 6 to 4 to 1, respectively. An interesting observation of this threshold approach is that clinically important mutations in KRAS (ID.3 and ID.6) and BRAF (ID.1 and ID.2) were still detected when a QBVDiii (500) was applied, however, the clinically relevant mutation in EGFR would not have been reported by applying this threshold, Figure 2.

Discussion
Generation of numerous DNA reads from significant portions of the genome in little time will transform the way we interrogate DNA in cancer diagnostics. The sooner NGS is fully fit for this purpose, the easier it will be to interrogate numerous possible drug targets per patient in a time-sensitive manner, and thus, design broader short-term and long-term therapeutic strategies.
In our opinion, the current study has 4 main points of interest. Firstly, NGS is reliable in detecting known standard-of-care mutations with good sensitivity and specificity within our small sample panel. For example, deletions in EGFR exon 19 and SNVs in EGFR exon 18 aa719, exon 20 aa768 and exon 21 aa858 in lung adenocarcinoma [8]; KRAS SNVs in exon 2 aa12 in 50% of wildtype EGFR lung adenocarcinoma [14] and BRAF mutations at exon 15 aa600 in DNA from malignant melanoma [15] were all accurately identified by NGS, concordant with conventional mutation detection methods.
Secondly, NGS called other mutations in EGFR, KRAS and BRAF that represent standard-of-care but were undetected by Sanger/q-PCR methods. This may be due to a) an increased sensitivity of NGS or b) a lack of specificity of NGS. For example, our preliminary dilution sensitivity tests for NGS, prior to the validation of the technology, allowed us to indicate that the standard-of-care mutation in EGFR exon 21 aa858 was detected at 1% in a mix of wildtype/mutant DNA from a cell line (data not shown); however other mutations were also detectable at this level, suggesting that the sensitivity assays are unlikely to reflect DNA extracted from FFPE, thus making the direct correlation of NGS sensitivity with that calculated for Sanger and q-PCR approaches, 10% and 5% respectively, questionable. In any case, it is likely that many of these new mutations are not genuine and thus further refinement of the technology is necessary. Thirdly, the need for a better NGS technology is also a consequence of the results obtained with the other 43 genes. Again, it was out of the scope of this work to Sanger sequence every mutation identified in the NGS analysis, and this is indeed one of the limitations of our study. However, the approximation to COSMIC tells us that for many of them, the current technology may be over-calling mutations. This, which may be acceptable for discovery studies where significant downstream validations need to take place, is not appropriate in the context of routine cancer diagnostics.
Fourthly, our study is a clear example of how the application of new technologies to patient care will be dictated by bioinformatics approaches as much as wet-bench related work. The importance of the bioinformatics threshold approach in identifying credible results is a clear illustration of this and calls for the presence of molecular diagnostic bioinformaticians embedded in future reference molecular diagnostic operations.
No doubt as NGS technologies (and bioinformatics tools) evolve, accuracy will be enhanced thereby meeting our two provisos: a) NGS technologies are as efficient as the current detection methods in the diagnosis of those single genes that, for a given cancer type, represent standard-of-care; and b) the extra information that is generated in the process is of sufficient quality to consider alternative therapies or be accepted for future research endeavours. In future validations of NGS technology, one must deal with the added benefits of the discovery of new mutations versus the potential false positives that can result from altering the threshold. The importance of applying thresholds has been investigated here. In the situation where we observe lower frequency (than the QBVDiii = 500), it is likely that the mutation occurs in a small population of tumour cells or that the actual sample contained many stromal cells for example, thereby diluting the mutation frequency. The benefits of NGS are that the technology is sensitive enough to detect mutations at low frequency and in mixed tumour DNA samples; in such cases the threshold must be lowered to detect this. We believe that the sequencing of tumour samples for diagnostics must be carried on in parallel with the sequencing of an adjacent histologically normal sample; the latter acting as a baseline reference that should eliminate false positives, reveal germline mutations in both samples and finally reveal the true mutational profile of that tumour sample. Investment into sequencing precision, accuracy, reliability and bioinformatics will accelerate NGS integration into clinical cancer diagnostics either as a parallel tool with conventional sequencing methods or, in time, as a stand-alone approach to mutation detection. Figure S1 The boxplot represents the distribution of mutation variants, by coverage, obtained using the Ion AmpliSeq Cancer Panel and analyzed by IonTorrent V2.2 and CLC_V2.2. The lines represent QBVD thresholds (i, ii, iii) demonstrating the number of variants filtered depending on the level of detection applied. (TIFF)