Development and validation of a multigene variant profiling assay to guide targeted and immuno therapy selection in solid tumors

We present data on analytical validation of the multigene variant profiling assay (CellDx) to provide actionable indications for selection of targeted and immune checkpoint inhibitor (ICI) therapy in solid tumors. CellDx includes Next Generation Sequencing (NGS) profiling of gene variants in a targeted 452-gene panel as well as status of total Tumor Mutation Burden (TMB), Microsatellite instability (MSI), Mismatch Repair (MMR) and Programmed Cell Death—Ligand 1 (PD-L1) respectively. Validation parameters included accuracy, sensitivity, specificity and reproducibility for detection of Single Nucleotide Alterations (SNAs), Copy Number Alterations (CNAs), Insertions and Deletions (Indels), Gene fusions, MSI and PDL1. Cumulative analytical sensitivity and specificity of the assay were 99.03 (95% CI: 96.54–99.88) and 99.23% (95% CI: 98.54% - 99.65%) respectively with 99.20% overall Accuracy (95% CI: 98.57% - 99.60%) and 99.7% Precision based on evaluation of 116 reference samples. The clinical performance of CellDx was evaluated in a subsequent analysis of 299 clinical samples where 861 unique mutations were detected of which 791 were oncogenic and 47 were actionable. Indications in MMR, MSI and TMB for selection of ICI therapies were also detected in the clinical samples. The high specificity, sensitivity, accuracy and reproducibility of the CellDx assay is suitable for clinical application for guiding selection of targeted and immunotherapy agents in patients with solid organ tumors.


Introduction
The process of carcinogenesis traces back to the progressive accumulation of genomic alterations which lead to abnormalities in the genetic landscape, such as chromosomal and gene rearrangements, gene amplifications, deletions, aneuploidy, as well as loss-of-function or gainof-function mutations [1]. Evaluation of these molecular landmarks of the malignancy in tumor tissue can provide crucial therapeutic guidance for selection of cancer-specific as well as  [2], selection of the pan-cancer drug Larotrectinib in malignancies which harbor NTRK fusions [3] or selection of the immune checkpoint inhibitor (ICI) Pembrolizumab in multiple cancers based on PD-L1 expression [4], microsatellite instability (MSI) or deficiency in Mismatch Repair (dMMR) genes [5] as well as tumor mutation burden (TMB) [6]. Sensitive and accurate detection of these molecular features is crucial for therapy selection in order to avoid risks of treatment failure owing to inaccurate assays. It is therefore equally imperative for stringent validations of molecular investigations before clinical adoption. Next generation sequencing (NGS) technology is a significant technological advancement which provides sensitive, accurate, high-throughput evaluation of variations in multiple genes and continues to evolve as a platform of choice for cancer diagnostic applications. The key advantages of NGS based gene profiling are the ability to simultaneously evaluate multiple (hundreds of) genes in the same run, in a short interval of time and with low requirement of DNA. Several NGS-based diagnostic assays find potential and actual applications in the clinical setting [7]. The United States Food and Drug Administration (US-FDA) have approved several NGSbased companion diagnostic assays which identify gene alterations to guide selection of targeted and ICI therapies [8][9][10].
In the present study, we report the validation of the CellDx tumor profiling assay, which includes NGS profiling of gene alterations (SNAs, CNAs, Indels, Gene Fusion and TMB), Capillary electrophoresis (CE) for Microsatellite instability (BAT-25, BAT-26, NR-21, NR-24, and MONO-27), IHC for detection of PD-L1 (28-8 and 22C3) expression and IHC for MMR status (MLH1, MSH2, MSH6, PMS2). Validation of the CellDx assay established the analytical and clinical sensitivity, specificity, reproducibility and limit of detection based on 122 samples and the real-world utility by a subsequent analysis of 299 samples from cancer patients.

Samples and standards
A total of 421 samples were used for analytical validation and evaluation of real-world clinical performance of the CellDx assay. Analytical validation was performed on 122 reference samples including Formalin Fixed Paraffin Embedded (FFPE) tumor tissue, tumor DNA or tumor RNA (S1 Table) which were obtained from various sources such as College of American Pathologist (CAP), European Molecular Genetics Quality Network (EMQN), Coriell Institute of Medical Research (CIMR) as well as various commercial providers. For all reference and commercial samples, manufacturer's Certificate of Analysis was used as confirmation of sample characteristics. For Proficiency Testing (PT) samples obtained from CAP or EMQN, the sample specifications documents were used as confirmation. For samples with insufficient information such as clinical samples with previously detected variants, appropriate orthogonal testing was performed to ascertain the variant. Samples obtained were determined to be appropriate for each assay type such as NGS, MMR, MSI and PD-L1. In addition to the reference samples, 299 clinical specimens (S2 and S3 Tables) were obtained to assess the clinical performance of the assay.

Ethics statement
All patients provided signed informed consent for the publication of deidentified data and results. The process of obtaining patients samples was in accordance with all regulatory and ethical guidelines including ICH-GCP and the Declaration of Helsinki. The use of patient  in the form of salaries for authors  [DA1, DP, NS, RP, VD, SA, NY, RD1, SK, NJ, PM,  DA2, SP, HB, SS, AN, DS, PD, AS], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the 'author contributions' section.

Assay designing and content
The workflow and overall components of CellDx assay was illustrated in Fig 1.

Library preparation and sequencing
The library preparation for each sample was done using the Ion Ampliseq 452 gene panel. The assay required 10-100 ng of RNA/DNA from FFPE tissue samples. For RNA samples cDNA was converted using the SuperScript™ VILO™ cDNA Synthesis Kit (Thermo Fisher, USA) followed by library preparation using Ion Ampliseq library kit plus (Thermo Fisher, USA) with 452 gene panel. The targets were amplified by thermal cycling as per manufacturer's recommendations by GeneAmp1 PCR System 9700 (Applied Biosystems, USA). After target amplification PCR components were combined together and treated with the FuPa enzyme which were then ligated to multiplexing Ion Code barcodes (Thermo Fisher, USA). Libraries then purified by JetSeq™ Clean (Bioline, USA) magnetic beads and amplified using Amplification master mix (Thermofisher Scientific). Prepared Library underwent quality control (QC) using an E-Gel TM Agarose Gel 2% (Thermo Fisher, USA). Libraries were analyzed on Agilent 2100 bioanalyzer (Agilent Technologies, USA). Purified libraries were Quantified by Quant studio 12 k Flex Real Time PCR (Thermofisher, USA) using Ion library TaqMan quantification kit (Thermo Fisher, USA). The volume of each of the prepared libraries was diluted to 100 pmol to add equimolar concentration of each library into the emulsion PCR for a final total molarity ranging from 8 to 10 pM. The emulsion PCRs (Ion OneTouch™ 2) were carried out using Ion PI™ Hi-Q™ OT2 200 Kit. (Thermo Fisher, USA). After emulsion PCR non-templated Ion Sphere Particles (ISP) beads were enriched by streptavidin magnetic beads. After ISP bead enrichment, each library was sequenced using the Ion PI™ Hi-Q™ Sequencing 200 Kit (Thermo Fisher, USA). The enriched ISPs were loaded in Ion PI V3 Chip (Thermo Fisher, USA) and sequenced on Ion Proton semi-Conductor Sequencer, which acquired sequencing data points and generated a BAM and a FASTQ files.

Sequencing data analysis
Raw data analysis was performed using torrent suite software version 5.10 (Thermo Fisher, USA) along with ion reporter version 5.10 (Thermo Fisher, USA) by default analysis parameters. For DNA sequencing raw reads were aligned to human genome 19 using the Torrent Mapping Alignment Program (TMAP) plug in with default parameters and variant calling was performed using the Variant Caller version 5.10 using the torrent variant caller (TVC) plug in (Thermo Fisher, USA). A minimum sequencing depth of 500X with an allelic frequency of 2.5% was used as a cutoff with at least 20 variants reads to be called a variant. All fusions with read counts �120 were considered as positive. Ingenuity Variant Analysis software version 5.6 (Qiagen, Germany) was used for annotation and Ion reporter version 5.10 (Thermo Fisher, USA) was used for copy number and gene fusion detection in their respective workflow as per default parameters. For clinical samples, all detected variants were mapped to CIViC [13] (https://civicdb.org/) and OncoKB [14] (https://oncokb.org/) clinical annotation databases.

Mismatch repair (MMR) by Immunohistochemistry (IHC)
A panel of ready to use antibodies (DAKO EnVision FLEX primary mAb, anti-MLH1 clone-ES05, anti-MSH2 clone-FE11; anti-MSH6 clone-EP49; anti-PMS2 clone-EP51) were used to determine MMR protein expression in 10 samples. FFPE tissue blocks were used to prepare 3-4 μm tissue sections on poly-L lysine coated slides (Leica, Germany) which was placed on hot plate at 60˚C for 1 hour. Deparaffinization, rehydration, antigen retrieval, antigen blocking and staining were performed in an automated slide staining system (Leica, Germany) as per the manufacturer's instructions. Each IHC run contained a positive control. Post-IHC, the slides were dehydrated and mounted using Distyrene, Plasticizer, Xylene (DPX) mountant. Results were interpreted by an experienced pathologist under light microscopy (Leica, Germany

Assay sensitivity and specificity
The analytical sensitivity was defined as the ability of the CellDx assay to detect known variants in NGS as well as

Accuracy
Accuracy [23] was assessed from sensitivity (TP and FN) and specificity (TN and FP) assessments from 110 samples. Accuracy was defined as the proportion (%) of TP and TN among the sum total of TP, TN, FP and FN. The 95% CI was estimated using the Medcalc's diagnostic test evaluation calculator.

Assay precision
The Repeatability and Reproducibility of the CellDx assay was evaluated using 31 samples. Two operators processed the same samples independently and each operator performed each assay twice. Repeatability and Reproducibility were assessed for Ampliseq 452 gene panel, MMR, MSI and PD-L1 Precision was assessed by calculating positive concordances between pairwise inter and intra-user comparisons. Positive pairwise concordance was defined as the fraction of positive results in agreement among the total positive results between the replicates [23][24][25].

Preliminary clinical feasibility study
Clinical feasibility of the CellDx assay was explored in a preliminary study based on evaluation of patient derived tumor samples (fresh tissue or FFPE blocks) for detection of actionable features in DNA/RNA, as well as in determining status of TMB, MMR / MSI and PD-L1. The study intended to determine the prevalence of clinically significant and actionable variants in patient samples. This preliminary study was based on remnant archived patient samples with appropriate consent from patients to use deidentified samples for research, development and validation purposes as well as for publication of sample-derived data. No patient underwent any additional invasive procedure to obtain samples for this Study. FFPE tissue samples, fresh tissue samples or 4-5 core needle biopsy specimens from 299 cancer patients were used and processed as per the protocol mentioned above.

NGS data analysis
Nucleic acid isolated from 20 tissue types, PT(CAP) and EMQN samples and commercial samples were used for library preparation. Library yields for all DNA and RNA samples exceeded the minimum requirements of 100 pmol/L irrespective of sample types indicating that the method was compatible for all type of samples. NGS data obtained for 83 samples from 11 sequencing runs of the 452 gene panel were used for analytical validation. The 452 gene panel produced a median of 8,808,657 reads per sample (range, 1.45 M to 30.33 M), a median read length of 111 bp (range, 90 to 125 bp), 94.0% median on target reads and a median 96.0% uniformity of data. Uniformity was defined as the proportion (%) of target bases covered by at least 0.2x the mean base-read depth [26]. By targeting >500X mean depth from the analysis of all samples across 11 runs, 99.1% amplicons were covered in the 452-gene panel coverage at >100X (Fig 2). NGS Run Statistics are provided in S5 Table. Sample-wise variants are provided in S6 Table.

Limit of detection for NGS
The LOD of the assay was based on two parameters, minimum tumor content and the lower VAF in tumor DNA. Two FFPE tissue samples were used for evaluating LOD for 4 types of variants, i.e., CNA, SNV, InDel and Fusion. Tumor content of the tissue samples were determined by a pathologist and DNA / RNA were isolated. To evaluate the LOD of variant calling, DNA was serially (2-fold) diluted with wild type CEPH DNA to simulate 70%, 35%, 17.5% and 8% tumor content and estimate the lowest VAF for 3 variant types, i.e., SNA, CNA and Indel.
DNA obtained from tissue with 70% tumor content was known to harbor 37 copies of CCNE1 (CNA), TSC1_c.2065C>T at 47% (SNA) and FANCI_ c.1641_1642delTA (small indel) at 3.7% VAF. The default parameters for LOD assessments were 2.5% VAF detection threshold and 20 sequencing reads with the variant was not applied. Across the serially diluted samples, VAFs of TSC1_c.2065C>T SNA ranged from 47.00% to 1.27%, CNA for CCNE1 (first 3 dilutions) ranged from 37 to 4. Non-linearity in VAF was observed for FANCI c.1641_1642delTA indel dilutions (3.7 to 0.9%) and are speculated to arise from the proportion of wt DNA reads in this region.
Similarly, RNA obtained from tumor tissue with 60% tumor content was known to have 2028 fusion copies of EML4-ALK. The RNA was serially (2-fold) diluted with healthy RNA sample to estimate limit of gene fusion variant. Gene fusions were detected from 2028 to 298 read counts. The LOD observations were consistent with the default detection limits of the data analysis pipeline. Based on these observations, minimum tumor content (MTC) was defined as � 8% for SNA and indels, and � 25% for CNA and gene fusion respectively at VAF �4% (Table C, in S7 Table). Based on evaluation of LOD the threshold for calling SNAs and Indels were set at 2.5% VAF for a sample with 8% tumor content, for calling CNAs were set at gain of 3 copies or loss of 1 copy for a sample with 25% tumor content, and for calling of fusions minimum read count was set at >120 with for a sample with 25% tumor content. Significant, i.e., therapeutically actionable, variants with VAF <2.5% are reviewed by ddPCR for positive or negative.

Analytical sensitivity and specificity
Analytical Sensitivity and Specificity of the CellDx assay was determined on 110 samples. Among the 56 samples which were analyzed by NGS, 1286 variants were detected in 289 genes including 1211 SNA, 34 indels (26 small and 8 large), 10 CNA and 31 gene fusions. Review of BAM files using integrated genome viewer [27] indicated that among these 1286 variants, 1284 variants were above the LOD threshold for respective Variant Allele Frequencies (VAF) and 2 variants were below the LOD (EGFR.pL858R at 2% AF and MET at 2.8 copies).
With the exception of 9 false positive (FP) variants, 1099 true negative (TN) variants were correctly detected by NGS which indicated a Specificity of 99.19%.

Accuracy
Accuracy of CellDx was determined from analysis of 110 samples which included 51 known positives for Sensitivity and 59 known negatives for Specificity. Overall accuracy of CellDx assay was based on detection of all TP variants (N = 204), TN variants (N = 1161), FP variants (N = 9) and FN variants (N = 2). The overall accuracy of the CellDx assay was 99.20% (95% CI: 98.57% -99.60%). (Table 1; S7 Table). The Accuracy of each investigation is also provided in S7 Table.

Precision (repeatability and reproducibility)
Precision including Intra-Operator Concordance (Repeatability) and Inter-Operator Concordance (Reproducibility) were determined for NGS, PD-L1, MMR and MSI. Precision was determined from 31 known positive samples, of which 9 were evaluated for NGS, 6 were evaluated by MSI, 4 were evaluated for MMR and 12 were evaluated for PD-L1 (IHC). For NGS replicates included library preparation, sequencing and data analysis. For PD-L1 and MMR, replicates encompassed sample preparation to staining and interpretation of results. For MSI replicates encompassed DNA isolation to interpretation of results. For NGS, 276 variants were evaluated.
MSI STR markers were successfully detected in both replicates as well as MMR and PD-L1 analysis was reproducible. Inability to detect a fusion (SLC34A2-ROS1) in one replicate translated into 99.7% overall concordance (Table 1; S7 Table).

Preliminary clinical feasibility
The CellDx assay was used to evaluate 299 tissue samples from patients with known cases of cancer to determine molecular variants by NGS as well as TMB, MSI / MMR and PD-L1 status. Among the 133 overall samples which were profiled by NGS samples, significant somatic variants (pathogenic, likely pathogenic, VUS) were detected in 96.99% (129/133) of samples. A total of 4,666 reportable variants were detected including 3,355 CNAs (37% loss; 63% gain), 1,161 SNA, 129 indels and 21 gene fusions. In all 852 unique mutations were detected, of which 784 were oncogenic and 47 were actionable, i.e., indication for selection of a targeted anticancer drug (S8 Table). All reported variants had a median VAF of 20.15%, and 99.98% variants were detected at � 4.0% AF. The most frequently detected alterations were in TP53 (25%) and PIK3CA (7%). Gene fusions detected in 7% patients (21/299), mostly included EIF3E, NCOA4, PTPRK, ESR1, FGFR3, and MYB.

Discussion
Precision Oncology aims to provide personalized treatment options based on patient-derived de novo evidence which can be obtained from multi-analyte, multi-variant evaluation of the tumor. A prior single institution retrospective study has shown that tumors in majority of patients have biologically actionable alterations [28], of which several can be therapeutically targeted with approved agents, while others are in various phases of clinical trials. It is well accepted that multigene variant profiling of tumor tissue samples can guide selection of targeted and ICI therapies [29][30][31][32][33][34] as well as predict individuals who are more likely to respond to (or not respond to) systemic anticancer therapies. While several companion and complementary diagnostics assays have been approved by the US FDA [10] which guide selection of anticancer agents or predict likely responders, these are based on univariate analysis and for a single drug The CellDx assay on the other hand is a multi-gene variant profiling that provides a comprehensive profiling of actionable and therapeutically relevant tumor vulnerabilities.
The CellDx assay was developed and has been validated for detection of SNAs, CNAs, Indels, gene fusion by NGS, status of MSI by CE and status of PD-L1 as well as MMR by IHC. The high overall analytical sensitivity (99.03%) of the CellDx assay resulted from high individual sensitivities of NGS, IHC and CE for all variants tested. The high sensitivity implies a little or no risk, if any, of undetectable actionable variants. The CellDx assay demonstrated an overall 99.23% specificity, as well as an accuracy of 99.20% and 99.7% precision indicating suitability for clinical use.
The clinical feasibility of the CellDx was explored using 299 clinical samples to identify molecular features on NGS as well as MSI, PD-L1 and MMR status. Actionable findings in the 299 clinical samples were conveyed to the respective clinicians, but the patients were not followed up with to determine whether the findings were used to guide further therapeutic directions or evaluate treatment outcomes-these aspects were beyond the scope of the present manuscript.
Accurate detection of prognostic and predictive biomarkers can identify patients more likely to benefit from targeted and ICI therapies. The use of gene profiling for selection of targeted anticancer therapies based on gene variants is already well accepted. In addition, profiling of PD-L1, MMR and TMB are more recently developed biomarkers which guide ICI therapy selection. Targeting the immune checkpoint proteins (PD-L1 or PD1) with inhibitory mABs is a treatment strategy in multiple cancers [32,33]. The expression of PD-1 and PD-L1 proteins was considered to be associated with response rate to ICI [29,32,33], and immunohistochemistry (IHC) profiling of PD-L1 status is routinely used to identify patients likely to benefit from ICI therapies [34]. More recent studies appear to indicate elevated TMB rather than PD-L1 expression as a more accurate predictor of treatment response [32][33][34][35]. Prior studies have also shown the association of tumors deficient for mismatch repair (dMMR) with higher response to ICI therapies [36]. The status of 4 MMR proteins, MLH1, MSH2, MSH6, and PMS2, as well as the LNE leading to dMMR status [30,31] due to either germline or somatic mutation or inactivation by hypermethylation is determined by IHC [32]. dMMR status is associated with the accumulation of mutations in microsatellite regions, leading to microsatellite instability (MSI). Somatic mutations leading to dMMR have been shown to be associated with increased TMB and MSI [33][34][35][36][37], the latter being another predictive biomarker for response to ICI therapies [35].
In a routine setting, patients usually undergo evaluation of single variants at each instance for therapy selection, e.g., in NSCLC, evaluation of EGFR mutations, ALK-fusions, PD-L1 and MMR status are usually not performed simultaneously which leads to extended time to treatment. Similarly, in CRC, evaluation of RAS mutations, PD-L1, MMR and MSI are not performed simultaneously in the routine clinical setting. CellDx is advantageous in evaluating all variants and indicated therapy options at the outset which can enable appropriate therapy selection. CellDx may be perceived as a comprehensive companion and complementary diagnostics solution for selection of anticancer agents in labeled as well as label-agnostic settings; the latter is especially helpful for clinicians who are considering clinician's choice of treatments in refractory cancers where Standard of Care (SoC) treatment options are exhausted or unviable. Findings of CellDx not only directly identify treatment options, but also help to stratify patients as likely to respond and less likely to respond; the former is significant in guiding treatment selection in labelled as well as label-agnostic setting, while the latter can help avoid selection of futile treatments which yield not benefit to patient but add to the cumulative toxicity.
In summary, the ability of the CellDx assay to detect and report actionable variants in via NGS (for SNAs, CNAs, small indels, large indels and gene fusions) as well as to determine status of TMB, MSI, MMR and PD-L1 has direct and significant clinical utility in cancer management. The study findings sufficiently establish the high sensitivity, specificity, accuracy and reproducibility which is expected for clinical adoption of this assay.