Towards decision-making using individualized risk estimates for personalized medicine: A systematic review of genomic classifiers of solid tumors

Recent advances in the understanding of the genetic underpinnings of cancer offer the promise to customize cancer treatments to the individual through the use of genomic classifiers (GCs). At present, routine clinical utilization of GCs is uncommon and their current scope and status, in a broad sense, are unknown. As part of a registered review (PROSPERO 2014:CRD42014013371), we systematically reviewed the literature evaluating the utility of commercially available GCs by searching Ovid Medline (PubMed), EMBASE, the Cochrane Database of Systematic Reviews, and CINAHL on September 2, 2014. We excluded articles involving pediatric malignancies, non-solid or non-invasive cancers, hereditary risk of cancer, non-validated GCs, and GCs involving fewer than 3 biomarkers. A total of 3,625 studies were screened, but only 37 met the pre-specified inclusion criteria. Of these, 15 studies evaluated outcomes and clinical utility of GCs through clinical trials, and the remainder through the use of mathematical models. Most studies (29 of 37) were specific to hormone-receptor positive breast cancer, whereas only 4 studies evaluated GCs in non-breast cancer (prostate, colon, and lung cancers). GCs have spurred excitement across disciplines in recent decades. While there are several GCs that have been validated, the general quality of the data are weak. Further research, including prospective validation is needed, particularly in the non-breast cancer GCs.


Introduction
Over the past 30 years, there have been substantial advances in our knowledge of the genetic underpinnings of cancer. The increase in this knowledge, and in the technology to evaluate it, has generated tremendous excitement because of its potential to customize therapies at the patient-specific level and deliver on the promise of personalized medicine. There is an increasing emphasis on "precision oncology" or "genomics-driven oncology" [1,2], with individualized therapy strategies driven by molecular "-omics" information. PLOS  A genomic classifier (GC) offers the opportunity to select patients most likely to respond to therapy, based on stratification of probability of a clinical outcome according to a DNA or RNA expression signature [3,4]. This provides the potential to intensify therapy in patients with high-risk disease, improving cure rates, and avoid the 'overtreatment' of patients with biologically low-risk disease that historical, clinical, or histopathologic criteria cannot otherwise distinguish. Since the mid-2000s, several commercially available breast cancer GCs have been approved for coverage by Medicare & Medicaid [5]. Population-based research has identified increasing utilization rates of GCs among breast cancer patients, with concordant reduction in the proportion of women with hormone receptor positive cancer receiving chemotherapy [6]. Recent series estimate that 18% of women with breast cancer in the U.S. undergo the 21-gene recurrence score assay, which is only one of many [7]. Comparatively, there has been surprisingly little clinical implementation of GCs for other solid tumors.
Additional research is needed to deliver on the promise of GCs for solid tumors [2,8]. Despite the promise of genomics-driven cancer medicine, its clinical implementation is limited by a relative lack of prospective evidence regarding genomic assay validation and clinical performance [9]. The availability of strong evidence from well-designed, prospective trials is a significant challenge and rate-limiting step in the development of GCs [3].
Our purpose was to describe the current state of GCs and delineate areas of research that could validate their routine use in clinic. We systematically review and report the current evidence evaluating the utility of commercially available GCs for solid tumors of adults. Our study describes the outcomes and clinical utility measure of GCs as studied through clinical trials or the use of mathematical models.

Methods and materials
As part of a registered, PROSPERO International prospective systematic review (PROSPERO 2014:CRD42014013371), we conducted literature database searches of Ovid Medline (PubMed), EMBASE, the Cochrane Database of Systematic Reviews, and CINAHL on September 2, 2014. The MeSH search criteria are provided in the supporting information (S1 File), but generally includes terms associated with genomic and/or personalized cancer care. We restricted search criteria those reported in English. This resulted in 3,815 articles with 190 duplicates (3,625 unique articles, Fig 1). The PRISMA checklist is provided in the supporting information (S2 File).
Two investigators independently reviewed manuscript titles and abstracts to identify original data studies that involved the use of validated GCs to demonstrate clinical utility. Clinical utility is demonstrated when the test is shown to improve clinical outcomes and/or alter clinical decisions. Studies were required to involve solid tumors, adult patients (! 18 years old), and GCs with 3 or more biomarkers. Manuscripts involving pediatric malignancies, non-solid or non-invasive tumors (e.g., leukemia, ductal carcinoma in situ, etc.), hereditary risk of cancer, non-validated GCs, and GCs involving less than 3 biomarkers were excluded (Fig 1). In addition, manuscripts were reviewed independently by the two investigators for quality by applying the general principles of the Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) checklist items [11]. A third investigator served to resolve all coding disagreements. Each included manuscript was assessed for clinical site, assay(s) used, number of patients or simulated patients, the specific clinical population the results apply to, the methodology, the main contribution of the study, and the country of origin. Data extraction was completed using a pre-defined spreadsheet; one investigator performed the data extraction while a second investigator reviewed the spreadsheet to confirm correct data extraction. We present a figure to show the timeline of when the studies were published, number of patient specimens included in each clinical study (shown by dot size), and when the major GCs were available commercially (Fig 2).

Results
We identified 3,625 manuscripts for title review according to the above methods. As shown in Fig 1, this was reduced to 2,302 manuscripts for abstract review and 1,119 studies for full text review. After this final review, 37 manuscripts were included . A total of 273 abstracts and 55 manuscripts needed a third investigator to resolve coding disagreements. Tables 1 and 2 depict the key characteristics of each included study. Table 1 provides a summary of the breast cancer studies while Table 2 presents studies for all other types of cancer. Of the 37 studies, 15 studies evaluated outcomes and clinical utility of GCs through clinical trials, and the remainder through the use of mathematical models. Fig 2 depicts the timeline of the publication of the studies and the date relevant GCs became commercially available [12][13][14][15][16][17]. Each dot represents a study, with green dots for modeling studies and orange dots for clinical studies. The dot diameter for clinical studies corresponds to the number of patients in the study. In general, breast cancer GCs were developed and commercially available earlier than GCs for other cancers.

Breast cancer
Thirty-three (89%) of studies evaluated breast cancer, and of these, 29 (89%) were specific to hormone-receptor positive breast cancer, and 31 (94%) concerned the Oncotype DX 1 GC (Table 1). Among the trials concerning breast cancer GCs, 13 (39%) concern the clinical validation of GCs, mostly through the testing of prospectively collected tissue banks and evaluation of various clinical outcomes (overall survival, cancer recurrence, pathologic response to neoadjuvant therapy, etc.).
Two studies presented comparisons of multiple GCs. Iwamoto et al. compared six distinct assays for breast cancer (MammaPrint, Oncotype DX 1 , a 76-gene signature assay, mitotic kinase prognostic score, MKI67 mRNA expression, and molecular subtype). They demonstrated that the assays generally performed similarly in their abilities to predict 5-year overall survival, progression-free survival, and pathologic complete response [25]. Kelly et al. compared the Oncotype DX 1 GC to the PAM50 Breast Cancer Intrinsic Classifier TM and demonstrated general agreement between the two [28].
Of the breast cancer GC articles, 20 (61%) are based in mathematical models and generally concern cost-effectiveness. The main type of mathematical model used is a Markov model, a statetransition model used to simulate the health outcomes and costs for a cohort of patients. Each article included demonstrated that the use of GCs in breast cancer was cost-effective in a variety of reimbursement models (Table 1). In addition, 4 articles demonstrated that the use of GCs in breast cancer altered decisions regarding the recommendation for or against adjuvant therapy.   [54], other articles regarding this classifier did not meet our predefined inclusion criteria.

Discussion
Our results provide a summative analysis of the current state of the clinical research supporting the validation of GCs in patients with some solid tumors. While there are several commercially available GCs, the bulk of the existing published data are evaluations of breast cancer GCs.
While breast cancer is a common malignancy that usually requires multimodality therapy, cure rates for most women with breast cancer is already high. Regardless, there is a subset of patients with breast cancer that go on to die from their disease, and GCs are poised to identify these patients and potentially cure them. The development of GCs regarding more commonly fatal diseases such as locally advanced lung cancer or glioblastoma multiforme may have limited clinical utility, since most patients with poor-prognosis cancers will receive the most intensively validated therapy and a decision aid may not be clinically relevant for personalized decisions.
Interestingly, all of the articles in this systematic review regard the decision for (or against) adjuvant chemotherapy following definitive surgical resection. For most cancers however, there are multiple therapies that could be informed through GCs. In head and neck cancer, for example, many patients undergo definitive surgical resection and adjuvant therapy while others are receive definitive chemoradiotherapy (without surgery) without clear existing evidence as to which (if either) improves outcomes for patients. As another example, it is unlikely that the superiority (or inferiority) of radical prostatectomy over radiotherapy will ever be established, but it is possible that GCs could serve to define a subset of patients that would be better served with either therapy.
GCs have positioned themselves in a gap in cancer care that has obsessed researchers for decades. On one side are diseases that have targetable, gene-specific mutations (e.g., ALK-rearranged non-small cell lung cancer) and on the other side are markedly heterogeneous diseases where only non-discriminatory therapies have effect. GCs have the ability to fill this gap by analyzing numerous genes and weighting them based on their ability to drive cancer recurrence and metastasis, keying physicians in that more intensive or alternative therapy is warranted.
There are several trends across GCs that should be noted including that the majority of patients included in these studies are Caucasian. Baseline genetic heterogeneity between racial groups could have an impact on the external validity of these tests, and further research in this area is needed before broad application of any genetic test is appropriate across a diverse population. Finally, relatively few of the studies included comparisons of multiple GCs; as noted by Hunter [55], more research is needed to compare how risk categorizations differ between GCs.

Conclusion
GCs promise an era of precise, personalized cancer care. While there are several GCs that have been accepted for clinical use (particularly in breast cancer), our review demonstrates that there are a relatively limited number of studies available to provide supportive evidence of clinical utility. We await the prospective validation of several of the alternative GCs for other solid