Identification and Validation of a Multigene Predictor of Recurrence in Primary Laryngeal Cancer

Purpose Local recurrence is the major manifestation of treatment failure in patients with operable laryngeal carcinoma. Established clinicopathological factors cannot sufficiently predict patients that are likely to recur after treatment. Additional tools are therefore required to accurately identify patients at high risk for recurrence. This study attempts to identify and independently validate gene expression models, prognostic of disease-free survival (DFS) in operable laryngeal cancer. Materials and Methods Using Affymetrix U133A Genechips, we profiled fresh-frozen tumor tissues from 66 patients with laryngeal cancer treated locally with surgery. We applied Cox regression proportional hazards modeling to identify multigene predictors of recurrence. Gene models were then validated in two independent cohorts of 54 and 187 patients (fresh-frozen and formalin-fixed tissue validation sets, respectively). Results We focused on genes univariately associated with DFS (p<0.01) in the training set. Among several models comprising different numbers of genes, a 30-probe set model demonstrated optimal performance in both the training (log-rank, p<0.001) and 1st validation (p = 0.010) sets. Specifically, in the 1st validation set, median DFS as predicted by the 30-probe set model, was 34 and 80 months for high- and low-risk patients, respectively. Hazard ratio (HR) for recurrence in the high-risk group was 3.87 (95% CI 1.28–11.73, Wald's p = 0.017). Testing the expression of selected genes from the above model in the 2nd validation set, with qPCR, revealed significant associations of single markers, such as ACE2, FLOT1 and PRKD1, with patient DFS. High PRKD1 remained an unfavorable prognostic marker upon multivariate analysis (HR = 2.00, 95% CI 1.28–3.14, p = 0.002) along with positive nodal status. Conclusions We have established and validated gene models that can successfully stratify patients with laryngeal cancer, based on their risk for recurrence. It seems worthy to prospectively validate PRKD1 expression as a laryngeal cancer prognostic marker, for routine clinical applications.


Introduction
Laryngeal cancer is the eleventh most common type of cancer in men worldwide. Every year 52,000 cases are newly diagnosed in Europe and 10,000 in the United States [1,2]. Despite the latest advances in diagnostic and therapeutic techniques, the majority of patients still recur after treatment [3]. Established clinicopathological factors cannot sufficiently predict patients that will recur. Additional factors are therefore required to accurately identify patients with poor prognosis. Expression profiling has been successfully used in the stratification of cancer patients with unfavorable prognosis [4,5,6,7]. Previous studies in head and neck cancer patients have linked gene expression profiles to nodal status [8,9,10], distant metastases [11,12] and disease-free survival [13,14,15,16]. While these studies provided great insight into the molecular complexity of head and neck cancer they did not identify a robust gene profile. The clinical use of these models has been limited by the large number of genes, the small-sized datasets and the lack of reproducibility and independent validation. Moreover, none of these studies focused exclusively on laryngeal cancer.
In the present study, we sought to identify genes prognostic of recurrence in patients with primary laryngeal cancer. The endpoint of our analyses was disease-free survival (DFS). We profiled tumor samples from two separate cohorts of patients using global gene expression profiling. Using the first cohort as a training set, we identified several prognostic gene models, which were then validated in the second cohort of patients. In order to further validate our results, we profiled selected genes of our models for relative expression with quantitative real time-polymerase chain reaction (qRT-PCR) in an independent cohort of patients.

Study population
Our study comprised of 307 patients diagnosed with primary squamous cell laryngeal cancinoma (training set: 66, 1 st validation set: 54 and 2 nd validation set: 187 patients), with median follow-up of 52, 33 and 89 months, respectively. All patients underwent surgical removal of the tumor at the Otorhinolaryngology Department of the AHEPA Hospital in Thessaloniki, Greece, between 1985 and 2008. Postoperative administration of radiation therapy was decided by the treating physician and most often was given to patients with positive tumor margins, patients with T4 tumors or those with T1/T2 supraglottic tumors for whom a prophylactic elective lymphadenectomy was not done. None of the patients received systemic therapy as part of their initial treatment. This is the current standard of care in many European countries [17], however this may not be the case in other countries, like the US, where the main focus is the preservation of the larynx with concurrent chemoradiotherapy, preceded in select cases by induction chemotherapy [18]. Follow-up included physical examination, every three months for the first three years and every six months thereafter. Imaging examinations were performed, as indicated by symptoms and physical examination. Detailed demographics and clinical characteristics for the patients with valid gene data are listed in Table 1, while individual patient data are shown as in Table S1.

Tumor specimens
Fresh-frozen tumor tissue samples, from patients comprising the training and 1 st validation sets, were prospectively collected at the time of surgery, from 2004 to 2008, were immediately frozen in liquid nitrogen and stored in 280uC until processing. Formalinfixed paraffin-embedded (FFPE) tumor tissue samples, from patients comprising the 2 nd validation set, were retrospectively collected (patients treated between 1985 and 2008). The latter were fixed in formalin for at least 6 hours before being embedded in paraffin. Laryngeal tumors were histologically assessed and verified in all cases, including the fresh-frozen tissue samples.

Ethics Statement
Fresh-frozen and FFPE tumor tissue samples were collected according to protocols approved by the Institutional Review Board of the ''AHEPA'' Hospital and the Bioethics Committee of the Aristotle University of Thessaloniki, School of Medicine. Written informed consent for the scientific use of biological material was obtained from all patients comprising the training and 1 st validation sets and from the patients of the 2 nd validation set treated after 2004. Waiver of consent was obtained from the Bioethics Committee for patients treated before 2004 and for whom FFPE tumor tissue samples needed to be retrospectively collected. All clinical investigations related to the present study have been conducted according to the principles expressed in the Declaration of Helsinki.
RNA isolation from fresh-frozen tissue and global gene expression profiling RNA isolation from fresh frozen tumor specimens was performed using the RNeasy protocol (Qiagen, Hilden, Germany), as previously described [19]. RNA quantity was determined by measuring UV absorbance at 260 and 280 nm, while RNA quality was assessed using an Agilent 2100 Bioanalyzer RNA 6000 LabChip kit (Agilent Technologies, Palo Alto, CA). RNA was reverse transcribed, labeled and hybridized to Affymetrix (Santa Clara, CA) HG-U133A arrays, as previously described [19]. Experiments concerning the training and 1 st validation sets were carried out separately, at different time points, at Siemens Healthcare Diagnostic Products (Cologne, Germany). The gene expression data have been deposited in the National Center for Biotechnology Information Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) and are available through the GEO Series accession number GSE27020. The following link has been created to allow review of GSE27020: http://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi?token = dxcdxksauqicizy&acc = GSE27020

RNA isolation from FFPE tumor tissue samples and qRT-PCR
All investigations involving FFPE tissue samples were performed at the Laboratory of Molecular Oncology, Hellenic Foundation for Cancer Research, Aristotle University of Thessaloniki School of Medicine. To validate the prognostic value of genes derived from our microarray analysis, we used a separate cohort of patients with available tumor tissue material. This sample group included 187 FFPE tissue samples that were macrodissected upon previous histological evaluation to contain .50% tumor cells. RNA was isolated after complete overnight tissue lysis using Trizol-LS (Life Technologies, Paisley, UK), as previously described [20] and was reverse transcribed with Superscript III (Life Technologies), according to the manufacturer's instructions. cDNAs were normalized at 25 ng/ul and stored at 220uC until use. mRNA expression was investigated with FAM-labeled TaqManH Gene Expression Assays in duplex reactions involving a primer-limited VIC-labeled reference assay for glucuronidase beta (GUSB, assay # Hs00939627_m1), as an endogenous template control. GUSB was preferred over usually applied endogenous controls because no pseudogenes have, as yet, been reported for this gene. Additionally, it has been identified as one among the best preserved mRNA targets in FFPE tissues [21]. qPCR mRNA target selection included evaluation of the 30 probes of the U133A signature for gene duplicates, gene characteristics (valid gene identities, type of gene, recorded splice variants that would not be distinguished by the probe on the array or by the qPCR assay), as well as a parametric p-value of ,0.05 for fold-change in gene expression. Out of the 30 U133A probes, 23 remained valid for qRT-PCR application ( Table 2). For these 23 targets, we searched for pre-made TaqManH Gene Expression Assays (Applied Biosystems) that would match the target sequences detected by the corresponding probes in the U133A array. It was possible to retrieve 16 such assays (Table 2). Duplex 10 ul reactions were run in duplicates, each containing 50 ng of template, in an ABI PRISM 7900HT system (Applied Biosystems/Life Technologies) on a 384-well block under default conditions involving 45 amplification cycles, along with a reference RNA sample (TaqManH Control Total RNA, cat. no 4307281, Applied Biosystems) and no-template controls. The reference RNA was used as a positive plate control and for the evaluation of assay performance among runs (inter-run validation). Two of the 16 selected assays yielded deltaRQ values of .1.5 among different runs and, hence, failed inter-run validation and were not further evaluated (Table 2).
Cycle thresholds (CT, corresponding to Cq in MIQE guidelines) for each target and for the endogenous reference were automatically obtained at default conditions and relative quantification (RQ) was calculated in a linear mode [22] by subtracting (45-avg deltaCT), whereby 45 was the total amplification cycle number and avg dCT = average [(CT target)2(CT GUSB)] for duplicates. Eligibility criteria for further sample evaluation included GUSB CT values of ,38 for each reaction and deltaRQ for each duplicate (intra-run variation) of ,0.8. For 3 assays no amplification curves were obtained for the FFPE and the reference mRNA samples, while for an additional assay high intra-run RQ value differences were observed in 87% of samples (Table 2). Based on the above filtering steps for assay and sample eligibility, it was finally possible to evaluate RQ results for 10 genes only.

Statistical analysis
Prognostic gene expression models were developed exclusively in the training set. DFS was measured from the time of diagnosis until verified disease progression or death. Alive patients without verified disease progression were censored at the date of last contact. Genes selected had to be univariately associated with DFS (p,0.01, Cox proportional hazard model). The algorithm fits proportional hazards models to relate DFS to each gene, one gene at a time, and provides a p value for each gene, testing the hypothesis that DFS is independent of the expression level of the particular gene. Genes found to be associated with DFS in the training set were then ranked based on their absolute hazard ratio value, provided by the algorithm. Prognostic gene models, comprising different numbers of top ranking genes, were developed using the supervised principal component survival algorithm [23]. The algorithm computes principal components and performs Cox proportional hazard regression analysis to calculate a regression coefficient (weight) for each principal component. A supervised principal component model is developed to provide a prognostic index for each patient of the study. A high prognostic index corresponds to a high value of hazard of recurrence. To evaluate the predictive value of this method, we used Leave-One-Out-Cross-Validation, where each case is omitted and the entire analysis is performed using the rest of the samples. In order to directly apply these models to the 1 st validation set, we normalized the training and the 1 st validation sets, using the empirical Bayes (EB) method [24]. The method uses an algorithm designed to adjust for the non-biological experimental variation (''batch effect'') between different datasets. It reduces inter-laboratory variation, as well as technical differences due to the utilization of different platforms and methodological approaches. After normalization, we directly applied the gene models to the 1 st validation set without any modifications. Kaplan-Meier curves and log-rank tests were used to estimate and compare the survival distributions in patients at high-and low-risk of recurrence. All reported p values are two-sided. Cox proportional hazard analysis was used for univariate analysis and multivariate adjustment for known prognostic factors. Statistical analysis was performed using the BRB-ArrayTools developed by Dr. Richard Simon and the BRB-ArrayTools Development Team and the SPSS statistical package, version 18.0, (IBM Corporation, Armonk, NY).
We used the unsupervised ''Subclass Mapping'' (Submap) method [25] to evaluate the molecular correspondence of patients with favorable and unfavorable prognosis between the training set and the 1 st validation set. This method bi-directionally assesses the association of predefined subtypes in multiple independent datasets, despite their technical variation. The algorithm provides the calculation of a p value to demonstrate the likelihood of molecular similarity between the different subclasses, it is implemented in the GenePattern software (Version 3.0, Broad Institute, Cambridge, MA) and can be accessed at http://www. broad.mit.edu/genepattern/ Gene set analysis (GSA) was utilized to detect gene network deregulation characteristic of groups of patients with good or poor prognosis [26]. Using publicly available data, we then predicted oncogenic pathway activation status in each patient of the training and 1 st validation sets. We applied gene expression models, previously developed and validated in vitro, to estimate the probability of pathway activation in each sample [27]. Finally, using Bayesian probit regression models we assigned to each patient a probability of pathway activation.

Identification and validation of prognostic classifiers using gene expression profiling
The flowchart of our study is shown in Figure 1 (consort diagram). We analyzed primary laryngeal tumors from 66 patients (training set) and 54 patients (1 st validation set) using global gene expression profiling. After evaluating the quality of the microarray data, we excluded 7 and 4 technical outliers from the two sets, respectively. For some of the genes, expression was evaluated using two different probe sets. Prognostic probe set models were identified exclusively in the training set. After excluding one fourth of the least variant genes, we focused on genes associated with DFS (Wald's p,0.01). We then ranked the 253 probe sets found to be significantly associated with DFS, based on their Cox regression coefficient. We identified several prognostic probe set models consisting of as many as 250 to as few as 20 probe sets, which performed equally well in the training set.
Subsequently, we applied these multigene predictors directly to the 1 st validation set. The 30-probe set model performed the best in the validation set (the model with the fewest genes demonstrating the highest statistical difference in DFS between high-and lowrisk patients). Median DFS for the groups of patients with unfavorable and favorable prognosis, as predicted by the 30-probe set model, was 34 and 80 months, respectively (log-rank, p = 0.010). The hazard ratio (HR) for recurrence in the high-risk group versus the low-risk group was 3.87 (95% CI 1.28-11.73, Wald's p = 0.017). Kaplan-Meier curves for all probe set models in the training and 1 st validation sets can be found in File S1. Concordance between the risk assignments both in the training and 1 st validation sets based on the different classifiers was high, 81-87% (Cramer V test = 0.62 to 0.75). Annotations of all 30 probe sets included in our model are shown in Table 3, while a graphical representation of the gene expression patterns of the 30-probe set model in the high-and lowrisk patients is shown in Figure 2a. We observed that in this model, two genes (ACE2 and MAP4K1) were represented by two probe sets. In order to avoid the effect of weighting these two genes twice as much compared to each of the other individual probe sets, representing a single gene, we retested our model in the training and 1 st validation sets using a single probe set for each gene. In the training set, statistical significance in 28-gene model remained identical (log-rank test, p,0.001, Figure 3a) with the one based on the 30-probe set model. Median DFS in the 1 st validation set, as predicted by the 28-gene model, was 34 and 80 months, respectively (log-rank, p = 0.029, Figure 3b). The hazard ratio (HR) for recurrence in the high-risk group versus the low-risk group was 4.38 (95% CI 1.06-7.01, Wald's p = 0.036).
To further validate the prognostic significance of these profiles, we applied our 28-gene model to a publicly available cohort of patients with early stage laryngeal cancer [28]. Due to technical variation between the two datasets, 23 out of the 28 genes of our model were used to stratify patients based on their risk for recurrence. Despite the technical and biologic limitations (different platforms, different number of genes, early vs all-stage disease) our model maintained its prognostic significance. Median DFS for the groups of patients with unfavorable and favorable prognosis, as predicted by the 23-gene model, was 118 and 161 months, respectively (log-rank, p = 0.011, Figure S1). The HR for recurrence in the high-risk group versus the low-risk group was 4.37 (95% CI 1.24-15.34, Wald's p = 0.022).
Molecular homology of patients with favorable and unfavorable prognosis in the training and 1 st validation sets Our 28-gene profile appears to not solely be a collection of prognostic genes but to actually capture the underlying biology of the tumors. To demonstrate the molecular homogeneity of highrisk tumors, we used subclass mapping (Submap), a method that assesses molecular similarity of predefined groups belonging to different datasets. We indeed illustrated that groups of patients with poor and good prognosis in the training set share the same biological patterns with the respective groups in the validation set, above and beyond the expression of specific genes. Charts displaying good ''molecular match'' of the high-and low-risk patients are shown in Figure 2b.

Independent prognostic significance of the 28-gene model
We were interested in demonstrating the independent prognostic significance of our multigene predictor. We included in the analysis stage and grade, the only known prognostic factors, for which data were available for the patients of our study. We have also included radiation therapy, since it is known to significantly reduce local recurrence. In multivariate analysis, our 28-gene model maintained borderline prognostic significance in the 1 st validation set. HR for recurrence in the high-risk group was 2.67 (95% CI 0.99-7.22, Wald's p = 0.05) (details are shown in Table 4).

Pathway analysis in the high-and low-risk groups
In order to gain some additional insight into the biological processes in patients with favorable and unfavorable prognosis, we performed Gene Set Analysis (GSA). We explored gene networks and biological themes, as described by the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database. We indeed identified a wide range of pathways, differentially expressed between the two groups of patients (GSA Goeman's global test p values,0.01). We focused on pathways deregulated both in the training and in the 1 st validation sets. Table 5 presents selected pathways of interest, while the full list of pathways can be found as in Table S2. Several of these pathways have previously been shown to play an important role in head and neck cancer progression. Interestingly, we observed that genes of the focal adhesion (FA) pathway [29], shown to be prognostic in our dataset, as well as in head and neck cancer, successfully stratified our patients based on their risk of recurrence (details in Figure 4).

Oncogenic pathway activation patterns in individual patients
In addition to the aforementioned pathway analysis results that derived from evaluating groups of patients, we sought to explore pathway activation in individual patients. We used previously developed and validated ''in vitro'' gene expression ''read-outs'' to identify activation of known oncogenic pathways in each patient of the training and 1 st validation sets. We investigated the Src, Ras, b-catenin and E2F3 pathways, which have previously been shown to be associated with survival in other types of cancer [27]. Interestingly, we demonstrated that patients with poor prognosis more often had tumors characterized by Ras pathway activation, odds ratio (OR) = 2.92 (95% CI 1.33-6.39, Wald's p,0.01), while patients with good prognosis more often had tumors exhibiting Src and b-catenin pathway activation, OR = 4.54 (95% CI 1.64-12.50, p = 0.003) and 2.63 (95% CI 1.16 to 5.88, p = 0.03), respectively. In addition, we performed Cox proportional hazards survival regression and observed that the Ras pathway activation was found to be associated with poor prognosis, HR = 2.55 (95%   Figure 5. Finally, we performed multivariate analysis, including the 28-gene model and the Ras pathway and found that the predictor maintained its independent prognostic significance (Wald's p = 0.001), as did the oncogenic pathway (Table 6).

qPCR validation of a subset of genes from the 28-gene predictor
The aforementioned analyses were performed using freshfrozen tissue samples. However, this approach has certain methodological limitations when it comes to daily clinical practice. Thus, we attempted to validate our multigene predictor using qPCR on FFPE tumor tissue samples, a more easily applicable methodology. As described in the Materials and Methods section, however, it was possible to reliably investigate in our FFPE sample series the expression of only 10 of the 28 genes in the original predictor. The descriptive characteristics of RQ values obtained from the informative samples per assay are described in Table 7 (means, medians, SD, min, max). At first, we attempted to cluster continuous RQ values for all these genes. However, hierarchical clustering did not yield results comparable to the 28-gene predictor in a meaningful way, with the majority of high and low RQ values clustered in the opposite direction (Figures 6a and  b). This was of no surprise, because we only examined 1/3 of the genes probed in the array signature, while fold-change in the expression of these genes was very narrow and could easily be reversed with a different method (qPCR vs. array hybridization) or a different type of material (FFPE vs. fresh-frozen tissues). Detailed analytical and statistical approaches for the behavior of each qPCR assay vs. the array probe in matched FFPE vs. fresh-frozen samples would be needed for the clarification of this discrepancy, however such an in-depth analysis was beyond the scope of the present study. Therefore, we next applied pre-determined profiling for the investigation of possible effects of the genes tested with qPCR on patient DFS. For this purpose, we transformed continuous RQ values into binary parameters (high/low expres-sion). Based on the narrow fold-change of gene expression between the good and bad prognosis groups in the 28-gene predictor, we used median RQ values to classify high and low expression for the 10 evaluable genes. Log-rank testing for these genes as single markers yielded associations with outcome similar to those observed for the corresponding genes in the U133A signature (high/low patterns comparable with up-and down-regulation of gene expression, respectively, for 6 of the 10 genes), some of which were significant (PRKD1) or showed a trend for significance (ACE2, FLOT1) ( Table 8). Hierarchical clustering of continuous RQ values for the latter three genes yielded two groups of patients  with significantly different DFS (Figures 6c and d). The expected gene expression pattern was present in the correct context with outcome, but was absent in the majority of cases.
Application of the good prognosis high/low gene expression patterns to the binary transformed RQ values revealed no single tumor with the expected pattern for all 10 genes. However, profiling of the three significant genes only revealed a strong association with patient outcome (univariate COX regression, overall Wald's p = 0.012). In particular, tumors expressing ACE2 high, FLOT1 and PRKD1 low (expected good prognosis profile according to the 30-gene predictor, n = 20) were indeed associated with a significantly longer DFS, as compared to tumors expressing ACE2 low, FLOT1 and PRKD1 high (expected bad prognosis profile, n = 21) (good vs. bad profile, HR = 0.24, 95% CI 0.09-0.63, Wald's p = 0.003). As with hierarchical clustering for the three genes, the majority of tumors (n = 108) did not fit into the above profiles and were of intermediate prognosis with a trend to perform better than the tumors expressing the bad profile (intermediate vs. bad HR = 0.58, 95% CI 0.33-1.03, Wald's p = 0.063). Similar results were obtained upon profiling PRKD1 with ACE2 (univariate Cox, overall Wald's p = 0.006; expected good vs. bad profile HR = 0.35, 95% CI 0.18-0.68, p = 0.002), and PRKD1 with FLOT1 (univariate Cox, overall Wald's p = 0.013; expected good vs. bad profile HR = 0.39, 95% CI 0.21-0.73, Wald's p = 0.003) ( Figure 5, Table 8). A more detailed subclassification of tumors according to more combinations of high/ low for the above genes did not yield significant associations with outcome.
With respect to clinicopathological parameters, no association was observed for age, alcohol consumption, smoking habits, histological grade, stage, or lymph node status, with ACE2, FLOT1 and PRKD1 mRNA as single binary variables, as clusters, or as pre-determined profiles. High ACE2 expression was observed more often in tumors without lymph node involvement Binary ACE2, FLOT1, PRKD1 variables, ACE2/FLOT1/ PRKD1, ACE2/PRKD1 and FLOT1/PRKD1 profiles, as well as ACE2/FLOT1/PRKD1 clusters were also tested for their effect on patient outcome in multivariate models, along with age, alcohol consumption, smoking status, lymph node status, extent of surgery and post-operative radiation. High PRKD1 mRNA expression as a single marker (HR = 2.00, 95% CI 1.28-3.14, Wald's p = 0.002) and positive lymph node status (HR = 4.00, 95% CI 2.22-7.37, Wald's p,0.001) independently predicted for unfavorable DFS, while patients that had undergone total laryngectomy had decreased risk for relapse (HR = 0.55, 95% CI 0.31-0.95, Wald's p = 0.036), as compared to all other surgical approaches.

Discussion
Our group has previously identified prognostic gene models in patients with early stage (T1N0M0, T2N0M0) laryngeal cancer [28] by using Illumina expression profiling (Illumina, CA) in 56 FFPE tissue samples. While this former study provided promising information on the genetic profile of early stage laryngeal cancer, herein we sought to expand our research to operable laryngeal cancer beyond stage. The present study employed a different platform for expression profiling (Affymetrix) and two independent validation sets. We observed that the prognostic profiles from our previous work do not share the same genes with the profiles presented here. However, the present 28-gene prognostic profile successfully stratified patients of the previously published early stage laryngeal cancer cohort with respect to disease-free survival ( Figure S1). We believe therefore, that even though these profiles comprise of different genes, they capture the underlying biology of the tumors, which correlates with their aggressive behavior.
Such a prognostic signature in laryngeal cancer patients has important clinical implications. Patients with unfavorable prognosis, as identified by our multigene predictor, might gain great Treatment efficacy of established therapeutic modalities could be further improved by novel therapeutic approaches. In an attempt to provide hypotheses on tumor progression, as well as novel therapeutic concepts, we explored the molecular biology of laryngeal tumors. First, using Submap, we illustrated that high-and low-risk tumors in the training set are molecularly homogeneous to their respective groups in the validation set. Then, using large-scale gene expression analysis, we demonstrated that tumors of high-risk patients are characterized by deregulation of several pathways of interest. Many of these gene networks have already been identified to be prognostic in cancer. Specifically, deregulation of the TGF-beta [30,31], VEGF [32,33,34], mTOR [35], Wnt [36], Hedgehog [37,38] and insulin [39] signaling pathways has been found to be associated with aggressive disease in squamous cell carcinomas.
Based on these data, the Hellenic Cooperative Oncology Group (HeCOG) proceeded with evaluating the prognostic value of the VEGF [40], EGFR [41] and insulin [42] signaling pathways in laryngeal cancer. Prospective validation of the prognostic role of the aforementioned pathways will promote the accurate identification of patients at high risk for recurrence who may necessitate a more aggressive treatment.
A number of studies also suggest that many of the genes playing a key role in these pathways may serve as novel therapeutic targets. For instance, in vivo studies in mice, targeting the mTOR [43] and IGF [44] molecules, have shown promising results for head and neck cancer treatment. Finally, it is important to outline the presence of the ''Focal Adhesion'' pathway in the gene set analysis results both in the training and 1 st validation sets [29]. Thurlow et al have previously demonstrated the prognostic significance of this network, in patients with head and neck cancer.
To further explore networks in laryngeal cancer, we sought to examine pathway activation status in each individual patient of our study. The Ras pathway appeared to be more frequently activated in tumors from high-risk patients. More importantly, Figure 6. Independent validation of selected genes from the 28-gene model. Hierarchical clustering of RQ values from the genes tested in the 2 nd validation set (FFPE samples). In panels a and b, RQ values of the 10 applicable genes were evaluated. Two major clusters were identified (a) that were shown to have a significant effect on patient DFS (b). Cluster 1 was associated with a better prognosis compared to cluster 2. However, the majority of these genes were clustered in the opposite direction than that expected from the 28-gene classifier. In panels c and d, clustering of the 3 genes with individually significant associations yielded 2 patient groups with distinct outcome; in comparison to the original 28-gene classifier, correct patterns of gene expression were present in the corresponding good and bad prognosis groups. Red and green colors in panels a and c denote high and low expression, respectively. doi:10.1371/journal.pone.0070429.g006 there was a significant survival difference between patients with Ras pathway activation and those without. It has already been suggested that the Ras pathway plays a growth promoting role in head and neck cancer [45]. These data indicate that members of the Ras pathway may be valuable targets for chemotherapeutic interventions in laryngeal cancer.
In the present study, the identification and independent validation of multigene predictors was performed using freshfrozen tumors from patients with resectable laryngeal cancer. However, gene expression profiling using microarray technology has several limitations that have as yet hampered its introduction into daily clinical practice. These include the need for fresh-frozen tissue, the lack of reproducibility and external validation, the complexity of microarray data analysis and cost. In an effort to transcribe our prognostic signature into clinically applicable markers, we tested genes from this classifier on a series of routinely processed FFPE samples from laryngeal cancer patients with similar disease characteristics at presentation. However, in order to evaluate the effectiveness of this validation step, as it is presented here, we need to consider that the 2 nd validation set differed from the other two study groups in the type of tissue material used and the method applied for gene expression profiling. FFPE RNA quantification results may differ in comparison to those from frozen tissues [15,46], while microarray expression signatures are usually filtered down to a few genes for qPCR applications [47]. As a result, it is challenging to reproduce the significance obtained with array classifiers. For example, Mirisola et al developed a 4gene classifier out of 213 genes distinguishing laryngeal carcinomas with and without relapse, but managed to reproduce only one of these gene markers with qPCR [15]. Herein, we applied very strict criteria for the selection of PCR targets in order to obtain profiles that would reliably reflect the ones included in the original U133 model. It should also be noted that, in comparison to the training and 1 st validation cohorts, patients in the 2 nd validation cohort experienced a higher incidence of recurrence. With all the above restrictions and reservations it was possible to validate profiles for 3 of the 28 genes in the array predictor, i.e., ACE2, FLOT1 and PRKD1. Despite the fact that 49-74% of tumors would fall into the intermediate category of undetermined prognosis, combinations of these markers might serve for distinguishing laryngeal cancer patients who would remain disease-free for more than 10 years or who would experience early relapse. Importantly, out of these three genes, PRKD1 was revealed to be an independent predictor for outcome. Hence, testing for PRKD1 expression only might, in fact, be more efficient for predicting laryngeal cancer prognosis than testing for all three genes. These are novel findings, since none of ACE1, FLOT1 and PRKD1 have previously been investigated in laryngeal or head and neck cancer, neither is it known whether these genes are functional in the normal laryngeal epithelium. The latter findings however, should be viewed as hypothesis generating rather than definitive.
From the biological point of view, the expression patterns of ACE2, FLOT1 and PRKD1 in association with good or bad prognosis are largely in line with the proposed anti-growth and anti-tumor roles for ACE2 [48,49]; with tumor promoting roles for FLOT1 [50,51]; and, with pro-metastatic roles for PRKD1 [52,53,54]. It should be noted, however, that the differences observed in the expression of ACE2, FLOT1 and PRKD1 in favourable and unfavourable laryngeal carcinomas were not clearcut in the array predictor, which may reflect the lack of direct genomic alterations within the corresponding gene sequences. Determining RQ value cut-offs for prospective use remains a challenge with all qPCR expression markers of this type. Based on the present results, it seems worthy pursuing PRKD1 expression at the protein level with immunohistochemistry for routine practice, provided that validated antibodies for this marker will become commercially available. One potential criticism of the analyses presented here is that patients have been treated over a long period of time and that treatment modalities might have changed over these years. However, laryngeal cancer is a rare disease and it would be difficult to acquire a large number of samples from patients treated over a shorter period of time. Another limitation is that patients included in this study have been treated in a singe institution. For this reason, one might question the applicability of our multigene predictors in diverse patient populations. Large-scale multicenter studies are therefore necessary to confirm the applicability of our prognostic models.
Moreover, all of our patients were treated with surgery, while approximately 40% of the patients also received radiotherapy. Of note, current treatment guidelines in many European countries [17], in early stages of laryngeal cancer, include surgical resection with or without adjuvant radiotherapy, which renders the findings of the present study both timely and clinically relevant. It is unclear however, whether patients with an unfavorable prognostic profile would be better treated with chemoradiotherapy, as given in other countries including the US [18], or other approaches. Prospective studies are needed in order to clarify this issue.

Conclusions
To summarize, laryngeal cancer is a heterogeneous disease. Patients with similar clinicopathological features may have a different outcome. It is important therefore, to accurately identify which patients with laryngeal cancer will recur, in order to use more aggressive and effective modalities. Our model accurately identified patients at high-risk for recurrence in the training set, as well as, in two independent validation sets. This model does not appear to solely be a collection of prognostic genes, but provides insight into disease mechanisms and potential therapeutic targets. The prognostic and biological significance of ACE2, FLOT1 and especially PRKD1 merits prospective validation, in order for these factors to possibly serve as independent prognostic markers for recurrence in patients with surgically resected squamous-cell carcinoma of the larynx. Figure S1 External validation of our 28-gene model using publicly available data. Kaplan-Meier survival estimates for high-and low-risk patients, as predicted by our 28-gene model in a separate cohort of patients with early stage laryngeal cancer.

(DOC)
File S1 Kaplan-Meier survival estimates for high-and low-risk patients, as defined by prognostic models comprising 20 to 250 probe sets. (DOC)