Molecular prognosticators in clinically and pathologically distinct cohorts of head and neck squamous cell carcinoma—A meta-analysis approach

Head and neck squamous cell carcinomas (HNSCC) includes multiple subsites that exhibit differential treatment outcome, which is in turn reflective of tumor stage/histopathology and molecular profile. This study hypothesized that the molecular profile is an accurate prognostic adjunct in patients triaged based on clinico-pathological characteristics. Towards this effect, publically available micro-array datasets (n = 8), were downloaded, classified based on HPV association (n = 83) and site (tongue n = 88; laryngopharynx n = 53; oropharynx n = 51) and re-analyzed (Genespring; v13.1). The significant genes were validated in respective cohorts in The Cancer Genome Atlas (TCGA) for correlation with clinico-pathological parameters/survival. The gene entities (n = 3258) identified from HPV based analysis, when validated in TCGA identified the subset specifically altered in HPV+ HNSCC (n = 63), with three genes showing survival impact (RPP25, NUDCD2, NOVA1). Site-specific meta-analysis identified respective differentials (tongue: 3508, laryngopharynx: 4893, oropharynx: 2386); validation in TCGA revealed markers with high incidence (altered in >10% of patients) in tongue (n = 331), laryngopharynx (n = 701) and oropharynx (n = 404). Assessment of these genes in clinical sub-cohorts of TCGA indicated that early stage tongue (MTFR1, C8ORF33, OTUD6B) and laryngeal cancers (TWISTNB, KLHL13 and UBE2Q1) were defined by distinct prognosticators. Similarly, correlation with perineural/angiolymophatic invasion, identified discrete marker panels with survival impact (tongue: NUDCD1, PRKC1; laryngopharynx: SLC4A1AP, PIK3CA, AP2M1). Alterations in ANO1, NUDCD1, PIK3CA defined survival in tongue cancer patients with nodal metastasis (node+ECS-), while EPS8 is a significant differential in node+ECS- laryngopharyngeal cancers. In oropharynx, wherein HPV is a major etiological factor, distinct prognosticators were identified in HPV+ (ECHDC2, HERC5, GGT6) and HPV- (GRB10, EMILIN1, FNDC1). Meta-analysis in combination with TCGA validation carried out in this study emphasized on the molecular heterogeneity inherent within HNSCC; the feasibility of leveraging this information for improving prognostic efficacy is also established. Subject to large scale clinical validation, the marker panel identified in this study can prove to be valuable prognostic adjuncts.


Introduction
Accurate molecular prognosticators predictive of survival in patients diagnosed with cancer can be an invaluable adjunct to the existing clinical and pathological parameters. In head and neck squamous cell carcinoma (HNSCC), advanced stage tumors, nodal metastasis, and presence of aggressive pathological features (perineural (PNI)/lymphovascular invasion (LVI)) are poor prognosticators indicating an increased risk of metastatic disease and reduced disease free/overall survival [1][2][3]. Notwithstanding the adoption of multi-modal treatment strategies based on these parameters, survival has not improved [4], indicating the need for additional and more accurate factors to improve disease management. Currently, TNM staging is a primary parameter for triaging the patients based on prognosis, however stage-independent prognostic impact attributed to the presence of pathological parameters such as PNI, angiolymphatic invasion (ALI) and LVI [5,6] indicates the need for a multi-parameter assessment. Whether additional consideration of underlying molecular parameters can further improve prognostication needs to be investigated.
Prognostic impact of the clinico-pathological parameters is site-dependent; recent studies have identified differential impact of the pathological parameters in the prognosis of tongue, buccal and laryngopharyngeal cancers possibly owing to the underlying molecular pathways involved in the carcinogenesis and progression of the disease [7][8][9]. Additionally, Human papilloma virus (HPV) -associated etiology is an increasingly evident prognostic factor; oropharynx being the most commonly associated site [10] [11] [12]. Among pathological parameters, nodal metastasis is a significant prognosticator in HNSCC; the primary tumor site, presence of PNI, extra-capsular spread (ECS), being additional contributing factors [2][3][4]13,14]. PNI is a factor that can determine lymph node involvement, recurrence, disease-free and overall survival necessitating appropriate treatment management if susceptibility is detected. Elective neck dissection is prescribed for T1/T2N0 PNI+ patients [15,16]. Additionally, PNI was identified as a prognostic factor in early stage tongue cancer, while LVI was a significant parameter in late stage buccal mucosal cancers [3]. These studies point out to etiological/biology-driven differences in the survival impact of the existing clinical and pathological parameters; identifying the underlying biological parameters may be significant in improving their prognostic impact.
Advancement in the technologies and high throughput global profiling have led to the identification of a repertoire of candidate markers that can qualify as molecular prognosticators [17][18][19][20]. These markers can be predictive (to evaluate the likelihood of benefit from a specific clinical intervention) or prognostic (to evaluate the patient's overall outcome). A cataloguing of the candidate markers that correlated with the aetiology and pathological parameters in a site-dependent manner, might provide insights into the molecules associated with treatment response/survival in each site and will therefore be more specific. This study proposed that molecular markers can serve as accurate adjuncts to existing prognostic parameters in head and neck cancers. Towards this effect, this study planned to mine the existing high throughput data to identify a panel of potential markers associated with etiological/clinical/pathological features in the three major sites in HNSCC; oral cavity, laryngopharynx and oropharynx. pipelines i) HPV associated meta-analysis gene list was validated in the HPV+ and HPVcohort of TCGA ii) meta-analysis derived significant gene list obtained from each site was validated in the site-specific cohort in the TCGA HNSCC database. In both cases, the validation was carried out with regard to incidence of gene alteration (percentage alteration), differential profiling (mRNA expression status using z-score) and prognostic efficacy (correlation with survival) (Fig 1).
The mRNA expression z-scores (RNA Seq V2 RSEM) of significant gene entities were downloaded from TCGA HNSCC. The patient samples were categorized into HPV+/HPVand within the sites further classified based on clinical and pathological condition. As a first tier analysis, the significance of the z-score distribution between the different groups; HPV, clinical stages (stage I-II vs stage III-IV) and pathological conditions (ALI, PNI, Node, ECS) was evaluated using the t-test unpaired unequal variance (p<0.05). The second tier analysis was carried out with the significant gene set to assess the heterogeneity of expression within each sub cohort of patient. This was assessed by evaluating the percentage of patients showing alteration in the gene (z-score threshold: ±2) in each cohort and difference in level of expression in the altered set (t-test unpaired unequal variance) (Fig 1).
Prognostic efficacy was assessed with markers identified at two levels i) mRNA z-score based differentials ii) markers with higher incidence of alterations in patients (>20%) (Fig 1). Over all work flow of site specific meta-analysis. Raw data (Affymetrix U133 Plus 2.0 microarray) of head and neck cancer series from publically available databases were downloaded and re-analyzed in Genespring statistical software. During analysis, post data upload onto the software, normalization (RMA) was carried out and samples grouped either into HPV positive/negative or based on sites (tongue, laryngopharyngeal and oropharyngeal cancer). In each site, further sample grouping was carried out into tumor and normal. Gene level experiment was performed in the software and PCA/clustering analysis was carried out to remove the discordant samples. Significant gene entities were identified (fold change >2.0, p-value <0.05) from each analysis/experiment. The annotation of the significant genes was carried out by TOPPFUN gene enrichment analysis (GO analysis/pathways). For validation, The Cancer Genome Atlas (TCGA) Head and Neck squamous cell carcinoma (HNSCC) cohort was classified based on HPV (HPV +/-) and site (tongue, laryngopharyngeal and oropharyngeal cancer). Patient cohorts in each site were further classified based on stage/pathological parameters and/or HPV. The significant gene entities identified from meta-analysis were validated in their respective cohorts; differential mRNA expression (z-score) and association with survival (OS/DFS) were the main endpoints.
The markers were assessed for their survival impact in patients within the TCGA cohort categorized based on HPV and clinical stage (early and late stage), nodal metastasis (with and without extracapsular spread) and pathological (perineural invasion, angiolymphatic invasion, positive lymph node hematoxylin and eosin staining microscopy count, extracapsular spread pathologic) parameters within each site. For each analysis, TCGA samples without definite information for these parameters were excluded from the study. The contribution of these markers to the prognostic efficacy of these parameters (disease-free (DFS), overall survival (OS)) was assessed. The significant marker panel (p-value <0.04) were also evaluated for the prognostic efficacy in the different sub cohorts of patients in each site (Kaplan Meier plots; log rank test, p<0.05) (Fig 1).

Data mining and meta-analysis
A total of 8 datasets, wherein analysis was carried out using the Affymetrix platform (U133 plus 2.0) were included in this study (S1 File). The samples were categorized based on HPV status and then site. Among the series downloaded, two series had information on HPV status; a total of 97 tumor samples (HPV positive n = 12; Negative n = 85) were selected, 83 included in the final analysis after PCA/clustering analysis and set of 3258 gene entities were identified.
The tongue, laryngopharyngeal and oropharyngeal cancer samples (both tumor and adjacent normal) were separated out from the 8 series for analysis; a total of 192 samples (tumors (T): 154, normal (N): 38) were collected from the series, which included tongue (T = 62; N = 26), laryngopharyngeal (T = 46, N = 7) and oropharyngeal (T = 46; N = 5) cancers. The site-specific samples were grouped and analyzed separately in the GeneSpring for the identification of significant gene entities. PCA analysis removed the outliers and the samples that qualified were taken for further analysis in each site (tongue = 73, laryngopharyngeal = 48, oropharyngeal cancer = 43). Site-specific meta-analysis lead to the identification of the statistically significant gene entities (fold change (FC)> 2.0, p-value<0.05) specific to each site; 3508 in tongue samples, 4893 from laryngopharyngeal and 2386 from oropharyngeal cancers.
A comparison across the different sites indicated that 698 genes were common across all the 3 sites (S2 File). The significant gene entities from each site, when cross compared with the previously published database of 181 gene entities (up regulated: 168; down regulated: 13) [21] indicated that 66.3% (n = 119/181) of the genes in the database were identified in tongue cancer, while 109/181 genes (60.22%) were common with laryngopharyngeal subset and 95/181 genes (52.48%) were common with oropharyngeal cancer.
The genes were validated in the TCGA HNSCC cohort with HPV information (n = 271; HPV+: 26, HPV-: 245) for their percentage alteration and the mRNA expression status; a total of 833 genes were altered in more than 5% and 267 genes in >10% of the cohort (S4A, S4B and S4C File). In order to document differences in alteration status between the HPV+/HPVcohorts, the alteration status and expression levels of this subset (n = 267) were assessed separately in each cohort. Analysis to identify the genes exclusively altered in the HPV+ cohort, indicated that alterations in 63/267 genes were specific to the HPV+ (altered in >10% of cohort) with minimal (altered in <5% of the cohort) or no alterations in the patients negative for HPV (Fig 2A; S4D File). Among these, ZNF541 (Zinc Finger Protein 541) showed the highest alteration in 50% of HPV+ cases as compared to 0.5% in HPV-cases. The other top genes altered in HPV positive cohort includes SYNGR3 (Synaptogyrin 3, 42% vs 1.2%), MAP7D2 (MAP7 Domain Containing 2, 42% vs 2%), RELB (RELB Proto-Oncogene, NF-KB Subunit, 42% vs 2%), and ZNF488 (Zinc Finger Protein 488, 35% vs 3%) (Fig 2A). Assessment of the mRNA expression levels (z-score) indicated that this cohort of 63 genes showed significant differential expression between the two cohorts (p<0.04) (Fig 2B-2D; S4D File). The survival impact of these genes in patients positive for HPV (n = 26) was assessed; RPP25 showed association with overall survival (p = 6.545e-3), NUDCD2 (NudC Domain Containing 2) was associated with DFS (p = 2.415e-03), while NOVA1 (Neuro-Oncological Ventral Antigen 1) was a good prognosticator for overall survival in patients positive for HPV (p = 0.048) (Fig 2E-2G; S4E File).

Markers associated with early stage tongue cancer.
In order to validate the gene entities in an independent cohort of patients, the TCGA patient cohort was used. The alteration status was evaluated in tongue cancer-specific patient cohort of TCGA (n = 132) to identify the highly prevalent genes; 331 genes were altered in at least 10% of the cohort, while 56 genes were altered in more than 20% (S5D File). To delineate the expression of these markers in different clinical stages, the TCGA tongue cancer patients were further classified into early/ late stages and the expression profiling compared. mRNA z-score based differential profiling of the genes (n = 331) between the early and late stage cohorts indicated 14 genes as statistically significant differentials (p = >0.05) (S6A File); JAM2 (Junctional Adhesion Molecule) showing a significant alteration (FC>2 fold; p = 0.05) in the early stage patients ( Table 1, Fig 3A). Marker profile in HPV+ HNSCC in terms of alterations, differential expression and prognostic efficacy. Comparison of the significant gene entities across HPV positive and negative cohort for their percentage alteration identified a total of 63 genes as highly altered in HPV positive cohort as compared to HPV negative cohort (A). Assessment of expression levels (mRNA z-score) indicated that all these genes were significant differentials (p<0.05) between the two cohorts; the profile of top 3 genes (both alterations and z-score); ZNF541 (50%), SYNGR3 (42%), MAP7D2 (42%) in HPV positive TCGA cohort is represented as boxplots. (B-D). The selected markers (n = 63) were analysed for their significant association with disease free and overall survival in patients with HPV positive cancer in TCGA. KM analysis indicated that NUDCD2 (E) was associated with DFS (p = 2.415e-03), RPP25 (F; p = 6.545e-3) and NOVA1 (G; good prognosticator, p = 0.048) were associated with OS. (P-value; ���� p<0.00005). https://doi.org/10.1371/journal.pone.0218989.g002 Evaluation of prognostic efficacy in early stage patients indicated that none of them were associated with survival.
As an additional effort to identify significant prognosticators, prognostic impact (overall/ disease-free survival) was also investigated using the gene set altered in higher percentage of patients (>20%) in i) the overall tongue cancer cohort ii) early stage cohort iii) late stage cohort. In the overall tongue cohort, 4 genes showed an association for DFS and/or OS (p<0.05) individually and in combination. A combination of NUDCD1 (NudC domain containing 1) ( Table 2) Table 2; Fig 3D-3F; S6B File), although the marker combination was not significant. Interestingly, this analysis also revealed that NUDCD1 that was significantly correlated with all tongue cancer patients, was relevant (p<0.05) in advanced cancers (stage III-IV; n = 81) (S6B File), not in patients with early cancers.

Markers with prognostic efficacy in patients with pathologically distinct tongue cancer.
To identify the genes associated with severe pathological parameters in tongue cancer, the cohort of 331 genes was validated in the TCGA tongue cancer patients with perineural invasion (PNI, n = 65) or angiolymphatic invasion (ALI, n = 26). A panel of 17 genes were differentials in PNI+ or ALI+ as compared to the negative patients (p<0.05) (S6A File); SMURF1 (SMAD Specific E3 Ubiquitin Protein Ligase 1) (Fig 3G) was significantly altered in PNI + patients and a panel of 3 genes were significantly altered in ALI+ patients (FC>±2; p<0.05) ( Table 1; Fig 3H). None of these genes showed a survival impact. Prognostic efficacy assessed with genes having alterations in a larger cohort of patients (>20%) in the multiple pathological sub-cohorts (PNI+, ALI+, node+, node+/ECS+ and node+/ECS-) identified 4 genes in PNI + patients (n = 62); NUDCD1 (Fig 3I and 3J), TFG (TRK-Fused Gen), TMEM267 (Transmembrane Protein 267) and BRIX1 (BRX1, Biogenesis of Ribosomes) as significant in survival prediction (OS and/or DFS; p<0.05), while PRKCI (Protein kinase C, iota) ( Fig 3K) WDR70 (WD repeat domain 70) and BRIX1 were predictors in patients with ALI+ (n = 26) ( Table 2; S6B File) although the marker combination was not significant in both the parameters.  A similar investigation into markers associated extracapsular spread (ECS) in tongue cancer indicated a panel of 17 differentials in node+ ECS+ patients (p<0.05) (S6A File). However, none of these genes were significantly altered or showed any significant survival impact. In next level analysis of the prognostic efficacy of genes with alterations in >20% of patients of node positive patients (n = 68), NUDCD1, PUF60 (poly-U binding splicing factor 60KDa), ANO1 (Anoctamin 1) and RHOA (Ras Homolog Family Member A) showed an association with OS/DFS (S6B File). These genes showed no correlation with survival in negative patient cohort. Analysis of the sub cohorts with/without ECS (ECS+: 24; ECS-: 39) indicated that NUDCD1 and ANO1 were significant prognosticators ( Table 2; S6B File) in the ECS-cohort. In this sub-cohort, patient with overexpression of the marker combination, ANO1, NUDCD1 (p = 4.285e-3) and OTUD6B (p = 3.293e-3) were individually associated with overall survival in early stage tongue cancer. Comparison of the differentials based on mRNA expression levels (z-score) between patients positive for pathological parameters (PNI, ALI, nodal metastasis) in tongue cancer TCGA cohort indicated that in the PNI+ cases, SMURF1 was upregulated (G), while the gene LY6E was altered maximum in ALI+ cases (H). KM plot analysis in PNI+ patients (I and J) indicated that NUDCD1 showed an association with poor DFS (p = 0.023) and OS (p = 1.888e-3), while in ALI+ cases (K) PRKCI was associated with poor OS (p = 0.018). Analysis in the node +/ECS-cohort (L and M) of TCGA revealed that a combination of ANO1, NUDCD1 and PIK3CA was a poor prognosticator (DFS p = 0.0236 and OS p = 1.419e-4). Analysis in a further defined cohort of early stage tongue cancers positive for PNI or ALI, identified OTUD6B as a poor prognosticator (N and O). (P-value; � p<0.05).
https://doi.org/10.1371/journal.pone.0218989.g003   Table 2; S6B File). Multiple level classification including stage and pathological parameters indicated that in the early stage patients with poor prognosis (PNI+ and/or ALI+), OTUD6B was an additional prognosticator (p<0.05) ( Table 2; Fig 3N and 3O). 3.4.1 Markers associated with early stage laryngopharyngeal cancers. These significant gene entities were validated in the TCGA HNSC laryngopharyngeal specific samples (n = 127) in terms of prevalence, differential expression and prognostic efficacy. Assessment of alteration status identified that 701 genes were altered in mRNA expression in >10% of patients and 137 genes were altered in at least 20%. The expression of the 701 genes was looked into in the TCGA laryngopharyngeal subset (n = 127) classified based on stage; 82 genes were differentially regulated (p<0.05) between the early/late stage laryngopharyngeal cancers, out of which 19 genes showed a significant alteration (FC >±2) in the expression levels (Table 3; Fig 4A-4F; S8A File). These differential genes were assessed for their impact on overall and disease free survival; PRRC2B (Proline Rich Coiled-Coil 2B, p = 0.0235) and CBLL1 (Cbl proto-oncogene, E3 ubiquitin protein ligase-like 1, p = 0.0121) impacted overall survival of early stage laryngopharyngeal patients (Fig 4G and 4H; S8B File).

Markers correlating with pathological parameters.
The genes (n = 701) were assessed for differential expression in the laryngopharyngeal patients triaged based on Table 3. Differential mRNA z-score expression profiling between the altered cases in association with clinico-pathological parameters in laryngopharyngeal cancer. Site specific HNSCC prognosticators  (Fig 6D and 6E). PAK2 was associated with disease free survival (p = 0.033; Fig 6F) in ALI-cases (S8B File). Additional analysis (genes with alterations in >20% patients) in ALI+ patients revealed that RYK, AP2M1 and IL1RAP that relevant in the overall cohort were significantly predictors of OS and DFS (Table 4; S9A File). Among these markers, the overexpression of the combination of RYK and IL1RAP in ALI+ patients, led to low median survival (17.12 vs 108.87 months; p = 0.0006) and low DFS (29.2 vs NA months; p = 0.02) when compared with the cohort without alterations (Fig 6G and 6H; S9B File). The assessment of differential prognosticators of early stage laryngopharyngeal patients with PNI/ALI+ was not carried out due to less number of patients in this cohort.

Molecular prognosticators of oropharyngeal cancers
Meta-analysis identified 2386 genes (Up = 1767; Down = 619) in oropharynx samples (S10A File) (n = 51). GO analysis carried out to evaluate the functional classes indicated that in  Site specific HNSCC prognosticators showed association with ALI-in DFS alone (F; p = 0.033). Prognostic assessment of the subset of meta-analysis based differentials altered in >20% of the patients identified the combination of RYK and IL1RAP to be associated with poor survival (G and H). EPS8 was the only differential in node in node+ECS-patients with no survival impact (I) validation of the differentials associated with node positive with/without ECS indicated that EPS8 was the only differential (downregulated) in node+/ECS+ cohort with no association with survival. Prognostic efficacy (subset altered in >20% patients) indicated that STK3, TMEM267 and PSMD2 were the prognosticators of poor survival (J and K). (P-value; � p<0.05; �� p<0.005). https://doi.org/10.1371/journal.pone.0218989.g006 Site specific HNSCC prognosticators showed maximum representation (S10B File). Analysis also identified N-glycosylation by oligosaccharyl transferase (71.42%; n = 5), Guanine ribonucleotide biosynthesis IMP = > GDP, GTP (46.15%; n = 6), Glycolysis (Embden-Meyerhof pathway), glucose = > pyruvate (44%; n = 25) and ECM-receptor interaction (36.58%; n = 30) were significant pathways in KEGG, while TFAP2 (AP-2) family regulates transcription of cell cycle factors (100%; n = 5), Antagonism of Activin by Follistatin (100%; n = 4) and Phosphorylation of proteins involved in the G2/M transition by Cyclin A:Cdc2 complexes 100%; n = 3) were major pathways in Reactome (S10C File). Validation in the TCGA oropharyngeal cancer cohort (n = 32) indicated that 404 genes were altered at the mRNA expression level in at least 10% of the cohort, while a sub-set of 73 genes showed a high prevalence and were altered in >20% of the patients (S11A File). Analysis of these genes in terms of z-score analysis and prognosis in correlation with stage/pathological parameters and survival analysis was not carried out due to the low number of samples in the sub-categories.
3.5.1 Markers correlating to HPV-associated oropharyngeal cancers. The markers were assessed for their relevance in HPV associated oropharyngeal cancers; the gene set was validated for alterations/survival impact in the TCGA oropharyngeal cohort segregated according to HPV status. Among the 404 gene set, 128 genes were altered in positive cohort (n = 15), while 131 were altered in the negative cohort (n = 17) (S11B and S11C File). Comparison of the alteration status across the two cohorts indicated that 51 genes in HPV+ and 55 in HPVwere either altered only in the respective cohorts or showed a high percentage alteration (S11D File). Assessment of the mRNA expression level (z-score) differences in these genes indicated that 31 genes in HPV+ cohort ( Fig 7A) and 27 genes in HPV-cohort showed significant difference in the expression levels between the two cohorts (p<0.04).

Discussion
Heterogeneity in treatment outcome is a major aspect of concern in solid tumors. In head and neck cancers (HNSCC), diverse behavioral and molecular etiologies drive tumors in anatomically different sites that further vary in the underlying biological basis, histology and treatment response. Consequently, with treatment currently being administered based on clinical/histological parameters, accurate prognostication is still a rising challenge. Given the availability of multi-modality treatments as first line therapy, designing treatment strategies based on accurate understanding of treatment outcome is imperative; identification of molecular markers that can improve accuracy of prognosis is hence of extreme significance. The cataloging of biomarkers that correlate to tumor biology, its clinicopathological characteristics and evaluation of their prognostic ability is an essential step towards identifying candidate markers of clinical utility [8,24]. Significant advances in global profiling technologies has improved the understanding of the mechanisms underlying the disease and thereby generating a repertoire of biomarkers that can be assessed/validated for their clinical relevance. This study attempted to identify clinically relevant, molecular prognosticators pertaining to etiology, site, stage and pathological severity, by the meta-analysis approach.
Etiology based categorization was carried out in terms of HPV associated disease, HPV being the major cause of viral associated disease in HNSCC. Although oropharyngeal cancers are the major site with HPV-associated etiology, recent studies have attributed a role in oral cancers too [11] [25] [26]. In our study, although 63 genes, including RELB, the upregulation of which leads to TRAF regulated over expression of HPV E6 protein [27], only three showed a survival impact. RPPP25, NUDCD2 were poor prognosticators, while, interestingly, NOVA1 was a good prognosticator in patients with HPV. NOVA1 is known to be involved in the downregulation of E6 and E7 proteins in HPV associated cancers [28] and is hence indicated as a good prognosticator in HPV+ patients. These genes can be possible prognosticators in HPV+ cancers, subject to large scale clinical validation.
Site-specific meta-analysis of publicly available data identified a subset of genes that were common across the three major sites of HNSCC (tongue, laryngopharynx, oropharynx); the pathways that were primarily enriched included focal adhesion and proteoglycans in cancer, and cell cycle, mitotic pathway, which are well known in various cancers [29][30][31][32][33][34][35][36][37]. Differential expression profiling of the tongue cancer cohort triaged based on the stage, pathology identified multiple candidate markers; JAM2, SMURF1, LY6E, MFN1 and SUPT16H. JAM2 is reported to be instrumental for metastatic progression in breast and colon cancer [38,39], while others are known regulators of migratory/invasive properties in many cancers [31,[40][41][42][43][44]. Analysis in laryngopharyngeal cohort also identified markers of early stage disease with PNI/ALI which included MFN1, PAK2, PIK3CA, FXR1 (differentials), PRRC2B and CBLL1 a significantly upregulation in oropharyngeal HPV+ cases (B-F). The selected markers (n = 31) were analysed for their significant association with disease free and overall survival in oropharyngeal HPV+ cohort in TCGA; KM plot analysis identified that a combination of ECHD2, GGT6 and HERC5 (all genes uniquely altered in the HPV+ cohort) showed a significant association with overall survival (G; p = 0.0321). Similar validation of the genes in the HPVcohort identified 27 genes altered; mRNA expression levels (z-score) of the top 5 genes [FADD (41%), FKBP9 (24%), FLNA (24%) DNMT3B (18%) and GLO1 (18%)] (H-L) indicated that they were significantly upregulated in the HPV + cohort. Assessment of prognostic efficacy in HPV-cohort showed that a combination of GRB10, EMILIN1, FNDC1, FOXF2 and HLX showed a significant association with overall survival (M; p = 2.939e-6). (P-value; � p<0.05; �� p<0.005; ��� p<0.0005). https://doi.org/10.1371/journal.pone.0218989.g007 Site specific HNSCC prognosticators (with prognostic efficacy), all of which were previously identified to play a major role in head and neck oncogenesis as well as in other cancers [45][46][47][48][49][50][51][52] EPS8, identified in ECS-laryngopharyngeal patients in this study, is known to be involved in the proliferation apoptosis, adhesion and migration in other cancers including HNSCC [53][54][55][56]. The correlation of these markers with the clinically/pathologically classified sub-cohorts (ALI+/PNI+) indicated their possible relevance as predictive panel, with a subset of genes providing survival impact. A comparison of the prognosticators identified for tongue and laryngopharynx indicated that AP2M1 and IGF2BP2 are common across sites, with deregulation of AP2M1, a protein trafficking molecule, signifying contradictory effects in tongue and laryngeal cancer. NUDCD1 also known as CML66, located on the chromosome 8q23, was one of the significant unique poor prognosticator in advanced tongue cancers along with MTFR1, IGF2BP2, TSTA3; genes that associated with survival in tongue cancer, many of these genes have been identified as involved in tumorigenesis, metastases, immune suppression in solid tumors [57][58][59][60]. The prognosis of laryngeal cancers could be improved by addition of markers such as IL1RAP, LANCL2, RYK, and SLC33A1. IL1RAP, interleukin involved in synthesis of pro inflammatory proteins [61], RYK a member receptor protein tyrosine kinases have been reported in drug resistance, cell motility, anchorage independent cell growth and other tumorigenic properties [62][63][64].
In tongue and laryngopharyngeal cancers, advanced stage of the disease signifies poor prognosis. Nevertheless, studies have also shown a subset of early stage patients that have an extremely poor prognosis [65,66]. Biomarkers that can further classify these patients based on outcome can be immensely valuable in this context. Early stage tongue cancer patients were significantly classified into patients with poor/good survival based on alterations in AP2M1, CTBP1, and MTFR1 all of whom were associated with prognosis in other studies [60,67,68]. Additionally OTUD6B alterations, which was prognostic in advanced stage (III-IV) patients was also significant predictor of survival in early stage patients with perineural invasion/angiolymphatic invasion (Stage I-II/PNI+/ALI+), indicating that this marker is a clear indicator of advanced disease. Expectedly, the prognosticators of early stage laryngopharyngeal cohort were completely unique and included TWISTNB and UBE2Q1, previously designated poor prognosticators in breast and hepatocellular cancers [69,70].
Pathological parameters that include perineural invasion, angiolymphatic invasion, and extracapsular spread signify poor prognosis; assessment of markers associated with these parameters, identified the adjunct predictors. Interestingly, there were no common significant predictor of ALI/PNI across the two sites (MFN1 was a differential but did not show relevance as a prognosticator). In PNI/ALI+ tongue cancer, the panel of TFG (tumor suppressor gene) [71], TMEM267, BRIX1 (good prognosticators), NUDCD1 and PRKCI (poor prognosticators) together could categorize the patients based on survival; PRKC1 expression is known to be associated with metastasis and poor prognosis in esophageal/ovarian cancer [72,73]. The predictor panel in patients with PNI/ALI in laryngopharyngeal cancer consisted of AP2M1, LANCL2, PFN2, RPN1, MAP3K13, WWTR1 and IL1RAP; WWTR1 being a transcriptional coactivator regulating cell proliferation, differentiation, survival and apoptosis in oral cancer cells [74,75], while LANCL2 is associated with the EGFR pathway in many cancer [76,77]. The association of these markers with PNI and ALI indicated their possible role in the processes apart from their significance as adjunct prognosticators. Assessment of ECS, an additional prognosticator in advanced stage disease, indicated a distinct set of markers in tongue cancer (EXT1, GMPS, TSTA3), while in laryngopharyngeal cancers a majority of the markers were common those associated with late stage disease. Notably, in tongue cancer patients without ECS, the markers of advanced disease were poor prognosticators indicating their relevance as adjunct prognosticators in the absence of clinical/pathological parameters.
The oropharyngeal cohort was assessed separately, given the high prevalence of HPV-associated etiology. Low sample numbers precluded extensive validation in TCGA patient cohort; nevertheless a distinct alteration pattern of candidate genes was observed between HPV+ and HPV-cohorts of oropharyngeal cancers. A distinct set of prognosticators were identified for HPV+ and HPV-oropharyngeal cancers; ECHDC2, GGT6, HERC5 were showed to be associated with HPV+; HERC5 has a known anti-viral activity [78] [79] and its possible role in HNSCC HPV associated cancer is yet to be studied. On the other hand alterations in GRB10, EMILIN1, FNDC1, FOXF2 and HLX, determined poor prognosis in the HPV-cohort; this subset is involved in multiple carcinogenic processes in different solid tumors [80] [81] [82] [83] [84] [85].
The clinical relevance of molecular prognosticators is contentious, given the vast repertoire of data, and the lack of adequate validation to accurately pin-point the beneficial patient cohort. In view of the extremely clinical, pathological, cellular and molecular heterogeneity inherent in all cancers including head and neck cancers, the utility of the biomarkers need to be customized to the sub-categories of patients with definite clinico-pathological parameters prior to validation. This study was an attempt to leverage the existing high throughput studies to specify the predictive/prognostic marker pattern that correlate to the various clinic-pathological sub-types in tongue/ laryngopharyngeal/oropharyngeal cancers, the most common sub-sites of head and neck cancer. Although a common thread of pathways/biomarkers was observed, the distinct marker subset that represented each of the sub-cohorts emphasizes the point of discussion. The primary limitation was that this study was confined to three sub sites; it can be expanded to other sites in order to enable accurate triaging of patients based on risk and thereby provide appropriate treatment. Marker panels that classified the sub-cohorts with historically good prognosis, such as the patients with early stage disease, patients without extra-capsular spread, can prove to be invaluable candidates to improve on the current prognostic indicators (S2 Fig). Distinct and large scale clinical validation is mandatory prior to the adoption of these markers into a clinical setting; nevertheless this study points to the need of customized marker mapping in patients and provides a database of annotated candidates for subsequent validation.