High ELF4 expression in human cancers is associated with worse disease outcomes and increased resistance to anticancer drugs

The malignant phenotype of tumour cells is fuelled by changes in the expression of various transcription factors, including some of the well-studied proteins such as p53 and Myc. Despite significant progress made, little is known about several other transcription factors, including ELF4, and how they help shape the oncogenic processes in cancer cells. To this end, we performed a bioinformatics analysis to facilitate a detailed understanding of how the expression variations of ELF4 in human cancers are related to disease outcomes and the cancer cell drug responses. Here, using ELF4 mRNA expression data of 9,350 samples from the Cancer Genome Atlas pan-cancer project, we identify two groups of patient’s tumours: those that expressed high ELF4 transcripts and those that expressed low ELF4 transcripts across 32 different human cancers. We uncover that patients segregated into these two groups are associated with different clinical outcomes. Further, we find that tumours that express high ELF4 mRNA levels tend to be of a higher-grade, afflict a significantly older patient population and have a significantly higher mutation burden. By analysing dose-response profiles to 397 anti-cancer drugs of 612 well-characterised human cancer cell lines, we discover that cell lines that expressed high ELF4 mRNA transcript are significantly less responsive to 129 anti-cancer drugs, and only significantly more response to three drugs: dasatinib, WH-4-023, and Ponatinib, all of which remarkably target the proto-oncogene tyrosine-protein kinase SRC and tyrosine-protein kinase ABL1. Collectively our analyses have shown that, across the 32 different human cancers, the patients afflicted with tumours that overexpress ELF4 tended to have a more aggressive disease that is also is more likely more refractory to most anti-cancer drugs, a finding upon which we could devise novel categorisation of patient tumours, treatment, and prognostic strategies.


Introduction
Genetic alterations in several transcription factors lead to genome-wide transcription changes that drive the malignant phenotypes [1][2][3]. Although many of these transcription factors, such as ELF4 a member of E74-like factor family (ELF), play essential physiological roles in healthy proliferating cells, a series of genetic aberrations within genes coding for these factors result in oncogenic behaviour of healthy cells [4,5]. However, we know far less about how the expression of ELF4 affects the aggressiveness of disease across many human cancer types and their response to drug perturbations. Furthermore, despite the keen interest in determining how perturbations of ELF4 relate to specific aspects of malignant cells, compared to other transcription factors such as TP53 (PubMed score of 46333.87) and MYC (13294.24), ELF4 (35.41) remains among the least studied transcription factor in human cancers [6,7]. Analysis of ELF4 in various human cancers has revealed that it plays essential roles in cellular differentiation, proliferation, and apoptosis in cancers of the prostate and breast [8], and is paradoxically associated with oncogenic activity [3,8] and tumour suppressor roles [9]. Also, mutations in the coding sequence and mRNA expression variations of ELF4 have been reported in various human cancers [3,5,10]. These and other recent studies [4,7,[11][12][13] have also generated a new appreciation of how aberrant ELF4 influences cancer development, the anti-cancer drug response of tumours, and the disease treatment outcomes.
Our current understanding of ELF4 and other cancer genes has largely been facilitated by large cancer profiling projects such as The Cancer Genome Atlas (TCGA) [14] and the International Cancer Genome Consortium [15]. Data get by these large cancer profiling projects have guided the identification of frequently altered cancer genes and the cancer type-specific regulators. Additionally, large-scale drug response screening projects, such as the Genomics of Drug Sensitivity in Cancer (GDSC) project [16], have been valuable in providing well-characterised cancer cell lines as models of disease for both drug discovery and the evaluation of drug action dependencies of cancer cells [17][18][19][20].
Here, we investigate the mRNA expression variations of ELF4 across pan-cancer TCGA tumours to identify the changes that have meaningful clinical value. By linking this information with the drug response profiles from the GDSC, we also investigate variations in the response of cancer cell types that are associated with ELF4 mRNA levels. Besides enabling the identification and selection of the most suitable anti-cancer drugs to treat a tumour with different ELF4 expression signatures, a multiscale understanding of ELF4 across different cancer types will likely yield better prediction of disease outcome.

Methods
The study protocol was approved by the University of Cape Town; Health Sciences Research Ethics Committee IRB00001938. The analyses in this study utilised publicly available datasets collected by the TCGA, CCLE and GDSC from consenting participants. Here, all analyses were performed following the relevant policies, regulations, and guidelines provided by the TCGA, CCLE and GDSC to analyse their datasets and report the findings.
We accessed and processed a pan-cancer TCGA project dataset derived from 9,350 patients afflicted with 32 distinct human cancers [21]. These datasets include evenly processed mRNA expression data, gene mutation data, and comprehensive deidentified clinical data for all patients.
Then we segregated the patient's tumours across the pan-cancer studies into two groups: those that expressed a higher level of ELF4 mRNA transcripts and lower level of ELF4 mRNA transcripts. To achieve this, first, we applied z-normalisation to the measured ELF4 mRNA transcript across all the tumours. Then we considered those tumours with ELF4 mRNA zscore > 0 to have high ELF4 levels (3887 tumours) and those tumours with ELF4 mRNA zscore < 0 to have low ELF4 levels (5463).

Survival analysis
We used the clinical information provided by the Cancer Genome Atlas and the corresponding TCGA sample IDs to match the patient's samples to the appropriate clinical outcomes and sample features. Then we used the Kaplan-Meier approach to estimate the duration of the overall survival and disease-free survival periods between tumours that expressed a high level of ELF4 mRNA transcripts and low levels of ELF4 mRNA transcripts [22]. Furthermore, across each of the 32 cancer types, we used the Kaplan-Meier method to compare the duration of the overall survival periods between patients afflicted with high-ELF4 tumours and those afflicted with low-ELF4 tumours (see S1 File).

Cancer grades across tumours that expressed high and low levels ELF4
We matched the TCGA sample IDs to the TCGA IDs provided in the clinical sample information to match the patients with tumours that either expressed low or high ELF4 levels to the samples' clinical features. Then we performed a Chi-squared test to compare the distribution of tumours of various grades across each category of tumours that either expressed high ELF4 level and low ELF4 level.

Distribution of tumour expressed high ELF4 and low ELF4 across cancer types
Across each of the 32 TCGA pan-cancer analyses, we counted the absolute number of tumours that expressed high ELF4 transcript levels and those that expressed low ELF4 transcript levels. We also obtained the overall percentage of tumours that expressed higher or lower mRNA transcript levels of ELF4 across each cancer type.

Comparing median patients' age, the number of mutations and fraction of the genome altered across the tumour groups
Across the two groups of tumours that expressed high ELF4 and low ELF4, we obtained each patient's age. Furthermore, we calculated the number of gene mutations in each tumour and the fraction of the genome altered in each tumour. Then we used the nonparametric Mann-Whitney U-test for equality of medians for groups to compare the median age, the median number of mutations per tumour and fraction of the genome altered per patient between each group [23].

Dose-response of cancer cell lines that expressed high and low ELF4 levels
We obtained 244,656 dose-response profiles of 612 cancer cell lines to 397 anti-cancer drugs from the Genomics of Drug Sensitivity in Cancer (GDSC) database [16]. The cell lines within the GDSC database represent 31 different human cancers. Also, we obtained the mRNA expression levels of the ELF4 genes in these 612 cancer cell lines from the Cancer Cell Line Encyclopedia (CCLE; [24]). Using the mRNA expression data from the CCLE, we used the approach described above (for the primary TCGA tumours) to segregate the cell lines into two groups: 1) those with high ELF4 transcript levels and 2) those with low ELF4 transcript levels. Then we used the Welch test to compare the mean differences in the z-score transformed IC50 values between the cell lines with high ELF4 levels and those with low ELF4 levels, for each class of the 397 anti-cancer drugs (also see S2 File).

Differential genes expression and pathway analysis
We used the moderated student t-test employing the negative binomial model to identify differentially expressed mRNAs between PAAD, BRCA, SARC, KIRC and LAUD tumours that expressed higher ELF4 and those that expressed lower ELF4 (see S3 File) [25,26]. Additionally, pathway enrichment analyses were performed using lists of significantly up-regulated genes in PAAD, BRCA, SARC, KIRC and LUAD tumours that expressed high ELF4. These were used to query Enrichr to return Reactome pathways enriched in the high-ELF4 tumour samples ( Fig 5, S4 File).
Therefore, we used a custom MATLAB script to create the connect network that links ABL1, SRC and ELF4 based on known protein-protein interactions extracted from the Kinase Enrichment Analysis database [27], Chromatin-immunoprecipitation Enrichment Analysis database [28], and UCSC super pathway databases, and Database of Protein, Genetic and Chemical Interactions (BioGrid) [29].

Statistical analyses
All statistical analyses were performed in MATLAB 2020a. Fisher's exact test was utilised to assess associations between categorical variables. The Welch test for normally distributed data and the Mann-Whitney U-test for non-normally distributed data were used to compare continuous variables. Here, the data distributions were assessed using the Kolmogorov-Smirnov test for goodness of fit [30]. Statistical tests were two-sided and considered significant at p < 0.05 with the Benjamini-Hochberg correction applied where multiple comparisons were involved.

The expression of ELF4 transcript varies across human tumours
We obtained and analysed a TCGA pan-cancer dataset comprising the mRNA expression levels of the ELF4 gene and comprehensively deidentified clinical information. The TCGA collected these datasets from 9,350 patients of 32 distinctive human cancers. We used the z-score normalisation methods to segregate the patient's tumours into two groups of cancers: those cancers with higher levels of ELF4 (which we named as the "high-ELF4 tumours"; 3,887 samples) and those with a lower level of ELF4 ("low-ELF4 tumours"; 5,463 samples; see Methods sections).
We examined whether the two groups of cancers were associated with different clinical outcomes. Remarkedly, using the Kaplan-Meier method [22], we found that the duration of overall survival (OS) periods for the patients with high-ELF4 tumours (OS = 51.9 months) were significantly shorter (log-rank test; p = 7.99 x 10 −64 ) relative to those of the patients with low-ELF4 tumours (OS = 114.7 months; Fig 1A). Correspondingly, we observed that the diseasefree survival (DFS) periods were significantly shorter (log-rank test; p = 1.07 x 10 −07 ) for the patients with high-ELF4 tumours than they were for the patients with low-ELF4 tumours ( Fig 1B).
Here, these findings expose an association between the expression levels of ELF4 and the clinical outcomes across various cancer types.

Disease factors associated with ELF4 expression
To understand why patients with tumours that express higher levels of ELF4 tended to have worse clinical outcomes (or a more aggressive disease). We assessed the distribution of various tumour grades across the two categories of tumours. Here, we found a significant association between the high-ELF4 tumours (Fisher exact test; odds ratio = 6.1, p = 4.5 x 10 −47 ) with highgrade tumours (Fig 1C).
We found that the median age of the patients with tumours that expressed higher ELF4 was significantly higher (median = 63 years) compared to that of patients with tumours that expressed lower levels of ELF4 (median = 58 years), Wilcoxon rank-sum test (Z = 119.2, p < 1 x 10 −300 ; Fig  1D). We compared the median number of mutations per patient's tumours across the two groups of disease. Here, we found a higher mutation burden in tumours that expressed higher levels of ELF4 (median = 83 mutations per tumour) than those that expressed lower levels of ELF4 (median = 38 mutations per tumour), Z = 119.2, p < 1 x10 -300 ; Fig 1E. Correspondingly, we found a higher fraction of the genome altered in the high-ELF4 tumours (median = 0.2605 per tumour), than those of the low-ELF4 (median = 0.1499), Z = 200, p < 1 x10 -300 ; Fig 1F. To further explore the relationship between patients' age and the mutation load of the tumours afflicting them, we measured the Pearson's linear correlation coefficients between the patients' age and the number of mutations found within their tumours. Here, we found that these were significantly positively correlated (Pearson's correlation = 0.32; p = 9.27 x 10 −205 ; Fig 1D).
Collectively these results show that high-ELF4 tumours, compared to the low-ELF4 tumours, tend to be of a higher-grade, afflict a significantly older patient population and have a significantly higher mutation burden.

The effect of ELF4 expression varies within cancer-type
Next, we compared the duration of the overall survival periods between patients with tumours that expressed high and low levels of ELF4 across the 32 cancers. Here, for 19 out of 32 cancer types, we found that the overall survival periods were shorter for patients with high-ELF4 tumours than those with low-ELF4 tumours (Fig 2A; S1 File). Among these nineteen cancer types are kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma and liver hepatocellular carcinoma. Surprisingly, for six cancer types, we found that the overall survival periods were longer for patients with high-ELF4 tumours than those afflicted by low-ELF4 tumours. These included, among others, urothelial bladder carcinoma, and adrenocortical carcinoma. For another seven cancer types, including diffuse large B-cell lymphoma and kidney chromophobe, we could not establish the overall influence of the ELF4 expression, as the median OS periods were undefined (i.e., more than 50% of the patients were alive by the reporting time) for both the high-ELF4 cancer patients and the low-ELF4 cancer patients.
Furthermore, we explored the expression of ELF4 across tumour types across the two pancancer groups (high-ELF4 and low-ELF4). We found that ELF4 vary between cancer types, for example, brain lower-grade gliomas (99.9% of the tumours) and prostate adenocarcinomas (99.6%) tended to express lower ELF4 levels, whereas acute myeloid leukaemias (92.7%) and oesophageal adenocarcinomas (92.8%) tended to express higher levels of ELF4 (Fig 2B).
Overall, these findings show that while the expression of ELF4 is associated with increased disease aggressiveness of most cancer types, there are some exceptions.

ELF4 expression is associated with significant resistance to most anticancer drugs
Since it is virtually impossible to test the primary patients' tumours that either expressed high or low ELF4 levels to several anti-cancer drugs, we used cancer cell lines to test for any differences in their drug responses. From the GDSC database, we collected the dose-response of 612 cancer cell lines to 397 small molecule inhibitors [16], and their ELF4 mRNA expression levels from the CCLE database [24]. We categorised the cell lines into two groups: 1) those that expressed high ELF4 mRNA transcripts (we refer to these as the "high-ELF4 cell lines") and 2) those that expressed low ELF4 mRNA transcripts (the "low-ELF4 cell lines"; see Methods section). Here, the categories of the cancer cell lines that have ELF4 mRNA levels that correspond to those of cancer cells from the patient groups could be used to examine how ELF4 levels in the patient's tumour cells are likely to influence the effectiveness of particular anti-cancer drugs.
Therefore, we compared drug z-score normalised half-maximal inhibitory concentration (IC50) values between the two categories of cells for 397 anti-cancer drugs. Remarkedly, we uncovered differences between the high-ELF4 cell lines and low-ELF4 cell lines in their observed dose-response to 130 drugs (Fig 3A).
Here, our results indicate that the expression of ELF4 does not only detectably impact the clinical outcomes of the patients but might also be a relevant variable to predict the response of tumours to various anti-cancer agents.
To elucidate the mechanism by which dasatinib, WH-4-023, and Ponatinib may regulate ELF4, we constructed a prior-knowledge network that connects ABL1, SRC and ELF4 ( Fig  5A). Here, we found that ELF4 acts downstream of ABL1 and SRC, with MDM2, PML, TP63 and CDK2 acting as intermediary signalling proteins (Fig 5B).
Furthermore, we attempted to determine whether any signalling pathways were enriched for among the overexpressed genes of pancreatic adenocarcinomas (PAAD), kidney renal clear cell carcinoma (KIRC), breast invasive carcinoma (BRCA) and lung adenocarcinomas (LUAD) that expressed higher ELF4 compare to those that expressed lower ELF4 (see S3 File). Here, astoundingly, we found that the over-expressed genes in high-ELF4 tumours of PAAD, KIRC, BRCA and LUAD tumours were all enriched for, pathways associated with 1) Defective C1GALT1C1 causes Tn polyagglutination syndrome, 2) Defective GALNT12 causes colorectal cancer 3), Defective GALNT3 causes familial hyperphosphatemic tumoral calcinosis, 4) Olinked glycosylation of mucins Homo sapiens, 5) Termination of O-glycan biosynthesis ( Fig  5C-5F, also see S4 File). These results corroborate published observations that link aberrant glycosylation to the increased resistance of cancer cell to drug perturbation [31][32][33].

Discussion
We examined the relationship between the mRNA expression level ELF4 in thousands of primary tumours and cancer cell lines of 32 different cancer types and both the clinical outcomes and likely anti-cancer drug responses. Others have studied ELF4 in the context of tumorigenesis in cancers of the skin, breast and prostate [3,10,12,13]. To the best of our knowledge, our study is the first to characterise the consequences of ELF4 expression variations across many distinct cancer types based on such many (9,350) primary tumours.
We showed that clinically relevant subtypes of patient tumours, across many cancer types, could be achieved by simply dividing the tumours into two categories based entirely on the expression of ELF4: those tumours which express high ELF4 levels and those that express low ELF4 levels. Just as others have shown that the expression of various genes, including kinases such as the phosphoinositide 3-kinase and transcription factors such as p53, can have clinical implications [34][35][36][37][38], we show here that patients with low-ELF4 tumours tend to have significantly better clinical outcomes than patients with high-ELF4 tumours. Our results suggest that the tumour's ELF4 expression levels are directly correlated with the aggressiveness of cancer.
Our analyses indicate that high-ELF4 tumours are typical among the elderly, and these also tend to have a higher mutation load (Fig 1D-1F). The frequency of somatic mutations increases with age in both non-cancerous and cancer tissue [39][40][41][42]; therefore we suggest that the link between the expression of ELF4 and the mutation burden could be exploited to establish therapeutics that are more effective in treating specific cancers. Whereas ELF4 is virtually undruggable, we could employ a network analysis approach to identify the upstream regulatory proteins that could be targeted to treat tumours that overexpress ELF4 [43,44]. Furthermore, tumours with higher mutations are more aggressive and respond significantly poorly to most anti-cancer drugs [45]. Accordingly, compared to the low-ELF4 tumours, we discovered higher gene mutation rates in high-ELF4 tumours, which could explain the poorer outcome exhibited by the patients afflicted with high-ELF4 tumours. Put together; we suggest that a combination of factors including the higher-grade tumours, higher mutation burden and the older patient population may in part explain why patients afflicted with high-ELF4 tumours level have worse survival outcomes.
There is a link between the expression of specific genes in cancer cells and the variations in the response of these cells to drug perturbation, and expression levels of these genes can, thus, impact disease treatment outcomes [24,[46][47][48][49][50]. With the advent of personalised medicine, growingly, the aim is to guesstimate the drugs to which a tumour is most likely responsive [51,52]. Here, to surmount the impractical barrier of testing hundreds of individual drugs on a specific tumour, cell lines that have genetic features resembling that of the tumour are convenient in inferring the drug responses of that tumour [16,24,53,54]. Suitably, by probing the drug response profiles of cancer cell lines, we showed that high-ELF4 tumours and low-ELF4 tumours perhaps respond differently to many anti-cancer drugs. More specifically, compared to the low-ELF4 tumours, the high-ELF4 tumours tended to be significantly responsive to only three (dasatinib, WH-4-023, Ponatinib) drugs out of 397 anti-cancer drugs. Since both, dasatinib and ponatinib also target ABL kinase, and dasatinib, WH-4-023, Ponatinib target the SRC kinase, our finding reveals that the high-ELF4 cancers are likely to signal through SRC and ABL kinases.
Our hypothesis is supported by previous studies that show that the loss of ELF4 causes a profound abrogation in BCR-ABL associated Chronic Myeloid Leukemia [20] and the direct interaction between the Promyelocytic Leukemia (PML) gene and ELF4 [55]. Furthermore, our analysis of the prior knowledge network that connected the SRC and ABL kinases to ELF4 showed that MDM2, PML and TP63 are the likely intermediary proteins that link these kinases to ELF4 [55,56]. Here, we suggest that many other cellular proteins interact with ELF4, but these interactions remain unknown since ELF4 is not as extensively studied (25 known interactors in the BioGrid database) as other proteins such as p53 (3077 known interactors) and c-MYC (7,325 known interactors) [29]. Therefore, many more experiments and computational analyses are required to elucidate the various pathways and mechanism by which SRC and ABL1 affect the activity of ELF4.
This finding implies that in aside the high-ELF4 tumours being significantly aggressive than low-EFL4 tumours, the high-ELF4 tumour patients may also exhibit worse clinical outcomes merely due to the refractory nature of the high-ELF4 tumours to most anti-cancer drugs. Additionally, since refractory tumours require more aggressive treatment protocols with higher drug doses, it would follow that patients with high-ELF4 tumours are likely to be exposed to such treatments, thus would tend to experience more adverse drug effects that might unfavourably impact their survival [57][58][59][60].

Conclusion
Altogether, we have shown that expression of ELF4 varies across tumours of the 32 cancer types investigated and that there is a link between the expression levels of ELF4 in tumours to the age of patients, the mutation burden, and the clinical outcomes. Further, our analyses of the dose-response profiles of well-characterised cell lines have revealed that the expression levels of ELF4 in tumours may be related to the varied responses of the cancer cell to drug perturbation, a finding that could help shape precision medicine.
Supporting information S1 File. All the cancer studies, the abbreviation of each cancer study and the number of samples. Within cancer types comparisons of the duration, the overall survival for patients with tumours that express high ELF4 and expressed low ELF4 expression levels.