Figures
Abstract
Oesophageal adenocarcinoma (OAC) is the 7th most common cancer in the United Kingdom (UK) and remains a significant health challenge. This study presents a proteomic analysis of seven OAC donors complementing our previous neoantigen identification study of their human leukocyte antigen (HLA) immunopeptidomes. Our small UK cohort were selected from donors undergoing treatment for OAC. We used label-free mass spectrometry proteomics to compare OAC tumour tissue to matched normal adjacent tissue (NAT) to quantify expression of 3552 proteins. We identified differential expression of a number of proteins previously linked to OAC and other cancers including common markers of tumourigenesis and immunohistological markers, as well as enrichment of processes and pathways relating to RNA processing and the immune system. Our findings also offer insight into the role of the protein stability in the generation of an OAC neoantigen we previously identified. These results provide independent corroboration of existing oesophageal adenocarcinoma biomarker studies that may inform future diagnostic and therapeutic research.
Citation: Nicholas B, Bailey A, McCann KJ, Walker RC, Johnson P, Elliott T, et al. (2025) Comparative analysis of protein expression between oesophageal adenocarcinoma and normal adjacent tissue. PLoS ONE 20(3): e0318572. https://doi.org/10.1371/journal.pone.0318572
Editor: Alexis G. Murillo Carrasco, Instituto do Cancer do Estado de Sao Paulo / University of Sao Paulo, BRAZIL
Received: October 12, 2024; Accepted: January 19, 2025; Published: March 12, 2025
Copyright: © 2025 Nicholas et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD054428 and 10.6019/PXD054428
Funding: This study was supported by a Cancer Research UK Centres Network Accelerator Award Grant (C328/A21998). Instrumentation in the Centre for Proteomic Research is supported by the Biotechnology and Biological Sciences Research Council (BM/M012387/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Oesophageal adenocarcinoma (OAC) accounts for about 2% of all cancer diagnosis in the UK, with an increase in 10-year survival from 4% to 12% in the last 50 years [1]. Treatment options centre on resection of the oesophagus in early-stage OAC, and chemo- or radiotherapy combined with surgery for later stage OAC [2]. Previously we presented proof-of-concept findings using mass spectrometry proteomics to identify human leukocyte antigen (HLA) presented neoantigens from a cohort of OAC donors as targets for cancer vaccines [3]. In the UK population, the most recent data for proportions of tumour stage at diagnosis were I = 5.8%, II = 11.4%, III = 24.4%, IV = 38% and 20.4% unknown. 69% of diagnosis were in men and 31% in women, with a median age at diagnosis, of 71 [4]. Here we present results of a small selected sample from the underlying OAC population of 7 OAC donors who were all undergoing surgery as part of their treatment. They comprised tumour stages II = 28.6%, III = 57.1% and IV = 14.3% and our samples were all from men with a lower median age at diagnosis of 68 (S1 Table). Rather than population level inferences, our focus here was on characterising tumour tissues. We performed a comparative proteomic analysis of OAC tumour tissue to matched normal adjacent tissue (NAT). Using label free quantification (LFQ) of bottom-up mass spectrometery proteomics we sought to identify differential expression of proteins (DEP) between OAC and NAT that may inform diagnosis and treatment options.
Results
We quantified 3552 proteins across 7 patients using label free quantification (LFQ) [5,6] yielding protein identifications from the normalised top 3 peptide intensities (S2 Table). To confirm we could distinguish between OAC and NAT tissues using protein expression we performed Principal Component Analysis (PCA) using the normalised top 3 peptide intensities of the 500 most variable proteins (Fig 1) [7]. The PCA yielded clear separation between OAC and NAT along PC1 accounting for 45% of the variance between the tissues. However, whilst the NAT samples were tightly grouped, the tumour samples were more dispersed, indicating some heterogeneity between tumours, most notably donor EN-430-11 (Fig 1A). Plotting the PCA loadings to examine the expression of proteins driving the separation towards the OAC samples indicated three proteins: Keratin, type I cytoskeletal 18 (K1C18), Anterior gradient protein 2 homolog (AGR2) and Gamma-interferon-inducible lysosomal thiol reductase (GILT) (Fig 1B]). Over 90% of the variation between the matched OAC and NAT samples was accounted for by the first 10 principal components (Fig 1C).
(A) PCA of normalised top 3 peptide intensities of 500 most variable proteins. OAC (red) & NAT (grey). Samples are numbered with donor identifier. (B) PCA plot with the loadings from proteins contributing to PC1 and PC2. (C) Scree plot of contribution of principal components to total variance.
We then grouped the samples according to OAC or NAT and calculated differential protein expression (DEP) [8]. Of the 3552 proteins, we found 419 DEPs for OAC and 40 DEPs for NAT at thresholds of a log2 fold-change of greater than 1 and p-value of less than 1% (Fig 2A). These and the other thresholds used here are necessarily arbitrary and chosen to balance being conservative whilst not over-excluding information. The data without thresholds is provided in Supporting Information S3 Table.
(A) Volcano plot of differentially expressed proteins for OAC vs NAT. Proteins are labelled with gene names. Thresholds are represented by dotted lines at p-value of 1% and log2 fold change of 1. (B) Heatmap of DEPs below a FDR of 2% (n = 92). Colour bar shows log2 fold change rescaled as z-scores, i.e., each unit from zero represents one standard deviation from the row average value for each protein.
As indicated by the PCA, K1C18 and AGR2 were the most DEP for OAC. We identified high expression in OAC of Endoplasmic reticulum chaperone BiP (HSPA5), Deoxynucleoside triphosphate triphosphohydrolase SAMHD1 (SAMHD1), Rho GDP-dissociation inhibitor 2 (ARHGDIB). Other notable OAC DEPs were Cell division cycle 5-like protein (CDC5L), Metalloproteinase inhibitor 1 (TIMP1), Matrix metalloproteinase-9 (MMP9) and Lysosome-associated membrane glycoprotein 1 (LAMP1) (S3 Table). Notable in NAT were high expression of Protein-glutamine gamma-glutamyltransferase E (TGM3) and Heat shock protein beta-1 (HSPB1).
Fig 2B focuses in on the most statistically significant DEPs (92 proteins below FDR of 2%) across the OAC cohort. We found high expression for proteins in OAC relating to cell structure such as Keratin, type II cytoskeletal 8 (K2C8), RNA processing and protein folding such as Nucleolar RNA helicase 2 and (DDX21), Peptidyl-prolyl cis-trans isomerase (FKB11), and the immune system such as antigen peptide transporters 1 & 2 (TAP1, TAP2), Mucin-1 (MUC1) and Intercellular adhesion molecule 1 (ICAM1). Collectively these are all proteins that impact cellular phenotype and immunoregulation. Donor EN-430-11, an outlier on the PCA plot, has relatively low expression of the proteins K1C18, AGR2 and GILT that drive the separation between the OAC and NAT PCA (Fig 1A).
A striking observation is for donor EN-454-11 and Nucleolar protein 58 (NOP58) which was generally highly expressed in OAC except for EN-454-11 (Fig 2B, S2 Table). We previously reported direct observation a putative neoantigen eluted from tumour HLA for EN-454-11 derived from mutation G95R in NOP58 [3]. Using DDGun [9] we calculated a change in the Gibbs free energy of unfolding (∆∆G) between the wild type and G95R mutant NOP58 protein of −0.5 kcal/mol, indicating a decrease in the stability of NOP58 expressed by donor EN-454-11 (Supporting Information).
Finally we performed functional analysis using 232 OAC DEPs (log2 fold-change greater than 2 and below FDR 5%). We identified greatest enrichment for pathways of biological processes relating to RNA processing, particularly mRNA splicing (Fig 3A, S4 Table). For enrichment of Reactome pathways [10] we also found changes in RNA processing, but also in immunological pathways, specifically neutrophil degranulation and antigen processing and presentation (Fig 3B, S4 Table).
(A) Enriched GO Biological Processes. The top 25 pathways are shown. (B) Enriched Reactome pathways. Statistical significance level indicated by the -log10 p-value on the x-axis. Proteins were selected using thresholds for OAC DEPs above a log2 fold change 2 and below a FDR of 5% (n = 232).
Discussion
Previously we used proteogenomics analysis to identify patient specific neoantigens arising OAC mutations as therapeutic targets [3]. Here we compared the proteomes of OAC tissues to NAT from seven patients of the same cohort. This was a small study, limiting the extent to which our findings can be generalised. For example, epithelial cell adhesion molecule (EPCAM) was not uniformly expressed across all samples, so we are unable to confirm reports of high expression of EPCAM as a putative OAC biomarker [11,12]. The other main limitation in our design is that we are unable to compare these differential protein expression observations to other information that might support or discount their value, such as gene expression or comparison to oesophageal squamous cell carcinoma tissue.
The principal value of the results presented here is the quantification of 3,500 proteins of which nearly 500 were differently expressed as a resource to other OAC researchers. Amongst the OAC DEPs we found high expression of common markers of tumourigenesis, including AGR2 and keratin K1C18 [13] and that along with antigen processing related protein GILT, these proteins drove the separation between OAC and NAT in the Principal Component Analysis. AGR2 is a known unfavourable prognostic marker in renal and liver cancer and OAC [11,13,14]. Moreover, the role of AGR2 in tumourigenesis in the oesophagus has previously been seen in higher gene expression of AGR2 in the OAC precursor condition Barrett’s oesophagus with respect to NAT [15], and increased expression of AGR2 in fibroblast cells was seen to promote tumour growth in mice [16]. We also observed differential expression in OAC of putative immunohistological markers BiP (HSPA5), SAMHD1 (SAMHD1) and Rho GDP-dissociation inhibitor 2 (ARHGDIB) [11]. Other DEPs of interest included G2/M checkpoint related protein Cell division cycle 5-like protein (CDC5L), a putative target for checkpoint inhibition [17], and putative prognostic biomarkers in colorectal, breast and ovarian cancers: Metalloproteinase inhibitor 1 (TIMP1), Matrix metalloproteinase-9 (MMP9) [18] and Lysosome-associated membrane glycoprotein 1 (LAMP1) [19]. We identified differential expression in NAT of Protein-glutamine gamma-glutamyltransferase E (TGM3) and Heat shock protein beta-1 (HSPB1) consistent with previous reports [11,20].
Additionally, the finding that the G95R mutation decreases the stability of NOP58 suggests that for donor EN-454-11, NOP58 is more likely to be a defective ribosomal product (DRiP) [21,22]. This on the one hand may explain the decreased expression seen in Fig 2B for donor EN-454-11, whilst on the other hand have increased the probability of NOP58 neoantigen presentation that we previously observed [3]. It suggests utility in quantifying the affects of single nucleotide polymorphisms on protein stability and subsequent processing in identifying putative DRiP-derived neoantigens.
Our functional analysis corresponds with previous reports characterising changes in OAC tissues in identifying enriched biological processes and Reactome pathways for mRNA processing and antigen processing [17]. Additionally Reactome pathway enrichment of the neutrophil degranulation pathway is indicative of inflammation, a known risk factor in cancer [23].
Overall, these observations offer independent corroboration and contrast to existing studies seeking to identify biomarkers or targets for more effective OAC specific treatments, and a catalogue of additional putative biomarkers of OAC which can be validated in larger cohorts in the future. Moreover our NOP58 observation suggests another parameter, protein stability, that can be used in the prediction of putative neoantigens for personalised therapies.
Materials and Methods
Ethics statement
Informed written consent was provided for participation by all individuals. Ethical approval for this study was granted by the Proportionate Review Sub-Committee of the North East – Newcastle & North Tyneside 1 Research Ethics Committee (Reference 18/NE/0234). This study was approved by the University of Southampton Research Ethics Committee. For the study presented here samples were accessed on 2nd October 2018, and only authors RCW and TJU had access to information that could identify individual participants during or after data collection.
Tissue preparation
Subjects diagnosed with OAC were recruited to the study (see S1 Table for clinical characteristics). Tumours were excised from resected oesophageal tissue post-operatively by pathologists and processed either for histological evaluation of tumour type and stage, or snap frozen at − 80°C.
Protein extraction and digestion
Snap frozen tissue samples were briefly thawed and weighed prior to 30s of mechanical homogenization (using disposable probes, Fisher, UK) in 4mL lysis buffer (0.02M Tris, 0.5% [w/v] IGEPAL, 0.25% [w/v] sodium deoxycholate, 0.15 mM NaCl, 1mM ethylenediaminetetraacetic acid (EDTA), 0.2 mM iodoacetamide supplemented with EDTA-free protease inhibitor mix). Homogenates were clarified for 10 min at 2000g, 4°C and then for a further 60 min at 13,500g, 4°C.
Protein concentration of tissue lysates was determined by BCA assay, and volumes equivalent to 100 mg of protein were precipitated using methanol/chloroform as previously described [24]. Pellets were briefly air-dried prior to resuspension in 6 M urea/50 mM Tris-HCl (pH 8.0). Proteins were reduced by the addition of 5 mM (final concentration) DTT and incubated at 37°C for 30 min, then alkylated by the addition of 15 mM (final concentration) iodoacetamide and incubated in the dark at room temperature for 30 min. 4 µg Trypsin/LysC mix (Promega, UK) were added and the sample incubated for 4 h at 37°C, then 6 volumes of 50 mM Tris-HCl pH 8.0 were added to dilute the urea to < 1 M, and the sample was incubated for a further 16 h at 37°C. Digestion was terminated by the addition of 4 µL of TFA, and the sample clarified at 13,000×g for 10 min at RT. The supernatant was collected and applied to Oasis Prime microelution HLB 96-well plates (Waters, UK) which had been pre-equilibrated with acetonitrile. Peptides were eluted with 50 µL of 70% acetonitrile and dried by vacuum centrifugation prior to resuspension in 0.1% formic acid.
Mass spectrometry proteomics
8 µg of peptides per sample were separated by an Ultimate 3000 RSLC nano system (Thermo Scientific, UK) using a PepMap C18 EASY-Spray LC column, 2 µm particle size, 75 µm × 75 cm column (Thermo Scientific, UK) in buffer A (H2O/0.1% Formic acid) and coupled on-line to an Orbitrap Fusion Tribrid Mass Spectrometer (Thermo Fisher Scientific, UK) with a nano-electrospray ion source.
Peptides were eluted with a linear gradient of 3–30% buffer B (acetonitrile/0.1% formic acid) at a flow rate of 300 µL/min over 200 min. Full scans were acquired in the Orbitrap analyser in the scan range 300–1,500 m/z using the top speed data dependent mode, performing an MS scan every 3 second cycle, followed by higher energy collision-induced dissociation (HCD) MS/MS scans. MS spectra were acquired at a resolution of 120,000, RF lens 60% and an automatic gain control (AGC) ion target value of 4.0e5 for a maximum of 100 ms. MS/MS scans were performed in the ion trap, higher energy collisional dissociation (HCD) fragmentation was induced at an energy setting of 32% and an AGC ion target value of 5.0e3.
Proteomic data analysis
Raw spectrum files were analysed using Peaks Studio 10.0 build 20190129 [5,25] and the data processed to generate reduced charge state and deisotoped precursor and associated product ion peak lists which were searched against the UniProt database (20,350 entries, 2020-04-07) plus the corresponding mutanome for each sample (~1,000–5,000 sequences) and contaminants list in unspecific digest mode. Parent mass error tolerance was set a 10 ppm and fragment mass error tolerance at 0.6 Da. Variable modifications were set for N-term acetylation (42.01 Da), methionine oxidation (15.99 Da), carboxyamidomethylation (57.02 Da) of cysteine. A maximum of three variable modifications per peptide was set. The false discovery rate (FDR) was estimated with decoy-fusion database searches [5] and were filtered to 1% FDR. Data was deposited in PRIDE [26].
Differential protein expression
Label free quantification using the Peaks Q module of Peaks Studio [5,6] yielding matrices of protein identifications as quantified by their normalised top 3 peptide intensities. The resulting matrices were filtered to remove any proteins for which there were more than two missing values across the samples. Differential protein expression was then calculated with DEqMS using the default parameters [8].
Principal component analysis of the normalised top 3 peptide intensities was performed using DESEq2 [7] and PCATools [27].
Results were visualised using EnhancedVolcano [28], pheatmap [29] and ggplot2 [30].
Functional analysis
Functional enrichment analysis was performed using g:Profiler [31] using default settings for homo sapiens modified to exclude GO electronic annotations. Protein ids were used as inputs.
Supporting Information
S1 Table. Patient information for the 7 male donors in this study
https://doi.org/10.1371/journal.pone.0318572.s001
(CSV)
S2 Table. Peaks normalised top 3 peptide intensities.
https://doi.org/10.1371/journal.pone.0318572.s002
(CSV)
References
- 1. Oesophageal cancer statistics. 2022. Available from: https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/oesophageal-cancer
- 2. Smyth EC, Lagergren J, Fitzgerald RC, Lordick F, Shah MA, Lagergren P, et al. Oesophageal cancer. Nat Rev Dis Primers. 2017;3:17048. pmid:28748917
- 3. Nicholas B, Bailey A, McCann KJ, Wood O, Walker RC, Parker R, et al. Identification of neoantigens in oesophageal adenocarcinoma. Immunology. 2023;168(3):420–31. pmid:36111495
- 4. CancerData. Available from: https://www.cancerdata.nhs.uk/incidence_and_mortality
- 5. Zhang J, Xin L, Shan B, Chen W, Xie M, Yuen D, et al. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics. 2012;11(4):M111.010587. pmid:22186715
- 6. Lin H, He L, Ma B. A combinatorial approach to the peptide feature matching problem for label-free quantification. Bioinformatics. 2013;29(14):1768–75. pmid:23665772
- 7. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. pmid:25516281
- 8. DEqMS. Available from: http://bioconductor.org/packages/DEqMS/.
- 9. Montanucci L, Capriotti E, Frank Y, Ben-Tal N, Fariselli P. DDGun: an untrained method for the prediction of protein stability changes upon single and multiple point variations. BMC Bioinform. 2019;20(Suppl 14):335. pmid:31266447
- 10. Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J, et al. The reactome pathway knowledgebase 2024. Nucleic Acids Res. 2024;52(D1):D672–8. pmid:37941124
- 11. O’Neill JR, Pak H-S, Pairo-Castineira E, Save V, Paterson-Brown S, Nenutil R, et al. Quantitative shotgun proteomics unveils candidate novel esophageal adenocarcinoma (EAC)-specific Proteins. Mol Cell Proteomics. 2017;16(6):1138–50. pmid:28336725
- 12. O’Neill JR, Yébenes Mayordomo M, Mitulović G, Al Shboul S, Bedran G, Faktor J, et al. Multi-omic analysis of esophageal adenocarcinoma uncovers candidate therapeutic targets and cancer-selective posttranscriptional regulation. Mol Cell Proteomics. 2024;23(6):100764. pmid:38604503
- 13. Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220):1260419. pmid:25613900
- 14. Chevet E, Fessart D, Delom F, Mulot A, Vojtesek B, Hrstka R, et al. Emerging roles for the pro-oncogenic anterior gradient-2 in cancer development. Oncogene. 2013;32(20):2499–509. pmid:22945652
- 15. Hao Y, Triadafilopoulos G, Sahbaie P, Young HS, Omary MB, Lowe AW. Gene expression profiling reveals stromal genes expressed in common between Barrett’s esophagus and adenocarcinoma. Gastroenterology. 2006;131(3):925–33. pmid:16952561
- 16. Wang Z, Hao Y, Lowe AW. The adenocarcinoma-associated antigen, AGR2, promotes tumor growth, cell migration, and cellular transformation. Cancer Res. 2008;68(2):492–7. pmid:18199544
- 17. Liu W, Xie L, He Y-H, Wu Z-Y, Liu L-X, Bai X-F, et al. Large-scale and high-resolution mass spectrometry-based proteomics profiling defines molecular subtypes of esophageal cancer for therapeutic targeting. Nat Commun. 2021;12(1):4961. pmid:34400640
- 18. Jiang H, Li H. Prognostic values of tumoral MMP2 and MMP9 overexpression in breast cancer: a systematic review and meta-analysis. BMC Cancer. 2021;21(1):149. pmid:33568081
- 19. Marzinke MA, Choi CH, Chen L, Shih I-M, Chan DW, Zhang H. Proteomic analysis of temporally stimulated ovarian cancer cells for biomarker discovery. Mol Cell Proteomics. 2013;12(2):356–68. pmid:23172893
- 20. Li L, Jiang D, Zhang Q, Liu H, Xu F, Guo C, et al. Integrative proteogenomic characterization of early esophageal cancer. Nat Commun. 2023;14(1):1666. pmid:36966136
- 21. Gestal-Mato U, Herhaus L. Autophagy-dependent regulation of MHC-I molecule presentation. J Cell Biochem. 2024;125(11):e30416. pmid:37126231
- 22. Holly J, Yewdell JW. Game of Omes: ribosome profiling expands the MHC-I immunopeptidome. Curr Opin Immunol. 2023;83:102342.
- 23. Mollinedo F. Neutrophil degranulation, plasticity, and cancer metastasis. Trends Immunol. 2019;40(3):228–42. pmid:30777721
- 24. bligh EG, Dyer WJ. A rapid method of total lipid extraction and purification. Can J Biochem Physiol. 1959;37(8):911–7. pmid:13671378
- 25. Tran NH, Zhang X, Xin L, Shan B, Li M. De novo peptide sequencing by deep learning. Proc Natl Acad Sci U S A. 2017;114(31):8247–52. pmid:28720701
- 26. Perez-Riverol Y, Csordas A, Bai J, Bernal-Llinares M, Hewapathirana S, Kundu DJ, et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 2019;47(D1):D442–50. pmid:30395289
- 27. Blighe K, Lun A. PCAtools: everything principal components analysis. 2024.
- 28. Blighe K, Rana S, Lewis M. EnhancedVolcano: publication-ready volcano plots with enhanced colouring and labeling. 2024.
- 29. Kolde R. Pheatmap: Pretty heatmaps. 2019. Available: https://CRAN.R-project.org/package=pheatmap
- 30. Wickham H. ggplot2: Elegant graphics for data analysis. 2016. Available: https://ggplot2.tidyverse.org
- 31. Kolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H. gprofiler2– an r package for gene list functional enrichment analysis and namespace conversion toolset g:profiler. F1000Res. 2020;9:ELIXIR-709.