Fig 1.
Flow chart of the research methodology.
This study entailed data acquisition, preprocessing, differential gene expression analysis, weighted gene co-expression network analysis (WGCNA), gene ontology (GO) enrichment and transcription factor analysis for distinguishing molecular and immunological signatures across different subtypes of esophagitis.
Fig 2.
Volcano plots illustrating differential gene expression between healthy controls and patients with various esophageal conditions.
(A) Conventional EoE (ce) shows a high number of significantly upregulated and downregulated genes with genes such as ALOX15, LOC105375905 and POSTN exhibiting relatively large fold changes. (B) EoE-like esophagitis (el) presents moderate differential expression patterns with genes such as LOC105375905 and SLC6A19. (C) Lymphocytic esophagitis (ly) shows fewer significant changes, with some notable upregulated genes such as LOC105375905, TRIM15 and ZIC4. (D) Nonspecific esophagitis (ns) displays distinct gene expression patterns with significant upregulation and downregulation of LOC105375905, SLC6A19 and C11orf86. Red dots represent genes that meet significance thresholds for both p-value and fold change, while blue and green represent genes meeting one of the thresholds. Note: Apparent differences in the contour (“belt shape”) of the plots reflect the inherent distribution of differential expression within each patient group, rather than methodological variation. (E) Heatmap showing the expression profiles of the top 50 differentially expressed genes (DEGs) across samples. Rows represent individual genes, while columns correspond to patient samples from various conditions. The color gradient indicates expression levels, with red denoting upregulation and blue denoting downregulation.
Fig 3.
Enrichment analysis of biological processes for different esophageal conditions.
The x-axis represents the gene ratio, while the dot size indicates the number of genes associated with each term, and the color reflects the adjusted p-value. (A): Biological processes in conventional EoE (CE) reveal significant activation of processes such as cytoplasmic translation, immune response, and keratinization, while suppression involves translational initiation and regulation. (B) EoE-like esophagitis (EL) displays enrichment in translation activator activity and signaling receptor activity. (C): Lymphocytic esophagitis (LE) highlights activation in processes related to mitochondrial ATP synthesis, oxidative phosphorylation, and nervous system development, with suppression in trans-synaptic signaling. (D) Nonspecific esophagitis (NS) shows activation in immune-related processes, including immunoglobulin production and adaptive immune response. (E) GO term Cellular Component Jaccard similarity matrix. The heatmap displays the pairwise similarity of cellular component GO terms among nonspecific esophagitis (nons_cc), EoE-like esophagitis (eoel_cc), lymphocytic esophagitis (lymp_cc), and conventional EoE (ceoe_cc). The similarity scores range from 0 (blue, low similarity) to 1 (red, high similarity). The matrix reveals distinct clustering patterns, with closer relationships observed between nonspecific and EoE-like esophagitis, compared to conditions.
Fig 4.
Construction of co-expression modules by WGCNA.
(A, B) The adjacency matrix was defined using soft-thresholds with β = 9. Network topology analysis was performed under different soft threshold powers. (C) Clustering dendrograms of genes, with dissimilarity based on topological overlap, together with assigned module colors. (D) The Module-trait relationship heatmap illustrates the correlation between gene modules and different disease conditions, specifically healthy controls (normal), eosinophilic-like esophagitis (el), nonspecific esophagitis (ns), lymphocytic esophagitis (ly), and conventional esophagitis (ce). Rows correspond to modules, and columns correspond to the disease conditions. Each cell contains the correlation and p-value information, with red indicating positive correlation and blue indicating negative correlation. The heatmap visualizes the module-trait correlations of key modules, with positive correlations depicted in red and negative correlations in blue. The intensity of the color reflects the strength of the correlation, with the associated p-values indicated by asterisks: *p < 0.05, **p < 0.01, and ***p < 0.001. (E) The correlation between gene signatures for conventional EoE and module membership in the lightgreen module.
Fig 5.
Hub gene identification from selected gene modules using CytoScape plugin CytoHubba degree centrality algorithm and filtering to top 10 by degree.
(A): Hub gene signature for the MElightgreen module, shows key genes such as IL13, NTRK1, BDNF and IGF1R play a significant role in the pathophysiology of conventional EoE. (B): Hub gene signature for the MElightsteelblue1 module, shows important genes like STAT1, CXCL10, CXCL8 play major roles in the pathophysiology of lymphocytic esophagitis. (C) Hub gene signature for MEwhite module includes key genes such as CDK1, CCNB1, CENPA, RAD51 and CDC20 which are involved in cell proliferation epithelial turnover. This module shows positive correlation with conventional EoE and a negative correlation with EoE-like esophagitis.
Fig 6.
ML identification of key biomarkers in EoE and its variants.
(A) Principal Component Analysis (PCA) clustering of samples into 3 groups based on gene expression data. The first principal component (PC1) explains 88.43% of the variance, while the second principal component (PC2) accounts for 7.76%. (B-E): Machine learning-based identification of key biomarkers associated with conventional EoE (CE) and lymphocytic esophagitis (Lym) using Random Forest classification. Receiver Operating Characteristic (ROC) curves are shown for selected genes: (B) POSTN, (C) DNAH11, (D) NFE2, and (E) PRR15L. The Area Under the Receiver Operating Characteristic (AUROC) is 1.00 for CE, indicating perfect discrimination for this condition, and 0.00 for LYM, suggesting no discriminatory power for these genes in lymphocytic esophagitis. The dashed line represents random classification (AUROC = 0.5).
Fig 7.
Transcription factor activity in EoE and its variants.
Heatmap displaying transcription factor activity across samples. Rows represent individual samples grouped by condition, and columns represent transcription factors. Conditions are annotated on the left with color coding: red for conventional EoE (ce), blue for EoE-like esophagitis (el), green for lymphocytic esophagitis (ly), orange for nonspecific esophagitis (ns), and gray for healthy controls (normal). The color gradient in the heatmap reflects transcription factor activity levels, with red indicating higher activity and blue indicating lower activity. Clustering reveals distinct patterns of transcription factor activity associated with specific esophageal conditions. (B) Top 10 transcription factors by condition. The bar plots display the mean activity of the top 10 transcription factors across: conventional EoE (ce, red), EoE-like esophagitis (el, olive), lymphocytic esophagitis (ly, green), nonspecific esophagitis (ns, purple), and healthy controls (normal, blue). Each plot highlights transcription factors such as FOXA1, CREB1, and STAT3, which demonstrate variable activity across conditions.
Fig 8.
Expression of POSTN (A) and DNAH11 (B) across esophageal conditions.
Both genes exhibit significantly higher expression in conventional EoE (ce) compared to EoE-like (el), lymphocytic (ly), nonspecific esophagitis (ns), and healthy controls (normal). POSTN demonstrates markedly elevated levels, while DNAH11 shows more moderate expression differences across the conditions.