Fig 1.
DE-podo, differentially expressed podocyte marker genes; LASSO, least absolute shrinkage and selection operator; RFE, recursive feature elimination; RF, random forest; GLM, generalized linear model; SVM, support vector machine; XGBoost, extreme gradient boosting; DKD: diabetic kidney disease; RT-qPCR, real-time quantitative polymerase chain reaction.
Fig 2.
Dimension reduction and cell annotation in scRNA-seq GSE131882.
(A) Cell distribution of the DKD and control groups in UMAP plot. (B) Ten cell types were annotated in scRNA-seq GSE131882, including collecting duct cell, endothelial cell, epithelial cell, ITGA1 + cell, loop of Henle, mt-rich cell, podocyte, proximal tubule cell, and vascular cell. (C) Proportions of each cell type in the DKD and control groups. (D) Expression levels of marker genes for annotation. scRNA-seq: single-cell RNA sequencing; DKD: diabetic kidney disease; UMAP: uniform manifold approximation and projection.
Fig 3.
Characteristics of biological processes involved in DKD.
(A) Volcano plot of differential BP in DKD identified by GSEA analysis. The grey dots represent the BP without significant differences, and green dots represent BP where only the NES value is significant, whereas red dots represent BP, in which both the NES value and Adj.P.value are significant. (B) Consistency analysis of GSEA and GSVA. Grey dots symbolize BP that did not meet the screening conditions (|log2FC| > 0.15), while red dots symbolize upregulated BP and blue dots symbolize downregulated BP. (C) List of NES value and Adj.P.value for the top 8 BP in GSEA. (D) GSVA scores of the top 8 BP presented in the heatmap. DKD: diabetic kidney disease; BP: biological processes; GSEA: gene set enrichment analysis; NES: normalized enrichment score; GSVA: gene set variation analysis.
Fig 4.
Cell-type marker genes and functional enrichment analysis.
(A) Relative expression levels of representative cell-type marker genes were visualized in the heatmap. (B) KEGG pathway enrichment analysis of each cell type. (C) GO-BP enrichment analysis of each cell type. KEGG: kyoto encyclopedia of genes and genomes; GO-BP: gene ontology biological processes.
Fig 5.
Unveiling cellular heterogeneity through podocyte subcluster analysis.
(A) Podocytes were divided into three distinct subclusters, labeled as 0, 1, and 2. (B) The top ten cell cluster marker genes of each subcluster were selected for presentation of their expression level in the form of a heatmap. (C) Representative marker genes of subcluster 0 (PODXL, PTPRO, and PTPRQ) and subcluster 2 (KCNJ16, LRP2, and WNK1). (D) GO-BP enrichment analysis of subcluster marker genes. The top seven items were selected for presentation. GO-BP: gene ontology biological processes.
Fig 6.
Possible LR interactions between endothelial cells and podocytes in DKD.
(A) The average expression levels of ligands in endothelial cells and podocytes. (B) Confidence of LR interaction. The closer score to 1, the higher the credibility of interaction between that LR pair. (C) Ligands and their target genes. The heatmap denotes regulatory potential scores between ligands and targets. (D) The average expression levels of receptors in endothelial cells and podocytes. (E) The average expression levels of targets in endothelial cells and podocytes. (F) Two pairs of LRs and their corresponding ligand cell/receptor cell are presented in the form of an iTALK circos plot. Circos plots showing all putative gain or loss of cell–cell interaction events in DKD via LR pair signaling between podocyte subclusters and endothelial cells. (G) The expression levels of GAS6, PTH1R, PTHLH, and TYRO3 provided by a DKD-related glomerular bulk RNA-seq dataset GSE96804. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 by Student’s t test. LR: ligand–receptor; bulk RNA-seq: bulk RNA sequencing.
Fig 7.
Differential expression analysis of podocytes in scRNA-seq.
(A) Volcano plots of podocyte DEGs between the DKD group and control group. The grey dots represent the genes expressed without significant differences, and orange dots represent upregulated genes, whereas blue dots represent downregulated genes. (B) Identification of DE-podos with a Venn diagram. Blue represents DEGs, orange represents podocyte marker genes. (C) Information concerning DE-podos. pct.1, percentage of cells in which the gene is detected in podocytes; pct.2, percentage of cells in which the gene is detected in non-podocyte cells. pct.diff, difference between pct.1 and pct.2. (D) UMAP plot of DE-podos expression and cellular localization. Each dot corresponds to a single cell, with color intensity reflecting the relative gene-expression levels: red denotes high expression, while green signifies low expression. (E) Dot plot of the average expression levels of DE-podos. (F) Ridgeplot of the relative expression levels of DE-podos. (G) Identification of hub transcription factors targeting DE-podos using the Cytoscape plugin iRegulon. (H) Heatmap of the most variable TFs activity among different cell types. scRNA-seq: single-cell RNA sequencing; DEGs: differentially expressed genes; DE-podo: differentially expressed podocyte marker gene; UMAP: uniform manifold approximation and projection.
Fig 8.
Identification and validation of hub DE-podos in bulk RNA-seq.
(A) Selection of feature genes via LASSO regression. Four feature genes with non-zero coefficients were selected by an optimal lambda value. (B) Selection of feature genes via RFE algorithm. Two feature genes were selected by an optimal accuracy. (C) Selection of feature genes via RF algorithm. Three feature genes were selected by an optimal accuracy. (D) Venn diagram showing the intersections of hub DE-podos by LASSO, RFE, and RF. (E) ROC curves of diagnostic models constructed using two hub DE-podos based on GLM, XGBoost, and SVM algorithms. (F) ROC analysis of the diagnostic performance of six DE-podos. (G) Expression validation of hub DE-podos in GSE96804 dataset (left) and GSE142025 dataset (right). (H) RT-qPCR validation of ARHGEF26. (I-J) Western blot images and quantification of ARHGEF26 expression in podocytes treated with normal control and high-glucose (HG) medium. Data are represented as mean ± SEM. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 by Student’s t test (G-J). DE-podo: differentially expressed podocyte marker gene; bulk RNA-seq: bulk RNA sequencing; LASSO: least absolute shrinkage and selection operator; RFE: recursive feature elimination; RF: random forest; ROC: receiver operating characteristic; GLM: generalized linear model; SVM: support vector machine; XGBoost: extreme gradient boosting; HG: high glucose.
Fig 9.
Graphical summary of results in this study.