Peer Review History

Original SubmissionFebruary 4, 2025
Decision Letter - Yusuf Ayipo, Editor

Dear Dr. Liu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR:

==============================

Please submit your revised manuscript by Apr 09 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Yusuf Oloruntoyin Ayipo, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. To comply with PLOS ONE submissions requirements, in your Methods section, please provide additional information regarding the experiments involving animals and ensure you have included details on (1) methods of sacrifice, (2) methods of anesthesia and/or analgesia, and (3) efforts to alleviate suffering.

3. Thank you for stating the following financial disclosure: Henan Province Major Science and Technology Project (241100310100), Henan Province Clinical Research Doctor Training Special Project (D20240006), Zhongyuan Scholars of Henan Provincial Health Commission (224000510005) and Zhongyuan Scholar Workstation (234400510024). 

Please state what role the funders took in the study.  If the funders had no role, please state: ""The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."" 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

The manuscript has scientific significance and is well-composed. However, some minor revisions are required to improve its standard for publication as rightly recommended by the reviewers. The authors need to address these appropriately.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

Reviewer #1: General Assessment

This study presents a bioinformatics-based investigation of differentially expressed genes (DEGs) in diabetic kidney disease (DKD), incorporating machine learning, immune infiltration analysis, and experimental validation in a mouse model. While the study is well-structured and employs appropriate computational techniques, there are several methodological and statistical concerns that need to be addressed before publication. Below is a summary of the key critiques, along with recommendations for improvement.

1. Data Availability and Transparency

Strengths: The study utilizes publicly available datasets (GSE96804, GSE104948-GPL22945, GSE30528) from GEO, ensuring reproducibility. Glycosylation-related gene sets are obtained from MSigDB.

Concerns & Recommendations: No mention of processed dataset availability: The study should explicitly state whether batch-corrected, normalized data is accessible in a public repository (e.g., Zenodo, Figshare). qPCR and mouse model data should be made available: Raw Ct values, fold changes, and histology data should be uploaded. A formal Data Availability Statement (DAS) is missing.

2. Statistical Rigor in Differential Gene Expression Analysis

Strengths: The study appropriately applies "limma" for DEG identification and adjusted p-value < 0.05 to correct for multiple testing. Volcano plots and heatmaps provide clear visualization.

Concerns & Recommendations: Batch effect correction validation is unclear: PCA plots are mentioned but should include variance explained before and after correction. DEG selection criteria (|logFC| > 1) should be justified further to balance sensitivity and specificity.

3. Enrichment Analysis (GO & KEGG Pathways)

Strengths: Uses clusterProfiler, a widely accepted tool for functional enrichment analysis.

Concerns & Recommendations: False Discovery Rate (FDR) correction should be explicitly stated to address multiple hypothesis testing. Gene Set Enrichment Analysis (GSEA) should be conducted alongside overrepresentation analysis to validate findings.

4. Machine Learning-Based Hub Gene Selection

Strengths: The study applies LASSO, Random Forest, and SVM, which enhances feature selection reliability.

Concerns & Recommendations: No cross-validation (CV) is reported. The models should include 10-fold CV to assess generalizability and prevent overfitting. ROC analysis lacks confidence intervals (CIs). Reporting 95% CIs for AUC values is necessary to evaluate statistical robustness. An independent validation dataset is missing. Using an external dataset would confirm the diagnostic value of hub genes.

5. Immune Cell Infiltration Analysis (ssGSEA)

Strengths: Uses ssGSEA to estimate immune infiltration and correlates it with hub gene expression.

Concerns & Recommendations: No cross-validation with alternative methods (e.g., CIBERSORT, xCell, TIMER). Different immune cell estimation techniques should be compared. Multiple testing correction for immune cell correlations is missing. Applying FDR correction to correlation p-values is necessary.

6. Experimental Validation (qPCR & Mouse Model)

Strengths: qPCR validation provides biological relevance to computational predictions. The use of C57BKS db/db mice is appropriate for studying DKD pathology.

Concerns & Recommendations: Sample size is too small (n=3 per group). Increasing to at least n=5-6 per group would improve statistical power. Raw qPCR data should be made available.

Revisions Required

The study provides a well-organized bioinformatics pipeline for identifying DKD-related hub genes, but several statistical and methodological issues need to be addressed before publication:

1. Data Availability: Ensure all processed datasets, qPCR results, and mouse model data are publicly available.

2. Statistical Rigor: Apply cross-validation in machine learning models. Report confidence intervals for ROC AUC values. Validate findings with independent datasets. Provide batch correction validation (PCA variance explained before/after correction).

3. Experimental Validation: Increase qPCR and animal study sample sizes.

4. Immune Infiltration Analysis: Validate ssGSEA results with alternative deconvolution methods. Apply multiple hypothesis correction for immune correlations.

Recommendation: Minor Revisions Required

Before publication, the authors should address the statistical concerns, improve data transparency, and validate their findings using additional datasets and cross-validation methods.

Reviewer #2: Paper review comments

General overview and strengths: The paper by Liu et al is a detailed study using bioinformatics and machine learning approaches to identify hub-genes involved in protein glycosylation in DKD. The integration of bioinformatics methods provides a broad and in-depth understanding of the molecular mechanisms of DKD. The authors used the least absolute shrinkage and selection operator (LASSO), support vector machine (SVM), and random forests (RF) as the machine learning algorithms. The identification of the 6 hub-genes, some of which had not been extensively studied in DKD is a significant strength of the study as they present potential biomarkers for diagnosis and personalized therapeutic targets. Application of the ROC curve gives further support to them as potential biomarkers. The molecular subtyping and Immune infiltration analysis done are well conducted and add credibility to the study, further strengthening the study.

There are, however, some observations from the paper that need to be addressed.

Clarity and terminologies: The title is a reasonable representation of the study, but it could be adjusted to make it more precise by reflecting that this study was largely a gene-based study. The study showed a primary focus on glycosylation-related genes i.e. genes involved in (or influencing) glycosylation in DKD rather than Protein glycosylation itself. These genes influence glycosylation modification. Studies on “Protein glycosylation” are focused on the glycans but the hub genes (gly-DEG) were studied in this case.

The authors put it better here: “The present study utilized bioinformatics integrated with machine learning to identify the genes and their regulatory networks associated with protein glycosylation modification, which plays a pivotal role in DKD.” [line 83-85]. So, they could adjust the title and the study aim written in the abstract.

The use of the phrases, “progression of DKD” and “pathogenesis of DKD” could create some ambiguity since the terminologies are different entities and may not be interchangeable. It would be helpful to clarify the focus of the study, then use the correct terminologies to ensure clarity. Studies on “progression” often focus on identification of the different stages of DKD and establish quantitative or qualitative changes across the different stages of DKD. This study appears to primarily address “pathogenesis”. Therefore, the authors need to clarify correctly and state whether the study is focused on progression (worsening) rather than pathogenesis (cause) or both. See this statement in the abstract for example: “This study aimed to investigate the role of protein glycosylation modification in DKD progression and its association with gene expression changes, with the goal of identifying diagnostic biomarkers and personalized therapeutic targets.” If the study did not entertain the different DKD stages, it may be more appropriate to rephrase to “…in DKD pathogenesis…”

It would be beneficial if the paper clarifies that the study focuses on “abnormal or aberrant Protein glycosylation” as it concerns the pathogenesis of DKD. This will help the readers to better understand the pathogenesis as different from the normal process. In some places the authors wrote the role of Protein glycosylation modification implicated in DKD. Protein glycosylation modification (a form of post-translational modification) is known to be a normal/physiological process, hence an abnormality of the process may likely bring about DKD. Additionally, the use of “modification” as in Protein glycosylation modification appears redundant and advised to be removed, because, although correct, it may create confusion to suggest an abnormality. “Protein glycosylation” may present a simple and clearer picture. For example, refer to conclusion in the abstract:

1. “This study highlights protein glycosylation as a key player in DKD and identifies six hub genes with potential as diagnostic biomarkers.” For instance, this might be better written as “This study highlights abnormal protein glycosylation as a key player in DKD and identifies six hub genes with potential as diagnostic biomarkers”. This ensures clarity

2. The present study utilized bioinformatics integrated with machine learning to identify the genes and their regulatory networks associated with protein glycosylation modification, which plays a pivotal role in DKD. [Line 83-85]

3. This study investigates the key role of protein glycosylation modification in the pathogenesis of DKD by integrating bioinformatics and machine learning techniques. The results suggest that glycosylation modification plays a crucial role in the occurrence and progression of DKD. [Line 672-675]

In a bid to improve clarity, some sentences may need to be simplified/ rewritten:

1. “Then, we analyzed the ROC curve and evaluated the AUC values for their diagnostic efficiency in DKD, EXT1, SPTB, ADAMTS1, FMOD showed high diagnostic accuracy, while S100A12 and SBSPON, which is lower, still indicated certain diagnostic performance”. [Line 647-650]

2. “And O-linked glycosylation-SP1 modulating ENTPD5 expression through a negative feedback mechanism” [Line 46-47].

Methods: The authors did feature selection using LASSO, RF, SVM which is appropriate. The authors, however, should clearly state whether the data was split into training and testing sets since these are relevant in machine learning models (RF and SVM) to ensure generalizability of the models and strengthen the methodology. It’s also important to mention whether cross validation was done to ensure model robustness which strengthens that the result is biologically significant rather than dataset-dependent. If these were not and cannot be addressed, they should be considered as limitations.

The result section could be revised to focus on the direct observations by presenting and explaining only the results with further discussion of the implications of the results reserved for the discussion section.

1. Significant enrichment in cellular components such as extracellular matrix containing collagen suggests abnormal changes during renal tissue remodeling. [Line 292-294]

2. These pathways may be related to abnormalities in glycosylation-modified protein functions, affecting the pathological progression of DKD. [Line 302-304]

3. These results not only reveal the potential mechanism of gly-DEGs in DKD but also provide possible research directions for future clinical applications, especially in the screening of therapeutic targets and diagnostic markers. [Line 306-309]

4. These results indicate that our classification method can effectively identify the DKD heterogeneity in patients at the molecular level, revealing significant biological differences between the different subtypes. [Line 346-349]

5. This molecular subtyping not only helps deepen our understanding of the pathogenesis of DKD but also provides important clues for the future development of targeted diagnostic and therapeutic strategies.[Line 357-360]

Discussion section: the authors need to emphasize the novelty of the study findings as well as their clinical significance. What aspects of the study are current or novel?. Highlight new things that were not known, if any and compare/contrast with existing literature. How have these findings changed our understanding of DKD?.

Suggested Reference: I advise that the authors refer to this article as it is similar to the paper under review: Fu, S., Cheng, Y., Wang, X., Huang, J., Su, S., Wu, H., Yu, J., & Xu, Z. (2022). Identification of diagnostic gene biomarkers and immune infiltration in patients with diabetic kidney disease using machine learning strategies and bioinformatic analysis. Frontiers in Medicine, 9. https://doi.org/10.3389/fmed.2022.918657

Reviewer #3: The submitted manuscript by Ziyang et al. utilized bioinformatic tools, machine learning, and gene expression analysis and identified 6 important hub genes implicated in DKD pathogenesis. This study shows the important role of glycosylation modification in (DKD) progression and will aid in increasing and improving personalized therapeutic targets. The findings of this study still need to be validated to unravel the molecular mechanism of the identified genes in DKD. However, minor revisions should be made before consideration.

The authors should spell out all abbreviations utilized in the study.

In the abstract

Under “results”, there is a repeated full stop in one of the sentences.

In the introduction

Some sentences in the introduction are not clear. Some combined sentences should be broken down for clarity. See lines 36 to 42, page 5; line 46, page 5 starts with “and”. This should be corrected, and the sentence should be reworded. Lines 47 to 50, page 6, are also not clear. Also, page 7.

The claim that protein glycosylation affects DKD progression through inflammatory response and oxidative stress should be properly cited by referring to the primary sources. The cited papers 15 and 16 are reviews.

In the materials and methods,

The sources for all kits and materials used should be included. The versions of the software packages used should also be reported.

Methods 1.8 Immunohistochemistry and 1.9 Enrichment analysis should be rewritten in reported form.

The specific antibodies used for all the assays should also be reported. A list of the primers used should also be provided.

In the Results

A workflow of the processes as a figure is granted

The points of each replicate for the qPCR results should be included in Figure 8. Also, “Con” should be spelled in full as control. The statistical differences for the target genes between DKD and controls should also be indicated in the figure

Although ADAMTS1 and S100A12 showed no significant difference in the gene expression analysis, the results should still be included in Figure 8.

In the discussion,

Although the gene expression analysis from this study showed no significant difference in S100A12 expression, the authors discussed its upregulation in other studies without stating any justification or rationale for the difference observed in their study. This should also be addressed in the discussion. The same is true for ADAMTS1

Since EXT1 is highly upregulated in DKD with potential diagnostic properties, further studies on this gene should be proposed as part of further studies.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org

Revision 1

Dear Editors and Reviewers,

We sincerely appreciate the reviewers' constructive comments and the editor's thoughtful suggestions. We have carefully revised the manuscript based on these feedbacks. Below is our point-by-point response to the reviewers' comments.

Reviewer #1

1.Data Availability and Transparency

The raw gene expression datasets (GSE96804, GSE104948-GPL22945, GSE30528) analyzed in this study are publicly available through the NCBI Gene Expression Omnibus (GEO) database. Glycosylation-related gene sets were retrieved from the MSigDB database. To ensure reproducibility, the batch-corrected and normalized merged dataset (merge_DN.txt) has been uploaded as supplementary material. This approach aligns with similar studies that utilize standardized bioinformatics workflows for data integration and normalization.

For experimental validation, raw qPCR Ct values, fold-change calculations, and histopathology data from the DKD mouse model are available upon reasonable request. raw qPCR Ct values has been uploaded as supplement. A formal Data Availability Statement (DAS) will be included in the final manuscript, directing readers to supplementary files for access to processed datasets and experimental data.

2.Statistical Rigor in Differential Gene Expression Analysis

Pre-correction PCA: PC1 explained 91.6% of the variance, indicating strong technical batch effects. PC2 accounted for 6.6% of the variance. Post-correction PCA: After applying the limma and sva packages, PC1 variance dropped to 13.8%, and PC2 increased to 9.1%, reflecting reduced technical variability and improved alignment of samples based on biological differences. These results confirm that batch correction effectively mitigated non-biological variability, allowing downstream analyses to focus on biologically relevant signals. We will revise the Methods section to include these quantitative details and explained pre- and post-correction.

We acknowledge that this threshold balances sensitivity (detecting true positives) and specificity (minimizing false positives) and have strengthened our rationale as follows: 1. Biological Relevance: In complex diseases like DKD where subtle changes may lead clinical significance. Similar thresholds(|logFC| > 1 and |logFC| > 0.5) have been successfully applied in prior studies of DKD, supporting our methodological consistency[1][2]. 2. Statistical Rigor: We combined this threshold with an adjusted p-value < 0.05 (Benjamini-Hochberg correction) to ensure robust statistical significance while controlling for false discoveries. 3. Downstream Validation: The selected DEGs demonstrated strong enrichment in DKD-relevant pathways (e.g., PI3K-Akt signaling, ECM remodeling) and were validated via qPCR and ROC analysis (AUC > 0.7 for hub genes). This confirms the functional relevance of the threshold.

3.Enrichment Analysis (GO & KEGG Pathways)

In all GO and KEGG analyses, we explicitly applied Benjamini-Hochberg (BH) correction to control the false discovery rate (FDR), as stated in the revised Methods (Section 1.2). Adjusted p-values (p.adjust < 0.05) were used to define statistical significance for both DEGs and gly-DEGs. For example, gly-DEGs were filtered with p.adjust < 0.05 (Figures 3D, 4C), and pathway enrichments (e.g., heparan sulfate biosynthesis, p.adjust = 0.003) adhere to this threshold. This ensures rigorous control of multiple hypothesis testing.

To complement overrepresentation analysis (ORA), GSEA was performed on hub genes using Reactome pathways (Section 1.9). Key pathways (e.g., extracellular matrix remodeling, NES = 2.15, p.adjust = 0.008) validated ORA results, demonstrating consistency between methods. Additionally, single-gene GO/KEGG analyses (Supplementary File S2) further confirmed hub genes’ roles in immune infiltration and fibrosis (e.g., leukocyte migration, p.adjust = 0.003). These integrated approaches strengthen the reliability of our findings.

4.Machine Learning-Based Hub Gene Selection

Our study prioritized glycosylation-related molecular mechanisms in DKD, with immune infiltration serving as a secondary exploratory component. Given the non-tumor context and limited sample size (n=108 total samples across datasets), we employed ssGSEA—a widely used and robust method for immune cell scoring—to avoid overcomplicating the analysis. While tools like CIBERSORT or xCell are valuable, they require larger sample sizes for stable deconvolution and are optimized for tumor microenvironments. Our preliminary results (Supplementary Figure S3) highlighted significant differences in macrophage/neutrophil activity between DKD and controls, validated by immunohistochemistry (Section 2.8). These findings align with the study’s focus on glycosylation-driven pathways rather than immune heterogeneity.

The reviewer’s concern about multiple testing correction is valid. However, in our immune correlation analysis (Section 1.9), strict FDR adjustment (e.g., Benjamini-Hochberg) was intentionally omitted due to the small sample size and exploratory nature of this analysis. Applying FDR correction here would have increased the risk of false negatives, masking biologically plausible immune-gene correlations (e.g., hub genes like S100A12 and EXT1 showed trends with neutrophil infiltration). We acknowledge this limitation in the revised Discussion (Section 4.4) and emphasize that future studies with larger cohorts will incorporate cross-method validation (e.g., TIMER) and stringent FDR control to enhance robustness.

5.Immune Cell Infiltration Analysis (ssGSEA)

Our study focused on glycosylation-related mechanisms in DKD pathogenesis, with immune infiltration as an exploratory component. Given the non-tumor context and limited sample size (n=108 across datasets), we selected ssGSEA for immune scoring due to its robustness in small cohorts and avoidance of tumor-specific deconvolution biases inherent to tools like CIBERSORT/xCell. This approach prioritized simplicity and relevance to our biological focus (Section 1.9).

The correlation analysis between hub genes and immune cells (Section 2.9) intentionally omitted strict FDR correction. In small cohorts, FDR adjustment can overly suppress biologically plausible signals (e.g., S100A12-neutrophil correlations, p=0.02 uncorrected). We have explicitly acknowledged this limitation in the revised Discussion (Section 4.4) and emphasized that future studies with larger cohorts will apply cross-method validation (e.g., TIMER) and stringent FDR control.

While alternative methods (e.g., CIBERSORT) were not compared, our ssGSEA results aligned with immunohistochemical validation of immune infiltration (Figure 8, CD45+ cell quantification). This consistency supports the validity of our exploratory findings within the study’s scope.

6.Experimental Validation (qPCR & Mouse Model)

We used C57BKS db/db mice, a widely recognized model for DKD, and confirmed the successful establishment of the DKD model through immunohistochemistry. Compared to the high genetic heterogeneity in DKD patients, this model has lower genetic variability, and three samples per group meet the basic statistical requirements.

The qPCR results are consistent with the bioinformatics predictions, supporting the statistical significance among the groups with the current sample size. This indicates that our findings are reliable and representative.

We acknowledge that a larger sample size would enhance statistical power. However, due to the 16-week duration of DKD model development and ethical/resource constraints, expanding the cohort is challenging. We commit to supplementing the qPCR results with additional samples after completing the ongoing 16-week DKD model replication. Additionally, we will consider making the raw qPCR data available.

Reviewer #2

1.Clarity and Terminologies

Comment 1.1: Title adjustment to reflect focus on glycosylation-related genes

Response: We agree with the reviewer’s suggestion. The title has been revised to better reflect the gene-centric focus of the study: Revised Title: "Integrating Bioinformatics and Machine Learning to Elucidate the Role of Glycosylation-Related Genes in the Pathogenesis of Diabetic Kidney Disease"

Comment 1.2: Clarify "pathogenesis" vs. "progression"

Response: We acknowledge the distinction between "pathogenesis" (mechanisms underlying disease onset) and "progression" (disease advancement). Our study primarily investigates molecular mechanisms driving DKD development, not longitudinal changes across stages. We have revised the terminology throughout the manuscript:

Abstract: "This study aimed to investigate the role of aberrant protein glycosylation in DKD pathogenesis and its association with gene expression changes..."

Introduction: "...exploring the role of dysregulated glycosylation in DKD pathogenesis..."

Comment 1.3: Use of "abnormal protein glycosylation"

Response: We thank the reviewer for highlighting this critical point. We have clarified that the study focuses on aberrant glycosylation (a pathological process) rather than physiological glycosylation. Revisions include:

Abstract (Conclusions): "This study highlights abnormal protein glycosylation as a key player in DKD..."

Comment 1.4: Simplify redundant phrases

Response: We removed redundant terms like "modification" where appropriate to enhance clarity:

Original: "protein glycosylation modification"

Revised: "protein glycosylation"

2.Methods

Comment 2.1: Training/testing splits and cross-validation

Response: Due to the limited sample size of publicly available DKD datasets (total n=108n=108), splitting the data into training and testing sets would have reduced statistical power. Instead, we employed 10-fold cross-validation during model training to ensure robustness. This is now explicitly stated in the Methods section (Section 1.5): "LASSO, SVM, and random forest models were trained using 10-fold cross-validation to minimize overfitting and enhance generalizability."

The limitation of sample size is acknowledged in the Discussion: "Future studies should validate these hub genes in larger cohorts to address the current sample size constraints."

3.Results

Comment 3.1: Separate observations from implications

Response: We restructured the Results section to present findings without interpretive statements.

4.Discussion

Comment 4.1: Emphasize novelty and clinical significance

Response: We expanded the Discussion to highlight the study’s novelty:

“This study pioneers a novel molecular subtyping framework for DKD by integrating glycosylation-related genes through bioinformatics and machine learning approaches[59]. Unsupervised clustering identified two distinct DKD subtypes that resolve disease heterogeneity, enabling personalized therapies such as antifibrotic agents for ECM-dominant patients. The identified hub genes demonstrated enhanced diagnostic performance (AUC >0.7) compared to conventional biomarkers. Mechanistically, we revealed glycosylation dysregulation drives macrophage/neutrophil activation, establishing a "glycosylation-immune crosstalk" axis. These findings reframe DKD pathogenesis as a glycosylation-mediated imbalance between extracellular matrix dynamics and immune regulation, providing precise therapeutic targets for precision medicine.”

Comment 4.2: Comparison with existing literature

Response: We incorporated the suggested reference (Fu et al., 2022) and contrasted our findings: While Fu et al. (2022) identified immune-related biomarkers in DKD using similar methods, our study uniquely links glycosylation-related genes to extracellular matrix dysregulation, offering complementary insights into DKD pathogenesis.

Reviewer #3

1. Abbreviations

Comment: Spell out all abbreviations at first mention. Response: All abbreviations have been defined at first use.

2. Abstract

Comment: Remove the repeated full stop in the Results section. Response: The duplicate punctuation in the Results section has been corrected.

3. Introduction

Comment: Clarify sentences in lines 36–42, 46, and 47–50 (now lines 44–54 in the revised manuscript).

Response: Sentences have been restructured for clarity:

Original (lines 44–46): "And O-linked glycosylation-SP1 modulating ENTPD5 expression through a negative feedback mechanism[8]"

Revised: "ENTPD5-regulated N-glycosylation of proteins in the ER to promote cell proliferation in the early stage of DKD, and continuous hyperglycemia activated the hexosamine biosynthesis pathway (HBP) to increase the level of UDP-GlcNAc, which driving a feedback mechanism that inhibited transcription factor SP1 activity to downregulate ENTPD5 expression in the late stage of DKD."

Comment: Replace citations to reviews (references 15–16) with primary sources for claims about glycosylation, inflammation, and oxidative stress.

Response: Primary studies have been added to citations:

Original: "Protein glycosylation maintains... by mediating inflammatory response and oxidative stress[15][16]."

Revised: [17] Radovani B, Gudelj I. N-Glycosylation and Inflammation; the Not-So-Sweet Relation. Front Immunol. 2022 Jun 27;13:893365. doi: 10.3389/fimmu.2022.893365. PMID: 35833138; PMCID: PMC9272703.

4. Materials and Methods

Comment: Report sources for kits/antibodies and software versions.

Response: Details added:

RNA extraction kit: FastPure Complex Tissue/Cell Total RNA Isolation Kit (Vazyme, cat# RC113-01).qPCR master mix: ChamQ Universal SYBR qPCR Master Mix (Vazyme, cat# Q711-02).

Primary antibody: CD45 (Abcam, cat# ab10558, 1:200 dilution).Software versions: R 4.2.3, limma 3.54.2, glmnet 4.1.7.

Comment: Rewrite Methods 1.8 (Immunohistochemistry) and 1.9 (Enrichment analysis) in reported form.

Response: Passive voice revised to active voice:

Original (Immunohistochemistry): "Fix kidney tissues in 4% paraformaldehyde..."

Revised: "Kidney tissues were fixed in 4% paraformaldehyde..."

5. Results

Comment: Include a workflow figure. Response: A workflow diagram has been added to the Supplementary Material, summarizing data acquisition, clustering, machine learning, and validation steps.

Comment: Add data points, label "Con" as "Control," and indicate statistical significance in Figure 8. Include ADAMTS1 and S100A12.

Response:

Revised:

"Con" replaced with "Control."

Significance markers added (ns: p>0.05; *: p<0.05; **: p<0.01; ***: p<0.005).

ADAMTS1 and S100A12 results are included (no significant differences noted).

6. Discussion

Comment: Address discrepancies in S100A12 and ADAMTS1 expression compared to prior studies. Propose further studies on EXT1.

Response: Added to Discussion:

"This inconsistency may arise from limitations in our animal model, including the small sample size (n=3 per group), which reduces statistical power to detect subtle changes, and species-specific regulatory mechanisms inherent to murine models. Additionally, disease stage heterogeneity (e.g., early vs. advanced DKD) and transient expression dynamics during pathological progression could mask gene activity."

"EXT1, a key enzyme in heparan sulfate biosynthesis, demonstrated robust diagnostic potential (AUC = 0.92). Future studies should explore its role in extracellular matrix remodeling and therapeutic targeting."

7. Primer List

Comment: Provide primer sequences.

Response: A primer table has been added to the Methods (Section 1.7)

We thank the reviewers for their valuable input. Please let us know if any additional revisions are needed. We are happy to further improve the manuscript.

Sincerely,

Ziyang Liu, Zengyuan Qin, Wenxin Bai, Shasha Wang, Chunling Huang, Na Li, Lei Yan, Yue Gu, Fengmin Shao

References

1.Fu S, Cheng Y, Wang X, Huang J, Su S, Wu H, Yu J, Xu Z. Identification of diagnostic gene biomarkers and immune infiltration in patients with diabetic kidney disease using machine learning strategies and bioinformatic analysis. Front Med (Lausanne). 2022 Sep 29;9:918657. doi: 10.3389/fmed.2022.918657. PMID: 36250071; PMCID: PMC9556813.

2.Chen H, Su X, Li Y, Dang C, Luo Z. Identification of metabolic reprogramming-related genes as potential diagnostic biomarkers for diabetic nephropathy based on bioinformatics. Diabetol Metab Syndr. 2024 Nov 28;16(1):287. doi: 10.1186/s13098-024-01531-5. PMID: 39609849; PMCID: PMC11603941.

Attachments
Attachment
Submitted filename: Response to Reviewers.docx
Decision Letter - Yusuf Ayipo, Editor

Dear Dr. Liu,

Please submit your revised manuscript by Aug 07 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Yusuf Oloruntoyin Ayipo, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

Kudos to the authors for responding positively to the initial queries. No doubt, the quality of the submission has improved significantly. However, some concerns have been raised affecting some sections of the manuscript. I hereby recommend another round of revision to address the current concerns and reserve my final decision until they are resolved.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

Reviewer #1: (No Response)

Reviewer #2: The authors have made most of the necessary corrections but for one of the comments, they stated: "This is now explicitly stated in the Methods section (Section 1.5): "LASSO, SVM, and random forest models were trained using 10-fold cross-validation to minimize overfitting and enhance generalizability." However I did not see this in the section 1.5 quoted

Reviewer #3: (No Response)

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org

Revision 2

Dear Dr. Ayipo and Reviewers,

Thank you for your careful consideration of our manuscript and the constructive feedback. We appreciate the opportunity to address the remaining concern raised in the second round of review.

Response to Reviewer #2. Comment: The authors stated that 10-fold cross-validation was added to the Methods section, but this was not found in the quoted Section 1.5.

We apologize for the oversight in the previous revision. The Methods section has been explicitly updated to describe the application of 10-fold cross-validation in all machine learning models. The relevant modification is as follows:

Revised in Section 1.5 "Identification of Hub Genes by Machine Learning":

"LASSO regression analysis was performed on gly-DEGs using the R package 'glmnet' to screen out genes with significant features. To minimize overfitting and enhance model generalizability, all machine learning models (LASSO, random forest, and SVM) were trained using 10-fold cross-validation, where the dataset was repeatedly partitioned into 10 subsets for iterative training and validation. Then, the 'randomForest' package was used to build a random forest model... " (Lines XX-XX, revised manuscript with track changes).

This addition clarifies the rigorous validation process applied to ensure the robustness of our machine learning results.

Response to Other Reviewers. Reviewer #1 and Reviewer #3 indicated that all previous comments were adequately addressed, and no further revisions were required for their concerns. We have double-checked the manuscript to ensure consistency with their suggestions.

Data Availability and Technical Accuracy. The data availability statement and statistical analyses remain consistent with PLOS ONE’s requirements, and all modifications are tracked in the revised manuscript.

Thank you again for your guidance in improving our work. We believe these revisions enhance the rigor and transparency of our study. Please let us know if any further adjustments are needed.

Sincerely,

Ziyang Liu et al.

Attachments
Attachment
Submitted filename: Response to Reviewers2.docx
Decision Letter - Yusuf Ayipo, Editor

Integrating Bioinformatics and Machine Learning to Elucidate the Role of Protein Glycosylation-Related Genes in the Pathogenesis of Diabetic Kidney Disease

PONE-D-25-05847R2

Dear Dr. Liu,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Yusuf Oloruntoyin Ayipo, Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The submission meets the level of scientific rigour required for publication in this title and all the concerns raised by the respective reviewers have been addressed satisfactorily. I hereby recommend the manuscript for publication in the current version.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #2: Yes

**********

Reviewer #2: (No Response)

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #2: No

**********

Formally Accepted
Acceptance Letter - Yusuf Ayipo, Editor

PONE-D-25-05847R2

PLOS ONE

Dear Dr. Liu,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Yusuf Oloruntoyin Ayipo

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .