Peer Review History

Original SubmissionAugust 12, 2025
Decision Letter - Muhammad Ahsan, Editor

PONE-D-25-43831H-NGPCA: Hierarchical clustering of data streams with adaptive number of clusters and adaptive dimensionalityPLOS ONE

Dear Dr. Migenda,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 08 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Muhammad Ahsan, Ph.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following financial disclosure:

 [This research was funded by the German Federal Ministry of Education and Research (BMBF) in the project VIP4PAPS, grant number 03VP10031. The sole responsibility for the content of this publication lies with the authors.]. 

Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

If this statement is not correct you must amend it as needed.

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

3. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process.

4. Please include a separate caption for each figure in your manuscript.

5. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. 

Additional Editor Comments (if provided):

Reviewer #1:

Reviewer #2:

Reviewer #3:

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This manuscript introduces H-NGPCA, a hierarchical clustering algorithm for data streams that adaptively determines both the number of clusters and their local dimensionality. The method combines centroid-based (Neural Gas), model-based (PCA), and hierarchical clustering in a fully online manner. The work is technically sound, well-motivated, and addresses an important challenge in streaming data analysis. The experimental evaluation is thorough, and the visualizations effectively illustrate the algorithm’s behavior. However, several aspects require clarification and improvement to strengthen the contribution and ensure reproducibility.

The pseudo-code in Appendix B (Algorithm 2) is a valuable addition, but the main text does not sufficiently explain the flow of the algorithm. For example, the interaction between the tree traversal (line 6–16) and the update steps (line 17–27) is unclear. A high-level summary in Section 3.1 would help readers understand the end-to-end process before diving into details.

The authors rightly note that many online algorithms lack implementations, so they compare against offline methods. However, this puts H-NGPCA at a disadvantage. To fairness, consider including a streaming variant of a classic method (e.g., Streaming K-Means) even if not adaptive. In addition, I suggest that the author discuss more about the application of the method. https://doi.org/10.1016/j.compbiomed.2023.107244 DOI: 10.1109/ACCESS.2020.2970838

Table 1 and 2 show that H-NGPCA performs well, but the standard deviations are missing. Including these would help assess the stability of the results across multiple runs.

The claim that H-NGPCA outperforms BIRCH and Affinity Propagation in some cases is supported, but the discussion should emphasize that these are offline methods. The online capability of H-NGPCA is a significant advantage that should be highlighted more clearly.

The complexity analysis in Appendix E.3 is detailed and correct. The per-data-point complexity does not account for the dimensionality adjustment (Algorithm 1), which involves Gram-Schmidt and eigenvalue regression. This should be included in the analysis.

The authors acknowledge that H-NGPCA cannot handle three collinear clusters (Fig. 12) and lacks a merge mechanism. These are significant limitations. A brief discussion of how a merge mechanism could be integrated (e.g., via a quality measure pruning branches) would strengthen the paper.

The algorithm currently uses five hyperparameters. While this is fewer than many streaming algorithms, a table showing the values used for each dataset (or a justification for fixed values) would aid reproducibility.

Reviewer #2: The paper is well-structured and clearly written, presenting a timely and relevant study on H-NGPCA. The hierarchical extension is explained coherently and supported by appropriate experimental validation. Here are some suggestions to further enhance the paper.

Abstract

•The abstract could be strengthened by more explicitly highlighting the novelty of the proposed method. Additionally, presenting key quantitative results would give readers a clearer and more concrete understanding of the contribution and its significance.

Introduction

•The introduction clearly outlines the problem statement; however, the review of related work is somewhat limited. A more thorough discussion of existing dimensionality reduction techniques (e.g., Kernel PCA, Autoencoders, Deep Learning–based methods, t-SNE, UMAP) is recommended. Furthermore, the manuscript should more explicitly articulate the advancement of H-NGPCA over conventional NGPCA, providing a stronger justification for the incorporation of the hierarchical structure.

•A comparison of these dimensionality reduction techniques, along with a clear justification of their relevance, would strengthen the discussion.

•To further strengthen the problem statement, it is advisable to include supporting references that validate and contextualize the identified research gap.

Methodology

•The methodological description is detailed and logically presented, with adequate mathematical formalization.

•Parameter choices, computational aspects, and algorithmic flow are well explained, ensuring reproducibility.

Results and Discussion

•The experimental evaluation is comprehensive, demonstrating clear improvements over conventional PCA and related methods.

•Results are supported by appropriate benchmarking and statistical validation.

Conclusion

•The conclusion effectively summarizes the contributions and highlights the practical implications of the work.

•Future directions are appropriately suggested, adding value to the study.

Reviewer #3: This paper proposes a new clustering algorithm, H-NGPCA, for data streams. It combines three ideas: hierarchical structure, centroid-based clustering algorithm (NG), and online principal component analysis within each cluster/unit. The proposed algorithm is parameter-free in clustering, meaning it can adaptively decide the number of clusters and the number of components in each cluster. Experimental results demonstrate its advantages over the state-of-the-art cluserting algorithm.

Overall, this paper is well presented. However, I have the some comments for improvement in the attached pdf file.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachments
Attachment
Submitted filename: review-PONE-D-25-43831.pdf
Revision 1

We thank the reviewers and editor for their careful reading of our manuscript, and for their constructive comments.

We attached a file to provide a detailed response to each point raised.

Attachments
Attachment
Submitted filename: Response to Reviewers.pdf
Decision Letter - Muhammad Ahsan, Editor

H-NGPCA: Hierarchical clustering of data streams with adaptive number of clusters and adaptive dimensionality

PONE-D-25-43831R1

Dear Dr. Migenda,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Muhammad Ahsan, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All comments has been address. So, I recommend this paper publish in plos one as current version.

Reviewer #2: (No Response)

Reviewer #3: Good to see that the authors have incorporated the suggestions from me and the other two reviewers. It looks good to be published for me.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

Formally Accepted
Acceptance Letter - Muhammad Ahsan, Editor

PONE-D-25-43831R1

PLOS One

Dear Dr. Migenda,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Muhammad Ahsan

Academic Editor

PLOS One

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .