Peer Review History
| Original SubmissionJuly 19, 2024 |
|---|
|
PONE-D-24-28122Examining heterogeneity in dementia using data-driven unsupervised clustering of cognitive profilesPLOS ONE Dear Dr. KUMAR, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== Review of PONE-D-24-28122
Overall, I think this paper is very useful and well-written. I recommend it for publication after the above 5 questions have been answered. ============================== Please submit your revised manuscript by Oct 12 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Matthew Cserhati, Ph.D Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for stating in your Funding Statement: "The preparation of this report was supported by the Centene Corporation contract (P19-00559) for the Washington University-Centene ARCH Personalized Medicine Initiative. " Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement. Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf. 3. Thank you for stating the following in the Acknowledgments Section of your manuscript: "The preparation of this report was supported by the Centene Corporation contract (P19-00559) for the Washington University-Centene ARCH Personalized Medicine Initiative. " We note that you have provided funding information that is currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "The preparation of this report was supported by the Centene Corporation contract (P19-00559) for the Washington University-Centene ARCH Personalized Medicine Initiative. " Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 4. In the online submission form, you indicated that [Anonymized data not published within this article will be made available by request from any qualified investigator.]. All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either 1. In a public repository, 2. Within the manuscript itself, or 3. Uploaded as supplementary information. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons on resubmission and your exemption request will be escalated for approval. Additional Editor Comments: Review of PONE-D-24-28122 1. Section 2.4: how valuable do you think those data are from patients with single visits? Do you not think that single visits may just be anecdotal evidence, and thus have less statistical power? According to section 3.1, this covers 953 patients, more than half of all patients. Would the analysis be different if you excluded those patients with one visit only? With only one visit you cannot measure transition between stages (section 3.6). 2. Section 3.1: please describe what ICD is and what these codes mean in the text for those who do not know what they mean. 3. Section 3.3.1: why did you select k=10 optimal clusters? According to the scree plot the elbow portion seems to be at k=5. 4. Section 3.4: how can we tell from figure 4 which each cluster (C1–10) belongs to (mild, severe, etc.)? 5. In figure 5 clusters C5 and C7 look like they have similar composition of CDR 0.5 and 1, why not unite these 2 clusters? Same thing for C6 and C10. Overall, I think this paper is very useful and well-written. I recommend it for publication after the above 5 questions have been answered. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No Reviewer #3: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This paper uses a recent clustering algorithm (SillyPutty) to process and analyze dementia in a data set that includes many patients and several years of information. The authors deal with limitations that characterize previously published work and successfully provide results while considering the heterogeneity of the patients and their symptoms. Exploiting easily obtained measurements and features allows for a clear definition of dementia subtypes, as the adopted and enhanced data clustering approach specifies. Moreover, with their data organization and post-processing procedure, the authors explore and manage to connect the subtypes with the progression of dementia over time. It is not easy to find flaws in this interesting and high-quality work. The text is very well organized and easy to read. The work is technically sound and uses state-of-the-art methods and data of great scientific interest. The experimental results and the conclusions drawn are very clear. This work advances the current state-of-the-art and provides a strong motivation for applying the presented methodology at a signifanlty larger scale. Minor comments: Some typos must be corrected in the final version of the paper (e.g. patents instead of patients) The naming of the clusters (C1-C10) is based on the results of the clustering method. It might be beneficial if these names are mapped to a new set of names (e.g. D1-D10), where the index is following the dementia severity (i.e. C2->D1, C9->D2). CRAN also includes density-based clustering algorithms (e.g. DBSCAN) that do not require the user to specify the number of clusters. They might not be suitable for the specific dataset, but some reference to those algorithms, at least, would be useful. The results (percentages) in Table 2 (Cluster demographics) probably need clarification: For example, the percentage of White Race in Table 1 is 77%, but the overall corresponding percentage in Table 2 is much higher (its lowest value is 83.3% for variable C8). Is this due to the "baseline visits" of Table 1 and the "all visits" for Table 2? Reviewer #2: The authors argue that the growth of Electronic Health Records has opened up new data-driven approaches to the heterogeneous trajectories associated with dementia. In their view, existing approaches have been limited by an emphasis on expensive neuroimaging and single time-point assessment. They, conversely, investigate the possibilities of trajectories in more widespread cognitive assessment tasks. Using the health records of visitors to a university memory clinic over a number of years, they first use unsupervised learning to identify different clusters of scores on a multidimensional assessment instrument and then recreate trajectories in terms of transitional probabilities between these clusters. I think the authors make a good case for the value of this project. If useful information on dementia trajectories can be extracted from widely-used tests, it would provide an efficient way to ground better assessments of risk and (potentially) personalization of treatments. I also applaud the authors' clear and concise explanation of their approach and procedures. The paper was to the point, focused, and enjoyable to read. 1) In terms of minor points, there were a couple of places where things could be explained a little further. a. The description of measures on pg 5 is not easy to understand; I was unclear if the history / neurological examination, the cognitive assessment battery and the CDR were three separate things or not. The list on pg 6 makes it much clearer, so I'd suggest moving some of this information up to the earlier section. b. Similarly, describing the dataset in the materials section without giving sample sizes (which came in the Results section) threw me a bit. I think it would be more comprehensible if the size of the data is mentioned earlier. c. The data period on pg 5 seems to be five years, but on pg 8 is described as six years. d. The rationales for clustering on CDR components (pg 10) and for choosing 10 as the optimal number of clusters (pg 11) could be filled out a bit more. Right now a few considerations are given, but I didn't quite understand what the deciding factor was in these decisions. e. The strategy for examining cluster transitions was under-described. First, I didn't see mention of positive transitions (ie from more severe to milder clusters) - an examination of Table S4 suggests these are quite rare, so I assume it was a deliberate analytic choice to focus on the more common case of increasingly severe impairments. Second, the analysis is currently at the level of raw descriptive statistics; it would be useful to translate these into transitional probabilities and perform inferential testing. 2) However, I feel the greatest weakness of the paper was in not making the case for its clustering approach (as compared to other modelling approaches). The authors make a comparison of different clustering algorithms, but not different types of model. There are a couple of salient alternative approaches, depending on the key research goal. If the aim is to better understand trajectories, latent growth curve methods or hierarchical linear models may be relevant. If the goal is to predict risk, supervised learning methods such as regularized regression may have value (predicting changes from visit to visit). The key value of a clustering approach over these approaches is its ability to model heterogeneity; rather than a single spectrum of severity, clustering can capture common profiles of scores across a number of dimensions. However, broadly, the resulting clusters look like a spectrum of increasing severity (see eg Table 3); there is some heterogeneity (eg transitions from mild to moderate CDR scores was more likely for those with more functional impairment), but this seems minor and reduces as the condition gets more severe. As such, I wonder what added value the clustering approach is providing in this particular case. Why not just model global CDR scores as the key outcome of interest, and then use CDR component and cognitive battery scores as additional predictors? Ultimately it seems like this is where the clustering analysis ends up. Reviewer #3: After thoroughly reviewing the manuscript, I would like to commend the authors for their innovative approach and the valuable insights provided into dementia heterogeneity. However, several key methodological and analytical aspects require significant enhancement to strengthen the validity and impact of the findings. The current approach, while promising, would benefit greatly from incorporating additional analyses and more rigorous justifications. I recommend a major revision to address these critical points and improve the overall robustness and clarity of the study. 1. The authors chose to use the six CDR components as features for clustering rather than the cognitive assessment scores, based on observation that CDR components better differentiate the patient clusters. But t-SNE is primarily a tool for data visualization (reducing data dimensionality to visualize the structure in a way that preserves the local distances between points), it is not a traditional feature selection method and doesn’t provide a direct measure of the importance or variance explained by different features. I suggest the authors consider supplementing with PCA. If PCA on both cognitive scores and CDR components show that the first few principal components of the CDR data explain a significant proportion of the variance (more so than the cognitive scores), then this would quantitatively support using CDR components as the primary features for clustering. Or conversely, if (cognitive scores) or (cognitive scores + CDR components) explained more variance, then the authors might reconsider their features to include those that contribute most to distinguishing between patient subtypes. 2. The decision to combine SillyPutty with hierarchical clustering is well-justified based on previous studies. But can the authors provide more details on the initialization parameters, the number of iterations, and how convergence was assessed etc.? 3. The SillyPutty approach is innovative, but I think it’s important to understand how it compares with other established clustering methods more widely recognized in the ML community. E.g. k-means, which is a common baseline method in clustering tasks with presumably much smaller computational costs than SillyPutty, or DBSCAN which is suitable for irregularly shaped clusters and can handle noise or outliers well (relevant in clinical data where not all data points neatly fit into clusters). An informative addition would be to compare the average the silhouette width of SillyPutty against DBSCAN, k-means etc, so we know whether potentially simpler methods can achieve comparable results. If SillyPutty shows clear advantages over other methods in the context of dementia research, this could position it as a valuable tool for other researchers/ clinicians. Conversely, if other methods perform similarly or better in some respects, then SillyPutty could be more useful in specific scenarios or with certain types of data. 4. Relying solely on silhouette width might give an incomplete picture of the clustering. Including multiple metrics e.g. DBI, Dunn, would make the manuscript’s argument for the effectiveness of SillyPutty more convincing by showing rigorous evaluation from multiple angles, not just a single metric. 5. Treating each visit as an independent data point in the clustering process is potentially problematic, because in a clinical context (especially with chronic and progressive conditions like dementia), the cognitive status of a patient at one visit is likely influenced by their status at previous visits. The authors should acknowledge the potential limitations of this assumption of independence, and discuss how this might affect the interpretation of their findings. Could also consider highlighting alternative approaches like HMM as avenues for future research, which would demonstrate a forward-thinking approach and acknowledge the complexities of analyzing longitudinal clinical data. 6. The manuscript identifies heterogeneity in cognitive profiles, particularly in the early stages of dementia. I believe the discussion could be more nuanced in exploring e.g. why certain clusters exhibit different progression risks despite similar initial profiles. What about potential confounding factors that could influence these findings, such as comorbidities, medication effects or SES? 7. The study's findings have implications for personalized dementia care, particularly in identifying patients at higher risk of rapid progression. And the manuscript could better connect these findings to specific clinical interventions or decision-making processes, e.g how could identifying "progressive MCI" subtypes influence treatment plans or monitoring strategies? ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Panagiotis Hadjidoukas Reviewer #2: No Reviewer #3: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
Examining heterogeneity in dementia using data-driven unsupervised clustering of cognitive profiles PONE-D-24-28122R1 Dear Dr. KUMAR, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Matthew Cserhati, Ph.D Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #3: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #3: No ********** |
| Formally Accepted |
|
PONE-D-24-28122R1 PLOS ONE Dear Dr. KUMAR, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Matthew Cserhati Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .