Peer Review History
| Original SubmissionJanuary 30, 2025 |
|---|
|
PONE-D-25-02516Behavior Classification: Introducing Machine Learning Approaches for Classification of Sign-Tracking, Goal-Tracking and Beyond PLOS ONE Dear Dr. Huppé-Gourgues, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, the reviewers and I feel that it has merit but will require some revisions to meet PLOS ONE’s publication criteria. We invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ACADEMIC EDITOR:
Please submit your revised manuscript by Apr 04 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Rita Fuchs Academic Editor PLOS ONE Journal Requirements: 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. Thank you for stating the following financial disclosure: [This research was supported by the Natural Sciences and Engineering Research Council of Canada Graduate Scholarship to C.G. and by the Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC RGPIN-2018-06285) to F.H-G.]. Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 4. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors put forward a comparison of k-Means clustering and a derivative-based method for classifying subgroups of Pavlovian conditioning: goal tracking, sign tracking, and indeterminate. The authors do well at outlining the current limitations of how PC subgroups are classified. However, without a ground truth it is difficult to assess the success of any classification method. Therefore the authors rely on arguments of features like stability and a call for future work to carefully consider classification methods rather than following a ‘standard’ or ‘predefined’ cutoff. Although the authors rigorously explore two new methods for classification, there are some details left unclear and theoretical considerations that should be resolved before publication. Data availability: The authors state that all data will be made available after acceptance, but there is currently no URL, accession number, or DOI associated with the data. Major points: Lines 121-123: Authors argue that the Meyer cutoff values result in uneven numbers of subjects across groups and thus will complicate hypothesis testing. It is unclear that the statistical inconvenience warrants altering the metric which may be accurately classifying subjects (i.e., is there a principled reason to believe that the subgroups should have the same number of subjects?) Line 288: The authors use the ksdensity function to approximate their density probabilities. It seems that the assumption is that the data will be bimodal and not normally distributed. In this case the bandwidth chosen for the function could have a large impact on the results. What bandwidth did the authors use and why? If they tried other bandwidths, did that have meaningful shifts in their results? Last, the data is bounded [-1 1] but it is not clear if the authors used a bounded support with ksdensity. The code provided suggests that the default bandwidth and support were used. Lines 343-347: I’m unclear what is being said here. Do the authors mean that Days 1 and 2 combined were different from all other days individually, but were themselves not different from each other? Likewise, were days 3 and 4 individually different from day 6? Perhaps a visualization of these data such that significance bars can be seen would help clarify the differences (e.g., violin plots with lines and asterisks above the data significantly different from each other). Last, it is not clear how these individual differences were detected; post-hoc t-tests? Throughout the analysis the authors sometimes pool data (e.g., line 354) and sometimes take the average (e.g., line 367). It would be helpful to provide a rationale for when one method is more appropriate than the other. Why was pooled data used for the k-means clustering, but not the derivative method? The frequencies of observations (lines 412-431) are reported in terms of whether significant differences were found, but the actual frequencies are not reported in either a table or figure. Lines 449-450: the authors claim that the two methods “...primarily affect the classification of individuals located in close proximity to group boundaries”. What is this based on? Figure 5: my interpretation of this figure is that the authors are showing the mean and 95% CI PCA score for each group after classification with the three methods. If this is the case, then I am not clear on how there could be PCA scores classified as, for example, GT by the K-means clustering on day 1 if the K-means cutoff value for GT on day 1 is <-0.44 (table 1) while the 95% CI goes above -0.4. If these classifications were obtained from the pooled clustering, then the problem seems even worse as that cutoff was <-0.22 which encompasses likely all the IN data and a large chunk of ST. The authors try a 3D k-means clustering, but state “...there were not three definite clusters…” (lines 480-481). How was this determined? If only visually, could one not make a similar argument for the distributions seen in Figures 5 and 6? Purely visually I might say there are perhaps 2 groups instead of 3. Further, did the authors try any non-linear methods like SVM to classify either the computed PCA scores or the 3 variables in the 3D clustering? The authors claim that the derivative method “...seems to provide a middle ground.” (line 510). It would be nice to see some exploration of the theoretical basis of this. Given two normal distributions, what percent of the data will fall in a tail defined by the derivative method? In other words, assuming normalcy, the first derivative will always be at the same z-score (±1) and thus cut off about 16% of the distribution. Is there a principled reason for believing that this cutoff will better approximate biologically relevant subgroups? Granted, these assumptions won’t hold in non-normal distributions which seem to be prevalent in these data. What do the authors think will be the consequence in bimodal distributions? Will the derivative find approximate half-way points between the two underlying distributions? Is that what we should expect to best separate the groups? A consistent motivation the authors give is the “high variability” (e.g., line 705) in the literature in defining subgroups in Pavlovian conditioning. However, it does not seem that their methods will necessarily solve this problem. Especially as the authors acknowledge that their is likely no one best method, even within a lab, and that cutoff values should be determined for each cohort. They do mention that PCA scores tend to stabilize as sample sizes increase and on later conditioning days, but there should be some acknowledgement that either the k-means clustering or derivative method may still lead to variability in cutoff values across labs and experiments. Minor points: Line 65: By “positive outcome” are the authors referring to valence, an outcome in which something is added, or something else? Line 145: “...PCA scores tend to distribute in a bimodal, U-shaped fashion…” are there citations to back this up? The authors themselves later argue that the bimodality may depend on day of conditioning and number of subjects. Line 175: The number of each sex used for the modeling sample is not stated. Line 175: What other experiments did these rats contribute to? If they are published, provide citations, if not, provide some details to allay concerns that the other experiments may influence Pavlovian conditioning. Missing details on validation sample: were these rats also used in other experiments? were they trained during the same time frame as the modeling sample (June 2021 - July 2022)? Line 190: what is the breakdown by sex of the two cohorts in the validation sample? Line 203: assuming that each operant box only had one lever being used, were rats randomly assigned to a left or right lever in equal numbers? Line 234: “Each contact…” do the authors mean any kind of contact or a full lever press? Line 235-237: “We noted…the total number of consumed pellets” How as this determined? As a subtraction of pellets delivered and pellets remaining in the dispenser? Line 243: What do the authors mean by “response” if not lever presses or food cup entries? Line 265: Earlier in the methods the validation sample size was given as 34 rats (line 177) Line 341: Why was sex not considered a factor in the analysis of the modeling sample? Table 1: a row should be added for the results of the pooled days 1–6 k-Means clustering. Line 433: I’m surprised the authors did not note that despite both methods extracting “unstable cutoff values” that there were also very similar values. Lines 466-467: Says “...pooled data from all six days…” while the figure legend for figure 6 specifies “...Days 5 and 6…” (line 475). Line 471: “...delimited around ±0.3…”; doesn’t Table 1 say >0.31 and <-0.21 for pooled days 5 and 6? Line 497: “The validation sample did not include two parabolas…”; please provide a figure showing this. Lines 713-714: “These variations may be related to subclusters of PCA Index scores and could be further explored with cluster analysis.” Are the authors suggesting more than 3 clusters may exist? If so, how do they propose they are defined/explored given that k-means was already having issues with reliably finding 3 clusters? Reviewer #2: The manuscript by Godin and Huppe-Gourgues describes two approaches for behavioral characterization of individual variability. They apply these approaches to the sign-tracker/goal-tracker model and compare them to the classification method that has traditionally been applied to this model, the Pavlovian conditioned approach (PCA) index. The results suggest that the newly described methods of classification are effective tools for identifying sign-trackers and goal-trackers and may be especially useful for small sample sizes. Although the differences between these approaches and the traditional PCA index method were not striking, the newly described approaches could potentially provide a more standardized classification framework. This is a well-written, timely, and important manuscript that could be improved with attention to the points raised below. Line 28: The authors define ST and GT as sign-tracking and goal-tracking, respectively, but should consider defining as “sign-trackers” and “goal-trackers” to keep consistent with the relevant literature. Line 32: The authors might consider changing “PCA Index” to “PavCA Index” so the readers do not confuse this terminology with principal components analysis (PCA). Line 66: The authors should note that predictive cues can act as incentive stimuli, but they do not always and often just for a subset of individuals (i.e., sign-trackers). Line 95-96: It is a misnomer to describe the PCA Index as a ratio of head entries to lever presses, when it is actually a composite index score, as described in the subsequent sentence. Line 166: It doesn’t seem all that common to food-restrict animals prior to assessing Pavlovian conditioned approach behavior in the sign-tracker/goal-tracker literature. Further, the effects of food-restriction on sign-tracking/goal-tracking behavior have been reported (e.g., Anderson et al., 2013 PMCID: PMC3845669 DOI: 10.1016/j.bbr.2013.09.021; Boakes Chapter, 1977). Given this, and the fact that the food-restriction data does not add much to the current manuscript, it is recommended that this dataset be removed from the current manuscript. Alternatively, it should be presented at the end of the manuscript, not at the beginning (see more below). Lines 173-175: The number of males and females in the n=189 sample should be specified. Line 201: What does PLA stand for? Line 264: It is stated here that the validation data n=58; whereas earlier in the methods (line 177) is states that n-34 for this sample. Please clarify. Figure 1: As suggested above, it is suggested that the effects of food availability either be removed or moved to a different section of the manuscript. It would be beneficial to the reader and the models if the initial data presentations focused on the typical Pavlovian conditioned approach metrics illustrated and analyzed with sex and session as variables. Further the data should be illustrated to better align with the descriptions in the text. For example, if sex differences are being analyzed/reported, then males and females should be shown in the same graphs. Lines 365-371: It is not clear why the new approaches were applied to early training days, given that the behavioral phenotypes don’t emerge or become stable until later training days. Further explanation for this approach is needed. In addition, further description of what the centroids and point-to-centroid distances refers to would be helpful. Finally, if there were a way to graphically illustrate the k-means cluster data (i.e., that shown in Table 1), that might also be helpful to the reader (as was done for the derivative method). Lines 413-423: It would be helpful to illustrate the data described here as histograms and to state the actual number of observations that differed between the different classification methods. Lines 430-436: Further description of the “center” value and how that should be interpreted is warranted. It would be helpful if a few sentences were added to this section to better explain what these findings actually mean. Lines 483-485: What does latency score refer to here? Is it the latency difference score between the lever and food cup? Lines 508: It is not clear on what basis the authors claim that the +/- 0.5 cutoff value provides “less optimal classification”. What makes this approach less optimal in this case? Figure 8: Do the data shown in Figure 8B include both food groups? It would be ideal if the validation group was tested under the same conditions as the model group. Lines 522-529: As described above, it would be helpful to state how many observations (i.e., animals) were differentially classified based on the different methods. Lines 553: What does “accurate classification” mean in this case? Line 564: Which trends supported the sex differences referred to here? Lines 565-570: Other studies have suggested that female rats tend to be skewed more towards sign-trackers than males rats (e.g., Hughson et al., 2019, PMCID: PMC6382850 DOI: 10.1038/s41598-019-39519-1). Thus, it seems like it would be important to assess the models separately in each sex and see how each compares. Line 618: It is likely the pretraining to the food cup that skews early behavior towards goal-tracking, and this should be noted. Lines 661-662: Further guidance is needed to clarify what the authors suggest should be done to select cutoff methods using one approach versus another. Line 713: It is not clear what the authors are referring to in the Lovic et al. paper (reference 39) wherein it seems that STs were generally more impulsive on tests of impulsive action relative to GTs. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
<p>Behavior Classification: Introducing Machine Learning Approaches for Classification of Sign-Tracking, Goal-Tracking and Beyond PONE-D-25-02516R1 Dear Dr. Huppé-Gourgues, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Rita Fuchs Academic Editor PLOS ONE Additional Editor Comments (optional): The authors did a great job addressing the comments of the reviewers. I think this paper contributes an important new approach to studying the sign-tracking/goal-tracking phenomenon. Reviewers' comments: |
| Formally Accepted |
|
PONE-D-25-02516R1 PLOS ONE Dear Dr. Huppé-Gourgues, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Rita Fuchs Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .