Point-estimating observer models for latent cause detection

Jennifer Laura Lee; Wei Ji Ma

doi:10.1371/journal.pcbi.1009159

Peer Review History

Original SubmissionJune 2, 2021
31 Aug 2021 Decision Letter - Jean Daunizeau, Editor, Samuel J. Gershman, Editor Dear Ms Lee, Thank you very much for submitting your manuscript "Point-estimating observer models for latent cause detection" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Jean Daunizeau Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology ********************* A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Reviewer #1: The authors introduce and describe a novel experimental paradigm for inferring the presence of a ‘food feeder in a park’ where some nuisance variables (location of the feeder and pigeon association) make exact inference intractable. Then, the authors compare both Bayesian and point-estimating models using model comparison. It looks that some models involving marginalization over the latent variables cannot be rejected based on the choice data alone, so I think that some of the statements (such as the title) could be a little bit interpretative and not directly supported by the data. Nevertheless, the work is very strong in comparing models that marginalize or make assumptions to simplify computations, and thus it is of high value. It seems that perfect Bayesian estimation is contrasted with point estimation. Another way to simplify marginalization is based on the use of priors that highly constraints the size of the sum to the carried. If my reading is correct, no model in the paper uses some form of priors, and it might be unclear why they are not used within the general context of testing multiple models simultaneously. Traditionally, the London bombing problem has been treated in a different way, in my understanding: one looks at the distribution of counts per bin, and tests whether the distribution is Poisson or not. Would something like this (which is a classical test) be something that participants might be using? The effects of N in Fig. 3 are very strong and relevant. Recent work by Schustek and Moreno-Bote (Nat Communications, 2019) show also a very strong effect of sample size N on responses and confidence reports, thus supporting the notion that subjects do not use too simple heuristics to solve hard inference problems. This paper could be commented in the discussion. Related to the above, the sample size used goes up to N=16, well above the subitizing regime, but models do not seem to incorporate the possibility of numerosity estimation errors. The agglomerative clustering algorithms looks a good model candidate due to its simplicity (although other models that require marginalization are as good as this one, as the authors indicate). This model seems to be sequential in the way clusters are built, which means that studying eye movement could be a very relevant direction to be explored. Is not model comparison rather limited by the use of choice responses, with no RTs, gazes or confidence estimates? Reviewer #2: In this paper the authors provide a very thorough set of analyses and comparisons to test a range of models of the computations that might be going on within subjects in a cognitive task. The task is a great example of the sorts of computations all animals must do in every day life, while being simple enough that the potential solutions can be analyzed. The work of the authors is very rigorous, perhaps to the point that so many possibilities are being compared that the main point/thread can be lost amidst the large number of results. Therefore, since I think the work is valuable, my suggestions/critiques revolve around increasing readability, in particular, by highlighting and explaining the important findings more. Specific comments: Given the number of results and figures I think it is important they are cited clearly in the text in order, alongside a sentence giving the main finding of the figure. For example I do not see Figures 5 or 6 cited in the text and others appear to be cited out of the numerical order in which they appear. Some of the figures, specifically Figs 8 and 9 along with the corresponding styles in the supplement have so many lines that the text is hard to read and to scan across to the corresponding curve. I suggest a bit more spacing between rows and clearer – probably larger – fonts are needed. p.2 the sentence explaining nuisance variables would benefit from being broken down and expanded. E.g. What is the “variable of interest” (an example would be nice – I’m still not sure what it means in the actual experiment, presumably the binary variable “C”). It would also not hurt to add a sentence explaining how a “generative model” is not just any model, since it is a bit of a buzzword/phrase these days. p.7 “following a bimodal distribution” – please explain why “feeder present trials” would have a bimodal distribution of the numbers of affiliated pigeons, since the process appears to be Binomial which has a unimodal distribution. p.8 “one strong contender for winning model” – I guess you mean the model that best fits the data rather than the one that results in optimal performance? (also either “a” or “the” is missing). Figure 6: even for feeder absent trials the log-likelihood is mostly above zero indicating feeder present in my understanding. I think this is why the “decision threshold” needs to be optimized to match the data, but this seems like a big problem with this and many of the other models that should be stated clearly (if you expected this then say why) and discussed in the discussion. In particular in Supp Fig 18 it seems that for whole families of models the decision criterion is either always positive or always negative – to help the reader these qualitative changes across models should be explained. i.e. there must be a reason why some models are biased toward positive and others to negative likelihood ratios. The explanation on p.16 (e.g. different motor costs) is unsatisfactory as that would be revealed in a behavioral bias rather than a model bias and in any case does not say why different families of models would err in different directions (or why does the Bayesian model need a specific N-dependence, from the point of view of what is going on within the model, not just the observed output). That is to say, irrespective of behavior, what is going on in the models to require such biases? Or are different “d” values only needed in the model to explain behavioral biases and all models produce greater accuracy with a d of zero? A lot more explanation is needed concerning this issue. Discussion: I would like to see a bit more focus on what is new/surprising, why some calculations are feasible in the brain, not others, what calculations (e.g. agglomerative clustering) have any evidence for them in other literature and so on. Are there ways of producing stimuli more artificially that would hinder one of the strong contender models more than another, and so on. As of now it does not seem like there is a strong takeaway. My understanding is that the full Bayesian method would be intractable anyway so I am not sure it was ever a serious contended, but here it is good to see some evidence disfavoring it. But again, some more discussion about the “false priors” results is needed. Do these really rescue the Bayesian model? Is there any reason to think that the subject would know the exact priors so not produce false priors? If not, perhaps the “false priors” are optimal given the data in some sense, so would that not make them a “more Bayesian” version than the standard? As you can see, with the wealth, even overabundance, of tests and results, a lot of questions are raised and the paper would be more satisfactory if a sizable subset of potential questions are discussed and addressed in the text. Finally, looking at the GitHub repository with the code, the ReadMe is insufficient to help anyone run the code (there are so many files I suspect it would be practically impossible for anyone to run it, so without instructions it is in practice unavailable). Minor issues: p. 5 “Family B” is used when I think you are still describing Family C (right after the eq.). top of p.12 a weird symbol instead of “<” for the p-value Eq.23: I think N_0 and N_1 should be defined though one can guess they are numbers non/affiliated with the feeder in a given trial. Before eq.24 a missing close parenthesis p.20 “see … box” it would be nice to tell the reader where to find the box as it is not nearby. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No: See above -- present but impractical to use without instructions. ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. https://doi.org/10.1371/journal.pcbi.1009159.r001
Revision 1
17 Sep 2021 Author Response Attachments Attachment Submitted filename: Pigeon reviewer response.pdf https://doi.org/10.1371/journal.pcbi.1009159.r002
5 Oct 2021 Decision Letter - Jean Daunizeau, Editor, Samuel J. Gershman, Editor Dear Ms Lee, We are pleased to inform you that your manuscript 'Point-estimating observer models for latent cause detection' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Jean Daunizeau Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology *********************************************************** https://doi.org/10.1371/journal.pcbi.1009159.r003
Formally Accepted
25 Oct 2021 Acceptance Letter - Jean Daunizeau, Editor, Samuel J. Gershman, Editor PCOMPBIOL-D-21-01027R1 Point-estimating observer models for latent cause detection Dear Dr Lee, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Andrea Szabo PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1009159.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .