Peer Review History
Original SubmissionJune 1, 2022 |
---|
PONE-D-22-15831Probabilistic Coherence, Logical Consistency, and Bayesian Learning: Neural Language Models as Epistemic AgentsPLOS ONE Dear Dr. Gregor Betz, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Nov 13 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Dr. Anu Sayal, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Please update your submission to use the PLOS LaTeX template. The template and more information on our requirements for LaTeX submissions can be found at http://journals.plos.org/plosone/s/latex. 3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 4. Thank you for stating the following in the Acknowledgments Section of your manuscript: This work is supported by the Helmholtz Association Initiative and Networking Fund on the HAICORE@KIT partition. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: This work is supported by the Helmholtz Association Initiative and Networking Fund on the HAICORE@KIT partition. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 5. Thank you for stating the following in your Competing Interests section: NO Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now This information should be included in your cover letter; we will change the online submission form on your behalf. 6. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide. 7. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary). 8. Please ensure that you refer to Figure 1,2,3,4,5,6,7,8,13,14,15,16 and 17 in your text as, if accepted, production will need this reference to link the reader to the figure. 9. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This is an impressive and interesting article, making (as far as I know) novel both conceptual and technical points about the way in which the probabilistic and Bayesian coherence of NLMs may be assessed. I suspect it warrants publication, but I have a few conceptual questions that it may help for the authors to clarify before publication. 1) What exactly is the force of the Rationality Hypothesis? As the authors say in defense of it, it is an existential hypothesis about what a possible NLM might compute, in the right circumstances. But neural nets are Turing complete, meaning of course there will be SOME net that computes any function we like, including a function that outputs probabilistically coherent credences and adjusts them in a Bayesian fashion. So what exactly is the form of the hypothesis that is uncertain? Is the idea that such a neural net may require too different an architecture (or training regimen) from standard NLMs to count as an NLM? I think it'd help understand the point of the paper better to have a bit more discussion on this front—though obviously there are difficult issues here and I don't mean to hold the publication of the paper hostage to settling all of them perfectly clearly! Perhaps here's another way to put the conceptual worry. The authors motivate their proposal by suggesting that it bears on whether neural nets may eventually exhibit AGI, since AGI agents must be rational. But it's trivially easy to construct programs that meet all our author's metrics perfectly: simply define a (non-neural-net) program that outputs a probability function over state desriptions in a Boolean langauge, and calculates probabilities of arbirary sentences by sums of their state descriptions. (Of course, this is computationally HARD, since the number of state descriptions explodes with the number of atomic sentences; but it's clearly programmable.) The fact that we can define such a Bayesian program doesn't seem to provide much if any evidence that it's possible to generate an AGI using a similar architecture. So why think the neural net case is any different? 2) Somewhat relatedly, I was surprised that the language did not contain Boolean operations. After all, that's what I'd initially expect was needed for a neural net to be "learning to be logically coherent". Why was this choice made? Is there a principled difficulty with running similar analyses with a more complex language, or is it simply an implementational difficulty? A bit more discussion would be helpful. 3) Why exactly does the belief elicitation protocol make sense, wherein we get the network's credences as it's probabilistic prediction for the completion of a masked string? Imagine you had a HUMAN doing this task: reading a bunch of sentences, and then predicting the next sentence. Clearly their prediction wouldn't necessarily line up with their degree of belief that the sentence is true; rather it's something closer to their degree of belief that "in this context, this corpus will present me with this string', or something like that. Is the thought that if they fully trust the corpus, that would be equivalent to their credence? I'm not sure that's right either. Difficult issues, we won't get to the bottom of them here. But more discussion for this way of measuring, and what it presupposes, would be nice. 4) The measures of deviation from probabilistic coherence are interesting, and touch on a not-huge-but-growing literature on what the proper way to define such measures is. It'd be nice to have some comparisons of the authors metrics to other ones—I suspect they will have important theoretical differences, but perhaps will not deviate too much in practice. (Though it'd be interesting if they did!) Some places to look for literature on this: - Staffel 2015, "Measuring the overall incoherence of credence functions" - Staffel and de Bona, "Graded Incoherence for Accuracy-Firsters' 5) I don't understand the transitivity constraint on page 18. It seems to be saying that if {A,B} entails C, then Pr(C) must be at least as great as Pr(A)*Pr(C). Right? But that is not a constraint of probabilistic coherence, and is violated whenever A and B are negatively relevant. Eg suppose we have a coin that's either biased 90% for H or biased 90% for T, and we're 50-50 on which it is. Then Pr(Heads first toss) = 0.5 and Pr(Tails second toss) = 0.5, but Pr(Heads first, Tails second) = 0.09 (if H-biased, it's 0.9*0.1; if T-biased, it's 0.1*0.9; the average of those is 0.09). The appendix mentioned something about independence, but why should that be assumed in this context? 6) The discussion of the presentation of new evidence went by a bit quick for me (but perhaps I was reading quickly/less-carefully as that point, so don't put too much weight on this). Intuitively, the networks are getting new evidence at the earlier stages too—namely, when they're learning what the beliefs of the informant generating the sentences are. So conceptually I found it strange that presenting new evidence later is any different than the earlier stages. Is it just because at the earlier stage we're letting the network piggyback on the informant's logical coherence, and later we want to see if they can maintain coherence on their own? Perhaps just a word or two more on this would help tired readers understand what's happening. Anyways, thanks for the thought-provoking paper! Reviewer #2: This paper examines how well certain types of natural language models can acquire characteristics of bayesian (epistemic) rationality. The paper uses RANKERS and a simple artificial language to test properties such as general probabilistic coherence and updating by conditionalization. The results are intriguing enough for me to recommend publication, but I overall was not fond of how the paper was written. The opening few pages state broad and grand theses that are ill-defined. Indeed, the Rationality Hypothesis---nominally the main subject of the paper---is stated in vague and unclear language and not made at all clear until 4 pages later (p. 6 of the ms). Some connections the authors draw were also very underspecified. For instance, on p. 3 line 63, the authors discuss Systems 1 and 2 in psychology, and it was entirely opaque to me how this investigation related to these concepts. So, I'd like to the authors being a little less grand and a little more precise in the opening pages. I have some reservations as well about the conclusions. For example, the authors use an extremely limited artificial language and then present the Ranker with fully coherent text and probe its partial beliefs. It isn't especially surprising if a system with such an artificial language that is presented a variety of opposing, but fully consistent passages ends up in the convex hull of the views it's presented with. While the results are interesting, I think the significance is a bit limited. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Kevin Dorst Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
Revision 1 |
Probabilistic Coherence, Logical Consistency, and Bayesian Learning: Neural Language Models as Epistemic Agents PONE-D-22-15831R1 Dear Dr. Gregor Betz, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Anu Sayal, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: I think the article is presented well enough and interesting enough to merit publication at this point. Reviewer #3: It is evident that the authors have made good efforts to address the reviewing comments, and I would recommend the revised paper to be considered for publication. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No ********** |
Formally Accepted |
PONE-D-22-15831R1 Probabilistic coherence, logical consistency, and Bayesian learning: neural language models as epistemic agents Dear Dr. Betz: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Anu Sayal Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .