A recurrent neural network framework for flexible and adaptive decision making based on sequence learning

Zhewei Zhang; Huzi Cheng; Tianming Yang

doi:10.1371/journal.pcbi.1008342

Peer Review History

Original SubmissionMay 17, 2020
17 Jun 2020 Decision Letter - Alireza Soltani, Editor, Samuel J. Gershman, Editor Dear Dr. Yang, Thank you very much for submitting your manuscript "A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making based on Sequence Learning" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Alireza Soltani Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology ********************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: This manuscript describes a RNN based on GRU units, which is trained to solve a fairly complex sequential probabilistic task. Specifically a sequence composed of an alphabet of 10 symbols (each of which is probabilistically associated with a correct Left/Right response) is presented until a choice is made. The optimal strategy is to integrate information across symbols until enough evidence for a decision is available. As the authors point out this task has similarities with natural language processing, thus the use of a GRU. After training the RNN solves the tasks across nontrained sequences. The units of the RNNs seem to qualitatively resemble the behavior of neurons recorded during animal experiments using similar tasks, including response, log likelihood and “urgency” neurons. The analyses and simulations seem well executed, however a weakness is the conclusions in terms of how the results improve our understanding of brain function. Some reference is made regarding capturing the properties of basal ganglia, but there is no direct clear link. Can the authors make any clear cut experimental predictions? For example are the authors predicting that animals are attempting to predict both the subsequent stimuli and reward with equal weighting (as implemented by their loss function)? The authors attempt to make a link to the biology, and make a few general statements about similarities to the basal ganglia, and a link between GRUs and the basal ganglia circuitry. This seems counterintuitive to me for a number of reasons including the fact that the basal ganglia circuitry is all inhibitory. In regards to the potential biological relevance it would be important to determine if gated units (which can have infinite long memory) are needed. Indeed, since no parametric studies are presented it would be helpful to do so, and one way this could be accomplished is to compare GRU performance with ReLU performance. That is, is this a task that requires the essentially infinite long-term memory of gated units, or can it be achieved equally well by more realistic ReLUs? Does performance exhibit a commutative property. That is, is the Reaction time the same for shapes 1,2,3 and 3,2,1 (controlling for cases in one shape basically solves the task)? This is an important question because from a mathematical stance it should be the case, but biologically speaking it is probably not the case, because animals are generally heavily biased by recency effects. Relatedly is the model equally weighing all evidence or preferentially weighing more recent evidence (i.e., forgetting early evidence)? Something that is a bit glossed over in the presentation is that the network is not actually making a decision, there seems to be a postprocessing stage in which information from the network is analyzed to see if a criteria is reached and then the simulation is terminated. This is a bit misleading as many people will be left with the impression the RNN is autonomously making a reaction time decision. Thus, this postprocessing stage should be strongly emphasized in the results. Apparently, the model was trained with Adam, but absolutely no information is given about this critical component of any model. It would be helpful to show some raw data of model performance as is often shown for standard RNN models during the performance of the task. The performance measure does not allow the reader to understand how well the model is integrating, how well can it perform just looking at the last N shapes? Line 42. “framework”. Line 92. “ranging” Li 280. Just “Performance” (behavior gives the impression there were animal studies). Li 735 “reported THAT lesions” Reviewer #2: In the manuscript titled “A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making based on Sequence Learning”, Zhang and Colleagues used natural language processing (NLP) framework and trained a network of gated recurrent units in four different experimental setting. Specifically, similar to networks in the NLP that are trained to predict text sequences, authors trained their networks to take inputs in the form of event sequences (sensory and reward outcome events) and predict future events through supervised learning. Networks were trained on ¬¬a probabilistic reasoning task, a multi-sensory integration task, a confidence/post-decision wagering task, and a two-step task. Authors showed that networks can learn perform the task and showed behavior similar to that observed to animals. Additionally, authors found units that resembled activity of recorded neurons in different areas of the brain. Overall, I found the paper suitable for publication in journal of PCB and their proposed framework interesting. However, some details of training procedure and analysis require further justification. Moreover, strong conclusions are drawn without enough supporting information, which needs to be addressed. Please find my comments below: Major concerns: 1) Page 14, lines 267-270: Why a drift-diffusion model (DDM) with collapsing boundary was used to simulate the choice used for training the network? It is sounds circular to train the network with a certain model and observe units that resemble the key parts of that model. How much of these observations depend on the model behind choice behavior? Why not train the network with the optimal algorithm? Authors need to clarify this. 2) Page 20, lines 408-409: Authors’ definition of Iwhen and Iwhich are very confusing to me. These measures don’t seem to be doing what they are supposed to do. For example, if a unit has (WLT, WRT, WFT) respectively equal to (5, 2, 3.5), this unit is +which (5-2=3) but apparently not sensitive to fixation at all ((5+2)/2-3.5=0), which does not make sense. Why not pass them through a psychometric function and use their relative weights to compare units? 3) Page 22, Figure 6: Can authors explain why the connection weights for +/- when and +/- which are not symmetric with respect to 0. Wouldn’t it show that not only weights, but the activity of these units should be also considered? 4) Page 30, lines 624-627: I don’t understand why only rewarded trails are used to train the network. What happens if all trials are fed into the network during the training? Unrewarded sequences carry same amount of information about the task as the rewarded trials. 5) Page 32, line 671-673, & Page 37-39: Even though that the loss function of the trained network is not that of reinforcement learning, the network has access to the reward, its estimate of reward at each time point, and is required to match its actions to rewarding ones (necessary pieces for a reinforcement learning agent). So, what makes authors think that the network is not using some deviation of RL to solve the task? Recent studies have been focused on the relation of these two frameworks [1-2]. Authors cannot make such a strong claim considering the structure of their network. 6) In all these experiments, only a single network is trained and used to calculate the average and SE values. How consistent are these results for different trained networks? What percentage of the trained networks show the mentioned behaviors? Minor concerns: 1) Page 3, line 48-49: There seems to be a few words missing/incorrect in this sentence; “Therefore, we build a recurrent neural network framework based on the gated recurrent units and test how it matches experimental findings in four exemplar tasks, each focusing on a different aspect of decision making and learning.” 2) Page 7, last paragraph: Please add a more detailed summary of your results in this section. The language is very vague for the introduction. 3) Page 9, line 147: “Our network framework contains three layers … ” 4) I did not find any information on the criteria for training procedure. Was the training stopped after the error went below a certain threshold? If so, please add it to the related section. 5) Please add a more detailed caption to Figures 4 and 11. 6) Page 21, line 427: “The where and when group units only overlap rarely … ” 7) Page 21, line 429: Please report the prevalence of each group separately. 8) Page 22, Figure 6a: Please add a horizontal line for 0 to both panels. 9) Page 26, Figure 8b: Please add information on the error bars to the caption. 10) Page 28, line 587: Please report the prevalence of these units in your model. References: [1]. Luketina, J., Nardelli, N., Farquhar, G., Foerster, J., Andreas, J., Grefenstette, E., ... & Rocktäschel, T. (2019). A survey of reinforcement learning informed by natural language. arXiv preprint arXiv:1906.03926. [2]. Jiang, Y., Gu, S. S., Murphy, K. P., & Finn, C. (2019). Language as an abstraction for hierarchical deep reinforcement learning. In Advances in Neural Information Processing Systems (pp. 9414-9426). ****** Have all data underlying the figures and results presented in the manuscript been provided?** Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: None ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods https://doi.org/10.1371/journal.pcbi.1008342.r001
Revision 1
14 Aug 2020 Author Response Attachments Attachment Submitted filename: Response to reviewers.docx https://doi.org/10.1371/journal.pcbi.1008342.r002
31 Aug 2020 Decision Letter - Alireza Soltani, Editor, Samuel J. Gershman, Editor Dear Dr. Yang, Thank you very much for submitting your manuscript "A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making based on Sequence Learning" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Importantly, please add a few sentences explaining the results of analyses in response to Reviewer # 2. Please note that we will not send the manuscript to review again and the final decision will be made at the editorial level. In addition, in accordance with the journal policy, please also make your data and/or codes available if you have not done already. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Alireza Soltani Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology ********************* A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have done a reasonable job in addressing my concerns. Fig 1a should read “Pavlovian” Reviewer #2: I would like to thank the authors for responding to my questions. I have a recommendation to add: RE 2.3 and 2.4: I suggest the authors to add a few sentences to the main manuscript explaining these points. ****** Have all data underlying the figures and results presented in the manuscript been provided?** Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods https://doi.org/10.1371/journal.pcbi.1008342.r003
Revision 2
9 Sep 2020 Author Response Attachments Attachment Submitted filename: Response to reviewers.docx https://doi.org/10.1371/journal.pcbi.1008342.r004
16 Sep 2020 Decision Letter - Alireza Soltani, Editor, Samuel J. Gershman, Editor Dear Dr. Yang, We are pleased to inform you that your manuscript 'A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making based on Sequence Learning' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Alireza Soltani Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology *********************************************************** https://doi.org/10.1371/journal.pcbi.1008342.r005
Formally Accepted
22 Oct 2020 Acceptance Letter - Alireza Soltani, Editor, Samuel J. Gershman, Editor PCOMPBIOL-D-20-00841R2 A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making based on Sequence Learning Dear Dr Yang, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Kaitlin Butler PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1008342.r006

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .