Predicting takeover response to silent automated vehicle failures

Callum Mole; Jami Pekkanen; William Sheppard; Tyron Louw; Richard Romano; Natasha Merat; Gustav Markkula; Richard Wilkie

doi:10.1371/journal.pone.0242825

Peer Review History

Original SubmissionAugust 8, 2020
3 Sep 2020 Decision Letter - Feng Chen, Editor PONE-D-20-24803 Predicting takeover response to silent automated vehicle failures PLOS ONE Dear Dr. Mole, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Oct 18 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Feng Chen Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We noted in your submission details that a portion of your manuscript may have been presented or published elsewhere. "A version of the paper has been released as a preprint." Please clarify whether this publication was peer-reviewed and formally published. If this work was previously peer-reviewed and published, in the cover letter please provide the reason that this work does not constitute dual publication and should be included in the current manuscript. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: 1. It is recommended that pictures of the driving simulator, the simulated driving route and the simulated driving scenario, as well as a flow chart of how the experiment was carried out, be added to enhance the reader's knowledge of the experiments carried out. The figure of the experimental scenario given in this paper shows that the experimental scenario differs greatly from the real road conditions, how to ensure that the results of the experiment are meaningful in this experimental state? 2. In this paper, reaction time is a very important parameter, but the definition of reaction time in this paper is vague, so it is recommended to clearly define the reaction time and explain its practical significance. 3. As the experiment progresses, the driver gradually adapts to the simulated driving scenario, producing a certain learning effect, when the driver may become more sensitive or more sluggish to the silent failure stimulus. It is proposed to explain how this paper is a scientifically based experimental approach to reduce the impact of driver learning effects on experimental results. 4. The title of this paper is “Predicting takeover response to silent automated vehicle failures”, therefore it is suggested that the description of the key performance of the predictive model be added to the conclusion section as appropriate to echo the theme of this paper and to enable the reader to quickly understand the key findings of this study in predicting response. 5. SI Figure 3 lacks a quantitative description of "The marginal means (dots) and standard deviations (lines) for RT and Lane Position", and given the statistical importance of the mean in describing the state, it is recommended that the magnitude of this statistical value be supplemented with an appropriate analysis of the value. Reviewer #2: The topic of this paper is interesting and important. The methods sound. The results are meaningful and useful. There are several suggestions to improve this paper. 1. More information of the participants is needed, for example, the driving experience. 2. The structure of this paper is not so formal. 3. One table of the statistical information of the results is suggested. 4. One paper about the driving simulator experiment of the the steering performance under sudden situation maybe is useful for this paper. [1] "Examining the safety of trucks under crosswind at bridge-tunnel section: A driving simulator study”, Tunnelling and Underground Space Technology, 2019, 92, 103034. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0242825.r001
Revision 1
21 Oct 2020 Author Response Reviewer #1: 1. It is recommended that pictures of the driving simulator… and the simulated driving scenario, We agree that it would be illustrative to include a picture of the driving simulator. Unfortunately, we do not have an up-to-date picture, and our laboratories currently remain closed to us as the University does not yet deem them covid-secure. If a picture is required we can investigate with the University management. … the simulated driving route, and the simulated driving scenario… The driving route and stimuli is described clearly in the methods: “The road geometry across all conditions began with a straight section of 16 m length (2 s), followed by a constant curvature bend of 80 m radius (either leftwards or rightwards). The road width was 3 m. The road was rendered using a semi-transparent grey texture. The ground plane of the virtual environment was textured with ‘Brownian noise’ (as per 72, Fig 1E), which has been shown to elicit similar gaze behaviours to on-road driving (72). Vehicle speed was kept constant at 8 ms-1 (≈18 mph).” Additionally, throughout the manuscript we take pains to describe the simulated driving scenario in considerable detail, including in multiple graphics (see Fig 1A & B; Fig1 Caption, Experiment, Silent Failures Selection). These descriptions were tested on three non-scientists to check for clarity and understanding. In all three tests the layperson was able to articulate, without prompts, the driving simulator scenario back to the first author. In previous drafts of the manuscript we included a birds-eye view of some sample trajectories, which might be the type of graphic that the reviewer is suggesting. We initially thought that it would be instructive, but we found that the scale needed to include the full track made it difficult to see the shape of trajectories, so in the end we decided that these graphics were unhelpful because the track could be easily described with words (straight section followed by a constant curvature bend). …as well as a flow chart of how the experiment was carried out… In Fig 1 we described in detail the procedure of the experiment. The manuscript already has a large number of figures, and we are reluctant to add more. We realise that some details were placed in figure captions that may have been confusing if one was to read only the Procedure section. We have now added to the procedure to clarify that the trials were done as a sequence: “The locomotor component of each trial was 15 s, after which the scene was reset (in SupAuto) or the ACMT task was shown (Fig 1E).” …be added to enhance the reader's knowledge of the experiments carried out. The figure of the experimental scenario given in this paper shows that the experimental scenario differs greatly from the real road conditions, how to ensure that the results of the experiment are meaningful in this experimental state? We feel that a strength of our approach is the high degree of experimental control over the visual stimuli and reliable and repeatable conditions. This necessarily comes at the expense of ecological validity since real world driving is highly varied and variable. We deliberately constrained the visual stimuli so that the only sources of perceptual information were the road edges and the optic flow from the ground texture. By removing extraneous features that could serve as possible distractions and gaze fixation candidates, we are able to assess the perceptual-motor behaviour more rigorously, rather than including spurious gaze behaviours that would confound interpretation (e.g. during less critical failures drivers may look to irrelevant scene objects rather than the road ahead, therefore delay takeover due to not looking rather than due to accumulating perceptual error more slowly). In the manuscript we acknowledge the limitations in the following section of the discussion: “The predictions in Fig 5 help to illustrate the potential benefits of using generative models for regression analysis in this domain. There are several reasons why drivers may have detected failures more quickly in the present highly-controlled experiment compared to noisy real-world driving conditions: there was no traffic (35), participants experienced many failure repetitions (33; 68; 22; 20), and gaze was directed forwards because there were few visual distractions (34). Relaxing any of these constraints could increase the predicted P(Exit) (Fig 5B & SI Fig 4). It should be noted that it is also possible that detection of AV failure could have been artificially slowed by the lack of vestibular cues (we used a fixed-based simulator) and no vehicular sounds (which prevented interference with the ACMT task), both of which can contribute to successful driving (69) and could provide a signal that there has been AV failure” 2. In this paper, reaction time is a very important parameter, but the definition of reaction time in this paper is vague, so it is recommended to clearly define the reaction time and explain its practical significance. We agree that reaction time is an important metric in the field. However, we also contend that the literature places too much emphasis on reaction time, and instead should report contextualising metric such as time-to-line-crossing (TLC; an argument which we make in the introduction). Our manuscript therefore uses TLC as the primary metric. That being said, both TLC and RT are related in almost all real-world scenarios (though the mapping depends on the context). A strength of the current experimental design is that one can be derived directly from the other (I.e. TLC at takeover = TLC at failure – RT). We now include additional clarification at the beginning of the section ‘Detecting Failures: TLC at Takeover’ that “The timestamp of when the driver pulled the paddle shifter behind the steering wheel was taken as the takeover moment”. TLC at failure corresponds to the TLC at the time when the failure was introduced (the failure onset; this is described in the manuscript). Therefore, the reaction time is this timestamp minus the failure onset time. 3. As the experiment progresses, the driver gradually adapts to the simulated driving scenario, producing a certain learning effect, when the driver may become more sensitive or more sluggish to the silent failure stimulus. It is proposed to explain how this paper is a scientifically based experimental approach to reduce the impact of driver learning effects on experimental results. The experimental blocks (SupAuto; SupAuto+ACMT) were counterbalanced, and the trials within each block were randomly interleaved (we have now added a clarifying sentence – “Within each block conditions were randomly interleaved.” in the Procedure). Therefore, though learning/fatigue effects within each participant might be expected, these would not have systematically mapped on to specific conditions, so is not a confound in the interpretation of our results. We specifically highlight the possibility of learning effects in the last couple of sentences in the discussion: “As an example, consider for a moment trying to account for the predictable nature of the current experiment. Drivers who were faced with unpredictable planned takeovers have been estimated to be around 1 s slower than drivers who had previously experienced (and therefore will have some expectation of) a planned takeover (5). A further 1 s delay (giving a safety threshold of 1.5 s) would mean more than 75% of AV failures result in lane exits for the specified scenarios (Fig 5B).” 4. The title of this paper is “Predicting takeover response to silent automated vehicle failures”, therefore it is suggested that the description of the key performance of the predictive model be added to the conclusion section as appropriate to echo the theme of this paper and to enable the reader to quickly understand the key findings of this study in predicting response. Thank you, we agree that in our attempts at succinctness we may have made the key conclusions hard to parse quickly. The second paragraph of the conclusion concerns the predictive model, and now reads as follows (changes highlighted): “Using bayesian hierarchical models, criticality (TLC) at takeover was ably predicted using a gaussian distribution where the mean and standard deviation both increased as failure severity decreased. Furthermore, the magnitude of steering response was related to the criticality at takeover through a power law, with highly critical takeover producing increasingly large corrections and less critical takeovers tending towards minimal corrections. Hierarchical modelling of both the mean and variability of TLC showed that both within- and between-individual variability should be taken into account when predicting safety boundaries, and also when developing mechanistic models for virtual testing. These methods allow for applied simulations of hypothetical failures, providing a lower-bound estimate of the probability that a driver would exit the road before taking over control of an automated vehicle that has failed. The lower-bound is not negligible (about 1/100 failures, rising quickly for critical failures), and the probability is expected to rise rapidly when additional sources of delays are incorporated (e.g. due to traffic, or surprising failures not tested in this manuscript). This modelling should be a cause for concern when considering the widespread plans to adopt AV systems.” 5. SI Figure 3 lacks a quantitative description of "The marginal means (dots) and standard deviations (lines) for RT and Lane Position", and given the statistical importance of the mean in describing the state, it is recommended that the magnitude of this statistical value be supplemented with an appropriate analysis of the value. To enable the reader to better assess the magnitude of the differences between conditions we now add one-sample t-tests comparing the differences between cognitive load conditions to zero, for RT, steering wheel angle, and lane position (i.e. for both SI Fig 2 and SI Fig 3). These are highlighted in the manuscript. Reviewer #2: The topic of this paper is interesting and important. The methods sound. The results are meaningful and useful. There are several suggestions to improve this paper. 1. More information of the participants is needed, for example, the driving experience. Thank you. We report that 17/19 participants had driving licenses, for an average of 6 years. Unfortunately the length of license is the only information on driving experience we have. That being said, we nevertheless do not feel that considerable driving experience is an important aspect of the study, or that our pattern of results could be explained by driver (in)experience. The participants only needed to control a steering wheel, and monitor when to take over of a vehicle. In our highly controlled scenario this behaviour is akin to a simple perceptual-motor error detection task, which is quickly learned. There are no traffic rules or complex driving situations to negotiate, for which experience might be beneficial. Furthermore, we offer practice with the driving simulator, and our highly controlled experimental conditions allow us to quantify (and control for) individual participant variability. 2. The structure of this paper is not so formal. We agree that the structure of our paper uses the Results-First format, which is atypical to many papers that describe the Methods before the Results. We chose this format because a quick reader may obtain the core understanding by Fig1 and reading the Results sections. The Materials and Methods section provide more detail for the interested reader, but are non-essential for the core flow of the manuscript. Instead of breaking up the flow from the Introduction to the Results, we chose to put the Methods at the end. 3. One table of the statistical information of the results is suggested. Thank you. We have tried to produce a single table with all the results, however, we found that the single large table was difficult to understand, since the model parameters have different interpretations depending on the measure. Further, since the first results section (TLC) is quite large, a single table will be quite spatially distant when the second results section (SWAMax) is reached, causing difficulty for the reader referring back. Since PlosOne allows two tables, we think that it is clearer to the reader to separate the metrics into two tables and hope you agree with our rationale. 4. One paper about the driving simulator experiment of the the steering performance under sudden situation maybe is useful for this paper. [1] "Examining the safety of trucks under crosswind at bridge-tunnel section: A driving simulator study”, Tunnelling and Underground Space Technology, 2019, 92, 103034. We thank the reviewer for highlighting this interesting paper, which the authors had not seen. However, though the paper concerns driving responses to sudden perturbations, the drivers are in manual control the entire drive. This scenario has fundamental differences to monitoring an automated vehicle, so due to the long reference list (we already have 79 references) we have decided to omit this paper from the manuscript in favour of similar papers that are more relevant to silent failures of automation. Attachments Attachment Submitted filename: Response_to_Reviewers.pdf https://doi.org/10.1371/journal.pone.0242825.r002
10 Nov 2020 Decision Letter - Feng Chen, Editor Predicting takeover response to silent automated vehicle failures PONE-D-20-24803R1 Dear Dr. Mole, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Feng Chen Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Through simulation experiments, the paper studies and predicts the response to the takeover of silent automatic vehicle faults, and puts forward the corresponding prediction model. The research makes sense. In the previous comment reply, the author has given a comprehensive explanation and improvement to the experimental process, the structure of the paper and the result statistics. I suggest that the manuscript give a supplement and explanation to the manuscript. Reviewer #2: (No Response) ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No https://doi.org/10.1371/journal.pone.0242825.r003
Formally Accepted
16 Nov 2020 Acceptance Letter - Feng Chen, Editor PONE-D-20-24803R1 Predicting takeover response to silent automated vehicle failures Dear Dr. Mole: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Feng Chen Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0242825.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .