Some performance considerations when using multi-armed bandit algorithms in the presence of missing data

Xijin Chen; Kim May Lee; Sofia S. Villar; David S. Robertson

doi:10.1371/journal.pone.0274272

Peer Review History

Original SubmissionApril 5, 2022
12 May 2022 Decision Letter - Darrell A. Worthy, Editor PONE-D-22-10081Some performance considerations when using multi-armed bandit algorithms in the presence of missing dataPLOS ONE Dear Dr. Chen, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. I received one review from an expert in the field. The reviewer has a generally positive impression of your manuscript, but lists several points that need to be addressed. if you feel you can address these concerns, then I invite you to submit a revision. Please submit your revised manuscript by Jun 26 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Darrell A. Worthy, Ph.D Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. 3.Thank you for stating the following financial disclosure: "This research was supported by the NIHR Cambridge Biomedical Research Centre (BRC1215-20014), the NIHR Maudsley Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health and Social Care (DHCS). SSV received funding from the UK Medical Research Council (MC_UU_00002/15). DSR received funding from the Biometrika Trust and the UK Medical Research Council (MC_UU_00002/14). KML received funding from the National Institute for Health Research (NIHR Research Professorship, Professor Richard Emsley, NIHR300051)." Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: High-level overview: This paper evaluates empirically standard multi-armed bandit algorithms in view of practical considerations involving missing data in the context of A/B testing for clinical trials. The "Missing at Random" (MAR) model is posited for missing data. The authors focus on the standard two-armed model (treatment/control) under Bernoulli outcomes and the "fraction of total assignments to treatment" and the "total number of successes" as the performance metrics of interest. Under the MAR model, some binary outcomes for one or both arms may be (independently) unobservable potentially with different probabilities, but the total number of samples from each is known at all times; this is the "missing data" problem investigated in the paper. Sensitivity/robustness of different algorithms to MAR data (in terms of stated performance metrics) is assessed empirically in a hypothesis testing setup under the null of "zero treatment effect." The same exercise is also repeated for the "mean imputation" model for unobservable (missing) outcomes. Scope: This is an empirical paper that focuses on the measurement/benchmarking of the impact of MAR data on key performance metrics of commonly used bandit algorithms for A/B testing. Review of extant literature: I think the following articles (and appropriate references therein) should be cited since they are related to the theme of this paper: 1. Why adaptively collected data have negative bias and how to correct for it [Nie et al., AISTATS 2018]. 2. Are sample means in multi-armed bandits positively or negatively biased? [Shin et al., NeurIPS 2019]. 3. Accurate inference for adaptive linear models [Deshpande et al., ICML 2018]. 4. Online Multi-Armed Bandits with Adaptive Inference [Dimakopoulou et al., NeurIPS 2021]. Aforementioned works investigate theoretically the directions, causes, implications and mitigations for biases in bandit algorithms resulting from sample-adaptivity, and are highly relevant to this paper, especially in the context of mean imputation for missing data. Cited references may be able to provide a theoretical explanation for many of the empirical observations made in this submission. In addition, reference [56] (bibliography) provides a theoretical explanation for the "imbalanced" behavior of RTS (Raw Thompson Sampling) under the null, observable also in experimental results in this submission. The same reference also provides an explanation for the behavior of UCB1 (Auer et al., 2002) under the null; these aspects should be elucidated in detail in view of the numerical experiments conducted in this submission. Miscellaneous: Line 243 -- Shouldn't it be "t" instead of "T" in the expression for \\beta? Otherwise the algorithm would correspond to a different version of UCB. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0274272.r001
Revision 1
15 Jul 2022 Author Response We thank the reviewer for pointing out the metioned theoretical references. We agree that including these are a very helpful addition and highly relevant to our paper, in particular in explaining the results seen when using mean imputation for missing data. We have now incorporated all of the reference into our paper in a new section 4.3 (‘The impact of biased estimates on mean imputation’), as summarised in the letter of 'Response to Reviewers'. This section is the major revision we have incorporated and is a direct response to the reviewer’s insightful feedback that allowed us to better explain our results. Note also we have updated our abstract in an attempt to better present our main observations in light of our reading of the references suggested by the reviewer. 1. Nie et al. (2018) prove that the bias of the sample mean for any fixed arm and at any fixed time is negative when the sampling strategy satisfies two conditions called ‘Exploit’ and ‘Independence of Irrelevant Options’ (IIO). Besides, they suggest two ways targeting the biased estimate via modifying the data collection procedure. We discuss the mean imputation results with the explanations of negative bias in our revised manuscript and we mention this paper in Line 674, 676, 743. 2. Shin et al. (2019) theoretically discuss that in many typical MAB settings, we should expect sample means to have two contradictory sources of bias: negative bias from ‘optimistic sampling’ and positive bias from ‘optimistic stopping/choosing’. This not only provides broader discussion than the contexts in Nie et al. (2017) and our submission, which could be some directions for future research, but also this work extends the formula for negative bias given in Bowden and Trippa (2017) that only applied to randomised data adaptive sampling rules. The formula in both references give us some insights on the magnitude of bias in different multi-armed algorithms. We discuss this in combination with our experimental results to provide additional and new interpretations in Line 674, 678, 691 of the revised manuscript. 3. Deshpande et al. (2018) discuss the simple case of multi-armed bandits without covariates, where the ordinary least squares estimates correspond to computing sample means for each arm. They propose to decorrelate the OLS estimator. Even though this is not implemented to do imputation in our submission, we discuss how this could be an avenue for future investigation for the imputation of the missing data in Line 745 of the revised manuscript. 4. Dimakopoulou et al. (2021) proposed the Doubly-Adaptive Thompson Sampling (DATS) by harnessing the strengths of adaptive inference estimators to ensure sufficient exploration in the initial stages of learning and the effective exploration-exploitation balance provided by the TS mechanism. This debiasing technique is not used in our submission, but we discuss how this could be a way of handling the problem of biased estimate for some specific algorithms (i.e., TS) in Line 748 of the revised manuscript. We also thank the reviewer for highlighting Kalvit and Zeevi (2021) (Reference [56]), which discusses the sampling behaviour of TS and UCB under the ‘large gap’ (i.e., ‘well-separated’) and ‘small gap’ (i.e., ‘worst-case’) instances. The latter setting matches the scenarios under the null in our experimental investigations. For this reason, this reference helps to explain the ‘imcomplete sampling’ (i.e., ‘random selection’ in our submission) of TS from a theoretical perspective. This behaviour is different from the ‘complete learning’ behaviour of UCB (i.e. inducing a ‘balanced’ allocation under the null), which has also been seen in our experimental results. We have modified the related discussions to include this reference as an explanation of the sampling behaviour of TS and UCB in Line 300, 318, 335 of the revised manuscript. Attachments Attachment Submitted filename: Response.pdf https://doi.org/10.1371/journal.pone.0274272.r002
25 Aug 2022 Decision Letter - Darrell A. Worthy, Editor Some performance considerations when using multi-armed bandit algorithms in the presence of missing data PONE-D-22-10081R1 Dear Dr. Chen, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. I sent your manuscript back to the original reviewer, and they felt all their comments had been adequately addressed. Therefore, I am happy to accept your manuscript for publication. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Darrell A. Worthy, Ph.D Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No ******** https://doi.org/10.1371/journal.pone.0274272.r003
Formally Accepted
2 Sep 2022 Acceptance Letter - Darrell A. Worthy, Editor PONE-D-22-10081R1 Some performance considerations when using multi-armed bandit algorithms in the presence of missing data Dear Dr. Chen: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Darrell A. Worthy Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0274272.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .