Video-based interventions to improve self-assessment accuracy among physicians: A systematic review

Chandni Pattni; Michael Scaffidi; Juana Li; Shai Genis; Nikko Gimpaya; Rishad Khan; Rishi Bansal; Nazi Torabi; Catharine M. Walsh; Samir C. Grover

doi:10.1371/journal.pone.0288474

Abstract

Purpose

Self-assessment of a physician’s performance in both procedure and non-procedural activities can be used to identify their deficiencies to allow for appropriate corrective measures. Physicians are inaccurate in their self-assessments, which may compromise opportunities for self- development. To improve this accuracy, video-based interventions of physicians watching their own performance, an experts’ performance or both, have been proposed to inform their self-assessment. We conducted a systematic review of the effectiveness of video-based interventions targeting improved self-assessment accuracy among physicians.

Materials and methods

The authors performed a systematic search of MEDLINE, Embase, EBM reviews, and Scopus databases from inception to August 23, 2022, using combinations of terms for “self-assessment”, “video-recording”, and “physician”. Eligible studies were empirical investigations assessing the effect of video-based interventions on physicians’ self-assessment accuracy with a comparison of self-assessment accuracy pre- and post- video intervention. We defined self-assessment accuracy as a “direct comparison between an external evaluator and self-assessment that was quantified using formal statistical analysis”. Two reviewers independently screened records, extracted data, assessed risk of bias, and evaluated quality of evidence. A narrative synthesis was conducted, as variable outcomes precluded a meta-analysis.

Results

A total of 2,376 papers were initially retrieved. Of these, 22 papers were selected for full-text review; a final 9 studies met inclusion criteria for data extraction. Across studies, 240 participants from 5 specialties were represented. Video-based interventions included self-video review (8/9), benchmark video review (3/9), and/or a combination of both types (1/9). Five out of nine studies reported that participants had inaccurate self-assessment at baseline. After the intervention, 5 of 9 studies found a statistically significant improvement in self-assessment accuracy.

Conclusions

Overall, current data suggests video-based interventions can improve self-assessment accuracy. Benchmark video review may enable physicians to improve self-assessment accuracy, especially for those with limited experience performing a particular clinical skill. In contrast, self-video review may be able to provide improvement in self-assessment accuracy for more experience physicians. Future research should use standardized methods of comparison for self-assessment accuracy, such as the Bland-Altman analysis, to facilitate meta-analytic summation.

Citation: Pattni C, Scaffidi M, Li J, Genis S, Gimpaya N, Khan R, et al. (2023) Video-based interventions to improve self-assessment accuracy among physicians: A systematic review. PLoS ONE 18(7): e0288474. https://doi.org/10.1371/journal.pone.0288474

Editor: Sorana D. Bolboacă, Iuliu Hațieganu University of Medicine and Pharmacy: Universitatea de Medicina si Farmacie Iuliu Hatieganu, ROMANIA

Received: November 3, 2022; Accepted: June 28, 2023; Published: July 13, 2023

Copyright: © 2023 Pattni et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: Funding/Support: CMW holds an Early Researcher Award from the Ontario Ministry of Research and Innovation. The funders had no role in the design and conduct of the review, decision to publish and preparation, review, or approval of the manuscript.

Competing interests: Other disclosures: Rishad Khan has received research grants from AbbVie and Ferring Pharmaceuticals and research funding from Pendopharm. Samir C. Grover has received research grants and personal fees from AbbVie and Ferring Pharmaceuticals, personal fees from Takeda, education grants from Janssen, and has equity in Volo Healthcare. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Introduction

Self-assessment, the evaluation of one’s own performance compared to a standard, can be used to identify specific deficiencies in a physician’s performance of relevant procedural and non-procedural skills. Ultimately, this would allow for appropriate corrective measures [1–3]. The current literature, however, suggests that physicians are inaccurate in their self-assessments, which may compromise opportunities for self-development [4,5]. In particular, self-assessment accuracy, the extent to which one’s own evaluation corresponds to an external standard, is inaccurate among medical subspecialty physicians and surgeons alike [5–8]. Inaccurate self-assessment may impede opportunities for professional growth, allowing for mistaken overconfidence in one’s own performance [9,10].

To mitigate the risks from inaccurate self-assessment and optimize opportunities for self-development, video-based interventions have been proposed. Video-based interventions involve physicians watching their own performance, an experts’ performance or both, in order to inform their self-assessment [11–13]. It has been proposed that these interventions work by facilitating feedback to the individual via visual cues, which allows for more informed assessment [12]. To date, however, a comprehensive evaluation of these interventions and their outcomes has not been conducted, as there is no systematic review on the topic [14].

We conducted a systematic review of the effectiveness of video-based interventions targeting improved self-assessment accuracy among all physicians, in order to inform educational and clinical practices as well as provide suggestions for future research.

Materials and methods

We prospectively registered this systematic review in the PROSPERO International Prospective Register of Systematic Reviews (CRD42020161954) and followed reporting guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15].

Definitions

We defined self-assessment accuracy as a direct comparison between an external evaluator and self-assessment that was quantified using formal statistical analysis. Specifically, we considered “accurate self-assessment” according to how it was defined in the included papers.

Data sources and searches

Using a search strategy developed by an academic health sciences librarian (N.T.), we systematically searched four bibliographic databases—Ovid MEDLINE (including Epub ahead of print, in-process, and other unindexed citations), Ovid Embase, EBM Reviews, and Scopus—from database inception through August 23, 2022. We conducted an initial database search on September 17, 2019, and updated it on August 11, 2020, August 20, 2021 and August 23, 2022. Search terms included the concepts of “self-assessment”, “video recording” and “physicians”, while also using a combination of subject headings and key words (see S1 File). Additionally, we searched the grey literature through August 2022 using the following: metaRegister of controlled trials (active and archived registers); the WHO ICTRP portal (http://apps.who.int/trialsearch/); Clinical Trials.gov (https://clinicaltrials.gov/); and the Cochrane review database. Finally, we searched abstracts and proceedings of major medical education meetings including AMEE: The Association for Medical Education in Europe Conference (2015–22), the International Conference on Residency Education (2014–22); and the Canadian Conference on Medical Education (2018–22).

Eligibility criteria

We included all randomized controlled trials (RCTs) and non-randomized before-after studies investigating the effect of video-based interventions to improve self assessment accuracy among physicians regardless of career stage (i.e. post-graduate trainees/residents/fellows and/or practicing physicians). We only examined studies in which the majority of participants were physicians, as we aimed to represent our findings among only fully licensed, independent practitioners. (Note: we only included studies that had medical students if the total proportion of medical students was <50% among all participants and the remainder of the participants were attending physicians and/or post-graduate trainees). We excluded studies if they only examined external or self-assessment without comparison to one another, assessed non-procedural tasks (e.g. communication skills), did not use a video-based intervention, was not a primary study (e.g. review), only involved medical students and/or non-physicians, published only in abstract form, or if they were published in any language other than English. We made no exclusions based on the date of publication.

Study selection

All retrieved studies were screened in a 2-stage process, beginning with titles and abstracts, for which inter-rater agreement was 92.3% (κ = 0.36), then proceeding to full text before determining eligibility. At each level, the studies were screened independently by two authors (CP, MAS) using Colandr, an open-source, research and evidence synthesis tool [16,17]. Chance-corrected agreement on inclusion after full-text screening between raters was high at 95.4% (κ = 0.91). Any discrepancies were resolved by consensus discussion and adjudication in the presence of a third reviewer (NG).

Data extraction and synthesis

Two authors (CP, MAS) independently extracted the information from all eligible studies using a standardized data extraction sheet with Excel 2016 (Microsoft Corporation, Redmond, Washington). We extracted data on the following:

Study demographics, including title, authors, publication date, journal of publication, study design (observation or interventional), task (procedural or non-procedural and exact task), study setting (clinical or simulated);
Participant and external observer details including, number of participants, experience level, subspecialty, metric of inter-rater agreement among external observers (e.g. Cohen’s kappa)
Type of video interventions and assessment, including self-video and/or benchmark/expert video), self/external assessment tools and validity evidence of the tools, timeframe of intervention with respect to post-intervention self-assessment
Type of outcomes, including statistical method for comparison of self-assessment accuracy, self-assessment accuracy at baseline with associated study metric, and impact of video-based intervention on self-assessment accuracy with p-values where provided.

Any disagreements were resolved by consensus discussion. If a study was missing data, we contacted the authors of that study in an effort to obtain the missing data.

Data analysis

The primary outcome was the impact of the video-based intervention on self-assessment accuracy. Due to the differences in the types of outcomes in the included studies, a meta-analysis was not possible; therefore, we conducted a narrative synthesis of the data.

Risk-of-bias and quality assessment

Two reviewers (NG, JL) independently assessed risk of bias for each included study using the Quality Assessment of Diagnostic Accuracy 2 (QUADAS-2) tool for diagnostic accuracy within each study [18], which was modified for self-assessment accuracy (see S2 File). The quality of each study was also rated using the Medical Education Research Study Quality Instrument (MERSQI) [19]. Two reviewers (MAS, CP) evaluated the following six domains of all included studies: study design; sampling; type of data; validity evidence for evaluation instrument scores; data analysis; and outcome. Using a weighting score for each section, an overall score that ranged between 5 and 18 was calculated.

Results

Our initial search yielded 2,376 original articles, which included 279, 409, and 194 papers from subsequent updates, respectively. There were 1,302 duplicates that were removed. At the title and abstract screening stage, 1, 074 studies were evaluated. Additionally, the grey literature search did not identify any relevant records. The most common reason for the exclusion of records at the time of the abstract was due to lack of defined self-assessment accuracy. At the full-text screening stage, 22 studies were evaluated and 13 of these were excluded, with no disagreements between authors (for rationale of excluded full texts, see S3 File). Ultimately, 9 studies met inclusion criteria and were included in this systematic review. The search flow with details regarding full-text article exclusion is summarized in Fig 1.

Download:

Fig 1. Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow chart illustrating the identification and selection process for studies included in this meta-analysis.

https://doi.org/10.1371/journal.pone.0288474.g001

Study characteristics

The characteristics of included studies are outlined in Table 1. All 9 articles were interventional studies published between 1998 and 2019 [12,20–27], which included a total of 240 participants. All studies were single-center and 3 were randomized controlled trials while the remainder were non-randomized interventional studies. The most common specialty was General Surgery in 4/9 studies, one of which focused on Bariatric surgery while the remaining 3 represented core General Surgery. In terms of experience levels among the 240 participants, there were 17 (7.1%) medical students, 217 (90.4%) post-graduate trainees, and 6 (2.5%) attending physicians. All studies used a pre-post study design, with most using a single arm design (6/9). In terms of the type of skill, 5/9 studies investigated self-assessment of procedural skills including gastroscopy⁷, sigmoidoscopy¹², laparoscopic surgery [21,25], and ultrasound-guided central venous catheter insertion into the internal jugular vein (US CVC IJ).⁹ The remaining 4/9 studies were non-procedural, focused on tasks related to simulated patient encounters [24,28], resuscitations¹⁰ and mock oral examinations [22].

Download:

Table 1. Characteristics for studies included in qualitative synthesis (n = 9).

https://doi.org/10.1371/journal.pone.0288474.t001

Characteristics of the interventions

Among the nine studies, the most common type of intervention was self-video review as evaluated in 8 studies, followed by benchmark video review in 4 studies and combination of benchmark and self-video review in 3 studies. Three of nine studies allowed for playback review with fast-forwarding and/or rewinding throughout the video intervention. One study (1/9) allowed for experts to choose specific video segments for each participant to review in order to illustrate constructive feedback. Three (3/9) studies reported a delay between the task and video review. Overall, there was a paucity of information regarding the length of the video and characteristics of the video review in the majority of studies.

Assessment of outcomes

The most common type of outcome assessment was an assessment tool with prior validity evidence, as used in 4 (4/9) studies [12,21,25,26]. External assessment was most commonly performed by experts who were unblinded to participant identity/group assignment (5/9); one (1/9) study used computer metrics on an endoscopy simulator. There was no time delay between the intervention and assessment of outcome in 6 (6/7) studies.

Impact of video interventions on self-assessment accuracy

At baseline, 5 (5/9) studies found that there was inaccurate self-assessment among participants; three (3/5) of these studies focused on procedural skill. Of these 5 studies, 2 (General Surgery) reported under-estimation of skill, 1 (Neurology) reported over-estimation, and the remaining 2 reported indeterminate variations in self-assessment accuracy.

After the video-based intervention, 5 (5/9) studies found that there was a statistically significant improvement in self-assessment accuracy. In terms of intervention type, benchmark video review was effective in 3 (3/9) studies, a combination of self-video and benchmark video review was effective in 1 study (1/9) and self-video review was effective in 2 (2/9) studies. Among the studies in which the benchmark video review was effective, 2 were procedural (General Surgery, Gastroenterology) and 1 was non-procedural (General Surgery). In contrast, both (2/9) studies in which self-video review was effective were from General Surgery, only 1 of which was procedural.

There were 2 (2/9) studies that directly compared different types of video-based interventions using two different participant arms. In one study, benchmark video review led to a significant improvement in self-assessment accuracy when compared directly to the combination group of both benchmark video review and self video review; no significant differences were observed when benchmark video review was compared to self-video review [12]. In the other study, benchmark video review led to a significant improvement in self-assessment accuracy compared to self-video review and to simulator practice only [31].

Additionally, there was another study that compared the use of two video-based interventions within the same group over time, wherein self-assessment accuracy was measured after self-video review and then again after review of four benchmark performances [32]. This study found that self assessment accuracy only improved after self-video review and not benchmark video review.

Risk of bias and study quality

Risk of bias assessments are summarized in Fig 2. Eight (8/9) studies had a low risk of bias and one (1/9) study had a high risk of bias. In the one study with high risk of bias, there was no formal randomization of participants noted and there was a potential for bias in how the reference standard used, as participants reviewed the video with the assessors, which could have introduced bias. MERSQI scores ranged from 11.5 to 13, suggesting a overall high study quality [33] (see Table 2).

Download:

Fig 2. Quality assessment of diagnostic accuracy studies 2 (QUADAS-2) risk of bias assessment.

https://doi.org/10.1371/journal.pone.0288474.g002

Download:

Table 2. Quality assessment for included studies (n = 9).

https://doi.org/10.1371/journal.pone.0288474.t002

Discussion

In this systematic review of video-based interventions to improve self-assessment accuracy of physicians, we found that most interventions were effective, as reported in 5 studies of the 9 included studies [12,20–27]. Video-based interventions used either videos of participants’ own performances (i.e., self-video review) or, less commonly, expert performances (i.e., benchmark video review), or, least commonly of all, a combination of both. Overall, benchmark video review was the most effective intervention. The primary strength of our study is that it is the first systematic review on interventions of any kind aimed at improving self-assessment accuracy among physicians. Additionally, our search was comprehensive, as it included six databases and used broad search criteria.

One explanation for the finding that videos, especially benchmark video review, can be effective at improving self-assessment accuracy is that it addresses deficiencies outlined by the Dunning-Kruger effect. In particular, individuals who have less experience or training may not be able to develop an accurate schema of their performance. This discrepancy, however, can be mitigated by providing an external point of reference (e.g. viewing a benchmark performance). This explanation is reinforced by our finding that benchmark video review was the most effective type of intervention, as it significantly improved self-assessment accuracy in almost all of the studies that used it [12,20,27]. Furthermore, in two studies that directly compared with an intervention involving self-video review (or a combination of benchmark video review with self-video review) [27,34], benchmark video review was found to be more effective. Of note, however, one study that found that benchmark video review did not significantly improve self-assessment accuracy; in this same study, self-video review used prior to the benchmark did indeed lead to more accurate assessments [21].

The differences in the effectiveness of benchmark videos compared to self-review videos may be contingent on physician experience levels. In particular, the one study which found self-video review more effective than benchmark-video review, was conducted among senior surgical residents [21]. The authors posited that this finding was likely due to the experience level of the residents–that is, a benchmark video did not provide them with any new information [21]. Rather, the residents likely already had an accurate working schema of their abilities that was only improved when they had an objective record of their own performances (i.e., self-video review) to inform their self-assessments. To this end, the effectiveness of benchmark videos is likely influenced by prior experience, with more novice physicians benefiting most. This is reflective of the process of “modelling”, which is well described in the psychological literature [35]. More specific to medical education, the use observation of videos to learn clinical skills has been demonstrated in several studies. For example, video-recorded performances, even when containing errors, supply learners with an idea of the variability of a performance’s scope [36].

We note the study limitations. Primarily, we could not quantify the impact of videos on self-assessment with a meta-analysis due to the lack of standardized outcome measures and different types of tasks in the included studies. In particular, the Bland-Altman analysis, a well described approach in the method comparison literature [37], is preferred, as this can allow for meta-analytic summation [8,38]. Similarly, we could not account for the potential confounding effect of accurate self-assessment at baseline, as noted in four studies. In addition, most studies did not quantify intervention length, which restricted our ability to determine whether length of the video interventions influences outcomes. The other characteristic we could not control for was physician level of experience and specialty that preceded the intervention, especially as the former has shown to have a significant impact on physician self-assessment [8]. Third, we may have missed relevant studies due to language bias, as we only included English-language abstracts, and due to a lack of universal definition of self-assessment accuracy. Last, we cannot definitively determine whether publication bias was present, as there may be additional studies with negative findings that were not yet published.

Conclusions

In closing, our systematic review highlights several implications for medical education. First, benchmark video review may enable physicians to improve self-assessment accuracy, especially for those with limited experience in performing a particular clinical skill. Similarly, self-video review may be able to inform more experienced physicians’ self-assessments. Second, due to the absence of similar analyses that precluded a meta-analysis, we suggest that future studies use standardized methods of comparison for self-assessment accuracy. Finally, it would be helpful for future research to delineate the effect of video-based interventions on both self-assessment accuracy and skill acquisition–that is, whether there is a relationship between improving self-assessment accuracy on skill acquisition. Quantifying this relationship would advance the objective impact of accurate self-assessment on clinical practice.

Supporting information

S1 Checklist. PRISMA 2020 checklist.

https://doi.org/10.1371/journal.pone.0288474.s001

(PDF)

S1 File. Search strategy.

https://doi.org/10.1371/journal.pone.0288474.s002

(DOCX)

S2 File. QUADAS-2 tool modified for self-assessment accuracy studies.

https://doi.org/10.1371/journal.pone.0288474.s003

(DOCX)

S3 File. Reference list for excluded studies from full-text review, with respective rationale for exclusion.

https://doi.org/10.1371/journal.pone.0288474.s004

(DOCX)

References

1. Colthart I, Bagnall G, Evans A, Allbutt H, Haig A, Illing J, et al. The effectiveness of self-assessment on the identification of learner needs, learner activity, and impact on clinical practice: BEME Guide no. 10. Med Teach. 2008;30: 124–145. pmid:18464136
- View Article
- PubMed/NCBI
- Google Scholar
2. Eva KW, Regehr G. Self-Assessment in the Health Professions: A Reformulation and Research Agenda. Acad Med. 2005;80: S46–S54. pmid:16199457
- View Article
- PubMed/NCBI
- Google Scholar
3. Sargeant J, Armson H, Chesluk B, Dornan T, Eva K, Holmboe E, et al. The processes and dimensions of informed self-assessment: a conceptual model. Acad Med. 2010;85: 1212–1220. pmid:20375832
- View Article
- PubMed/NCBI
- Google Scholar
4. Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of Physician Self-assessment Compared: A Systematic Review. J Am Med Assoc. 2006;296: 1094–1102.
- View Article
- Google Scholar
5. Nayar SK, Musto L, Baruah G, Fernandes R, Bharathan R. Self-Assessment of Surgical Skills: A Systematic Review. J Surg Educ. 2020;77: 348–361. pmid:31582350
- View Article
- PubMed/NCBI
- Google Scholar
6. Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of physician self-assessment compared with observed measures of competence: a systematic review. Jama. 2006;296: 1094–1102. pmid:16954489
- View Article
- PubMed/NCBI
- Google Scholar
7. Blanch-Hartigan D. Medical students’ self-assessment of performance: Results from three meta-analyses. Patient Educ Couns. 2011;84: 3–9. pmid:20708898
- View Article
- PubMed/NCBI
- Google Scholar
8. Scaffidi MA, Li J, Genis S, Tipton E, Khan R, Pattni C, et al. Accuracy of self-assessment in gastrointestinal endoscopy: A systematic review and meta-analysis. Endoscopy. 2022; 176–185. pmid:36162425
- View Article
- PubMed/NCBI
- Google Scholar
9. Schultz K, McGregor T, Pincock R, Nichols K, Jain S, Pariag J. Discrepancies Between Preceptor and Resident Performance Assessment: Using an Electronic Formative Assessment Tool to Improve Residents’ Self-Assessment Skills. Acad Med. 2022;97: 669–673. pmid:33951681
- View Article
- PubMed/NCBI
- Google Scholar
10. Cassam Q. Diagnostic error, overconfidence and self-knowledge. Palgrave Commun. 2017;3.
- View Article
- Google Scholar
11. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2017;74: 23–29. pmid:27522346
- View Article
- PubMed/NCBI
- Google Scholar
12. Scaffidi M, Walsh C, Khan R, Parker C, Al-Mazroui A, Abunassar M, et al. Influence of video-based feedback on self-assessment accuracy of endoscopic skills: a randomized controlled trial. Endosc Int Open. 2019;07: E678–E684. pmid:31061880
- View Article
- PubMed/NCBI
- Google Scholar
13. Hawkins SC, Osborne A, Schofield SJ, Pournaras DJ, Chester JF. Improving the accuracy of self-assessment of practical clinical skills using video feedback The importance of including benchmarks. Med Teach. 2012;34: 279–284. pmid:22455696
- View Article
- PubMed/NCBI
- Google Scholar
14. Colthart I, Bagnall G, Evans A, Allbutt H, Haig A, Illing J, et al. The effectiveness of self-assessment on the identification of learner needs, learner activity, and impact on clinical practice: BEME Guide no. 10. Med Teach. 2008;30: 124–145. pmid:18464136
- View Article
- PubMed/NCBI
- Google Scholar
15. Shamseer L, Moher D, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: Elaboration and explanation. BMJ. 2015;349: 1–25. pmid:25555855
- View Article
- PubMed/NCBI
- Google Scholar
16. Harrison H, Griffin SJ, Kuhn I, Usher-smith JA. Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation. 2020;3: 1–12.
- View Article
- Google Scholar
17. Kahili-Heede M, Hillgren KJ. Colandr. J Med Libr Assoc. 2021;198. pmid:34629986
- View Article
- PubMed/NCBI
- Google Scholar
18. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies. Ann Intern Med. 2011; 253–260.
- View Article
- Google Scholar
19. Cook DA, Reed DA. Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education. Acad Med. 2015;90: 1067–1076. pmid:26107881
- View Article
- PubMed/NCBI
- Google Scholar
20. Martin D, Regehr G, Hodges B, McNaughton N. Improving self-assessment of family practice residents using benchmark videos. Academic Medicine. 1998. pp. 1201–1206.
- View Article
- Google Scholar
21. Ward M, MacRae H, Schlachta C, Mamazza J, Poulin E, Reznick R, et al. Resident self-assessment of operative performance. Am J Surg. 2003;185: 521–524. http://dx.doi.org/10.1016/S0002-9610%2803%2900069-2. pmid:12781878
- View Article
- PubMed/NCBI
- Google Scholar
22. Kozol R, Giles M, Voytovich A. The value of videotape in mock oral board examinations. Curr Surg. 2004;61: 511–514. pmid:15475107
- View Article
- PubMed/NCBI
- Google Scholar
23. Sadosty AT, Bellolio MF, Laack TA, Luke A, Weaver A, Goyal DG. Simulation-based emergency medicine resident self-assessment. J Emerg Med. 2011;41: 679–685. pmid:21835571
- View Article
- PubMed/NCBI
- Google Scholar
24. Schellenberg KL, Schofield SJ, Fang S, Johnston WS. Breaking bad news in amyotrophic lateral sclerosis: the need for medical education. Amyotroph Lateral Scler Frontotemporal Degener. 2014;15: 47–54. https://dx.doi.org/10.3109/21678421.2013.843711. pmid:24245652
- View Article
- PubMed/NCBI
- Google Scholar
25. Herrera-Almario GE, Kirk K, Guerrero VT, Jeong K, Kim S, Hamad GG. The effect of video review of resident laparoscopic surgical skills measured by self- and external assessment. Am J Surg. 2016;211: 315–320. pmid:26590043
- View Article
- PubMed/NCBI
- Google Scholar
26. Wittler M, Hartman N, Manthey D, Hiestand B, Askew K. Video-augmented feedback for procedural performance. Med Teach. 2015; 1–6. pmid:26383586
- View Article
- PubMed/NCBI
- Google Scholar
27. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2016;74: 23–29. pmid:27522346
- View Article
- PubMed/NCBI
- Google Scholar
28. Martin D, Regehr G, Hodges B, McNaughton N. Using videotaped benchmarks to improve the self-assessment ability of family practice residents. Acad Med. 1998;73: 1201–1206. pmid:9834705
- View Article
- PubMed/NCBI
- Google Scholar
29. Vassiliou MC, Feldman LS, Andrew CG, Bergman S, Leffondré K, Stanbridge D, et al. A global assessment tool for evaluation of intraoperative laparoscopic skills. Am J Surg. 2005;190: 107–113. pmid:15972181
- View Article
- PubMed/NCBI
- Google Scholar
30. Walsh CM, Ling SC, Mamula P, Lightdale JR, Walters TD, Yu JJ, et al. The gastrointestinal endoscopy competency assessment tool for pediatric colonoscopy. J Pediatr Gastroenterol Nutr. 2015;60: 474–480. pmid:25564819
- View Article
- PubMed/NCBI
- Google Scholar
31. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2017;74: 23–29. pmid:27522346
- View Article
- PubMed/NCBI
- Google Scholar
32. Ward M, MacRae H, Schlachta C, Mamazza J, Poulin E, Reznick R, et al. Resident self-assessment of operative performance. Am J Surg. 2003;185: 521–524. pmid:12781878
- View Article
- PubMed/NCBI
- Google Scholar
33. Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE, Wright SM. Association Between Funding and Quality of Published Medical Education Research. JAMA—J Am Med Assoc. 2007;298: 1002–1009. pmid:17785645
- View Article
- PubMed/NCBI
- Google Scholar
34. Causada-Calo NS, Gonzalez-Moreno EI, Bishay K, Shorr R, Dube C, Heitman SJ, et al. Educational interventions are associated with improvements in colonoscopy quality indicators: a systematic review and meta-analysis. Endosc Int Open. 2020;08: E1321–E1331. pmid:33015334
- View Article
- PubMed/NCBI
- Google Scholar
35. Ashford D, Davids K, Bennett SJ. Observational Modeling Effects for Movement Dynamics and Movement Outcome Measures Across Differing Task Constraints: A Meta-Analysis. J Mot Behav. 2006;38: 185–205. pmid:16709559
- View Article
- PubMed/NCBI
- Google Scholar
36. Domuracki K, Wong A, Olivieri L, Grierson LEM. The impacts of observing flawed and flawless demonstrations on clinical skill learning. Med Educ. 2015;49: 186–192. pmid:25626749
- View Article
- PubMed/NCBI
- Google Scholar
37. Zaki R, Bulgiba A, Ismail R, Ismail NA. Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: A systematic review. PLoS One. 2012;7: 1–7. pmid:22662248
- View Article
- PubMed/NCBI
- Google Scholar
38. Tipton E, Shuster J. A framework for the meta-analysis of Bland–Altman studies based on a limits of agreement approach. Stat Med. 2017;36: 3621–3635. pmid:28664537
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Colthart I, Bagnall G, Evans A, Allbutt H, Haig A, Illing J, et al. The effectiveness of self-assessment on the identification of learner needs, learner activity, and impact on clinical practice: BEME Guide no. 10. Med Teach. 2008;30: 124–145. pmid:18464136
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Eva KW, Regehr G. Self-Assessment in the Health Professions: A Reformulation and Research Agenda. Acad Med. 2005;80: S46–S54. pmid:16199457
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Sargeant J, Armson H, Chesluk B, Dornan T, Eva K, Holmboe E, et al. The processes and dimensions of informed self-assessment: a conceptual model. Acad Med. 2010;85: 1212–1220. pmid:20375832
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of Physician Self-assessment Compared: A Systematic Review. J Am Med Assoc. 2006;296: 1094–1102.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref5] 5. Nayar SK, Musto L, Baruah G, Fernandes R, Bharathan R. Self-Assessment of Surgical Skills: A Systematic Review. J Surg Educ. 2020;77: 348–361. pmid:31582350
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref6] 6. Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of physician self-assessment compared with observed measures of competence: a systematic review. Jama. 2006;296: 1094–1102. pmid:16954489
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref7] 7. Blanch-Hartigan D. Medical students’ self-assessment of performance: Results from three meta-analyses. Patient Educ Couns. 2011;84: 3–9. pmid:20708898
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref8] 8. Scaffidi MA, Li J, Genis S, Tipton E, Khan R, Pattni C, et al. Accuracy of self-assessment in gastrointestinal endoscopy: A systematic review and meta-analysis. Endoscopy. 2022; 176–185. pmid:36162425
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Schultz K, McGregor T, Pincock R, Nichols K, Jain S, Pariag J. Discrepancies Between Preceptor and Resident Performance Assessment: Using an Electronic Formative Assessment Tool to Improve Residents’ Self-Assessment Skills. Acad Med. 2022;97: 669–673. pmid:33951681
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref10] 10. Cassam Q. Diagnostic error, overconfidence and self-knowledge. Palgrave Commun. 2017;3.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref11] 11. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2017;74: 23–29. pmid:27522346
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref12] 12. Scaffidi M, Walsh C, Khan R, Parker C, Al-Mazroui A, Abunassar M, et al. Influence of video-based feedback on self-assessment accuracy of endoscopic skills: a randomized controlled trial. Endosc Int Open. 2019;07: E678–E684. pmid:31061880
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref13] 13. Hawkins SC, Osborne A, Schofield SJ, Pournaras DJ, Chester JF. Improving the accuracy of self-assessment of practical clinical skills using video feedback The importance of including benchmarks. Med Teach. 2012;34: 279–284. pmid:22455696
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref14] 14. Colthart I, Bagnall G, Evans A, Allbutt H, Haig A, Illing J, et al. The effectiveness of self-assessment on the identification of learner needs, learner activity, and impact on clinical practice: BEME Guide no. 10. Med Teach. 2008;30: 124–145. pmid:18464136
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref15] 15. Shamseer L, Moher D, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: Elaboration and explanation. BMJ. 2015;349: 1–25. pmid:25555855
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref16] 16. Harrison H, Griffin SJ, Kuhn I, Usher-smith JA. Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation. 2020;3: 1–12.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref17] 17. Kahili-Heede M, Hillgren KJ. Colandr. J Med Libr Assoc. 2021;198. pmid:34629986
View Article
PubMed/NCBI
Google Scholar

[63] View Article

[64] PubMed/NCBI

[65] Google Scholar

[ref18] 18. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies. Ann Intern Med. 2011; 253–260.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref19] 19. Cook DA, Reed DA. Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education. Acad Med. 2015;90: 1067–1076. pmid:26107881
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref20] 20. Martin D, Regehr G, Hodges B, McNaughton N. Improving self-assessment of family practice residents using benchmark videos. Academic Medicine. 1998. pp. 1201–1206.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref21] 21. Ward M, MacRae H, Schlachta C, Mamazza J, Poulin E, Reznick R, et al. Resident self-assessment of operative performance. Am J Surg. 2003;185: 521–524. http://dx.doi.org/10.1016/S0002-9610%2803%2900069-2. pmid:12781878
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref22] 22. Kozol R, Giles M, Voytovich A. The value of videotape in mock oral board examinations. Curr Surg. 2004;61: 511–514. pmid:15475107
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref23] 23. Sadosty AT, Bellolio MF, Laack TA, Luke A, Weaver A, Goyal DG. Simulation-based emergency medicine resident self-assessment. J Emerg Med. 2011;41: 679–685. pmid:21835571
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref24] 24. Schellenberg KL, Schofield SJ, Fang S, Johnston WS. Breaking bad news in amyotrophic lateral sclerosis: the need for medical education. Amyotroph Lateral Scler Frontotemporal Degener. 2014;15: 47–54. https://dx.doi.org/10.3109/21678421.2013.843711. pmid:24245652
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref25] 25. Herrera-Almario GE, Kirk K, Guerrero VT, Jeong K, Kim S, Hamad GG. The effect of video review of resident laparoscopic surgical skills measured by self- and external assessment. Am J Surg. 2016;211: 315–320. pmid:26590043
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref26] 26. Wittler M, Hartman N, Manthey D, Hiestand B, Askew K. Video-augmented feedback for procedural performance. Med Teach. 2015; 1–6. pmid:26383586
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref27] 27. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2016;74: 23–29. pmid:27522346
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref28] 28. Martin D, Regehr G, Hodges B, McNaughton N. Using videotaped benchmarks to improve the self-assessment ability of family practice residents. Acad Med. 1998;73: 1201–1206. pmid:9834705
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref29] 29. Vassiliou MC, Feldman LS, Andrew CG, Bergman S, Leffondré K, Stanbridge D, et al. A global assessment tool for evaluation of intraoperative laparoscopic skills. Am J Surg. 2005;190: 107–113. pmid:15972181
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref30] 30. Walsh CM, Ling SC, Mamula P, Lightdale JR, Walters TD, Yu JJ, et al. The gastrointestinal endoscopy competency assessment tool for pediatric colonoscopy. J Pediatr Gastroenterol Nutr. 2015;60: 474–480. pmid:25564819
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref31] 31. Vyasa P, Willis RE, Dunkin BJ, Gardner AK. Are General Surgery Residents Accurate Assessors of Their Own Flexible Endoscopy Skills? J Surg Educ. 2017;74: 23–29. pmid:27522346
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref32] 32. Ward M, MacRae H, Schlachta C, Mamazza J, Poulin E, Reznick R, et al. Resident self-assessment of operative performance. Am J Surg. 2003;185: 521–524. pmid:12781878
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref33] 33. Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE, Wright SM. Association Between Funding and Quality of Published Medical Education Research. JAMA—J Am Med Assoc. 2007;298: 1002–1009. pmid:17785645
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref34] 34. Causada-Calo NS, Gonzalez-Moreno EI, Bishay K, Shorr R, Dube C, Heitman SJ, et al. Educational interventions are associated with improvements in colonoscopy quality indicators: a systematic review and meta-analysis. Endosc Int Open. 2020;08: E1321–E1331. pmid:33015334
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref35] 35. Ashford D, Davids K, Bennett SJ. Observational Modeling Effects for Movement Dynamics and Movement Outcome Measures Across Differing Task Constraints: A Meta-Analysis. J Mot Behav. 2006;38: 185–205. pmid:16709559
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref36] 36. Domuracki K, Wong A, Olivieri L, Grierson LEM. The impacts of observing flawed and flawless demonstrations on clinical skill learning. Med Educ. 2015;49: 186–192. pmid:25626749
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref37] 37. Zaki R, Bulgiba A, Ismail R, Ismail NA. Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: A systematic review. PLoS One. 2012;7: 1–7. pmid:22662248
View Article
PubMed/NCBI
Google Scholar

[141] View Article

[142] PubMed/NCBI

[143] Google Scholar

[ref38] 38. Tipton E, Shuster J. A framework for the meta-analysis of Bland–Altman studies based on a limits of agreement approach. Stat Med. 2017;36: 3621–3635. pmid:28664537
View Article
PubMed/NCBI
Google Scholar

[145] View Article

[146] PubMed/NCBI

[147] Google Scholar

Figures

Abstract

Purpose

Materials and methods

Results

Conclusions

Introduction

Materials and methods

Definitions

Data sources and searches

Eligibility criteria

Study selection

Data extraction and synthesis

Data analysis

Risk-of-bias and quality assessment

Results

Study characteristics

Characteristics of the interventions

Assessment of outcomes

Impact of video interventions on self-assessment accuracy

Risk of bias and study quality

Discussion

Conclusions

Supporting information

S1 Checklist. PRISMA 2020 checklist.

S1 File. Search strategy.

S2 File. QUADAS-2 tool modified for self-assessment accuracy studies.

S3 File. Reference list for excluded studies from full-text review, with respective rationale for exclusion.

References