Figures
Abstract
Background
Independent medical evaluations (IMEs) are commonly acquired to provide an assessment of impairment; however, these assessments show poor inter-rater reliability. One potential contributor is symptom exaggeration by patients, who may feel pressure to emphasize their level of impairment to qualify for incentives. This study explored the prevalence of symptom exaggeration among IME examinees in North America, which if common may represent an important consideration for improving the reliability of IMEs.
Methods
We searched CINAHL, EMBASE, MEDLINE and PsycINFO from inception to July 08, 2024. We included observational studies that used a known-group design or multi-modal determination method. Paired reviewers independently assessed risk of bias and extracted data. We performed a random-effects model meta-analysis to estimate the overall prevalence of symptom exaggeration and explored potential subgroup effects for sex, age, education, clinical condition, and confidence in the reference standard. We used the GRADE approach to assess the certainty of evidence.
Results
We included 44 studies with 46 cohorts and 9,794 patients. The median of the mean age was 40 (interquartile range [IQR] 38–42). Most cohorts included patients with traumatic brain injuries (n = 31, 67%) or chronic pain (n = 11, 24%). Prevalence of symptom exaggeration across studies ranged from 17% to 67%. We found low certainty evidence suggesting that studies with a greater proportion of women (≥40%) may be associated with higher rates of exaggeration (47%, 95%CI 36–58) vs. studies with a lower proportion of women (<40%) (31%, 95%CI 28–35; test of interaction p = 0.02). Possible explanations include biological differences, greater bodily awareness, or higher rates of negative affectivity. We found no significant subgroup effects for type of clinical condition, confidence in the reference standard, age, or education.
Conclusion
Symptom exaggeration may occur in almost 50% of women and in approximately a third of men undergoing IMEs. The high prevalence of symptom exaggeration among IME attendees provides a compelling rationale for clinical evaluators to formally explore this issue. Future research should establish the reliability and validity of evaluation criteria for symptom exaggeration and develop a structured IME assessment approach.
Citation: Darzi AJ, Wang L, Riva JJ, Morsi RZ, Charide R, Couban RJ, et al. (2025) Prevalence of symptom exaggeration among North American independent medical evaluation examinees: A systematic review of observational studies. PLoS One 20(6): e0324684. https://doi.org/10.1371/journal.pone.0324684
Editor: Thiago P. Fernandes, Federal University of Paraiba, BRAZIL
Received: July 31, 2024; Accepted: April 28, 2025; Published: June 25, 2025
Copyright: © 2025 Darzi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Background
In 2022, Statistics Canada found that 8.0 million Canadian adults reported a disability [1] and in 2020, 64.4 million Americans reported living with disability [2]. Individuals suffering from a disabling injury or illness may be eligible to receive financial compensation and services based on their level of impairment. Determinations of impairment often rely on independent medical evaluations (IMEs), which are requested by a third party, such as an insurance company or employer, and conducted by a clinician who is not part of the patient’s regular medical team [3]. Underlying this process is the concern that treating clinicians may have difficulty providing impartial assessments of their patients [4,5]. Such concerns are supported by a trial that randomized 5,888 individuals in Norway to an independent assessment or usual care and found 29% of IMEs recommended less sick leave than the treating physician (68% the same, and 3% a longer duration) [6].
Despite their widespread use and far-reaching consequences, the consistency and reliability of IMEs has been challenged. The most recent systematic review found that clinical experts assessing the same patients often dissented on whether they were disabled from working (median inter-rater reliability 0.45) [7]. Although this review suggested that standardization of the assessment process may improve the reliability of IMEs, [7] two subsequent studies failed to support this hypothesis [8]. Another potential source of variability in IME assessments is symptom exaggeration [3]. IME assessors may focus too narrowly on a biomedical model to explain symptoms, without giving sufficient attention to psychosocial and work-related factors that may influence how individuals present their symptoms [3,9].
Patients referred for IMEs often present with subjective complaints (e.g., mental illness, chronic pain) and may feel pressure to emphasize their level of impairment to qualify for wage replacement benefits, receiving time off work, or other incentives [3,10,11]. Patients’ presentation may also be affected if they perceive the assessor as representing the referring agency rather than their interests. Whether or not IME assessors consider symptom exaggeration has the potential to lead to very different conclusions; however, the prevalence of exaggeration among IME attendees is uncertain and individual studies report rates as low as 17% [12] or as high as 67% [13]. Also, terminology such as exaggeration, malingering, or over-reporting are defined inconsistently across studies, making it difficult to distinguish intentional deception from psychological amplification of distress [4,14]. We undertook the first systematic review of observational studies to explore the prevalence of symptom exaggeration among IME examinees in North America.
Methods
We conducted our systematic review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Meta-analysis of Observational Studies in Epidemiology (MOOSE) checklists [15,16]. (See S1 and S2 Checklists in the supplemental material) We registered our protocol on the Open Science Framework (Registration DOI: https://doi.org/10.17605/OSF.IO/64V2B) [17]. After registration but prior to data analysis, we included five meta-regressions/subgroup analyses to explore variability among studies reporting the prevalence of symptom exaggeration: (1) proportion of female participants, (2) older age, (3) level of formal education, (4) clinical condition, and (5) level of confidence in the reference standard used in the approach for evaluating symptom exaggeration.
Data sources and searches
An experienced medical librarian (RJC) developed database-specific search strategies (S1 Table) and conducted a systematic search in CINAHL, EMBASE, MEDLINE and PsycINFO, from inception through July 08, 2024. We included English, French or Spanish studies to reduce language bias. The search strategies were developed using a validation set of known relevant articles and included a combination of MeSH headings and free text key words, such as malinger* or litigation or litigant or “insufficient effort” and “independent medical examination” or “independent medical evaluation” or “disability” or “classification accuracy”. We did not use any filters for our searches to maximize sensitivity. We screened the reference lists of all included studies for additional eligible articles.
Study selection
Six reviewers screened the titles and abstracts of all retrieved citations, independently and in duplicate, and subsequently the full texts of potentially eligible studies, using standardized and pre-tested forms [18]. A third senior reviewer resolved disagreements when necessary.
Eligible studies: (i) enrolled individuals presenting for an IME in North America, (ii) in the presence of external incentive (e.g., insurance claims), and (iii) assessed the prevalence of symptom exaggeration using a known group design or multi-modal determination method [19,20]. As there is no singular reliable and valid criteria (reference standard) in the literature that is used to assess for symptom exaggeration, we included known group study designs that defined their reference standard based on criteria incorporating both clinical findings and performance on psychometric testing to classify individuals as exaggerating (within diagnostic test terminology, the target positive group), or not exaggerating (the target negative group) their symptoms [21,22].
Examples of two commonly used known group designs are the Slick, Sherman, and Iverson criteria for malingered neurocognitive dysfunction [23] and the Bianchini, Greve, & Glynn criteria for malingered pain-related disability [24]. We excluded studies that used only beyond-chance scores on symptom validity tests as an indicator of symptom exaggeration, since beyond-chance scores are infrequent and likely to result in underestimates [25–27]. We restricted our focus to North America as there may be important differences between IMEs conducted within North America where social insurance for disability is limited and Europe where social insurance is prominent. In cases where multiple studies had population overlap, we included only the study with the larger sample size.
Data extraction and risk of bias assessment
Teams of paired reviewers abstracted data independently and in duplicate from all eligible studies using standardized, pre-tested forms. We prefaced data abstraction with calibration exercises to optimize consistency and accuracy of extractions. For all identified studies, the reviewers abstracted the following data: name of first author, year of publication, participant demographics, referral source(s), criteria for establishing symptom exaggeration and reference standard, and the prevalence of symptom exaggeration. After completing training and calibration, pairs of reviewers independently evaluated risk of bias for each included study. They used key criteria tailored to known-group designs, which were developed and pre-tested in collaboration with research methodologists. These criteria included: (i) representativeness of the study population, (ii) validity of outcome assessment (including whether the index test was administered without knowledge of the reference standard, and confidence in the reference standard), (iii) whether those with and without symptom exaggeration were similar across age groups and education level, and (iv) loss to follow-up (≥20% was considered high risk of bias). The response options for all the above risk of bias items included “definitely yes”, “probably yes”, “probably no” and “definitely no”. Also, we evaluated whether the criteria for establishing symptom exaggeration had been shown reliable and valid. We resolved disagreements by consensus or with the help of a third senior reviewer.
We categorized the reference standard and rated our confidence in it as either: (i) ‘weak’ when the study declared a known-group design, however its only criterion for identifying symptom exaggeration was below-chance performance on forced-choice symptom validity testing without any corroborating clinical observations or inconsistencies in medical records. For example, a patient with a mild ankle sprain labeled as exaggerating exclusively because they failed a below‐chance forced‐choice test of pain threshold, with no clinical exam or review of documented pain or functional abilities; (ii) ‘moderate’ where most patients exaggerating symptoms were identified by forced symptom validity testing results, but some cases could be confirmed using other credible indicators. For example, a claimant insists they cannot remember simple details of their daily routine (e.g., the route to their kitchen), yet is casually observed navigating complex tasks with no apparent cognitive difficulty; or (iii) ‘strong’ where exaggeration was determined by either forced symptom validity testing results or other credible clinical evidence. For example, a clinical finding that would classify a patient presenting with persistent post-concussive complaints after a very mild head injury as exaggerating symptoms would include claims of remote memory loss (e.g., loss of spelling ability).
Data synthesis and analysis and certainty in the evidence assessment
We used a random-effects model to pool data for the prevalence of symptom exaggeration among IME examinees and a Freeman-Tukey double arcsine transformation to stabilize the variance [28,29]. This transformation avoids producing confidence intervals (CIs) that include values lower than 0% or greater than 100% [28,29]. We used the DerSimonian and Laird method [30] to pool estimates of symptom exaggeration based on the transformed values and their variances, and then the harmonic mean of sample sizes for back‐transformation to the original units of proportions [31].
We assessed the certainty of evidence based on the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach [32]. This approach considers risk of bias, indirectness, inconsistency, imprecision, and small study effects, to appraise the overall certainty of evidence as high, moderate, low, or very low [32]. We estimated that if 20% of IME attendees presented with symptom exaggeration, that would be sufficiently frequent to justify formal evaluation for exaggeration by IME evaluators. Therefore, we rated down for imprecision if the 95%CI associated with the prevalence of symptom exaggeration included 20%. When there were at least 10 studies contributing to meta-analysis, we evaluated small study effects by visual inspection of the funnel plot for asymmetry and calculation of Egger’s test [33].
Subgroup analyses, meta-regression, and sensitivity analyses
We assessed heterogeneity across studies contributing to our pooled estimate of symptom exaggeration using both a statistical test and visual inspection of forest plots. We did not calculate I2 as it can be misleading in cases where the estimates of precision are very narrow due to large sample sizes. Instead, we estimated the between-study variance with tau-squared (τ2), which provides an absolute measure of heterogeneity. We considered τ2 < 0.05 as low, between 0.05–0.1 as moderate, and >0.1 as substantial heterogeneity [34].
We assessed the variability between studies based on five hypotheses. We assumed a higher prevalence of symptom exaggeration with: (1) greater strength of the reference standard, (2) higher proportion of female participants, (3) older age, (4) lower level of formal education, and (5) higher risk of bias on a component-by-component basis. We also explored for subgroup effects based on type of clinical condition but did not pre-specify an anticipated direction of association. We conducted subgroup analyses if there were two or more studies in each subgroup, and evaluated credibility of significant subgroup effects using ICEMAN criteria [35].
We performed meta-regression to explore the relationship between the proportion of women, severity of the presenting complaint, mean age, and years of formal education, with the prevalence of symptom exaggeration. If meta-regression suggested an association, we used visual inspection of the associated scatterplot to estimate a threshold and conducted subgroup analysis. We performed all analyses using Stata software version 16.0 [36]. All comparisons were two-tailed, with a threshold P-value of 0.05.
Ethics approval and consent to participate
We did not require ethics approval for this systematic review and meta-analysis due to our sole use of already published data.
Systematic review update
Considering the speed at which studies exploring the prevalence of symptom exaggeration among IME attendees are published, we plan to update this review within the next five years [37].
Results
Of 20,405 unique citations identified in our search, 44 English-language studies that reported on 46 cohorts and 9,794 patients were eligible for review. (Fig 1). None of the studies had overlapping cohorts. In S5 Table we detail the included and excluded studies with reasons at full text screening. Of the 46 cohorts, 67% (n = 31) reported on patients with traumatic brain injuries (TBI) with or without mixed neurological diseases, 24% (n = 11) on chronic pain patients, and 9% (n = 4) on other populations including toxic exposure (n = 1) [38], personal injury claimants that were not described (n = 1) [39], patients with memory impairment (n = 1) [13] and claimants reporting cognitive dysfunction following exposure to occupational and environmental substances (n = 1) [40]. In terms of criteria used to identify individuals who were exaggerating symptoms, 61% (n = 28) of cohorts relied on the Slick, Sherman, and Iverson criteria for probable malingered neurocognitive dysfunction [23], 24% (n = 11) on the Bianchini criteria [24], and 15% (n = 7) used other criteria such as those proposed by Greiffenstein, Gola, and Baker [41], Nies and Sweet [22] or Lees-Haley methods [42] (Table 1).
Risk of bias
Of the 32% of studies that described their sampling method (14 of 44), 13 used consecutive sampling and one used random sampling methods to identify IME referrals. All studies reported minimal missing data (<5%). Most studies (n = 29, 64%) showed similar age and education characteristics between exaggerating and non-exaggerating groups. No study explicitly stated that IME assessors administered the index test without knowledge of the reference standard. We had moderate confidence in the reference standard used by most studies (n = 35, 80%). None of the known group designs used to evaluate symptom exaggeration provided evidence of reliability and validity testing; however, there has been formal evaluation of psychometric properties of forced-choice tests that were administered in eligible studies (See S4 Table in supplementary material for details). (S2 Table).
Prevalence of symptom exaggeration and additional analyses
The prevalence of symptom exaggeration ranged from 17% to 67%, median 33% (inter-quartile range: 25–44), and the pooled prevalence was 35% (95% confidence interval [CI]: 31–39) (low certainty evidence) (Fig 2). However, we found a significant subgroup effect, of low to moderate credibility, that studies with a higher proportion of women (≥40% vs. < 40%) may be associated with higher rates of exaggeration: 47% (95%CI 36–58) vs. 31% (95%CI 28–35) (test of interaction p = 0.02; Fig 2, Tables 2 and S3). We did not detect any evidence of small study effects for the overall prevalence of symptom exaggeration (Egger’s test P = 0.13; S2 Fig) nor for the subgroup of studies with <40% women (Egger’s test P = 0.16; S2 Fig).
We found no significant subgroup effects for type of clinical condition (mild TBI versus chronic pain versus other conditions), confidence in the reference standard, age, or education (S3–S5 Figs). Meta-regression showed no association between prevalence of symptom exaggeration and age, level of education, or severity of presenting complaint, but did suggest an association with the proportion of female participants (S1, S6 and S7 Figs). We present all extracted data per study in S6 Table.
Discussion
Our systematic review and meta-analysis of observational studies found low certainty evidence, rated down due to risk of bias and inconsistency, that symptom exaggeration may be common among individuals attending for IMEs in North America, affecting approximately 1 in 3 assessments. The prevalence of symptom exaggeration was higher in studies that enrolled a greater proportion of female attendees (47%) vs. a lower proportion of female attendees (31%).
Relation to other studies
This is the first systematic review to summarize the extent of symptom exaggeration among IME attendees in North America. A previous survey of 131 US board-certified neuropsychologists conducting forensic work found that, on average, they estimated 30% of examinees claiming personal injury, disability, or workers’ compensation presented with symptom exaggeration. However, estimated prevalence ranged considerably by diagnosis – from an average of 41% for mild head injuries to 2% for vascular dementia [80]. Our review found no evidence for differences in the prevalence of symptom exaggeration based on clinical condition, but most patients among studies eligible for our review presented with either mild TBI or chronic pain.
Although our review focused on IMEs in North America, data from other regions also suggest high rates of symptom exaggeration. An observational study in Spain reported that of 1,003 participants (61.5% female), drawn from unselected undergraduates, advanced psychology students, the general population, forensic psychologists, and forensic/legal medicine physicians, one-third reported having feigned symptoms or illness [81]. Data from Germany and the Netherlands suggest that one‐fifth to one‐third of clients in forensic or insurance contexts exhibit symptom overreporting [82]. Further, a Swiss study found that 28% to 34% of individuals undergoing medico‐legal evaluations demonstrated probable or definite symptom exaggeration [83].
Our finding suggesting that women are more likely to exaggerate symptoms vs. men is supported by a systematic review of 175 studies that found women report more bodily distress and more numerous, more intense, and more frequent somatic symptoms than men [84]. Reasons for this discrepancy are uncertain, but may include biological differences, greater bodily vigilance and awareness, and higher rates of negative affectivity vs. men [84]. When symptoms are disproportionate to objective pathology, clinicians should inquire about other factors. For example, women are more likely to experience intimate partner violence than men [85,86], and pain patients who report lifetime traumatic events experience greater pain severity [87].
Studies eligible for our review used different strategies and approaches for assessing the prevalence of symptom exaggeration. The National Academy of Neuropsychologists (NAN) and American Academy of Clinical Neuropsychology (AACN) have emphasized the use of a multimethod approach to assess symptom and performance validity. These include clinical interviews, medical records, medical investigations in certain cases, behavioural observations, and symptom and performance validity tests [88]. Specific guidance is not provided on which symptom and performance validity tests should be used, when they should be conducted, and how they should be interpreted [89].
Strengths and limitations
Our study has several methodological strengths including (1) restricting our eligibility criteria to studies employing a known group design or multi-modal approach to assess symptom exaggeration, (2) subgroup analysis and assessment consistent with current best practices [35,90], and (3) use of the GRADE approach to evaluate the certainty of evidence.
In terms of limitations, we restricted our review to IMEs conducted in North America and eligible studies focused mainly on chronic pain and TBI. The generalizability of our findings to other jurisdictions, contexts, and clinical conditions, is uncertain. We were unable to explore the effect of cultural variability on the prevalence of symptom exaggeration as we found no studies within our inclusion criteria that addressed this issue. We did not find evidence for a subgroup effect based on confidence in the refence standard; however, there may have been insufficient variability to identify an association as almost all studies used a reference standard in which we rated moderate confidence. Another limitation of our review is the absence of a compelling reference standard for symptom exaggeration. Furthermore, even within the same reference standard, operationalization can be variable, which may affect prevalence. Another limitation of the primary studies is the lack of stratification of prevalence of symptom exaggeration according to possible effect modifiers, such as sex. Doing so would facilitate within-study subgroup analysis, which are less subject to confounding than between-study subgroup analysis. Another major limitation of the current evidence is that none of the known group approaches for evaluating symptom exaggeration have undergone reliability and validity testing.
Implications for future research and practice
Failure to identify the contribution of symptom exaggeration towards examinee’s complaints not only compromises the reliability and validity of independent assessments but may also adversely impact patient care by medicalizing psychosocial issues [91–93]. Our findings suggest that symptom exaggeration is common among patients attending for IMEs; however, we rated down the certainty of evidence due to uncertain psychometric properties of the criteria used to evaluate exaggeration. An urgent research priority is the evaluation of inter-rater reliability of known group and multi-modal systems to appraise symptom exaggeration. Validation of such assessment systems is also critical and extremely challenging, but indirect evidence of validity could be acquired by evaluating accuracy in distinguishing between volunteers who were or were not exaggerating symptoms.
Future research should investigate how cultural factors affect IME outcomes, with attention to language barriers, health beliefs, and potential biases among both examinees and assessors. Another research priority is the development and validation of a structured and comprehensive approach to identify symptom exaggeration in IME assessments. Such an approach should consider observed versus reported abilities, findings of other providers, self-reported history that is discrepant with documented history, and administration of validated tests. A further consideration for research and practice is the use of symptom validity tests that focus on malingering (e.g., Test of Memory Malingering [TOMM], Lees-Haley Fake Bad Scale [FBS]), which imply intent. Clinicians are, understandably and appropriately, hesitant to assign a label of malingering; reasons include the challenges associated with determining intent and the risk of litigation [94]. To circumvent these issues, we would suggest the use of the less value-laden term ‘symptom exaggeration’.
Conclusion
Symptom exaggeration may occur in almost 50% of women and in approximately a third of men undergoing IMEs. Assessors should evaluate symptom exaggeration when conducting IMEs using a multi-modal approach that includes both clinical findings and validated tests of performance effort, and avoid conflation with malingering which presumes intent. Priority areas for future research include establishing the reliability and validity of current evaluation criteria for symptom exaggeration, and development of a structured IME assessment approach that includes consideration of symptom exaggeration.
Supporting information
S3 Table. ICEMAN criteria to assess credibility of subgroup effect of female % and prevalence.
https://doi.org/10.1371/journal.pone.0324684.s003
(DOCX)
S4 Table. Psychometric properties of tests included in symptom exaggeration criteria with list of references.
https://doi.org/10.1371/journal.pone.0324684.s004
(DOCX)
S5 Table. Included and excluded studies at full text screening with reasons.
https://doi.org/10.1371/journal.pone.0324684.s005
(DOCX)
S6 Table. Data extracted from included studies.
https://doi.org/10.1371/journal.pone.0324684.s006
(DOCX)
S1 Fig. Meta-regression for proportion of females among 42 studies (p = 0.16).
https://doi.org/10.1371/journal.pone.0324684.s007
(DOCX)
S2 Fig. a- Funnel plots of overall prevalence (Egger’s test p = 0.13) and b- prevalence in subgroup of studies with female proportion <40% (Egger’s test p = 0.16).
https://doi.org/10.1371/journal.pone.0324684.s008
(DOCX)
S3 Fig. Subgroup analysis for type of conditions (test of interaction p = 0.95).
https://doi.org/10.1371/journal.pone.0324684.s009
(DOCX)
S4 Fig. Subgroup analysis for confidence in reference standard (test of interaction p = 0.84).
https://doi.org/10.1371/journal.pone.0324684.s010
(DOCX)
S5 Fig. Subgroup analysis for similar age and/or education between groups (test of interaction p = 0.47).
https://doi.org/10.1371/journal.pone.0324684.s011
(DOCX)
S6 Fig. Meta-regression for average age among 46 cohorts (p = 0.18).
https://doi.org/10.1371/journal.pone.0324684.s012
(DOCX)
S7 Fig. Meta-regression for average education level among 45 cohorts (p = 0.65).
https://doi.org/10.1371/journal.pone.0324684.s013
(DOCX)
S1 Checklist. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Checklist.
https://doi.org/10.1371/journal.pone.0324684.s014
(DOCX)
S2 Checklist. Meta-analysis of Observational Studies in Epidemiology (MOOSE) checklist.
https://doi.org/10.1371/journal.pone.0324684.s015
(DOCX)
Acknowledgments
We would like to thank Michael Bagby from the Departments of Psychology and Psychiatry at University of Toronto for his contributions to the initial discussions around conceptualization and design of this study. No financial compensation was provided to any of these individuals.
References
- 1. Canada S. Canadian Survey on Disability, 2017 to 2022; 2023. Available from: https://www150.statcan.gc.ca/n1/daily-quotidien/231201/dq231201b-eng.htm
- 2. Disability and Health Data System (DHDS) [Internet]; 2020 [cited 2023 Jan 16. ]. Available from: https://dhds.cdc.gov/SP?LocationId=59&CategoryId=DISEST&ShowFootnotes=true&showMode=&IndicatorIds=STATTYPE,AGEIND,SEXIND,RACEIND,VETIND&pnl0=Chart,false,YR5,CAT1,BO1,AGEADJPREV&pnl1=Chart,false,YR5,DISSTAT,PREV&pnl2=Chart,false,YR5,DISSTAT,AGEADJPREV&pnl3=Chart,false,YR5,DISSTAT,AGEADJPREV&pnl4=Chart,false,YR5,DISSTAT,AGEADJPREV
- 3.
Martin DW. Independent medical evaluation: a practical guide. Springer; 2018.
- 4. Ebrahim S, Sava H, Kunz R, Busse JW. Ethics and legalities associated with independent medical evaluations. CMAJ. 2014;186(4):248–9. pmid:24491474
- 5. Gill D, Green P, Flaro L, Pucci T. The role of effort testing in independent medical examinations. Med Leg J. 2007;75(Pt 2):64–71. pmid:17822166
- 6. Mæland S, Holmås TH, Øyeflaten I, Husabø E, Werner EL, Monstad K. What is the effect of independent medical evaluation on days on sickness benefits for long-term sick listed employees in Norway? A pragmatic randomised controlled trial, the NIME-trial. BMC Public Health. 2022;22(1):400. pmid:35216560
- 7. Barth J, de Boer WE, Busse JW, Hoving JL, Kedzia S, Couban R. Inter-rater agreement in evaluation of disability: systematic review of reproducibility studies. BMJ. 2017;356.
- 8. Kunz R, von Allmen DY, Marelli R, Hoffmann-Richter U, Jeger J, Mager R, et al. The reproducibility of psychiatric evaluations of work disability: two reliability and agreement studies. BMC Psychiatry. 2019;19(1):205. pmid:31266488
- 9. Bachmann M, de Boer W, Schandelmaier S, Leibold A, Marelli R, Jeger J, et al. Use of a structured functional evaluation process for independent medical evaluations of claimants presenting with disabling mental illness: rationale and design for a multi-center reliability study. BMC Psychiatry. 2016;16:271. pmid:27474008
- 10. Boskovic I, Gallardo CT, Vrij A, Hope L, Merckelbach H. Verifiability on the run: an experimental study on the verifiability approach to malingered symptoms. Psychiatr Psychol Law. 2018;26(1):65–76. pmid:31984064
- 11. Rumschik SM, Appel JM. Malingering in the psychiatric emergency department: prevalence, predictors, and outcomes. Psychiatr Serv. 2019;70(2):115–22. pmid:30526343
- 12. Greve KW, Heinly MT, Bianchini KJ, Love JM. Malingering detection with the Wisconsin Card Sorting Test in mild traumatic brain injury. Clin Neuropsychol. 2009;23(2):343–62. pmid:18609328
- 13. Costa D. Psychiatric detection of exaggeration in reports of memory impairment. J Nerv Ment Dis. 1999;187(7):446–8. pmid:10426467
- 14. Walczyk JJ, Sewell N, DiBenedetto MB. A review of approaches to detecting malingering in forensic contexts and promising cognitive load-inducing lie detection techniques. Front Psychiatry. 2018;9:700. pmid:30622488
- 15. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4(1):1. pmid:25554246
- 16. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283(15):2008–12. pmid:10789670
- 17.
BSCI I. Open Science Framework.
- 18.
Distiller S. Data management software. Ottawa (ON): Evidence Partners; 2011.
- 19.
Rogers RE. Clinical assessment of malingering and deception. Guilford Press; 2008.
- 20. Rogers R, Kropp PR, Bagby RM, Dickens SE. Faking specific disorders: a study of the Structured Interview of Reported Symptoms (SIRS). J Clin Psychol. 1992;48(5):643–8. pmid:1401150
- 21. Heilbronner RL, Sweet JJ, Morgan JE, Larrabee GJ, Millis SR, 1 CP. American Academy of Clinical Neuropsychology Consensus Conference Statement on the neuropsychological assessment of effort, response bias, and malingering. Clin Neuropsychol. 2009;23(7):1093–129.
- 22. Nies KJ, Sweet JJ. Neuropsychological assessment and malingering: a critical review of past and present strategies. Arch Clin Neuropsychol. 1994;9(6):501–52. pmid:14590999
- 23. Slick DJ, Sherman EM, Iverson GL. Diagnostic criteria for malingered neurocognitive dysfunction: proposed standards for clinical practice and research. Clin Neuropsychol. 1999;13(4):545–61. pmid:10806468
- 24. Bianchini KJ, Greve KW, Glynn G. On the diagnosis of malingered pain-related disability: lessons from cognitive malingering research. Spine J. 2005;5(4):404–17. pmid:15996610
- 25. Aguerrevere LE, Greve KW, Bianchini KJ, Ord JS. Classification accuracy of the Millon Clinical Multiaxial Inventory-III modifier indices in the detection of malingering in traumatic brain injury. J Clin Exp Neuropsychol. 2011;33(5):497–504. pmid:21424973
- 26. Cook RJ, Farewell VT. Conditional inference for subject‐specific and marginal agreement: two families of agreement measures. Can J Stat. 1995;23(4):333–44.
- 27.
Rogers R. Clinical assessment of malingering and deception. (No Title). 2009.
- 28. Freeman MF, Tukey JW. Transformations related to the angular and the square root. Ann Math Statist. 1950;21(4):607–11.
- 29. Nyaga VN, Arbyn M, Aerts M. Metaprop: a Stata command to perform meta-analysis of binomial data. Arch Public Health. 2014;72:1–10.
- 30. DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986;7(3):177–88. pmid:3802833
- 31. Miller JJ. The inverse of the freeman-tukey double arcsine transformation. Am Stat. 1978;32(4):138.
- 32. Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4):383–94. pmid:21195583
- 33. Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. 1997;315(7109):629–34. pmid:9310563
- 34. Rücker G, Schwarzer G, Carpenter JR, Schumacher M. Undue reliance on I(2) in assessing heterogeneity may mislead. BMC Med Res Methodol. 2008;8:79. pmid:19036172
- 35. Schandelmaier S, Briel M, Varadhan R, Schmid CH, Devasenapathy N, Hayward RA, et al. Development of the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN) in randomized controlled trials and meta-analyses. CMAJ. 2020;192(32):E901–6. pmid:32778601
- 36.
StataCorp L. Stata statistical software: release 16. College Station (TX): StataCorp; 2019.
- 37. Garner P, Hopewell S, Chandler J, MacLehose H, Akl EA, Beyene J, et al. When and how to update systematic reviews: consensus and checklist. BMJ. 2016;354.
- 38. Greve KW, Springer S, Bianchini KJ, Black FW, Heinly MT, Love JM. Malingering in toxic exposure: classification accuracy of Reliable Digit Span and WAIS-III Digit Span scaled scores. Assessment. 2007;14(1):12–21.
- 39. Lees-Haley PR, English LT, Glenn WJ. A Fake Bad Scale on the MMPI-2 for personal injury claimants. Psychol Rep. 1991;68(1):203–10. pmid:2034762
- 40. Greve KW, Bianchini KJ, Black FW, Heinly MT, Love JM, Swift DA, et al. The prevalence of cognitive malingering in persons reporting exposure to occupational and environmental substances. Neurotoxicology. 2006;27(6):940–50. pmid:16904749
- 41. Greiffenstein MF, Gola T, Baker WJ. MMPI-2 validity scales versus domain specific measures in detection of factitious traumatic brain injury. Clin Neuropsychol. 1995;9(3):230–40.
- 42. Lees-Haley PR. Provisional normative data for a credibility scale for assessing personal injury claimants. PR. 1990;66(3):1355.
- 43. Suhr J, Tranel D, Wefel J, Barrash J. Memory performance after head injury: contributions of malingering, litigation status, psychological factors, and medication use. J Clin Exp Neuropsychol. 1997;19(4):500–14. pmid:9342686
- 44. van Gorp WG, Humphrey LA, Kalechstein AL, Brumm VL, McMullen WJ, Stoddard MA, et al. How well do standard clinical neuropsychological tests identify malingering? A preliminary analysis. J Clin Exp Neuropsychol. 1999 Apr;21(2):245–50. pmid:10425521
- 45. Sweet JJ, Wolfe P, Sattlberger E, Numan B, Rosenfeld JP, Clingerman S, et al. Further investigation of traumatic brain injury versus insufficient effort with the California Verbal Learning Test. Arch Clin Neuropsychol. 2000;15(2):105–13. pmid:14590555
- 46. Greve KW, Bianchini KJ, Mathias CW, Houston RJ, Crouch JA. Detecting malingered performance on the Wechsler Adult Intelligence Scale. Validation of Mittenberg’s approach in traumatic brain injury. Arch Clin Neuropsychol. 2003;18(3):245–60. pmid:14591458
- 47. Lu PH, Boone KB, Cozolino L, Mitchell C. Effectiveness of the Rey-Osterrieth Complex Figure Test and the Meyers and Meyers recognition trial in the detection of suspect effort. Clin Neuropsychol. 2003;17(3):426–40. pmid:14704893
- 48. Barrash J, Suhr J, Manzel K. Detecting poor effort and malingering with an expanded version of the Auditory Verbal Learning Test (AVLTX): validation with clinical samples. J Clin Exp Neuropsychol. 2004;26(1):125–40. pmid:14972700
- 49. Heinly MT, Greve KW, Bianchini KJ, Love JM, Brennan A. WAIS digit span-based indicators of malingered neurocognitive dysfunction: classification accuracy in traumatic brain injury. Assessment. 2005;12(4):429–44. pmid:16244123
- 50. Curtis KL, Greve KW, Bianchini KJ, Brennan A. California verbal learning test indicators of Malingered Neurocognitive Dysfunction: sensitivity and specificity in traumatic brain injury. Assessment. 2006;13(1):46–61. pmid:16443718
- 51. Etherton JL, Bianchini KJ, Ciota MA, Heinly MT, Greve KW. Pain, malingering and the WAIS-III Working Memory Index. Spine J. 2006;6(1):61–71. pmid:16413450
- 52. Greve KW, Bianchini KJ, Love JM, Brennan A, Heinly MT. Sensitivity and specificity of MMPI-2 validity scales and indicators to malingered neurocognitive dysfunction in traumatic brain injury. Clin Neuropsychol. 2006;20(3):491–512. pmid:16895861
- 53. Greve KW, Bianchini KJ, Doane BM. Classification accuracy of the test of memory malingering in traumatic brain injury: results of a known-groups analysis. J Clin Exp Neuropsychol. 2006;28(7):1176–90. pmid:16840243
- 54. Greve KW, Bianchini KJ. Classification accuracy of the Portland Digit Recognition Test in traumatic brain injury: results of a known-groups analysis. Clin Neuropsychol. 2006;20(4):816–30. pmid:16980264
- 55. Ardolf BR, Denney RL, Houston CM. Base rates of negative response bias and malingered neurocognitive dysfunction among criminal defendants referred for neuropsychological evaluation. Clin Neuropsychol. 2007;21(6):899–916. pmid:17886149
- 56. Greve KW, Bianchini KJ, Roberson T. The Booklet Category Test and malingering in traumatic brain injury: classification accuracy in known groups. Clin Neuropsychol. 2007;21(2):318–37. pmid:17455021
- 57. Henry GK, Enders C. Probable malingering and performance on the Continuous Visual Memory Test. Appl Neuropsychol. 2007;14(4):267–74. pmid:18067423
- 58. O’Bryant SE, Engel LR, Kleiner JS, Vasterling JJ, Black FW. Test of memory malingering (TOMM) trial 1 as a screening measure for insufficient effort. Clin Neuropsychol. 2007;21(3):511–21. pmid:17455034
- 59. Aguerrevere LE, Greve KW, Bianchini KJ, Meyers JE. Detecting malingering in traumatic brain injury and chronic pain with an abbreviated version of the Meyers Index for the MMPI-2. Arch Clin Neuropsychol. 2008;23(7–8):831–8. pmid:18715751
- 60. Curtis KL, Thompson LK, Greve KW, Bianchini KJ. Verbal fluency indicators of malingering in traumatic brain injury: classification accuracy in known groups. Clin Neuropsychol. 2008;22(5):930–45. pmid:18756393
- 61. Greve KW, Lotz KL, Bianchini KJ. Observed versus estimated IQ as an index of malingering in traumatic brain injury: classification accuracy in known groups. Appl Neuropsychol. 2008;15(3):161–9. pmid:18726736
- 62. Ord JS, Greve KW, Bianchini KJ. Using the Wechsler Memory Scale-III to detect malingering in mild traumatic brain injury. Clin Neuropsychol. 2008;22(4):689–704. pmid:17853130
- 63. Greve KW, Ord J, Curtis KL, Bianchini KJ, Brennan A. Detecting malingering in traumatic brain injury and chronic pain: a comparison of three forced-choice symptom validity tests. Clin Neuropsychol. 2008;22(5):896–918. pmid:18756391
- 64. Henry GK, Heilbronner RL, Mittenberg W, Enders C, Domboski K. Comparison of the MMPI-2 restructured Demoralization Scale, Depression Scale, and Malingered Mood Disorder Scale in identifying non-credible symptom reporting in personal injury litigants and disability claimants. Clin Neuropsychol. 2009;23(1):153–66. pmid:18609325
- 65. Greve KW, Bianchini KJ, Etherton JL, Ord JS, Curtis KL. Detecting malingered pain-related disability: classification accuracy of the Portland Digit Recognition Test. Clin Neuropsychol. 2009;23(5):850–69. pmid:19255913
- 66. Greve KW, Curtis KL, Bianchini KJ, Ord JS. Are the original and second edition of the California Verbal Learning Test equally accurate in detecting malingering? Assessment. 2009;16(3):237–48. pmid:19098280
- 67. Greve KW, Etherton JL, Ord J, Bianchini KJ, Curtis KL. Detecting malingered pain-related disability: classification accuracy of the test of memory malingering. Clin Neuropsychol. 2009;23(7):1250–71. pmid:19728222
- 68. Greve KW, Ord JS, Bianchini KJ, Curtis KL. Prevalence of malingering in patients with chronic pain referred for psychologic evaluation in a medico-legal context. Arch Phys Med Rehabil. 2009;90(7):1117–26. pmid:19577024
- 69. Bortnik KE, Boone KB, Marion SD, Amano S, Ziegler E, Cottingham ME, et al. Examination of various WMS-III logical memory scores in the assessment of response bias. Clin Neuropsychol. 2010;24(2):344–57. pmid:19921593
- 70. Curtis KL, Greve KW, Brasseux R, Bianchini KJ. Criterion groups validation of the Seashore Rhythm Test and Speech Sounds Perception Test for the detection of malingering in traumatic brain injury. Clin Neuropsychol. 2010;24(5):882–97. pmid:20486016
- 71. Greve KW, Bianchini KJ, Etherton JL, Meyers JE, Curtis KL, Ord JS. The Reliable Digit Span test in chronic pain: classification accuracy in detecting malingered pain-related disability. Clin Neuropsychol. 2010;24(1):137–52. pmid:19816837
- 72. Ord JS, Boettcher AC, Greve KW, Bianchini KJ. Detection of malingering in mild traumatic brain injury with the Conners’ Continuous Performance Test-II. J Clin Exp Neuropsychol. 2010;32(4):380–7. pmid:19739010
- 73. Roberson CJ, Boone KB, Goldberg H, Miora D, Cottingham M, Victor T, et al. Cross validation of the b Test in a large known groups sample. Clin Neuropsychol. 2013;27(3):495–508. pmid:23157695
- 74. Bianchini KJ, Aguerrevere LE, Guise BJ, Ord JS, Etherton JL, Meyers JE, et al. Accuracy of the Modified Somatic Perception Questionnaire and Pain Disability Index in the detection of malingered pain-related disability in chronic pain. Clin Neuropsychol. 2014;28(8):1376–94. pmid:25517267
- 75. Guise BJ, Thompson MD, Greve KW, Bianchini KJ, West L. Assessment of performance validity in the Stroop Color and Word Test in mild traumatic brain injury patients: a criterion-groups validation design. J Neuropsychol. 2014;8(1):20–33. pmid:23253228
- 76. Patrick RE, Horner MD. Psychological characteristics of individuals who put forth inadequate cognitive effort in a secondary gain context. Arch Clin Neuropsychol. 2014;29(8):754–66. pmid:25318597
- 77. Aguerrevere LE, Calamia MR, Greve KW, Bianchini KJ, Curtis KL, Ramirez V. Clusters of financially incentivized chronic pain patients using the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF). Psychol Assess. 2018;30(5):634–44. pmid:28627924
- 78. Bianchini KJ, Aguerrevere LE, Curtis KL, Roebuck-Spencer TM, Frey FC, Greve KW, et al. Classification accuracy of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2)-Restructured form validity scales in detecting malingered pain-related disability. Psychol Assess. 2018;30(7):857–69. pmid:29072481
- 79. Curtis KL, Aguerrevere LE, Bianchini KJ, Greve KW, Nicks RC. Detecting malingered pain-related disability with the pain catastrophizing scale: a criterion groups validation study. Clin Neuropsychol. 2019;33(8):1485–500. pmid:30957700
- 80. Mittenberg W, Patton C, Canyock EM, Condit DC. Base rates of malingering and symptom exaggeration. J Clin Exp Neuropsychol. 2002;24(8):1094–102. pmid:12650234
- 81. Puente-López E, Pina D, López-López R, Ordi HG, Bošković I, Merten T. Prevalence estimates of symptom feigning and malingering in Spain. Psychol Inj Law. 2023;16(1):1–17. pmid:35911787
- 82. Merten T, Dandachi-FitzGerald B, Hall V, Bodner T, Giromini L, Lehrner J, et al. Symptom and performance validity assessment in European countries: an update. Psychol Inj Law. 2022;15(2):116–27. pmid:34849185
- 83. Plohmann AM, Hurter M. Prevalence of poor effort and malingered neurocognitive dysfunction in litigating patients in Switzerland. Z Neuropsychol. 2017.
- 84. Barsky AJ, Peekna HM, Borus JF. Somatic symptom reporting in women and men. J Gen Intern Med. 2001;16(4):266–75. pmid:11318929
- 85. Lövestad S, Krantz G. Men’s and women’s exposure and perpetration of partner violence: an epidemiological study from Sweden. BMC Public Health. 2012;12:945. pmid:23116238
- 86. Umubyeyi A, Mogren I, Ntaganira J, Krantz G. Women are considerably more exposed to intimate partner violence than men in Rwanda: results from a population-based, cross-sectional study. BMC Womens Health. 2014;14:99. pmid:25155576
- 87. Nicol AL, Sieberg CB, Clauw DJ, Hassett AL, Moser SE, Brummett CM. The association between a history of lifetime traumatic events and pain severity, physical function, and affective distress in patients with chronic Pain. J Pain. 2016;17(12):1334–48. pmid:27641311
- 88. Bush SS, Heilbronner RL, Ruff RM. Psychological assessment of symptom and performance validity, response bias, and malingering: official position of the Association for Scientific Advancement in Psychological Injury and Law. Psychol Inj Law. 2014;7(3):197–205.
- 89. Sweet JJ, Heilbronner RL, Morgan JE, Larrabee GJ, Rohling ML, Boone KB, et al. American Academy of Clinical Neuropsychology (AACN) 2021 consensus statement on validity assessment: Update of the 2009 AACN consensus conference statement on neuropsychological assessment of effort, response bias, and malingering. Clin Neuropsychol. 2021;35(6):1053–106. pmid:33823750
- 90. Sun X, Briel M, Walter SD, Guyatt GH. Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses. BMJ. 2010;340:c117. pmid:20354011
- 91. Häuser W, Fitzcharles MA. Facts and myths pertaining to fibromyalgia. Dialogues Clin Neurosci. 2022.
- 92. Koesling D, Bozzaro C. Chronic pain as a blind spot in the diagnosis of a depressed society: on the implications of the connection between depression and chronic pain for interpretations of contemporary society. Med Health Care Philos. 2022:1–10.
- 93.
Burke MJ, Silverberg ND. New framework for the continuum of concussion and functional neurological disorder. BMJ Publishing Group Ltd and British Association of Sport and Exercise Medicine; 2024.
- 94. Weiss KJ, Van Dell L. Liability for diagnosing malingering. J Am Acad Psychiatry Law. 2022;45:339–47.