Advertisement
  • Loading metrics

Ruling out pulmonary embolism across different healthcare settings: A systematic review and individual patient data meta-analysis

  • Geert-Jan Geersing ,

    Contributed equally to this work with: Geert-Jan Geersing, Toshihiko Takada

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    G.J.Geersing@umcutrecht.nl

    Affiliation Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands

  • Toshihiko Takada ,

    Contributed equally to this work with: Geert-Jan Geersing, Toshihiko Takada

    Roles Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliations Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands, Department of General Medicine, Shirakawa Satellite for Teaching And Research (STAR), Fukushima Medical University, Fukushima, Japan

  • Frederikus A. Klok,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Supervision, Writing – review & editing

    Affiliation Department of Medicine, Thrombosis and Haemostasis, Dutch Thrombosis Network, Leiden University Medical Center, Leiden, the Netherlands

  • Harry R. Büller,

    Roles Conceptualization, Data curation, Supervision, Writing – review & editing

    Affiliation Department of Medicine, Amsterdam University Medical Center, Amsterdam Cardiovascular Sciences, Amsterdam, the Netherlands

  • D. Mark Courtney,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Emergency Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America

  • Yonathan Freund,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Sorbonne University, Emergency Department, Hôpital Pitié-Salpêtrière, Assistance Publique—Hôpitaux de Paris, Paris, France

  • Javier Galipienzo,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Service of Anesthesiology, MD Anderson Cancer Center Madrid, Madrid, Spain

  • Gregoire Le Gal,

    Roles Conceptualization, Data curation, Investigation, Writing – review & editing

    Affiliation Department of Medicine, University of Ottawa, Ottawa Hospital Research Institute, Ottawa, Canada

  • Waleed Ghanima,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Medicine, Østfold Hospital Trust, Norway and Institute of Clinical Medicine, University of Oslo, Oslo, Norway

  • Jeffrey A. Kline,

    Roles Conceptualization, Data curation, Investigation, Writing – review & editing

    Affiliation Department of Emergency Medicine, Wayne State School of Medicine, Detroit, Michigan, United States of America

  • Menno V. Huisman,

    Roles Conceptualization, Data curation, Investigation, Supervision, Writing – review & editing

    Affiliation Department of Medicine, Thrombosis and Haemostasis, Dutch Thrombosis Network, Leiden University Medical Center, Leiden, the Netherlands

  • Karel G. M. Moons,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands, Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands

  • Arnaud Perrier,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Division of Angiology and Hemostasis, Geneva University Hospitals and Faculty of Medicine, Geneva, Switzerland

  • Sameer Parpia,

    Roles Data curation, Formal analysis, Investigation, Methodology, Writing – review & editing

    Affiliations Department of Oncology, McMaster University, Hamilton, Canada, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada

  • Helia Robert-Ebadi,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Division of Angiology and Hemostasis, Geneva University Hospitals and Faculty of Medicine, Geneva, Switzerland

  • Marc Righini,

    Roles Conceptualization, Data curation, Investigation, Supervision, Writing – review & editing

    Affiliation Division of Angiology and Hemostasis, Geneva University Hospitals and Faculty of Medicine, Geneva, Switzerland

  • Pierre-Marie Roy,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation UNIV Angers, UMR (CNRS 6015—INSERM 1083) and CHU Angers, Department of Emergency Medicine, F-CRIN InnoVTE, Angers, France

  • Maarten van Smeden,

    Roles Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands

  • Milou A. M. Stals,

    Roles Data curation, Formal analysis, Investigation, Writing – review & editing

    Affiliation Department of Medicine, Thrombosis and Haemostasis, Dutch Thrombosis Network, Leiden University Medical Center, Leiden, the Netherlands

  • Philip S. Wells,

    Roles Conceptualization, Data curation, Investigation, Writing – review & editing

    Affiliation Department of Medicine, University of Ottawa, Ottawa Hospital Research Institute, Ottawa, Canada

  • Kerstin de Wit,

    Roles Conceptualization, Data curation, Investigation, Writing – review & editing

    Affiliations Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada, Department of Emergency Medicine, Queen’s University, Kingston, Canada

  • Noémie Kraaijpoel,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliation Department of Medicine, Amsterdam University Medical Center, Amsterdam Cardiovascular Sciences, Amsterdam, the Netherlands

  •  [ ... ],
  • Nick van Es

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Medicine, Amsterdam University Medical Center, Amsterdam Cardiovascular Sciences, Amsterdam, the Netherlands

  • [ view all ]
  • [ view less ]

Abstract

Background

The challenging clinical dilemma of detecting pulmonary embolism (PE) in suspected patients is encountered in a variety of healthcare settings. We hypothesized that the optimal diagnostic approach to detect these patients in terms of safety and efficiency depends on underlying PE prevalence, case mix, and physician experience, overall reflected by the type of setting where patients are initially assessed. The objective of this study was to assess the capability of ruling out PE by available diagnostic strategies across all possible settings.

Methods and findings

We performed a literature search (MEDLINE) followed by an individual patient data (IPD) meta-analysis (MA; 23 studies), including patients from self-referral emergency care (n = 12,612), primary healthcare clinics (n = 3,174), referred secondary care (n = 17,052), and hospitalized or nursing home patients (n = 2,410). Multilevel logistic regression was performed to evaluate diagnostic performance of the Wells and revised Geneva rules, both using fixed and adapted D-dimer thresholds to age or pretest probability (PTP), for the YEARS algorithm and for the Pulmonary Embolism Rule-out Criteria (PERC). All strategies were tested separately in each healthcare setting. Following studies done in this field, the primary diagnostic metrices estimated from the models were the “failure rate” of each strategy—i.e., the proportion of missed PE among patients categorized as “PE excluded” and “efficiency”—defined as the proportion of patients categorized as “PE excluded” among all patients. In self-referral emergency care, the PERC algorithm excludes PE in 21% of suspected patients at a failure rate of 1.12% (95% confidence interval [CI] 0.74 to 1.70), whereas this increases to 6.01% (4.09 to 8.75) in referred patients to secondary care at an efficiency of 10%. In patients from primary healthcare and those referred to secondary care, strategies adjusting D-dimer to PTP are the most efficient (range: 43% to 62%) at a failure rate ranging between 0.25% and 3.06%, with higher failure rates observed in patients referred to secondary care. For this latter setting, strategies adjusting D-dimer to age are associated with a lower failure rate ranging between 0.65% and 0.81%, yet are also less efficient (range: 33% and 35%). For all strategies, failure rates are highest in hospitalized or nursing home patients, ranging between 1.68% and 5.13%, at an efficiency ranging between 15% and 30%. The main limitation of the primary analyses was that the diagnostic performance of each strategy was compared in different sets of studies since the availability of items used in each diagnostic strategy differed across included studies; however, sensitivity analyses suggested that the findings were robust.

Conclusions

The capability of safely and efficiently ruling out PE of available diagnostic strategies differs for different healthcare settings. The findings of this IPD MA help in determining the optimum diagnostic strategies for ruling out PE per healthcare setting, balancing the trade-off between failure rate and efficiency of each strategy.

Author summary

Why was this study done?

  • Pulmonary embolism (PE; i.e., clots in pulmonary vessels) is a potentially fatal condition, and patients suspected of having this condition are encountered in many different healthcare settings.
  • To help physicians with ruling out PE without additional imaging tests, several diagnostic strategies exist, consisting of clinical items and a blood test (D-dimer testing), with different approaches to interpret this D-dimer test, i.e., using a fixed threshold, an age-adjusted manner, or adjusting D-dimer interpretation to a pretest probability (PTP) of PE.
  • However, it remains unknown how each diagnostic strategy performs in different healthcare settings like emergency care, primary healthcare, secondary hospital care, and inpatient care.

What did the researchers do and find?

  • The researchers searched and collected individual patient data (IPD) of existing studies that can be used to evaluate the performance of diagnostic strategies to exclude the possibility of PE.
  • By analyzing the data of over 35,000 patients suspected of PE from 23 studies, the researchers validated the performance of diagnostic strategies for suspected PE across different healthcare settings.
  • In healthcare settings with a higher prevalence of PE—compared to those with a lower prevalence—each diagnostic strategy tended to miss more patients with PE (i.e., less safe) and identified less patients in whom PE could be ruled out without imaging (i.e., less efficient), notably for strategies with a variable D-dimer interpretation.

What do these findings mean?

  • The performance of diagnostic strategies varied considerably across different healthcare settings due to the difference in patient characteristics and prevalence of PE.
  • Our findings can be used to choose the optimum diagnostic strategies in each healthcare setting, balancing the trade-off between decreasing unnecessary imaging studies and missing patients with PE.

Introduction

Pulmonary embolism (PE) is one of the most difficult diagnoses in clinical medicine, encountered daily in a variety of healthcare settings [1,2]. Due to potentially fatal consequences of missing PE [3,4], physicians tend to perform diagnostic imaging tests even when PE is considered not the most likely diagnosis. Some argue against this low threshold for diagnostic workup since such overtesting can lead to unnecessary radiation exposure, cost, and potential adverse events related to the use of contrast media [5]. At the same time, it has been argued that PE should be suspected more often to prevent potentially life-threatening delay in diagnosis [6].

To help physicians with this clinical dilemma, various diagnostic strategies for ruling out PE have been developed over time, all consisting of a set of clinical variables that are often combined with a blood test to detect clot degradation, i.e., D-dimer [7,8]. Given the differences in case mix and underlying prevalence of PE, it is likely that each diagnostic strategy has different merits across different healthcare settings [9,10]. Nevertheless, evidence on the performance of the currently available diagnostic strategies across different healthcare settings is limited, notably for settings like primary healthcare or inpatient care.

Hence, we performed a comprehensive systematic review followed by an individual patient data (IPD) meta-analysis (MA) to explore the performance of diagnostic strategies for PE across a variety of healthcare settings. The secondary aim of this study was to investigate the relationship between PE prevalence and the diagnostic performance measures of each strategy.

Methods

Throughout this paper, we adhere to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Individual Participant Data (PRISMA-IPD) and Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy (PRISMA-DTA) guidance on systematic reviews including IPD, where applicable [11,12]. The checklists are available in Tables A, B, and C in S1 Checklists. Ethical approval including written informed consent was obtained in each original study, and analyses described in this paper on optimizing diagnostic strategies for suspected PE were aligned with the informed consent as provided by individual patients in each study. Therefore, no additional ethical approval was required for this MA.

Protocol registration

This study was preregistered in the PROSPERO registration (see https://www.crd.york.ac.uk/prospero ID 89366), and the protocol has been published [13].

Diagnostic strategies under evaluation

Based on a previous systematic review [14] and discussion among experts, we a priori selected 11 existing diagnostic strategies under evaluation. The overview of these index strategies is shown in Table A in S1 Text. The 2 most commonly used clinical decision rules for pretest probability (PTP) assessment, the Wells and revised Geneva rules [14], are to be combined with D-dimer testing, with D-dimer interpretations either using a fixed cutoff (using either qualitative or quantitative D-dimer testing), adjusted to PTP, or adjusted to age [15,16]. The YEARS algorithm is a simplified version of the Wells rule with PTP-adjusted D-dimer [17]. The Pulmonary Embolism Rule-out Criteria (PERC) algorithm, which comprises 8 clinical items, was also evaluated [18]. This strategy differs from the other diagnostic strategies as it was originally developed for excluding PE in patients with a low clinical impression of PE. Hereto, following earlier studies, the PERC algorithm was validated in combination with (i) a Wells rule of 4 points or less; or (ii) physician’s gestalt considering PE unlikely (“low gestalt”). The PERC algorithm could only be evaluated for the settings “self-referral” emergency care and referred secondary care due to missing information on oxygen saturation in most of the studies in the other settings.

Study eligibility, identification, and selection

The process of study selection for the IPD-MA was described in detail in the protocol [13]. In short, to retrieve eligible studies, MEDLINE was first searched from January 1, 1995 to August 25, 2016 (this was recently updated until November 1, 2021). Studies were eligible if they (1) had a prospective or cross-sectional design and included patients with clinically suspected PE (in diagnostic research of venous thromboembolism [VTE], prospective cohort studies are common because VTE is often defined by clinical follow-up in patients whom a PTP of VTE is deemed unlikely); (2) assessed the variables to validate at least one of the diagnostic strategies under evaluation; (3) included a clear description of the source of patient enrolment or clinical healthcare setting; (4) objectively confirmed VTE diagnosis (i.e., PE or deep vein thrombosis) with an established reference test method (either imaging [computed tomography pulmonary angiography (CTPA), ventilation–perfusion lung scan, or digital subtraction angiography] or clinical follow-up of at least 1 month); and (5) included at least 50 patients with confirmed VTE. Full-text screening was performed independently by 2 couples of authors (GJG and NK and FAK and NvE), and 40 potentially eligible papers were identified. With all principal investigators from these 40 retrieved studies invited, the results of this literature search were discussed during a meeting at the International Society on Thrombosis and Haemostasis (ISTH) conference in Berlin in 2017. The search results were complimented by asking those experts in the field of diagnosing VTE about whether they knew any additional datasets eligible for this IPD.

Risk of bias assessment across studies

Three pairs of authors (GJG and TT, NvE and NK, and FAK and MAMS), who were not involved in the original studies, independently assessed each eligible study for potential sources of bias and applicability concerns using the QUADAS-2 tool [19]. Any disagreements were solved by discussion within each pair and subsequently between the pairs.

Healthcare settings

We defined the following 4 categories of healthcare settings in which patients suspected of PE are typically encountered:

  1. Self-referral emergency care: Patients typically present themselves without a referral by a general physician or specialist. This setting is characterized by a (very) low PE prevalence (i.e., around 5%) among patients with clinically suspected PE and has relatively good access to additional imaging or laboratory workup. Given that the studies performed in this setting emphasized on preselection of patients who need to undergo D-dimer testing, thus not explicitly to evaluate a clinical decision rule for patients with a clear suspicion of PE, we only validated the PERC algorithm in this setting.
  2. Primary healthcare: Outpatient or community healthcare clinics where patients are investigated by a general practitioner, family doctor, or general internist who needs to decide on the need for further referral or diagnostic testing, with relatively restricted access to laboratory or imaging workup. The PE prevalence is usually low to intermediate (i.e., between 5% and 15%).
  3. Referred secondary care: In this setting, patients are referred (mostly by general practitioners, family doctors, or general internists) based upon a clear clinical suspicion of PE. In this setting, the PE prevalence in suspected patients is intermediate to relatively high (i.e., between 15% and 25%).
  4. Hospitalized or nursing home care: In this setting, patients are either hospitalized or in nursing homes, reflecting more severe and progressive illness with a high risk of PE. PE prevalence in the suspected population is typically high (i.e., above 25%).

To categorize each study into 1 of the 4 settings, expert panel members (GJG, FAK, MAMS, NK, and NvE) independently grouped each study and discussed disagreements until they reached a consensus. For studies that were performed in more than 1 setting (e.g., including both outpatients and inpatients), each patient was categorized based on the information provided by the principal investigators.

Data collection and harmonization

Principal investigators of eligible studies were asked to provide their original, anonymized datasets. These datasets were then harmonized by adjusting coding and definition of each variable using a template developed for this IPD-MA; see Table B in S1 Text.

Outcomes

The primary outcomes were diagnostic indices, i.e., failure rate and efficiency of each diagnostic strategy across different healthcare settings. Failure rate, which is a frequently applied measure for diagnostic safety in the VTE domain, was defined as the proportion of missed PE patients among those categorized as “PE excluded” by each diagnostic strategy. Efficiency of a strategy was defined as the proportion of patients categorized by the strategy as “PE excluded” among all patients. Additionally, we also estimated the traditional diagnostic indices, sensitivity and specificity.

Missing data

Summary of missing data in each study is shown in Table C in S1 Text. Within each study, missing values were imputed using multiple imputation techniques with chained equations with all available variables, except for variables missing in more than 80% of patients in the study [20]. The detail of imputation procedure is described in S1 Text.

Statistical analyses

The statistical analysis plan is described in detail in S1 Text. To evaluate the diagnostic performance of each strategy across different healthcare settings, we used multilevel logistic regression models [21,22]. In models for failure rate and efficiency, a random effect for the intercept was applied to account for clustering of observations within studies. In models for sensitivity and specificity, we used univariate random effects modeling due to nonconvergence issues encountered in bivariate random effects modeling [23]. By using these models, the diagnostic performance measures were estimated with 95% confidence intervals (CIs). In addition, between-study heterogeneity was assessed by calculating 95% prediction intervals (PIs), which indicates the performance that can be expected when the diagnostic strategy is applied in a new study [24]. Forest plots were drawn to visualize the failure rate and efficiency for the different strategies across different healthcare settings. In addition, the range of failure rate and efficiency of each diagnostic strategy in included studies was visualized with I2 [25].

Although our primary aim was to evaluate the performance of diagnostic strategies across different healthcare settings, the categorization of healthcare settings by the expert panel might still be arbitrary. Therefore, we assessed the relationship between failure rate and efficiency with underlying PE prevalence in each study as well, as this was deemed one of the most important distinctive characteristics of different healthcare settings. In accordance with a previous systematic review [26], log-transformed prevalence was added as a continuous covariable to the aforementioned multilevel logistic regression models. The relationship between PE prevalence and failure rate or efficiency of each strategy was plotted to graphically illustrate the impact of PE prevalence on these outcomes.

Finally, given that the availability of items used in each diagnostic strategy differed across included studies, the diagnostic performance of each strategy was estimated in different sets of studies. This inherently makes comparisons of each strategy indirect, and, therefore, we performed additional sensitivity analyses including only studies in which all diagnostic strategies can be calculated. Such an analysis yields a direct comparison among diagnostic strategies.

All analyses were performed using R, version 3.6.3 (R foundation for Statistical Computing, www.R-project.org), particularly using the lme 4 package.

Results

The systematic literature search identified 3,892 unique studies [13]. After applying the eligibility criteria and scrutinizing original data files and publications, a total of 23 studies were selected to be included in this IPD-MA for a total of 35,248 unique patients suspected of PE; see Fig A in S1 Figs. Risk of bias of included studies was generally scored as low; see Fig B in S1 Figs.

Study and patient characteristics

A summary of the included studies is shown in Table D in S1 Text. Studies were published between 2000 and 2019. A total of 5 studies were conducted in self-referral emergency care (N = 12,612; mean prevalence 7%), 4 in primary healthcare (N = 3,174; mean prevalence 9%), 14 in referred secondary care (N = 17,052; mean prevalence 20%), and 9 studies included patients hospitalized or in nursing home (N = 2,410; mean prevalence 24%). Detailed patient characteristics in each healthcare setting are shown in Table 1.

thumbnail
Table 1. Patient characteristics across different healthcare settings.

https://doi.org/10.1371/journal.pmed.1003905.t001

Accuracy of different diagnostic strategies across healthcare settings

Fig 1 shows the failure rate and efficiency of the diagnostic strategies across healthcare settings. The range of failure rate and efficiency in the included studies are shown with I2 in Fig C in S1 Figs. Sensitivity and specificity of the 11 diagnostic strategies across healthcare settings are shown in Table 2. All strategies had a sensitivity higher than 90% in all settings (range: 93.3% to 99.6%), while specificity decreased in healthcare settings with higher PE prevalence (range: 7.9% to 67.4%).

thumbnail
Fig 1. Forest plot of failure rate and efficiency of the diagnostic strategies across healthcare settings.

CI, confidence interval; (C)PTP, (clinical) pretest probability; DD, D-dimer; N, number of patients; PERC, Pulmonary Embolism Rule-out Criteria; PI, prediction interval; PTP, pretest probability.

https://doi.org/10.1371/journal.pmed.1003905.g001

thumbnail
Table 2. Sensitivity and specificity of diagnostic strategies across healthcare settings.

https://doi.org/10.1371/journal.pmed.1003905.t002

Self-referral emergency care

The PERC algorithm was evaluated in combination with a Wells rule ≤4 points or “low gestalt.” Failure rate was 1.12% (95% CI 0.74 to 1.70) for the PERC algorithm combined with a Wells rule ≤4 points and 0.90% (95% CI 0.54 to 1.48) for that with “low gestalt.” Efficiency was higher for the PERC algorithm combined with a Wells rule ≤4 points (21%) than when that with “low gestalt” (13%).

Primary healthcare

The failure rate ranged from 0.13% (95% CI 0.03 to 0.62) for the Wells rule with a fixed D-dimer cutoff to 0.69% (95% CI 0.31 to 1.52) for the Wells rule with a qualitative or fixed D-dimer cutoff, while efficiency ranged from 38% (95% CI 25 to 52) for the Wells rule with a fixed D-dimer cutoff to 62% (95% CI 48 to 74) for the Wells rule with PTP-adjusted D-dimer.

Referred secondary care

In general, strategies with PTP-adjusted D-dimer (i.e., YEARS and Wells or revised Geneva rule combined with PTP-adjusted D-dimer) showed a higher failure rate than the others without overlapping in their 95% CIs: Failure rate was 2.10% (95% CI 1.59 to 2.75) for YEARS, 3.06% (95% CI 2.47 to 3.78) for the Wells rule with PTP-adjusted D-dimer, and 2.95% (95% 2.34 to 3.71) for the revised Geneva rule with PTP-adjusted D-dimer, respectively. Among the others, the failure rate ranged from 0.32% (95% CI 0.17 to 0.60) to 1.17% (95% CI 0.79 to 1.74). Efficiency of the strategies using PTP-adjusted D-dimer was higher than the others without overlapping in their 95% CIs.

Evaluation of the PERC algorithm in combination with a Wells rule of ≤4 points yielded a failure rate of 6.01% (95% CI 4.09 to 8.75) with a corresponding efficiency of 10% (95% CI 7 to 14).

Hospitalized or nursing home care

The failure rate ranged from 1.68% (95% CI 0.65 to 4.25) for the Wells rule with age-adjusted D-dimer to 5.13% (95% CI 2.57 to 9.93) for the revised Geneva rule with a qualitative or fixed D-dimer cutoff, while efficiency ranged from 15% (95% CI 12 to 19) for the Wells rule with a fixed D-dimer cutoff to 30% (95% CI 25 to 35) for the Wells rule with PTP-adjusted D-dimer. The failure rate of all strategies showed wide overlapping 95% CIs.

Association between PE prevalence and failure rate/efficiency of diagnostic strategies under evaluation

The relationship between PE prevalence and failure rate or efficiency is visualized in Figs 2 and 3, respectively. In general, as PE prevalence increased, both failure rate and efficiency became poorer (i.e., higher failure rate and lower efficiency).

thumbnail
Fig 2. The relationship between the prevalence of PE and failure rate of each diagnostic strategy.

Gray shaded area shows 95% CI, and light gray shaded area shows 95% PI. CI, confidence interval; (C)PTP, (clinical) pretest probability; DD, D-dimer; PE, pulmonary embolism; PERC, Pulmonary Embolism Rule-out Criteria; PI, prediction interval; PTP, pretest probability.

https://doi.org/10.1371/journal.pmed.1003905.g002

thumbnail
Fig 3. The relationship between the prevalence of PE and efficiency of each diagnostic strategy.

Gray shaded area shows 95% CI, and light gray shaded area shows 95% PI. CI, confidence interval; (C)PTP, (clinical) pretest probability; DD, D-dimer; PE, pulmonary embolism; PERC, Pulmonary Embolism Rule-out Criteria; PI, prediction interval; PTP, pretest probability.

https://doi.org/10.1371/journal.pmed.1003905.g003

Sensitivity analyses allowing direct comparisons

Two sensitivity analyses were performed for direct comparisons. First, we included only patients in whom all diagnostic strategies can be calculated. Due to the lack of studies allowing for such a direct comparison of all strategies, we could include only referred secondary care patients in this sensitivity analysis (N = 6,736). Second, as the PERC algorithm is different from the other strategies as it is used in only patients with a very low PTP, we have also included patients in whom all diagnostic strategies except the PERC algorithm can be calculated (including N = 11,307 in the referred secondary care and N = 1,142 in hospitalized or nursing home care). In both types of sensitivity analyses, we found very similar inferences which supported the robustness of the primary analyses; see Figs D and E in S1 Figs.

Discussion

In this large, comprehensive international study including over 35,000 patients suspected of PE in various healthcare settings, we validated the performance of diagnostic strategies for suspected PE. We observed that the performance of these strategies varied considerably across different healthcare settings, likely due to the difference in case mix and (thus) PE prevalence. Our findings provide strong evidence on the optimum diagnostic strategies for PE suspicion per care setting, balancing the trade-off between missing PE cases and decreasing unnecessary referrals or follow-up.

Clinical implications

Our interpretation of the findings is as follows. The PERC algorithm is safe in self-referral emergency care, allowing to preclude additional testing for PE (notably including D-dimer) in about 1 in every 5 patients when combined with a low clinical impression of PE being present, which confirms previous findings [27,28]. In the other settings, as this algorithm appears not to be safe, the use of a diagnostic strategy followed by D-dimer testing is preferred.

In primary healthcare, strategies with PTP-adjusted D-dimer showed equal safety and higher efficiency than those with a fixed or age-adjusted D-dimer cutoff, making them overall an attractive diagnostic strategy. However, in referred secondary care, strategies with PTP-adjusted D-dimer also had a better efficiency but showed a considerably higher failure rate—ranging between 2.10% and 3.06%—compared to those with age-adjusted D-dimer, which ranged from 0.65% to 0.81%.

Finally, in hospitalized or nursing home care, the observed failure rate was higher than that for the other settings, ranging between 1.81% and 5.13%. Moreover, as clearly observed in wide 95% CIs and PIs, the precision of our inferences was not sufficient to draw firm conclusions in this setting.

When deciding what diagnostic strategy to use, it should be acknowledged that no diagnostic strategy in patients suspected of PE will be completely safe, i.e., yielding a “failure rate” of 0%. In fact, even CTPA, which is used as the “reference standard” for PE in modern clinical medicine, is not perfectly safe as the cumulative VTE incidence at 3 months after a normal CTPA—i.e., the “failure rate” of CTPA—was reported to be 1.20% (95% CI 0.48 to 2.60) [29]. Accordingly, it could be argued that any diagnostic strategy with a failure rate around 1% to 2% is as safe as referring all patients for CTPA, and this safety threshold is generally considered the adequate standard provided by the ISTH. Nevertheless, this safety threshold is dependent on case mix, exemplified by a higher cumulative VTE incidence at 3 months following a normal CTPA in patients with a high PTP (6.3%; i.e., patients with risk factors such as cancer, previous VTE, and immobilization). Thus, the acceptable threshold of a failure rate could be higher in healthcare settings that include more high-risk patients (i.e., high PE prevalence) than in those including more low-risk patients (i.e., low PE prevalence). Such a prevalence-adjusted threshold of failure rate indeed has been proposed by the ISTH [9]. If this was applied to each healthcare setting in this IPD-MA for illustrative purposes, the acceptable threshold of failure rate should range between 0.71% and 1.86% in self-referral emergency care, between 0.72% and 1.87% in primary healthcare, between 0.78% and 1.93% in referred secondary care, and between 0.80% and 1.95% in hospitalized or nursing home care, respectively. In that case, the optimum strategy (i.e., most efficient strategy with acceptable failure rate) may be the PERC algorithm in emergency care, a PTP-adjusted D-dimer strategy in primary healthcare, and an age-adjusted strategy in referred secondary care, while no strategy showed an acceptable failure rate in hospitalized or nursing home care.

Nevertheless, as these prevalence-adjusted thresholds are proposed only for planning diagnostic studies rather than for the use in clinical practice [9], physicians need to set the acceptable threshold of failure rate for their own setting and standards and subsequently choose the optimum diagnostic strategy, likely dictated by clinical context. We believe that our findings can be used to aid that clinical decision-making, balancing the trade-off between safety and efficiency, and tailored to the specific setting and case mix where they work and encounter patients suspected of PE. Furthermore, by combining with various factors (e.g., patient perceptions and demands, availability of imaging studies, and benefit/cost associated with different recommendations) in a clinical setting where it is applied, our findings could be a useful basis for developing a clinical guideline for the diagnosis of PE.

This large-scale international study included over 35,000 patients suspected of PE, coming from a variety of healthcare settings. In addition, we used state-of-the-art statistical methods to quantify diagnostic performance of currently available diagnostic strategies. For full appreciation, some aspects of this study though need specific attention.

First, the availability of items used in each diagnostic strategy differed across included studies. As such, in the primary analyses, the diagnostic performance of each strategy was compared in different sets of studies. Accordingly, we added the sensitivity analyses for a direct comparison of the diagnostic strategies, which yielded very similar results supporting the robustness of the primary analyses.

Second, although we defined the categorization of healthcare settings through profound discussion among expert panel members, it could still be arbitrary. Thus, we analyzed the relationship between failure rate or efficiency and PE prevalence. We found that both failure rate and efficiency became poorer as PE prevalence increased, which supported the robustness of our main finding that the performance of each diagnostic strategy became poorer in healthcare settings with higher PE prevalence.

Third, the YEARS algorithm and the Wells rule with PTP-adjusted D-dimer (PeGED) were less safe in this IPD-MA than in their original studies [15,17]. In most of the included studies, the reference standard for PE was a combination of imaging tests and clinical follow-up, with the decision to refer for imaging guided by the diagnostic strategy under evaluation. However, diagnostic strategies adapting D-dimer to PTP, such as YEARS and PeGED, are more efficient than the other strategies. Accordingly, when applying these diagnostic strategies retrospectively in other studies, more patients will have had imaging as the reference standard than clinical follow-up compared to their derivation studies. This approach likely led to the inclusion of small, possibly insignificant clots in the proportion of missed PE cases among those in whom PE could be considered excluded based on a negative PTP-adjusted D-dimer strategy. This hypothesis is supported by data showing that PE detected by the original Wells rule with a fixed D-dimer cutoff included more subsegmental PE than in those detected by the PTP-adjusted YEARS algorithm [30]. Unfortunately, detailed information about the localisation and extent of diagnosed PE was not available in this IPD dataset.

Fourth, as shown in Table D in S1 Text, different types of D-dimer assay were used in the included studies, which could be a source of between-study heterogeneity. In addition, the performance of diagnostic strategies in each healthcare setting could be affected by the variation in D-dimer testing (e.g., the skill of laboratory technicians or the timing of the blood test in relation to patient presentation), which we could not explore in this IPD.

Finally, the studies included in our IPD-MA were conducted between 2000 and 2019. Over those 20 years, the performance of D-dimer testing and imaging studies has evolved. Hence, although we consider the trends of failure rate and efficiency of the diagnostic strategies in our findings to be valid and representative, the validity of our finding in today’s patients should be interpreted with some caution.

Conclusions

The performance of available diagnostic strategies for patients with suspected PE varied considerably across different healthcare settings. The findings of this large-scale study indicate which is the optimum diagnostic strategy for ruling out PE per care setting, balancing the trade-off between missing PE cases and decreasing unnecessary referrals or follow-up.

Supporting information

S1 Checklist.

Includes Table A PRISMA-IPD Checklist, Table B PRISMA-DTA Checklist, and Table C PRISMA-DTA for Abstracts Checklist. PRISMA-DTA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy; PRISMA-IPD, Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Individual Participant Data.

https://doi.org/10.1371/journal.pmed.1003905.s001

(DOCX)

S1 Text.

Includes a detailed statistical analyses plan (including references), Table A Diagnostic strategies under evaluation, Table B Data template, Table C Summary of missing data in each study, and Table D Summary of included studies.

https://doi.org/10.1371/journal.pmed.1003905.s002

(DOCX)

S1 Fig.

Includes Fig A Flow of studies, Fig B Risk of bias assessment, Fig C The range of failure rate and efficiency of the diagnostic strategies with I2 statistics, Fig D Sensitivity analysis including only studies in which all diagnostic strategies can be calculated, and Fig E Sensitivity analysis including only studies in which all diagnostic strategies except PERC algorithm can be calculated. PERC, Pulmonary Embolism Rule-out Criteria.

https://doi.org/10.1371/journal.pmed.1003905.s003

(DOCX)

References

  1. 1. Huisman MV, Barco S, Cannegieter SC, Le Gal G, Konstantinides SV, Reitsma PH, et al. Pulmonary embolism. Nat Rev Dis Primers. 2018;4:18028. Epub 2018/05/18. pmid:29770793.
  2. 2. Konstantinides SV, Meyer G, Becattini C, Bueno H, Geersing GJ, Harjola VP, et al. 2019 ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS). Eur Heart J. 2020;41(4):543–603. Epub 2019/09/11. pmid:31504429.
  3. 3. den Exter PL, van Es J, Erkens PM, van Roosmalen MJ, van den Hoven P, Hovens MM, et al. Impact of delay in clinical presentation on the diagnostic management and prognosis of patients with suspected pulmonary embolism. Am J Respir Crit Care Med. 2013;187(12):1369–73. Epub 2013/04/18. pmid:23590273.
  4. 4. Hendriksen JM, Koster-van Ree M, Morgenstern MJ, Oudega R, Schutgens RE, Moons KG, et al. Clinical characteristics associated with diagnostic delay of pulmonary embolism in primary care: a retrospective observational study. BMJ Open. 2017;7(3):e012789. Epub 2017/03/11. pmid:28279993; PubMed Central PMCID: PMC5353317.
  5. 5. Prasad V, Rho J, Cifu A. The diagnosis and treatment of pulmonary embolism: a metaphor for medicine in the evidence-based medicine era. Arch Intern Med. 2012;172(12):955–8. Epub 2012/04/05. pmid:22473672.
  6. 6. Carrier M, Klok FA. Symptomatic subsegmental pulmonary embolism: to treat or not to treat? Hematology Am Soc Hematol Educ Program. 2017;2017(1):237–41. Epub 2017/12/10. pmid:29222261; PubMed Central PMCID: PMC6142620.
  7. 7. Willich SN, Chuang LH, van Hout B, Gumbs P, Jimenez D, Kroep S, et al. Pulmonary embolism in Europe—Burden of illness in relationship to healthcare resource utilization and return to work. Thromb Res. 2018;170:181–91. Epub 2018/09/11. pmid:30199784.
  8. 8. Tritschler T, Kraaijpoel N, Le Gal G, Wells PS. Venous Thromboembolism: Advances in Diagnosis and Treatment. JAMA. 2018;320(15):1583–94. Epub 2018/10/17. pmid:30326130.
  9. 9. van Es N, van der Hulle T, van Es J, den Exter PL, Douma RA, Goekoop RJ, et al. Wells Rule and d-Dimer Testing to Rule Out Pulmonary Embolism: A Systematic Review and Individual-Patient Data Meta-analysis. Ann Intern Med. 2016;165(4):253–61. Epub 2016/05/18. pmid:27182696.
  10. 10. Dronkers CEA, van der Hulle T, Le Gal G, Kyrle PA, Huisman MV, Cannegieter SC, et al. Towards a tailored diagnostic standard for future diagnostic studies in pulmonary embolism: communication from the SSC of the ISTH. J Thromb Haemost. 2017;15(5):1040–3. Epub 2017/03/16. pmid:28296048.
  11. 11. McInnes MDF, Moher D, Thombs BD, McGrath TA, Bossuyt PM, and the P-DTAG, et al. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA. 2018;319(4):388–96. Epub 2018/01/25. pmid:29362800.
  12. 12. Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, et al. Preferred Reporting Items for Systematic Review and Meta-Analyses of individual participant data: the PRISMA-IPD Statement. JAMA. 2015;313(16):1657–65. Epub 2015/04/29. pmid:25919529.
  13. 13. Geersing GJ, Kraaijpoel N, Buller HR, van Doorn S, van Es N, Le Gal G, et al. Ruling out pulmonary embolism across different subgroups of patients and healthcare settings: protocol for a systematic review and individual patient data meta-analysis (IPDMA). Diagn Progn Res. 2018;2:10. Epub 2019/05/17. pmid:31093560; PubMed Central PMCID: PMC6460525.
  14. 14. Hendriksen JM, Geersing GJ, Lucassen WA, Erkens PM, Stoffers HE, van Weert HC, et al. Diagnostic prediction models for suspected pulmonary embolism: systematic review and independent external validation in primary care. BMJ. 2015;351:h4438. Epub 2015/09/10. pmid:26349907; PubMed Central PMCID: PMC4561760.
  15. 15. Kearon C, de Wit K, Parpia S, Schulman S, Afilalo M, Hirsch A, et al. Diagnosis of Pulmonary Embolism with d-Dimer Adjusted to Clinical Probability. N Engl J Med. 2019;381(22):2125–34. pmid:31774957
  16. 16. Schouten HJ, Koek HL, Oudega R, Geersing GJ, Janssen KJ, van Delden JJ, et al. Validation of two age dependent D-dimer cut-off values for exclusion of deep vein thrombosis in suspected elderly patients in primary care: retrospective, cross sectional, diagnostic analysis. BMJ. 2012;344:e2985. Epub 2012/06/08. pmid:22674922; PubMed Central PMCID: PMC3368485.
  17. 17. van der Hulle T, Cheung WY, Kooij S, Beenen LFM, van Bemmel T, van Es J, et al. Simplified diagnostic management of suspected pulmonary embolism (the YEARS study): a prospective, multicentre, cohort study. Lancet. 2017;390(10091):289–97. pmid:28549662
  18. 18. Kline JA, Mitchell AM, Kabrhel C, Richman PB, Courtney DM. Clinical criteria to prevent unnecessary diagnostic testing in emergency department patients with suspected pulmonary embolism. J Thromb Haemost. 2004;2(8):1247–55. Epub 2004/08/12. pmid:15304025.
  19. 19. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36. Epub 2011/10/19. pmid:22007046.
  20. 20. Janssen KJ, Donders AR, Harrell FE Jr, Vergouwe Y, Chen Q, Grobbee DE, et al. Missing covariate data in medical research: to impute is better than to ignore. J Clin Epidemiol. 2010;63(7):721–7. Epub 2010/03/27. pmid:20338724.
  21. 21. Debray TP, Riley RD, Rovers MM, Reitsma JB, Moons KG, Cochrane IPDM-aMg. Individual participant data (IPD) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use. PLoS Med. 2015;12(10):e1001886. Epub 2015/10/16. pmid:26461078; PubMed Central PMCID: PMC4603958.
  22. 22. Riley RD, Ensor J, Snell KI, Debray TP, Altman DG, Moons KG, et al. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140. Epub 2016/06/24. pmid:27334381; PubMed Central PMCID: PMC4916924.
  23. 23. Simel DL, Bossuyt PM. Differences between univariate and bivariate models for summarizing diagnostic accuracy may not be large. J Clin Epidemiol. 2009;62(12):1292–300. Epub 2009/05/19. pmid:19447007.
  24. 24. Debray TP, Moons KG, Abo-Zaid GM, Koffijberg H, Riley RD. Individual participant data meta-analysis for a binary outcome: one-stage or two-stage? PLoS ONE. 2013;8(4):e60650. Epub 2013/04/16. pmid:23585842; PubMed Central PMCID: PMC3621872.
  25. 25. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–60. Epub 2003/09/06. pmid:12958120; PubMed Central PMCID: PMC192859.
  26. 26. Lucassen W, Geersing GJ, Erkens PM, Reitsma JB, Moons KG, Buller H, et al. Clinical decision rules for excluding pulmonary embolism: a meta-analysis. Ann Intern Med. 2011;155(7):448–60. Epub 2011/10/05. pmid:21969343.
  27. 27. Freund Y, Cachanado M, Aubry A, Orsini C, Raynal PA, Feral-Pierssens AL, et al. Effect of the Pulmonary Embolism Rule-Out Criteria on Subsequent Thromboembolic Events Among Low-Risk Emergency Department Patients: The PROPER Randomized Clinical Trial. JAMA. 2018;319(6):559–66. Epub 2018/02/17. pmid:29450523; PubMed Central PMCID: PMC5838786.
  28. 28. Penaloza A, Soulié C, Moumneh T, Delmez Q, Ghuysen A, El Kouri D, et al. Pulmonary embolism rule-out criteria (PERC) rule in European patients with low implicit clinical probability (PERCEPIC): a multicentre, prospective, observational study. Lancet Haematol. 2017;4(12):e615–e21. pmid:29150390
  29. 29. van der Hulle T, van Es N, den Exter PL, van Es J, Mos ICM, Douma RA, et al. Is a normal computed tomography pulmonary angiography safe to rule out acute pulmonary embolism in patients with a likely clinical probability? A patient-level meta-analysis. Thromb Haemost. 2017;117(8):1622–9. Epub 2017/06/02. pmid:28569924.
  30. 30. van der Pol LM, Bistervels IM, van Mens TE, van der Hulle T, Beenen LFM, den Exter PL, et al. Lower prevalence of subsegmental pulmonary embolism after application of the YEARS diagnostic algorithm. Br J Haematol. 2018;183(4):629–35. Epub 2018/09/11. pmid:30198551; PubMed Central PMCID: PMC6282699.