Ruling out pulmonary embolism across different healthcare settings: A systematic review and individual patient data meta-analysis

Background The challenging clinical dilemma of detecting pulmonary embolism (PE) in suspected patients is encountered in a variety of healthcare settings. We hypothesized that the optimal diagnostic approach to detect these patients in terms of safety and efficiency depends on underlying PE prevalence, case mix, and physician experience, overall reflected by the type of setting where patients are initially assessed. The objective of this study was to assess the capability of ruling out PE by available diagnostic strategies across all possible settings. Methods and findings We performed a literature search (MEDLINE) followed by an individual patient data (IPD) meta-analysis (MA; 23 studies), including patients from self-referral emergency care (n = 12,612), primary healthcare clinics (n = 3,174), referred secondary care (n = 17,052), and hospitalized or nursing home patients (n = 2,410). Multilevel logistic regression was performed to evaluate diagnostic performance of the Wells and revised Geneva rules, both using fixed and adapted D-dimer thresholds to age or pretest probability (PTP), for the YEARS algorithm and for the Pulmonary Embolism Rule-out Criteria (PERC). All strategies were tested separately in each healthcare setting. Following studies done in this field, the primary diagnostic metrices estimated from the models were the “failure rate” of each strategy—i.e., the proportion of missed PE among patients categorized as “PE excluded” and “efficiency”—defined as the proportion of patients categorized as “PE excluded” among all patients. In self-referral emergency care, the PERC algorithm excludes PE in 21% of suspected patients at a failure rate of 1.12% (95% confidence interval [CI] 0.74 to 1.70), whereas this increases to 6.01% (4.09 to 8.75) in referred patients to secondary care at an efficiency of 10%. In patients from primary healthcare and those referred to secondary care, strategies adjusting D-dimer to PTP are the most efficient (range: 43% to 62%) at a failure rate ranging between 0.25% and 3.06%, with higher failure rates observed in patients referred to secondary care. For this latter setting, strategies adjusting D-dimer to age are associated with a lower failure rate ranging between 0.65% and 0.81%, yet are also less efficient (range: 33% and 35%). For all strategies, failure rates are highest in hospitalized or nursing home patients, ranging between 1.68% and 5.13%, at an efficiency ranging between 15% and 30%. The main limitation of the primary analyses was that the diagnostic performance of each strategy was compared in different sets of studies since the availability of items used in each diagnostic strategy differed across included studies; however, sensitivity analyses suggested that the findings were robust. Conclusions The capability of safely and efficiently ruling out PE of available diagnostic strategies differs for different healthcare settings. The findings of this IPD MA help in determining the optimum diagnostic strategies for ruling out PE per healthcare setting, balancing the trade-off between failure rate and efficiency of each strategy.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 12,612), primary healthcare clinics (n = 3,174), referred secondary care (n = 17,052), and hospitalized or nursing home patients (n = 2,410). Multilevel logistic regression was performed to evaluate diagnostic performance of the Wells and revised Geneva rules, both using fixed and adapted D-dimer thresholds to age or pretest probability (PTP), for the YEARS algorithm and for the Pulmonary Embolism Rule-out Criteria (PERC). All strategies were tested separately in each healthcare setting. Following studies done in this field, the primary diagnostic metrices estimated from the models were the "failure rate" of each strategy-i.e., the proportion of missed PE among patients categorized as "PE excluded" and "efficiency"-defined as the proportion of patients categorized as "PE excluded" among all patients. In self-referral emergency care, the PERC algorithm excludes PE in 21% of suspected patients at a failure rate of 1.12% (95% confidence interval [CI] 0.74 to 1.70), whereas this increases to 6.01% (4.09 to 8.75) in referred patients to secondary care at an efficiency of 10%. In patients from primary healthcare and those referred to secondary care, strategies adjusting D-dimer to PTP are the most efficient (range: 43% to 62%) at a failure rate ranging between 0.25% and 3.06%, with higher failure rates observed in patients referred to secondary care. For this latter setting, strategies adjusting D-dimer to age are associated with a lower failure rate ranging between 0.65% and 0.81%, yet are also less efficient (range: 33% and 35%). For all strategies, failure rates are highest in hospitalized or nursing home patients, ranging between 1.68% and 5.13%, at an efficiency ranging between 15% and 30%. The main limitation of the primary analyses was that the diagnostic performance of each strategy was compared in different sets of studies since the availability of items used in each diagnostic strategy differed across included studies; however, sensitivity analyses suggested that the findings were robust.

Conclusions
The capability of safely and efficiently ruling out PE of available diagnostic strategies differs for different healthcare settings. The findings of this IPD MA help in determining the optimum diagnostic strategies for ruling out PE per healthcare setting, balancing the trade-off between failure rate and efficiency of each strategy.

Author summary
Why was this study done?
• Pulmonary embolism (PE; i.e., clots in pulmonary vessels) is a potentially fatal condition, and patients suspected of having this condition are encountered in many different healthcare settings.
• To help physicians with ruling out PE without additional imaging tests, several diagnostic strategies exist, consisting of clinical items and a blood test (D-dimer testing), with different approaches to interpret this D-dimer test, i.e., using a fixed threshold, an ageadjusted manner, or adjusting D-dimer interpretation to a pretest probability (PTP) of PE. What did the researchers do and find?

PLOS MEDICINE
• The researchers searched and collected individual patient data (IPD) of existing studies that can be used to evaluate the performance of diagnostic strategies to exclude the possibility of PE.
• By analyzing the data of over 35,000 patients suspected of PE from 23 studies, the researchers validated the performance of diagnostic strategies for suspected PE across different healthcare settings.
• In healthcare settings with a higher prevalence of PE-compared to those with a lower prevalence-each diagnostic strategy tended to miss more patients with PE (i.e., less safe) and identified less patients in whom PE could be ruled out without imaging (i.e., less efficient), notably for strategies with a variable D-dimer interpretation.

Introduction
Pulmonary embolism (PE) is one of the most difficult diagnoses in clinical medicine, encountered daily in a variety of healthcare settings [1,2]. Due to potentially fatal consequences of missing PE [3,4], physicians tend to perform diagnostic imaging tests even when PE is considered not the most likely diagnosis. Some argue against this low threshold for diagnostic workup since such overtesting can lead to unnecessary radiation exposure, cost, and potential adverse events related to the use of contrast media [5]. At the same time, it has been argued that PE should be suspected more often to prevent potentially life-threatening delay in diagnosis [6].
To help physicians with this clinical dilemma, various diagnostic strategies for ruling out PE have been developed over time, all consisting of a set of clinical variables that are often combined with a blood test to detect clot degradation, i.e., D-dimer [7,8]. Given the differences in case mix and underlying prevalence of PE, it is likely that each diagnostic strategy has different merits across different healthcare settings [9,10]. Nevertheless, evidence on the performance of the currently available diagnostic strategies across different healthcare settings is limited, notably for settings like primary healthcare or inpatient care.
Hence, we performed a comprehensive systematic review followed by an individual patient data (IPD) meta-analysis (MA) to explore the performance of diagnostic strategies for PE across a variety of healthcare settings. The secondary aim of this study was to investigate the relationship between PE prevalence and the diagnostic performance measures of each strategy.

Methods
Throughout this paper, we adhere to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Individual Participant Data (PAU : PleasenotethatPRISMA À IPDhasbeendefined RISMA-IPD) and Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy (PAU : PleasenotethatPR RISMA-DTA) guidance on systematic reviews including IPD, where applicable [11,12]. The checklists are available in Tables A, B, and C in S1 Checklists. Ethical approval including written informed consent was obtained in each original study, and analyses described in this paper on optimizing diagnostic strategies for suspected PE were aligned with the informed consent as provided by individual patients in each study. Therefore, no additional ethical approval was required for this MA.

Protocol registration
This study was preregistered in the PROSPERO registration (see https://www.crd.york.ac.uk/ prospero ID 89366), and the protocol has been published [13].

Diagnostic strategies under evaluation
Based on a previous systematic review [14] and discussion among experts, we a priori selected 11 existing diagnostic strategies under evaluation. The overview of these index strategies is shown in Table A in S1 Text. The 2 most commonly used clinical decision rules for pretest probability (PTP) assessment, the Wells and revised Geneva rules [14], are to be combined with D-dimer testing, with D-dimer interpretations either using a fixed cutoff (using either qualitative or quantitative D-dimer testing), adjusted to PTP, or adjusted to age [15,16]. The YEARS algorithm is a simplified version of the Wells rule with PTP-adjusted D-dimer [17]. The Pulmonary Embolism Rule-out Criteria (PERC) algorithm, which comprises 8 clinical items, was also evaluated [18]. This strategy differs from the other diagnostic strategies as it was originally developed for excluding PE in patients with a low clinical impression of PE. Hereto, following earlier studies, the PERC algorithm was validated in combination with (i) a Wells rule of 4 points or less; or (ii) physician's gestalt considering PE unlikely ("low gestalt"). The PERC algorithm could only be evaluated for the settings "self-referral" emergency care and referred secondary care due to missing information on oxygen saturation in most of the studies in the other settings.

Study eligibility, identification, and selection
The process of study selection for the IPD-MA was described in detail in the protocol [13]. In short, to retrieve eligible studies, MEDLINE was first searched from January 1, 1995 to August 25, 2016 (this was recently updated until November 1, 2021). Studies were eligible if they (1) had a prospective or cross-sectional design and included patients with clinically suspected PE (in diagnostic research of venous thromboembolism [VTE], prospective cohort studies are common because VTE is often defined by clinical follow-up in patients whom a PTP of VTE is deemed unlikely); (2) assessed the variables to validate at least one of the diagnostic strategies under evaluation; (3) included a clear description of the source of patient enrolment or clinical healthcare setting; (4) objectively confirmed VTE diagnosis (i.e., PE or deep vein thrombosis) with an established reference test method (either imaging [computed tomography pulmonary angiography (CTPA), ventilation-perfusion lung scan, or digital subtraction angiography] or clinical follow-up of at least 1 month); and (5) included at least 50 patients with confirmed VTE. Full-text screening was performed independently by 2 couples of authors (GJG and NK and FAK and NvE), and 40 potentially eligible papers were identified. With all principal investigators from these 40 retrieved studies invited, the results of this literature search were discussed during a meeting at the International Society on Thrombosis and Haemostasis (ISTH) conference in Berlin in 2017. The search results were complimented by asking those experts in the field of diagnosing VTE about whether they knew any additional datasets eligible for this IPD.

Risk of bias assessment across studies
Three pairs of authors (GJG and TT, NvE and NK, and FAK and MAMS), who were not involved in the original studies, independently assessed each eligible study for potential sources of bias and applicability concerns using the QUADAS-2 tool [19]. Any disagreements were solved by discussion within each pair and subsequently between the pairs.

Healthcare settings
We defined the following 4 categories of healthcare settings in which patients suspected of PE are typically encountered: i. Self-referral emergency care: Patients typically present themselves without a referral by a general physician or specialist. This setting is characterized by a (very) low PE prevalence (i.e., around 5%) among patients with clinically suspected PE and has relatively good access to additional imaging or laboratory workup. Given that the studies performed in this setting emphasized on preselection of patients who need to undergo D-dimer testing, thus not explicitly to evaluate a clinical decision rule for patients with a clear suspicion of PE, we only validated the PERC algorithm in this setting.
ii. Primary healthcare: Outpatient or community healthcare clinics where patients are investigated by a general practitioner, family doctor, or general internist who needs to decide on the need for further referral or diagnostic testing, with relatively restricted access to laboratory or imaging workup. The PE prevalence is usually low to intermediate (i.e., between 5% and 15%).
iii. Referred secondary care: In this setting, patients are referred (mostly by general practitioners, family doctors, or general internists) based upon a clear clinical suspicion of PE. In this setting, the PE prevalence in suspected patients is intermediate to relatively high (i.e., between 15% and 25%).
iv. Hospitalized or nursing home care: In this setting, patients are either hospitalized or in nursing homes, reflecting more severe and progressive illness with a high risk of PE. PE prevalence in the suspected population is typically high (i.e., above 25%).
To categorize each study into 1 of the 4 settings, expert panel members (GJG, FAK, MAMS, NK, and NvE) independently grouped each study and discussed disagreements until they reached a consensus. For studies that were performed in more than 1 setting (e.g., including both outpatients and inpatients), each patient was categorized based on the information provided by the principal investigators.

Data collection and harmonization
Principal investigators of eligible studies were asked to provide their original, anonymized datasets. These datasets were then harmonized by adjusting coding and definition of each variable using a template developed for this IPD-MA; see Table B in S1 Text.

Outcomes
The primary outcomes were diagnostic indices, i.e., failure rate and efficiency of each diagnostic strategy across different healthcare settings. Failure rate, which is a frequently applied measure for diagnostic safety in the VTE domain, was defined as the proportion of missed PE patients among those categorized as "PE excluded" by each diagnostic strategy. Efficiency of a strategy was defined as the proportion of patients categorized by the strategy as "PE excluded" among all patients. Additionally, we also estimated the traditional diagnostic indices, sensitivity and specificity.

Missing data
Summary of missing data in each study is shown in Table C in S1 Text. Within each study, missing values were imputed using multiple imputation techniques with chained equations with all available variables, except for variables missing in more than 80% of patients in the study [20]. The detail of imputation procedure is described in S1 Text.

Statistical analyses
The statistical analysis plan is described in detail in S1 Text. To evaluate the diagnostic performance of each strategy across different healthcare settings, we used multilevel logistic regression models [21,22]. In models for failure rate and efficiency, a random effect for the intercept was applied to account for clustering of observations within studies. In models for sensitivity and specificity, we used univariate random effects modeling due to nonconvergence issues encountered in bivariate random effects modeling [23]. By using these models, the diagnostic performance measures were estimated with 95% confidence intervals (CIs). In addition, between-study heterogeneity was assessed by calculating 95% prediction intervals (PIs), which indicates the performance that can be expected when the diagnostic strategy is applied in a new study [24]. Forest plots were drawn to visualize the failure rate and efficiency for the different strategies across different healthcare settings. In addition, the range of failure rate and efficiency of each diagnostic strategy in included studies was visualized with I 2 [25].
Although our primary aim was to evaluate the performance of diagnostic strategies across different healthcare settings, the categorization of healthcare settings by the expert panel might still be arbitrary. Therefore, we assessed the relationship between failure rate and efficiency with underlying PE prevalence in each study as well, as this was deemed one of the most important distinctive characteristics of different healthcare settings. In accordance with a previous systematic review [26], log-transformed prevalence was added as a continuous covariable to the aforementioned multilevel logistic regression models. The relationship between PE prevalence and failure rate or efficiency of each strategy was plotted to graphically illustrate the impact of PE prevalence on these outcomes.
Finally, given that the availability of items used in each diagnostic strategy differed across included studies, the diagnostic performance of each strategy was estimated in different sets of studies. This inherently makes comparisons of each strategy indirect, and, therefore, we performed additional sensitivity analyses including only studies in which all diagnostic strategies can be calculated. Such an analysis yields a direct comparison among diagnostic strategies.
All analyses were performed using R, version 3.6.3 (R foundation for Statistical Computing, www.R-project.org), particularly using the lme 4 package.

Results
The systematic literature search identified 3,892 unique studies [13]. After applying the eligibility criteria and scrutinizing original data files and publications, a total of 23 studies were selected to be included in this IPD-MA for a total of 35,248 unique patients suspected of PE; see Fig

Study and patient characteristics
A summary of the included studies is shown in Table D in S1 Text. Studies were published between 2000 and 2019. A total of 5 studies were conducted in self-referral emergency care (N = 12,612; mean prevalence 7%), 4 in primary healthcare (N = 3,174; mean prevalence 9%), 14 in referred secondary care (N = 17,052; mean prevalence 20%), and 9 studies included patients hospitalized or in nursing home (N = 2,410; mean prevalence 24%). Detailed patient characteristics in each healthcare setting are shown in Table 1.  Table 2. All strategies had a sensitivity higher than 90% in all settings (range: 93.3% to 99.6%), while specificity decreased in healthcare settings with higher PE prevalence (range: 7.9% to 67.4%).

Self-referral emergency care
The PERC algorithm was evaluated in combination with a Wells rule �4 points or "low gestalt." Failure rate was 1.12% (95% CI 0.74 to 1.70) for the PERC algorithm combined with a Wells rule �4 points and 0.90% (95% CI 0.54 to 1.48) for that with "low gestalt." Efficiency was higher for the PERC algorithm combined with a Wells rule �4 points (21%) than when that with "low gestalt" (13%).

Primary healthcare
The failure rate ranged from 0.13% (95% CI 0.03 to 0.62) for the Wells rule with a fixed Ddimer cutoff to 0.69% (95% CI 0.31 to 1.52) for the Wells rule with a qualitative or fixed Ddimer cutoff, while efficiency ranged from 38% (95% CI 25 to 52) for the Wells rule with a fixed D-dimer cutoff to 62% (95% CI 48 to 74) for the Wells rule with PTP-adjusted D-dimer.

Referred secondary care
In general, strategies with PTP-adjusted D-dimer (i.e., YEARS and Wells or revised Geneva rule combined with PTP-adjusted D-dimer) showed a higher failure rate than the others without overlapping in their 95% CIs: Failure rate was 2.10% (95% CI 1.59 to 2.75) for YEARS, 3.06% (95% CI 2.47 to 3.78) for the Wells rule with PTP-adjusted D-dimer, and 2.95% (95% 2.34 to 3.71) for the revised Geneva rule with PTP-adjusted D-dimer, respectively. Among the others, the failure rate ranged from 0.32% (95% CI 0.17 to 0.60) to 1.17% (95% CI 0.79 to    1.74). Efficiency of the strategies using PTP-adjusted D-dimer was higher than the others without overlapping in their 95% CIs. Evaluation of the PERC algorithm in combination with a Wells rule of �4 points yielded a failure rate of 6.01% (95% CI 4.09 to 8.75) with a corresponding efficiency of 10% (95% CI 7 to 14).

Hospitalized or nursing home care
The failure rate ranged from 1.68% (95% CI 0.65 to 4.25) for the Wells rule with age-adjusted D-dimer to 5.13% (95% CI 2.57 to 9.93) for the revised Geneva rule with a qualitative or fixed D-dimer cutoff, while efficiency ranged from 15% (95% CI 12 to 19) for the Wells rule with a fixed D-dimer cutoff to 30% (95% CI 25 to 35) for the Wells rule with PTP-adjusted D-dimer. The failure rate of all strategies showed wide overlapping 95% CIs.

Association between PE prevalence and failure rate/efficiency of diagnostic strategies under evaluation
The relationship between PE prevalence and failure rate or efficiency is visualized in Figs 2 and 3, respectively. In general, as PE prevalence increased, both failure rate and efficiency became poorer (i.e., higher failure rate and lower efficiency).

Sensitivity analyses allowing direct comparisons
Two sensitivity analyses were performed for direct comparisons. First, we included only patients in whom all diagnostic strategies can be calculated. Due to the lack of studies allowing for such a direct comparison of all strategies, we could include only referred secondary care patients in this sensitivity analysis (N = 6,736). Second, as the PERC algorithm is different from the other strategies as it is used in only patients with a very low PTP, we have also included patients in whom all diagnostic strategies except the PERC algorithm can be calculated (including N = 11,307 in the referred secondary care and N = 1,142 in hospitalized or nursing home care). In both types of sensitivity analyses, we found very similar inferences which supported the robustness of the primary analyses; see Figs D and E in S1 Figs.

Discussion
In this large, comprehensive international study including over 35,000 patients suspected of PE in various healthcare settings, we validated the performance of diagnostic strategies for

Clinical implications
Our interpretation of the findings is as follows. The PERC algorithm is safe in self-referral emergency care, allowing to preclude additional testing for PE (notably including D-dimer) in about 1 in every 5 patients when combined with a low clinical impression of PE being present, which confirms previous findings [27,28]. In the other settings, as this algorithm appears not to be safe, the use of a diagnostic strategy followed by D-dimer testing is preferred. In primary healthcare, strategies with PTP-adjusted D-dimer showed equal safety and higher efficiency than those with a fixed or age-adjusted D-dimer cutoff, making them overall an attractive diagnostic strategy. However, in referred secondary care, strategies with PTP- adjusted D-dimer also had a better efficiency but showed a considerably higher failure rateranging between 2.10% and 3.06%-compared to those with age-adjusted D-dimer, which ranged from 0.65% to 0.81%. Finally, in hospitalized or nursing home care, the observed failure rate was higher than that for the other settings, ranging between 1.81% and 5.13%. Moreover, as clearly observed in wide 95% CIs and PIs, the precision of our inferences was not sufficient to draw firm conclusions in this setting.
When deciding what diagnostic strategy to use, it should be acknowledged that no diagnostic strategy in patients suspected of PE will be completely safe, i.e., yielding a "failure rate" of 0%. In fact, even CTPA, which is used as the "reference standard" for PE in modern clinical medicine, is not perfectly safe as the cumulative VTE incidence at 3 months after a normal CTPA-i.e., the "failure rate" of CTPA-was reported to be 1.20% (95% CI 0.48 to 2.60) [29]. Accordingly, it could be argued that any diagnostic strategy with a failure rate around 1% to 2% is as safe as referring all patients for CTPA, and this safety threshold is generally considered the adequate standard provided by the ISTH. Nevertheless, this safety threshold is dependent on case mix, exemplified by a higher cumulative VTE incidence at 3 months following a normal CTPA in patients with a high PTP (6.3%; i.e., patients with risk factors such as cancer, previous VTE, and immobilization). Thus, the acceptable threshold of a failure rate could be higher in healthcare settings that include more high-risk patients (i.e., high PE prevalence) than in those including more low-risk patients (i.e., low PE prevalence). Such a prevalenceadjusted threshold of failure rate indeed has been proposed by the ISTH [9]. If this was applied to each healthcare setting in this IPD-MA for illustrative purposes, the acceptable threshold of failure rate should range between 0.71% and 1.86% in self-referral emergency care, between 0.72% and 1.87% in primary healthcare, between 0.78% and 1.93% in referred secondary care, and between 0.80% and 1.95% in hospitalized or nursing home care, respectively. IAU : Pleasecheckwhe n that case, the optimum strategy (i.e., most efficient strategy with acceptable failure rate) may be the PERC algorithm in emergency care, a PTP-adjusted D-dimer strategy in primary healthcare, and an age-adjusted strategy in referred secondary care, while no strategy showed an acceptable failure rate in hospitalized or nursing home care. Nevertheless, as these prevalence-adjusted thresholds are proposed only for planning diagnostic studies rather than for the use in clinical practice [9], physicians need to set the acceptable threshold of failure rate for their own setting and standards and subsequently choose the optimum diagnostic strategy, likely dictated by clinical context. We believe that our findings can be used to aid that clinical decision-making, balancing the trade-off between safety and efficiency, and tailored to the specific setting and case mix where they work and encounter patients suspected of PE. Furthermore, by combining with various factors (e.g., patient perceptions and demands, availability of imaging studies, and benefit/cost associated with different recommendations) in a clinical setting where it is applied, our findings could be a useful basis for developing a clinical guideline for the diagnosis of PE.
This large-scale international study included over 35,000 patients suspected of PE, coming from a variety of healthcare settings. In addition, we used state-of-the-art statistical methods to quantify diagnostic performance of currently available diagnostic strategies. For full appreciation, some aspects of this study though need specific attention.
First, the availability of items used in each diagnostic strategy differed across included studies. As such, in the primary analyses, the diagnostic performance of each strategy was compared in different sets of studies. Accordingly, we added the sensitivity analyses for a direct comparison of the diagnostic strategies, which yielded very similar results supporting the robustness of the primary analyses.
Second, although we defined the categorization of healthcare settings through profound discussion among expert panel members, it could still be arbitrary. Thus, we analyzed the relationship between failure rate or efficiency and PE prevalence. We found that both failure rate and efficiency became poorer as PE prevalence increased, which supported the robustness of our main finding that the performance of each diagnostic strategy became poorer in healthcare settings with higher PE prevalence.
Third, the YEARS algorithm and the Wells rule with PTP-adjusted D-dimer (PeGED) were less safe in this IPD-MA than in their original studies [15,17]. In most of the included studies, the reference standard for PE was a combination of imaging tests and clinical follow-up, with the decision to refer for imaging guided by the diagnostic strategy under evaluation. However, diagnostic strategies adapting D-dimer to PTP, such as YEARS and PeGED, are more efficient than the other strategies. Accordingly, when applying these diagnostic strategies retrospectively in other studies, more patients will have had imaging as the reference standard than clinical follow-up compared to their derivation studies. This approach likely led to the inclusion of small, possibly insignificant clots in the proportion of missed PE cases among those in whom PE could be considered excluded based on a negative PTP-adjusted D-dimer strategy. This hypothesis is supported by data showing that PE detected by the original Wells rule with a fixed D-dimer cutoff included more subsegmental PE than in those detected by the PTPadjusted YEARS algorithm [30]. Unfortunately, detailed information about the localisation and extent of diagnosed PE was not available in this IPD dataset.
Fourth, as shown in Table D in S1 Text, different types of D-dimer assay were used in the included studies, which could be a source of between-study heterogeneity. In addition, the performance of diagnostic strategies in each healthcare setting could be affected by the variation in D-dimer testing (e.g., the skill of laboratory technicians or the timing of the blood test in relation to patient presentation), which we could not explore in this IPD.
Finally, the studies included in our IPD-MA were conducted between 2000 and 2019. Over those 20 years, the performance of D-dimer testing and imaging studies has evolved. Hence, although we consider the trends of failure rate and efficiency of the diagnostic strategies in our findings to be valid and representative, the validity of our finding in today's patients should be interpreted with some caution.