^{1}

^{*}

^{1}

^{2}

^{1}

^{3}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: TPD KGM GAZ HK RDR. Performed the experiments: TPD RDR. Analyzed the data: TPD KGM HK RDR. Contributed reagents/materials/analysis tools: TPD KGM GAZ HK RDR. Wrote the paper: TPD KGM GAZ HK RDR.

A fundamental aspect of epidemiological studies concerns the estimation of factor-outcome associations to identify risk factors, prognostic factors and potential causal factors. Because reliable estimates for these associations are important, there is a growing interest in methods for combining the results from multiple studies in individual participant data meta-analyses (IPD-MA). When there is substantial heterogeneity across studies, various random-effects meta-analysis models are possible that employ a one-stage or two-stage method. These are generally thought to produce similar results, but empirical comparisons are few.

We describe and compare several one- and two-stage random-effects IPD-MA methods for estimating factor-outcome associations from multiple risk-factor or predictor finding studies with a binary outcome. One-stage methods use the IPD of each study and meta-analyse using the exact binomial distribution, whereas two-stage methods reduce evidence to the aggregated level (e.g. odds ratios) and then meta-analyse assuming approximate normality. We compare the methods in an empirical dataset for unadjusted and adjusted risk-factor estimates.

Though often similar, on occasion the one-stage and two-stage methods provide different parameter estimates and different conclusions. For example, the effect of erythema and its statistical significance was different for a one-stage (OR = 1.35,

When planning an IPD-MA, the choice and implementation (e.g. univariate or multivariate) of a one-stage or two-stage method should be prespecified in the protocol as occasionally they lead to different conclusions about which factors are associated with outcome. Though both approaches can suffer from estimation challenges, we recommend employing the one-stage method, as it uses a more exact statistical approach and accounts for parameter correlation.

A fundamental aspect of epidemiological studies concerns the estimation of associations between independent variables (factors) and dependent variables (outcomes). Outcomes may include such as disease onset, disease presence (diagnosis), disease progression (prognosis), and death. Independent variables may include potential causal factors to unravel the pathophysiology or causal pathway of the outcome under study, but also non-causal predictors or risk-indicators of the outcome to enhance timely detection or prediction of the outcome, perhaps as part of a risk prediction model

When IPD are available, meta-analysis is usually performed using a two-stage approach

In a recent empirical evaluation using a meta-analysis of 24 randomised trials of antiplatelets to prevent preeclampsia, Stewart

The aim of this article is to describe and empirically evaluate possible one-stage and two-stage IPD-MA models for synthesizing (causal or predictive) factor-outcome association estimates across multiple studies where a continuous or binary factor is of interest in relation to a binary outcome. It is therefore similar in spirit to a recent description of methods for meta-analysis of time-to-event outcomes

Deep Vein Thrombosis (DVT) is a blood clot that forms in a vein in the body (usually in the lower leg or thigh). A (part of such) clot can break off and be carried through the bloodstream to the lungs and there cause a blockage (pulmonary embolism), preventing oxygenation of the blood and potentially causing death. The diagnosis DVT presence or absence can (ultimately) be made using repeated leg ultrasound, which requires patient referral and is to some extent burdening and time and money consuming. Hence, it is desirable to predict the presence or absence of DVT without having to refer patients for more cumbersome testing, by rather using easy to obtain predictors from their patient history, physical examination and simple blood assays. For this reason, in patients with a suspected DVT various studies aimed at estimating which factors – out of a range of candidate factors – are indeed associated with the presence or absence of DVT; in other words, which factors are useful diagnostic predictors of the probability that a patient truly has DVT.

A previous systematic review collected the IPD of patients with a suspected DVT from 13 studies (

Study | N | ddimdich = 1 | notraum = 1 | coag = 1 | eryt = 1 | sex = 1 | malign = 1 | par = 1 |

1 | 1028(131) | 472 (117) | 743 (104) | 19 (3) | 382 (52) | 376 (66) | 54 (15) | 13 (2) |

2 | 814(318) | 598 (313) | 307(146) | 86 (43) | 35(17) | |||

3 | 153 (26) | 103 (16) | 51(15) | 73 (10) | 7 (4) | 12 (1) | ||

4 | 1756(411) | 910 (387) | 1497 (361) | 68 (20) | 654(192) | 224 (84) | 101(35) | |

5 | 791 (126) | 572 (91) | 650 (111) | 191(31) | 301(59) | 38 (8) | 112(18) | |

6 | 1075(190) | 424 (161) | 857 (158) | 52 (17) | 471 (97) | 55 (25) | 50 (11) | |

7 | 429 (61) | 153 (28) | 47 (17) | 12 (2) | ||||

8 | 325 (52) | 214 (51) | 57 (11) | 128 (24) | 12 (5) | 14 (2) | ||

9 | 1295(289) | 897 (276) | 1098 (257) | 467(137) | 81 (34) | 178(37) | ||

10 | 436 (42) | 82 (5) | 145 (20) | 26 (8) | 13 (2) | |||

11 | 541 (121) | 266 (108) | 373 (92) | 14 (4) | 144(38) | 238 (62) | 99 (47) | 34 (13) |

12 | 550 (55) | 210 (27) | 50 (17) | 12 (1) | ||||

13 | 809 (42) | 324 (21) | 55 (10) | 27 (5) |

tend = 1 | leg = 1 | calfdif3 = 1 | pit = 1 | vein = 1 | altdiagn = 1 | surg = 1 | ||

1 | 1028(131) | 562 (69) | 232 (46) | 311 (72) | 704(108) | 155 (28) | 669 (22) | 81 (11) |

2 | 814 (318) | 541 (237) | 169 (89) | 353 (186) | 419(196) | 127 (57) | 217 (58) | 75 (34) |

3 | 153 (26) | 82 (19) | 51 (20) | 59 (19) | 73 (22) | 25 (8) | 74 (3) | 24 (10) |

4 | 1756(411) | 664 (238) | 607 (251) | 426 (210) | 950(272) | 283 (92) | 906 (92) | 198(77) |

5 | 791 (126) | 572 (90) | 353 (79) | 322 (79) | 490 (85) | 155 (32) | 300 (43) | 105(25) |

6 | 1075(190) | 494 (118) | 217 (67) | 303 (106) | 357(100) | 43 (16) | 448 (26) | 168(45) |

7 | 429 (61) | 203 (41) | 30 (13) | 96 (33) | 87 (24) | 33 (5) | 176 (17) | 25 (6) |

8 | 325 (52) | 161 (31) | 47 (21) | 93 (33) | 97 (29) | 39 (9) | 114 (8) | 16 (6) |

9 | 1295(289) | 924 (208) | 583 (164) | 556 (194) | 799(193) | 257 (82) | 782 (98) | 181(54) |

10 | 436 (42) | 222 (28) | 168 (28) | 66 (13) | 91 (18) | 1 (0) | 119 (5) | 58 (9) |

11 | 541 (121) | 239 (74) | 152 (67) | 162 (63) | 270 (86) | 38 (16) | 313 (22) | 96 (34) |

12 | 550 (55) | 176 (21) | 83 (16) | 114 (30) | 251 (40) | 28 (8) | 245 (16) | 39 (7) |

13 | 809 (42) | 258 (22) | 75 (8) | 153 (18) | 196 (17) | 32 (3) | 399 (9) | 45 (4) |

This section describes the framework for random-effects IPD-MA modeling of risk factor (predictor finding) studies with a binary outcome. Hereto, it identifies two sources of data: IPD and AD. IPD is represented by patient-level factor values (covariates) and outcomes, whereas AD consists of study-level summaries such as the estimated log odds ratios and corresponding standard errors for the factor-outcome associations reported

In a two-stage method, the IPD are first analyzed separately in each study using an appropriate statistical method for binary outcome data. For example, consider where a single risk factor is of interest, then the logistic regression model is:

In a one-stage method, the IPD from all studies are modeled simultaneously whilst accounting for the clustering of subjects within studies. The one-stage IPD-MA framework is a (multilevel) logistic regression model with random effects. Different specifications are possible, as now described.

Previously, we described models for summarizing unadjusted factor-outcome associations. Although these models are fairly straightforward to implement, it is well known that factor-outcome associations are often influenced by extraneous variables rendering exposure groups incomparable. This situation may, for instance, arise when associations are estimated from cohort and cross-sectional studies (prognostic research) or treatment-by-patient-characteristic interactions occur (intervention research). In addition, several authors have recommended that each factor should be studied for their incremental (causal or predictive) value beyond established risk factors

For the two-stage method, multivariable logistic regression models are estimated in each study:

The fully random-effects one-stage model with multiple risk factors is specified as follows:

Alternatively, a

Finally, it is possible to reduce the number of random effects by stratifying the intercepts and/or predictors for which a summary estimate is not of interest. For example, one-stage

Stratification on all confounders may, however, not always be feasible due to sample size constraints. For this reason, we generally recommend to model separate intercept terms and to assume random effects for all predictor effects (and hence reduce model complexity by introducing additional assumptions). The underlying rationale is that accurate estimates for confounding parameters are usually not required. Although this simplification may introduce bias in all parameter estimates, baseline risks are likely most affected because they capture all unexplained variation. A non-parametric modeling approach for the intercept terms may thus better accommodate model misspecification.

In the two-stage methods, the first stage model (logistic regression in each study) is estimated using maximum likelihood (ML). In the second stage, the AD meta-analysis models are estimated using, for example, methods of moment (MOM) or restricted maximum likelihood (REML)

One-stage methods involve the estimation of a mixed effects (multilevel) model which is often high dimensional

In this section, we illustrate the benefits, limitations and differences of one-stage and two-stage methods in the DVT data. For all case studies, in the two-stage models we used MLE in the first stage and MLE, REML or MOM in the second stage. For the one-stage models we used adaptive Gauss-Hermite Quadrature with 1 (Laplacian approximation) and 5 quadrature points.

In the first case study, we performed meta-analyses to estimate the

In the second case study, we performed meta-analyses to investigate the risk factor

For all models, we calculated

All models were implemented in R 2.15.1 using Linux Mint 14 Nadia (MATE 64-bit) and incorporated the packages

Results in

Risk Factor | Model | Estimation | β | S.E.(β) | OR | 95% CI | 95% PI | |||

MLE | 2.76 | 0.15 | 0.30 | 0.52 | 15.86 | 11.73 to 21.45 | 6.98 to 36.06 | <0.001 | ||

REML | 2.78 | 0.17 | 0.33 | 0.28 | 16.10 | 11.64 to 22.27 | 6.48 to 40.00 | <0.001 | ||

MLE | 2.87 | 0.15 | 0.25 | 17.69 | 13.15 to 23.80 | 8.65 to 36.20 | <0.001 | |||

REML | 2.89 | 0.17 | 0.31 | 17.97 | 12.88 to 25.06 | 7.49 to 47.04 | <0.001 | |||

ddimdich (8 ) | 3 | MOM | 2.89 | 0.17 | 0.32 | 17.98 | 12.87 to 25.13 | 7.43 to 43.54 | <0.001 | |

3 | MLE 1QP | 2.87 | 0.15 | 0.28 | 0.07 | 17.70 | 13.14 to 23.86 | 8.08 to 38.78 | <0.001 | |

3 | MLE 5QP | 2.85 | 0.14 | 0.25 | 0.58 | 17.35 | 13.15 to 22.89 | 8.62 to 34.91 | <0.001 | |

4 | MLE 1QP | 2.88 | 0.15 | 0.29 | 17.79 | 13.15 to 24.07 | 8.00 to 39.56 | <0.001 | ||

4 | MLE 5QP | 2.85 | 0.19 | 0.41 | 17.37 | 12.06 to 25.01 | 5.74 to 52.55 | <0.001 | ||

5 | MLE 1QP | 2.92 | 0.11 | 0.00 | 18.55 | 15.01 to 22.93 | 14.24 to 24.17 | <0.001 | ||

5 | MLE 5QP | 2.92 | 0.11 | 0.00 | 18.46 | 14.94 to 22.81 | 14.17 to 24.04 | <0.001 | ||

MLE | 0.37 | 0.14 | 0.26 | −0.47 | 1.45 | 1.11 to 1.90 | 0.76 to 2.75 | 0.007 | ||

REML | 0.38 | 0.14 | 10.29 | −0.45 | 1.46 | 1.10 to 1.93 | 0.71 to 2.98 | 0.009 | ||

MLE | 0.33 | 0.13 | 0.23 | 1.38 | 1.06 to 1.80 | 0.76 to 2.51 | 0.016 | |||

REML | 0.33 | 0.14 | 0.27 | 1.39 | 1.05 to 1.84 | 0.71 to 2.73 | 0.020 | |||

MOM | 0.33 | 0.13 | 0.24 | 1.38 | 1.06 to 1.80 | 0.76 to 2.52 | 0.016 | |||

par (13 ) | 3 | MLE 1QP | 0.32 | 0.13 | 0.23 | −0.37 | 1.38 | 1.07 to 1.79 | 0.77 to 2.48 | 0.013 |

3 | MLE 5QP | |||||||||

4 | MLE 1QP | 0.29 | 0.13 | 0.21 | 1.33 | 1.03 to 1.71 | 0.78 to 2.27 | 0.026 | ||

4 | MLE 5QP | |||||||||

5 | MLE 1QP | 0.28 | 0.13 | 0.19 | 1.32 | 1.03 to 1.70 | 0.79 to 2.21 | 0.026 | ||

5 | MLE 5QP | |||||||||

MLE | 0.32 | 0.15 | 0.10 | 1.00 | 1.37 | 1.02 to 1.84 | 0.13 to 13.97 | 0.036 | ||

REML | 0.32 | 0.16 | 0.13 | 1.00 | 1.38 | 1.01 to 1.87 | 0.10 to 18.23 | 0.043 | ||

MLE | 0.30 | 0.14 | 0.00 | 1.35 | 1.03 to 1.77 | 0.23 to 7.87 | 0.030 | |||

REML | 0.44 | 0.28 | 0.39 | 1.55 | 0.90 to 2.66 | 0.00 to 664.30 | 0.115 | |||

MOM | 0.42 | 0.25 | 0.33 | 1.52 | 0.93 to 2.47 | 0.01 to 303.63 | 0.094 | |||

eryt (3 ) | 3 | MLE 1QP | 0.31 | 0.15 | 0.10 | 1.00 | 1.37 | 1.02 to 1.83 | 0.14 to 13.02 | 0.037 |

3 | MLE 5QP | 0.31 | 0.15 | 0.10 | 1.00 | 1.37 | 1.02 to 1.83 | 0.14 to 13.04 | 0.037 | |

4 | MLE 1QP | 0.33 | 0.17 | 0.14 | 1.39 | 1.01 to 1.92 | 0.09 to 22.31 | 0.046 | ||

4 | MLE 5QP | 0.33 | 0.17 | 0.14 | 1.39 | 1.01 to 1.93 | 0.09 to 22.62 | 0.046 | ||

5 | MLE 1QP | 0.30 | 0.14 | 0.00 | 1.35 | 1.03 to 1.77 | 0.23 to 7.80 | 0.029 | ||

5 | MLE 5QP | 0.30 | 0.14 | 0.00 | 1.35 | 1.03 to 1.77 | 0.23 to 7.80 | 0.029 | ||

MLE | 0.10 | 0.18 | 0.20 | −1.00 | 1.11 | 0.78 to 1.57 | 0.35 to 3.52 | 0.574 | ||

REML | 0.10 | 0.19 | 0.23 | −1.00 | 1.10 | 0.76 to 1.60 | 0.31 to 3.97 | 0.595 | ||

MLE | −0.02 | 0.15 | 0.00 | 0.98 | 0.73 to 1.31 | 0.52 to 1.86 | 0.898 | |||

REML | 0.02 | −0.15 | 0.00 | 0.98 | 0.73 to 1.31 | 0.52 to 1.86 | 0.898 | |||

MOM | −0.02 | 0.15 | 0.00 | 0.98 | 0.73 to 1.31 | 0.52 to 1.86 | 0.898 | |||

oachst (4 |
3 | MLE 1QP | 0.08 | 0.17 | 0.18 | −1.00 | 1.09 | 0.77 to 1.53 | 0.37 to 3.22 | 0.629 |

3 | MLE 5QP | |||||||||

4 | MLE 1QP | −0.03 | 0.15 | 0.00 | 0.98 | 0.73 to 1.31 | 0.51 to 1.85 | 0.866 | ||

4 | MLE 5QP | |||||||||

5 | MLE 1QP | −0.03 | 0.15 | 0.00 | 0.97 | 0.72 to 1.30 | 0.51 to 1.84 | 0.830 | ||

5 | MLE 5QP | |||||||||

MLE | 0.21 | 0.16 | 0.22 | 0.98 | 1.24 | 0.91 to 1.68 | 0.62 to 2.47 | 0.172 | ||

REML | 0.22 | 0.17 | 0.29 | 0.82 | 1.24 | 0.88 to 1.75 | 0.52 to 2.95 | 0.218 | ||

MLE | 0.26 | 0.15 | 0.15 | 1.29 | 0.97 to 1.72 | 0.75 to 2.23 | 0.078 | |||

REML | 0.26 | 0.16 | 0.23 | 1.29 | 0.94 to 1.78 | 0.63 to 2.65 | 0.116 | |||

MOM | 0.26 | 0.16 | 0.21 | 1.29 | 0.95 to 1.76 | 0.66 to 2.51 | 0.103 | |||

coag (7 ) | 3 | MLE 1QP | 0.19 | 0.16 | 0.23 | 1.00 | 1.20 | 0.88 to 1.64 | 0.59 to 2.46 | 0.241 |

3 | MLE 5QP | |||||||||

4 | MLE 1QP | 0.21 | 0.16 | 0.22 | 1.24 | 0.90 to 1.69 | 0.61 to 2.51 | 0.186 | ||

4 | MLE 5QP | |||||||||

5 | MLE 1QP | 0.22 | 0.13 | 0.01 | 1.25 | 0.97 to 1.61 | 0.90 to 1.74 | 0.083 | ||

5 | MLE 5QP |

Zero-cells occurred in two studies for factor

As previously described, only the full one- and two-stage models (Model 1 & Model 3) estimate a parameter for the correlation between random effects. Results in

Risk factor | Model | Estimation | β | S.E.(β) | OR | ||

MLE | 2.62 | 0.18 | 0.40 | 13.67 | <0.001 | ||

REML | 2.64 | 0.20 | 0.44 | 13.80 | <0.001 | ||

MLE | 2.67 | 0.15 | 0.25 | 14.48 | <0.001 | ||

REML | 2.69 | 0.17 | 0.33 | 14.75 | <0.001 | ||

C | MLE 1QP | 2.70 | 0.18 | 0.39 | 14.81 | <0.001 | |

ddimdich (10 ) | C | MLE 5QP | 2.70 | 0.18 | 0.40 | 14.83 | <0.001 |

D | MLE 1QP | 2.67 | 0.16 | 0.33 | 14.42 | <0.001 | |

D | MLE 5QP | 2.69 | 0.14 | 0.22 | 14.74 | <0.001 | |

E | MLE 1QP | 2.72 | 0.11 | 0.00 | 15.25 | <0.001 | |

E | MLE 5QP | 2.72 | 0.11 | 0.00 | 15.25 | <0.001 |

It is possible to avoid estimating correlation between random effects without assuming independence by using a stratified one-stage model, for example where a separate intercept is estimated for each study (Model 5) and, in the adjusted analyses, where predictors not of key interest are also stratified. Results indicate that the estimation of a separate intercept for each study (Model 5) tends to decrease the standard errors and between-study heterogeneity of factor-outcome associations (unless between-study correlations are +1 or −1). This, in turn, resulted in smaller prediction intervals for estimated odds ratios. For instance, the prediction interval for the unadjusted OR of

One-stage models were estimated with 1 and 5 quadrature points, and sometimes suffered from convergence problems (e.g.

We have described several random-effects IPD-MA models that implement a one-stage or two-stage method, where one desires to evaluate a potential causal (risk) factor or predictor of outcome. We detailed how they can be estimated and also extended to adjust for other factors. Despite the conventional belief that one-stage and two-stage methods yield similar conclusions

Thus, importantly the choice of IPD-MA method may actually influence the conclusions about which factors are thought to be risk factors. This makes it desirable to pre-specify in a study protocol what meta-analysis method will be used, to avoid unjustified post-hoc analyses being performed to achieve statistical significance. We generally recommend that the one-stage method should be used. This method models the exact binomial distribution of the data in each study, and does not require a continuity correction when (partial) separation occurs

Although we focused on IPD-MA of prognostic factors in this article, the two-stage methods can also be applied when only AD data is available for the included studies. These methods are usually preferred because sharing of IPD is often unfeasible due to, for instance, confidentiality agreements. Results from our empirical example demonstrate that the full two-stage model, which when pooling the AD accounts for heterogeneity of baseline risk and risk factors, and their within-study and between-study correlation, tends to yield most consistent results with the one-stage models. The full two-stage method is a bivariate meta-analysis, which by additionally using the correlation between parameter estimates, is known to have benefits over a univariate me-analysis

In summary, the choice of one-stage or two-stage method for performing a random-effects IPD-MA may influence the statistical identification of risk factors (predictors) for a binary outcome. When the number of studies in the meta-analysis are large and the number of events in each study are not few, we agree with Stewart

(GZ)

(PDF)

(PDF)

(PDF)