A Case Study of Discordant Overlapping Meta-Analyses: Vitamin D Supplements and Fracture

Background Overlapping meta-analyses on the same topic are now very common, and discordant results often occur. To explore why discordant results arise, we examined a common topic for overlapping meta-analyses- vitamin D supplements and fracture. Methods and Findings We identified 24 meta-analyses of vitamin D (with or without calcium) and fracture in a PubMed search in October 2013, and analysed a sample of 7 meta-analyses in the highest ranking general medicine journals. We used the AMSTAR tool to assess the quality of the meta-analyses, and compared their methodologies, analytic techniques and results. Applying the AMSTAR tool suggested the meta-analyses were generally of high quality. Despite this, there were important differences in trial selection, data extraction, and analytical methods that were only apparent after detailed assessment. 25 trials were included in at least one meta-analysis. Four meta-analyses included all eligible trials according to the stated inclusion and exclusion criteria, but the other 3 meta-analyses “missed” between 3 and 8 trials, and 2 meta-analyses included apparently ineligible trials. The relative risks used for individual trials differed between meta-analyses for total fracture in 10 of 15 trials, and for hip fracture in 6 of 12 trials, because of different outcome definitions and analytic approaches. The majority of differences (11/16) led to more favourable estimates of vitamin D efficacy compared to estimates derived from unadjusted intention-to-treat analyses using all randomised participants. The conclusions of the meta-analyses were discordant, ranging from strong statements that vitamin D prevents fractures to equally strong statements that vitamin D without calcium does not prevent fractures. Conclusions Substantial differences in trial selection, outcome definition and analytic methods between overlapping meta-analyses led to discordant estimates of the efficacy of vitamin D for fracture prevention. Strategies for conducting and reporting overlapping meta-analyses are required, to improve their accuracy and transparency.


Introduction
The number of meta-analyses published in recent years has dramatically increased [1,2]. Partly, this is because systematic reviews and meta-analyses are considered the highest level of evidence, but it is also relatively easy to undertake and publish a meta-analysis [3]. However, many meta-analyses are not novel, and either reproduce or extend earlier analyses on the same topic-i.e. are overlapping. In a random sample of meta-analyses that were published in 2010 and included randomised trials, 67% had at least one other overlapping meta-analysis [3]. Overlapping meta-analyses may report discordant results and conclusions, particularly as the number of such analyses increases. The consequences of this include contradictory recommendations for clinical practice, confusion amongst clinicians and their patients, and public disenchantment with clinical science.
Discordant meta-analyses have been reported previously for a variety of interventions [4][5][6][7][8], and recommendations for assessing such meta-analyses are available [9]. These recommendations focus on the methods and quality of the review, both of which have become much more standardised since the recommendations were proposed. The effect of vitamin D supplements on fracture is the subject of a large number of meta-analyses [10]. In 2012, an individual patient data meta-analysis was the 21 st meta-analysis published on this topic, but identified only 14 relevant randomised controlled trials [11]. Two recent clinical guidelines on vitamin D [12,13], based on meta-analyses of the same clinical trials by independent groups, reached very different conclusions [14,15]. As a case study of overlapping meta-analyses, we conducted a detailed review of meta-analyses of vitamin D and fracture. We investigated differences between the meta-analyses by applying recommendations for comparing discordant overlapping meta-analyses. We focused on the quality and methodology of the meta-analyses with regard to trials included, trial data utilised, analytic approaches, and conclusions, and considered the implications these differences have for clinical practice, interpretation of existing meta-analyses and performance of future analyses.

Ethics statement
Ethical approval was not required for this work.
We searched PubMed in October 2013 for systematic reviews and metaanalyses of vitamin D with fracture as an outcome (S1 Appendix). We identified 24 meta-analyses, and analysed the most recent meta-analysis in each of the highest ranking general medical journals (Ann Intern Med, BMJ, Cochrane Database Syst Rev, JAMA, JAMA Intern Med, Lancet, NEJM). We chose this sample of meta-analyses because they are likely to have been conducted to the highest standard, as well as being the most closely scrutinised during peer review and post-publication. Thus, we analysed 5 trial-level and 2 patient-level metaanalyses on the effect of vitamin D with or without calcium on fracture [16][17][18][19][20][21][22].
Jadad and colleagues recommended assessing discordant systematic reviews in 6 domains-the clinical question asked, study selection and inclusion, data extraction, study quality, ability to combine studies, and statistical methods [9]. Following this approach, the quality of each meta-analysis was assessed using the AMSTAR tool [23], and the inclusion and exclusion criteria, endpoints, trials included, and data on hip and total fracture outcomes for each contributing study were extracted by one author (MB) and checked by a second (AG). Differences were resolved by consensus. Some trials reported data for non-vertebral fracture rather than total fracture. In this situation, we used data for non-vertebral fracture when total fracture data were not available. Only one meta-analysis described the reasons for exclusion of individual trials in detail [18]. For each of the other metaanalyses, we assessed whether trials that were not included in the meta-analysis were eligible for inclusion according to the published inclusion and exclusion criteria, and tried to determine why the trial was not included. Where data for the efficacy of vitamin D on fracture outcomes for a trial differed between metaanalyses, we tried to determine the reason. The recommended approach to analysis of a randomised controlled trial is an unadjusted intention-to-treat analysis using all randomised participants with data from the final study timepoint [24]. We considered the result from this approach to be the best estimate of treatment efficacy. An analysis restricted to those participants who completed the study was termed a ''per-protocol analysis.'' Finally, we compared the conclusions of the meta-analyses, with each author independently rating the strength of the conclusions toward the use of vitamin D supplementation to prevent fracture on a three point scale (positive/mixed/negative toward vitamin D supplementation) and on a scale from 1 to 5 (1 most negative, 5 most positive toward vitamin D supplementation). These ratings were based solely on the conclusions of the meta-analysis, and did not consider the data or analyses used in the meta-analyses. Table 1 shows the characteristics of the 7 meta-analyses and the trials included in each meta-analysis. The number of included trials in each meta-analysis ranged from 7 to 20. Of the 25 trials included in any of the meta-analyses , 6 were included in only 1 meta-analysis, 3 in 2 meta-analyses, 2 in 3 meta-analyses, 3 in 4 meta-analyses, 7 in 5 meta-analyses, and 4 in 6 meta-analyses. No trial was included in all meta-analyses. The number of trials that met criteria for inclusion but were not included in each meta-analysis ranged from 0 to 8 trials: 4 Table 1. Characteristics of 7 included meta-analyses and trials included or excluded in each meta-analysis.

Author
Bischoff-Ferrari, 2005 [16] Tang, 2007 [17] Bischoff-Ferrari, 2009 [19] Avenell, 2009 [18] DIPART, 2010 [20] Chung, 2011 [21] Bischoff-Ferrari, 2012 [22] Level of data meta-analyses included all trials that met the eligibility criteria [16][17][18]20], with 2 meta-analyses ''missing'' 3 trials [19,22], and 1 missing 8 trials [21]. Of the 8 trials that were missing from at least 1 meta-analysis, 4 were missed in 1 meta-analysis, 2 in 2 meta-analyses, and 2 in 3 meta-analyses. Two meta-analyses [16,17] included 1 and 3 trials, respectively, that did not appear to meet the stated eligibility criteria (Table 1). In both cases, other trials were not included in the meta-analyses despite having similar design to the included trials that appeared ineligible. Table 2 shows the quality assessment of each meta-analysis. Generally, the meta-analyses were of high quality, although all meta-analyses did not report some of the AMSTAR items, and some of the methods used in 3 meta-analyses [19,21,22] were of uncertain validity. Reporting of AMSTAR items was less common in the 2 patient-level meta-analyses [20,22]. Tables 3 and 4 show data from each trial used in each meta-analysis for hip fracture and total fracture, respectively. For hip fracture, the relative risk differed between meta-analyses for 6 of 12 trials for which data were reported in two or more meta-analyses, and for total fracture, the relative risk differed between metaanalyses in 10 of 15 trials. Tables 3 and 4 show the reasons for the differences in relative risks between the meta-analyses, which are summarised in Table 5. Many of the differences arose when results obtained using analyses other than the recommend approach (unadjusted intention-to-treat analysis of all participants with data from the final study timepoint) were used in a meta-analysis, with the most common example being the use of a per-protocol analysis. The majority of the differences (4 of 6 for hip fracture, 7 of 10 for total fracture) led to more favourable estimates of the efficacy of vitamin D on hip or total fracture being used for individual trials than if the recommended approach was applied. In general, the Cochrane review [18] was most likely to use the recommended X5 study not included in meta-analysis. Reasons for non-inclusion: NDA-eligible for inclusion but no patient-level data available; agent-did not compare vitamin D plus calcium with placebo; size-study smaller than inclusion criteria allowed; age-age outside inclusion criteria; uncertain-unknown reason for exclusion; duration-duration of study less than inclusion criteria allowed; design-design did not meet inclusion criteria; controls-untreated control group; date-after search date; IM-intramuscular administration. approach, and the relative risks used for each study in that review are the most conservative.
The differences between meta-analyses in relative risks for individual trials were most prominent for the total or non-vertebral fracture endpoint. Non-vertebral fracture has commonly been reported in trials, but is often used interchangeably with total fracture. Some meta-analyses adopted this approach [17], whereas others carried out separate analyses for total fracture and for non-vertebral fracture [18]. In several meta-analyses, there were inconsistent approaches to handling data (Table 3). Participants with hip fracture were added to those with all non-vertebral fracture for 2 trials [25,31] in one meta-analysis [17], and for one of these two trials [31] in two meta-analyses [16,19], effectively counting hip fractures twice. In 2 meta-analyses [16,19], the primary endpoint was nonvertebral fracture but, when this endpoint was not reported, the authors used different estimates of non-vertebral fractures for different individual trials. Thus Table 4 shows that for different trials the authors used total fracture; total fracture Data combined appropriately using random-effects models, but studies grouped according to received dose (treatment dose * adherence). The advisability and validity of this approach is uncertain. c Data combined appropriately using random-effects models, but data for hip fracture was used for 4/16 trials when the primary endpoint assessed was total fracture. d Data combined appropriately using Cox proportional-hazards models, but method of assessment of vitamin D intake differed between treatment and control groups. e Partly funded by a manufacturer of vitamin D supplements.
Bischoff-Ferrari Study excluded-study met meta-analysis exclusion criteria. Study not included-study did not meet exclusion criteria but not included in meta-analysis. Data not included-data not included in meta-analysis but available from other meta-analyses or primary publication. Abbreviations: Vit D-vitamin D, RR-relative risk, CI confidence interval. NS-not stated. Vit D-vitamin D. CaD-co-administered calcium and vitamin D.
Bold text-indicates differences in relative risks for individual studies between meta-analyses. a factorial/multi-arm studies permitting multiple comparisons between randomised groups.
Reasons for differences in reported data between meta-analyses: b Data are for total fracture unless otherwise indicated. Individual trial data for fracture were not reported in DIPART 2010 [20], or Bischoff-Ferrari 2012 [22]. No data on total or nonvertebral fracture were reported in any meta-analysis for Bischoff 2003 [33], and Bischoff-Ferrari 2010 [48]. Study excluded-study met meta-analysis exclusion criteria. Study not included-study did not meet exclusion criteria but not included in meta-analysis. Data not included-data not included in meta-analysis but available from other meta-analyses or primary publication. Abbreviations: Vit D-vitamin D, RR-relative risk, CI confidence interval. NS-not stated. Vit D-vitamin D. CaD-co-administered calcium and vitamin D.
Bold text-indicates differences in relative risks for individual studies between meta-analyses. a factorial/multi-arm studies permitting multiple comparisons between randomised groups.
Reasons for differences in reported data between meta-analyses: b where only subsets of fractures were used (hip fracture, or hip/wrist/forearm fracture), data on total fractures were available. One meta-analysis utilised one fracture endpoint for each study determined hierarchically in descending order from total fracture, hip fracture, and non-vertebral fracture [21]. For the resulting meta-analysis, total fracture was used for 10 trials, hip fracture for 4 trials, and non-vertebral fracture for 2 trials. Hip fracture is only a small subset of total fracture, and for all 4 trials where hip fracture was used, data on the broader endpoint of non-vertebral fracture were used in other meta-analyses. Table 6 shows the conclusions from the meta-analysis. Some of the conclusions differ substantially. For example, in 3 meta-analyses Bischoff-Ferrari and colleagues conclude that higher doses but not lower doses of vitamin D prevent fractures [16,19,22], whereas 3 other meta-analyses concluded that vitamin D, used without calcium supplements, does not prevent fractures, regardless of the dose [18,20,21]. Table 6 shows that our assessment of the strength of the conclusions in favour of vitamin D supplements ranged from mixed to strongly Table 5. Reasons for differences in results between meta-analyses, and effects on estimate of efficacy of vitamin D on fracture. Discordant Meta-Analyses of Vitamin D and Fracture positive, with a median score of $4 for 6 of the 7 meta-analyses. The metaanalysis that most closely adhered to the recommended approach to analysis of a randomised controlled trial and fulfilled the most number of items in the AMSTAR tool had the least positive conclusion [18].

Discussion
Among overlapping meta-analyses of vitamin D and fracture, there were substantial differences in the trials included, the data used from each trial, the analytical approach adopted, and the conclusions drawn, despite the metaanalyses being of high quality and published in the highest ranking medical journals. Only 4 meta-analyses included all eligible trials, with the number of ''missed'' trials ranging from 3 to 8 in the other 3 meta-analyses. Two metaanalyses included trials that did not appear to meet eligibility criteria, while excluding other trials of similar design. The relative risks used for individual trials varied between meta-analyses, with differences being more common for total fracture (67% of trials) than for hip fracture (50% of trials). The differences in relative risks led to more favourable estimates of the efficacy of vitamin D compared to analyses using recommended analytic approaches on 11/16 (69%) occasions. The conclusions of the meta-analyses were discordant, ranging from strong statements that vitamin D prevents fractures to equally strong statements that vitamin D used without calcium does not prevent fractures. All meta-analyses were favourable toward prescribing of vitamin D for fracture prevention, although in some meta-analyses the recommendations were restricted to certain subgroups, or to co-administration of vitamin D with calcium supplements. The reasons for the differences between the meta-analyses for the trials included and the data used (Table 5) were often not readily apparent. An explanation as to why trials were not included was provided in only one meta-analysis. Fracture data for 3 of the trials included in the Cochrane review were unpublished and obtained for that review [18]. It is not clear whether the authors of other metaanalyses sought these unpublished data, or why, once published in the Cochrane review, they were not included in later meta-analyses. On a similar note, the Cochrane review authors clarified ambiguous reporting of treatment group numbers in the primary publication for one study with the lead author of the study [31] (Table 3/4), whereas in the other meta-analyses, incorrect denominators for both treatment groups were used.
The reasons for the differences in relative risks between meta-analyses can only be deduced by detailed, careful examination of the meta-analyses and the primary publications. One trial-level meta-analysis did not report the number of participants with fracture or the number of participants in each treatment group for individual trials [17] and neither patient-level analysis [20,22] reported these data or relative risks for individual trials. The absence of this information limits verification of the accuracy of the data and analyses undertaken. For patient-level analyses where data is censored at an earlier timepoint [20], this information is very important because the results at the earlier time point may differ from those of the overall trial. For example, in a patient-level meta-analysis of vitamin D and mortality [51], data for the Women's Health Initiative trial [41] was censored at 3 years, restricting the number of deaths to about 25% of those occurring in the trial, and providing a more favourable effect estimate than that for the entire follow-up. Several meta-analyses used data from per-protocol analyses for individual studies. The recommended analysis for a randomised controlled trial is an unadjusted intention-to-treat analysis including all available data from all randomised participants [24]. We think the same principle applies in metaanalyses of randomised controlled trials. When only per-protocol data are reported for trials, the Cochrane handbook recommends performing sensitivity analyses to explore differences between intention-to-treat approaches (that assume participants lost to follow-up did not have an event), with results using per-protocol data [52]. None of the meta-analyses performed such sensitivity analyses. None of the meta-analyses provided sufficient details to permit a reader to understand if data from individual trials could be incorporated in the metaanalysis in different ways (such as using all fractures versus using only low-trauma fractures), and none compared their handling of the data with previous metaanalyses.
The methodological differences between meta-analyses influenced the conclusions drawn from them. Each of the 3 meta-analyses that considered trials of vitamin D with calcium supplements separately to trials of vitamin D [18,20,21] concluded that vitamin D alone does not prevent fractures, regardless of dose. However, the 3 meta-analyses by Bischoff-Ferrari and colleagues that assessed vitamin D with or without calcium concluded that higher doses of vitamin D prevent fractures [16,19,22]. We have several reservations about the conclusions of these 3 meta-analyses. As highlighted in Tables 3 and 4, these meta-analyses used more favourable effect estimates for vitamin D for individual trials than estimates obtained using unadjusted intention-to-treat analysis of all participants with data from the final study timepoint. Most trials categorised as high dose vitamin D studied co-administered calcium and vitamin D, but the benefits were attributed to vitamin D. Three large trials of high dose vitamin D [40,45,50] were excluded or only included in secondary analyses because of their study design, but all had relative risks for fracture with vitamin D greater than 1, essentially excluding clinically significant benefits on fracture prevention for vitamin D (Tables 1,3-4). Finally, 2 of these meta-analyses made questionable assumptions about received vitamin D doses and focused on treatment adherence analyses [19,22], methodology that has been criticised [53]. In our view, the Cochrane review [18] is the most detailed and comprehensive, receives the highest rating using the AMSTAR tool for quality assessment, includes the broadest range of trials, and utilises the recommended intention-to-treat approach with the most conservative effect estimates. We think the meta-analyses in the Cochrane review are the most reliable with the greatest external validity-i.e. the results are most generalisable to the wider population.
We followed the approach recommended in 1997 for assessment of discordance amongst systematic reviews [9]. The widespread use of checklists for reporting meta-analyses, such as the PRISMA checklist [54], means that later meta-analyses have become more standardised and of higher quality. This is reflected in Table 2 which shows the high level of reporting of AMSTAR items used to assess metaanalysis quality. However, despite the apparent high quality of these metaanalyses, there were important differences in 3 of the domains that give rise to discordant meta-analyses [9]: study selection and inclusion, data extraction, and statistical methods. The differences are not readily apparent unless each metaanalysis is scrutinised in considerable detail. Thus, it is very likely that the casual reader, and even an expert reviewer, will not notice the differences or understand why the results of overlapping meta-analyses differ. Many of the methodological differences are based on subjective decisions made by the authors. Since different researchers make different judgements on these methodological issues, their decisions and the reasoning behind the decisions should be reported. In Table 7, we propose additions to guidelines for the reporting of overlapping meta-analyses to facilitate their interpretation. They may also decrease redundant overlapping meta-analyses [3], by requiring authors to clearly identify previous publications and make apparent what the new meta-analyses adds to existing knowledge. An important limitation of our analysis is that it is limited to a single topic and a sample of meta-analyses published in high-impact journals, but it seems likely that the weaknesses in methodology and reporting we found will be present in other overlapping meta-analyses.
There are important clinical consequences arising from discordant conclusions from overlapping meta-analyses. They engender confusion among clinicians and patients, and foster public disenchantment with biomedical research, exemplified by the statement often used in the media that ''the experts can't make up their minds''. Another specific possibility is that patients taking vitamin D supplements in the hope of preventing fractures might be falsely reassured that they are improving their skeletal health by the reporting of positive meta-analyses with methodological weaknesses or limited generalisability.
In summary, this detailed review reveals substantial differences between overlapping meta-analyses of vitamin D and fracture published in the highest ranking general medical journals, despite all meta-analyses generally being assessed as high quality using the AMSTAR tool. The reasons for these differences were often not readily apparent, but the differences led to more favourable estimates of the efficacy of vitamin D compared to estimates obtained using recommend analytic approaches. From this specific example, it is possible to propose additional guidelines for reporting meta-analyses, in order to create greater accuracy and transparency, especially amongst overlapping meta-analyses that report discordant results.
Supporting Information S1 Appendix. List of 24 meta-analyses or systematic reviews of vitamin D and fracture identified in October 2013 search of PubMed using the terms vitamin D; fracture or osteoporosis; systematic review or meta-analysis. doi:10.1371/journal.pone.0115934.s001 (DOCX) Table 7. Suggestions for improved reporting of overlapping meta-analyses.

Abstract
N State article is an overlapping meta-analysis