Distribution and Epidemiological Characteristics of Published Individual Patient Data Meta-Analyses

Background Individual patient data meta-analyses (IPDMAs) prevail as the gold standard in clinical evaluations. We investigated the distribution and epidemiological characteristics of published IPDMA articles. Methodology/Principal Findings IPDMA articles were identified through comprehensive literature searches from PubMed, Embase, and Cochrane library. Two investigators independently conducted article identification, data classification and extraction. Data related to the article characteristics were collected and analyzed descriptively. A total of 829 IPDMA articles indexed until 9 August 2012 were identified. An average of 3.7 IPDMA articles was published per year. Malignant neoplasms (267 [32.2%]) and circulatory diseases (179 [21.6%]) were the most frequently occurring topics. On average, each IPDMA article included a median of 8 studies (Interquartile range, IQR 5 to 15) involving 2,563 patients (IQR 927 to 8,349). Among 829 IPDMA articles, 229 (27.6%) did not perform a systematic search to identify related studies. In total, 207 (25.0%) sought and included individual patient data (IPD) from the “grey literature”. Only 496 (59.8%) successfully obtained IPD from all identified studies. Conclusions/Significance The number of IPDMA articles exhibited an increasing trend over the past few years and mainly focused on cancer and circulatory diseases. Our data indicated that literature searches, including grey literature and data availability were inconsistent among different IPDMA articles. Possible biases may arise. Thus, decision makers should not uncritically accept all IPDMAs.


Introduction
Meta-analysis is a crucial tool in evidence-based medicine because it quantitatively combines results from relevant studies on specific clinical topics, such as treatment effectiveness [1,2]. Metaanalysis produces results with increased statistical power and minimized bias by integrating data from different studies [3]. Clinicians, treatment guideline developers and medical policy makers often use up-to-date high-quality meta-analyses to support clinical strategies [4].
Meta-analyses are conducted through either aggregate data (AD) or individual patient data (IPD) [5,6]. AD meta-analyses (ADMAs) are based on group-level results of studies [7], whereas IPD meta-analyses (IPDMAs) collect and integrate individual data from researchers of original primary studies [8]. IPDMA is generally believed to have advantages over ADMA because IPDMA uses consistent inclusion and exclusion criteria among IPD, thus increasing data sensitivity and specificity with detailed data analysis [2,9,10]. Therefore, IPDMA is considered as the gold standard in meta-analyses [5,11].
The number of meta-analyses significantly increased over the past few years, and most meta-analyses were ADMAs [12]. Many studies have documented the characteristics of ADMA articles, such as publication year, study design, and number of studies included [4,13]. However, studies on prevailing trends and epidemiological characteristics of IPDMA articles are relatively few and are still based on several convenient samples of IPDMAs [14][15][16][17]. Meta-analysts and clinicians may be unaware of the general trends, prevailing distributions, qualities, and epidemiological characteristics of published IPDMA articles in their relative fields. Moreover, detailed information on the data identification and collection process is required because such information may affect the completeness of the data [16,18].
This work investigates the distribution and epidemiological characteristics of IPDMA articles indexed until 9 August 2012. This survey on published IPDMA articles may provide important epidemiological information for meta-analytical researchers and clinicians of evidence-based medicine.

Definition
An article was classified as an IPDMA article if it stated that individual-level data across multiple studies were collected and pooled from original studies.

Search
We developed a search strategy that combines IPD keywords and five balanced search terms of Montori in searching for IPDMA articles [19]. PubMed, Embase and Cochrane library were searched. The detailed search strategy is given in File S1. In addition, the search strategy developed by Riley et al [5] was used to identify additional eligible IPDMA articles. We also screened the reference lists of all potentially included full-text IPDMAs. No limitation was placed to the year of publication so as to increase the search sensitivity. The latest search was conducted on 9 August 2012.

Eligibility of IPDMA Articles
No restriction was placed on disease types under investigation or study design. Methodological articles, review protocols and review overviews were excluded. Conference abstracts for which full text articles could not be retrieved were excluded. Non-English articles were also excluded. The most recent articles were included in case of obvious duplication.

Screening
Two authors independently assessed potentially relevant articles for eligibility. The decision on possible inclusion or exclusion of a study was initially based on the study title, abstract, and then on the full text of articles. Disagreement between the two researchers was resolved by consensus or by consulting a third reviewer if a consensus was not reached.

Data Extraction and Classification
Data with respect to the epidemiological characteristics of all IPDMA articles were extracted using a form comprising 17 questions, such as publication year, number of included studies and patients, how reviewers identified the studies, and what proportion of request studies actually provided raw data.
Journals that published IPDMA articles were classified by subject category and by impact factors, according to the Thomson Reuters (ISI) Web of Knowledge in 2011 [20]. Impact factors of the journals were divided into three groups, $10, $5 but ,10, and ,5. The funding sources were classified into five categories as follows: no funding, non-profit sources (such as government or universities), profit sources (such as pharmaceutical companies), mixed, and unclear. The focus of IPDMA articles was classified into three categories according to the following primary objectives: therapeutic (IPDMA articles studied the effect of treatment or prevention of specific diseases or health conditions) [4], prognosis [17], and others. Diseases cited in IPDMA articles were classified according to the 10th Revision of the International Statistical Classification of Diseases and Related Health Problems (ICD-10) [21]. Studies in IPDMA articles were classified into three categories according to the following methodological design: randomized controlled trials (RCTs), observational studies (cohort or case control studies, or mixed), and others.
Classification and data extraction were independently conducted by two investigators. Discrepancies were resolved through consensus or by consulting a third reviewer if the two investigators failed to research a consensus.

Statistical Analysis
Descriptive analyses were conducted. Collected data were summarized based on frequencies, median, and interquartile range (IQR). All analyses were conducted using SPSS (version 18.0 for Windows).

Search
The flowchart of the literature search for IPDMA articles is shown in Figure 1. The initial search identified 12,700 citations from PubMed, Embase and Cochrane Library A total of 664 abstracts were considered potentially eligible after screening the titles and/or abstracts. The search based on the strategy of Riley et al. yielded 313 additional eligible abstracts [5]. A total of 977 abstracts were evaluated further. However, only 837 full texts were retrieved after 140 conference abstracts were excluded. Screening the reference lists of the 837 full texts identified 26 additional potentially eligible studies. Eventually, 34 articles were excluded after further scrutiny. The final count of eligible IPDMA articles included in the study was 829 (the list of the 829 IPDMA articles is given in File S2).

Epidemiological Description of Published IPDMA Articles
The distribution of all identified 829 IPDMA articles against the year of publication is presented in Figure 2, which shows an annual average of 31.9 (829/26) IPDMA articles. A regression was fitted for the number of IPDMA articles against year of publication. A slope of 3.7 (P,0.001) indicates an average growth of 3.7 IPDMA articles per year ( Figure 2 Factors of the IPDMA Reviewers that may Affect the Completeness of Data Table 2 summarizes the potential factors associated with the completeness of data in the 829 IPDMA articles. A total of 497 (60.0%) IPDMA articles clearly stated that systematic searches were performed to identify relevant studies, whereas 103 of the remaining 332 (40.0%) IPDMA articles did not state how the studies were searched, and 229 (27.6%) identified the studies based on a selective or non-systematic approach.
A total of 792 (95.5%) IPDMA articles requested IPD from all identified studies. Nevertheless, only 496 (59.8%) obtained IPD from all requested studies. Thirty-seven (4.5%) of the IPDMA articles did not clearly report whether they requested all studies for IPD.
Among the abovementioned 497 (60.0%) IPDMA articles for which a systematic search was performed, only 190 (38.2%) obtained IPD from all requested studies, whereas 277 (55.7%) did not obtain IPD from all eligible studies. Thirty (6.1%) IPDMA articles did not state whether IPD was obtained from all requested studies.  A total of 788 (95.1%) IPDMA articles provided information on the sum of studies with and without IPD (additionally extracted the information from relative AD) in each article, which enables the determination of the proportion of studies providing IPD among the total studies within each article. Of the 788 IPDMA articles, 582 obtained IPD, comprising up to 80% or more of the total studies. The percentage of studies providing IPD within an IPDMA article had a 100% median (IQR 77.8% to 100%). Meanwhile, the investigators sent mails to the authors of 190 articles requesting for IPD, but only 6 (3.2%) agreed to provide their raw data [22].
A total of 584 (70.4%) IPDMA articles provided information on the total number of patients with and without IPD. Hence, the total proportion of patients with IPD can be calculated. Of the 584 IPDMA articles, 525 obtained 80% or more of the total original IPD. The percentage of patient data in IPD ranged from 4.6% [23] to 100%, with a median of 100% (IQR 100% to 100%).

Discussion
Our study shows that an increasing number of IPDMA articles are published yearly. This increasing number is attributed to strong information supports, such as Cochrane and high-impact general medical journals, such as BMJ and Annals of Internal Medicine, for sharing IPD among researchers in recent years [24]. A recent survey found that only 24% of the meta-analysts who attempted to seek IPD resulted in no IPD [12]. However, the number of IPDMA articles remains far less than that of ADMA articles [12,14]. Among all published articles on meta-analyses, the proportions of IPDMA articles and ADMA articles are 4% and 96% respectively [12]. Nevertheless, with the increase in investigator demand for shared data and the willingness of trialists to self-encourage data sharing [24], the number of IPDMA articles can continuously increase.
Cancer is historically the most prevalent topic in IPDMA articles. Of 34 IPDMA articles, 19 (55.9%) were on the cancer field before 1996 [8]. IPDMA articles for diseases of the circulatory system have made a rapid progress, although cancer remains the most frequent topic for IPDMA articles. In our study, approximately one-third (267 [32.2%]) of the IPDMA articles are in the cancer field, and more than one-fifth (179 [21.6%]) of the IPDMA articles are in the field of circulatory systematic disease. In total, IPDMA articles in these two diseases comprised more than half of all IPDMA articles.
IPDMA articles are more time-consuming and normally require more human resources than ADMA articles [16]. This study shows that the median number of authors in IPDMA articles is eight, a value that is twice that of ADMA articles [4]. IPDMA articles are more likely to be supported by funding from profit sponsors. In our study, 129 (15.6%) of 829 IPDMA articles are supported by profit sponsors. Previous studies show that only 2.3% of the ADMA articles received sponsor support [4]. Compared with 32.2% of IPDMA articles in the cancer field being supported by profit sponsors, the percent of ADMA articles on the cancer field with sponsor support is only 11% [4].
Our study finds that each IPDMA article includes a median of 8 studies (IQR 5 to 15) and a median of 2,563 patients (IQR 927 to 8,349) on average. Previous studies reported that ADMA articles included a median of 16 studies (IQR 7 to 30) and a median of 1,112 patients (IQR 322 to 3,750) [4]. The number of studies in IPDMA articles is smaller than that in ADMA articles. One potential reason for this difference is that IPDMA articles lack systematic searches to identify all relevant studies. However, the number of patients in IPDMA articles is larger than that in ADMA articles. For IPDMA articles, a larger quantity of sensitive studies is derived from systematic searches that often temporize a higher degree of specificity of individual patient data because of clinically more consistent inclusion and exclusion criteria.
Our study finds that 37.9% of IPDMA articles are based on survival data. A previous study showed that only 4.0% of ADMA articles use survival data [13]. IPDMA has some advantages over ADMA because of its intrinsic attributes. One advantage is that IPDMAs are more flexible than ADMA in conducting analyses both clinically and statistically, particularly in dealing with survival outcomes. Survival data identify whether and when an outcome (e.g., death) has occurred [25]. Survival analysis is critical in evaluating therapeutic effects and prognosis in the cancer field [2].
Results suggest that selection bias may affect the completeness of the data. Our result shows that 332 (40.0%) IPDMA articles do not clearly state whether systematic searches were performed to identify relevant studies (103 did not state how they searched for studies, whereas 229 identified the studies based on a selective, or non-systematic approach). Selection bias was a potential concern for IPDMA articles that did not perform a systematic literature search to identify relevant studies. For example, Davidson et al [26] published an IPDMA article to compare biphasic insulin aspart 30 (BIAsp 30) with biphasic human insulin 30 (BHI 30). In the IPDMA article, they only searched the databases of a pharmaceutical company and included six trials for the metaanalysis of major hypoglycemia. The overall OR estimate was 0.45 (95% CI 0.22 to 0.93), which verifies that the likelihood of major hypoglycemia was significantly lower with BIAsp 30 than with BHI 30. However, in an article of ADMA [27], the authors conducted a systematic search comprising nine trials. The overall RR estimate was 0.66 (95% CI 0.31 to 1.41), which was insignificant. IPDMA articles should also emphasize publication bias. Our result shows that 334 (40.3%) IPDMA articles reported seeking IPD from studies in the ''grey literature'' and 207 integrated results from the ''grey literature'' into their meta-analyses. A total of 376 (45.5%) IPDMA articles reported that they did not seek ''grey literature''. Given that IPDMA included a higher number of patients, but smaller number of studies than ADMAs, IPD metaanalysts are more likely to seek large published studies on IPD, rather than unpublished small studies. Small unpublished studies may either be omitted because IPD meta-analysts cannot obtain the original data from small studies or because they select large studies while neglecting to perform systematic searches. The omission of small unpublished studies may result in an exaggeration of the risk estimate [28]. Publication bias will more likely occur when only large published studies are sought. Hence, unpublished studies are strongly suggested to be sought in future IPDMAs. If IPD from small unpublished studies is unavailable, IPD meta-analysts can collect related AD and include AD in their estimation. Future IPD meta-analysts should assess publication bias through funnel plot because such investigations are still rare in IPDMAs [18].
Moreover, the bias of IPDMA articles may be derived from data unavailability. Our study found that 582 (70.2%) IPDMAs obtained IPD for 80% or more of the total studies from which IPD were sought. Previous studies on the availability of IPD found that 79% of 142 IPDMA articles published until 2005 obtained IPD for 80% or more of the total studies [5], and 67% of 31 IPDMA articles published between 2007 and 2009 obtained IPD for 80% or more of the total studies [18]. Data availability bias was a potential concern if the meta-analysts failed to obtain IPD from all requested studies [29]. For example, Choy et al. [30] performed an IPDMA study to compare stapled ileocolic anastomoses with handsewn methods, and the overall anastomotic leak was the primary end point. Seven RCTs were identified, and IPD were sought from these studies, but only three out of seven obtained IPD. The fixed effects meta-analysis of the three RCTs with IPD gave an OR of 0.18 (95% CI 0.03 to 1.03; I 2 = 0%). This result indicated that the stapled method was not associated with low overall anastomotic leaking. When the additional four RCTs that do not provide IPD are included, the fixed effects metaanalysis of all seven trials (three of IPD and four of AD) showed an OR of 0.48 (95% CI 0.24 to 0.95; I 2 = 21%), which indicated that stapled ileocolic anastomoses were associated with low overall anastomotic leak. Consequently, studies without IPD potentially affect the conclusions. AD was suggested to be collected and added to the calculation if the IPD meta-analysts cannot obtain IPD from all requested studies.

Strengths
IPDMAs are considered as the gold standard in supporting clinical decision making. This article is the first to conduct a comprehensive search in a cross section study of the distribution and epidemiological characteristics of the published IPDMA articles.

Limitations
This study has several limitations. First, our data were based on the information reported in published IPDMA articles. The original review authors were not directly contacted. Some details may possibly be omitted (e.g., some publications may have been granted sponsorship but were not reported). Second, we excluded 140 conference abstracts for which full text articles could not be retrieved and this fail to provide detailed information for data extraction. Given that most of these abstracts will be published as full-text journal articles, the database should be updated in the future. Third, direct comparisons of the characteristics between IPDMAs and ADMAs may be inappropriate because the screening conditions of this study are different from those of previous studies. However, these comparisons are limitedly discussed in this study. Moreover, the conclusions in this study were obtained from the cross-sectional data collected from IPDMAs, rather than by comparing the results of this study with those of a previous study.

Conclusions
This study provides a survey of published IPDMA articles in terms of prevailing distribution and epidemiological characteristics. The number of IPDMA articles is augmented yearly. IPDMA articles on cancer and circulatory diseases comprise more than half of the total IPDMA articles. Meta-analysts mainly focus on therapeutic IPD and minimally focus on prognosis and others. Systematic searches are not often performed. IPD from grey literature are usually not included. IPD are often unavailable. Selection bias, publication bias, and data availability in IPDMAs should be considered and emphasized. Decision makers should be aware of the potential biases in IPDMAs before accepting their results.

Supporting Information
File S1 Search strategies.