Premature Discontinuation of Prospective Clinical Studies Approved by a Research Ethics Committee – A Comparison of Randomised and Non-Randomised Studies

Background Premature discontinuation of clinical studies affects about 25% of randomised controlled trials (RCTs) which raises concerns about waste of scarce resources for research. The risk of discontinuation of non-randomised prospective studies (NPSs) is yet unclear. Objectives To compare the proportion of discontinued studies between NPSs and RCTs that received ethical approval. Methods We systematically surveyed prospective longitudinal clinical studies that were approved by a single REC in Freiburg, Germany between 2000 and 2002. We collected study characteristics, identified subsequent publications, and surveyed investigators to elucidate whether a study was discontinued and, if so, why. Results Of 917 approved studies, 547 were prospective longitudinal studies (306 RCTs and 241 NPSs). NPSs were on average smaller than RCTs, more frequently single centre and pilot studies, and less frequently funded by industry. NPSs were less frequently discontinued than RCTs: 32/221 (14%) versus 78/288 (27%, p<0.001, missing data excluded). Poor recruitment was the most frequent reason for discontinuation in both NPSs (36%) and RCTs (37%). Conclusions Compared to RCTs, NPSs were at lower risk for discontinuation. Measures to reliably predict, sustain, and stimulate recruitment could prevent discontinuation of many RCTs but also of some NPSs.


Introduction
In recent years, many methodological investigations and guidance documents about randomised controlled trials (RCTs) have been published, whereas problems in the planning and conduct of other prospective clinical studies such as non-randomised controlled trials, single arm trials or cohort studies (non-randomised prospective studies, NPSs) are less well known.
A major reason for failure of RCTs is premature discontinuation; about one quarter of planned RCTs are prematurely discontinued, mostly due to recruitment problems [1][2][3][4][5]. Discontinuation is rarely reported to research ethics committees (RECs) and about half of discontinued RCTs remain unpublished [1,3]. This is a serious ethical concern for volunteering participants and the society at large as it represents a waste of scarce resources, a loss of valuable research data, and missed opportunities to learn from failure [1,2,6].
Similar ethical implications would apply to NPSs. However, their premature discontinuation has not been investigated in depth so far. (Retrospective or cross-sectional studies are not discussed in this paper because the mechanisms for discontinuation differ from prospective studies. For instance, they cannot be discontinued for slow benefit, harm, futility, or slow recruitment). It is unknown whether the risk and reasons for discontinuation of NPSs differ from RCTs. Pilot or feasibility NPSs conducted on selected populations may be less prone to discontinuation due to poor recruitment than confirmative RCTs that need to achieve a certain sample size to establish the effectiveness of a treatment. A study on a sample of cardiovascular trials registered in ClinicalTrials.gov showed that besides funding by federal agencies and behavioural therapies, a single arm study design is associated with a lower risk of early termination due to poor recruitment [7]. Furthermore, studies suggest that it is easier to recruit participants into NPSs [8,9]. On the other hand, NPSs may be more frequently explorative in nature including first-in-human studies or early pharmacological studies [7]. Those studies are typically surrounded by more uncertainties concerning the potential benefit and harms an intervention could have for patients, and may thus bear a higher risk for discontinuation.
The aim of this analysis was to study the discontinuation of clinical studies that were approved by a REC, and to compare the prevalence of and reasons for discontinuation between NPSs and RCTs.

Methods
We had access to all study protocols submitted to the REC of the University of Freiburg / Germany from 2000 to 2002, including RCTs and NPS [10,11]. If a study protocol described two or more studies, we considered each study separately.
We included studies that 1) enrolled patients or healthy volunteers (hereafter referred to as 'participants'), 2) collected baseline data after initiation of the study (prospective studies), and 3) had at least one follow-up time point regardless of time elapsed since baseline (longitudinal studies). We excluded retrospective longitudinal and cross-sectional studies because our main outcome "discontinuation of recruitment and/or follow-up" would not apply. Furthermore, we excluded studies if we certainly knew that they were never started or were still on-going at time of data collection (Fig 1). We considered a study on-going if investigators indicated this in response to our survey and if results had not been published.
We used the following definitions to classify NPSs (Fig 1): 1. Controlled trial: Participants are systematically assigned to two or more parallel exposure (intervention) group defining the study arms. An exposure (intervention) could be in one time point or repeated / continued over a defined period of time.
2. Single arm trial: All participants are assigned to one exposure (intervention) group.
3. Cohort study: Participants are not actively assigned to exposure groups; those are defined by observed characteristics or exposures.
Details of data collection have been described previously [10,11]. REC files included local application forms, the study protocols with amendments and correspondence with the REC. We extracted information about study centre status (multicentre international/national, single centre), sample size, pilot study, medical field, inclusion criteria, and source of funding (industry/non-industry). We considered a study industry-sponsored if the protocol clearly named the sponsor, displayed a company or institution logo prominently, mentioned affiliations of protocol authors, or included statements about data ownership or publication rights, or statements about full funding by industry or public funding agencies. If only some study material (e.g. the experimental drug) was provided by a private company and academic investigators wrote the study protocol, we did not consider the study industry-sponsored.
We used pre-piloted standardized forms for all extracted information and provided detailed written data extraction instructions. To minimize extraction errors, we conducted formal calibration exercises with all data extractors and extracted 30% of the RCTs in duplicate. NPSs data were extracted by one investigator (PO). All database entries were checked for plausibility and about one third of the data was cross-checked by a second investigator (AB), so that we are confident that we achieved a high level of accuracy. If the investigator in charge could not decide on how to extract data (e.g. when classifying studies by design), the issue was discussed with a second investigator. If data extraction still remained unclear, a third investigator (SS) was involved to reach consensus.
In a second step, we established a specific search strategy for each study protocol using relevant keywords from the protocol such as experimental drug, study name or acronym, studied disease or condition or names of applicants. We We contacted the applicants of all included protocols by a personalised letter including a questionnaire to confirm the publications identified by us and to ask for additional publications (S1 Appendix). We also asked about premature discontinuation and the reason(s) thereof. The survey of investigators was conducted in 2007 for protocols of the year 2000, in 2010 for protocols of the years 2001 and 2002, and in 2013 for all RCTs. The survey response rate in Freiburg was 90.0% for RCTs and non-RCTs. We grouped the reasons for study discontinuation provided by the investigators into the following categories: poor recruitment, benefit, harm, futility, lack of funding or other reasons (including administrative reasons such as retirement or change of institute of principal investigator or disagreement with sponsor).
Our primary analysis was to compare the proportion of discontinued studies between NPSs and RCTs. We calculated proportions based on complete cases (excluding studies with missing status information) and conducted sensitivity analyses assuming that a) unclear/missing studies were completed, b) unclear/missing studies were discontinued, or c) half of unclear/missing studies were discontinued. The latter assumption that unclear/missing studies were more likely to be discontinued was based on the observation that unknown status was associated with nonpublication which is known to be associated with discontinuation [1]. In another sensitivity analysis, we excluded studies stopped for harm, benefit, or futility. The rationale was to enable a 'fairer' comparison because cohort studies cannot be stopped for these reasons. We used the chi-square test to test for differences in proportions (based on the assumption that studies approved in Freiburg are a random sample of studies approved in comparable jurisdictions and future studies) and a type I error of 5% as threshold for statistical significance.
To explore potential differences across countries and jurisdictions, we compared the study characteristics and discontinuation between the RCTs approved in Freiburg and RCTs approved in Canada (Hamilton) and Switzerland (Basel, Lausanne, Zurich, and Lucerne) at the same time period. Details about this cohort of 1017 RCTs were reported earlier [1,12].

Included studies
We identified 917 studies that were approved by the REC in Freiburg (Fig 2). After excluding studies that were never started, still on-going, in vitro or other non-human studies, or of crosssectional or retrospective design, our final data set for analysis comprised 547 prospective longitudinal studies. Of those, 306 were RCTs and 241 were NPSs (27 controlled trials, 158 single arm trials, and 56 cohort studies).

Study discontinuation
Overall, NPSs were less frequently discontinued than RCTs (14% versus 27%, missing excluded, p<0.001) (Tables 2 and 3). Sensitivity analyses using different assumption for missing data only slightly changed these proportions and the differences between NPSs and RCTs. In the sensitivity analysis excluding studies stopped for harm, benefit, or futility the difference between NPSs and RCTs was no longer statistically significant (p = 0.057, missing excluded, see Table 3). Poor recruitment was the most frequent reason for discontinuation in both NPSs (37%) and RCTs (36%) ( Table 2). Completion status was very similar in RCTs approved in Freiburg and in Canada or Switzerland.  The numbers in brackets are proportions (column %) based on complete cases (excluding unclear/missing) 2 Please see Table 3 for proportions 3 Other reasons: administrative; retirement; change of institute of applicant; lack of staff resources; lack of flexibility of the system; logistical problems, i.e.
interdisciplinary study organisation and logistics; number of participants too small; studies conducted by other researcher were larger and more meaningful; pilot phase failed, test material was insufficient; study data were deleted during maintenance works by the technician; change of study design; change of topic; termination of the liver transplant programme at the University Medical Center Freiburg by the government; shifting of the research focus of the study investigator Abbreviations: NPSs, non-randomised prospective studies; RCTs, randomised controlled trials. doi:10.1371/journal.pone.0165605.t002

Discussion
We compared the risk of discontinuation between NPSs and RCTs in a sample of clinical studies approved by a German REC. Overall, NPSs were at lower risk of discontinuation than RCTs (14% versus 27%, p<0.001). The difference was robust to sensitivity analyses using different assumptions for missing data, however, when we excluded studies stopped for harm, benefit, or futility, the difference in the proportion of discontinued studies diminished and was no longer significant. This is obviously due to the fact that cohort studies are not being stopped for harm, benefit, or futility.
A strength of our study is that we had access to the full REC correspondence from a period of three years. Consequently, we could take into account relevant information from the application form, the study protocol with its amendments, patient information sheets and additional correspondence. In addition, we achieved a high response rate in our survey. The included RCTs were remarkably similar to RCTs that were approved at the same time by five RECs in Switzerland and Canada. This increases our confidence that the findings will be generalizable to clinical research settings in other high-income countries.
A main limitation of our study was that for some studies only little information was available, e.g. due to missing or poor reporting in study documents. When a study protocol was missing, we extracted relevant information from application forms and patient information sheets. While all RCTs conducted in the jurisdiction of the REC needed to obtain ethical approval, the regulations were less strict for NPSs between the years 2000 and 2002. We could not quantify the number of NPSs that were never assessed by the REC. Consequently, it is possible that our sample of NPSs overrepresented more challenging studies for which ethical approval was deemed critical and that it might not be representative of NPSs in general. Finally, when we applied statistical tests and formulated our conclusions, we implicitly regarded the studies approved in Freiburg as being representative of other clinical studies approved in other jurisdictions. This assumption is supported by the fact that characteristics of RCT in Freiburg were very similar to characteristics of RCTs in Switzerland and Canada (Table 1). However, results may differ among studies performed in jurisdictions where unique challenges exist, such as developing countries.
Only few previous studies have reported on the risk of discontinuation in NPSs. We screened two systematic reviews including methodological studies on studies approved by RECs or included in trial registries [13,14]. One study followed a cohort of 367 studies (137 RCTs and 230 studies of other designs) approved by a single REC in Oxford, UK, between 1984 and 1987 [15]. It did not investigate differences between study designs due to the low number of discontinued studies. Similar to our study, poor recruitment was the main reason for study discontinuation across designs. Compared to completed studies, discontinued studies were more likely to be non-comparative (33% vs 27%) and single centre (83% vs 76%). Another study was based on cardiovascular studies registered in ClinicalTrials.gov [7]. Again, slow recruitment was the main reason for study discontinuation. In addition, single arm design was independently associated with lower risk for recruitment failure. Our sample did not include enough single arm trials to explore this potential association.

Conclusions
Our analysis suggests that NPSs may be at lower risk for discontinuation than RCTs. Poor recruitment was the main reason for study discontinuation in both RCTs and NPSs. Measures to reliably predict, stimulate and sustain recruitment performance may prevent the discontinuation of many NPSs and RCTs in the future [16][17][18].