Frequency of multiple changes to prespecified primary outcomes of clinical trials completed between 2009 and 2017 in German university medical centers: A meta-research study

Background Clinical trial registries allow assessment of deviations of published trials from their protocol, which may indicate a considerable risk of bias. However, since entries in many registries can be updated at any time, deviations may go unnoticed. We aimed to assess the frequency of changes to primary outcomes in different historical versions of registry entries, and how often they would go unnoticed if only deviations between published trial reports and the most recent registry entry are assessed. Methods and findings We analyzed the complete history of changes of registry entries in all 1746 randomized controlled trials completed at German university medical centers between 2009 and 2017, with published results up to 2022, that were registered in ClinicalTrials.gov or the German WHO primary registry (German Clinical Trials Register; DRKS). Data were retrieved on 24 January 2022. We assessed deviations between registry entries and publications in a random subsample of 292 trials. We determined changes of primary outcomes (1) between different versions of registry entries at key trial milestones, (2) between the latest registry entry version and the results publication, and (3) changes that occurred after trial start with no change between latest registry entry version and publication (so that assessing the full history of changes is required for detection of changes). We categorized changes as major if primary outcomes were added, dropped, changed to secondary outcomes, or secondary outcomes were turned into primary outcomes. We also assessed (4) the proportion of publications transparently reporting changes and (5) characteristics associated with changes. Of all 1746 trials, 23% (n = 393) had a primary outcome change between trial start and latest registry entry version, with 8% (n = 142) being major changes, that is, primary outcomes were added, dropped, changed to secondary outcomes, or secondary outcomes were turned into primary outcomes. Primary outcomes in publications were different from the latest registry entry version in 41% of trials (120 of the 292 sampled trials; 95% confidence interval (CI) [35%, 47%]), with major changes in 18% (54 of 292; 95% CI [14%, 23%]). Overall, 55% of trials (161 of 292; 95% CI [49%, 61%]) had primary outcome changes at any timepoint over the course of a trial, with 23% of trials (67 of 292; 95% CI [18%, 28%]) having major changes. Changes only within registry records, with no apparent discrepancy between latest registry entry version and publication, were observed in 14% of trials (41 of 292; 95% CI [10%, 19%]), with 4% (13 of 292; 95% CI [2%, 7%]) being major changes. One percent of trials with a change reported this in their publication (2 of 161 trials; 95% CI [0%, 4%]). An exploratory logistic regression analysis indicated that trials were less likely to have a discrepant registry entry if they were registered more recently (odds ratio (OR) 0.74; 95% CI [0.69, 0.80]; p<0.001), were not registered on ClinicalTrials.gov (OR 0.41; 95% CI [0.23, 0.70]; p = 0.002), or were not industry-sponsored (OR 0.29; 95% CI [0.21, 0.41]; p<0.001). Key limitations include some degree of subjectivity in the categorization of outcome changes and inclusion of a single geographic region. Conclusions In this study, we observed that changes to primary outcomes occur in 55% of trials, with 23% trials having major changes. They are rarely transparently reported in the results publication and often not visible in the latest registry entry version. More transparency is needed, supported by deeper analysis of registry entries to make these changes more easily recognizable. Protocol registration: Open Science Framework (https://osf.io/t3qva; amendment in https://osf.io/qtd2b).


1
GENERAL Please justify the inclusion of studies published only from Germany.Thank you!We now explain why we used a sample from Germany in the Introduction and the Methods (Data sources and sample): "We are aware of only two studies that examined changes to primary outcomes over the course of the entire registry version history.Both used relatively small and selective samples (either trials published in ICMJE journals or trials registered on ClinicalTrials.govcovered by the Food and Drug Administration Amendments Act) over a four-year time period [30,31].We are not aware of studies quantifying the frequency of changes that may be 'invisible' and potentially unnoticed.
In this study, we aimed to provide an in-depth analysis of changes to primary outcomes using a comprehensive nationwide sample over 9 years that represents the current practice of trial registration in academic research."(Introduction) "We based our analyses on previous work of our group that provided the highly granular and large-scale data on both registry entries and accompanying results publications required to address our aims.Two large datasets [32,33] of randomized controlled trials covering a period of 9 years of trial completion plus a period of 5 years of publication tracking were used for this project.The trials were completed at German university medical centers between 2009 and 2017 and had been registered in either ClinicalTrials.govor the German World Health Organization primary registry (German Clinical Trials Register or Deutsches Register Klinischer Studien; DRKS).Both registries have an accessible history of changes.The corresponding results publications were identified in manual searches up to 2022, ensuring an at least fiveyear period after trial completion and complementing the often inconsistent references in the trial registry [34]."(Methods) 2 GENERAL As you report an analysis of existing studies as metadata (also noting the use of 'Numbat Systematic Review Manager' for data extraction), we suggest that you consider reporting your study according to PRISMA for SRMAs instead of STROBE.The PRISMA guidelines can be found at the EQUATOR site, http://www.equator-network.org/reportingguidelines/prisma/.We acknowledge that some aspects of PRISMA reporting may not apply here.
Please provide the completed PRISMA checklist.When completing the checklist, please use section and paragraph numbers, rather than page/line numbers as these often change in the event of publication.
Please add the following statement, or similar, to the Methods: "This study is reported as per the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline (S1 Checklist)." We have carefully perused PRISMA and ensured that all applicable items are adequately reported in the revised version of the manuscript; we are submitting these files along with our revisions.We would prefer to also use the STROBE guideline as we feel it also adds important details.We employed the STROBE guideline in many other similar types of cross-sectional meta-research studies, also for a recent publication from our team in PLOS Medicine: Franzen DL, Carlisle BG, Salholz-Hillel M, Riedel N, Strech D (2023) Institutional dashboards on clinical trial transparency for University Medical Centers: A case study.PLOS Medicine 20(3): e1004175.https://doi.org/10.1371/journal.pmed.1004175 3 GENERAL Please temper the language used throughout, including the use of the word 'hidden' in the title, and ensure only objective reporting of your findings and their implications.
Thank you for highlighting this important point!It was not our intention at all to suggest 'malfeasance' and we have changed the word in the title to 'invisible'.Please note, that by using the term 'hidden', we did not mean to imply that researchers purposely hide important information in registry histories, but rather that the information is hidden.

INTRODUCTION
Please address past research and explain the need for and potential importance of your study.Indicate whether your study is novel and how you determined that.If there has been a systematic review of the evidence related to your study (or you have conducted one), please refer to and reference that review and indicate whether it supports the need for your study.We now changed the text and our citation style accordingly to meet the journal requirements and changed the square brackets to round brackets for numbering of, for example, objectives.

INTRODUCTION
Has there been a systematic review of the evidence related to your study (or if you have conducted one)?If so, please refer to and reference that review and indicate whether it supports the need for your study.
Please see our response to your previous comment (#24).

METHODS and RESULTS
Line 142 -please place citation in square brackets We changed this (please also see comment #27).

METHODS and RESULTS
Line 143 -'interventional studies' please replace with 'randomized controlled trials' We changed this accordingly.

METHODS and RESULTS
As above, please justify the inclusion of data from these specified time points Thank you!We now better explain our approach in the Methods, see comment #1.

As above, please justify the inclusion of only studies originating from Germany
This project required highly granular data on publications, which are available to us for Germany.We now better explain our approach (see comment #1 above), and also now explicitly discuss this geographic restriction in the Discussion: "Fourth, although we used high quality underlying data reflecting German clinical research output, we only assess a single geographic area, partly reflecting EU/German registration policies.Still, many registration policies are international, and our results point to the possibility that technical details of the registries rather determine whether outcome changes occur." While we have no reason to believe that the registration of study outcomes is substantially different in other countries, we agree that our research needs to be replicated for other settings and now better highlight this limitation.The phenomenon of invisible outcome changes has, however, not been described before.

METHODS and RESULTS
Line 167, 213 -please use an alternative word for 'downloaded', 'extracted' perhaps?Please check and amend throughout.
Thank you for pointing this out.Since we use the term 'extract' already when we talk about extraction of primary outcomes from the dataset (in accordance with common language in systematic reviews and in the now applied PRISMA guidance), we exchanged 'downloaded' with 'retrieved' (as used by e.g.PRISMA).

METHODS and RESULTS
Line 170 onwards -please reserve the use of square brackets for in-test reference callouts Changed accordingly (please also see #27).

METHODS and RESULTS
Line 213 onwards -details of the software packages would be better placed earlier at line 165 where you describe data extraction and processing.

Changed accordingly. METHODS and RESULTS
Line 216 -suggest reporting according to PRISMA instead of STROBE.
We now also use PRISMA (please also see #2).

DISCUSSION
Please remove the sub-heading 'conclusion' form the end of the discussion.

Changed accordingly. DISCUSSION
As above, please refrain from using the word 'hidden'.

REFERENCES
For in text reference callouts please place citations in square brackets.
Changed accordingly (please also see #27 We made these changes accordingly.TABLES Suggest presenting the trial characteristics as table 1 followed by the definitions of discrepancies as table 2.
Thank you for this suggestion!We changed the order of the tables accordingly.TABLES Table 1 -outcome discrepancies -these seem rather vaguely defined, suggest additional nuance to the categorization.When reporting minor discrepancies the word 'significant' is used implying a major discrepancy and is also not very quantified.Please revise this table including more nuanced detail of the categorization applied.
Thank you for this important comment.We fully agree that the word significant in combination with minor discrepancies is confusing!We removed it.
We also revised the table substantially and refer to related literature using similar categorization and provide more details the Methods: "Reviewers assessed changes to the primary outcomes between the four key trial milestones within the registry (Figure 1), using a classification system that categorizes changes to primary outcome as either major changes (based on an existing classification system [40]) or minor changes (adapted from [21] and [41], for details see were trials where no matching journal had been found using our automated approach.We now manually identified the respective journal matches, and thus, the number of 'others' has reduced from 395 to 74.Please note that due to these changes, we updated our regression models (no relevant changes, conclusions remain the same).TABLES Table 3 -for the main outcomes measures please indicate whether your analyses are adjusted or unadjusted and if adjusted please present the unadjusted analyses for comparison.Please report p as p<0.001 and where higher the exact p value.Please define the meaning of the lettering in bold type-face The analyses in Table 3 do not include statistical tests or models, we have no p-values to report.We now explain the meaning of boldface lettering in the caption, thank you for bringing this up!FIGURES Please ensure that all figures are associated with a caption that clearly describes the figure content without the need to refer to the manuscript text.Please ensure all abbreviations and the meaning of dots, lines and bars are clearly defined for the reader.
We have carefully revisited all figures, legends and captions to ensure these issues are adequately addressed.

FIGURES
Figure 1 -if the end point in this flow diagram is 'journal publication' does the earlier 'publication' refer to the protocol publication?Please clarify/revise for the reader.The figure has been substantially revised following the suggestions by the Editors and Reviewers.Additionally, we have moved it to the supplement (now as Supplementary Figure 1).The different widths of shaded grey areas and the lines represent the 'flow' of changes: For example, the largest grey bar between the column on the left and the column in the middle indicates that the majority of trials with no changes in the registry between study start and study completion also have no changes in the registry between completion and publication.We have explained this in the caption.We agree that this figure is challenging.To improve it, we have now added a scale to show the percentages, changed the naming of the colored bars, and changed the legend.We have also created an online, interactive version: https://martin-rh.shinyapps.io/InvisibleOutcomeChangesFigure/FIGURES Figure 4 -please provide the unit of measurement for the 'discrepancies across timepoints' total number?The label of the y axis could be closer to the graph, please revise.
Thank you for your comments!Following your feedback, as well as feedback from the reviewers, we have now completely reworked Figure 4 (now Figure 3).

SUPPLEMENTARY FIGURES
Please ensure that all figures are associated with a caption that clearly describes the figure content without the need to refer to the manuscript text.
We made the requested changes to our caption and legend.

SUPPLEMENTARY FIGURES
Please ensure all abbreviations and the meaning of dots, lines and bars are clearly defined for the reader.
We made the requested changes to our caption and legend.SUPPLEMENTARY FIGURES Please ensure axis labels are placed appropriately alongside the relevant axis.
We changed this accordingly.FIGURES Please ensure that all axes have a defined unit of measurement e.g., total number.the agreement for registry-publication results seems fairly low -this is presumably prior to discussion to resolve disagreements?This does seem a limitation to applying these results in practice for identifying whether there were inconsistencies when reviewing future publications: some reviewers may identify an issue and some may not.Perhaps this could be mentioned in the discussion.
This is an excellent point, thank you! Yes, we report the inter-rater agreement prior to the consensus discussion.However, our detailed and fine-grained rating terminology (we differentiate 8 categories of changes) is probably not necessary for the peer review process where the larger changes may matter more.
We now briefly add that the subjectivity in the assessment may nevertheless translate to the peerreview process in the Discussion section: "However, conflicting primary outcome definitions and conflicting ratings were resolved in discussions between the reviewers (and, where necessary, a third reviewer), and excluding unclear situations in the above-mentioned sensitivity analysis did not change the main findings.Still, this could become an issue in practice if, for example, different peer-reviewers disagree in their assessment of whether an outcome change has happened or not."61 Discussion I presume the authors did not extract information on whether results were significant with the discrepant outcomes.This would be interesting information if it exists (e.g.trials with discrepant outcomes were more likely to report significant results) as it would imply that this might involve cherry-picking.
Thank you for this suggestion.We did extract this information for some of the sampled studies, but a thorough analysis would be beyond the scope of the current project -we would think that the significance of the original outcome would also be relevant.We plan to explore this in a subsequent project.62 Supplementary table 2 it would be useful to provide the 95% CI for the odds ratios in this table.We have added the confidence intervals, thank you for this suggestion!(See also comment #56 from the Editor.)

GENERAL
This is an interesting meta-research study investigating discrepancies between clinical trial protocols in registries vs published reports.
Thank you for this very helpful feedback.
Overall, the findings are likely to be of interest to trialists and other readers, though the manuscript and reporting could be improved in a number of areas.Please see comments below.

GENERAL
The scope of this study is substantial, with a large number of outcomes, statistical tests, and presentation in figures that are not common in most fields of research.In addition, the writing uses a number of assumed terms and knowledge, which may be problematic for the broad readership of this journal.And the writing is hard to follow in places.
We carefully revised our manuscript and hope it is easier to follow now.GENERAL * Suggest to edit the manuscript to simplify wording and tighten the writing, remove/minimise jargon, and clearly define terms at first use.
We have carefully revised our text, aiming to make it more readable, removing jargon and providing more examples and illustrative language.GENERAL * Consider displaying data using figures that are more recognisable in medical research.If not possible, substantially expand the figure legends to orientate figures to readers who may not be research-active in this area.
We agree that these figures are uncommon.We took great thought into the visualization of the data and now adapted them to facilitate their usefulness for medical/clinical readers.We now have extensively explained the Figures and their legends (see the more detailed comments below).ABSTRACT L64. 'primary outcome discrepancies' can read as jargon: consider wording as e.g.'differences between primary outcomes reported in trial registries and in published reports (i.e.primary outcome discrepancies)' We revised this accordingly across the manuscript and now state: We adapted the flow of sentences in that section.INTRODUCTION L126-129: It seems unclear what value these statements add to this paragraph, that there is variability between registered protocols and trial reports.Consider dropping them, or revise to connect the ideas more strongly to the sentences above.
We reworded this paragraph and deleted a sentence.INTRODUCTION L130-137.The large number of aims and lengthy articulation seems to dilute the strength of the key message of this study, and make the paper look overambitious.Consider if the aims can be simplified to focus the reader on key aspects of this study.
Thank you for this feedback!We have changed the section as follows: "In this study, we aimed to provide an in-depth analysis of changes to primary outcomes using a comprehensive nationwide sample over 9 years that represents the current practice of trial registration in academic research.Specifically, we aimed to…" METHODS L152, 158.Why would a study be regarded as 'completed' if the trial has an 'Unknown status'?How can investigators be sure it was actually completed?
If a trial's registry entry is not updated for a certain amount of time, it is automatically set to 'Unknown' on ClinicalTrials.gov.However, the trials included in our sample were both past their primary completion date and had a results publication.METHODS L185.Please explain clearly 'If the reviewers could not find the correct publication, they excluded the publication' We changed the text to explain this better: "For feasibility, we drew a random sample of 300 trials with accompanying results publications, from which three reviewers (MRH, MH, SY) extracted primary outcomes.Eight articles were excluded from the sample because they were not the main results publications (i.e., the final sample size was 292)."METHODS L205.'proportions': should be percentages?
We rephrased the text accordingly.METHODS L209.Why are these variables regarded as 'predictors'?Are future events being predicted from preceeding knowledge on study phase, sponsor, ... intervention?
We took the perspective of a reviewer or user of the trial publication who has the question "Which factor indicates that in this study an outcome change may exist?".However, to avoid confusion, we now use more neutral language and exchanged the terms with 'output variable' and 'input variables'.

RESULTS
L230.Consider expanding on 'changed to or from a secondary outcome' We changed the wording to "primary outcomes were added, dropped, changed to secondary outcomes, or secondary outcomes were turned into primary outcomes".

RESULTS
L230.Percentages reported as integer values in Results text, but to 2 dec place in Tables.Please make consistent: report percentages in Tables as integers.
It is unlikely that precision to 2 dec place of a percent is meaningful.
We changed all percentages with decimal places to integers, including supplementary tables and figure captions.

RESULTS
L259.I may have missed it, but the positive result for trials registered on ClincialTrials.govseems an unusual finding and does not seem to be sufficiently explained.Is this finding because most trials tend to be registered on this registry anyway?
We added more context to the Discussion section about this and state: "The observation that the frequency of outcome changes declined over time may be explained by recent discussion on clinical trial reporting practices.Regarding the type of registry, the association might be explained by better usability of the ClinicalTrials.govplatform, which might facilitate more frequent changes to registry entries.Note, any qualitative difference in patterns in figures can be interpreted to have a meaning, otherwise no difference is needed.So if differences are shown but there is no meaning, it leaves the impression that the figure is sloppy and rushed, not clear and informative.
We carefully followed the reviewer's advice and now give more explanation, aiming to make the figures standalone and self-explanatory.We also included an example (suggested by another reviewer) and updated our figure legend to reflect the suggested changes.FIGURES Thank you!We now completely revised these Figures, which we believe makes them easier to understand.We have also added a more thorough caption of these Figures and provide the reasoning for our number of 41 trials with invisible changes.

NR AREA COMMENT (Reviewer 3) RESPONSE
CRITICAL General I found this difficult to follow, I'm afraid.On this reading, I was left suspecting that readers will think: "Is this a German problem?", "Is this an old problem?" and/or "Is this a real problem?"I didn't fully understand what had been done nor, often, why.The comments are all intended constructively and I hope they come across that way.Below are listed detailed comments.
Thank you for your very clear and constructive feedback!We have addressed all the comments to improve the context and clarity of the project.We have specifically highlighted that the problem is new (without claiming primacy, as suggested by the Editor), underlined the relevance, and explain more detailed why our sample is from Germany and our reasons to believe the result can be generalized beyond this geographic reason and time period.103 General I'm afraid I didn't get the importance of this work.What am I missing?Is the suggestion that people don't change outcome measures if they We have rephrased both our Abstract and our Introduction section to make the importance clearer.
need to or that they report better or that registers or journals should take some action?General I thought the terminology felt harsh and misleading, particularly "discrepancy" and "hidden".It's only discrepant if the paper differs to the most recent registry entry.It's only hidden if the change doesn't appear in the registry.(I spend most of my career researching how to improve the way do clinical trials -there's lots of scope for improvement -so I was surprised by how the wording of this manuscript provoked a defensive reaction in me.)This requires a big rethink, including in Table 1.
We appreciate your very helpful feedback!In line with comments by other reviewers, we exchanged the word 'hidden' with 'invisible', and hope it comes across as less accusing (which was not our intention).Please note that 'discrepancy' is a common term for changes to primary outcomes, which is also used in other publications on the topic.Still, we now consistently use the word 'change'.
We also changed Table 1 to make it more understandable.

MODERATE Background
Why is it a problem to look through the history of changes?This is the point of history of changes?
The problem is that the history of changes is not easily accessible and readable.We now have further highlighted this in our Discussion and state: "At the same time, in their current form, historical versions of registry entries are not easily accessible and readable, which makes checking them tedious."Throughout I always understand that an "outcome" is how the patient does and an "outcome measure" is how you measure the outcome.An endpoint would be the same as an outcome measure.The use of outcome (as a shorthand for "outcome measure"?)here seemed confusing.
We fully understand.However, the shorter version (e.g., 'outcome switching' instead of 'outcome measure switching') is quite established in related literature and discussions on the topic (please see e.g. the quotes from CONSORT that we report in the Introduction), and in our view it facilitates readability.However, to be clearer, we consider this comment and now have explained this at the beginning of the manuscript: "outcome measures (or, 'outcomes', for short)" Methods :: Text ref :: "We based our analyses on two published datasets (from the IntoValue projects, 31,32) containing all interventional studies completed at German University Medical Centres between 2009 and 2017.":: Comment :: Chosen for convenience rather than relevance?Thank you for pointing us to this possible source of misunderstandings.We now better describe why we chose this sample, see comment #1 from the editor.

Methods
:: Text ref :: "The trials had been registered in either ClinicalTrials.govor the Deutsches Register Klinischer Studien (DRKS), which is the WHO primary trial registry for Germany.":: Comment :: How many were registered in both?I suspect quite a few.I would have liked the results broken down by registry: some have better supported updating the entries than others.
In our dataset, 11 out of 1402 trials registered on ClinicalTrials.govhave a cross-registration in the DRKS (0.78%), and 4 out of 344 trials registered in the DRKS have a cross-registration on ClinicalTrials.gov(1.16%).Methods :: Text ref :: " We included any study that: 1) has a registry entry in either the ClinicalTrials.govor the DRKS database, 2) was completed between 2009 and 2017 according to the trial status described as 'Completed', 'Unknown status', 'Terminated', or 'Suspended' (ClinicalTrials.gov),or 'Recruiting complete, follow-up complete', 'Recruiting stopped after recruiting started', or 'Recruiting suspended on temporary hold' (DRKS), 3) reported in the registry that a German University Medical Centre was involved (i.e., mentioned as responsible party, lead/primary sponsor, principal investigator, study chair, study director, facility, collaborator, or recruitment location; for definitions see 31,32), 4) has published results, i.e., a full-text publication of the trial results was found by the search methods described in the IntoValue protocols (31,32), 5) is registered as a randomised trial.":: Comment :: First, are these joined together with "and" statements or "or" statements?I suspect the form but it's not clear.Second, the selection criteria confuse "completing accrual" with "completing the trial".Which is actually meant?Thank you for pointing this out!We mean "and" -we clarify this in the revision.We define trial completion according to the trial status (e.g., 'Completed', 'Unknown status', 'Terminated', or 'Suspended' in ClinicalTrials.gov).Methods :: Text ref :: "For feasibility, we drew a random sample of 300 publications, from which three reviewers (MRH, MH, SY) extracted primary outcomes.":: Comment :: I accept couldn't do all of them but why 300?Based on our experiences in the previous rating of withinregistry changes, 200 trials per reviewer (due to duplicate assessment) was the maximal feasible amount.This sample still represents 17% of the total trials assessed.Methods :: Text ref :: "If the reviewers could not find the correct publication, they excluded the publication…" :: Comment :: If it's not the right publication, how could discrepancy be judged?
Our work was based on a highly granular and large-scale dataset on both registry entries and accompanying results publications, which had been created in a previous project of our group.The results publications were identified by extensive manual searches.Still, there were some publications in the dataset which had been misclassified as results publications.In those very few cases, if no other results publication could be found by searching the trial title, the trials (n=8) were excluded.We modified the text to make this clearer (see also comment #80).112 Results :: Text ref :: "Of 1746 trials, 393 (23%) had an outcome discrepancy reported within the registry (Table 3; Figure 3), 142 trials (8%) had major discrepancies, i.e., a primary outcome was newly added, dropped, or changed to or from a secondary outcome.":: Comment :: When?This is really important for the documentedhistory vs unreported-discepancy.
For this summary statistic we counted every change after first patient inclusion up to the latest entry.Additional information on changes between the key timepoints are visualized in Supplementary Figure S1 (formerly: Figure 3) and Figure 3 (formerly: Figure 4) and are described in the modified main text: "Of 1746 trials, 393 (23%) had a primary outcome change between trial start and latest registry entry version, i.e., within registry entry versions between key trial milestones (Table 3; Figure S1).Minor changes were found in 318 (18%) trials.142 trials (8%) had major changes, i.e., primary outcomes were added, dropped, changed to secondary outcomes, or secondary outcomes were turned into primary outcomes.Of these major changes, 66 (4%) happened between first patient inclusion (start date) and trial completion date, 49 (3%) between completion and publication date, and 36 (2%) between the publication date and the latest registry entry."113 Results Why look only at the results paper?Many trials will publish a methods paper too.The choices may be better documented in these papers (where there is space) than in results papers (where journals, particularly the well-read journals, are very restrictive in space).
We chose to assess results publications because current reporting guidelines such as CONSORT clearly state that deviations from the protocol should be reported in the results publication.Since outcome changes are among the most critical protocol deviations in trials, we agree with the CONSORT view.However, we also agree that methods papers or protocol papers may include such information, and have now added this in the Discussion and state: "Second, we only assessed reporting of outcome changes in the main results publications of the respective trials.These changes could also have been published in a protocol or methods paper.We assessed the results publications as they have higher impact, and because current reporting guidelines such as CONSORT state that deviations from the protocol should be reported in the results publication."Results :: Text ref :: "Out of 292 randomly selected trials among the 1746 trials, 120 (41%; 95% CI [35%, 47%]) had a discrepancy between the latest registry entry and the publication, of which 54 (18%; 95% CI [14%, 237 23%]) were major and 75 (26%; 95% CI [21%, 31%]) were minor discrepancies.":: Comment :: The 120 / 292 is the interesting part of this manuscript.This should be the focus.The rest reads a bit like noise to me.Broken by registry would be good.
Thank you for highlighting this issue.We agree that the prevalence of outcome changes between latest registry entry and publication is an important finding.However, this problem is known while we believe that more awareness of the prevalence of 'hidden' or 'invisible' changes is the finding that deserves the most focus in our manuscript.Results broken down by registry are in Supplementary Tables 1 and 2 but would not change our interpretation of our main findings.Discussion :: Text ref :: "The ICMJE policy states that any changes to the registration should be explained by authors in the publication (4).":: Comment :: Why "the publication" rather than "as part of the publications"?ICJME aren't always right, of course.They don't enforce CONSORT, they are too strict on authors and they got data sharing wrong, so I suspect "a" rather than "the" should be fine.
Thank you for highlighting this.We made that change.Discussion :: Text ref :: "Still, some trials change their outcomes at two or even three different timepoints over the course of the trial.":: Comment :: A reminder of examples here would be good.I've been involved recently with some very long-term trials.During the course of the trials, external projects showed that earlier outcome measures could be used as a surrogate for the existing primary outcome measures so the trial team (without reference to accumulating comparative data, after considering the implications for sample size and timelines and after peer review) formally amended the outcome measures.This saved a couple of years and helped patients sooner.The rational for the changes was documented in methodology papers and these were referenced in the main results papers.These trials were registered both in clinicaltrials.govwhich allows changes to be captured and ISRCTN.comwhich doesn't allow updates (or didn't at the time).How would these examples be judged here.
To make the practice of changing outcomes at multiple timepoints more understandable, we updated Figure 1 to include an example.Thank you for pointing this out!We would like to point out that we absolutely do not argue against outcome changes per se (we agree that they may be well necessary and encountered such situations in our own trials as well).We only highlight the problem of undisclosed outcome changes.We fully acknowledge this aspect and now added a short paragraph to the conclusion section: "To conclude, primary outcome changes are very common in clinical research.Such changes are often, in one of four cases, 'invisible' in the trial registries and almost never reported in publications.While changes to primary outcomes are not unusual in clinical trials and may be well justified and sometimes even desirable, undisclosed changes are a problem." Regarding methods papers, please see comment #113.Discussion :: Text ref :: "Interestingly, we found not only cases in which primary outcomes gained more detail, but also cases in which primary outcomes were described with less detail than before.":: Comment :: I don't really know what this means.
We clarified our Table 2 (formerly: Table 1), which provides the terminology of outcome changes that we use to clarify this issue.Figure 1 I didn't really understand this.Why does the blue box extend beyond the dotted line?
We have reworked Figure 1 to be clearer (see also comment #99 of Reviewer 2) and also added a better description in the caption.Figure 2 This is not well presented.Sometimes a splitting line means the patients go down just one of the paths and other times it means the patients go down both.
To make our flowchart more understandable, we included everything in one single flow and added some descriptions, hoping that they make the flowchart more understandable.Figure 3 No scale, no definitions, no timelines.I completely don't understand this.
We extensively revised the figure (see also comment #49 from the Editor and comment #101 from Reviewer 2).Table 2 I would have liked a column for all trials (as shown) and an extra one for the 300 samples and another for the ones selected as odd.
We added an extra column to the table, containing the numbers for the random sample.Table 3 What is a post-publication discrepancy?What differs from what in those instances?Is this a change in the registry after a first results paper in anticipation of another longer-term paper?
Thank you for this remark.We extended our explanation in the table captions.

Supp Tables
The lack of totals make these difficult to use.What are these p-values?And they are the separate by row rather than looking for patters without predictors?
We have revised our tables, including some added detail on what the p-values mean.
The figure has been substantially revised following the suggestions by the Editors and Reviewers.We clarified the text in the Figure to reflect that 'journal publication' refers to the registry entry version at the time of journal publication.Aiming to substantially improve clarity, we ensured that the Figures are better harmonized, and we now closely align Figure1with Figure3(the former Figure4) with regard to colors, descriptions, and symbols used.FIGURESFigure3is striking but rather confusing.It details percentages but does not give a scale to determine the percentage.What do the different widths of shaded grey areas and lines represent?Perhaps a key for the colored bars could be provided as a legend for clarity.

Fig
FIGURES Fig 2.* Why are lower 3 boxes in grey but the others in black?Make consistent * In Exluded trials: * rephrase 'publication' with 'published' * why were time limits of 2 and 5 years chosen?Was this explained in the text?* by definition, a clinical trial that is not randomised or does not have an intervention can't be a 'trial'.Does this suggest that observational studies are registered on ClinicalTrials.org?Would it be more accurate to use 'study' here?* 'samples' should be 'registries'?* Also, expand the Legend to provide details on information in boxes.Thank you! * We have now made all boxes black and have -in accordance with another reviewer's commentschanged the flowchart to one single stream.* We rephrased the text * We have also clarified the text parts regarding our exclusion criteria.* We agree that our wording was misleading.Yes, noninterventional and non-randomized studies seem to be registered both on ClinicalTrials.govand DRKS.We made that clear and changed our wording to be more precise.* 'samples': We meant trials that were part of both research projects underlying the dataset we used and thus duplicated in the combined dataset.We changed the wording to reflect this.We also expanded the legend, as requested.FIGURES Fig 3. Very hard to follow.At present, the figure is so unintuitive and uninformative that it seems better to drop it than keep it in.* Does the height of the bar have meaning?Are bars drawn to scale?Thank you for your feedback!While we generally would be open to dropping the figure, we believe that it still conveys a lot of information, Thus, we moved it to the Thank you, we have indeed not reported this adequately.We have revised the penultimate paragraph of the Introduction in which we put our research in context and now explicitly state that we have done a systematic literature search.This also included some backward citation tracking.We have now uploaded our search filters and results of the structured data extractions to the project's OSF repository and cite this accordingly (https://osf.io/za9rx/).In the overview paragraph, we describe that comparisons of registry data to publications are often based on unclear or varying (historical) versions of trial registrations.The manuscript now reads: "We systematically searched studies that have assessed the frequency of outcome changes between registries and publications[7].We identified several analyses which widely agree that this problem is very common, with a median estimated prevalence of 31% (interquartile range (IQR): 17-45%) of clinical trials affected[8].However, most studies did not report which registry entry version they used[9][10][11][12][13][14][15][16][17], or reported that they only assessed changes between the publication and one single registry entry version (i.e., the latest available entry[18][19][20][21][22][23][24][25][26], the latest entry before trial completion[27], the first available entry[28], or the first entry during the active phase[29]).INTRODUCTION Please remove the phrase 'cherry-picking'We removed the phrase and replaced it with 'selective reporting of only statistically significant results'.

Table 2
Scimago Country & Rank database.The 'other studies' ABSTRACT L64.The first sentence also seems odd: It is worded as "to asses.discrepancies within registry records.." but does not state what registry records are discrepant to.Please refine.INTRODUCTION L98.The word 'critical' is mentioned 3 times in the first 6 lines, which seems an overuse of the word and dilutes its seriousness.Please rephrase to keep the emphasis on what is important and de-emphasise less important points.
We determined changes of primary outcomes (1) between different versions of registry entries at key trial milestones, (2) between the latest registry entry version and the results publication, and…".INTRODUCTION L122-124.The flow between sections of this sentence is not clear 'most studies published to date did not XXX, assessed XXX, or the latest entry ...' does not make sense.Please fix.