I have read the journal’s policy and the authors of this manuscript have the following competing interests: Dr. Malcolm Macleod received a grant from the NC3Rs as salary support; Dr. Manoj Lalu reports support from The Ottawa Hospital Anesthesia Alternate Funds Association and from a University of Ottawa Junior Research Chair in Innovative Translational Research during the conduct of the study; and Dr. Emily Sena reports support from Stroke Association (SAL-SNC 18\1003) during the conduct of the study.
In an effort to better utilize published evidence obtained from animal experiments, systematic reviews of preclinical studies are increasingly more common—along with the methods and tools to appraise them (e.g., SYstematic Review Center for Laboratory animal Experimentation [SYRCLE’s] risk of bias tool). We performed a cross-sectional study of a sample of recent preclinical systematic reviews (2015–2018) and examined a range of epidemiological characteristics and used a 46-item checklist to assess reporting details. We identified 442 reviews published across 43 countries in 23 different disease domains that used 26 animal species. Reporting of key details to ensure transparency and reproducibility was inconsistent across reviews and within article sections. Items were most completely reported in the title, introduction, and results sections of the reviews, while least reported in the methods and discussion sections. Less than half of reviews reported that a risk of bias assessment for internal and external validity was undertaken, and none reported methods for evaluating construct validity. Our results demonstrate that a considerable number of preclinical systematic reviews investigating diverse topics have been conducted; however, their quality of reporting is inconsistent. Our study provides the justification and evidence to inform the development of guidelines for conducting and reporting preclinical systematic reviews.
A cross sectional study of a sample of recent preclinical systematic reviews reveals deficiencies in reporting and provides the justification and evidence to inform the development of specific guidelines for conducting and reporting preclinical systematic reviews.
Systematic reviews and meta-analyses are essential tools for synthesizing evidence in a transparent and reproducible manner [
The popularity of preclinical systematic reviews has been growing over the past decade. Groups such as the Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies (CAMARADES,
The prevalence and state of reporting of preclinical systematic reviews has not been formally evaluated since 2014 (by Mueller and colleagues) [
The protocol for this study was posted on Open Science Framework (
All preclinical systematic reviews that investigated interventions using
We included systematic reviews that met at least 3 of the 4 following statements according to the 2009 PRISMA statement [
We defined preclinical as research investigating medically relevant interventions that is conducted using nonhuman models, with an intention to progressing to testing efficacy in human participants prior to being approved. Models were limited to
A comprehensive literature search was developed and conducted in conjunction with an information specialist. We searched MEDLINE via Ovid, Embase via Ovid, and Toxline for preclinical systematic reviews of
The literature search results were uploaded to Distiller Systematic Review Software (DistillerSR, Evidence Partners, Ottawa, Canada). DistillerSR is cloud-based software that manages references and provides customized reports for accurate review. Titles, abstracts, and full text were screened for inclusion by 2 independent reviewers using the eligibility criteria outlined above. Disagreements were resolved through consensus or by a third party, if necessary. Where titles and abstracts appeared to meet the inclusion criteria or where there was uncertainty, we reviewed the full text. Prior to the formal full-text screening, a calibration test of 13 systematic reviews was performed to refine the screening form to ensure no misinterpretation of the eligibility criteria. Inter-rater agreement was assessed (Cohen’s kappa coefficient). The reasons for article exclusion at the full-text level were recorded. The study selection process was documented using the 2009 PRISMA flow diagram.
After identifying all eligible preclinical systematic reviews, we extracted data in duplicate with conflicts resolved by consensus or a third party. Prior to the formal data extraction, a pilot test on 13 reviews was performed to refine the data form and to ensure a high level of inter-rate agreement. The extracted study characteristics included details about the publication (corresponding author’s name, their contact information, the country their institution was located in, and publication year), the animal species investigated, and the disease domain being investigated (e.g., cardiovascular disease). We extracted the number of
We next evaluated the quality of reporting in a random sample of 25% included systematic reviews. This sample size was chosen based on available resources. These were selected using the embedded randomize function in DistillerSR. Our aim was to assess the reporting in preclinical systematic reviews; thus, we only selected studies in which the majority of data were derived from preclinical
To create the checklist for the reporting assessment, we consulted PRISMA 2009 [
Items were arranged by the following manuscript sections: title, introduction, methods, results, discussion, and other. We did not evaluate items that are specific to review abstracts, as the guideline for abstracts vary substantially by journal. Items were assessed in each review as being reported (“yes”) or not (“no”), or not applicable to the review (“NA”). If a review did not contain quantitative data (i.e., no meta-analysis), “NA” was selected for all items relating to quantitative data/analysis (e.g., report methods for extracting numerical data from report). One exception was an assessment of the review’s main objective/question (which would ideally be presented in a population, intervention, comparison, outcome [PICO] format). This could be scored as a “yes,” “no,” or “partial,” where “partial” represented some, but not all, relevant PICO items of the question being stated. This checklist was piloted and refined by 2 independent reviewers to improve its utility in practical application before using it to assess systematic review reporting. The checklist can be found in
The collected data are presented using descriptive statistics (total counts, medians, and ranges), as well as narratively when appropriate.
Our searches identified a total of 2,356 records (2015 to 2018, inclusive,
PRISMA, Preferred Reporting Items for Systematic reviews and Meta-Analyses.
Twenty-seven percent of reviews were published in 2015, 15% in 2016, 20% in 2017, and 38% were published in 2018. Corresponding authors resided in 43 different countries (
No systematic reviews were published in the counties with gray coloring. The map was created using Tableau software.
Category | Characteristic | Number (%), |
---|---|---|
Year of publication | 2015 | 118 (27) |
2016 | 66 (15) | |
2017 | 90 (20) | |
2018 | 168 (38) | |
Source of funding | Government | 163 (37) |
Academia | 78 (18) | |
Foundation/charity | 73 (17) | |
Pharmaceutical company | 13 (3) | |
Hospital | 10 (2) | |
Unfunded | 86 (20) | |
Not reported | 108 (24) | |
Number of funding sources |
1 | 174 (70) |
2 | 55 (22) | |
3 | 15 (6) | |
4 | 4 (2) | |
Number of included publications | <10 | 87 (20) |
10–100 | 318 (72) | |
100–300 | 28 (6) | |
>300 | 4 (1) | |
Not reported | 5 (1) |
*Percent calculation out of the number of funded reviews that reported the source (
The median number of primary publications included in the preclinical systematic reviews was 24 (range: 1 to 1,342). Eighty-two (19%) reviews contained data from both human studies (clinical trials and case studies) and
The structure of the pyramid and positioning of each species/class/family reflects their frequency and does not reflect hierarchy. [Values represent frequency and (%).]. The figure was created using non-copyrighted biological silhouettes retrieved from
The preclinical systematic reviews investigated 23 different disease domains (
Category | Characteristic | Number (%), of |
---|---|---|
Type of disease domain | Musculoskeletal system and connective tissue | 74 (17) |
Nervous system | 70 (16) | |
Cardiovascular system | 66 (15) | |
Endocrine, nutritional, and metabolic diseases | 42 (10) | |
Cancer | 38 (9) | |
Toxicology | 38 (9) | |
Mental and behavior | 37 (8) | |
Genitourinary system | 24 (5) | |
Skin and subcutaneous tissue | 20 (5) | |
Digestive system | 20 (5) | |
Critical illness | 18 (4) | |
Infectious and parasitic diseases | 17 (4) | |
Respiratory system | 13 (3) | |
Pain and analgesia | 13 (3) | |
General and whole-body health | 10 (3) | |
Conditions originating in the perinatal period | 9 (2) | |
Pharmacokinetic, biological activity, and dose–response | 5 (1) | |
Blood and immune disorders | 6 (1) | |
Eye | 6 (1) | |
Mouth | 6 (1) | |
Congenital malformations | 5 (1) | |
Surgery and imaging techniques | 4 (0.9) | |
Auditory system | 1 (0.2) | |
Number of disease domains per review | 1 | 363 (82) |
2 | 69 (16) | |
3 | 7 (2) | |
>3 | 3 (0.7) |
Two-hundred and thirty-nine (54%) reviews reported pharmacological interventions, and 203 (46%) reported non-pharmacological interventions. Pharmacological interventions included substances like synthetic drugs, vaccines, and organic molecules. Within the 203 reviews that had a non-pharmacological intervention, 46 (10%) were cell therapies, 44 (10%) were surgery or invasive procedures, and 21 (5%) were medical physics interventions (e.g., ultrasound therapies) (
Intervention Number (%), of |
Subgroup | Number (%), of |
---|---|---|
Pharmacological 239 (54) | NA | |
Non-pharmacological 203 (46) | Cell therapy | 46 (10) |
Surgery or invasive procedures | 44 (10) | |
Medical physics | 21 (5) | |
Dietary interventions | 13 (3) | |
Blood transfusions or modifications | 11 (3) | |
Animal model validation | 10 (2) | |
Tactile stimulus interventions | 10 (2) | |
Exercise and physical activity | 10 (2) | |
Oxygen therapy | 7 (2) | |
Gene therapy | 7 (2) | |
Other | 24 (5) |
To assess the completeness of reporting within preclinical systematic reviews, we selected a random sample of 110 articles (25% of the 442 identified reviews): 64 of which evaluated pharmacological interventions and 46 evaluated non-pharmacological interventions (
Many (92; 84%) of the reviews indicated that the report was a “systematic review” in the title, while approximately half (54; 49%) reported that the review contained animal experiments in the title. Forty-five (41%) reviews reported both of these elements in the title. Within the introduction, most reviews described the human disease or health condition being modeled (104; 95%) and described the biological rationale for testing the intervention (106; 96%). Eighty (73%) reviews explicitly stated the review question(s) addressed (
PICO, population, intervention, comparison, outcome.
Twenty-two reviews (20%) reported a protocol had been developed
In the methods section, 76% of reviews reported a full or a representative search strategy, and 72% described the study screening/selection process—of which 18 reviews (10%) reported the platform used to screen. Roughly two-thirds (62%) of reviews stated the number of independent screeners, while less than half (44%) reported the number of reviewers extracting data. Half of the reviews (49%) reported the methods and tool to measure study quality/risk of bias, while no (0%) reviews described methods for assessing construct validity (i.e., potential relevance to human health) [
CAMARADES, Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies; DistillerSR, Distiller Systematic Review Software; SYRCLE, SYstematic Review Center for Laboratory animal Experimentation.
In the results section, almost all (106; 96%) the sampled reviews reported the number of included studies/publications, and 44% reported the number of independent experiments included in the analysis. The majority of reviews (86%) included a study selection flow diagram of the study selection process, and details such as study characteristics, animal species, and animal models, were generally well reported. For quality assessment measures, less than half (46%) reported the results of a risk of bias assessment, and 25% reported the results of assessing publication bias or that this assessment was not possible/done (
PRISMA, Preferred Reporting Items for Systematic reviews and Meta-Analyses.
Within the discussion section, a minority of reviews (31%) discussed the impact of the risk of bias of the primary studies. Sixty-five percent of reviews discussed the limitations of the primary studies and outcomes to be drawn, while 56% of reviews discussed the limitations of the review itself. Twenty-one (19%) reviews reported on data sharing (
Of the 110 reviews, 44 (40%) performed a quantitative analysis. The 44 quantitative reviews investigated 17 of the 23 diseases domains: cardiovascular system disorders (9 reviews; 20%), followed by musculoskeletal and connective tissue disorders, and nervous system disorders (7; 16% each). Twenty-five (60%) quantitative reviews investigated pharmacological interventions, and 19 (40%) investigated non-pharmacological interventions. Characteristics of the 44 quantitative reviews are found in
The following reporting items were specific for the quantitative reviews and were not applicable to reviews that did not perform a quantitative analysis. For the reviews that did not perform a quantitative analysis, the quantitative items were evaluated as “NA” (as described in the methods). Twenty-two (50% of 44) quantitative reviews described methods for extracting numerical data from primary studies (e.g., how data were extracted from graphical format, which is common in preclinical experimental studies). The majority (40; 95%) of quantitative reviews reported the methods for synthesizing the effect measure and methods for assessing statistical heterogeneity between studies. Fourteen reviews (32% of 44) reported methods for any data transformation needed to make extracted data suitable for analysis. Sixteen reviews (36% of 44) reported methods for handling shared control groups, and 13 (30% of 44) described methods for handling effect sizes over multiple time points, 2 common features in preclinical experimental studies. Of the 35 reviews that reported a subgroup or sensitivity analysis in the results, 33 reported the methods for these analyses. Within the results section, the confidence intervals of outcomes and a measure of heterogeneity were reported by 88% and 84% of quantitative reviews, respectively, while 29% of reviews in the sample reported the results of a subgroup or sensitivity analysis (
Of the 110 reviews, 46 (42%) explicitly mentioned following/using a reporting guideline or provided a completed reporting checklist. Forty-five of which reported using the PRISMA 2009 statement, and one used the Meta-analyses Of Observational Studies in Epidemiology (MOOSE) checklist.
This review provides a comprehensive characterization of preclinical systematic reviews and an evaluation of their reporting. Our results suggest that systematic review methodology is being applied to a diverse range of preclinical topics, and their production is increasing. Compared to the last assessment of preclinical systematic reviews published in 2014, the number of preclinical systematic reviews has nearly doubled in just 4 years. We identified that the reporting of methodology and results are not optimal. Without complete and transparent reporting of methods, it is likely not possible to gauge the trustworthiness of the results, a major limitation of any research project.
Established guidelines exist for both reporting
Encouragingly, some of the reporting items assessed in previous reporting assessments have improved in our evaluation. Mueller and colleagues [
The improvements we see in our review provide evidence that these initiatives may have contributed to better reporting quality in preclinical systematic reviews; however, significant deficits still exist with half of studies failing to assess risk of bias. In addition, items that are unique to preclinical systematic reviews (e.g., construct validity [
In addition to the development of reporting guidelines, other initiatives must be considered to improve the state of reporting and quality of preclinical systematics reviews. Journals, funders, and reviewers could contribute to this improvement by advocating and appealing for data sharing, registration of protocols
A strength of this study is our use of a sensitive search strategy to identify systematic reviews of
A potential limitation is the timeframe of our sample, as we included reviews from 2015 to 2018 inclusively. Although we chose this sample to capture the state of reporting in preclinical systematic reviews after the previous assessment in 2014, we acknowledge that the state of reporting may have changed from 2018; however, it is important to note that no major initiatives to address systematic review reporting have occurred since that time.
Additionally, to assess the state of reporting, we selected a sample of 25% of the identified reviews. Moreover, we applied eligibility criteria for inclusion in the reporting assessment with the aim of ensuring the reviews were predominantly preclinical
In addition, some of the items in our reporting assessment may not be generalizable to other forms of preclinical systematic reviews (e.g., those not focused on therapeutic interventions).
Our results show that the number of preclinical systematic reviews continues to increase compared to a previous review published in 2014. Although reporting quality has demonstrated some potential improvements, there still remains room for significant improvement. This echoes the past state of reporting within clinical systematic reviews, where historic poor reporting hampered their quality and potentially limited their utility. To address the insufficient reporting and improve transparency in clinical reviews, the PRISMA statement was developed. Although observational studies have suggested that the adoption of PRISMA has led to improved reporting of systematic reviews, it does not completely facilitate reporting within reviews of preclinical animal experiments. This is likely due to the unique differences between clinical and animal research. Specifically, our results provide rationale for a preclinical animal experiments extension of PRISMA and highlight areas of most deficient reporting. This will inform the development of a PRISMA preclinical systematic review extension.
(DOCX)
(DOCX)
(DOCX)
(DOCX)
(XLSX)
(DOCX)
(DOCX)
We thank Drs. Olavo Amaral, Carlijn Hooijmans, Bin Ma, and Daniele Wikoff for their co-leadership on the larger research program to generate a PRISMA Extension for Preclinical
Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies
Distiller Systematic Review Software
Meta-analyses Of Observational Studies in Epidemiology
population, intervention, comparison, outcome
Preferred Reporting Items for Systematic reviews and Meta-Analyses
SYstematic Review Center for Laboratory animal Experimentation
Dear Dr Lalu,
Thank you for submitting your manuscript entitled "Epidemiology and Reporting Characteristics of Preclinical Systematic Reviews" for consideration as a Meta-Research Article by PLOS Biology.
Your manuscript has now been evaluated by the PLOS Biology editorial staff, as well as by an academic editor with relevant expertise, and I'm writing to let you know that we would like to send your submission out for external peer review. Please accept my apologies for the delay in providing you with an initial decision.
However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.
Please re-submit your manuscript within two working days, i.e. by Oct 28 2020 11:59PM.
Login to Editorial Manager here:
During resubmission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit
Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.
Given the disruptions resulting from the ongoing COVID-19 pandemic, please expect delays in the editorial process. We apologise in advance for any inconvenience caused and will do our best to minimize impact as far as possible.
Feel free to email us at
Kind regards,
Roli Roberts
Roland G Roberts, PhD,
Senior Editor
PLOS Biology
Dear Dr Lalu,
Thank you very much for submitting your manuscript " Epidemiology and Reporting Characteristics of Preclinical Systematic Reviews " for consideration as a Research Article at PLOS Biology. Your manuscript has been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by three several independent reviewers.
You'll see that the reviewers are broadly positive about your study, but between them they raise a number of concerns about possible coding artefacts, the need to assess trends over time, more discussion of remedies, clarifications of the motivation and some choices made, and the need for an update to the data (we note that the search date is nearly two years ago). These issues will need to be addressed for further consideration.
In light of the reviews (below), we will not be able to accept the current version of the manuscript, but we would welcome re-submission of a much-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent for further evaluation by the reviewers.
We expect to receive your revised manuscript within 3 months.
Please email us (
**IMPORTANT - SUBMITTING YOUR REVISION**
Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:
1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.
*NOTE: In your point by point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point.
You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.
2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type.
*Re-submission Checklist*
When you are ready to resubmit your revised manuscript, please refer to this re-submission checklist:
To submit a revised version of your manuscript, please go to
Please make sure to read the following important policies and guidelines while preparing your revision:
*Published Peer Review*
Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:
*PLOS Data Policy*
Please note that as a condition of publication PLOS' data policy (
*Blot and Gel Data Policy*
We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data:
*Protocols deposition*
To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see:
Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.
Sincerely,
Roli Roberts
Roland G Roberts, PhD,
Senior Editor,
PLOS Biology
*****************************************************
REVIEWERS' COMMENTS:
Reviewer #1:
[identifies himself as Kieron Rooney]
Edits:
Typographical:
Line 120: delete "was defined" which is repetitive
Lines 199 - 205 and figure 1. There are distinct differences in the numbers reported in the text and figure that need to be checked and corrected. For example - line 199 states 1549 papers were excluded, however figure 1 identifies 1585; line 200 states 807 papers were retrieved but the figure says 771; line 201 stated 325 papers were excluded but figure 1 states 329
Line 209 states 14% table 1 says 15%
Line 229 states 21 different disease domains; Table 2 identifies 23
Editorial:
Lines 146 - 148 and elsewhere for example line 168. When describing how the articles were screened and excluded etc it would be good to specifically identify which authors participated in the various tasks rather than just say "two authors …" or "we".
Line 236 and 237: the use of the word "had" seems so lax and "reported" would be more accurate representation.
Lines 238 - 241 represent the relative % of occurrence to the 203 non-pharmacological sub-set yet table 3 is relative to the total 442 excluded. I don't really care which is used, but only one reference point should be used for consistency.
Figure 4: I think the relative number of the final 110 that are pharmacological and non-pharmacological should be identified.
Line 337: the sentence that ends at "increase" on line 376 is broken by the inclusion of "compared to a previous review published in 2014". I think this first half of line 337 could be deleted.
Specific comments requiring response:
Lines 266-267 present a result in a different manner / context in which the checklist was implemented and this raises a concern regarding the process in which multiple authors were stated as implementing the assessment with consensus or third review. For example, we are given a result here of "eligible intervention timing (prevention vs rescue)" Yet in the checklist the qualifier "prevention vs rescue" is in regard to "intended goal of the intervention". I am concerned then that assessors of the articles may have scored this item differently as a result of inconsistent procedure. Can the authors please confirm if item 12 was assessed consistently between authors AND was the "prevention vs rescue" element assessed regarding whether it appeared within the context of intervention timing or intended goal of the intervention. Finally, can the authors also confirm how the potential for such a discrepancy was confirmed prior to submission.
Checklist item 6 states to assess only whether or not a protocol was registered a priori however the results on lines 261-262 identify that it was also identified if a database was indicated. In this instance then item 6 should be updated as identifying 1) was a protocol registered and 2) is the database or ID provided so as to better represent how it must have been implemented by the authors here.
Checklist item 20 - construct validity is identified as being described 0% in all studies. However, I wonder if this is simply an artefact of the checklist placing this item in the methods section. As an author of pre-clinical SRs reporting on interventions related to human clinical conditions, this is an item my colleagues and I have referred to in the discussion in the absence of a strong tool for assessing construct validity. The authors of this manuscript identify in the discussion that this concept is an emerging area of assessment. I wonder if the result presented here was re-assessed with a critical eye over attempts by authors of the SR to comment on construct validity in the discussion would identify a different outcome to be reported in this manuscript? Could the authors comment on why that item was placed only in the methods, and whether or not - even if the SRs reviewed included comment on construct validity in discussion - they scored 0 and whether or not reviewing the discussion sections of the included articles now and reporting on item 20 as an alternate discussion item could be a worthwhile adventure.
Lines 298-312 and figure 5E identify that the checklist items presented as supplementary 4 are a conflated list of two checklist - there are some questions apparently only relevant to SRs with quantitative analysis. This should have been more explicitly articulated in the method section and then also in the checklist items that are only relevant to some SRs but not all should be clearly identified. If this work is (as the authors seem to hope) to inform a PRISMA extension, then it is vital that any potentially "irrelevant" questions are clearly articulated.
Lines 298-312 sub-group analysis: I would like to at least know how many of the 42 reviews now included in a sub-analysis implemented a pharmacological or non-pharmacological intervention. The potential need for a re-analysis / updated presentation of the results in this section then may be needed or at least commented on if the authors suspect their results here may have been impacted by the distribution across those interventions. Further, an identification of the disease domains for the 110 and 42 should be provided so one can tell if the results presented may have been impacted by a non-representative sub-group relative to the original 442.
Checklist item ?? - I do not have a number to cross reference to as this item does not yet exist, but I'd have appreciated a specific question that asks whether or not the quantitative analysis in SRs grouped species or performed sub-group analyses as the contribution of species differences can often be ignored in analyses of heterogeneity.
Lines 364 - 375 strengths and limitations and, lines 121-122, and the general issue of including an item on construct validity. I have a concern regarding how the determination of the purpose of the SRs included was performed. It is not clear, if the authors of the original SR had to explicitly state they performed the SR for the purpose of informing construct validity / pre-clinical scope / translation to humans OR if it was the authors of this manuscript that inferred that purpose on the SR. For example, many pre-clinical SRs of interventions that would have met this inclusion criteria have been conducted NOT for the explicit purpose of informing human clinical practice, but rather to synthesise the current existing evidence to inform a future study design - best model, most appropriate species, etc. I would like to know if the authors of this manuscript specifically assessed if the originating SR authors explicitly stated an intent that the SR would enhance or inform translation to humans, or if that was inferred by the current authors. If it was simply assumed by the current authors then this should be included as a limitation of the analysis since it is possible that within the 75% of SRs not assessed there are better conducted SRs as their purpose was different.
Reviewer #2:
[identifies herself as Miranda W. Langendam]
The authors of this manuscript performed a cross sectional study of a sample of recent preclinical systematic reviews and examined characteristics of the systematic reviews, including reporting characteristics. To describe the reporting characteristics, they developed and used a 46-item checklist. The manuscript is informative and well-written, and addresses an important topic in preclinical research.
Some aspects of the manuscript puzzle me, however, and I have some concerns.
First, from the Introduction it is unclear what the rationale is for a preclinical PRISMA extension. Please describe for what aspects preclinical systematic reviews are different from clinical systematic reviews, and why these aspects need additional reporting items?
Second, the authors developed a 46-item reporting checklist, based on the PRISMA 2009 checklist. Please report what the additional items were? How was decided to include these items, what was the methodological process?
Overall it is unclear if the 46-item checklist should be seen as the PRISMA extension, or how the presented checklist will be developed further, making the results in the current manuscript preliminary. The protocol suggests a large research project, how does this manuscript fit in the larger project?
Third, did the authors analyze trends over time in the characteristics, and if so, what where the results? Is there any improvement? Did they analyze the associations between the general characteristics (for example the type of intervention) and reporting?
Fourth, the authors make a too strong conclusion. The authors conclude that although a considerable number of preclinical systematic reviews have been conducted, their quality and rigour is inconsistent. The authors describe only if some quality-related characteristics, for example risk of bias assessment, were reported. They did not assess if the correct risk of bias instrument was used and applied in an appropriate way. In other words, reporting, not methodological quality was assessed.
Minor concerns:
Line 330: Many of the included reviews reported that they followed systematic review reporting guidelines: what guidance where they referring to?
The search date is March 2019, please consider an update of the search.
Non-English systematic reviews are excluded. Please elaborate if this could cause bias, as we know that many non-English systematic reviews are published.
In the manuscript there is no reference to S4, the checklist.
Line 120: typo, please delete 'was defined'.
Regarding the title: it may be me, but I find the term 'epidemiology' a bit odd when the manuscript is about characteristics of systematic reviews.
Reviewer #3:
[identifies himself as David Mellor]
The authors provide a useful snapshot of reporting quality of a population of preclinical systematic reviews. This information is vital to the community of researchers who are monitoring such changes over time. Below I provide some recommendations for modest improvements and hope to see this work published in the near future.
An important improvement to this paper would be to more fully compare this to previous work (notably, Mueller et al 2014) and to evaluate items that have improved, not changed, or perhaps decreased in the past 6 years. You begin to do this in line 340-344 by comparing 3 items between these two time periods, but I think a more comprehensive comparison would be very helpful to users of this information. Ideally, a table that compares the items included by you, by Mueller, and the rates of each item in common would be helpful. I did not make a side by side comparison between the two checklists, but this would have the further benefit of making such a comparison possible and would provide the opportunity for you to explain any differences between the two. This comparison would make later suggestions and next directions more meaningful.
The authors note four possible reasons for the relative improvement in the past 6+ years (again, I would like to see a better comparison over time to really see if there is much change): increased funding, leadership provided by CAMARADES and SYRCLE, establishment of PROSPERO, and journal endorsement of PROSPERO. I think that those are all reasonable explanations, but is there any more explanation possible? Without a good sense of the magnitude of the trend, it is hard to ascertain how much explanation is really needed or expected, but if the authors provide more comparison and find substantial change, some more explanation of the possible causes would be helpful. Even if that is not possible, the authors could acknowledge that.
I believe the article would be improved with greater assertions on steps that authors, funders, reviewers, and journals should take to further improve the situation. Some possibilities could include journal requirements for systematic reviews to more fully report details (journal mandates are sometimes effective at changing behaviors such as data sharing
Data availability: The authors provide their complete search strategy, checklist, and protocol. However, I was not able to easily find any raw, paper-level data. Even if the authors removed the identifying information from the file (which I don't think would be necessary, but if they want to prevent any single paper as being identified as "worst" then perhaps they could do so), such a file could be helpful. For example, correlations between excluded items could reveal meaningful patterns for future work. If the authors did provide it and I am missing it, please forgive me- although I would recommend adding it to the shared drive along with a code book or README file to explain any odd variables.
I regularly sign my reviews. I hope that the provided comments are helpful to the authors.
Sincerely, David Mellor, Director of Policy Initiatives, Center for Open Science
Minor comments
Line 99: "The prevalence and quality of preclinical systematic reviews has not been formally evaluated since 2014." It is unclear if the authors are referring to Mueller et al 2014
Line 106: It would probably be helpful to point directly to the protocol,
Line 120: Typo: "We defined preclinical was defined as research…."
Line 123: Please justify the exclusion of in vitro and ex vivo studies (presumably this is to make it comparable with Mueller et al 2014, but please state.
Line 162: "We next evaluated the quality of reporting in a random sample of 25% included systematic reviews. This sample size was chosen based on available resources." It would be helpful for other researchers to provide information on time required to score the 110 articles (either here or around line 246). Not providing this information would be acceptable (perhaps time cannot be reasonably estimated if it was not tracked during the screening), as it is a bit beyond the scope of normal reporting expectations, but would nonetheless be a useful piece of information for others.
Line 225, Figure 3: It's unclear why this information is provided in a pyramid shape. The shape suggests more meaning than I think is intended by the authors. Of course, rat is the most common animal, thus at the top of the pyramid, but is not necessarily in a different class or tier than mouse, with nearly as many animals. I think the visual of the animal outlines is a nice way to summarize the information but rearranging into a circle or square would remove any suggestion that the levels are meaningful (beyond simple frequency).
Line 261: "Twenty-two reviews (20%) reported a protocol had been developed a priori…"
Submitted filename:
Dear Dr Lalu,
On behalf of my colleagues and the Academic Editor, Lisa Bero, I'm pleased to say that we can in principle offer to publish your Meta-Research Article "Epidemiology and Reporting Characteristics of Preclinical Systematic Reviews" in PLOS Biology, provided you address any remaining formatting and reporting issues. These will be detailed in an email that will follow this letter and that you will usually receive within 2-3 business days, during which time no action is required from you. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have made the required changes.
Please take a minute to log into Editorial Manager at
PRESS: We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with
We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit
Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Biology.
Sincerely,
Roli Roberts
Roland G Roberts, PhD
Senior Editor
PLOS Biology