Examining the Predictive Validity of NIH Peer Review Scores

Mark D. Lindner; Richard K. Nakamura

doi:10.1371/journal.pone.0126938

Abstract

The predictive validity of peer review at the National Institutes of Health (NIH) has not yet been demonstrated empirically. It might be assumed that the most efficient and expedient test of the predictive validity of NIH peer review would be an examination of the correlation between percentile scores from peer review and bibliometric indices of the publications produced from funded projects. The present study used a large dataset to examine the rationale for such a study, to determine if it would satisfy the requirements for a test of predictive validity. The results show significant restriction of range in the applications selected for funding. Furthermore, those few applications that are funded with slightly worse peer review scores are not selected at random or representative of other applications in the same range. The funding institutes also negotiate with applicants to address issues identified during peer review. Therefore, the peer review scores assigned to the submitted applications, especially for those few funded applications with slightly worse peer review scores, do not reflect the changed and improved projects that are eventually funded. In addition, citation metrics by themselves are not valid or appropriate measures of scientific impact. The use of bibliometric indices on their own to measure scientific impact would likely increase the inefficiencies and problems with replicability already largely attributed to the current over-emphasis on bibliometric indices. Therefore, retrospective analyses of the correlation between percentile scores from peer review and bibliometric indices of the publications resulting from funded grant applications are not valid tests of the predictive validity of peer review at the NIH.

Citation: Lindner MD, Nakamura RK (2015) Examining the Predictive Validity of NIH Peer Review Scores. PLoS ONE 10(6): e0126938. https://doi.org/10.1371/journal.pone.0126938

Academic Editor: Neil R. Smalheiser, University of Illinois-Chicago, UNITED STATES

Received: February 23, 2015; Accepted: March 30, 2015; Published: June 3, 2015

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication

Data Availability: All relevant, de-identified data are uploaded to the NIH web site for the Center for Scientific Review at the following URL: http://public.csr.nih.gov/aboutcsr/NewsAndPublications/News/Documents/DEIDENTIFIEDDATASETUSINGZSCORES.XLSX.

Funding: The authors have no support or funding to report.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The National Institutes of Health (NIH) supports the training, development and research of more than 300,000 full-time scientists working throughout the United States, which stimulates economic activity, produces new businesses and products, and advances basic and clinical biomedical research [1,2]. The NIH has supported the majority of landmark studies that led to Nobel prizes [3], and more than 67% of all citations in scientific articles refer to studies funded primarily by the NIH [4]. In addition, every $1 of NIH funding produces $2.2 – $2.6 or more in economic activity [1,2,5], more than half of all the studies cited in patents filed in the biomedical field were funded primarily by the NIH [4,6,7], and many of the new drugs and biologics produced are based on research funded by the NIH [8,9], all of which contribute to significant increases in the length and quality of life [10,11].

Since the end of WWII, the NIH has largely relied on a peer review process to allocate extramural funds [12]: scientists submit applications for funding, and other scientists who are active in the same fields—their peers—are recruited by the NIH to review those applications and determine the quality, scientific merit and potential impact of the research described in those applications. Despite clear historical evidence that NIH-funded research has produced a wide range of significant and beneficial effects, the scientific community is expressing increasing criticism of the peer review process that the NIH relies on to allocate funds [13–15], and in fact, the predictive validity of peer review has not yet been empirically demonstrated [16].

Perhaps the most efficient and expedient test of the predictive validity of NIH peer review would be a retrospective analysis of the correlation between percentile scores from peer review as the predictor and bibliometric indices as the criteria. The percentile score is used by each institute as the primary influence when deciding which applications to fund, and scientists are evaluated by their academic institutions for their scientific impact based on bibliometric indices: numbers of publications and citations. Decisions about hiring, retention, promotion, tenure and compensation of academic scientists have been based on bibliometric indices for decades [17–30], and these readily-available quantitative bibliometric indices could easily be used to determine the impact of funded research projects.

A number of investigators have suggested that percentile scores and/or bibliometric indices are appropriate variables for determining the predictive validity of peer review [31–35], but there has not yet been a careful examination of whether retrospective studies using those variables are appropriate for the assessment of the predictive validity of peer review at NIH. Tests of the predictive validity of a screening procedure are dependent on the inclusion of all cases or a sample of cases selected at random across the full range of the population that is screened [36]. Such correlational studies are also dependent on the use of a valid criterion measure (e.g., a valid measure of scientific impact) on the same elements or cases evaluated with the screening procedure [37]. The present study was conducted to determine if retrospective analyses of funded research applications, using percentile scores from peer review as the predictor variable and bibliometric indices as criterion measures of scientific impact, satisfy the requirements for tests of predictive validity.

Methods

The R01 is the most commonly used mechanism for funding research at the NIH. It typically supports a discrete, specified, and circumscribed project of up to 5 years duration, to be performed by applicants in an area representing their specific interests and competencies, based on the mission of the NIH (see http://grants.nih.gov/grants/funding/r01.htm). In 2007 and 2008, 45,874 new R01 applications were considered for funding by the funding institutes at NIH; peer review of 87% or 39,888 of those applications were managed by the Center for Scientific Review (CSR). Of those 39,888 applications, 1.4% were incomplete or had errors or were reviewed in meetings that did not assign percentile scores. The remaining 39,337 records from the NIH IMPAC II database were included in the dataset analyzed in the present study.

Each application is reviewed in-depth by three assigned reviewers, and the average of the preliminary scores from those assigned reviewers is used to determine which applications will be discussed by the full committees in the review meetings. Usually, about half of the applications reviewed by each committee—the applications with the better average preliminary scores—are discussed in each review meeting. For the applications discussed in the review meeting, all eligible committee members assign an overall impact score, and the average of those scores is used to calculate the percentile score. An application’s percentile score is the percentage of all applications reviewed by a study section with average overall impact scores better than or equal to that application (for a more detailed description of the review and scoring procedure see [38]).

Records for awarded applications included the percentile score, the total budget requested, the total budget committed by the funding institute at the time of the award, the requested duration of the project, the approved duration of the project, and the number of days between the review meeting and the date the award was issued. This dataset includes applications assigned to all the institutes at the NIH, representing all areas of research, including applications that were discussed and not discussed, and applications that were funded and not funded.

Results

Overall, 48% of the applications were discussed (n = 19,049), and 17.4% (n = 6,830) were funded. As expected, most of the applications with the best percentile scores were funded, and fewer and fewer applications were funded as the percentile scores increased. For example, 95–96% of applications with scores in the top 10 percentile were funded, but only 86% of the applications in the 10.1–15 percentile range and only 57% of the applications in the 15.1–20 percentile range were funded (Fig 1A).

Download:

Fig 1.

[A] Percent of applications funded decreases as peer review percentile scores increase. Approximately 95% of all applications with peer review percentile scores in the 0.1–10.0 range are funded, but only 3.2% of the applications with peer review scores in the 35.1–40 percentile range are funded. [B] Cumulative percentage of all funded applications with increasing peer review percentile scores. Almost 50% of all funded applications have peer review percentile scores in the 0.1–10 range, and 97% of all funded applications have peer review percentile scores equal to or less than 30. [C] Number of applications reviewed for each application funded increases as peer review percentiles increase. Almost every application reviewed with peer review percentile scores in the 0.1–10.0 range is funded, but only one of every 6 applications reviewed with peer review percentile scores in the 25.1–30.0 range is funded.

https://doi.org/10.1371/journal.pone.0126938.g001

Among the funded applications, almost half had scores in the top 10 percentile, and 97% of all funded applications had scores in the top 30 percentile (Fig 1B). Only 3% of funded applications were beyond the 30^th percentile, including several at greater than the 50^th percentile.

Among applications in the 25.1–30 percentile range, 6 applications were reviewed for each application that was funded; for applications with percentile scores in the 30.1–35 percentile range, 16 applications were reviewed for each application funded; and for applications in the 35.1–40.0 percentile range, more than 32 applications were reviewed for each application funded (Fig 1C).

Approximately 15% of the funded applications were selected ‘out of order’. In other words, 15% of the funded applications in the present study would not have been funded if the funding decision was based purely on peer review scores, applications with better peer review scores would have been funded instead. The percentage of applications funded varies widely between the different institutes at the NIH, ranging in 2007–2008 from less than 10% at some institutes to more than 30% at other institutes, and the proportion of awards made ‘out of order’ in terms of peer review scores, also varies widely, ranging from only 3% at the National Institutes of Aging to more than 30% at some of the smallest funding institutes (see Table 1).

Download:

Table 1. New R01s by Funding Institute.

https://doi.org/10.1371/journal.pone.0126938.t001

In addition, analyses of variance (ANOVA) of the awarded applications showed that what was funded was different from what had been initially submitted. A 5 x 2 ANOVA on the 6,830 awarded applications including percentile scores from peer review (i.e., 0.1–10, 10.1–20, etc.) and budget (i.e., requested vs. awarded) as factors in the analysis, treating budget as a repeated measure, revealed that awarded budgets were significantly smaller than requested budgets, F(1,6825) = 769.3, p < 0.0001 (Fig 2A). Furthermore, as percentile scores increased, the difference between the requested and the awarded budgets increased. The percentile score by budget interaction was statistically significant, F(4,6825) = 36.49, p < 0.0001.

Download:

Fig 2. Awarded applications are classified by the range of percentile scores: 0.1–10 (N = 3,169), 10.1–20 (N = 2,486), 20.1–30 (N = 961), 30.1–40 (N = 178), and 40.1–60 (N = 36).

https://doi.org/10.1371/journal.pone.0126938.g002

A 5 x 2 ANOVA on the 6,830 awarded applications including percentile scores from peer review (i.e., 0.1–10, 10.1–20, etc.) and project duration (i.e., proposed vs. approved) as factors in the analysis, treating project duration as a repeated measure, revealed that approved project durations were significantly shorter than proposed project durations, F(1,6825) = 602.43, p < 0.0001 (Fig 2B). As percentile scores increased, the difference between the proposed and the approved project durations increased. The percentile score by duration interaction was statistically significant, F(4,6825) = 104.11, p < 0.0001.

In addition, the delay from the review meeting until the notice of award increased as the percentile scores increased, F(4,6825) = 283.05, p < 0.0001 (Fig 2C).

Discussion

This study analyzed a large set of new (Type 1) competing R01 Research Project Grant applications reviewed at the NIH Center for Scientific Review in FY 2007 and 2008: 39,337 applications were analyzed, 48% of those applications were discussed and 17.4% were funded. Most of the applications with the best percentile scores were funded. As the percentile scores increased, fewer and fewer applications were funded and more and more applications were reviewed for each application that was funded. In addition, funded projects were different from the applications that were submitted and evaluated during peer review, and the differences between the initial peer reviewed applications and the projects that were eventually funded increased as the peer review scores increased. Furthermore, the delay from the review meeting until the notice of award increased as the percentiles increased.

The results of the present study demonstrate several reasons why retrospective studies of funded applications might fail to detect a strong linear relationship between peer review estimates of potential impact and subsequent measures of actual impact. First, tests of the predictive validity of a screening procedure are dependent on the inclusion of the full population or a sample of cases selected at random across the full range of the population that is screened. However, only a small percentage of NIH applications are funded, and those funded applications are restricted to only a portion of the entire range of applications reviewed: 97% of funded applications have peer review scores at the 30^th percentile or better. Such restriction of range confounds studies of predictive validity. For example, Thorndike developed a personnel screening test to determine which applicants were more likely to successfully complete pilot training school. Scores on the screening test were correlated with successful completion of pilot training at r = 0.64. However, once training was limited to the 13% of applicants with the best scores on the screening test, the restriction of range reduced the apparent correlation between test scores and successful completion of pilot training from r = 0.64 to only r = 0.18 [36].

In addition to restriction of range, the present study underlines an NIH award process in which peer review is only the first level of review at NIH, and provides evidence of the significance of the second round of review conducted by the funding institutes which consider the initial peer review scores as only a part of their funding decisions. The funding institutes at NIH identify what they feel are the best applications that most deserve to be funded. They do this by closely examining the applications and the peer review scores and written critiques. Consistent with suggestions that at least 8% of the NIH budget should be allocated to discretionary funding of high-risk, high-reward research managed by program managers at the funding institutes [39], approximately 15% of the funded applications are selected ‘out of order’ by the funding institutes. In other words, 15% of the funded applications in the present study would not have been funded if the funding decision was based purely on peer review scores, applications with better peer review scores would have been funded instead.

Especially among those applications with worse peer review scores, the funding institutes identify and ‘cherry-pick’ those few applications or parts of applications that they believe stand out from the rest of the applications in the same range, and they negotiate with applicants to address issues identified during peer review (see ‘Negotiation of Competing Award’ at http://grants.nih.gov/grants/managing_awards.htm). The extent of those negotiations and changes is reflected in the increasing differences between proposed and awarded budgets and project durations, and the increasing times from review to award. Critical information can be added, weaknesses in the design can be corrected and flawed or unnecessary experiments can be dropped.

This is not evidence of inappropriate or unethical behavior on the part of the funding institutes. They not only have the authority, they have an obligation to identify and fund the best projects. However, this means that projects are selected, revised and funded in a way that is not amenable to rigorous retrospective examination of the predictive validity of peer review scores. In addition to severe restriction of range, funded projects, especially those with worse peer review scores, are not selected at random and are not representative of other applications in the same range, and many of them are different from the projects that were initially reviewed. But the peer review scores remain unchanged: they are not revised to reflect the changes and improvements that are made during negotiations with the funding institutes. Therefore, it is not appropriate to expect that the peer review scores assigned to the original applications should be correlated with measures of the impact of the funded projects that have been revised and improved before they were eventually funded.

In addition to the issues related to peer review scores discussed above, there is also a problem with the use of citation metrics as measures of scientific impact. Citation metrics by themselves are not accepted as valid or widely used for that purpose. Reward systems shape behavior, and the current reward system based primarily on primacy of discovery and numbers of publications and citations has shaped standards of practice in ways that facilitate achievement of those objectives [25,40–45]. For example, scientists have little incentive and rarely replicate reports from other investigators [46–49] or publish results that fail to support their own hypotheses [50–55].

Papers reporting novel, robust and statistically significant effects are more likely to be published in high-impact journals and to be more highly cited, and in order to produce novel, robust, statistically significant effects, scientists often approach their research as advocates, intent on producing and publishing results that confirm and support their hypotheses [56–60] without including adequate methodological controls to prevent their unconscious biases from affecting their results [61–65]. Scientists also often conduct a large number of small studies and use exploratory analytical techniques that virtually ensure the identification of robust, novel phenomena and hypotheses but fail to report the use of exploratory, post hoc analyses or conduct the appropriate confirmatory studies [66–71].

In addition, scientists select other papers to cite based primarily on their rhetorical utility, to persuade their readers of the value and integrity of their own work: papers are not selected for citation primarily based on their relevance or validity [72,73]. Even the father of the Science Citation Index (SCI), Eugene Garfield, noted that citations reflect the ‘utility’ of the source, not their scientific elegance, quality or impact [74]. Authors cite only a small fraction of relevant sources [75,76], and studies reporting robust, statistically significant results that support the author’s agenda have greater utility and are cited much more often than equally relevant studies that report small or non-statistically-significant effects [75–81].

Given the emphasis on primacy of discovery and bibliometric indices, these strategies are clearly beneficial for individual scientists and do not constitute research misconduct; but, it is becoming more and more widely recognized that they produce problems of reproducibility of scientific findings and are therefore a source of waste and inefficiency [82]. Replication studies are rarely conducted [46–49], valid but small or non-statistically significant results are often not detected [83,84] or published [50–55], unnecessary studies are conducted because previous research results have not been published or cited [85,86], and uncontrolled biases lead to the publication of a large number of false positives [65,70,87–89]. In the vast majority of publications reporting novel phenomena, the effects are exaggerated or invalid [48–50,90–93], and high citation numbers do not provide assurance of the quality or the validity of the results [94,95]. Even studies reporting robust effects and cited more than 1,000 times are often not valid or reproducible [96,97].

Moreover, the evidence suggests that the magnitude of this problem is growing. With more and more highly qualified scientists, the ‘publish or perish’ culture continues to become ever more competitive, demanding larger and larger numbers of publications and citations in order to be hired, retained, promoted and well-compensated [98–104]. Clearly, further increasing the emphasis on the numbers of publications and citations produced by funded applications would not increase productivity, but would likely increase the problems already largely attributed to the current over-emphasis on bibliometric indices.

Instead, there is a growing consensus that scientific progress and productivity can best be increased by providing incentives to increase the integrity of the scientific literature, largely in ways that will reduce the number of publications produced and the proportion of studies reporting robust, novel effects that tend to be most highly cited [82,86,105,106]. Suggested changes to standards of practice include conducting full literature reviews and meta-analyses, citing all relevant studies, not just those that support the author’s rationale; conducting power analyses and using adequate sample sizes to detect expected effects; including and clearly communicating quality controls to prevent bias from affecting the results; clearly distinguishing between planned analyses and post-hoc exploratory analyses; and publishing results of studies in a timely manner, even if the results fail to support the investigator’s hypotheses or detect any statistically significant effect. In response, the leadership at NIH has initiated changes in the NIH peer review process to incentivize quality and reproducibility over quantity [107], and one estimate suggests that such changes could significantly increase scientific productivity [82].

Conclusions

An appropriate test of the predictive validity of the peer review process has not yet been conducted. Such a test would need to include funded projects selected at random across the entire range of applications, and the projects would have to be conducted without changes or improvements based on issues identified during the peer review process. The impact of those applications would also have to be based on measures that appropriately value the impact and validity of the results. Citation numbers alone are not appropriate for that purpose, in part because citation numbers are often higher for studies that report exaggerated or invalid results [90,91,108]. Further increasing the emphasis on the numbers of publications and citations produced by funded applications, as some studies have suggested [32,34], would exacerbate the waste and inefficiencies already attributed to the current over-emphasis on bibliometric indices. Instead, the leadership at NIH has initiated changes in the NIH peer review process to incentivize quality and reproducibility over quantity [107], and one estimate suggests that such changes could significantly increase scientific productivity [82].

Acknowledgments

Disclaimer: The views expressed in this article are those of the authors and do not necessarily represent those of CSR, NIH or the US Dept. of Health and Human Services.

Author Contributions

Conceived and designed the experiments: MDL. Analyzed the data: MDL. Wrote the paper: MDL RKN.

References

1. Ehrlich E. An economic engine: NIH research, employment, and the future of the medical innovation sector. United for Medical Research. 2011.
2. Makomva K, Mahan D. In your own backyard: how NIH funding helps your state's economy. Families USA. 2008.
3. Tatsioni A, Vavva E, Ioannidis JPA. Sources of funding for Nobel Prize-winning work: Public or private? FASEB J. 2010;24: 1335–1339. pmid:20056712
- View Article
- PubMed/NCBI
- Google Scholar
4. Zinner DE. Medical R&D at the turn of the millennium. Health Aff. 2001;20: 202–209. pmid:11558704
- View Article
- PubMed/NCBI
- Google Scholar
5. Tripp S, Grueber M. Economic impact of the Human Genome Project. Battelle Memorial Institute. 2011.
6. Narin F, Hamilton KS, Olivastro D. The increasing linkage between U.S. technology and public science. Research Policy. 1997;26: 317–330.
- View Article
- Google Scholar
7. McMillan GS, Narin F, Deeds DL. An analysis of the critical role of public science in innovation: The case of biotechnology. Research Policy. 2000;29: 1–8.
- View Article
- Google Scholar
8. Stevens AJ, Jensen JJ, Wyller K, Kilgore PC, Chatterjee S, Rohrbaugh ML. The role of public-sector research in the discovery of drugs and vaccines. N Engl J Med. 2011;364: 535–541. pmid:21306239
- View Article
- PubMed/NCBI
- Google Scholar
9. Chatterjee SK, Rohrbaugh ML. NIH inventions translate into drugs and biologics with high public health impact. Nat Biotechnol. 2014;32: 52–58. pmid:24406928
- View Article
- PubMed/NCBI
- Google Scholar
10. Arias E. United states life tables, 2009. National Vital Statistics Reports. 2013;62. pmid:24979975
- View Article
- PubMed/NCBI
- Google Scholar
11. Manton KG, Gu X, Lamb VL. Change in chronic disability from 1982 to 2004/2005 as measured by long-term changes in function and health in the U.S. elderly population. Proceedings of the National Academy of Sciences of the United States of America. 2006;103: 18374–18379. pmid:17101963
- View Article
- PubMed/NCBI
- Google Scholar
12. Mandel R. A Half Century of Peer Review, 1946–1996. Miami, FL: HardPress Publishing. 1996.
13. Kaplan D. POINT: Statistical analysis in NIH peer review—Identifying innovation. FASEB J. 2007;21: 305–308. pmid:17267383
- View Article
- PubMed/NCBI
- Google Scholar
14. Kirschner M. A perverted view of "impact". Science (New York, N Y). 2013;340: 1265. pmid:23766298
- View Article
- PubMed/NCBI
- Google Scholar
15. Nicholson JM, Ioannidis JP. Research grants: Conform and be funded. Nature. 2012;492: 34–36. 492034a [pii]; pmid:23222591
- View Article
- PubMed/NCBI
- Google Scholar
16. Demicheli V, Di Pietrantonj C. Peer review for improving the quality of grant applications. Cochrane Database of Systematic Reviews. 2007.
17. Katz DA. Faculty Salaries, Promotions, and Productivity at a Large University. The American Economic Review. 1973;63: 469–477.
- View Article
- Google Scholar
18. Salthouse TA, McKeachie WJ, Lin YG. An Experimental Investigation of Factors Affecting University Promotion Decision: A Brief Report. The Journal of Higher Education. 1978;49: 177–183.
- View Article
- Google Scholar
19. Hamermesh D, Johnson G, Weisbrod B. Scholarship, Citations and Salaries: Economic Rewards in Economics. Southern Economic Journal. 1982;49: 472–481.
- View Article
- Google Scholar
20. Sheldon PJ, Collison FM. Faculty review criteria in tourism and hospitality. Ann Tour Res. 1990;17: 556–567.
- View Article
- Google Scholar
21. Street DL, Baril CP. Scholarly accomplishments in promotion and tenure decisions of accounting faculty. J Account Educ. 1994;12: 121–139.
- View Article
- Google Scholar
22. Moore WJ, Newman RJ, Turnbull GK. Reputational capital and academic pay. Econ Inq. 2001;39: 663–671.
- View Article
- Google Scholar
23. Selpel MMO. Assessing publication for tenure. J Soc Work Educ. 2003;39: 79–88.
- View Article
- Google Scholar
24. Adler NJ, Harzing AW. When knowledge wins: Transcending the sense and nonsense of academic rankings. Acad Manage Learn Educ. 2009;8: 72–95.
- View Article
- Google Scholar
25. MacDonald S, Kam J. Aardvark , et al.: Quality journals and gamesmanship in management studies. J Inf Sci. 2007;33: 702–717.
- View Article
- Google Scholar
26. Franzoni C, Scellato G, Stephan P. Changing incentives to publish. Science (New York, N Y). 2011;333: 702–703. pmid:21817035
- View Article
- PubMed/NCBI
- Google Scholar
27. Shao J, Shen H. The outflow of academic papers from China: Why is it happening and can it be stemmed? Learn Publ. 2011;24: 95–97.
- View Article
- Google Scholar
28. O'Keefe S, Wang TC. Publishing pays: Economists' salaries reflect productivity. Soc Sci J. 2013;50: 45–54.
- View Article
- Google Scholar
29. Fairweather JS. Beyond the Rhetoric: Trends in the Relative Value of Teaching and Research in Faculty Salaries. The Journal of Higher Education. 2005;76: 401–422.
- View Article
- Google Scholar
30. Arthur MD. What is a Citation Worth? The Journal of Human Resources. 1986;21: 200–215.
- View Article
- Google Scholar
31. Berg, JM. 6-2-2014 Productivity Metrics and Peer Review Scores [Web log post]. Available http://loop.nigms.nih.gov/2011/06/productivity-metrics-and-peer-review-scores/
32. Danthi N, Wu CO, Shi P, Lauer M. Percentile ranking and citation impact of a large cohort of national heart, lung, and blood institute-funded cardiovascular R01 grants. Circulation Research. 2014;114: 600–606. pmid:24406983
- View Article
- PubMed/NCBI
- Google Scholar
33. Gallo SA, Carpenter AS, Irwin D, McPartland CD, Travis J, Reynders S, et al. The validation of peer review through research impact measures and the implications for funding strategies. PLoS ONE. 2014;9: e106474. pmid:25184367
- View Article
- PubMed/NCBI
- Google Scholar
34. Kaltman JR, Evans FJ, Danthi NS, Wu CO, DiMichele DM, Lauer MS. Prior publication productivity, grant percentile ranking, and topic-normalized citation impact of NHLBI cardiovascular R01 grants. Circ Res. 2014;115: 617–624. pmid:25214575
- View Article
- PubMed/NCBI
- Google Scholar
35. Scheiner SM, Bouchie LM. The predictive power of NSF reviewers and panels. Frontiers Ecol Envir. 2013;11: 406–407.
- View Article
- Google Scholar
36. Thorndike Robert L. Personnel Selection: Test and Measurement Techniques. New York, NY: John Wiley and Sons, Inc. 1949.
37. Nunnally JC. Psychometric Theory. New York: McGraw-Hill Book Company. 1978.
38. Johnson VE. Statistical analysis of the National Institutes of Health peer review system. Proc Natl Acad Sci U S A. 2008;105: 11076–11080. pmid:18663221
- View Article
- PubMed/NCBI
- Google Scholar
39. National Academy of Sciences, National Academy of Engineering and Institute of Medicine Rising Above the Gathering Storm: Energizing and Employing America for a Brighter Future. Washington, D.C.: The National Academies. 2007.
40. Van Dalen HP, Henkens K. Intended and unintended consequences of a publish-or-perish culture: A worldwide survey. J Am Soc Inf Sci Technol. 2012;63: 1282–1293.
- View Article
- Google Scholar
41. Lawrence PA. Lost in publication: How measurement harms science. Ethics Sci Environm Polit. 2008;8: 9–11.
- View Article
- Google Scholar
42. Lawrence PA. The politics of publication. Nature. 2003;422: 259–261. pmid:12646895
- View Article
- PubMed/NCBI
- Google Scholar
43. Abbott A, Cyranoski D, Jones N, Maher B, Schiermeier Q, Van Noorden R. Metrics: Do metrics matter? Nature. 2010;465: 860–862. pmid:20559361
- View Article
- PubMed/NCBI
- Google Scholar
44. Martinson BC, Anderson MS, De Vries R. Scientists behaving badly. Nature. 2005;435: 737–738. pmid:15944677
- View Article
- PubMed/NCBI
- Google Scholar
45. Anderson MS, Ronning EA, De Vries R, Martinson BC. The perverse effects of competition on scientists' work and relationships. Sci Eng Ethics. 2007;13: 437–461. pmid:18030595
- View Article
- PubMed/NCBI
- Google Scholar
46. Bornstein RF. Publication politics, experimenter bias and the replication process in social science research. J Soc Behav Pers. 1990;5: 71–81.
- View Article
- Google Scholar
47. Collins HM. Changing order: replication and induction in scientific practice. Chicago: University of Chicago Press. 1992.
48. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003;33: 177–182. pmid:12524541
- View Article
- PubMed/NCBI
- Google Scholar
49. Vineis P, Manuguerra M, Kavvoura FK, Guarrera S, Allione A, Rosa F, et al. A field synopsis on low-penetrance variants in DNA repair genes and cancer susceptibility. J Natl Cancer Inst. 2009;101: 24–36. pmid:19116388
- View Article
- PubMed/NCBI
- Google Scholar
50. Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483: 531–533. 483531a [pii]; pmid:22460880
- View Article
- PubMed/NCBI
- Google Scholar
51. Benatar M. Lost in translation: treatment trials in the SOD1 mouse and in human ALS. Neurobiol Dis. 2007;26: 1–13. pmid:17300945
- View Article
- PubMed/NCBI
- Google Scholar
52. Dickersin K, Chan S, Chalmers TC, Sacks HS, Smith H Jr. Publication bias and clinical trials. Control Clin Trials. 1987;8: 343–353. pmid:3442991
- View Article
- PubMed/NCBI
- Google Scholar
53. Dickersin K, Min YI. Publication bias: the problem that won't go away. Ann N Y Acad Sci. 1993;703: 135–146. pmid:8192291
- View Article
- PubMed/NCBI
- Google Scholar
54. Frank R, Heather MF, Susann F. Is there evidence of publication biases in JDM research? Judgment and Decision Making. 2011;6: 870–881.
- View Article
- Google Scholar
55. Su J, Li X, Cui X, Li Y, Fitz Y, Hsu L, et al. Ethyl pyruvate decreased early nuclear factor-kappaB levels but worsened survival in lipopolysaccharide-challenged mice. Crit Care Med. 2008;36: 1059–1067. pmid:18176313
- View Article
- PubMed/NCBI
- Google Scholar
56. Mahoney MJ, DeMonbreun BG. Psychology of the scientist: An analysis of problem-solving bias. Cognitive Therapy and Research. 1977;1: 229–238.
- View Article
- Google Scholar
57. Mahoney MJ. Publication Prejudices: An Experimental Study of Confirmatory Bias in the Peer Review System. Cognitive Therapy and Research. 1977;1: 161–175.
- View Article
- Google Scholar
58. Mahoney MJ, Kimper TP. From ethics to logic: a survey of scientists. In: Scientist as Subject: The Psychological Imperative. New York, NY: Percheron Press. 2004. pp. 187–193.
59. Mahoney MJ. Scientist as Subject: The Psychological Imperative. Clinton Corners, NY: Percheron Press. 2004.
60. Mitroff II. The Subjective Side of Science: A Philosophical Inquiry into the Psychology of the Apollo Moon Scientists. New York, NY: Elsevier. 1974.
61. Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet. 2001;2: 91–99. pmid:11253062
- View Article
- PubMed/NCBI
- Google Scholar
62. Colhoun HM, McKeigue PM, Davey SG. Problems of reporting genetic associations with complex outcomes. Lancet. 2003;361: 865–872. pmid:12642066
- View Article
- PubMed/NCBI
- Google Scholar
63. Dirnagl U. Bench to bedside: the quest for quality in experimental stroke research. J Cereb Blood Flow Metab. 2006;26: 1465–1478. pmid:16525413
- View Article
- PubMed/NCBI
- Google Scholar
64. van der Worp HB, Sena ES, Donnan GA, Howells DW, Macleod MR. Hypothermia in animal models of acute ischaemic stroke: a systematic review and meta-analysis. Brain. 2007;130: 3063–3074. pmid:17478443
- View Article
- PubMed/NCBI
- Google Scholar
65. Lindner MD. Clinical attrition due to biased preclinical assessments of potential efficacy. Pharmacol Ther. 2007;115: 148–175. pmid:17574680
- View Article
- PubMed/NCBI
- Google Scholar
66. Hartwig F, Dearing BE. Exploratory data analysis. Beverly Hills: Sage Publications. 1979.
67. Hoaglin DC, Mosteller F, Tukey JW. Understanding robust and exploratory data analysis. New York: John Wiley and Sons, Inc. 1983.
68. Ioannidis JP. Microarrays and molecular research: noise discovery? Lancet. 2005;365: 454–455. pmid:15705441
- View Article
- PubMed/NCBI
- Google Scholar
69. Pocock SJ. Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation. Control Clin Trials. 1997;18: 530–545. pmid:9408716
- View Article
- PubMed/NCBI
- Google Scholar
70. Simmons JP, Nelson LD, Simonsohn U. False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science. 2011;22: 1359–1366. pmid:22006061
- View Article
- PubMed/NCBI
- Google Scholar
71. Kerr NL. HARKing: Hypothesizing after the results are known. Pers Soc Psychol Rev. 1998;2: 196–217. pmid:15647155
- View Article
- PubMed/NCBI
- Google Scholar
72. Brooks TA. Private acts and public objects: an investigation of citer motivations. Journal of the American Society of Information Science. 1985;36: 223–229.
- View Article
- Google Scholar
73. Gilbert GN. Referencing as persuasion. Social Studies of Science. 1977;7: 113–122.
- View Article
- Google Scholar
74. Garfield E. Is citation analysis a legitimate evaluation tool? Scientometrics. 1979;1: 359–375.
- View Article
- Google Scholar
75. Robinson KA, Goodman SN. A systematic examination of the citation of prior research in reports of randomized, controlled trials. Ann Intern Med. 2011;154: 50–55. pmid:21200038
- View Article
- PubMed/NCBI
- Google Scholar
76. Greenberg SA. How citation distortions create unfounded authority: Analysis of a citation network. BMJ. 2009;339: 210–213.
- View Article
- Google Scholar
77. Schrag M, Mueller C, Oyoyo U, Smith MA, Kirsch WM. Iron, zinc and copper in the Alzheimer's disease brain: a quantitative meta-analysis. Some insight on the influence of citation bias on scientific opinion. Prog Neurobiol. 2011;94: 296–306. S0301-0082(11)00072-4 [pii]; pmid:21600264
- View Article
- PubMed/NCBI
- Google Scholar
78. Chapman S, Ragg M, McGeechan K. Citation bias in reported smoking prevalence in people with schizophrenia. Aust New Zealand J Psychiatry. 2009;43: 277–282.
- View Article
- Google Scholar
79. Jannot AS, Agoritsas T, Gayet-Ageron A, Perneger TV. Citation bias favoring statistically significant studies was present in medical research. J Clin Epidemiol. 2013;66: 296–301. pmid:23347853
- View Article
- PubMed/NCBI
- Google Scholar
80. Kjaergard LL, Gluud C. Citation bias of hepato-biliary randomized clinical trials. J Clin Epidemiol. 2002;55: 407–410. pmid:11927210
- View Article
- PubMed/NCBI
- Google Scholar
81. Gotzsche PC. Reference bias in reports of drug trials. Br Med J (Clin Res Ed). 1987;295: 654–656. pmid:3117277
- View Article
- PubMed/NCBI
- Google Scholar
82. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. The Lancet. 2009;374: 86–89.
- View Article
- Google Scholar
83. Bakker M, van Dijk A, Wicherts JM. The Rules of the Game Called Psychological Science. Perspectives on Psychological Science. 2012;7: 543–554.
- View Article
- Google Scholar
84. Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14: 365–376. nrn3475 [pii]; pmid:23571845
- View Article
- PubMed/NCBI
- Google Scholar
85. Chapman SJ, Shelton B, Mahmood H, Fitzgerald JE, Harrison EM, Bhangu A. Discontinuation and non-publication of surgical randomised controlled trials: Observational study. BMJ (Online). 2014;349.
- View Article
- Google Scholar
86. Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gülmezoglu AM, et al. How to increase value and reduce waste when research priorities are set. Lancet. 2014;383: 156–165. pmid:24411644
- View Article
- PubMed/NCBI
- Google Scholar
87. Counsell CE, Clarke MJ, Slattery J, Sandercock PA. The miracle of DICE therapy for acute stroke: fact or fictional product of subgroup analysis? BMJ. 1994;309: 1677–1681. pmid:7819982
- View Article
- PubMed/NCBI
- Google Scholar
88. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG. Replication validity of genetic association studies. Nat Genet. 2001;29: 306–309. pmid:11600885
- View Article
- PubMed/NCBI
- Google Scholar
89. Munafo MR, Stothart G, Flint J. Bias in genetic association studies and impact factor. Mol Psychiatry. 2009;14: 119–120. pmid:19156153
- View Article
- PubMed/NCBI
- Google Scholar
90. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2: e124. pmid:16060722
- View Article
- PubMed/NCBI
- Google Scholar
91. Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19: 640–648. pmid:18633328
- View Article
- PubMed/NCBI
- Google Scholar
92. Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10: 712. nrd3439-c1 [pii]; pmid:21892149
- View Article
- PubMed/NCBI
- Google Scholar
93. Steward O, Popovich PG, Dietrich WD, Kleitman N. Replication and reproducibility in spinal cord injury research. Exp Neurol. 2012;233: 597–605. S0014-4886(11)00239-1 [pii]; pmid:22078756
- View Article
- PubMed/NCBI
- Google Scholar
94. Reinstein A, Hasselback JR, Riley ME, Sinason DH. Pitfalls of using citation indices for making academic accounting promotion, tenure, teaching load, and merit pay decisions. Issues Account Educ. 2011;26: 99–131.
- View Article
- Google Scholar
95. Browman HI, Stergiou KI. Factors and indices are one thing, deciding who is scholarly, why they are scholarly, and the relative value of their scholarship is something else entirely. Ethics Sci Environm Polit. 2008;8: 1–3.
- View Article
- Google Scholar
96. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294: 218–228. pmid:16014596
- View Article
- PubMed/NCBI
- Google Scholar
97. Trikalinos TA, Ntzani EE, Contopoulos-Ioannidis DG, Ioannidis JP. Establishment of genetic associations for complex diseases is independent of early study findings. Eur J Hum Genet. 2004;12: 762–769. pmid:15213707
- View Article
- PubMed/NCBI
- Google Scholar
98. Moore WJ, Newman RJ, Turnbull GK. Do academic salaries decline with seniority? J Labor Econ. 1998;16: 352–366.
- View Article
- Google Scholar
99. Balogun JA, Sloan PE, Germain M. Core values and evaluation processes associated with academic tenure. Percept Mot Skills. 2007;104: 1107–1115. pmid:17879644
- View Article
- PubMed/NCBI
- Google Scholar
100. Youn TIK, Price TM. Learning from the experience of others: The evolution of faculty tenure and promotion rules in comprehensive institutions. J High Educ. 2009;80: 204–237.
- View Article
- Google Scholar
101. Graber M, Launov A, Wälde K. Publish or perish? The increasing importance of publications for prospective economics professors in Austria, Germany and Switzerland. Ger Econ Rev. 2008;9: 457–472.
- View Article
- Google Scholar
102. Pilcher ES, Kilpatrick AO, Segars J. An assessment of promotion and tenure requirements at dental schools. J Dent Educ. 2009;73: 375–382. pmid:19289726
- View Article
- PubMed/NCBI
- Google Scholar
103. Fanelli D. Do pressures to publish increase scientists' bias? An empirical support from US states data. PLoS ONE. 2010;5.
- View Article
- Google Scholar
104. Fanelli D. Negative results are disappearing from most disciplines and countries. Scientometrics. 2012;90: 891–904.
- View Article
- Google Scholar
105. Ioannidis JPA, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383: 166–175. pmid:24411645
- View Article
- PubMed/NCBI
- Google Scholar
106. Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JPA, et al. Biomedical research: Increasing value, reducing waste. The Lancet. 2014;383: 101–104. pmid:24411643
- View Article
- PubMed/NCBI
- Google Scholar
107. Collins FS, Tabak LA. NIH plans to enhance reproducibility. Nature. 2014;505: 612–613. pmid:24482835
- View Article
- PubMed/NCBI
- Google Scholar
108. Young NS, Ioannidis JP, Al-Ubaydli O. Why current publication practices may distort science. PLoS Med. 2008;5: e201. pmid:18844432
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Ehrlich E. An economic engine: NIH research, employment, and the future of the medical innovation sector. United for Medical Research. 2011.

[ref2] 2. Makomva K, Mahan D. In your own backyard: how NIH funding helps your state's economy. Families USA. 2008.

[ref3] 3. Tatsioni A, Vavva E, Ioannidis JPA. Sources of funding for Nobel Prize-winning work: Public or private? FASEB J. 2010;24: 1335–1339. pmid:20056712
View Article
PubMed/NCBI
Google Scholar

[4] View Article

[5] PubMed/NCBI

[6] Google Scholar

[ref4] 4. Zinner DE. Medical R&D at the turn of the millennium. Health Aff. 2001;20: 202–209. pmid:11558704
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref5] 5. Tripp S, Grueber M. Economic impact of the Human Genome Project. Battelle Memorial Institute. 2011.

[ref6] 6. Narin F, Hamilton KS, Olivastro D. The increasing linkage between U.S. technology and public science. Research Policy. 1997;26: 317–330.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref7] 7. McMillan GS, Narin F, Deeds DL. An analysis of the critical role of public science in innovation: The case of biotechnology. Research Policy. 2000;29: 1–8.
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref8] 8. Stevens AJ, Jensen JJ, Wyller K, Kilgore PC, Chatterjee S, Rohrbaugh ML. The role of public-sector research in the discovery of drugs and vaccines. N Engl J Med. 2011;364: 535–541. pmid:21306239
View Article
PubMed/NCBI
Google Scholar

[19] View Article

[20] PubMed/NCBI

[21] Google Scholar

[ref9] 9. Chatterjee SK, Rohrbaugh ML. NIH inventions translate into drugs and biologics with high public health impact. Nat Biotechnol. 2014;32: 52–58. pmid:24406928
View Article
PubMed/NCBI
Google Scholar

[23] View Article

[24] PubMed/NCBI

[25] Google Scholar

[ref10] 10. Arias E. United states life tables, 2009. National Vital Statistics Reports. 2013;62. pmid:24979975
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref11] 11. Manton KG, Gu X, Lamb VL. Change in chronic disability from 1982 to 2004/2005 as measured by long-term changes in function and health in the U.S. elderly population. Proceedings of the National Academy of Sciences of the United States of America. 2006;103: 18374–18379. pmid:17101963
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref12] 12. Mandel R. A Half Century of Peer Review, 1946–1996. Miami, FL: HardPress Publishing. 1996.

[ref13] 13. Kaplan D. POINT: Statistical analysis in NIH peer review—Identifying innovation. FASEB J. 2007;21: 305–308. pmid:17267383
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref14] 14. Kirschner M. A perverted view of "impact". Science (New York, N Y). 2013;340: 1265. pmid:23766298
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref15] 15. Nicholson JM, Ioannidis JP. Research grants: Conform and be funded. Nature. 2012;492: 34–36. 492034a [pii]; pmid:23222591
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref16] 16. Demicheli V, Di Pietrantonj C. Peer review for improving the quality of grant applications. Cochrane Database of Systematic Reviews. 2007.

[ref17] 17. Katz DA. Faculty Salaries, Promotions, and Productivity at a Large University. The American Economic Review. 1973;63: 469–477.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref18] 18. Salthouse TA, McKeachie WJ, Lin YG. An Experimental Investigation of Factors Affecting University Promotion Decision: A Brief Report. The Journal of Higher Education. 1978;49: 177–183.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref19] 19. Hamermesh D, Johnson G, Weisbrod B. Scholarship, Citations and Salaries: Economic Rewards in Economics. Southern Economic Journal. 1982;49: 472–481.
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref20] 20. Sheldon PJ, Collison FM. Faculty review criteria in tourism and hospitality. Ann Tour Res. 1990;17: 556–567.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref21] 21. Street DL, Baril CP. Scholarly accomplishments in promotion and tenure decisions of accounting faculty. J Account Educ. 1994;12: 121–139.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref22] 22. Moore WJ, Newman RJ, Turnbull GK. Reputational capital and academic pay. Econ Inq. 2001;39: 663–671.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref23] 23. Selpel MMO. Assessing publication for tenure. J Soc Work Educ. 2003;39: 79–88.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref24] 24. Adler NJ, Harzing AW. When knowledge wins: Transcending the sense and nonsense of academic rankings. Acad Manage Learn Educ. 2009;8: 72–95.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref25] 25. MacDonald S, Kam J. Aardvark , et al.: Quality journals and gamesmanship in management studies. J Inf Sci. 2007;33: 702–717.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref26] 26. Franzoni C, Scellato G, Stephan P. Changing incentives to publish. Science (New York, N Y). 2011;333: 702–703. pmid:21817035
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref27] 27. Shao J, Shen H. The outflow of academic papers from China: Why is it happening and can it be stemmed? Learn Publ. 2011;24: 95–97.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref28] 28. O'Keefe S, Wang TC. Publishing pays: Economists' salaries reflect productivity. Soc Sci J. 2013;50: 45–54.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref29] 29. Fairweather JS. Beyond the Rhetoric: Trends in the Relative Value of Teaching and Research in Faculty Salaries. The Journal of Higher Education. 2005;76: 401–422.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref30] 30. Arthur MD. What is a Citation Worth? The Journal of Human Resources. 1986;21: 200–215.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref31] 31. Berg, JM. 6-2-2014 Productivity Metrics and Peer Review Scores [Web log post]. Available http://loop.nigms.nih.gov/2011/06/productivity-metrics-and-peer-review-scores/

[ref32] 32. Danthi N, Wu CO, Shi P, Lauer M. Percentile ranking and citation impact of a large cohort of national heart, lung, and blood institute-funded cardiovascular R01 grants. Circulation Research. 2014;114: 600–606. pmid:24406983
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref33] 33. Gallo SA, Carpenter AS, Irwin D, McPartland CD, Travis J, Reynders S, et al. The validation of peer review through research impact measures and the implications for funding strategies. PLoS ONE. 2014;9: e106474. pmid:25184367
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref34] 34. Kaltman JR, Evans FJ, Danthi NS, Wu CO, DiMichele DM, Lauer MS. Prior publication productivity, grant percentile ranking, and topic-normalized citation impact of NHLBI cardiovascular R01 grants. Circ Res. 2014;115: 617–624. pmid:25214575
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref35] 35. Scheiner SM, Bouchie LM. The predictive power of NSF reviewers and panels. Frontiers Ecol Envir. 2013;11: 406–407.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref36] 36. Thorndike Robert L. Personnel Selection: Test and Measurement Techniques. New York, NY: John Wiley and Sons, Inc. 1949.

[ref37] 37. Nunnally JC. Psychometric Theory. New York: McGraw-Hill Book Company. 1978.

[ref38] 38. Johnson VE. Statistical analysis of the National Institutes of Health peer review system. Proc Natl Acad Sci U S A. 2008;105: 11076–11080. pmid:18663221
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref39] 39. National Academy of Sciences, National Academy of Engineering and Institute of Medicine Rising Above the Gathering Storm: Energizing and Employing America for a Brighter Future. Washington, D.C.: The National Academies. 2007.

[ref40] 40. Van Dalen HP, Henkens K. Intended and unintended consequences of a publish-or-perish culture: A worldwide survey. J Am Soc Inf Sci Technol. 2012;63: 1282–1293.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

[ref41] 41. Lawrence PA. Lost in publication: How measurement harms science. Ethics Sci Environm Polit. 2008;8: 9–11.
View Article
Google Scholar

[118] View Article

[119] Google Scholar

[ref42] 42. Lawrence PA. The politics of publication. Nature. 2003;422: 259–261. pmid:12646895
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref43] 43. Abbott A, Cyranoski D, Jones N, Maher B, Schiermeier Q, Van Noorden R. Metrics: Do metrics matter? Nature. 2010;465: 860–862. pmid:20559361
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref44] 44. Martinson BC, Anderson MS, De Vries R. Scientists behaving badly. Nature. 2005;435: 737–738. pmid:15944677
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref45] 45. Anderson MS, Ronning EA, De Vries R, Martinson BC. The perverse effects of competition on scientists' work and relationships. Sci Eng Ethics. 2007;13: 437–461. pmid:18030595
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref46] 46. Bornstein RF. Publication politics, experimenter bias and the replication process in social science research. J Soc Behav Pers. 1990;5: 71–81.
View Article
Google Scholar

[137] View Article

[138] Google Scholar

[ref47] 47. Collins HM. Changing order: replication and induction in scientific practice. Chicago: University of Chicago Press. 1992.

[ref48] 48. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003;33: 177–182. pmid:12524541
View Article
PubMed/NCBI
Google Scholar

[141] View Article

[142] PubMed/NCBI

[143] Google Scholar

[ref49] 49. Vineis P, Manuguerra M, Kavvoura FK, Guarrera S, Allione A, Rosa F, et al. A field synopsis on low-penetrance variants in DNA repair genes and cancer susceptibility. J Natl Cancer Inst. 2009;101: 24–36. pmid:19116388
View Article
PubMed/NCBI
Google Scholar

[145] View Article

[146] PubMed/NCBI

[147] Google Scholar

[ref50] 50. Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483: 531–533. 483531a [pii]; pmid:22460880
View Article
PubMed/NCBI
Google Scholar

[149] View Article

[150] PubMed/NCBI

[151] Google Scholar

[ref51] 51. Benatar M. Lost in translation: treatment trials in the SOD1 mouse and in human ALS. Neurobiol Dis. 2007;26: 1–13. pmid:17300945
View Article
PubMed/NCBI
Google Scholar

[153] View Article

[154] PubMed/NCBI

[155] Google Scholar

[ref52] 52. Dickersin K, Chan S, Chalmers TC, Sacks HS, Smith H Jr. Publication bias and clinical trials. Control Clin Trials. 1987;8: 343–353. pmid:3442991
View Article
PubMed/NCBI
Google Scholar

[157] View Article

[158] PubMed/NCBI

[159] Google Scholar

[ref53] 53. Dickersin K, Min YI. Publication bias: the problem that won't go away. Ann N Y Acad Sci. 1993;703: 135–146. pmid:8192291
View Article
PubMed/NCBI
Google Scholar

[161] View Article

[162] PubMed/NCBI

[163] Google Scholar

[ref54] 54. Frank R, Heather MF, Susann F. Is there evidence of publication biases in JDM research? Judgment and Decision Making. 2011;6: 870–881.
View Article
Google Scholar

[165] View Article

[166] Google Scholar

[ref55] 55. Su J, Li X, Cui X, Li Y, Fitz Y, Hsu L, et al. Ethyl pyruvate decreased early nuclear factor-kappaB levels but worsened survival in lipopolysaccharide-challenged mice. Crit Care Med. 2008;36: 1059–1067. pmid:18176313
View Article
PubMed/NCBI
Google Scholar

[168] View Article

[169] PubMed/NCBI

[170] Google Scholar

[ref56] 56. Mahoney MJ, DeMonbreun BG. Psychology of the scientist: An analysis of problem-solving bias. Cognitive Therapy and Research. 1977;1: 229–238.
View Article
Google Scholar

[172] View Article

[173] Google Scholar

[ref57] 57. Mahoney MJ. Publication Prejudices: An Experimental Study of Confirmatory Bias in the Peer Review System. Cognitive Therapy and Research. 1977;1: 161–175.
View Article
Google Scholar

[175] View Article

[176] Google Scholar

[ref58] 58. Mahoney MJ, Kimper TP. From ethics to logic: a survey of scientists. In: Scientist as Subject: The Psychological Imperative. New York, NY: Percheron Press. 2004. pp. 187–193.

[ref59] 59. Mahoney MJ. Scientist as Subject: The Psychological Imperative. Clinton Corners, NY: Percheron Press. 2004.

[ref60] 60. Mitroff II. The Subjective Side of Science: A Philosophical Inquiry into the Psychology of the Apollo Moon Scientists. New York, NY: Elsevier. 1974.

[ref61] 61. Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet. 2001;2: 91–99. pmid:11253062
View Article
PubMed/NCBI
Google Scholar

[181] View Article

[182] PubMed/NCBI

[183] Google Scholar

[ref62] 62. Colhoun HM, McKeigue PM, Davey SG. Problems of reporting genetic associations with complex outcomes. Lancet. 2003;361: 865–872. pmid:12642066
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref63] 63. Dirnagl U. Bench to bedside: the quest for quality in experimental stroke research. J Cereb Blood Flow Metab. 2006;26: 1465–1478. pmid:16525413
View Article
PubMed/NCBI
Google Scholar

[189] View Article

[190] PubMed/NCBI

[191] Google Scholar

[ref64] 64. van der Worp HB, Sena ES, Donnan GA, Howells DW, Macleod MR. Hypothermia in animal models of acute ischaemic stroke: a systematic review and meta-analysis. Brain. 2007;130: 3063–3074. pmid:17478443
View Article
PubMed/NCBI
Google Scholar

[193] View Article

[194] PubMed/NCBI

[195] Google Scholar

[ref65] 65. Lindner MD. Clinical attrition due to biased preclinical assessments of potential efficacy. Pharmacol Ther. 2007;115: 148–175. pmid:17574680
View Article
PubMed/NCBI
Google Scholar

[197] View Article

[198] PubMed/NCBI

[199] Google Scholar

[ref66] 66. Hartwig F, Dearing BE. Exploratory data analysis. Beverly Hills: Sage Publications. 1979.

[ref67] 67. Hoaglin DC, Mosteller F, Tukey JW. Understanding robust and exploratory data analysis. New York: John Wiley and Sons, Inc. 1983.

[ref68] 68. Ioannidis JP. Microarrays and molecular research: noise discovery? Lancet. 2005;365: 454–455. pmid:15705441
View Article
PubMed/NCBI
Google Scholar

[203] View Article

[204] PubMed/NCBI

[205] Google Scholar

[ref69] 69. Pocock SJ. Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation. Control Clin Trials. 1997;18: 530–545. pmid:9408716
View Article
PubMed/NCBI
Google Scholar

[207] View Article

[208] PubMed/NCBI

[209] Google Scholar

[ref70] 70. Simmons JP, Nelson LD, Simonsohn U. False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science. 2011;22: 1359–1366. pmid:22006061
View Article
PubMed/NCBI
Google Scholar

[211] View Article

[212] PubMed/NCBI

[213] Google Scholar

[ref71] 71. Kerr NL. HARKing: Hypothesizing after the results are known. Pers Soc Psychol Rev. 1998;2: 196–217. pmid:15647155
View Article
PubMed/NCBI
Google Scholar

[215] View Article

[216] PubMed/NCBI

[217] Google Scholar

[ref72] 72. Brooks TA. Private acts and public objects: an investigation of citer motivations. Journal of the American Society of Information Science. 1985;36: 223–229.
View Article
Google Scholar

[219] View Article

[220] Google Scholar

[ref73] 73. Gilbert GN. Referencing as persuasion. Social Studies of Science. 1977;7: 113–122.
View Article
Google Scholar

[222] View Article

[223] Google Scholar

[ref74] 74. Garfield E. Is citation analysis a legitimate evaluation tool? Scientometrics. 1979;1: 359–375.
View Article
Google Scholar

[225] View Article

[226] Google Scholar

[ref75] 75. Robinson KA, Goodman SN. A systematic examination of the citation of prior research in reports of randomized, controlled trials. Ann Intern Med. 2011;154: 50–55. pmid:21200038
View Article
PubMed/NCBI
Google Scholar

[228] View Article

[229] PubMed/NCBI

[230] Google Scholar

[ref76] 76. Greenberg SA. How citation distortions create unfounded authority: Analysis of a citation network. BMJ. 2009;339: 210–213.
View Article
Google Scholar

[232] View Article

[233] Google Scholar

[ref77] 77. Schrag M, Mueller C, Oyoyo U, Smith MA, Kirsch WM. Iron, zinc and copper in the Alzheimer's disease brain: a quantitative meta-analysis. Some insight on the influence of citation bias on scientific opinion. Prog Neurobiol. 2011;94: 296–306. S0301-0082(11)00072-4 [pii]; pmid:21600264
View Article
PubMed/NCBI
Google Scholar

[235] View Article

[236] PubMed/NCBI

[237] Google Scholar

[ref78] 78. Chapman S, Ragg M, McGeechan K. Citation bias in reported smoking prevalence in people with schizophrenia. Aust New Zealand J Psychiatry. 2009;43: 277–282.
View Article
Google Scholar

[239] View Article

[240] Google Scholar

[ref79] 79. Jannot AS, Agoritsas T, Gayet-Ageron A, Perneger TV. Citation bias favoring statistically significant studies was present in medical research. J Clin Epidemiol. 2013;66: 296–301. pmid:23347853
View Article
PubMed/NCBI
Google Scholar

[242] View Article

[243] PubMed/NCBI

[244] Google Scholar

[ref80] 80. Kjaergard LL, Gluud C. Citation bias of hepato-biliary randomized clinical trials. J Clin Epidemiol. 2002;55: 407–410. pmid:11927210
View Article
PubMed/NCBI
Google Scholar

[246] View Article

[247] PubMed/NCBI

[248] Google Scholar

[ref81] 81. Gotzsche PC. Reference bias in reports of drug trials. Br Med J (Clin Res Ed). 1987;295: 654–656. pmid:3117277
View Article
PubMed/NCBI
Google Scholar

[250] View Article

[251] PubMed/NCBI

[252] Google Scholar

[ref82] 82. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. The Lancet. 2009;374: 86–89.
View Article
Google Scholar

[254] View Article

[255] Google Scholar

[ref83] 83. Bakker M, van Dijk A, Wicherts JM. The Rules of the Game Called Psychological Science. Perspectives on Psychological Science. 2012;7: 543–554.
View Article
Google Scholar

[257] View Article

[258] Google Scholar

[ref84] 84. Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14: 365–376. nrn3475 [pii]; pmid:23571845
View Article
PubMed/NCBI
Google Scholar

[260] View Article

[261] PubMed/NCBI

[262] Google Scholar

[ref85] 85. Chapman SJ, Shelton B, Mahmood H, Fitzgerald JE, Harrison EM, Bhangu A. Discontinuation and non-publication of surgical randomised controlled trials: Observational study. BMJ (Online). 2014;349.
View Article
Google Scholar

[264] View Article

[265] Google Scholar

[ref86] 86. Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gülmezoglu AM, et al. How to increase value and reduce waste when research priorities are set. Lancet. 2014;383: 156–165. pmid:24411644
View Article
PubMed/NCBI
Google Scholar

[267] View Article

[268] PubMed/NCBI

[269] Google Scholar

[ref87] 87. Counsell CE, Clarke MJ, Slattery J, Sandercock PA. The miracle of DICE therapy for acute stroke: fact or fictional product of subgroup analysis? BMJ. 1994;309: 1677–1681. pmid:7819982
View Article
PubMed/NCBI
Google Scholar

[271] View Article

[272] PubMed/NCBI

[273] Google Scholar

[ref88] 88. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG. Replication validity of genetic association studies. Nat Genet. 2001;29: 306–309. pmid:11600885
View Article
PubMed/NCBI
Google Scholar

[275] View Article

[276] PubMed/NCBI

[277] Google Scholar

[ref89] 89. Munafo MR, Stothart G, Flint J. Bias in genetic association studies and impact factor. Mol Psychiatry. 2009;14: 119–120. pmid:19156153
View Article
PubMed/NCBI
Google Scholar

[279] View Article

[280] PubMed/NCBI

[281] Google Scholar

[ref90] 90. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2: e124. pmid:16060722
View Article
PubMed/NCBI
Google Scholar

[283] View Article

[284] PubMed/NCBI

[285] Google Scholar

[ref91] 91. Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19: 640–648. pmid:18633328
View Article
PubMed/NCBI
Google Scholar

[287] View Article

[288] PubMed/NCBI

[289] Google Scholar

[ref92] 92. Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10: 712. nrd3439-c1 [pii]; pmid:21892149
View Article
PubMed/NCBI
Google Scholar

[291] View Article

[292] PubMed/NCBI

[293] Google Scholar

[ref93] 93. Steward O, Popovich PG, Dietrich WD, Kleitman N. Replication and reproducibility in spinal cord injury research. Exp Neurol. 2012;233: 597–605. S0014-4886(11)00239-1 [pii]; pmid:22078756
View Article
PubMed/NCBI
Google Scholar

[295] View Article

[296] PubMed/NCBI

[297] Google Scholar

[ref94] 94. Reinstein A, Hasselback JR, Riley ME, Sinason DH. Pitfalls of using citation indices for making academic accounting promotion, tenure, teaching load, and merit pay decisions. Issues Account Educ. 2011;26: 99–131.
View Article
Google Scholar

[299] View Article

[300] Google Scholar

[ref95] 95. Browman HI, Stergiou KI. Factors and indices are one thing, deciding who is scholarly, why they are scholarly, and the relative value of their scholarship is something else entirely. Ethics Sci Environm Polit. 2008;8: 1–3.
View Article
Google Scholar

[302] View Article

[303] Google Scholar

[ref96] 96. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294: 218–228. pmid:16014596
View Article
PubMed/NCBI
Google Scholar

[305] View Article

[306] PubMed/NCBI

[307] Google Scholar

[ref97] 97. Trikalinos TA, Ntzani EE, Contopoulos-Ioannidis DG, Ioannidis JP. Establishment of genetic associations for complex diseases is independent of early study findings. Eur J Hum Genet. 2004;12: 762–769. pmid:15213707
View Article
PubMed/NCBI
Google Scholar

[309] View Article

[310] PubMed/NCBI

[311] Google Scholar

[ref98] 98. Moore WJ, Newman RJ, Turnbull GK. Do academic salaries decline with seniority? J Labor Econ. 1998;16: 352–366.
View Article
Google Scholar

[313] View Article

[314] Google Scholar

[ref99] 99. Balogun JA, Sloan PE, Germain M. Core values and evaluation processes associated with academic tenure. Percept Mot Skills. 2007;104: 1107–1115. pmid:17879644
View Article
PubMed/NCBI
Google Scholar

[316] View Article

[317] PubMed/NCBI

[318] Google Scholar

[ref100] 100. Youn TIK, Price TM. Learning from the experience of others: The evolution of faculty tenure and promotion rules in comprehensive institutions. J High Educ. 2009;80: 204–237.
View Article
Google Scholar

[320] View Article

[321] Google Scholar

[ref101] 101. Graber M, Launov A, Wälde K. Publish or perish? The increasing importance of publications for prospective economics professors in Austria, Germany and Switzerland. Ger Econ Rev. 2008;9: 457–472.
View Article
Google Scholar

[323] View Article

[324] Google Scholar

[ref102] 102. Pilcher ES, Kilpatrick AO, Segars J. An assessment of promotion and tenure requirements at dental schools. J Dent Educ. 2009;73: 375–382. pmid:19289726
View Article
PubMed/NCBI
Google Scholar

[326] View Article

[327] PubMed/NCBI

[328] Google Scholar

[ref103] 103. Fanelli D. Do pressures to publish increase scientists' bias? An empirical support from US states data. PLoS ONE. 2010;5.
View Article
Google Scholar

[330] View Article

[331] Google Scholar

[ref104] 104. Fanelli D. Negative results are disappearing from most disciplines and countries. Scientometrics. 2012;90: 891–904.
View Article
Google Scholar

[333] View Article

[334] Google Scholar

[ref105] 105. Ioannidis JPA, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383: 166–175. pmid:24411645
View Article
PubMed/NCBI
Google Scholar

[336] View Article

[337] PubMed/NCBI

[338] Google Scholar

[ref106] 106. Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JPA, et al. Biomedical research: Increasing value, reducing waste. The Lancet. 2014;383: 101–104. pmid:24411643
View Article
PubMed/NCBI
Google Scholar

[340] View Article

[341] PubMed/NCBI

[342] Google Scholar

[ref107] 107. Collins FS, Tabak LA. NIH plans to enhance reproducibility. Nature. 2014;505: 612–613. pmid:24482835
View Article
PubMed/NCBI
Google Scholar

[344] View Article

[345] PubMed/NCBI

[346] Google Scholar

[ref108] 108. Young NS, Ioannidis JP, Al-Ubaydli O. Why current publication practices may distort science. PLoS Med. 2008;5: e201. pmid:18844432
View Article
PubMed/NCBI
Google Scholar

[348] View Article

[349] PubMed/NCBI

[350] Google Scholar

Examining the Predictive Validity of NIH Peer Review Scores

Examining the Predictive Validity of NIH Peer Review Scores

Correction

Figures

Abstract

Introduction

Methods

Results

Discussion

Conclusions

Acknowledgments

Author Contributions

References