The citation rate for articles is viewed as a measure of their importance and impact; however, little is known about what features of articles are associated with higher citation rate.
We conducted a cohort study of all original articles, regardless of study methodology, published in the Lancet, JAMA, and New England Journal of Medicine, from October 1, 1999 to March 31, 2000. We identified 328 articles. Two blinded, independent reviewers extracted, in duplicate, nine variables from each article, which were analyzed in both univariable and multivariable linear least-squares regression models for their association with the annual rate of citations received by the article since publication. A two-way interaction between industry funding and an industry-favoring result was tested and found to be significant (p = 0.02). In our adjusted analysis, the presence of industry funding and an industry-favoring result was associated with an increase in annual citation rate of 25.7 (95% confidence interval, 8.5 to 42.8) compared to the absence of both industry funding and industry-favoring results. Higher annual rates of citation were also associated with articles dealing with cardiovascular medicine (13.3 more; 95% confidence interval, 3.9 to 22.3) and oncology (12.6 more; 95% confidence interval, 1.2 to 24.0), articles with group authorship (11.1 more; 95% confidence interval, 2.7 to 19.5), larger sample size and journal of publication.
Citation: Kulkarni AV, Busse JW, Shams I (2007) Characteristics Associated with Citation Rate of the Medical Literature. PLoS ONE 2(5): e403. doi:10.1371/journal.pone.0000403
Academic Editor: Peter Bacchetti, University of California, San Francisco, United States of America
Received: January 30, 2007; Accepted: March 30, 2007; Published: May 2, 2007
Copyright: © 2007 Kulkarni et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: No funds were received for this analysis. Dr. Jason Busse is funded by a Canadian Institutes of Health Research Fellowship Award.
Competing interests: The authors have declared that no competing interests exist.
The dissemination of important research findings through the medical community begins with publication in peer-reviewed journals, but is continued through citation of the original work in subsequent publications. The number of citations received by an article is viewed as a marker for the importance of the original research and is reflected in the impact factor of journals in which the original paper was published. The impact factor is calculated as the mean number of citations received in a year for all articles published in the journal in the previous 2 years .
Although reference bias in the medical literature has been well established, the tendency to over-represent studies with positive findings –, limited work has been done to determine what variables affect the number of citations an original paper will receive –. We therefore undertook a study to determine what factors were associated with an increased rate of citation using a cohort of articles published in leading medical journals. In particular, we examined whether certain variables that have been empirically linked to study quality or bias had a positive or negative impact on subsequent citation rate.
Selection of Journals and Articles
Our cohort of articles assembled included all original research papers published in the 6 month period between October 1, 1999 and March 31, 2000 appearing in the three general medical journals with the highest impact factors according to the Institute of Scientific Information's Journal Citation Report (ISI-JCR): the New England Journal of Medicine (NEJM), the Journal of the American Medical Association (JAMA), and the Lancet. We included all articles under the following table of content headings, regardless of study methodology: “Original Articles” in NEJM, “Original Contributions” in JAMA, and “Original Research–Articles” in Lancet. We excluded all other articles, including editorials, review articles, special articles, case reports, and research letters.
Two reviewers (AVK and JWB) trained in health research methodology extracted data independently and in duplicate, for the following variables: 1) the journal in which the article appeared (NEJM, JAMA, or Lancet) and the month of publication; 2) study design (randomized trial, prospective observational study, retrospective study, survey study, or meta-analysis); 3) clinical category of the article, defined as the medical subspecialty to which the main conclusion of the article was most applicable: anesthesiology, cardiovascular, dermatology, endocrinology, gastroenterology, general medicine, infectious disease, musculoskeletal, nephrology, neurology, obstetrics/gynecology, oncology, ophthalmology, pediatrics, psychiatry, or respirology; 4) whether the author by-line for the article included group authorship; 5) country in which the research was performed (defined as the country or countries in which research participants were recruited or, for research which did not use research participants, e.g., meta-analyses, the country of the corresponding author); 6) sample size of the study (in cases of meta-analysis, the sample size was taken as the total number of patients in all analyzed studies); 7) if an industry-affiliated drug or device was under investigation and whether the results favored the intervention or not; 8) declared industry funding; and 9) if the study had been reported in the lay media. Reviewers resolved discrepancies by discussion.
Industry was defined as for-profit companies and excluded all government agencies and non-profit private agencies. Industry funding was considered present if there was any acknowledgement of direct industry support for the research study (including direct funding of the study or supplying of drugs or medical devices). This did not include author-declared conflicts arising from having received individual consultant fees, for example.
In cases where studies explored the efficacy of an industry-affiliated device or drug two reviewers (AVK and JWB) independently evaluated whether the results would be considered favorable to industry. There is no standardized definition of positive results , and we considered study results favorable to industry if study findings suggested beneficial health effects or absence of expected adverse health effects with regards to the intervention under study. Disagreement was resolved through discussion. To explore the reliability of assessing industry-favouring status prior to data extraction, the same two reviewers independently evaluated 20 randomly selected studies from our cohort using a computer-based random number generator and found very good inter-observer reliability (kappa = 0.80).
To inform public interest in each study we searched the Associated Press health news wire during the 6-month period following publication of each article to determine if the study had been reported by the lay media. All data were extracted prior to determination of our primary outcome measure-the number of citations received.
Outcome Measure Assessment
The primary outcome measure (annual rate of citation) was defined as the number of citations received per year since publication. Approximately five years (ranging from 57 months to 63 months) after we assembled our cohort, we conducted a citation search using the Institute of Scientific Information's (ISI) electronic version of Science Citation Index (http://isiknowledge.com) for each article, using a cited reference search, to determine the number of times the article had subsequently been cited in the medical literature. All citation searches were carried out in a single one week period in December 2004. A citation is counted by ISI if an article appears in a reference list in any of the approximately 8700 journals indexed by ISI. This would include reference lists associated with scientific papers, editorials, letters, or general interest articles. The initial query was performed by two of us independently (AVK and JWB) using the first author's name or group authorship name, journal title, and year of publication. If this query failed to yield any citations for an article, we conducted a search for the study title to limit misclassification of an article as having zero subsequent citations.
We performed all analyses using SPSS 13.0 statistical software (SPSS Inc., Chicago, IL). Amongst the 16 subgroups within the clinical category variable, only those with at least 20 articles each were retained as distinct subgroups for analysis; all other subgroups were combined into “others”. The country in which the research was performed was analyzed as either exclusively/partially in the United States or exclusively outside of the United States. Because of the highly skewed distribution of sample size (mean of 53310, but ranging from 1 to 3.3 million) we used a log10-transformation for this analysis. As funding source and study conclusions have been shown to be associated , so-called “sponsorship bias” , we decided, a priori, to test declared industry funding and industry-favoring results for interaction. We calculated the median and mean (with associated standard deviation [SD] or 95% CI) annual citation rate for all articles.
We used linear least-squares regression with the annual rate of citation as the dependent variable to explore associations. Each of the independent variables was initially tested in a univariable regression model. The F-test was used to calculate the level of significance and we included variables in our multivariable model if their level of significance was p<0.10 or they substantially altered the significance of another variable in the model. We used a step-forward method for entry into our multivariable analysis, in order from lowest p-value to highest. A variable was considered statistically significant if it had a p-value<0.05 in the final multivariable model. Multicollinearity was deemed concerning if the variance inflation factor for any independent variable was greater than 5 .
Our literature search generated 328 articles that were grouped into the following clinical categories: infectious disease (n = 62), cardiovascular (n = 57), oncology (n = 30), general medicine (n = 29), and obstetrics/gynecology (n = 25), leaving 125 articles assigned to “other”. Ninety-two (28.0%) studies were randomized and 68 (20.7%) were group authored (either exclusively or in addition to named individual authors). The majority of studies were performed either partly or exclusively in the United States (54.0%, 177 of 328). Eighty-two articles (25.0%) declared industry funding, of which approximately half (n = 42) reported industry-favoring results. Thirty-four studies reported industry-favoring results, but were not industry-funded. Ninety-seven articles (29.6%) had been reported by the Associated Press. (Table 1)
Our 328 eligible articles were cited a total of 38,381 times and the annual rate of citation ranged from 1.0 to 392.9 (median 14.1; mean 23.8, SD = 31.6).
Univariable regression models using annual rate of citation as the dependent variable yielded p-values<0.10 for all independent variables, except month of publication (p = 0.50).(Table 1) The variance inflation factors of all independent variables were less than 2.1, suggesting that multicollinearity was not a concern. Graphical examination of residuals against predicted values did not suggest a violation of the linearity assumption for the independent variables.
The following variables were retained in our multivariable regression model: industry funding, industry-favoring result, clinical category of article, group authorship, journal of publication, and sample size. (Table 1)
Based on our a priori hypothesis, a two-way interaction between industry funding and industry-favoring result was tested and found to be significant (p = 0.02). Therefore, if a study was industry funded, an industry-favoring result was associated with a significantly higher annual citation rate (an increase of 21.7; 95% CI = 9.2 to 34.3). However, if a study was not industry funded, a favorable result was not associated with a significant difference in annual citation rate (an increase of 2.5; 95% CI = −8.2 to 13.2). The unstandardized regression coefficients presented in Table 1 represent the difference in the annual citation rate between the subgroup and the reference category. Our model explained approximately 20% of the variance (adjusted R2 = 0.20) in annual citation rates of our cohort.
Our analysis of a consecutive cohort of 328 original articles published in leading general medical journals found that declared industry funding with industry-favoring results, articles reporting data related to oncology or cardiovascular medicine, group authorship, higher impact journal of publication, and larger sample size were associated with higher rates of subsequent annual citation. Studies that declared industry funding and reported industry-favoring results were associated with the largest increase in annual citation rate.
Limitations and Strengths
Our review has potential limitations. Despite our aggressive search strategy it is possible that some citations were missed, and the difficulty in accurately retrieving citations of group-authored articles, in particular, has been documented , . However, this is likely to be only a relatively small proportion (only 10 articles in our sample were exclusively group-authored) which would be unlikely to substantially alter our main results. Further, we found group authorship was associated with greater citations which provides additional assurance that our search strategy was successful.
We did not assess self-citation, which has been associated with increased frequency of subsequent citation , . As well, we assumed all subsequent citations to be quantitatively equal and we did not assess the context in which the citation appeared. For example, there may be differences between studies that are cited in a positive fashion versus those that are cited in a critical or negative fashion.
Despite including many potentially relevant independent variables, our final model only accounted for a moderate amount of the variability in the citations received (adjusted R2 = 0.20). Our model, however, was able to provide more explanation of the variance in citation frequency than the previous model by Callaham et al. (pseudo-R2 = 0.14)  and this is likely due to our inclusion of declared industry funding and industry-favoring results as variables. In fact, when these variables are removed from our model, the adjusted R2 falls to 0.15.
Our multivariable analysis highlights some of the limitations in the interpretation of the impact factor. For example, using our data, the difference in the annual citation rate between articles appearing in the highest and lowest impact journals in our sample (NEJM and Lancet) was 16.3, roughly in keeping with the 15.8 difference in their 2001 impact factors (29.1 and 13.3, respectively). However, the adjusted difference in annual citation rate was approximately 10.0 (95% CI = 1.7 to 18.3) (see Table 1), highlighting the fact the impact factor is attributable to more than just the journal of publication.
Our work has additional strengths. Our cohort of 328 articles is the result of a systematic search. Our data collection was comprehensive and careful, including independent judgment and abstraction of data at all stages conducted by methodological trained reviewers, and use of targeted, relevant analyses. Our results are not, however, generalizable to articles published in periodicals aside from the 3 high-impact general medical journals we reviewed.
The rate of citation is used to calculate journal impact factors, which are viewed as a sign of journal importance and prestige. Subsequent citation and journal impact factor are commonly used as criteria for academic promotions within universities and the works of more accomplished researchers, including Nobel laureates, receive more citations than the works of other researchers . Citation of articles is also an essential component of the dialogue of medical research–a dialogue which occurs largely within the pages of peer-reviewed journals. By re-iterating published research, citation serves to further the influence of their results.
In a review of emergency medicine papers, Callaham et al. found that the impact factor of the publishing journal was associated with the largest increase in citation rates . Their study included a broader range of journal impact factors (ranging from 0.23 to 24.5) than our study, which was limited to only three very high impact factor journals (ranging from 13.3 to 29.1 in 2001). In our analysis, there was an association between journal and citation rate and this was in the expected direction, with articles in the higher impact journals having a higher rate of citation.
The impact factor of a journal has empirically been shown to be associated with article quality in some studies  but not in others . In our adjusted analysis, larger sample size was associated with a higher citation rate while the design of the study was not. Some authors have described an association between citations and newsworthiness , ; however, the presence of an Associated Press news story (an indicator of newsworthiness and general public interest) did not demonstrate a significant enough association with citation rate to be included in our final multivariable model (5.5 more citations per year (95% CI = −2.2 to 13.2, p = 0.20) when added to the existing multivariable model).
The incidence of group authorship in the medical literature has steadily increased over the last two decades . In our analysis, group-authored articles received approximately 11.1 more citations per year than articles with only individually named authors, a result consistent with previous findings by Dickersin et al. . One could hypothesize that papers with group authorship are potentially larger studies, of higher methodological rigor, and of possible greater general interest. However, our multivariable analysis attempted to correct for such confounding variables. We did not study the effect of self-citation, which may account for up to 20% of subsequent citations . It can be hypothesized that with group authorship (and, therefore, a greater number of authors) the potential impact of self-citation may be greater, thereby at least partly accounting for the higher citation rate.
The potential bias associated with industry-sponsored research has been suggested in previous works that have found an association between industry funding and the reporting of favorable results – and lower methodological quality . Friedberg et al. found that pharmaceutical company sponsorship of economic analyses was associated with reduced likelihood of reporting unfavorable results . Djulbegovic et al. reported that industry-funded trials more often compared innovative treatments to either a placebo arm or no therapy, resulting in a higher proportion of such studies favoring the new intervention . This type of research has generally concentrated on examining the association between industry funding and study results. However, the next step in the dissemination of results is through their subsequent citation, and Patsopoulos et al. have recently shown that the proportion of most frequently cited articles funded by industry has been increasing .
After controlling for a number of other independent variables our analysis found that studies with declared industry funding received approximately 22 more citations per year only if their results were industry-favoring. The influence of 22 additional citations per year certainly appears to be substantial when put in context to the impact factors of the most cited journals in general medicine (which range from 10.4 to 44.0 for the top five journals in 2005). Therefore, the added influence appears to be the quantitative equivalent of having an extra publication in a high-impact journal. These extra citations may have the effect of amplifying the results of these studies in the medical literature.
In our analysis, large trials, with group authorship, industry-funded, with industry-favoring results, in oncology or cardiology were associated with greater subsequent citations. Declared industry funding with industry-favoring results was associated with the largest increase in annual citation rate. The medical community should be aware of the potential for these studies and their results to have greater impact in the subsequent medical literature.
We thank Debra Levin and Reza Mazaheri for contributing to the data collection for this manuscript.
Conceived and designed the experiments: JB AK. Performed the experiments: JB AK IS. Analyzed the data: JB AK. Wrote the paper: JB AK.
- 1. Garfield E (1996) How can impact factors be improved? BMJ 313: 411–413.
- 2. Gotzsche PC (1987) Reference bias in reports of drug trials. BMJ 295: 654–656.
- 3. Schmidt LM, Gotzsche PC (2005) Of mites and men: reference bias in narrative review articles: a systematic review. J Fam Pract 54: 334–338.
- 4. Ravnskov U (1992) Cholesterol lowering trials in coronary heart disease: frequency of citation and outcome. BMJ 305: 15–19.
- 5. Ravnskov U (1995) Quotation bias in reviews of the diet-heart idea. J Clin Epidemiol 48: 713–719.
- 6. Callaham M, Wears RL, Weber E (2002) Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. JAMA 287: 2847–2850.
- 7. Phillips DP, Kanter EJ, Bednarczyk B, Tastad PL (1991) Importance of the lay press in the transmission of medical knowledge to the scientific community. N Engl J Med 325: 1180–1183.
- 8. Healy D, Cattell D (2003) Interface between authorship, industry and science in the domain of therapeutics. Br J Psychiatry 183: 22–27.
- 9. Olson CM (1994) Publication bias. Acad Emerg Med 1: 207–209.
- 10. Bekelman JE, Li Y, Gross CP (2003) Scope and impact of financial conflicts of interest in biomedical research: A systematic review. JAMA 289: 454–465.
- 11. Lesser LI, Ebbeling CB, Goozner M, Wypij D, Ludwig DS (2007) Relationship between funding source and conclusion among nutrition-related scientific articles. PLoS Med 4: e5.
- 12. Belsley DA, Kuh E, Welsch RE (1980) Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley & Sons.
- 13. Dickersin K, Scherer R, Suci ES, Gil-Montero M (2002) Problems with indexing and citation of articles with group authorship. JAMA 287: 2772–2774.
- 14. MacKinnon L, Clarke M (2002) Citation of group-authored papers. Lancet 360: 1513–1514.
- 15. Gami AS, Montori VM, Wilczynski NL, Haynes RB (2004) Author self-citation in the diabetes literature. CMAJ 170: 1925–1927.
- 16. Fassoulaki A, Paraskeva A, Papilas K, Karabinis G (2000) Self-citations in six anaesthesia journals and their significance in determining the impact factor. Br J Anaesth 84: 266–269.
- 17. Garfield E, Welljams-Dorof A (1992) Of Nobel class: a citation perspective on high impact research authors. Theor Med 13: 117–135.
- 18. Lee KP, Schotland M, Bacchetti P, Bero LA (2002) Association of journal quality indicators with methodological quality of clinical research articles. JAMA 287: 2805–2808.
- 19. Weeks WB, Wallace AE, Kimberly BC (2004) Changes in authorship patterns in prestigious US medical journals. Soc Sci Med 59: 1949–1954.
- 20. Als-Nielsen B, Chen W, Gluud C, Kjaergard LL (2003) Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events? JAMA 290: 921–928.
- 21. Bhandari M, Busse JW, Jackowski D, Montori VM, Schunemann H, et al. (2004) Association between industry funding and statistically significant pro-industry findings in medical and surgical randomized trials. CMAJ 170: 477–480.
- 22. Kjaergard LL, Als-Nielsen B (2002) Association between competing interests and authors' conclusions: epidemiological study of randomised clinical trials published in the BMJ. BMJ 325: 249–252.
- 23. Friedberg M, Saffran B, Stinson TJ, Nelson W, Bennett CL (1999) Evaluation of conflict of interest in economic analyses of new drugs used in oncology. JAMA 282: 1453–1457.
- 24. Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, et al. (2000) The uncertainty principle and industry-sponsored research. Lancet 356: 635–638.
- 25. Patsopoulos NA, Ioannidis JP, Analatos AA (2006) Origin and funding of the most frequently cited papers in medicine: database analysis. BMJ 332: 1061–1064.