Integrative Approach to Quality Assessment of Medical Journals Using Impact Factor, Eigenfactor, and Article Influence Scores

Background Impact factor (IF) is a commonly used surrogate for assessing the scientific quality of journals and articles. There is growing discontent in the medical community with the use of this quality assessment tool because of its many inherent limitations. To help address such concerns, Eigenfactor (ES) and Article Influence scores (AIS) have been devised to assess scientific impact of journals. The principal aim was to compare the temporal trends in IF, ES, and AIS on the rank order of leading medical journals over time. Methods The 2001 to 2008 IF, ES, AIS, and number of citable items (CI) of 35 leading medical journals were collected from the Institute of Scientific Information (ISI) and the http://www.eigenfactor.org databases. The journals were ranked based on the published 2008 ES, AIS, and IF scores. Temporal score trends and variations were analyzed. Results In general, the AIS and IF values provided similar rank orders. Using ES values resulted in large changes in the rank orders with higher ranking being assigned to journals that publish a large volume of articles. Since 2001, the IF and AIS of most journals increased significantly; however the ES increased in only 51% of the journals in the analysis. Conversely, 26% of journals experienced a downward trend in their ES, while the rest experienced no significant changes (23%). This discordance between temporal trends in IF and ES was largely driven by temporal changes in the number of CI published by the journals. Conclusion The rank order of medical journals changes depending on whether IF, AIS or ES is used. All of these metrics are sensitive to the number of citable items published by journals. Consumers should thus consider all of these metrics rather than just IF alone in assessing the influence and importance of medical journals in their respective disciplines.


Introduction
The impact factor (IF), which is a score calculated each year by the Institute for Scientific Information (ISI), is widely considered as one of the leading proxies for evaluating the quality, importance, and influence of medical journals to their respective discipline (Science Citation Index, Journal Citation Report. Institute for Scientific Information, www.isinet.com). [1] Medical editors frequently use the IF as a performance index of their journal and a means of ranking their journals relative to their peers. [2,3,4,5] Some journals use the IF to ''advertise'' their quality and to entice potential authors in submitting high-quality papers to them. Promotion committees of academic institutions commonly use the IF to judge the quality of publications of applicants for promotion and tenure and departmental chairs may use it in the hiring and assessment process of new recruits. [6] Increasingly, however, there is growing discontent with the IF as a tool for determining ''quality'' and ''prestige'' of journals [7,8]. One reason is that the distribution of citations is non-parametric with fewer than 20% of the articles accounting for more than 50% of the total number of citations of journals and with many articles that never receive any citations [9,10]. Moreover, IF only counts the number of citations without taking into account the source of the citations (ie. citations from prestigious journals are worth no more than citations from lower-tier journals) or makes any allowances for the ''citation culture'' between journals and across disciplines [7]. It is also now well recognized that journal's IF can be increased by reducing the number of original research papers and increasing the number of editorials (which are not counted in the denominator of IF), review papers, which receive on average twice as many citations as original articles [9,11] and by encouraging self-citations [7,11]. Original research papers, however, are the main ''engines'' of generating new knowledge and, by decreasing their publication rate, journals may be mitigating dissemination of scientific knowledge and curtailing scientific discourse. Over time, this may increase the IF but paradoxically reduce the overall influence of these journals on the scientific community as fewer scientists and clinicians read the journal. To address these and other concerns with the IF, other instruments including those that take into account the quality as well as the quantity of citations, have been proposed [12,13,14]. This concept was first proposed by Pinski and Narin [15], who suggested that journals should be ranked according to their eigenvector centrality in a citation network. With the recent success of Google's ranking system for web pages, this concept has been modified to include algorithms based on a PageRank system [13]. Although there are several different algorithms in use, the two that have gained the most attention in recent years are Scimago Journal Rank (SJR) (http://www.scimagojr.com/index. php) and Eigenfactor score (ES) (http://eigenfactor.org/), both of which use an iterative weighting system to calculate a summary index that reflects both the ''quality'' and the ''quantity'' of citations received by these journals based on a PageRank algorithm [12,15]. Despite the differences in the way in which weight-based and non-weight based methods are derived, studies have shown that in any given year, scores based on a PageRank algorithm correlate well with those based on traditional IF and produce similar rank order of medical journals [14,16]. However, it is not known whether the temporal trends in these scores produce similar or differential rank orders of these journals. Since ES is at least in part dependent on the number of citable items published by journals in any given year [17,18], by reducing the publication rate, it is possible for a journal to increase IF without changing its ES (and vice versa). Thus, the primary aim of the present study was to determine the changes in IF and ES across the major general and sub-specialty medical journals over the past 8 years.

Selection of Journals
We decided a priori to evaluate the temporal trends in the impact factor (IF) and Eigenfactor Score (ES) in 35 general and subspecialty clinical journals between 2001 and 2008. We chose this timeframe to mitigate the influence of name changes of journals in the IF and ES calculations and to ensure comparability of data across the journals. To ensure reasonable representation of journals from each discipline, we chose the three mostly highly ranked journals per discipline as determined by the 2008 IF except for respiratory medicine and endocrinology in which four rather than three journals were selected. We did this to mitigate the potential effect of overlap of content and audience of journals in the ''respiratory system'' and ''critical care medicine'' (e.g. the American Journal of Respiratory and Critical Medicine is listed both categories) and to ensure that there is adequate representation of non-diabetic papers (and audience) in ''endocrinology'' as the top two journals under this category were diabetes-focused (e.g. Diabetes and Diabetes Care). From the Thompson Reuters' Journal Citation Reports (http://admin-apps.isiknowledge.com/JCR/ JCR?PointOfEntry = Home&SID = 3EIG4M34Amad@6eKDPA) and the Eigenfactor.org websites (http://eigenfactor.org/), two independent reviewers (JR, DS) abstracted data on the IF, ES, citable items and Article Influence Score (AIS) on these journals. The data were imported into an Excel Spreadsheet and any disagreements were resolved by iteration and consensus.
The journals that were evaluated included Annals of Neurology

Impact Factor
The IF is published by ISI each year for all indexed journals and is calculated based on a three-year period. It reflects the average number of times that papers are cited up to two years following publication. For example, the 2009 IF for a journal would be calculated by taking the number of times articles (original, reviews, proceedings or notes) published in 2007 and 2008 were cited in 2009 and dividing this number by the total number of articles, reviews, proceedings, guidelines or consensus statements that were published in this journal in 2007 and 2008. Editorials and letters to the editors are generally excluded from the denominator but can be counted in the numerator of the impact factor. In general, review articles, consensus statements and clinical guidelines are cited more frequently than original articles [19].

Eigenfactor Score (ES)
For each of these journals, we retrieved data on the ES from http://www.eigenfactor.org. ES is calculated based on a complex algorithm that takes into account not only the quantity of citations but also their ''quality'' by assigning weights to the source of the citations. The full details of the algorithm can be found at http:// www.eigenfactor.org/methods.htm. In brief, the algorithm assigns quality scores to journals by creating a citation network in which journal articles are first randomly selected. The citation lists from these retrieved articles are then used by the network to select the next set of journals. The citation lists from this batch of journals are then used by the network to select the third set of journals. This process continues indefinitely creating a hierarchical ranking of journals based on the frequency of citations. The network assumes that journals that are highly cited are to be of high quality, while those that are infrequently cited are deemed to be of lower quality. Importantly, the ES has no denominator. Thus, journals that publish a lot of articles have higher ES than those that publish very few articles if the average quality of the published articles is similar between these journals.

Article Influence Score (AIS)
Article Influence TM Score (AIS) is derived from ES and conceptually similar to IF in that there is a numerator as well as a denominator (i.e. number of citable papers) except that it uses ES (rather than the total number of citations) as the numerator. Thus, dissimilar to IF where all citations are counted equally regardless of their source, in AIS, each citation is multiplied by the ''quality'' of the citing journals, resulting in greater weights for citations that come from highly cited journals, and less weight to poorly cited journals. To facilitate interpretation, the AIS is normalized, so that the mean article in the Journal of Citation ReportsH has an AIS of 1.00.

Statistical Analysis
The journals were ranked based on the published 2008 ES, AIS, and IF scores. We also retrieved the 2001 to 2008 ES, AIS, IF scores, and number of citable items (CI) in order to determine the temporal trends in these values. The statistical significance of the temporal trends was determined using a chi-square test for trend. A p-value of less than 0.05 was considered statistically significant. All analyses were conducted using SAS version 9.1 (Carey, N.C.).

AIS, IF, and ES
The 2008 ES, AIS, and IF values of selected medical journals are shown in Table 1. Of the evaluated journals, the overall leader was the New England Journal of Medicine irrespective of the metric used to measure quality. However, the rankings for the remaining journals changed depending on the score that was used. For instance, using the traditional IF score, the 2 nd leading journal in 2008 was JAMA, followed by the Lancet, J Clin Oncol and J Natl Cancer I. In general, the AIS and IF values provided similar rank orders, with few notable exceptions including J Neurosci, which was ranked 11 th based on AIS and 20 th based on IF and J Allergy Clin Immunol, which was ranked 20 th based on AIS and 14 th based on IF.
Using ES values resulted in large changes in the rank order of the selected journals. While the N Engl J Med retained the top spot, J Neurosci took over 2 nd spot on the list, followed by Circulation, Lancet and JAMA. In general, journals that published a lot of  (2) J Clin Oncol 0.34752 (6) 4.164 (7) 17.157 (4) J Am Coll Cardiol 0.22767 (7) 3.727 (10) figure 1A and 1B). In general, however, journals with a high number of citable items displayed higher ES values than those that had a small number of citable items.

Trends in IF, ES, AIS, and CI Between 2001 and 2008
Since 2001, the IF of 77% (27/35) of the journals included this analysis increased significantly ( Table 2). Only J Neurosci experienced a significant decline in IF. In the remaining journals, the IF did not change significantly over time. In contrast, only 51% ( = 18/ 35) of the journals increased their ES values over the 8 years, while 26% ( = 9/35) of the journals experienced a decline in their EF values (table 3). The discordance between the temporal trends in IF and ES was largely driven by the temporal changes in the number of citable items published by each of the journals (see figure 2A). In 20% of the journals, the number of citable items increased and in another 20% the number of citable items decreased over time. In the remaining 60%, the number of citable items did not change significantly (Table 4; figure 2A). In general, as the number of citable items decreased, the IF of the journals increased, though this relationship did not reach statistical significance (p = 0.132) largely due to the extreme effects of the New England Journal of Medicine, whose IF score increased by 21 in the absence of any significant changes in the number of citable items over the 8 years of the study. The removal of the New England Journal of Medicine from this analysis, however, led to a significant relationship between the temporal trends in CI and IF (figure 2B; p = 0.05). There were journals whose IF score and the number citable items both increased during this period of time (see Tables 2 and 4 (Table 5).

Discussion
There is no universally accepted metric for assessing the ''quality'' and ''influence'' of journals to the scientific community.
In the Journal Citation Reports, ISI provides several attributes for assessing quality including total citations, IF, ES, and AIS. Of these the most widely used metric is the IF. However, the major shortcoming of IF is that it is sensitive to the number of original research papers published per year. Because in general review papers and guidelines have a higher citation index than that for original papers, by publishing fewer original papers (and more review papers), journals can increase their IF. Paradoxically, however, because original research is the primary engine for generating new scientific knowledge (or validating existing knowledge), by reducing the publication rate of research articles, journals' influence on the scientific discourse of their discipline may decrease. ES is an attempt to capture the ''influence'' of medical journals on the scientific discourse generated in their These data indicate that IF and ES in particular can produce dissimilar results and thus highlight the importance of using multiple rather than just one metric in assessing the performance of journals and the impact and influence they have on their respective fields of study. Our data are consistent with those of Chew et al [20], who showed that the IF of the seven top-ranked general medical journals rose considerably between 1994 and 2005 but the  [21] and that both usage and citation based measures are needed to understand the scientific impact of journals. It should also be noted in our analysis that although IF and AIS values are calculated differently, they nonetheless produced similar rank order of journals, suggesting that the weighting system of AIS does not significantly modify the performance status of the journals. The major discrepancies occurred only when the denominator of AIS was removed (yielding ES values), which highlights the importance of quantity of publications in the determination of scientific impact of journals.
There are important implications for these data. Firstly, it is essential that authors take into account not only the IF of journals on deciding where to send their paper but also the ES, as journals with high IF but low ES may have low readership and have little influence on their respective field, although in general papers that are highly accessed and viewed are cited more frequently than those that have limited access [22,23,24]. Secondly, IF must be viewed in the context of other metrics such as ES and AIS, which takes into account not only the quantity but also the quality of the citations. Thirdly, the rise of the journal IF over the past decade likely reflects the increase in the citation rate of papers published in these journals. However, it is possible that in some journals, the rise in their IF may in part reflect a reduction in the number of original articles published per year. The potential paradox is that by doing so these journals may be limiting their influence. Thus, as with individual researchers, journals should use IF in conjunction with other metrics such as ES in assessing the relevance and ''impact'' of their journals in their respective field. There were limitations to this study. Firstly, ES was used as a surrogate for the ''influence'' of journals. However, this metric has never been fully validated for this outcome. In the same vein, IF has never been fully validated as measure of ''quality'', though it is widely used in this fashion. Secondly, there are other conventional metrics of journal quality such as immediacy index, citation half-life, or PageRank based metrics such as SCImago journal rank indicator [25] that were not considered in the present analysis. Thirdly, an important aspect of understanding the influence of journals is to determine the size and make-up of the readership, which was not done in the present study. Some [22,23,24] but not all [26]studies suggest that papers that are viewed more frequently receive higher citation rates than those that are accessed infrequently. Fourthly, we did not determine the reasons for the rise and fall of IF, ES and citable items in these journals. A previous study suggested that the temporal increases in IF for certain journals may reflect several factors including active recruitment of ''high-impact'' papers by journal editors, acceleration of the review and publication process, early on-line publication of accepted articles, media promotion of articles and journals, and the increase in the number of journals included in the ISI database [20]. The reasons for the fall in the citable items for certain journals are also unclear. Some explanations include journals becoming more selective of the articles that they were accepting, and re-design of journals leading to fewer pages [20]. Whatever