A Comprehensive Survey of Retracted Articles from the Scholarly Literature

Background The number of retracted scholarly articles has risen precipitously in recent years. Past surveys of the retracted literature each limited their scope to articles in PubMed, though many retracted articles are not indexed in PubMed. To understand the scope and characteristics of retracted articles across the full spectrum of scholarly disciplines, we surveyed 42 of the largest bibliographic databases for major scholarly fields and publisher websites to identify retracted articles. This study examines various trends among them. Results We found, 4,449 scholarly publications retracted from 1928–2011. Unlike Math, Physics, Engineering and Social Sciences, the percentages of retractions in Medicine, Life Science and Chemistry exceeded their percentages among Web of Science (WoS) records. Retractions due to alleged publishing misconduct (47%) outnumbered those due to alleged research misconduct (20%) or questionable data/interpretations (42%). This total exceeds 100% since multiple justifications were listed in some retraction notices. Retraction/WoS record ratios vary among author affiliation countries. Though widespread, only miniscule percentages of publications for individual years, countries, journals, or disciplines have been retracted. Fifteen prolific individuals accounted for more than half of all retractions due to alleged research misconduct, and strongly influenced all retraction characteristics. The number of articles retracted per year increased by a factor of 19.06 from 2001 to 2010, though excluding repeat offenders and adjusting for growth of the published literature decreases it to a factor of 11.36. Conclusions Retracted articles occur across the full spectrum of scholarly disciplines. Most retracted articles do not contain flawed data; and the authors of most retracted articles have not been accused of research misconduct. Despite recent increases, the proportion of published scholarly literature affected by retraction remains very small. Articles and editorials discussing retractions, or their relation to research integrity, should always consider individual cases in these broad contexts. However, better mechanisms are still needed for raising researchers’ awareness of the retracted literature in their field.


Introduction
The number of articles retracted each year has increased precipitously in recent years. Prior studies, which mainly focused on the medical literature, found that article retractions and lapses in research integrity impact both the published literature and the evolution of scientific knowledge [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. However, the severity of the problem has been a matter of debate. On one hand, some feel that the rise in retractions reflects a very real and pressing issue: Although retracted publications represent only a miniscule percentage of the total literature [1], [2], surveys of researchers have suggested that only a fraction of research misconduct cases are caught and publicly discredited [3], [4]. Furthermore, the results from retracted articles continue to be cited as valid [5][6][7]. In an attempt to determine the potential extent of the problem, a crude estimate of ''potentially retractable'' articles based on high-impact journal retraction frequencies was conducted [1], though it has since been criticized as too simplistic [8].
In contrast, other authors argue that since articles can be retracted for a variety of reasons, the recent rise in retractions may not actually reflect a ''crisis of scientific integrity'' which may be superficially suggested by the raw numbers: For example, past surveys found that despite an increasing number of retractions due to misconduct [9], [10], more articles had been retracted due to unintentional errors [11]. For this reason, some have argued that article retraction should generally be uncoupled from the stigma of ''misconduct'' [12]. They argue that if retractions are to be used as a proxy for measuring misconduct, then retraction, or ''unpublication,'' should be a last resort, reserved for only the most egregious offences [13], [14].
Recent systematic studies have attempted to characterize retraction notices, retracted publications and the role of misconduct in order to gain a more thorough understanding of the impact of retractions and the true extent of misconduct in scholarly research. However, studies published from 1992-2011 were somewhat limited in scope due to a sole reliance on the literature indexed in the PubMed database (Table S1), with the largest dataset examined to date comprising only 871 retracted publications [9].
In order to gain a broader perspective on the phenomenon of article retractions, in both the medical and non-medical literature, we identified 4,449 formally retracted scholarly publications from 42 bibliographic databases and major publisher websites representing a broad spectrum of scholarly fields. Our analysis investigated various attributes among these articles, such as their distributions across disciplines, geographic location of author affiliations, the justifications and authorities calling for retraction, and temporal trends. Trends in these attributes can shed light on why publications are being retracted at current rates, in both the medical and non-medical literature, and foster the development of more effective methods for curbing the rising retraction rates.

Data Sources
Retracted scholarly articles were identified using bibliographic databases, major publisher websites, non-publisher journal aggregators (such as J-STOR), and search engines. To compile a comprehensive list of retracted articles, a wide variety of data sources must be consulted for several reasons. Bibliographic databases vary with respect to the journals they index ''cover-tocover,'' the journals they index only selectively, and their policies regarding retroactively marking existing records for retracted publications [21], [22]. Due to the latter two factors, even if database X covers journal Y, identifying the retracted articles from that journal in the database may not be possible. One example is the Inspec database, which covers a wide range of engineering journals but does not index retraction notices or retroactively mark retracted article records in any discernible way.
Thus, 42 data sources were consulted from May-June 2011 using the queries indicated in Table 1. The highest-yielding sources were re-queried in Aug 2011 to capture the dozens of articles retracted in the interim. Search terms for retrieving either the retracted articles or the retraction notices were used, as cases where only one of the two could be identified in a given data source were common. The 42 data sources consulted represent the largest broad-scope scholarly literature databases, the most comprehensive sources which focus on major fields of study, and several which cover specialized literature that is largely excluded from other sources. An example of the latter is Global Index Medicus-IMSEAR (SEARO), which yielded many retracted articles from Southeast Asian medical journals that are not indexed in any of the other sources.

Criteria for Considering an Article ''retracted''
Here we consider a ''retracted'' article to be one that has been explicitly ''retracted'' or ''withdrawn'' via a notice, erratum, corrigendum, editorial note, ''rectification,'' or other such editorial notification vehicle. We included cases of ''partial retraction'' by such notices, where retraction applied to only a portion of the publication, such as a single figure with questionable data which may or may not be central to the main point(s) of the publication. Articles identified as problematic but not explicitly retracted (such as a simple ''statement of duplicate publication,'' pairs of original and ''corrected and republished'' articles, or those mentioned in ''editorial expression of concern'' notices (but not yet retracted) were not included in this study.

Compiling the Master list of Retracted Articles
Compiling the list of retracted articles and their corresponding retraction notices began with the PubMed query (''retracted publication'' [pt] or ''retraction of publication''[pt]). Each retracted article citation in the results output was matched up with the citation for its corresponding retraction notice based on either data from the PubMed records or consultation of the notice to determine which article(s) it retracted. The list of paired retracted article/notice citations was imported into an Excel spreadsheet and sorted by full journal title, volume, and pagination of the original article. The results of additional queries of PubMed and other data sources listed in Table 1 were sorted by ''journal title'' (for databases offering such an option) and all results were manually screened against the growing retracted article list. Duplicates and ''off-topic'' hits were discarded, while citations for clearly identified retracted articles and retraction notices that were not already on the list were added. Terms such as ''retraction'' and ''withdrawal'' are used in many contexts other than article retractions, such as ''retraction'' of an airplane's landing gear or ''alcohol withdrawal.'' Thus, many queries yielded high proportions of ''off-topic'' hits-such as the WoS query for ''TI = retract*'' which yielded 7,925 records on 07 Aug 2011 (Table 1, line 26). Additional data sources were queried until the vast majority of retracted articles retrieved were already on the list, indicating a point of diminishing returns. At that point, queries of the websites of individual journals with large numbers of retractions (Table 1, part C) verified that we had identified all retracted articles in them from the data sources consulted. In summary, the 42 data sources yielded 4,449 scholarly articles retracted between 1928 and 2011, which were subjected to further analysis.

Metadata
In order to characterize various attributes of the retracted articles, we were able to obtain 4,244 of the retracted articles and the retraction notices for 4,232 of them. In the remaining cases, the article, its retraction notice, or both were not available to us in hard copy or online. Because our survey yielded only 21 articles retracted prior to 1980 and 2011 was only a partial year, our analysis focused on the retracted articles published from 1980-2010. Previous smaller-scale studies have examined various attributes of retracted articles from the medical literature (Table  S1), such as trends in number of articles retracted over time, across disciplinary fields, and by country of authorship; in addition to determining the frequency of various justifications for retraction and the authorities involved in calling for the retraction of articles. We sought to examine these attributes among a more comprehensive and multi-disciplinary set of retracted articles. Table 2 lists the categories used to describe the attributes of the retracted articles that we analyzed in this study. The information given in retraction notices was taken at face value, and no attempt was made to independently verify the accuracy of the statements made in the notices. Table 1. Forty-two data sources were consulted to locate retracted articles.
A. Multi-publisher databases and journal aggregators: literature. For comparisons between different years, disciplines, journals, and authorship countries, data are presented as both raw counts and either ratios or percentages which are normalized by some measure of their relative sizes.

Attributes Studied
Disciplinary and journal distributions. An objective system for assigning articles to various fields of study is required to compare retraction rates among different scholarly disciplines. WoS covers all scholarly fields and includes query features for Table 2. Attributes of article retraction cases analyzed in this study.
1. Justification for retraction: Many retraction notices list multiple retraction justifications for a single article. For the purposes of this study, the justifications were divided into categories of Publisher error, Author error, Other and Unspecified. These broad categories were then further divided, resulting in a total of 15 justification categories: iii. Distrust data or interpretations, meaning that the data or interpretations as published are no longer considered valid or reliable by some or all of the authors. This category is dominated by cases of unexplained data irreproducibility or experimental artifacts discovered post-publication, and excludes cases of ''data falsification or fabrication'' covered by category 1.b.i.1 above.
1.c. Other -including scenarios where results from a crucial support article were retracted, or statements such as ''breach of ethics'' or ''data irregularities'' which were too vague to allow proper assignment to any of the more specific categories. 3. Scholarly fields. The retracted publications were assigned to scholarly fields based on Web of Science (WoS) categories assigned to the journals in which they are published. The number of articles in each WoS category were then tallied. These figures were then summed by assigning WoS categories to one or more of 12 broad fields in Fig. 2. In addition, the impact factor of each journal listed in the 2010 edition of Thompson Reuters' Journal Citation Reports was obtained.

Country of affiliation.
Defunct country names were combined as appropriate to reflect current United Nations-recognized countries. The European Union (EU-27) category included retracted articles with at least one author from one of the 27 countries comprising the EU as of 2011. In some cases where original articles were not available, author affiliation countries were obtained from WoS, the only major database which includes all author affiliation addresses. In total, there were 102 countries represented.

5.
Year. Articles were categorized by the year of publication and the year of retraction. For articles retracted prior to volume/pagination assignment, the date posted online was used, when known.
6. Full or partial retraction. Some notices only retracted a portion of a publication, such as a single figure with questionable data which may or may not be central to the main point(s). These were considered 'partial' retractions, to distinguish them from full retractions. These ''partial retractions'' represented 3% of the articles included in our analyses.
doi:10.1371/journal.pone.0044118.t002 easily obtaining data on the relative proportions of articles published in specific disciplines. The retracted publications we identified were assigned to scholarly fields based on the ''Web of Science Categories'' assigned to the journals in which they are published. Only those journals indexed by WoS were assigned WoS Categories. The remaining journals were not indexed by WoS so the retracted articles in them were excluded from those analyses which rely on additional WoS data. For each of the WoS Categories with at least one retracted article, the sum of retracted articles in the journals assigned to that category gives the number of retracted articles in the category.
However, the size of the published literature varies among fields of study; therefore, the raw counts must be normalized to allow direct comparison of the proportions of articles which are retracted in each field. Percentages of records in each WoS Category for a WoS query of ''publication year = 2010'' were obtained using the ''Web of Science Categories'' option of the ''Analyze Results'' feature of WoS. These percentages were then compared with the percentages of the retracted articles which were assigned to these same categories. The WoS Categories were also grouped into 12 broad fields. The retracted articles in each WoS Category were summed and the sums were converted to percentages of the retracted articles. Results from the WoS query of ''publication year = 2010'' were grouped into the same 12 broad fields, and converted to percentages of all records for each field. Because previous studies of retracted articles focused on the literature indexed in PubMed, we also compared the number of ''PubMed retractions'' to ''non-PubMed retractions'' for each year from 1980-2010; with ''PubMed retractions'' defined as those articles that are marked with ''Retracted publication'' in the publication type field.
Justification for retraction. Various justifications have motivated the decisions to retract scholarly articles. Determining the relative proportions of these justifications shows their impact on the published literature. Many surveys of the retracted literature (see references in Table S1) use justifications categories dictated by the focus of the study. For example, a study of ''scientific fraud'' [10] divided all justifications into ''Fraud'' and ''Error'' categories, and included breaches of publishing ethics (e.g., plagiarism, duplicate publication, authorship disputes) in the latter group. In this study, we separately quantified several overlapping justification categories (Table 2) which are relevant to different issues. For example, the category total for ''Alleged data manipulation'' shows the scope of this particular issue. On the other hand, determining the overall extent of unreliable published data requires a count of all articles with data questioned due to either ''Alleged data manipulation'' or ''Distrust data or interpretations'' (the latter including artifacts or unexplained irreproducibility). Because multiple justifications are often given for retraction of a single article, the counts for individual categories can not simply be added together.
From our initial list of justification categories, some were refined as logical groupings emerged among the retraction notices consulted, and all notices previously assigned to a category which changed were reassessed. For example, we initially included separate categories for ''known artifact'' and ''unexplained irreproducibility,'' but we decided to merge them into one category of ''Distrust data or interpretations'' after we noticed varying degrees of certainty expressed in different cases. None of the retraction notices in this study specifically mentioned ghostwriting or guestwriting, activities receiving increasing attention in recent years [23].
Repeat offenders. While compiling the list, we noticed that large numbers of retracted articles were sometimes associated with a single author. Some of these cases involved an extensive list of publications in a single retraction notice, while in others, retraction notices were scattered among the individual journals involved. To determine the influence of specific individuals on total retraction counts, the author lists for all 4,449 retracted articles were subjected to a Pivot Table analysis in Excel. Author names yielding more than 15 hits were then surveyed to determine how many articles were attributable to a single individual based on field of study and institutional affiliation.

Retraction
rates per year and ''publication inflation''. Increasing numbers of article retractions per year have been noted in recent studies of the PubMed literature [1], [2], [5], [9][10][11], [19], [24][25][26][27][28][29][30].  (Table S2). Therefore, the growth of sequential 4-year sums of WoS record counts can represent the growth of the pool of articles which includes most of the articles retracted in a given year. The change in retraction rate, adjusted for this factor, is considered by comparing the change in the two ratios: Geographic distribution of authors. Prior studies have discussed geographic trends in the authorship of retracted articles [15], [20]. Since WoS lists all author affiliation addresses (rather than just the author of correspondence) we were able to obtain the countries of all author affiliations for 4,244 retracted articles. Pivot Table analysis in Excel yielded counts for each of the 101 countries. To determine any changes in geographic distribution over the years, counts for individual retraction years from 1980-2010 were obtained for the geographic entities with highest counts (USA, EU27, China, India, Japan and South Korea) using Excel Pivot Tables. Because the scientific output of these regions varies, raw counts were then divided by the number of WoS records for each country-year combination. These figures were obtained by querying WoS for individual years (e.g., ''PY = 2010'') and using the ''Countries/Territories'' option of the ''Analyze Results'' feature on the results for each year. The resulting ratios approximate the relative proportions of published articles for each country which have been retracted. However, since WoS does not index the total scholarly output of any country, these should not be considered absolute ratios of retracted articles.
Retraction authorities. To better understand the roles of various authorities in the retraction of articles, for each case where the retraction notice was obtained the authorities specificially mentioned in the notice were recorded based on the categories in Table 2. We obtained the retraction notices for 4,232 articles (95.1% of 4,449 total), and specific authorities were mentioned in the notices for 3,510 retracted articles (82.9% of 4,232).

Journal Distribution
Retractions were identified in 1,796 unique journal titles, including 59 (64%) of the 92 research journals with a 2010 ISI Impact Factor of 9.000 or higher (Table S3) WoS Category assignments were available for 1,522 (85%) of the 1,796 different journals which included at least one retracted article. From a total of 244 possible WoS Categories, retracted articles were found in 201 of them (Fig. S1). Among these 201 WoS Categories, ratios of the number of retracted articles from 1980-2010 to the number of 2010 WoS records vary from 0.00005 for History to 0.02034 for Anesthesiology (Fig. S1); indicating a higher retraction rate based on the size of the published literature in the latter category. This ratio also varies among subdisciplines within Medicine, from 0.00021 for Substance Abuse to 0.02034 for Anesthesiology (Fig. S1).
The percentages of all retracted articles which are in the broad fields of Medicine, Chemistry, Life Sciences and Multidisciplinary Sciences are higher than the percentages of articles in these fields among all 2010 WoS records (Fig. 2). In contrast, percentages of all retracted articles in Engineering & Technology, Social Sciences, Mathematics, Physics, Agriculture, Earth & Space Sciences, Ecology & Natural Resources and Humanities are lower than their WoS percentages (Fig. 2). These two observations suggest that the former 4 fields have higher retraction rates, based on size of the published literature in each field, than the latter 8 fields.

Justifications for Article Retraction
The retraction notices for 4,232 publications were obtained. Of these, the notices for 601 (14%) of them did not state why the publication was retracted. For the remaining 3,631, the counts of articles fitting each justification category in Table 2 are shown in Fig. 3. Alleged research misconduct was mentioned as a motivation for retraction of 20% of these articles; while 42% were motivated by questionable data or interpretations-whether due to alleged fraud, legitimate artifacts, unexplained irreproducibility, or re-interpretation of conclusions in the light of new facts. Publishing misconduct, primarily plagiarism and author-initiated duplicate publication, accounted for 47% of the retractions. All forms of publisher error represented 9%. These percentages add up to over 100% because some notices gave more than one justification for the retraction of one article.

Proliferation of Retractions
The year of retraction could be determined for 3,490 of the 4,449 retracted articles. This information was lacking for many online-only retraction notices.  (Table 5). Since most retractions occur within 4 years of publication (Table S2) Table 5).

Country Count Trends
The yearly distribution of the retraction of articles by authors from the European Union (EU-27) and top 5 non-EU countries by total retraction count is shown in Fig. 5. The USA and EU-27 clearly accounted for most retractions prior to 2005. Thereafter, the numbers from the Asian countries, particularly China, began to increase dramatically (Fig. 5A). The dashed lines represent counts excluding the articles from the repeat offenders (Table 4). Retraction Authorities and ''editorial expressions of concern'' The retraction notices for 4,232 articles were consulted. Those for 722 (17.1%) of them did not mention the authority calling for the retraction. Of the remaining 3,510, over half (1,970, 56.1%) mentioned either some or all of the authors; and a similar number (2,088, 59.5%) explicitly mentioned either the publisher, ''the journal'' or editor(s) ( Table S4). Information provided by an investigation at the authors' home institute or employer was mentioned in 358 (10.2%) as part of the decision to retract. Interestingly, this group included 276 (38%) of all 725 retractions due to alleged research misconduct. In contrast, 129 mentioned investigations by non-institutional watchdog agencies, such as the US Department of Health and Human Services, Office of Research Integrity (ORI).
When editors feel evidence is sufficient to question published data, but not (yet) sufficient to retract an article, they may issue an ''editorial expression of concern'' notice. As of 2 Aug 2011, such notices were found for 58 different articles, with 40 (70%) published since 2008 (Fig. S2). Although retraction usually occurs a few years post-publication ( [6] and Table S2), roughly 1/3 (19/ 58) of the articles mentioned in these notices had been formally retracted by 2 Aug 2011. Like retraction notices [1], ''expression  The count of 532 articles since 1980 is an underestimate since WoS only includes articles for volume 4 onward for this journal, and it is not covered by PubMed. This is a Korean-language journal to which we have no access, so the reason for this large percentage is not known. of concern'' notices often appear in high-impact journals, with 24/ 58 (40%) in BMJ, Lancet, N Engl J Med, PNAS or Science.   highly-cited article [31], had 740 total WoS citations, including 187 for 2009-2012.

Discussion
Central Role of ''repeat offenders'' Steen [2] termed authors with multiple retractions ''repeat offenders.'' His survey of 742 retractions (from PubMed, 2000-2010) yielded 7 authors with 5 or more retractions, and the top 2 authors with a combined total of 32 retracted articles [2]. This study, which included all scholarly fields, found 9 authors with 20 or more retractions each, and 434 retractions among the top 15 authors/groups (Table 4). These few aberrant cases are clearly ''outliers'' in a scientific community of millions. The repeat offenders identified in this study are globally distributed, with 7 from North America, 3 from Europe and 5 from Asia. Some authors have suggested that the stigma of misconduct should be decoupled from instances of article retractions [12], and the case of M. Quik, G. Goldstein and collaborators provides a good example. They discovered a potentially result-altering contaminant in their chemical standards after 15 articles had been published, and promptly sent retraction letters to the journals involved, e.g. [32].
The most extreme ''repeat offender'' cases appear frequently in editorials, e.g. [33][34][35][36][37]; and the data presented here provides a broader perspective in which their actions should be considered. The ''soul-searching'' in recent anesthesiology editorials on the Reuben and Boldt cases, e.g. [35], [36], [38] is reminiscent of cardiology editorials from the 1980s on the Slutsky and Darsee cases, e.g. [39], [40]. The Reuben and Boldt cases prompted a special 30-page editorial section on misconduct in the Mar 2011 issue of Anesthesiology [36]; in which one author asked ''whether there is a specialty-related propensity for anesthesiologists to … commit academic fraud, or … [are Reuben and Boldt] … merely a statistical blip?'' [38]. The data here strongly suggest the latter. Reuben and Boldt's 98 articles retracted from strictly Anesthesiology journals accounted for 85% of all 115 articles that have ever been retracted from Anesthesiology journals. That only 17 articles had been retracted from all journals in this field prior to the Reuben case in 2009 argues strongly against rampant misconduct in anesthesiology. However, another dramatic case is currently unfolding in the field of anesthesiology, in which 193 published articles are at risk of being retracted [41].
The impact of repeat offenders on retraction counts for individual countries complicates direct cross-country comparisons. Their contributions among the top countries are: Schön, Slutsky, Reuben, and Darsee, 101 for the USA; Mori and Matsuyama, 40 for Japan; Chiranjeevi, 19 for India; Zhong and Liu, 72 for China; and Boldt and Hermann, 110 for Germany (Table 4). Subtracting their contributions blunts the retraction spikes for China in 2010 and the EU-27 in 2011 (Fig. 5A). A Lancet editorial used Zhong and Liu as examples to call on China to ''reinvigorate standards for teaching research ethics'' [42]. However, it could have cited the other repeat offenders to make equally dramatic cases for the inadequacy of standards in the USA, Japan, India, or Germany. These repeat offender cases should always be considered as the anomalies that they are, rather than held up as examples of the general state of research integrity.
The repeat offenders also dramatically skew results for individual journals (Zhong and Liu, Acta Crystallogr E, Table 3), subdisciplines (Boldt and Reuben, Anesthesiology; Zhong and Liu, Crystallography, Fig. S1) and years. Note that 10 of the top 15 cases have come to light since 2005 (Table 4), and the repeat offenders account for 15% of all 2010 retractions, and 24% of 2011 retractions catalogued thus far (Table 4). . Justifications for retraction stated in the notices consulted, which accounted for 4,232 retracted articles. Only 20% of articles were retracted due to research misconduct, while more than twice that many were retracted due to publishing misconduct. Note that 42% were retracted because of ''questionable data or interpretations.'' Percentages are based on the 3,631 ( = 4,2322601) notices which stated the justification. doi:10.1371/journal.pone.0044118.g003

Rising Retraction Rates
While previous studies have documented the increasing number of retractions per year [1], [2], [5], [9][10][11], [19], [24][25][26][27][28][29][30], the inclusion of non-PubMed literature here yielded dramatically higher absolute numbers (Fig. 4). The growth of retracted non-PubMed literature in recent years (Fig. 1) underscores its importance in understanding the current impact of retracted literature on science as a whole, as well as non-medical scholarly fields of study. Despite the recent increases, retracted articles remain only fractions of a percent among all articles published in a given year or scholarly field. Some retracted articles undoubtedly eluded our search queries, so their actual number is higher than reported here. Queries of additional data sources, particularly those which thoroughly cover publications in narrow fields of study or from specific geographic regions, could reveal additional retracted articles. However, we are confident that a large proportion of articles which have been retracted were identified using all of the sources we consulted.
Various explanations for increasing numbers of retractions have been suggested. The repeat offenders and overall growth of the published literature are important contributors (Table 5); however, a dramatic rise in annual retraction counts in the past 10 years remains even after these factors are taken into account. Some debate on rising retraction rates focuses on whether ''more cases are occurring'' or simply ''more cases are being caught'' due to improved tools such as plagiarism-detecting software and the Déjà vu database [43] (http://spore.vbi.vt.edu/dejavu/browse/accessed 19 Mar 2012). Some journals now scan all submissions using CrossCheck, e.g., [44]. Technological advances enable cut-andpaste plagiarism and en-mass multiple article submission. An  According to the IEEExplore database, this author has allegedly fabricated data in 39 publications and co-authors of 14 additional publications. 3 The 72 retractions of these two authors represent 34% of China's 210 retractions for 2010 and 8.9% of all 811 retractions for China. 4 These four authors account for 101 (7.5%) of all 1,355 USA retractions. It is noteworthy that Dr. Schö n's retractions include 10 articles from Science and 7 from Nature.  outstanding case of the latter involved ''… duplication of a paper that has already appeared in at least nine other publications'' [45]! A companion article is in preparation which discusses in greater detail the likely motivations and possible steps which may reverse, or at least slow, the rising rate of retractions due to research and publishing misconduct. Another contributor is the recent emergence of articles retracted while ''in press'' -i.e. those available to the research community on the publisher's website, but retracted prior to volume, issue, and page assignment. This category, which did not exist 10 years ago, included 780 (17.5%) retractions in this survey. PubMed creates database records for such ''in press'' articles, while WoS does not. Thus records for hundreds of ''in press'' retracted articles exist in PubMed but not WoS (though this policy difference is not expected to affect the ratios in Fig. 4).

Retractions Widespread among Disciplines
While the correlation between Impact Factor and number of retractions or ''retraction index'' [46] has been noted previously in the medical literature, this survey found retractions in 64% of all research journals with 2010 ISI Impact Factor of 9.000 or greater (Table S3). While some editors resist retracting articles, even when faced with overwhelming evidence of fraud, e.g., [47], the fact remains that flawed research has slipped through the peer review process at most of the top journals in science and medicine. The use of WoS Categories allows objective disciplinary assignment of individual articles, though it is an imperfect system [48] due to topical mixing among the articles in a given journal. For example, the journal Cell is assigned only to the category of ''Cell Biology,'' though it contains individual articles that are relevant to Endocrinology, Oncology, Developmental Biology, etc.
The data generated in this study did not support the perception that research misconduct is primarily restricted to biomedical fields [49]. Large numbers of retracted articles, including those due to misconduct, are found outside of the medical literature (Figs. 1, 2). Chemistry and the Life Sciences, which overlap with Medicine in fields such as ''Cell Biology'' or ''Chemistry, Medicinal,'' exhibited disproportionately high retraction rates, similar to Medicine (Fig. 2). The higher proportion of database records marked as retracted in PubMed relative to WoS (Fig. 4) may reflect the lower retraction rates among the 8 other major disciplines in Fig. 2, which are all covered by WoS. However, many records in both databases fail to indicate the retracted status of articles. For example, some, e.g., [22] are marked as retracted in WoS but not in PubMed (when checked on 7 Jun 2012). Thus, determining the true proportions of retracted articles in these databases would require surveying records for all articles known to be retracted. Full, Partial, Implicit, Explicit Retraction Full retractions, i.e. retraction of the entire article, were called for in 4,120 (97%) of the 4,232 articles in our survey where the retraction notice was obtained. The remaining 112 (3%) cases retracted only a portion of an article. However, many notices labeled as ''errata'' report serious errors in published dataeffectively ''retracting'' the original data or figure(s) without using the word ''retract,'' e.g., [50]. On 7 Jun 2012, WoS included 295,957 records for errata, designated as publication type ''correction'' or ''correction, addition,'' with 12,189 for 2011 alone. The 112 ''partial retractions'' in our dataset could be considered ''errata''.
The 9% of retractions due to publisher error (Fig. 3) is an underestimation. This study included only cases where ''retraction notices'' or ''errata'' explicitly call for the ''retraction'' or ''withdrawal'' of a publication. ''Corrected and republished'' articles typically involve severe flaws introduced by the publisher. However, many notices for them do not state the obvious implication that the original version is ''retracted'' and should not be consulted or cited. This also applies to many ''duplicate publication'' notices. In PubMed, 2,361 such records are indicated by ''corrected and republished article'' (n = 1,391, 7 Jun 2012) or ''duplicate publication'' (n = 973) in the publication type field. Of the flawed original articles among the 1,391 ''corrected and republished'' PubMed articles, only 7 were found on the list of 4,449 ''retracted'' publications in this study. Thus, thousands of these ''implicit'' retractions exist in addition to the 4,449 ''explicit'' retractions in the dataset used here.

Retraction Authority
The retraction authority categories used here include all of the parties mentioned as guiding the decision to retract in all of the notices we obtained. There is some disagreement over the authority or responsibility of editors to declare an article as ''retracted''. For example, one editor wrote: ''Journals cannot retract-that is the obligation of authors … We can repudiate our association with a study.'' [51]. Another suggests that allowing editors to take on the role of ethics police is ''poison'' to the scientific process [52]. Some editors retract articles without the authors' consent, in what they perceive to be clear cases of fraud or error, due to a lack of response from authors, e.g., [53]. This practice is supported by the Committee on Publication Ethics guidelines [54]. When co-authors disagree over retraction appropriateness, a ''retraction of authorship'' [55] or lack of consensus statement [56] may be published.
Many authors have stressed the responsibility of institutions and employers in fostering the responsible conduct of research (and publication) by their staff, e.g., [57[-] 58]. Their importance in facilitating retractions due to research misconduct is apparent, since the notices for 38% of articles retracted over allegations of research misconduct mentioned information provided by institute/employer investigations-far more than mentioned off-site watchdog organizations such as ORI (Table  S4). However, obtaining clear-cut evidence of poor data integrity is often difficult for a variety of reasons [47], [59][60]. The ''expression of concern'' notice, e.g., [61] rapidly alerts the research community to serious doubts about published data, particularly in the early stages of investigation or cases with ambiguous outcomes. Such notices may be subsequently ''reaffirmed'' [62] or ''removed'' [63] as appropriate. These notices are likely to become more common in the future given their relatively recent appearance and several-fold increase in number (Fig. S2) since a 2008 study on their characteristics was published [64].

Citation of Retracted Articles
Efforts to inform researchers about retracted articles they may cite in the future have achieved only limited success [5][6][7]. However, the figures for citations of retracted articles may not be as alarming as they sound. Some citations do warn readers of an article's retracted status [6][7], though ideally they all should. Since retraction does not automatically imply either ''fraudulent data manipulation'' or ''questionable data'' (Fig. 3), some authors defend the practice of citing valid data in retracted articles. For example, citing one of Schon's retracted Science articles one author [65] noted: ''This paper has been retracted … yet contains legitimate and innovating ideas that are now generally accepted.'' In other cases, retraction can trigger a domino-effect, resulting in the retraction of subsequent articles with conclusions dependent on the retracted data or interpretations, e.g., [66].
How can researchers stay informed of the constant march of retractions which may affect articles on which they rely for knowledge and cite? The role of publishers and bibliographic databases to properly mark retracted publications has been discussed [10]. The RetractionWatch (http://retractionwatch. wordpress.com) blog supplements the often minimal information retraction notices provide with that obtained directly from the authors, editors, or investigative committees involved. Passive online databases, such as the Retraction Database from Rutgers University (http://retract.rutgers.edu), are impractical as they require researchers to actively search for articles in their field. Solutions which do not require active and repetitive searching by researchers may be more effective. For example, CrossRef's CrossMark initiative (http://www.crossref.org/crossmark/) is designed to help researchers identify the latest, definitive, ''publisher-maintained'' version available for an article of interest, as multiple versions of articles are typically generated during the publishing process or through incorporation of information from erratum or retraction notices. Another possibility could involve linking a truly comprehensive retraction database to the widely used reference management software tools, such as EndNote. This tool would scan a researchers' personal EndNote library whenever opened, or when importing new references, and alert the user when a match between a known retracted article and a library entry is found.

Conclusions
A very broad survey of the scholarly literature shows that retractions are widespread across disciplines and author affiliation countries; yet represent only small fractions of a percent among all publications for any given field, country, journal or year. While retracted articles and research integrity have received considerable attention in the medical literature, similar proportions of articles in the Life Sciences and Chemistry have also been retracted. Only limited proportions of articles have been retracted due to alleged research misconduct (20%) or loss of faith in the data or interpretations as published (43%). These low proportions support the call for de-coupling the stigma of ''misconduct'' from article retraction, and partially explain the continuing citation of retracted articles which may contain data believed to be valid. The effect of a limited number of prolific individuals on retraction counts for particular subsets can be very dramatic. These repeat offenders and overall growth of the published literature account for a substantial portion of the increase in the number of retractions over the past 10 years; though annual retraction figures adjusted for these factors have still increased dramatically. The central role of local authorities in investigating and providing evidence in cases of alleged misconduct was mentioned in a high proportion (38%) of such cases. Articles and editorials discussing article retraction and the separate, but related, issue of research integrity, should always consider them in these broad contexts.  Figure S2 Yearly distribution of 58 ''editorial expression of concern'' notices from 30 different journals. Such notices are expressions of opinion when ambiguity prevents outright retraction or in the early stages of investigating cases of questionable data. (TIFF) Table S1 Number of retracted articles included in previous large-scale studies of retraction attributes. Many narrowly-focused studies, e.g., [57] were excluded from this list. (DOCX)  Table S4 Authorities specifically mentioned in retraction notices as being involved in the decision to retract, with percentage of the 3,510 articles for which the notice specified any authorities (except as noted). (DOCX)