The Post-Embargo Open Access Citation Advantage: It Exists (Probably), It’s Modest (Usually), and the Rich Get Richer (of Course)

Many studies show that open access (OA) articles—articles from scholarly journals made freely available to readers without requiring subscription fees—are downloaded, and presumably read, more often than closed access/subscription-only articles. Assertions that OA articles are also cited more often generate more controversy. Confounding factors (authors may self-select only the best articles to make OA; absence of an appropriate control group of non-OA articles with which to compare citation figures; conflation of pre-publication vs. published/publisher versions of articles, etc.) make demonstrating a real citation difference difficult. This study addresses those factors and shows that an open access citation advantage as high as 19% exists, even when articles are embargoed during some or all of their prime citation years. Not surprisingly, better (defined as above median) articles gain more when made OA.


Introduction
All things being equal, that an article made freely available ought to get downloaded more than a comparable article that costs money to access seems obvious. We mistrust the intuitively obvious, though, largely because all things are rarely equal and confounding factors are not always easy to intuit. Besides, more downloads (and presumably more readers) may not be sufficient motivation for authors to make their articles open access (OA); researchers want their work to have a measurable impact on their peers. They want to advance their field, boost their careers by increasing their likelihood of receiving grants, and improve their chances for promotion. . .preferably all three.
Citations are the coin of the impact realm, so a citation, not merely a download or readership, advantage is important to authors. A number of studies show that some kind of open access citation advantage (OACA) exists. For example, an oft-cited paper by Hajjem, Harnad and Gingras [1] found a 36-172% advantage, and the majority of studies (46/74) listed in [2] also report an OACA. Many of these same studies have also been challenged, though: • One frequently expressed concern is that authors self-select only articles of higher than average quality to make OA, which would be expected to get more citations regardless of whether they were open ( [3][4][5][6][7], e.g.).
• OA articles may have been previously available in working paper or pre-print versions that differ from their final published form. The resulting final publications may benefit from that early availability ( [8][9][10], e.g.). Further, comparing the author's accepted manuscript for one article to the publisher's formatted and copy-edited version for another introduces a confounding variable in terms of quality, or at least perceived quality. A like-with-like comparison to determine whether an OACA truly exists is challenging in such scenarios.
• It can be difficult to find articles of any kind that have been open for long enough, after being previously closed, to show a meaningful effect on citation frequency [11].
• Finally, even if the above concerns are addressed, finding enough articles in a broad enough range of disciplines to draw a conclusion on an OACA has proved challenging. Studies in specific disciplines or single journals abound: 2/3 of the articles in [2] investigate citations and OA in only a single discipline, and over half of those studies use a small sample size or did not open articles for long. But even when convincing, such studies are prone to being dismissed as special cases, peculiar to the particular discipline or even sub-discipline.
One way to address these concerns would be to find an appropriate control group for a large sampling of OA articles. This too is difficult; every article is (or at least should be!) unique, and even so-called hybrid journals, which make some but not all articles OA, do not offer a straightforward means for comparison because of the self-selection problem, since they require authors to pay an additional publication fee to make them open from the outset.

Methodology
In Deep Blue <deepblue.lib.umich.edu/documents>, the University of Michigan's institutional repository service, we have the equivalent of a random sample of thousands of OA articles from thousands of journals. Each article has the following characteristics: Prior to a known date (ranging from 2006 onward) these articles, since they are the final published version, were only available by subscription. After that date, they became freely available via Deep Blue. Meanwhile, other articles from the same journal issue as the now-OA article continued to only be available to subscribers. None of the OA articles were self-selected; authors did not choose to deposit the articles in question in Deep Blue, since they were opened via blanket licensing agreements between the publishers and the library.
By comparing citations to subscriber-only/now-open (opened) articles with the corresponding subscriber-only/still-subscriber-only (closed) articles in that journal issue before and after availability in Deep Blue, we can determine what effect opening them may have had, i.e. a postembargo OACA.
The sample began with a random selection of 3,850 papers-peer-reviewed and review articles only; bibliographies, book reviews, corrections, discussions, editorials, letters, notes, etc. were not considered-with original publication dates ranging from 1990 to 2013. These were matched with the 89,895 corresponding articles which remained closed, using the specific journal issue as a proxy for comparability of subject matter and quality. (It is an imperfect proxy, of course, but as noted above, in theory each article is unique, so an exact like-for-like comparison is not possible.) Using data from Thomson Reuters' Web of Science and Journal Citation Reports databases, we get actual citations and can calculate the expected values of citations to an article for each issue. (Some opened articles had no corresponding peer reviewed or review articles in that issue, and so were dropped from the sample. In issues where more than one opened article appeared, one was randomly selected to compare with the closed articles. H 0 : ActualCitations af ter opening À ExpectedCitations af ter opening ¼ 0 In other words, assume that making an article OA has no effect and test that assumption. To test H 0 we need to know how many citations to the opened article we would expect (E) if it behaved like the other articles in its issue that remained closed. In our example: , meaning in this case the opened article had five fewer citations than expected.
Looking at all the articles in aggregate is straightforward and, as described below, doing so produces large OACA percentages, similar in magnitude to those previously reported. However, a more appropriate approach is to focus on article-by-article differences, calculated three ways: each opened article is compared to the mean and median article in the same journal issue which it appeared in and also to equivalent articles in that issue. Recognizing that other definitions are possible, we define "equivalent articles" as follows: • If one or more articles had the same number of citations to the opened article while all were closed, use those articles; • if not, but there were articles with citation counts during the closed period that bracket that of the opened article, use those; • if the opened article had more, or fewer, citations than any other article, use the mean for the issue as an equivalent (this is conservative, since it will tend to maximize the effect of outliers among the closed articles).
Regardless of whether we use the median or the equivalent articles to calculate the expected value, the problem of dealing with never-cited articles comes up. (And it comes up often; in our sample the modal value for citations for all four groups of articles, O c through C o , was 0.) These zeroes were handled as follows: If C c = 0 for medians E ¼ median C o À median C c for equivalents E ¼ mean of equivalent C o s À mean of equivalent C c s Also, if E < 0, be conservative (and realistic) and use 0, as this minimizes differences when O c = 0 and O o > 0.

Analysis
As mentioned above, an aggregate calculation is straightforward: find the overall mean (or median) values and find the percentage difference between the ratios: However, even leaving aside the large standard deviations (ranging from 24.96 to 51.41) of these aggregate values, we know their distributions are far from normal-again, the mode for all four values is zero, and no article can have less than zero citations-so an article-by-article calculation is more appropriate.
The differences between expected values and actual values were usually small. (For the means, 963/3850 ±1; for medians 963/3850 ±1, and; for equivalents 1133/3850 ±1.) This is not surprising, given the modes and the conservative treatment of the never-cited articles described above. Figs 1-3 plot these distributions.
Was there a significant difference, and can we reject H 0 ? Because the sample size is large, the t-and Z-distributions are reasonable approximations of each other, so applying the usual formulae for standard deviation to these distributions offers a quantitative measure of the difference between actual and expected values, and confidence intervals for them. These are as follows: Overall Since the majority of articles in the overall sample were older (82% were more than five years old and 61% were more than ten years old), per Fig 4, most will have been past their prime citation years.
So, a further refinement is possible by repeating the analysis for only the newest articles (from publication to eight years old), and limiting those considered to ones that were closed less than 85% of their lifespan. In this case the aggregate OACA for the mean becomes even larger: 665.9%. (Because the median O c is zero, the corresponding result for medians is indeterminate.) However, the results for the granular analysis for this smaller sample are more equivocal: Overall First, in every case the overall sample the above median articles show a greater gain than the rest (3.2% > -7.6%, 19.3% > 18.5%, and 6.8% > 3.6%). So, in the long run better articles gain more citations than expected by being made OA, adding weight to the results reported by Gargouri et al. [3] (This does not appear to hold for the closest equivalents, though their high pvalues make any conclusion about a gain or loss suspect.) Just like the O o -E mean value, where we find an open access disadvantage, these data appear intent on preventing us from making blanket statements.
Second, an OACA appears to exist for all but one case (opened articles compared to all other articles in the issue), but the confidence intervals are relatively small only in the case where we compare opened articles to their issue's medians.
It is arguable that the median is a better control group than the mean, since its use reduces the influence of outliers, and can correct for a potential bias in the sample as well: the opened articles come from one institution, and one that enjoys a good reputation for scholarship. So it is possible that, given that reputation, articles on the borderline of acceptance get more of a benefit of the doubt during the peer review process. Note, however, that the mean for O c was 6% more than C c , while the median was 14% less, so an acceptance bias might be argued either way.
Taken together, these somewhat equivocal results lead to a short discussion of the limitations of the data, which are available in anonymized form via Deep Blue <deepblue.lib.umich. edu/data; doi:10.7302/Z2KH0K8V>.
• Though it was not possible here, it is clear that a multi-institution sample would be ideal, as would a sample not dominated by physical science, health science, and engineering articles (92% of the total).
• As discussed above, the age of the articles available for study is an additional factor, as the majority of those studied were past their peak citation years. The results for the medians in the smaller sample of more recently published articles mirror the broader analysis, but there appears to be a negative effect on the means. When coupled with the small positive OACA found for the larger sample, one possibility is that OA mainly extends the shelf-life of research. Other explanations are possible, of course, but a larger sample of articles-still young enough themselves, and closed for a short enough time-is needed to demonstrate the true size of an OACA during an article's prime.
• With regards to articles being closed for a short enough time, given the evidence that any embargo reduces citations [10], the ideal would be no time at all. The articles in this sample were all embargoed during some or all of their prime citation years, so they may not have achieved their full OA potential.
• The citation counts here are probably conservative. Because the data from Web of Science is only available on a yearly basis, citations to an opened article (O o ) will always be undercounted and citations to that same article while closed (O c ) will be over-counted by some unknown amount. This may not be offset by using a more fine-grained approach for the corresponding articles that remained closed during that time.
• Context also matters. While OA repositories are typically well-indexed, making their contents easy to find via e.g. Google searches, opening an article in an institutional repository is not the same as opening it within the context of the journal itself or via a discipline-specific repository. Making an article OA in context(s) more researchers consider a destination for conducting a literature search would likely produce more citation activity.
There are no doubt others, and despite these limitations, the data suggest additional angles from which to explore the effects of making articles OA. Some articles had only their metadata indexed by Google et al., while others had searchable full text as well. Looking at specific disciplines, the effect of journal impact factor, and changing the definition of what an equivalent (or closest equivalent) article come immediately to mind as additional avenues worth exploring.

Conclusion
Removing self-selection bias, considering only published versions of articles, assuring that articles were open for a long enough time to allow for meaningful citation patterns to emerge, and working with a large sample address the main weaknesses in previous attempts to show an OACA.
When treating all the articles in aggregate, this study produced an OACA of comparable magnitude to previous studies. That large advantage shrinks when articles are treated individually and compared to close equivalents, but it doesn't disappear. This may be because outliers skew the results (the rich can always get richer, but there's a lower limit to how poorly an article can be cited), or it may be that the best we can do with the data we have is to produce boundary values for the OACA. So, even though effects found here are more modest than reported elsewhere, given the conservative treatments of the data and when viewed in conjunction with other OACA studies already done [2,12], the results lend support to the existence of a real, measurable, open access citation advantage with a lower bound of approximately 20%; absent a large body of published articles available as OA from the moment of publication onward it may be difficult to say more. The publishing environment we have right now can not provide such a data set, and will not do so in the foreseeable future. The closest we have today are hybrid journals, but their OA articles are not only self-selected by authors (presumably because of higher than normal confidence in their quality), but by authors who have the ability to pay to assure immediate OA, doubling down on that selection bias.
It is in authors' best interest to create a more open environment. Combining the results of treating each article individually with the aggregate numbers for mean-to-mean and medianto-median comparisons, we see that when an article benefits from being OA, it benefits a lot. In other words, where there is an OACA it is large relative to the cases where the effect of OA is zero. Judging by the difference found between the smaller group of newer articles and the overall analysis, it's plausible that OA also extends an article's impact further into the long tail of the citation curve. And the better the author, the stronger the interest in open access, since, as shown by the analysis of above-median articles, OA (unsurprisingly) benefits better articles the most.