Self-selected or mandated, open access increases citation impact for higher quality research - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 18;5(10):e13636.
doi: 10.1371/journal.pone.0013636.

Self-selected or mandated, open access increases citation impact for higher quality research

Affiliations

Self-selected or mandated, open access increases citation impact for higher quality research

Yassine Gargouri et al. PLoS One. .

Abstract

Background: Articles whose authors have supplemented subscription-based access to the publisher's version by self-archiving their own final draft to make it accessible free for all on the web ("Open Access", OA) are cited significantly more than articles in the same journal and year that have not been made OA. Some have suggested that this "OA Advantage" may not be causal but just a self-selection bias, because authors preferentially make higher-quality articles OA. To test this we compared self-selective self-archiving with mandatory self-archiving for a sample of 27,197 articles published 2002-2006 in 1,984 journals. METHDOLOGY/PRINCIPAL FINDINGS: The OA Advantage proved just as high for both. Logistic regression analysis showed that the advantage is independent of other correlates of citations (article age; journal impact factor; number of co-authors, references or pages; field; article type; or country) and highest for the most highly cited articles. The OA Advantage is real, independent and causal, but skewed. Its size is indeed correlated with quality, just as citations themselves are (the top 20% of articles receive about 80% of all citations).

Conclusions/significance: The OA advantage is greater for the more citable articles, not because of a quality bias from authors self-selecting what to make OA, but because of a quality advantage, from users self-selecting what to use and cite, freed by OA from the constraints of selective accessibility to subscribers only. It is hoped that these findings will help motivate the adoption of OA self-archiving mandates by universities, research institutions and research funders.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Open Access (OA) Self-Archiving Percentages for Institutions With Self-Archiving Mandates Compared to Non-Mandated, Self-Selected Controls.
As estimated from the portion of their yearly published article output that is indexed by Thomson-Reuters, in this 2006 sample at least 60% of each of the four mandated institutions' total yearly article output was self-archived and hence made OA, as mandated. The corresponding percentage OA among the control articles published in the same journal/year (but originating from other, presumably nonmandated institutions) was 15%, or close to the frequently reported global spontaneous baseline rate of about 15–20% for self-selected (nonmandated) self-archiving . In other words, about 15% of these papers were self-selectively self-archived when it was not mandated, whereas at least 60% were self-archived when it was mandated.
Figure 2
Figure 2. Log Citation Ratios Comparing the Yearly OA Impact Advantage for Self-Selected vs Mandatory OA 2002–2006.
O = OA article (Open Access); Ø = non-OA article (non-Open Access); M = Mandated OA; S = Self-Selected OA. Averages across the sample of four institutions with self-archiving mandates confirm the significantly higher citation counts for OA articles (symbolized here as “O”) compared to matched control non-OA articles (symbolized here as “Ø”) published in the same journal and year. They are compared as O/Ø log ratios in the seven comparisons. (The first comparison, O/Ø, for example, is the arithmetic mean of all the (log) ratios O/Ø for each of the 5 years.) OA articles are more highly cited irrespective of whether the OA is Self-Selected (S) or Mandated (M). The O/Ø Advantage is present for mandated OA (OM/ØS) and is of about the same magnitude irrespective of whether we compare the S ratios with the M ratios for the entire control sample (OS/Ø vs OM/Ø) or just compare S alone with M alone (OS/ØS vs OM/ØM). (The larger values for year 2006 are almost certainly due to the fact that 2006 was still too near to have stabilized at the time this analysis was conducted (2008–9); the analysis has since been extended for years 2006–2008, thereby stabilizing the data for 2006 and 2007, and yields the same results, always with the exception of the most recent year, which was 2008 in the most recent analysis.)
Figure 3
Figure 3. Distribution of citation counts (minus self-citations) for articles.
Citation counts are not normally distributed. Of our sample of 27,197 articles, 23% had zero citations; 51% had 1–5 citations; 12% had 6–10 citations; 8% had 11–20 citations; and 6% had 20+ citations. It is for this reason that a logistic analysis rather than an ordinary regression analysis was conducted. (Cf. Figure 4 , which presents the distribution of average Journal Impact Factors – which are, roughly, average citation counts – for journals.)
Figure 4
Figure 4. Exp(ß)-1 values for logistic regressions.
These comparisons are based on 4 models, each analyzing a different comparison range. For each comparison (e.g., 1–4 citations (lo) vs. 5–9 citations (med-lo)) an article is assigned zero if its citation count is in the lower of the two ranges and one if it is in the upper range. Then the model assigns the best fitting weights to each of the fifteen predictor variables in their joint prediction of the citation counts. The weights are proportional to the independent contribution of each variable. (Only statistically significant weights are shown.) In most of the four citation range comparisons (zero/lo, lo/med-lo, low/med-hi, lo/hi), citation counts are positively correlated with Age, Journal Impact Factor, Number of Authors, Number of References, Number of Pages, Science, Review, USA Author, OA, and Mandatedness. There is also a significant OA*Age interaction in the top and bottom range. (Citations grow with time; for age-matched articles, the OA Advantage grows even faster with time; Figure 6 ). OA is a significant independent contributor in three of the four models and their citation ranges, especially in the the lo/hi comparison.
Figure 5
Figure 5. Interaction between OA and article age.
Over and above the sum of the independent positive effects on citations of OA alone and of age alone, the size of this OA Advantage increases as articles get older. The interaction is illustrated here for the lo/hi (1–4/20+) citation range comparison (model M4) for articles that were from 3 years old (2006) to 7 years old (2002). (The comparison was made in 2009.)
Figure 6
Figure 6. Distribution of Journal Impact Factors by Journal.
As with the distribution of individual article citation counts ( Figure 3 ), the distribution of journal impact factors (average citation counts) is highly skewed. Most journal JIFs fall between 0 and 5, with the peak between 2 and 3, followed by a long rapidly shrinking tail, tail with very few journals having a JIF greater than 10.
Figure 7
Figure 7. Exp(ß)-1 values for logistic regressions (Lowest JIF Range: 0.0–.0.63).
(See Figure 5 for explanation of analysis and interpretation.) In this lowest range of journal impact factors, the biggest factor contributing to citation in all citation range comparisons is article age. OA is an important contributor in the two upper range comparisons.
Figure 8
Figure 8. Exp(ß)-1 values for logistic regressions (JIF range 0.63–1.05).
(See Figure 5 for explanation of analysis and interpretation.) In the second lowest JIF range, article age continues to be the main factor in all four citation ranges, with OA emerging and growing in the top three.
Figure 9
Figure 9. Exp(ß)-1 values for logistic regressions (JIF range 1.05–1.78).
(See Figure 5 for explanation of analysis and interpretation.) In this middle range of journal JIFs, article age continues to be influential, and OA is a significant factor in three of the four citation ranges.
Figure 10
Figure 10. Exp(ß)-1 values for logistic regressions (JIF range 1.78–2.47).
(See Figure 5 for explanation of analysis and interpretation.) In this next-to-highest JIF range, OA has its effect only in the top range (lo/hi).
Figure 11
Figure 11. Exp(ß)-1 values for logistic regressions (JIF 2.47–29.96).
(See Figure 5 for explanation of analysis and interpretation.) In this, the highest JIF range, article age again increases citations in all ranges, whereas OA again has its effect only in the top range (lo/hi) (Note the anomalous effect of the “Review” variable; this is probably because it is confounded with the Reference count variable; when Review was removed in further analyses, the pattern of the other variables, and in particular OA, was unchanged.)

Similar articles

Cited by

References

    1. Odlyzko A. The economic costs of toll access. 2006. Jacobs, Neil, Eds Open Access: Key Strategic, Technical and Economic Aspects Chandos Publishing (Oxford) Limited.
    1. Hitchcock S. The effect of open access and downloads (‘hits’) on citation impact: a bibliography of studies. 2010. Available: http://opcit.eprints.org/oacitation-biblio.html.
    1. Evans JA. Electronic Publication and the Narrowing of Science and Scholarship Science. 2008;321(5887):395–399. - PubMed
    1. Evans JA, Reimer J. Open Access and Global Participation in Science. Science. 2009;323(5917):1025. - PubMed
    1. Harnad S, Brody T. Comparing the Impact of Open Access (OA) vs. Non-OA Articles in the Same Journals. 2004. D-Lib Magazine 10(6). Available: http://eprints.ecs.soton.ac.uk/10207/

Publication types