Are open access fees a good use of taxpayers’ money?


 Butler et al. (2023) reported that five large commercial publishers (Elsevier, Sage, Springer-Nature, Taylor & Francis and Wiley) received $1.06 billion in publication fees between 2015 and 2018. Another three publishers were mentioned (Frontiers, MDPI and PLOS), but were not analyzed. The revenue was underestimated and the number of open access (OA) articles increased over the period of study. The five publishers analyzed charged (on average) $1,989 for gold OA articles and $2,905 for hybrid articles.
 
 
 https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_c_00305


1. INTRODUCTION Butler, Matthias et al. (2023) reported that five large commercial publishers (Elsevier, Sage, Springer-Nature, Taylor & Francis, and Wiley) received $1.06 billion in publication fees between 2015 and 2018 (all monetary amounts are given in U.S. dollars unless stated otherwise).Another three publishers were mentioned (Frontiers, MDPI, and PLOS), but were not analyzed.The revenue was underestimated and the number of open access (OA) articles increased over the period of study.The five publishers analyzed charged (on average) $1,989 for Gold OA articles and $2,905 for hybrid articles.
In this letter eight publishers are analyzed, over a longer period; 9 years (2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022)(2023).The publishers are the same as those previously analyzed, as well as the three that were mentioned but not analyzed.The revenue of three of the publishers is estimated as the data are too large to collect for this short letter, but a project that includes these publishers (and others) is currently under way.The other five publishers have their open access revenue estimated using article level data, article processing charges (APC), and/or the publishers' annual reports.
This letter provides a revenue estimate for eight publishers over a 9-year period (2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022)(2023).Given that the data are difficult to collect, and then it can only be used to give a rough indication of revenue, it is argued that the publishers themselves should provide information for each article, so that the funders of open access articles (typically taxpayers) can not only access the research it funds (which is the primary motivation for open access) but also know how their funds are supporting open access charges, in much more detail than is available at the present time.

METHODOLOGY
The data collection process for the eight publishers is given below.The three publishers marked with an asterisk are too large to collect data for this letter, but a larger project is currently being undertaken that will include these, and additional, publishers.For these three publishers, estimates of their revenue (drawing on the figures in Butler et al. (2023)) is given so that an overall figure for the eight publishers can be derived.
For four of the publishers, the number of papers published by each of its journals for each year is captured.This is not an easy task, as a bespoke capture plan (using a browser web scraping extension) had to be devised for each publisher.The current APC for each journal is also captured and this is discounted by 5% each year to provide an APC for previous years.The figure of 5% was chosen because random manual sampling suggests that this is not unreasonable.It would have been better to have captured the APC for each journal for each of the 9 years, but this is challenging to do and the data are unlikely to be available for every journal for all of the 9 years, so an estimated figure would still be required.
Once we have the number of articles published by a given journal in a given year, and an APC for each year, it is a simple matter to derive the yearly income for that journal and then sum up each journal to give an annual income for the publisher.We note that waivers are often mentioned on the web sites of open access journals, but there is no way of knowing if a waiver has been awarded and, if so, how much, so this data could not be collected.
In two cases (PLOS and Wiley), the publishers' annual reports are available.As the article level data are also captured for PLOS, the revenue figure can be compared against the figures from the annual reports.In the case of Wiley, we rely solely on the annual reports, as it publishes 1,600 journals and have published over eight million articles; collecting that amount of data is outside the scope of this letter.Quantitative Science Studies individual journal's homepage for details."Therefore, the current APC for each journal was manually collected.Several journals do not charge APCs, as these are supported by other means (e.g., a university or a trade association).For other journals it was not possible to find an APC, so it was set to zero.Of the 211 journals listed as being fully open access, 61 had their APCs set to zero.The number of articles published in these 211 journals between 2015 and 2023 is 121,742.6. *Springer-Nature covers a number of different imprints 7,8  The main message though is that there is not a reliable way to accurately estimate the revenue income for a given publisher.
Table 3 shows a comparison between the "Publication fees, net" as reported in the PLOS financial statements and the estimated figures using article-level data that were collected from their web site.Between 2015 and 2017, the estimates are lower than the income reported by PLOS.For 2018-2021, the estimates are higher than the values reported by PLOS.Looking at 2015-2020, the difference in the values is 0.38%, which is negligible given the scale of the figures involved.The 2021 revenue figure differs by almost $6 million, a difference of 18.62%.This significant difference has yet to be explained, but it is positive to see the strong correlation across the other 8 years ( just 0.38% difference) and even an overall difference of 2.41% suggests that the article-level calculation broadly agrees with the data in the financial statements, albeit compensating from 1 year to the next.

Quantitative Science Studies 267
Are open access fees a good use of taxpayers' money?

DISCUSSION
Only eight publishers are included in the analysis in this letter.There are many other (legitimate) publishers that publish open access papers, which would significantly increase the overall spend on open access revenues.
There are other publishers that are more motivated by the financial side of publishing and they have little regard for maintaining the integrity of the scientific archive.These are often referred to as predatory publishers (Kendall, 2021;Kendall & Linacre, 2022;Macháček & Srholec, 2022).OMICS (the only proven predatory publisher), for example, has not been included.Estimating the open access charges from these publishers will significantly increase the reported revenues.
In this analysis, only fully open access journals were considered.That is, those journals that offer both traditional and open access options were excluded.If these were included, it would significantly add to the number of articles.It would also significantly increase the data collection challenges, not only as there would be significantly more papers to process, but to identify papers that were published under an open access model might be challenging.
The data are incomplete.As evidenced in Butler et al. (2023), as well as noted here, to collect the data is a vast undertaking, and even then many assumptions/estimations have to be made.The estimates will never be fully accurate unless the publishers are willing to provide article-level data, which include all necessary information (see Section 6).

COULD OPEN ACCESS FEES PROVIDE BETTER VALUE IF SPENT ELSEWHERE?
The catalyst for the open access model of publishing was a requirement for governmentfunded research to be available to those that pay for it, typically the taxpayer.A (perhaps unintended) consequence of this is that the taxpayer not only pays for the research to be carried out, but pays again to have that research published in a journal.Moreover, the information about how much has been spent on these publication costs is not easily available and those funds go to commercial organizations that have an eye on the bottom line and have shareholders to consider.Forbes11 provides details of U.S. professorial salaries across a range of disciplines.They range between $92,250 p.a. and $133,950 p.a. Taking into account additional employment costs, if it is assumed that the average professorial salary is $200,000, the $5,420,269,346 of open access fees reported here could fund 27,101 professors.
A U.K. government report12 says that the average student debt, at the completion of their course, is £45,600 ($57,583 (using xe.com, February 10, 2024).The $5,420,269,346 open access fees could remove this debt for 94,130 students.Another way of framing this is to consider the £9,250 ($11,681) p.a. fees that are paid by U.K. students.Over a 3-year course, they would pay £27,750 ($35,042).The open access revenue reported here would fund 154,678 students.

CALL TO ACTION
1. Elsevier and PLOS should be applauded for providing annual accounts, which includes data about open access fee revenue.We would encourage other publishers to do this.2. The fee paid to publish each individual paper should be noted on the published article, as well as in the metadata.The actual amount paid and the amount of any waiver should be provided.3. Who paid the open access fee should be stated so that it is easier to calculate (for example) the investment made by a specific university or country.This is even more important when the authors come from different institutions and/or countries.4. Governments should reflect on whether the money provided by their taxpayers is providing value for money by paying open access fees, or whether the funds would be better spent on supporting students, supporting more research, or even used in other government departments to support other urgent issues that the country faces.5. Robust, peer-reviewed journals should be established that are run as not-for-profit enti- ties.The open access fees that are currently spent with commercial publishers should easily cover the costs of these journals.The journals should be under the control of the funding agencies and it should be a condition of accepting a grant that the results of that research are published in those journals.As long as the peer-review processes are in line with the highly respected journals, these journals will quickly establish themselves as the leading journals in the discipline.This model would also be another way to combat the threat posed by predatory journals, as publicly funded research could not be published in those journals.
Of course there is also a need to ensure that the fees are proportionate to the service being provided and that publishers are not simply setting their fees as high as they believe they can get away with.
Frontiers is a fully open access publisher, so all of its journals are fully open access.Sage has 211 fully open access journals 5 .One of the Sage web pages says 6 "For pure gold open access journals, APCs vary from journal to journal, so please visit the Butler et al. (2023)from its 228 journals.The current APCs (as at January 30, 2024) were collected for each journal.The number of articles for each year (2015-2023) was also captured.The total number of articles is 459,743.3.*MDPI is a fully open access publisher, publishing about 430 journals.Between 1996 and 2022, it published one million articles 2 .It is worth noting that if the average APC was $1,000 (which is almost certainly an underestimate), MDPI's total revenue would be $1 billion.Using the average APC ($1,989) fromButler et al. (2023), this would put the total revenue closer to $2 billion.A conservative estimate of $500,000 is used as their revenue for 2015-2023.4.PLOS is a fully open access publisher.Data were collected on its 14 journals, although two are new and have not published any articles.The current APCs (as at January 29, 2024) 3 were collected for each journal, as well as the number of articles(2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022)(2023).The total number of articles is 207,935.PLOS provides an annual overview of its accounts 4 , which shows its revenue from open access charges (see Table3). 5. 3 https://plos.org/publish/fees/,accessed January 29, 2024 (archived at https://bit.ly/3HW0faZ).4 https://plos.org/financial-overview/,accessed February 9, 2024 (archived at https://bit.ly/3ujovk5).5 https://journals.sagepub.com/gold-open-access-journals-disciplines,accessed January 28, 2024 (archived at https://bit.ly/3HVv3bT).6 https://us.sagepub.com/en-us/sam/author-information,accessed February 3, 2024 (archived at https:// bit.ly /3OtoKji).
Butler et al. (2023)io, Springer, and Palgrave Macmillan).It offers 683 open access journals and more than 2,200 hybrid journals.It has published more than 124,000 fully open access articles 7 .The data are too large to collect for this letter.The estimated revenue we use for this publisher is $1,179,348,760.This is double that of the estimate given inButler et al. (2023).This will be an underestimate, as the figure inButler et al. (2023)was underestimated (and was their lower estimate); their estimate was over a 4-year period(2015)(2016)(2017)(2018)and this letter includes an additional 5 years (2019-2013); and the figures provided here assume no increase in the APCs, which is almost certainly not the case.7. *Taylor & Francis has published over five million articles and publish 364 open access journals.The data are too large to collect for this letter.The estimated revenue we use for this publisher is $153,531,114.The reasons for this are the same as given for Springer-Nature.8. Wiley has published eight million articles across 1,600 journals 9 .Accessing its annual reports 10 provides financial information on open access revenue.The 2015-2023 annual reports indicate that open access revenue was $505,235 million. Tble 1 shows the estimated revenue figures for three publishers for 2015-2018.The table compares the estimated revenue derived here (the Kendall column) and the figures presented by Butler et al. (the Butler column).For Elsevier and Sage the Kendall revenues are higher than those of Butler et al.The estimated revenue for Wiley, for Kendall, is extracted from the annual reports and is lower than the Butler figure.Overall, the Kendall estimate is higher than the Butler et al. estimate.

Table 3 .
The PLOS revenue figures ($US) from their annual statements.Only figures up to 2021 were available at the time of writing Is this investment of billions of dollars providing value for money?To try to contextualize the amount that is spent on open access fees, we ask what other initiatives these funds could support?