Butler, Matthias et al. (2023) reported that five large commercial publishers (Elsevier, Sage, Springer-Nature, Taylor & Francis, and Wiley) received $1.06 billion in publication fees between 2015 and 2018 (all monetary amounts are given in U.S. dollars unless stated otherwise). Another three publishers were mentioned (Frontiers, MDPI, and PLOS), but were not analyzed. The revenue was underestimated and the number of open access (OA) articles increased over the period of study. The five publishers analyzed charged (on average) $1,989 for Gold OA articles and $2,905 for hybrid articles.

In this letter eight publishers are analyzed, over a longer period; 9 years (2015–2023). The publishers are the same as those previously analyzed, as well as the three that were mentioned but not analyzed. The revenue of three of the publishers is estimated as the data are too large to collect for this short letter, but a project that includes these publishers (and others) is currently under way. The other five publishers have their open access revenue estimated using article level data, article processing charges (APC), and/or the publishers’ annual reports.

This letter provides a revenue estimate for eight publishers over a 9-year period (2015–2023). Given that the data are difficult to collect, and then it can only be used to give a rough indication of revenue, it is argued that the publishers themselves should provide information for each article, so that the funders of open access articles (typically taxpayers) can not only access the research it funds (which is the primary motivation for open access) but also know how their funds are supporting open access charges, in much more detail than is available at the present time.

The data collection process for the eight publishers is given below. The three publishers marked with an asterisk are too large to collect data for this letter, but a larger project is currently being undertaken that will include these, and additional, publishers. For these three publishers, estimates of their revenue (drawing on the figures in Butler et al. (2023)) is given so that an overall figure for the eight publishers can be derived.

For four of the publishers, the number of papers published by each of its journals for each year is captured. This is not an easy task, as a bespoke capture plan (using a browser web scraping extension) had to be devised for each publisher. The current APC for each journal is also captured and this is discounted by 5% each year to provide an APC for previous years. The figure of 5% was chosen because random manual sampling suggests that this is not unreasonable. It would have been better to have captured the APC for each journal for each of the 9 years, but this is challenging to do and the data are unlikely to be available for every journal for all of the 9 years, so an estimated figure would still be required.

Once we have the number of articles published by a given journal in a given year, and an APC for each year, it is a simple matter to derive the yearly income for that journal and then sum up each journal to give an annual income for the publisher. We note that waivers are often mentioned on the web sites of open access journals, but there is no way of knowing if a waiver has been awarded and, if so, how much, so this data could not be collected.

In two cases (PLOS and Wiley), the publishers’ annual reports are available. As the article level data are also captured for PLOS, the revenue figure can be compared against the figures from the annual reports. In the case of Wiley, we rely solely on the annual reports, as it publishes 1,600 journals and have published over eight million articles; collecting that amount of data is outside the scope of this letter.

  1. Elsevier has an open access FAQ page1 from which you can download a list of journals, which indicates whether they are hybrid, open access, or traditional, along with the current APCs (as at January 23, 2024). Of the 2,702 journals listed, 834 are fully open access. The number of articles published in these 834 journals between 2015–2023 is 731,054.

  2. Frontiers is a fully open access publisher, so all of its journals are fully open access. Data were collected from its 228 journals. The current APCs (as at January 30, 2024) were collected for each journal. The number of articles for each year (2015–2023) was also captured. The total number of articles is 459,743.

  3. *MDPI is a fully open access publisher, publishing about 430 journals. Between 1996 and 2022, it published one million articles2. It is worth noting that if the average APC was $1,000 (which is almost certainly an underestimate), MDPI’s total revenue would be $1 billion. Using the average APC ($1,989) from Butler et al. (2023), this would put the total revenue closer to $2 billion. A conservative estimate of $500,000 is used as their revenue for 2015–2023.

  4. PLOS is a fully open access publisher. Data were collected on its 14 journals, although two are new and have not published any articles. The current APCs (as at January 29, 2024)3 were collected for each journal, as well as the number of articles (2015–2023). The total number of articles is 207,935.

    PLOS provides an annual overview of its accounts4, which shows its revenue from open access charges (see Table 3).

  5. Sage has 211 fully open access journals5. One of the Sage web pages says6 “For pure gold open access journals, APCs vary from journal to journal, so please visit the individual journal’s homepage for details.” Therefore, the current APC for each journal was manually collected. Several journals do not charge APCs, as these are supported by other means (e.g., a university or a trade association). For other journals it was not possible to find an APC, so it was set to zero. Of the 211 journals listed as being fully open access, 61 had their APCs set to zero. The number of articles published in these 211 journals between 2015 and 2023 is 121,742.

  6. *Springer-Nature covers a number of different imprints7,8 (BMC, Nature Portfolio, Springer, and Palgrave Macmillan). It offers 683 open access journals and more than 2,200 hybrid journals. It has published more than 124,000 fully open access articles7. The data are too large to collect for this letter. The estimated revenue we use for this publisher is $1,179,348,760. This is double that of the estimate given in Butler et al. (2023). This will be an underestimate, as the figure in Butler et al. (2023) was underestimated (and was their lower estimate); their estimate was over a 4-year period (2015–2018) and this letter includes an additional 5 years (2019–2013); and the figures provided here assume no increase in the APCs, which is almost certainly not the case.

  7. *Taylor & Francis has published over five million articles and publish 364 open access journals. The data are too large to collect for this letter. The estimated revenue we use for this publisher is $153,531,114. The reasons for this are the same as given for Springer-Nature.

  8. Wiley has published eight million articles across 1,600 journals9. Accessing its annual reports10 provides financial information on open access revenue. The 2015–2023 annual reports indicate that open access revenue was $505,235 million.

Table 1 shows the estimated revenue figures for three publishers for 2015–2018. The table compares the estimated revenue derived here (the Kendall column) and the figures presented by Butler et al. (the Butler column). For Elsevier and Sage the Kendall revenues are higher than those of Butler et al. The estimated revenue for Wiley, for Kendall, is extracted from the annual reports and is lower than the Butler figure. Overall, the Kendall estimate is higher than the Butler et al. estimate. The main message though is that there is not a reliable way to accurately estimate the revenue income for a given publisher.

Table 1.

Estimated revenue ($US) of three publishers (2015–2018)

#PublisherButlerKendall
Elsevier 221,441,616 293,832,673 
Sage 31,576,202 49,305,233 
Wiley 141,316,332 120,717,000 
  Total 394,334,150 463,854,906 
#PublisherButlerKendall
Elsevier 221,441,616 293,832,673 
Sage 31,576,202 49,305,233 
Wiley 141,316,332 120,717,000 
  Total 394,334,150 463,854,906 

Table 2 shows all eight publishers, covering 2015–2023. The revenue estimate for publishers 1–5 (in Table 2) is $3.587 billion. For the eight publishers, the estimated revenue is $5.420 billion. These figures will not be totally accurate—indeed, they may be quite far from the true values—but they do provide a good indication of the scale of spend on open access revenues across just eight publishers.

Table 2.

Estimated revenue ($US) of 8 publishers (2015–2023)

#PublisherKendall
Elsevier 1,384,0987,843 
Frontiers 1,211,017,631 
PLOS 327,526,349 
Sage 159,511,574 
Wiley Limited 505,236,075 
  Subtotal 3,587,389,472 
MDPI 500,000 
Springer Nature 1,179,348,760 
Taylor & Francis 153,531,114 
  Total 5,420,269,346 
#PublisherKendall
Elsevier 1,384,0987,843 
Frontiers 1,211,017,631 
PLOS 327,526,349 
Sage 159,511,574 
Wiley Limited 505,236,075 
  Subtotal 3,587,389,472 
MDPI 500,000 
Springer Nature 1,179,348,760 
Taylor & Francis 153,531,114 
  Total 5,420,269,346 

Table 3 shows a comparison between the “Publication fees, net” as reported in the PLOS financial statements and the estimated figures using article-level data that were collected from their web site. Between 2015 and 2017, the estimates are lower than the income reported by PLOS. For 2018–2021, the estimates are higher than the values reported by PLOS. Looking at 2015–2020, the difference in the values is 0.38%, which is negligible given the scale of the figures involved. The 2021 revenue figure differs by almost $6 million, a difference of 18.62%. This significant difference has yet to be explained, but it is positive to see the strong correlation across the other 8 years (just 0.38% difference) and even an overall difference of 2.41% suggests that the article-level calculation broadly agrees with the data in the financial statements, albeit compensating from 1 year to the next.

Table 3.

The PLOS revenue figures ($US) from their annual statements. Only figures up to 2021 were available at the time of writing

YearFrom financial overviewKendall% difference
2015 42,274,910 37,987,182 −10.68 
2016 36,772,796 33,790,076 −8.45 
2017 34,832,837 33,730,198 −3.22 
2018 31,663,670 33,272,712 4.96 
2019 29,847,728 33,384,452 8.15 
2020 32,428,621 35,861,769 10.05 
2021 32,402,344 39,054,764 18.62 
Total 240,222,906 246,081,152 2.41 
YearFrom financial overviewKendall% difference
2015 42,274,910 37,987,182 −10.68 
2016 36,772,796 33,790,076 −8.45 
2017 34,832,837 33,730,198 −3.22 
2018 31,663,670 33,272,712 4.96 
2019 29,847,728 33,384,452 8.15 
2020 32,428,621 35,861,769 10.05 
2021 32,402,344 39,054,764 18.62 
Total 240,222,906 246,081,152 2.41 

Only eight publishers are included in the analysis in this letter. There are many other (legitimate) publishers that publish open access papers, which would significantly increase the overall spend on open access revenues.

There are other publishers that are more motivated by the financial side of publishing and they have little regard for maintaining the integrity of the scientific archive. These are often referred to as predatory publishers (Kendall, 2021; Kendall & Linacre, 2022; Macháček & Srholec, 2022). OMICS (the only proven predatory publisher), for example, has not been included. Estimating the open access charges from these publishers will significantly increase the reported revenues.

In this analysis, only fully open access journals were considered. That is, those journals that offer both traditional and open access options were excluded. If these were included, it would significantly add to the number of articles. It would also significantly increase the data collection challenges, not only as there would be significantly more papers to process, but to identify papers that were published under an open access model might be challenging.

The data are incomplete. As evidenced in Butler et al. (2023), as well as noted here, to collect the data is a vast undertaking, and even then many assumptions/estimations have to be made. The estimates will never be fully accurate unless the publishers are willing to provide article-level data, which include all necessary information (see Section 6).

The catalyst for the open access model of publishing was a requirement for government-funded research to be available to those that pay for it, typically the taxpayer. A (perhaps unintended) consequence of this is that the taxpayer not only pays for the research to be carried out, but pays again to have that research published in a journal. Moreover, the information about how much has been spent on these publication costs is not easily available and those funds go to commercial organizations that have an eye on the bottom line and have shareholders to consider.

Is this investment of billions of dollars providing value for money? To try to contextualize the amount that is spent on open access fees, we ask what other initiatives these funds could support?

Forbes11 provides details of U.S. professorial salaries across a range of disciplines. They range between $92,250 p.a. and $133,950 p.a. Taking into account additional employment costs, if it is assumed that the average professorial salary is $200,000, the $5,420,269,346 of open access fees reported here could fund 27,101 professors.

A U.K. government report12 says that the average student debt, at the completion of their course, is £45,600 ($57,583 (using xe.com, February 10, 2024). The $5,420,269,346 open access fees could remove this debt for 94,130 students. Another way of framing this is to consider the £9,250 ($11,681) p.a. fees that are paid by U.K. students. Over a 3-year course, they would pay £27,750 ($35,042). The open access revenue reported here would fund 154,678 students.

  1. Elsevier and PLOS should be applauded for providing annual accounts, which includes data about open access fee revenue. We would encourage other publishers to do this.

  2. The fee paid to publish each individual paper should be noted on the published article, as well as in the metadata. The actual amount paid and the amount of any waiver should be provided.

  3. Who paid the open access fee should be stated so that it is easier to calculate (for example) the investment made by a specific university or country. This is even more important when the authors come from different institutions and/or countries.

  4. Governments should reflect on whether the money provided by their taxpayers is providing value for money by paying open access fees, or whether the funds would be better spent on supporting students, supporting more research, or even used in other government departments to support other urgent issues that the country faces.

  5. Robust, peer-reviewed journals should be established that are run as not-for-profit entities. The open access fees that are currently spent with commercial publishers should easily cover the costs of these journals. The journals should be under the control of the funding agencies and it should be a condition of accepting a grant that the results of that research are published in those journals. As long as the peer-review processes are in line with the highly respected journals, these journals will quickly establish themselves as the leading journals in the discipline. This model would also be another way to combat the threat posed by predatory journals, as publicly funded research could not be published in those journals.

Of course there is also a need to ensure that the fees are proportionate to the service being provided and that publishers are not simply setting their fees as high as they believe they can get away with.

This is a letter, in response to the Butler et al. article. As such it is not a full research article. This is currently being prepared and will have much more detail than can be presented here. The limitations listed below will need to be considered for a more complete article.

  1. Given the widely different ways that the data are presented by the journals/publishers, it is challenging to collect the data and be fully confident that it are robust and complete. Indeed, it is almost certain that any data collection exercise is lacking and new tools may need to be developed to provide data sets that are robust.

  2. There is no one “go-to” place for every journal where the data can be collected from, other than the publisher itself. For example, indexes such as Scopus, Web of Science, or the Directory of Open Access Journals (DOAJ) do not hold information on every journal.

  3. Data are only collected at the journal level. It would be interesting to collect data about authors (noting that author disambiguation is an issue) and also countries, so that an estimate can be derived about how much a given country spends on open access fees.

  4. Where there are multiple authors across different institutions/countries it is challenging to know who paid the open access fee and how its costs should be attributed.

  5. Only fully open access journals have been considered here. Many open access articles are published in hybrid and/or transformative journals. Collecting information about open access articles published in hybrid journals adds another layer of complexity to the data collection task.

  6. There are models where institutions can enter an agreement with a publisher, enabling their researchers to publish in their journals. There is no information publicly available that shows the effect of these agreements on the underlying finances. Nor is there any information about whether a fee waiver was granted for a given paper.

  7. Many publishers are missing from this analysis. The publishers analyzed are some of the largest but there are hundreds (if not thousands) more that can be looked at.

The author has no competing interests.

No funding was received for the purposes of this research.

The data are not being made available at the present time as this is a letter in response to an article. A more complete paper is being prepared at which point the full data set will be made available.

1

https://www.elsevier.com/open-access#4-faqs, accessed January 28, 2024 (archived at https://bit.ly/42eBEr8).

2

https://www.mdpi.com/about/announcements/5130, accessed November 15, 2023 (archived at https://bit.ly/3MO4jwN).

3

https://plos.org/publish/fees/, accessed January 29, 2024 (archived at https://bit.ly/3HW0faZ).

4

https://plos.org/financial-overview/, accessed February 9, 2024 (archived at https://bit.ly/3ujovk5).

Butler
,
L.-A.
,
Matthias
,
L.
,
Simard
,
M.-A.
,
Mongeon
,
P.
, &
Haustein
,
S.
(
2023
).
The oligopoly’s shift to open access: How the big five academic publishers profit from article processing charges
.
Quantitative Science Studies
,
4
(
4
),
778
799
.
Kendall
,
G.
(
2021
).
Beall’s legacy in the battle against predatory publishers
.
Learned Publishing
,
34
(
3
),
379
388
.
Kendall
,
G.
, &
Linacre
,
S.
(
2022
).
Predatory journals: Revisiting Beall’s research
.
Publishing Research Quarterly
,
38
(
3
),
530
543
.
Macháček
,
V.
, &
Srholec
,
M.
(
2022
).
Predatory publishing in Scopus: Evidence on cross-country differences
.
Quantitative Science Studies
,
3
(
3
),
859
887
.

Author notes

Handling Editor: Vincent Larivière

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.