The speed with which biomedical specialists were able to identify and characterize COVID-19 was partly due to prior research with other coronaviruses. Early epidemiological comparisons with Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS), also made it easier to predict COVID-19’s likely spread and lethality. This article assesses whether academic interest in prior coronavirus research has translated into interest in the primary source material, using Mendeley reader counts for early academic impact evidence. The results confirm that SARS and MERS research in 2008–2017 experienced anomalously high increases in Mendeley readers in April–May 2020. Nevertheless, studies learning COVID-19 lessons from SARS and MERS or using them as a benchmark for COVID-19 have generated much more academic interest than primary studies of SARS or MERS. Thus, research that interprets prior relevant research for new diseases when they are discovered seems to be particularly important to help researchers to understand its implications in the new context.
COVID-19 was first recognized because scientists already knew about coronaviruses: their shape and how to test for them. Their origins were also already known (zoonotic with specific animal carriers). Moreover, an understanding of virus mutations had led to an expectation that new coronaviruses could emerge and that their virulence could differ from those already found. Thus, while the virulence of COVID-19 and timing of its occurrence could not be predicted in advance, its emergence was a recognized possibility.
In addition, prior coronavirus research had identified a set of symptoms from previous outbreaks, tested a range of treatments, experimented with vaccines, and implemented preventative measures. Thus, biomedical and public health investigations of COVID-19 had a body of prior research to draw upon. Assessing the extent to which COVID-19 differs from prior diseases might help speed new biomedical and public health research, for example. This is made explicit in some papers, such as “Repurposing antivirals as potential treatments for SARS-CoV-2: From SARS to COVID-19” (Gómez-Ríos, López-Agudelo, & Ramírez-Malule, 2020). It seems likely that this is a general trend, so older coronavirus research will be attracting substantial new attention in 2020, but evidence is needed to confirm this. There are currently three known coronavirus diseases that can have a serious impact on humans. Other coronaviruses are mild in humans or only infect some species of animals.
SARS (Severe Acute Respiratory Syndrome) is caused by the coronavirus SARS-CoV (also known as SARS-CoV-1, SARSr-CoV). It was first identified in 2003, and there has been no outbreak since then. 8,437 people have been reported infected, with an 11% death rate (https://www.who.int/csr/sars/country/2003_07_11/en/).
MERS (Middle East Respiratory Syndrome) is caused by the MERS coronavirus MERS-CoV. It was first identified in Saudi Arabia in 2012 and, by January 2020, 2,500 people had been reported infected, with a 35% death rate (http://www.emro.who.int/health-topics/mers-cov/mers-outbreaks.html).
COVID-19 is caused by the coronavirus SARS-CoV-2 and emerged in December 2019. At the time of writing, it had infected many more people than the previous two coronaviruses, was more infectious for human-to-human transmission, and had a lower death rate but much higher death toll. It has previously been called 2019-nCoV and 2019/2020 novel coronavirus.
Despite the above-mentioned likelihood of prior coronavirus research being more useful in 2020, there is no evidence yet to check whether interest in research specific to SARS and MERS has increased due to COVID-19. A positive result—even though highly expected—would empirically validate the importance of ongoing research into diseases related to potential pandemics (e.g., coronaviruses, ebolaviruses, Flaviviridae viruses). This article addresses this issue and compares the current academic impact of COVID-19 research with prior coronavirus research to assess their current relative importance. It is not clear whether older coronavirus research would be more impactful. This seems possible because it may be more foundational and of higher quality due to more time to plan and execute. Conversely, research about COVID-19 may be more relevant to the 2020 pandemic. The focus here is on scholarly impact (i.e., citations from future research, or Mendeley readers as a proxy for this) rather than societal impact. In the special case of COVID-19 it is not yet clear whether scholarly impact would be a good indicator of societal impact, especially in the form of treatments, vaccines, or epidemiology. The research questions are as follows.
RQ1: Does SARS and MERS research from before 2020 have more scholarly impact in 2020 than expected for its age?
RQ2: Does SARS and MERS research from before 2020 have more scholarly impact in mid-2020 than early 2020 COVID-19 research?
2. BACKGROUND: BIBLIOMETRIC STUDIES OF CORONAVIRUSES
Some bibliometric studies have investigated the influence of coronavirus research, mostly characterizing the number and type of publications indexed in relevant scholarly databases. As coronaviruses have many variants and could be the primary focus of a paper or less central to a research project, each study has operationalized its sample in different ways and there is no single agreed method. All seem to have been produced in 2020 in the context of COVID-19.
A range of studies have shown that there is a rapid rate of COVID-19 research publishing and that both MERS and SARS are relevant to this emerging set. In detail, and discussed chronologically, the specific findings are as follows. One study found 8,732 articles and 1,028 reviews with the title term “coronavirus” by February 9, 2020 in the Web of Science (WoS). The Journal of Virology (9%) was the single most common source and both SARS and MERS were identified as relevant keywords (Tao, Zhou, et al., 2020). By February 29, 2020, 183 publications matching the query “COVID-19” were indexed in PubMed, a third of which reported original research (Lou, Tian, et al., 2020). In PubMed and the World Health Organization (WHO) COVID-19 research database, there had been 564 observational or interventional investigations about COVID-19 by March 18, 2020 (Chahrour, Assi, et al., 2020). An investigation of WoS publications matching a set of COVID-19 queries on April 1, 2020 found keywords related to both MERS and SARS to be associated, suggesting that early research had often made connections with the two prior diseases (Hossain, 2020). By April 7, the COVID-19 coverage of a range of scholarly databases had been expanding at an increasing rate since January 2020 (Torres-Salinas, 2020). On April 9, 12,109 papers had been indexed by Scopus matching the query “coronavirus*”, with sudden increases associated with each of SARS, MERS, and COVID-19 (Haghani, Bliemer, et al., 2020; see also Danesh & GhaviDel, 2020). A collection of 2,958 articles and 2,797 preprints from Scopus, arXiv, bioRxiv, and medRxiv by April 23, 2020 was created by a set of inclusive queries, with unspecified manual checking afterwards (Latif, Usman, et al., 2020). Topic modeling applied to this data set extracted sets of 10 topics for different slices of the data, but none included SARS or MERS. Scopus queries for “COVID-19” in titles, abstracts, or keywords 2 days later (April 25) found 3,513 documents and a keyword-based topic visualization included MERS (Hamidah, Sriyono, & Hudha, 2020). Also on April 25, the number of COVID-19 documents indexed by PubMed was experiencing exponential-like growth (Kambhampati, Vaishya, & Vaish, 2020). A similar investigation with a wider range of queries found both SARS and MERS represented in a topic map (Dehghanbanadaki, Seif, et al., 2020).
One prior bibliometric study has compared SARS, MERS, and COVID-19 papers until March 25, 2020. Similar document types were published for each disease, both SARS and MERS research volume had decreased over time, and COVID-19 research was more cited (higher field normalized citation counts). This paper adopted an inclusive search strategy, finding 7,272 SARS documents and 2,199 MERS documents (Hu, Chen, et al., 2020). Thus, this analysis may be dominated by studies that are related to the three diseases without being primarily about them.
Some research has compared bibliometric trends for a variety of diseases, including coronaviruses. A comparison of SARS, MERS, Avian Flu, Ebola, HIV/AIDS, Hepatitis B & C, Flu, and Swine Flu found that short-lasting epidemics were uniquely associated with rapid increases and declines in both publication volumes and citation rates around the critical years (Kagan, Moran-Gilad, & Fire, 2020). Another study compared COVID-19 with SARS, Ebola, Avian Flu (H1N1), and Zika publications in WoS by April 9, 2020 (adding papers from PubMed and the Chinese National Knowledge Infrastructure for COVID-19). It found that all four prior epidemics were associated with rapid increases in publication volumes, with slower declines. Research into all diseases covered a wide range of subject areas (Zhang, Zhao, et al., 2020).
The research design was to gather studies about coronaviruses from before 2020 and compare their scholarly impact (as reflected by their Mendeley reader counts) with that of studies published in 2020. There is a large curated relevant data set: the COVID-19 Open Research Dataset (CORD-19), which is a collection of papers designed for data mining and scientometrics, but takes a broad approach by including many publications not primarily about coronaviruses (Colavizza, Costas, et al., 2020). This was not used because of its broad remit. The scholarly database Dimensions was chosen in preference to WoS, PubMed, or Scopus for its rapid indexing of academic documents and wider coverage of COVID-19 than other scholarly indexes (Torres-Salinas, 2020; Kousha & Thelwall, 2020). For the basic sample, Dimensions was searched for documents matching “coronavirus” weekly from March 21, 2020 to May 30, 2020, using the query below. This single term was chosen, rather than a set of coronavirus-related keywords and phrases, to give a narrow focus on the virus. The earliest result was from 2008, which seems to be an API limitation, because the web version has results from 1950.
search publications for “coronavirus“ return publications [basics + extras]
Mendeley was queried for each document matching the above query to count the number of registered readers of the document. Reader counts were checked each week, immediately after the Dimensions queries. Mendeley readers have moderate or strong correlations with citation counts in all or almost all academic fields (Thelwall, 2017b) and recent scholarly articles usually have at least one Mendeley reader (Zahedi, Costas, & Wouters, 2014). Early Mendeley reader counts have a high correlation with later Scopus citation counts (Thelwall, 2018) but appear about a year before them (Maflahi & Thelwall, 2018; Thelwall, 2017a), so they are preferable to citation counts as an indicator of early scholarly impact. This is also true for COVID-19 research (Kousha & Thelwall, 2020). Most people registering documents in Mendeley are academics or PhD students, although there are also some master’s students, librarians, and other professionals (Mohammadi, Thelwall, et al., 2015). People usually register an article in Mendeley because they have read it or intend to read it (Mohammadi, Thelwall, & Kousha, 2016), so all evidence points to it being a citation-like academic impact indicator, with a small element of educational impact. Mendeley is not ideal for comparing reader counts for articles published in different years (for RQ1) because the expansion or contraction of its user base can influence the reader counts for papers. This is especially true because articles are often added to users’ Mendeley libraries shortly after they are published, so older articles are less likely to attract Mendeley readers. Nevertheless, although Mendeley launched in 2008, an analysis in 2014 found that average reader counts were close to the subject maximum for articles published in 2009 (Thelwall & Sud, 2016), suggesting that it is reasonable to use Mendeley reader counts for this year as a scholarly impact indicator. Comparisons of reader counts between years therefore seem reasonable from 2009 onwards, although they are not precise due to unknown changes in the Mendeley user base since then. The same would be true for citations, as citation counts from any database depend on the number of journals indexed and the size of those journals.
Mendeley documents were searched for using an author/title query and a separate DOI query, with the results combined to identify readers of all variants of a document in Mendeley (Zahedi, Haustein, & Bowman, 2014). The Mendeley reader counts were not field normalized because the documents fit within a relatively narrow topic and trends over time are clearer with the raw reader count data. Articles not in Mendeley by March 21, 2020 were discarded to focus on research published by that date (the data collection start). Including later articles would dilute the results through new articles with few readers when first published.
Four subsets were extracted to identify the influence of different types of research. The first three sets of documents were based on the inclusion of a human coronavirus-related keyword in the title. Documents mentioning a keyword in the abstract rather than the title were excluded because they are less likely to focus on coronaviruses. For example, an abstract might mention coronaviruses as a potential application of a technique or a motivation, or as part of a related study (e.g., “Feline infectious peritonitis as a systemic inflammatory disease: Contribution of liver and heart to the pathogenesis” mentions in its abstract that the disease examined is induced by a feline coronavirus, but the article is of little relevance to coronavirus research). Many papers in the complete set of matches of the original Dimensions coronavirus query were about other viruses or about viruses in general, but mentioned coronaviruses as an example or as part of a list (e.g., “Fighting misconceptions to improve compliance with influenza vaccination among health care workers”). The fourth subset encapsulated any mention of coronaviruses generally or a specific human coronavirus in article titles. Although an article can be about these topics without containing a disease or virus name in the title, for example by being published before a formal name was assigned (e.g., “A pneumonia outbreak associated with a new coronavirus of probable bat origin”), or research targeting an aspect of the disease without needing to specify it (e.g., “The psychological impact of quarantine and how to reduce it: Rapid review of the evidence” and “First respiratory transmitted food borne outbreak?” from 2020), this method seemed to be effective at eliminating peripherally relevant papers.
SARS: Journal articles with “SARS”, “SARSr-CoV”, “SARS-CoV-1”, or “SARS-CoV”, or “Severe Acute Respiratory Syndrome” in their titles. The results from 2020 were manually checked to remove false matches that were mentions of COVID-19 before it had been named.
MERS: Journal articles with “MERS”, “MERS-CoV”, or “Middle East Respiratory Syndrome” in their titles.
COVID-19: Journal articles containing “COVID-19”, “COVID19”, “COVID2019”, “SARS-CoV-2”, “2019-nCoV”, “2019 coronavirus”, “coronavirus disease 2019”, or “Wuhan”, in their titles. The inclusion of “Wuhan” did not generate false matches because the original data set was captured with a coronavirus query.
Coronavirus: Journal articles containing any of the above or the word “coronavirus” in their titles. This encapsulates the three human coronaviruses and the generic name for the virus family.
Documents that did not match the Dimensions classification for journal articles were excluded to focus on the primary mechanism for disseminating coronavirus-related research. This excludes books, book chapters, and conference papers. Some of the documents classified in Dimensions as journal articles may have been news items, editorials, letters, short articles, or reviews rather than “standard” journal articles. These were retained because short-form contributions seem to play an important role in infectious disease research (Kousha & Thelwall, 2020).
The rate of increase for each subset was calculated with the percentage increase in average Mendeley readers over about 2 months: from the first date checked in April to the last date in May 2020. Although the start date could have been earlier, from March 21, 2020, there was a higher rate of increase to the end of March. April was chosen as the starting point as a conservative step, in case the end of March increase was due to a technical cause.
As captured by Dimensions, the volume of research about SARS has slowly decreased since 2011 (Figure 1), with a projected increase for 2020, given that the 2020 data contains only a quarter of a year (articles recorded in Mendeley on March 21, 2020). In contrast, the volume of research about MERS increased to 2015/16, then decreased, even allowing for the 2020 data containing only a quarter of a year.
The average readership for the three subsets of articles (excluding the COVID-19 subset) is very approximately constant, irrespective of year, for all sets, except with a substantially higher average number of readers for research published in 2020 (Figure 2). Other factors being equal, older articles should be more cited and have more readers because interest should accrue over time. Thus, the relatively static average numbers of readers per article until 2019 suggests a moderate tendency for newer articles to be more read in all three categories. In addition, articles published in 2020 attracted substantially more readers than articles published earlier. This gives a negative answer to RQ2: By mid-2020, coronavirus research from 2020 had already attracted more interest than coronavirus research from before 2020. For MERS, articles written in 2013, just after the disease was identified, tend to have attracted more readers, presumably due to their use in follow-up research.
The percentage increase in Mendeley readers over a 2-month period was calculated for each data set and year (Figure 7). Multiplying by 6 would give an estimated 12-month (annual) increase in Mendeley readers if the rate were constant over the year. For reference, a 17% increase over 2 months would equate to an annual increase of over 100%. For almost all data sets and years, the 2-month increase was above 17%, indicating an over 100% annual increase in Mendeley readers. This figure can be benchmarked against expected increases in Mendeley readers, based on prior information about the rate at which citations increase.
Citations tend to continue to accumulate in the long term, with old articles increasing their total citation counts slowly. For example, the cited half-life of the most common source, the Journal of Virology, was 10.9 for 2019 according to the Clarivate Journal Citation Reports (the cited half-life has increased steadily from 7.1 in 2009), so the typical virology article should have attracted half of its final citation count after 11 years. Nevertheless, in the life sciences, the annual number of citations that a paper attracts seems to peak after 2 years, then gradually decrease (Adams, 2005; this is an old reference, but the average age of cited biomedical literature has been approximately constant since the 1950s: Larivière, Archambault, & Gingras, 2008) and in biology the peak for more cited papers published in 1990 is at 4 years, with the peak for less cited papers from 1990 being at 2 years (Parolo, Pan, et al., 2015), consistent with an overall average of 3 years to the citation peak. Thus, after 3 years, the annual percentage increase in total citations should be substantially below 100% (because the total number of citations could only double in years before the year with the highest annual increase). Similarly, after 4 years the annual percentage increase in total citations should be below 50% (because the previous 2 years would have had higher annual increases in citations). As Mendeley readers accumulate a year earlier than citations (Maflahi & Thelwall, 2018; Thelwall, 2017a), the annual Mendeley reader count percentage increases should be substantially below 100% after 2 years and below 50% after 3 years. Thus, SARS, MERS, and coronavirus research for every year from 2008 to 2017 has attracted an abnormally large increase in academic attention during April and May 2020, giving a positive answer to RQ1. This is especially the case for SARS research. It is to be expected that the rate of increase is highest for the newest articles, with two anomalous exceptions (2009 and MERS in 2012—only one paper).
The reason for the high readership counts for SARS papers from 2020, compared to SARS papers from previous years (Figure 3) can be deduced by reading their titles ( Appendix, Table A1). The top 17 SARS papers from 2020 also mentioned COVID-19 in their titles. Thus, the high readership rate for 2020 is probably due to SARS being mentioned in the context of its implications for COVID-19.
The reason for the high readership counts for MERS papers from 2020, compared to SARS papers from previous years (Figure 4) can also be deduced by reading their titles ( Appendix, Table A2). For MERS, most articles mentioning COVID-19 (including its earlier names) have more readers than most articles not mentioning it in its title, with some exceptions. Thus, again the high readership rate for MERS is partly due to research using it to illuminate COVID-19 properties, usually by comparisons or in the form of learning lessons from it. There are four exceptions in terms of highly read articles that do not mention COVID-19. Three are about treatments for MERS that mention the antiviral medication remdesivir, which has also been suggested elsewhere as a potential treatment for COVID-19. A review of MERS research is also widely read (604 readers). Only three papers mentioning COVID-19 (299, 222, 35 readers) are not in the top 22. One is a short editorial and the other two are short letters.
The main limitation of this study is its restriction to articles mentioning the diseases in their titles. Articles could be primarily about coronaviruses without including their names in the titles if they use an uncommon or early name variant. A second limitation is the use of Mendeley reader counts as a source of academic impact evidence. Mendeley may be less used in China and this could skew the results away from studies that were more read in China. Also, its use as an academic impact source may be misleading if users employ it for nonacademic purposes (such as personal safety) during the pandemic. The use of a 2-month period to assess changes is also a restriction, as the rate of increase of readers might speed or slow in the rest of 2020. It is also possible that changes in the uptake of Mendeley since 2009 have influenced the comparisons of readership counts between years, but it seems unlikely that such changes would be enough to affect the conclusions drawn.
The results show, apparently for the first time, that older SARS and MERS research has generated substantial new attention in April–May 2020, which is almost certainly due to new COVID-19 research. Nevertheless, SARS and MERS research from before 2020 has had far less academic impact, at least as reflected by Mendeley reader counts, than articles from 2020 that have reviewed or situated prior SARS and MERS research in the context of COVID-19. This issue does not seem to have been investigated for other groups of related diseases. Thus, while studies of SARS and MERS have informed COVID-19 research, this has occurred disproportionately through new articles that have explicitly made connections with COVID-19 or that have translated SARS and MERS research into implications for COVID-19. This underlines the value of this type of translating research.
Although the scholarly impact of pre-COVID-19 coronavirus research has apparently increased due to COVID-19, this increase is relatively small, given the huge number of COVID-19 publications in 2020 (by March 21; see Figure 1). The reason for this is unclear. The analysis above suggests that the value of pre-COVID-19 coronavirus research is partly channeled through articles translating it for COVID-19. It is also possible that early COVID-19 articles have sufficiently drawn upon earlier coronavirus research and moved forward with new COVID-19 findings, rendering some earlier studies no longer relevant for COVID-19. Short-form COVID-19 articles (e.g., letters, editorials, news reports in journals) may also deliberately focus on COVID-19 research to convey a succinct and clear message to a general audience.
The results confirm that older SARS and MERS research is proving useful for COVID-19 scholarship, confirming expectations. The Mendeley readership evidence also suggests that research interpreting SARS and MERS studies for COVID-19 performs a particularly useful role in academia. It is more read, on average, than the earlier primary studies that underpin it. This also suggests that the academic impact of pre-COVID-19 coronavirus research is underestimated by Mendeley reader counts because publishing scholars seem to prefer to read the interpreting articles rather than the original papers.
A practical suggestion from the findings is that, in future, when new diseases emerge that are variants of known diseases, researchers may need to prioritize publishing reviews of prior research and interpreting it in the context of the new disease. These reviews may save valuable time by reducing the need for academics and clinicians to rely on the source material to identify likely properties of the new disease. While this has happened for COVID-19, it may be useful to ensure that this is recognized an important stage that should not be forgotten. Such reviews should be careful to draw out all the value of earlier research given the possibility that researchers will focus on papers about the new disease, with relevant earlier studies being relatively less read.
The author has no competing interests.
This research was not funded.
The processed data used to produce the graphs are available in the supplementary material (https://doi.org/10.6084/m9.figshare.12442616).
Handling Editor: Ludo Waltman