Abstract
For the last 50 years, the journal impact factor (IF) has been the most prominent of all bibliometric indicators. Since the first Journal Citation Report was launched, the IF has been used, often improperly, to evaluate institutions, publications, and individuals. Its well-known significant technical limitations have not detracted from its popularity, and they contrast with the lack of consensus over the numerous alternatives suggested as complements or replacements. This paper presents a percentile-distribution-based proposal for assessing the influence of scientific journals and publications that corrects several of the IF’s main technical limitations using the same set of documents as is used to calculate the IF. Nearly 400 journals of Library Science and Information Science and Biochemistry and Molecular Biology categories were analyzed for this purpose. The results show that the new indicator retains many of its predecessor’s advantages and adds benefits of its own: It is more accurate, more gaming resistant, more complete, and less influenced by the citation window or extreme observations.
PEER REVIEW
1. INTRODUCTION
The best-known bibliometric indicator is the impact factor (IF) (Glänzel & Moed, 2002; Larivière & Sugimoto, 2019). A quick, simple search of the Web of Science (WoS) in October 2023 found over 2,300 papers with “impact factor” in their title. An equally rapid, cursory analysis of the document types involved reveals that 48% of the papers were editorials and 30% were articles. A category check shows that 45% of the publications about the IF were published in specialized medical journals, and only 7.3% were published in Library Science and Information Science journals. This demonstrates, first, the indicator’s huge popularity in the biomedical field and, second, its tendency to crop up in editorials, almost always in announcements of the source’s visibility, instead of as an instrument of scientific debate. Visibility and influence are equally used in this work as synonyms of impact and are based on the number of citations.
The IF’s popularity has skyrocketed, inspiring as much love among its defenders as hate among its opponents. The anti-IF camp is growing stronger and stronger, but it has not managed to lessen the indicator’s reputation or its prominence (Martin, 2016). The increase among critics is due mainly to misuse of the IF as an assessment instrument. This practice has created fallout for many facets of academic and scientific life, inspiring framework documents that stress the need to drop abusive IF-related practices (Declaration on Research Assessment1; Coalition for Advancing Research Assessment2; Leiden Manifesto). The negative effects of IF abuse and misuse, which affect researchers and journals alike, have been thoroughly documented, including their influence on changing publication and citation patterns due to institutional pressure in recruitment and promotion processes (Archambault & Larivière, 2009; Brown, 2007; Ioannidis & Thombs, 2019; Kurmis, 2003; Martin, 2016; Rushforth & de Rijcke, 2015; Sample, 2013). Curiously, many of the complaints about the IF as a yardstick for evaluating individual researchers or institutions come primarily from disciplines that have nothing to do with quantitative studies. Quantitative work has focused on trying to address the indicator’s methodological drawbacks. As the impact factor’s creator said, much of the criticism, of both IF use and methodological issues, is well founded (Garfield, 1979).
The first reference to the IF appeared in 1955, although the canon definition of the term was published in 1972. In that first paper, Garfield linked the IF to earlier studies by Gross and Gross (1927) and Fussler (1949a, 1949b). These pioneering articles described how to use the bibliographic references printed in a journal (a chemistry journal in the former paper, and physics and chemistry journals in the latter) to create a list of the journals that a specialized library ought to subscribe to if it wanted to have good coverage of the scientific output in that discipline. The authors of these early papers felt, however, that the final list should be adapted to suit each institution’s local needs. These studies were based on the premise that a citation is associated with use of the cited material. They reasoned that such an indicator was needed due to the growing complexity and size of the scientific literature, all of which are factors affecting the identification, selection, and use of publications. In short, the method was presented as a tool to enable librarians and researchers to use the materials available in libraries more efficiently. In both cases major constraints were acknowledged. The first was that this method could fail to detect journals in which libraries might be interested. The second was that it is tricky to define the target field adequately.
Garfield also briefly mentioned Brodman’s (1944) criticism of Gross and Gross’s groundbreaking methodology from which Garfield himself drew inspiration. Brodman thought the proposal unscientific, although she acknowledged its usefulness to librarians in putting together bibliographic collections until a better method was presented.
Garfield defined his IF as a tool to improve communication between scientists and help scientists better share their ideas. Like Fussler, Garfield recognized the difficulty of choosing the right sources for correctly defining the field of study. He also came to terms with the fact that constructing the citation index as an index based on a small core of journals would mean missing many references to publications not covered by that index (Garfield, 1955).
Twelve years later, Garfield published what he himself called his most significant paper. There he established the relationship among publications, journals, and citations as components of a communication system that facilitates the exchange of scientific and technical information (Garfield, 1972). Garfield said that citations help identify a journal’s importance or significance, although he did recognize that there are “random events” that may distort citations’ effectiveness, such as the publication of a highly cited paper having an inordinate influence on the ranking of the journal that published it. He also said that citation frequency reflects a journal’s value and the frequency of the journal’s use, although he admitted that there are valuable journals that are cited only infrequently. He noted that many variables apart from merit are at play in citation frequency and that it is nearly impossible to find a relationship between those variables and their effect on citation frequency. In the first IF formula, the impact factor was defined as “an average citation rate per published article.” The formula was almost identical to the one proposed by Martyn and Gilchrist in 1968 for evaluating British scientific journals, as Archambault and Larivière (2009) noted. Like Gross and Gross, and like Fussler, Garfield stated that the impact factor had potential value for librarians managing journal collections and for researchers who found it hard to select what to read. Garfield did expand on the traditional uses, though, and he introduced two new ones: as a means of enabling journal editors to assess their editorial policies and for “studies of science policy and research evaluation.”
A reLreading of the original sources reveals the key points that help explain and understand how IF use began to deviate from what was envisioned at its creation. Ever since the Journal Citation Report began to be published in 1975 (Heaney, 2023), the IF was employed, by institutions as well as individuals, in a way that did not respect its originally intended use, thus showing just how poorly the tool was understood (Kurmis, 2003). Even now it is not unusual to come across calculations of an institution’s annual impact as the sum of the impacts of the journals where its papers have been published.
Seven years after his original publication describing the IF, Garfield countered some of the criticism. He acknowledged that citation analyses required a reasonable amount of methodological and interpretative effort to be useful. He insisted that citations provided information about a paper’s usefulness but not about the reasons for its usefulness; these could only be established through expert judgment. Furthermore, citation analyses did not replace qualitative evaluation; they only helped make it more objective. Consequently, any scientific evaluation based on citation analyses had to recognize that there was much about the meaning of citation rates that was not yet known, and citation ratios are especially vague bases for determining the quality of scientific papers. Moreover, the apparent simplicity of counting citations could mask the numerous subtleties associated with comparing citation counts (Garfield, 1979). Eleven years later, he spoke out again, saying that citation data had to be interpreted carefully and their limitations clearly understood when they were used for assessment purposes (Garfield, 1990).
Diverse technical and methodological inconsistencies have been identified among the limitations of the IF, including but not limited to: source selection criteria, lack of reproducibility, discrepancies between document types in numerator and denominator, the effects of document typology, an inability to represent the skewness of science, self-citations, obliteration, citation windows, utility vs. quality, geographic and language biases, uncited publications, the role played by outliers, abuse in the promotion of researchers or the indicator inflation practices (Archambault & Larivière, 2009; Bornmann & Leydesdorff, 2017; Brown, 2007; Glänzel & Moed, 2002; Ioannidis & Thombs, 2019; Kiesslich, Beyreis et al., 2021; Kurmis, 2003; Martin, 2016; Moed & van Leeuwen, 1995; Mutz & Daniel, 2012; Neff & Olden, 2010; Seglen, 1997).
First of all, it is easy to see that there is a chasm here. On one side stands the clear statement of the limitations and safeguards involved in the use of citation analyses. On the other, the abusive use to which these tools are put by institutions, managers, and decision makers. Second, IF has numerous methodological drawbacks from a purely technical standpoint. These drawbacks highlight the need for alternative methods that are free of such problems and can afford a more accurate evaluation of scientific publications’ influence.
2. OBJECTIVES
The objective of this paper is to present an indicator that is sensitive to skewed citation distributions and reduces the effects of gaming the numerator and denominator of one of the most popular quotients in research evaluation, to try and find a solution to some of the IF’s best-known shortcomings.
To assess the validity of the new indicator, seven research questions have been posed. These questions concern the indicator’s technical aspects, its relationship with complementary indicators, and the use of citation analyses in the context of research evaluation. The questions are the following:
Can an indicator be found that maintains the IF’s main advantages and corrects technical defects related to the biased distribution of citations and the effect of outliers, thus providing a reliable representation of the complete distribution of the publications of one or more journals?
Can an indicator be calculated that is more robust to the influence of the citation window, journal age, and sample size?
Is a simpler, more intuitive, more informative alternative to the percentile-based proposals that have tried to correct the IF’s problems viable?
Can a graph be created for comparing journals in the same discipline while preserving the biased nature of citation distributions?
Is there a suitable combination of indicators that can help characterize the compared visibility of journals in the same discipline to enhance the sensitivity of the result, by incorporating subtleties not covered by the new indicator?
Will the new indicator allow journals in different disciplines to be compared?
Could the new indicator comply with all the “indicator design” recommendations and the characteristics defined for what is known as “responsible metrics”?
3. METHODOLOGY
3.1. Definition of Fields, Citation Windows, and Data Sources
The purpose of this paper is not only to unveil a new indicator, but also to present a clear methodology that can be reproduced using any data source compatible with citation studies. To achieve reproducibility and facilitate disciplinary comparability, the strategic decision was taken to use WoS categories and to establish the set of journals based on the original Journal Citation Report (JCR) lists published in 2023. Thus, although the sources (OpenAlex and WoS) are very different, the results draw upon well-known, homogeneous lists. In this paper the categories of Library Science and Information Science (LIS, 86 journals) and Biochemistry and Molecular Biology (BIO, 315 journals) are analyzed. Even though more scientific areas and categories could be analyzed, these were chosen as reference sets for pragmatic reasons.
With the starting set of journals established, it remained to determine the origin of the data. For that, the journals’ production and citation metadata from the OpenAlex November 2022 snapshot were used. This source was chosen because, in addition to being free of charge and facilitating reproducibility, OpenAlex affords full access to all the metadata needed for the analysis. So, the second stage consisted in combining the JCR journal list with the OpenAlex metadata. This combination eventually yielded 78 LIS journals and 296 BIO journals. The differences are due to journals that could not be located by their ISSN and journals for which no metadata were available.
The third stage was to determine the production and citation window. This point is controversial, because the IF’s original 2-year window has been widely criticized, as noted in the introduction. In fact, Clarivate now also uses a 2-year citation window in addition to the original 2-year window. Although both time windows were reproduced, they were used for different purposes. The 2-year window was compared with its JCR counterpart. The 5-year window was used to calculate the new indicator. The starting year for citations was 2022, so the production data referred to 2017–2021 for the longer window and 2020–2021 for the shorter window. Because the OpenAlex version dates from November, it does not include citations of papers published in December. As in other databases, there is also a gap in the availability of papers from 2022, even in 2023.
3.2. Indicators
The fourth and final stage was the analysis of the resulting data set: 400,516 publications in BIO (187,066 in the 2-year window) and 29,496 publications in LIS (13,319 in the 2-year window). Visibility analyses were run for 6,289,715 citations in BIO (1,736,962 in the 2-year window) and 347,400 citations in LIS (94,492 in the 2-year window).
The first indicator calculated was the journal impact factor (JIF) for 2022 using the OpenAlex metadata. The Clarivate methodology was followed, but OpenAlex metadata were used to calculate the numerator and denominator. Obviously, the results of the OpenAlex and WoS IF lists are different, because the metadata are different. The similarities and differences in ranks between the two lists are shown in Section 4. The aggregate data are available in the Supplementary material.
The second indicator calculated was the new proposed indicator. The process was divided into these three steps:
The first percentile partition for each category was calculated. Percentiles have already been used for this purpose by other researchers (Gorraiz, Ulrych et al., 2022; Leydesdorff & Bornmann, 2011; Mutz & Daniel, 2012). The procedure for establishing the percentiles was that normally utilized by Clarivate (2018), SCImago, the CWTS Leiden Ranking, and Science-Metrix, to name just a few groups, considering the citations according to its discipline, year, and document type. OpenAlex assigns the term “article” to all publications in a journal and does not distinguish between reviews, letters, notes, editorials, etc. The implications of this fact are addressed in Section 5.
Unlike Leydesdorff and Bornmann’s weighted citation, the z-values transformation of the standard normal distribution of Mutz and Daniel or percentile shares (Jann, 2016), the new process included a second percentile partition. This second partition was calculated following the descending order of the percentiles found in the first step for each of the journals. In short, this method yields two percentiles for each publication, one corresponding to its position according to its citations, document type, and year in the category, and the other corresponding to its position in the complete distribution of papers of the journal where it was published.
Boyack (2004) defined two of the main benefits of using percentiles. First, percentiles provide stable normalization across time, enabling simultaneous comparison of publications from different years; second, they keep a few highly cited publications from distorting citation statistics. This kind of study falls within the category of a priori normalization (Glänzel, Schubert et al., 2011; Mutz & Daniel, 2012).
This second step poses three advantages over earlier approaches: It is less complex, it is more intuitive, and most importantly it not only enables journal visibility to be calculated considering all of a journal’s publications, but also enables us to represent that whole distribution, as shown in Figure 1.
The definitive indicator was calculated. For each publication, the difference between its percentile in the category and its percentile in the journal was found. If the result is 0, the publication’s visibility is the same in both partitions. If the result is positive, the publication’s visibility is higher in the category than in the journal. If, on the other hand, the result is negative, the publication’s visibility is greater in the journal than its influence in the discipline. Consequently, the higher the positive mean value of the journal, the greater the visibility and influence of its individual publications in the field.
As Leydesdorff and Bornmann (2011) put it, “Impact has to be defined not as a distribution but as a sum.” They reasoned this way because “[t]he number of citations can be highly skewed and in this situation any measure of central tendency is theoretically meaningless.” Under this premise, for a given journal the differences between percentiles of all the journal’s publications were added up, and, to reduce the size effect, the sum was divided by the journal’s total number of publications (Eq. 1):where ρ1i is the publication’s percentile in the category, ρ2i is the publication’s percentile in the journal, and N is the total number of publications in the journal.(1)A graph of the equation may prove more intuitive. Figure 1 shows the distribution of the visibility of the publications of the International Journal of Information Management (IJIM), the highest-ranked journal in the LIS category. Each point on the scatter plot represents an article. The x-axis indicates the article’s percentile in the category, and the y-axis its percentile in the journal. The diagonal line marks the expected value.
In this graph, the points below the diagonal line show the journal’s most visible publications in the category. The summation of the differences enables the resulting area to be calculated, showing the total visibility of the distribution. It is true that, as Leydesdorff and Bornmann say, “very different distributions can add up to the same impact.” For that reason, the graph of the proposed indicator furnishes a simple, informative, intuitive instrument that helps compare the real and full distributions of journals in the same category, to complement the mean value of each journal’s indicator. In this example, an article in the 50th percentile of the journal (y-axis) lies in the 91st percentile of the category (x-axis). In other words, 50% of the papers in the journal lie in the top 10% of the most cited LIS papers (green dotted line). The graph also illustrates how 5.05% of the publications in this journal are uncited publications (red dotted line).
The third indicator was the proportion of uncited publications. Moed, van Leeuwen, and Reedijk (1999) termed this the “Uncited Factor” (UNCF).
The fourth indicator was the proportion of publications below the mean citation in the field (p_below_meancit). This indicator determines the ratio of papers in a journal whose citation is below the mean for the category.
The fifth and last indicator presents the ranking differences between the RI and the IF, to determine how journals rise and fall, showing their differences in performance (Diff rank). In this case, the IF was calculated for the same 5-year citation window, so as to work with the same set of documents as used for the RI.
These last three indicators help interpret the results and the visualizations of the RI, which are especially useful in comparing journals with similar values or explaining upward and downward variations between RI and IF rankings.
4. RESULTS
4.1. Impact Factor Comparison
First let us focus on the comparison of the IF found using metadata from two different sources. Table 1 lists the 30 journals with the highest IF according to the 2022 JCR in LIS and a comparison to their equivalents in OpenAlex. In both cases, the original 2-year citation window was used. The results clear up some important issues about data sources, especially the raw newness of OpenAlex and the limitations of its main metadata supplier, CrossRef. As CrossRef itself acknowledges, its metadata are restricted to those data sets with DOI that are available on DataCite. Other sources may not be adequately identified and validated (CrossRef, 2022). Furthermore, publishers’ data policies also have a significant effect, because many of them have rolled out tailored strategies targeting the specific needs of each of their journals, as regards the data scientists share as well as their publications’ metadata (Cousijn, Kenall et al., 2018).
Journal . | Web of Science . | OpenAlex . | ||||||
---|---|---|---|---|---|---|---|---|
Rank . | Citations . | JIF . | Q . | Rank . | Citations . | IF . | Q . | |
International Journal of Information Management | 1 | 19,865 | 21.0 | Q1 | 1 | 14,920 | 36.30 | Q1 |
Information and Management | 2 | 13,809 | 9.9 | Q1 | 5 | 3,774 | 13.43 | Q1 |
European Journal of Information Systems | 3 | 5,473 | 9.5 | Q1 | 3 | 2,089 | 15.95 | Q1 |
Information Processing and Management | 4 | 10,883 | 8.6 | Q1 | 2 | 8,574 | 16.03 | Q1 |
Telematics and Informatics | 5 | 8,586 | 8.5 | Q1 | 7 | 3,678 | 11.71 | Q1 |
Government Information Quarterly | 6 | 6,753 | 7.8 | Q1 | 8 | 1,976 | 11.49 | Q1 |
Journal of Management Information Systems | 7 | 9,097 | 7.7 | Q1 | 13 | 814 | 8.75 | Q1 |
MIS Quarterly | 8 | 29,364 | 7.3 | Q1 | 12 | 1,371 | 8.96 | Q1 |
Journal of Computer-Mediated Communication | 9 | 5,312 | 7.2 | Q1 | 6 | 665 | 11.88 | Q1 |
Journal of Knowledge Management | 10 | 9,844 | 7.0 | Q1 | 9 | 3,037 | 9.18 | Q1 |
Journal of Strategic Information Systems | 11 | 3,517 | 7.0 | Q1 | 15 | 456 | 8.00 | Q1 |
Journal of Enterprise Information Management | 12 | 3,684 | 6.5 | Q1 | 11 | 1,880 | 9.00 | Q1 |
Journal of Organizational and End User Computing | 13 | 1,362 | 6.5 | Q1 | 21 | 821 | 6.68 | Q2 |
Information Systems Journal | 14 | 3,865 | 6.4 | Q1 | 26 | 1,001 | 5.99 | Q2 |
Journal of the American Medical Information Association | 15 | 15,068 | 6.4 | Q1 | 4 | 10,164 | 14.58 | Q1 |
Information and Organization | 16 | 1,389 | 6.3 | Q1 | 10 | 399 | 9.07 | Q1 |
Journal of the Association for Information Systems | 17 | 4,885 | 5.8 | Q1 | 30 | 628 | 4.98 | Q2 |
International Journal of Geographical Information Science | 18 | 8,901 | 5.7 | Q1 | 14 | 2,502 | 8.18 | Q1 |
Journal of Information Technology | 19 | 2,516 | 5.6 | Q1 | 27 | 388 | 5.62 | Q2 |
Telecommunications Policy | 20 | 3,907 | 5.6 | Q1 | 19 | 1,887 | 7.49 | Q1 |
Information Systems Research | 21 | 12,947 | 4.9 | Q1 | 22 | 1,331 | 6.37 | Q2 |
Information Technology for Development | 22 | 1,554 | 4.8 | Q2 | 16 | 852 | 7.96 | Q1 |
Journal of Global Information Management | 23 | 1,414 | 4.7 | Q2 | 38 | 930 | 3.96 | Q2 |
Information Technology and People | 24 | 3,104 | 4.4 | Q2 | 34 | 1,011 | 4.60 | Q2 |
Journal of Health Communication | 25 | 6,850 | 4.4 | Q2 | 23 | 1,167 | 6.31 | Q2 |
International J. Computer-Supported Collab. Learn. | 26 | 1,166 | 4.3 | Q2 | 20 | 295 | 7.02 | Q1 |
Profesional de la Informacion | 27 | 2,530 | 4.2 | Q2 | 31 | 1,876 | 4.85 | Q2 |
MIS Quarterly Executive | 28 | 1,449 | 4.1 | Q2 | 46 | 156 | 3.32 | Q3 |
Social Science Computer Review | 29 | 3,645 | 4.1 | Q2 | 17 | 1,695 | 7.96 | Q1 |
Scientometrics | 30 | 20,613 | 3.9 | Q2 | 25 | 5,675 | 6.05 | Q2 |
Journal . | Web of Science . | OpenAlex . | ||||||
---|---|---|---|---|---|---|---|---|
Rank . | Citations . | JIF . | Q . | Rank . | Citations . | IF . | Q . | |
International Journal of Information Management | 1 | 19,865 | 21.0 | Q1 | 1 | 14,920 | 36.30 | Q1 |
Information and Management | 2 | 13,809 | 9.9 | Q1 | 5 | 3,774 | 13.43 | Q1 |
European Journal of Information Systems | 3 | 5,473 | 9.5 | Q1 | 3 | 2,089 | 15.95 | Q1 |
Information Processing and Management | 4 | 10,883 | 8.6 | Q1 | 2 | 8,574 | 16.03 | Q1 |
Telematics and Informatics | 5 | 8,586 | 8.5 | Q1 | 7 | 3,678 | 11.71 | Q1 |
Government Information Quarterly | 6 | 6,753 | 7.8 | Q1 | 8 | 1,976 | 11.49 | Q1 |
Journal of Management Information Systems | 7 | 9,097 | 7.7 | Q1 | 13 | 814 | 8.75 | Q1 |
MIS Quarterly | 8 | 29,364 | 7.3 | Q1 | 12 | 1,371 | 8.96 | Q1 |
Journal of Computer-Mediated Communication | 9 | 5,312 | 7.2 | Q1 | 6 | 665 | 11.88 | Q1 |
Journal of Knowledge Management | 10 | 9,844 | 7.0 | Q1 | 9 | 3,037 | 9.18 | Q1 |
Journal of Strategic Information Systems | 11 | 3,517 | 7.0 | Q1 | 15 | 456 | 8.00 | Q1 |
Journal of Enterprise Information Management | 12 | 3,684 | 6.5 | Q1 | 11 | 1,880 | 9.00 | Q1 |
Journal of Organizational and End User Computing | 13 | 1,362 | 6.5 | Q1 | 21 | 821 | 6.68 | Q2 |
Information Systems Journal | 14 | 3,865 | 6.4 | Q1 | 26 | 1,001 | 5.99 | Q2 |
Journal of the American Medical Information Association | 15 | 15,068 | 6.4 | Q1 | 4 | 10,164 | 14.58 | Q1 |
Information and Organization | 16 | 1,389 | 6.3 | Q1 | 10 | 399 | 9.07 | Q1 |
Journal of the Association for Information Systems | 17 | 4,885 | 5.8 | Q1 | 30 | 628 | 4.98 | Q2 |
International Journal of Geographical Information Science | 18 | 8,901 | 5.7 | Q1 | 14 | 2,502 | 8.18 | Q1 |
Journal of Information Technology | 19 | 2,516 | 5.6 | Q1 | 27 | 388 | 5.62 | Q2 |
Telecommunications Policy | 20 | 3,907 | 5.6 | Q1 | 19 | 1,887 | 7.49 | Q1 |
Information Systems Research | 21 | 12,947 | 4.9 | Q1 | 22 | 1,331 | 6.37 | Q2 |
Information Technology for Development | 22 | 1,554 | 4.8 | Q2 | 16 | 852 | 7.96 | Q1 |
Journal of Global Information Management | 23 | 1,414 | 4.7 | Q2 | 38 | 930 | 3.96 | Q2 |
Information Technology and People | 24 | 3,104 | 4.4 | Q2 | 34 | 1,011 | 4.60 | Q2 |
Journal of Health Communication | 25 | 6,850 | 4.4 | Q2 | 23 | 1,167 | 6.31 | Q2 |
International J. Computer-Supported Collab. Learn. | 26 | 1,166 | 4.3 | Q2 | 20 | 295 | 7.02 | Q1 |
Profesional de la Informacion | 27 | 2,530 | 4.2 | Q2 | 31 | 1,876 | 4.85 | Q2 |
MIS Quarterly Executive | 28 | 1,449 | 4.1 | Q2 | 46 | 156 | 3.32 | Q3 |
Social Science Computer Review | 29 | 3,645 | 4.1 | Q2 | 17 | 1,695 | 7.96 | Q1 |
Scientometrics | 30 | 20,613 | 3.9 | Q2 | 25 | 5,675 | 6.05 | Q2 |
The metadata referring specifically to citation affect indicator calculations. The reports on publisher participation in CrossRef (all time) show that 73% of Springer publications include reference lists, as do 71% of Elsevier publications, 70% of Wiley publications, 57% of Sage publications, 43% of Cambridge University Press publications, and 29% of Oxford University Press publications, to name just some of the leading publishers (CrossRef, 2023). At the journal level, 93% of the publications of the Journal of Informetrics contain references, and so do 84% of the publications of Scientometrics, 81% of the publications of the Journal of the Association for Information Science and Technology, 70% of the publications of the Information Systems Journal, 68% of the publications of the International Journal of Information Management, 62% of the publications of the Journal of Strategic Information Systems, 39% of the publications of Nature and El Profesional de la Información, and 33% of the publications of Science, to mention just some examples. These percentages refer to publications without bibliographic references in their metadata or publications that are unavailable. Such gaps in information have big effects, and the effects are considerable in cases like that of the Journal of Information Technology (JIT, 54% of publications with references), which exhibits a much lower total number of citations in OpenAlex (388) than in WoS (2,370). This suggests that JIT loses a great many citations from publications in other journals, and from JIT itself, which are not adequately covered in the source of our research.
While this point is not the only explanation of the considerable differences (bearing in mind that OpenAlex includes a larger volume of papers than WoS), it does showcase several significant things: the importance of data quality; the need to share all publication and citation metadata, in an adequately described state so they can be collected correctly; and above all the care and caution with which one must handle, interpret, and use the data from quantitative studies, which are valuable but are nonetheless useless without adequate qualitative supervision by experts.
Although it is impossible to obtain identical results when the metadata one starts with are so different, these differences and gaps in OpenAlex are not a drawback. First, the indicator’s methodological proposal is independent of the database used. Therefore, the method can be applied using any other data source (WoS, Scopus, Dimensions, etc), as long as access to all the necessary metadata is available. Second, the differences help detect shortcomings in the sources and illustrate the effects that such problems have on the results. This goes to show what meticulous care analysts must take when processing data and disseminating results to avoid inappropriate uses or decisions. Even with all these constraints, the overall results are not very different, although there are some striking differences in ranks. Of the 30 journals listed in Table 1, only four are different in OpenAlex ranking. These similarities are confirmed by running Spearman’s correlation on the entire distribution (JIF vs. IF-OpenAlex: 0.962), which shows a nearly perfect positive relationship between the two lists.
The data presented in Table 2 for BIO are similar to the LIS data. There are considerable differences caused by metadata diversity, compounded by the impossibility of differentiating document types in OpenAlex. In this case, only seven journals of the 74 included in the first quartile by WoS are different. Spearman’s correlation shows a very strong positive relationship between the two lists (JIF vs IF-OpenAlex: 0.944). The full lists are available in the Supplementary material.
Journal . | Web of Science . | OpenAlex . | ||||||
---|---|---|---|---|---|---|---|---|
Rank . | Citations . | JIF . | Q . | Rank . | Citations . | IF . | Q . | |
Nature Medicine | 1 | 139,574 | 82.90 | Q1 | 2 | 82,869 | 81.09 | Q1 |
Cell | 2 | 338,069 | 64.50 | Q1 | 1 | 130,008 | 98.87 | Q1 |
Signal Transduction and Targeted Therapy | 3 | 19,678 | 39.30 | Q1 | 5 | 23,800 | 31.95 | Q1 |
Molecular Cancer | 4 | 36,241 | 37.30 | Q1 | 3 | 16,401 | 47.96 | Q1 |
Molecular Plant | 5 | 24,114 | 27.50 | Q1 | 9 | 11,139 | 25.14 | Q1 |
Nature Structural and Molecular Biology | 6 | 30,445 | 16.80 | Q1 | 8 | 9,162 | 25.81 | Q1 |
Annual Review of Biochemistry | 7 | 22,840 | 16.60 | Q1 | 4 | 2,283 | 35.12 | Q1 |
Molecular Cell | 8 | 86,313 | 16.00 | Q1 | 7 | 23,794 | 26.00 | Q1 |
Trends in Microbiology | 9 | 18,957 | 15.90 | Q1 | 23 | 5,495 | 16.80 | Q1 |
Nucleic Acids Research | 10 | 282,987 | 14.90 | Q1 | 10 | 65,319 | 23.87 | Q1 |
Nature Chemical Biology | 11 | 30,143 | 14.80 | Q1 | 20 | 9,977 | 18.31 | Q1 |
Trends in Biochemical Sciences | 12 | 21,556 | 13.80 | Q1 | 24 | 4,488 | 15.97 | Q1 |
Progress in Lipid Research | 13 | 7,560 | 13.60 | Q1 | 15 | 1,378 | 19.97 | Q1 |
Trends in Molecular Medicine | 14 | 13,691 | 13.60 | Q1 | 25 | 4,485 | 15.79 | Q1 |
Cytokine and Growth Factor Reviews | 15 | 8,423 | 13.00 | Q1 | 6 | 4,098 | 27.50 | Q1 |
Experimental and Molecular Medicine | 16 | 13,641 | 12.80 | Q1 | 12 | 7,465 | 21.03 | Q1 |
Cell Death and Differentiation | 17 | 31,913 | 12.40 | Q1 | 11 | 11,869 | 22.14 | Q1 |
Natural Product Reports | 18 | 13,843 | 11.90 | Q1 | 26 | 4,098 | 15.64 | Q1 |
Plant Cell | 19 | 68,851 | 11.60 | Q1 | 34 | 9,695 | 14.07 | Q1 |
EMBO Journal | 20 | 70,290 | 11.40 | Q1 | 18 | 12,828 | 18.65 | Q1 |
Journal of Integrative Plant Biology | 21 | 10,416 | 11.40 | Q1 | 29 | 5,477 | 15.01 | Q1 |
Redox Biology | 22 | 24,642 | 11.40 | Q1 | 22 | 15,794 | 17.61 | Q1 |
Biochimica et Biophysica Acta-Reviews on Cancer | 23 | 9,303 | 11.20 | Q1 | 27 | 4,300 | 15.52 | Q1 |
Molecular Psychiatry | 24 | 33,254 | 11.00 | Q1 | 13 | 22,110 | 20.96 | Q1 |
Molecular Biology and Evolution | 25 | 64,994 | 10.70 | Q1 | 17 | 15,309 | 19.14 | Q1 |
Molecular Aspects of Medicine | 26 | 9,129 | 10.60 | Q1 | 40 | 1,874 | 12.92 | Q1 |
Molecular Systems Biology | 28 | 10,370 | 9.90 | Q1 | 19 | 3,373 | 18.53 | Q1 |
PLoS Biology | 29 | 41,783 | 9.80 | Q1 | 16 | 13,718 | 19.35 | Q1 |
Cell Systems | 30 | 8,614 | 9.30 | Q1 | 21 | 4,242 | 18.28 | Q1 |
Current Biology | 31 | 79,963 | 9.20 | Q1 | 46 | 23,042 | 11.06 | Q1 |
Journal . | Web of Science . | OpenAlex . | ||||||
---|---|---|---|---|---|---|---|---|
Rank . | Citations . | JIF . | Q . | Rank . | Citations . | IF . | Q . | |
Nature Medicine | 1 | 139,574 | 82.90 | Q1 | 2 | 82,869 | 81.09 | Q1 |
Cell | 2 | 338,069 | 64.50 | Q1 | 1 | 130,008 | 98.87 | Q1 |
Signal Transduction and Targeted Therapy | 3 | 19,678 | 39.30 | Q1 | 5 | 23,800 | 31.95 | Q1 |
Molecular Cancer | 4 | 36,241 | 37.30 | Q1 | 3 | 16,401 | 47.96 | Q1 |
Molecular Plant | 5 | 24,114 | 27.50 | Q1 | 9 | 11,139 | 25.14 | Q1 |
Nature Structural and Molecular Biology | 6 | 30,445 | 16.80 | Q1 | 8 | 9,162 | 25.81 | Q1 |
Annual Review of Biochemistry | 7 | 22,840 | 16.60 | Q1 | 4 | 2,283 | 35.12 | Q1 |
Molecular Cell | 8 | 86,313 | 16.00 | Q1 | 7 | 23,794 | 26.00 | Q1 |
Trends in Microbiology | 9 | 18,957 | 15.90 | Q1 | 23 | 5,495 | 16.80 | Q1 |
Nucleic Acids Research | 10 | 282,987 | 14.90 | Q1 | 10 | 65,319 | 23.87 | Q1 |
Nature Chemical Biology | 11 | 30,143 | 14.80 | Q1 | 20 | 9,977 | 18.31 | Q1 |
Trends in Biochemical Sciences | 12 | 21,556 | 13.80 | Q1 | 24 | 4,488 | 15.97 | Q1 |
Progress in Lipid Research | 13 | 7,560 | 13.60 | Q1 | 15 | 1,378 | 19.97 | Q1 |
Trends in Molecular Medicine | 14 | 13,691 | 13.60 | Q1 | 25 | 4,485 | 15.79 | Q1 |
Cytokine and Growth Factor Reviews | 15 | 8,423 | 13.00 | Q1 | 6 | 4,098 | 27.50 | Q1 |
Experimental and Molecular Medicine | 16 | 13,641 | 12.80 | Q1 | 12 | 7,465 | 21.03 | Q1 |
Cell Death and Differentiation | 17 | 31,913 | 12.40 | Q1 | 11 | 11,869 | 22.14 | Q1 |
Natural Product Reports | 18 | 13,843 | 11.90 | Q1 | 26 | 4,098 | 15.64 | Q1 |
Plant Cell | 19 | 68,851 | 11.60 | Q1 | 34 | 9,695 | 14.07 | Q1 |
EMBO Journal | 20 | 70,290 | 11.40 | Q1 | 18 | 12,828 | 18.65 | Q1 |
Journal of Integrative Plant Biology | 21 | 10,416 | 11.40 | Q1 | 29 | 5,477 | 15.01 | Q1 |
Redox Biology | 22 | 24,642 | 11.40 | Q1 | 22 | 15,794 | 17.61 | Q1 |
Biochimica et Biophysica Acta-Reviews on Cancer | 23 | 9,303 | 11.20 | Q1 | 27 | 4,300 | 15.52 | Q1 |
Molecular Psychiatry | 24 | 33,254 | 11.00 | Q1 | 13 | 22,110 | 20.96 | Q1 |
Molecular Biology and Evolution | 25 | 64,994 | 10.70 | Q1 | 17 | 15,309 | 19.14 | Q1 |
Molecular Aspects of Medicine | 26 | 9,129 | 10.60 | Q1 | 40 | 1,874 | 12.92 | Q1 |
Molecular Systems Biology | 28 | 10,370 | 9.90 | Q1 | 19 | 3,373 | 18.53 | Q1 |
PLoS Biology | 29 | 41,783 | 9.80 | Q1 | 16 | 13,718 | 19.35 | Q1 |
Cell Systems | 30 | 8,614 | 9.30 | Q1 | 21 | 4,242 | 18.28 | Q1 |
Current Biology | 31 | 79,963 | 9.20 | Q1 | 46 | 23,042 | 11.06 | Q1 |
Examination of the citation metadata of the BIO journals in CrossRef indicates that, for example, 99% of the Antioxidants publications contain references, as do 96% of the Redox Biology publications, 89% of the Nature Chemical Biology publications, 79% of the Biochimica et Biophysica Acta-Reviews on Cancer publications, 71% of the Nature Medicine publications, 66% of the Molecular Biology and Evolution publications, 43% of the Nucleid Acids Research publications, and 29% of the Annual Review of Biochemistry publications. These percentages reveal that in the field of Biology and Biochemistry citation is more frequent, that there are fewer publications without bibliographic references, and that the citation metadata of BIO journals have greater coverage than the citation metadata of LIS journals. This suggests that the big citation differences between fields cannot be explained by publisher policies alone.
4.2. Real Influence Versus Impact Factor
To compare the IF and the RI, the two indicators were calculated using the extended 5-year citation window, as already indicated in Section 3. Furthermore, this comparison will be more consistent than the last, because the same set of metadata from OpenAlex was used to determine the distributions of publications and journals according to both indicators. So, how similarly do the two indicators rank journals? Table 3 presents the two lists for the 30 journals with the highest RI in LIS.
Journal . | Real Influence . | Impact Factor-5 . | Diff Rank . | ||
---|---|---|---|---|---|
Rank . | RI . | Rank . | IF . | ||
International Journal of Information Management | 1 | 31.76 | 1 | 48.17 | 0 |
Journal of Knowledge Management | 2 | 22.91 | 9 | 22.35 | 7 |
Government Information Quarterly | 3 | 22.24 | 3 | 26.83 | 0 |
Information Processing and Management | 4 | 19.74 | 12 | 19.44 | 8 |
Information and Management | 5 | 19.64 | 6 | 25.10 | 1 |
Journal of Management Information Systems | 6 | 18.92 | 5 | 25.28 | −1 |
Telematics and Informatics | 7 | 17.84 | 4 | 25.75 | −3 |
Journal of the American Medical Information Association | 8 | 17.72 | 10 | 21.42 | 2 |
International J. Computer-Supported Collab. Learn. | 9 | 17.43 | 17 | 15.86 | 8 |
Journal of Enterprise Information Management | 10 | 17.05 | 18 | 14.45 | 8 |
European Journal of Information Systems | 11 | 15.26 | 14 | 18.04 | 3 |
Journal of Computer-Mediated Communication | 12 | 15.08 | 8 | 24.02 | −4 |
Information Technology for Development | 13 | 13.63 | 21 | 13.31 | 8 |
MIS Quarterly | 14 | 12.23 | 7 | 24.17 | −7 |
Social Science Computer Review | 15 | 12.12 | 20 | 13.58 | 5 |
Information Technology and People | 16 | 11.46 | 29 | 10.68 | 13 |
Ethics and Information Technology | 17 | 11.42 | 23 | 12.60 | 6 |
Journal of Informetrics | 18 | 11.40 | 13 | 18.82 | −5 |
Information Systems Research | 19 | 10.65 | 19 | 14.44 | 0 |
Journal of Organizational and End User Computing | 20 | 10.38 | 26 | 11.99 | 6 |
Journal of Health Communication | 21 | 9.79 | 25 | 12.07 | 4 |
Journal of Strategic Information Systems | 22 | 8.92 | 2 | 30.25 | −20 |
Qualitative Health Research | 23 | 8.75 | 24 | 12.37 | 1 |
Scientometrics | 24 | 8.46 | 27 | 11.37 | 3 |
Information Systems Journal | 25 | 8.14 | 15 | 16.77 | −10 |
Information and Organization | 26 | 7.48 | 11 | 20.36 | −15 |
Knowledge Management Research and Practice | 27 | 6.87 | 37 | 8.14 | 10 |
International Journal of Geographical Information Science | 28 | 6.84 | 22 | 12.95 | −6 |
Online Information Review | 29 | 5.62 | 33 | 8.74 | 4 |
Journal of Information Technology | 30 | 5.37 | 16 | 16.12 | −14 |
Journal . | Real Influence . | Impact Factor-5 . | Diff Rank . | ||
---|---|---|---|---|---|
Rank . | RI . | Rank . | IF . | ||
International Journal of Information Management | 1 | 31.76 | 1 | 48.17 | 0 |
Journal of Knowledge Management | 2 | 22.91 | 9 | 22.35 | 7 |
Government Information Quarterly | 3 | 22.24 | 3 | 26.83 | 0 |
Information Processing and Management | 4 | 19.74 | 12 | 19.44 | 8 |
Information and Management | 5 | 19.64 | 6 | 25.10 | 1 |
Journal of Management Information Systems | 6 | 18.92 | 5 | 25.28 | −1 |
Telematics and Informatics | 7 | 17.84 | 4 | 25.75 | −3 |
Journal of the American Medical Information Association | 8 | 17.72 | 10 | 21.42 | 2 |
International J. Computer-Supported Collab. Learn. | 9 | 17.43 | 17 | 15.86 | 8 |
Journal of Enterprise Information Management | 10 | 17.05 | 18 | 14.45 | 8 |
European Journal of Information Systems | 11 | 15.26 | 14 | 18.04 | 3 |
Journal of Computer-Mediated Communication | 12 | 15.08 | 8 | 24.02 | −4 |
Information Technology for Development | 13 | 13.63 | 21 | 13.31 | 8 |
MIS Quarterly | 14 | 12.23 | 7 | 24.17 | −7 |
Social Science Computer Review | 15 | 12.12 | 20 | 13.58 | 5 |
Information Technology and People | 16 | 11.46 | 29 | 10.68 | 13 |
Ethics and Information Technology | 17 | 11.42 | 23 | 12.60 | 6 |
Journal of Informetrics | 18 | 11.40 | 13 | 18.82 | −5 |
Information Systems Research | 19 | 10.65 | 19 | 14.44 | 0 |
Journal of Organizational and End User Computing | 20 | 10.38 | 26 | 11.99 | 6 |
Journal of Health Communication | 21 | 9.79 | 25 | 12.07 | 4 |
Journal of Strategic Information Systems | 22 | 8.92 | 2 | 30.25 | −20 |
Qualitative Health Research | 23 | 8.75 | 24 | 12.37 | 1 |
Scientometrics | 24 | 8.46 | 27 | 11.37 | 3 |
Information Systems Journal | 25 | 8.14 | 15 | 16.77 | −10 |
Information and Organization | 26 | 7.48 | 11 | 20.36 | −15 |
Knowledge Management Research and Practice | 27 | 6.87 | 37 | 8.14 | 10 |
International Journal of Geographical Information Science | 28 | 6.84 | 22 | 12.95 | −6 |
Online Information Review | 29 | 5.62 | 33 | 8.74 | 4 |
Journal of Information Technology | 30 | 5.37 | 16 | 16.12 | −14 |
Curiously enough, although the differences are considerable and can be attributed entirely to the different calculations used in each distribution, Spearman’s correlation remains strong (0.916). However, when only the 30 journals in Table 3 are studied, Spearman’s correlation lowers the strength of the relationship to moderate (0.684), and it demonstrates that the differences in the two indicators’ results are indeed considerable when the analysis includes only the publications of the most influential journals.
The correlation is even stronger for BIO journal indicators (0.923) than it is for LIS journals. Table 4 presents the 30 journals with the highest RI in the BIO specialty. As in the last case, when only the 30 most influential journals are studied, Spearman’s correlation reduces the relationship to moderate (0.596). This confirms that the differences in the two indicators’ results are somewhat greater in BIO than in LIS. In the same vein, the higher the ranking of the journals the more notable the differences in the positions. This result is in line with previous findings on the strong influence of highly cited publications (outliers) in the JIF.
Journal . | Real Influence . | Impact Factor-5 . | Diff Rank . | ||
---|---|---|---|---|---|
Rank . | RI . | Rank . | IF . | ||
Annual Review of Biochemistry | 1 | 37.46 | 2 | 113.09 | 1 |
Molecular Cancer | 2 | 36.81 | 4 | 82.48 | 2 |
Molecular Cell | 3 | 27.00 | 6 | 54.29 | 3 |
Molecular Psychiatry | 4 | 26.64 | 16 | 34.76 | 12 |
Genome Research | 5 | 26.24 | 7 | 47.84 | 2 |
Redox Biology | 6 | 25.40 | 23 | 32.34 | 17 |
Cell | 7 | 25.26 | 1 | 129.19 | −6 |
Critical Reviews in Biochemistry and Molecular Biology | 8 | 25.16 | 25 | 30.11 | 17 |
Cell Death and Differentiation | 9 | 24.35 | 13 | 37.44 | 4 |
Molecular Systems Biology | 10 | 23.18 | 14 | 37.05 | 4 |
Oncogene | 11 | 22.88 | 27 | 29.10 | 16 |
Biochimica et Biophysica Acta-Reviews on Cancer | 12 | 22.60 | 34 | 25.48 | 22 |
Progress in Lipid Research | 13 | 22.41 | 10 | 40.11 | −3 |
Cellular and Molecular Life Sciences | 14 | 21.60 | 30 | 26.74 | 16 |
Matrix Biology | 15 | 21.58 | 17 | 34.44 | 2 |
Plos Biology | 16 | 20.88 | 24 | 31.10 | 8 |
Molecular Plant | 17 | 20.61 | 12 | 37.98 | −5 |
Nucleic Acids Research | 18 | 20.41 | 5 | 55.36 | −13 |
EMBO Journal | 19 | 20.20 | 22 | 33.50 | 3 |
Signal Transduction and Targeted Therapy | 20 | 20.18 | 11 | 38.81 | −9 |
International Journal of Biological Macromolecules | 21 | 19.59 | 42 | 21.46 | 21 |
Nature Structural and Molecular Biology | 22 | 18.36 | 9 | 40.47 | −13 |
Experimental and Molecular Medicine | 23 | 18.00 | 26 | 29.77 | 3 |
Molecular Ecology Resources | 24 | 17.90 | 35 | 24.89 | 11 |
Molecular Aspects of Medicine | 25 | 17.79 | 19 | 34.26 | −6 |
Current Opinion in Structural Biology | 26 | 17.42 | 38 | 23.84 | 12 |
International Journal of Biological Sciences | 27 | 17.09 | 40 | 23.57 | 13 |
Current Opinion in Chemical Biology | 28 | 16.29 | 31 | 26.47 | 3 |
Antioxidants | 29 | 15.60 | 89 | 13.80 | 60 |
Cell Systems | 30 | 15.45 | 18 | 34.35 | −12 |
Journal . | Real Influence . | Impact Factor-5 . | Diff Rank . | ||
---|---|---|---|---|---|
Rank . | RI . | Rank . | IF . | ||
Annual Review of Biochemistry | 1 | 37.46 | 2 | 113.09 | 1 |
Molecular Cancer | 2 | 36.81 | 4 | 82.48 | 2 |
Molecular Cell | 3 | 27.00 | 6 | 54.29 | 3 |
Molecular Psychiatry | 4 | 26.64 | 16 | 34.76 | 12 |
Genome Research | 5 | 26.24 | 7 | 47.84 | 2 |
Redox Biology | 6 | 25.40 | 23 | 32.34 | 17 |
Cell | 7 | 25.26 | 1 | 129.19 | −6 |
Critical Reviews in Biochemistry and Molecular Biology | 8 | 25.16 | 25 | 30.11 | 17 |
Cell Death and Differentiation | 9 | 24.35 | 13 | 37.44 | 4 |
Molecular Systems Biology | 10 | 23.18 | 14 | 37.05 | 4 |
Oncogene | 11 | 22.88 | 27 | 29.10 | 16 |
Biochimica et Biophysica Acta-Reviews on Cancer | 12 | 22.60 | 34 | 25.48 | 22 |
Progress in Lipid Research | 13 | 22.41 | 10 | 40.11 | −3 |
Cellular and Molecular Life Sciences | 14 | 21.60 | 30 | 26.74 | 16 |
Matrix Biology | 15 | 21.58 | 17 | 34.44 | 2 |
Plos Biology | 16 | 20.88 | 24 | 31.10 | 8 |
Molecular Plant | 17 | 20.61 | 12 | 37.98 | −5 |
Nucleic Acids Research | 18 | 20.41 | 5 | 55.36 | −13 |
EMBO Journal | 19 | 20.20 | 22 | 33.50 | 3 |
Signal Transduction and Targeted Therapy | 20 | 20.18 | 11 | 38.81 | −9 |
International Journal of Biological Macromolecules | 21 | 19.59 | 42 | 21.46 | 21 |
Nature Structural and Molecular Biology | 22 | 18.36 | 9 | 40.47 | −13 |
Experimental and Molecular Medicine | 23 | 18.00 | 26 | 29.77 | 3 |
Molecular Ecology Resources | 24 | 17.90 | 35 | 24.89 | 11 |
Molecular Aspects of Medicine | 25 | 17.79 | 19 | 34.26 | −6 |
Current Opinion in Structural Biology | 26 | 17.42 | 38 | 23.84 | 12 |
International Journal of Biological Sciences | 27 | 17.09 | 40 | 23.57 | 13 |
Current Opinion in Chemical Biology | 28 | 16.29 | 31 | 26.47 | 3 |
Antioxidants | 29 | 15.60 | 89 | 13.80 | 60 |
Cell Systems | 30 | 15.45 | 18 | 34.35 | −12 |
Figure 2 helps detect journals that undergo highly significant changes in position in both lists. Some of them are analyzed in Section 4.4.
4.3. Real Influence, Uncited Factor, and Mean Citation
Additional indicators can be brought in to enhance the RI’s understanding and usefulness. The IF is a journal’s mean number of citations per document. Analogously, the RI must be interpreted as the mean difference between the document’s percentile in the category and its percentile in the journal. Although, as stated before, the RI is more accurate than the IF, because it uses the journal’s full distribution of publications, the average value can be very similar for very different publication distributions. For that reason, it is advisable to combine the indicator with the visualization of the full distribution of each journal for a better understanding, and with other indicators that help add context to its significance. Thus, the UNCF and p_below_meancit are employed in addition to the RI to illustrate complementary journal characteristics that help establish similarities and differences when only numerical indicators are available. The RI has also been included in the higher and lower percentiles to help characterize higher or lower journal visibility, regardless of the journal’s overall RI. For this purpose, RI t10 has been calculated with the mean difference of the publications in the 10 most visible percentiles, and RI ≤ t50 has been calculated with the mean difference in the publications of the 50 least visible percentiles (not including uncited publications).
Table 5 presents the set of journals grouped by the statistical method of clustering (Ward’s method and quadratic Euclidean distance) according to the similarities between indicators.
Cluster . | Journal . | Rank . | RI . | UNCF . | p_below_meancit . | RI t10 . | RI ≤ t50 . |
---|---|---|---|---|---|---|---|
C1 | International Journal of Information Management | 1 | 31.76 | 5.06 | 23.67 | 22.33 | 34.96 |
C2 | Journal of Knowledge Management | 2 | 22.91 | 5.48 | 42.42 | 8.44 | 34.22 |
Government Information Quarterly | 3 | 22.24 | 9.25 | 39.33 | 10.11 | 30.07 | |
Information Processing and Management | 4 | 19.74 | 11.97 | 44.02 | 8.89 | 27.04 | |
Information and Management | 5 | 19.64 | 12.68 | 44.29 | 9.75 | 28.16 | |
Journal of Management Information Systems | 6 | 18.92 | 9.83 | 48.72 | 7.30 | 26.32 | |
Telematics and Informatics | 7 | 17.84 | 11.04 | 48.26 | 7.90 | 24.26 | |
Journal of the American Medical Information Association | 8 | 17.72 | 8.17 | 51.71 | 5.33 | 27.20 | |
International J. Computer-Supported Collab. Learn. | 9 | 17.43 | 4.67 | 55.14 | −0.22 | 31.39 | |
Journal of Enterprise Information Management | 10 | 17.05 | 6.38 | 55.05 | 3.91 | 28.17 | |
European Journal of Information Systems | 11 | 15.26 | 11.52 | 56.38 | 6.79 | 23.39 | |
Average | 18.88 | 9.10 | 48.53 | 6.82 | 28.02 | ||
Coefficient of variation | 12% | 30% | 12% | 43% | 11% | ||
C3 | Information Technology for Development | 13 | 13.63 | 8.53 | 61.14 | 0.47 | 24.86 |
Social Science Computer Review | 15 | 12.12 | 9.12 | 63.54 | 2.48 | 21.99 | |
Information Technology and People | 16 | 11.46 | 9.59 | 67.12 | −1.77 | 24.48 | |
Ethics and Information Technology | 17 | 11.42 | 11.06 | 61.54 | 2.24 | 20.09 | |
Journal of Informetrics | 18 | 11.40 | 10.14 | 65.52 | −0.60 | 22.46 | |
Journal of Health Communication | 21 | 9.79 | 5.74 | 72.67 | −1.97 | 23.11 | |
Qualitative Health Research | 23 | 8.75 | 8.38 | 73.34 | −2.70 | 21.69 | |
Scientometrics | 24 | 8.46 | 9.86 | 71.77 | −2.35 | 19.88 | |
Knowledge Management Research and Practice | 27 | 6.87 | 8.96 | 75.52 | −2.66 | 18.22 | |
Online Information Review | 29 | 5.62 | 11.16 | 76.72 | −4.05 | 17.71 | |
Average | 9.95 | 9.25 | 68.89 | −1.09 | 21.45 | ||
Coefficient of variation | 24% | 16% | 8% | 190% | 11% | ||
C4 | Journal of Computer-Mediated Communication | 12 | 15.08 | 17.91 | 47.02 | 6.74 | 21.31 |
MIS Quarterly | 14 | 12.23 | 17.51 | 53.85 | 7.20 | 14.47 | |
Information Systems Research | 19 | 10.65 | 19.15 | 56.97 | 1.49 | 18.20 | |
Journal of Organizational and End User Computing | 20 | 10.38 | 21.31 | 55.19 | 5.15 | 12.19 | |
Journal of Strategic Information Systems | 22 | 8.92 | 30.97 | 56.13 | 6.76 | 4.05 | |
Information Systems Journal | 25 | 8.14 | 23.66 | 59.31 | 3.53 | 11.13 | |
Information and Organization | 26 | 7.48 | 24.53 | 65.09 | 6.46 | 10.27 | |
International Journal of Geographical Information Science | 28 | 6.84 | 21.29 | 65.60 | 0.63 | 12.65 | |
Journal of Information Technology | 30 | 5.37 | 18.37 | 70.75 | 1.69 | 9.85 | |
Average | 9.45 | 21.63 | 58.88 | 4.41 | 12.68 | ||
Coefficient of variation | 30% | 19% | 12% | 56% | 37% |
Cluster . | Journal . | Rank . | RI . | UNCF . | p_below_meancit . | RI t10 . | RI ≤ t50 . |
---|---|---|---|---|---|---|---|
C1 | International Journal of Information Management | 1 | 31.76 | 5.06 | 23.67 | 22.33 | 34.96 |
C2 | Journal of Knowledge Management | 2 | 22.91 | 5.48 | 42.42 | 8.44 | 34.22 |
Government Information Quarterly | 3 | 22.24 | 9.25 | 39.33 | 10.11 | 30.07 | |
Information Processing and Management | 4 | 19.74 | 11.97 | 44.02 | 8.89 | 27.04 | |
Information and Management | 5 | 19.64 | 12.68 | 44.29 | 9.75 | 28.16 | |
Journal of Management Information Systems | 6 | 18.92 | 9.83 | 48.72 | 7.30 | 26.32 | |
Telematics and Informatics | 7 | 17.84 | 11.04 | 48.26 | 7.90 | 24.26 | |
Journal of the American Medical Information Association | 8 | 17.72 | 8.17 | 51.71 | 5.33 | 27.20 | |
International J. Computer-Supported Collab. Learn. | 9 | 17.43 | 4.67 | 55.14 | −0.22 | 31.39 | |
Journal of Enterprise Information Management | 10 | 17.05 | 6.38 | 55.05 | 3.91 | 28.17 | |
European Journal of Information Systems | 11 | 15.26 | 11.52 | 56.38 | 6.79 | 23.39 | |
Average | 18.88 | 9.10 | 48.53 | 6.82 | 28.02 | ||
Coefficient of variation | 12% | 30% | 12% | 43% | 11% | ||
C3 | Information Technology for Development | 13 | 13.63 | 8.53 | 61.14 | 0.47 | 24.86 |
Social Science Computer Review | 15 | 12.12 | 9.12 | 63.54 | 2.48 | 21.99 | |
Information Technology and People | 16 | 11.46 | 9.59 | 67.12 | −1.77 | 24.48 | |
Ethics and Information Technology | 17 | 11.42 | 11.06 | 61.54 | 2.24 | 20.09 | |
Journal of Informetrics | 18 | 11.40 | 10.14 | 65.52 | −0.60 | 22.46 | |
Journal of Health Communication | 21 | 9.79 | 5.74 | 72.67 | −1.97 | 23.11 | |
Qualitative Health Research | 23 | 8.75 | 8.38 | 73.34 | −2.70 | 21.69 | |
Scientometrics | 24 | 8.46 | 9.86 | 71.77 | −2.35 | 19.88 | |
Knowledge Management Research and Practice | 27 | 6.87 | 8.96 | 75.52 | −2.66 | 18.22 | |
Online Information Review | 29 | 5.62 | 11.16 | 76.72 | −4.05 | 17.71 | |
Average | 9.95 | 9.25 | 68.89 | −1.09 | 21.45 | ||
Coefficient of variation | 24% | 16% | 8% | 190% | 11% | ||
C4 | Journal of Computer-Mediated Communication | 12 | 15.08 | 17.91 | 47.02 | 6.74 | 21.31 |
MIS Quarterly | 14 | 12.23 | 17.51 | 53.85 | 7.20 | 14.47 | |
Information Systems Research | 19 | 10.65 | 19.15 | 56.97 | 1.49 | 18.20 | |
Journal of Organizational and End User Computing | 20 | 10.38 | 21.31 | 55.19 | 5.15 | 12.19 | |
Journal of Strategic Information Systems | 22 | 8.92 | 30.97 | 56.13 | 6.76 | 4.05 | |
Information Systems Journal | 25 | 8.14 | 23.66 | 59.31 | 3.53 | 11.13 | |
Information and Organization | 26 | 7.48 | 24.53 | 65.09 | 6.46 | 10.27 | |
International Journal of Geographical Information Science | 28 | 6.84 | 21.29 | 65.60 | 0.63 | 12.65 | |
Journal of Information Technology | 30 | 5.37 | 18.37 | 70.75 | 1.69 | 9.85 | |
Average | 9.45 | 21.63 | 58.88 | 4.41 | 12.68 | ||
Coefficient of variation | 30% | 19% | 12% | 56% | 37% |
The resulting clusters by indicator value similarity could be associated with epistemological differences and differences in communication and citation practices between the groups in the LIS category. Thus, C1 and C2 contain the journals focusing on information management. C4 holds the journals where information is analyzed primarily from a technological perspective. C3 is the most heterogeneous cluster, characterized by journals oriented toward other values of information (ethics, social implications, communication, or qualitative and quantitative analyses). These results are striking, because the first quartile is dominated by Information Science journals. The journals traditionally assigned to Library Science, on the other hand, are relegated to the following quartiles. This suggests that in recent years, considerable differences have arisen in the two specialties’ production and citation practices. A natural division of the category would facilitate more precise delimitation and better representation of the journals in both disciplines, reducing the effects of the differences detected.
For the UNCF, the average percentage of uncited papers is 30% in LIS, while it is 16.7% in BIO. The average proportion of publications below the mean for the discipline is 79.7% in LIS and 78.2% in BIO. Only 9% of the journals have under 50% in LIS and 10.5% in BIO.
These results confirm known findings. First, citation habits vary greatly between disciplines, as revealed by the different proportions of uncited documents in BIO and LIS. Second, citation distributions are highly biased in all disciplines: on average only 20% of articles receive more citations than the average for their category in the two fields examined.
Furthermore, analysis of the correlations of the RI and IF with the other indicators (UNCF, p_below_meancit, RI t10, and RI ≤ t50) suggests that the need to employ supplementary indicators is less pressing for the RI (which presents a significant relationship among indicators) than for the IF.
Table 6 presents the set of BIO journals grouped into clusters according to their similarities in terms of these indicators.
Cluster . | Journal . | Rank . | RI . | UNCF . | p_below_meancit . | RI t10 . | RI ≤ t50 . |
---|---|---|---|---|---|---|---|
C1 | Annual Review of Biochemistry | 1 | 37.46 | 2.92 | 15.21 | 33.62 | 28.42 |
Molecular Cancer | 2 | 36.81 | 1.81 | 16.59 | 31.15 | 31.54 | |
Average | 37.14 | 2.37 | 15.90 | 32.39 | 29.98 | ||
Coefficient of variation | 1% | 23% | 4% | 4% | 5% | ||
C2 | Molecular Cell | 3 | 27.00 | 7.15 | 27.18 | 23.23 | 17.72 |
Molecular Psychiatry | 4 | 26.64 | 5.35 | 31.57 | 16.93 | 23.52 | |
Genome Research | 5 | 26.24 | 3.70 | 33.16 | 12.91 | 28.50 | |
Redox Biology | 6 | 25.40 | 3.52 | 37.32 | 10.89 | 30.32 | |
Cell | 7 | 25.26 | 6.91 | 32.28 | 28.33 | 14.73 | |
Critical Reviews in Biochemistry and Molecular Biology | 8 | 25.16 | 2.33 | 39.54 | 9.11 | 32.26 | |
Cell Death and Differentiation | 9 | 24.35 | 3.54 | 37.22 | 11.15 | 25.12 | |
Molecular Systems Biology | 10 | 23.18 | 3.30 | 36.81 | 12.24 | 21.97 | |
Average | 25.40 | 4.48 | 34.39 | 15.60 | 24.27 | ||
Coefficient of variation | 5% | 37% | 11% | 41% | 23% | ||
C3 | Oncogene | 11 | 22.88 | 3.21 | 42.02 | 6.68 | 27.67 |
Biochimica et Biophysica Acta-Reviews on Cancer | 12 | 22.60 | 5.37 | 42.15 | 9.35 | 29.20 | |
Cellular and Molecular Life Sciences | 14 | 21.60 | 3.15 | 45.88 | 7.08 | 27.91 | |
PLoS Biology | 16 | 20.88 | 4.12 | 44.72 | 8.95 | 23.52 | |
Nucleic Acids Research | 18 | 20.41 | 2.20 | 49.86 | 8.02 | 26.49 | |
International Journal of Biological Macromolecules | 21 | 19.59 | 2.94 | 49.12 | 5.13 | 25.87 | |
Experimental and Molecular Medicine | 23 | 18.00 | 3.09 | 55.15 | 6.31 | 24.94 | |
Molecular Ecology Resources | 24 | 17.90 | 6.35 | 48.24 | 6.02 | 22.06 | |
Antioxidants | 29 | 15.60 | 3.23 | 60.25 | 1.89 | 24.44 | |
International Journal of Biological Sciences | 27 | 17.09 | 2.09 | 56.95 | 3.52 | 24.98 | |
Average | 19.66 | 3.58 | 49.43 | 6.30 | 25.71 | ||
Coefficient of variation | 12% | 36% | 12% | 35% | 8% | ||
C4 | Progress in Lipid Research | 13 | 22.41 | 16.88 | 29.87 | 17.48 | 12.22 |
Matrix Biology | 15 | 21.58 | 10.41 | 38.46 | 9.05 | 23.98 | |
Molecular Plant | 17 | 20.61 | 8.71 | 38.19 | 15.77 | 16.72 | |
EMBO Journal | 19 | 20.20 | 7.69 | 41.20 | 11.02 | 18.05 | |
Signal Transduction and Targeted Therapy | 20 | 20.18 | 7.61 | 43.75 | 15.30 | 19.87 | |
Nature Structural and Molecular Biology | 22 | 18.36 | 14.14 | 36.77 | 17.67 | 9.39 | |
Molecular Aspects of Medicine | 25 | 17.79 | 13.03 | 46.36 | 10.81 | 18.43 | |
Current Opinion in Structural Biology | 26 | 17.42 | 11.31 | 46.25 | 6.36 | 20.51 | |
Current Opinion in Chemical Biology | 28 | 16.29 | 13.60 | 48.69 | 8.32 | 19.01 | |
Cell Systems | 30 | 15.45 | 14.22 | 45.54 | 11.71 | 9.32 | |
Average | 19.03 | 11.76 | 41.51 | 12.35 | 16.75 | ||
Coefficient of variation | 12% | 25% | 13% | 31% | 28% |
Cluster . | Journal . | Rank . | RI . | UNCF . | p_below_meancit . | RI t10 . | RI ≤ t50 . |
---|---|---|---|---|---|---|---|
C1 | Annual Review of Biochemistry | 1 | 37.46 | 2.92 | 15.21 | 33.62 | 28.42 |
Molecular Cancer | 2 | 36.81 | 1.81 | 16.59 | 31.15 | 31.54 | |
Average | 37.14 | 2.37 | 15.90 | 32.39 | 29.98 | ||
Coefficient of variation | 1% | 23% | 4% | 4% | 5% | ||
C2 | Molecular Cell | 3 | 27.00 | 7.15 | 27.18 | 23.23 | 17.72 |
Molecular Psychiatry | 4 | 26.64 | 5.35 | 31.57 | 16.93 | 23.52 | |
Genome Research | 5 | 26.24 | 3.70 | 33.16 | 12.91 | 28.50 | |
Redox Biology | 6 | 25.40 | 3.52 | 37.32 | 10.89 | 30.32 | |
Cell | 7 | 25.26 | 6.91 | 32.28 | 28.33 | 14.73 | |
Critical Reviews in Biochemistry and Molecular Biology | 8 | 25.16 | 2.33 | 39.54 | 9.11 | 32.26 | |
Cell Death and Differentiation | 9 | 24.35 | 3.54 | 37.22 | 11.15 | 25.12 | |
Molecular Systems Biology | 10 | 23.18 | 3.30 | 36.81 | 12.24 | 21.97 | |
Average | 25.40 | 4.48 | 34.39 | 15.60 | 24.27 | ||
Coefficient of variation | 5% | 37% | 11% | 41% | 23% | ||
C3 | Oncogene | 11 | 22.88 | 3.21 | 42.02 | 6.68 | 27.67 |
Biochimica et Biophysica Acta-Reviews on Cancer | 12 | 22.60 | 5.37 | 42.15 | 9.35 | 29.20 | |
Cellular and Molecular Life Sciences | 14 | 21.60 | 3.15 | 45.88 | 7.08 | 27.91 | |
PLoS Biology | 16 | 20.88 | 4.12 | 44.72 | 8.95 | 23.52 | |
Nucleic Acids Research | 18 | 20.41 | 2.20 | 49.86 | 8.02 | 26.49 | |
International Journal of Biological Macromolecules | 21 | 19.59 | 2.94 | 49.12 | 5.13 | 25.87 | |
Experimental and Molecular Medicine | 23 | 18.00 | 3.09 | 55.15 | 6.31 | 24.94 | |
Molecular Ecology Resources | 24 | 17.90 | 6.35 | 48.24 | 6.02 | 22.06 | |
Antioxidants | 29 | 15.60 | 3.23 | 60.25 | 1.89 | 24.44 | |
International Journal of Biological Sciences | 27 | 17.09 | 2.09 | 56.95 | 3.52 | 24.98 | |
Average | 19.66 | 3.58 | 49.43 | 6.30 | 25.71 | ||
Coefficient of variation | 12% | 36% | 12% | 35% | 8% | ||
C4 | Progress in Lipid Research | 13 | 22.41 | 16.88 | 29.87 | 17.48 | 12.22 |
Matrix Biology | 15 | 21.58 | 10.41 | 38.46 | 9.05 | 23.98 | |
Molecular Plant | 17 | 20.61 | 8.71 | 38.19 | 15.77 | 16.72 | |
EMBO Journal | 19 | 20.20 | 7.69 | 41.20 | 11.02 | 18.05 | |
Signal Transduction and Targeted Therapy | 20 | 20.18 | 7.61 | 43.75 | 15.30 | 19.87 | |
Nature Structural and Molecular Biology | 22 | 18.36 | 14.14 | 36.77 | 17.67 | 9.39 | |
Molecular Aspects of Medicine | 25 | 17.79 | 13.03 | 46.36 | 10.81 | 18.43 | |
Current Opinion in Structural Biology | 26 | 17.42 | 11.31 | 46.25 | 6.36 | 20.51 | |
Current Opinion in Chemical Biology | 28 | 16.29 | 13.60 | 48.69 | 8.32 | 19.01 | |
Cell Systems | 30 | 15.45 | 14.22 | 45.54 | 11.71 | 9.32 | |
Average | 19.03 | 11.76 | 41.51 | 12.35 | 16.75 | ||
Coefficient of variation | 12% | 25% | 13% | 31% | 28% |
In the case of BIO, although Oncology and Molecular Biology applied to medical and clinical specialties prevail in clusters C1 and C3, and Biochemistry, Genetics, and Molecular Biology are the leading tags in clusters C2 and C4, the clusters are made up of groups of journals sharing homogeneous topic areas where no clear differences in epistemology or production and citation practices can be seen.
4.4. Journal Comparison
This final subsection is devoted to a comparison of several journals from the LIS and BIO categories. Figure 3 exemplifies the unequal distribution of percentile differences in the publications of two BIO journals whose RIs are similar as the consequence of calculating the mean of the differences. In orange, the distribution of Antioxidants (rank = 29) shows that the journal’s total value is due to an accumulation of considerable differences in less visible percentiles. Cell Systems (rank = 30) presents a distribution in which the main differences occur in the most visible percentiles. The visualization coincides with the indicators in Table 6, where Antioxidants’ average RI for the less visible half of papers (24.4) is higher than its RI for total distribution (15.6), and its average is significantly lower in the 10 most visible percentiles (1.8). On the other hand, Cell Systems has an RI in the top 10 (11.7) that is similar to the general RI (15.4), and it has a lower RI in the less visible half (9.3).
Leydesdorff and Bornmann (2011) suggested a way to address these situations by weighting the difference according to the percentile. Their approach favors the higher-visibility publications, but, although citation is a proxy to the quality of a publication, the weighting process acts indiscriminately, over- or underrepresenting publications’ value based entirely on their position. It has already been said that the value of this indicator is based on a publication’s use or usefulness. Qualitative analyses are required to identify other values. Another alternative was proposed by Mutz and Daniel (2012) to transform skewed distributions into normal distributions using z-values, thus radically altering the original representation. In our case, we have chosen to present the data with the least possible artefact and to add indicators for a more accurate characterization of the journal’s rank, respecting the real citation data as much as possible.
Graphic comparison also helps explain the change in rank leadership in BIO, as shown in Figure 4, despite the large difference in productivity between Annual Review of Biochemistry (171 publications) and Cell (3,315 publications). The differences in the areas represented by the curves below the diagonal line are obvious. The indicators in Table 6 confirm the visualization, presenting a journal that is the leader due to the extremely high visibility of its publications in the highest percentiles, its very low proportion of uncited papers, and its high percentage of papers with higher-than-average citations for the category. These results illustrate the advantages of using percentiles, as described in Section 3.
By the same token, it is a simple matter to compare LIS journals. Figure 5 shows three journals: (a) the journal ranked first (International Journal of Information Management, IJIM), (b) a journal that rises from Q2 according to its IF to Q1 according to its RI (Journal of Informetrics, JoI), and (c) a journal that falls from Q2 according to its IF to Q3 according to its RI (El Profesional de la Información, EPI).
The indicators in Table 5 corroborate the sharp differences in the areas represented by the curves around the diagonal line. While IJIM shows a significant area below the diagonal, JoI’s area lies below the expected visibility (above the diagonal) in the higher percentiles, and EPI has very few publications that manage to stay below the diagonal line, and only in the less visible percentiles.
5. DISCUSSION AND CONCLUSIONS
5.1. About Citation Impact Indicators’ Role in Research Evaluation
For over 50 years, the impact factor has led the pack of bibliometric indicators, despite its numerous recognized technical and use issues. The widespread acknowledgment of its issues and limitations contrasts with its creators’ and custodians’ unmoving refusal to correct or minimize its problems. As a result, movements have under way since 2012 to promote the rational use of quantitative studies (DORA, Leiden Manifesto, CoARA) and even to call for their elimination (Ioannidis & Thombs, 2019). The most rational explanation for no one having taken action is the economic rewards that come with the prestige of the IF (Archambault & Larivière, 2009). These rewards are directly proportional to the damage done to the scientific ecosystem by actors who have utilized the IF for purposes for which it was never intended. It is impossible to put a price tag on the damage due to IF misuse or abuse, but the institutional and personal consequences have been studied (Shu, Liu, & Larivière, 2022).
In this sense, what was defined in 1972 as “perhaps the most important application” of the IF turns out to be contradictory, in view of the indicator’s wide-reaching, proven limitations. If the IF cannot reflect subtleties in citation, it is clearly too vague and inappropriate to be used to evaluate individuals or institutions, so, why postulate IF as a useful indicator in research evaluation? Research evaluation was the new, important application that distorted the way the IF was used and made it an unreliable tool to begin with. For proof of the harmful effects of IF abuse, one has but to look at the difficulties hampering promotion and recognition in many countries and institutions (Archambault & Larivière, 2009; Hicks, Wouters et al., 2015; Larivière & Sugimoto, 2019), the pressure placed on young researchers for publication in highly ranked journals, alterations of publication and citation patterns (Brown, 2007), and misconduct of journals, institutions, and researchers scrabbling for a place in the first quartile, to mention just a few examples (Kurmis, 2003; Martin, 2016).
Although Clarivate analysts and experts know the IF is not a good instrument for research evaluation (Clarivate, 2023), the company understandably finds it hard to discard the tool that has earned it so much success and recognition. Furthermore, Clarivate cannot be accused of its product misuse; likewise, it cannot remain indifferent to those abuses (Archambault & Larivière, 2009). At the same time, the IF is an old, outdated indicator that, as Martin (2016) says, is based on the historical legacy of a time whose technical and technological difficulties were very different. The IF has not reflected technical and technological evolution. It remains as unchanging as it is prominent in the face of any more advanced, more accurate rival. Objectively, it is a tool that is as dangerous in unskilled hands as it is inaccurate and easy to manipulate.
The deterioration of the scientific reporting ecosystem cannot be chalked up entirely to the IF, but the IF is obviously one of the more pernicious agents of erosion. Directly or indirectly, it is behind the temptation that leads journals to value quantity over quality and to take advantage of the need that some national, regional, and local scientific systems have created with their recruitment, promotion, and recognition policies, unconsciously chorusing the famous mantra of “publish or perish” and the more recent “impact or perish” (Biagioli, 2016). Also, the last few years have seen the appearance of new publishers and numerous journals that display predatory behavior, meaning that they place article processing charge (APC)-generated income before the opinions of qualified reviewers. That is a different debate and ought to be addressed separately, but it shares some of the effects attributed to IF abuse, effects that have driven authors and editors away from factors that used to matter a great deal, such as a journal’s topic area, relevance for the field, fairness and rapidity of the editorial process, the probability of acceptance, publication lag, and the cost of publishing (Seglen, 1997).
5.2. Advantages of RI
The new method proposed in this paper answers the first three research questions in “Objectives” with a “Yes.” First, it has several advantages over the IF and earlier percentile-based proposals. Second, it aspires to be the sought-for, as-yet-unestablished alternative that improves on some of the IF’s applications and eradicates those that the IF should never have been used for, while correcting the IF’s defects and limitations (Brodman, 1944; Kurmis, 2003; Martin, 2016).
To do so, using the same data set as the IF, the RI maintains some of the IF’s documented advantages (e.g., its comprehensibility, its stability, its robustness, and its availability) and improves on some of its main technical defects. The RI presents: greater accuracy because it is based on a two-dimensional percentile partition; better resistance to vulnerabilities, manipulation or fraud because few highly cited publications has a negligible effect on the indicator; greater completeness, because it takes into consideration all of a journal’s publications and even enables the specific distribution of each aggregate to be represented; lesser influence by the citation window, publication age, or journal productivity because of the use of a size-independent indicator; and less susceptibility to the effect of outliers.
In addition, the RI involves less complexity than other percentile-based alternatives, such as I3, percentile shares, or z-values, because it normalizes gross citation and does not require adjustment procedures or generate new scales. Nor does it use estimates or probabilities. Another huge advantage, which also contributes to the indicator’s comprehensibility, is its visualizations of double percentile partitions, converted into a simple, intuitive, highly descriptive complement of the numerical version of the RI. As in the case of the h-index, graphs show the combined distribution of citations (percentiles) and publications. In the case of the RI, these graphs show at a glance the effect of publications that are cited often, sometimes, little, or never in each journal’s distributions. Where the RI affords a substantial improvement over the h-index is this: Its distributions are size independent and age independent (Bornmann et al., 2012). Moreover, they extend the traditional detection of publications within the distribution of citations in their field that percentiles offer, combining this with the publications’ position in the distribution of citations of the analyzed aggregate (in this case, journals).
The RI also preserves another one of the advantages Glänzel and Moed (2002) attributed to the IF: It is a very easy indicator to replicate, regardless of the source used, provided that the replicating researcher can access the full citation metadata. In addition, because it is based on percentiles, it can be used in reference sets from diverse sources (Leydesdorff & Bornmann, 2011), such as Scopus, Google Scholar, patent databases, WoS, Dimensions, or OpenAlex, and with different size distributions.
The incorporation of additional indicators to better characterize each individual journal boosts the RI’s accuracy and usefulness, because a single indicator would be hard pressed to successfully represent prestige and usefulness based only on a journal’s position within its disciplinary context (Kiesslich et al., 2021; Mutz & Daniel, 2012). This answers the fifth research question. In this sense, correlation analyses show that the RI has less of a crucial need for complementary indicators, another detail that suggests the RI is more robust than the IF.
Moreover, the RI proposal is aligned with Waltman’s (2016) recommendations for indicator design, because it offers a different alternative with added values, as compared to the numerous indicators proposed thus far; it considers the theoretical principles of citation counts; it recognizes and respects appropriate uses of indicators based on citation counts; and although the proposal is based on the traditional use of citation counts, it explores a new open access source to create, share, openly debate, and improve tools to evaluate the visibility of publications and journals, so that the resulting indicators could be used sensibly in both research policy and evaluation, as Archambault and Larivière (2009) claimed.
The RI also fulfils the other characteristics defined for responsible metrics: robustness, as has been demonstrated numerically and visually; humility, because robustness is not the same as perfection, and the RI recognizes and exemplifies its own limitations; transparency, because it applies known, widely used methods, and its data set is publicly accessible free of charge; diversity, because it employs additional indicators to refine its meaning and its application, in addition to considering and respecting disciplines’ different practices and uses; and reflexivity, because, in the light of its limitations, it lists the negative side effects that misuses could cause and how they can be avoided by research managers, decision makers, research stakeholders, and policymakers. In this sense, the RI’s transparency of data access favors robustness and the scientific community’s reliability in quantitative analyses (Wilsdon, Allen, & Belfiore, 2015). This answers the seventh research question.
Probably, the main advantage of the RI and the double-percentile diagrams is exactly to illustrate and quantify, for a given journal, how every single publication is distributed in relation to the whole set of publications of the same journal and the whole set of publications of the same subject category. Consequently, the higher the positive RI the higher the visibility of the journal’s publications in the category compared to the expected percentile in the journal distribution.
5.3. The RI’s Limitations
Despite all efforts to reduce its drawbacks, the RI still wrestles with some of the problems attributed to the IF, which are not very different from the methodological difficulties that were found almost 100 years ago by Gross and Gross (1927) and Fussler (1949a, 1949b) and documented again a little over a decade ago by Mutz and Daniel (2012). The RI is sensitive to topic classification. In this sense, article-level classifications, such as those suggested by Leydesdorff and Bornmann (2011), are preferable and would offer more accurate, fairer results than journal-level classifications.
It has also been proved that document types have clear, strong effects on the calculation of the RI. The RI regards the kinds of documents that tend to be highly cited (reviews) as being as relevant as the types that are normally considered noncitable (letters, editorials, news). The problem is exacerbated even more by the use of OpenAlex, which currently does not distinguish between subtypes but identifies all journal publications as articles. Nor does the RI address shortcomings attributable to data sources, such as geographic and linguistic biases, or errors or gaps in coverage or in citations. The RI also still has trouble translating usefulness (or publication use) into quality. From a qualitative point of view, many are the factors in citation that make a more highly cited publication not necessarily better than another, less heavily cited publication in the same field.
In addition, as the RI is the average of the differences in the visibility of a journal’s publications, it does not enable the influence of each individual publication to be determined (Seglen, 1997), although visualization of the full distribution does, because it shows the representativeness of the publication in question in the context of the journal that published it and the category to which it belongs. This finding confirms the fourth research question. As Leydesdorff and Bornmann (2011) indicate, assigning a percentile to each individual publication makes it possible to conduct analyses at the micro level (authors, departments, schools), the meso level (institutions, journals), and the macro level (countries, regions).
It is important to remember that, although percentiles enable disciplines to be compared, comparisons of journals from different categories are not always recommended. Yes, percentiles are size independent, but they may be insufficient to eliminate the effect caused by differences in publication and citation practices. This answers the sixth research question.
6. PRACTICAL IMPLICATIONS
The results suggest that, compared to the IF, the RI furnishes major improvements, important corrections, and new contributions. But the RI’s main beauty lies in its visual representation of the double percentile partition. It offers a clear, accurate, understandable tool. For example, in Figure 4, it is easy to grasp why the Annual Review of Biochemistry has a higher RI than Cell. The fact that the figure is easily understood does not mean it can be easily interpreted, however. For example, an inexperienced analyst might be tempted to assert that the Annual Review of Biochemistry is a better journal than Cell. Not only does its RI value say so, but it can be clearly seen in the curve drawn by the distribution of its publications. The analyst could not be more wrong. The Annual Review of Biochemistry, as its name indicates, only publishes reviews, and, as said before, reviews tend to receive a higher number of citations. This does not mean all reviews are highly cited; only those that the scientific community judges interesting and useful and therefore uses are highly cited. Reviews have the advantage of compiling the main publications on a topic and making it easier for a reader to gain quick familiarity with the different perspectives on that topic in a single document. However, from a purely scientific standpoint, reviews are hardly original.
An experienced analyst would not only know the role reviews play in the scientific process but would also know how to understand the graph, which shows two journals that achieve high visibility not by chance, but probably because they have editors in chief who are experts in the field, they have a professional editorial team, they recruit expert reviewers from the cream of the international crop, or they attract publications from authors with new, groundbreaking scientific proposals and ideas. A journal’s success, seen as readership, does not depend on an indicator but on hard, intense, prolonged, well-done work. When an author uses IFs to choose which journal to publish in, that just shows the author’s ignorance, the pressure the system puts the author under, or perhaps both things at the same time. Also, many journals may take advantage of using the JIF to promote their own (supposed) quality to attract new submissions. A librarian would also know that, budget allowing, the library ought to acquire both titles.
Likewise, Figure 5 illustrates how El Profesional de la Información falls to the third quartile by RI as compared to its newfound place in the second quartile according to its IF. An inexperienced editor would feel the immediate urge to reject the new indicator for lowering the journal’s rating so sharply. An inexperienced analyst would hasten to assert that the journal is significantly worse than the International Journal of Information Management or the Journal of Informetrics. An expert analyst, however, would know that El Profesional de la Información traditionally publishes its articles in Spanish, that its editorial policy is very strict about accepting bibliometric papers (which also tend to receive more citations), and that, as the journal’s name indicates, its publications are chosen more for their applicability to the professional field than for their scientific bearing, as happens in some medical and nursing journals. Because of these characteristics, this journal’s publications are not as visible, but to people working in the field, they are of great value, a value that citations cannot measure. Its articles are used heavily by university teachers and students, and young researchers frequently submit their first papers to it. A librarian knows that a specialized library’s collection cannot do without this journal.
This leads to a simple reflection. Waltman said in his 2016 review that “having technically sophisticated citation impact indicators is absolutely essential in some situations.” By the same token, the technical sophistication of the interpretation of the results yielded by citation indicators is also essential in some situations, and it must not be masked by the apparent clarity and comprehensibility of said results. In other words, although citation-based visibility indicators are adapted to end users’ needs and expectations, end users may be unable to use them correctly unless they are trained to do so.
Citation indicator misuse has led to an overreaction, probably fueled by misinterpretations of calls to action from international coalitions and declarations that promote good practices in citation indicator use, or by numerous researchers’ suggestion that we eliminate the IF. This is what Torres-Salinas, Arroyo-Machado, and Robinson-Garcia (2023) call “bibliometric denialism,” illustrating what they mean by several examples of how European public funding institutions have spread the concept. Dead dogs don’t bite. This solution is as effective as cutting down all the forests to prevent forest fires. Properly used quantitative indicators are vital, because they furnish objectivity and because they reduce the high costs of qualitative evaluations. Basing research evaluation on qualitative aspects alone is not only slow and costly, but can stray into something much more dangerous when subjectivity morphs into arbitrariness. In the third decade of the 21st century, the solution involves “metrics literacy” (Haustein, Woods et al., 2023). Efforts must be aimed at training people to use those tools, refusing the idea that all instruments are self-explanatory, simple, and easy for any policymaker to use to find the right solution to any management problem that happens to be on the table in a question of minutes. Quantitative analyses still provide a fundamental support, but not exclusive, to qualitative reviews. Their role never has been to replace the qualitative side (Wilsdon et al., 2015). Another point to take into consideration is that there are no bad indicators, only bad indicator uses.
Furthermore, if citation analyses and rankings are going to keep doing the harm described in the introduction, if they are going to keep giving editors an excuse for fraudulently boosting their visibility, if they are going to cause behavioral changes in author and journal citation and production, if they are going to force authors to engage in unethical practices (salami publication, self-citation, self-plagiarism, etc.), if they are going to keep fueling authors’ obsession over publication in and citation of high-quartile journals only, if they are going to keep being used as the sole tool for recruitment, promotion, recognition, or funding, if they are going to keep encouraging shortcuts and research fabrication or falsification, perhaps we should advocate rejecting the use of these indicators for research evaluation.
Although the RI seems to be a more objective, accurate, robust measurement than the IF, it cannot be used indiscriminately for any comparison, nor may its results be extrapolated to anything outside its strictly recommended uses. To do otherwise would be to repeat the mistakes made with the IF, which was conceived as a tool to improve communication among scientists and the sharing of their ideas but has become a major contaminant of the scientific ecosystem, modifying the way in which communication and idea sharing take place and unleashing undesired negative side effects (Osterloh & Frey, 2015).
To conclude, the RI does not aspire to become an all-in-one indicator, because it is currently impossible for any one measurement to capture all the complexity of science or the diverse values that can characterize a scientific publication. Although the advantages of the RI over the IF are numerous, it must never be forgotten that the RI too is an average. Fortunately, the representation of the double percentile partition helps place and characterize each individual publication within its disciplinary context. The RI demonstrates that a journal is as visible as all its publications, not just a few highly cited ones. The results suggest that this could be an important step forward in research metrics.
7. FURTHER RESEARCH
Future research must do the following:
Calculate the RI with metadata from sources other than OpenAlex, to confirm the real incidence of the lack of document type details and some journals’ coverage problems in the results.
Analyze the consequences of changes in journal positions, expanding to other scientific categories, such as Medicine, Physics, Chemistry, and Engineering, to complete this initial analysis.
Evaluate the suitability of journal-level classifications, such as the JCR’s, as opposed to probably more accurate, fairer solutions based on article-level classifications. In fact, the majority of the Library Science and Information Science journals in the first quartile in the 2023 JCR list show a marked inclination toward Management or Information Systems from a technological standpoint. Cross-fertilization can also be employed as a strategy for climbing the JCR list, if not in one category, then in another.
Check whether analysis of full citation distributions would enable journals to be characterized according to distribution form and detect unusual citation patterns.
Determine what factors are involved in a journal’s being more visible and influential in its category (journal scope, citations received from other categories, editorial policy, open access, etc).
Apply the RI, with the necessary reservations, to different aggregates, such as countries, institutions, or individuals, to expand on the information currently offered by tools such as Clarivate’s author impact beamplot.
ACKNOWLEDGMENTS
We would like to thank Professors Javier Ruiz-Castillo and Alonso Rodríguez-Navarro for their valuable comments and suggestions on a very early version of this paper. They would also like to thank two anonymous reviewers at Quantitative Science Studies for their helpful and valuable corrections and recommendations.
AUTHOR CONTRIBUTIONS
Antonio Perianes-Rodríguez: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Visualization; Writing—original draft; Writing—review & editing. Bianca S. Mira: Data curation; Investigation; Visualization; Writing—review & editing. Daniel Martínez-Ávila: Formal analysis; Investigation; Writing—review & editing. Maria Cláudia Cabrini Grácio: Investigation; Methodology; Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
This work was partially supported by the Comunidad de Madrid-Spain under the Multiannual Agreement with UC3M in the line of Excellence of University Professors, project number EPUC3M02. The doctoral dissertation of Bianca S. Mira is partially funded by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) Grant No. 88887.831119/2023-00.
DATA AVAILABILITY
The comprehensive raw data set used in this study is freely available at the OpenAlex website. The aggregated data sets are available as supplementary material at Zenodo with DOI https://doi.org/10.5281/zenodo.10869819 (Perianes-Rodríguez, Mira et al., 2024).
Notes
DORA: the first recommendation states “do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions” (https://sfdora.org/read/).
The third commitment of CoARA recommends “abandon inappropriate uses in research assessment of journal- and publication-based metrics, in particular inappropriate uses of Journal Impact Factor (JIF) and h-index” (https://coara.eu/agreement/the-agreement-full-text/).
REFERENCES
Author notes
Handling Editor: Vincent Larivière