The Academic Ranking of World Universities (ARWU) is one of the most well-known university rankings, recognized for its objective and reproducible methodology. In contrast, the Global Ranking of Academic Subjects (GRAS), which ranks institutions by scientific subjects and is also elaborated by Shanghai Ranking Consultancy (SRC), introduces methodological differences that deviate from the ARWU’s objectivity. This is due to the use of SRC’s Academic Excellence Survey to define two of the GRAS’s five indicators. Specifically, the Top indicator counts publications in journals determined by respondents as top tier in their field, and the Award indicator does the same for prizes. An examination of this survey suggests the presence of potential biases, especially in participant selection and journal identification, among which an Anglo-Saxon bias is prominently evident. Likewise, there is a potential risk that the selection of journals in some cases may be influenced, potentially masking conflicts of interest, such as involvement in editorial committees that could sway this selection. As a result, relying on surveys instead of adhering to established bibliometric standards can lead to inconsistencies and subjectivity, especially if not rigorously conducted. Such methodologies pose a risk to the trustworthiness of tools crucial for university policymaking.

https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_c_00289

Since the early 2000s, the emergence of various ranking systems has influenced university scientific policies (Dill & Soo, 2005). This influence has prompted universities to compete for higher positions in rankings (Kehm & Stensaker, 2009). Consequently, more institutions have engaged in the so-called "rankings game" (Grewal, Dearden, & Llilien, 2008). Some of the most noteworthy of these rankings are the Academic Ranking of World Universities (ARWU), QS’s World University Rankings, Times Higher Education’s World University Rankings, and U-Multirank. One of the main issues of debate has been the methodologies used in constructing rankings, and much criticism has been directed at those based on surveys (Bowman & Bastedo, 2011). Ultimately, the one that has received the most attention has been the ARWU, which is prominently on the agenda of universities1 and university systems2, serving as a key benchmark in their strategic planning and evaluation processes. It is published annually by Shanghai Ranking Consultancy (SRC), employs a method based on objective indicators, and is completely reproducible (Docampo, 2013). Recently, publishers of rankings have tried to improve their coverage in order to give a fuller picture of the world university system. To do so, general rankings have been complemented by others dedicated to scientific specialties or faculties3,4. Hence, in 2017, SRC introduced a thematic area-specific ranking5 called the Global Ranking of Academic Subjects (GRAS).

The GRAS is calculated for 54 scientific areas. Ranking by subject offers universities excluded from the ARWU—or struggling to improve their positions therein—the opportunity to demonstrate academic excellence in a specific field. However, in building this ranking, SRC implemented methodological changes in the calculation of indicators, diverging from the ARWU’s approach. Instead of solely relying on bibliometric indicators, the GRAS incorporates additional methods, such as surveys involving a select group of experts, which could introduce biases not present in the ARWU’s strictly quantitative methodology. Three of the five GRAS indicators (First Journal Impact Factor Quartile (Q1), Category Normalized Citation Impact (CNCI), and International Collaboration (IC)) are obtained by using InCites, but two (Top and Awards) are defined by a survey. Specifically, the Academic Excellence Survey asks participants to identify “top tier journals” and “credible international awards” within their respective field. In this manner, the selection of journals and awards to be included in the calculation of these indicators is conducted by choosing those that receive more than one vote and at least 50% of the total votes, or those that were selected in the previous edition. Thus, Top is defined as the number of articles researchers from the same institution have published in journals identified as “top journals” during the period 2016–2020 (2022 edition). In this edition, the survey selected 180 top journals across 52 academic subjects. Awards denotes the number of staff from an institution who received a “significant award” in their academic subject since 1981.

Whatever their purpose, surveys should be well-designed and include a representative sample free of any bias. After analyzing the sample of participants and the results of the Academic Excellence Survey from 2017 to 2020, some serious doubts arise about its representativeness. For example, it is intriguing that only professors from universities in the ARWU top 100 can participate in the survey6, thus generating an important geographic bias (Table S1, Supplementary material). This contributes to widening the gap between universities and magnifies the “Matthew effect” (Münch & Schäfer, 2014), reinforcing the stratification of the university system with only a small elite having a voice. In the first survey, in 2017, 211 university professors participated; in 2020, 735 participated. During this 4-year period, professors from 19 institutions in 15 countries participated; 56% of these institutions were in either the United States (43%) or the United Kingdom (12%). These institutions represent 1,747 respondents with a marked distribution: The first five countries recorded 82% of the respondents; together, the United States, Australia, and the United Kingdom alone recorded 70%. Furthermore, the distribution of respondents by scientific area is far from homogeneous: Some fields have a significant number of respondents (e.g., Computer Science & Engineering and Economics have 105 and 92, respectively), but in others, only one or two professors vote. In 2020, at least 16 areas have three respondents or fewer (Table S2, Supplementary material).

This distribution of the vote may create a bias when identifying the main journals in academic subject areas, which is the principal objective of the Top indicator. Several instances of bias in the choice of journals have been identified, particularly a significant Anglo-Saxon bias: 90% of journals are from the United States (62%) or the United Kingdom (28%). Moreover, some high-ranked journals—according to Clarivate Analytics’ Journal Impact Factor (JIF)—are not selected. In 2020, of the 136 journals selected, 15 were indexed in the second JIF quartile (Q2) and three in the third JIF quartile (Q3). Another key problem is consensus on which journals are the best in each field. For example, the eight top journals in Mechanical Engineering received less than 25% of the vote, three received only 13%, one was indexed in Q2, and another in Q3 (Table S3, Supplementary material). A suspicious case is found in Aerospace Engineering with the Q2 journal AIAA Journal. In 2020, it garnered votes from five out of six participants (83%). However, two respondents in this field, affiliated with the Georgia Institute of Technology and the University of Colorado at Boulder, hold positions as an associate editor and the editor of AIAA Journal, respectively. Although their individual votes are unknown and the journal’s inclusion is supported by a majority, this situation highlights a lack of measures to prevent potential conflicts of interest that could bias the results.

In short, the development and use of surveys as a substitute for recognized bibliometric standards can introduce an inconsistent, subjective approach and may sometimes be susceptible to manipulation. This undermines the credibility of a tool that is widely used in university policymaking. Resorting to surveys to determine the top journals is questionable when tools such as the Journal Citation Reports or Scimago Journal Rank provide quantifiable metrics based on comprehensive citation data, allowing anyone to identify the most influential journals with ease.

Enrique Herrera-Viedma: Conceptualization, Project administration, Supervision, Writing—review & editing. Wenceslao Arroyo-Machado: Data curation, Investigation, Methodology. Daniel Torres-Salinas: Conceptualization, Investigation, Methodology, Project administration, Writing—original draft, Writing—review & editing.

The authors have no competing interests.

All data are available in the main text or the Supplementary material.

Bowman
,
N. A.
, &
Bastedo
,
M. N.
(
2011
).
Anchoring effects in world university rankings: Exploring biases in reputation scores
.
Higher Education
,
61
(
4
),
431
444
.
Dill
,
D. D.
, &
Soo
,
M.
(
2005
).
Academic quality, league tables, and public policy: A cross-national analysis of university ranking systems
.
Higher Education
,
49
(
4
),
495
533
.
Docampo
,
D.
(
2013
).
Reproducibility of the Shanghai academic ranking of world universities results
.
Scientometrics
,
94
(
2
),
567
587
.
Grewal
,
R.
,
Dearden
,
J. A.
, &
Llilien
,
G. L.
(
2008
).
The university rankings game: Modeling the competition among universities for ranking
.
The American Statistician
,
62
(
3
),
232
237
.
Kehm
,
B. M.
, &
Stensaker
,
B.
(Eds.). (
2009
).
University rankings, diversity, and the new landscape of higher education
.
Sense Publ
.
Münch
,
R.
, &
Schäfer
,
L. O.
(
2014
).
Rankings, diversity and the power of renewal in science. A comparison between Germany, the UK and the US
.
European Journal of Education
,
49
(
1
),
60
76
.

Author notes

Handling Editor: Vincent Larivière

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data