The rise of responsible metrics as a professional reform movement: A collective action frames account

Abstract Recent years have seen a rise in awareness around “responsible metrics” and calls for research assessment reforms internationally. Yet within the field of quantitative science studies and in research policy contexts, concerns about the limitations of evaluative bibliometrics are almost as old as the tools themselves. Given that many of the concerns articulated in recent reform movements go back decades, why has momentum for change grown only in the past 10 years? In this paper, we draw on analytical insights from the sociology of social movements on collective action frames to chart the emergence, development, and expansion of “responsible metrics” as a professional reform movement. Through reviewing important texts that have shaped reform efforts, we argue that hitherto, three framings have underpinned the responsible metrics reform agenda: the metrics skepticism framing, the professional-expert framing, and the reflexivity framing. We suggest that although these three framings have coexisted within the responsible metrics movement to date, cohabitation between these framings may not last indefinitely, especially as the responsible metrics movement extends into wider research assessment reform movements.


INTRODUCTION
Recent years have seen a rise in discussions and campaigns around "responsible metrics" and wider calls for research assessment reforms.Since the 2010s, calls for change have accelerated, with reform champions calling on multiple research actors (researchers, universities, funders, policymakers, publishers, bibliometric producers, and so on) to alter their ways.Interventions, such as the San Francisco Declaration on Research Assessment (DORA) (DORA, 2013), the Leiden Manifesto (Hicks, Wouters et al., 2015), and the Metric Tide (Wilsdon, 2016), are considered important statements that have sought to steer (or "nudge") research actors into reconsidering their relationship with quantitative performance indicators in research assessment.
More recently, umbrella labels such as "research assessment reform" and "responsible research assessment" have emerged, which aligns responsible metrics with other reform agendas, including open science, research integrity, and diversity and equity (Curry, de Rijcke et al., 2020;Wilsdon, 2021).Although the emerging assessment reform movement is transnational in its intended scope, it has only partially resonated among research actors internationally (Tijssen, 2020), and no doubt will not be on the radars of all research actors.Nonetheless, The paper is structured as follows.In the next section we will introduce further the core concepts and arguments from the sociology of social movements on collective action frames.We will then briefly outline our approach to reviewing assessment reform texts, before outlining in greater depth the master framing and the three framings that we argue have most informed and shaped the responsible metrics collective action frame.We then analyze how these framings have variously featured and interacted with one another in important texts associated with the responsible metrics movement (DORA, the Leiden Manifesto, Metric Tide Report), their receptions, and how, more recently, the responsible metrics movement has been extended and translated into broader reform agendas-"responsible research assessment"-that encompass values and concerns from parallel reform movements such as research integrity and open science.We end by reminding readers that reform movement framings are rarely settled permanently but are subject to ongoing events, negotiations, and contestations.

HOW SOCIAL MOVEMENTS PURSUE CHANGES
In sociological research, a social movement is "a persistent and organized effort involving the mobilization of large numbers of people to work together to either bring about what they believe to be beneficial social change or resist or reverse what they believe to be harmful social change" (DeFronzo & Gill, 2019, p. 27).Social movements are a broad label incorporating many different forms of collective action with varying aims, from religious movements and cultural movements to revolutionary or resistance movements.One of the most common forms of social movement is reform movements, which do not aim to overthrow an existing system or government, but nonetheless are dissatisfied enough to seek to achieve some significant economic, social, or political change to an existing order (DeFronzo & Gill, 2019).
The responsible metrics movement is not the only professional scientific reform movement to have emerged in recent times: Responsible research assessment, open science and research integrity are other notable examples (Brundage & Guston, 2019;Penders, 2022).Penders (2022, p. 3) suggests that recent scientific reform movements tend to "foreground values that never were strangers to science such as honesty, transparency and accountability … yet seek [s] to embed them into changed or improved scientific processes and instruments."These movements call on researchers to restore or reconfigure the "etiquette" of their practices and will often appeal to core professional values (e.g., fairness, quality, accountability) and a sense that the current science system has gone awry, is "severely disturbed," or is perhaps even "broken" (Derksen & Field, 2022;Penders, 2022).Though promising wider societal benefits (by generating more socially responsive and sustainable academic research systems), the targeted agents of responsible metrics reforms are fellow professionals and research system actors, with little interest, let alone heated controversy, likely to be stirred in wider society-hence the name professional reform movement (Frickel, 2004b, p. 459).
Our analysis draws on collective action frames (Benford & Snow, 2000;Frickel, 2004a), a concept from social movements research, to articulate how the responsible metrics reform agenda has evolved as it has gathered momentum internationally.Collective action frames are a means of analyzing the course and character of social movements, and are our core focus here, rather than (say) resource mobilization or political opportunity processes.Note also that our aim is not to determine if the responsible metrics reform movement has helped to reconfigure research assessment practices on the ground in universities or funding agencies, a contentious issue for which there is mixed evidence (Chen & Lin, 2017;Pontika, Klebel et al., 2022;Rice, Raffoul et al., 2020).

Quantitative Science Studies 881
The rise of responsible metrics as a professional reform movement Collective action frame accounts conceptualize social movement actors as "signifying agents actively engaged in the production and maintenance of meaning for constituents, antagonists, and bystanders or observers" (Benford & Snow, 2000, p. 613).This signifying work is approached through the term "framing," a contentious process whereby a movement generates and promotes shared meanings ("interpretive frames") about social conditions and practices "that not only differ from existing ones but that may also challenge them" (Benford & Snow, 2000, p. 614).The products of this framing activity are known as collective action frames.
Collective action frames are constructed through the core tasks of diagnosis (problem identification and attribution), and prognosis (proposed solutions, correctives, or ways of negotiating the problems).These socially constructed vocabularies appeal to varying motives for others to follow the movement, including urgency, severity, propriety, or some combination thereof.There is of course a moral dimension to such framing elements, which often draw distinctions between good and bad, and in-versus outgroups.As social movements often straddle complex multiorganizational, sometimes multi-institutional fields, where other social movements also come and go, intramovement and intermovement conflicts can occur (Benford & Snow, 2000).Processes of attributing causality, blame, or responsibility often generate tensions and conflict within a movement, and even where there is agreement on the source of a problem, consensus does not necessarily follow on the appropriate correctives.
Framing processes evolve over time, particularly as a social movement expands across organizational networks and encounters new actors and discourses.Forms of frame mutation include frame amplification (some issues come to be highlighted or accented as more important than others), frame extension (a social movement's interests and frames extend beyond their original focus to incorporate issues and concerns that animate potential adherents and recruits) (Benford & Snow, 2000), and frame translation (whereby the issues and concerns of one social movement are presented as essential for other movements and interested actors to pass through in order to realize their own goals and interests) (Frickel, 2004a).
Together, this package of concepts provides a valuable means of accounting for how efforts to "responsibilize" bibliometrics have emerged and evolved over time.However, we do not seek to exaggerate the reach or successes of the responsible metrics movement's accomplishments: On the contrary, collective action frames literature helps highlight the limitations of the responsible metrics movement's framings in resonating with certain actors and mobilizing them into action.For example, collective action frame theorists also point to counterframings, which resist, challenge, or simply ignore framings projected by a social movement.As such, collective action frame concepts help to make visible some of the ongoing struggles and limitations that the responsible metrics reform movement seems likely to face going forward, within the movement and externally.

ANALYTICAL REVIEW APPROACH
This analytical review is based on desk-based research of publications, announcements, statements, reports, and websites pertaining to evaluative bibliometrics and research assessment reforms from the 1970s until today.Various source materials were gathered as evidence of efforts by various actors to intervene in debates and discussions around appropriate uses of bibliometrics for evaluation purposes, and in some instances to set out an explicit agenda for change.The materials were identified through our own prior knowledge of evaluative bibliometrics, further reading, and web searches, and discussions with colleagues and peers.This was an interpretive, rather than a rigid and systematic, approach to identifying texts,

Quantitative Science Studies 882
The rise of responsible metrics as a professional reform movement guided by whether they added meaning to our understanding of this phenomenon.A list of the position papers, statements, reviews, and resources promoting assessment reforms that were covered in this review is provided in the Appendix.
As researchers, we believe that data do not speak for themselves but need to be given meaning by analysts.Our analysis involved separate close readings of the selected texts, note taking, and shared online conversations and review sessions.We annotated texts with highorder concepts from collective action frames, such as diagnosis, prognosis, frame amplification, frame extension, frame translation, master-frame, and so on.This allowed us to map and chart the changing contours of the movement across texts over time.This was an iterative process that followed the constant comparative approach, involving moving backward and forward between the materials and our own readings, interpretations, discussions, and experiences as researchers to critically identify and sharpen the collective action frames.Our findings have been written up as a composite narrative across three sections: on the three collective actions frames underpinning the responsible metrics movement; how these feature in key texts that have been central to the growth of the movement; and how the responsible metrics movement and its frames have extended and translated into a broader assessment reform movement.
Although we attempt to analyze these texts as sociologists, we cannot claim to be completely "detached" or "outsider" observers.As individuals we have worked more-or-less closely with some of the authors of the research assessment reform texts reviewed in this article, while working at CWTS, Leiden University (see Competing interests statement).We, like all researchers, are products of our surroundings, meaning that our work comes out of particular times and circumstances.If we were not located where we were when some of these responsible metrics texts were produced, we perhaps would never have gained the curiosity or motivation to conduct this piece of research or formed some of the impressions that we did.Ultimately, we believe prior experiences, relations and knowledge are not weaknesses to be played down but an inbuilt feature of interpretive social science research.Our primary objective in this analytical review is to apply critically the sociology of collective action frames to prominent statements and texts that support and symbolize this reform movement.In presenting this rich and complex history, we aim to privilege originality and meaningfulness over completeness and replicability (Boell & Cecez-Kecmanovic, 2014).

THE THREE FRAMES OF THE RESPONSIBLE METRICS MOVEMENT
We present three ideal types of framing, which we claim have shaped key texts and interventions in the responsible metrics movement (and the later, expanded research assessment reform movement).The three framings set out distinct problems and attributions ("diagnosis" in the collective action frames lexicon) and solutions ("prognosis") in relation to evaluative bibliometrics.Each prognosis proposes alternative values and practices as solutions to the evaluative "bibliometrics problem" (variously conceived).Each framing consists of clusters of ideas and values recurrently used together to help construct problems and solutions.The proposed solutions are rarely presented as being qualitatively new, but instead, as values already present in practice (at least in part) but marginalized or threatened by the current regime (c.f.Penders (2022) on the "rediscovery" narratives of contemporary science reform movements).In the texts we analyzed, two or more of these ideal types often coexisted.
Before outlining substantial differences between the three framings that cohabit under the "responsible metrics" umbrella, we argue that each has emerged over time in response to a master-framing: the managerial-realist framing.

Quantitative Science Studies 883
The rise of responsible metrics as a professional reform movement

The Managerial-Realist Framing of Evaluative Bibliometrics
In what we have termed the managerial-realist master frame, bibliometrics are projected as suitable and effective proxies for measuring research quality and impact-be they of individuals, groups, departments, institutes, research fields, or countries and regions.For accounts of the emergence of this way of thinking in 1970s science policy in select countries, see Chubin and Hackett (1990), Csiszar (2023), andWouters (1999).We recognize this master-framing as part of a more general rational organizing myth in modern societies that evaluating and comparing quality and impact can be rationally and efficiently administered across disciplines and organizations, through standardized, objective measurement systems (Dahler-Larsen, 2012).
Numbers carry an aura of precision and rationality in modern societies (Desrosières, 1998;Porter, 1996), and have been depicted by social theorists as a defining form of knowledge within Modernity (e.g., Poovey, 1998).
Evaluative bibliometrics have promised more transparent, equitable, efficient, and truthful windows into scientific quality and impact than peer review.We label this rational myth a master-framing because it is not movement-specific but rather a generic frame (Benford & Snow, 2000) that features widely across multiple policy contexts, more so following the rise of "New Public Management" styles of governing in many countries since the 1980s (Dahler-Larsen, 2012) The managerial-realist framing of evaluative bibliometrics thus belongs to a story that some managers, policymakers, and scientists have projected about how (or how they wish) evaluation processes and systems to proceed (Wouters, 1999).Although this framing has no doubt been advanced at times by zealous policymakers, bureaucrats, and academics, we see this framing as also being evoked strategically by its critics as a major, urgent threat to good research evaluation and healthy research systems.
We will now set out the three action frames-metric skepticism, professional-expert, and reflexive-that have shaped the responsible metrics as a reform movement, and consider how each of them seeks to construct and counter the managerial-realist frame.

Metrics Skepticism Framing
The metrics skepticism framing encompasses many broad and specific concerns, which are united by being largely incredulous of the value of quantitative measurement of research performance, and through being critical of the politics and effects of bibliometrics on research systems.The specter of the managerial-realist framing has long troubled actors in research systems since the early introductions of evaluative bibliometrics in the 1970s and 1980s, even when the presence of bibliometrics was relatively small scale compared to today (see for example Collins, 1985).Weingart (2005) notes there have long been backlashes from within scientific communities to efforts to introduce evaluative bibliometrics, pointing to methodological objections and sociological conflicts about power and control of research evaluations, which he claims have been perennial features of how bibliometrics tools have been received by some academic researchers.Although multiple reasons have been advanced as to why quantitative performance indicators should not lead (or inform, even) research evaluations, such criticisms were not able to find (or latch onto) a concerted reform movement until the responsible metrics movement emerged in the 2010s.By this time, a number of critical social science research-informed critiques of bibliometrics and their effects on evaluation practices had emerged in response to the rise of international rankings and "audit explosion" of the 1990s and 2000s in many national research systems (Power, 1997).
Skepticism towards bibliometrics has often been informed by humanistic and social constructivist research philosophies, which privilege the language and methods of "social,"

Quantitative Science Studies 884
The rise of responsible metrics as a professional reform movement "theory," "political," and "context" above "performance," "measure," and "results" (Kang & Evans, 2020).Research on audit cultures (Strathern, 2003), has been influential in arguing that in neoliberally oriented systems of new public management, quantitative indicators are used to exert forms of control over academic research and researchers' working lives.Commonly articulated fears of bibliometrics' adverse impacts on research systems include Goodhart's Law (when a measure becomes a goal, it ceases to be a good measure), the McNamara fallacy (making decisions on quantitative information at the expense of other inputs) and the streetlight effect (bibliometrics measure what can be counted, not what counts).Critically oriented social science literature has claimed the expansion of performance measures has led to growing pressures on individuals, leading to burnout and unhealthy work cultures; the narrowing of research results and outputs at the expense of research diversity; and marginalization of certain research topics, groups, and career paths that do not satisfy dominant performance models (de Rijcke, Wouters et al., 2016).Critical readings of the streetlight effect depict bibliometrics as narrowing the policy and managerial view of research down to visible performance indicators that uphold the existing political economy, shining light only on topics already hegemonic in the science system and the economy (Ràfols, 2019).
Though some within arts, humanities and social sciences have on occasion mobilized technical criticisms of bibliometrics to justify resistance, not least unsatisfactory database coverage of their research outputs (Franssen & Wouters, 2019), for the most part the object of critique tends to be quantification as a "logic" (e.g., Burrows, 2012;Shore & Wright, 2015).In such fields, quantitative performance indicators are often considered reductive and contrary to the richness, diversity, and multidimensionality of research quality (Franssen, 2022;Nästesjö, 2021), and providing flawed representations of what is said to be measured (Aksnes, Langfeldt, & Wouters, 2019).
Independently of critical traditions in social sciences and arts and humanities, prominent natural science spokespersons also later emerged to criticize and warn of the corrosive effects of quantitative performance indicators in their field (Alberts, Kirschner et al., 2014;Sample, 2013).The backlash against bibliometrics has thus come in many forms, from disparate professional communities.All have in common a preference for "qualitative" peer review and "expert judgement" over numbers, and position bibliometrics as risky objects that threaten the health and legitimacy of research systems.Skepticism towards the notion of bibliometrics having any value in assessing research has been an important framing input into debates and calls for responsible metrics.

Professional-Expert Framing
In response to the growing presence of-and backlashes towards-bibliometric tools in academic research systems, more concerted, focused debates emerged in the field of scientometrics in the 1990s (van Raan, 1996).As part of a "regulatory science" that both studies and produces quantitative indicators (Wouters, 1999), scientometricians saw it as their professional duty to educate wider science communities about these tools: to "set the record straight" against inflated managerial-realist claims of evaluative bibliometrics, while pushing back against full-blooded metric skepticism.The professional-expert framing that has resulted is a largely technical diagnosis and prognosis, with rigorous, scientific methods believed to hold the key for addressing the social problem of evaluative bibliometrics (for a critique of this framing, see de Rijcke and Rushforth [2015]).Expertise in the professional-expert framing is binary: There is the expert professional knowledge of scientometricians about the design and properties of bibliometric tools and their limits, and then there is expert epistemic knowledge

Quantitative Science Studies 885
The rise of responsible metrics as a professional reform movement of evaluators about a given domain of scientific research under assessment (Moed, 2007;van Raan, 1998).It is the responsibility of evaluators, as users of bibliometric tools, to educate themselves about the uses and limitations of the tools.Terminology such as mis-use and abuse feature noticeably in these accounts-suggesting that a duty of care is being flaunted and denigrated by some evaluators when using bibliometric tools (Hammarfelt & Rushforth, 2017).
Arguably the most notable recent example of professional-expert responses to criticisms of bibliometrics is the notion of "metric wiseness."According to this perspective "all scientists should become knowledgeable about indicators used to evaluate them.They should know the publication-citation context, the correct mathematical formulas of indicators being used by evaluating committees, their consequences and how such indicators can be misused" (Rousseau, Egghe, & Guns, 2018, p. 1).Here the problem of bibliometric evaluation is not attributed to the measures and indicators in themselves, but to a general lack of knowledge on how to use them.Therefore, what was called for was not a general questioning of the methods employed, but a call for researchers "becoming metric-wise" (Rousseau et al., 2018).Contrary to the metric skepticism framing, which suggests weakening the influence of bibliometrics, this counterframing emphasizes that bibliometrics can be legitimate tools in research assessments, as long as they are of the "correct" kind (for a recent defense of this position, see Bornmann, Botz, and Haunschild (2023)).
Criticisms of bibliometrics are not ignored in such accounts-it is acknowledged that indicators used in citation analysis "… are not perfect; but neither is peer review" (Rousseau et al., 2018, p. 7).So-called "informed peer review," whereby quantitative indicators inform, but do not supplant, expert qualitative judgement, is presented as the optimal solution (Butler, 2007).Given this context, the terms "caution," "proficiency," and "competence" have become the watchwords of the professional-expert framing, more than the recently fashionable "responsibility" (Petersohn, 2021).Importantly, many of the problems of evaluative bibliometrics can be avoided if we trust the experts.Such views are expressed, for example, in relation to citation databases, where direct use of these is discouraged: The "own databases" built by "professional experts" should be consulted for evaluation purposes rather than the "raw data" provided by databases such as Web of Science or Scopus (Rousseau et al., 2018, p. 284).It should be acknowledged that the need for collegial influence and contextual understanding is emphasized in this account, and critical literature on the use of quantitative indicators is frequently cited.Yet, the main conclusion is that "bibliometric expertise is needed, and counting is a necessity."The idea of "metric wiseness" appeals to the notion of specialists sharing their wisdom to the uninformed.In this account, the nonexperts, using for example "raw data," are to blame for misusage of indicators-a problem that increasing academics' knowledge and skills in bibliometrics will address.Hence, expertise, knowledge, and techniques are the answers to the current problems of bibliometric evaluation: No strong reform is needed, just an intensification of the current ambitions of the bibliometric community.Better metrics, combined with more educated users, hold the solutions to reform, not altogether abandoning or dismissing bibliometrics (Bornmann et al., 2023).

Reflexive Framing
Recently, the reflexive framing has gained in momentum, influenced by parallel "responsibilization" movements-such as Responsible Research and Innovation (Owen, Macnaghten, & Stilgoe, 2020).Both the Responsible Research and Innovation and the responsible metrics movements share the common influence of science and technology studies, particularly research on technology assessment and public engagement.Like Responsible Research and Innovation, the reflexive framing towards bibliometrics emphasizes the need to guide evaluators and other research system actors and their practices from a distance, by setting out broad parameters of "responsible" action (Davies & Horst, 2015).
In the reflexive framing, bibliometric tools can in principle be used as supporting tools in evaluations, if good practices are kept constantly in mind.Codified standards and principles of good practice are important tools within the reflexive frame, and are part of a counterframing to what we have labeled the metrics skepticism framing, by seeking to shift the terms of debate "from a defensive one (e.g., 'one cannot use these indicators in the humanities') to a specification of the conditions under which assessments can be accepted as valid, and the purposes for which indicators might legitimately be used" (Leydesdorff, Wouters, & Bornmann, 2016, p. 2132).Unlike the metrics skepticism framing, the reflexive framing is less about legislating whether a given bibliometrics tool is "good" or "bad," but tries to impress upon evaluators a moral obligation to practice their evaluations with care (the professional duty to act as a custodian of the apparatus of evaluation) and responsiveness (willingness and motivation from within to acknowledge ambiguity of research quality and enter into dialog with other actors and listen to others' beliefs and practices) (Dorbeck-Jung & Shelley-Egan, 2013; Pellizzoni, 2004).Democratization is another dimension of responsible research and innovation (Owen et al., 2020) which has been translated into the reflexive framing of evaluative bibliometrics, via calls for diversifying the range of actors to be included in the construction and use of bibliometric tools (Ràfols, 2019).Last, inclusion (or diversity) is an important principle of the reflexive framing, particularly the imperative to widen what counts as valuable research contributions beyond publication and citation indicators.
In the reflexive framing, the appropriateness of bibliometrics can only be worked out in specific contexts by autonomous social agents committed to self-examining whether their practices are proceeding carefully and responsively.Bibliometrics are not one-size-fits-all tools that diffuse into evaluation settings but traveling standards to be translated by responsible evaluators to "fit the local context, resulting in heterogeneity and context-dependent spread" (Reymert, 2021, 53).

DORA
The San Francisco Declaration on Research Assessment (DORA) was one of the first organized attempts to raise awareness of problems associated with uses of research indicators for evaluative purposes and call research system actors to reform what its authors considered bad practices.Following a special session at the American Society of Cell Biology (ASCB) Conference in December 2012, the DORA statement was written by a committee of attendees and released in May 2013.At the time of writing this review, DORA's text remains unchanged, calling on organizations and individuals to sign the declaration and commit to its principles for good evaluation.The statement includes 18 principles, some of which are general, and others aimed towards specific actors within the research system charged as accountable for change-including individuals, universities, funders, and publishers.Important general recommendations that signatories are expected to adhere to are not to use "… journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist's contributions, or in hiring, promotion, or funding decisions" and instead "assess research on its own merits rather than on the basis of the journal

Quantitative Science Studies 887
The rise of responsible metrics as a professional reform movement in which the research is published" (DORA, 2013).Today DORA is viewed as a broad initiative for changing how research is evaluated, yet it was launched in the discipline-specific setting of cell biology.This context is a likely explanation for why the Journal Impact Factor (JIF)-which is used across many fields, but is often thought to have an elevated position in biosciences (Rushforth & de Rijcke, 2015)-is the main metric targeted in the text.DORA is not presented as an expert account but rather as an account from "concerned citizens" in the field.Nonetheless, some of the arguments that the DORA statement presents against the use of the JIF align with professional-expert framings of the indicator's problems, by commenting on technical and statistical deficiencies, not simply the consequences of misuse.Thus, one prognosis to counter the influence of the JIF is to "make available a range of article-level metrics." Yet, simultaneously, DORA's solution to read articles qualitatively on their own merits, rather than rely on quantitative journal indicators mirrors the reflexive framing insofar as it calls on evaluators' sense of moral responsibility to be responsive towards the content of each research contribution, rather than assuming that what constitutes "good" can be known independently of reading it.The statement calls for more inclusive definitions of what constitutes a worthwhile academic contribution (albeit the examples provided underscore the disciplinary origins of the statement and its imagined audience): "The outputs from scientific research are many and varied, including: research articles reporting new knowledge, data, reagents, and software; intellectual property; and highly trained young scientists."This statement reads now as though it is an early precursor to later developments in the responsible research assessment movement to consider more holistic criteria of academic contributions (see Section 6).
Simultaneously, DORA can also be interpreted as a statement that is skeptical of metrics, insofar as it calls for abandonment of the JIF.The statement draws on many resources to criticize this measure, including technical arguments that we might associate with the professional-expert framing, though it is not clear that DORA is trying to recover the authority of bibliometric knowledge and expertise from "mis-uses."In our reading, the original DORA statement is difficult to characterize entirely within a skeptical framing, as it does not address whether bibliometrics per se are inappropriate, only the shortcomings of the JIF and "other" journal-based indicators (which it does not name) used to judge the merits of research articles.
The Declaration has taken on something of a life of its own since its publication as a symbol for various arguments and calls to action.Since its publication, DORA has been interpreted as a specific critique of the JIF and its influence on science, and/or a more general critique of the use of metrics, which takes the JIF as its primary example, but may also take aim at other contentious indicators such as the h-index (which the original DORA statement does not mention).Reception towards DORA has varied: There are clearly many enthusiastic followers who follow its official Twitter (X) account, cite the statement positively, attend community engagement events, or have signed it as individuals (or encouraged their organizations to do so).There has also been criticism and counterframings of the Declaration by some within the world of biomedicine, including former Nature editor Philip Campbell (Anderson, 2013).A post by the founder of the influential Scholarly Kitchen blog countered DORA's diagnosis of JIF's impacts upon the research system thus: "There's a deeper problem with the DORA declaration, which is an unexpressed and untested inference in their writing about how the impact factor may be relevant to academic assessment groups.They assert repeatedly, and the editorials expand on these assertions, that tenure committees and the like connect the impact factor to each individual article, as if the article had attained the impact stated.I don't believe this is true.I believe that the impact factor for a journal provides a modest signal of the selectivity of

Quantitative Science Studies 888
The rise of responsible metrics as a professional reform movement the journal-given all the journal titles out there, a tenure committee has to look for other ways to evaluate how impressive a publication event might be."(Anderson, 2013) Broadly speaking, though, DORA has become an important reference point and symbol of an emerging global reform movement, and signing it has become something many institutions have been keen to communicate to their stakeholders as evidence of their commitment to being (or becoming) responsible actors.

Leiden Manifesto
Compared to DORA, the Leiden Manifesto offers a more focused defense of evaluative bibliometrics in general.The Leiden Manifesto was published in Nature in 2015, written by a small group of scientometricians and science studies scholars following a meeting at the 2014 Science and Technology Indicators conference held in Leiden, the Netherlands.It laid out 10 good practice principles for the appropriate uses of quantitative indicators-addressing audiences of evaluators and those being evaluated.As the authors themselves noted, most of the recommendations codified within the Manifesto were based on forms of technical and practical wisdom already well articulated within the field of scientometrics, but which had been, they argued, less well articulated beyond it.The text's introductory section provides a diagnosis of evaluative bibliometrics' influence upon academic research systems.This shares much of the narrative features of the skeptical framing, including concerns about the ubiquity of performance indicators, a graph on "impact factor obsession," and a strong assertion that: "The problem is that evaluation is now led by the data rather than by judgement" (Hicks et al., 2015).Words such as obsession imply that quantitative measures can pose an irrational threat to reasoned, expert judgment (Leckert, 2021).
Despite setting out contemporary research systems' quantitative metrics problem in the introduction, the remainder of the text does not align with the skeptical framing prognosis (whereby full-scale delegitimation and abandonment of metrics is warranted).The authors instead state that uses of bibliometrics for evaluation purposes are "usually well intentioned, not always well informed, often ill applied"-a statement that introduces the professionalexpert framing of users' knowledge deficits as an important source of current bibliometric problems.Elements of the professional-expert framing put forward to address this diagnosis include recommendations that evaluators avoid "misplaced concreteness and false precision" of indicators, citing a well-known criticism of using three decimal points in the JIF, and for evaluators to "account for variation by field in publication and citation practices."Calls for evaluations to ensure "robust statistics," "data quality," and "normalized indicators" also feature, as do terms such as "abuse" and "mis-application"-vocabulary that is common within the professional-expert frame.
The Manifesto combines information on technical limitations of bibliometrics with calls for evaluators to be more mindful of their limitations and effects within the research system, thereby eclectically combining the professional-expert and reflexive framings.As an intervention in a debate, the Leiden Manifesto can be read partly as professional-experts communicating prepackaged technical knowledge to lay-persons, while also supporting "self-scrutinization" (Leckert, 2021, p. 9869), with a logic of coproducing more responsible academic citizens (or citizen bibliometricians).Consistent with the reflexive frame, the 10 principles are presented as modest, simplified rules that can help actors go on in the complex world of research evaluation.A feature of the responsible research and innovation movement, which the authors of the Leiden Manifesto also draw on, is equating responsibility with accountability: "We offer this

Quantitative Science Studies 889
The rise of responsible metrics as a professional reform movement distillation of best practice in metrics-based research assessment so that researchers can hold evaluators to account, and evaluators can hold their indicators to account" (Hicks et al., 2015).The text calls not only for appropriate use of bibliometric indicators, but asks evaluators to be more inclusive of diverse indicators of research quality that go beyond bibliometrics, a feature that would be later amplified in assessment reform statements (see Section 6).Further instances of inclusiveness are calls to develop new indicators that better capture diverse forms of research, particularly outputs published in non-English-language outlets less well covered by existing bibliometric databases: "Metrics built on high-quality non-English literature would serve to identify and reward excellence in locally relevant research" (Hicks et al., 2015).There is a circumscribed form of democratization called for, in stating that those being evaluated via bibliometric methods and data should be able to check the data being used.However, calls for democratization are not extended to the peer review process (e.g., the Manifesto does not challenge what constitutes an expert, nor does it call for widening participation of nontraditional groups as expert reviewers).
The reflexive framing dimensions of the Manifesto have not resonated with all in the scientometrics field.David and Frangopol (2015) provided a counterframing to the Manifesto by doubling down on the message that users are the ones responsible for problems surrounding evaluative bibliometrics, not the tools or, by extension, the field of scientometrics (David & Frangopol, 2015).Another counterframing which emerged was the aforementioned "Metric-Wise" argument (Rousseau et al., 2018), whereby the expert scientometrician provides the lay evaluator with clear, solid knowledge to learn and to implement (with metric wiseness offered to reformers as an alternative umbrella label to organize under, instead of responsible metrics).So far, these counterframings have not restrained the global circulation of the Leiden Manifesto, which has been a relatively effective intervention compared with earlier attempts to address problems associated with evaluative bibliometrics that relied mostly on the professional-expert frame.This is likely in part because the genre of the manifesto of best practice principles is a lighter and more resonant intervention for mainstream academic and policy audiences than interventions such as technical textbooks, arguments, and courses through which professional-expert framings have been predominantly mobilized.

The Metric Tide
The Metric Tide Report initially was commissioned to address a policy question posed by the Higher Education Funding Council for England (the body that was responsible for distributing the United Kingdom's higher education funds) in 2014: Should the next installment of the Research Excellence Framework (REF; the United Kingdom's periodic national research evaluation exercise) become metrics driven rather than peer review driven?Unlike the Leiden Manifesto, which assumes bibliometrics are already a general presence in manifold evaluation settings, the Metric Tide is written not only for evaluators but also for a policy and management audience about what is to happen to a particular evaluation exercise that was scheduled for a particular time.In addition, the stakeholder consultation process led to an expansion from this scope, to review the impacts of metrics upon research systems more generally (Wilsdon, 2017).
The report contained an executive summary with recommendations and conclusions, which cited the report's commissioned independent literature review (see our Competing Interests statement) and correlation analysis supplementary sections as evidence.The report's executive summary concluded that introduction of bibliometrics into the REF was not appropriate for now.In doing so, the executive summary draws on all three framings-the skeptical framing, professional-expert framing, and reflexive framing.

Quantitative Science Studies 890
The rise of responsible metrics as a professional reform movement In delivering its conclusion, the report positioned itself as spokesperson on behalf of the research community and its concerns about bibliometrics: "Across the research community, the description, production and consumption of 'metrics' remains contested and open to misunderstandings" (p.viii).In supporting this finding, the report cites DORA as an authority on the destructive effects of "narrow, poorly-defined indicators-such as journal impact factors" (p.viii).Like the Leiden Manifesto, the beginning of the executive summary starts with a diagnosis of problems that is compatible with the metric skepticism framing in its tone: "Too often, poorly designed evaluation criteria are 'dominating minds, distorting behaviour and determining careers.'At their worst, metrics can contribute to what Rowan Williams, the former Archbishop of Canterbury, calls a "new barbarity" in our universities" (p.iii).
Though a different genre of writing to the Leiden Manifesto, with a different history and target audience, the Metric Tide also did not advocate for wholescale removal of bibliometrics but, in theory, leaves the door open for their use in future assessment exercises (after 2021)subject to certain conditions being met.The executive summary lays out five principles of Responsible Metrics (thereby coining this umbrella label that many have used to name the reform movement).These are mobilized as a set of standards that bibliometric indicators would need to meet in order to play a useful role in the REF (but never replace its peer review)-which, the report argued, they did not meet at that time.The executive summary explicitly links the five principles to the Responsible Research and Innovation agenda (Owen et al., 2020).However, we argue that some of the Metric Tide's principles chime with the professional-expert framing more than the reflexive, Responsible Research and Innovationstyle framing.The principle of robustness ("basing metrics on the best possible data in terms of accuracy and scope" (p.x)), for instance, aligns more closely with the professional-expert framing, and elsewhere the executive summary draws on a technical lens when citing evidence from its independently commissioned correlation analysis as to why metrics should not replace peer review.The report cites the Leiden Manifesto and endorses its principles, and shares the Leiden Manifesto's call for democratization: Data and indicators should be subjectable to scrutiny by those being evaluated.Like the Leiden Manifesto, though, it does not challenge the authority of expert judgment and who should count as a peer in peer review.Indeed, the report adopts a spokesperson position on behalf of the academic community in arguing that peer review should be retained in the REF and in research assessments at large: "despite its flaws and limitations, [peer review] continues to command widespread support across disciplines" (p.viii).
The fifth principle of reflexivity (unsurprisingly) resonates most strongly with the reflexive framing, with its emphasis on recognizing and anticipating systemic and performative effects of indicators, challenging the limitations of current tools, updating them, and being alert to development of new indicators.
Since being published in 2015, the five principles of responsible metrics are perhaps the elements of the report that have circulated most widely (alongside the umbrella label Responsible Metrics).Universities in the United Kingdom were also encouraged to publish responsible metrics statements on their websites committing to these principles publicly, and the Metric Tide subsequently led to the establishment of an independent sector-wide U.K. Forum for Responsible Metrics.

DORA, the Leiden Manifesto, and the Metric Tide Become Fellow Travelers
In the years following the publication of these three texts, champions and supporters of research assessment reforms have begun to refer to responsible metrics as a movement

Quantitative Science Studies 891
The rise of responsible metrics as a professional reform movement (e.g., Curry, Gadd, & Wilsdon, 2022).Although we have shown that DORA, the Leiden Manifesto, and the Metric Tide contain differences in aims, audiences, and arguments, we would argue that in the years following their publication, their similarities have tended to be emphasized much more than their differences.Let us consider how the three texts have often come to be cited together in statements and policies found online.In the United Kingdom, the Wellcome Trust, for example, introduced its Open Access policy with a request (though not requirement) that Wellcome-funded organizations publicly commit to assessing research outputs and other contributions based on their intrinsic merit and discourage inappropriate metrics or proxies such as the title or impact factor of the journal in which work is published.The Wellcome Trust cites DORA's principles as a key text in informing this policy, referencing two of its principles directly, before referring to "other equivalent declarations" such as the Leiden Manifesto (Wellcome Trust, 2018).As requested of universities by the Metric Tide, the University of Bristol published a responsible metrics statement which is exemplary of how one or more of DORA, the Leiden Manifesto, and the Metric Tide have come to be cited together: "This Policy Statement builds on a number of prominent external initiatives on the same task, including the San Francisco Declaration on Research Assessment (DORA), the Leiden Manifesto for Research Metrics and the Metric Tide report."(University of Bristol, n.d.) Rhetorical practices of grouping together these three texts resemble the use of "concept symbols" in scientific articles (Small, 1978), whereby certain references become shorthand symbols for authoritative sources that lend credibility to an author's statement and over time cease being expanded upon in detail.In citing such texts together under headings such as Responsible Metrics Statement, universities and funders present themselves as having signed up to a burgeoning movement, symbolized by these three texts.Such groupings tend to present responsible metrics as a unified, established agenda with a set of shared values and principles, with frictions between metrics skepticism, professional-expert, and reflexive framings rendered invisible.
Since the late 2010s, a notable frame extension (Benford & Snow, 2000) of the responsible metrics reform movement has occurred, from the more specific focus on appropriate uses of bibliometrics into a widened framing of "responsible research assessment."A 2020 report by members of the Research on Research Institute (including authors of DORA, the Leiden Manifesto and the Metric Tide texts) defined responsible research assessment as "an umbrella term for approaches to assessment which incentivize, reflect and reward the plural characteristics of high-quality research, in support of diverse and inclusive research cultures" (Curry et al., 2020, p. 7).A large number of texts have emerged supporting this widened agenda (CoARA, 2022;EC, 2017;EU, 2022;LERU, 2022;UNESCO, 2021).Published in 2022, Harnessing the Metric Tide served as a follow-up to the 2015 Metric Tide Report, in preparation for the United Kingdom's 2028 Research Excellence Framework, presenting a fine-tuning of the five original responsible metrics principles within the original report, and stating that any use of bibliometrics in research evaluations should be "judicious."Harnessing the Metric Tide, though, explicitly endorses the expansion of research assessment reform agendas from the narrower focus on responsible uses of bibliometrics towards ensuring research assessment aligns more widely with "with intersecting movements to support more fruitful, inclusive and positive research cultures" (Curry et al., 2022, p. 13).

Quantitative Science Studies 892
The rise of responsible metrics as a professional reform movement A prominent feature of how responsible research assessment texts perform frame extension is through referring to and incorporating goals and agendas from parallel scientific reform movements, such as open science, research integrity, and societal relevance of research, as well as drives for equity, diversity, and inclusion, and to improving academic working environments.For example, the European Agreement on research assessment reforms calls on assessments to "reward research behaviour underpinning open science practices such as early knowledge and data sharing as well as open collaboration within science and collaboration with societal actors where appropriate" (CoARA, 2022, p. 4).A recent position statement by the League of European Research Universities (LERU) draws assessment reform and open science causes together as related: "This paper complements other recent papers from LERU on inclusion, scientific integrity, societal impact and on the implementation of Open Science" (LERU, 2022, p. 6).Harnessing the Metric Tide calls for research assessment to better support, incentivize, and recognize shifts in academic culture around "issues of equality, diversity, bullying and harassment" (Curry et al., 2022, p. 13), while also reducing assessment burden.In responsible research assessment texts, bibliometric indicators and wider valuations of productivity and impact among scientific peers are widely cast as "traditional" approaches, in need of modernizing-rhetoric that is common across various contemporary science reform movements (Penders, 2022).
Interactions between the responsible metrics movement and other reform agendas have also resulted in frame translation (Frickel, 2004a).Texts associated with research integrity and open science movements have, for example, mostly accepted the responsible metrics' movement's diagnosis that perverse effects of bibliometrics are an obstacle to them realizing their own reform goals.The UNESCO Recommendation on Open Science states: "Assessment of scientific contribution and career progression rewarding good open science practices is needed for operationalization of open science" (UNESCO, 2021, p. 27).Similarly, when a group of research integrity reform champions proposed the Hong Kong Principles in 2020, they accepted evaluative bibliometrics as an urgent problem blocking realization of their agenda: "We acknowledge … the global leadership of those working on the San Francisco Declaration on Research Assessment (DORA), the Leiden Manifesto, and other initiatives to promote the responsible use of metrics, which have laid the foundations for much of our work.The HKPs [Hong Kong Principles] are formulated from the perspective of the research integrity community.We, like the DORA signatories, strongly believe that current metrics may act as perverse incentives in the assessment of researchers" (Moher, Bouter et al., 2020, p. 2).
In responsible research assessment texts, DORA, the Leiden Manifesto, and the Metric Tide are frequently cited as part of the evidence base for why assessment practices need to be reformed.Yet when responsible research assessment statements cite these three texts, certain elements get amplified more than others.The European Agreement on Research Assessment Reform amplifies, for instance, the need to recognize a diverse range of outputs and contributions and to respect the diversity of disciplines (CoARA, 2022, p. 4).The Agreement, though, omits the Leiden Manifesto's call to reduce the dominance of the English language as the default language of academic publication.
Importantly, responsible research assessment texts amplify the message of the Leiden Manifesto and Metric Tide that bibliometrics ought to be given "license to continue," as long as they are used appropriately.Like DORA, the Leiden Manifesto, and the Metric Tide,

Quantitative Science Studies 893
The rise of responsible metrics as a professional reform movement responsible research assessment texts also make strategic use of the three action framings that our analysis identifies in order to persuade.Part of the argumentative structure of the Leiden Manifesto and the Metric Tide taken forward by responsible research assessment texts is to set the scene (diagnosis) by being heavily critical of evaluative bibliometrics.Ultimately, though, responsible research assessment texts have tended to align with the Leiden Manifesto's and Metric Tide's reflexive prognosis for addressing bibliometric-related problems: Appropriate uses of metrics by self-aware agents is the way forward, not the abandonment of metrics.The League of European University's (LERU, 2022) position statement, for example, initially diagnoses problems through the metrics skepticism framing, for instance, arguing that bibliometrics are used in assessments because they are "so much easier than focusing on what really counts" (LERU, 2022, p. 5) and linking bibliometrics to injustices: "They [candidates with different profiles or career choices] may feel misjudged and wronged, because their strengths did not get the same weight in the assessment than the publication ratio" (p. 7).However, the LERU position statement text then aligns with the Leiden Manifesto's reflexive framing of how to address problems with bibliometrics: "Although bibliometric data have played and still play an important role, LERU universities have always adopted a multidimensional perspective, where different dimensions of research performance and a variety of duties and responsibilities are taken into account for assessment" (p. 7, emphasis added).The primacy of expert judgement and "qualitative" peer review, with input of quantitative measures only where appropriate (rather than no bibliometrics)-is a commonly repeated mantra throughout many position statements, agreements, and guidelines advocating assessment reforms (CoARA, 2022;EC, 2017;EU, 2022;LERU, 2022;UNESCO, 2021).

CONCLUDING REMARKS: HOW SECURE IS THE RELEGITIMATION OF EVALUATIVE BIBLIOMETRICS?
In recent years, calls for research assessment reforms have begun to gain greater attention in research policy in some contexts-especially in parts of the Global North (Tijssen, 2020).Many assessment reform statements cite and credit DORA, the Leiden Manifesto, and the Metric Tide as evidence of growing momentum for change.These texts have become symbols for a "responsible metrics" agenda, and have helped long-standing concerns about evaluative bibliometrics, long discussed in quantitative science studies research, gain wider attention.
Communicating practical wisdom accumulated within quantitative science studies communities via manifestos and statements has surely brought such concerns to a wider audience compared with more traditional academic communication channels such as specialized monographs, journals, and conference meetings.One of the major influences of the Leiden Manifesto and the Metric Tide on widening coalitions supporting assessment reforms has been to persuade them of the legitimacy of bibliometrics in assessments if used appropriately (DORA has also come to be associated with this prognosis, even though its original text is rather ambiguous on what the general role of bibliometrics should be in assessments).Currently the metric skepticism, professional-expert, and reflexive framings cohabit the responsible metrics movement.This combination of framings no doubt enables the responsible metrics movement to appeal to a broader audience than would be the case if only one of these framings was projected.Given that evaluative bibliometrics are often diagnosed as the reason for many ills in academic research systems, which frustrate the realization of certain other reform agendas, for how long can the metrics skepticism framing remain neutralized within expanding research assessment reform coalitions?This is an important matter of concern for the quantitative science studies community, as assessment reform movements continue to grow, evolve, and perhaps splinter (as expanding social movements often do).We therefore ask readers to consider: Can Quantitative Science Studies 894 The rise of responsible metrics as a professional reform movement this cohabitation of framings hold out peacefully within assessment reform movements, and what might happen to evaluative bibliometrics if it does not?