Abstract
Citations are increasingly being used to evaluate institutional and individual performance, suggesting a need for rigorous research to understand what behaviors citations are reflecting and what these behaviors mean for the institution of science. To overcome challenges in accurately representing the citation generation process, we use postretraction citations to test competing theories under two different citation search processes, empirically testing predictions on the spread of retracted references. We find that retracted papers are continually cited after the retraction, and that these citations are more likely to come from audiences likely to be unfamiliar with the field of the retracted paper. In addition, we find this association to be much stronger among those citing high-status journals, consistent with the behavior of scientists relying on heuristic search instead of engaged search process. While the current policy debate on misinformation in science emphasizes increasing the visibility of retraction labels to discourage the use of such publications, we argue that institutional-level interventions may be more effective, as such interventions are more consistent with the heuristic citation process. As such citation behavior may not be limited to the case of postretraction citations, we discuss the implications for current science studies as well as science policy.
PEER REVIEW
1. INTRODUCTION
Citations to scholarly publications are increasingly being used for research evaluations, due to their perceived quality of being an unobtrusive and objective measure of scientific impact (Biagioli, 2018; Hicks, Wouters et al., 2015). At the same time, there is continual concern about the widespread uses and institutionalization of citation metrics as evaluation standards (Biagioli & Lippman, 2020; Hicks et al., 2015; Moravcsik & Murugesan, 1975; Woolgar, 1991). Such continuing criticisms call for rigorous scholarly studies to improve understandings of what citations actually measure (Bornmann & Daniel, 2008; Cronin, 1984; Kaplan, 1965). In other words, what motivations and behaviors are citations reflecting, and, in particular, regardless of the motivations, what do these behaviors mean for the institutions of science, as well as for science policy, particularly science evaluation?
While there has been longstanding interest in understanding what citations actually measure (Bornmann & Daniel, 2008; Tahamtan & Bornmann, 2019), one of the challenges has been the difficulty of accurately illuminating citation practices with conventional research methods and data, as most existing studies have relied either on investigators’ close examinations of citation contexts or authors’ self-reported explanations about their citation motivations. Such methods partly assume the ability of authors to make and recollect well-informed and rational judgments about the work they cite. Thus, instead of relying on our interpretation or author’s recollected memory, we argue that examining citations to retracted references could provide a unique opportunity to understand citation practices, because postretraction citation data allow us to identify a set of articles that should not have been cited but nevertheless were cited. In this regard, the use of postretraction citations as a case not only provides a unique opportunity to investigate abnormal citation behavior but also a strategic research site for theory-testing by comparing different predictions on the likelihood of such citations, built from competing theories. We first review two dominant theories of citation motivations from the sociology of science: the normative and constructivist theories (Bornmann & Daniel, 2008; Cozzens, 1989; Cronin, 1984). To simplify the distinction, the normative theory views the function of citation as conferring credits and indebtedness to the original author (Kaplan, 1965; Merton, 1957). Meanwhile, constructivists view citation as a means to bolster scientific claims to convince audiences (Gilbert, 1977; Latour, 1987).
However, these citation motivations alone cannot explain why retracted references were cited, which suggests that we need to consider the citation generation process independent of citation motivation. We use the term “citation search” to represent the citation generation process, which encompasses the process of accessing the published literature, and eventually marking the use of that literature in the author’s argument by use of a citation. We are agnostic about when in the process this search occurs. The search part of a “citation search” can occur early in one’s training as a researcher, at an early stage in the focal project’s conceptualization, at random moments when perusing published works, or in a targeted manner to find a specific piece of knowledge to support an argument or perhaps to respond to a comment from a reviewer. However, the citation appears in the focal paper, and we use its appearance in the paper as the marker of the search, and make this citation search the object of our analysis (i.e., the process by which a particular reference goes from being in the published literature to being cited in the focal paper). We argue that we can use citations to retracted papers as a strategic site for understanding this citation search. We first consider the citation process that follows a pure form of citation search, where authors would cite after thoroughly reading papers. We consider this as an engaged citation search. At the other end, we consider a citation process where authors would extensively rely on cues and signals that they think are useful in fulfilling their citation motivations. Inspired by a behavioral theory tradition (Cyert & March, 1963; Simon, 1997), we consider this citation process as a heuristic citation search. One important insight from the behavioral theory is that it can describe when authors are more likely to use heuristics, and more importantly, when such behavior can become overly mechanical, perhaps even to the point where they may be citing the materials without reading them. We then derive hypotheses that lead to competing predictions about conditions under which retracted articles are more likely to be cited. In doing so, we use field distance (between retracted articles and citing articles) and journal visibility (high vs. ordinary journal impact factor) as theory testing variables.
Our analysis is based on a set of retracted articles published from 1980 to 2016, obtained from Retraction Watch (2019), and corresponding metadata obtained from the Clarivate Analytics Web of Science. Based on 103,245 citing-cited article pairs from 2,123 retracted articles and 94,871 citing articles, we first show that, on average across time, from 38% to 44% of citations to retracted articles were made after retraction events. By operationalizing field distance with a natural language processing model, we show a strong association between postretraction citation and field distance. Furthermore, we find this association much stronger among those citing high-status journals, which supports a heuristic citation search model, regardless of the author’s citation motivation. As predicted by the heuristic citation search model, some authors seem to superficially use high-status journals as a heuristic when searching for distant (hence more likely unfamiliar) knowledge. These findings are consistent with a process of postretraction citations at least partly driven by a process where authors are citing the paper as a marker for some point in their argument, relying on more surface characteristics of the paper (it was published, it is related to point X, and it is in a reputable journal), perhaps without regard to the detailed contents of the paper, and, in particular, without regard to whether the publication has been nullified by a retraction.
This paper is organized as follows. In Section 2, we provide a review of the existing theories and empirical studies on citations. In Section 3, we discuss postretraction citation and construct hypotheses combined from the existing theories of citation motivations and two different citation search models. After presenting our data and method in Section 4, we report our findings in Section 5. In Section 6, we provide discussions of our findings, particularly on the implications for citation theory and policy intervention.
2. THE ROLE OF CITATION IN THE SOCIAL INSTITUTION OF SCIENCE
2.1. Two Dominant Theories on Citation
Contemporary scientific articles are characterized by the prominent use of citations to prior work, unlike, for example, newspaper opinion essays or literary works. Citation use can reveal important aspects of science as a social institution. First, some scholars view citations as a social device that establishes and maintains property rights and priority claims in science (Kaplan, 1965; Price, 1963; Zuckerman, 1987; Zuckerman & Merton, 1971). This view is described as the normative view due to the emphasis on the function of citations in maintaining the normative structure of science (Merton, 1957, 1973). According to the Mertonian norms of science, scientists are compelled to freely share their knowledge, and the social recognition of priority, in turn, serves as a primary means to compensate scientists for voluntarily sharing their findings with the public (Merton, 1957). Therefore, just like eponyms, prizes, and awards are used to maintain the collective memory of scientific discovery, citations can be used to maintain the norm of common ownership of scientific goods by rewarding original authors with social recognition (Kaplan, 1965). This social recognition can also lead to material rewards, such as jobs, promotions, and funding, providing an additional economic basis for this publishing-citation normative reward system. From this perspective, citations can reflect the operation of the Mertonian norms in science. What does this mean for the institution of science? To the extent that citations accurately reflect the conferring of one’s indebtedness for appreciating the scientific contribution, citation counts can be viewed as a valid measure of quality and impact.
Meanwhile, the constructivist schools in the sociology of science (Gilbert, 1977; Knorr-Cetina, 1981; Latour & Woolgar, 1979) question the normative interpretation of citation practices. Based on contemporary findings from citation context analyses (Chubin & Moitra, 1975; Moravcsik & Murugesan, 1975), which revealed multifaceted motivations for citations, Gilbert (1977) rejected the idea that recognition was the primary function of citations. He further argued that the presence of perfunctory and negative citations was not readily explained by the normative interpretation. Instead, citation practices were viewed as scientists’ attempts to persuade their peers by bolstering their scientific claims through embedding other people’s work into their texts (Gilbert, 1977; Latour, 1987), making the citation process a selective and strategic activity. For example, citations to highly cited and recognized works even when they have minor intellectual relevance would be better explained by the constructivist perspective. Because of this selective citation behavior, citations also reflect the process of a scientific claim transforming into a hard fact or black box (Gilbert, 1977; Latour, 1987; Latour & Woolgar, 1979). Such a scientific claim is less likely to be cited once it has become a “black box.” Thus, according to the constructivist interpretation, citations rather reflect strategies employed by scientists in constructing scientific knowledge and persuading their audience.
Therefore, some constructivist scholars show disdain for using simple citation counts to measure the quality or impact of a published article. For example, MacRoberts and MacRoberts (1987) criticized Cole and Cole (1972)’s use of citation counts to measure intellectual importance to investigate whether a few elite scientists disproportionately make scientific contributions. MacRoberts and MacRoberts (1987) reasoned that because scientists are more likely to cite works of high-status scientists to bolster their claims, eminent scientists end up garnering excessive citations. They also argued that the use of citations is prone to many errors due to the multifaceted context and content of citations. In response, Zuckerman (1987) argued that errors need to be systematically observed, as there can be undercitation of eminent scholars due to our tendency to drop citations of “hard” fact1. Moreover, she argued that citation motives and consequences are analytically distinct such that citations can have a variety of motives but the very fact that they were stamped on the text suggests that materials were read and had an influence on the authors. As seen from her quote below, she further questions whether the use of “argument from authority” can ever be devoid of any relevant cognitive materials.
What are the characteristics of those sources which can possibly be ‘persuasive’ citations in the clear sense of only providing ‘authority’ rather than relevant cognitive materials in support of the new work referring to it? Presumably, these authoritative sources have been assessed by the pertinent collectivity of peers as having made sound and consequential contributions. As Gilbert himself observes, it is the papers seen as “important and correct” which “are selected because the author hopes that the referenced papers will be regarded as authoritative by the intended audience.” (Zuckerman, 1987, p. 334).
2.2. Previous Empirical Findings
The empirical evidence for normative and constructivist interpretations is far from settled (Bornmann & Daniel, 2008; Tahamtan & Bornmann, 2019). This debate has been addressed broadly from three different methodological approaches. One approach attempts to understand the citation context by closely examining both cited and citing documents. Some early works used this method to illuminate the multifaceted uses of citations (Chubin & Moitra, 1975; Frost, 1979; MacRoberts & MacRoberts, 1986; Moravcsik & Murugesan, 1975). Some of these studies substantiate the normative interpretation by showing that the majority of citations reflected research impacts and few citations were “negational” (Chubin & Moitra, 1975; Moravcsik & Murugesan, 1975). Yet, the same studies also find nontrivial use of “perfunctory” citations (Gilbert, 1977; Latour, 1987), citations that were either misquoted, wrong, or meaningless, questioning the normative interpretation. Moreover, MacRoberts and MacRoberts (1986) also found that a large share of works that made significant contributions to the topic were never cited. They further argue that these “lost citations” are disproportionately lost by low-status authors (MacRoberts & MacRoberts, 1987), thereby casting doubt on using citation analysis to measure scientific contribution. One of the difficulties of citation context analysis is that it requires immensely painstaking efforts to read carefully as well as a high level of field expertise to accurately infer citation contexts from reading both cited and citing documents (Bornmann & Daniel, 2008). The former part is being partly addressed by the increasing diffusion of machine-readable documents and advances in natural language processing (Tahamtan & Bornmann, 2019), which have been increasingly used to scale citation context analysis (Berger, McDonough, & Seversky, 2017; Cohan, Ammar et al., 2019; Jurgens, Kumar et al., 2018; Teufel, Siddharthan, & Tidhar, 2006).
The other approach uses interviews or conducts surveys to directly asks the original authors about their citation intents/functions (Brooks, 1985, 1986; Cano, 1989; Teplitskiy, Duede et al., 2022; Vinkler, 1987). This method has helped our understanding of citation functions by revealing the heterogeneous and sometimes chaotic nature of citation uses. Yet, one weakness of this method is its reliance on the author’s self-response to identify citation motivations. Thus, it is not too surprising that few previous studies based on this method identify behavior such as citing retracted papers, let alone providing useful answers to continuous citations of nullified references. Last, citation studies also have been approached by using a statistical approach to examine the extent to which citations can be better predicted by normative or constructivist variables. Empirical evidence is again mixed. For example, based on a citation network of potential citing-cited pairs of publications from astrophysics, Baldi (1998) showed that the likelihood of citation increases with the content relevance and perceived quality of the work, thereby supporting the normative interpretation2. At the same time, he finds no relationship between the status of the author and citation, but does find that women are less likely to be cited, thereby providing partial support for constructivist interpretation. This latter finding is also supported by more recent bibliometric studies (Huang, Gates et al., 2020; Larivière, Ni et al., 2013), which show the persistent gender inequality in citations even after controlling for relevant observable variables, suggesting the presence of particularistic standards governing the citation process (Fox, Whittington, & Linkova, 2017; Long & Fox, 1995). As such, while previous citation studies have illuminated the diverse usages of citations, the debate between the normative and constructivists' interpretations of citation has not been settled. Our paper provides new insights into this longstanding debate by empirically presenting and examining an overlooked citation practice: citations to retracted references.
3. POSTRETRACTION CITATIONS
Retracted articles are nullified papers. Literally, the publication has been “undone” from the journal (Van Noorden, 2011). In other words, while the paper’s content still exists, the paper no longer exists as a publication. In most cases, the paper is retracted because there is some problem with the paper that implies it should not have been published to begin with. Hence, one could argue that such nullified papers should not then be used as a proper base for knowledge production. However, retracted articles often continue to be cited as if they were legitimate scientific findings (Bar-Ilan & Halevi, 2017; Bordignon, 2020; Budd, Sievert et al., 1999; Hamilton, 2019; Kochan & Budd, 1992; Pfeifer & Snodgrass, 1990), which raises concerns about the integrity of science across scientific communities (Campanario, 2000; Unger & Couzin, 2006).
One might argue that the paper’s content continues to exist and therefore can provide a legitimate basis for a citation: for example, because it contains an inspiring research question, or a particular finding that was unrelated to the retraction, or even, that the plagiarized content is still useful even if copied from elsewhere3. Here we argue that even if such motives would lead to “legitimate” citations to retracted papers, such a citation should at minimum include a caveat stating that the paper is being cited in service of point X, despite the fact that the journal has nullified the publication. Furthermore, the paper should not be cited in its publication form (perhaps citing a preprint, for example), as the publication no longer officially exists. Still, to further explore this line of argument regarding citation practices, we will explore various subsets of citations to retracted papers to address their implications for different models of citation search.
In response to awareness of these citations to retracted papers and to address concerns about such citation practices, previous studies (Bar-Ilan & Halevi, 2017; Davis, 2012; Garfield & Welljams-Dorof, 1990; Pfeifer & Snodgrass, 1990; Wager, Barbour et al., 2009) suggested various ways to increase the visibility of retractions by either standardizing the retraction notice, clarifying the retraction reasons, coordinating with nonpublisher platforms (such as Web of Science or PubMed) or implementing the author alert system. However, drawing from the literature above on citation search, it is plausible that citing retracted articles is a reflection of the search heuristics, leading to superficial or perfunctory citation practices. In fact, numerous prior studies suggest that many citations may not have been deeply engaged with by citing authors (Harzing, 1995, 2002; Harzing & Kroonenberg, 2016; Katz, 2006; Leng, 2020; Leung, Macdonald et al., 2017; MacRoberts & MacRoberts, 1986; Simkin & Roychowdhury, 2005; Vinkler, 1987). One additional reason to cite nullified papers is a negative citation (that highlights that the paper is retracted). However, prior work finds that such negative citations are rare among postretraction citations (Bar-Ilan & Halevi, 2017; Bordignon, 2020; Hsiao & Schneider, 2021; Schneider, Ye et al., 2020). Hence, it is highly plausible that a substantial share of postretraction citations may have resulted from authors citing articles without attending to the retracted status of the paper. If lack of awareness is the primary reason behind citing retracted articles, we would expect to observe more postretraction citations from distant fields (Dinh, Sarol et al., 2019). The presence of postretraction citations suggests that we need to separately address the citation process from citation motivation. In the next section, we consider the normative and constructivist theories as two ideal-type citation motivations. In addition, inspired by the behavioral theory tradition, we consider that the citation process may lie between engaged and heuristic citation search processes. We then use this framework to derive hypotheses about the relationship between field distance and postretraction citation.
3.1. Citation Motivation and Citation Search Process
While normative and constructivist theories provide plausible reasons for why scientists cite what they cite, these theories fall short of explaining postretraction citations. From the normative perspective, authors would not cite a retracted article, as the priority norm would compel them to properly confer credits to the rightful producers. Therefore, it would be absurd to give credit to the authors of works that have been nullified in the eyes of the scientific community. For constructivists, citing a retracted article would be like convincing peers with arguments based on flawed evidence. Thus, instead of relying solely on citation motivations, we examine the citation search process to understand why retracted articles are continually being cited. We consider that a citation search process may lie between two ends, between engaged and heuristic citation searches. Ideally, authors would thoroughly read the paper and then integrate the paper’s content into their own argument and presentation of findings via the citation, whether their citation motivations are to confer credits to original authors (normative) or bolster their claims by associating their works with the papers they cite (constructive). We categorize such a citation process as an engaged citation search. While particular citations may not adhere to such a strict citation process in practice, we argue that the engaged citation process represents an ideal type and norm for how academics should end up citing the work of others. However, it is not clear how common such “pure” engaged citations may be. One survey-based study suggests that about 75% of the references were cited after authors had thoroughly read them (Vinkler, 1987), suggesting about one-fourth of citations did not involve engaged citation search. Furthermore, numerous other studies provide plausible evidence against a presumption that authors strictly adhere to the engaged citation process (Harzing, 1995, 2002; Harzing & Kroonenberg, 2016; Katz, 2006; Leng, 2020; Leung et al., 2017; MacRoberts & MacRoberts, 1986; Simkin & Roychowdhury, 2005).
Such an imperfect citation search is consistent with the insights from the behavioral theory tradition (Cyert & March, 1963; Simon, 1997; Simon & March, 1958). With the increasing citation search space from the exponential increase in the number of publications and the increasing rate at which scientists produce articles, combined with a demand by reviewers to thoroughly incorporate the existing literature into the paper’s argument, it may be difficult to expect scientists to read thoroughly all the papers they cite. Thus, to the extent that citation search is more costly (for example, cognitively distant search), we expect authors to rely more heavily on cues and signals that they think might provide useful information related to their citation motivations. For example, the authors may rely on the visibility of journals, number of citations, status, and affiliated institutions of authors to guide their search. This information may further be used as part of the process of selecting which of the papers found will be cited by the searching author. We consider this type of citation search process as a heuristic citation search process. Recall that our concept of citation search incorporates the processes of both finding articles and incorporating those found into one’s paper. Such heuristics may guide both steps, which has implications for the probability that a retracted paper continues to get cited after the retraction event. Figure 1 illustrates how citation motivations can be classified as normative or constructivist, and how the citation search process can be driven by engaged or heuristic searches. Using this framework, we construct hypotheses for predicting citations to retracted references.
3.2. Field Distance and Postretraction Citations
3.2.1. Engaged citation search
We first derive hypotheses from each citation motivation when authors are conducting an engaged citation search. While citing retracted references, by definition, is antithetical to the engaged search, we can derive conditions under which retracted articles are more likely to be cited. According to normative theory, the institutional norms of science compel scientists to protect the priority of their peers (Kaplan, 1965; Merton, 1957). Articles that falsely cite prior works may not make it past a series of field gatekeepers (Kaplan, 1965; Zuckerman & Merton, 1971), such as editors and reviewers, as they likely consider such an act a violation of the social norms of recognition. Under this normative pressure, authors are more likely to be cautious when citing sources from their own fields because a field is a unit in which the norms of science would operate with greater pressure (as more people are going to find out that you have infringed the norm to acknowledge the help of others (Kaplan, 1965)). Therefore, the likelihood of citing retracted articles would increase with the distance between cited and citing authors’ fields. The same prediction could be derived from the constructivist theory. In particular, those that view citation as a rhetorical device to convince peers would predict that citations to retracted articles are more likely to come from distant fields. This is because falsely citing references from their own discipline increases the chance that their “opponents” detect their mistakes, which would subsequently undermine their scientific claims (Gilbert, 1977; Latour, 1987).
Therefore, based on both normative and constructivist theories of citation under the engaged citation search model, the following first hypothesis can be derived:
H1: Postretraction citations are more likely to come from distant fields than from proximate fields, such that E[post_cite ∣ distant_field] −E[post_cite ∣ proximate_field] = β1 > 0
3.2.2. Heuristic citation search
The heuristic citation search considers a process of finding relevant citations as part of the information search process (Cyert & March, 1963; Simon, 1997; Simon & Newell, 1971) from a vast space of potentially citable articles. A citation search would be more costly to the extent that an author is unfamiliar with a topic, which would require her to spend extra time and effort to identify and decide whether a particular source would be relevant to her work. Thus, failure to properly evaluate the cited materials is more likely to occur when searching for knowledge from distant fields, regardless of whether the motivation is to give proper credit (normative) or to enhance your claim (constructive). The prediction from the heuristic search process for both normative and constructive citation motivations is consistent with the argument from engaged search derived from either motive, such that, again, the rate of citations to retracted articles may increase with field distance. Therefore, hypothesis H1 should hold for either engaged or heuristic search processes.
3.3. Visibility, Field-Distance, and Postretraction Citations
3.3.1. Engaged citation search
Not all retracted articles have equal visibility. Indeed, the most high-profile retracted articles are those published in the most visible and prestigious journals, such as Science and Nature (Oransky & Marcus, 2021), as well as research that committed severe research misconduct (Reich, 2009). Empirical evidence shows that highly visible articles have a sharper decline in citations following retractions (Azoulay, Furman et al., 2015; Furman, Jensen, & Murray, 2012). Exploiting the visibility of retracted articles may allow us to construct two competing hypotheses from normative and constructivist theories. First, the constructivist approach views citations as a means to bolster scientific claims (Gilbert, 1977; Latour, 1987). In this sense, falsely citing a highly visible article, even if that article is coming from a distant field, may pose a high risk of undermining the authors’ scientific claims because the “opponents” are simply more likely to be aware of the article given its high visibility. Note that the importance here is that the constructivists view the use of citations as a rhetorical device deployed in the “war of words.” While deploying many references to their claim may be equivalent to bringing in many “allies” that their opponents must defeat, any mistakes in making references can also be used against them (Latour, 1987). Thus, for retracted articles that are highly visible, such as those published in journals with a high journal impact factor (JIF), if the author is motivated in the way the constructivists argue, the relation of field distance to postretraction citations would be lower for high-profile journals, because both authors and referees are more likely to be aware of retraction.
H2A: [constructivist motive and visibility × distance interaction]: The field distance effect on postretraction citations will be weaker when citing retracted articles published in high JIF journals, such that β1,high_JIF − β1,low_JIF = β2 < 0
H2B: [normative motive and visibility × distance interaction]: Postretraction citations are more likely to come from distant fields than from proximate fields regardless of the JIF of the journals in which the retracted articles are published, such that β1,high_JIF − β1,low_JIF = β2 ≈ 0
3.3.2. Heuristic citation search
Searching for relevant citations from distant fields may be more costly due to the limited expertise, experience, and time available to authors. One insight from the behavioral theory tradition is that we rely on heuristics to guide our search, particularly in contexts of high uncertainty and of time pressure (Tversky & Kahneman, 1974). In the context of searching for relevant literature, we may rely on the perceived quality and status of journals, such as JIF, as a cue. For example, JIF, which was originally created to serve as a heuristic for librarians (Garfield, 2006), could be deployed as a search heuristic for authors searching for relevant literature (Osterloh & Frey, 2020; Wooding, 2020), whether our motivation is to give credit (normative) or bolster our claims (constructive). Such reliance on JIF would greatly reduce the mental effort expended searching for unfamiliar knowledge. Yet, reliance on such a heuristic can become overly mechanical, such that, in an extreme case, authors may not have even read the paper they cite. In fact, previous studies have documented possible evidence of authors making references without reading them (Ball, 2002; Harzing, 1995, 2002; Hoerman & Nowicke, 1995; Simkin & Roychowdhury, 2005). For example, by tracking misprints in citations, Simkin and Roychowdhury (2005) constructed a mis-citation propagation model, which estimated that around 70–90% of citations are copied from the reference lists of other papers. More subtle evidence shows researchers mechanically responding to a false (unknowingly to them) recommendation algorithm by citing recommended works that have substantially lower cognitive relevance than uncited works that were not recommended (Kolympiris, Drivas et al., 2020). We argue that such citation behavior may be more common among citations to articles from distant fields yet published in high JIF journals. That is, given that authors must bear significant costs in searching for distant knowledge, they are more likely blindly to “trust” the works that are published in high JIF journals due to their perceived higher status, just as hiring committees often rely on superficial uses of JIF of the articles authored by job candidates (Biagioli & Lippman, 2020). Combining this argument with the first hypothesis, we would expect to observe the association between field distance and postretraction citation to be higher for those citing retracted articles from high JIF journals, regardless of the citation motivation (normative or constructivist).
H2C: [heuristic search and visibility × distance interaction]: The field distance effect on postretraction citations will be stronger when citing retracted articles published in high JIF journals, such that β1,high_JIF − β1,low_JIF = β2 > 0
4. DATA AND METHOD
This section provides a detailed description of the construction of the data sets and methods used to test our hypotheses. We construct a citation network data set from retracted articles and their citing articles. The population of retracted articles was obtained from Retraction Watch (2019), a nonprofit organization that monitors and collects data on retractions. To our knowledge, the Retraction Watch database provides the most comprehensive coverage of retracted articles and detailed information about retracted articles, including but not limited to titles, authors, retracted dates, and curated retracted reasons. The data set we obtained from the Retraction Watch contains 18,525 articles retracted between 1980 to 2018. The database also provides unique identifiers such as DOIs and PMIDs for most of the articles. These identifiers were used to retrieve detailed bibliographic information from the Web of Science Core Collection. From 18,525 retracted articles, we identified 8,037 articles from the Web of Science database in this way.
From the 8,037 retracted articles, we retrieved bibliographic information on 198,674 citing articles (citations up to 2020) from the Web of Science. We removed articles (both retracted and citing) that were missing title, abstract, or cited references fields, as these fields were used to calculate field distance. We also removed both citing and cited articles whose journals did not have JIF information. We also removed articles retracted after 2016 to ensure at least 3 years of postretraction citation window. We also limited our sample to original research articles that cited retracted articles. That is, we removed citing articles that are not categorized as either “article,” “review,” or “proceedings papers” document types from the Web of Science database. We also removed self-citations to rule out alternative behavioral motivations for citing retracted articles. Finally, given that our identification relies on the within-retracted article variations, we isolated our sample to retracted articles that were cited at least 10 times by 20194. The resulting data set contains 103,245 citing-cited article pairs, which is the unit of our analysis. This data set contains 2,123 retracted articles published from 1980 to 2016, and 94,871 citing articles published from 1980 to 2019.
4.1. Dependent Variable: Postretraction Citation
While previous empirical studies focused on the effect of retraction on various dimensions of scientific activities, such as its effects on the subsequent reputation of the focal papers, fields, and authors (Azoulay, Bonatti, & Krieger, 2017; Azoulay et al., 2015; Furman et al., 2012; Jin, Jones et al., 2019), we use citations to retracted articles to examine the role of field distance and JIF in generating continuous citations to retracted articles. Thus, our dependent variable is a binary variable that takes the value of 1 if a citation was made after the retraction year or 0 if a citation was made before or during the retraction year. In other words, in each case we are comparing citations within a retracted paper to estimate the likelihood that the observed citation happened before or after the retraction event.
4.2. Field Distance
Operationalizing field distance is a central concern to test our hypotheses. Following a conventional method commonly used in scientometrics and innovation studies, we transform text information embedded in publication documents into a vector space. To the extent that the citing document and cited document share similar concepts as measured by represented vector similarity, we argue that the articles are cognitively similar. In our arguments, we are assuming that the more cognitively similar the two papers are, the more likely it is that the citing authors were familiar with the contents of the cited documents. While we are aware that this is not a perfect measure of field distance, we believe it is sufficient to argue that, for example, a paucity of shared concepts between sociology and materials science papers can be well captured by a large distance between the textual representation of documents from these two fields.
To measure the textual similarity between retracted and citing articles, we first transform texts embedded in scientific documents into vector space. Several methods are widely used for this task, including “one-hot” representation of texts into bag-of-words vectors or distributed representation of texts (Le & Mikolov, 2014; Mikolov, Sutskever et al., 2013) based on pretrained word-embedding vectors. The conventional word-embedding method is “context-free” in the sense that the representation of words is invariant with respect to surrounding words. Meanwhile, recently developed contextual representation models, such as BERT, provide a more accurate representation of scientific texts by considering the ambiguous usages of words inferred from their association with neighboring words (Beltagy, Lo, & Cohan, 2019; Lee, Yoon et al., 2020). In this paper, we use SPECTER embeddings (Cohan, Feldman et al., 2020) for the semantic representation of our corpus. SPECTER is one of the latest language models optimized for the semantic representation of scientific documents. It is based on the contextual representational model (SCIBERT) yet optimized for scientific documents by considering citation linkage during the training process. The authors of the SPECTER model provide a public API5 from which we have access to their pretrained model. By concatenating title and abstract fields from our documents and encoding it into the pretrained SPECTER model, we obtained dense vectors with 768 dimensions for 100,580 articles (both retracted articles and citing articles). As robustness checks, we also replicated our results using a bag-of-words model and a vector representation from cited Web of Science Subject Categories. The main results are qualitatively similar across different measures of field distance (results available from author).
4.3. Visibility
We use the JIF of the retracted articles, which we obtained from the Clarivate Analytics Journal Citation Report (downloaded in 2018), to proxy their retraction visibilities. High-JIF journals are highly visible. At the same time, high-JIF journals also carry significant status signals, such that many evaluators (wrongly) consider the JIF of an article as an indicator of future success (Biagioli & Lippman, 2020; Osterloh & Frey, 2020). We define a high JIF as being above the 75th percentile in the JIF distribution, which corresponds to a JIF above 18.43 for retracted journals and above 5.03 for citing journals. Note that we are not suggesting that authors specifically check the JIF of an article before referencing it in their work. Rather, we use JIF as an indirect indicator of the visibility and prestige of the journal in which a retracted article was published.
4.4. Controls
Our estimating models include several control variables. First, we address how citing authors from countries different from those of the retracted article authors may be more likely to cite them, as they are less likely to be informed about the retraction. Previous studies have shown that geographical distance continues to act as a barrier to knowledge flow despite the advancement in communication technologies (Abramo, D’Angelo, & Di Costa, 2020; Matthiessen, Schwarz, & Find, 2002; Pan, Kaski, & Fortunato, 2012). While this localization can partly be explained by the geographical concentration of research activities (Wuestman, Hoekman, & Frenken, 2019), we posit that information flow may be hampered by national boundaries net of cognitive distance. We thus calculate the country distance between retracted and citing articles based on sets of the affiliated countries using the Jaccard index. Precisely, the country distance is calculated as one minus the ratio of the intersection and union of affiliated countries. We also include several citing article-level control variables. First, it is plausible that common norms around publication practices may be different in countries that are considered “peripheral” regions (Honig & Bedi, 2012; Lewellyn, Judge, & Smith, 2017; Walsh, Lee, & Tang, 2019). Thus, we include a binary variable that takes a value of 1 if a citing article contains any authors from the United States and Western Europe and 0 otherwise. The difference between core and periphery may also be observed via institutional hierarchy and status. We thus include a dummy variable that takes a value of 1 if affiliated organizations of a citing article are among the top 50 universities based on the 2021 Times Higher Education Ranking. For a similar reason, we include the JIF of citing articles as a control. Lastly, we include the number of authors, affiliations, and countries of citing articles as controls, which is a standard practice to control for unobserved heterogeneity across different dimensions of team size (Liu, Jones et al., 2023).
4.5. Empirical Specifications
Our first hypothesis examines the role of distance in driving continuous citations of retracted articles. Specifically, our hypothesis predicts that greater field distance generates citations to nullified references, which can be operationalized by comparing the probability of citing retracted articles when citing authors are from distant fields as opposed to proximate fields. This can be expressed by the following inequality: E[post_cite ∣ distant_field] > E[post_cite ∣ proximate_field]. Given that we are operationalizing the probability of citing a retracted article by measuring the proportion of citations that are made after retraction, a naïve comparison between field distances of preretracted and postretracted citations may lead to a biased estimation, because the citation generation process, such as the likelihood of getting cited by articles from distant fields, may be highly influenced by subfield and article-level characteristics, such as journal type and status. Thus, our main estimation model employs fixed effects around the retracted article. Meanwhile, our dependent variable, postretraction citation, is mechanically correlated with citation age (years elapsed since the publication of retracted articles). Yet the citation age can positively affect the field distance, as it generally takes some time for a published idea to diffuse to other fields. Therefore, without controlling for citation age, the positive correlation between field distance and citation age (diffusion effect) can lead to an overestimation of the positive association between field distance and postretraction citations. We address the citation age problem semiparametrically by employing a set of citation age indicator variables in our estimation model. The inclusion of citation age is particularly useful, as our data set has substantial variations in the years it took for retracted articles to be retracted (see Figure 2(a)), which allows us to exploit cross-sectional variation for each citation age group to estimate field distance effects (see Figure 2(b)).
5. RESULTS
5.1. Descriptive Statistics
Table 1 reports the descriptive statistics and descriptions of all variables used in the analysis. The mean of the post_cite variable is 0.384, meaning around 38.4% of citations to retracted articles were made after retractions. Note that this value is an unweighted average of postretraction citations, without considering the positively skewed distribution of citations received across retracted articles in our data set (see Figure 3(a)). When we compute the average of the mean of the postretraction citation rate across 2,123 retracted articles (postretraction citation rate for each retracted article), the rate increased to 44.5%. Thus, given the positively skewed distribution of citations (see Figure 3), the postretraction citation rate is lower among retracted articles that received a large number of citations.
Variable . | Description . | Variable level . | Obs . | Mean . | Std. Dev. . | Min. . | Max. . |
---|---|---|---|---|---|---|---|
post_cite | Postretraction citations | relational | 103,245 | 0.384 | 0.486 | 0 | 1 |
dist_embedding | Field distance (SPECTER embedding) | relational | 103,245 | 0.294 | 0.114 | 0.038 | 0.997 |
severe | Retraction due to severe misconduct | retracted article | 103,245 | 0.612 | 0.487 | 0 | 1 |
jif_retracted | Impact factor of retracted article | retracted article | 103,245 | 13.974 | 13.874 | 0.429 | 55.873 |
jif_citing | Impact factor of citing article | citing-article | 103,245 | 4.763 | 5.347 | 0.000 | 115.840 |
country_distance | Degree of country overlap | relational | 103,245 | 0.782 | 0.370 | 0 | 1 |
top_50_org | Top 50 ranking institutions | citing-article | 103,245 | 0.202 | 0.402 | 0 | 1 |
team_size | Number of authors | citing-article | 103,245 | 5.412 | 3.729 | 1 | 194 |
org_size | Number of affiliations | citing-article | 103,245 | 2.774 | 2.201 | 1 | 147 |
multi_country | Number of countries | citing-article | 103,245 | 1.291 | 0.712 | 1 | 29 |
has_west | Has affiliation from Western countries | citing-article | 103,245 | 0.679 | 0.467 | 0 | 1 |
year_citing | Citation year | citing-article | 103,245 | 2009.941 | 5.778 | 1980 | 2019 |
age | Age of retracted article | relational | 103,245 | 5.340 | 4.131 | 0 | 39 |
Variable . | Description . | Variable level . | Obs . | Mean . | Std. Dev. . | Min. . | Max. . |
---|---|---|---|---|---|---|---|
post_cite | Postretraction citations | relational | 103,245 | 0.384 | 0.486 | 0 | 1 |
dist_embedding | Field distance (SPECTER embedding) | relational | 103,245 | 0.294 | 0.114 | 0.038 | 0.997 |
severe | Retraction due to severe misconduct | retracted article | 103,245 | 0.612 | 0.487 | 0 | 1 |
jif_retracted | Impact factor of retracted article | retracted article | 103,245 | 13.974 | 13.874 | 0.429 | 55.873 |
jif_citing | Impact factor of citing article | citing-article | 103,245 | 4.763 | 5.347 | 0.000 | 115.840 |
country_distance | Degree of country overlap | relational | 103,245 | 0.782 | 0.370 | 0 | 1 |
top_50_org | Top 50 ranking institutions | citing-article | 103,245 | 0.202 | 0.402 | 0 | 1 |
team_size | Number of authors | citing-article | 103,245 | 5.412 | 3.729 | 1 | 194 |
org_size | Number of affiliations | citing-article | 103,245 | 2.774 | 2.201 | 1 | 147 |
multi_country | Number of countries | citing-article | 103,245 | 1.291 | 0.712 | 1 | 29 |
has_west | Has affiliation from Western countries | citing-article | 103,245 | 0.679 | 0.467 | 0 | 1 |
year_citing | Citation year | citing-article | 103,245 | 2009.941 | 5.778 | 1980 | 2019 |
age | Age of retracted article | relational | 103,245 | 5.340 | 4.131 | 0 | 39 |
In Figure 4(a), we plot the average of postretraction citation rates across retracted articles against the year in which they were retracted. The solid line corresponds to the postretraction citation rate where postretraction refers to citations to retracted articles received 1 year after retraction. To provide a more conservative estimate, we also include a dotted line that represents postretraction citations made 2 years after retraction. Figure 4(a) reveals a declining trend in the postretraction citation rate over time. However, this trend may be attributed to the longer citation windows of older articles compared to more recently retracted ones. Overall, our data indicate that retracted articles, on average across time, received about 38–44% of their citations after retraction events. Meanwhile, model-free comparison of the pre- and postretraction citations reveals that the more distant citations are made to articles after retraction events (see Figure 4(b)).
For ease of interpretability, we transform the field distance variable into standardized units in the analysis. We also transform the jif_retracted, jif_citing, team_size, and org_size variables into binary variables with a cut at 75th percentile values. Meanwhile, because most citing publications had affiliated organizations from a single country, we use a binary variable that assigns a value of 1 if a citing article has more than one country for the multi_country variable.
Table 2 reports the correlation matrix of all variables used in our analysis before transforming the aforementioned variables into binary or standardized variables. First, the correlation table shows a positive correlation between our dependent variable, post_cite, and field distance (dist_embedding). Meanwhile, the high level of positive correlations between age and post_cite, and age and dist_embedding suggests that citation age must be incorporated into our model to avoid overestimating the association between the postretraction citation and field distance variable.
. | Variables . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | 10 . | 11 . | 12 . | 13 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | post_cite | 1.000 | ||||||||||||
2 | dist_embedding | 0.091 | 1.000 | |||||||||||
3 | severe | −0.132 | −0.012 | 1.000 | ||||||||||
4 | jif_retracted | −0.016 | 0.126 | 0.138 | 1.000 | |||||||||
5 | jif_citing | −0.142 | 0.013 | 0.060 | 0.115 | 1.000 | ||||||||
6 | country_distance | 0.080 | 0.014 | −0.025 | −0.042 | −0.086 | 1.000 | |||||||
7 | top_50_org | −0.081 | 0.025 | 0.027 | 0.066 | 0.161 | −0.122 | 1.000 | ||||||
8 | team_size | 0.046 | −0.074 | −0.006 | −0.045 | 0.046 | 0.055 | 0.069 | 1.000 | |||||
9 | org_size | 0.062 | 0.009 | −0.016 | −0.030 | 0.063 | 0.034 | 0.182 | 0.641 | 1.000 | ||||
10 | multi_country | 0.010 | 0.030 | −0.009 | −0.011 | 0.066 | 0.076 | 0.172 | 0.351 | 0.550 | 1.000 | |||
11 | has_west | −0.206 | 0.075 | 0.072 | 0.099 | 0.204 | −0.199 | 0.260 | −0.052 | 0.079 | 0.224 | 1.000 | ||
12 | year_citing | 0.359 | −0.012 | −0.085 | −0.243 | −0.148 | 0.119 | −0.088 | 0.138 | 0.163 | 0.079 | −0.273 | 1.000 | |
13 | age | 0.481 | 0.143 | 0.023 | −0.039 | −0.130 | 0.108 | −0.068 | 0.067 | 0.075 | 0.033 | −0.144 | 0.383 | 1.000 |
. | Variables . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | 10 . | 11 . | 12 . | 13 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | post_cite | 1.000 | ||||||||||||
2 | dist_embedding | 0.091 | 1.000 | |||||||||||
3 | severe | −0.132 | −0.012 | 1.000 | ||||||||||
4 | jif_retracted | −0.016 | 0.126 | 0.138 | 1.000 | |||||||||
5 | jif_citing | −0.142 | 0.013 | 0.060 | 0.115 | 1.000 | ||||||||
6 | country_distance | 0.080 | 0.014 | −0.025 | −0.042 | −0.086 | 1.000 | |||||||
7 | top_50_org | −0.081 | 0.025 | 0.027 | 0.066 | 0.161 | −0.122 | 1.000 | ||||||
8 | team_size | 0.046 | −0.074 | −0.006 | −0.045 | 0.046 | 0.055 | 0.069 | 1.000 | |||||
9 | org_size | 0.062 | 0.009 | −0.016 | −0.030 | 0.063 | 0.034 | 0.182 | 0.641 | 1.000 | ||||
10 | multi_country | 0.010 | 0.030 | −0.009 | −0.011 | 0.066 | 0.076 | 0.172 | 0.351 | 0.550 | 1.000 | |||
11 | has_west | −0.206 | 0.075 | 0.072 | 0.099 | 0.204 | −0.199 | 0.260 | −0.052 | 0.079 | 0.224 | 1.000 | ||
12 | year_citing | 0.359 | −0.012 | −0.085 | −0.243 | −0.148 | 0.119 | −0.088 | 0.138 | 0.163 | 0.079 | −0.273 | 1.000 | |
13 | age | 0.481 | 0.143 | 0.023 | −0.039 | −0.130 | 0.108 | −0.068 | 0.067 | 0.075 | 0.033 | −0.144 | 0.383 | 1.000 |
5.2. Regression Results
5.2.1. Are citations from distant fields (as opposed to proximate fields) more likely to generate postretraction citations?
We report our OLS fixed effect estimations of Equation 2 in Table 3 columns (1) and (2). All models also include retracted article fixed effects, as well as citation age and citing year indicator variables, which are not reported in the table. Column (1) in Table 3 reports the model results estimated without control variables, and in Column (2), we report our findings with a full set of control variables. These regression results suggest that postretraction citations are more likely to be made by articles from distant fields. In Column (1) from Table 3, the estimated coefficient of dist_embedding is 0.0106 (p < 0.01), which suggests that one standard deviation increase in the field distance is associated with around a 1 percentage point increase in the probability of citing a retracted article. Considering that the mean value of the postretraction citation is around 38.4%, a one standardized unit increase in field distance increases the postretraction citation rate by around 2.76% (1.06/38.4). Once we include the full set of control variables in Column (2), the dist_embedding coefficient slightly increases to 0.0110 (p < 0.01). Therefore, these results support our first hypothesis (H1), which predicted that postretraction citations are more likely to come from distant fields.
. | (1) . | (2) . | (3) . | (4) . |
---|---|---|---|---|
dist_embedding | 0.0106*** | 0.0110*** | 0.0071*** | 0.0077*** |
(0.002) | (0.002) | (0.002) | (0.002) | |
jif_retracted × dist_embedding | 0.0114*** | 0.0111*** | ||
(0.004) | (0.004) | |||
country_distance | 0.0050*** | 0.0050*** | ||
(0.001) | (0.001) | |||
jif_citing | −0.0125*** | −0.0123*** | ||
(0.002) | (0.002) | |||
top_50_org | −0.0071*** | −0.0070*** | ||
(0.003) | (0.003) | |||
team_size | −0.0065*** | −0.0065*** | ||
(0.002) | (0.002) | |||
org_size | 0.0052** | 0.0053** | ||
(0.003) | (0.002) | |||
multi_country | 0.0044* | 0.0044* | ||
(0.003) | (0.003) | |||
has_west | −0.0209*** | −0.0208*** | ||
(0.003) | (0.003) | |||
constant | −3.5577** | −3.7473** | −3.5231** | −3.7136** |
(1.798) | (1.848) | (1.777) | (1.828) | |
R2 | 0.5178 | 0.5189 | 0.5180 | 0.5190 |
Controls | No | Yes | No | Yes |
Retracted Articles | 2,123 | 2,123 | 2,123 | 2,123 |
Observations | 103,245 | 103,245 | 103,245 | 103,245 |
. | (1) . | (2) . | (3) . | (4) . |
---|---|---|---|---|
dist_embedding | 0.0106*** | 0.0110*** | 0.0071*** | 0.0077*** |
(0.002) | (0.002) | (0.002) | (0.002) | |
jif_retracted × dist_embedding | 0.0114*** | 0.0111*** | ||
(0.004) | (0.004) | |||
country_distance | 0.0050*** | 0.0050*** | ||
(0.001) | (0.001) | |||
jif_citing | −0.0125*** | −0.0123*** | ||
(0.002) | (0.002) | |||
top_50_org | −0.0071*** | −0.0070*** | ||
(0.003) | (0.003) | |||
team_size | −0.0065*** | −0.0065*** | ||
(0.002) | (0.002) | |||
org_size | 0.0052** | 0.0053** | ||
(0.003) | (0.002) | |||
multi_country | 0.0044* | 0.0044* | ||
(0.003) | (0.003) | |||
has_west | −0.0209*** | −0.0208*** | ||
(0.003) | (0.003) | |||
constant | −3.5577** | −3.7473** | −3.5231** | −3.7136** |
(1.798) | (1.848) | (1.777) | (1.828) | |
R2 | 0.5178 | 0.5189 | 0.5180 | 0.5190 |
Controls | No | Yes | No | Yes |
Retracted Articles | 2,123 | 2,123 | 2,123 | 2,123 |
Observations | 103,245 | 103,245 | 103,245 | 103,245 |
Cluster standard errors around retracted articles shown in parentheses. All models include retracted-article, citation-age, citation-year fixed effects.
p < 0.1.
p < 0.05.
p < 0.01.
Before we move on to test our second hypothesis, we describe some of the interesting findings from our control variables. First, postretraction citations are more likely to come from articles whose authors’ affiliated countries are different from those of retracted articles. Post-retraction citations are also less likely to come from articles published in high-JIF journals. For example, articles published in journals that are above the 75th percentile of the JIF distribution among citing articles (above JIF of 5.03) were associated with around a 1.25 percentage point decrease (see Column (2) from Table 3) in the probability of citing retracted articles. This is equivalent to around a 3.26% (1.25/38.4) reduction in the postretraction citation rate. Retracted articles are also less likely to be cited by articles published by authors from high-ranking institutions (top_50_org) or scientifically “core” countries (has_west). In fact, whether citing articles had any authors from these “core” countries is one of the most predictive variables, with an estimated coefficient of −2.09 percentage points (Column (2) from Table 3).
5.2.2. Differential effects of field distance by JIF of retracted articles
We now test the competing hypotheses (H2A, H2B, and H2C) by examining whether the association between field distance and postretraction citation rate would vary across high and ordinary JIF retracted articles, by introducing interaction effects between the dist_embedding (field distance) and jif_retracted variable (JIF for the retracted article above the 75th percentile of 18.43). A negative interaction effect would support the prediction from constructivist theory under engaged citation search (H2A). A null result would be consistent with the normative theory under engaged citation search (H2B). Finally, a positive interaction effect would support the heuristic citation search process for both citation motivations (H2C). Columns (3) and (4) in Table 3 report regression results involving the interaction effect both with and without the control variables. All models are estimated with the OLS fixed effects around the retracted article and include a series of citation age and citing year indicator variables, which we did not report in the table due to space. Column (3) in Table 3 reports regression results estimated without control variables, while Column (4) includes the full set of control variables. As seen from both columns (3) and (4) in Table 3, the estimated coefficient for field distance is positive and statistically significant, which suggests a positive association between field distance and the post-retraction citation rate for retracted articles from ordinary JIF journals. We also find evidence of a positive interaction effect (p < 0.01) between field distance and the JIF of retracted articles (see Column (4) in Table 3). The estimated coefficient of the interaction effect is 0.0111, which suggests that one standard deviation increase in the field distance (dist_embedding), is associated with around a one percentage point additional increase in the postretraction citation rate (roughly 1.11/38.4 = 2.89% increase) for those that are citing retracted articles published in high-JIF journals over those that are citing retracted articles published in ordinary journals (0.77 percentage points increase).
We also report predicted probabilities of postretraction citation with respect to field distance between high and ordinary JIF retracted articles. The predicted probabilities plotted in Figure 5 are from the same specification used to estimate Column (4) from Table 3 but instead estimated with a random effect model around retracted articles. Figure 5 clearly shows that the association between field distance and postretraction citation rate is different across the high and ordinary JIF retracted articles. The association is much stronger among citations to retracted articles from high JIF journals. This evidence, coupled with the regression outputs from Table 3, provides strong support for hypothesis H2C, suggesting citation search consistent with the heuristic search model.
5.3. Robustness Tests
To test the robustness of our results, we reran the models in Table 3 using alternative measures of field distance, both a bag-of-words vector and a distance measure based on Web of Science Categories from referenced journals (results available from the authors). For the main effect of distance, the magnitudes of the estimated coefficients from these other two field distance measures are slightly smaller, but they are positive and statistically significant, supporting H1. For the test of H2C, we find positive interaction effects for these alternative measures, with the effect statistically significant for the Web of Science Categories-based measure, although not for the bag-of-words measure. Hence, our results are qualitatively robust to alternative measures of field distance.
We also consider various scenarios for how postretraction citations are generated and how they would affect our results and interpretations. Our analysis assumes that by citing retracted articles, citing authors were not aware of the retraction (regardless of whether they read the paper or not). Here, we discuss three possible scenarios where postretraction citation generation may not follow this assumption. It is important to note that these are all scenarios that might explain why postretraction citations are perpetuated. However, while the presence of these scenarios may increase the base rate of postretraction citations, they cannot explain why the postretraction citation is predicted by field distance or, furthermore, by the interaction of field distance and journal status.
5.3.1. Deliberate citations of retracted articles
Firstly, citing authors may be aware of the retraction but would cite the article anyway because they believe that some findings are still valid (Bar-Ilan & Halevi, 2017). We address this concern by exploiting the severity of retraction reasons provided by the Retraction Watch data set. We manually classified 95 curated reasons for retraction into three categories (minor/major/severe) based on the severity of research misconduct (See Table S3 in the Supplementary material). For example, the minor misconduct category includes reasons such as “salami-slicing” or “plagiarism,”7 while the major misconduct category includes “concerns/issues about data or results.” Last, the severe misconduct category includes reasons such as the “fabrication of data” or “fabrication of results.” If retracted articles from our data set contained at least one “severe” retraction reason, we classified them into the “severe” retracted article. The idea is that deliberate citations of retracted articles are more likely to be observed if the cited articles are retracted for nonsevere reasons (Bar-Ilan & Halevi, 2017). In contrast, citing a paper even after retraction for a “severe” infraction suggests that a heuristic (rather than engaged) search may have produced the citing behavior. In Table S1 of the Supplementary material, we report our regressions separately for “nonsevere” retracted articles (Column (1) in Table S1) and “severe” retracted articles (Column (2) in Table S1). For the “nonsevere” sample, we do not see any statistical significance in the interaction effects (p > 0.1), which may have been contaminated with deliberate postretraction citations. Meanwhile, regression outputs from the “severe” sample show that the interaction effects are positive and statistically significant (p < 0.01). Moreover, the magnitude of this effect is greater than those estimated from using the full sample, which further substantiates heuristic citation search prediction (H2C). These findings are consistent if we use our alternative measures of field distance (results available from the authors).
5.3.2. Citation context analysis
In addition, some authors may have cited the retracted articles in negative ways while acknowledging the retraction. In this case, postretraction citation does not indicate incidences of “false” references. The problem is whether these incidences are correlated with our main independent variable, field distance. We would suggest that such negative citations are more likely to occur when citing authors have a substantial understanding of the topic. Therefore, to the extent that there is such a negative relationship between field distance and the tendency to cite “negatively,” and assuming that “negative” citations are more likely to occur for post-retraction citations, we would underestimate the field distance effect (meaning correcting this would produce even stronger evidence for our hypotheses). However, previous studies suggest that the incidence of negative citations among postretraction citation are rather low, making this source of bias unlikely (Bar-Ilan & Halevi, 2017; Bordignon, 2020; Schneider et al., 2020).
We conducted additional analysis using a subset of the full-text data set to check whether citing articles were aware of the retraction. We retrieved full-text data of 13,441 articles citing 2,560 retracted articles (21,355 citing-cited article pairs) from the Microsoft Academic Graph database8. This data set allows us to get a grasp of the general idea of how retracted references are cited. Here, we present 20 randomly selected citation contexts made to retracted articles. These are selected from the subset of postretraction citations that were made at least 2 years after retractions. We specifically focus on those that were retracted for severe reasons (see Table S3 in the Supplementary material for the classification). As seen from this sample of citation contexts in Table S4 of the Supplementary material, none of them cited retracted articles negatively, nor did they explicitly mention the retracted status. Furthermore, one can see that these citations seem to primarily be citing the prior retracted paper as a foundational piece of knowledge on which the citing author is building her argument, even though the paper had been retracted 2+ years prior for severe reasons such as data falsification or fabrication. This suggests that the citing author is not incorporating the citation for one of the “legitimate” reasons noted above.
Continuing in this vein, we then systematically analyzed the texts around citations to examine if citing authors have explicitly mentioned the retracted status of retracted articles. Out of 4,777 postretraction citation contexts made to 1,156 retracted papers, only 83 citation contexts (1.74%) had explicitly mentioned the words retraction, retracted, or retract when they cited retracted papers. We also analyzed whether retracted papers are more likely to be coreferenced with other references in citing manuscripts. Our rationale is that if a retracted paper is cited in a deliberate manner, or as a “negative” citation, it is more likely to be a “standalone” citation. On the other hand, if the citation is made in a nonengaged manner, it is more likely to be coreferenced with other papers (appearing in the same location in the manuscript in a list of citations). We examined distributions of the number of coreferences from 13,441 citing articles that made post- and preretraction citations. For each citing article, citation contexts are assumed to be coreferenced if they have the exact same tokens. This allows us to identify 20,999 unique coreferences from which we can compare the pooled distributions from articles that cite before and after retractions. We find that around 59.3% of preretraction citations are standalone citations, while this share is only around 52.1% for postretraction citations. Furthermore, as Figure S1 in the Supplementary material indicates, we find that coreferences are more common among postretraction citations than in preretraction citations. Our full-text analysis suggests that the majority of retracted articles are used as if they are legitimate knowledge, and there is no evidence that this becomes less likely after retraction compared to before the retraction.
5.3.3. Publication delay
We are also concerned that some of the authors may have read the paper they cite but were unaware of the retraction due to publication lag. For example, it is plausible that cited articles were not retracted when the citing authors incorporated them into their articles but only retracted during the review process of the citing articles. We reran the analyses using the same specifications from Table 3 but excluding citations made in the first year after retraction. The findings are consistent in direction and statistical significance to the main regression outputs shown in Table 3 (results available from the authors).
6. DISCUSSIONS AND CONCLUSIONS
6.1. Postretraction Citation as Window
In our paper, we consider that the citation generation process can be abstracted to lie between engaged and heuristic search processes. Assuming that authors with either normative or constructive citation motivations have relied on the engaged citation search, the postretraction citations may simply be a result of the author’s unfamiliarity with the paper’s retraction. Thus, under the engaged citation search model, we could interpret postretraction citations as the result of a notification failure, which can partly be addressed by increasing the visibility of retraction notices. However, we also describe a heuristic citation search process, where authors may rely on heuristics to guide their citation search (Osterloh & Frey, 2020; Wooding, 2020), particularly when searching for knowledge in unfamiliar terrain. Researchers may rely on highly visible journals, such as journals with high JIF, when they need to cite unfamiliar knowledge. In a supplemental analysis, we show that, among a random sample of Web of Science publications, there is a strong positive correlation between JIF and citing-cited field distance, such that more distant cites are more likely to be to high-JIF journals (results available from authors). But we further argue that such use of heuristics can become overly mechanical. That is, to the extent that exploring knowledge is costly due to unfamiliarity, some researchers may bet on the perceived status of journals by citing the article without deep engagement with the contents (Ball, 2002; Hoerman & Nowicke, 1995; Simkin & Roychowdhury, 2005), perhaps, for example, because they are drawing from other’s cites to the paper. In the previous section, we showed that our findings are robust even after considering retraction lags or how citing authors may have deliberately or negatively cited retracted articles. Thus, our evidence suggests that postretraction citation is not all “honest mistakes” or simple laziness stemming from researchers’ inability to check retraction notices. Instead, part of it also appears to be a direct consequence of systematic citation search behavior that may not require careful reading of the papers they cite and, especially, their use of highly visible journals as guideposts when encountering unfamiliar knowledge (Osterloh & Frey, 2020). More importantly, we argue that our postretraction citation analysis may illuminate a citation practice more generally, as the citation search process (relying on heuristics rather than engaged search) is likely to generalize to citations more broadly. This seems more plausible than a theory that suggests authors use one process for drawing on retracted literature and a different one for drawing on nonretracted literature.
In addressing the long-lasting question of what it means to cite a paper in science (Bornmann & Daniel, 2008), our findings provide evidence consistent with a citation search process that is present in addition to, or instead of, the conventional understandings of citation practices. We are still left with a few important questions. Why would researchers use highly visible journals as a heuristic, and more importantly, why would someone ever use them in a perfunctory manner? Although answering these questions is beyond the scope of our paper, we attempt to provide a few potential explanations. Firstly, citing highly visible journals accompanied with unfamiliar search may be an acceptable response for boundedly rational individuals to address enormous search costs. With the increasing rate of publications, researchers simply cannot identify and evaluate all possible sets of relevant articles (Simon, 1997). The increasing reliance on article recommendation systems is meant to partly address this issue. In this sense, just as JIF was originally developed to support librarians sorting through a flood of information (Garfield, 2006), researchers could rely on the visibility of journals, such as JIF, but in this case, to identify which articles have the potential to be more important. Part of this behavior may also stem from how we tend to equate status with JIF. Papers published in high-JIF journals are perceived to be more legitimate by citing authors, reviewers, and future readers, which incentivizes researchers to cite articles from high-JIF journals. The use of JIF in this manner is also present for purposes other than citation practices. For example, Biagioli and Lippman (2020) compare the use of JIF by academic hiring committees with how futures contracts work in the financial markets. For example, some evaluators may judge published articles not based on their contents, but on a crude measure of how many citations they would expect to generate in the future, despite the fact that the skewed nature of citation distributions would render such prediction ineffective (Larivière, Kiermer et al., 2016). The important part that we show in our paper is that such reliance may become overly mechanical, as shown from Biagioli and Lippman (2020)’s example.
This brings us back to an interesting debate between the normative and constructivist theories of citations. As discussed in Section 2.1, one heated debate was around whether citation based on “argument from authority,” such as citing eminent authors or articles, could be constituted as a violation of the normative view (Leydesdorff, 1987; MacRoberts & MacRoberts, 1987; Zuckerman, 1987). However, if the authority comes from citing high-status journals, as our findings suggest, a citation can be generated irrespective of whether the citing authors have read or been influenced by the contents of the cited works (normative view) or the position of cited works and authors within the stratification structure of science (constructivist view). Thus, to the extent that our identified citation behavior can be generalized into general citation practices, there exists a Matthew Effect of JIF (Larivière & Gingras, 2010), an independent channel by which high-JIF articles garner additional citations. Furthermore, to the extent that citations produce both symbolic and material rewards (jobs, promotions, etc.) such heuristics can lead to a misallocation of resources in science.
If such mechanical citation practices constitute a nontrivial share of actual citation counts, why haven’t existing theories been able to explain them? First, we argue that it is through analyzing postretraction citations that we can uncover such citation behaviors. Second, and more importantly, we argue that the existing citation theories may rest on an idealized notion or a narrow definition of a scientist, drawing evidence and accounts based on selected scientific documents and field studies from what may be considered now as the “core” part of science, which we refer to as an engaged search process. However, during the last century, just as there has been enormous growth in publication activities (Milojević, 2015; Price, 1963), we have seen an increasing number of authors contributing to the publication activities accompanied by diversification in terms of their roles in the production of science (Hackett, 1990; Hagstrom, 1964; Larivière et al., 2013; Milojević, Radicchi, & Walsh, 2018; Walsh & Lee, 2015), nationalities (Maisonobe, Grossetti et al., 2017; Zhou & Leydesdorff, 2006), and organizations (Hicks, 1995; Li, Youtie, & Shapira, 2015). The citation behaviors of this broader population of authors do not have to follow the existing descriptions, which may have been based on a narrow definition of a once homogenous population of scientists. In fact, prior studies have shown that the increasing adoption of various performance evaluation measures by institutions, mostly in developing countries, may have created perverse incentives to publish (Franzoni, Scellato, & Stephan, 2011), which may have led to increasing publications accompanied by many instances of research misconduct (Biagioli & Lippman, 2020; Biagioli, Kenney et al., 2019; Walsh et al., 2019). When publishing becomes merely a means to an end (Price, 1963; Shibayama & Baba, 2015), it is not too difficult to expect that citations can become a ceremonial practice. While diversification of author demography may partly be blamed, given that our findings suggest that authors from non-Western countries and nonelite institutions were more likely to cite retracted articles, it is important to note that our main effects are consistent even after we control for these variables. Therefore, what our findings may suggest is that when researchers are provided with conditions that incentivize scholarly communication activities as an end in itself, and researchers are faced with increasing demands for productivity combined with a rapidly expanding knowledge base to be accounted for, the conditions for severe bounded rationality and an increasing reliance on heuristics are created. In such conditions, we are more likely to observe citation practices that do not reflect traditional notions of engaged citation.
6.2. Policy Implications
Our findings provide a few important policy implications. First, our findings have implications for addressing the continuous spread of false references in science, an issue that has become an important policy concern due to increasing misinformation and misuse of scientific knowledge (West & Bergstrom, 2021). To address this issue, we can examine different phases of the signaling pathway of false references. One solution is to interfere at the reader level to avoid readers directly citing retracted articles by flagging retraction notices. To our knowledge, increasing the retraction visibility has been the most widely suggested policy recommendation (Bar-Ilan & Halevi, 2017; Campanario, 2000; Cox, Craig, & Tourish, 2018; Da Silva & Bornemann-Cimenti, 2017; Schneider et al., 2020; Unger & Couzin, 2006). Indeed, while constructing our data set, we found that many journals and indexing databases, such as PubMed and Web of Science, fail to display retraction notices, suggesting there is room for policy interventions here. However, we argue that this intervention rests on a strong, and likely fallacious, assumption: that the postretraction citations are “honest mistakes” such that all authors who cited retracted articles have read the paper yet are unaware of its retraction (engaged citation search). Given that postretraction citations may also be driven by authors who use a heuristic search that may generate shortcuts in the process of incorporating the citation into their paper, any intervention at the reader level may not eradicate the spread of false references. In this regard, we argue that intervention at the journal level may be much more effective. Without putting an additional tax on already burdened reviewers and editors (West & Bergstrom, 2021), journals and publishers could implement an automated retraction detection system (Bar-Ilan & Halevi, 2017; Bornemann-Cimenti, Szilagyi, & Sandner-Kiesling, 2016). The idea is not for journals or publishers to forbid authors from citing retracted articles, but to flag them with a warning. This system could alert authors at different stages, as references could easily change during the paper’s path to publication. Alternatively, this could be implemented at the final copyediting/proofing stages, when other information about the references (such as missing page numbers) is routinely checked already.
Postretraction citation is one form of misinformation in science (West & Bergstrom, 2021), though it may have no intent to deceive others. Yet, it is important to note that such behaviors are not entirely due to honest mistakes or laziness, but rather, this behavior is partly reflective of systematic, and perhaps mechanical, uses of high-JIF journals when citing distant knowledge. In this sense, the spread of false references in scholarly communication is systematic and if we can extrapolate our findings to general citation practices, it may also involve the spread of irrelevant references, or in extreme cases, of references that do not exist (Harzing & Kroonenberg, 2016; Katz, 2006; Leng, 2020). In fact, the tendency of AI models such as ChatGPT to generate fictitious references (Walters & Wilder, 2023), when combined with a tendency to copy references embedded in other’s publications (secondary referencing) means that there is a risk of a proliferation of incorrect citations in the literature moving forward, from any of several processes. Therefore, the solutions mentioned above would partly reduce the circulation of retracted references, but they may do little to stop the spread of irrelevant but nonretracted references, which would involve substantial changes in publication practices (publication as an end itself) and in particular, how we use journal ranking as academic currency at both institutional (Biagioli & Lippman, 2020) and individual (Larivière et al., 2016; Osterloh & Frey, 2020) levels. While engaged search may be the ideal type practice, heuristic search may be common, and perhaps increasingly so as the knowledge burden increases (Jones, 2009).
Last, this paper contributes to the continual debate on how cumulative advantage, or the Matthew Effect (Merton, 1968), operates through the attributions given to high-JIF journals. Prior studies have suggested numerous reasons attributed to the cumulative advantage enjoyed by high-JIF journals, including increased visibility and status-seeking citation motivation (Drivas & Kremmydas, 2020; Larivière & Gingras, 2010; Traag, 2021). Assuming that our proposed citation behavior model can be extrapolated into general citation practice, we would expect some researchers to superficially cite references without thorough examination from high-JIF journals, especially when they are unfamiliar with topics when compared to when they are drawing knowledge from familiar fields. While this does not suggest that distant citations are necessarily perfunctory9, policies that aim to reward “broad” impacts should still be implemented with caution, especially when implementing a scope-based impact measure of a publication using the interdisciplinary of its citations.
ACKNOWLEDGMENTS
The authors would like to thank two reviewers for their helpful feedback, which has significantly enhanced our paper. We extend our gratitude to the participants of the 2022 Workshop on the Organisation, Economics, and Policy of Scientific Research (WOEPSR) at KU Leuven for their invaluable comments. We also thank Paula Stephen, Philip Shapira, Stasa Milojevic, Mary Fox, Juan Rogers, Cassidy Sugimoto, and Seokbeom Kwon for their critical and constructive comments and insights on our paper. We are grateful to Retraction Watch for generously supplying retraction data and to the Georgia Institute of Technology for granting access to the Clarivate Web of Science.
AUTHOR CONTRIBUTIONS
Seokkyun Woo: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing—original draft, Writing—review & editing. John P. Walsh: Conceptualization, Methodology, Writing—original draft, Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
No external funding was received for this study.
DATA AVAILABILITY
Our main bibliometric data is proprietary to Clarivate Analytics and, therefore, cannot be disclosed. However, we have made available data sets and code for replicating our figures and regressions, which you can download from the following link (https://doi.org/10.5281/zenodo.10692463) (Woo, 2024).
Notes
Merton called this “obliteration by inclusion,” which bears a resemblance to Latour’s “black box” in this context.
One problem with this finding is the measures of content relevance and perceived quality, which were measured by the number of figures and tables per article, and the number of citations, respectively.
We thank the reviewers for these examples.
We replicated our analysis using various threshold values, using 2, 5, 20, 50, and 100 minimum citations. The results using these thresholds are consistent with our main analysis.
While the logistic regression model is generally preferred for a binary outcome variable, we use OLS for the following reasons. First, while the logistic regression model generally fits better than the linear model, the relationship between probability and log odds (which are a linear function of our covariates) are quasi-linear between the probability of 0.2 and 0.8 (Long, 1997; Von Hippel, 2015), which is well within much of the range of our postretraction citation probabilities. Secondly, for the interaction models our second set of hypotheses (H2A–H2C), OLS coefficients are more straightforward to interpret while logistic regression models require calculating and reporting the range of marginal effects for the interaction effect in the data. Finally, our estimating model, which includes more than one set of many indicator variables, makes computations of logistic regression (both conditional and unconditional fixed effect models) intractable due to the quasi-complete separation problem (Allison, 2008). We addressed this problem by combining the citation-age indicator variable into fewer categories. After dropping retracted articles without variation in the dependent variable (a condition for estimating logistic regression), we conducted both logistic and OLS regressions from this new data set and our results were consistent with our main findings. We also ran a regression model that explicitly predicts citation distance using postretraction citation. As shown in Table S2 (Supplementary material), our results are consistent with this alternative regression specification. Specifically, we find an increase in citation distance after the retraction event, and that this effect is greater among those that are citing retracted articles published in high impact factor journals.
We are not arguing that plagiarism is a minor problem. Rather, we are coding this as minor with reference to its likely impact on the validity of the cited finding (as false attribution of authorship does not affect the content of the findings).
Our main finding suggests P(false_cite ∣ distant_cite, high_JIF) ≥ P(false_cite ∣ proximate_cite, high_JIF). However, this does not suggest P(false_cite ∣ distant_cite, high_JIF) ≥ P(true_cite ∣ distant_cite, high_JIF), because the probability of false citation is generally far smaller than the probability of true citation (and so is the joint probability of false citations, distant citations, and citing high-JIF articles).
REFERENCES
Author notes
Handling Editor: Vincent Larivière