In this article, we show and discuss the results of a quantitative and qualitative analysis of open citations of retracted publications in the humanities domain. Our study was conducted by selecting retracted papers in the humanities domain and marking their main characteristics (e.g., retraction reason). Then, we gathered the citing entities and annotated their basic metadata (e.g., title, venue, subject) and the characteristics of their in-text citations (e.g., intent, sentiment). Using these data, we performed a quantitative and qualitative study of retractions in the humanities, presenting descriptive statistics and a topic modeling analysis of the citing entities’ abstracts and the in-text citation contexts. As part of our main findings, we noticed that there was no drop in the overall number of citations after the year of retraction, with few entities that have either mentioned the retraction or expressed a negative sentiment toward the cited publication. In addition, on several occasions, we noticed a higher concern/awareness by citing entities belonging to the health sciences domain about citing a retracted publication, compared with the humanities and social science domains. Philosophy, arts, and history are the humanities areas that showed higher concern toward the retraction.
Retraction is a way to correct the literature and alert readers to erroneous materials in the published literature. A retraction should be formally accompanied by a retraction notice—a document that justifies such a retraction. Reasons for retraction include plagiarism, peer review manipulation, and unethical research (Barbour, Kleinert et al., 2009).
Several works in the past have studied and uncovered important aspects regarding this phenomenon, such as the reasons for retraction (Casadevall, Steen, & Fang, 2014; Corbyn, 2012), the temporal characteristics of the retracted articles (Bar-Ilan & Halevi, 2018), their authors’ countries of origin (Ataie-Ashtiani, 2018), and the impact factor of the journals publishing them (Campos-Varela, Villaverde-Castañeda, & Ruano-Raviña, 2020; Fang & Casadevall, 2011). Other works have analyzed authors with a higher number of retractions (Brainard, 2018), and the scientific impact, technological impact, funding impact, and Altmetric impact in retractions (Feng, Yuan, & Yang, 2020). Other studies focused on the retraction in the medical and biomedical domain (Campos-Varela, Villaverde-Castañeda, & Ruano-Raviña, 2020; Gaudino, Robinson et al., 2021; Gasparyan, Ayvazyan et al., 2014).
Scientometricians have also proposed several works on retraction based on quantitative data. For instance, several works (Azoulay, Bonatti, & Krieger, 2017; Lu, Jin et al., 2013; Mongeon & Larivière, 2016; Shuai, Rollins et al., 2017) focused on showing how a single retraction could trigger citation losses through an author’s prior body of work. Bordignon (2020) investigated the different impacts that negative citations in articles and comments posted on postpublication peer review platforms have on the correction of science, while Dinh, Sarol et al. (2019) applied descriptive statistics and ego-network methods to examine 4,871 retracted articles and their citations before and after retraction. Other authors focused on the analysis of the citations made before the retraction (Bolland, Grey, & Avenell, 2021) and on a specific reason for retraction, such as misconduct (Candal-Pedreira, Ruano-Ravina et al., 2020). The studies that considered only one retraction case usually observed also the in-text citations and the related citation context in the articles citing retracted publications (Bornemann-Cimenti, Szilagyi, & Sandner-Kiesling, 2016; Luwel, van Eck, & van Leeuwen, 2019; Schneider, Ye et al., 2020; van der Vet & Nijveen, 2016).
Although citation analysis concerning retraction has been done several times in Science, Technology, Engineering, and Mathematics (STEM) disciplines, less attention has been given to the humanities domain. One of the rare analyses done in the humanities domain was recently presented by Halevi (2020), who considered two examples of retracted articles and showed their continuous postretraction citations.
Our study seeks to expand the work concerning the analysis of citations of retracted publications in the humanities domain. By combining quantitative analysis with quantification of citations and their related characteristics/metadata, and qualitative analysis, through a subjective examination of aspects related to the quality of the citations (e.g., the reason for a citation based on the examination/interpretation of its in-text citation context), we aim to understand this phenomenon in the humanities, which has gained little attention in the past literature. In particular, the research questions (RQ1–RQ3) we aim to address are
RQ1: How did scholarly research cite retracted humanities publications before and after their retraction?
RQ2: Did all the humanities areas behave similarly concerning the retraction phenomenon?
RQ3: What were the main differences in citing retracted publications between STEM disciplines and the humanities?
In this paper, we use a methodology developed to gather, characterize, and analyze incoming citations of retracted publications (Heibi & Peroni, 2022), adapted for the case of the humanities1. The citation analysis is based on collections of open citations (i.e., data are structured, separate, open, identifiable, and available) (Peroni & Shotton, 2018, 2020).
2. DATA GATHERING
The workflow followed to gather and analyze the data in this study is based on the methodology introduced in Heibi and Peroni (2022), briefly summarized in Figure 1. The first two phases of the methodology are dedicated to the collection and characterization of the entities that have cited the retracted publications. The third phase is focused on analyzing the information annotated in the first two phases to summarize quantitatively the data collected. The fourth and final phase applies a topic modeling analysis (Barde & Bainwad, 2017) on the textual information (extracted from the full text of the citing entities) and builds a set of dynamic visualizations to enable an overview and investigation of the generated topics. The data gathering of our study is detailed in the following sections.
2.1. Retraction in the Humanities
First, we wanted to have a descriptive statistical overview of the retractions in the humanities as a function of crucial features (e.g., reasons of retraction) to help us define the set of retractions to use as input in the next phases. Thus, we queried the Retraction Watch database (https://retractiondatabase.org; Collier, 2011) searching for all the retracted publications labeled as humanities (marked with “HUM” in the database). Thus, the humanities domain considered in this work is based on the subject classification used by Retraction Watch (i.e., the subjects under the macro category “(HUM) Humanities”). Then we classified the results as a function of three parameters: the year of the retraction, the subject area of the retracted publications (architecture, arts, etc.), and the reason(s) for the retraction. We collected an overall number of 474 publications; the earliest retraction occurred in 2002, and the last year of retraction we obtained was 2020.
As shown in Figure 2, we noticed an increasing trend throughout the years, with some exceptions. In particular, we observed that the highest number of retractions per year was 119 in 2010, probably due to an investigation and a massive retraction of several articles belonging to one author, Joachim Boldt (Brainard, 2018). When looking at the subject areas, we noticed that most of the retractions are related to arts and history, and plagiarism motives2 were by far the most representative ones, confirming the observation in Halevi (2020). Most of the retracted publications (88%) are of article type (i.e., labeled in Retraction Watch as either “Conference Abstract/Paper,” “Research Article,” or “Review Articles”). Book chapters/References represent 8% of the total, and the rest are “Commentary/Editorials” (1%), and other residual types (3%, e.g., letters, case reports, articles in press).
2.2. Retracted Publications Set and their Citations
As the focus of our study is on the analysis of citations of fully retracted publications, we excluded all the retracted publications collected in the previous step that did not receive at least one citation according to two open citation databases: Microsoft Academic Graph (MAG, https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/) (Wang, Shen et al., 2020) and OpenCitations’ (2020) COCI (https://opencitations.net/index/coci) (Heibi, Peroni, & Shotton, 2019). MAG is a knowledge graph that contains the scientific publication records, citations, authors, institutions, journals, conferences, and fields of study. It also provides a free REST API service to search, filter, and retrieve its data. COCI is a citation index that contains details of all the DOI-to-DOI citation links retrieved by processing the open bibliographic references available in Crossref (Hendricks, Tkaczyk et al., 2020), and it can be queried using open and free REST APIs. We decided to not use other proprietary and nonopen databases because we aimed to make our workflow and results as reproducible as possible.
After querying COCI and MAG3, we found that 85 retracted items (out of 474) had at least one citation (2,054 citations). We manually checked the data set for possible mistakes introduced by the collections. Indeed, either some of the citing entities identified in MAG did not include a bibliographic reference to any of the retracted publications or the retracted publication in consideration was not cited in the content of the citing entity (although present in its reference list), or the citing entities’ type did not refer to a scholarly publication (e.g., bibliography, retraction notice, presentation, data repository). There was also one article retracted for duplication “The Nature of Creativity” by Sternberg (2006) that received 1,050 citations. This retracted article contains a substantial amount of content published by the same author in several of his previous works and it was the fourth retracted article by the same author who used to cite himself at a high rate while not doing enough to encourage diversity in psychology research. We decided to exclude it from our study to reduce bias in the results. Following these considerations, the final number of retracted publications considered was 84, involving a total number of 935 unique citing entities. As shown in the bubble chart in Figure 3, most of the citing entities (i.e., 891) were included in MAG; 388 were included in COCI, and they shared 344 entities.
Although the retracted items identified so far were all in the humanities domain according to the categories specified in Retraction Watch, an item might have other nonhumanities subjects associated with it. Sometimes, these nonhumanities subjects might be more representative of the content of the retracted document and, thus, they might generate an unwanted bias for the rest of the analysis. For instance, consider the retracted article “The good, the bad, and the ugly: Should we completely banish human albumin from our intensive care units?” (Boldt, 2000). In Retraction Watch, the subjects associated with it were medicine and journalism. Yet, when we checked the full text of the article, we noticed that argumentations close to journalism are very few and, as such, the article should not be considered as belonging to humanities research.
To avoid considering these peculiar publications in our analysis, we devised a mechanism to help us evaluate the affinity of each retracted item to the humanities domain. We assigned to each retracted item in the list (84) an initial score of 1, named hum_affinity—this value ranges from 0 (i.e. very low) to 5 (i.e. very high). The final value of hum_affinity for each retracted item is calculated as follows:
We assigned to each retracted item additional subject categories obtained by searching the venue where it was published in external databases—we used Scimago classification (https://www.scimagojr.com/) for journals and the Library of Congress Classification (LCC, https://www.loc.gov/catdir/cpso/lcco/) for books/book chapters.
If both the Retraction Watch subjects and those gathered in step (1) included at least one subject identifying a discipline in the humanities, we added 1 to hum_affinity of that item.
If all the Retraction Watch subjects are part of the humanities domain, we added another 1 to hum_affinity of that item.
If the title of the retracted item has a clear affinity to the humanities (e.g., “The origins of probabilism in late scholastic moral thought”), we added another 1 to hum_affinity of that item.
Finally, we provided a subjective score of −1, 0, or 1 based on the abstract of the item. For instance, we assigned 1 to the abstract of the retracted article of Mößner (2011): “… This paper aims at a more thorough comparison between Ludwik Fleck’s concept of thought style and Thomas Kuhn’s concept of paradigm. Although some philosophers suggest that these two concepts ….”
The pie chart in Figure 3 shows how we classified the retracted publications and those citing them according to their hum_affinity score. To narrow our analysis and reduce bias, we decided to consider only the retracted publications (and their corresponding citing entities) having a medium or high hum_affinity score (i.e., ≥ 2). Twelve retracted publications have been excluded from the analysis (i.e., hum_affinity < 2) along with their 257 citations. A list of the excluded retracted publications is available at the Zenodo repository (Heibi & Peroni, 2021b). At the end of this phase, the final number of retracted items we considered was 72, with 678 citing entities.
2.3. Annotating the Citation Characteristics
Once collected the 72 retracted items and their related 678 citing entities were collected, we wanted to characterize such citing entities with respect to their basic metadata and full-text content.
2.3.1. Gathering citing entities metadata
We retrieved basic metadata via REST APIs from either COCI/MAG, for each citing entity (i.e., DOI (if any), year of publication, title, venue id (ISSN/ISBN), and venue title). Then, using the Retraction Watch database, we annotated whether the citing entity was fully retracted as well.
We also classified the citing entities into areas of study and specific subjects, following the Scimago Journal Classification (https://www.scimagojr.com/), which uses 27 main subject areas (medicine, social sciences, etc.) and 313 subject categories (psychiatry, anatomy, etc.). We searched for the titles and IDs (ISSN/ISBN) of the venues of publication of all the citing entities and classified them into specific subject areas and subject categories. For books/book chapters, we used the ISBNDB service (https://isbndb.com/) to look up the related Library of Congress Classification (LCC, https://www.loc.gov/catdir/cpso/lcco/), and then we mapped the LCC categories into a corresponding Scimago subject area using an established set of rules detailed in Heibi and Peroni (2022).
2.3.2. Extracting textual content features
We extracted the abstract of each citing entity and all its in-text citations of the retracted publications in our set, marking the reference pointers to them (i.e., the in-line textual devices, e.g., “” used to refer to bibliographic references), the section where they appear, and their citation context4. The citation context is based on the sentence that contains the in-text reference (i.e., the anchor sentence), plus the preceding and following sentences5. The definition of this citation context is based on the study of Ritchie, Robertson, and Teufel (2008). We annotated the first-level sections containing the in-text citation with their type using the categories “introduction,” “method,” “abstract,” “results,” “conclusions,” “background,” and “discussion” listed in Suppe (1998) if such section rhetoric was clear by looking at its title; otherwise we used other three residual categories: “first section,” “middle section,” and “final section,” depending on their position in the citing entity.
Then, we manually annotated each in-text citation with three main features: the citation sentiment conveyed by the citation context, whether the citation context mentioned the retraction of the cited entity, and the citation intent. The annotation of the citation sentiment is inspired by the classification proposed in Bar-Ilan and Halevi (2017), and we marked each in-text citation with one of the following values:
positive, when the retracted publication was cited as sharing valid conclusions, and its findings could also have been used in the citing entity;
negative, if the citing entity cited the retracted publication and addressed its findings as inappropriate and/or invalid; and
neutral, when the author of the citing entity referred to the retracted publication without including any judgment or opinion regarding its validity.
Then, we annotated with yes/no each citing entity if any in-text citation context we gathered from it did/did not explicitly mention the fact that the cited entity was retracted. Finally, we annotated the intent of each in-text citation. The citation intent (or citation function) is defined as the authors’ reason for citing a specific publication (e.g., the citing entity uses a method defined in the cited entity). To label such citation functions, we used those specified in the Citation Typing Ontology (CiTO, https://purl.org/spar/cito) (Peroni & Shotton, 2012), an ontology for the characterization of factual and rhetorical bibliographic citations. We used the decision model developed and adopted in Heibi and Peroni (2021a) to decide which citation function select to label an in-text citation. Figure 4 shows part of the decision model; it presents the case when the intent of the citation is “Reviewing and eventually giving an opinion on the cited entity” and the citation function is part of one of the following groups: “Consistent with,” “Inconsistent with,” or “Talking about.”
3. RESULTS AND ANALYSIS
We have produced an annotated data set containing 678 citing entities and 1,020 in-text citations of 72 retracted publications. We have published a dedicated web page (https://ivanhb.github.io/ret-analysis-hum-results/) embedding visualizations that enable the readers to view and interact with the results, also available in Heibi and Peroni (2021b).
In the following sections, we introduce some important concepts adopted in the description and organization of our results. Then we show the results of quantitative and qualitative analyses of all the data we collected.
3.1. Data Organization
We defined three periods to distribute the citations of retracted publications:
Period P-Pre—from the year of publication of the retracted work to the year before its full retraction (the year of the retraction is not part of this period).
Period P-Ret—the year of the full retraction.
Period P-Post—from the year after the full retraction to the year of the last citation received by the retracted publication, according to the citation data we gathered.
Each citing entity falls under one of the above three periods. The two periods P-Pre and P-Post were split into fifths, labeled “[−1.00, −0.61],” “[−0.60, −0.21],” “[−0.20, 0.20],” “[0.21, 0.60],” and “[0.61, 1.00].” When the citing entity is part of either P-Pre or P-Post, then it is also part of a specific fifth, which identifies how close or far that entity is to or from the events that defining the period.
The division into fifths helped us define a uniform time span to locate the citing entities independently of the year of retraction of the work they cite and the publication years of the citing and cited entities6. For instance, if an entity A published in 2011 had cited a retracted publication R published in 2002, fully retracted in 2012, then A is part of the last fifth (i.e., “[0.61, 1.00]”) of P-Pre. This means that A has cited R in the last fifth, immediately before the formal retraction of R.
3.2. Descriptive Statistics
We have classified the distribution of the citing entities in the three periods (i.e., P-Pre, P-Ret, and P-Post) as a function of the humanities disciplines used in Retraction Watch, as shown in Figure 5. Religion was the discipline that received the highest number of citations (375), and history had the highest number of retracted items (20).
In Figure 6 we have classified the entities citing a retracted publication in each discipline according to their subject areas. Arts and humanities and Social sciences (AH&SS) were highly represented in both the P-Pre and P-Post periods of almost all the retracted publications’ disciplines. However, we noticed some exceptions to this rule in P-Pre in Journalism (10% of citing entities were AH&SS publications), P-Post in Arts (13% AH&SS publications), and P-Pre and P-Post of Architecture (no AH&SS publications in either period).
Because we expected, as also highlighted in previous studies (e.g., Ngah & Goi, 1997), that a good part of the citations of humanities publications come from AH&SS publications, we decided to look more deeply into the obtained results before moving on to the next stage. As shown in Figure 5, we noticed that Journalism has a completely different behavior compared to the other disciplines. Indeed, the citations of Journalism have cited three retracted publications: two with a hum_affinity of 3, and one with a hum_affinity of 2. The latter article was “Personality, stress and disease: Description and validation of a new inventory” (Grossarth-Maticek & Eysenck, 1990). This article has 130 citations (almost 95% of all the citations in Journalism). Retraction Watch has labeled this article with the additional two subject areas: Public Health and Safety and Sociology; therefore Journalism represents the only humanities subject. A further investigation in the full text of the paper revealed the fact that this article is highly related to health sciences, and Journalism has a marginal (almost absent) relevance in it. Considering these discovered facts, we felt that this article could represent a significant bias in our analysis. Therefore, to limit its impact on the results we decided to exclude it from our analysis.
As a further check, we have investigated all the retracted publications of all the humanities disciplines in Figure 6 having citations from Arts and humanities publications less than 20% in either P-Pre or P-Post. Arts and Architecture are the two disciplines falling in this category. After a manual check, we detected the article “A systematic review on postimplementation evaluation models of enterprise architecture artefacts” (Nikpay, Ahmad et al., 2020), classified under Architecture, yet while reading its full text we found little evidence supporting the proposed labeling, as it was a computer science study. Therefore, we decided to also exclude this article from our analysis.
After this data refinement, our final data were reduced to 546 citing entities and 786 in-text citations of 70 retracted publications. Considering the final data and the classification of the retracted publications based on their humanities discipline, we investigated another aspect: In Figure 7 we have plotted the total number of citations gained by each humanities discipline as a function of the number of years passed after the date of retraction. This trend is compared to the average time of retraction for each humanities discipline. From Figure 7, we noticed that on average disciplines such as religion and philosophy reported their peak in the year before their retraction, while this trend is the opposite for history, arts, and architecture.
To infer other interesting statistics regarding the obtained results, we treated the citing entities and the in-text citations they contain as two different classes, and we present descriptive statistics of these two classes in the following subsections.
3.2.1. Citing entities
We examined the distribution of the citing entities to retracted publications as a function of two features: the periods (i.e., P-Pre, P-Ret, and P-Post), further classified into those that mentioned the retraction or for which we could not access their full-text; and their subject areas. The results are shown in Figure 8.
The number of citing entities before the retraction (192, period P-Pre) was lower than the number of citing entities after the retraction (260, period P-Post). Along P-Pre and P-Ret, we noticed a continuous increment in the overall number of citing entities, which suddenly started decreasing after the first fifth of P-Post, yet the numbers were in line with those observed in the third and fourth fifths of P-Pre. The last fifth of P-Post is an exception to the declining trend, with an unexpected high peak. This result was due to the fact that 27 retracted items received only one citation in P-Post and, in these cases, that citation always represented the last citation received, which is the final border of P-Post.
The full text of 8.42% of the citing entities was not accessible. For those for which we successfully retrieved the full text, our results showed that a relatively low percentage mentioned the retraction of the cited entity—2.25% of the total number of citing entities in P-Ret and P-Post.
Looking at their subject areas, we noticed that the citing entities started to spread into a higher number of subject areas (i.e., an additional nine) in P-Post compared to P-Pre, where the residual category Others contained 16% of the citing entities. The Arts and humanities subject area had a similar percentage throughout all three periods (22.94%, 18.42%, and 18.14%), and it represents, together with Social sciences, the two most representative subject areas in P-Ret and P-Post. We also noticed an important drop in Psychology, from 15.41% in P-Pre to 4.42% in P-Post.
3.2.2. In-text citations
We focused on the distribution of the in-text citations as a function of three features: the periods (i.e., P-Pre, P-Ret, and P-Post); the citation intent; and the section containing the in-text citation. The results of the three distributions have been further classified according to the in-text citation sentiment (i.e., negative/neutral/positive), as shown in Figure 9.
The overall trend in the number of in-text citations during the three periods was close to the one we observed for the citing entities (shown in the previous section), although the differences between P-Pre and P-Post were even more marked. As introduced in the previous section, the pick in the last fifth of P-Post was due to the retracted items receiving only one citation in P-Post. Even though the overall percentage of negative citations was low, it had a higher presence in P-Pre (4.5%). Generally, most in-text citations were tagged as neutral, and very few were positive (0.75%).
The citation intents “obtains background from” and “cites for information” were the two most dominant ones in the three periods, and they represented 31.29% and 22.64% of the total number of in-text citations, respectively. The citation intent “cites for information” increased its presence moving from 17.8% in P-Pre to 27.20% in P-Post.
Considering the citation sections, we can clearly see that the in-text citations were mostly located in the “Introduction” section in all the three periods. The in-text citations in the section “Introduction” decreased a lot after P-Ret moving from 30.15% in P-Pre to 22.13% in P-Post. Instead, the in-text citations contained in the section “Discussion” have an increasing trend, from 6.87% in P-Pre to 15.20% in P-Post.
3.3. Topic Models of Citing Entities’ Abstracts and their Citation Contexts
A topic modeling analysis is a statistical modeling approach for automatically discovering the topics (represented as a set of words) that occur in a collection of documents. We used it with our data to understand what the evolution of the topics in time was and whether it was dependent, in some way, on the retraction received by the publications considered.
A standard workflow for building a topic model is based on three main steps: tokenization, vectorization, and topic model I creation. The topic model we have built is based on the Latent Dirichlet Allocation (LDA) model (Jelodar, Wang et al., 2019). In the tokenization process, we have converted the text into a list of words by removing punctuation, unnecessary characters, and stop words, and we also decided to lemmatize and stem the extracted tokens. In the second step, we created vectors for each of the generated tokens using a Bag-of-Words (BoW) model (Brownlee, 2019), which we considered appropriate to model our study considering our direct experience in previous findings (Heibi & Peroni, 2021a) and the suggestions by Bengfort, Bilbro, and Ojeda (2018) on the same issue. Finally, to build the LDA topic model, we determined in advance the number of topics to retrieve according to the examined corpus using a popular method based on the value of the topic coherence score, as suggested in Schmiedel, Müller, and vom Brocke (2019), which can be used to measure the degree of semantic similarity between high-scoring words in the topic.
We built and executed two LDA topic models, one using the abstracts of the entities citing the retracted publications (with 16 topics), named TM-Abs, and another using the citation contexts where the in-text reference pointers to retracted publications were contained (with 20 topics), named TM-Cits. To create the topic models, we used MITAO (Ferri, Heibi et al., 2020) (https://github.com/catarsi/mitao), a visual interface to create a customizable visual workflow for text analysis. With MITAO, we have generated two visualizations: Latent Dirichlet Allocation Visualization (LDAvis) (Sievert & Shirley, 2014) for an overview of the topic modeling results, and Metadata-Based Topic Modeling Visualization (MTMvis) for a dynamic and interactive visualization of the topics based on customizable metadata.
3.3.1. Citing entities abstracts
The total number of available abstracts in our data set was 509. We extended the list of MITAO’s default English stop words (“the”, “is”, etc.) with ad hoc stop words devised for our study, such as “method,” “results,” and “conclusions,” which represent the typical words that might be part of a structured abstract.
Figure 10 shows the topic distribution represented in the two-dimensional space of LDAvis. Using the LDAvis interface, we set the parameter λ to 0.3 to determine the weight given to the probability of a term under a specific topic relative to its lift (Sievert & Shirley, 2014), and retrieved the 30 most relevant terms of each topic. We gave an interpretation and a title to each topic by analyzing its related terms, which we avoid introducing here due to space constraints, but they are available in Heibi and Peroni (2021b). Topic 6 (“Leadership organization, and management”) was the dominant topic. The topics were distributed in four main clusters, as shown in Figure 10:
one composed of topics 2 (“Sociopolitical issues related to leadership”) and 6, concerning issues related to leadership, work organization, and management form a sociopolitical point of view;
a large one composed of topics 1 (“Sociopolitical issues possibly related to Vietnam”), 4 (“History of the Jewish culture”), 5 (“Music and psychological diseases”), 11(“Family and religion”), etc. This treats several subjects from different domains close to social sciences, political sciences and psychology; and
another two clusters composed of one topic each: topic 16 (“Geography and climatic issues”) and topic 3 (“Colonial history”).
Figure 11 shows the chart generated using MTMvis. We plotted the topic distribution as a function of the three periods. At a first analysis, we noticed how topics 6 and 16 incremented their distribution along the three periods. On the other hand, topics 1 and 11 decreased their percentage throughout the three periods.
3.3.2. In-text citation contexts
The total number of in-text citation contexts in our data set, used as input to produce the second topic model, was 786. As we did with the abstracts, we have defined and used a list of ad hoc stop words, which included all the given and family names of the authors of the cited publications.
Figure 12 shows the topics represented in the two-dimensional space of LDAvis. As we did for the abstracts’ topic modeling, we set λ to 0.3 and interpreted each topic by analyzing its 30 most relevant terms (Heibi & Peroni, 2021a, 2021b). In this case, we noticed that the topics are less overlapping and more distributed along the whole axis of the visualization. Topic 12 (“Leadership organization, and management”) is the most representative (11.7%) and was very distant from the other topics. The bottom right part of the graphics—with topics 2 (“Countries in conflict”), 15 (“War and terrorism”), 17 (“War and history”), 18 (“History of Europe”), and 20 (“War and army conflicts”)—is mostly close to history studies, especially discussion of army conflicts. The top part of the graphic contains several single-topic clusters, such as topic 5 (“Gender social issues”) and 9 (“Geography and climatic issues”).
Figure 13 shows the chart generated using MTMvis, where we plotted the topic distribution as a function of the three periods. We noticed a continuous decrement in topics 7 (“Family and religion”) and 18 along the three periods. Topic 3 (“Drugs/alcohol and psychological diseases”) had a high decrement immediately after P-Ret. On the other hand, we noticed an increment in topics 5, 9, and 11 (“Music and psychological diseases”), although the latter topic had a higher percentage in P-Ret than in P-Post.
4. DISCUSSION AND CONCLUSION
In this section, we address separately each of our research questions RQ1–RQ3 presented in Section 1. We conclude the section by discussing the limits of our work and by sketching out some future work that might help us overcome these issues.
4.1. Answering RQ1: Citing Retracted Publications in the Humanities
It seems that, on average, retracted publications in the humanities did not have a drop in citations after their retraction (Figure 8) and only 2.25% of the citing entities—five Arts and humanities publications and three related to health sciences subject areas (e.g., medicine, psychology, nursing) mentioned the retraction in the citation context. In addition, we noticed that the negative perception of a retracted work, although limited in the data we have, happened before its retraction if the cited entity had a low affinity to the humanities domain. The fact that we reported few negative citations in P-Post is consistent with other studies (Bordignon, 2020; Luwel et al., 2019; Schneider et al., 2020).
Citing entities talking about retraction usually discussed the cited entity rather than obtaining background material from it or generic informative claims (Figure 14). Most of the in-text citations marked as discusses occurred in the Discussion section (as shown in Figure 15), and from TM-Cits we noticed the emerging of topic 6 (“The retraction phenomenon”) in Discussion sections only in P-Post—in other words, the retraction was not mentioned in the Discussion section before the retraction, and the retraction event might have been the trigger of a higher discussion from the citing entities.
From the distribution of the subject areas of the citing entities over the three periods (Figure 8), we noticed that Social sciences and Arts and humanities had almost the same percentages in the P-Ret and P-Post periods, which is less than their percentages in P-Pre, suggesting that the retraction event did have an impact on these subject areas. However, other subject areas such as psychology decreased in P-Ret and more in P-Post, which may be an indicator of higher concern in these subject areas toward the citation of retracted publications. This is evidenced by the observation of the TM-Abs topics distribution for the citing entities assigned to psychology (Figure 16), with a clear decrement in the topics related to health sciences, such as topics 10 and 11, whereas others, such as topics 6 and 9 (close to sociohistorical discussions with no relation to health sciences) increased their presence in P-Ret and P-Post. In other words, not only did the overall number of citing entities from the health sciences domain decrease after the retraction, but their subject areas moved from the health sciences domain to subjects that are closer to the Social sciences and Arts and humanities domains.
4.2. Answering RQ2: Citation Behaviors in the Humanities
As shown in Figure 6, Religion and History had a very similar distribution pattern. In both, the citing entities belonging to Social sciences had an important decrement in P-Post, and for that period, the TM-Cits of these entities does not include topic 3 (“Drugs/Alcohol psychological diseases”) for Religion and topic 7 (“Family and religion”) for History. We can speculate that Social sciences studies significantly reduced its percentage due to a higher concern toward sensitive social subjects such as healthcare, family, and religion.
Arts had the highest number of citations in P-Post, although we reported an important drop in the Arts and humanities citing entities, in favor of subject areas such as Medicine, Nursing, and Engineering (Figure 6). On the other hand, for Philosophy we had a completely different situation: Citing entities labeled as Arts and humanities incremented a lot in P-Post at the expense of citing entities from Psychology. For the Arts discipline, topic 11 (“Music and psychological diseases”) of TM-Cits is the reason for the positive trend of P-Post. In other words, arts (and especially music) had been discussed with relation to psychological and medical diseases.
In Figure 17, we show the distribution of topic 6 (“The retraction phenomenon”) as a function of the three periods and considering the four humanities disciplines with the higher number of citing entities. Topic 6 increased a lot in P-Post in Philosophy and in Religion it had a steady trend, whereas History and Arts had a peak in P-Ret and a lower, yet relatively high, percentage in P-Post. These results might suggest that the entities that cite retracted publications in Philosophy, Arts, and History (which following the results of the topic modeling analysis produced topics close to STEM disciplines) were those showing major concerns toward the retraction—in the case of History and Arts starting from the year of the retraction.
Considering these hypotheses, we can interpret the fact that History and Arts reached their peak of citations after their year of retraction (Figure 7) as a sign of awareness/acknowledgment regarding the retraction rather than unconsciousness use of the retracted publications, at least for part of these citations.
4.3. Answering RQ3: Comparing STEM and the Humanities
Our findings showed that the retraction of humanities publications did not have a negative impact on the citation trend (Figure 8). The opposite trend was observed in other disciplines, according to prior studies, such as biomedicine (Dinh et al., 2019) and psychology (Yang & Qi, 2020). However, studies, such as Heibi and Peroni (2021a) and Schneider et al. (2020), also observed that in the health sciences domain there were cases where either a single or a few popular cases of retraction were characterized by an increment of citations after the retraction. This might suggest that the discipline related to the retracted publication is not the only central factor to consider for predicting the citation trend after the retraction. Other factors might play a crucial role, such as the popularity of and media attention to the retraction case, as has been discussed in the studies by Mott, Fairhurst, and Torgerson (2019) and Bar-Ilan and Halevi (2017).
The work by Bar-Ilan and Halevi (2018) analyzed the citations of 995 retracted publications and found the same growing trend in the citations in the postretraction period. However, they did not analyze the retraction according to different and separate disciplines. As such, we might consider such results as a representation of a general trend of retracted publications, that confirms the general observations we derived from our data. In addition, considering the results we have obtained for the specific humanities disciplines, it seems as though the potential threats and damage from retracted materials have been perceived more seriously by others (i.e., citing entities) when the retracted publications have been linked to a sensitive area of study and to the STEM domain. This final observation notes the different behaviors that might occur when a retracted publication manifests a higher relation to STEM.
4.4. Limitations and Future Developments
There are some limitations in our studies that may have introduced some biases. First, compared to other fields of study, bibliographic metadata in the humanities have limited coverage in well-known citation databases (Hammarfelt, 2016). This fact led to some limitations when applying a citation analysis in the humanities domain (Archambault & Larivière, 2010). In this regard, a coverage analysis and comparison of the citations in the humanities domain in COCI and MAG might be highly valuable. Other data sources, such as OpenAlex (Priem, Piwowar, & Orr, 2022), a free and open catalog of the world’s scholarly papers, researchers, journals, and institutions, could be considered. Pragmatically, as far as our study is concerned, we undoubtedly collected fewer citing entities than those that had in fact cited the retracted publications. In addition, we have considered only open citation data; therefore the citation coverage could significantly improve with the addition of nonopen citation data. The availability of a larger amount of data could have strengthened and improved the quality of our results.
The selection of the retracted publications was another crucial issue, because we faced two major problems: some inconsistencies in the data provided by Retraction Watch and the presence of retracted publications labeled as humanities that, on close analysis, actually belonged to a different discipline. The first descriptive statistical results, our manual check, and the definition of the humanities affinity score helped us limit the biases of these two issues. However, we could improve the approach adopted by using additional services such as Elsevier’s ScienceDirect—as done in Bar-Ilan and Halevi (2018)—and increasing the threshold of the humanities affinity level to exclude border cases.
A citation analysis concerning retraction in the humanities domain is something that has rarely been discussed in the past, and therefore the discussion of our results included a comparison with similar works that considered different domains or retraction cases. Such works have not addressed the humanities domain or were based either on a single retraction case or a limited set of them. Work that considered other domains did not include most of the features that we have analyzed in this work (e.g., the citation intent), which made the comparison with them difficult. We intend that this study and others to be done in this field can lead to a comparison and improvement in the understanding of the retraction phenomenon in the humanities domain.
We would the like to thank the editor and the reviewers for taking the time and effort necessary to review the paper. We sincerely appreciate all valuable suggestions, which helped us to improve the quality of the paper.
Ivan Heibi: Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing—Original draft, Writing—Review & editing. Silvio Peroni: Conceptualization, Project administration, Supervision, Validation, Writing—Review & editing.
The authors have no competing interests.
This work has been partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No 101017452 (OpenAIRE-Nexus).
The data produced in this work (i.e. inputs, annotations, and results) are published and available on Zenodo (Heibi & Peroni, 2021b).
We have not described the methodology adopted in full here due to space constraints.
A complete list of reasons accompanied by a description is provided by Retraction Watch at https://retractionwatch.com/retraction-watch-database-user-guide/retraction-watch-database-user-guide-appendix-b-reasons/.
We used their REST APIs in June 2021 to retrieve citation information.
If we could not access the full text of a citing entity (e.g., due to paywalls restrictions), the corresponding entity was still considered in our data set. However, we did not use it for the qualitative postanalysis described in Sections 3.2.2 and 3.3. Details about the number of entities for which we could not retrieve are introduced in Section 3.2.1.
Exceptions to this rule (e.g., when the anchor sentence is the last one of a paragraph) are discussed in Heibi and Peroni (2022).
A detailed explanation regarding the calculation of the periods is discussed in Heibi and Peroni (2022).
Handling Editor: Ludo Waltman