Abstract
Scientific breakthroughs possess the transformative potential to reshape research trajectories and scientific paradigms. However, there is limited systematic evidence on how these breakthroughs influence the evolution of scientific knowledge. Building on the concepts of disruption and consolidation in science, we categorize forward-citing papers into two distinct categories: Disruptive Citing Papers (DCP) and Consolidating Citing Papers (CCP). Analyzing the dynamic patterns of DCP and CCP in Nobel Prize–winning papers, we find that in the early postpublication phase, scientific breakthroughs generate more consolidating citations than disruptive citations. Additionally, CCP in this early phase demonstrate higher scientific impact. However, in the long-term phase, scientific breakthroughs generate more disruptive citations, with DCP often involving larger and more diverse teams. Linguistic analysis also uncovers nuanced differences between CCP and DCP. Furthermore, the dynamic patterns of knowledge flow in scientific breakthroughs differ significantly from control groups. Collectively, our results reveal that scientific breakthroughs initially consolidate knowledge before disrupting it in later phases, offering profound insights into the mechanisms driving scientific progress.
PEER REVIEW
1. INTRODUCTION
Science, as a multifaceted and dynamic endeavor, can be conceptualized as a complex, self-organizing, and evolving multiscale network (Fortunato, Bergstrom et al., 2018; Wu, Kittur et al., 2022; Zeng, Shen et al., 2017). Within this network, scientific knowledge interconnects through formal and informal information exchanges (Fronczak, Mrowinski, & Fronczak, 2022), academic discourse, research methodologies, tool systems, and the process of learning (de Solla Price, 1963). The process of scientific discovery within this intricate system is shaped by a myriad of factors, including serendipitous discoveries, experimental frameworks, and theoretical paradigms (Merton, 1973).
Advances in the domains of the science of science (Fortunato et al., 2018; Liu, Jones et al., 2023) have witnessed remarkable progress, generating heightened interest in utilizing large-scale data to analyze the intricate structures of citation networks, investigate knowledge flow patterns, and uncover fundamental mechanisms underlying knowledge creation and dissemination (Liu et al., 2023; Zeng et al., 2017). While certain studies have delved into characterizing diverse types of citing papers (Catalini, Lacetera, & Oettl, 2015) through manual annotation (Kang & Evans, 2020) and machine learning algorithms (Kunnath, Herrmannova et al., 2022; Meng, Varol, & Barabási, 2024; Yousif, Niu et al., 2019), limited attention has been given to the disruptive and consolidating perspective based on network structure.
Significant contributions from Funk and Owen-Smith (2017) introduced a novel framework for quantifying the disruptive nature of scholarly papers, enhancing our understanding of disruptive citation patterns. This framework explores deep citation networks and introduces the Consolidation-Disruption (CD) index, a tool for assessing the extent to which both patents and scientific papers either consolidate or disrupt existing trends or disciplinary domains (Bu, Waltman, & Huang, 2021; Chen, Shao, & Fan, 2021; Lin, Evans, & Wu, 2022; Yang, Hu et al., 2023b). Despite the widespread adoption of disruptive and consolidating impact concepts across various academic fields (Chu & Evans, 2021; Li, Tessone, & Zeng, 2024; Lin, Frey, & Wu, 2023; Park, Leahey, & Funk, 2023; Wu, Wang, & Evans, 2019; Xu, Wu, & Evans, 2022), a notable gap remains in our understanding of knowledge flow patterns in scientific papers viewed through the lens of disruption and consolidation.
This study aims to fill this critical void by proposing an innovative approach to identify disruptive and consolidating knowledge flow patterns. We focus on the context of Nobel-winning scientific breakthroughs, as these breakthroughs exhibit enduring and consistent knowledge flow patterns (Wuestman, Hoekman, & Frenken, 2020; Xu, Luo et al., 2022), enabling more precise quantification. Furthermore, Nobel-winning breakthroughs are often catalysts for future innovation and transformation (Farys & Wolbring, 2021; Fortunato, 2014; Li, Yin et al., 2020). By examining knowledge flow patterns through the lenses of disruption and consolidation, we aim to provide evidence on how scientific breakthroughs shape the trajectory of future knowledge.
Specifically, we categorized Disruptive Citing Papers (DCP) as those that cite the focal paper but do not reference its cited sources. In contrast, Consolidating Citing Papers (CCP) include papers that not only cite the focal paper but also reference at least one of its cited sources. By examining the distinctions between DCP and CCP in the context of scientific breakthroughs, our analysis aims to provide profound insights into the mechanisms governing knowledge creation and dissemination. This approach offers a fresh perspective on understanding how such breakthroughs disrupt or develop established paradigms. Building on these considerations, this study addresses the following research questions:
RQ1: How can we analyze knowledge flow in citation networks from the perspective of disruption and consolidation? How can we categorize forward citations into disruptive and consolidating citations?
RQ2: Do scientific breakthroughs exhibit different patterns of disruption or consolidation in the early postpublication phase versus the long-term phase? How do scientific breakthroughs generate disruptive and consolidating knowledge in forward citation networks?
RQ3: What are the differences between DCP and CCP of scientific breakthroughs in terms of scientific impact, team patterns, and word usage? How do such differences evolve in different phases?
RQ4: Do the observed patterns of disruptive and consolidating knowledge flows in scientific breakthroughs differ from control group papers?
This study contributes to the academic literature by introducing a novel perspective on knowledge flow and conducting a comprehensive analysis of the disruptive and consolidating impacts of scientific breakthroughs. This enriches our understanding of knowledge dynamics in citation networks. The research demonstrates that scientific breakthroughs initially consolidate scientific knowledge, but disrupt it in the long term. It provides insights into evaluating potentially disruptive papers and raises concerns about using short-term citation windows for assessing their disruptive potential. The varying patterns of DCP and CCP suggest that the impact nature and team composition of scientific breakthroughs undergo complex transformations in response to the evolving knowledge landscape.
2. RELATED WORKS
2.1. Nobel Prizes as Scientific Breakthroughs
The growth of scientific studies over time demonstrates an exponential surge, highlighting the dynamic nature of scientific progress (Fortunato et al., 2018). However, it is important to recognize that science does not follow a linear or continuous trajectory (Barabási, Song, & Wang, 2012), as it is often driven by a small fraction of transformative research endeavors, i.e., scientific breakthroughs (Azoulay, Graff-Zivin et al., 2018b; Wei, Li, & Shi, 2023). These transformative studies possess distinctive characteristics such as “accidental discoveries,” “formidable challenges,” and “paradigm shifts,” often giving rise to the emergence and evolution of nascent fields (Azoulay, Fuchs et al., 2018a; Winnink, Tijssen, & van Raan, 2019; Yang, Zhao, & Deng, 2024b).
Scientific breakthroughs often serve as catalysts for natural phenomena, transformative changes in human lifestyles, technological advancements, and socioeconomic structures (Wuestman et al., 2020; Xu et al., 2022b). Within citation networks, scientific breakthroughs occupy central positions in the broader scientific system, establishing intricate connections with numerous successor nodes and other nodes associated with significant advancements (Battiston, Musciotto et al., 2019). While conventional science typically aligns with prevailing paradigms, scientific breakthroughs often arise by unveiling or resolving scientific problems that deviate from established norms (Kuhn, 1962). These changes foster scientific revolutions, engender new fields of study, and reshape our collective understanding of the world. Winnink et al. (2019) proposed the knowledge topology of scientific breakthrough, known as Charge-Challenge-Chance, in which breakthroughs characterized by Challenge-Chance often stem from serendipitous discoveries that address scientific problems or challenges that elude comprehension within the existing paradigm. From the perspective of complex systems, scientific breakthroughs act as events that induce a “phase transition” within the scientific landscape, modifying both internal and external factors associated with scientific paradigms (Comin, Peron et al., 2020).
Harriet Zuckerman's seminal works provide a solid foundation for the analysis of Nobel-winning papers (Zuckerman, 1967, 1977, 1992). Nobel-winning papers have long been regarded as exemplifying scientific breakthroughs, as the esteemed Nobel Prize, widely recognized as the pinnacle of scientific recognition, aligns with the concept of transformative advancements (Farys & Wolbring, 2021; Fortunato, 2014; Li et al., 2020; Szell, Ma, & Sinatra, 2018). Recent studies have focused on identifying and predicting Nobel-winning scientific breakthroughs by examining local network structural characteristics (Min, Bu et al., 2021), local structural entropy (Xu et al., 2022), and key node identification measures (Mariani, Medo, & Lafond, 2019) within the citation network. However, there is limited systematic evidence on how these breakthroughs influence the evolution of scientific knowledge.
2.2. Disruptive and Consolidating Perspective
The genesis of novel knowledge in scientific advancement is fundamentally intertwined with the vast reservoir of preexisting wisdom (Lee, Kempes, & West, 2024), a principle eloquently encapsulated by the metaphor of “standing on the shoulders of giants” (Jo, Liu, & Wang, 2022). Conversely, the trajectory of scientific progress also necessitates disruptive innovation (Bower, & Christensen, 1995), which generates new areas of inquiry by dismantling established paradigms (Kuhn, 1962). Within this context, disruption and consolidation emerge as pivotal dimensions of scientific knowledge flow (Lin et al., 2022; Yang et al., 2023b). Disruptive knowledge flow denotes a paradigmatic shift, signaling the emergence of entities that fundamentally deviate from the preceding state of affairs (Arthur, 2007; Mokyr, 1990). In contrast, consolidating knowledge flow represents the recognition and refinement of existing paradigms, ensuring the continuous improvement and validation of established knowledge bases, which fortifies the foundation upon which future research is built (Zeng et al., 2017).
In the domain of citation networks, knowledge flows can be dichotomized as either consolidating or disruptive, grounded in the foundational tenets articulated by Kuhn (1962), whose scholarly contributions resonate profoundly with the notion of challenging and transcending established paradigms as an integral facet of scientific progress. The crux of this perspective lies in the assertion that scientific advancement often germinates from anomalies that defy prevailing paradigms, and transformative breakthroughs manifest through the application of novel ideas or innovative methodologies aimed at resolving these enigmatic anomalies (Lin et al., 2022). Citing links within the knowledge network function as indicators of the forward and backward impact of papers and patents, thus forming a sprawling network of knowledge flow (Newman, 2012). The disruptive and consolidating nature of knowledge flow can be quantified through a meticulous analysis of the linkages within this network.
In recent years, the concept of disruptive and consolidating impact of publications has sparked extensive discussion. The CD index introduced by Funk and Owen-Smith (2017) furnishes a quantitative metric for appraising the disruptive and consolidating impact of technologies, particularly through an examination of the intricate citation network structure. It facilitates a precise evaluation of the extent to which a patent consolidates or disrupts its disciplinary paradigm. Building upon this foundational work, Wu et al. (2019) expanded the application of the CD index to the domain of scientific literature. Rooted in the concepts of disruption and consolidation, Chen et al. (2021) proposed a bifurcation of scientific research impact into two dimensions: disruptive impact and consolidating impact. Bu et al. (2021) and Chen et al. (2021) have also advanced a dual framework for assessing the disruptive and consolidating impact of papers.
It is important to note that disruptiveness is a distinct concept from scientific breakthroughs. Scientific breakthroughs can either disrupt existing paradigms or further develop them. For example, the discovery of DNA represents a highly disruptive study, while the discovery of RNA is a consolidating research effort (Park et al., 2023). Similarly, the proposal and verification of gravitational waves both constitute groundbreaking research. The former disrupts and introduces new paradigms, whereas the latter develops and consolidates existing paradigms (Wu et al., 2019). Despite these distinctions, there is a paucity of systematic evidence on the patterns through which Nobel-winning scientific breakthroughs disrupt or consolidate scientific paradigms. Moreover, the quantitative analysis of disruptive and consolidating knowledge flow in scientific breakthroughs has not been systematically undertaken. This study aims to address these gaps by scrutinizing the characteristics of disruptive and consolidating contributions to paradigms (DCP and CCP), with the overarching goal of advancing our understanding of the roles played by disruptive and consolidating knowledge flows in science.
3. METHODS AND DATA
3.1. Methodology
In the realm of scientific knowledge, citation links serve as crucial conduits for the flow of information. This study builds upon existing methodologies for assessing disruptive and consolidating impacts (Funk & Owen-Smith, 2017; Wu et al., 2019; Yang et al., 2023a) to propose an innovative approach. This approach delineates the citations of a focal publication into two distinct categories: disruptive and consolidating citations.
As illustrated in Figure 1(a), a focal paper (FP) in the citation network, along with its reference set R = {r1, r2, …, rm} and citation set C = {c1, c2, …, cn}, form a coherent knowledge flow subnetwork that provides deep citation information about the forward and backward nodes in the citation network. To explore the implicit relationships among articles, the citation set of R is also considered, denoted as RC = {rc1, rc2, …, rck}. Notably, nodes in RC may include more nodes as they point to multiple articles. Thus, the consolidating citation can be represented as CC = RC ∩ C, and the disruptive citation can be represented as DC = C\CC. This approach allows for the division of citing papers into two categories: Disruptive Citing Papers (DCP) and Consolidating Citing Papers (CCP). Subsequently, we can quantitatively assess the temporal structure of the dual impact exerted by the disruptive and consolidating citing papers, respectively.
Illustration of DCP and CCP in the citation network of a focal paper (FP). (a) Illustration of the consolidating citations and disruptive citations associated with the focal paper (FP). (b), (c) Dynamic patterns of DCP and CCP in an exemplary case.
Illustration of DCP and CCP in the citation network of a focal paper (FP). (a) Illustration of the consolidating citations and disruptive citations associated with the focal paper (FP). (b), (c) Dynamic patterns of DCP and CCP in an exemplary case.
To illustrate this methodology, we selected an exemplary paper, “Ordering, metastability and phase transitions in two-dimensional systems” (Kosterlitz & Thouless, 1973), which received the Nobel Prize in Physics in 2016. Figures 1(b) and (c) demonstrate the dynamic patterns of DCP and CCP within the citation network of this paper. As the citation network evolves over time, new citation links can be categorized as either disruptive or consolidating, depending on whether they connect to the references cited by the focal paper. At the initial stages, CCP may dominate the citation network, while over time, DCP may gain prominence. This methodology provides a nuanced understanding of the evolving impact of scientific publications by distinguishing between citations that consolidate existing knowledge and those that disrupt it.
It is noteworthy that the emergence of citing papers transpires organically and in a self-organizing manner. Our approach fundamentally adopts an ex-post perspective to disentangle the distinct groups of citations. Remarkably, the concept of CCP shares striking similarities with the notions of bibliographic coupling (Kessler, 1963) and cociting networks (Zhang & Zhu, 2022). In essence, CCPs cite the references of the focal paper, thereby enhancing the cociting links of Nobel-winning papers and enriching the structural motifs within these papers (Zeng, Fan et al., 2022). These perspectives have found widespread application in the analysis of diverse real-world systems, including transportation networks and social systems (Harush & Barzel, 2017; Yang et al., 2023a).
Our prior research has firmly established that the disruptive citations hold significant validity in evaluating both papers and scientists, whereas the consolidating citations exhibit less validity (Yang, Gong et al., 2024a). This perspective offers a novel approach to understanding how knowledge flows within the realm of science (Lin et al., 2022). It provides a robust framework for analyzing knowledge flow patterns, considering both disruptive and consolidating citation elements.
3.2. Citation Data
We utilize the Microsoft Academic Graph (MAG) data set (Wang, Shen et al., 2020), a comprehensive scientific publication database that records bibliographic information, authorship, author affiliations, and citation links for articles. Spanning from 1800 to 2021, MAG encompasses over 200 million documents. This collection comprises a wide spectrum of scholarly outputs, including journal articles, conference proceedings, preprints, and various other forms of research publications. The MAG data set offers detailed information for each paper, including publication year, paper DOI, categorizations, publication venues, disciplinary domains, authorship, citation networks, and other pertinent properties.
3.3. Nobel-Winning Papers
We employed the Nobel-winning papers data set provided by Li, Yin et al. (2019). Identifying the precise papers that have contributed to Nobel prizes is a complex task. To address this complexity, Li et al. (2019) adopted a comprehensive approach that includes papers cited within Nobel lectures and those published contemporaneously with the prize-winning research, subject to predefined inclusion criteria. It ensures the identification of papers that potentially played a role in fostering the scientific breakthroughs leading to the prestigious Nobel prize. We have successfully matched a corpus of 712 Nobel-winning papers in the MAG data set, spanning from 1887 to 2010.
Table 1 constitutes the essential attributes of these Nobel-winning papers. It is noteworthy that the average citation count for these Nobel-winning papers is remarkably high, underscoring their groundbreaking nature and profound impact on the scientific community. The protracted prize lag bears testament to the rigorous and painstaking endeavors involved in conducting pioneering research that ultimately materializes into the coveted accolade. Prize lag proffers profound insights into the temporal chasm that often ensues between the unveiling of groundbreaking research and the subsequent acknowledgment of its seminal contributions through the medium of the Nobel Prize. Such extended time frames may allude to the intricate patterns and theoretical underpinnings of scientific breakthroughs (Fortunato, 2014) while concurrently accounting for the temporal requisites inherent in the widespread recognition and appreciation of groundbreaking research (Fortunato, 2014; Jones & Weinberg, 2011).
Description of Nobel-winning papers
Field . | Count . | Avg. citation . | Avg. reference . | Avg. publishing year . | Avg. prize year . | Avg. prize lag . |
---|---|---|---|---|---|---|
Physics | 217 | 1,397 | 13 | 1954 | 1972 | 18.2 |
Chemistry | 220 | 1,873 | 21 | 1958 | 1974 | 16.0 |
Medicine | 275 | 1,493 | 17 | 1958 | 1974 | 16.7 |
Field . | Count . | Avg. citation . | Avg. reference . | Avg. publishing year . | Avg. prize year . | Avg. prize lag . |
---|---|---|---|---|---|---|
Physics | 217 | 1,397 | 13 | 1954 | 1972 | 18.2 |
Chemistry | 220 | 1,873 | 21 | 1958 | 1974 | 16.0 |
Medicine | 275 | 1,493 | 17 | 1958 | 1974 | 16.7 |
4. RESULTS
4.1. Scientific Breakthroughs and Forward Citations
We begin by extracting all forward-citation links associated with Nobel-winning papers, resulting in a data set comprising 1,124,254 citing papers. Notably, no citation windows were set here; thus, the data set includes all citation links from the publication year of the Nobel-winning papers up to 2021. Given the log-normal distribution of citation-based data, we employed the Mann-Whitney U-test and Kolmogorov-Smirnov test to examine the differences between DCP and CCP across different fields and over time. As depicted in Figure 2, our results demonstrate that the average number of DCP for Nobel-winning papers is significantly higher than that of CCP (p values < 0.001). Among these forward-citing papers, a substantial proportion, 940,250 (83.6%), are categorized as DCP, while 184,004 (16.4%) are identified as CCP.
Distribution of DCP and CCP in Nobel-winning scientific breakthroughs. (a) The number of Nobel-winning papers published in each year. (b) The average citation counts (without citation windows) of Nobel-winning papers in each year. (c) The average CD index (using citations from the publication year to 2021) of Nobel-winning papers in each year. (d), (e) The average number of DCP and CCP of Nobel-winning papers by field and across different years. The significance of the difference is assessed using (d) the Kolmogorov-Smirnov test and (e) the Mann-Whitney U-test. Shaded regions represent 95% confidence intervals.
Distribution of DCP and CCP in Nobel-winning scientific breakthroughs. (a) The number of Nobel-winning papers published in each year. (b) The average citation counts (without citation windows) of Nobel-winning papers in each year. (c) The average CD index (using citations from the publication year to 2021) of Nobel-winning papers in each year. (d), (e) The average number of DCP and CCP of Nobel-winning papers by field and across different years. The significance of the difference is assessed using (d) the Kolmogorov-Smirnov test and (e) the Mann-Whitney U-test. Shaded regions represent 95% confidence intervals.
The downward trend of the average CD index (calculated using citations from the publication year to 2021) of Nobel-winning papers (Figure 2(c)), along with the diminishing gap between DCP and CCP over time (Figure 2(d)), reveals that even for scientific breakthroughs, their disruptive potential diminishes over time (Park et al., 2023). Our findings also indicate that the distributions of DCP and CCP in scientific breakthroughs are not uniform across different fields (Figure 2(e)). Specifically, Nobel-winning papers in Physics exhibit the lowest proportion of CCP, while Medicine breakthrough papers demonstrate the highest proportion of CCP and the lowest proportion of DCP. The higher proportion of CCP observed in the field of Medicine may be attributed to the nature of medical research, which often builds upon existing knowledge and previous discoveries (Gysi, do Valle et al., 2021). In contrast, Physics research may place a greater emphasis on novel and groundbreaking findings (Battiston et al., 2019), potentially resulting in a higher proportion of DCP.
4.2. Early Consolidation and Long-Term Disruption
The predominance of DCP in the long term indicates that scientific breakthroughs are more likely to challenge established paradigms and stimulate innovation. However, given the enduring impact and delayed recognition characteristic of scientific breakthroughs, we anticipate distinct patterns of disruption or consolidation in the early postpublication phase versus the long-term phase of breakthroughs. To explore these dynamics, we established the publication year of Nobel-winning papers as the baseline (year 0) and categorize DCP and CCP into groups based on the temporal gap.
Figure 3(a) illustrates a notable temporal evolution in the disparities between DCP and CCP. Initially, Nobel-winning papers tend to attract more CCP than DCP (p < 0.01) upon publication. However, as time elapses, the number of DCP steadily rises, while CCP experiences a decline. After a certain duration (approximately 5 years), the annual count of CCP significantly decreases compared to that of DCP. In the long term, beyond the early phase, scientific breakthroughs yield more disruptive citations than consolidating ones. This temporal shift reflects a transformation of how scientific breakthroughs contribute to the body of scientific knowledge. Initially, these breakthroughs appear to stimulate research that builds upon established foundations. However, over time, there is an increasing emphasis on disruptive elements, leading to new knowledge or paradigm shifts generated by such breakthroughs.
Nobel-winning papers produce more consolidating citations in the early phase, but more disruptive citations in the long term. (a) Average annual number of DCP and CCP since the publication of Nobel-winning papers, with significance indicators for the difference between DCP and CCP numbers using the Mann-Whitney U-test for each year. Error bars depict 95% CI. (b) Proportion of Nobel-winning scientific breakthroughs measured as disruptive (CDy > 0) using a y-year citation window.
Nobel-winning papers produce more consolidating citations in the early phase, but more disruptive citations in the long term. (a) Average annual number of DCP and CCP since the publication of Nobel-winning papers, with significance indicators for the difference between DCP and CCP numbers using the Mann-Whitney U-test for each year. Error bars depict 95% CI. (b) Proportion of Nobel-winning scientific breakthroughs measured as disruptive (CDy > 0) using a y-year citation window.
When employing various citation windows to compute the CDy index, which assesses accumulated DCP and CCP up to y years since publication of the focal paper, Figure 3(b) demonstrates that the likelihood of disruption in scientific breakthroughs increases over the temporal gap (y). This finding suggests that the disruptive nature of scientific breakthroughs intensifies over time. Initially, scientific breakthroughs consolidate scientific knowledge, but they ultimately contribute to its disruption in the long run.
4.3. Dynamic Patterns of DCP and CCP
We then analyzed the impact patterns of DCP and CCP in Nobel-winning papers, as presented in Figure 4. It shows that DCP receive fewer citations than CCP (p < 0.001). Specifically, the average citation count for DCP is 43.1, whereas for CCP it is 64.9. To mitigate potential bias from variations in citation windows, we conducted an analysis of 10-year citation counts (Sinatra, Wang et al., 2016), which confirmed the robustness of our findings.
Dynamic scientific impact of DCP and CCP in Nobel-winning papers. (a), (b) Illustration of DCP and CCP for a Nobel-winning paper, including their respective probabilities of hit papers (top 1% highly cited papers within publication year and subfield). (c), (d) Comparison of the (c) citation count and (d) 10-year citation count of DCP and CCP, with statistical significance indicated by the Mann-Whitney U-test. Error bars depict 95% confidence intervals. (e), (f) Yearly comparison of the (e) citation count and (f) 10-year citation count of DCP and CCP since publication. Shaded areas depict 95% CI.
Dynamic scientific impact of DCP and CCP in Nobel-winning papers. (a), (b) Illustration of DCP and CCP for a Nobel-winning paper, including their respective probabilities of hit papers (top 1% highly cited papers within publication year and subfield). (c), (d) Comparison of the (c) citation count and (d) 10-year citation count of DCP and CCP, with statistical significance indicated by the Mann-Whitney U-test. Error bars depict 95% confidence intervals. (e), (f) Yearly comparison of the (e) citation count and (f) 10-year citation count of DCP and CCP since publication. Shaded areas depict 95% CI.
We further examined the dynamic comparison of citation counts of DCP and CCP since publication of Nobel-winning papers (Figures 4(e), (f)). This analysis reveals that during the early postpublication period of Nobel-winning papers, CCP consistently yield higher impact than DCP. However, this temporal advantage in scientific impact appears to diminish after this critical period.
The initial superiority of the CCP in scientific impact may be attributed to its early linkage to Nobel-winning papers, benefiting from preferential attachment effects that generate higher future citations. Moreover, we observed a noticeable decline in the scientific impact of papers citing Nobel-winning papers, for both DCP and CCP, over time. This underscores the importance of building upon existing knowledge and leveraging insights gained from previous breakthroughs, exemplified by the concept of “standing on the shoulder of giants” (Jo et al., 2022).
Our investigation into the distinctions in team characteristics between DCP and CCP in scientific breakthroughs focused on five dimensions of team patterns:
the number of authors (Wuchty, Jones, & Uzzi, 2007);
the unique number of institutions signifying cross-disciplinary collaboration (Jones, Wuchty, & Uzzi, 2008);
the unique number of countries indicating international collaborations;
gender diversity measured by the Shannon entropy of team gender distributions (Yang, Tian et al., 2022); and
field diversity measured by the Shannon entropy of team expert distributions (Lin et al., 2023; Yang, 2024).
Gender is inferred using a statistical model based on first names (Liu, Xie et al., 2024; Van Buskirk, Clauset, & Larremore, 2023), and author expertise is defined as the most common second-level fields of study in MAG up to the publication year.
The results presented in Figure 5 suggest that DCP are associated with larger and more diverse research teams compared to CCP. The average team sizes for DCP and CCP are 6.03 and 3.85, respectively (p < 0.001). The average unique numbers of institutions for DCP and CCP are 1.58 and 1.4 (p < 0.001), the average unique numbers of countries for DCP and CCP are 1.22 and 1.16 (p < 0.001), the average gender diversities for DCP and CCP is 0.25 and 0.24 (p < 0.001), and the average field diversities for DCP and CCP are 0.49 and 0.44 (p < 0.001).
Dynamic team Patterns of DCP and CCP in Nobel-winning papers. (a) Visualization of DCP and CCP in Nobel-winning papers. (b)–(f) Comparison of team patterns between DCP and CCP, including (b) the number of authors, (c) the unique number of institutions, (d) the unique number of countries, (e) the proportion of female researchers, and (f) the proportion of female researchers. Statistical significance is indicated by the Mann-Whitney U-test. Error bars depict 95% CI. (g)–(k) Yearly comparison of team patterns between DCP and CCP since publication. Shaded areas depict 95% CI.
Dynamic team Patterns of DCP and CCP in Nobel-winning papers. (a) Visualization of DCP and CCP in Nobel-winning papers. (b)–(f) Comparison of team patterns between DCP and CCP, including (b) the number of authors, (c) the unique number of institutions, (d) the unique number of countries, (e) the proportion of female researchers, and (f) the proportion of female researchers. Statistical significance is indicated by the Mann-Whitney U-test. Error bars depict 95% CI. (g)–(k) Yearly comparison of team patterns between DCP and CCP since publication. Shaded areas depict 95% CI.
We further examined the yearly comparison of citation counts of DCP and CCP since the publication of Nobel-winning papers (Figures 5(e), (f)). It shows that the advantage in team size and diversity observed in DCP is notably prominent in the long-term period. During the early phase, DCP exhibit smaller team sizes and less diverse collaborations compared to CCP. However, over time, DCP demonstrate significantly larger team sizes and greater diversity.
This suggests that the nature of collaboration and team composition in the context of knowledge flows of the scientific breakthroughs undergoes complex transformations in response to the changing landscape of knowledge. The long-term disruptive context of Nobel-winning papers necessitates the involvement of larger and more diverse teams capable of harnessing diverse expertise, resources, and perspectives to propel scientific progress.
4.4. Control Group Analysis
To investigate whether patterns of disruptive and consolidating knowledge flows observed in Nobel-winning scientific breakthroughs extend to nonbreakthrough papers, we employed a control group methodology. We identified a cohort of control papers from the MAG database based on congruent publication timelines, issues, volumes, and the same discipline (OECD second-level classification) as their Nobel-winning counterparts. In total, our data set comprises 20,954 control group papers and their associated 2,511,685 forward-citing papers.
As illustrated in Figure 6, our analysis of the control group papers reveals notable distinctions compared to our findings within the subset of Nobel-winning papers. Firstly, the distribution dynamics of DCP and CCP in control group papers exhibit distinct patterns. DCP significantly outnumber CCP immediately following the publication of focal papers. The probability of disruptive potential, as measured by the CDy index, remains consistently steady approximately 3 years postpublication. Secondly, DCP within control group papers consistently demonstrate significantly higher scientific impact compared to CCP. Thirdly, analysis of team characteristics reveals that DCP in control group papers consistently feature larger team sizes and more diverse collaboration patterns than CCP.
Patterns of DCP and CCP in control papers. (a) Illustration of the control group papers. (b), (c) Dynamic disruptive and consolidating patterns of the control papers. (e) Distributions of DCP and CCP. (f), (g) Dynamic scientific impact of DCP and CCP. (h), (i) Dynamic team size and team diversity of DCP and CCP. Statistical significance is indicated by the Mann-Whitney U-test. Error bars and shaded areas depict 95% CI.
Patterns of DCP and CCP in control papers. (a) Illustration of the control group papers. (b), (c) Dynamic disruptive and consolidating patterns of the control papers. (e) Distributions of DCP and CCP. (f), (g) Dynamic scientific impact of DCP and CCP. (h), (i) Dynamic team size and team diversity of DCP and CCP. Statistical significance is indicated by the Mann-Whitney U-test. Error bars and shaded areas depict 95% CI.
These findings underscore that the patterns of DCP and CCP within control group papers differ markedly from those observed in Nobel-winning papers. This contrast highlights the unique dynamics of knowledge flow within scientific breakthroughs and emphasizes the necessity for tailored analytical approaches when examining and evaluating citation patterns of scientific breakthroughs.
4.5. Word Usage Patterns
Finally, we involved a meticulous examination of the linguistic traits evident in the titles and abstracts of the forward-citing papers of scientific breakthroughs. This linguistic analysis aims to offer deeper insights into the distinctive features that differentiate DCP and CCP. Figure 7 reveals significant disparities in title characteristics between DCP and CCP. One notable observation is the propensity for CCP to feature shorter titles, as well as a lower frequency of verbs and nouns compared to DCP.
Title length and word usage of DCP and CCP in Nobel-winning papers. (a) Visualization of DCP and CCP of Nobel-winning papers. (b)–(e) Comparison of average title length, number of words, verbs and nouns between DCP and CCP. Statistical significance is indicated by the Mann-Whitney U-test. Error bars depict 95% CI. (f), (g) Frequency distribution of the top 10 most common verbs and nouns in titles and abstracts for DCP and CCP, respectively. Red boxes show the unique words of DCP or CCP.
Title length and word usage of DCP and CCP in Nobel-winning papers. (a) Visualization of DCP and CCP of Nobel-winning papers. (b)–(e) Comparison of average title length, number of words, verbs and nouns between DCP and CCP. Statistical significance is indicated by the Mann-Whitney U-test. Error bars depict 95% CI. (f), (g) Frequency distribution of the top 10 most common verbs and nouns in titles and abstracts for DCP and CCP, respectively. Red boxes show the unique words of DCP or CCP.
Moreover, we conducted a detailed examination of the specific verbs and nouns utilized by DCP and CCP, thus illuminating the thematic discrepancies that underlie their content. We observe that DCP and CCP tend to employ different sets of verbs and nouns. This nuanced exploration of linguistic choices serves as a testament to the intricate nature of knowledge dissemination patterns within Nobel-winning scientific breakthroughs. It furnishes an indispensable layer of context, enriching our comprehension of how DCP and CCP contribute to the diverse landscape of scientific discourse.
5. DISCUSSION
5.1. Implications
This study contributes to the theoretical understanding of scientific breakthroughs and knowledge flow in several significant ways. First, by introducing a novel framework to categorize forward-citing papers into DCP and CCP, this study enriches the conceptualization of how scientific breakthroughs impact the evolution of scientific knowledge. The findings reveal that scientific breakthroughs initially consolidate scientific paradigms, followed by a phase where they disrupt established knowledge structures. This temporal evolution highlights the dynamic nature of scientific progress. It not only aligns with the concept of cumulative knowledge advancement (de Solla Price, 1963; Uzzi, Mukherjee et al., 2013) and reinforces the notion that scientific progress often relies on the iterative refinement and consolidation of existing knowledge, but also supports the theoretical proposition that disruptive impact plays a crucial role in scientific advancement (Kuhn, 1962), highlighting the transformative potential of Nobel-winning papers in driving scientific progress and underscores the importance of disruptive ideas in fostering innovation and pushing the boundaries of knowledge.
Second, the analysis reveals temporal shifts in the impact and team characteristics associated with DCP and CCP. In the early postpublication phases, CCP exhibits higher scientific impact, likely driven by early recognition and the preferential attachment effect (Zeng et al., 2017). However, the long-term phase witnesses a rise in the impact of DCPs, indicating the emergence of paradigm-shifting knowledge. This temporal analysis enhances our understanding of the evolving patterns of scientific knowledge flow. The more diverse team features associated with DCP align with theoretical perspectives that emphasize the role of collaboration in driving innovation (Jones et al., 2008) and the benefits of diverse team compositions in generating novel ideas (Lin et al., 2023; Wuchty et al., 2007; Yang et al., 2022). This suggests that the nature of collaboration and team composition in the context of knowledge flows in scientific breakthroughs undergoes complex transformations in response to the evolving landscape of knowledge.
The findings of this study hold practical implications for stakeholders in the scientific community. Recognizing the temporal evolution from consolidation to disruption, grant funding could be directed towards supporting both foundational research that consolidates existing knowledge at the early phases and may disrupt paradigms in the long term. Research evaluators and institutions can refine their evaluation metrics to account for the dual nature of scientific impact observed in this study. Traditional citation-based metrics may be complemented with measures that capture the long-term transformative impact of breakthroughs.
5.2. Limitations and Future Avenues
While this study provides valuable insights, several limitations warrant consideration for future research: First, the findings are based on Nobel-winning papers and may not fully generalize to other types of scientific breakthroughs or disciplines. Future research could explore how these dynamics vary across different fields and types of breakthroughs. Second, we use data from the MAG database and employ specific methodologies to categorize citations. Further validation using alternative data sets and methodologies could strengthen the robustness and generalizability of the findings. Third, we do not fully explore how the dynamics of DCP and CCP may vary across different scientific disciplines beyond broad categorizations. Disciplinary differences in research practices, collaboration patterns, and citation behaviors could influence the observed patterns and warrant further investigation. Last, the control group analysis in this study does not account for many confounding factors and is not representative of typical scientific papers. It merely provides insights into the “nonbreakthrough” papers published alongside scientific breakthroughs within the same volume of the same journal. Consequently, we cannot draw counterfactual inferences from this control analysis, nor can we generalize the findings to normal scientific papers.
Future research should extend the analysis to include a wider array of scientific breakthroughs beyond Nobel-winning papers. Investigating different types of high-impact research across various fields could enhance the generalizability and applicability of the findings. Developing and validating new methodologies for classifying citations and measuring disruption and consolidation could enhance the robustness of future studies. Incorporating machine learning techniques, natural language processing, and alternative citation indices could provide more nuanced and accurate classifications. Future research can also undertake large-scale analyses to elucidate the universal patterns of knowledge flow in standard scientific papers, considering both disruption and consolidation perspectives.
AUTHOR CONTRIBUTIONS
Alex J. Yang: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing—original draft, Writing—review & editing. Sanhong Deng: Funding acquisition, Methodology, Project administration, Resources, Supervision, Validation, Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
This research was supported by the Open Fund for Innovative Evaluation from Fudan University, the International Joint Informatics Laboratory Fund, and the Fundamental Research Funds for the Central Universities.
DATA AVAILABILITY
The Nobel-winning papers data used in this study are open access at Harvard Dataverse https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/6NJ5RN. The MAG data set can be accessed through the Get Microsoft Academic Graph on Azure storage - Microsoft Academic Services | Microsoft Learn (https://learn.microsoft.com/en-us/academic-services/graph/get-started-setup-provisioning). The version number of MAG data used in our study is the final (latest) version, published December 20, 2021.
Other data used in this study can be obtained by making reasonable requests.
The code is available at https://github.com/AlexJieYang/DCP-CCP. The K-S test analysis and Mann-Whitney U-test used in this study are based on scipy-1.10.1. Text segmentation and preprocessing is based on nltk-3. The figures are based on matplotlib-3.6.0 and seaborn-0.12. Note that one may need Python-3.10 to replicate the code.
REFERENCES
Author notes
Handling Editor: Vincent Larivière