Abstract
The exponentially growing number of scientific papers stimulates a discussion on the interplay between quantity and quality in science. In particular, one may wonder which publication strategy may offer more chances of success: publishing lots of papers, producing a few hit papers, or something in between. Here we tackle this question by studying the scientific portfolios of Nobel Prize laureates. A comparative analysis of different citation-based indicators of individual impact suggests that the best path to success may rely on consistently producing high-quality work. Such a pattern is especially rewarded by a new metric, the E-index, which identifies excellence better than state-of-the-art measures.
PEER REVIEW
1. INTRODUCTION
The number of scientific papers has been growing exponentially for over a century (Dong, Ma et al., 2017; Fortunato, Bergstrom et al., 2018). The number of papers per author has been relatively stable for a long time, but it has been increasing over the past decades (Dong et al., 2017), favored by the growing tendency of scientists to work in teams (Wuchty, Jones, & Uzzi, 2007).
Such increased productivity is incentivized by career evaluation criteria that typically reward large outputs, making scientists less risk averse when choosing research directions (Franzoni & Rossi-Lamastra, 2017). This, however, may come at the expense of the quality of research outcomes (Bornmann & Tekles, 2019; Sunahara, Perc, & Ribeiro, 2021). Indeed, it has been shown that the exponential growth of the number of publications corresponds to a much slower increase in the number of new or disruptive ideas (Chu & Evans, 2021; Milojević, 2015).
However, although scholars should focus on quality, it is unclear whether it is more rewarding to pursue rare hit papers, have a consistent track record of valuable outputs, or be in between these scenarios. Analyzing the careers of arguably the most successful class of scientists, Nobel Prize laureates, may help address this issue. In particular, we would like to check if there is a dominant path to success in the careers of such illustrious scholars.
To that effect, we consider a broad range of evaluation metrics that reward one-hit wonders alongside those that favor a consistent production of high-quality research and investigate their effectiveness in identifying Nobelists from within a more extensive set of similarly productive scientists. We find that the best-performing metrics are indeed the ones that prioritize a consistent stream of high-quality research.
The rest of this article is organized as follows. We first describe the data collection and curation in Section 2. Then, we briefly review some popularly adopted impact metrics and introduce two new ones. In Section 3, we describe and discuss the two sets of experiments we used to check which of the two competing scenarios is more common. Finally, we give our conclusions in Section 4.
2. METHODS
2.1. Data
We consider three fields in which the Nobel Prize is awarded: Physics, Chemistry, and Physiology or Medicine (abbreviated henceforth as Medicine).
The publication records of scientists are obtained from two sources. For Nobelists, we use the hand-curated data set with explicit annotations for prize-winning papers (Li, Yin et al., 2019). As a baseline, we consider scientists with verified Google Scholar (GS) profiles tagged with Physics, Chemistry, Physiology, or Medicine as of May 2021.
We use the 2017 version of the Web of Science (WoS) database to compile the citation statistics of the articles. We rely on gathering data from different sources on purpose, as WoS and GS complement each other well. GS offers the possibility of obtaining accurate publication records of individual scientists without the need to perform name disambiguation (Radicchi & Castellano, 2013). WoS lets us reconstruct the citation history of individual papers. Both ingredients are necessary for the type of analysis that we perform in this paper.
We adopt a similar methodology to that of Sinatra, Wang et al. (2016) to match papers across databases. Given a paper written by author a in GS, we list the papers Pa in WoS authored by people with the same last name as a. From Pa, we select the paper p with the highest normalized Levenshtein similarity between the corresponding paper titles (Levenshtein, 1966). We consider it a successful match only if the similarity exceeds 90%. Otherwise, we discard from further analysis. Following this procedure, we could match 78.1% of papers by Nobelists and 49.6% of papers by baseline scientists, respectively. For our analysis, we only consider scientists who published their first paper after 1960 and have a portfolio with at least 10 papers. Detailed statistics are provided in Table 1.
2.2. Metrics
Let us consider a portfolio 𝒫 = {c1, …, cN} of N = |𝒫| papers that collectively receive Ctot citations (i.e., Ctot = ci). We consider the following metrics:
N: total number of papers.
Ctot: total number of citations.
Cavg: average number of citations (i.e., Cavg(𝒫) = ).
Cmax: citations received by the most cited paper (i.e., Cmax(𝒫) = max{c1, ⋯, cN}).
H: H-index (i.e., the largest number H of the top-cited papers with at least H citations; Hirsch, 2005).
G: G-index (i.e., the largest number G of the top-cited papers with at least G2 combined citations; Egghe, 2006).
Q: Q-index, proposed by Sinatra et al. (2016), Q(𝒫) = exp , up to a constant factor, where Θ is the Heaviside function (i.e., Θ(x) = 1 if x > 0 and 0 otherwise), and c10,i is the citations gained by paper i within 10 years of publication. We normalize c10,i by dividing it with the average c10 of all papers published in the same discipline and year as paper i (Sinatra et al., 2016).
: a variant of the unnormalized Q-index, where we use the total number of citations ci instead of c10,i.
We observe that these measures have their unique preferences for ranking portfolios. Some, like Cmax, appear to reward one-hit wonders, and others, like H, reward consistency. One of the goals of this work is to identify and differentiate Nobelists from baseline scientists. Therefore, we argue that we need a new, simple, yet interpretable metric covering the whole portfolio spectrum.
2.3. Citation Moment and E-Index
Given a publication portfolio 𝒫, one may consider the following extreme scenarios:
Citations are equally distributed among the papers, with each paper having Ctot/N citations.
A single paper accounts for all citations.
In the first case, there is a sustained production of work of similar quality, while the second represents a one-hit-wonder situation.
2.3.1. Citation moment
α → 0: Mα behaves like as cα ≈ log c, but unlike , it accounts for uncited papers.
0 < α < 1: Mα is higher for balanced portfolios (i.e., ones with a more uniform distribution of citations).
α = 1: Mα becomes identical to Cavg.
α > 1: Mα is higher for unbalanced portfolios.
α → ∞: Mα closely imitates Cmax.
2.3.2. E-index
2.4. Behavior of Metrics on Stylized Portfolios
To better understand the behavior of the different metrics in our analysis, we consider a portfolio with n cited papers with Ctot/n citations each and N − n uncited papers. In Table 2, we show the values that several key metrics take in this case.
Metric . | Value . |
---|---|
H | min{⌊Ctot/n⌋, n} |
G | min{⌊⌋, ⌊Ctot/n⌋, N} |
Ctot/n | |
Mα | |
E | log n |
Metric . | Value . |
---|---|
H | min{⌊Ctot/n⌋, n} |
G | min{⌊⌋, ⌊Ctot/n⌋, N} |
Ctot/n | |
Mα | |
E | log n |
We see that the citation moment Mα (for α ≠ 0, 1), E-index, and G-index depend on n, N, and Ctot. The H-index and the depend only on the cited papers. So, for example, two portfolios with identical values of Ctot and n would have the same H-index, regardless of the number of uncited papers. Furthermore, even though the G-index depends on all three parameters, it depends on them in a somehow undesirable way. For example, a portfolio with more uncited papers may have a G-index value greater than or equal to the G-index of another portfolio with identical Ctot and n values. Instead, ranking the portfolio with fewer uncited works higher (lower N − n), as Mα and E would, seems more intuitive.
3. RESULTS
In Figure 1, we plot Nobelists and baseline scientists according to their number of papers and the total number of citations. As expected, most Nobelists lie in the top right region, indicating high levels of both productivity and impact. However, there appear to be a few Nobelists in the top left, indicating that they only produced a handful of high-impact papers. To further illustrate this difference, we consider two Nobelists in Physics, David J. Gross (2004) and John M. Kosterlitz (2016), and plot their publication timelines in Figure 2. Gross has a consistent production of high-impact works, but Kosterlitz stands out for having a single big paper.
We now focus on two tasks: portfolio classification and future Nobelist identification.
3.1. Portfolio Classification
We test the performance of the metrics in distinguishing the portfolios of Nobelists from those of the baseline scientists. We consider two subtasks which we describe below. We use the area under the precision-recall curve (AUC-PR) in each task as the performance metric. This curve shows the trade-off between precision and recall at different thresholds. Bounded between 0 and 1, higher AUC-PR values indicate better classification performance. For random predictions, AUC-PR is the fraction of positive samples. AUC-PR is better suited for imbalanced data sets than the area under the receiver operating characteristic curve (ROC-AUC) (Saito & Rehmsmeier, 2015). Results for the ROC-AUC are reported in the Supplementary material and are consistent with the analysis done using AUC-PR.
Full. We use the entire portfolio of the scientists described in Section 2.1.
Preaward. We construct the preaward portfolio of Nobelists (i.e., the set of papers published until the year of the prize-winning paper), discarding those with fewer than 10 papers. We find that 15 (27%), 28 (55%), and 22 (39%) of Nobelists in Physics, Chemistry, and Medicine, respectively, satisfy the above criteria.
Specifically, for a Nobelist who published their first paper in year y0 and wrote their prize-winning article in year yp, we consider the papers published and citations accrued between years y0 and yp − 1. We then pair the Nobelist with 20 baseline scientists who published their first papers around the year y0 and wrote at least 10 papers in their careers’ first yp − y0 years.
Optimal α selection. Recall that, unlike other measures, Mα has a tunable parameter α. Therefore, for each task, we record the performance of Mα across a range of α values and plot the results in Figure 3. We observe a slight dependence of the optimal α-value (α*) on the task and the field. We use the corresponding α* values while comparing the performance of Mα with other metrics. In each case, however, we find α* < 1, which indicates that portfolios are most separable when the metric prioritizes consistent impact.
We record the metrics’ performance in Table 3. In the Supplementary material, we report the classification results on the American Physical Society (APS) bibliographic data set.
Metric . | Physics . | Chemistry . | Medicine . | |||
---|---|---|---|---|---|---|
Full . | PA . | Full . | PA . | Full . | PA . | |
N | 0.03 | 0.07 | 0.13 | 0.12 | 0.06 | 0.06 |
Ctot | 0.21 | 0.15 | 0.43 | 0.34 | 0.52 | 0.24 |
Cavg | 0.42 | 0.19 | 0.32 | 0.39 | 0.68 | 0.46 |
Cmax | 0.24 | 0.12 | 0.25 | 0.21 | 0.49 | 0.18 |
H | 0.12 | 0.16 | 0.44 | 0.36 | 0.50 | 0.24 |
G | 0.15 | 0.15 | 0.41 | 0.33 | 0.48 | 0.17 |
0.30 | 0.19 | 0.32 | 0.41 | 0.67 | 0.48 | |
Q | 0.08 | 0.15 | 0.13 | 0.20 | 0.26 | 0.45 |
Mα | 0.43 | 0.34 | 0.49 | 0.53 | 0.78 | 0.68 |
E | 0.44 | 0.23 | 0.53 | 0.45 | 0.75 | 0.44 |
Metric . | Physics . | Chemistry . | Medicine . | |||
---|---|---|---|---|---|---|
Full . | PA . | Full . | PA . | Full . | PA . | |
N | 0.03 | 0.07 | 0.13 | 0.12 | 0.06 | 0.06 |
Ctot | 0.21 | 0.15 | 0.43 | 0.34 | 0.52 | 0.24 |
Cavg | 0.42 | 0.19 | 0.32 | 0.39 | 0.68 | 0.46 |
Cmax | 0.24 | 0.12 | 0.25 | 0.21 | 0.49 | 0.18 |
H | 0.12 | 0.16 | 0.44 | 0.36 | 0.50 | 0.24 |
G | 0.15 | 0.15 | 0.41 | 0.33 | 0.48 | 0.17 |
0.30 | 0.19 | 0.32 | 0.41 | 0.67 | 0.48 | |
Q | 0.08 | 0.15 | 0.13 | 0.20 | 0.26 | 0.45 |
Mα | 0.43 | 0.34 | 0.49 | 0.53 | 0.78 | 0.68 |
E | 0.44 | 0.23 | 0.53 | 0.45 | 0.75 | 0.44 |
Metrics agnostic to the distribution of citations appear to perform worse than their counterparts across either task. This includes the total number of papers N, as well as total citations Ctot, and maximum citations Cmax. We highlight the performance of three metrics: N, Cavg, and Cmax. N is consistently the worst performer because it does not account for the impact, only volume. Cavg is among the top performers considering the whole portfolio. We believe that is partly due to the nature of the distributions observed in Figure 1, where the Nobelists are likely to accumulate higher than average citations over their careers. However, performance for the preaward portfolios is a bit worse, probably because we only consider the preaward period of their careers. Winning the prize has been shown to provide a tangible boost to the overall visibility of a scientist, resulting in more citations (Inhaber & Przednowek, 1976). The number of citations of the most cited paper Cmax is among the worst performers, which suggests that the one big-hit portfolio is not typical among Nobelists. This finding supports the idea that scientists win the Nobel Prize after years of consistent, high-quality work.
We now shift our focus to the other category of indicators (i.e., ones sensitive to the citation distributions). We find that H records mediocre performance despite rewarding consistency. Its dependence on productivity likely fails to account for the Nobelists with a few highly cited papers. The Q-index performs poorly. However, its variant, , fares considerably better, which is consistent with the fact that it is similar to Mα for small α.
Mα and E consistently rank in the top two positions. This further supports the hypothesis that Nobelists set themselves apart by producing a steady stream of high-impact work.
3.2. Identifying Future Nobelists
As a test of the predictive power of the metrics, we check whether we can identify scholars who received the Nobel Prize from 2018 to 2022 (i.e., the period not covered by our WoS data set). First, we note that our set of baseline scientists may be missing some of these new Nobelists, in which case we add them manually, provided they have a GS profile.
Then, for each metric, we construct a top 20 list of baseline scientists by ranking them in descending order and highlighting the Nobelists. We report the table for the E-index in the main text (Table 4), while the remaining lists can be found in the Supplementary material.
Rank . | Physics . | Chemistry . | Medicine . |
---|---|---|---|
1 | H. Dai | H. Dai | S. Kumar |
2 | A. L. Barabási | J. Godwin | R. A. Larson |
3 | D. Finkbeiner | R. Ruoff | A. L. Barabási |
4 | P. McEuen | K. L. Kelly | G. L. Semenza |
5 | I. Bloch | H. Wang | A. S. Levey |
6 | A. Ashkin | M. Egholm | S. Paabo |
7 | U. Seljak | L. Umayam | R. A. North |
8 | S. Inouye | L. Zhang | A. Patapoutian |
9 | S. Manabe | R. Freeman | J. Goldberger |
10 | M. Tegmark | P. Cieplak | M. Snyder |
11 | J. R. Heath | G. Church | J. Magee |
12 | L. Verde | D. Macmillan | M. Houghton |
13 | S. G. Louie | G. Winter | G. Loewenstein |
14 | D. I. Schuster | J. Kuriyan | S. Via |
15 | N. D. Lang | J. R. Heath | R. Jaeschke |
16 | B. Hammer | E. H. Schroeter | G. Hollopeter |
17 | D. Holmgren | W. Lin | S. J. Wagner |
18 | M. Lazzeri | W. L. Jorgensen | V. V. Fokin |
19 | L. P. Kouwenhoven | J. Clardy | J. Allison |
20 | M. Buttiker | D. Zhao | B. Moss |
Rank . | Physics . | Chemistry . | Medicine . |
---|---|---|---|
1 | H. Dai | H. Dai | S. Kumar |
2 | A. L. Barabási | J. Godwin | R. A. Larson |
3 | D. Finkbeiner | R. Ruoff | A. L. Barabási |
4 | P. McEuen | K. L. Kelly | G. L. Semenza |
5 | I. Bloch | H. Wang | A. S. Levey |
6 | A. Ashkin | M. Egholm | S. Paabo |
7 | U. Seljak | L. Umayam | R. A. North |
8 | S. Inouye | L. Zhang | A. Patapoutian |
9 | S. Manabe | R. Freeman | J. Goldberger |
10 | M. Tegmark | P. Cieplak | M. Snyder |
11 | J. R. Heath | G. Church | J. Magee |
12 | L. Verde | D. Macmillan | M. Houghton |
13 | S. G. Louie | G. Winter | G. Loewenstein |
14 | D. I. Schuster | J. Kuriyan | S. Via |
15 | N. D. Lang | J. R. Heath | R. Jaeschke |
16 | B. Hammer | E. H. Schroeter | G. Hollopeter |
17 | D. Holmgren | W. Lin | S. J. Wagner |
18 | M. Lazzeri | W. L. Jorgensen | V. V. Fokin |
19 | L. P. Kouwenhoven | J. Clardy | J. Allison |
20 | M. Buttiker | D. Zhao | B. Moss |
In Table 5, we show how many Nobelists appeared in the top 20 lists for each metric. E-index outperforms all other indicators, proving particularly effective for Medicine.
Metric . | Physics (9) . | Chemistry (8) . | Medicine (5) . |
---|---|---|---|
N | 1 | 0 | 0 |
Ctot | 2 | 0 | 3 |
Cavg | 1 | 0 | 3 |
Cmax | 0 | 0 | 2 |
H | 2 | 1 | 2 |
G | 2 | 0 | 3 |
1 | 1 | 3 | |
Q | 0 | 1 | 1 |
Mα | 1 | 0 | 2 |
E | 2 | 2 | 5 |
Metric . | Physics (9) . | Chemistry (8) . | Medicine (5) . |
---|---|---|---|
N | 1 | 0 | 0 |
Ctot | 2 | 0 | 3 |
Cavg | 1 | 0 | 3 |
Cmax | 0 | 0 | 2 |
H | 2 | 1 | 2 |
G | 2 | 0 | 3 |
1 | 1 | 3 | |
Q | 0 | 1 | 1 |
Mα | 1 | 0 | 2 |
E | 2 | 2 | 5 |
To further corroborate this conclusion, we matched each Nobelist with a baseline scientist with (nearly) identical N and Ctot values. In Figure 4, we plot the E-index of each Nobelist and matched baseline pair. We find that the E-index of Nobelists usually exceeds that of their matches. Some exceptions correspond to Nobelists with a low number of highly cited papers. Other outliers might be prominent scholars who have not yet received the award but might receive it in the future.
4. CONCLUSION
In this work, we searched for productivity patterns in excellent scientific careers. Specifically, we aimed to assess whether the output of high-profile scientists is more likely to be characterized by a low number of hit papers or by a consistent production of high-quality work. To address this question, we have examined the scientific portfolios of Nobel Prize winners in Physics, Chemistry, and Medicine and checked which citation-based metrics are most suitable to recognize them among a much larger number of baseline scholars. In addition, we introduced two new metrics, the E-index and Mα, that reward both consistency and high average impact (when α < 1).
We found that the best-performing metrics are the ones that peak when citations are distributed among a considerable number of works rather than being concentrated on a few hit papers. The E-index, in particular, proves especially effective in identifying future Nobelists. A portal for the calculation of E-index and other scores of individual performance can be found at e-index.net.
While there are Nobelists whose success relied on isolated hit papers, the most successful scientists usually stayed on top of their game for most of their careers.
ACKNOWLEDGMENTS
We acknowledge Aditya Tandon’s help in this study’s initial phase. This work uses WoS data by Clarivate Analytics provided by the Indiana University Network Science Institute and the Cyberinfrastructure for Network Science Center at Indiana University.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
This project was partially supported by grants from the Army Research Office (#W911NF-21-1-0194) and the Air Force Office of Scientific Research (#FA9550-19-1-0391, #FA9550-19-1-0354).
DATA AVAILABILITY
The data for Nobel laureates is available at Li et al. (2019). The disambiguated APS data set is available at Sinatra et al. (2016). The raw data set for the APS can be requested at https://journals.aps.org/datasets. The code is available at https://github.com/siragerkol/Consistency-pays-off-in-science. WoS data are not publicly available.
REFERENCES
Author notes
Handling Editor: Ludo Waltman