Abstract
This article aims to improve our understanding of scientometric data in a Benfordian context. Recently, Benford’s law has been used to detect scientific fraud. However, we need to better understand its application to scientometric data. Through the implementation of Benford’s law and the generalized Benford’s law, we propose a categorization of science products and metrics. To this end, we have performed chi-square, MAD, and Max tests on data sets from WoS and Scopus as well as on historical data. This enables us to better understand the behavior and characteristics of these objects in a Benfordian context, and invites us to discuss the nature of bibliometric indicators in this particular context.
PEER REVIEW
1. BENFORD’S LAW IN THE FIELD OF INFORMETRICS
This law is surprising, as one would expect a uniform distribution of the different digits. Numerous publications have given different explanations for this unexpected regularity: Gauvrit and Delahaye (2008) recall the well-known fact (cf. for example, Diaconis, 1977) that the uniformity of the logarithmic mantissa log X of a strictly positive real random variable X is equivalent to Benford’s law on X. They propose a sufficient condition for a random distribution to generate numbers that satisfy this law: If a statistical distribution concentrates a significant number of low and high values and has a certain regularity, it roughly conforms to this law.
The work of Delahaye (2012) and Kafri (2023) shows similarities between Zipf’s law and Benford’s law. The latter shows that under certain conditions (Eq. 2) is obtained by applying the Riemann sum to Zipf’s law (Kafri, 2023).
1.1. Evolution of Benford’s Law
In 1976, Raimi wrote an article on Benford’s law called The First Digit Problem, in which he presented a first state of the art of the work (Raimi, 1976, p. 521). He presents 37 papers in chronological order with the aim of retaining only the papers: “deliberately omitting only those references to the problem which make no attempt to add to its understanding … among authors of which are mathematicians, statisticians, economists, engineers, physicists and amateurs.” He also proposed 15 additional bibliographical references which “[do] not refer explicitly to the problem.”
In his book, Nigrini returned to the works identified around Benford’s law (Nigrini, 2012, p. 293). In 2000, he identified 200 articles on Benford’s law. Eleven years later, he had identified 750 papers on the subject. He continued the reasoning of Raimi (1976) by suggesting a difference between statistics which deals with real data and mathematics. He pointed out that Benford’s law is interested in both aspects. That it motivates the studies of both of these fields. He proposed a classification of the works in eight points with three main categories: papers that prove Benford’s law; those that have an approach to mathematical phenomena; and those that consider Benford’s law as a test, particularly in terms of fraud.
The applicability of Benford’s law is very interesting. This law is verified in many collections of numbers enumerating objects of various origins: the number of inhabitants of cities, the distances between stars, the lengths of rivers, the prices in supermarkets, and the citations of journals in a database. Not only does Benford’s law cover many areas, but it is regularly used and cited in different fields. A quick study of the Web of Science (WoS) among the 50 most cited articles dealing with Benford’s law shows that the fields concerned are, of course, fraud (whether in the financial or electoral field (Mebane, 2011; Tam Cho & Gaines, 2007), but also on electrical networks (Wei, Sundararajan et al., 2017); scientific data (Diekmann, 2007; Geyer & Williamson, 2004; Judge & Schechter, 2009); astronomy (Alexopoulos & Leontsinis, 2014); atomic physics (Pain, 2008) and quantum physics (Rane, Mishra et al., 2014); hydrology (Nigrini & Miller, 2007); image processing (Fu, Shi, & Su, 2007; Perez-Gonzalez, Heileman, & Abdallah, 2007); natural sciences (Sambridge, Tkalčić, & Jackson, 2010); and, more recently, in epidemiology with COVID-19 (Lee, Han, & Jeong, 2020).
1.2. Integrity of Academic Research
Scientific integrity and trust are the cornerstones of scientific research, because of their relationship with constantly evolving knowledge. In recent years, Benford’s law has become a tool for detecting scientific fraud (Eckhartt & Ruxton, 2023). Reproducibility in academic research has long been a persistent problem, in contradiction with one of the fundamental principles of science. The growing number of false claims found in academic manuscripts is worrying. This goes against the very nature of science and calls into question the reproducibility of academic research. Lazebnik and Gorlitsky (2023)’s work has determined the rate of manipulation in academic research. Furthermore, Schumm, Crawford et al. (2023) offer an approach designed to detect issues in social science surveys when dealing with small sample sizes. An application of this law to medical data has been proposed by Hein, Schuepfer, and Konrad (2011). More recently, the work of Gupta, Singh, and Banshal (2023) proposes that Benford’s law can be used to define a framework for assessing the quality of altmetric data.
1.3. Benford Applied to Scientometrics
There are fewer studies in the field of scientometrics, but the theoretical developments and experiments are just as relevant. Over the last decade, several studies have been carried out on data from WoS and Scopus. One focuses on the number of articles, citations, and impact factor of WoS from 1998 to 2007 (Campanario & Coslado, 2011). A second study aims to compare this law on data from WoS and Scopus (Alves, Yanasse, & Soma, 2014).
Benford’s law is generalized for the other digits. The work of Alves, Yanasse, and Soma (2016). focuses on Benford’s law for the second digit.
The work of Hürlimann (2015b) discussed parametric extensions of Benford’s law. For this purpose, they used the mean absolute deviation (MAD), test developed by Nigrini (2012) for their experiment. All of these studies tend to show that the data in these databases, which are at the source naturally produced by researchers, agree in verifying this law.
1.4. The Problem Addressed Is the Understanding of Bibliometric Data
While Benford’s law is well known and studied, there remains a question mark over the nature of the data studied in the field of scientometrics. Before experimenting with Benford’s law on bibliometric data for the purposes of fraud detection or the regulation of science, it is necessary to understand the nature of these indicators in a Benfordian context. The research question concerns the nature of scientometric data.
Indeed, we have found that previous studies of Benford’s law on scientometric data have not explained its behavior in relation to the nature of the scientometric information used. On page 431 of Campanario’s article, the authors tested the distribution on articles, citations and impact factor (Campanario & Coslado, 2011). Their conclusion is as follows: “We have no explanation for these differences.”
Egghe’s work based on Campanario’s data stated in the conclusion that “We consider this to be an interesting discovery, but we have no informetric explanation for it” (Egghe & Guns, 2012, pp. 1663, 1665). Alves’ work, published in 2014 following Campanario and Egghe, did not offer a discussion of the nature of the objects studied, but wondered about anomalies in the data sets and the ability to detect them (Alves et al., 2014). Despite these previous works, there is still a lack of understanding when it comes to reading Benfordian distributions.
Therefore, we proposed a typology of the scientometric objects studied, based on the different approaches to a Benfordian distribution. Consequently, in this article we have constructed a new data set extracted from WoS from 1997 to 2019 and from Scopus from 1999 to 2019. We have also taken into account data from the work of Campanario and Coslado (2011) and Alves et al. (2014). This led us to a corpus of 181 distributions from different scientometric objects. The data set we used consists of unaggregated data. We have made available all the distributions analyzed. They are both included in the Supplementary material and available for download via Zenodo (Bertin & Lafouge, 2024).
The article is structured as follows: After recalling Benford’s law and its generalized form in Section 2, Section 3 describes the construction of the scientometrics data set from WoS, Scopus, and historical data.
Next, we present the tests used in this work. Two well-known tests were employed: the MAD test, developed specifically for this law, and the classic chi-square test. Furthermore, a third test, called Max, is proposed, which represents a compromise between the macro and micro tests.
Section 4 presents the results obtained and a discussion that proposes an analysis at the micro and macro levels. The results lead us to categorize the scientometric data in our study and to propose a typology of distributions based on the results. The distinction between macro and micro data and the tests used allow us to hypothesize about the classes that define the nature of the scientometric information used. It allows us to provide some answers about the nature of the scientometric information used and to better understand the Benfordian phenomena in a scientometric perspective.
Section 5 concludes the study with a summary of this work and of the main results obtained.
2. GENERALIZED BENFORD’S LAW AND ZIPFIAN FORM
2.1. Zipfian Form
In this formalism Benford’s law can be stated as follows: A digit d = 1, …, 9 (sources) produces the numbers (items) whose first digit is d. This scheme is the same in Zipf’s law. In this case the number of sources is finite. This scheme will allow us to modify how the law is written and, consequently, to offer a proposal for generalization.
2.2. Benford’s Law in Zipfian Form
This form led Egghe to generalize this law.
2.3. Statement of the Generalized Benford’s Law
The limit of the generalized law when β → 1 is thus equal to the original Benford’s law (Eq. 1). This unproven result in Egghe’s article reinforces this generalization in our view.
This generalized law can be perceived through the scale invariance of power laws (Raimi, 1976, p. 529; Pietronero, Tosatti et al., 2001).
3. METHODOLOGY
The construction of the data set is based both on historical data, for the purposes of validation and reproducibility of this experiment, and also on WoS and Scopus data in order to have a set of scientometric objects such as the impact factor, h-index, number of journals, references, citations, and articles over different periods. In order to respond to the problem posed, namely the understanding of bibliometric objects in a Benfordian context, we will observe the behavior of these scientometric objects during the application of the tests. The latter invites us to reflect on the nature of scientometric objects through discussion.
3.1. Construction of the Scientometrics Data Set
For the construction of the data set that will be used for our experimentation, we retained the data set used by Alves et al. (2014). This is a compilation of the Campanario and Coslado (2011) and Alves et al. (2014) data sets used by Hürlimann (2015a) and Egghe and Guns (2012). The data set produced, contains the historical data but also the new data set we produced from WoS and Scopus.
To create the new data set to test Benford’s law, we collected for each WoS journal between 1997 and 2019 the impact factor and the total number of citations. For each Scopus journal between 1999 and 2019, we collected the h-index, the number of cumulative citations over three years, the number of references over the year, and the number of articles over the year. We also collected the ratio of the number of references to the number of articles. The ratio is a bibliometric indicator, as are the h-index and the impact factor.
For these 181 distributions of WoS and Scopus, we calculate the percentage of the first digit. The results are shown in Tables S1–S10 in the Supplementary material. It was necessary to change the scale to process the 23 distributions of impact factor: We multiply all these values by 10,000. For example, the Biologia journal whose impact factor was 0.159 in 2000 will now has a value of 1,590. Benford’s law has been tested on data extracted from the WoS and Scopus databases.
3.2. Chi-Square, MAD and Max Tests
The optimal β is then chosen when Dβ is minimal with an error range of 0.01.
The values are expressed in percentages in all the tables. To measure the adequacy of the two theoretical distributions with the 181 observed distributions we use three tests.
Equation 13 is only valid for the Benford’s law. It shows us the dependence on the number of journals: If two observed distributions are identical, one for 10,000 journals and the other for 20,000 journals, the χ2 will necessarily be in a ratio of 2.
This dependence of χ2 on the number of items (here journals) leads researchers to develop other tests.
We then carry out the classical test of the adjustment of χ2: We will carry out this test with a precision of 5%. The threshold of rejection read in the table for eight degrees of freedom (Benford’s law) is then 15.51 and for seven degrees (generalized Benford’s law) it is 14.07.
The MAD is not a classical test as it does not depend on the size of the sample. It assumes that we know the FSD perfectly. Here, it is considered as a conformity indicator. It is also possible to use the mean χ2 (weighted least squares (WLS): chi-square divided by sample size). This measure is more of an empirical rule in order to choose the best adjustment (Hürilmann, 2015b, p. 355). We should also mention the recent application (Cerqueti & Lupi, 2022) of another type of test for large samples: the severity principle.
Table 1 was provided by Nigrini to clarify conformity with the MAD (Nigrini, 2012, p. 160). We adopt these notations in all tables.
MAD critical values and conformity to first significant digit (FSD)
MAD critical values . | Conformity to Benfordian distribution . | Abbreviation . |
---|---|---|
MAD ≤ 6.10−3 | Close Conformity | C |
6.10−3 < MAD ≤ 12.10−3 | Acceptable Conformity | AC |
12.10−3 < MAD ≤ 15.10−3 | Marginal Conformity | MC |
MAD > 15.10−3 | Non Conformity | NC |
MAD critical values . | Conformity to Benfordian distribution . | Abbreviation . |
---|---|---|
MAD ≤ 6.10−3 | Close Conformity | C |
6.10−3 < MAD ≤ 12.10−3 | Acceptable Conformity | AC |
12.10−3 < MAD ≤ 15.10−3 | Marginal Conformity | MC |
MAD > 15.10−3 | Non Conformity | NC |
It is relevant at this step of the discussion to add a complementary test denoted Max for a better interpretation of the data.
This test constructed here is of the same nature as the one using the Z-statistic (Alves et al., 2016, p. 1492).
We note Δp = tα · .
We then carry out a goodness-of-fit test on Max with a precision of 5%. The calculation of Fmax is done in Tables 3–7.
In summary, these three tests are quite different and seem necessary to validate the adjustments. The MAD test, specifically designed for Benford’s law, is unavoidable due to its widespread use in many studies. However, it is not a classical test and of course it is inoperative for the generalized Benford’s law. The chi-square test is the statistical test that seems to us the most relevant for this type of discrete law. However, as this law does not depend on any parameter, we have seen that it is very sensitive to the size of the population. Therefore, it seemed necessary to construct a third test. This does of course depend on the size of the population, but its rejection threshold is linked to the root of the population and is therefore less sensitive.
4. RESULTS AND DATA INTERPRETATION
This section is in three parts. The first part, microanalysis, focuses on distributions over time (i.e., the calculations are made year by year).
Our data is presented in Tables 3–6. Additionally, the results from historical data are presented in Table S7.
The second part proposes a macroanalysis, aggregating distribution data over several years, leading us to construct Table 2. This part also takes into account historical data. The final section presents a categorical organization of the scientometric data correlated with the tests used in this study. The results obtained, which will be discussed, are shown in Table 2.
In this study, we therefore examined 181 distributions. The distribution of the data is as follows. WoS provides 23 distributions for citations and 23 distributions for impact factor. The historical data set provides 10 distributions for citations, 10 distributions for articles and 10 for the impact factor. Finally, Scopus provides 21 distributions for citations, 21 for the h-index, 21 for the number of articles, 21 for the number of bibliographic references and 21 for the ratio. The 181 distributions analyzed can be found in Supplementary material: Tables S1–S10.
The three previous tests are implemented. In all cases, as expected, we observe a hierarchy of validity between the three tests: If the χ2 test validates a fit, then the Max test also validates it, and when it makes sense, the MAD test is also validated, with “Close Conformity” or “Acceptable Conformity” as the critical value. We test the validity of the generalized Benford’s law in all cases. Remember that the studies cited concerning historical data only use the χ2 test (Campanario & Coslado, 2011; Egghe & Guns, 2012).
4.1. Microanalysis
4.1.1. WoS and Scopus data set
The results are in Tables 3–6. For each year, we tested the validity of Benford’s law and the generalized Benford’s law. For these tables the results are read as follows:
WoS corpora from 1997 to 2019 with number of citations and impact factor
Years . | Number of citations . | Impact factor . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1997 | 6,634 | C | 2.41 | 0.84 | 8.45 | 1.01 | 0.8 | 8.33 | 6,537 | C | 2.42 | 0.58 | 4.31 | 0.97 | 0.44 | 2.57 |
1998 | 7,146 | C | 2.32 | 1.04 | 16.74 | — | 7,037 | C | 2.34 | 0.44 | 5.90 | 0.98 | 0.36 | 3.89 | ||
1999 | 7,249 | C | 2.30 | 0.65 | 5.58 | 0.98 | 0.48 | 4.35 | 7,142 | C | 2.32 | 0.74 | 14.78 | 0.97 | 0.76 | 12.78 |
2000 | 7,383 | C | 2.28 | 0.40 | 2.78 | 0.99 | 0.29 | 2.10 | 7,289 | C | 2.30 | 1.17 | 24.78 | 0.96 | 0.98 | 19.50 |
2001 | 7,434 | C | 2.27 | 0.72 | 8.54 | — | 7,332 | C | 2.29 | 1.66 | 35.14 | 0.95 | 1.42 | 27.8 | ||
2002 | 7,585 | C | 2.25 | 0.46 | 3.13 | 1.01 | 0.25 | 2.84 | 7,481 | AC | 2.27 | 1.13 | 40.37 | 0.96 | 1.17 | 27.31 |
2003 | 7,621 | C | 2.25 | 0.31 | 3.18 | 1.01 | 0.17 | 3.00 | 7,548 | C | 2.26 | 0.84 | 24.05 | 0.96 | 1.08 | 19.93 |
2004 | 7,681 | C | 2.24 | 0.33 | 2.91 | — | 7,621 | C | 2.25 | 1.39 | 19.96 | 1.01 | 1.20 | 19.44 | ||
2005 | 7,835 | C | 2.21 | 0.55 | 7.80 | 1.01 | 0.51 | 7.66 | 7,770 | C | 2.22 | 0.71 | 20.04 | 0.99 | 0.90 | 19.46 |
2006 | 7,934 | C | 2.20 | 0.63 | 12.08 | 1.02 | 0.53 | 10.68 | 7,893 | C | 2.21 | 2.25 | 41.27 | 1.06 | 1.57 | 30.14 |
2007 | 8,292 | C | 2.15 | 0.59 | 7.25 | 1.01 | 0.54 | 7.06 | 8,219 | AC | 2.16 | 3.20 | 48.81 | 1.09 | 1.05 | 21.66 |
2008 | 8,605 | C | 2.11 | 0.89 | 18.35 | — | 8,541 | AC | 2.12 | 3.29 | 64.26 | 1.11 | 0.88 | 18.83 | ||
2009 | 9,644 | C | 2.00 | 0.69 | 22.60 | 0.99 | 0.74 | 22.27 | 9,567 | AC | 2.00 | 3.22 | 84.14 | 1.11 | 1.34 | 40.60 |
2010 | 10,804 | C | 1.89 | 0.72 | 16.15 | 1.01 | 0.67 | 15.88 | 10,712 | AC | 1.89 | 2.79 | 62.80 | 1.10 | 0.68 | 14.98 |
2011 | 11,302 | C | 1.84 | 0.43 | 7.97 | 0.99 | 0.43 | 7.66 | 11,215 | AC | 1.85 | 2.26 | 91.30 | 1.09 | 1.00 | 27.33 |
2012 | 11,518 | C | 1.83 | 0.29 | 5.88 | 1.01 | 0.29 | 5.69 | 11,455 | AC | 1.83 | 2.83 | 91.22 | 1.11 | 0.94 | 35.20 |
2013 | 11,569 | C | 1.82 | 0.48 | 6.27 | — | 11,538 | AC | 1.82 | 3.09 | 109.23 | 1.14 | 0.96 | 34.70 | ||
2014 | 11,813 | C | 1.80 | 0.78 | 12.70 | 0.99 | 0.76 | 13.38 | 11,745 | AC | 1.81 | 2.88 | 109.54 | 1.12 | 1.23 | 53.29 |
2015 | 12,026 | C | 1.79 | 0.73 | 5.54 | 0.98 | 0.25 | 3.46 | 11,984 | MC | 1.79 | 3.97 | 256.80 | 1.16 | 1.28 | 55.72 |
2016 | 12,120 | C | 1.78 | 0.55 | 9.31 | 0.98 | 0.46 | 7.88 | 12,060 | MC | 1.78 | 3.56 | 371.07 | 1.18 | 2.36 | 100.12 |
2017 | 12,236 | C | 1.77 | 0.62 | 12.72 | 0.97 | 0.43 | 6.55 | 12,294 | NC | 1.77 | 4.52 | 371.07 | 1.23 | 2.64 | 104.79 |
2018 | 11,822 | C | 1.80 | 0.69 | 11.04 | 0.97 | 0.30 | 4.30 | 12,525 | NC | 1.75 | 4.28 | 435.14 | 1.24 | 3.43 | 146.76 |
2019 | 12,872 | C | 1.73 | 0.80 | 7.10 | 0.99 | 0.75 | 6.38 | 12,485 | NC | 1.75 | 4.45 | 459.07 | 1.25 | 3.51 | 139.32 |
Years . | Number of citations . | Impact factor . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1997 | 6,634 | C | 2.41 | 0.84 | 8.45 | 1.01 | 0.8 | 8.33 | 6,537 | C | 2.42 | 0.58 | 4.31 | 0.97 | 0.44 | 2.57 |
1998 | 7,146 | C | 2.32 | 1.04 | 16.74 | — | 7,037 | C | 2.34 | 0.44 | 5.90 | 0.98 | 0.36 | 3.89 | ||
1999 | 7,249 | C | 2.30 | 0.65 | 5.58 | 0.98 | 0.48 | 4.35 | 7,142 | C | 2.32 | 0.74 | 14.78 | 0.97 | 0.76 | 12.78 |
2000 | 7,383 | C | 2.28 | 0.40 | 2.78 | 0.99 | 0.29 | 2.10 | 7,289 | C | 2.30 | 1.17 | 24.78 | 0.96 | 0.98 | 19.50 |
2001 | 7,434 | C | 2.27 | 0.72 | 8.54 | — | 7,332 | C | 2.29 | 1.66 | 35.14 | 0.95 | 1.42 | 27.8 | ||
2002 | 7,585 | C | 2.25 | 0.46 | 3.13 | 1.01 | 0.25 | 2.84 | 7,481 | AC | 2.27 | 1.13 | 40.37 | 0.96 | 1.17 | 27.31 |
2003 | 7,621 | C | 2.25 | 0.31 | 3.18 | 1.01 | 0.17 | 3.00 | 7,548 | C | 2.26 | 0.84 | 24.05 | 0.96 | 1.08 | 19.93 |
2004 | 7,681 | C | 2.24 | 0.33 | 2.91 | — | 7,621 | C | 2.25 | 1.39 | 19.96 | 1.01 | 1.20 | 19.44 | ||
2005 | 7,835 | C | 2.21 | 0.55 | 7.80 | 1.01 | 0.51 | 7.66 | 7,770 | C | 2.22 | 0.71 | 20.04 | 0.99 | 0.90 | 19.46 |
2006 | 7,934 | C | 2.20 | 0.63 | 12.08 | 1.02 | 0.53 | 10.68 | 7,893 | C | 2.21 | 2.25 | 41.27 | 1.06 | 1.57 | 30.14 |
2007 | 8,292 | C | 2.15 | 0.59 | 7.25 | 1.01 | 0.54 | 7.06 | 8,219 | AC | 2.16 | 3.20 | 48.81 | 1.09 | 1.05 | 21.66 |
2008 | 8,605 | C | 2.11 | 0.89 | 18.35 | — | 8,541 | AC | 2.12 | 3.29 | 64.26 | 1.11 | 0.88 | 18.83 | ||
2009 | 9,644 | C | 2.00 | 0.69 | 22.60 | 0.99 | 0.74 | 22.27 | 9,567 | AC | 2.00 | 3.22 | 84.14 | 1.11 | 1.34 | 40.60 |
2010 | 10,804 | C | 1.89 | 0.72 | 16.15 | 1.01 | 0.67 | 15.88 | 10,712 | AC | 1.89 | 2.79 | 62.80 | 1.10 | 0.68 | 14.98 |
2011 | 11,302 | C | 1.84 | 0.43 | 7.97 | 0.99 | 0.43 | 7.66 | 11,215 | AC | 1.85 | 2.26 | 91.30 | 1.09 | 1.00 | 27.33 |
2012 | 11,518 | C | 1.83 | 0.29 | 5.88 | 1.01 | 0.29 | 5.69 | 11,455 | AC | 1.83 | 2.83 | 91.22 | 1.11 | 0.94 | 35.20 |
2013 | 11,569 | C | 1.82 | 0.48 | 6.27 | — | 11,538 | AC | 1.82 | 3.09 | 109.23 | 1.14 | 0.96 | 34.70 | ||
2014 | 11,813 | C | 1.80 | 0.78 | 12.70 | 0.99 | 0.76 | 13.38 | 11,745 | AC | 1.81 | 2.88 | 109.54 | 1.12 | 1.23 | 53.29 |
2015 | 12,026 | C | 1.79 | 0.73 | 5.54 | 0.98 | 0.25 | 3.46 | 11,984 | MC | 1.79 | 3.97 | 256.80 | 1.16 | 1.28 | 55.72 |
2016 | 12,120 | C | 1.78 | 0.55 | 9.31 | 0.98 | 0.46 | 7.88 | 12,060 | MC | 1.78 | 3.56 | 371.07 | 1.18 | 2.36 | 100.12 |
2017 | 12,236 | C | 1.77 | 0.62 | 12.72 | 0.97 | 0.43 | 6.55 | 12,294 | NC | 1.77 | 4.52 | 371.07 | 1.23 | 2.64 | 104.79 |
2018 | 11,822 | C | 1.80 | 0.69 | 11.04 | 0.97 | 0.30 | 4.30 | 12,525 | NC | 1.75 | 4.28 | 435.14 | 1.24 | 3.43 | 146.76 |
2019 | 12,872 | C | 1.73 | 0.80 | 7.10 | 0.99 | 0.75 | 6.38 | 12,485 | NC | 1.75 | 4.45 | 459.07 | 1.25 | 3.51 | 139.32 |
Note. — indicates that there is no β that improves the result for the generalized Benford’s law.
Scopus corpora from 1999 to 2019 with number of cumulative citations over three years
Years . | Number of citations . | |||||||
---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | ||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | ||
1999 | 14,351 | C | 1.70 | 1.57 | 49.70 | 1.08 | 0.65 | 7.15 |
2000 | 14,457 | C | 1.63 | 1.26 | 30.22 | 1.06 | 0.67 | 9.16 |
2001 | 14,958 | C | 1.60 | 1.02 | 34.65 | 1.06 | 0.76 | 11.48 |
2002 | 15,666 | C | 1.57 | 0.90 | 39.51 | 1.06 | 0.70 | 14.56 |
2003 | 14,649 | C | 1.62 | 1.57 | 49.08 | 1.07 | 0.79 | 10.97 |
2004 | 17,420 | C | 1.54 | 0.80 | 31.54 | 1.05 | 0.68 | 16.15 |
2005 | 18,274 | C | 1.45 | 0.57 | 24.61 | 1.03 | 0.51 | 16.06 |
2006 | 19,738 | C | 1.40 | 0.50 | 16.95 | 1.04 | 0.87 | 13.15 |
2007 | 21,109 | C | 1.35 | 1.00 | 27.24 | 1.05 | 0.31 | 6.30 |
2008 | 22,659 | C | 1.30 | 1.14 | 22.83 | 1.04 | 0.27 | 4.78 |
2009 | 24,262 | C | 1.26 | 0.52 | 16.26 | 1.03 | 0.49 | 8.78 |
2010 | 26,104 | C | 1.21 | 0.93 | 23.63 | 1.04 | 0.21 | 3.66 |
2011 | 27,582 | C | 1.18 | 0.97 | 17.91 | 1.03 | 0.23 | 3.63 |
2012 | 28,865 | C | 1.15 | 0.65 | 18.64 | 1.02 | 0.27 | 12.37 |
2013 | 29,593 | C | 1.14 | 0.70 | 21.56 | 1.04 | 0.29 | 6.16 |
2014 | 30,014 | C | 1.13 | 0.79 | 22.28 | 1.04 | 0.19 | 4.18 |
2015 | 30,526 | C | 1.12 | 0.62 | 33.83 | 1.04 | 0.35 | 15.05 |
2016 | 31,099 | C | 1.11 | 0.43 | 12.27 | 1.02 | 0.25 | 5.23 |
2017 | 31,580 | C | 1.10 | 0.58 | 23.52 | 1.03 | 0.34 | 9.91 |
2018 | 22,659 | C | 1.13 | 0.62 | 22.83 | 1.04 | 0.27 | 4.78 |
2019 | 23,627 | C | 1.17 | 1.02 | 54.08 | — |
Years . | Number of citations . | |||||||
---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | ||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | ||
1999 | 14,351 | C | 1.70 | 1.57 | 49.70 | 1.08 | 0.65 | 7.15 |
2000 | 14,457 | C | 1.63 | 1.26 | 30.22 | 1.06 | 0.67 | 9.16 |
2001 | 14,958 | C | 1.60 | 1.02 | 34.65 | 1.06 | 0.76 | 11.48 |
2002 | 15,666 | C | 1.57 | 0.90 | 39.51 | 1.06 | 0.70 | 14.56 |
2003 | 14,649 | C | 1.62 | 1.57 | 49.08 | 1.07 | 0.79 | 10.97 |
2004 | 17,420 | C | 1.54 | 0.80 | 31.54 | 1.05 | 0.68 | 16.15 |
2005 | 18,274 | C | 1.45 | 0.57 | 24.61 | 1.03 | 0.51 | 16.06 |
2006 | 19,738 | C | 1.40 | 0.50 | 16.95 | 1.04 | 0.87 | 13.15 |
2007 | 21,109 | C | 1.35 | 1.00 | 27.24 | 1.05 | 0.31 | 6.30 |
2008 | 22,659 | C | 1.30 | 1.14 | 22.83 | 1.04 | 0.27 | 4.78 |
2009 | 24,262 | C | 1.26 | 0.52 | 16.26 | 1.03 | 0.49 | 8.78 |
2010 | 26,104 | C | 1.21 | 0.93 | 23.63 | 1.04 | 0.21 | 3.66 |
2011 | 27,582 | C | 1.18 | 0.97 | 17.91 | 1.03 | 0.23 | 3.63 |
2012 | 28,865 | C | 1.15 | 0.65 | 18.64 | 1.02 | 0.27 | 12.37 |
2013 | 29,593 | C | 1.14 | 0.70 | 21.56 | 1.04 | 0.29 | 6.16 |
2014 | 30,014 | C | 1.13 | 0.79 | 22.28 | 1.04 | 0.19 | 4.18 |
2015 | 30,526 | C | 1.12 | 0.62 | 33.83 | 1.04 | 0.35 | 15.05 |
2016 | 31,099 | C | 1.11 | 0.43 | 12.27 | 1.02 | 0.25 | 5.23 |
2017 | 31,580 | C | 1.10 | 0.58 | 23.52 | 1.03 | 0.34 | 9.91 |
2018 | 22,659 | C | 1.13 | 0.62 | 22.83 | 1.04 | 0.27 | 4.78 |
2019 | 23,627 | C | 1.17 | 1.02 | 54.08 | — |
Note. — indicates that there is no β that improves the result for the generalized Benford’s law.
Scopus corpora from 1999 to 2019 with indicators: ratio & h-index
Years . | Number of references / Number of articles . | h-index . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1999 | 13,270 | NC | 1.70 | 9.93 | 1,848.3 | 1.35 | 8.78 | 1,260.5 | 17.113 | C | 1.50 | 1.71 | 56.34 | 0.95 | 1.48 | 38.55 |
2000 | 13,993 | NC | 1.66 | 9.91 | 1,769.4 | 1.29 | 8.89 | 1,302.0 | 17.547 | C | 1.48 | 1.73 | 65.35 | 0.94 | 1.45 | 40.13 |
2001 | 14,512 | NC | 1.65 | 9.54 | 1,933.1 | 1.31 | 8.47 | 1,454.5 | 18.105 | C | 1.46 | 1.69 | 71.21 | 0.93 | 1.36 | 36.76 |
2002 | 15,362 | NC | 1.58 | 9.93 | 2,208.2 | 1.28 | 8.90 | 1,787.3 | 19.165 | AC | 1.42 | 1.95 | 96.78 | 0.92 | 1.57 | 45.26 |
2003 | 15,430 | NC | 1.58 | 9.97 | 2,329.4 | 1.27 | 8.98 | 1,922.0 | 19.760 | C | 1.39 | 2.04 | 105.52 | 0.92 | 1.70 | 53.42 |
2004 | 16,481 | NC | 1.53 | 9.39 | 2,229.0 | 1.23 | 8.54 | 1,925.9 | 20.557 | AC | 1.37 | 1.96 | 91.60 | 0.93 | 1.63 | 48.92 |
2005 | 17,515 | NC | 1.48 | 9.17 | 2,306.1 | 1.25 | 8.25 | 1,910.6 | 22.004 | C | 1.32 | 1.95 | 84.20 | 0.94 | 1.67 | 52.41 |
2006 | 18,784 | NC | 1.43 | 7.97 | 2,327.4 | 1.23 | 7.72 | 1,984.9 | 23.638 | C | 1.27 | 1.94 | 93.28 | 0.94 | 1.60 | 59.95 |
2007 | 19,506 | NC | 1.40 | 8.30 | 2,582.2 | 1.23 | 7.94 | 2,222.3 | 25.405 | C | 1.23 | 1.94 | 101.86 | 0.95 | 1.70 | 72.93 |
2008 | 20,705 | NC | 1.36 | 8.10 | 2,814.4 | 1.23 | 7.85 | 2,427.1 | 27.501 | AC | 1.18 | 2.04 | 118.17 | 0.95 | 1.80 | 91.70 |
2009 | 22,327 | NC | 1.31 | 8.01 | 3,042.9 | 1.20 | 7.84 | 2,707.2 | 29.350 | AC | 1.14 | 2.26 | 167.02 | 0.95 | 2.02 | 133.60 |
2010 | 23,010 | NC | 1.29 | 7.86 | 3,212.9 | 1.20 | 8.48 | 2,944.6 | 31.048 | AC | 1.11 | 2.32 | 194.59 | — | ||
2011 | 24,187 | NC | 1.26 | 7.67 | 3,648.3 | 1.13 | 8.37 | 3,502.3 | 32.584 | AC | 1.09 | 2.57 | 239.39 | 0.93 | 2.19 | 165.97 |
2012 | 24,716 | NC | 1.25 | 7.55 | 3,880.6 | 1.08 | 8.38 | 3,818.7 | 33.433 | AC | 1.07 | 2.44 | 241.21 | 0.92 | 2.06 | 168.63 |
2013 | 24,980 | NC | 1.24 | 8.08 | 4,287.1 | 1.07 | 8.72 | 4,252.4 | 33.965 | AC | 1.06 | 2.39 | 268.81 | 0.91 | 1.09 | 149.60 |
2014 | 25,987 | NC | 1.21 | 8.25 | 4,579.6 | 1.01 | 8.73 | 4,579.4 | 34.701 | AC | 1.05 | 2.26 | 250.11 | 0.91 | 1.83 | 123.08 |
2015 | 26,048 | NC | 1.21 | 9.83 | 5,159.2 | 0.95 | 8.82 | 5,133.9 | 35.115 | AC | 1.05 | 2.18 | 246.08 | 0.91 | 1.75 | 125.88 |
2016 | 26,597 | NC | 1.20 | 10.35 | 5,580 | 0.91 | 8.4 | 5,509.9 | 35.505 | AC | 1.04 | 2.18 | 296.89 | 0.90 | 1.65 | 134.43 |
2017 | 25,093 | NC | 1.24 | 13.8 | 7,184.9 | 0.82 | 10.1 | 6,896.2 | 34.464 | AC | 1.06 | 1.86 | 248.66 | 0.90 | 1.38 | 111.47 |
2018 | 24,248 | NC | 1.36 | 16.9 | 8,570.2 | 0.9 | 14.53 | 8,213.9 | 32.447 | C | 1.09 | 1.41 | 128.18 | 0.94 | 1.12 | 72.49 |
2019 | 23,627 | NC | 1.27 | 17.3 | 8,439.9 | 0.7 | 10.6 | 7,645.4 | 30.142 | C | 1.13 | 1.11 | 55.48 | 0.97 | 0.97 | 45.54 |
Years . | Number of references / Number of articles . | h-index . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1999 | 13,270 | NC | 1.70 | 9.93 | 1,848.3 | 1.35 | 8.78 | 1,260.5 | 17.113 | C | 1.50 | 1.71 | 56.34 | 0.95 | 1.48 | 38.55 |
2000 | 13,993 | NC | 1.66 | 9.91 | 1,769.4 | 1.29 | 8.89 | 1,302.0 | 17.547 | C | 1.48 | 1.73 | 65.35 | 0.94 | 1.45 | 40.13 |
2001 | 14,512 | NC | 1.65 | 9.54 | 1,933.1 | 1.31 | 8.47 | 1,454.5 | 18.105 | C | 1.46 | 1.69 | 71.21 | 0.93 | 1.36 | 36.76 |
2002 | 15,362 | NC | 1.58 | 9.93 | 2,208.2 | 1.28 | 8.90 | 1,787.3 | 19.165 | AC | 1.42 | 1.95 | 96.78 | 0.92 | 1.57 | 45.26 |
2003 | 15,430 | NC | 1.58 | 9.97 | 2,329.4 | 1.27 | 8.98 | 1,922.0 | 19.760 | C | 1.39 | 2.04 | 105.52 | 0.92 | 1.70 | 53.42 |
2004 | 16,481 | NC | 1.53 | 9.39 | 2,229.0 | 1.23 | 8.54 | 1,925.9 | 20.557 | AC | 1.37 | 1.96 | 91.60 | 0.93 | 1.63 | 48.92 |
2005 | 17,515 | NC | 1.48 | 9.17 | 2,306.1 | 1.25 | 8.25 | 1,910.6 | 22.004 | C | 1.32 | 1.95 | 84.20 | 0.94 | 1.67 | 52.41 |
2006 | 18,784 | NC | 1.43 | 7.97 | 2,327.4 | 1.23 | 7.72 | 1,984.9 | 23.638 | C | 1.27 | 1.94 | 93.28 | 0.94 | 1.60 | 59.95 |
2007 | 19,506 | NC | 1.40 | 8.30 | 2,582.2 | 1.23 | 7.94 | 2,222.3 | 25.405 | C | 1.23 | 1.94 | 101.86 | 0.95 | 1.70 | 72.93 |
2008 | 20,705 | NC | 1.36 | 8.10 | 2,814.4 | 1.23 | 7.85 | 2,427.1 | 27.501 | AC | 1.18 | 2.04 | 118.17 | 0.95 | 1.80 | 91.70 |
2009 | 22,327 | NC | 1.31 | 8.01 | 3,042.9 | 1.20 | 7.84 | 2,707.2 | 29.350 | AC | 1.14 | 2.26 | 167.02 | 0.95 | 2.02 | 133.60 |
2010 | 23,010 | NC | 1.29 | 7.86 | 3,212.9 | 1.20 | 8.48 | 2,944.6 | 31.048 | AC | 1.11 | 2.32 | 194.59 | — | ||
2011 | 24,187 | NC | 1.26 | 7.67 | 3,648.3 | 1.13 | 8.37 | 3,502.3 | 32.584 | AC | 1.09 | 2.57 | 239.39 | 0.93 | 2.19 | 165.97 |
2012 | 24,716 | NC | 1.25 | 7.55 | 3,880.6 | 1.08 | 8.38 | 3,818.7 | 33.433 | AC | 1.07 | 2.44 | 241.21 | 0.92 | 2.06 | 168.63 |
2013 | 24,980 | NC | 1.24 | 8.08 | 4,287.1 | 1.07 | 8.72 | 4,252.4 | 33.965 | AC | 1.06 | 2.39 | 268.81 | 0.91 | 1.09 | 149.60 |
2014 | 25,987 | NC | 1.21 | 8.25 | 4,579.6 | 1.01 | 8.73 | 4,579.4 | 34.701 | AC | 1.05 | 2.26 | 250.11 | 0.91 | 1.83 | 123.08 |
2015 | 26,048 | NC | 1.21 | 9.83 | 5,159.2 | 0.95 | 8.82 | 5,133.9 | 35.115 | AC | 1.05 | 2.18 | 246.08 | 0.91 | 1.75 | 125.88 |
2016 | 26,597 | NC | 1.20 | 10.35 | 5,580 | 0.91 | 8.4 | 5,509.9 | 35.505 | AC | 1.04 | 2.18 | 296.89 | 0.90 | 1.65 | 134.43 |
2017 | 25,093 | NC | 1.24 | 13.8 | 7,184.9 | 0.82 | 10.1 | 6,896.2 | 34.464 | AC | 1.06 | 1.86 | 248.66 | 0.90 | 1.38 | 111.47 |
2018 | 24,248 | NC | 1.36 | 16.9 | 8,570.2 | 0.9 | 14.53 | 8,213.9 | 32.447 | C | 1.09 | 1.41 | 128.18 | 0.94 | 1.12 | 72.49 |
2019 | 23,627 | NC | 1.27 | 17.3 | 8,439.9 | 0.7 | 10.6 | 7,645.4 | 30.142 | C | 1.13 | 1.11 | 55.48 | 0.97 | 0.97 | 45.54 |
Note. — indicates that there is no β that improves the result for the generalized Benford’s law.
Scopus corpora from 1999 to 2019 with references and articles
Years . | Number of references . | Number of articles . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1999 | 13,488 | AC | 1.69 | 1.74 | 79.5 | 0.93 | 1.65 | 53.9 | 14,734 | AC | 1.61 | 2.53 | 185.0 | 1.07 | 2.80 | 162.8 |
2000 | 13,994 | AC | 1.66 | 2.00 | 99.0 | 0.93 | 1.70 | 68.0 | 15,065 | AC | 1.60 | 3.03 | 183.9 | 1.06 | 2.80 | 163.3 |
2001 | 14,152 | AC | 1.63 | 2.38 | 104.4 | 0.93 | 2.05 | 77.8 | 15,844 | AC | 1.56 | 2.82 | 164.2 | 1.08 | 2.68 | 132.8 |
2002 | 16,812 | AC | 1.51 | 2.85 | 161.4 | 1.08 | 2.58 | 127.8 | 15,363 | AC | 1.58 | 2.46 | 160.8 | 0.9 | 2.08 | 99.0 |
2003 | 15,783 | AC | 1.56 | 2.12 | 132.3 | 0.92 | 2.09 | 93.9 | 17,068 | AC | 1.50 | 2.49 | 125.7 | 1.05 | 2.27 | 103.4 |
2004 | 16,481 | AC | 1.53 | 2.40 | 176.6 | 0.94 | 2.33 | 146.9 | 17,567 | AC | 1.48 | 2.65 | 184.2 | 1.08 | 2.34 | 138.8 |
2005 | 17,891 | AC | 1.46 | 2.59 | 175.7 | 0.95 | 2.36 | 153.3 | 18,858 | AC | 1.42 | 2.67 | 189.9 | 1.06 | 2.41 | 160.9 |
2006 | 18,784 | AC | 1.43 | 2.48 | 224.5 | 0.95 | 4.17 | 341.1 | 19,622 | AC | 1.40 | 2.73 | 267.9 | 1.07 | 3.05 | 235.1 |
2007 | 19,847 | AC | 1.39 | 2.56 | 213.1 | 0.95 | 2.55 | 191.6 | 20,598 | AC | 1.36 | 2.50 | 220.6 | 1.07 | 2.50 | 192.3 |
2008 | 21,008 | AC | 1.35 | 2.40 | 240.4 | 0.96 | 2.70 | 225.8 | 21,697 | AC | 1.33 | 2.97 | 291.3 | 1.05 | 2.99 | 272.0 |
2009 | 23,329 | AC | 1.28 | 2.67 | 250.6 | 0.95 | 2.44 | 230.9 | 23,407 | AC | 1.29 | 3.20 | 332.3 | 1.05 | 2.98 | 307.1 |
2010 | 23,308 | AC | 1.28 | 2.15 | 255.7 | 0.98 | 2.60 | 251.4 | 24,100 | MC | 1.26 | 3.29 | 380.2 | 1.05 | 3.13 | 365.6 |
2011 | 24,188 | AC | 1.26 | 2.26 | 270.1 | 0.96 | 2.57 | 207.9 | 25,089 | AC | 1.24 | 3.38 | 371.8 | 1.06 | 3.22 | 340.9 |
2012 | 24,717 | AC | 1.25 | 1.95 | 253.8 | 0.97 | 2.36 | 241.2 | 25,536 | AC | 1.22 | 3.29 | 365.6 | 1.06 | 3.29 | 339.9 |
2013 | 25,170 | AC | 1.24 | 2.07 | 249.7 | 0.97 | 2.38 | 237.1 | 25,975 | AC | 1.22 | 3.34 | 326.7 | 1.07 | 2.94 | 279.2 |
2014 | 26,001 | AC | 1.21 | 2.17 | 283.4 | 0.99 | 2.41 | 233.2 | 26,789 | AC | 1.20 | 2.97 | 321.8 | 1.06 | 2.71 | 279.6 |
2015 | 26,208 | AC | 1.22 | 2.66 | 295.5 | — | 26,968 | AC | 1.19 | 3.03 | 312.9 | 1.07 | 2.74 | 261.9 | ||
2016 | 26,599 | AC | 1.20 | 2.66 | 290.5 | — | 27,330 | AC | 1.18 | 3.14 | 318.4 | 1.06 | 2.88 | 281.8 | ||
2017 | 25,095 | AC | 1.24 | 2.62 | 237.6 | 0.91 | 2.13 | 236.8 | 25,805 | AC | 1.22 | 3.22 | 328.6 | 1.07 | 2.93 | 285.5 |
2018 | 24,352 | AC | 1.26 | 2.53 | 195.8 | 1.02 | 2.05 | 193.4 | 24,820 | AC | 1.24 | 3.07 | 331.6 | 1.05 | 2.86 | 306.2 |
2019 | 23,627 | C | 1.27 | 1.02 | 54.7 | 0.99 | 1.26 | 54.4 | 24,188 | C | 1.26 | 1.6 | 65.7 | 1.04 | 1.48 | 54.8 |
Years . | Number of references . | Number of articles . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Journals . | Benford . | Benford G . | Journals . | Benford . | Benford G . | |||||||||||
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | MAD . | FMax . | Max . | χ2 . | β . | Max . | χ2 . | |||
1999 | 13,488 | AC | 1.69 | 1.74 | 79.5 | 0.93 | 1.65 | 53.9 | 14,734 | AC | 1.61 | 2.53 | 185.0 | 1.07 | 2.80 | 162.8 |
2000 | 13,994 | AC | 1.66 | 2.00 | 99.0 | 0.93 | 1.70 | 68.0 | 15,065 | AC | 1.60 | 3.03 | 183.9 | 1.06 | 2.80 | 163.3 |
2001 | 14,152 | AC | 1.63 | 2.38 | 104.4 | 0.93 | 2.05 | 77.8 | 15,844 | AC | 1.56 | 2.82 | 164.2 | 1.08 | 2.68 | 132.8 |
2002 | 16,812 | AC | 1.51 | 2.85 | 161.4 | 1.08 | 2.58 | 127.8 | 15,363 | AC | 1.58 | 2.46 | 160.8 | 0.9 | 2.08 | 99.0 |
2003 | 15,783 | AC | 1.56 | 2.12 | 132.3 | 0.92 | 2.09 | 93.9 | 17,068 | AC | 1.50 | 2.49 | 125.7 | 1.05 | 2.27 | 103.4 |
2004 | 16,481 | AC | 1.53 | 2.40 | 176.6 | 0.94 | 2.33 | 146.9 | 17,567 | AC | 1.48 | 2.65 | 184.2 | 1.08 | 2.34 | 138.8 |
2005 | 17,891 | AC | 1.46 | 2.59 | 175.7 | 0.95 | 2.36 | 153.3 | 18,858 | AC | 1.42 | 2.67 | 189.9 | 1.06 | 2.41 | 160.9 |
2006 | 18,784 | AC | 1.43 | 2.48 | 224.5 | 0.95 | 4.17 | 341.1 | 19,622 | AC | 1.40 | 2.73 | 267.9 | 1.07 | 3.05 | 235.1 |
2007 | 19,847 | AC | 1.39 | 2.56 | 213.1 | 0.95 | 2.55 | 191.6 | 20,598 | AC | 1.36 | 2.50 | 220.6 | 1.07 | 2.50 | 192.3 |
2008 | 21,008 | AC | 1.35 | 2.40 | 240.4 | 0.96 | 2.70 | 225.8 | 21,697 | AC | 1.33 | 2.97 | 291.3 | 1.05 | 2.99 | 272.0 |
2009 | 23,329 | AC | 1.28 | 2.67 | 250.6 | 0.95 | 2.44 | 230.9 | 23,407 | AC | 1.29 | 3.20 | 332.3 | 1.05 | 2.98 | 307.1 |
2010 | 23,308 | AC | 1.28 | 2.15 | 255.7 | 0.98 | 2.60 | 251.4 | 24,100 | MC | 1.26 | 3.29 | 380.2 | 1.05 | 3.13 | 365.6 |
2011 | 24,188 | AC | 1.26 | 2.26 | 270.1 | 0.96 | 2.57 | 207.9 | 25,089 | AC | 1.24 | 3.38 | 371.8 | 1.06 | 3.22 | 340.9 |
2012 | 24,717 | AC | 1.25 | 1.95 | 253.8 | 0.97 | 2.36 | 241.2 | 25,536 | AC | 1.22 | 3.29 | 365.6 | 1.06 | 3.29 | 339.9 |
2013 | 25,170 | AC | 1.24 | 2.07 | 249.7 | 0.97 | 2.38 | 237.1 | 25,975 | AC | 1.22 | 3.34 | 326.7 | 1.07 | 2.94 | 279.2 |
2014 | 26,001 | AC | 1.21 | 2.17 | 283.4 | 0.99 | 2.41 | 233.2 | 26,789 | AC | 1.20 | 2.97 | 321.8 | 1.06 | 2.71 | 279.6 |
2015 | 26,208 | AC | 1.22 | 2.66 | 295.5 | — | 26,968 | AC | 1.19 | 3.03 | 312.9 | 1.07 | 2.74 | 261.9 | ||
2016 | 26,599 | AC | 1.20 | 2.66 | 290.5 | — | 27,330 | AC | 1.18 | 3.14 | 318.4 | 1.06 | 2.88 | 281.8 | ||
2017 | 25,095 | AC | 1.24 | 2.62 | 237.6 | 0.91 | 2.13 | 236.8 | 25,805 | AC | 1.22 | 3.22 | 328.6 | 1.07 | 2.93 | 285.5 |
2018 | 24,352 | AC | 1.26 | 2.53 | 195.8 | 1.02 | 2.05 | 193.4 | 24,820 | AC | 1.24 | 3.07 | 331.6 | 1.05 | 2.86 | 306.2 |
2019 | 23,627 | C | 1.27 | 1.02 | 54.7 | 0.99 | 1.26 | 54.4 | 24,188 | C | 1.26 | 1.6 | 65.7 | 1.04 | 1.48 | 54.8 |
Note. — indicates that there is no β that improves the result for the generalized Benford’s law.
The green shaded boxes are those where Benford’s law is validated by χ2 for a 95% confidence level. The red shaded and bold boxes are those where Benford’s law is validated by the Max test for a 95% confidence level. We then test the generalized Benford’s law.
Column 7: value of β. When the column is empty, it means that the generalized Benford’s law does not improve the original: In this case, the optimal β is 1
Columns 8 and 9: calculation of Max and χ2 as before
The purple shaded and underlined boxes are those where the generalized Benford’s law is validated by χ2 for a 95% confidence level. The red shaded and bold boxes are those where the generalized Benford’s law is validated by the Max test for a 95% confidence level.
The results can be summarized as follows. For the 46 WoS distributions, we have
The total number of journal citations (Table 3): The critical value of the MAD is always “Close conformity”; the Max test agrees in all cases. The χ2 test validates 82% of cases.
For the generalized Benford’s law, the result is the same.
The impact factor (Table 3): The critical value of the MAD is “Close Conformity” for only nine cases and “Non Conformity” for three cases; 39% of cases are validated by the Max test. Only 13% of cases are validated by the χ2 test.
For the generalized Benford’s law, the result for the χ2 test is the same; for the Max test the result is better: it validates 80% of cases. β varies between 0.97 and 1.25.
The number of cumulative citations over three years (Table 4): The critical value of the MAD is always “Close Conformity.” The Max test agrees in all cases. Only one is validated by the χ2 test.
For the generalized Benford’s law, the χ2 test validates the law in 80% of cases and in 100% of cases for the Max test. β varies between 1.02 and 1.08.
h-index (Table 5): The critical value of the MAD is “Close Conformity” for only nine cases and “Acceptable Conformity” for the other cases. The Max test validates only one case. The χ2 test is not valid in any cases.
For the generalized Benford’s law, the Max test validates four cases. The χ2 test is not valid in any cases. β varies between 0.90 and 0.97.
Number of bibliographic references/Number of articles (ratio) (Table 5): The MAD critical value is “Non Conformity.” The Max and χ2 tests are never validated. The generalized Benford’s law introduces no modification.
Number of bibliographic references (Table 6): The MAD critical value is “Close Conformity” for 1 case and “Acceptable Conformity” for 20 cases. The Max and χ2 tests are never validated. The generalized Benford’s law introduces no modifications.
Number of articles published (Table 6): The MAD critical value is “Acceptable Conformity”. The Max and χ2 tests are never validated. The generalized Benford’s law is validated in only one case by the Max test.
4.1.2. Historical data set
The results obtained are presented in Table 7. We apply our analysis tools to the 2011 scientometric data (Campanario & Coslado, 2011) on the number of articles, the number of citations and the impact Factor from 1998 to 2007 of WoS journals. The results are gathered in Table 7. They follow the same presentation as our results in Table 3. The test used by the authors is χ2. Obviously, we obtained the same result as the authors on this test with their data. There is a difference between our calculation and that of the authors because we do not analyze the same number of journals. However, we note that when the value is different, the ranking is very close. We proceed with the MAD test, the Max test, and an adjustment with the generalized Benford’s law. The results for the citations and impact factor data set between 1998 and 2007 are compared with the result obtained from our own collection of WoS data. We were not able to collect the data set corresponding to the number of articles.
Citations (Table 7): The result is expected (see Table 3). The MAD critical value is “Close Conformity” and the Max test is valid in all cases. The χ2 test is valid in all cases.
Number of articles (Table 7): The MAD critical value is “Close Conformity” in one case and “Acceptable Conformity” in the other cases. The χ2 test is never significant. The Max test is significant in eight out of 10 cases. For the generalized law, the Max test is always validated and the χ2 test is significant in six cases. β varies between 0.87 and 0.92.
Impact factor (Table 7): If we compare the results with those of Table 3, the MAD and Max tests are almost identical. The χ2 test validates three cases, one more than for our data. The Max test validates all cases, as is the case with our data set. For the generalized law, the χ2 test is significant in five cases. The β test ranges from 0.97 to 1.1.
Historical scientometric data set from Alves et al. (2014) and Campanario and Coslado (2011)
. | Years . | Journals . | Benford . | Benford G . | |||||
---|---|---|---|---|---|---|---|---|---|
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | |||
Articles | 1998 | 5,188 | AC | 2.72 | 1.4 | 27.8 | 0.91 | 1.02 | 14.72 |
1999 | 5,283 | AC | 2.70 | 1.58 | 27.4 | 0.92 | 1.20 | 17.07 | |
2000 | 5,412 | C | 2.66 | 1.37 | 16.2 | 0.92 | 0.71 | 6.57 | |
2001 | 5,477 | AC | 2.65 | 2.15 | 38.1 | 0.89 | 1.03 | 10.68 | |
2002 | 5,607 | AC | 2.61 | 2.01 | 57.9 | 0.87 | 1.16 | 29.77 | |
2003 | 5,660 | AC | 2.60 | 2.47 | 43.5 | 0.88 | 0.96 | 9.63 | |
2004 | 5,722 | AC | 2.59 | 1.91 | 31.3 | 0.91 | 0.80 | 10.7 | |
2005 | 5,887 | AC | 2.55 | 1.97 | 41.5 | 0.90 | 1.11 | 16.77 | |
2006 | 5,981 | AC | 2.53 | 2.54 | 27.8 | 0.91 | 0.62 | 5.41 | |
2007 | 6,266 | AC | 2.48 | 2.50 | 31.3 | 0.90 | 0.79 | 5.02 | |
Citations | 1998 | 5,467 | C | 2.65 | 1.12 | 15.1 | — | ||
1999 | 5,550 | C | 2.63 | 1.00 | 7.1 | 0.96 | 0.50 | 4.11 | |
2000 | 5,696 | C | 2.60 | 0.54 | 4.5 | 0.98 | 0.38 | 3.34 | |
2001 | 5,752 | C | 2.58 | 0.48 | 5.2 | 0.99 | 0.51 | 4.75 | |
2002 | 5,876 | C | 2.55 | 0.35 | 3.1 | — | |||
2003 | 5,907 | C | 2.55 | 0.49 | 3.5 | 0.98 | 0.34 | 2.28 | |
2004 | 5,968 | C | 2.54 | 0.34 | 3.0 | 0.98 | 0.40 | 1.94 | |
2005 | 6,088 | C | 2.51 | 0.73 | 11.2 | 0.99 | 0.72 | 11.97 | |
2006 | 6,166 | C | 2.49 | 0.53 | 9.7 | 1.01 | 0.49 | 9.25 | |
2007 | 6,417 | C | 2.44 | 0.61 | 8.4 | 1.01 | 0.67 | 8.26 | |
Impact factor | 1998 | 5,378 | C | 2.67 | 0.61 | 6.6 | — | ||
1999 | 5,467 | C | 2.65 | 0.64 | 11.3 | 0.98 | 0.91 | 10.5 | |
2000 | 5,607 | AC | 2.61 | 1.14 | 22.2 | 0.99 | 1.10 | 21.7 | |
2001 | 5,670 | C | 2.60 | 1.15 | 20.2 | 0.97 | 1.34 | 8.6 | |
2002 | 5,791 | C | 2.57 | 1.32 | 24.9 | 0.98 | 1.34 | 23.4 | |
2003 | 5,845 | C | 2.56 | 0.76 | 12.5 | 0.99 | 0.75 | 12.1 | |
2004 | 5,918 | C | 2.54 | 1.76 | 16.7 | 1.04 | 0.90 | 12.80 | |
2005 | 6,033 | C | 2.52 | 0.99 | 16.3 | 1.01 | 0.75 | 15.85 | |
2006 | 6,122 | AC | 2.50 | 0.82 | 39.3 | 1.06 | 1.52 | 27.84 | |
2007 | 6,359 | AC | 2.46 | 2.74 | 40.4 | 1.10 | 1.06 | 14.04 |
. | Years . | Journals . | Benford . | Benford G . | |||||
---|---|---|---|---|---|---|---|---|---|
MAD . | Fmax . | Max . | χ2 . | β . | Max . | χ2 . | |||
Articles | 1998 | 5,188 | AC | 2.72 | 1.4 | 27.8 | 0.91 | 1.02 | 14.72 |
1999 | 5,283 | AC | 2.70 | 1.58 | 27.4 | 0.92 | 1.20 | 17.07 | |
2000 | 5,412 | C | 2.66 | 1.37 | 16.2 | 0.92 | 0.71 | 6.57 | |
2001 | 5,477 | AC | 2.65 | 2.15 | 38.1 | 0.89 | 1.03 | 10.68 | |
2002 | 5,607 | AC | 2.61 | 2.01 | 57.9 | 0.87 | 1.16 | 29.77 | |
2003 | 5,660 | AC | 2.60 | 2.47 | 43.5 | 0.88 | 0.96 | 9.63 | |
2004 | 5,722 | AC | 2.59 | 1.91 | 31.3 | 0.91 | 0.80 | 10.7 | |
2005 | 5,887 | AC | 2.55 | 1.97 | 41.5 | 0.90 | 1.11 | 16.77 | |
2006 | 5,981 | AC | 2.53 | 2.54 | 27.8 | 0.91 | 0.62 | 5.41 | |
2007 | 6,266 | AC | 2.48 | 2.50 | 31.3 | 0.90 | 0.79 | 5.02 | |
Citations | 1998 | 5,467 | C | 2.65 | 1.12 | 15.1 | — | ||
1999 | 5,550 | C | 2.63 | 1.00 | 7.1 | 0.96 | 0.50 | 4.11 | |
2000 | 5,696 | C | 2.60 | 0.54 | 4.5 | 0.98 | 0.38 | 3.34 | |
2001 | 5,752 | C | 2.58 | 0.48 | 5.2 | 0.99 | 0.51 | 4.75 | |
2002 | 5,876 | C | 2.55 | 0.35 | 3.1 | — | |||
2003 | 5,907 | C | 2.55 | 0.49 | 3.5 | 0.98 | 0.34 | 2.28 | |
2004 | 5,968 | C | 2.54 | 0.34 | 3.0 | 0.98 | 0.40 | 1.94 | |
2005 | 6,088 | C | 2.51 | 0.73 | 11.2 | 0.99 | 0.72 | 11.97 | |
2006 | 6,166 | C | 2.49 | 0.53 | 9.7 | 1.01 | 0.49 | 9.25 | |
2007 | 6,417 | C | 2.44 | 0.61 | 8.4 | 1.01 | 0.67 | 8.26 | |
Impact factor | 1998 | 5,378 | C | 2.67 | 0.61 | 6.6 | — | ||
1999 | 5,467 | C | 2.65 | 0.64 | 11.3 | 0.98 | 0.91 | 10.5 | |
2000 | 5,607 | AC | 2.61 | 1.14 | 22.2 | 0.99 | 1.10 | 21.7 | |
2001 | 5,670 | C | 2.60 | 1.15 | 20.2 | 0.97 | 1.34 | 8.6 | |
2002 | 5,791 | C | 2.57 | 1.32 | 24.9 | 0.98 | 1.34 | 23.4 | |
2003 | 5,845 | C | 2.56 | 0.76 | 12.5 | 0.99 | 0.75 | 12.1 | |
2004 | 5,918 | C | 2.54 | 1.76 | 16.7 | 1.04 | 0.90 | 12.80 | |
2005 | 6,033 | C | 2.52 | 0.99 | 16.3 | 1.01 | 0.75 | 15.85 | |
2006 | 6,122 | AC | 2.50 | 0.82 | 39.3 | 1.06 | 1.52 | 27.84 | |
2007 | 6,359 | AC | 2.46 | 2.74 | 40.4 | 1.10 | 1.06 | 14.04 |
Note. — indicates that there is no β that improves the result for the generalized Benford’s law.
In summary, the results are almost identical for both data sets. The differences are due to the different number of journals analyzed. For example, for the year 2003 and the impact factor 7,548 journals are collected in one case and 5,845 in the other.
4.2. Macroanalysis
After an initial study of the years, we propose a macroanalysis of the 181 distributions by aggregating the temporal data. A digit-by-digit analysis (i.e., a column-by-column analysis of the 90 distributions in the appended tables) is carried out. We calculate the mean, median, and standard deviation of the nine distributions of digit values over the total number of years (23 for Scopus, 21 for WoS, 10 for historical data). We will define the Benfordian distribution of averages.
4.2.1. Distribution of averages
This distribution is all the more relevant if the mean is a representative of the value of each digit for all periods. A quick examination of the tables in the appendices where mean, median, and standard deviation are calculated shows that the distribution of values has a Gaussian appearance (median and mean close together) with a low standard deviation. We have represented the histogram of the distribution of the 23 values of digit 1 of the WoS citations which have an average of 29.93 (see Figure 2). This leads us to calculate the average distributions for each data set.
Variation in the value of digit 1 of the WoS citation data set (see Table S7 in the Supplementary material).
Variation in the value of digit 1 of the WoS citation data set (see Table S7 in the Supplementary material).
In order to produce Table S7 in Supplementary material, which allows us to analyze, we then use these 90 distributions to construct the 10 average distributions. Each digit average is compared with the corresponding theoretical value. Except for the data set corresponding to the ratio Number of references/Number of articles (see Table S7), good conformity with the theoretical value is observed.
Only the MAD test is possible, as the number of journals varies from year to year. Except for the ratio data set, the other distributions are C or AC. The three citation distributions are of type C.
4.3. Discussion
This work has made it possible to apply Benford’s law and the various related tests to several bibliometric objects. By considering categorizations, macro- and microanalyses have laid the foundations for a reflection on the nature of the scientometric objects used. Macroanalysis shows the nine distributions of average have MAD values of C or AC (see Table 2). Note that the distribution of ratios has not been taken into account.
In the microanalysis, Benford’s law was first tested with the three tests MAD, χ2, and Max. From a quantitative and global point of view, the results obtained for the 160 distributions show
154 distributions (96%) are of type C or AC;
58 distributions (36%) validate the χ2 test for the generalized Benford’s law; and
98 distributions (61%) validate the Max test for the generalized Benford’s law.
However, microanalysis allows us to refine the results and to observe differences between the different types of distributions. We then grouped the homogeneous distributions into three classes (see Table 2);
A first class: the 54 distributions of citations produced by scientific activity;
A second class: the 52 distributions relating to the number of references and the number of articles produced; and
Finally, a last class: the 54 distributions concerning the scientometric indicators h-index and impact factor, which are metrics of science.
They clearly show that we can classify the three categories according to the success of the statistical tests (χ2 and Max) (see Table 2).
First, we get citations (100% Max, 56% χ2), second, indicators (35% Max, 11% χ2), and third, references and articles (17% Max, 0% χ2).
Second, we tested the generalized Benford’s law. The ranking remains unchanged. In all three categories, the generalized law improves the results. The range of variation of β is only 0.11. The most significant improvement is for items in the historical data set (Table 7).
How can we explain these differences in applicability? We have observed Benford’s law on a wide range of scientometric data. There are scientometric distributions that do not verify Benford’s law, and these phenomena invite us to reflect on their nature. We can divide the scientometric objects into two main categories, namely Product of Science and Metrics of Science. The Product of Science class contains citations, references, and articles. The Metrics of Science class contains impact factor, h-index, and ratio. To understand this phenomenon, we’ll discuss the following: citations, references, and articles, which belong to the Product of Science class, and bibliometric indicators, which belong to the Metrics of Science class.
4.3.1. Citations
Benford’s law applies particularly well to bibliographic citations.
Whatever the data set considered, the citation distributions follow Benford’s law (Gauvrit & Delahaye, 2008). Indeed, citations are data that reflect the actual use of scientific articles by other researchers. Scientometrics has shown that, very often, only a few articles receive a large number of citations, while the majority of articles receive far fewer. They often follow an uneven distribution, with a few articles receiving a large number of citations, while the majority of articles receive fewer.
4.3.2. References and articles
The number of references in an article, unlike the number of citations, is limited by the researcher’s practice: There are rarely extremes. For the number of articles in a journal, the constraint is imposed by the edition itself. Thus, for this third class of scientometric elements, the results obtained show poor results for the application of Benford’s law. Although this may seem paradoxical at first glance, because it’s easy to replace the term citation with references, it’s important to understand that citations are generated naturally.
References and the number of articles, on the other hand, are limited by editorial practices. Indeed, the length of an article is artificial, in the sense that the length of the article is limited. This also implies a constraint on bibliographic references. As the length of articles is limited, the use of references is also limited. These constraints also affect the distribution of significant digits, which partly explains the poor results obtained.
4.3.3. Bibliometric indicators
When considering bibliometric indicators, their applicability depends on the specific characteristics of the bibliometric data in question.
The impact factor is a specific indicator that measures the frequency with which a journal’s articles are cited in other articles over a given period. It is calculated as the average number of citations received by a journal’s articles over a given period (e.g., 2 years). The h-index reflects both the number of publications and the number of citations of a researcher. The ratio does not follow Benford’s law. One possible interpretation is that references and the number of articles are limited by editorial practice. As the length of articles is limited, so is the use of references. However, we have not categorized the journals, whether paper, digital, or megajournal, and this work remains to be done.
4.3.4. Generalized Benford’s law
The generalized Benford’s law naturally gives better results when fitting a distribution. Indeed, Section 2 shows that its generalization consists in adding a parameter, as for Zipf’s law, which in its primitive form has no parameter. This improvement can be seen when we compare the indicator values of χ2 and Max for the two laws (see Tables 3–6). We will also observe that the generalized Benford’s law is all the more relevant as the scientometric objects that constitute the three classes respect Benford’s law (see Table 2).
In our experiment, the generalized Benford’s law does not provide any results for distributions that are not Benfordian. It is important to point out that Benford’s law is not universal and does not apply to all types of scientometric data; see for example the distribution of ratio. Indeed, distributions that are not constrained by the system or by humans better verify Benford’s law. This partly explains the open questions in the literature (Alves et al., 2014; Campanario & Coslado, 2011; Egghe & Guns, 2012).
5. CONCLUSION
This paper focuses on scientometric objects and their behavior. It is important to consider the nature of scientometric observables. To this end, we have built a corpus that allows us to experiment with different tests. All the data produced are freely available for reproducibility purposes. We have also confirmed the results of our colleagues. We used MAD, Max, and χ2, applying Benford’s law and the generalized Benford’s law. The latter confirms its generalization for certain scientometric objects, such as citations.
We have shown that Benford’s law applies particularly well to citations and, to a lesser extent, to bibliometric indicators. Benford’s law is not easily applicable to bibliographic references and articles. After proposing a categorization, we put forward an explanation of this phenomenon, together with the constraints that apply to these objects.
The next steps in this work are to extend the scientometric objects. Indeed, we can consider altmetrics, as suggested by the work of Gupta et al. (2023). This work is based on an abductive approach based on observations of data sets, so it would be relevant to mathematize the concept of constraint around scientometric objects.
ACKNOWLEDGMENTS
The authors would like to express their gratitude to the reviewers for their valuable feedback, which has been incorporated into this version of the manuscript.
AUTHOR CONTRIBUTIONS
Marc Bertin: Conceptualization, Data curation, Funding acquisition, Investigation, Project administration, Resources, Supervision, Validation, Visualization, Writing—original draft, Writing—review & editing. Thierry Lafouge: Conceptualization, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing—original draft, Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
DATA AVAILABILITY
The data used in this study are published in https://doi.org/10.5281/zenodo.12698510.
REFERENCES
Author notes
Handling Editor: Vincent Larivière