We test the feasibility of incorporating broad social, political, and governance indicators with standard metrics as a way to enrich assessment of national research capacity. We factor analyze two sets of variables for 174 countries from 2012 to 2021, one being tradtional measures associated with national science and technology capacity, such as spending, and a second being broader social, political, and governance measures, such as academic freedom. As expected, two factors emerge, one for raw or “core” research capacity and the other indicating the wider governance context. Further analysis shows convergent validity within the two factors and divergent validity between them. The analysis also quantifies the contribution of each indicator to each factor. Nations rank differently for each factor and also when combined. Ranks vary as a function of the chosen aggregation method. As a test of the predictive validity of the capacity index, we find both factors to be associated with country-level field-weighted citation indices. Policymakers and analysts may find useful feedback from this approach to quantifying national research strength.

Numerous indicators exist to measure inputs, processes, outputs, and outcomes of scientific capacity, which are often used to assess the relative strength of nations. A common approach is to combine numerous indicators into aggregate indices. However, indices (e.g., the Global Competitiveness Index) have several drawbacks for policymakers. They often incorporate a great deal of collaborative international data and general economic data that obscure the scientific performance and productivity of a single nation. Moreover, bibliometric databases tend to have biases but remain the only viable way to conduct international comparisons of national scientific performance (Van Leeuwen, Moed et al., 2001). We offer an alternative approach to existing indices, one that combines data sources in a principled way to explore the prospects for developing a national index of scientific capacity as input to the policy process. The resulting index fills a need for insight into capacity to better empower assessment grounded in national contexts. We identify and index two sets of indicators as factors broadly understood as “raw capacity” and “governance.” To further test the validity of these factors, we use them as predictors of national scientific impact, measured by fractional field-weighted citation impact.

The approach follows Lundvall’s theory (2016) of national systems of innovation (NSI) where he characterizes the dynamic interaction of factors of innovation as having “core” and “wider context” elements (see box 9.8 in Lundvall’s work). We define Lundvall’s core as “raw capacity” within the research system with widely accepted indicators, and the “wider context” to include elements of national “governance,” which have been used less frequently. Our approach to index creation provides insight into the relative contributions of each indicator to distinct factors suggesting a more systematic understanding of the index creation process. Practitioners may find this approach to index generation useful when comparing national science and technology policies and their impacts, while scholars may find it useful to analyze elements of the global research system.

Using existing indicators of inputs, activities, outputs, and context, we seek to represent national capacity to conduct research. Capacity is understood as access to and ability to use resources, as opposed to simply an input or an output. We undertake this work for three reasons:

  • existing indices target economic strengths, not research capacity, making them less useful for national science policy purposes;

  • data from more countries have become more widely available, providing an opportunity to make a research index more useful; and

  • methods are needed to better compare nations on their capacity to conduct research.

Moreover, composite indices help policy analysis by making units comparable (Nardo & Saisana, 2009) and it would be helpful to compare national capacities. This broad acceptance of indices is attributed to the ability to “effectively encapsulate intricate and occasionally evasive matters across various domains” (Nardo & Saisana, 2009, p. 2), which is also our goal.

Following this introduction, the paper is divided into four parts. The literature review justifies the choice of indicators for this paper. Then, the methodology for constructing the index, the data chosen and applied, and the statistical analyses utilized are described. Following the discussion of the methodology and data, we present the results of statistical analyses. In the conclusion, the results are discussed focusing on the index’s applicability to future research, and the next steps for using the index to compare nations.

First, we compare existing innovation indices and explain why our proposed index fills a gap. Then, we draw from literature to present a conceptual framework that categorizes our variables into three broad categories:

  • raw scores of research capacity, Lundvall’s “core capacity”;

  • broad measures of social and political context, Lundvall’s “broader context”; and

  • outcomes of science, specifically, citation impact.

In contrast to a standard literature review approach, which presents testable hypotheses, our approach is to mine the literature for candidate indicators of national research capacity (see Supplementary material).

2.1. Existing Innovation Indices

Indices are a construct of broad interest. Indices of national innovation, competitiveness, and knowledge are compiled each year by different analytic organizations (e.g., World Bank) to provide input to business and public policy by ranking countries on baskets of variables. These reports often focus on economic growth, business, and commerce; they do not have the goal of decoupling the national from the international data nor do they disaggregate the role of public research. From a policy perspective, economic and competitiveness indices such as these have limited value as feedback to R&D policy support. Of the economically oriented indices, the most prominent are the Global Competitiveness Index (GCI), the Global Innovation Index (GII), and the Global Knowledge Index (GKI), each of which takes a slightly different approach to measuring innovation, but all of which include business data. The method proposed here does not include business, trade, financial, or market data as these features do not reflect science. Moreover, we do not include global data because we desire to measure national capacity. Our index does not draw upon these reports, but we summarize them here to contrast our approach with theirs.

  • The World Bank publishes the Global Competitiveness Index as a publication of the World Economic Forum Global Competitiveness Report (GCR); this work was most recently issued in 20191. The GCI draws upon 110 variables and a survey of businesspeople, covering 141 countries. GCI variables are grouped into “pillars” with nested indices covering human capital; market conditions; policy environment and enabling conditions; technology and innovation; and physical environment. The outcomes provide ranks of countries by their competitiveness.

  • The Global Innovation Index (GII) is copublished annually by Cornell University, INSEAD Business School, and the World Intellectual Property Organization (WIPO). The GII is a multilevel index with 81 variables, nested into subindices around seven pillars: institutions; human knowledge and research; infrastructure; market sophistication; business sophistication; knowledge and technology outputs; and creative outputs. GII covers the economies of 132 countries in the 2022 report.

  • The United Nations endorses the publication of the Global Knowledge Index (GKI) compiled by the Mohammed Bin Rashid Al Maktoum Knowledge Foundation (2022). GKI is a multilevel index combining 199 variables, nested in subindices, covering seven areas: preuniversity education; higher education; research, development, and innovation; information and communications technology; economy; enabling environment, consisting of governance, socioeconomic, and health and environment. GKI covers 138 countries.

These reports mix national data (such as exports) with data that have a high degree of globalization (such as international patents), which does little to help science policymakers assess national policy impacts on research strengths. We seek to generate a measure of capacity that is disentangled from business and international data, which both obscure the national public research contribution.

Li, Zhang, and Liu (2020) developed a citation-based scientific capability identification (CISCI) tool to examine national capability in what they termed “dual science roles” by examining both cited and citing behavior in a networked structure. They find, for 158 countries, different contributing roles for different fields of science; they further show that these roles and rankings change over time. The United States, Canada, and the United Kingdom were consistently highly ranked by their tool, with rapid improvements for China. This admirable approach provides useful insights into national capacities, but we have the further goal of testing the wider context of governance, which is not covered by Li et al. (2020).

2.2. Measures of Research and Development Capacity

This section reviews the literature on R&D indicators which are used in our index. May (1997) compared the scientific wealth of nations for high-performing countries by calculating the number of scholarly articles along with citations to these articles on a nation-by-nation comparison. May found that 15 nations accounted for 81% of all scholarly articles, which has expanded since his article. King (2004) conducted an analysis like May (1997), examining the national distribution of the top 1% of most highly cited papers, finding the United States to be dominant at that time, a finding that has also changed2.

Previous studies of research include Barro and Lee (1994), Martin (1996), and Frame (2005); and Cole and Phelan (1999) identified and assessed the usefulness of indicators of knowledge-creating capacity, many of which are codified in the Frascati Manual (OECD, 2015). A consensus emerged around spending on research and development, and the number of trained individuals, as two core indicators of R&D capacity (Romer, 1989). Other studies have added regulatory quality and political stability (Furman, Porter, & Stern, 2002; Lundvall, 2016). Still others measured knowledge production (number of published articles) (May, 1997) and the number of domestic patent registrations (Lepori et al., 2008; Narin, 1994; Schmoch, 2004; Tong & Frame, 1994). Analysts often include the number of research-conducting academic and nonacademic research institutions in discussions of national capacity (Knack & Keefer, 1995). International cooperation, fractionally counted to attribute numbers to participating countries, is also commonly used as an indicator of global engagement and openness (OECD, 2015; Wagner & Jonkers, 2017).

Research and development expenditure is a major contributor to the knowledge economy, and it is always included in any measure of research capacity. Gross expenditure on research and development (GERD) is universally recognized in the literature as a key indicator of knowledge-creating capacity, and there is a strong correlation between R&D spending and economic growth (Howitt, 2000; Salter & Martin, 2001). Gulmez and Yardımcıoglu (2012) discovered positive effects of R&D spending on the income of 21 OECD member nations between 1990 and 2010, demonstrating that a 1% increase in R&D spending led to a 0.77% increase in economic growth. Other authors have studied the relationship between scientific investment and national development and discovered a significant correlation between R&D investment and both short-term and long-term growth in developing and developed nations (Gittleman & Wolff, 1995; Goel & Ram, 1994; Gumus & Celikay, 2015). Adams (1990), May (1997), and King (2004) demonstrate significant correlations between R&D expenditures and economic expansion.

Education and human resources are included in any assessment of national research capacity. In earlier attempts to measure national scientific activity, Frame (2005) emphasized the importance of education, while Barro (1991), citing work by Nelson and Phelps (1966), showed that nations with highly trained human capital were better able to absorb new products or ideas. Fedderke (2005), supports Romer (1990), noting that quality, rather than quantity, of human capital contributes to total productivity at the national level. Capacity for training, education, and knowledge transfer is also essential: Schofer, Ramirez, and Meyer (2000) showed that the size of scientific labor sources and training systems had a positive effect on national economic growth. Cole and Phelan (1999) identified a positive relationship between the number of research scientists and economic growth, and Barro (1991) showed that GDP growth is positively related to the availability of trained human capital.

Scholarly productivity is often used as an indicator of research capacity. The number of scholarly articles published is widely used as an indicator of the strength of a national research sector (Martin, 1996; OECD, 2015). Recent research by Miao, Murray et al. (2022) demonstrates a correlation between a nation’s scientific output and its economic growth and complexity (see also Ahmadpoor & Jones, 2017; Cilliers, 2005). Past productivity forms a basis for expectations of productive capacity in the future. This finding supports the findings of Cimini, Gabrielli, and Sylos Labini (2014), who discovered that OECD member states had more diverse research systems when measured in articles than developing countries (see also OECD, 2021).

Patent counts are used as indicators for technological development or entrenched knowledge (World Intellectual Property Organization Patentscope, 2020–2022). Crosby (2000) identified a positive correlation between the number of patents and economic growth, while Kogan, Papanikolaou et al. (2017) support Furman et al. (2002), demonstrating that the scientific content of the patent is positively correlated with the patent’s value to the economy. Patent offices often differentiate between national and international patents. Our index uses only national/residential patents.

2.3. Research on Social and Political Context

The importance of good governance and political stability for knowledge-based economic growth is well documented (Bäck & Hadenius, 2008). Rule of law and freedom from corruption are correlated with higher economic growth and expansion of a knowledge economy (Haggard, MacIntyre, & Tiede, 2008). Barro (1996) and Cole and Phelan (1999) showed that growth is correlated with the maintenance of the rule of law, free markets, small government consumption, and high human capital. Other studies have shown a positive relationship between political stability, technological change, and growth (Barro, 1991; Barro & Lee, 1994; Hall & Jones, 1999) and between democracy and growth (Barro, 1996). Berggren and Bjørnskov (2022) showed correlations between academic freedom and innovation. Whetsell et al. (2021) showed the relevance of democratic governance in predicting the national performance in science, and Wang, Feng et al. (2021) show similar effects on technology. Whetsell et al. (2021) showed that levels of polyarchy, measured through Varieties of Democracy Project data (Coppedge, Gerring et al., 2011, 2023), is a significant correlate of field-weighted citation impact at the national level.

There is mixed evidence regarding the role of regulations, standards, and enforcement in promoting research and development and science capacity. Some economists showed that regulatory burdens can hinder innovation, competitiveness, and national trade positions (Hahn & Hird, 1991). In contrast, Porter and van der Linde (1995), discussed in Blind (2012), suggest that, while ambitious environmental regulations may be costly for national industry at the outset, regulations may help to improve international competitiveness and increase exports of environmental technologies over the longer term. Our view aligns with that of Blind (2012)—that regulations and standards aid research and innovation—and we include the Coppedge et al. (2011) data on regulatory quality in our index.

The enforcement of intellectual property rights (IPR) has also garnered considerable research interest. According to Blind (2012, p. 393), innovation is supported by “...institutional regulations that ensure adequate enforcement of intellectual property rights.” Blind (2012) cites, who demonstrate that IPR regulations have an advantageous effect on the R&D intensity of the former G7 nations. Greenhalgh and Rogers (2010) demonstrated that IPR enforcement serves as an indicator of the quality of research. The rule of law facilitates the invention and innovation processes in both the public and private research sectors.

The ability of researchers to access new ideas, or diffusion, as critical to research capacity and success is supported across many parts of the literature, such as Björk and Magnusson (2009), exploring the role of interactions among researchers as tied to innovation; Lopez-Vega, Tell, and Vanhaverbeke (2016) ask where and how to search, focusing on the internet in their article on that topic. The role of the internet in improving and enhancing search has received attention in high-level policy documents such as OECD’s report, “Economic and Social Benefits of Internet Openness” (OECD, 2016). (We use the open internet measure drawn from the Varieties of Democracy data.)

As a practical starting point for constructing composite indices, our guide was the OECD handbook on constructing composite indicators (Nardo & Saisana, 2009). The handbook suggests the following actions: (a) establish a theoretical framework; (b) choose variables; (c) impute missing data; (d) conduct multivariate analysis; (e) normalize data; (f) weight and aggregate data; and (g) present findings. We generally follow the Nardo and Saisana (2009) framework with the added test of predictive validity after step 6. This section will discuss variable selection, data sources, and analysis techniques, as the theory development was covered in the previous section. All analysis was conducted in the R programming language (R Core Team, 2021).

3.1. Choice of Indicator Variables and Data Sources

Table 1 provides a description of all the variables used in the analysis identified from the literature and drawn from existing databases and shows the chosen variables for potential selection into the proposed national research capacity index: research and development spending (RD) as a raw number (not GDP normalized); number of resident patent applications (ResPatents); number of academic institutions affiliated with publications (AcadInst); number of nonacademic research institutions affiliated with publications (NonAcadInst); number of unique authors listed on publications (Authors); number of publications fractionally counted by nation (Pubs); number of papers that are international collaborations, fractionally counted by nation (IntlPubs); open internet access (OpenInternet); rule of law (RuleLaw); regulatory quality (RegQuality); political stability (PolitStability); noncorruption (NonCorrupt); electoral democracy (Polyarchy); and academic freedom (AcadFreedom). Elsevier’s field weighted citations index (FWCI), fractionally counted, is used as a dependent variable in the regression models.

Table 1.

Variable descriptions and summary statistics

Variable nameDescriptionData source
RD Gross research and development spending: raw number World Bank Indicators 
ResPatent Number of resident patent applications World Bank Indicators 
AcadInst Number of academic Institutions: paper affiliation Scopus/Elsevier 
NonAcadInst Number of nonacademic institutions: paper affiliation Scopus/Elsevier 
Authors Number of unique authors: paper affiliation Scopus/Elsevier 
Pubs Number of publications: fractional count Scopus/Elsevier 
IntlPubs Number of international co-pub papers: fractional count Scopus/Elsevier 
OpenInternet Country approach to regulating/controlling Internet Varieties of Democracy 
RuleLaw Rule of law: crime, judicial & contract effectiveness World Bank Indicators 
RegQuality Regulatory quality: burden of regulation on markets Varieties of Democracy 
PolitStability Political stability: probability of gov. destabilization World Bank Indicators 
NonCorrupt Control of corruption: use of public power for private gain World Bank Indicators 
Polyarchy Electoral Democracy Index Varieties of Democracy 
AcadFreedom Academic Freedom Index Varieties of Democracy 
FWCI Fractional Field Weighted Citation Index Scopus/Elsevier 
Variable nameDescriptionData source
RD Gross research and development spending: raw number World Bank Indicators 
ResPatent Number of resident patent applications World Bank Indicators 
AcadInst Number of academic Institutions: paper affiliation Scopus/Elsevier 
NonAcadInst Number of nonacademic institutions: paper affiliation Scopus/Elsevier 
Authors Number of unique authors: paper affiliation Scopus/Elsevier 
Pubs Number of publications: fractional count Scopus/Elsevier 
IntlPubs Number of international co-pub papers: fractional count Scopus/Elsevier 
OpenInternet Country approach to regulating/controlling Internet Varieties of Democracy 
RuleLaw Rule of law: crime, judicial & contract effectiveness World Bank Indicators 
RegQuality Regulatory quality: burden of regulation on markets Varieties of Democracy 
PolitStability Political stability: probability of gov. destabilization World Bank Indicators 
NonCorrupt Control of corruption: use of public power for private gain World Bank Indicators 
Polyarchy Electoral Democracy Index Varieties of Democracy 
AcadFreedom Academic Freedom Index Varieties of Democracy 
FWCI Fractional Field Weighted Citation Index Scopus/Elsevier 

Data were gathered from the World Bank Indicators (World Bank Databank, 2020–2022), available through the R package WDI (Arel-Bundock, 2022), the Varieties of Democracy Project (Coppedge et al., 2023), available through the R package vdemdata (Maerz, Edgell et al., 2020), and (an Elsevier database)3 (data were obtained through email communication with researchers at Elsevier).

3.2. Missing Data

Finding a balance between data coverage over the number of countries and comprehensiveness over important aspects of science is an inherent difficulty in the construction of an index. No index can be expected to include every variable on every country related to a subject. A requirement for more detailed data will necessarily lead to fewer nations, regions, or groups being included in the analysis. For example, developing countries typically collect and provide less data than developed countries, and the available statistical data may be less reliable. Because many existing indices cover smaller samples of countries, we sought to construct our index to include as many nations as possible. However, numerous variables of interest had low data coverage. This presents an interesting problem for research. High-quality measures are generally available for countries whose status is already well known and whose research systems are well developed, while data is lacking on those that are likely to experience the greatest change over time and whose status is of particular interest. For these reasons, missing data remains a persistent issue for studies of national scientific capacity.

Our missing data strategy is as follows. First, we focus on the most recent 10 years of available data, which resulted in a period from 2012 to 2022. Data from Scopus/Elsevier is comprehensive across almost all countries, so these data formed the base sample for the subsequent merging of data. The Varieties of Democracy (Coppedge et al., 2023) data also covered many countries. The World Bank Indicators had the lowest data coverage, specifically on RD and ResPatents. We did not include, for example, tertiary enrollment because of low data coverage. Among the sample, if there was only partial missing data (available in some years but not others), we imputed the mean from the available data for the country/year observations. Next, Research and Development Expenditure (RD) and Resident Patents (ResPatent) do not contain zero values. However, it is not clear whether these are simply missing data. Multivariate imputation by chained equations (MICE) was used to impute values, which was implemented using the R package mice (Van Buuren & Groothuis-Oudshoorn, 2011). Imputation for both was based on the other capacity variables, AcadInst-IntlPubs, using the predictive mean matching option to produce five imputed data sets of which the pooled mean was taken as the imputed value. Alternatively, the analyst may choose to omit missing data here or drop these two variables from the analysis altogether. These choices resulted in a sample of 174 countries.

3.3. Methods of Analysis

We apply exploratory factor analysis (EFA) to identify the underlying factor structure between all the variables. EFA is a statistical method used to identify unobserved “latent factors” that manifest numerous observable indicators (Cudeck, 2000). It is commonly used to justify the reduction of numerous variables into aggregate indices. EFA computes the pairwise correlation matrix of a set of variables, then computes the eigenvalues and eigenvectors of the matrix, which are used to identify the amount of variance in the indicator variables explained by the factor (eigenvalues), and the direction of the relationship between the variables and the underlying factor (eigenvector). In the present context, we use EFA to identify whether the indicators listed in Table 1 (excluding FWCI) represent a coherent underlying latent factor, called “national research capacity.” Practically, candidate variables are found in a variety of sources and formats, and they are often gathered for reasons unrelated to research capacity. As such, their relationship with one another becomes more important than what they represent individually. Because numerous variables measure essentially the same factor, EFA helps to economize the multiplicity of empirical indicators. Factor analysis allows us to make statements about the convergence or divergence of these empirical measurements as they relate to national research capacity. We use the R package psych to conduct EFA using the principal factor method and varimax rotation (Revelle, 2024), retaining two factors as indicated by the eigenvalue and eigenvector matrixes (Grice, 2001).

The Cronbach’s alpha test, which examines the internal consistency and relatedness of a set of variables, is used to assess scale reliability (Revelle & Condon, 2019). This test is conducted after EFA to provide additional evidence that the variables identified by EFA have scale reliability prior to aggregation. Higher scores indicate greater internal consistency. In general, a value greater than 0.7 indicates adequate scale reliability. To aggregate variables into an index for the cross-sectional regression model, we chose the factor regression score extraction method. We used a summative index for the panel regression because there appear to be no established methods to generate factor regression scores in panel data.

Additionally, we wish to demonstrate the predictive validity of the index by testing its relationship between other well-established variables. To achieve this, we employ fractional Field Weighted Citation Impact (FWCI) to measure the impact of national research. FWCI is the ratio of the total citations received by the unit (country) and the total citations expected based on the average of the subject field, document type, and year. At the country level, the index is aggregated across all research domains. The data are further fractionalized in cases of international collaboration to represent country-specific contributions. FWCI has gained acceptance in the scientometrics literature as a valid indicator of citation impact and fractional counting is a growing standard for analysis (Purkayastha, Palmaro et al., 2019; Sivertsen, Rousseau, & Zhang, 2019; Waltmann & van Eck, 2015).

To examine the influence of the national research capacity index on FWCI, we employ Bayesian multilevel regression with the R package brms (Bürkner, 2017). This method allows us to account for the distinctions between regions and countries in our data (see also Huggins and Izushi (2008)), revealing the relationship between research capacity and research impact across the globe. Bayesian methods have found rapid acceptance as an alternative to the frequentist approach of conventional regression techniques that employ significance testing based on the p-value. In place of a binary evaluation of statistical significance, Bayesian methods generate credibility intervals for parameter estimates of interest, focusing the analyst on gradations of uncertainty (Gelman, Hill, & Vehtari, 2020).

This section summarizes the results of the statistical analysis. First, we present descriptive statistics. Second, we show the results of the exploratory factor analysis. Finally, we present the Bayesian regression models that predict research impact using the index.

Prior to presenting the findings, it is useful to provide some additional remarks on the empirical methodology employed. Longitudinal data were collected for all variables spanning the years 2012 to 2021. To our knowledge, there are no established methods for conducting exploratory factor analysis (EFA) on panel data. Consequently, we aggregated all the data based on the within-country mean for the time frame to conduct the EFA. Practitioners and analysts may choose different time frames, or collapse data on different metrics than the mean, or construct indices year-by-year as the data become available.

Table 2 displays the list of indicators along with descriptive statistics of the variables analyzed. First, n shows the number of countries for which data are included. Mean, sd, min, and max present descriptive statistics for each indicator. Mean shows that the data have positive and negative values. Standard deviation (sd) shows how far the data points are from the mean. Min/max shows the range of values for the variable. Prior to conducting the EFA, the natural logarithm (+1) was applied to variables exhibiting significant skewness. All variables pertaining to raw research capacity, ranging from research and development (RD) to international publications (IntlPubs), were transformed. This resulted in more normally distributed variables. The variables pertaining to governance exhibited less skewed distributions that did not require log transformation.

Table 2.

Descriptive statistics of the entire data set, 2012–2021

 nmeansdminmax
ln_RD 174 19.833 2.495 13.471 27.049 
ln_ResPatent 174 5.065 2.558 0.693 13.899 
ln_AcadInst 174 2.681 1.584 7.295 
ln_NonAcadInst 174 2.14 1.856 7.985 
ln_Authors 174 7.786 2.461 2.912 13.973 
ln_Pubs 174 6.746 2.752 1.48 13.163 
ln_IntlPubs 174 5.794 2.355 1.279 11.443 
OpenInternet 174 0.353 1.546 –3.572 2.372 
RuleLaw 174 0.55 0.307 0.021 0.998 
RegQual 174 –0.127 1.002 –2.33 2.045 
Stability 174 –0.207 0.947 –2.747 1.48 
NonCorrupt 174 –0.124 1.01 –1.677 2.272 
Polyarchy 174 0.524 0.254 0.017 0.919 
AcadFreedom 174 0.633 0.294 0.011 0.971 
FWCI 174 0.781 0.255 0.267 1.606 
 nmeansdminmax
ln_RD 174 19.833 2.495 13.471 27.049 
ln_ResPatent 174 5.065 2.558 0.693 13.899 
ln_AcadInst 174 2.681 1.584 7.295 
ln_NonAcadInst 174 2.14 1.856 7.985 
ln_Authors 174 7.786 2.461 2.912 13.973 
ln_Pubs 174 6.746 2.752 1.48 13.163 
ln_IntlPubs 174 5.794 2.355 1.279 11.443 
OpenInternet 174 0.353 1.546 –3.572 2.372 
RuleLaw 174 0.55 0.307 0.021 0.998 
RegQual 174 –0.127 1.002 –2.33 2.045 
Stability 174 –0.207 0.947 –2.747 1.48 
NonCorrupt 174 –0.124 1.01 –1.677 2.272 
Polyarchy 174 0.524 0.254 0.017 0.919 
AcadFreedom 174 0.633 0.294 0.011 0.971 
FWCI 174 0.781 0.255 0.267 1.606 

A correlogram for each variable is shown in Figure 1. The pairwise correlations for each pair of variables are displayed in the table’s upper section. Each variable’s distribution is represented by the diagonal. Each bivariate scatterplot is displayed in the table’s lower section, along with fit lines that roughly depict the correlation’s slope. The graph displays two regions of higher correlations, where the governance measures and the raw capacity metrics have stronger correlations with one another. The image also demonstrates the correlation between these two sets of metrics and FWCI. It appears that the governance measures have a stronger correlation with FWCI than the raw capacity measures, which will be discussed further.

Figure 1.

Correlogram of all variables.

Figure 1.

Correlogram of all variables.

Close modal

A preliminary test was performed to ensure that factor analysis would be a satisfactory tool to assess the relationship among the variables. The Kaiser-Meyer-Olkin measure of sampling adequacy (MSA) yielded a value of 0.88. A value close to 1 suggests that variables have a high level of common variance. Kaiser and Rice (1974) argued that a score in the 0.80s range is “meritorious.” In short, this measure quantifies the amount of shared variance among the items and indicates whether the items are suitable for factor analysis.

Following these results, eigenvalue decomposition was conducted, which revealed the presence of three factors with eigenvalues greater than 1 (8.03, 3.3, and 1.13, respectively). However, the scree plot seen in Figure 2 demonstrates that two factors are situated beyond the inflection point of the curve, while the third factor exhibits only a marginal increase above 1. Further, the third factor appears to “load” unevenly on all the variables (discussion of loadings below). The scree plot displays the cumulative percentage of variance accounted for by each successive factor, indicating that two factors explain roughly 81% of the variance in the variables.

Figure 2.

Scree plot of factor eigenvalues and cumulative percentage.

Figure 2.

Scree plot of factor eigenvalues and cumulative percentage.

Close modal

Next, the standardized factor loadings from the EFA are presented in Table 3. The table shows the associations between the variables and the underlying factors in the first two columns. The factor loadings reveal the degree that each variable “loads” on the two factors identified by the EFA, ranging between values of –1 to 1. The loadings for the variables ln_RD through ln_IntlPubs exhibit strong positive loadings on Factor One that all exceed 0.7 and relatively low loadings on Factor Two. In contrast OpenInternet through AcadFreedom show strong loadings on Factor Two exceeding 0.71 and relatively low loadings on Factor One.

Table 3.

Standardized loadings, communality, uniqueness, and complexity

FactorFactor OneFactor TwoCommunalityUniquenessComplexity
ln_RD 0.774 0.245 0.66 0.34 1.199 
ln_ResPatent 0.706 0.025 0.499 0.501 1.002 
ln_AcadInst 0.933 0.044 0.873 0.127 1.005 
ln_NonAcadInst 0.881 0.351 0.899 0.101 1.309 
ln_Authors 0.965 0.194 0.969 0.031 1.081 
ln_Pubs 0.955 0.208 0.956 0.044 1.094 
ln_IntlPubs 0.951 0.241 0.962 0.038 1.127 
OpenInternet 0.051 0.775 0.603 0.397 1.009 
RuleLaw 0.272 0.923 0.925 0.075 1.173 
RegQual 0.441 0.786 0.812 0.188 1.572 
Stability 0.105 0.719 0.528 0.472 1.043 
NonCorrupt 0.382 0.768 0.735 0.265 1.466 
Polyarchy 0.165 0.904 0.845 0.155 1.066 
AcadFreedom 0.024 0.812 0.659 0.341 1.002 
FactorFactor OneFactor TwoCommunalityUniquenessComplexity
ln_RD 0.774 0.245 0.66 0.34 1.199 
ln_ResPatent 0.706 0.025 0.499 0.501 1.002 
ln_AcadInst 0.933 0.044 0.873 0.127 1.005 
ln_NonAcadInst 0.881 0.351 0.899 0.101 1.309 
ln_Authors 0.965 0.194 0.969 0.031 1.081 
ln_Pubs 0.955 0.208 0.956 0.044 1.094 
ln_IntlPubs 0.951 0.241 0.962 0.038 1.127 
OpenInternet 0.051 0.775 0.603 0.397 1.009 
RuleLaw 0.272 0.923 0.925 0.075 1.173 
RegQual 0.441 0.786 0.812 0.188 1.572 
Stability 0.105 0.719 0.528 0.472 1.043 
NonCorrupt 0.382 0.768 0.735 0.265 1.466 
Polyarchy 0.165 0.904 0.845 0.155 1.066 
AcadFreedom 0.024 0.812 0.659 0.341 1.002 

The general pattern of loadings aligns with Lundvall’s theoretical framework of “core” and “wider context.” Factor One appears to predict measures associated with core capacity, whereas Factor Two is associated with measures of governance. A third factor was explored but showed relatively inconsistent loadings across both sets of variables, and an eigenvalue just above 1 as discussed above. The loading pattern provides support for convergent validity within the two separate factors and divergent validity between them. However, RegQual, NonCorrupt, and NonAcadInst show some cross-loadings above 0.25 onto their respective opposing factors, weakening the divergent validity. For this reason, analysts may choose to drop these items. The Communality column in Table 3 shows the proportion of the variance in the variable that is explained by the overall factor model. For example, ln_AcadInst has a communality of 0.87, meaning 87% of its variance is accounted for by the factors extracted. The Uniqueness column shows the inverse of the communality measure showing the proportion of the variance in the variable that is not explained by the overall model. The Complexity column indicates the degree to which the variable is explained by potentially more than one factor, where higher values indicate more than one factor explains the variable. For example, ln_NonAcadInst has the highest complexity score, which is consistent with its cross-loading on Factor One and Factor Two. An analyst might wish to remove this variable depending on the research question.

Next, we present the results of the Cronbach’s alpha test of scale reliability on the items loading separately on each factor. The test assesses the degree to which the items in a set are interrelated and are suitable for aggregation. Cronbach’s alpha ranges from 0 to 1, with higher values indicating higher average interitem reliability. Typically, a Cronbach’s alpha value of 0.7 or higher is considered acceptable for aggregation. The test results for the items loading on Factor One, representing raw capacity, including variables ln_RD through ln_IntlPubs, resulted in an overall alpha value of 0.96, and a standardized alpha value of 0.97. Similarly, items loading on Factor Two, representing governance, including variables OpenInternet through AcadFreedom, resulted in an overall alpha value of 0.85 and a standardized alpha value of 0.94. Tests for both sets of variables indicate moderate to high levels of internal reliability and suggest aggregation is appropriate.

Next, we shift toward demonstrating how one might utilize such indices. To visualize how countries rank on raw capacity, governance, and their combination, Figure 3 shows three plots that utilize the factor regression scores extracted from the factor analysis. Factor regression scores are estimated values assigned to each observation—each country, in this context—based on their shared variance captured by the factors. These provide an economical way of generating aggregate representations for indexed variables. The left plot shows countries ranked from highest to lowest on core research capacity. The middle plot shows countries ranked on governance. The right plot shows the ranking of the product (interaction) of the two indices. The scores were first standardized before plotting. Table 4 compares the top 10 country names in each plot represented in Figure 3.

Figure 3.

Factor score country ranks.

Figure 3.

Factor score country ranks.

Close modal
Table 4.

Factor score country ranks

Capacity factorGovernance factorCapacity × Governance
China Luxembourg United States 
United States Iceland Germany 
Japan Estonia Great Britain 
Russia Denmark Japan 
India Norway France 
Germany Finland Australia 
Great Britain New Zealand Canada 
Brazil Ireland Switzerland 
Turkey Sweden Netherlands 
Iran Switzerland Spain 
Capacity factorGovernance factorCapacity × Governance
China Luxembourg United States 
United States Iceland Germany 
Japan Estonia Great Britain 
Russia Denmark Japan 
India Norway France 
Germany Finland Australia 
Great Britain New Zealand Canada 
Brazil Ireland Switzerland 
Turkey Sweden Netherlands 
Iran Switzerland Spain 

To illustrate how different aggregation methods result in different country rankings, the rankings are reconstructed using a simpler summative (represented in Figure 4 and Table 5). Some practitioners and scholars may find this approach more intuitive, but it also includes more noise in the index as opposed to factor regression scores, which only include common variance on the factor. Further, the results of the summative index are useful in the longitudinal data setting as will be demonstrated later in the paper.

Figure 4.

Summative index country ranks.

Figure 4.

Summative index country ranks.

Close modal
Table 5.

Summative index country ranks

Capacity factorGovernance factorCapacity × Governance
United States Norway United States 
China Sweden Germany 
Japan Finland Japan 
Germany Denmark Great Britain 
Great Britain Switzerland Canada 
India New Zealand France 
France Luxembourg Australia 
South Korea Iceland Switzerland 
Russia Canada Netherlands 
Italy Netherlands Sweden 
Capacity factorGovernance factorCapacity × Governance
United States Norway United States 
China Sweden Germany 
Japan Finland Japan 
Germany Denmark Great Britain 
Great Britain Switzerland Canada 
India New Zealand France 
France Luxembourg Australia 
South Korea Iceland Switzerland 
Russia Canada Netherlands 
Italy Netherlands Sweden 

Finally, we move to test the predictive validity of the two indices. Capacity and Governance are used as predictors of national scientific impact, measured through fractional FWCI. First, the results of a cross-sectional (averaged by country over 2012–2021) Bayesian mixed-model regression are shown in Table 6, which tests the effects of Capacity and Governance on country-level research impact measured via FWCI. When using FWCI, it is common practice to filter out countries that have a very low publication rate but high FWCI due to collaboration. Thus, we removed countries with fewer than 50 total publications, taking the model from a sample size of 174 to 160.

Table 6.

Cross-sectional multilevel Bayesian regression, data collapsed 2012–2021

FWCIEstimateStd. Error[95% cred. interval]
Intercept −0.0759 0.2055 [0.4797, 0.3307] 
Capacity 0.0032 0.0013 [0.0007, 0.0057] 
Governance 0.0054 0.0012 [0.003, 0.0078] 
Region 0.1758 0.057 [0.098, 0.3145] 
Residuals 0.1743 0.0101 [0.1558, 0.1954] 
FWCIEstimateStd. Error[95% cred. interval]
Intercept −0.0759 0.2055 [0.4797, 0.3307] 
Capacity 0.0032 0.0013 [0.0007, 0.0057] 
Governance 0.0054 0.0012 [0.003, 0.0078] 
Region 0.1758 0.057 [0.098, 0.3145] 
Residuals 0.1743 0.0101 [0.1558, 0.1954] 

Table 6 shows that both Capacity and Governance have positive estimates on fractional FWCI. Furthermore, the standard errors are less than half the estimates and the credibility intervals for both do not include zero, indicating (in frequentist terminology) that the estimate is “statistically significant.” The model also controls for nesting within 10 different geographic regions. In short, both indices (Capacity and Governance) appear to be significant predictors of national scientific impact.

Table 7 shows the regression analysis conducted on the full panel data. Given the absence of a clear method for obtaining factor regression scores across temporal intervals, a summative composite measure was employed instead of extracting regression scores yearly. The present model also makes use of geographical nesting of countries within regions. The outcome of the model is similar to the results of the cross-sectional analysis. Table 7 displays positive estimates for Capacity and Governance, with credibility intervals that exclude zero. The model includes both country and region-specific random intercepts and includes Year as a continuous variable. The region, the country nested within the region, and the residual exhibit positive estimates and their credibility intervals do include zero.

Table 7.

Longitudinal multilevel Bayesian regression, 2012–2021

FWCIEstimateStd. Error[95% cred. interval]
Intercept −6.0532 2.7623 [−11.4272, −0.6195] 
Capacity 0.0036 0.0013 [0.0011, 0.0061] 
Governance 0.0161 0.0039 [0.0084, 0.0237] 
Year 0.0033 0.0014 [0.0006, 1.0006] 
Region 0.1614 0.0519 [0.0905, 0.2879] 
Region:Country 0.1796 0.0114 [0.1583, 0.2032] 
Residuals 0.1476 0.0028 [0.1423, 0.1532] 
FWCIEstimateStd. Error[95% cred. interval]
Intercept −6.0532 2.7623 [−11.4272, −0.6195] 
Capacity 0.0036 0.0013 [0.0011, 0.0061] 
Governance 0.0161 0.0039 [0.0084, 0.0237] 
Year 0.0033 0.0014 [0.0006, 1.0006] 
Region 0.1614 0.0519 [0.0905, 0.2879] 
Region:Country 0.1796 0.0114 [0.1583, 0.2032] 
Residuals 0.1476 0.0028 [0.1423, 0.1532] 

The present study introduces a composite index of a country’s capacity to produce and conduct research based on a set of credible indicators. The convergent and divergent validity of the capacity index is tested in relation to a separate index representing governance. They are then used to explore country rankings, which diverge between capacity versus governance context. China’s position, for example, moves from first in core capacity to last in governance among countries. Both capacity and governance are then tested together to establish their predictive validity, estimating relative effects on national scientific impact with both showing significant associations.

The dramatic difference in capacity and governance among some nations may indicate a discrepancy in relatively recent gains in capacity and the long-term sustainability of scientific systems. Autocratic nations may not be fully utilizing the emergent dynamic of self-organization within their workforces, often recognized to be pillars of scientific development (Whetsell et al., 2021). Such countries may gain short-term boosts in raw capacity from top-down programs, yet it remains to be seen whether authentic scientific performance can be sustained in the long-term. As De Solla Price (1963) observed in reference to the USSR, rapid growth can be achieved in developing countries because the existing network of scientific activity has already been established. It remains an open question as to whether the gap between capacity and governance can be sustained long term, especially considering both are empirical predictors of scientific performance (not just capacity).

The results provide to policymakers and analysts the ability to compare nations against one another, and perhaps to consider asymmetries between countries. Moreover, actions within the sphere of “science diplomacy” may be helped by this approach when actions involve establishing scientific agreements or proposing ties. Policymakers sometimes lack clear insight into the underlying capacities of counterpart nations as they seek partners to participate in scientific activities, and this index may be of help because soliciting science agreements or proposing ties can at times be opaque, particularly with regard to the least developed nations.

The index may also be useful for countries wishing to promote their scientific investments and achievements. National capacity to conduct research and development can attract talent who wish to cooperate or collaborate, invest, or study in another country. Nations with higher research capacity attract students and researchers to their universities and research institutions. Governments approve investment into R&D to build research capacity to reach multiple goals, which may include systemic resilience, long-term viability, and national standing and prestige in science. Understanding the role of core capacity and wider context may aid policymakers as they consider ways to improve the development of “useful knowledge.”

The capacity index could certainly be improved in the future. The focus here is on the underlying latent factors that manifest as relationships between indicators and less about the specific indicators that go into an index. In social science, underlying causal mechanisms generate innumerable empirical indicators, and we must not lose sight of the forest for the trees. In this case, the forest represents the more-or-less stable relationship between indicators, while the indicators themselves represent the individual trees.

Additional research could seek to validate the index and assess the scope for trimming or expanding additional indicators, such as tertiary education levels, tax incentives, and infrastructure. Future research could also test the predictive validity of the index against other extant indexes with similar data coverage. The index could become more useful over time as more data points are added. In addition, further research using the index in inferential models on a wide variety of interesting outcomes, such as strategic behavior in the international system, may provide insights into the effects of the capacity of individual nations on their network of relationships.

We hope that the index acts as a useful tool for assessing current science capacity and encouraging international collaboration. But more importantly, we hope our approach may serve as a practical starting point for other scholars seeking to construct their own indices. We expect to use it for research to understand the influence of geopolitical factors on national growth and international collaboration. Furthermore, we expect to use the index to serve as a test for the role of public investment in the growth of capacity over time.

Special thanks to Edwin Horlings for comments on the analytical and conceptual approach. Thanks to Ian Helfrich and Lisa Fagan for comments on the statistical analysis. Thanks to Jeroen Baas at Elsevier for providing critical data. Thanks also to Carol Robbins, Sylvia Schwaab-Serger, John Jankowski, and the late Loet Leydesdorff for consultations and comments; and to Peter Zhang and Ken Poland for help and comments on data collection. An earlier monograph by Wagner, Brahmakulam et al. (2001) presents a similar approach to indexing national research capacity, so we acknowledge the previous RAND publication and the RAND Corporation. We are thankful to attendees at the conference session during our presentation at the Atlanta Conference on Science and Innovation Policy, 2023, for comments.

Caroline S. Wagner: Conceptualization; Data curation; Formal analysis; Writing – original draft; Writing – review & editing. Travis A. Whetsell: Conceptualization; Data curation; Formal analysis; Writing – original draft; Writing – review & editing.

The authors have no competing interests.

The authors received no funding to undertake this project.

All relevant code and data are available at Github: https://github.com/tawhetsell/National-Scientific-Capacity-Index.

1

In the 2000s, the World Bank created the Knowledge Economy Index, which grew from a report that described a Knowledge Assessment Methodology (Chen & Dahlman, 2004); the World Bank no longer publishes the KEI index.

2

Wagner and Jonkers (2017) compared nations on international engagement and found a positive relationship between open exchange and the impact of and quality of science, supporting earlier work by Barro (1996).

3

Scopus abstracts entries in scholarly journals based using quality criteria. The database has a broader representation than Web of Science, including more non-English-language journals. The top journals in most countries are represented in Scopus, regardless of language. For example, Scopus claims to have over 90% coverage of serial publications from Japan, close to 90% from South Korea and Taiwan, and over 70% for China.

Adams
,
J. D.
(
1990
).
Fundamental stocks of knowledge and productivity growth
.
Journal of Political Economy
,
98
(
4
),
673
702
.
Ahmadpoor
,
M.
, &
Jones
,
B. F.
(
2017
).
The dual frontier: Patented inventions and prior scientific advance
.
Science
,
357
(
6351
),
583
587
. ,
[PubMed]
Arel-Bundock
,
V.
(
2022
).
WDI: World Development Indicators and other World Bank data
. R package version 2.7.8. https://CRAN.R-project.org/package=WDI
Bäck
,
H.
, &
Hadenius
,
A.
(
2008
).
Democracy and state capacity: Exploring a J-shaped relationship
.
Governance
,
21
(
1
),
1
24
.
Barro
,
R. J.
(
1991
).
Economic growth in a cross-section of nations
.
Quarterly Journal of Economics
,
106
(
2
),
407
443
.
Barro
,
R. J.
(
1996
).
Democracy and growth
.
Journal of Economic Growth
,
1
,
1
27
.
Barro
,
R. J.
, &
Lee
,
J. W.
(
1994
).
Sources of economic growth
. In
Carnegie-Rochester conference series on public policy
(Vol. 40, pp.
1
46
).
North-Holland
.
Berggren
,
N.
, &
Bjørnskov
,
C.
(
2022
).
Political institutions and academic freedom: Evidence from across the world
.
Public Choice
,
190
(
1–2
),
205
228
.
Björk
,
J.
, &
Magnusson
,
M.
(
2009
).
Where do good innovation ideas come from? Exploring the influence of network connectivity on innovation idea quality
.
Journal of Product Innovation Management
,
26
(
6
),
662
670
.
Blind
,
K.
(
2012
).
The influence of regulations on innovation: A quantitative assessment for OECD countries
.
Research Policy
,
41
(
2
),
391
400
.
Bürkner
,
P. C.
(
2017
).
brms: An R package for Bayesian multilevel models using Stan
.
Journal of Statistical Software
,
80
(
1
),
1
28
.
Chen
,
D. H. C.
, &
Dahlman
,
C. J.
(
2004
).
Knowledge and development: A cross-section approach
(Vol. 3366).
Washington, DC
:
World Bank Publications
.
Cilliers
,
P.
(
2005
).
Knowing complex systems
. In
K.
Richardson
(Ed.),
Managing organisational complexity: Philosophy, theory, application
(pp.
7
19
).
Greenwich, CT
:
Information Age Publishers
.
Cimini
,
G.
,
Gabrielli
,
A.
, &
Sylos Labini
,
F.
(
2014
).
The scientific competitiveness of nations
.
PLOS ONE
,
9
(
12
),
e113470
. ,
[PubMed]
Cole
,
S.
, &
Phelan
,
T. J.
(
1999
).
The scientific productivity of nations
.
Minerva
,
37
,
1
23
.
Coppedge
,
M.
,
Gerring
,
J.
,
Altman
,
D.
,
Bernhard
,
M.
,
Fish
,
S.
, …
Teorell
,
J.
(
2011
).
Conceptualizing and measuring democracy: A new approach
.
Perspectives on Politics
,
9
(
2
),
247
267
.
Coppedge
,
M.
,
Gerring
,
J.
,
Knutsen
,
C. H.
,
Lindberg
,
S. I.
,
Teorell
,
J.
, …
Ziblatt
,
D.
(
2023
).
V-Dem [Country-Year/Country-Date] Dataset v13
.
Varieties of Democracy (V-Dem) Project
.
Crosby
,
M.
(
2000
).
Patents, innovation and growth
.
Economic Record
,
76
(
234
),
255
262
.
Cudeck
,
R.
(
2000
).
Exploratory factor analysis
. In
Tinsley
,
H. E.
, &
Brown
,
S. D.
(Eds.),
Handbook of applied multivariate statistics and mathematical modeling
(pp.
265
296
).
Cambridge, MA
:
Academic Press
.
De Solla Price
,
D. J.
(
1963
).
Little science, big science
.
New York, NY
:
Columbia University Press
.
Fedderke
,
J. W.
(
2005
).
Technology, human capital, and growth
(No.
27
).
Economic Research Southern Africa
. https://www.econrsa.org, accessed March 2023.
Frame
,
J. D.
(
2005
).
Measuring scientific activity in lesser developed countries
.
Scientometrics
,
2
(
2
),
133
145
.
Furman
,
J. L.
,
Porter
,
M. E.
, &
Stern
,
S.
(
2002
).
The determinants of national innovative capacity
.
Research Policy
,
31
(
6
),
899
933
.
Gelman
,
A.
,
Hill
,
J.
, &
Vehtari
,
A.
(
2020
).
Regression and other stories
.
Cambridge, UK
:
Cambridge University Press
.
Gittleman
,
M.
, &
Wolff
,
E. N.
(
1995
).
R&D activity and cross-country growth comparisons
.
Cambridge Journal of Economics
,
19
(
1
),
189
207
.
Goel
,
R. K.
, &
Ram
,
R.
(
1994
).
Research and development expenditures and economic growth: A cross-country study
.
Economic Development and Cultural Change
,
42
(
2
),
403
411
.
Greenhalgh
,
C.
, &
Rogers
,
M.
(
2010
).
Innovation, intellectual property, and economic growth
.
Princeton, NJ
:
Princeton University Press
.
Grice
,
J. W.
(
2001
).
Computing and evaluating factor scores
.
Psychological Methods
,
6
(
4
),
430
450
. ,
[PubMed]
Gulmez
,
A.
, &
Yardımcıoglu
,
F.
(
2012
).
The relationship between R&D expenditures and economic growth in OECD countries: Panel cointegration and panel causality analyses (1990–2010)
.
Maliye Dergisi
,
163
,
335
353
.
Gumus
,
E.
, &
Celikay
,
F.
(
2015
).
R&D expenditure and economic growth: New empirical evidence
.
Margin: The Journal of Applied Economic Research
,
9
(
3
),
205
217
.
Haggard
,
S.
,
MacIntyre
,
A.
, &
Tiede
,
L.
(
2008
).
The rule of law and economic development
.
Annual Review of Political Science
,
11
,
205
234
.
Hahn
,
R. W.
, &
Hird
,
J. A.
(
1991
).
The costs and benefits of regulation: Review and synthesis
.
Yale Journal on Regulation
,
8
(
1
),
233
278
.
Hall
,
R. E.
, &
Jones
,
C. I.
(
1999
).
Why do some countries produce so much more output per worker than others?
Quarterly Journal of Economics
,
114
(
1
),
83
116
.
Howitt
,
P.
(
2000
).
Endogenous growth and cross-country income differences
.
American Economic Review
,
90
(
4
),
829
846
.
Huggins
,
R.
, &
Izushi
,
H.
(
2008
).
Benchmarking the knowledge competitiveness of the globe’s high-performing regions: A review of the World Knowledge Competitiveness Index
.
Competitiveness Review: An International Business Journal
,
18
(
1/2
),
70
86
.
Kaiser
,
H. F.
, &
Rice
,
J.
(
1974
).
Little jiffy, mark IV
.
Educational and Psychological Measurement
,
34
(
1
),
111
117
.
King
,
D. A.
(
2004
).
The scientific impact of nations
.
Nature
,
430
(
6997
),
311
316
. ,
[PubMed]
Knack
,
S.
, &
Keefer
,
P.
(
1995
).
Institutions and economic performance: Cross-country tests using alternative institutional measures
.
Economics & Politics
,
7
(
3
),
207
227
.
Kogan
,
L.
,
Papanikolaou
,
D.
,
Seru
,
A.
, &
Stoffman
,
N.
(
2017
).
Technological innovation, resource allocation, and growth
.
Quarterly Journal of Economics
,
132
(
2
),
665
712
.
Lepori
,
B.
,
Barré
,
R.
, &
Filliatreau
,
G.
(
2008
).
New perspectives and challenges for the design and production of S&T indicators
.
Research Evaluation
,
17
(
1
),
33
44
.
Li
,
C.
,
Zhang
,
E.
, &
Liu
,
J.
(
2020
).
Analysis of countries’ science capability in dual science roles
.
IEEE Access
,
8
,
14545
14556
.
Lopez-Vega
,
H.
,
Tell
,
F.
, &
Vanhaverbeke
,
W.
(
2016
).
Where and how to search? Search paths in open innovation
.
Research Policy
,
45
(
1
),
125
136
.
Lundvall
,
B. Å.
(
2016
).
National innovation systems and globalization
. In
Sampath
,
P.
&
Narula
,
R.
(Eds.),
The learning economy and the economics of hope
(pp.
351
367
).
Anthem Studies in Economics and Development
.
Martin
,
B. R.
(
1996
).
The use of multiple indicators in the assessment of basic research
.
Scientometrics
,
36
(
3
),
343
362
.
May
,
R. M.
(
1997
).
The scientific wealth of nations
.
Science
,
275
(
5301
),
793
796
.
Maerz
,
S.
,
Edgell
,
A.
,
Hellmeier
,
S.
, &
Ilchenko
,
N.
(
2020
).
Vdemdata—An R package to load, explore and work with the most recent V-Dem (Varieties of Democracy) and V-Party datasets
.
Varieties of Democracy (V-Dem) Project
https://www.v-dem.net/en/ and https://github.com/vdeminstitute/vdemdata.
Miao
,
L.
,
Murray
,
D.
,
Jung
,
W.-S.
,
Larivière
,
V.
,
Sugimoto
,
C. R.
, &
Ahn
,
Y.-Y.
(
2022
).
The latent structure of national scientific development
.
Nature Human Behaviour
,
6
(
9
),
1454
1464
. ,
[PubMed]
Mohammed Bin Rashid Al Maktoum Knowledge Foundation
.
(
2022
).
Global knowledge index
.
Dubai
:
Al Ghurair Printing and Publishing
.
Nardo
,
M.
, &
Saisana
,
M.
(
2009
).
OECD guide to composite indicators
. Retrieved from https://www.oecd.org/els/soc/handbookonconstructingcompositeindicatorsmethodologyanduserguide.htm.
Narin
,
F.
(
1994
).
Patent bibliometrics
.
Scientometrics
,
30
(
1
),
147
155
.
Nelson
,
R. R.
, &
Phelps
,
E.
(
1966
).
Investment in humans, technological diffusion, and economic growth
.
American Economic Review
,
61
,
69
75
.
OECD
. (
2015
).
Frascati manual 2015: Guidelines for collecting and reporting data on research and experimental development, the measurement of scientific, technological and innovation activities
.
Paris, France
:
OECD Publishing
.
OECD
. (
2016
).
Economic and social benefits of internet openness
.
OECD Digital Economy Papers
(No.
257
).
Paris
:
OECD Publishing
.
OECD
. (
2021
).
Main science and technology indicators database
. Retrieved from https://oe.cd/msti, March 2023.
Porter
,
M. E.
, &
van der Linde
,
C.
(
1995
).
Toward a new conception of the environment competitiveness relationship
.
Journal of Economic Perspectives
,
9
(
4
),
97
118
.
Purkayastha
,
A.
,
Palmaro
,
E.
,
Falk-Krzesinski
,
H. J.
, &
Baas
,
J.
(
2019
).
Comparison of two article-level, field-independent citation metrics: Field-weighted citation impact (FWCI) and relative citation ratio (RCR)
.
Journal of Informetrics
,
13
(
2
),
635
642
.
R Core Team
.
(
2021
).
R: A language and environment for statistical computing
.
Vienna, Austria
:
R Foundation for Statistical Computing.
https://www.R-project.org/.
Revelle
,
W.
(
2024
).
psych: Procedures for psychological, psychometric, and personality research. Northwestern University, Evanston, Illinois. R package version 2.4.6
. https://CRAN.R-project.org/package=psych
Revelle
,
W.
, &
Condon
,
D. M.
(
2019
).
Reliability from α to ω: A tutorial
.
Psychological Assessment
,
31
(
12
),
1395
1411
. ,
[PubMed]
Romer
,
P. M.
(
1989
).
Increasing returns and new developments in the theory of growth
.
National Bureau of Economic Research Working Paper
https://www.nber.org/papers/w3098, accessed March 2024.
Romer
,
P. M.
(
1990
).
Capital, labor, and productivity
.
Brookings Papers on Economic Activity. Microeconomics
,
1990
,
337
367
.
Salter
,
A. J.
, &
Martin
,
B. R.
(
2001
).
The economic benefits of publicly funded basic research: A critical review
.
Research Policy
,
30
(
3
),
509
532
.
Schmoch
,
U.
(
2004
).
The utility of patent indicators for evaluation
.
Plattfform fteval - Forschungs- und Technologieevaluierung
,
22
,
2
10
.
Schofer
,
E.
,
Ramirez
,
F. O.
, &
Meyer
,
J. W.
(
2000
).
The effects of science on national economic development, 1970 to 1990
.
American Sociological Review
,
65
(
6
),
866
887
.
Sivertsen
,
G.
,
Rousseau
,
R.
, &
Zhang
,
L.
(
2019
).
Measuring scientific contributions with modified fractional counting
.
Journal of Informetrics
,
13
(
2
),
679
694
.
Tong
,
X.
, &
Frame
,
J. D.
(
1994
).
Measuring national technological performance with patent claims data
.
Research Policy
,
23
(
2
),
133
141
.
van Buuren
,
S.
, &
Groothuis-Oudshoorn
,
K.
(
2011
).
mice: Multivariate imputation by chained equations in R
.
Journal of Statistical Software
,
45
(
3
),
1
67
.
Van Leeuwen
,
T. N.
,
Moed
,
H. F.
,
Tijssen
,
R. J.
,
Visser
,
M. S.
, &
Van Raan
,
A. F.
(
2001
).
Language biases in the coverage of the Science Citation Index and its consequences for international comparisons of national research performance
.
Scientometrics
,
51
,
335
346
.
Wagner
,
C. S.
,
Brahmakulam
,
I.
,
Jackson
,
B.
,
Wong
,
A.
, &
Yoda
,
T.
(
2001
).
Science & technology collaboration: Building capacity in developing countries?
Santa Monica, CA
:
RAND Corporation
https://www.rand.org/pubs/monograph_reports/MR1357z0.html (accessed, March 2023).
Wagner
,
C. S.
, &
Jonkers
,
K.
(
2017
).
Open countries have strong science
.
Nature
,
550
(
7674
),
32
33
. ,
[PubMed]
Waltman
,
L.
, &
van Eck
,
N. J.
(
2015
).
Field-normalized citation impact indicators and the choice of an appropriate counting method
.
Journal of Informetrics
,
9
(
4
),
872
894
.
Wang
,
Q. J.
,
Feng
,
G. F.
,
Wang
,
H. J.
, &
Chang
,
C. P.
(
2021
).
The impacts of democracy on innovation: Revisited evidence
.
Technovation
,
108
,
102333
.
Whetsell
,
T. A.
,
Dimand
,
A. M.
,
Jonkers
,
K.
,
Baas
,
J.
, &
Wagner
,
C. S.
(
2021
).
Democracy, complexity, and science: Exploring structural sources of national scientific performance
.
Science and Public Policy
,
48
(
5
),
697
711
.
World Bank Databank
. (
2020–2022
). Retrieved from https://data.worldbank.org/.
World Intellectual Property Organization Patentscope
. (
2020–2022
). Retrieved from https://patentscope.wipo.int/search/en/search.jsf (accessed, March 2023).

Author notes

Handling Editor: Vincent Larivière

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data