Abstract
We test the feasibility of incorporating broad social, political, and governance indicators with standard metrics as a way to enrich assessment of national research capacity. We factor analyze two sets of variables for 174 countries from 2012 to 2021, one being tradtional measures associated with national science and technology capacity, such as spending, and a second being broader social, political, and governance measures, such as academic freedom. As expected, two factors emerge, one for raw or “core” research capacity and the other indicating the wider governance context. Further analysis shows convergent validity within the two factors and divergent validity between them. The analysis also quantifies the contribution of each indicator to each factor. Nations rank differently for each factor and also when combined. Ranks vary as a function of the chosen aggregation method. As a test of the predictive validity of the capacity index, we find both factors to be associated with country-level field-weighted citation indices. Policymakers and analysts may find useful feedback from this approach to quantifying national research strength.
PEER REVIEW
1. INTRODUCTION
Numerous indicators exist to measure inputs, processes, outputs, and outcomes of scientific capacity, which are often used to assess the relative strength of nations. A common approach is to combine numerous indicators into aggregate indices. However, indices (e.g., the Global Competitiveness Index) have several drawbacks for policymakers. They often incorporate a great deal of collaborative international data and general economic data that obscure the scientific performance and productivity of a single nation. Moreover, bibliometric databases tend to have biases but remain the only viable way to conduct international comparisons of national scientific performance (Van Leeuwen, Moed et al., 2001). We offer an alternative approach to existing indices, one that combines data sources in a principled way to explore the prospects for developing a national index of scientific capacity as input to the policy process. The resulting index fills a need for insight into capacity to better empower assessment grounded in national contexts. We identify and index two sets of indicators as factors broadly understood as “raw capacity” and “governance.” To further test the validity of these factors, we use them as predictors of national scientific impact, measured by fractional field-weighted citation impact.
The approach follows Lundvall’s theory (2016) of national systems of innovation (NSI) where he characterizes the dynamic interaction of factors of innovation as having “core” and “wider context” elements (see box 9.8 in Lundvall’s work). We define Lundvall’s core as “raw capacity” within the research system with widely accepted indicators, and the “wider context” to include elements of national “governance,” which have been used less frequently. Our approach to index creation provides insight into the relative contributions of each indicator to distinct factors suggesting a more systematic understanding of the index creation process. Practitioners may find this approach to index generation useful when comparing national science and technology policies and their impacts, while scholars may find it useful to analyze elements of the global research system.
Using existing indicators of inputs, activities, outputs, and context, we seek to represent national capacity to conduct research. Capacity is understood as access to and ability to use resources, as opposed to simply an input or an output. We undertake this work for three reasons:
existing indices target economic strengths, not research capacity, making them less useful for national science policy purposes;
data from more countries have become more widely available, providing an opportunity to make a research index more useful; and
methods are needed to better compare nations on their capacity to conduct research.
Moreover, composite indices help policy analysis by making units comparable (Nardo & Saisana, 2009) and it would be helpful to compare national capacities. This broad acceptance of indices is attributed to the ability to “effectively encapsulate intricate and occasionally evasive matters across various domains” (Nardo & Saisana, 2009, p. 2), which is also our goal.
Following this introduction, the paper is divided into four parts. The literature review justifies the choice of indicators for this paper. Then, the methodology for constructing the index, the data chosen and applied, and the statistical analyses utilized are described. Following the discussion of the methodology and data, we present the results of statistical analyses. In the conclusion, the results are discussed focusing on the index’s applicability to future research, and the next steps for using the index to compare nations.
2. LITERATURE REVIEW
First, we compare existing innovation indices and explain why our proposed index fills a gap. Then, we draw from literature to present a conceptual framework that categorizes our variables into three broad categories:
raw scores of research capacity, Lundvall’s “core capacity”;
broad measures of social and political context, Lundvall’s “broader context”; and
outcomes of science, specifically, citation impact.
In contrast to a standard literature review approach, which presents testable hypotheses, our approach is to mine the literature for candidate indicators of national research capacity (see Supplementary material).
2.1. Existing Innovation Indices
Indices are a construct of broad interest. Indices of national innovation, competitiveness, and knowledge are compiled each year by different analytic organizations (e.g., World Bank) to provide input to business and public policy by ranking countries on baskets of variables. These reports often focus on economic growth, business, and commerce; they do not have the goal of decoupling the national from the international data nor do they disaggregate the role of public research. From a policy perspective, economic and competitiveness indices such as these have limited value as feedback to R&D policy support. Of the economically oriented indices, the most prominent are the Global Competitiveness Index (GCI), the Global Innovation Index (GII), and the Global Knowledge Index (GKI), each of which takes a slightly different approach to measuring innovation, but all of which include business data. The method proposed here does not include business, trade, financial, or market data as these features do not reflect science. Moreover, we do not include global data because we desire to measure national capacity. Our index does not draw upon these reports, but we summarize them here to contrast our approach with theirs.
The World Bank publishes the Global Competitiveness Index as a publication of the World Economic Forum Global Competitiveness Report (GCR); this work was most recently issued in 20191. The GCI draws upon 110 variables and a survey of businesspeople, covering 141 countries. GCI variables are grouped into “pillars” with nested indices covering human capital; market conditions; policy environment and enabling conditions; technology and innovation; and physical environment. The outcomes provide ranks of countries by their competitiveness.
The Global Innovation Index (GII) is copublished annually by Cornell University, INSEAD Business School, and the World Intellectual Property Organization (WIPO). The GII is a multilevel index with 81 variables, nested into subindices around seven pillars: institutions; human knowledge and research; infrastructure; market sophistication; business sophistication; knowledge and technology outputs; and creative outputs. GII covers the economies of 132 countries in the 2022 report.
The United Nations endorses the publication of the Global Knowledge Index (GKI) compiled by the Mohammed Bin Rashid Al Maktoum Knowledge Foundation (2022). GKI is a multilevel index combining 199 variables, nested in subindices, covering seven areas: preuniversity education; higher education; research, development, and innovation; information and communications technology; economy; enabling environment, consisting of governance, socioeconomic, and health and environment. GKI covers 138 countries.
These reports mix national data (such as exports) with data that have a high degree of globalization (such as international patents), which does little to help science policymakers assess national policy impacts on research strengths. We seek to generate a measure of capacity that is disentangled from business and international data, which both obscure the national public research contribution.
Li, Zhang, and Liu (2020) developed a citation-based scientific capability identification (CISCI) tool to examine national capability in what they termed “dual science roles” by examining both cited and citing behavior in a networked structure. They find, for 158 countries, different contributing roles for different fields of science; they further show that these roles and rankings change over time. The United States, Canada, and the United Kingdom were consistently highly ranked by their tool, with rapid improvements for China. This admirable approach provides useful insights into national capacities, but we have the further goal of testing the wider context of governance, which is not covered by Li et al. (2020).
2.2. Measures of Research and Development Capacity
This section reviews the literature on R&D indicators which are used in our index. May (1997) compared the scientific wealth of nations for high-performing countries by calculating the number of scholarly articles along with citations to these articles on a nation-by-nation comparison. May found that 15 nations accounted for 81% of all scholarly articles, which has expanded since his article. King (2004) conducted an analysis like May (1997), examining the national distribution of the top 1% of most highly cited papers, finding the United States to be dominant at that time, a finding that has also changed2.
Previous studies of research include Barro and Lee (1994), Martin (1996), and Frame (2005); and Cole and Phelan (1999) identified and assessed the usefulness of indicators of knowledge-creating capacity, many of which are codified in the Frascati Manual (OECD, 2015). A consensus emerged around spending on research and development, and the number of trained individuals, as two core indicators of R&D capacity (Romer, 1989). Other studies have added regulatory quality and political stability (Furman, Porter, & Stern, 2002; Lundvall, 2016). Still others measured knowledge production (number of published articles) (May, 1997) and the number of domestic patent registrations (Lepori et al., 2008; Narin, 1994; Schmoch, 2004; Tong & Frame, 1994). Analysts often include the number of research-conducting academic and nonacademic research institutions in discussions of national capacity (Knack & Keefer, 1995). International cooperation, fractionally counted to attribute numbers to participating countries, is also commonly used as an indicator of global engagement and openness (OECD, 2015; Wagner & Jonkers, 2017).
Research and development expenditure is a major contributor to the knowledge economy, and it is always included in any measure of research capacity. Gross expenditure on research and development (GERD) is universally recognized in the literature as a key indicator of knowledge-creating capacity, and there is a strong correlation between R&D spending and economic growth (Howitt, 2000; Salter & Martin, 2001). Gulmez and Yardımcıoglu (2012) discovered positive effects of R&D spending on the income of 21 OECD member nations between 1990 and 2010, demonstrating that a 1% increase in R&D spending led to a 0.77% increase in economic growth. Other authors have studied the relationship between scientific investment and national development and discovered a significant correlation between R&D investment and both short-term and long-term growth in developing and developed nations (Gittleman & Wolff, 1995; Goel & Ram, 1994; Gumus & Celikay, 2015). Adams (1990), May (1997), and King (2004) demonstrate significant correlations between R&D expenditures and economic expansion.
Education and human resources are included in any assessment of national research capacity. In earlier attempts to measure national scientific activity, Frame (2005) emphasized the importance of education, while Barro (1991), citing work by Nelson and Phelps (1966), showed that nations with highly trained human capital were better able to absorb new products or ideas. Fedderke (2005), supports Romer (1990), noting that quality, rather than quantity, of human capital contributes to total productivity at the national level. Capacity for training, education, and knowledge transfer is also essential: Schofer, Ramirez, and Meyer (2000) showed that the size of scientific labor sources and training systems had a positive effect on national economic growth. Cole and Phelan (1999) identified a positive relationship between the number of research scientists and economic growth, and Barro (1991) showed that GDP growth is positively related to the availability of trained human capital.
Scholarly productivity is often used as an indicator of research capacity. The number of scholarly articles published is widely used as an indicator of the strength of a national research sector (Martin, 1996; OECD, 2015). Recent research by Miao, Murray et al. (2022) demonstrates a correlation between a nation’s scientific output and its economic growth and complexity (see also Ahmadpoor & Jones, 2017; Cilliers, 2005). Past productivity forms a basis for expectations of productive capacity in the future. This finding supports the findings of Cimini, Gabrielli, and Sylos Labini (2014), who discovered that OECD member states had more diverse research systems when measured in articles than developing countries (see also OECD, 2021).
Patent counts are used as indicators for technological development or entrenched knowledge (World Intellectual Property Organization Patentscope, 2020–2022). Crosby (2000) identified a positive correlation between the number of patents and economic growth, while Kogan, Papanikolaou et al. (2017) support Furman et al. (2002), demonstrating that the scientific content of the patent is positively correlated with the patent’s value to the economy. Patent offices often differentiate between national and international patents. Our index uses only national/residential patents.
2.3. Research on Social and Political Context
The importance of good governance and political stability for knowledge-based economic growth is well documented (Bäck & Hadenius, 2008). Rule of law and freedom from corruption are correlated with higher economic growth and expansion of a knowledge economy (Haggard, MacIntyre, & Tiede, 2008). Barro (1996) and Cole and Phelan (1999) showed that growth is correlated with the maintenance of the rule of law, free markets, small government consumption, and high human capital. Other studies have shown a positive relationship between political stability, technological change, and growth (Barro, 1991; Barro & Lee, 1994; Hall & Jones, 1999) and between democracy and growth (Barro, 1996). Berggren and Bjørnskov (2022) showed correlations between academic freedom and innovation. Whetsell et al. (2021) showed the relevance of democratic governance in predicting the national performance in science, and Wang, Feng et al. (2021) show similar effects on technology. Whetsell et al. (2021) showed that levels of polyarchy, measured through Varieties of Democracy Project data (Coppedge, Gerring et al., 2011, 2023), is a significant correlate of field-weighted citation impact at the national level.
There is mixed evidence regarding the role of regulations, standards, and enforcement in promoting research and development and science capacity. Some economists showed that regulatory burdens can hinder innovation, competitiveness, and national trade positions (Hahn & Hird, 1991). In contrast, Porter and van der Linde (1995), discussed in Blind (2012), suggest that, while ambitious environmental regulations may be costly for national industry at the outset, regulations may help to improve international competitiveness and increase exports of environmental technologies over the longer term. Our view aligns with that of Blind (2012)—that regulations and standards aid research and innovation—and we include the Coppedge et al. (2011) data on regulatory quality in our index.
The enforcement of intellectual property rights (IPR) has also garnered considerable research interest. According to Blind (2012, p. 393), innovation is supported by “...institutional regulations that ensure adequate enforcement of intellectual property rights.” Blind (2012) cites, who demonstrate that IPR regulations have an advantageous effect on the R&D intensity of the former G7 nations. Greenhalgh and Rogers (2010) demonstrated that IPR enforcement serves as an indicator of the quality of research. The rule of law facilitates the invention and innovation processes in both the public and private research sectors.
The ability of researchers to access new ideas, or diffusion, as critical to research capacity and success is supported across many parts of the literature, such as Björk and Magnusson (2009), exploring the role of interactions among researchers as tied to innovation; Lopez-Vega, Tell, and Vanhaverbeke (2016) ask where and how to search, focusing on the internet in their article on that topic. The role of the internet in improving and enhancing search has received attention in high-level policy documents such as OECD’s report, “Economic and Social Benefits of Internet Openness” (OECD, 2016). (We use the open internet measure drawn from the Varieties of Democracy data.)
3. VARIABLES, DATA, AND METHODS
As a practical starting point for constructing composite indices, our guide was the OECD handbook on constructing composite indicators (Nardo & Saisana, 2009). The handbook suggests the following actions: (a) establish a theoretical framework; (b) choose variables; (c) impute missing data; (d) conduct multivariate analysis; (e) normalize data; (f) weight and aggregate data; and (g) present findings. We generally follow the Nardo and Saisana (2009) framework with the added test of predictive validity after step 6. This section will discuss variable selection, data sources, and analysis techniques, as the theory development was covered in the previous section. All analysis was conducted in the R programming language (R Core Team, 2021).
3.1. Choice of Indicator Variables and Data Sources
Table 1 provides a description of all the variables used in the analysis identified from the literature and drawn from existing databases and shows the chosen variables for potential selection into the proposed national research capacity index: research and development spending (RD) as a raw number (not GDP normalized); number of resident patent applications (ResPatents); number of academic institutions affiliated with publications (AcadInst); number of nonacademic research institutions affiliated with publications (NonAcadInst); number of unique authors listed on publications (Authors); number of publications fractionally counted by nation (Pubs); number of papers that are international collaborations, fractionally counted by nation (IntlPubs); open internet access (OpenInternet); rule of law (RuleLaw); regulatory quality (RegQuality); political stability (PolitStability); noncorruption (NonCorrupt); electoral democracy (Polyarchy); and academic freedom (AcadFreedom). Elsevier’s field weighted citations index (FWCI), fractionally counted, is used as a dependent variable in the regression models.
Variable descriptions and summary statistics
Variable name . | Description . | Data source . |
---|---|---|
RD | Gross research and development spending: raw number | World Bank Indicators |
ResPatent | Number of resident patent applications | World Bank Indicators |
AcadInst | Number of academic Institutions: paper affiliation | Scopus/Elsevier |
NonAcadInst | Number of nonacademic institutions: paper affiliation | Scopus/Elsevier |
Authors | Number of unique authors: paper affiliation | Scopus/Elsevier |
Pubs | Number of publications: fractional count | Scopus/Elsevier |
IntlPubs | Number of international co-pub papers: fractional count | Scopus/Elsevier |
OpenInternet | Country approach to regulating/controlling Internet | Varieties of Democracy |
RuleLaw | Rule of law: crime, judicial & contract effectiveness | World Bank Indicators |
RegQuality | Regulatory quality: burden of regulation on markets | Varieties of Democracy |
PolitStability | Political stability: probability of gov. destabilization | World Bank Indicators |
NonCorrupt | Control of corruption: use of public power for private gain | World Bank Indicators |
Polyarchy | Electoral Democracy Index | Varieties of Democracy |
AcadFreedom | Academic Freedom Index | Varieties of Democracy |
FWCI | Fractional Field Weighted Citation Index | Scopus/Elsevier |
Variable name . | Description . | Data source . |
---|---|---|
RD | Gross research and development spending: raw number | World Bank Indicators |
ResPatent | Number of resident patent applications | World Bank Indicators |
AcadInst | Number of academic Institutions: paper affiliation | Scopus/Elsevier |
NonAcadInst | Number of nonacademic institutions: paper affiliation | Scopus/Elsevier |
Authors | Number of unique authors: paper affiliation | Scopus/Elsevier |
Pubs | Number of publications: fractional count | Scopus/Elsevier |
IntlPubs | Number of international co-pub papers: fractional count | Scopus/Elsevier |
OpenInternet | Country approach to regulating/controlling Internet | Varieties of Democracy |
RuleLaw | Rule of law: crime, judicial & contract effectiveness | World Bank Indicators |
RegQuality | Regulatory quality: burden of regulation on markets | Varieties of Democracy |
PolitStability | Political stability: probability of gov. destabilization | World Bank Indicators |
NonCorrupt | Control of corruption: use of public power for private gain | World Bank Indicators |
Polyarchy | Electoral Democracy Index | Varieties of Democracy |
AcadFreedom | Academic Freedom Index | Varieties of Democracy |
FWCI | Fractional Field Weighted Citation Index | Scopus/Elsevier |
Data were gathered from the World Bank Indicators (World Bank Databank, 2020–2022), available through the R package WDI (Arel-Bundock, 2022), the Varieties of Democracy Project (Coppedge et al., 2023), available through the R package vdemdata (Maerz, Edgell et al., 2020), and (an Elsevier database)3 (data were obtained through email communication with researchers at Elsevier).
3.2. Missing Data
Finding a balance between data coverage over the number of countries and comprehensiveness over important aspects of science is an inherent difficulty in the construction of an index. No index can be expected to include every variable on every country related to a subject. A requirement for more detailed data will necessarily lead to fewer nations, regions, or groups being included in the analysis. For example, developing countries typically collect and provide less data than developed countries, and the available statistical data may be less reliable. Because many existing indices cover smaller samples of countries, we sought to construct our index to include as many nations as possible. However, numerous variables of interest had low data coverage. This presents an interesting problem for research. High-quality measures are generally available for countries whose status is already well known and whose research systems are well developed, while data is lacking on those that are likely to experience the greatest change over time and whose status is of particular interest. For these reasons, missing data remains a persistent issue for studies of national scientific capacity.
Our missing data strategy is as follows. First, we focus on the most recent 10 years of available data, which resulted in a period from 2012 to 2022. Data from Scopus/Elsevier is comprehensive across almost all countries, so these data formed the base sample for the subsequent merging of data. The Varieties of Democracy (Coppedge et al., 2023) data also covered many countries. The World Bank Indicators had the lowest data coverage, specifically on RD and ResPatents. We did not include, for example, tertiary enrollment because of low data coverage. Among the sample, if there was only partial missing data (available in some years but not others), we imputed the mean from the available data for the country/year observations. Next, Research and Development Expenditure (RD) and Resident Patents (ResPatent) do not contain zero values. However, it is not clear whether these are simply missing data. Multivariate imputation by chained equations (MICE) was used to impute values, which was implemented using the R package mice (Van Buuren & Groothuis-Oudshoorn, 2011). Imputation for both was based on the other capacity variables, AcadInst-IntlPubs, using the predictive mean matching option to produce five imputed data sets of which the pooled mean was taken as the imputed value. Alternatively, the analyst may choose to omit missing data here or drop these two variables from the analysis altogether. These choices resulted in a sample of 174 countries.
3.3. Methods of Analysis
We apply exploratory factor analysis (EFA) to identify the underlying factor structure between all the variables. EFA is a statistical method used to identify unobserved “latent factors” that manifest numerous observable indicators (Cudeck, 2000). It is commonly used to justify the reduction of numerous variables into aggregate indices. EFA computes the pairwise correlation matrix of a set of variables, then computes the eigenvalues and eigenvectors of the matrix, which are used to identify the amount of variance in the indicator variables explained by the factor (eigenvalues), and the direction of the relationship between the variables and the underlying factor (eigenvector). In the present context, we use EFA to identify whether the indicators listed in Table 1 (excluding FWCI) represent a coherent underlying latent factor, called “national research capacity.” Practically, candidate variables are found in a variety of sources and formats, and they are often gathered for reasons unrelated to research capacity. As such, their relationship with one another becomes more important than what they represent individually. Because numerous variables measure essentially the same factor, EFA helps to economize the multiplicity of empirical indicators. Factor analysis allows us to make statements about the convergence or divergence of these empirical measurements as they relate to national research capacity. We use the R package psych to conduct EFA using the principal factor method and varimax rotation (Revelle, 2024), retaining two factors as indicated by the eigenvalue and eigenvector matrixes (Grice, 2001).
The Cronbach’s alpha test, which examines the internal consistency and relatedness of a set of variables, is used to assess scale reliability (Revelle & Condon, 2019). This test is conducted after EFA to provide additional evidence that the variables identified by EFA have scale reliability prior to aggregation. Higher scores indicate greater internal consistency. In general, a value greater than 0.7 indicates adequate scale reliability. To aggregate variables into an index for the cross-sectional regression model, we chose the factor regression score extraction method. We used a summative index for the panel regression because there appear to be no established methods to generate factor regression scores in panel data.
Additionally, we wish to demonstrate the predictive validity of the index by testing its relationship between other well-established variables. To achieve this, we employ fractional Field Weighted Citation Impact (FWCI) to measure the impact of national research. FWCI is the ratio of the total citations received by the unit (country) and the total citations expected based on the average of the subject field, document type, and year. At the country level, the index is aggregated across all research domains. The data are further fractionalized in cases of international collaboration to represent country-specific contributions. FWCI has gained acceptance in the scientometrics literature as a valid indicator of citation impact and fractional counting is a growing standard for analysis (Purkayastha, Palmaro et al., 2019; Sivertsen, Rousseau, & Zhang, 2019; Waltmann & van Eck, 2015).
To examine the influence of the national research capacity index on FWCI, we employ Bayesian multilevel regression with the R package brms (Bürkner, 2017). This method allows us to account for the distinctions between regions and countries in our data (see also Huggins and Izushi (2008)), revealing the relationship between research capacity and research impact across the globe. Bayesian methods have found rapid acceptance as an alternative to the frequentist approach of conventional regression techniques that employ significance testing based on the p-value. In place of a binary evaluation of statistical significance, Bayesian methods generate credibility intervals for parameter estimates of interest, focusing the analyst on gradations of uncertainty (Gelman, Hill, & Vehtari, 2020).
4. RESULTS
This section summarizes the results of the statistical analysis. First, we present descriptive statistics. Second, we show the results of the exploratory factor analysis. Finally, we present the Bayesian regression models that predict research impact using the index.
Prior to presenting the findings, it is useful to provide some additional remarks on the empirical methodology employed. Longitudinal data were collected for all variables spanning the years 2012 to 2021. To our knowledge, there are no established methods for conducting exploratory factor analysis (EFA) on panel data. Consequently, we aggregated all the data based on the within-country mean for the time frame to conduct the EFA. Practitioners and analysts may choose different time frames, or collapse data on different metrics than the mean, or construct indices year-by-year as the data become available.
Table 2 displays the list of indicators along with descriptive statistics of the variables analyzed. First, n shows the number of countries for which data are included. Mean, sd, min, and max present descriptive statistics for each indicator. Mean shows that the data have positive and negative values. Standard deviation (sd) shows how far the data points are from the mean. Min/max shows the range of values for the variable. Prior to conducting the EFA, the natural logarithm (+1) was applied to variables exhibiting significant skewness. All variables pertaining to raw research capacity, ranging from research and development (RD) to international publications (IntlPubs), were transformed. This resulted in more normally distributed variables. The variables pertaining to governance exhibited less skewed distributions that did not require log transformation.
Descriptive statistics of the entire data set, 2012–2021
. | n . | mean . | sd . | min . | max . |
---|---|---|---|---|---|
ln_RD | 174 | 19.833 | 2.495 | 13.471 | 27.049 |
ln_ResPatent | 174 | 5.065 | 2.558 | 0.693 | 13.899 |
ln_AcadInst | 174 | 2.681 | 1.584 | 0 | 7.295 |
ln_NonAcadInst | 174 | 2.14 | 1.856 | 0 | 7.985 |
ln_Authors | 174 | 7.786 | 2.461 | 2.912 | 13.973 |
ln_Pubs | 174 | 6.746 | 2.752 | 1.48 | 13.163 |
ln_IntlPubs | 174 | 5.794 | 2.355 | 1.279 | 11.443 |
OpenInternet | 174 | 0.353 | 1.546 | –3.572 | 2.372 |
RuleLaw | 174 | 0.55 | 0.307 | 0.021 | 0.998 |
RegQual | 174 | –0.127 | 1.002 | –2.33 | 2.045 |
Stability | 174 | –0.207 | 0.947 | –2.747 | 1.48 |
NonCorrupt | 174 | –0.124 | 1.01 | –1.677 | 2.272 |
Polyarchy | 174 | 0.524 | 0.254 | 0.017 | 0.919 |
AcadFreedom | 174 | 0.633 | 0.294 | 0.011 | 0.971 |
FWCI | 174 | 0.781 | 0.255 | 0.267 | 1.606 |
. | n . | mean . | sd . | min . | max . |
---|---|---|---|---|---|
ln_RD | 174 | 19.833 | 2.495 | 13.471 | 27.049 |
ln_ResPatent | 174 | 5.065 | 2.558 | 0.693 | 13.899 |
ln_AcadInst | 174 | 2.681 | 1.584 | 0 | 7.295 |
ln_NonAcadInst | 174 | 2.14 | 1.856 | 0 | 7.985 |
ln_Authors | 174 | 7.786 | 2.461 | 2.912 | 13.973 |
ln_Pubs | 174 | 6.746 | 2.752 | 1.48 | 13.163 |
ln_IntlPubs | 174 | 5.794 | 2.355 | 1.279 | 11.443 |
OpenInternet | 174 | 0.353 | 1.546 | –3.572 | 2.372 |
RuleLaw | 174 | 0.55 | 0.307 | 0.021 | 0.998 |
RegQual | 174 | –0.127 | 1.002 | –2.33 | 2.045 |
Stability | 174 | –0.207 | 0.947 | –2.747 | 1.48 |
NonCorrupt | 174 | –0.124 | 1.01 | –1.677 | 2.272 |
Polyarchy | 174 | 0.524 | 0.254 | 0.017 | 0.919 |
AcadFreedom | 174 | 0.633 | 0.294 | 0.011 | 0.971 |
FWCI | 174 | 0.781 | 0.255 | 0.267 | 1.606 |
A correlogram for each variable is shown in Figure 1. The pairwise correlations for each pair of variables are displayed in the table’s upper section. Each variable’s distribution is represented by the diagonal. Each bivariate scatterplot is displayed in the table’s lower section, along with fit lines that roughly depict the correlation’s slope. The graph displays two regions of higher correlations, where the governance measures and the raw capacity metrics have stronger correlations with one another. The image also demonstrates the correlation between these two sets of metrics and FWCI. It appears that the governance measures have a stronger correlation with FWCI than the raw capacity measures, which will be discussed further.
A preliminary test was performed to ensure that factor analysis would be a satisfactory tool to assess the relationship among the variables. The Kaiser-Meyer-Olkin measure of sampling adequacy (MSA) yielded a value of 0.88. A value close to 1 suggests that variables have a high level of common variance. Kaiser and Rice (1974) argued that a score in the 0.80s range is “meritorious.” In short, this measure quantifies the amount of shared variance among the items and indicates whether the items are suitable for factor analysis.
Following these results, eigenvalue decomposition was conducted, which revealed the presence of three factors with eigenvalues greater than 1 (8.03, 3.3, and 1.13, respectively). However, the scree plot seen in Figure 2 demonstrates that two factors are situated beyond the inflection point of the curve, while the third factor exhibits only a marginal increase above 1. Further, the third factor appears to “load” unevenly on all the variables (discussion of loadings below). The scree plot displays the cumulative percentage of variance accounted for by each successive factor, indicating that two factors explain roughly 81% of the variance in the variables.
Next, the standardized factor loadings from the EFA are presented in Table 3. The table shows the associations between the variables and the underlying factors in the first two columns. The factor loadings reveal the degree that each variable “loads” on the two factors identified by the EFA, ranging between values of –1 to 1. The loadings for the variables ln_RD through ln_IntlPubs exhibit strong positive loadings on Factor One that all exceed 0.7 and relatively low loadings on Factor Two. In contrast OpenInternet through AcadFreedom show strong loadings on Factor Two exceeding 0.71 and relatively low loadings on Factor One.
Standardized loadings, communality, uniqueness, and complexity
Factor . | Factor One . | Factor Two . | Communality . | Uniqueness . | Complexity . |
---|---|---|---|---|---|
ln_RD | 0.774 | 0.245 | 0.66 | 0.34 | 1.199 |
ln_ResPatent | 0.706 | 0.025 | 0.499 | 0.501 | 1.002 |
ln_AcadInst | 0.933 | 0.044 | 0.873 | 0.127 | 1.005 |
ln_NonAcadInst | 0.881 | 0.351 | 0.899 | 0.101 | 1.309 |
ln_Authors | 0.965 | 0.194 | 0.969 | 0.031 | 1.081 |
ln_Pubs | 0.955 | 0.208 | 0.956 | 0.044 | 1.094 |
ln_IntlPubs | 0.951 | 0.241 | 0.962 | 0.038 | 1.127 |
OpenInternet | 0.051 | 0.775 | 0.603 | 0.397 | 1.009 |
RuleLaw | 0.272 | 0.923 | 0.925 | 0.075 | 1.173 |
RegQual | 0.441 | 0.786 | 0.812 | 0.188 | 1.572 |
Stability | 0.105 | 0.719 | 0.528 | 0.472 | 1.043 |
NonCorrupt | 0.382 | 0.768 | 0.735 | 0.265 | 1.466 |
Polyarchy | 0.165 | 0.904 | 0.845 | 0.155 | 1.066 |
AcadFreedom | 0.024 | 0.812 | 0.659 | 0.341 | 1.002 |
Factor . | Factor One . | Factor Two . | Communality . | Uniqueness . | Complexity . |
---|---|---|---|---|---|
ln_RD | 0.774 | 0.245 | 0.66 | 0.34 | 1.199 |
ln_ResPatent | 0.706 | 0.025 | 0.499 | 0.501 | 1.002 |
ln_AcadInst | 0.933 | 0.044 | 0.873 | 0.127 | 1.005 |
ln_NonAcadInst | 0.881 | 0.351 | 0.899 | 0.101 | 1.309 |
ln_Authors | 0.965 | 0.194 | 0.969 | 0.031 | 1.081 |
ln_Pubs | 0.955 | 0.208 | 0.956 | 0.044 | 1.094 |
ln_IntlPubs | 0.951 | 0.241 | 0.962 | 0.038 | 1.127 |
OpenInternet | 0.051 | 0.775 | 0.603 | 0.397 | 1.009 |
RuleLaw | 0.272 | 0.923 | 0.925 | 0.075 | 1.173 |
RegQual | 0.441 | 0.786 | 0.812 | 0.188 | 1.572 |
Stability | 0.105 | 0.719 | 0.528 | 0.472 | 1.043 |
NonCorrupt | 0.382 | 0.768 | 0.735 | 0.265 | 1.466 |
Polyarchy | 0.165 | 0.904 | 0.845 | 0.155 | 1.066 |
AcadFreedom | 0.024 | 0.812 | 0.659 | 0.341 | 1.002 |
The general pattern of loadings aligns with Lundvall’s theoretical framework of “core” and “wider context.” Factor One appears to predict measures associated with core capacity, whereas Factor Two is associated with measures of governance. A third factor was explored but showed relatively inconsistent loadings across both sets of variables, and an eigenvalue just above 1 as discussed above. The loading pattern provides support for convergent validity within the two separate factors and divergent validity between them. However, RegQual, NonCorrupt, and NonAcadInst show some cross-loadings above 0.25 onto their respective opposing factors, weakening the divergent validity. For this reason, analysts may choose to drop these items. The Communality column in Table 3 shows the proportion of the variance in the variable that is explained by the overall factor model. For example, ln_AcadInst has a communality of 0.87, meaning 87% of its variance is accounted for by the factors extracted. The Uniqueness column shows the inverse of the communality measure showing the proportion of the variance in the variable that is not explained by the overall model. The Complexity column indicates the degree to which the variable is explained by potentially more than one factor, where higher values indicate more than one factor explains the variable. For example, ln_NonAcadInst has the highest complexity score, which is consistent with its cross-loading on Factor One and Factor Two. An analyst might wish to remove this variable depending on the research question.
Next, we present the results of the Cronbach’s alpha test of scale reliability on the items loading separately on each factor. The test assesses the degree to which the items in a set are interrelated and are suitable for aggregation. Cronbach’s alpha ranges from 0 to 1, with higher values indicating higher average interitem reliability. Typically, a Cronbach’s alpha value of 0.7 or higher is considered acceptable for aggregation. The test results for the items loading on Factor One, representing raw capacity, including variables ln_RD through ln_IntlPubs, resulted in an overall alpha value of 0.96, and a standardized alpha value of 0.97. Similarly, items loading on Factor Two, representing governance, including variables OpenInternet through AcadFreedom, resulted in an overall alpha value of 0.85 and a standardized alpha value of 0.94. Tests for both sets of variables indicate moderate to high levels of internal reliability and suggest aggregation is appropriate.
Next, we shift toward demonstrating how one might utilize such indices. To visualize how countries rank on raw capacity, governance, and their combination, Figure 3 shows three plots that utilize the factor regression scores extracted from the factor analysis. Factor regression scores are estimated values assigned to each observation—each country, in this context—based on their shared variance captured by the factors. These provide an economical way of generating aggregate representations for indexed variables. The left plot shows countries ranked from highest to lowest on core research capacity. The middle plot shows countries ranked on governance. The right plot shows the ranking of the product (interaction) of the two indices. The scores were first standardized before plotting. Table 4 compares the top 10 country names in each plot represented in Figure 3.
Factor score country ranks
Capacity factor . | Governance factor . | Capacity × Governance . |
---|---|---|
China | Luxembourg | United States |
United States | Iceland | Germany |
Japan | Estonia | Great Britain |
Russia | Denmark | Japan |
India | Norway | France |
Germany | Finland | Australia |
Great Britain | New Zealand | Canada |
Brazil | Ireland | Switzerland |
Turkey | Sweden | Netherlands |
Iran | Switzerland | Spain |
Capacity factor . | Governance factor . | Capacity × Governance . |
---|---|---|
China | Luxembourg | United States |
United States | Iceland | Germany |
Japan | Estonia | Great Britain |
Russia | Denmark | Japan |
India | Norway | France |
Germany | Finland | Australia |
Great Britain | New Zealand | Canada |
Brazil | Ireland | Switzerland |
Turkey | Sweden | Netherlands |
Iran | Switzerland | Spain |
To illustrate how different aggregation methods result in different country rankings, the rankings are reconstructed using a simpler summative (represented in Figure 4 and Table 5). Some practitioners and scholars may find this approach more intuitive, but it also includes more noise in the index as opposed to factor regression scores, which only include common variance on the factor. Further, the results of the summative index are useful in the longitudinal data setting as will be demonstrated later in the paper.
Summative index country ranks
Capacity factor . | Governance factor . | Capacity × Governance . |
---|---|---|
United States | Norway | United States |
China | Sweden | Germany |
Japan | Finland | Japan |
Germany | Denmark | Great Britain |
Great Britain | Switzerland | Canada |
India | New Zealand | France |
France | Luxembourg | Australia |
South Korea | Iceland | Switzerland |
Russia | Canada | Netherlands |
Italy | Netherlands | Sweden |
Capacity factor . | Governance factor . | Capacity × Governance . |
---|---|---|
United States | Norway | United States |
China | Sweden | Germany |
Japan | Finland | Japan |
Germany | Denmark | Great Britain |
Great Britain | Switzerland | Canada |
India | New Zealand | France |
France | Luxembourg | Australia |
South Korea | Iceland | Switzerland |
Russia | Canada | Netherlands |
Italy | Netherlands | Sweden |
Finally, we move to test the predictive validity of the two indices. Capacity and Governance are used as predictors of national scientific impact, measured through fractional FWCI. First, the results of a cross-sectional (averaged by country over 2012–2021) Bayesian mixed-model regression are shown in Table 6, which tests the effects of Capacity and Governance on country-level research impact measured via FWCI. When using FWCI, it is common practice to filter out countries that have a very low publication rate but high FWCI due to collaboration. Thus, we removed countries with fewer than 50 total publications, taking the model from a sample size of 174 to 160.
Cross-sectional multilevel Bayesian regression, data collapsed 2012–2021
FWCI . | Estimate . | Std. Error . | [95% cred. interval] . |
---|---|---|---|
Intercept | −0.0759 | 0.2055 | [0.4797, 0.3307] |
Capacity | 0.0032 | 0.0013 | [0.0007, 0.0057] |
Governance | 0.0054 | 0.0012 | [0.003, 0.0078] |
Region | 0.1758 | 0.057 | [0.098, 0.3145] |
Residuals | 0.1743 | 0.0101 | [0.1558, 0.1954] |
FWCI . | Estimate . | Std. Error . | [95% cred. interval] . |
---|---|---|---|
Intercept | −0.0759 | 0.2055 | [0.4797, 0.3307] |
Capacity | 0.0032 | 0.0013 | [0.0007, 0.0057] |
Governance | 0.0054 | 0.0012 | [0.003, 0.0078] |
Region | 0.1758 | 0.057 | [0.098, 0.3145] |
Residuals | 0.1743 | 0.0101 | [0.1558, 0.1954] |
Table 6 shows that both Capacity and Governance have positive estimates on fractional FWCI. Furthermore, the standard errors are less than half the estimates and the credibility intervals for both do not include zero, indicating (in frequentist terminology) that the estimate is “statistically significant.” The model also controls for nesting within 10 different geographic regions. In short, both indices (Capacity and Governance) appear to be significant predictors of national scientific impact.
Table 7 shows the regression analysis conducted on the full panel data. Given the absence of a clear method for obtaining factor regression scores across temporal intervals, a summative composite measure was employed instead of extracting regression scores yearly. The present model also makes use of geographical nesting of countries within regions. The outcome of the model is similar to the results of the cross-sectional analysis. Table 7 displays positive estimates for Capacity and Governance, with credibility intervals that exclude zero. The model includes both country and region-specific random intercepts and includes Year as a continuous variable. The region, the country nested within the region, and the residual exhibit positive estimates and their credibility intervals do include zero.
Longitudinal multilevel Bayesian regression, 2012–2021
FWCI . | Estimate . | Std. Error . | [95% cred. interval] . |
---|---|---|---|
Intercept | −6.0532 | 2.7623 | [−11.4272, −0.6195] |
Capacity | 0.0036 | 0.0013 | [0.0011, 0.0061] |
Governance | 0.0161 | 0.0039 | [0.0084, 0.0237] |
Year | 0.0033 | 0.0014 | [0.0006, 1.0006] |
Region | 0.1614 | 0.0519 | [0.0905, 0.2879] |
Region:Country | 0.1796 | 0.0114 | [0.1583, 0.2032] |
Residuals | 0.1476 | 0.0028 | [0.1423, 0.1532] |
FWCI . | Estimate . | Std. Error . | [95% cred. interval] . |
---|---|---|---|
Intercept | −6.0532 | 2.7623 | [−11.4272, −0.6195] |
Capacity | 0.0036 | 0.0013 | [0.0011, 0.0061] |
Governance | 0.0161 | 0.0039 | [0.0084, 0.0237] |
Year | 0.0033 | 0.0014 | [0.0006, 1.0006] |
Region | 0.1614 | 0.0519 | [0.0905, 0.2879] |
Region:Country | 0.1796 | 0.0114 | [0.1583, 0.2032] |
Residuals | 0.1476 | 0.0028 | [0.1423, 0.1532] |
5. DISCUSSION AND CONCLUSION
The present study introduces a composite index of a country’s capacity to produce and conduct research based on a set of credible indicators. The convergent and divergent validity of the capacity index is tested in relation to a separate index representing governance. They are then used to explore country rankings, which diverge between capacity versus governance context. China’s position, for example, moves from first in core capacity to last in governance among countries. Both capacity and governance are then tested together to establish their predictive validity, estimating relative effects on national scientific impact with both showing significant associations.
The dramatic difference in capacity and governance among some nations may indicate a discrepancy in relatively recent gains in capacity and the long-term sustainability of scientific systems. Autocratic nations may not be fully utilizing the emergent dynamic of self-organization within their workforces, often recognized to be pillars of scientific development (Whetsell et al., 2021). Such countries may gain short-term boosts in raw capacity from top-down programs, yet it remains to be seen whether authentic scientific performance can be sustained in the long-term. As De Solla Price (1963) observed in reference to the USSR, rapid growth can be achieved in developing countries because the existing network of scientific activity has already been established. It remains an open question as to whether the gap between capacity and governance can be sustained long term, especially considering both are empirical predictors of scientific performance (not just capacity).
The results provide to policymakers and analysts the ability to compare nations against one another, and perhaps to consider asymmetries between countries. Moreover, actions within the sphere of “science diplomacy” may be helped by this approach when actions involve establishing scientific agreements or proposing ties. Policymakers sometimes lack clear insight into the underlying capacities of counterpart nations as they seek partners to participate in scientific activities, and this index may be of help because soliciting science agreements or proposing ties can at times be opaque, particularly with regard to the least developed nations.
The index may also be useful for countries wishing to promote their scientific investments and achievements. National capacity to conduct research and development can attract talent who wish to cooperate or collaborate, invest, or study in another country. Nations with higher research capacity attract students and researchers to their universities and research institutions. Governments approve investment into R&D to build research capacity to reach multiple goals, which may include systemic resilience, long-term viability, and national standing and prestige in science. Understanding the role of core capacity and wider context may aid policymakers as they consider ways to improve the development of “useful knowledge.”
The capacity index could certainly be improved in the future. The focus here is on the underlying latent factors that manifest as relationships between indicators and less about the specific indicators that go into an index. In social science, underlying causal mechanisms generate innumerable empirical indicators, and we must not lose sight of the forest for the trees. In this case, the forest represents the more-or-less stable relationship between indicators, while the indicators themselves represent the individual trees.
Additional research could seek to validate the index and assess the scope for trimming or expanding additional indicators, such as tertiary education levels, tax incentives, and infrastructure. Future research could also test the predictive validity of the index against other extant indexes with similar data coverage. The index could become more useful over time as more data points are added. In addition, further research using the index in inferential models on a wide variety of interesting outcomes, such as strategic behavior in the international system, may provide insights into the effects of the capacity of individual nations on their network of relationships.
We hope that the index acts as a useful tool for assessing current science capacity and encouraging international collaboration. But more importantly, we hope our approach may serve as a practical starting point for other scholars seeking to construct their own indices. We expect to use it for research to understand the influence of geopolitical factors on national growth and international collaboration. Furthermore, we expect to use the index to serve as a test for the role of public investment in the growth of capacity over time.
ACKNOWLEDGMENTS
Special thanks to Edwin Horlings for comments on the analytical and conceptual approach. Thanks to Ian Helfrich and Lisa Fagan for comments on the statistical analysis. Thanks to Jeroen Baas at Elsevier for providing critical data. Thanks also to Carol Robbins, Sylvia Schwaab-Serger, John Jankowski, and the late Loet Leydesdorff for consultations and comments; and to Peter Zhang and Ken Poland for help and comments on data collection. An earlier monograph by Wagner, Brahmakulam et al. (2001) presents a similar approach to indexing national research capacity, so we acknowledge the previous RAND publication and the RAND Corporation. We are thankful to attendees at the conference session during our presentation at the Atlanta Conference on Science and Innovation Policy, 2023, for comments.
AUTHOR CONTRIBUTIONS
Caroline S. Wagner: Conceptualization; Data curation; Formal analysis; Writing – original draft; Writing – review & editing. Travis A. Whetsell: Conceptualization; Data curation; Formal analysis; Writing – original draft; Writing – review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
The authors received no funding to undertake this project.
DATA AVAILABILITY
All relevant code and data are available at Github: https://github.com/tawhetsell/National-Scientific-Capacity-Index.
Notes
In the 2000s, the World Bank created the Knowledge Economy Index, which grew from a report that described a Knowledge Assessment Methodology (Chen & Dahlman, 2004); the World Bank no longer publishes the KEI index.
Wagner and Jonkers (2017) compared nations on international engagement and found a positive relationship between open exchange and the impact of and quality of science, supporting earlier work by Barro (1996).
Scopus abstracts entries in scholarly journals based using quality criteria. The database has a broader representation than Web of Science, including more non-English-language journals. The top journals in most countries are represented in Scopus, regardless of language. For example, Scopus claims to have over 90% coverage of serial publications from Japan, close to 90% from South Korea and Taiwan, and over 70% for China.
REFERENCES
Author notes
Handling Editor: Vincent Larivière