This is a “bottom-up” paper in the sense that it draws lessons in defining disciplinary categories under study from a series of empirical studies of interdisciplinarity. In particular, we are in the process of studying the interchange of research-based knowledge between Cognitive Science and Educational Research. This has posed a set of design decisions that we believe warrant consideration as others study cross-disciplinary research processes.
Our overarching program of research addresses the interchange of research knowledge among disciplines (Porter, Schoeneck, Roessner, and Garner 2010). Our approach is empirical social science: We combine bibliometrics—the study of research publication patterns (De Bellis 2009)—with “tech mining”—analyses of science and technology text content patterns (Porter and Cunningham 2005). In this paper, we present, as a case study of sorts, the design and measurement challenges we face in our studies of the “connections” between Cognitive Science and Educational Research. By connections, we focus on co-citation—that is, research publications that reference papers in both.
Let us briefly share our perspective on measuring interdisciplinarity and on the case study to be presented. Some thirty years ago, Chubin led a team that compiled a wonderful set of approaches to consider and facilitate interdisciplinary research (Chubin, Rossini, Porter, and Connolly 1986)—research on interdisciplinarity has a legacy! In the late 1970s and 1980s, a professional association called “Interstudy” formed and hosted a series of conferences to share research on interdisciplinary research (c.f. Epton, Payne, and Pearson 1984; Mar et al. 1985; Birnbaum-More et al. 1990). Research touched on definition and measurement of interdisciplinary research (IDR), and its facilitation and evaluation from many angles (c.f. Chubin et al. 1984; Porter and Chubin, 1985; Rossini and Porter 1981). NSF had an Office of Interdisciplinary Research that funded studies of IDR processes that, ironically, was shut down just as the agency began funding large interdisciplinary centers—the Engineering Research Centers (http://erc-assoc.org/) in the mid-1980s. Julie Klein offers rich perspective on the treatment of IDR over the decades (Klein   2008). Wagner et al. (2011) sum up knowledge on the efficacy of measuring IDR.
Our choice to focus on “connections” between Cognitive Science and Educational Research derives from our belief that they pose an especially interesting case for the study of interdisciplinarity, or lack thereof. These are two research communities engaging similar (frequently the same) questions about learning and social interaction at roughly the same grain size or level of explanation. Yet, the communities are surprisingly separate, typically located in different academic departments, and members of different scholarly societies.
Various observers worry that those communities do not engage each other as fully as is desirable. Yet, of particular interest to us as students of IDR is that this landscape appears to have been changing over the last few decades. There have been many attempts to foster connections between these fields, dating at least as far back as the creation of the National Institute of Education, a collaboration between the US National Science Foundation (NSF) and the US Department of Education in the 1970’s, or the US Office of Naval Research and the James S. McDonnell Foundation’s Cognitive Studies for Educational Practice funding programs launched in the 1980’s. Several actions occurring around the year 2000 suggest that the turn of the century may have been something of a watershed moment in bolstering connections between Cognitive Science and Educational Research. These include major efforts by funding organizations, such as the NSF’s Research on Learning and Education (ROLE) program launched in 1999 and its sister program, the Science of Learning Centers, launched in 2003, plus the Department of Education’s initiation of the Institute of Education Sciences (IES). Signal publications, especially the National Academies of Science’s 1999 publication How People Learn (Bransford, Brown, and Cocking 1999), are also reputed to have exerted notable influence in fostering this connection.
The following sections present our: 1) framing of the problem, 2) categorization of disciplines, and 3) measurement of IDR, followed by 4) discussion. This is a paper on the conceptual and methodological issues in measuring IDR. It draws heavily on the case of Cognitive Science—Educational Research connections to illustrate those issues. It does not present results of the bibliometric analyses in depth; we are addressing those in ongoing research and other publications (c.f. Solomon et al. 2018; Youtie et al. 2017b).
2. Framing the Problem
2.1. Major Theoretical and Methodological Decisions
As is true of science in general, but especially true in the still inchoate field of “bibliometric analyses of interdisciplinarity,” the methodological and analytic decisions one makes are often theory-laden, whether the researchers who make them are aware of that fact or not. There are many ways one might approach such questions, and which approach one takes can have major implications for what is, or is not, of importance and how critical constructs are to be understood, as well as empirical implications for the kinds of results one will get. Even seemingly minor decisions concerning operationalization can render conclusions tautological or make some questions impossible to address. In this paper, we make explicit a number of such design decisions that researchers make, knowingly or not, using our program of research on Educational Research and Cognitive Science to illustrate. We do not mean to imply that this is a full set of such questions, nor certainly that our decisions about each were necessarily the best, but we believe that they serve to highlight the kinds of concerns careful and skeptical researchers should consider.
Briefly, the major design decisions we discuss in this paper are these:
What aspect of interdisciplinarity is of interest? Is it research output? Human capital? Social capital? Ideas? Methodology?
What disciplinary categories are of interest and how will disciplinary category membership be determined? Is it discrete or are there degrees of membership?
Relatedly, what level of analysis is most appropriate for addressing those aspects of interdisciplinarity that are of interest? Articles and papers? Journals? What data source is most appropriate and amenable?
How is interdisciplinarity to be measured? Using quantitative or qualitative metrics? What considerations of reliability or validity are warranted?
Other scope, sampling, and representativeness issues. Norms (patterns averaged across a field or time) as opposed to exceptional groups (harbingers of change)? Over what time scale? How are sampling issues addressed?
We begin with our central case study questions: How interdisciplinary is Educational Research, and how much has it engaged with Cognitive Science? In addressing our central research questions, we might, for example, track people, looking at patterns of collaborations and career paths. Alternatively, we might track ideas, concepts, or even methodologies, as they move from, say, the province of Cognitive Science to Educational Research, even influencing graduate training. We chose to approach this question through the research literature, using analyses of citation patterns as an indicator of influence. To assess change in research citation practices requires measuring the citation patterns among Cognitive Science, Educational Research, and closely related fields and sub-fields, which demands that we categorize publications into those disciplinary groupings, and then measure their connections.
2.2. Aspects of Interdisciplinarity
Interdisciplinarity can mean many things to many people, so it’s important to be clear which aspects of it are of most importance in addressing one’s questions. Obviously, care should be taken in making declarations about IDR. Many studies of IDR focus upon issues pertaining to human and social capital. For example, what researchers or research communities are engaged in a particular scientific endeavor and how do they interact? What is the composition of the teams that engage in IDR?1 What social or organization structures foster or hinder IDR? What kinds of social networks have been created? This work can be descriptive or evaluative, and itself draws on a range of literatures, from labor economics, to public policy, organizational behavior, social psychology, sociology, anthropology, and human factors.
One might also, or instead, focus upon the ideas that these different communities share through IDR. What are their theories, findings, methodologies, analyses, and research practices? And how do they come into collision or change with time and interaction? Or one might focus primarily upon research outputs—whether journal articles, books, chapters, conference proceedings, or other forms such as patents, curricula, or various outreach activities.
The larger question we address in our program of research is the extent to which Educational Research engages with the Cognitive Science community. We have pursued that in somewhat different ways in different projects. Our primary focus is on knowledge transfer in the research literature, therefore we key on research outputs and citation patterns. We want to know the extent to which Educational Research has been influenced by Cognitive Science (and vice-versa). Such a question is highly amenable to bibliometric analyses. Thus, we have studied the extent to which articles in Educational Research journals cite articles that had appeared in Cognitive Science journals. We also have looked at articles that cited or that were cited by How People Learn (HPL) to gain insight into the fields that influenced that National Academies report (Bransford et al. 1999) and were influenced by it. In other projects, we have engaged questions concerning human capital. For example, we looked at the disciplinary affiliations of the Principal Investigators (PIs) on awards made by NSF’s educational research funding programs—ROLE and REESE—as well as the disciplinary affiliations of the authors of HPL. Of course, the interactions of these IDR elements are also ripe for study. For example, in asking whether projects funded by the REESE program were likely to draw upon research appearing in Cognitive Science journals, one would very much like to know how proposal literature referencing corresponded with the disciplinary affiliations of the PIs, before drawing broad conclusions about how the fields have engaged one another. Our analysis of HPL crossed with yet a third factor—the actual ideas. One can imagine an article appearing in an Educational Research journal citing HPL, ostensibly as a way of indicating that it had engaged the Cognitive Science literature, only to find that the actual concept referred to was already commonly known in Educational Research and therefore not really much of an indication of cross-disciplinary engagement (Solomon et al. 2018). As in all good scientific endeavors, the lesson is to be careful about making claims beyond the scope of what is warranted by the evidence. But that evidence is a product, in part, of choices made about categories to measure and how to do so.
3. Categorizing Disciplinarity
3.1. Determining Field of Categorization and Level of Analysis
Implicit already in this paper is the issue of what level of categorization is appropriate. One could analyze department-level academic “disciplines.” This has the benefit of mapping onto human and social capital concerns. Disciplines map to academic departments (albeit not neatly) that constitute institutional homes for researchers, and are producers and purveyors of curricula. Departments also formulate and administer degrees. Disciplinary societies help conform research outputs via means such as academic journals and professional conferences. Note that even at this level, definitional decisions can arise. The masthead for the Cognitive Science Society lists Education as ones of its sub-disciplines. By contrast, in some universities, Cognitive Science is a sub-discipline of Education organizationally. Adopting either of these definitions renders the question of cross-disciplinary knowledge exchange tautological, despite the fact that there are communities of researchers calling themselves Cognitive Scientists and those calling themselves Educational Researchers who are largely sociologically distinct from each other, and read and contribute to surprisingly separate literatures. The key is to make such decisions coherently and explicitly.
Alternatively, one could focus at subordinate subfield levels, organized around more intellectually coherent content areas and with ostensibly more shared histories and levels of current interaction. For instance, later we’ll note that Physical Chemistry doesn’t usually stand apart as an academic discipline, but is obviously a more homogeneous unit than Chemistry. Conversely, one could aim at broader fields of scholarship—“meta-disciplines” if you will—such as the “physical sciences” counterposed to the “social sciences.” Or, perhaps, might some other multidisciplinary or hybrid grouping be essential to one’s research inquiries? Again, determining the appropriate level of categorization very much depends on how one’s IDR questions are pitched and what sort of claims one would like to make. We next consider five possible categorical levels pertinent to research knowledge interchange—looking first at Research Outputs and then at Researchers.
3.2. Research Outputs
Perhaps the most basic level is to consider discrete concepts (“memes” or ideas or discrete findings) or topic areas and their transfer among researchers. In our study of the influence of How People Learn (HPL), we identify major HPL concepts and theoretical approaches, and then look for evidence of their uptake in papers that cite HPL (Solomon et al. 2018). Such evidence entails analyses of the citing document text segments in proximity to the reference to HPL, requiring download of the full citing papers. An essential challenge is how to distinguish the concepts sufficiently cleanly to enable us to track their uptake? Such analyses are more challenging in this social science arena where the terminology has much in common with general language uses (as opposed to the case of analyses of highly technical fields). Given our interest in the extent to which ideas from Cognitive Science influenced researchers in Education, and vice versa, we distinguish those concepts in HPL already prevalent in the Education literature from those concepts common to the Cognitive Science literature and not Education.
A next level is the article as unit of analysis. Articles are not unitary; they can well be composed of multiple concepts, interesting methodological heritages and contributions, multiple findings, and studied interpretations. So, as we track research influence, citation of a given article could mean various things. This is even more the case for citations to longer pieces, such as National Academy reports (e.g., HPL) or books. A citation could be a tip of the hat to the broad relevance of a line of inquiry, a superficial or pro forma acknowledgement, or an indication of real engagement and influence of discrete concepts or findings.
How does one categorize articles to indicate fields? One could do it on the basis of content, though note the difficulties with that, just mentioned re: concepts. One could do it on the basis of the assigned disciplinary categories of the authors (though see the next subsection for a discussion of issues entailed in that effort). We are currently exploring categorization of articles based on human interpretation of article samples, in turn used as seed knowledge from which computer routines can then auto-classify large sets of articles. Even this endeavor requires one to decide whether to weight the field or fields from which the article derives (e.g., looking at the cited references), or one could attempt to determine an articles’ intended audiences. All of this is fraught with coding challenges as well as conceptual risks. For example, what is one to do with an article written by a researcher trained as a Cognitive Scientist, possibly working in an Education college, drawing heavily on the Educational Psychology (one of several “Border fields” we consider) literature, but intended for an audience of Educational Researchers (based on its journal’s readership)?
Despite such challenges, we note that, in a separate effort to categorize Nano-Enabled Drug Delivery article abstract records, our colleagues reported that including journal information added no predictive power to other abstract record content (Ma et al., to appear). Boyack and Klavans (2010) compare alternative article-level analyses to identify research frontiers (a more fine-grained classification). Their investigation of nine article-to-article similarity matrices from the MEDLINE database abstract records supported PubMed’s approach to identify related authors, as a vehicle to prompt users’ possible interest in other articles (Boyack et al. 2011). For our purposes, these considerations point to the multidimensionality of categorization.
A third level is the journal (often referred to as a “source”—whether journal, conference, book, or website). Again, the central challenge is how a journal is to be assigned a disciplinary category. What is the systematic basis of such a decision—stated mission, or article content, or intended audience? Indeed, the aggregation of articles into journals poses a homogeneity challenge. For example, we were forced to confront that issue in classifying articles appearing in disciplinary as opposed to multidisciplinary journals. Consider the diversity of articles appearing in a given journal issue—at different tiers of “disciplinarity”:
Nature (multidisciplinary), or
PLOS BIOLOGY (biological, but diversified broadly across the field), or
Cell (biological, but somewhat more focused on certain subfields).
Anytime we use journal as unit of analysis, we are amalgamating differences. Leydesdorff and Rafols (2011) explore journal-level IDR measurement; Rafols and Leydesdorff (2009) explore alternative ways to classify journals; and Rafols et al. (2010) and Leydesdorff et al. (2016) offer journal-based science maps. Moreover, one must be careful about what citation by and of a journal at these different tiers of disciplinarity means. For example, Solomon, Carley, and Porter (2016) show that though the multidisciplinary journals Science and Nature contain articles from a diverse range of disciplines, the individual articles within them are not more multidisciplinary in the literatures that they cite nor in the disciplines that cite them than is true of top disciplinary journals. That is, an article on Cell Biology appearing in Nature does not influence a more diverse array of disciplines than does an article on Cell Biology appearing in Cell.
An anecdote may illuminate the somewhat whimsical nature of article/journal disciplinary categorization. In a case study, we asked Robert Nerem about how particular papers of their biomedical engineering lab came to be published in a biomedical or in an engineering oriented journal. He related tales for particular articles reflecting opportunistic invitations, balancing distribution, etc. In essence, a given article could as reasonably have been published in sources associated with either field (Roessner et al. 2013).
A fourth level is represented by Web of Science Categories (WoSCs). The Web of Science (WoS) is a leading database that indexes a broad spectrum of research literature [www.weboknowledge.com]. The number of those categories evolves slowly (227 as of early 2017) that WoS uses to categorize over 11,000 science and social science journals. Again, we confront a less-than-pristine aggregation. How homogeneous are the journals bunched into a given WoSC? WoS combines journal-to-journal citation frequency with expert field knowledge to compose and assign WoSCs. WoSC size varies considerably (e.g., 5 journals in Andrology vs. 228 in Biochemistry, as of 2015). More tellingly, observers allege that the WoSC assignment of an article, based on the journal in which it appears, is wrong approximately half the time (Boyack et al. 2007). For macro-scale mapping, that is not a huge concern, as the location of research concentrations on a map of all science is not precise, and errors in the assignment of journals to WoSCs are usually close (i.e., the article’s author might indicate the article is best considered in a nearby field; rarely, in a distant one). But for fine characterizations (e.g., our intent to draw distinctions among fields related to the science of learning) that is a grave concern. We deem use of WoSCs for our Cognitive Science—Educational Research analyses problematic (explored in the next section).
We also note that WoSCs, while a leading bibliometric choice, are not the only option. Scopus is a direct competitor of WoS, also covering nearly all research arenas, with more journals and proceedings in many of those. We favor WoS because we have developed our IDR metrics using WoSCs and we find the data cleaner. There are additional options. In our “connections” study, we also gather Google Scholar citations to HPL. We locate some 641 papers that cite HPL in WoS, though we might also have looked at the 17,000 citations in Google Scholar were we considering less research-oriented questions. The key is that one must consider the kinds of interpretations that are to be warranted. Implications drawn for cross-disciplinary exchange can sharply differ as based on those two data sources.
The issue of category assignment is partly an empirical and partly a theoretical issue with WoSCs. One must ask just what a category assignment indicates. As noted, WoS staff make these designations based on a combination of empirical journal cross-citation frequencies and judgments by panels of topical experts. Nonetheless, we found it advisable to modify the WoS assignments to suit the kinds of questions we wanted to ask. For example, the journal “Physics Education Research” is assigned by WoS to both the WoSC “EDUCATION, SCIENTIFIC DISCIPLINES” and to “PHYSICS.” On an intuitive basis this seems sensible as that journal’s educational content is assuredly physics-related. But, note that this dual assignment confounds analyses of research outputs (articles) appearing in “Physics Education Research.” Each citation thereto would count both as a link to the disciplinary literature (Physics) and to Discipline-Based Education Research (DBER) as reflected by the WoSC (Education, Scientific Disciplines). Our research question requires deeper parsing, as exemplified here by “Physics Education.” By creating correspondingly more specialized categories for Chemistry Education, Math Education, and so on, we can better compare how research in particular DBER fields connects to other DBER fields, and to other Educational Research work, and so on. It is an empirical question whether Physics Education researchers read articles in Chemistry Education journals.
Further questions include whether journals in the same WoSC share an audience, whether they constitute a scientific community of sorts. The answer is sometimes yes, sometimes no. One must be careful. In our project, we further disaggregated the WoSC “EDUCATION, SCIENTIFIC DISCIPLINES.” Indeed, many DBER (focusing on undergraduate level disciplinary education issues) journals also are assigned to the WoSC “EDUCATION & EDUCATION RESEARCH.” In the previous paragraph, we sought more precision in categorizing citations to articles in “Physics Education.” Similar issues arise in analyzing the research upon which “Physics Education” authors draw. Again, we desired greater specificity to distinguish citation within that DBER field (Physics Education) from citation to other DBERs (e.g., “Chem Ed”), and from citation to other educational research (e.g., journals aiming at preschool teachers).
The prior discussion concerned disaggregation of WoSCs. A fifth level can be based on aggregation of WoSCs. We note that there is no WoSC for Cognitive Science! Happily, we can create our own concatenations from WoSC data. In this case, we drew on expert advice and empirical evidence regarding journal co-citation, as well as the literature about Cognitive Science (e.g., the mission statement of the Cognitive Science Society, and bibliometric studies—Goldstone and Leydesdorff 2006; Leydesdorff and Goldstone 2014). We categorized a journal as Cognitive Science if it was assigned the WoSC for a range of experimental psychology WoSCs (e.g., cognitive, developmental, biological, mathematical, or social psychology), Artificial Intelligence, Linguistics, and Cognitive Neuroscience. We expressly did not include various branches of Psychology that were more clinical or therapy oriented, reasoning that they represented different literatures, different professional societies, and different communities. In effect, we might say “disciplines separated by a common name”—psychology.
Sometimes, we are less interested in distinctions among the various disciplines, but more interested in higher level contrasts. For various purposes we have used “macro-disciplines” (clustering the 224 WoSCs [Web of Science Categories] into 19 larger groups—e.g., “biomedical sciences,” “materials sciences”). Some background—earlier we mentioned that inaccuracies in locating an article in the right WoSC, while frequent, tended to be “nearby.” Determination of nearness draws upon a year’s worth of WoS journal-to-journal cross-citation data (kindly provided by Thomson Reuters [now would be Clarivate]). Our colleague, Loet Leydesdorff, consolidates those data to create matrices of WoSC by WoSC cross-citations. A statistic (cosine similarity) reflects “nearness” in the sense of relative frequency of one WoSC citing another. We use those results in various calculations, such as Integration scores (to be discussed). Then, our colleague, Ismael Rafols, factor analyzes those data to generate clusters of WoSCs that reflect reasonable granularity, as well as statistical association. The current set (based on 2015 WoS data) yields a reasonable set of 19 macro-disciplines (http://www.leydesdorff.net/wc15). We generate visualizations using these data. With a background of the 227 WoSCs as nodes located based on “nearness,” we overlay particular sets of research as a science overlay map. Concentrations of that research are shown as sized, colored nodes distinguishing the 18 macro-disciplines (Leydesdorff and Rafols 2009; Rafols et al. 2010; Carley et al. 2017). Example science overlay maps depicting the spread of cited WoSCs by Educational Research articles, vs. those by Cognitive Science articles, appear in the Supplement for this article.
We go further to group into four meta-disciplines. For example, in a study of a particular social science funding program, we mapped publications derived from their funded projects into the four macro-disciplines: “Psychology & Social Sciences,” “Biology & Medicine,” “Environmental Science & Technology,” and “Physical Science & Engineering” (Garner et al. 2013). This proved effective in communicating via science overlay maps that the funding program exerted extensive influence (indicated by citations) beyond the social sciences.
One additional note about assigning categories to research outputs: As you can see for WoSCs, a single output, whether an article or a journal itself, can be assigned to multiple categories. As noted, one could choose to assign a given category not on the basis of WoSC, but on the basis of who the researcher or authors are. But determining how to categorize the researchers engaged in particular research outputs, funding awards, or teams is no less complex. For example, if one wants to use researcher disciplinarity in order to determine the disciplinarity of some research output, such as a paper, the first question to ask is “Whom to categorize?” The lead author? The problem with that approach is that when there are multiple authors, it is not clear who is the lead contributor. Fields vary as to which in a chain of authors is the lead. In some fields, it is the first; in others, the last. And many journals now list the specific contributions of each of the authors. Should the authors be weighted in some a priori fashion based on those contributions? Similarly, when categorizing funding awards, should one attend only to the PI? Or should the co-PI’s also be considered? What of the other senior personnel on the project? How does one proceed when determining the category of a team project or output? If one does identify researcher disciplinary categories, how should one use the results? For example, what do we make of a team that is 90% psychology and 10% education, vs. another that is 50-50?
Further issues arise in the nature of categories used and assignment thereto. As indicated, journals can be assigned to more than a single WoSC, and some 40% are. In our “connections” study analyses, for many purposes we favor single assignment categories. For instance, in allocating 177 journals upon which we drew publications for analyses, if a journal were in both the general Educational Research WoSC and in another WoSC (e.g., Educational Psychology), we assigned it to the more specialized category. For certain analyses, we examined cross-citation between categories so constructed with the residual Educational Research journal subset of 31 journals solely in “Education and Educational Research.”2
Rather than such exclusive, discrete assignment to categories, one could favor graded membership. This could take the form of scoring a research output as centrally, as opposed to peripherally, a member of a category. A project or team could be, to use the above example, 90% psychology and 10% education. There is no single answer, of course, to the issue of how to assign disciplinarity to a team. Again, it comes down to what kinds of research questions are to be addressed, using what data.
The prior section considered issues in categorizing research outputs. It is no simpler when assigning researchers to disciplinary categories. First, is a team to be assigned membership in a single category, or by doing so does one lose potentially valuable data about multidisciplinary teams? Similarly, there might be much to learn from considering individuals to possibly be members of multiple categories (i.e., “IDR [interdisciplinary research] individuals”).
It is in part an empirical question as to what sources of data are used to determine disciplinarity and how they ought to be weighted. Data potentially available regarding researchers include the following:
When assigning disciplinarity to an individual on the basis of a single factor, the default is usually to consider the individual’s highest academic degree. But options arise on how to place those with highest degrees from multiple fields. Should one consider the first or the latest? One can also consider other degrees, postdoctoral fellowships, certificates, and non-standard degrees.
One can also consider the researcher’s current department (ignoring the further complexity, for now, of considering researchers in non-academic positions). Though that also leads quickly to considerations of multiple departmental affiliations, adjunct status, involvement in research centers, past affiliations, and so on.
One can also consider assigning discipline on the basis of the predominant WoSCs (better aggregated to macro-disciplines) in which the researcher publishes. Though beware of the potential tautology one risks in using such a means of assigning category while studying cross-category connections.
The professional societies to which researchers belong say something about how they self-identify. However, this information is not available in many databases, such as WoS. So, obtaining such data likely entails many challenges—finding personal websites, checking research sites such as ResearchGate, visiting departmental websites, and some form of individual surveying. Moreover, it is not clear what lack of membership, or lack of current membership, means. Some people are joiners!
One can derive graded scores of the extent to which an individual is a member of one or multiple disciplinary categories. It is an empirical question whether a researcher with a BA, MA, PhD, and postdoc in Chemistry who resides in a Chemistry department and is a member of the American Chemical Society cites different literature than does a departmental colleague with a background in Computer Science. Indeed, we found, when comparing publication and citation patterns (which literatures are cited and whether the source publications were in WoS or not) among Cognitive Scientists and Education Researchers, that current department was a better predictor than was highest degree, though degree was, to be sure, a factor (Solomon et al. 2014).
There are further practical considerations in using the above information to assign disciplinary category in studying research activities. One is that the above information is not fully available in WoS records or other databases. Moreover, even the information that is available can require further coding judgments. Departmental units vary in naming and composition. They also may not match one’s research needs. For example, Solomon et al. (2016) examined publications in physical chemistry, but the degrees and departmental affiliations of the authors were rarely of that grain size, more typically listing instead, “Chemistry” or “Division of Physical Sciences.” In our studies of Cognitive Science and Educational Research, we distinguished Cognitive Science from Psychoanalysis, yet a Department of Psychology might include both.
We note that the move toward use of unique researcher identification is gaining momentum. In particular, some nations, universities, and journals are moving to require researchers to obtain and use ORCiD identifications (IDs)3. Over time that should provide an assist to obtaining information on authors and proposers. At present, WoS includes a field for authors’ IDs, including ORCiD IDs and ResearcherIDs. The percentage of articles showing ORCiD coverage is growing, but still below threshold for our purposes (e.g., on the order of 20-30% to have at least one such ID associated with an article, as of 2017). In preliminary explorations, we find that ORCiD coverage ranges widely by discipline, journal, and by author nationality (Youtie et al. 2017a). Additionally, researchers vary in what personal information regarding degrees and affiliations they provide on their personal ORCiD sites (orcid.org).
4. Measuring Interdisciplinarity
Wagner et al. (2011) set out to recommend metrics to assist NSF in its efforts to evaluate IDR research and to identify contributing factors to IDR success. They concluded that measurement was not sufficiently well-developed to recommend particular indicators. Well-respected science policy analysts more recently reiterated the message of caution regarding research metrics, in general, not just pertaining to IDR, in the “Leiden Manifesto” (Hicks et al. 2015).
The National Academies Keck Futures Initiative (www.keckfutures.org) is a 15-year, $40 million program to boost IDR in the U.S. As an early part of that ongoing program, the National Academies issued a report whose definition of IDR has informed our efforts to operationalize measures of interdisciplinarity, and, by implication, what we consider to be disciplines:
Interdisciplinary research (IDR) is a mode of research by teams or individuals that integrates information, data, techniques, tools, perspectives, concepts, and/or theories from two or more disciplines or bodies of specialized knowledge to advance fundamental understanding or to solve problems whose solutions are beyond the scope of a single discipline or area of research practice” (National Academies 2005, p. 188)
It requires integration of knowledge;
It specifies that such integration can be accomplished by teams, or by individuals, implying that collaboration may be conducive to IDR, but is not its defining characteristic;
The knowledge to be integrated can be of various forms: ideas (perspectives, concepts, theories), methods (tools, techniques), and/or information (data); Inter-“disciplinary” reduces in practice to interchanges among bodies of specialized knowledge or research practices.
“Bodies of specialized knowledge or research practice” can be operationalized in terms of WoSCs that cluster science, social science, and arts and humanities journals into 244 research areas (at the level of sub-disciplines such as organic chemistry or educational psychology).4 We have developed a number of measures, and visualizations derived from them, to get at cross-disciplinary research knowledge attributes and transfer patterns (Porter et al. 2006, 2007, 2008). We next introduce IDR measures based on the WoSCs: Integration scores, Specialization scores, and Diffusion scores.
Integration and Specialization scores were developed to help measure changes in interdisciplinarity in support of the Keck Futures project evaluation. Integration addresses the diversity of references cited by a paper (or a proposal or an accumulation such as the set of papers published by a researcher or a center). The Integration score corresponds to a 3-part conceptualization of diversity (Stirling 2007). The basic process to calculate this score entails:
download the pertinent set of WoS abstract records, including cited references import those into VantagePoint desktop text analysis software (see: www.theVantagePoint.com)
extract the cited source (i.e., journal title, conference proceedings, book, or other source) name; then clean and consolidate name variations
apply a thesaurus that associates WoS cited journals to their WoSCs (or to other categorizations)
run a script that calculates the Integration scores.
Integration score incorporates “variety” (the number of WoSCs cited by the paper), “balance” (the distribution of cites among those WoSCs), and “disparity” (how similar those WoSCs are from each other—akin to a Rao-Stirling index (Stirling 2007; Rafols and Meyer 2009). Similarity among WoSCs is based on the extent of cross-citation among their journals in a given year of WoS records (e.g., 2015; note prior discussion of the calculations).
Specialization score analogously measures the diversity of publication WoSCs. For an individual paper, this holds little utility as a paper appears in a single journal that is usually associated with a single WoSC (again, we note that over 40% of journals are assigned multiple WoSCs). Specialization holds more interest when applied to a collective—e.g., a set of publications acknowledging support by a particular NSF program.
Diffusion score analogously measures the diversity of the papers citing a given body of research (e.g., a paper; the collected set of papers of a research center) (Carley and Porter 2012). An additional measure addresses the distance between WoSCs, as in those citing a given research output—e.g., comparing articles supported by the NSF Human and Social Dynamics (HSD) Program to a comparison group in terms of how distantly they are cited (i.e., how far apart a citing WoSC is from that of the journal of publication—Garner et al. 2014).
It is beyond our scope to review other approaches to measuring interdisciplinarity; we point to the review by Wagner et al. (2011) as especially catholic and cautionary. Our approach of using WoSCs as essential disciplinary categories in measuring IDR has issues, starting with one’s chosen “level of analysis,” as discussed in some depth previously. We note that the approach of scoring degree of interdisciplinarity (Integration, Specialization and Diffusion scores) can be extended to measure the three diversity components (variety, balance and disparity) separately to study their roles (Chavarro et al. 2014; Wang et al. 2015). The Science and Technology Indicators Conference in 2018 focused considerable attention on measuring separate diversity components (http://sti2018.cwts.nl/proceedings).
Conversely, for some research purposes, simpler measures may be cleaner for getting a sense of interdisciplinary “reach.” Simply counting the number or percentage of citations appearing in out-of-field journals can indicate the “reach” of a paper or set of papers (Solomon, Carley, and Porter 2016). For instance, does a given Educational Research paper reference any Cognitive Science papers at all? One could extend this to tally what percentage of papers in a particular field cite one or more articles from a particular other field (Youtie et al. 2017b). Or one might look at the proportion of references in a paper that were from a particular field. Schunn, Crowley, and Okada (1998) decided that a given field had had a major influence on a particular paper if at least 25% of its references were to articles appearing in journals from that field. This categorization allows one to ask fairly focused questions about disciplinary interaction. We adapted this criterion to assign papers in multidisciplinary journals (Science and Nature) to disciplines (Solomon et al. 2016).
Finally, Kwon et al. (2017) went one step further than Schunn et al. in defining various types of Knowledge Mediating publications (KMEDs), allowing one to focus on the knowledge interchange between specific fields, rather than globally. An Aggregating Type of KMED is a publication in which x% of its cited publications are from one field of interest and x% are from another, with x set at whatever threshold one determines, either theoretically or empirically, to be of interest. For example, one could set x=25% and determine what percentage of publications in a particular journal were Aggregating KMED for Cognitive Science and Educational Research. Similarly, they defined a Bridging type of KMED as a target publication for which x% of its cited references were to one literature (e.g., Cognitive Science) and x% of the publications that cited the target publication were in another literature (e.g., Education). In this sense, the publication served to bridge the literatures. Kwon et al.’s third KMED was the Diffusing type, defined as those publications for which x% of the publications citing it were from one field of interest and x% were from another.
5. The Case Study: Knowledge Interchange between Cognitive Science and Educational Research
The considerations posed in this paper have arisen as we framed and conducted research on this case. As said, this paper aims to present issues in assessing cross-disciplinary knowledge transfer conceptually. Here we offer a brief summary of our results in the ongoing case analysis.
The case analyses have focused on two main datasets: 1) a compilation of 32121 WoS abstract records of articles published in 177 journals in the years 1994, 1999, 2004, 2009, and 2014 in Cognitive Science, Educational Research, and Border fields; and 2) 1641 WoS records of papers citing How People Learn (HPL). Results from analyzing (1) indicate a pickup in citation of Cognitive Science research by Educational Researchers after 2000 (Youtie et al. 2017b). They also show an increase in attention to Border field articles, with interesting implications about Border field research providing a conduit between Cognitive Science and Educational Research (Youtie et al. 2017b). We continue to extend our analyses to probe more deeply into the cross-disciplinary connections appearing between the 32121 papers and their cited references, as well as between those papers and the papers that cite them (Porter et al. submitted).
Our analyses of HPL (Solomon et al. 2018) show that HPL is highly multidisciplinary, both in the literature it cited and in the diversity of the publications citing HPL. In contrast, its reputation as a gateway for Educational Researchers to the Cognitive Science literature is questionable. We find that Education publications that cite HPL are no more likely to have Cognitive Science as a major influence than are benchmark Education publications, as calculated by Youtie et al. (2017b). Moreover, we found that Education publications are overwhelmingly more likely to refer to content in HPL that was already in the Education literature, and Cognitive Science publications were more likely to refer to content that was already in the Cognitive Science literature. A caution, to be sure, about what citation behavior indicates about cross-disciplinary connections. For those interested in specific findings, we welcome inquiries as we progress.
This paper has shared issues and experiences in identifying “disciplines” and the interchange of research knowledge among them. We see these as vital to further understanding of interdisciplinary research processes and outcomes. Here we step back to frame those issues and approaches. This paper’s goals are modest—to help readers recognize options and to offer considerations in pursuing those. We draw from case experiences, with aspirations of generalizability, but no pretense of completeness.
Studying interdisciplinarity begins with a choice of units. We mainly treat categorization of research outputs, with some attention to the alternative of categorizing researchers.5 For research outputs, we focus on papers, particularly in the form of compilations of abstract records and metadata that describe the records provided by the Web of Science database.6 We discuss treating papers’ disciplinarity at five levels of analysis: 1) particular concepts (as represented by terms or phrases), 2) articles, 3) sources (e.g., journals in which articles appear), 4) disciplinary groupings (we focus on Web of Science Categories), and for some purposes, 5) aggregations of those 227 WoSCs into 18 macro-disciplines, or even further into 5 meta-disciplines (those values reflect the 2015 rendition of science overlay mapping—see www.leydesdorff.net/wc15).
This would seem like enough structuring. However, in our “connections” between Cognitive Science and Educational Research case study introduced here, we felt the need to adjust further. We confronted distinctions in researcher categories among academic affiliations—departments and/or disciplines (further noting different implications of industry, government, or other affiliations). Researcher categorization can also draw upon one’s education or affiliation (present or past). Another way is to designate one’s discipline based on where one publishes, primarily or collectively in some fashion.
Regarding research outputs further, we touch on concerns as we amalgamate to WoSCs, or further aggregations of those. We realize that we combine unlike articles, that in turn amalgamate multiple concepts. These choices are difficult; an overarching tradeoff takes place between the generalizability of treating “disciplines” by use of well-standardized categorizations vs. tailoring to capture disciplinary realities more accurately.
Here we introduce possibilities for measuring the diversity of articles based on the WoSCs of the journals they reference, but note that WoS offers a somewhat more aggregated option of “subject areas.” We gauge similarity among WoSCs based on their overall cross-citation patterns. That information enables calculation of Integration, Specialization and Diffusion scores, and visualization via science overlay maps (see the Supplement for illustrations and details). The WoS is the premier science database, but Elsevier’s Scopus is a strong competitor, with somewhat different coverage and classifications. Or, one could devise other ways to categorize research outputs. If one is considering disciplinarity measures “starting from scratch,” replicability poses an important criterion to weigh.
In an evolving field like IDR measurement, the design decisions on how you measure and analyze interdisciplinarity have theoretical and empirical ramifications. Researchers need to pay attention to defining interdisciplinarity, operationalizing key IDR concepts, and determining categorical units—these decisions matter greatly. Particular attention needs to be paid to how to define the fields. The approach used must relate well to the kind of claims that the researcher seeks to make. For example, in measuring interdisciplinarity of research outputs or impacts, some metrics might be pertinent to the research question to be studied and others may be less so. One size doesn’t fit all. Sometimes simple raw counts of citations out of field are all that is necessary. Other times the number of out-of-field citations above a threshold may indicate a vital result. Integration or specialization scores wouldn’t necessarily be the metric to use. In our study of Cognitive Science—Educational Research knowledge interchange, we are devising article-level indicators of such interchange (Kwon et al. 2017). The upshot is that the researcher should not simply apply a standard set of metrics and visualizations to address all IDR questions; rather the data treatment/methods/metrics/visualizations should be tailored to the study’s research questions. The reproducibility of categories and measures, by others, is also essential.
It bears saying that we should not be misled into presuming that more interdisciplinarity (e.g., higher Integration scores) is better. Interdisciplinary processes, especially when combined with other approaches that increase research project complexity (e.g., cross-institutional collaboration, cross-national collaboration), bear considerable costs (Cummings and Kiesler 2005). The right balance of diversity in perspectives, knowledge, and methods, together with effective interpersonal dynamics, is the aim. Furthermore, the outputs of interdisciplinary research don’t generally exert greater influence, as measured by higher citation (Yegros-Yegros et al. 2015; Rafols et al. 2012; Wang et al. 2015), although we see indications of higher citation impact of papers that mediate knowledge transfer between Cognitive Science and Education (Kwon et al. 2017).
Even in our study of the connections between Educational Research and Cognitive Science, it is not clear to what extent Educational Research should be citing work in Cognitive Science, or vice versa. To be sure, in a research funding program whose explicit goal is to increase such connections, knowing how much knowledge transfer occurred and what the baselines in the field are is useful information. But it takes more focused, and likely multifactorial, research to address questions about how much or what kinds of connections support the most impactful research. And it is both a theoretical and empirical question to what extent one can generalize from the interactions of one group of disciplines to another. Lessons drawn about Cognitive Scientists and Educational Researchers will not necessarily apply to interactions between, say, Mathematicians and Geoscientists.
In conclusion, we herein share our reflections on a set of considerations involved in classifying and analyzing data pertaining to interdisciplinarity. The process of extracting these from our study has been enlightening. That effort reminds us of the accumulation of choices involved in our study and, consequently, the fragility of generalization from results obtained. In a research endeavor undergoing as many changes as is currently true with the study of interdisciplinarity, we believe that just making such challenges and decisions explicit can help move the field forward.
To those undertaking study of interdisciplinary processes, we wish all the best. We hope this treatment of disciplinary categorization helps inform your studies and welcome dialogue with you.
For a professional society keying on related issues: see http://www.scienceofteamscience.org/.
The ORCID iD (Open Researcher and Contributor ID) is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors and contributors. https://en.wikipedia.org/wiki/ORCID
Inclusion of arts and humanities boosts the number from 227 to 244, as of 2017, but these change slightly over time.
Other units include research proposals or awards.
Full texts are an alternative and other outputs, such as patents, warrant attention.
In ongoing work we are extending this to try to categorize other works as well using an algorithm based on terms appearing in the cited journal titles plus human judgment.
We forego showing the Boundary map as that does not seem to add value.
This Supplement presents more detailed information on the categorization choices and analyses performed in our “connections” project. It is auxiliary information in support of the main paper.
In the “connections” project, we are analyzing a few different datasets. One set contains 32,121 WoS abstract records of articles that were published in 177 journals at 5 times (1994, 1999, 2004, 2009, 2014), along with their ∼1,345,000 cited references. A second dataset links to those 32,121 records; it consists of the ∼600,000 WoS abstract records that have cited those papers as of 2015. For those large datasets, we rely on information within the WoS records. So, here, we start by classifying the research outputs (i.e., the papers, not the researchers). We mainly use journal and WoSC level categorization.
Figure 1 presents our key analytical model to categorize Cognitive Science—Educational Research units of research knowledge. We built this by “tailoring” the WoSCs. For our purposes we modify how WoS groups journals as follows:
We decompose the large (230 journal) “Education and Educational Research” (Educational Research) category; e.g., “LTHCI” stands for a group of 23 Learning Technologies and Human-Computer Interface journals that we selected, of which 22 would also fall in the Educational Research WoSC.
Recall that WoS assigns about 40% of journals to multiple WoSCs. As per Figure 1, we separate 31 journals that are solely in Educational Research from others that we treat as primarily associated with the specialty shown (e.g., we consider 39 journals as Educational Psychology (Ed Psych), even though 15 of those are also found in the Educational Research WoSC).
We likewise separate the 40-journal WoSC “Education, Scientific Disciplines” into separate Science, Technology, Engineering, and Mathematics (STEM) specialties such as “ENG ED” (Engineering Education), of which one journal is also in Educational Research. This is so we can contrast how various Discipline-Based Education Research (DBER) specialties differ in their interactions with each other, Educational Research, Cognitive Science, and other disciplines.
We take keen interest in three “Boundary fields” somewhat between Cognitive Science and Educational Research: LTHCI, Ed Psych, and Applied Linguistics (7 journals, of which 6 are in Educational Research and 1 is also in Ed Psych).
We thus have 14 “CORE” categories; we collapse those various ways—e.g., combining the 3 Boundary fields, combining the 9 STEM-ED specialties with Educational Research.
In sum, we tailor the WoSCs to enable critical “disciplinary” comparisons of research publications and citations. This assigns the core 177 journals whose publications we pursue. It enables analyses of the citation patterns among Cognitive Science, Educational Research, Boundary fields, (if desired to separate) STEM-ED fields, and other fields. We collapse the “other” WoSCs into 14 macro-disciplinary groups for our comparisons—for instance, in examining how Math Ed interacts with Math, other STEM domains, other STEM-ED fields, Cognitive Science, and Educational Research.
Cross-Disciplinary Measurement Responses
Returning to our CORE project’s main research question, we want to ascertain the extent to which Cognitive Science and Educational Research cross-cite research in the other field. Recognizing that both address learning-related topics, we expect considerable overlap in terminology. Yet we also recognize that they have evolved as distinct “disciplines” for various reasons, perhaps more sociological than topical in nature.
As noted, we are pursuing categorization of research outputs into disciplinary units, not categorization of researchers. One factor contributing to that decision is that academic departmental affiliation is not unambiguous. Departments of Cognitive Studies may reside in Schools of Education. Conversely, the Cognitive Science Society lists Education as a component discipline. Our Boundary fields (e.g., Educational Psychology) can be represented in units on both sides of the Cognitive Science—Education divide.
Concept Level Categorization
We noted our attempt to track important themes emphasized in HPL by identifying key concepts, then looking for evidence of those in citing text segments.
A variant on concepts is to focus on words and phrases. We spent several months trying to categorize articles into Cognitive Science, Educational Research, or Boundary fields (epitomized by “Ed Psych”) based on content analyses of abstracts. We began by downloading paper abstracts from WoS for 2013. Focusing on the essential differentiation between Cognitive Science and Educational Research, we used the “Ed and Ed Research” WoSC as a starting point for Educational Research. For Cognitive Science, we explored records from multiple WoSCs with presumed Cognitive Science content. We variously explored four term fields: 1) author keywords (but coverage varied from 48% to 87% for the seven WoSCs engaged); 2) Keywords-Plus (a special WoS field based on cited titles, covering some 75% of the records); 3) title Natural Language Processing (NLP) phrases; and 4) abstract NLP phrases. We compiled the resulting term sets (using VantagePoint software) showing their frequency of occurrence. Based on our knowledge of the fields, we (Solomon, Youtie, and Porter) independently tagged what we felt were “core” terms for Cognitive Science and Educational Research, respectively. We compared lists; noted that our term selection would shift depending on whether the objective were to differentiate the two fields or to get at core concepts. This led to generation of a third list of “common” terms (to both Cognitive Science and Educational Research).
As we grappled to implement this approach, we tried additional empirical manipulations. For one, we ran a separate search on publication names containing “cognitive*”—excluding clinically related terms, and compared term prevalence there to our previous Cognitive Science sample. We also varied the extent of term processing—cleaning and consolidation to get at essential concepts. We included root Cognitive Science concepts—i.e., cognit, neuroscience, linguistic, psychology, behavioral, artificial intelligence, memory, percept, learning, brain. A similar process for ED added root terms: academic, college, learning, student. We enriched our Cognitive Science downloaded sample to 20,696 paper abstracts. We varied weighting of the four term fields and examined total instances of terms appearing in a given abstract record from WoS, as well as a count of how many different of our terms appeared in given records.
These term (words and phrases) compilations resulted in indicators of “Cognitive Science intensity” (and corresponding ED and common intensity scores). Disarming was the finding that exploring the journals with high occurrence of our Cognitive Science terms yielded weak coverage of what we perceive as core Cognitive Science journals. We decided not to use term prevalence to categorize papers into our three prime categories (Cognitive Science, Educational Research, and Boundary fields). Instead, as discussed above, we derived our categories for Educational Research, Boundary fields, and STEM-Ed specialties from WoSCs. And, as discussed below under “Journal Level Categorization,” we based our Cognitive Science journal set on analyses of what journals are heavily cited by papers in the journal, Cognitive Science.
The term lists do offer an interesting counterpoint in further analyses to explore content emphases and shifts. Our final versions consist of 158 Cognitive Science-oriented terms, 135 Ed-oriented terms, and 39 common, to both fields, terms.
Article Level Categorization
We generally adopt a journal level unit of analysis. That is, we do not pursue concepts or articles as our basic units; we, instead, strive to categorize articles based on the journals (or other sources) in which they were published. Put another way, we don’t differentiate articles within journals, with some exceptions.
In what fields should one place articles in multidisciplinary journals? In a separate investigation, we contrasted the extent of cross-disciplinary knowledge interchange between articles published in the multidisciplinary journals, Science and Nature, vs. those published in leading disciplinary journals (Solomon et al. 2016). We examined three fields—cell biology, physical chemistry, and cognitive science—for each of which we sampled articles published in one leading journal, Cell, Journal of Physical Chemistry, and Cognitive Science. We categorized articles appearing in those journals as being in that respective field. We generated comparable (same year) sets of articles associated with these three fields, published in Science/Nature.
We considered various ways to classify the Science and Nature articles, including assignment based on review of titles, keywords, or abstracts by disciplinary experts. The greater subjectivity of such judgments, combined with the sheer numbers of papers to be examined, rendered the approach less attractive (though not inappropriate). Coding based on assignment of authors to disciplines (based on departmental affiliation or degrees) was not pursued for reasons akin to those discussed previously. We adopted the approach of Schunn et al. (1998) to categorize articles based on the predominance of cited reference fields. Their reasoning was that if at least 25% of an article’s references were to a field, then that field should be considered a major influence.
In practice for us, we screened for Science/Nature articles for which at least 25% of their references, as indicated in WoS records, were to journals in Cell Biology, Physical Chemistry, or Cognitive Science. To do this we applied routines in VantagePoint software to extract the journal (source) name from each cited reference; standardize those journal names with the aid of a find/replace thesaurus, and then match to WoSCs via another thesaurus [courtesy of Thomson Reuters (now Clarivate Analytics)].
We next built a thesaurus to group those resulting cited WoSCs into the three target disciplines (fields), consolidating as follows:
Cell Biology: Cell Biology; Biochemistry and Molecular Biology
Physical Chemistry: Physical Chemistry; Atomic, Molecular and Chemical Physics
Cognitive Science: Behavioral Sciences; Computer Science, Artificial Intelligence; Linguistics; Psychology; Applied Psychology; Developmental Psychology; Educational Psychology; Experimental Psychology; Mathematical Psychology; Multidisciplinary Psychology; Social Psychology.
This illustrates several issues. First, exactly what should we use to constitute the “disciplines” under study? Even for the related clear-cut two here—Cell Biology, and Physical Chemistry—there are non-trivial choices. One could certainly argue with each of these—e.g., “PChem” reflects different training and research questions, to a degree, than does Chemical Physics. And, we also have the composite WoSC, “Atomic, Molecular and Chemical Physics” included here. Likewise, Cell Biology has important distinctions from Biochemistry and Molecular Biology.
But the great challenge is what constitutes Cognitive Science? For one, note that we differ in this analysis than in other ones by including Ed Psych in Cognitive Science, whereas later we will separate it and treat it as a Boundary field between Cognitive Science and Educational Research. We assert that it is reasonable to operationalize a given “discipline” differently, depending on the nature of the analysis involved. We concede that this is less than ideal in that it messes up comparisons among studies that define disciplines differently. Here, our target is the comparison of citation patterns by and to Cognitive Science articles that are published in Science/Nature vs. in the journal, Cognitive Science. In the other study used as our primary example, we focus on citation pattern differences among Cognitive Science, Educational Research, and the Boundary fields, so where Ed Psych is located becomes critical.
WoS does not have a category for Cognitive Science journals as such. Cognitive Science is an emerging field drawing on such diverse fields as Artificial Intelligence, Linguistics, Neuroscience, Philosophy, and Psychology (Schunn, Crowley, and Okada 1998; Thagard 2005). We opted not to include Neuroscience, after exploring it as a possible fourth “discipline.” We addressed how to deal with the considerable Neuroscience overlap with Cognitive Science. We examined sets of Science/Nature articles to yield some four levels that we included in our Cognitive Science sample:
Least strongly Cognitive Science—having >= 25% of references in Cognitive Science-related WoSCs and also no more than 30% of references in Neuroscience
Next tier—having >= 25% of references in Cognitive Science-related WoSCs and a similar number of references in Neuroscience
Higher—having >= 25% of references in Cognitive Science-related WoSCs and fewer references in Neuroscience
Most strongly Cognitive Science—having >= 50% of references in Cognitive Science-related WoSCs and fewer references in Neuroscience
This reflected a tradeoff between a shortage of Cognitive Science articles in Science/Nature vs. the very plentiful Neuroscience articles in those journals. Separating Neuroscience was critical to this experimental comparison given our sense that constituencies and citation patterns vary greatly between it and Cognitive Science. We compared samples and found big differences between the 34 articles in our Science/Nature sample set and Neuroscience articles in terms of the journals they reference, the terminology of their abstract records, and the extent and concentrations of journals citing those articles.
For present purposes, distinguishing between Cognitive Science and Neuroscience illustrates the sensitivity of experimental conclusions to the choices made in disciplinary categorization. We would have liked to draw on just the higher of the four levels (just listed), but we also wanted more than the 14 articles in our time period in Level 4. Levels 2 and 3 added 11, and we extended our criteria to pick up the 10 in Level 1.
Journal Level Categorization
Figure 1 presents our grouping of 177 journals into 14 CORE categories (“core” named after the NSF EHR Core Research [ECR] program funding this project; it addresses fundamental research in STEM Education). Here’s how we arrived at this. Our driving research question is whether Educational Research articles have increased their citation of Cognitive Science research. As our thinking coalesced, we sought to distinguish “Boundary fields” that seem to overlap Cognitive Science and Educational Research. We also wanted to treat the several STEM-ED sub-fields separately in certain analyses, collapsing them for other purposes.
We began by seeking to identify a set of core Cognitive Science journals. As treated in the previous “Article Level” section, WoS does not have a ready-made subject category for Cognitive Science. So we looked to determine what might constitute a viable Cognitive Science set. We turned to Goldstone and Leydesdorff’s (2006) seminal bibliometric analysis of the flagship journal Cognitive Science, and their later (Leydesdorff and Goldstone 2014) analysis. They use factor analyses to differentiate five “fields” heavily cited by 904 papers, finding relative constancy over time. They list journals that are highly cited in Cognitive Science. We modified this list slightly by including journals that have been introduced since 2006 and dropping journals that were expressly multidisciplinary (e.g., Science and Nature) or more clinical or biological in our judgment (relying heavily on Gregg Solomon). The resulting list yielded 42 Cognitive Science journals. We note that all 42 journals were assigned by the WoS to at least one of the WoS Categories relevant to Cognitive Science (e.g., Experimental Psychology, Linguistics, Artificial Intelligence, and Neuroscience). As discussed in the prior “Article Level” section, this entails judgment.
We used the WoSCs to select journals for the Boundary fields. For Educational Psychology we included journals coded with the WoS category: “Psychology, Education.” Learning Technology/Human Computer Interaction journals were those included in the WoS Category “Computer Science Interdisciplinary Applications.” And Applied Linguistics journals were those falling in the WoS categories “Linguistics” and “Language and Linguistics.” The 69 Boundary field journals include 39 from Educational Psychology, 23 from Learning Technology/Human Computer Interaction, and 7 from Applied Linguistics.
For STEM-ED, we divided journals in the WoSC “Education, Scientific Disciplines” into particular, small sub-fields (but for most analyses we collapse these either as STEM-ED or into Educational Research).
Figure 1 shows Educational Research as the remaining 31 journals of the WoSC “Education + Education Research” not captured by another of our 14 CORE categories. We do so to sharpen contrasts among the target “disciplines” to address that main research question – has Educational Research citation of Cognitive Science changed over time? Were we, for instance, to include the 22 of 23 LTHCI journals that WoS also locates in “Ed + Ed Research” we would heavily confound the categories. That said, we recognize that our categorization places a particular slant on the analyses.
Article vs. Journal Level Categorization
Given the multiple concerns raised so far regarding how best to categorize records by discipline (or such), we tried another way. Dual objectives were to validate our journal-based assignments (Figure 1) and seek potentially easier and more effective means.
We are developing a Bayesian auto-classifier capability in VantagePoint (Cassidy, under submission). This “learns” the distribution of words by category for a training set of records and then infers the category of additional records. Our training set was 300 of the 32,121 WoS abstracts. Those were assigned to the three main categories—Cognitive Science, Educational Research, Boundary fields. The auto-classifier processed the terms in the 300 records’ titles and abstracts. It then applied that knowledge to associate the other 31,821 records with one of the three categories.
We compared the resulting assignments, with the placement of those records based on journal titles (Table 1). A composite indicator, F1, combines precision (i.e., the records so classified do belong) and recall (i.e., how many of the records that should be included, are). Classification is best for Cognitive Science (F1 = 76%), indicative of reasonably unique vocabulary and double the number of records in the training set of either Educational Research or Boundary. Worst is Boundary (F1 = 44%)—expected in that it seems apt to share terminology with both Cognitive Science and Educational Research. Educational Research categorization is in between (F1 = 58%). We believe the better accuracy for Cognitive Science vs. Educational Research reflects more particular terms in heavy use in Cognitive Science, while Educational Research has both an inherently general vocabulary plus diverse terms specific to subfields (e.g., Physics Ed; Special Ed).
|.||.||Actual Journal Category .|
|Cognitive Science .||Educational Research .||Boundary .|
|Classifier Assignment||Cognitive Science||10515||999||956|
|.||.||Actual Journal Category .|
|Cognitive Science .||Educational Research .||Boundary .|
|Classifier Assignment||Cognitive Science||10515||999||956|
It is interesting to note that the Cognitive Science and Educational Research journal-based categories are clearly delineated in terms of word distribution. This seems to imply that exchange of ideas occurs primarily in interdisciplinary sources. The next question becomes where Boundary journals sit on the spectrum? The matrix (Table 1) shows that items appearing in a Boundary journal are more likely to be “misclassified” (based on title and abstract terminology) as Educational Research than as Cognitive Science (37% vs. 13%). This suggests that Boundary field topical coverage, based on terminology, has more in common with Education than with Cognitive Science. Going the other way, Educational Research articles are more likely than Cognitive Science articles to be categorized as Boundary, based on terminology. Again, this suggests more commonality between Boundary and Education than between Boundary and Cognitive Science. Finally, we measure the interaction between Cognitive Science and Educational Research by investigating how often one is mistaken for the other based on term-based Classification. Education documents were somewhat more likely to be assigned to Cognitive Science than vice versa (11% vs. 9%), which implies that the Educational Research field is borrowing more ideas from Cognitive Science than Cognitive Science is borrowing from Educational Research.
Journal-based assignment does match this term-based assignment to a reasonable degree. And term-based classification shows promise in exploring these cross-disciplinary engagements.
As described earlier, Integration scores reflect the diversity of the references cited by papers. We calculate these for references to papers published in WoS indexed journals.7 We wondered whether Cognitive Science, Educational Research, and Boundary papers differ in the degree of Integration (measured in this manner).
As an extreme example, take one Boundary paper—“Theories and the Good: Toward Child-Centered Gifted Education”—of its 50 cited references, those to papers in any of our 177 journal set are all categorized as “Ed Psych.” This yields an Integration score of 0. At the other extreme, an Educational Research paper—“Self-plagiarism and unfortunate publication: an essay on academic values”—its 68 cited references include papers in 26 WoSCs (e.g., Biochemical Research Methods, Ethics, Paleontology), yielding our highest Integration score of 0.85 (the maximum is 1).
Table 2 summarizes our findings. The averages are similar, with 0.50 in the middle range of values we have observed in samples of diverse fields (Porter and Rafols 2009). However, with such large samples, the differences are statistically significant. We hesitate to make too much of the small differences, but finding Boundary fields having the most diverse draw of references is consistent with our calling them “boundary” [including Ed Psych, Applied Linguistics, and LTHCI (Learning Technologies and Human Computer Interaction)].
|Category .||# Records .||Mean Integration Score .||Standard Deviation .|
|Category .||# Records .||Mean Integration Score .||Standard Deviation .|
Science Overlay Mapping
As introduced earlier, this sort of visualization provides another perspective on cross-disciplinarity. Here we depict research activity superimposed on a background map of the 224 WoSCs. Here we illustrate for the same data used to estimate the diversity of cited referencing by our Cognitive Science, Educational Research, and Boundary field paper sets. Were we to do so for the two extreme examples just mentioned, the Ed Psych paper’s cited WoSCs would be a single spotlighted node. In contrast, the Educational Research paper would show activity in 26 nodes. If it cited a lot of papers in one of those WoSCs, that colored node would be larger.
Here we illustrate for the collective publications noted in Table 2, showing the distributional pattern for Cognitive Science and Educational Research. (The map scaling differs as the relative numbers of cited WoSC instances ranges widely—e.g., the most cited by our 15,455 Cognitive Science papers is Psychology—132,393 times; most cited by the 9,082 Educational Research papers is “Education and Educational Research”—56,615 times.) Table 3 indicates the top 10 cited WoSCs by each field. Boundary fields do fall between the other two, more resembling Cognitive Science.8 For some analyses we treat the STEM Education categories separately (Figure 1); here they are collapsed into Educational Research. Note the citation of STEM WoSC journals by Educational Research—a sharp difference from Cognitive Science—visible in the Figure, but clearer in Table 3. (Figure 2)
|Cited WoSCs .||CogSci .||Boundary .||EdRes .|
|EDUCATION & EDUCATIONAL RESEARCH||1||1|
|EDUCATION, SCIENTIFIC DISCIPLINES||2|
|Cited WoSCs .||CogSci .||Boundary .||EdRes .|
|EDUCATION & EDUCATIONAL RESEARCH||1||1|
|EDUCATION, SCIENTIFIC DISCIPLINES||2|
This work was supported by a grant from the US National Science Foundation, Directorate for Education and Human Resources (DRL-1348765) to Search Technology, Inc., with major involvement by Georgia Tech. While serving at the National Science Foundation, GS was supported by the IR/D program of the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Foundation.