Abstract
This article explores the global implementation of the FAIR Guiding Principles for scientific management and data stewardship, which provide that data should be findable, accessible, interoperable and reusable. The implementation of these principles is designed to lead to the stewardship of data as FAIR digital objects and the establishment of the Internet of FAIR Data and Services (IFDS). If implementation reaches a tipping point, IFDS has the potential to revolutionize how data is managed by making machine and human readable data discoverable for reuse. Accordingly, this article examines the expansion of the implementation of FAIR Guiding Principles, especially how and in which geographies (locations) and areas (topic domains) implementation is taking place. A literature review of academic articles published between 2016 and 2019 on the use of FAIR Guiding Principles is presented. The investigation also includes an analysis of the domains in the IFDS Implementation Networks (INs). Its uptake has been mainly in the Western hemisphere. The investigation found that implementation of FAIR Guiding Principles has taken firm hold in the domain of bio and natural sciences. To achieve a tipping point for FAIR implementation, it is now time to ensure the inclusion of non-European ascendants and of other scientific domains. Apart from equal opportunity and genuine global partnership issues, a permanent European bias poses challenges with regard to the representativeness and validity of data and could limit the potential of IFDS to reach across continental boundaries. The article concludes that, despite efforts to be inclusive, acceptance of the FAIR Guiding Principles and IFDS in different scientific communities is limited and there is a need to act now to prevent dampening of the momentum in the development and implementation of the IFDS. It is further concluded that policy entrepreneurs and the GO FAIR INs may contribute to making the FAIR Guiding Principles more flexible in including different research epistemologies, especially through its GO CHANGE pillar.
1. INTRODUCTION
In current times, the traditional paradigm of limited use (and reuse of) data no longer fits the expectations of researchers and practitioners. This creates space for new ideas, such as those contained in the FAIR Guiding Principles for scientific management and data stewardship, to be put forward to fulfill this gap [1]. The collective starting point of the FAIR Guiding Principles dates back to a workshop in 2014, entitled “Jointly Designing a Data FAIRport”. The objective of this workshop was to unlock the potential of digital data to make it discoverable and useable for science through machine and human reading processes [1]. The idea is that public responsibility is needed to embed open and digital data-powered science guided by public interest [2]. After the workshop, FAIR experienced rapid development, gaining recognition from the European Union, G7, G20 and US-based Big Data to Knowledge (BD2K), as well as the African Research Cloud [2]. In 2017, the European Union adopted a resolution to supporting FAIR and appointed a High Level Expert Group to advise on the establishment of a European Open Science Cloud (EOSC) [3]. In the following year, an international implementation strategy to realize the Internet of FAIR Data and Services (IFDS) was put in place [4,5].
The FAIR Guiding Principles establish a direction to facilitate digital data-driven science [1]. They establish a new paradigm that builds on digital data, protocols and technologies from the Internet and the “Internet of Things”, providing the basis for digital data-driven science. FAIR boldly presents a directional solution within a new paradigm of data-driven decision making, with evidence-based solutions driving developments “in almost every societal domain” [2]. FAIR gives a direction in which new practices should be developed, namely, that: (i) data should serve the public interest and public data should be governed by public policy, (ii) data-driven science should expand the collective knowledge in Europe and internationally, (iii) science should serve practical solutions and services, and (iv) scrutiny by, and the involvement of, citizens and the general public in knowledge discovery will democratize science and its use for services [2]. The idea behind the principles is that an open and collaborative approach is needed for a contemporary science that is serviceable to society and in which data are understood to be a global good and public resource, as opposed to the science of the previous century, which is seen as driven by privileged scientific elites, individualistically-oriented aspirations and closed approaches. FAIR and open science brings research squarely and fully into the 21st Century.
The expansion of the envisioned IFDS to all scientific disciplines and global geographies and societies is the primary objective. The early implementation of the principles has been organized through FAIR Data Points (FDPs). The FDPs provide nodes in the IFDS by providing a machine readable semantic and software layer over data resources, which will make these discoverable and reusable for both humans and machines through relevant semantic networks. While FAIR is fully supported by the European Union as a future direction to guide the establishment of the IFDS, its success depends on its uptake in all geographies and domains. This article presents the results of an investigation looking into the uptake of FAIR in order to understand whether a global “tipping point” for FAIR has been attained.
2. IMPLEMENTATION OF FAIR IN DIFFERENT LOCATIONS
The number of articles citing the original article on the FAIR principles [1] stands at 1,570 (3 September 2019: 17:00hrs, Google Scholar). To investigate FAIR implementation, a literature review of 100 randomly selected academic journal articles – citing the founding article [1], was conducted. The literature review was carried out on articles published in the period 2016 to 2019. The articles were analyzed using a closed coding-labeling analysis, looking at implementation in terms of: (i) geography, (ii) science-discipline (topic), and (iii) reported rate of success.
The results show that the FAIR Guiding Principles have been implemented mainly in European geographies (67%) and to a lesser extent in American geographies (14%) (Figure 1), together accounting for 81% of implementation efforts. The Southern hemisphere is largely excluded from implementation efforts at this point.
The vast inequality in worldwide implementation is of concern. First of all, the enormous potential of open and FAIR-driven science could widen the digital divide, rather than help close it. Furthermore, for global challenges, such as the ambitions set in the Sustainable Development Goals (SDG), a bias in studies might be dangerous. For example, Sirugo, Williams and Tishkoff [6] conclude that “the majority of studies of genetic association with disease have been performed on Europeans [and this] bias has important implications for risk prediction of diseases across global populations.” Dresser [7] identified the problem that historical power-relations have in terms of inclusion and exclusion patterns in medical research and pointed out that the gender dimensions of this are reflected in the “white male” being accepted as the norm. The researchers argue that there is a need to include more diverse populations in empirical examples and theoretical reasoning. Their findings resonate with concerns that research biased towards European ancestry populations and men affect the generalizability of research findings, which negatively affects the understanding and treatment of people with ancestries that are under-represented [8]. Research bias also negatively affects the ability to explain disease causes, which affects the accuracy of treatment and care. Furthermore, application of interventions that have proven their value in, for instance, Europe may have unexpected negative side effects in other regions. This concern is not only relevant for research in bio sciences, but also in other disciplines.
3. REPRESENTATION OF RESEARCH DOMAINS.
Analyzing the disciplines in which FAIR Guiding Principles are implemented, bio- and natural sciences are significantly overrepresented constituting 95% of the sample and there appears to be extremely limited implementation of the principles in the social, political, law, humanities and other sciences (5%). The lack of adoption of FAIR Guiding Principles in other domains outside of bio- and natural sciences is partially because those domains do not see FAIR data stewardship as offering a solution to a problem that needs to be solved. In these domains the problems to be solved are not data issues, or are data issues that the FAIR Guiding Principles cannot resolve.
Many articles in the bio-science sector aim at providing a practical guide for producing a high-quality description of biomedical datasets, i.e., scientific data management and stewardship, building and sustaining data infrastructure, focusing on how data infrastructure can make data machine actionable following the FAIR Guiding Principles. The large majority of implementation efforts, as reported, focus on data reuse. The implementation efforts reported, examine various workflows around research data management with the FAIR Guiding Principles within labs at institutions. In the articles analyzed, problems are reported in relation to how data is insufficiently described to understand what they are (i.e., poor metadata) and how they were produced. This lack of contextual embedding in the data presents problems for their reuse.
. | Africa . | America . | Europe . | Europe+ . | Other . | Unknown . | Total . |
---|---|---|---|---|---|---|---|
Bio-science | 1 | 11 | 53 | 5 | 2 | 12 | 84 |
Natural science | 0 | 3 | 5 | 1 | 0 | 2 | 11 |
Social science | 0 | 0 | 3 | 0 | 1 | 1 | 5 |
Total | 1 | 14 | 61 | 6 | 3 | 15 | 100 |
. | Africa . | America . | Europe . | Europe+ . | Other . | Unknown . | Total . |
---|---|---|---|---|---|---|---|
Bio-science | 1 | 11 | 53 | 5 | 2 | 12 | 84 |
Natural science | 0 | 3 | 5 | 1 | 0 | 2 | 11 |
Social science | 0 | 0 | 3 | 0 | 1 | 1 | 5 |
Total | 1 | 14 | 61 | 6 | 3 | 15 | 100 |
Source: Created by authors, Stokmans & Basajja (2019)
Note: Europe+ is Europe and other continental geographies, mainly the United States of America
Combining data from implementation in different geographies and research domains shows an early adoption of the FAIR Guiding Principles by the biomedical research community. This concurs with the origin of FAIR being from “a group that was mainly perceived as coming from a bio-science background” [2] and the fact that an early adopter was ELIXIR [9], which “unites Europe's national bioinformatics Nodes into a single infrastructure that underpins the management and safeguarding the increasing volumes of data generated by publicly-funded research” [2].
Mons [2] recognizes the challenges to organizing data reuse across the disciplines, but he does not identify such problems as domain specific:
The problems that hinder data reuse in the life sciences – ambiguity of symbols, too many persistent identifiers for the same concept, semantic drift, and linguistic barriers, the description of the analytical methodologies, tools, and their capabilities, and the need for adequate and accurate citation – are all in various shades of severity, also problematic in other domains, such as the humanities or law. [2]
Mons [2] concludes that FAIR is “not a life sciences hobby” and that consortia in the various fields need to be developed in order to help expand the engagement with FAIR data stewardship in other scientific domains. It is critically important that the European Commission require all publicly-funded scientific research to use FAIR-based data stewardship to recognize that such data are generated as a public good, enabled and supported by public means. From 2020 onwards, research in all European academic institutions will be required to progressively introduce FAIR-based data stewardship and this will, therefore, affect all research domains.
4. IMPLEMENTATION NETWORKS
A second study was conducted into the establishment of GO FAIR Implementation Networks (INs). This was based on information provided by the GO FAIR International Support and Coordination Office (GFISCO). The INs stem from recognition that the introduction of FAIR-based data stewardship requires support. The GFISCO was set up in 2018 to coordinate activities aimed at developing a global Internet of FAIR Data and Services (IFDS). A critical part of this is the GO FAIR INs, which bring together FAIR communities. The INs provide the building blocks for the development of the IFDS by defining and creating materials and tools that are relevant to data stewardship in particular domains, support areas or geographies [10]. Given the 2020 target for FAIR data stewardship in all European-funded science, the question arises as to which domains (and geographies) are reflected in the INs.
The 33 INs available in March 2019 were analyzed based on available information provided in the establishment documents (“manifestos”) of the INs, of which 12 were available. At the time of data analysis, 13 INs were listed as active, while the remaining were listed as interested or preparatory. In addition, a meeting of INs held in January 2019 was attended and this provided additional information on the direction of and obstacles to FAIR uptake from a global perspective. In total, 13 INs were “active” and all are European or European+ initiatives. Figure 3 indicates that all topic domains are represented almost equally, except information technology (IT). There are 5 INs that focus on software for FAIR implementation.
The implementation of FAIR is managed through three pillars of activities: (i) socio-cultural change involving relevant stakeholders working towards alignment, leverage and synergy (GO CHANGE), (ii) training professional data stewards (GO TRAIN), (iii) designing and building connective tools (GO BUILD) (www.go-fair.org/go-fair-initiative/strategy). The information about the 13 active INs indicate that 5 INs work on GO CHANGE, 11 on GO BUILD and 2 on GO TRAIN. Further analyses indicate that 4 INs work on more than one activity, and 9 INs work on GO BUILD, but not GO CHANGE.
5. ACCEPTANCE OF FAIR IMPLEMENTATION
While FAIR provides flexible guidelines, its implementation appears critically biased towards the bio- and natural sciences. This is despite efforts to integrate other domains, such as social sciences, law, economy and humanities [2]. The implementation of FAIR is also biased towards implementation in the Western and Northern hemispheres (the European Union and United States of American) and focuses heavily on building software. This might be partially explained by the policy framework provided by the European Union and the funding available to support this [3]. However, this does not explain the bias among different sciences inside Europe. The relevance of FAIR Guiding Principles for science depends on their potential to achieve inclusiveness [11,12]. Inclusiveness can be reached through the pillar GO CHANGE. But this pillar is not the focus of the majority of active INs.
Hence, there is a need to explore the barriers and opportunities that the worldwide implementation of FAIR Guiding Principles may encounter. This analysis will be approached from two different angles. Firstly, what is the public policy agenda setting that has defined the architecture of the FAIR Guiding Principles? This is important as it shapes confidence in the relevance of FAIR as a framework for managing data through combined machine and human reading within a framework that will continue to be in the public interest in the future. The second approach is to analyze the acceptance of FAIR Guiding Principles by researchers and practitioners and their use of these principles in daily practice.
From the perspective of public policy, the uptake of the idea of the FAIR Guiding Principles for IFDS depends on the opening of a policy window to expand the agenda among a new generation so as to have a revolutionary impact on society. Kingdon [13] describes a “policy window” as a moment in time in which new ideas can reach the policy agenda. A policy window is time-bound – it opens, but it also closes. A policy window allows for new and unexpected ideas to be taken into consideration by policymakers and other policy entrepreneurs, sometimes because the old paradigm no longer seems to fit, opening space for new ideas whose time has come [14].
Kingdon[13] defines three streams towards a tipping point for new ideas to enter a policy agenda: the problem stream, the policy stream and the political stream. In the problem stream, issues are identified and acknowledged as a problem that needs solving (e.g., the informational value of data not being sufficiently or adequately utilized); in the policy stream, policy alternatives are developed in relation to solving certain problems (e.g., the conceptualization of FAIR Guiding Principles as a solution to insufficiently and inadequately utilized data) and in the political stream, political will is identified to address how a certain issue is shaped (e.g., the European Commission embraces the application of FAIR Guiding Principles to research as a public good; see Dutch Techcentre for Life Sciences, [15]; European Commission Expert Group on FAIR Data, [16]; also see the consensus among African governments as demonstrated in the Digital Regional East African Community Health initiative). According to this perspective, FAIR Guiding Principles provide a solution to the unsatisfying idea that research money is largely spent to generate data that is used only once for a specific project. Hence, the informational value of data is neither utilized sufficiently nor adequately. Now, due to digitalization, data can easily be stored, shared, integrated, and reused to generate more knowledge about the world [1] opening up many possibilities. Whether or not an idea is taken up depends on whether the three streams become synergetic at a certain point, leading to a policy window opening. In line with Birkland [17] we suggest that synergy between the three streams can be triggered by focusing on events that “change the dominant issues on the agenda in a policy domain” (p. 1). Importantly, focusing on events may lead to the mobilization of people and groups around an idea. For the worldwide implementation of FAIR, implementation can be hindered if a policy window is not opened in geographies outside Europe. Hence, the urgency of the problem that data are not sufficiently used and the notion of FAIR Guiding Principles as a workable solution to this problem should be promoted worldwide.
The opening of a policy window can be promoted by “social entrepreneurs” who are actively engaged in Kingdon's three streams. Social entrepreneurs can identify challenges, which can include “simple” barriers such as: “I can't find or afford enough data scientists to analyze my data”; “data sources are not FAIR and require our finite resources to be data wranglers”; and “we have financial constraints”. Social entrepreneurs can prompt the urgency of the identified problem, they can promote a specific idea (solution or policy) and they can activate the political will to embrace the suggested idea as appropriate to solve the identified problem. According to Gladwell's theory of a “tipping point” [18], and the conceptualization of Wood [19], which identifies the concept of the tipping point in relation to the policy images that help shape a policy window, “such images are not randomly determined, but are shaped by the combined quality of the message, the persuasiveness of the messenger, and the context of the policy environment” [19]. Wood further argues that policy is “the product of an innovative idea coupled with an opportunistic group of entrepreneurs occurring in a receptive policy environment” [19]. Those social entrepreneurs should convince people who are active in (one of) the three streams to embrace the FAIR Guiding Principles to attain sufficient momentum.
It is important to note that each stream is also affected by people who are directly or indirectly affected by the acceptance of the FAIR Guiding Principles, such as researchers and practitioners, funders and publishers. The acceptance of the FAIR Guiding Principles not only depends on the quality of the message (content and source), but also on the context. According to dual process models, which are frequently used to study persuasive communication (MacInnis & Jaworski [20]; Chaiken & Trope [21], preface; Stanovich & West [22]; Kahneman [23]; Evans & Stanovich [24]), this context includes (among other things) the motivation as well as the beliefs and opinions of the receiver of the message. This includes perceptions on the epistemological paradigms of science that are regarded as relevant [29].
The motivation to listen to a message about FAIR, and the beliefs and opinions receivers have about data collection, use and reuse, may differ substantially. Motivation, which is defined as the willingness to process the information in a message [20], is affected by the perceived usefulness or instrumentality of the message, in this case the FAIR Guiding Principles. This perceived instrumentality is dependent on the recipient's perception of the role FAIR has within the social process of data management and data collection and the objectives the receiver has for the use of the FAIR Guiding Principles. These roles and objectives may be interlinked, but will probably differ between policymakers, data managers, researchers and practitioners. Social entrepreneurs within the agenda-setting arena should be aware of the different motivations and objectives that different stakeholders hold in accepting and using FAIR Guiding Principles to persuade stakeholders to embrace these principles and facilitate the multiple streams to create a policy window for the adoption of the principles. In this respect, the activity GO CHANGE is crucial in motivating potential users of FAIR Guiding Principles.
According to the dual process models, beliefs and opinions about data collection, use and reuse will affect the acceptance of the FAIR-message. The beliefs and opinions of the receiver of the message can be regarded as a reference point from which the content of the communication is evaluated. If the content of the message does not correspond or resonate with the receiver's frame of reference, the receiver will probably not accept the message and may even formulate strong counterarguments to safeguard their own beliefs and opinions (contrast Hovland, Harvey & Sherif [25] and MacInnis & Jaworski [20] on the effect of communication). In this regard, it is assumed that researchers and practitioners have strong ideas about data collection, use and reuse, as these beliefs are the core of their identity and practices, as researchers traditionally work with data. Embedded assumptions unrelated to FAIR are also a factor [26]. Any discrepancy between the view of research on which the FAIR Guiding Principles is built and the opinions and beliefs of researchers and practitioners who are expected to use these principles can be measured as cultural entropy [27]. If cultural entropy is large, it will hinder the acceptance of the FAIR Guiding Principles. Hence, it is important that the activity GO CHANGE is initiated together with GO BUILD. The IFDS can only reach a tipping point if it receives broad global acceptance in most if not all of the research domains, which is currently not the case.
6. CONCLUSION
The FAIR Guiding Principles have experienced significant expansion in acceptance and implementation, although implementation is largely limited to the Western hemisphere and to bio- and natural sciences (95% of articles reviewed). In fact, the societal implications of the IFDS could be larger than the impact of the Internet, as identified by George Strawn during the closure of International GO FAIR Implementation Networks meeting [28]. However, in order to reach a worldwide tipping point, global coverage of INs across disciplines is necessary to guarantee population generalizability as well as the development of valid models to guide research and, in this way, enhance the usability and instrumentality of reuse of data. Global coverage depends heavily on the acceptance of the FAIR Guiding Principles by all stakeholders and in particular by researchers and practitioners, who may have different ideas about what models and methods are relevant in a particular domain. There is also a need to create space to acknowledge diversity in research epistemologies and contextual differences regarding the acceptability of digital data. There is also a need for more diverse social entrepreneurs in the GO FAIR movement to realize IFDS and GO CHANGE should including training INs to be social entrepreneurs. As social entrepreneurs developing the IFDS, the INs will have a crucial role to play in constructing a foundational architecture that is conducive to, and supports, FAIR data of relevance to different epistemological paradigms in academia.
AUTHOR CONTRIBUTIONS
M. van Reisen ([email protected]) gave guidance to the content of this article based on her research on the applicability of FAIR, providing feedback to other co-authors, final reviewing and editing of the article. M. Stokmans ([email protected]) did the data analyses, wrote several paragraphs and gave feedback on other paragraphs. M. Basajja ([email protected]) carried out the research and data analysis, reviewed and edited the article. A. Ong'ayo ([email protected]) contributed in the writing of several paragraphs with regards to the applicability of FAIR in the African context as well as feedback on other paragraphs. C.R. Kirkpatrick Nakazibwe ([email protected]) contributed to ideas and examples of the article. B. Mons ([email protected]) provided critical ideas for the research undertaken for this article and also provided supervision and comments to the article.