Research and development are gradually becoming data-driven and the implementation of the FAIR Guidelines (that data should be Findable, Accessible, Interoperable, and Reusable) for scientific data administration and stewardship has the potential to remarkably enhance the framework for the reuse of research data. In this way, FAIR is aiding digital transformation. The ‘FAIRification’ of data increases the interoperability and (re)usability of data, so that new and robust analytical tools, such as machine learning (ML) models, can access the data to deduce meaningful insights, extract actionable information, and identify hidden patterns. This article aims to build a FAIR ML model pipeline using the generic FAIRification workflow to make the whole ML analytics process FAIR. Accordingly, FAIR input data was modelled using a FAIR ML model. The output data from the FAIR ML model was also made FAIR. For this, a hybrid hierarchical k-means (HHK) clustering ML algorithm was applied to group the data into homogeneous subgroups and ascertain the underlying structure of the data using a Nigerian-based FAIR dataset that contains data on economic factors, healthcare facilities, and coronavirus occurrences in all the 36 states of Nigeria. The model showed that research data and the ML pipeline can be FAIRified, shared, and reused by following the proposed FAIRification workflow and implementing technical architecture.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Article PDF first page preview

Article PDF first page preview