This article describes the process of the development of a research intelligence tool to analyze rare disease research in the Netherlands. To the best of our knowledge, this is the first tool that can surface and organize scientific output on rare diseases using established annotation and natural language processing mechanisms. We focus on the track leading up to the development, including strategic motivation and user needs, of a proof-of-concept tool, upscaling the idea to a national collaboration project, the development of the final tool and a usability evaluation and subsequent fine-tuning. The tool is a unique visualization that allows users to benefit with a few clicks from getting the information they require for their needs, and offers novel scientific indicators to characterize (relatively) rare disease research activity. We discuss the applications of insights derived from this tool for science policy and to support decision-making, and to identify opportunities and potential collaborations and make recommendations for future developments, including a broadening of the scope, and discuss potential novel applications.

Rare diseases are defined as conditions that affect a small percentage of the population (less than one in 2,000 people) or a tumor that occurs in less than six in 100,000 people per year (definitions by the European Commission1 and the Dutch government2). In the Netherlands, as in many other countries, rare diseases pose significant challenges for patients, healthcare providers, and researchers due to their low prevalence and complexity. According to the Netherlands Federation of University Medical Centers (NFU), there are approximately 8,000 rare diseases affecting around 1 million people in the Netherlands. Due to their rarity, many rare diseases are difficult to diagnose and treat. Patients often have to visit multiple specialists and undergo a battery of tests before receiving a proper diagnosis (if one is even found), which can lead to delays in treatment and increased healthcare costs (Navarrete-Opazo, Singh et al., 2021).

Research on rare diseases is the study of the causes, diagnosis, treatment, and prevention of rare diseases. Due to the small number of patients affected by each rare disease, such research can be challenging. Funding for research on rare diseases is often limited, and it can be difficult to find enough patients to participate in clinical studies.

Rare diseases research involves a wide range of scientific disciplines, including genetics, epidemiology, pharmacology, and clinical medicine. The goal of research on rare diseases is to increase knowledge and understanding of these disorders, to improve the diagnosis and treatment of patients with rare diseases, and to ultimately find cures for these conditions.

Research on rare diseases can include basic scientific research, such as studies to understand the genetic and molecular basis of a disease, as well as applied research, such as clinical trials to test new treatments. It can also include epidemiological studies to understand the prevalence and distribution of a disease, and health services research to improve the delivery of care to patients with rare diseases.

To address the abovementioned challenges and to support patient care and research for rare diseases, European Reference Networks (ERNs) have been established by the European Union. These virtual networks bring together clinicians, researchers, and patient representatives to share knowledge, resources, and expertise; to facilitate discussion among experts; to identify common priorities; to improve quality of care; and to stimulate research for rare diseases.

On a national level, a plan3 for rare diseases was established in the Netherlands with the identification of the most important bottlenecks and recommendations to improve the situation for people living with a rare disease. One of the main results of the national plan for rare diseases is that the Dutch government has put in place a system and policy of national acknowledgment for centers of expertise on rare diseases. To become an expert center, an institution must formally apply and provide evidence of patient care, research activities, etc. Acknowledgment as an expert center is a boost for the viability and continuity of care and research and is a requirement for becoming part of the aforementioned ERNs.

Besides patient care, a (candidate) expert center is required to conduct (clinical and/or fundamental) scientific research in the field of the rare disease(s) for which recognition has been requested and to have published about the results. This makes insight into rare disease research activity highly relevant and offers a challenge to develop a research intelligence solution to provide strategic insights. Research intelligence analyses can be very insightful and are highly applicable to rare disease research (Iping, Cohen et al., 2021; Zampeta, Distel et al., 2022). They not only help an institution to get a grip on their own research portfolio but can also shed light on their national position and therefore the chances or necessity to get recognition for certain rare diseases. Furthermore, collaboration is often indispensable in research on rare diseases. Because expertise is so limited, specialists from different hospitals need to work together on tackling different aspects of a rare disease. All of this information supports researchers, clinicians, and advisors in the process of acknowledgment as expert centers, and of becoming part of larger networks.

Insight into rare disease research activity is not only highly valuable for practitioners but can also benefit patients, patient organizations, clinicians, and other practitioners outside research organizations. They can use this information to find experts on their rare disease and choose collaboration partners to jointly design research questions as part of citizen science. Clinicians and other practitioners outside research organizations are often limited in their access to research outcomes and research information. Being able to get a grasp of activity and output can benefit them in collaborations and referrals, ultimately also influencing the entire care system for patients with rare diseases.

The development of dashboards for clinical data relating to rare diseases has been described in the literature (Leutner, Bathelt et al., 2021; Vasseur, Zieschank et al., 2022) but nothing has been described yet for research on rare diseases.

The value of insight into rare disease research activity is clear, but the lack of available classifications in current research information systems and analytical tools is a barrier. There is, however, a nomenclature available that is considered to be the European standard: Orphanet4. Orphanet is a broad directory of information on rare diseases. It also maintains the Orphanet classification system, which is designed to organize rare diseases and disorders into a hierarchical structure based on their clinical and genetic features. The classification system has several levels, each of which provides progressively more detailed information about the disease or disorder. The first level of the Orphanet classification system is the disease group level, which groups together diseases that share similar clinical features. The second level of the classification system is the disease level, which provides more specific information about individual diseases or disorders. Each disease is assigned a unique Orphanet identification number (ORPHAcode) to facilitate easy identification and tracking. The third level of the classification system is the subtype level, which provides further detail about specific subtypes of a disease. Overall, the Orphanet classification system is a valuable tool for organizing and categorizing information about rare diseases, and it can help researchers, clinicians, and patients better understand and manage these conditions (i.e., Rath, Olry et al., 2012).

Using the terminology from Orphanet we first attempted a simple approach using one-on-one term matching between the research output of Erasmus MC and the rare diseases. We used Dimensions as the source for Erasmus MC output over a longer period, primarily because the registration of publication metadata in the institution’s Current Research Information system (CRIS) prior to 2018 was not up to quality standards, and secondarily because of the easy export functionalities of metadata provided by Dimensions. In this simple approach, we employed standard substring matching techniques while ignoring the case and nonalphanumeric letters to extract the set of research articles that have reference to each Orphanet disease. The methodology involved using algorithms that identify occurrences of target substrings within the textual data, enabling us to identify relevant article-disease links without being sensitive to letter case or punctuation. As there was a high chance that articles refer to fewer or even irrelevant Orphanet diseases in the introduction and body of the articles, we applied the textual search only at the level of titles and abstracts of articles. This approach exhibited promising results and allowed us to uncover several interesting findings.

Despite the positive outcomes, we immediately noticed certain limitations in our methodology that necessitate more sophisticated data processing methods. One notable limitation is the failure to account for alternative names or aliases for each disease that might be present in the data. Omitting such variations may have resulted in overlooking essential information and potentially led to incomplete or biased findings. Additionally, the analysis did not consider the hierarchical relationships between Orphanet entities, which could have significantly impacted the interpretation of the results.

To overcome the identified limitations, we approached Elsevier to set up a pilot around this topic, making use of their databases and algorithms, and with the help of Elsevier-based data scientists. This was facilitated as part of a national agreement:

“Inspired by the open science movement, research is becoming increasingly more open. To facilitate this, the Dutch research community (UNL, NFU, VH, NWO, and KNAW) and Elsevier started a collaboration5. Together they are developing new open science and open access services to contribute to Dutch open science. Ideas for new services are tested via pilot projects. These pilots are undertaken by researchers, librarians, research managers, research intelligence consultants and other staff from the participating institutions – the people who will also end up using these services” (from: https://epdos.nl).

To facilitate these pilots, a statement of work and framework documents were set up to ensure aspects of data ownership and openness. These documents are openly published on the aforementioned website. This specific pilot was approved because it addresses a gap in research intelligence for a specific target audience that has not been facilitated until now, and because it benefits open science by providing open insights into information on rare disease research activity (a part of the Scopus database). The agreement stipulates that NFU is the owner of the resulting work, and Elsevier can apply methods developed in this pilot for its own products. It has also been agreed that Elsevier will publish its methods openly so that in principle this can be reproduced. In various ways, this pilot contributes to open science in the form of open research information and open research methodology. Elsevier used its algorithms to recreate the work done by the Erasmus MC data scientist hired to improve the matching. The way in which the annotation engine and algorithm work was published by Azarbonyad, Afzal et al. (2023), which contains an extensive overview. Here, we will briefly summarize this to give a grasp of the methodology and avoid overlap with the article referenced.

The approach to identifying publications on rare diseases involves matching articles to specific diseases by aligning the concepts within the Orphanet taxonomy with the content of the articles. This matching process employs both exact and fuzzy matching techniques, utilizing disease names, their synonyms, and various text preprocessing methods to ensure accuracy. Evaluation of this method takes place through two avenues:

  1. A user study where publications on rare diseases detected by the method across four major University Medical Centers (UMCs) in the Netherlands are assessed by experts from these institutions for quality and accuracy. The findings of this study demonstrate the method's effectiveness in detecting publications relevant to the rare diseases of interest to these UMCs.

  2. Examination of a test set comprising articles with their corresponding rare diseases assigned to them, serving as a benchmark for measuring the method's performance in a document classification context. Results from this evaluation indicate a precision rate of 92%, a recall rate of 71%, and an F1 score of 80%. These outcomes underscore the method’s robustness in accurately assigning rare diseases to documents.

Overall, this method exhibits a high level of performance, ensuring reliable identification and categorization of publications related to rare diseases.

In the subsequent process, articles were assigned to institutions based on affiliation information in Scopus and SciVal using full counting. We use information from this source because not every medical institution in the Netherlands uses a research information system and the scope of this pilot went beyond just the academic institutions. There is no comprehensive national database containing all metadata that was accessible for this purpose. The methodology would, however, also work using different source systems of metadata.

With significantly improved results we drafted the official pilot proposal, which was approved by the steering committee of the collaboration in June 2022. It was agreed that the analytics tool to be developed should be open to everyone, and that the methodology behind it should be published and also accessible to anyone.

The next sections of this article will focus on the analytical tool that was designed based on the outcomes of the matching, and the subsequent evaluation and potential strategic use of the information.

The result of the algorithmic annotation is a set of publications from Dutch institutions over the period 2011–2022 in the Scopus database in which at least one ORPHAcode was found. Over 40,000 articles were identified. To make sense of all this information a dashboard (Figure 1) was designed using Tableau. The dashboard6 is published and openly available to everyone.

Figure 1.

The overview page of the dashboard.

Figure 1.

The overview page of the dashboard.

Close modal

The dashboard currently offers three dimensions that provide a different view into the data. The first dimension gives an overview of all rare disease research in the Netherlands. A selection of metrics is provided to give the user an immediate glance at the size and impact of the research. The source of the metrics is Scopus (more information about this can be found in Elsevier (2019)). These metrics (Table 1) are consistently shown in all current dimensions of the dashboard.

Table 1.

Overview of presented metrics in the dashboard and their definition

MetricDefinition
Scholarly output The number of publications on a rare disease per selected dimension in the dashboard 
% of output out of total disease research The percentage of articles of the selected dimension in the dashboard compared to the total number of articles in this dimension (share of research in the Netherlands) 
Citation count Total amount of citations received by the publications in the selected dimension of the dashboard 
Citations per publication Total average number of citations per publication received by the publications in the selected dimension of the dashboard 
% of outputs in top 10% citation percentiles The percentage of publications that are highly cited, having reached a particular threshold (top 10%) of citations received 
MetricDefinition
Scholarly output The number of publications on a rare disease per selected dimension in the dashboard 
% of output out of total disease research The percentage of articles of the selected dimension in the dashboard compared to the total number of articles in this dimension (share of research in the Netherlands) 
Citation count Total amount of citations received by the publications in the selected dimension of the dashboard 
Citations per publication Total average number of citations per publication received by the publications in the selected dimension of the dashboard 
% of outputs in top 10% citation percentiles The percentage of publications that are highly cited, having reached a particular threshold (top 10%) of citations received 

Because the potential users of the dashboard have very different backgrounds, the set of offered metrics is limited to avoid an overload of information. The overview page offers a visual containing spheres that are indicative of the number of rare diseases that were identified in the articles, and the relative size of the research that is done on each rare disease.

The circles can be hovered over and a popup will appear with information about the disease that circle represents (development of output and citations over the years). The large spheres in the center are rare diseases towards which a lot of research focus is directed in the Netherlands, such as tuberculosis, malaria, preeclampsia, cystic fibrosis, and systemic lupus erythematosus. Some of these diseases are rare for the Netherlands but are not necessarily rare in other parts of the world, such as tuberculosis and malaria. The research done in the Netherlands on these diseases is indicative for our societal mission and efforts directed towards the healthcare situation in developing countries. At the top of the page an overview of the publication set is given. This also lists the number of unique rare diseases that institutions in the Netherlands have contributed to. At this point this is about 2,500. This is only about 30% of the total rare diseases currently known (8,000). There are many rare diseases for which no research in the Netherlands has been identified by our methods.

The final element of the overview page is a heat map of locations in the Netherlands where research into rare diseases is located. Most research is concentrated in the seven UMCs, but also universities, regional hospitals, and industrial or commercial partners are active. Interesting examples are, for instance, the Jeroen Bosch hospital located in ’s-Hertogenbosch, Noord Brabant province, which has published on Q fever, a disease of which world’s largest outbreak occurred in this province in 2007. Other examples are technical universities or commercial parties involved in research towards technical advances in diagnostics or treatment.

In the second dimension the user can search for specific diseases by name or ORPHAcode, or by expert center, ERN, or patient organization, which provides overviews of a selection of diseases that are in scope of one of these units. When a disease or unit is selected the table can be updated and an overview is presented of all institutions active in research on this rare disease, or selection of rare diseases, and a number of metrics can be seen. Figure 2 shows an overview of the dashboard for a random chosen rare disease (in this case pituitary tumor).

Figure 2.

Detail page of the dashboard giving an overview of research in the Netherlands into pituitary tumors.

Figure 2.

Detail page of the dashboard giving an overview of research in the Netherlands into pituitary tumors.

Close modal

The heat map on the left shows where research in the Netherlands on pituitary tumors is concentrated, which is primarily in Rotterdam and Leiden. The metrics provided give insight into different dimensions of impact (volume, relative market share, and citations). The relative market share (% of outputs out of total disease research in the Netherlands) is an interesting indicator for anyone working on this disease who is looking, for instance, into applying to become an expert center, or for collaboration partners. It indicates if it is potentially worthwhile. Citation impact can be viewed as total number of citations, or as a percentage of outputs in the top 10% citation percentile, meaning which percentage of publications is among the top 10% most cited publications. This dimension can also be used to find potentially interesting collaborators.

If an expert center, ERN, or patient organization is selected, the associated diseases will be filtered out and can be viewed individually, or as a total for the selected unit, providing insight into the institutions that are most active in research on these diseases.

In the third dimension the user can select an institution to get an overview of research activity specific to this institution (Figure 3). It is good to note that in this dimension higher nodes in the Orphanet taxonomy were included (not only the leaves). This is the reason why categories with larger volumes of publications are shown here (in contrast with the overview page, which only contains the leaves of the Orphanet taxonomy). This choice was made based on feedback from users, for whom this level of information was very insightful. The same selection of metrics as before is provided on this page.

Figure 3.

Detail page of the dashboard showing the research activity of Erasmus MC on rare diseases sorted by number of publications.

Figure 3.

Detail page of the dashboard showing the research activity of Erasmus MC on rare diseases sorted by number of publications.

Close modal

An insightful functionality of this overview is when the user sorts the list on the second metric (% of outputs of total disease research). This selection provides insight into where an institution has unique expertise (or market share). For Erasmus MC the top of this list can be seen in Figure 4, using a cutoff of at least five publications to filter out nonsignificant diseases.

Figure 4.

Detail page of the dashboard showing the research activity of Erasmus MC on rare diseases sorted by % of outputs out of total disease research.

Figure 4.

Detail page of the dashboard showing the research activity of Erasmus MC on rare diseases sorted by % of outputs out of total disease research.

Close modal

This sorted list shows rare diseases for which Erasmus MC has a very large market share, and which they should probably already have included in an expert center, assuming they also provide patient care in the Erasmus MC or in close collaboration with another hospital.

To evaluate the first published beta version of the dashboard, we sent out a set of open questions to potential users of the tool with different perspectives from different institutions in the Netherlands. The most prominent group of users are researchers and clinicians involved in rare diseases, and active within expert centers. Therefore, the survey was sent out to a population in our own institution which we knew met these criteria. Two other UMCs were directly involved in this project. We used local contacts to approach potential users of the tool meeting these same criteria. In total, 15 researchers/clinicians responded to our questionnaire out of the 34 sent out. Some respondents did not fill out the entire questionnaire but gave general feedback. Most of the replies came from our own institution. We received three replies from the other institutions.

The questions for the researchers and clinicians revolved around the recognizability of the presented data, the usability of insights, and potential added value. A full overview of all questions and responses to the questionnaire is listed in Table S1 of the Supplementary material. Besides the questionnaire, we asked two advisors to use the dashboard with a specific use case of their own choice related to supporting expert center applications, portfolio management, and research collaborations and impact. The results of the questionnaire and the experiences of the advisors are discussed next.

5.1. Evaluation by Researchers/Clinicians

Most of the respondents were able to find information on the diseases they are involved in easily (“It was easy to find. The tools are clear and if you know in what European Reference Network you should be looking for the information it is not difficult to find”). It is noted, however, that some diseases have popular names under which they are better known than their official name. An example is Pompe disease instead of glycogen storage disease. Two respondents indicated that they found it hard to navigate in the dashboard, and therefore were not able to quickly find information relevant to them. Some of the respondents found the presented data recognizable: The overview of institutions active on the diseases they are involved in matched their expectations. However, many respondents mentioned that they want to see the publications behind the numbers presented in the dashboard (“I like the dashboard, it gives timely information on the number of publications from our center. Still, it would be nice to see the publication list by clicking on it”). This would provide more insight into the activity of other institutions, and the ability to take a look at what they are publishing about—important for potential future collaborations. Because of this inability to do checks, there is some doubt among some respondents whether the information in the dashboard is complete. Some suggested that they expected more publications from their institution linked to a rare disease. There can be several reasons for these discrepancies: The publication is matched at a different point in the ORPHA hierarchy, the publication is recent and not included in the dashboard yet, problems with matching, lack of key terms that could be matched in the publication, or the use of alternative names for diseases. However, most of the differences we checked were small and did not change the overall picture provided in the dashboard. For some diseases, this could be a bigger problem than others. We recommend using the information in the dashboard as directional, to be further optimized when using it for official decisions.

Most respondents indicated that after seeing the dashboard they had more information than they had before, but also indicated that the current dashboard does not answer all the questions they would be interested in. They were specifically interested in seeing the development of research in specific topics and from which institutions output is coming, and using this information to find partners (“It is certainly informative which institutions publish how much on different topics, as it is sometimes surprising. Therefore, it is a useful tool to find partners.”). The potential here would be a broadening of the scope of the dashboard. The landscape in the Netherlands is often known to the experts, but in Europe or worldwide, this is much harder to get grip on. The information in the dashboard also does not show current collaborations and the effect of forming a network with the other centers in the Netherlands. It was mentioned that the basic citation information provided in the dashboard was interesting, but without proper contextualizing, this could be easily misinterpreted (“For example, publications on (ultra) rare diseases may not be cited that much given the number of people/institutes working on it. This is especially the case for case reports. These are very important for the study of rare diseases but typically are not cited often. In addition, review articles may generate more citations but are less informative with regard to the extent an institute really works on a rare disease.”). Especially in rare diseases, where there is a very significant difference in the size of fields, this could quickly lead to unjust comparisons.

The (potential) added value of the dashboard to the experts comes from providing a search tool specifically for rare diseases, smart matching of publications to rare diseases, showing the lay-of-the-land on each rare disease, identifying collaboration partners, showing network performance, identifying chances for future research, and setting future ambitions. To achieve this potential, the experts note that being able to access the publications and seeing researcher names is essential (“The added value would be much bigger if there was a way to directly link to the publications itself—that would increase the value of the dashboard significantly.”). Also, showing collaborations within the data is mentioned as a valuable addition. Finally, the ability to perform better searches in the dashboard was mentioned by the respondents. This search could include keywords such as topic, rare disease, treatment type and researcher. In this way you can specifically and quickly locate the experts on a certain aspect within the rare disease field. The benefit here is that you are already searching within a curated set of research outputs relevant to rare diseases using an algorithmic approach, making it more powerful than starting your search from scratch in another database (such as PubMed, Scopus, Web of Science, or OpenAlex) with the difficulty in identifying the relative rare disease literature using free search terms and MeSH terms.

Overall, the experts are positive about the tool, but note that there are several elements to develop in order for the tool to provide more added value and secure more frequent use.

5.2. Evaluation by Advisors

We asked two advisors from our own institution to evaluate the usability of the dashboard, using two use cases of their own design but related to support expert center applications, portfolio management, research collaborations, or impact. Their experiences are described next.

5.2.1. Use case 1

The second dimension of the dashboard (ORPHAcodes) was used to provide support during the national preapplication round for expert centers. Several departments made requests for acknowledgment of specific ORPHAcodes, some as new expert centers and some as an extension of their already government-endorsed expert centers. When preliminary preapplication was submitted, every ORPHAcode was checked for research output in the dashboard. Attention was given to the number of publications and where they stand in comparison to other institutions (relative market share). This information was used to get an impression of whether it would be suited to apply for acknowldgement. In the review of the application, the number of publications in which the applying expert center has a leading role is taken into account. This specific information can currently not be extracted from the dashboard. The information provided by the dashboard is therefore indicative and can be supplied to the expert center as support.

In some cases, the information from the dashboard cast doubt on whether the candidate expert center could fulfill the requirements for scientific research. In those cases, they were asked to explain their application in more detail to increase the quality of the application.

Some expert centers were reluctant to apply because they were uncertain whether their scientific contribution was substantial enough to support an application. The dashboard was used to find information on the number of outputs, and relative market share, which ultimately stimulated the candidate expert center to apply. In this process, we encountered that it would have been helpful to be able to get a publication overview from the dashboard.

5.2.2. Use case 2

For strategic positioning and research portfolio management, the third dimension (Institution) of the dashboard was used. By using the ability to sort the list of ORPHAcodes on “% of outputs out of total disease research,” one can easily get an overview of diseases in which one’s institution has unique expertise. When sorting the list on “% of outputs in top 10% citation percentile” the user is also able to identify diseases on which their institution has remarkably high citation impact. The filter allows the user to put in a threshold, filtering out diseases with insignificant output numbers.

When a disease is selected, you can navigate to the second dimension (ORPHAcodes) to see the full national landscape on that disease and identify which other institutions are active in research into this specific disease. Added value here would be if there are already existing collaborations between your own institution and another on a specific disease. This information would be useful to advise researchers looking for partners. This will become even more relevant if the scope of the data can be expanded to cover Europe. In diseases where there are multiple institutions active, the metrics on impact can be helpful to potentially identify the most interesting collaboration partner. In this case, an overview of publications and researchers would be valuable.

The dashboard offers interesting insights into the research portfolio in rare diseases of an institution, the national position on individual rare diseases, and potential collaboration partners. The citation-based metric should be handled responsibly but can be indicative for areas with strength and potential where research efforts are valuable.

5.3. Improving the Dashboard After Evaluation

The evaluation gave us clear handholds for improvements to the dashboard. The most pressing one was the demand for access to the publications behind the numbers in the dashboard. This was the most important element for users to develop a better understanding and trust in the data. As a result of the evaluation, this option was added to the dashboard, allowing the user to link out to the selected publications in Scopus (the source of publications that the dashboard was built on), and to PubMed in order to secure open access to the data to all users. This also allows users to analyze publications further in other systems or download them from either Scopus or PubMed. It must be noted that the numbers of publications in Scopus and PubMed can differ slightly for reasons of database coverage.

To address the other suggestions for improvements coming from the evaluation and our own experiences, we are working on a roadmap for future development. Some suggestions are mentioned in the next section. In the near future, we intend to invite more researchers, clinicians, and advisors (preferably) from other institutions to also evaluate the improved dashboard using their own use cases. Also, we want to approach practitioners, patients, and patient organizations to find out whether and how the tool might be useful for their use cases.

The Rare Disease Research Dashboard is intended to be a comprehensive and up-to-date research intelligence tool on rare disease research in the Netherlands. It is regularly updated with new data and information as it becomes available and is designed to be user friendly and accessible to a wide range of stakeholders.

To the best of our knowledge, this is the first tool that can surface and organize scientific output on rare diseases using established annotation and natural language processing mechanisms. The visualization is unique and allows all stakeholders and users of the tool for the aforementioned use cases to benefit with a few clicks from getting the information they require for their needs. The dashboard can be used to identify research gaps, support evaluation procedures, and promote collaboration among researchers and other stakeholders. Research on rare diseases is often challenging due to the limited amount of data and resources available. This emphasizes the need for collaboration and data sharing (patient registries, biobanks, cohorts) in this field.

The development of this dashboard was demand driven. Initially, advisors concerned with the portfolio of expert centers at Erasmus MC identified the need for more strategic information, which was confirmed by clinicians and researchers working in the field. The clearest added value of the dashboard in the current form is for managing the internal portfolio and supporting clinicians and researchers in the process of applying to become an expert center.

6.1. Limitations and Responsible Use

The dashboard in its current form only takes into account journal articles indexed in Scopus. Though this is a more than solid representation of research activity on rare diseases, not all (scientific) journal articles are covered in this database because not all journals are indexed. For instance, journals addressing practitioners, often non-English language, have less coverage in Scopus. Scopus however has a good coverage of Dutch language journals such as for instance Nederlands Tijdschrift voor Geneeskunde, the largest Dutch outlet aimed at medical practitioners. Though relevant, it is not expected that the volume of articles published in journals not covered by Scopus will influence the general overview presented in the dashboard. Also, other forms of scientific output and efforts are not included in a database such as Scopus. When looking at this information this must be taken into account: That there are no publications does not mean there is no clinical or even research activity per se in an institution. An interesting addition to the dashboard, for instance, could be information on clinical trials, research funding distribution, and other types of research outputs, including, for example, preprints and data sets as a proxy for where current active research hotspots are located, but also to gain insight into where potential treatment options are investigated and to show where funding for research into rare diseases comes from. Often patient organizations and specific societal funds fund research projects, which deserve special recognition. This informs researchers and clinicians more about where potential funding is available. Because the tool is only built on research output from the Netherlands, the conclusions and outcomes can only be viewed within the context of this country. To make more general statements, a broadening of the scope, preferably to cover the entire Scopus database (or other available databases), of the tool is required.

The limitations of the algorithm (annotation engine) are discussed by Azarbonyad et al. (2023), but it should be emphasized that there can be multiple reasons why an article can be missed in the matching process or will be matched based on incorrect grounds. In our evaluation process, we found that these issues were not significant enough to devalue the tool, but considering the large number of rare diseases, it is likely that the tool will not work for all of them.

Taking into account the limitations, it is important that the information presented in the dashboard is used responsibly. It can be used for exploration purposes, inspiration, and support, but not as the sole ground for decision-making. When designing this tool, we thought carefully about the metrics that should be presented. We chose a small selection of metrics that could be used in all dimensions of the dashboard, are relatively easy to understand and can be used to get a simple overview to serve the main purpose of the dashboard: identify (relative) activity and stimulate collaborations. We refrain from statements about quality or impact because we don’t believe that metrics can be solely used to fully assess these aspects. The metrics together should give the user an idea of what is going on, and invite them to further investigate, reach out, or use them to complete narrative assessments.

While they can provide valuable insights, it is important that metrics should take into account the context, the discipline, and the stage of research. It is important to consider the purpose of the research and the audience for which the metrics are intended. Metrics should not be used in isolation to make decisions about research or researchers. Instead, they should be used as part of a broader evaluation process that includes qualitative assessments and expert judgments. Overall, the responsible use of research metrics requires careful consideration of their purpose, transparency in their use, cautious application, diversity in their selection, and integration into a broader evaluation process.

6.2. Opportunities for Future Developments

As shown in the evaluation process, there is a lot of potential still to explore. A dashboard on the level of Europe will be very valuable to support efficient processes for ERN memberships and research collaboration and can give insights for patients that currently cannot get access to treatment in their own country because of a lack of expertise. But there are other opportunities to think ahead and create a meaningful extension of the project.

One is related to rewarding and recognizing researchers for their contributions. Articles about rare diseases are usually cited less than other articles because of the overall limited research activity in these niche areas. There are various ways to create weighted versions of citation impact indicators. Looking into a “disease-weighted citation indicator” could help alleviate that problem and make rare disease research more rewarding, though it must be taken into consideration that the numbers of publications can be very small, making these types of weighted analyses potentially complicated.

Another important extension could be towards transitional results of rare disease research. For example, showing how the lab-based and clinical research projects have been translated into new therapies and drugs could incentivize funders and charities to invest more in this kind of research. Also, funders and charities could use the tool to find diseases that are impacting patients but, for some reason, have not yet been researched extensively.

Rik Iping: Conceptualization, Data curation, Formal analysis, Investigation, Writing—original draft. Ilse Nederveen: Conceptualization, Data curation, Investigation, Writing—review & editing. Bijan Ranjbar-Sahraei: Data curation, Methodology, Writing—review & editing. Hosein Azarbonyad: Data curation, Methodology, Writing—review & editing. Max Dumoulin: Conceptualization, Investigation, Writing—review & editing. Georgios Tsatsaronis: Supervision, Writing—review & editing. Irene M. J. Mathijssen: Supervision, Writing—review & editing.

Max Dumoulin, Hosein Azarbonyad, and Georgios Tsatsaronis are employees of Elsevier, a commercial company specializing among other things in developing and selling research intelligence tools.

The other authors have no competing interests.

The tool was developed as a project approved under an innovation agreement between a number of Dutch public parties (universities and research organizations) and Elsevier, part of a larger deal on open science. This tool therefore is indirectly funded by the Dutch government and directly by Elsevier.

Azarbonyad
,
H.
,
Afzal
,
Z.
,
Iping
,
R.
,
Dumoulin
,
M.
,
Nederveen
,
I.
, …
Tsatsaronis
,
G.
(
2023
).
Annotating and indexing scientific articles with rare diseases
. .
Iping
,
R.
,
Cohen
,
A. M.
,
Abdel Alim
,
T.
,
van Veelen
,
M. C.
,
van de Peppel
,
J.
, …
Mathijssen
,
I. M. J.
(
2021
).
A bibliometric overview of craniosynostosis research development
.
European Journal of Medical Genetics
,
64
(
6
),
104224
. ,
[PubMed]
Leutner
,
L. A.
,
Bathelt
,
F.
,
Sedlmayr
,
B.
,
Sedlmayr
,
M.
, &
Zoch
,
M.
(
2021
).
Development of a dashboard for rare diseases—A technical case report
.
Studies in Health Technology and Informatics
,
283
,
78
85
. ,
[PubMed]
Navarrete-Opazo
,
A. A.
,
Singh
,
M.
,
Tisdale
,
A.
,
Cutillo
,
C. M.
, &
Garrison
,
S. R.
(
2021
).
Can you hear us now? The impact of health-care utilization by rare disease patients in the United States
.
Genetics in Medicine
,
23
(
11
),
2194
2201
. ,
[PubMed]
Rath
,
A.
,
Olry
,
A.
,
Dhombres
,
F.
,
Brandt
,
M. M.
,
Urbero
,
B.
, &
Ayme
,
S.
(
2012
).
Representation of rare diseases in health information systems: The Orphanet approach to serve a wide range of end users
.
Human Mutation
,
33
(
5
),
803
808
. ,
[PubMed]
Vasseur
,
J.
,
Zieschank
,
A.
,
Göbel
,
J.
,
Schaaf
,
J.
,
Dahmer-Heath
,
M.
, …
Storf
,
H.
(
2022
).
Development of an interactive dashboard for OSSE rare disease registries
.
Studies in Health Technology and Informatics
,
293
,
187
188
. ,
[PubMed]
Zampeta
,
F. I.
,
Distel
,
B.
,
Elgersma
,
Y.
, &
Iping
,
R.
(
2022
).
From first report to clinical trials: A bibliometric overview and visualization of the development of Angelman syndrome research
.
Human Genetics
,
141
(
12
),
1837
1848
. ,
[PubMed]

Author notes

Handling Editor: Rodrigo Costas

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data