Which types of online evidence show the nonacademic benefits of research? Websites cited in UK impact case studies

Abstract While funders increasingly request evidence of the societal benefits of research, all academics in the UK must periodically provide this information to gain part of their block funding within the Research Excellence Framework (REF). The impact case studies produced in the UK are public and can therefore be used to gain insights into the types of sources used to justify societal impact claims. This study focuses on the URLs cited as evidence in the last public REF to help researchers and resource providers to understand what types can be used and the disciplinary differences in their uptake. Based on a new semiautomatic method to classify the URLs cited in impact case studies, the results show that there are a few key online types of source for most broad fields, but these sources differ substantially between subject areas. For example, news websites are more important in some fields than others, and YouTube is sometimes used for multimedia evidence in the arts and humanities. Knowledge of the common sources selected independently by thousands of researchers may help others to identify suitable sources for the complex task of evidencing societal impacts.


INTRODUCTION
Although knowledge-building is a core goal of much scholarship, it is important to assess the impacts of research outside academia when evaluators or funders need evidence of its societal impacts (Dinsmore, Allen, & Dolby, 2014;Thelwall, Kousha et al., 2015). This is because funders consider research findings to have added value when they benefit society, such as by influencing policy (Oliver, Innvar et al., 2014). Assessing these nonacademic impacts is difficult because there are many types and no systematic record of them. In contrast, academic impacts are partly trackable by citation indexes. To illustrate the variety of potential nonacademic impacts, 27 categories of impact within four broad areas (research-related, policy, service, societal) have been suggested to help health researchers describe the benefits of their research when writing impact narratives (Kuruvilla, Mays et al., 2006). At a finer-grained level, 100 indicators have been suggested for the policy, health, economic, teaching, and career development impacts of biomedical research alone (Guthrie, Krapels et al., 2017). General recommendations have also been provided for interpreting nonacademic indicators of research impacts (Wilsdon, Allen et al., 2015).
The UK Research Excellence Framework (REF) is an exercise that runs every 6-7 years, assessing scholarly and nonscholarly research achievements to allocate block grant research a n o p e n a c c e s s j o u r n a l funding. It groups UK academic research into four broad disciplinary panels (A, B, C, and D), containing 36 field-based Units of Assessment (UoAs, see Supplementary Information Tables  S1-S4 for a list) in the 2014 iteration. The REF assesses the nonacademic impacts of research primarily through impact case studies, which are structured evidence-based narrative claims of nonacademic impacts written by the groups of researchers evaluated. In the REF context, research impact has been defined as "an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia" (Research Excellence Framework, 2014, p. 26). The weighting of the case studies for funding purposes has been increased from 20% in REF 2014to 25% in REF 2021(Research Excellence Framework, 2019. Impact case study mandatory "Sources to corroborate the impact" sections contain citations to the evidence underpinning the narratives. These must be found by the researchers themselves, typically with the support of university impact support officers and digital resources, such as Altmetric.com (nonacademic citations to academic publications) and Overton.io (policy documents mentioning researchers).
A range of sources may be used as evidence of nonacademic impacts. These include government publications, regulations, legislation, policy documents, parliamentary reports, statistics, white papers, medical treatment information sheets, clinical guidelines, patents, standards, book reviews, and news stories. For example, an independent review of the role of metrics in research assessment in REF 2014 and future exercises has suggested that "citations from online 'grey' literature seem to be an additional useful source of evidence of the wider impact of research, but there do not seem to be any systematic studies of these" (Wilsdon et al., 2015, p. 38). These sources of nonacademic impact evidence cannot be easily captured through scientific databases and may need extensive searches on the web to locate, if they are online at all. Although there have been attempts to propose methods to capture different types of nonacademic impacts based on web citation searches (Kousha, 2019) and social media websites (Thelwall, Haustein et al., 2013), these have tended to focus on assessing the availability of information rather than its utility for evidencing nonacademic impacts. It is therefore important to identify sources commonly used by academics to corroborate their claims of nonscholarly impacts in different subject areas. This may help researchers and university impact support officers to build their cases and may help altmetrics providers or others to index the necessary sources.
Most previous studies of REF case studies have used text mining (e.g., King's College London and Digital Science, 2015;Parks, Ioppolo et al., 2018) or content analysis (e.g., Brook, 2018;Wilkinson, 2019) to identify the types of impacts claimed by researchers, rather than the types of evidence cited. In contrast, one (not peer-reviewed) study has listed the 40 websites most cited in impact case studies, broken down into four broad disciplinary groups (Digital Science, 2016) but did not analyze the cited URLs further. There seems to have been no large-scale assessment of the types of URLs cited in "Sources to corroborate the impact" evidence sections. The current study addresses this gap with a hybrid automatic and manual method to extract and classify the most cited of these URLs for 6,637 downloadable REF 2014 impact case studies across all 36 UoAs. The same method can also be used for the systematic classification of URLs cited in future impact case studies (e.g., REF 2021) or other similar large-scale exercises with URL citations outside the UK to understand their characteristics and disciplinary differences.

Text Mining Analyses of REF 2014 Impact Cases Studies
Several text mining studies have assessed the narrative sections of impact case studies. A largescale topic modeling of the REF 2014 impact case studies found subject differences in the types of impact reflected in them. For instance, in medical and biological sciences (Panel A), about 20% of the case studies related to Clinical guidance, whereas in the arts and humanities (Panel D) the most common topic was "Media" (26%) (King's College London and Digital Science, 2015). Another text mining analysis of REF 2014 impact case studies used seven categories (People, Economic, Reach, Significance, Prestige, Health, and Environment) to identify quantitative indicators of impact, finding that sentences matching the categories People (35%) and Economy (30%) were the most common (Parks et al., 2018). A further study classified the words in two sections ("Summary of impact" and "Details of impact") of the impact case studies into six categories: Education (22.8%), Public engagement (17%), Environmental and energy solutions (17.7%), Enterprise (11.8%), Policy (17.1%) and Clinical uses (13.7%). Differences between broad disciplines in types of impact were identified. For instance, in the Social Sciences (Panel C) over one-third (34%) of the identified impact types were classified as Museums and cultural heritage, whereas in the Life Sciences (Panel A) about half of the impact types were categorized as Public health policy (Terämä, Smallman et al., 2016).

Content Analyses of REF 2014 Impact Cases Studies
Several content analyses have used human coders to classify aspects of the REF 2014 impact case studies. In terms of the types of documents cited, most case studies corroborate impact through at least one of Testimonial (80%) or Project report (78%) compared to Websites (30%) or Media (26%) (Hughes, Webber, & O'Regan, 2019).
The types of narrative impact claims found have differed greatly between disciplines. An analysis of the REF impact case studies from one university faculty in Health and Applied Sciences (n = 18) found impacts on Policy (e.g., policy reports and guidelines), Specific information and advice (e.g., online materials or toolkits), Research field (e.g., clinical trial procedures), and Patient interventions, protocols or standards of care (Wilkinson, 2019). For 162 impact case studies submitted to the Public Health, Health Services and Primary Care subpanel, three quarters (75%) had impacts on New or revised clinical guidelines and more than half influenced International, national or local policy (54%) or changed Clinical or public health practice (52%) (Greenhalgh & Fahy, 2015). For 194 REF 2014 impact case studies in Business and Management, impact claims mentioned "Specific actions by practitioners or policy-makers" (93%), "Specific and quantifiable results" (43%), "Indirect influence on the public" (31%), or "Direct influence on the public" (4%). One study used a different approach to select case studies to examine. Using selected Leadership, Governance, and Management keywords, 1,309 relevant impact case studies were identified. Their most common impacts were related to Government policy (52%), Training (47%), Impact on understanding (e.g., awareness, attitude, or behaviors) (39%), and Strategy (e.g., knowledge transfer, organizational development, or performance) (37%) (Morrow, Goreham, & Ross, 2017). Different types of evidence can be presented to justify impact claims. Most of the 63 Arts REF impact case studies contained evidence of the number of people who attended an event (73%). Other common types of evidence were implementing policy or influencing policymakers, industry, or other activities (60%), media coverage (52%), the number of events in a festival or other relevant cultural program (52%), and benefit to artists, curators, and cultural institutions (51%). The study argued that it is particularly challenging to provide evidence for artistic impacts in the REF because it requires looking at the opinions or behaviors of a wide range of audiences (Brook, 2018). Some content analyses have examined the sources of evidence cited. For 46 cancer trials impact case studies, most (93%) of the supporting evidence was from either clinical guidelines (e.g., National Institute for Health and Care Excellence, National Comprehensive Cancer Network, or European Society for Medical Oncology) or trial research published by medical journals (e.g., The Lancet, Journal of Clinical Oncology, New England Journal of Medicine) (Hanna, Gatting et al., 2020). Another analysis of 25 Library and Information Science (LIS) case studies found that the most frequent types of impact evidence identified were about Cultural and heritage preservation, Historical archives, and Informing government policy. The categories Workers, Policymakers, Companies/businesses, and Governments were most frequently mentioned as research beneficiaries (Marcella, Lockerbie, & Bloice, 2016).

Alternative Sources for Assessing Wider Impacts of REF Case Studies
Alternative indicators might help to evidence the societal impacts of publications submitted as research outputs or referenced in impact case studies within REF 2014. One study identified mentions of social media platforms in REF 2014 impact case studies (all sections) through 42 terms, finding that blogs (52%), podcasts (21%), and YouTube (25.6%) were more commonly mentioned in Panel D case studies (Arts and Humanities) than other main panels. However, in Panel A (Medicine, Health, and Life Sciences) about a quarter (23.7%) of social media mentions were for YouTube, whereas Google Scholar (46%) was commonly referenced in Panel B (Physical and Mathematical Sciences), despite being a primarily academic source (it was sometimes used to evidence the credentials of the researcher or the scholarly uptake of the research despite this not being assessed, e.g., https://impact.ref.ac.uk/casestudies/CaseStudy .aspx?Id=20952, https://impact.ref.ac.uk/casestudies/CaseStudy.aspx?Id=938). In Panel C (Social Sciences), blogs (about 40%) were most common (Jordan & Carrigan, 2018). Another investigation gathered six altmetric indicators (Twitter, Wikipedia, Facebook, blogs, news, and policy-related documents) for publications (with DOIs) submitted either as REF 2014 research output or publications cited in impact case studies to support the underpinning research, finding that the publications referenced in impact case studies tended to be mentioned more commonly in altmetrics sources than were publications submitted as REF research outputs (Bornmann, Haunschild, & Adams, 2019). Although an early study of REF 2014 case studies found no obvious association between altmetric scores and REF impact scores (Ravenscroft, Liakata et al., 2017), a later investigation found a significant correlation between altmetric scores and expert peer review ratings of nonacademic impacts for publications (with DOIs) cited in the "Underpinning Research" sections of 1,469 REF 2014 impact case studies submitted under main panel B (Wooldridge & King, 2019).
It seems that only one study has assessed the frequency of URL citations from all impact case studies, reporting the 40 most cited websites (Digital Science, 2016, p. 30, Annex 4). This study did not classify URL types and did not use manual checking to exclude URLs mentioned for other reasons (e.g., archived copies of submitted REF impact case studies from https://www .wiki.ed.ac.uk/ and https://apps.lse.ac.uk/). It also did not merge all relevant types of cited URLs under one category (e.g., URL citations from all newspapers or news agencies and sources under the category "News and media"). Thus, this study has not given an overall picture of the types of URL cited in REF case studies.

RESEARCH QUESTIONS
The objective is to identify the main types of websites cited in REF 2014 impact case studies. This will shed light on how academics in all fields use online sources differently to reflect the nonacademic impacts of their research. The following questions address different aspects of this.
1. Which types of website (e.g., news and media, governmernal, clinical guideline or social media) are cited in UK REF impact case studies to evidence research impacts? 2. Which websites (e.g., BBC, UK Parliament, or NHS) are most frequently cited in the impact case studies in all broad fields and all 36 Units of Assessment? 3. Are there disciplinary differences in the answers to the above questions? 4. METHODS

The Data Set of REF 2014 Impact Case Study URL Citations
The metadata and full text of all 6,637 REF 2014 case studies 1 were downloaded from the main REF website 2 in Excel format. Note that of the 6,975 of impact case studies submitted to the REF2014, 6,637 case studies were downloadable from the REF database due to reuse and licensing arrangements (https://impact.ref.ac.uk/casestudies/FAQ.aspx). A program was designed and added to the free Webometric Analyst software (see https://lexiurl.wlv.ac.uk/, "Extract URLs from Impact Case Studies" option under "Citations") to automatically identify and extract URL citations from these impact case studies. The term "URL citation" in this article refers to mentions of URLs in the "Sources to corroborate the impact" section of impact case studies (see Figure 1). Only this section of case studies was used for analyses because researchers "should list sufficient sources that could corroborate key claims made about the impact of the unit's research" in it, such as "reports, reviews, web links or other documented sources of information in the public domain" (Research Excellence Framework, 2014, p. 54). The official case study template had recommended an indicative maximum of 10 references in this section (Research Excellence Framework, 2014). The software extracted 32,196 raw URLs from all impact case studies based on the mentions of https://, https://, or www. anywhere in the references to corroborate the impact.

Data cleaning
An initial check of the 32,196 extracted URLs showed that 1,929 were from the link shortening websites tinyurl.com (1,055) or bit.ly (874). Hence, a program in Webometric Analyst was used to identify the redirected URLs (see "Get redirected URLs" under the "Service" menu), finding 1,871 (97%) of the ultimately cited URLs, which were used for analysis. However, manual checks of URLs containing the terms "REF," "impact," or "case study" revealed that 1,059 of the extracted URLs (mostly from a few universities) were either archived copies of submitted REF impact case studies (e.g., https://ref2014.inf.ed.ac.uk/impact/) or other uploaded files or relevant information about the submitted impact case studies that were inaccessible (https://apps.lse.ac.uk/impact/download/file/1194) and hence were excluded from the study. To have more unique and reliable cited URLs for analysis, duplicate URLs in case studies were excluded (e.g., see https://impact.ref.ac.uk/casestudies/CaseStudy.aspx?Id=38782), giving a final total of 29,830 URLs from all 36 UoAs (data is available via https://doi.org/10 .6084/m9.figshare.14447295.v1).

Semiautomatic Classification of the Websites of the Cited URLs
An initial URL classification scheme was developed by checking the most cited websites (i.e., domain name or domain name ending) of the URL citations from all UoAs. For instance, manual checks showed that many cited URLs from impact case studies in the arts, humanities, and social sciences were from news and media (e.g., BBC News, the Guardian, and the Daily Telegraph) or governmental websites (e.g., UK government and UK parliament). In Clinical and Applied Medicine, health care organizations (e.g., the National Health Service) and clinical guidelines (e.g., NICE clinical guidelines) commonly documented research impacts. In Science and Engineering subject areas, commercial or business websites (e.g., Rolls-Royce or Apple) also frequently evidenced societal impacts. Nevertheless, the initial categories were subsequently modified to include new types of websites identified during the classification process. For instance, only one general category was first assigned for social media websites, but due to many cited URLs to online videos, it was split into two: Social Media and Blogs (e.g., Twitter, Facebook, WordPress) and Video and Photo Sharing websites (e.g., YouTube, Vimeo, Flickr). Moreover, in the arts and music a new category was added for artistic-related websites that could not be classified elsewhere (e.g., music, film, television, galleries, and museums). The URLs cited by impact case studies were eventually classified into 18 categories and eight broad areas, as shown below.

Initial automatic classification of cited URL websites
Because it was not practical to manually classify the websites of all 29,830 cited URLs extracted from the impact case studies, a systematic method was developed to automatically match the domains of the cited URLs (e.g., https://www.bbc.co.uk/news/health-18366437) against a manually curated list of relevant websites in predefined categories (e.g., bbc.co.uk in the category News and media). The relevant URLs for each category were identified and extracted from different sources such as DMOZ-The Directory of the Web (https://dmoz-odp.org/), Wikipedia lists of websites (e.g., https://en.wikipedia.org/wiki/List_of_intergovernmental_organizations), and top visited websites listed by alexa.com in different categories (e.g., https://www.alexa .com/topsites/category/Top/Reference/Encyclopedias/). Additional searches were carried out to identify reliable lists of websites for each category, such as the Webometrics Ranking of World Universities (https://www.webometrics.info/) for university websites worldwide, a list of UK healthcare organizations published by the NHS (https://www.england.nhs.uk/tis/our -members/), Ulrich's Periodicals Directory (https://www.ulrichsweb.com/) for academic journal websites, the Directory of Intellectual Property Offices (https://www.wipo.int/directory/en/urls .jsp) for URL citations to patents, or National and International Clinical Guidelines Organizations (https://www.openclinical.org/guidelines.html/) for clinical guidelines. A program was written and added to Webometric Analyst to match lists of domain names in one category against the URLs from the impact case studies (see "Copy all URLs from long results files that match list of domain names" option under "Utilities").
The systematic classification of the cited URLs may be useful to assess how academics are documenting research impacts in terms of the types of online sources but does not provide contextual evidence about how the cited sources have been used-this needs manual content analysis. To make sure that the URL citations were reasonably classified into the predefined categories by the above automatic domain name-based method, the 20 most cited URLs in each of the 36 UoAs from the initial systematic classification were manually checked and reclassified if necessary (20 × 36 = 720 URLs). For instance, URLs for the National Audit Office (nao.org.uk) were first classified as UK organization URL citations, but the manual checks revealed that this organization is part of the UK government sector. The manual checking was based on visiting the websites and reading relevant sections, including "about us," "our mission," or "contact us," if necessary. Nevertheless, about 12% (3,545 out of 29,830) of the URLs cited in the case studies were not classified even after the manual checking phase, although not classified cited URLs were more common in UoA 12-Aeronautical, Mechanical, Chemical and Manufacturing Engineering (21.4%), UoA 15-General Engineering (19.5%), UoA 11-Computer Science and Informatics (19.2%), and UoA 13-Electrical and Electronic Engineering (17.9%) than UoA 2-Public Health, Health Services and Primary (5.6%), UoA 22-Social Work and Social Policy (5.7%), UoA 18-Economics and Econometrics (6.0%), and UoA 20-Law (6.2%). This is because in the engineering fields researchers may use a range of different industries, businesses, or manufacturing companies as evidence of nonacademic impacts. The 3,545 not classified URLs were from 3,028 different websites, suggesting that they were less frequently cited in impact case studies. For instance, the most common not classified URLs were docs.google.com (cited nine times), thefreelibrary.com (cited seven times), and scribd.com (cited six times). Table 1 gives examples of website reclassifications from this stage. The 36 REF UoAs were combined into seven broad subjects for disciplinary analyses ( Table 2). The REF classification of Units of Assessments across four main panels (A-D) was modified to represent results within more uniform broad subject areas. For instance, all artistic fields (including Art and Design: History, Practice and Theory and Music, Drama, Dance, and Performing Arts) and related humanities (e.g., English Language and Literature, History, Philosophy, and Law) in Panel D were combined to form the broad subject categories Arts and Humanities, respectively. Similarly, all relevant engineering, hard science subjects and medical relevant fields were merged to represent Engineering and Computer Science, Hard Sciences, and Medical Sciences and Healthcare.

RESULTS
Considering the indicative maximum of 10 references to support wider impacts of research in the REF 2014 template (Research Excellence Framework, 2014), it is unsurprising that the average number of URL citations is less than seven for all UoAs (Figure 2). Nevertheless, in public health and other allied health professions, impact case studies tended to cite on average more online sources (6.0 to 6.1) than in other fields, such as most engineering subjects (2.6 to 3.8). This may reflect health information being increasingly public and online, in contrast to engineering research innovation documentation.   About a third of the cited URLs in the impact case studies were for other organizational websites (30%), with many others directed to news and media (19%) and government (17%) websites. Nevertheless, there are clear disciplinary differences (Figure 3). For instance, in Medical Sciences and Biological and Agricultural Sciences, URL citations of organizational websites were more numerous (40% and 32%, respectively), whereas in the Arts and Humanities, citations to news and media (28% and 26% respectively) were more common. In Engineering and Computer Science, more citations to commercial and business websites (26%) were identified. This confirms that broad fields tend to cite different types of online sources to evidence the impact of their research. Figure 4 gives more fine-grained details about the types of websites cited in the impact case studies. For instance, artistic contents (Music, Drama, Dance, and Performing Arts) were commonly cited in the Arts impact case studies (14.2%) and citations to the UK government and the UK parliament were most common in Social Science impact case studies (20.3% and 6.5%). UK healthcare organizations (e.g., NHS) and clinical guidelines (e.g., NICE) were more cited in Medical Sciences and Healthcare impact case studies (15% and 12% respectively) than in other fields. Perhaps surprisingly, citations to social media websites were relatively common in Humanities (7%) and Arts (5%) impact case studies.
The 10 websites with the most citations from all impact case studies also vary in prevalence between broad subjects ( Figure 5). In the Arts and Humanities, 6% and 5% of citations from the impact case studies were to the BBC website and 3% and 4% to the Guardian. YouTube videos were also more cited in the Arts (3.5%) and Humanities (2.8%) than in other subjects. In Social Sciences, citations to UK parliament (5%) and in Medical Sciences and Healthcare citations to the UK NHS (6.7%) and the National Institute for Health and Care Excellence (5.4%) were more common. This suggests that disciplinary-relevant online sources can be used to reflect the societal impacts of research. Tables S1-S4 in the Supplementary Information report similar information for the 36 UoAs.

DISCUSSION
The results show, for the first time, the main types of website cited in all REF impact case studies. The method can be used to identify the most common online sources provided for societal impact claims in the new REF 2021, informing evaluators about the norms of societal benefits of research across their own field judgements. This might be more useful in the arts, humanities, and social sciences, where academics may use nonstandard online sources such as news sources, multimedia information, and social media for evidencing research impact. Because many online sources used as evidence of nonacademic impacts have not been covered by altmetric platforms, future tools may capture and analyze societal impacts from wider online sources of impact. The results show the existence of substantial disciplinary differences in the websites cited, as might have been suspected from prior evidence of disciplinary differences in the types of impact claim made (e.g., Hanna et al., 2020;Marcella et al., 2016). There are many ways in which academics may cite online evidence of the societal impacts of their research in REF impact case studies. In this section, different examples of the most frequently cited URLs are given in across subjects to provide richer insights into the main quantitative findings above.
In main Panel A (Medicine, health and life sciences), the most common types of claimed evidence were from clinical guidelines or trials (nice.org.uk or clinicaltrials.gov) followed by the World Health Organization (who.int), the UK NHS (nhs.uk), and the UK government (defra.gov.uk, gov.uk). For instance, in Public Health, Health Services, and Primary Care about a third (32.9%), in Clinical Medicine over a quarter (26.4%) and in Allied Health Professions, Dentistry, Nursing, and Pharmacy" less than a fifth (18.1%) of the claimed online evidence about benefits or wider impact of submitted impact case studies were from the above web sources. In Clinical Medicine, a wide range of clinical documents were used to demonstrate the benefits of medical research, such as changes in drug labels and guidelines (fda.gov/drugs/postmarket-drug-safety-information-patients-and-providers/information -abacavir-marketed-ziagen-and-abacavir-containing-medications), use in NICE clinical guidelines as treatment evidence (https://www.nice.org.uk/guidance/cg71) or cited by WHO guidelines for health policy (who.int/nutrition/publications/guidelines/potassium _intake_printversion.pdf).
In Agriculture, Veterinary, and Food Science, 8.5% of the URLs were from European Union websites, including the European Food Safety Authority (efsa.europa.eu), the European Medicines Agency (ema.europa.eu), and other relevant EU sections of food, farming, and fisheries websites. In Biological Sciences, some cited URLs (4%) were from ClinicalTrials.gov, where many mentioned clinical trials for the safety or efficacy of new drugs or treatments (e.g., clinicaltrials.gov/ct2/show/nct01844986 and clinicaltrials.gov/ct2/show/nct01712074).
In main Panel B (Physical sciences, engineering and mathematics), a combination of news, online videos, and specialized websites were frequently used as impact evidence in case studies. For instance, in Earth Systems and Environmental Sciences 5% were from environment, In Art and Design History, Practice, and Theory many impact case studies cited information from galleries or museums, such as information an exhibition (tate.org.uk/whats-on/tate-britain /exhibition/turner-and-masters) or an artistic object (tate.org.uk/art/research-publications /gaudier-brzeska-wrestlers) in the Tate Modern. Other relevant artistic information was also cited, such as a review of a painting exhibition (youtube.com/watch?v=mlsn4Za5-as), and specific galleries or exhibitions (e.g., whitechapelgallery.org/exhibitions/john-latham-anarchive/). In Music, Drama, Dance, and Performing Arts, 6% of the cited URLs were for online videos, such as a theatre play preview (youtube.com/watch?v=br9tafybBXM), music performances at a festival (youtube.com/watch?v=-Z6H8jpd1fU), an interview with a Professor of Music (youtube.com/watch?v=D1EUurZ4s98), a commercial racing game soundtrack, Need for Speed Shift 2: Unleashed (youtube.com/watch?v=mB6X3LGIT30), and a computer-generated light and sound music performance (youtube.com/watch?v=cysjxHzCoh0).

Limitations
This study has several limitations. The cited URLs studied here only include online sources to corroborate impact. This ignores all cited offline or unpublished sources (e.g., letters, emails, reports, statements) that may give different insights into the types of evidence used. Because there is no practical way to classify a large number of URLs, an ad hoc method was used to categorize the broad type of the cited URLs. Although 720 most cited URLs from the initial systematic classification were manually checked and reclassified when necessary, about 12% of the URLs cited in the case studies were not classified. This was particularly common in Engineering and Computer Science, where a range of different commercial websites could be claimed as evidence of nonacademic benefits of engineering research. We could not find a practical method to classify the websites cited in the impact case studies, and these less common URLs may well give a different perspective. Moreover, this study did not assess the contexts or motivations for citing URLs and hence it is not clear how the online sources were used by the impact case studies. For instance, a news story cited by a clinical medicine case study might reflect a publicity claim (i.e., the news story is the impact) or may evidence uptake of an invention by a company (i.e., the news story reports the impact). Finally, the classification of the 20 most cited URLs in each of the 36 UoAs was not crosschecked by a second classifier and hence there could be disagreement about the characteristics of some websites such as the National Audit Office (nao.org.uk) which is an independent public spending watchdog scrutinizing public spending for parliament.

CONCLUSIONS
The results show that a wide range of nonacademic online sources have been used to corroborate the benefits of research, including news stories, online videos, government publications, parliamentary records, and social media websites, although there were disciplinary differences. In answer to the first research question, in Medical and Health Sciences, clinical guidelines and UK healthcare organizations were most frequently cited, whereas in the Arts and Humanities, news and media, and in Social Sciences, government and parliamentary publications were more commonly mentioned in impact case studies. In answer to the second and third research questions, there are large disciplinary differences in the websites most commonly cited in impact case studies across the 36 REF UoAs (Supplementary Information, Tables  S1-S4). For instance, in Clinical Medicine the NICE clinical guidelines (8.3%), in Public Health, Health Services, and Primary Care WHO (10.5%), in Agriculture, Veterinary, and Food Science (8.5%) and Economics and Econometrics (5.9%) European Union websites were more frequently mentioned in case studies. Similarly, in History many URLs were from BBC News (9.7%) and the Guardian (4%), whereas in Law the UK Parliament (10.2%) and Ministry of Justice (5.4%) were more frequently mentioned. In Chemistry, Physics, and Mathematical Sciences about 3% of the URL citations were to YouTube, indicating widespread disciplinary differences in the online sources used to evidence the wider impacts of research.