Open access and international coauthorship: A longitudinal study of the United Arab Emirates research output

Abstract We investigate the interplay between open access (OA), coauthorship, and international research collaboration. Although previous research has dealt with these factors separately, there is a knowledge gap in how these interact within a single data set. The data includes all Scopus-indexed journal articles published over 11 years (2009–2019) where at least one of the authors has an affiliation to a United Arab Emirates institution (30,400 articles in total). To assess the OA status of articles, we utilized Unpaywall data for articles with a digital object identifier, and manual web searches for articles without. There was consistently strong growth in publication volume counts as well as shares of OA articles across the years. The analysis provides statistically significant results supporting a positive relationship between a higher number of coauthors (in particular international) and the OA status of articles. Further research is needed to investigate potentially explaining factors for the relationship between coauthorship and increased OA rate, such as implementation of national science policy initiatives, varying availability of funding for OA publishing in different countries, patterns in adoption of various OA types in different coauthorship constellations, and potentially unique discipline-specific patterns as they relate to coauthorship and OA rate.


INTRODUCTION
Open access (OA) publishing in journals is growing globally, both as entire journals and on the article level, in particular through hybrid OA transformative agreements (Crawford, 2021;Jahn, Matthias, & Laakso, 2022). Repository self-archiving by authors is also a major enabler of OA to content that would otherwise only be accessible behind a paywall (Thibault, MacPherson et al., 2018). Science policy and practices for OA publishing have evolved unevenly from an international perspective, where many European countries have in recent years been advancing rapidly compared to the rest of the world. Research funders and higher education institutions (HEIs) in Europe are increasingly requiring that the publications produced by funded or affiliated researchers are made available OA immediately (cOalition S, 2018;ROARMAP, n.d.). Although OA policies and practices are locally anchored to specific organizations and funding instruments, research is often conducted through international collaboration. Institutional requirements and possibilities for OA availability can thus also affect coauthors, even though their own circumstances do not require or enable OA to publications. Knowledge about how this phenomenon, coauthorship-induced OA, exists and has developed over time is lacking. One key aspect that has contributed to the slow progress of knowledge development related to more intricate aspects of OA, such as the phenomenon of coauthorship-induced OA described in the previous paragraph, is the dearth of comprehensive basic data. Even though OA publishing has been growing strongly for over 20 years, a comprehensive central database for searching and retrieval of OA resources has still not been realized (Azadbakht & Schultz, 2020). The freely accessible Unpaywall database is currently the most comprehensive resource, but it is built around the fundamental principle that included articles have a digital object identifier (DOI) which is not something that all journals use. In a study on DOIs in the Web of Science (WoS) Core Collection and Scopus from 2005 to 2014, Gorraiz, Melero-Fuentes et al. (2016) observed that while 90% of all citable items in the Sciences and the Social Sciences in 2014 had a DOI, the percentage is about 50% in the Arts & Humanities. As this concerns journals within these indexes, the lack of DOIs for journals outside them can be assumed to be higher. Although the problem of lacking comprehensive data, or what Nguyen, Luczak-Roesch et al. (2022) refer to as "fitness for use," is something that also concerns data availability in scholarly journal publishing overall, with the selection of what data source one uses strongly shaping what the landscape looks like (Basson, Simard et al., 2022), for research on OA, this problem is heightened due to often having to rely on multiple layers of incomplete data to gain an overview of the situation. To counter these shortcomings, Xu, Yue et al. (2017) conclude that a multisource data fusion (MSDF) is "necessary and meaningful" in scientometrics. Overall, there is a need for more research on OA that also includes parts of the publication landscape that are omitted if only readily available data is used.
Considerable existing research is devoted to descriptive article-level growth analysis of OA utilizing Scopus and WoS, but less attention has been paid to the connection between openness and coauthorship, international collaboration, and journal host country. Using manual data enrichment, this study provides new insight into these phenomena with United Arab Emirates (UAE) research output providing the base data. The objective is to provide a granular analysis of research article output in the country, level of openness, and connection to international coauthorship.
The specific research questions that we seek answers to through the use of longitudinal data concerning journal article output which involves at least one UAE-affiliated author are 1. What are the key OA characteristics of journal articles from UAE-affiliated authors?
(a) What are the shares of different OA types? (b) What are the disciplinary differences in OA shares? (c) Does the journal host country have a connection to OA availability? (d) What are the most popular repositories for self-archiving? 2. How do different aspects of coauthorship interplay with OA shares of UAE-affiliated research output?
(a) How is coauthorship distributed globally? (b) Does the number of coauthors have a connection to OA availability? (c) Does the geographic region of coauthors have a connection to OA availability?
The UAE makes for an interesting case for the study of both research output and OA growth for a number of reasons. First, the UAE is a very young country, established only in 1971 and with its oldest university established in 1976. Although it does not have an old research tradition or a well-established science policy, it has made giant strides in transforming its research landscape. Al Marzouqi, Alameddine et al. (2019) revealed that UAE research productivity has seen a 16-fold increase between 1998 and 2017 and the UAE Commission for Academic Accreditation (2022) currently lists 74 active higher education institutions. Second, the UAE research workforce is composed of a high share of expatriates and thus is transient by nature but they may also bring along their collaboration networks and thus boost the UAE's coauthored publications output. Finally, we could not identify any mention of sources of article processing charges (APC) funding within the UAE across all resources analyzing extramural and intramural funding in the country. This is in stark contrast with countries that have a high OA uptake and highlights the unique characteristics of the UAE research environment. Alsharari (2018) states the preoccupation of UAE universities with gaining recognition through international accreditation. He further adds that "Local and global rankings are assuming greater importance …." Research performance plays a major role in most university rankings and often relies on outputs in international journals, preferably high-ranked ones. It is this preoccupation with rankings (among other data quality aspects that are discussed in the methods section) that supports our choice of Scopus as a source of data, as it is a main resource of research output data for university rankings such as Quacquarelli Symonds (QS) and Times Higher Education (THE). What is relatively unexplored is how the growth in OA journal articles of UAE-affiliated authors has developed over recent years and how that might be connected to changes in international coauthorship among these authors. By designing a study around this topic, we aim to improve the current level of knowledge regarding the influence of coauthorship on the OA status of articles. We also aim to expose the level of compromise that reliance on readily available OA data implies when investigating phenomena such as this.

Challenges to OA Analysis and Retrieval
The road to comprehensive study of OA is strewn with methodological options and associated tradeoffs that need to be considered. First, current data sources often fail to provide comprehensive coverage data on different types of OA, leading researchers to resorting manual data collection, which has implications for how studies are skewed towards certain languages, countries, and disciplines. Second, discovery and retrieval of OA sources is shackled by the inconsistency of the different existing OA finding tools.
Despite being the most mature branch of open science so far, the measurement of OA share for journal articles is a complex task given the many variants of OA and the multiplicity of approaches, as well as the data sets used. Taubert, Hobert et al. (2019) illustrate this point with a listing of about 11 different OA types synthesized from existing OA research. Most bibliographic indexes do not capture data on all these OA variants, which can overlap with each other as multiple copies of publications are available through different channels over time, thus introducing a methodological challenge for bibliometric analysis. As most bibliographic databases are designed primarily for content retrieval purposes, bibliometric analysis of metadata can be just a secondary purpose (Hood & Wilson, 2003). Researchers often resort to extensive manual data collection to rectify gaps in the data (see e.g., Boufarss, 2020). Another issue with bibliometric analysis of international scope, be it including OA dimensions or not, is related to the biases in the two most commonly used data sources, namely WoS and Scopus. These two services contain biases in coverage related to disciplines, countries, and languages (Khanna, Ball et al., 2022;Mongeon & Paul-Hus, 2016). The bias towards English language publications also reported by Björk (2019) makes comprehensive bibliometric studies of non-English-speaking countries like the UAE skewed, as part of their research output is often underrepresented. In a recent comprehensive analysis of the leading sources of citation data, Martín-Martín, Thelwall et al. (2021) reveal that sources suffer from either of the two main limitations: limited coverage in the case of Scopus or WoS; and limited search functionalities in the case of Google Scholar, Microsoft Academic, Dimensions, and OpenCitations COCI. With this in mind, Scopus has the upper hand because of greater coverage than WoS and more metadata fields, enabling deeper analysis (Thelwall & Maflahi, 2022). This last argument is supported by Guerrero-Bote, Chinchilla-Rodríguez et al. (2021), who concluded that there is greater coverage at the level of countries and institutions in Scopus than in Dimensions even though the latter has overall greater data coverage than the former. However, as of writing, there are no competing, more inclusive services than Scopus and WoS that would offer the same level of curation for journal and article-level metadata concerning active peer-reviewed journals that fulfill some common baseline criteria, which means that they can still be very useful for various research purposes as long as the limitations and biases are acknowledged.
At a time when OA uptake is trending upward (Archambault, Amyot et al., 2014;Piwowar, Priem et al., 2018;Piwowar, Priem, & Orr, 2019), discovery and retrieval of OA resources has been an issue that many service providers have worked on improving (Azadbakht & Schultz, 2020;Dhakal, 2019;Willi Hooper, 2017). The heralded general objective of the OA movement to provide access to scholarship to anyone with internet access is not achieved if people cannot find OA versions of articles easily (Schultz, Azadbakht et al., 2019). OA discovery tools such as Unpaywall, Kopernio, OA button, and Lazy Scholar have tried to resolve this challenge, as demonstrated by Azadbakht and Schultz (2020), Duffin (2020), Else (2018), and Schultz et al. (2019). Willi Hooper (2017) reviewed Unpaywall as an OA finding tool, finding it advantageous compared with Google Scholar, which has accuracy issues and linking to Academic Social Networks (ASNs), which can have copyright compliance issues. This finding is shared by Dhakal (2019), who stressed Unpaywall's focus on legally available OA articles. Other merits of Unpaywall have also been emphasized by Dhakal (2019), such as the provided Simple Query Tool, the REST API, and the full database snapshot, which all facilitate establishing OA status for larger amounts of articles as long as a DOI can be provided for each.
Whether it is the unequivocal focus of OA studies on journal literature, absence of comprehensive data sources that cater for the different OA models and are unbiased, or unreliable discovery and retrieval tools, the challenges to OA studies abound.

Research Collaboration and OA
the Middle East, North Africa, and Turkey (MENAT) context, as they believe that the upsurge in publications is largely due to international collaboration.
While coauthorship is the most common indicator of research collaboration (Nguyen et al., 2022), drawing broad conclusions regarding the intensity and quality of research collaboration purely based on bibliometric data should be done with caution. Katz and Martin (1997) believe that "co-authorship is no more than a partial indicator of collaboration" as interinstitutional and international collaboration does not have to be a collaboration between individuals. A case in point is when a researcher lists two affiliations, indicating an overarching institutional collaboration. In fact, a "collision of collaboration and authorship" (Birnholtz, 2006) can even happen when collaboration breeds mass authorship or hyperauthorship, with some articles in physics, for example, listing thousands of coauthors (Kahn, 2018). Equally problematic is what Moustafa (2020) refers to as "octopus affiliations," referring to authors listing multiple affiliations. This could be in exchange for financial reward by institutions seeking to enhance their ranking or authors' desire to boost their reputation by associating themselves with prestigious institutions. These practices have deep implications for attribution and credit, ownership, and reputation. Glänzel and Schubert (2004) provide a detailed fundamental overview of coauthorship. They note that almost 20 years ago, one could already observe an overall trend in terms of decrease in single-author publications. This was counterbalanced by intensifying collaboration in all disciplines. In a study of coauthorship in different disciplines from 1900 to 2020, Thelwall and Maflahi (2022) reported a steady increase in the mean number of authors per article. Even though Glänzel and Schubert (2004) noted that this increase was a "global law" with all countries, regardless of the size of their publication output, having witnessed this growth, they observed that medium-sized or small countries had higher international copublications than large countries.
Benefits of coauthorship transcend the impact it can have on an individual author's or institution's scientific profile. Wagner, Whetsell et al. (2018) state that "the more internationally engaged a nation is in terms of co-authorships and researcher mobility, the higher the impact of scientific work." If this statement is accurate and with a high incoming mobility as demonstrated by El-Ouahi, Robinson-García, and Costas (2021) and with internationally coauthored publications of nearly 70% in 2015 (Moed, 2016), the UAE should record high scientific work impact. In fact, Al Marzouqi et al. (2019) reported an improvement in the percentage of articles from the UAE that were published in the top 10th percentile (by CiteScore) ranked journals and that this metric was higher than the average for Gulf Cooperation Council and Arab League countries.
Very little research has been published on the relationship between the number of authors and level of articles' openness. Though old and exploring a different aspect of OA, Eysenbach's (2006) research found OA articles to have a "higher number of authors." This could be attributed to two factors, namely higher self-archiving probability with more authors and increased potential of APC funding by one of the author's affiliations. Hajjem and Harnad (2007), in a study from around the same time frame, found that the number of authors among other factors "contributes an independent, statistically significant increment to the citation counts." Another challenge brought about by coauthorship is who bears the cost of publishing OA. In their study on OA costs, taking into account author roles and the number of authors in Germany, Bruns, Rimmert, and Taubert (2020) identified five payment models for APC payments: First author model, Reprint author model, Institutions contribute equally, Institutions contribute, weighted by the number of authors, and Institutions contribute, weighted by author-institution-combination. They conclude that these models result in different financial contributions and thus some are preferred by some institutions over others. Morillo (2020) looked more closely at the relationships between OA (based on Unpaywall data), funding types (national, international, EU funded), collaboration (national, international coauthorship), and citations for WoS articles published in 2017 in the disciplines of Immunology and Economics. One clear difference from the start was that the overall level of OA among the articles differed substantially between the two disciplines: 50% for Immunology and less than 15% for Economics. Although the studied factors are intertwined and influence each other in different ways, the authors could conclude that the probability of an article being OA was significantly higher in Immunology when the study was EU funded, included international collaboration, and with a positive connection to accrued citations. The factors were positive towards the probability of an article being OA independently but particularly so when multiple of them were present for the same article. The trend was also similar for the same factors for Economics articles, but the overall strength was weaker due to the substantially lower OA update overall.
Based on research on the initial years of transformative agreements in Germany, Haucap, Moshgbar, and Schmal (2021) found a significant change in publication patterns among authors, where they more frequently select journals that are part of such agreements than journals that are outside of their coverage. Similar results were also recently found by Wenaas (2022) for articles from authors affiliated with Norwegian institutions. What does this mean for studies that relate to coauthorship and openness? OA grows by two mechanisms: directly as a consequence of outlets making articles open that would otherwise have been closed, and by stimulating authors to select journals that enable OA at no extra cost.

The UAE Landscape
Article output from Arab countries was slow in catching up but is quickly compensating for this latency as part of a global trend ending the dominance of the transatlantic research axis, which had a share of 75-80% of all academic research output (Adams et al., 2021). Adams et al. (2021) further state that the number of papers output from the MENAT region saw a 20-fold growth between 1981 and 2019. This translates into a move from 2% to 8% of global share. They also share the findings of Cavacini (2016) that research output from the region is dominated by Israel, Iran, Turkey, Saudi Arabia, and Egypt, which means that other countries, including the UAE, still play a marginal role in scientific production. In 2019, the UAE contributed only 15% of the Gulf Cooperation Council research productivity against 63% for Saudi Arabia (Ajayan, Balasubramanian, & Ramachandran, 2022).
The UAE research landscape presents some unique characteristics, including, but not limited to, the country being only around 50 years old, a high transient research community with temporary residency status (the oldest university being only around 46 years old), and a nonhomogeneous multilingual population. All these factors have a direct impact on research output. However, the situation is set to change in the UAE as the national science policy is being geared towards increased scientific output (Boufarss & Laakso, 2020). This direction started with the launch of UAE Vision 2021, followed by the release of the UAE Innovation Strategy, the National Strategy for Higher Education 2030, the announcement of the National Advanced Sciences Agenda 2031, the Research and Development (R&D) Governance Policy, and finally by the recently launched Golden Visa scheme, aiming to attract and retain outstanding researchers. Furthermore, initiatives that aim to provide funding for research were launched recently and include, among many others, the Mohammed bin Rashid Al Maktoum Knowledge Foundation, the National Research Foundation, the Abu Dhabi Research and Development Authority, the Advanced Technology Research Council, and the Abu Dhabi Ghadan 21 Research and Development funds. The Research and Development Governance Policy lists among its aims to "foster an agile, robust national ecosystem for research and development in the UAE" and "set standards to improve research, elevate the performance of the national R&D activities." These policies and initiatives are likely to have had a visible impact on scientific research output. A Clarivate Analytics (2019) report estimated that UAE research articles indexed in the WoS Core Collection increased by 450% between 2008 and 2018. The same Clarivate report states that the UAE is part of the OA growth trend, with a gradual increase in the percentage of OA articles published in recent years.

METHODOLOGY AND DATA
Some of the most recent and comprehensive studies on national-level OA dimensions have been based on nationally collected and curated Current Research Information System (CRIS) data, which, when a country has such available, still provide breadth at the expense of standardized detail when it comes to, for example, affiliation metadata for all involved coauthors and OA type categorization (Pölönen, Laakso et al., 2020;Wenaas, 2022). In the absence of comprehensive local or regional indexes of journal articles with the required author-affiliation metadata for each article, we utilized Scopus as a source of data. Boufarss (2020) states that "regional indexes such as ARCIF and Arab Impact Factor are limited in their coverage of locally published journals." In fact, these two products are Journal Impact Factor indexes. Similarly, according to Ouahi (2021), the share of UAE journals in Clarivate's Arabic Citation Index is a mere 2%. This index is also highly biased with nearly 79% of records in Arts & Humanities, and Social Sciences categories and nearly 93% in Arabic language (Ouahi, 2021). The choice of Scopus is also supported by the perceived focus among UAE institutions on publications indexed mainly in Scopus and WoS services, as Boufarss and Laakso (2020) found that the greatest majority of HEIs consider their researchers' publishing in Scopus and WoS essential and a high priority.
The key steps of the data collection methodology are presented in Figure 1. We initially extracted a list of articles published during a period of 11 years (2009 to 2019) and authored by researchers affiliated with UAE institutions from Scopus and imported the data into Microsoft Excel. Scopus data were extracted using Scival in February 2020 and data for the year 2019 were appended in October 2021. A query for publications limited by country to the "United Arab Emirates" was performed. For the sake of focus on primarily peer-reviewed content, and comparability with other studies such as Piwowar et al. (2019), the query was further limited to articles, articles in the press, business articles, and data papers. Our choice of this time frame initially emanated from a desire to analyze a decade of data, but was later expanded to 11 years. Our choice of 2009 was motivated by data in Ajayan et al. (2022) and Al Marzouqi et al. (2019) which indicated a big jump in UAE research articles output in that year, and also by the general momentum for OA journal publishing globally that was more seriously building up involving several OA types around that time frame (Piwowar et al., 2018). Articles published in 2020 were excluded, as metrics were still at risk of being "incomplete" for that year at the time of data collection, particularly regarding self-archived materials, which are often under an embargo before they can be distributed on the web. Conference proceedings, books, and book chapters were excluded in an endeavor to have a consistent data set that could be analyzed for journal OA status. To enable the analysis of journal choice and possible relationship concerning language and geographical bias, we also enriched the data with the journal country using the ISSN Portal.
For the records without a DOI (2,133 articles), we matched these to DOIs in Crossref Metadata using their Link References feature or manually researched and appended a DOI whenever found through manual checking through journal websites. A DOI could not be found for the remaining 297 articles. All records with a DOI were batch run through the Unpaywall Simple Query Tool and the resulting data were appended to those records. For the remaining articles that were published without DOIs, we manually collected OA status information for them, following the basic principles of classification that Unpaywall also uses to have a uniform data set. To remain in line with Unpaywall data harvesting principles, OA resources in services such as ResearchGate and Academia.edu were excluded.
The data were then enriched with a coauthor affiliation region based on Scopus affiliation country data. The affiliation countries were grouped into six regions, namely Africa, Asia, Australasia, Europe, North America, and South America. It bears noting that one author might be affiliated to more than one institution or country through one article. From the perspective of this study this has been seen as an expression of international collaboration and aligned with the aims of the study and can be included as such rather than something that had to be fractionalized or cleaned out from the data. To give some scope for this data property, we calculated that 7,724 articles (25%) included more affiliations than the count of total authors in the metadata. Journal topic clusters were grouped into the five main Scopus subject areas (Multidisciplinary, Life Sciences, Health Sciences, Physical Sciences, and Social Sciences and Humanities) by mapping the All Science Journal Classification Codes (ASJC) field against the Scopus subject areas.
For the sake of clarity and disambiguation, the following basic definitions from Piwowar et al. (2018) are used for the classification of OA type: under an open license, often in exchange for payment of an APC. • Bronze OA: articles provided and made available to read from the publishers' website but without a license, thus limiting their reuse rights to reading. • Closed: an OA version of the article has not been found, also referred to as non-OA.
The Unpaywall data used in this study contain one OA type recorded per publication, and in cases where there were multiple versions available preference was given to recording the gold OA option. As such, the green OA share can be lower than actual availability because many articles available through that mechanism might also be available as a gold OA type.
For statistical analysis, we utilized IBM SPSS 28.0. Dichotomous variables and presence of article attributes (article OA status, journal discipline categories, journal world region, article affiliation world region group) were dummy coded as 0 or 1 to enable analysis. For analysis involving absolute author counts or author affiliation distribution, outliers were excluded from the analysis to make the analysis more representative of the majority of articles in the population. Articles with author counts outside of one standard deviation of the arithmetic mean of authors (14.76) were excluded in this case, which meant that articles with an excess of 159 authors were not considered (183 articles in total). Where this exclusion applies is mentioned in the results section; otherwise in all other cases, all articles are included in the analysis.

RESULTS AND DISCUSSION
This part of the study presents and discusses the results of the analysis conducted on the dataset described in the methods and data section. Figure 2 also shows that 54% (6,684) of all OA articles are provided as gold OA directly through journals. This is followed by green OA with 3,139 (25%). Bronze OA and hybrid OA account for 11% (1409) and 9% (1,153) of OA articles respectively. This trend corroborates the conclusions of Piwowar et al. (2019) that gold OA spearheads the OA movement. When interpreting these numbers, it is important to reiterate that the Unpaywall data used here only provides one OA type recorded per publication, and in cases where there are multiple versions available preference is given to recording the gold OA option. As such, the green OA share is lower than actual availability because many articles available through that mechanism are also available as a gold OA type. In any case, it can be argued that this will not have much effect on the decreasing trend of green OA, which could be attributed to an increasing number of authors who publish gold OA articles not choosing to self-archive these already open articles.

What are the disciplinary differences in OA shares?
As Table 2 presents, articles involving UAE-affiliated authors were dominated by Physical Sciences, which accounts for 47% of all articles. This is probably driven by the research and development of the oil and gas industry. However, the highest percentage of OA was achieved by journals in multidisciplinary fields at 90% (e.g., including megajournals such as PLOS ONE and Scientific Reports). Health Sciences and Life Sciences achieved the next highest OA percentages, with 55% and 51% respectively. It also bears remembering in this context that the study only includes journal articles and does not include, for example, conference papers that might follow different dynamics regarding OA shares and have seen different changes over the 11-year observation period.
To more robustly explore whether the degree of OA status differed to a statistically significant degree between discipline categories of the publishing journal, we performed a Pearson chi-square association test. The relationship between these variables (article OA status and As the results in Table 4 demonstrate, authors continue heading north, with the majority of articles published in journals from Europe and North America. Journals published in Europe alone account for about 56% of all articles published by UAE authors. North American journals published another 29% of the articles. This could be attributed to the big publishers being based in these countries (Mongeon & Paul-Hus, 2016) and to the authors' pursuit of high impact and prestige or to the increasing globalization of research communication trend (Macháček, 2023). MENAT journals account for only 926 (3%) publications, of which 724 (78%) are OA and 202 (22%) are paywalled.
South American journals lead in the OA percentage of articles, with 84% of all articles being OA. This is followed by MENAT (78%), International organization (75%), Australasian (67%), Asian (66%), and African (64%) journals. European and North American journals are both at the bottom of the list with 36% and 37%. "International organization," in this context, represents journals published by an international organization and listed as such by the ISSN International Centre because those organizations do not have a national ISSN center.
We conducted a Pearson chi-square association test to establish whether the distribution of article OA status differs across journal host country categories. The relationship between these variables (article OA status and journal host country) was found to be significant, χ 2 (6, 34,000) = [1,461.186, p < .000]. The results of the analysis showed that articles published by journals in  Performing document version and web location analysis for any other type than green OA would not be meaningful, as the copies should in those cases always be available from the publisher's website in their final peer-reviewed and copyedited form, but for green OA, access can be provided through various document versions and can come from different types of web services around the world. With data being based on how Unpaywall has harvested different article versions, Table 6 shows that the submitted version of the manuscript accounts for almost half of all self-archived articles. Combined with the accepted version rate, this reaches around two-thirds of self-archived articles. This result of around a third of self-archived copies being the published version is surprising, as, in general, journal publishers do not allow posting of the published version (Laakso, 2014) unless the article has been published in an OA journal with a Creative Commons license so that open distribution is explicitly permitted.
Studies have reported a limited number of institutional repositories (IRs) in the UAE (Boufarss, 2011;Boufarss & Laakso, 2020), and this study provides further evidence that the actual use of the existing repositories is also low when looked through the observation of this data set. Although IRs were the most common location of self-archived/green OA articles as demonstrated in Table 7, the vast majority of were at institutions outside the UAE, as OA copies located at UAE-based academic IR amounted to a mere 36 articles of the 1,077 found at such locations. IRs were followed by subject-based repositories, namely arXiv and PMC, in  frequency of use for self-archiving articles. These findings are quite surprising in contrast with the findings of Boufarss and Laakso (2020) that the majority of UAE HEIs mandate or encourage self-archiving in an IR, something which does not happen at least in the UAE-operated IRs based on these results.

How Do Different Aspects of Coauthorship Interplay with OA Shares of UAE-Affiliated
Research Output?
To start unraveling the relationships between the coauthorship, international collaboration, and OA status of articles a summarizing longitudinal analysis was made over how the average number of world regions covered by affiliations per article and the share of articles with at least one international affiliation have developed for articles with at least one UAE-affiliated author over the 11 years covered by the study. Table 8 presents the results of this analysis, showing consistent growth for both indicators over the years, the average number of world regions covered by the affiliations in the articles growing from 0.75 to 1.12 and the inclusion of at least one international affiliation from 59% to 72%.

How is coauthorship distributed globally?
To get a global summarizing perspective on coauthorship distribution we grouped the affiliation data into world regions rather than individual countries, with the UAE separated out as the only individual country in order to enable inspection of national-only coauthorships. Figure 3 indicates that about 19% of all coauthored UAE articles were with other UAE authors. However, UAE authors also have a diversified collaboration portfolio with coauthors from all continents, with around 80% of coauthored publications with authors from other countries surpassing the 70% reported by Moed (2016). With the exception of internal UAE coauthorship, the numbers shown on the map are nonexclusive per continent but rather capture all instances of at least one coauthor from that continent. The highest instances of collaboration were recorded with Asia (26%), North America (20%), and Europe (19%) respectively. Similar intercontinental collaboration trends have been reported by Kozma and Calero-Medina (2019)  among South African authors. This could be attributed to a range of factors, such as neocolonial ties and language impact, with English being the language of teaching and business in the UAE and workforce dynamics with immigrants from Asia being dominant (De Bel-Air, 2015) representing about half of the population. UAE university faculty by nationality statistics reported in Karabchuk, Shomotova, and Chmel (2022) indicate that about 89% of academics are expatriates in the year 2016/2017. A similar report by Bayanat.ae (s.d.) shows that only 12% of academics at Zayed University are UAE nationals. According to the same report, faculty hail from 62 different countries: 37% are from Asian countries, 25% are from the United States and Canada, and 22% are from European countries. These findings indicate that the UAE is part of the increasing international copublications trend reported by Glänzel and Schubert (2004).

Does the number of coauthors have a connection to OA availability?
To start with, we divided the articles into three groups based on the number of authors involved in each: one, two, or three or more. As Figure 4 shows, we found general prevalence for higher OA share for articles authored by more than three authors throughout the years covered. It can be seen also that there has been a constant increase in OA percentage across different coauthorship levels and over the 11 years captured. In addition to the fact that the number of coauthored publications has been significantly higher than single-author articles throughout the last 11 years, the output of publications with multiple authors has seen strong growth during the same period across both OA and closed articles. It can also be observed that  the OA rate is higher among multiauthor publications in recent years, with, for example, 52% OA for articles with three or more authors against 43% for single-author articles in 2019, 45% against 29% for the 2018, and 45% versus 34% in 2017.
Digging a bit deeper into this research question, a binominal logistic regression was performed to ascertain the effect of author count on the likelihood of an article being available OA. This analysis included two independent variables (count of authors per article) and year (publication year), and one independent variable for OA status (yes/no). We included the publication year in the model to account for the growth in general OA that can be seen over the observation years.
As described in Section 3, outliers were removed to improve analysis that involves absolute author counts. Author counts outside of one standard deviation of the arithmetic mean of authors (14.76) were excluded in this case, which meant that articles with an excess of 159 authors were not considered (183 articles in total). The logistic regression model was statistically significant χ 2 (2) = 1,431.995, p < .001. The two predictor variables were both statistically significant: number of authors and publication year. An increase in authors as well as later publication years were associated with an increased likelihood of an article being available OA. The finding of more authors per paper being associated with higher likelihood of being OA is in line with the results of Morillo (2020) and Eysenbach (2006) for the disciplines they researched. The output of the analysis is presented in Table 9. However, the degree to which the included variables could explain all the variation in the OA status for articles was relatively low. The model explained 6.3% (Nagelkerke R 2 ) of the variance in OA status and correctly classified 62.4% of cases. Sensitivity was 17.4% and specificity was 92.9%. Negative predictive value was 62.4% and positive predictive value 37.5%. As a follow-up we performed a Receiver Operating Characteristic (ROC) plotting of the discriminatory effects of the variables with the results of "Number of Authors" having an area of .593 and "Publication Year" having .566, which according to Hosmer, Lemeshow, and Sturdivant (2013) in general suggests poor discrimination that is not much better than a random classification.
So, although the test and its variables were significant, there are a lot of other factors also in play that should be explored in future studies.
Based on this finding we argue that one explaining factor is the increased likelihood of one author being covered by an OA mandate that either caters for OA APC expenses or ensures self-archival for published research on behalf of all authors of the article. As these mandates and funding possibilities have become more common over time, we think that also explains the relationship for more recent articles being more likely to be OA.  Table 10 shows a comparison between OA rate and intercontinental collaboration. It shows that Europe is a key player in the top 10 collaboration combinations with the highest OA rate. For this analysis we included two categories for articles that contain no international affiliations (one for single-authored articles with a UAE affiliation, and one for articles with multiple authors where there are only UAE affiliations) as a point of comparison to all the other categories, which contain different combinations of international affiliations. The OA percentage among articles with only UAE affiliations the was 32% for single-authored and 38% for multiauthor articles, thus not falling far behind the average of 41% for all articles over the period of the study. The results seem to indicate that higher intercontinental collaboration is related to higher OA rate.
To further explore the relationship between different coauthor affiliation world regions and the OA status of articles we opted for a nonparametric Pearson chi-square test for association, here also using the modified data set that excluded the 183 articles with over 159 authors per article (N = 30,217). Because we are dealing with two dichotomous variables (OA status and presence of specific author affiliation continent), and the same articles can include several of the affiliation variables at any one time, a nonparametric test was decided as the most optimal way to explore this dimension of the data.
The result of the Pearson chi-square test of association found a statistically significant relationship between all affiliation categories and OA status outside of articles with an affiliation to Africa, where the results were not statistically significant. For articles with only national affiliations (only UAE affiliations) the share of articles with OA status was lower than expected. For articles that included affiliations to Europe, South America, Asia, Australasia, and North America (listed in descending order of effect size between the variables) the actual count of OA articles exceeded the expected distribution. Because Cramer s V indication of the relative effect size ranges between 0 and 1, much like traditional correlation analysis, we can deduce that while the results are statistically significant the actual relative effect size is low, ranging between .037 and .079. The output of the analysis is provided in Table 11. These results support the notion that internationally coauthored articles in the data set are available OA to a higher degree, where the strongest effect was for articles which included a coauthor with an affiliation address in Europe.

CONCLUSIONS
For scientometric research, this study is able to contribute to integrative method development for supporting research on diverse data dimensions of bibliometric data sets on a national and longitudinal scale. Drawing together central methodological elements from OA research, coauthorship research, and research on national-level output, this study also provides novel research results related to how the national and international intertwine when it comes to the journal article publishing space. For this data set, we could establish that having more authors is related to a higher probability of an article being available OA, as well as more recent articles also more likely being available OA. The findings also show support for broad, multicontinent research being available OA to a higher degree than research only involving national authors. Though the explanatory power of the statistical model for identifying the most influential coauthor continent for relationship to an article being OA was weak overall, the highest effect With regard to the national perspective and what the study contributes towards better understanding of the development of research in the UAE specifically, this study shows that UAEaffiliated journal research output saw strong increases in volume, international collaboration, and OA during the 11 years captured as part of this study. This has happened at the same time as the country took steps to establish a stronger science policy that emphasizes these aspects as central elements. How much of this change can be attributed to the impact of national science policy and how much to the global trends of growth, collaboration, and OA is hard to pin down and would require different data and methods to establish. However, distinguishable upsurges in the number of documents can be seen around the release times of the UAE Vision 2021 in 2010, the UAE Innovation Strategy in 2014, and the National Advanced Sciences Agenda 2031 in 2018. Worthy of mention in this context also are the UAE federal government open data guidelines and the transformative "read and publish" agreements with major publishers, such as Cambridge and the American Chemical Society, signed by the two major research-leading public universities, Khalifa University and United Arab Emirates University. It is still relatively rare to see these agreements outside European institutions and library consortia, where they have become quite common, and this is a substantially strong step from the direction of the UAE to facilitating immediate OA publication of research outputs.
As is expected from a country whose economy is primarily dependent on oil, our findings suggest that the highest number of articles were in the Physical Sciences. However, this subject area achieved the second lowest OA rate of 33% after Social Sciences and Humanities. Apart from the articles in multidisciplinary journals, which recorded a significant OA rate of 90%, Health Sciences and Life Sciences achieved shares of 55% and 51% respectively. In terms of green OA publications, IRs and subject-based repositories are the main host locations of green OA articles despite the mediocre number of repositories in the UAE. This would indicate a low level of use for such repositories in the UAE for self-archiving of journal article manuscripts; however, such repositories might be populated with other types of content.
We found that the UAE aligns with the global trend of coauthored articles being on the rise and that the share of OA among coauthored publications is higher. This suggests that either awareness of OA increases as the number of authors increases or the cost of publishing OA is shared, such that research projects with larger teams have access to more funds to pay APCs or are required to by funders, especially those with Plan S-aligned OA policies. We also found that the rate of OA is connected to the size of intercontinental collaboration, with European coauthors especially being part of the top 10 collaboration combinations with the highest OA rate, even though the highest collaborations were with Asia and North America. This European coauthorship-associated higher OA rate is likely to be attributed to the high subscription to Plan-S and Horizon Europe principles in Europe. Further investigations need to be carried out on the factors contributing to the connection between collaboration and OA rate.
The study also included an element where the continent of the journal publisher was included as a variable, with results showing that North American and European journals have recruited the majority of articles published by UAE-affiliated researchers during the observation period. However, South American journals have published the highest percentage of OA articles. What bears remembering is that these results in particular are likely influenced by the Western-skewness of the Scopus index in terms of journal inclusion (Khanna et al., 2022;Rodrigues & Abadal, 2014;Tennant, 2020).
OA overall has changed a lot since 2009, and this is one thing that we consider this study also captured quite well from our own perspective of looking at the world through the window of the UAE. However, it is not without limitations. Through this study we observed the complexity of dealing with a rich bibliometric data set augmented with both OA status information and authorship world region categories. One can only inspect so many variables at a time and everything cannot be included in one study. Future studies could zoom in even further: for example, only on the development of specific OA types with similar national data sets, and at the same time identifying particular research funders from article-level metadata, thus being able to also include financial considerations of various models and science policy strategies into the mix. Because of the widespread acceptance of Scopus indexing as a measure of acceptance of research among UAE HEIs, as well as the strict requirements for detailed author affiliation metadata, this study used a data set extracted from Scopus. However, it would be beneficial to do a similar study on a larger scale with articles in other indexes, local journals, and other languages. Further studies could also be expanded to compare the situation in the UAE with other countries, as well as identifying who has funded OA for coauthored publications.