Abstract
Authorship is associated with scientific capital and prestige, and corresponding authorship is used in evaluation as a proxy for scientific status. However, there are no empirical analyses on the validity of the corresponding authorship metadata in bibliometric databases. This paper looks at differences in the corresponding authorship metadata in Web of Science (WoS) and Scopus to investigate how the relationship between author position and corresponding authors varies by discipline and country and analyzes changes in the position of corresponding authors over time. We find that both WoS and Scopus have accuracy issues when it comes to assigning corresponding authorship. Although the number of documents with a reprint author has increased over time in both databases, WoS indexed more of those papers than Scopus, and there are significant differences between the two databases in terms of who the corresponding author is. Although metadata is not complete in WoS, corresponding authors are normally first authors with a declining trend over time, favoring middle and last authors, especially in the Medical, Natural Sciences, and Engineering fields. These results reinforce the importance of considering how databases operationalize and index concepts such as corresponding authors, this being particularly important when they are used in research assessment.
PEER REVIEW
1. INTRODUCTION
Authorship plays an important role in career progression, from an undergraduate student to professorship. Author order is usually used in the assessment of researchers’ scientific contributions (Bhandari, Guyatt et al., 2014; Hess, Brückner et al., 2015; Perneger, Poncet et al., 2017). There are, however, disciplinary differences in how author contributions are represented in the byline of scientific papers (Pontille, 2004). Although some disciplines order authors by decreasing order of contribution (Bu, Wang et al., 2020; Grando & Bernhard, 2003), most lab-based disciplines exhibit an inverted U shape, with first authors and last authors having performed the most significant contributions (Larivière, Desrochers et al., 2016; Larivière, Pontille, & Sugimoto, 2021). This inverted U shape is the most generalized distribution of credit assigned to authorship (Bhandari, Busse et al., 2004; Costas & Bordons, 2011). There are exceptions to those dominant trends—such as economics, mathematics and business, management and accounting—where researchers show a strong trend to sign in alphabetical order (Fernandes & Cortez, 2020; Wohlrabe & Bornmann, 2022).
Corresponding authorship is another role that is gaining relevance in many countries as an alternative or complement to assigning credit based on author order (Chinchilla-Rodríguez, Sugimoto, & Larivière, 2019; Moya-Anegón, Guerrero-Bote et al., 2013; Zhou & Leydesdorff, 2006). Corresponding authors take the lead in manuscript submission for the publication process, having primary responsibility for communication with the journal during the manuscript submission, peer review, and publication process, and typically ensure that all the journal’s administrative requirements are properly completed. Accordingly, the corresponding author should be available to respond to editorial queries in a timely way and be available after publication to respond to critiques of the work and cooperate with any requests from the journal for data or additional information should questions about the paper arise after publication (International Committee of Medical Journal Editors (ICMJE), 2017).
It is generally assumed that corresponding authors are senior researchers or group leaders with experience in the submission and publishing process of scientific research. They not only contribute to the paper significantly but also ensure that it goes through the publication process in a smooth and successful manner1. However, there is no clear consensus on the role that the corresponding author plays in terms of leadership (Willems & Plume, 2021), despite being increasingly used and perceived in evaluation as a proxy for leadership (González-Alcaide & Gorraiz, 2018; Mattsson, Sundberg, & Laget, 2011; Wren, Kozak et al., 2007). Furthermore, little is known about the quality of the metadata used in scientific databases to analyze this role. Bibliometric databases include a field, often named reprint address, with which the corresponding author is identified.
The goal of this paper is twofold. First, we examine the validity of such a field as assigned by two different bibliometric databases. We focus our study on two of the major bibliometric databases, namely, Web of Science (WoS) and Scopus, as these tend to play an important role in research evaluation practices around the world. We make the comparison by working with an overlapping data set of records common to both databases. Second, we critically investigate the author position of corresponding authors according to discipline and country in WoS, paying special attention to trends over time. We then discuss the implications of our findings, both from a technical point of view and in relation to the use of this field in evaluation exercises, such as hiring, recruitment and promotion.
2. RELATED WORK
Evidence on what is a corresponding author and who from a research team should carry out the role is contradictory. For example, Weiss (2012) explicitly states that it is not appropriate for students and postdocs to perform the role, as they lack permanent positions and hence will not be able to respond effectively to information requests. Indeed, Teunis, Nota, and Schwab (2015) emailed corresponding authors from MEDLINE under the guise of a data request, showing that slightly more than half of researchers responded to the request. The higher proportion of undeliverable messages among basic/translational researchers might most likely be explained by an author leaving an institution or changing their email address.
Examining fields covering the journals subscribing to the ICMJE’s guidelines in European countries, Mattsson et al. (2011) stated that in the Science Citation Index (SCI), the corresponding author is labeled as reprint author. Less than 60% of publications had a reprint author tag before 1998, but from 1998 and onwards on average of 98% include the reprint address. They also found that the first author was more likely to be the corresponding author in small teams, but for larger teams it would be either the first or the last author, and they observed differences based on the type of collaboration. Corresponding authors tend to be last authors in internationally coauthored papers, but first authors tend to be corresponding authors in domestic publications.
At the international level, corresponding authorship has been taken as a proxy for leadership. Although research groups are organized around different structures when they collaborate with other external colleagues, they delegate the responsibility and authority to a researcher who acts as the main contributor and, by extension, to their affiliated country and institution. For example, corresponding address has been used to study leadership at the national level (Chinchilla-Rodríguez et al., 2019; Zhou & Leydesdorff, 2006). Greater presence as a first or corresponding author confers greater leadership; in contrast, absence from these roles could be associated with subordination or a secondary role (Chinchilla-Rodríguez, Miguel et al., 2018; Chinchilla-Rodríguez, Ocaña-Rosa, & Vargas-Quesada, 2016; González-Alcaide & Gorraiz, 2018).
The solution to the problem of how to count publications and credit authorship is neither clear nor generally accepted (Bornmann & Osorio, 2019; Frandsen & Nicolaisen, 2010; Gauffriau, 2017; Gauffriau & Larsen, 2005; Gauffriau, Larsen et al., 2007, 2008; Waltman, 2016), especially as disciplines have different publication practices and treat authorship differently. In principle, collaborative papers could be considered as an achievement for all authors involved, and thus full credit should be given to all of them (full counting). But the existence of disciplinary differences on collaboration suggests correcting for these differences to avoid inflation of authorship. One way of doing this is by using fractional counting, but this has a negative effect on the internationalization of performance in collaborations (Leydesdorff, 1988). Some studies have concluded that there are no significant differences between full and fractional counting (Liu, Yu et al., 2018). Furthermore, when differences are observed, the difficulties in interpreting the findings correctly increase (Park, Yoon, & Leydesdorff, 2016; Perianes-Rodriguez, Waltman, & van Eck, 2016). A recent proposal to find a balance between fractional and full counting is to calculate the square root of the fractional contribution of each author (Sivertsen, Zhang et al., 2022).
Huang, Lin, and Chen (2011) examined the difference between full counting, considering only first, only corresponding author, and fractional counting. They reported that less than 3% of the publications in their data set lacked metadata on corresponding author (in WoS) for the 1989–2008 period in physics. They concluded that there were large differences in the use of corresponding author by country. Moya-Anegón et al. (2013) used the corresponding author to give full credit to the country to which the corresponding author was affiliated. They found a strong relationship between first and corresponding author (in Scopus). This approach was also used to examine the relationship between guarantorship and international collaboration and their effect upon citation impact (Moya-Anegón, Guerrero-Bote et al., 2018).
The value of the corresponding author at the individual level, however, seems to be still disputed. In late nineties, Laurance (2006) expressed his concerns about the necessity of a set of coherent authorship rules after being informed by his peers that the British Research Assessment Exercise gave greater credit to the last author than the other authors except for the corresponding author. Indeed, the criteria followed by some national funding agencies (Ancaiani, Anfossi et al., 2015; Buckle & Creedy, 2022) which evaluate and recognize merits for the promotion and tenure process tend to push for publication counts. This means that the structure of collaboration is not fairly rewarding. Furthermore, they tend to prioritize academic leadership in the byline of publications, leaving aside other roles (Robinson-Garcia, Costas et al., 2020). Hence, evaluators perceive corresponding authors as playing a bigger role than other authors (e.g., middle authors) (Wren et al., 2007) and the prestige of the last author tends to increase when also designated as corresponding (Bhandari et al., 2014).
Assuming that the designation as corresponding author is meaningful, some articles have examined the change in individual publication practices, such as an increase in the number of papers with more than one corresponding author (Liu et al., 2018). Other studies have sampled only corresponding authors to ask about the roots of their creative ideas, assuming that these authors were involved in the design of the work (Tahamtan & Bornmann, 2018), or to analyze statements on research contribution in order to study the degree of adherence to ICMJE authorship criteria in one biomedical journal (Marušić, Božikov et al., 2004).
Motivated by the scandal of fake reviews submitted with fake emails from noninstitutional accounts and its retractions in one Springer journal (Stigbrand, 2017), Shen, Rousseau, and Wang (2018) examined whether the differences in institutional and noninstitutional email address influenced citation patterns. The email of the corresponding author from WoS (reprint address) or, in its absence, the email address of the first author in the list of authors was taken as the unit of analysis. They found that papers with an institutional email address receive more citations than others, agreeing with publishers who require authors to provide their institutional email. Wang and Wang (2017) look at how collaborations between China and the European Union are established, examining whether corresponding authors are Chinese local, Chinese abroad, or non-Chinese. It seems that that academic collaborations between China and the EU28 have been mainly set up by Chinese researchers, although, Chinese corresponding authors may be the result of the incentive structure in China (Franzoni, Scellato, & Stephan, 2011; Fuyuno & Cyranoski, 2006; Quan, Chen, & Shu, 2017).
Author order has also been used to understand how different roles in academia are affected by gender (e.g., Ghiasi, Harsh, & Schiffauerova, 2018), observing an underrepresentation of women as authors in academic publications, and in more prestigious authorship positions (West, Jacquet et al., 2013). For instance, Boekhout, van der Weijden, and Waltman (2021) found that in biomedical disciplines, men are about 25% more likely than women to be last authors, suggesting that men tend to have more senior roles than women. Garg and Kumar (2014) looked at corresponding vs. other author roles by gender, showing that women tend to work in small teams, and they represent about a quarter of corresponding authors in some fields. Macaluso, Larivière et al. (2016) reported that the relationship between team size and proportional contribution to various tasks differs considering the gender of the corresponding author. Women appearing as first or corresponding authors are more likely to be associated with all tasks except contributing materials. In the case of male corresponding or first authors, these were more likely to be associated with all tasks except experimentation. Studies focused on gender and geographic location of first, last, and corresponding authorship (Fox, Ritchey, & Paine, 2018) found that female first authors were less likely to serve as corresponding authors in their papers. This difference increased with the degree of gender inequality in the author's home country. First authors from non-English-speaking countries were less likely to serve as corresponding authors, especially if the last author was from an English-speaking country.
Recently, there has been an increasing trend for including more than one corresponding author. Between 1999 and 2008 the percentage of papers with more than one corresponding author has steadily been on the rise. However, neither WoS nor Scopus provided this information. Hu (2009) argued that the fact that major databases do not mention “equal first authorship” has severe implications and that as more and more journals require disclosure of the exact contribution of each author, it should be considered in scientometric investigations. Since then, there have been studies analyzing this phenomenon in specific disciplines, such as biomedicine (Akhabue & Lautenbach, 2010; Hu, Rousseau, & Chen, 2010) or pharmacy and anesthesia (Huang, Neylon et al., 2020).
3. DATA AND METHODS
A total of about 33 million documents from WoS Core Collection and 43 million documents from Scopus were retrieved from the in-house version of the WoS and Scopus maintained at CWTS (Leiden University) for all document types, which is a common practice in these types of studies (see, for example, Huang et al., 2020; Martín-Martín, Orduña-Malea et al., 2018; Visser, van Eck, & Waltman, 2021). We used Digital Object Identifiers (DOIs) to match more than 23 million publications published between 1998 and 2017 (n = 23,426,742) from both databases. The matched data set represents 62% of all WoS publications and 54% of all Scopus publications (Figure 1).
Both databases are expanding the inclusion of DOIs over time, with the matched set representing 70% of WoS and 60% of Scopus in 2017 (Figure 1). The lower proportion of documents with DOIs in Scopus might be explained by differences in coverage. Scopus includes a wider representation of countries and languages than WoS (Moya-Anegón, Chinchilla-Rodríguez et al., 2007). DOI registration requires investment and infrastructure that may be lacking for some countries or institutions: According to the Scopus Content Coverage Guide, 60% of more than 5,000 international journals do not belong to the largest publishing groups, such as Elsevier and Springer2. This suggests that there are journals published by universities, associations, etc. that do not assign DOIs to their records. For a comprehensive database comparison, see Martín-Martín, Thelwall et al. (2021), Visser et al. (2021), and Gusenbauer (2022).
Bibliometric databases do not include metadata for corresponding author explicitly (Huang, Hsieh, & Lin, 2016). Rather, the reprint address is the indication of the author to whom correspondence should be addressed3. Therefore, we operationalize corresponding author as reprint author and will use these terms interchangeably. We calculate the number of authors for each published paper and consider author positions in the byline of all coauthored publications, namely, first, middle, last and corresponding author.
Each publication was categorized into four broad categories and 14 disciplines, which are used for the disciplinary breakdown of the numbers presented in this paper. Medicine (MED), including biomedical research, clinical medicine, and health; natural sciences and engineering (NSE), comprising biology, chemistry, earth, and space; engineering and technology, mathematics and physics; social sciences (SS); professional fields, psychology and social sciences, and arts and humanities (AH).
4. RESULTS
Next, we report our main findings. This section is structured as follows. First, we compare corresponding authorship metadata as shown in WoS and Scopus. For this, we focus on a global analysis of the levels of disagreement between the information reported in each database. To validate our findings, we randomly select three different samples and manually inspect them. In the second section of the results, we focus on the information reported by WoS and we investigate differences in corresponding authorship by scientific field and geographic regions. We conclude the reporting of our findings by showcasing some specific countries.
4.1. Comparison of the Corresponding Author in Web of Science and Scopus
The number of documents with corresponding authors has increased steadily across time (Figure 2A). In the entire matched data set, on average about 97% of WoS documents contain at least one reprint author, whereas 85% of Scopus documents have these metadata, derived from fluctuations in the data before 2004 and after 2012. These fluctuations seem to be derived from indexing errors in specific physics journals. To understand the reasoning behind these fluctuations, we manually inspected the source of records with no correspondence in Scopus, but with at least one corresponding author in WoS. We observe that the corresponding author field of over 80% of records from journals such as Physical Review Letters, Physical Review D, and Physical Review B, among others, was not indexed in the early period. During 2002 and 2012 the top journals for which the corresponding author field was not indexed changed, and the share of nonindexed records falls to around 50% for these journals. Also, these journals publish fewer papers per year (between 300 and 4,000 papers). Throughout 2013 to 2017, again the share of papers for which this field is not indexed increases in journals such as The Astrophysical Journal (94% of its records do not include a corresponding authoring 2016) and Proceedings of SPIE – The International Society for Optical Engineering (this journal produces around 14,000 papers a year and in 2012, 10% of its papers did not include a corresponding author, this share increasing to over 60% within the 2013–2017 period).
For those with reprint authors (Figure 2B), WoS starts indexing a significant number of documents with more than one reprint author from 2014 onwards—reaching 10% of our sample by 2016—and increasing at a more rapid pace than the inclusion of multiple reprint authors in Scopus.
Table 1 shows the position of the corresponding author in the author byline (first, middle, or last) for all publications matched with DOI in both databases and by number of coauthors (single vs. coauthored) related to the coverage of the corresponding author. For all publications, the percentage of documents with the same corresponding author in both databases is close to 86%. There are significant differences in documents where only one database identifies a corresponding author, and WoS registers corresponding authors in 12% of documents that Scopus does not; only 1% of documents have no corresponding author in both databases.
. | Same CA (%) . | CA in WoS but not in Scopus (%) . | CA in Scopus but not in WoS (%) . | No CA in either database (%) . |
---|---|---|---|---|
All | 85.70 | 12.19 | 1.05 | 1.06 |
Single authored | 79.55 | 10.51 | 5.38 | 4.56 |
Coauthored | 86.51 | 12.41 | 0.48 | 0.60 |
First | 47.60 | 9.79 | 1.52 | 41.10 |
Middle | 13.42 | 1.97 | 0.82 | 83.79 |
Last | 25.51 | 3.76 | 1.08 | 69.65 |
. | Same CA (%) . | CA in WoS but not in Scopus (%) . | CA in Scopus but not in WoS (%) . | No CA in either database (%) . |
---|---|---|---|---|
All | 85.70 | 12.19 | 1.05 | 1.06 |
Single authored | 79.55 | 10.51 | 5.38 | 4.56 |
Coauthored | 86.51 | 12.41 | 0.48 | 0.60 |
First | 47.60 | 9.79 | 1.52 | 41.10 |
Middle | 13.42 | 1.97 | 0.82 | 83.79 |
Last | 25.51 | 3.76 | 1.08 | 69.65 |
For publications with a single author (11.8% of all matched documents), nearly 80% have the same corresponding author in both databases, whereas significant differences remain in documents where only one database identifies the corresponding author. WoS always has a higher percentage of documents with corresponding authors than Scopus (10.5% and 5.4% respectively). Around 4.5% of documents do not register a corresponding author in either database.
For publications with more than one author and where the first, middle, or last author appears as corresponding, 48%, 13%, and 25% respectively of documents have the same corresponding author. WoS assigns a corresponding author to a larger number of unique documents than Scopus (second and third columns) and only 0.6% of documents have no corresponding author in either database.
4.1.1. Validation
To verify how discrepancies between databases matched reality—that is, how was corresponding author originally assigned by the journals—we manually inspected three random samples. The first sample (Set 1) consisted of a random selection of 100 coauthored papers for which both databases reported the same corresponding author. The second sample (Set 2) included 100 publications for which Scopus reported a corresponding author but WoS did not. The third sample (Set 3) also included 100 papers for which WoS reported a corresponding author but Scopus did not. For each of these, the full text was manually examined to determine the validity of the identification of corresponding authorship. We looked into three items: whether a corresponding author was explicitly labeled; whether contact information was provided; and the author position of the corresponding author.
Set 1.
In the cases where Scopus and WoS both agreed on corresponding first authors, this was indicated by either a single email address or a corresponding author indicator (100% in WoS and 97% in Scopus). For the three documents with no explicit indication of corresponding author, we found different document types (article, conference, and editorial material). However, Scopus had a higher proportion of documents with an email address than WoS (88% vs. 77%).
Both databases showed the same data in all cases, but there were four papers in which the corresponding author occupied a middle position. In that case, an email address was provided. However, in three cases, more than one email address was provided for both the last and middle authors, and in one case for all authors in WoS. In Scopus, for those publications with more than one email address (9% of our sample), first or last positions were clearly indicated, but WoS defaulted to the first listed email address. Therefore, middle corresponding authorships may be undercounted.
Similarly, in corresponding last authorships, WoS specifically indicated a corresponding author but Scopus missed this data in all but six of the 100 sampled records. In only one of those missing cases was an email address not provided. For those publications with more than one corresponding author, in all cases Scopus provided more than one email for each corresponding author, and WoS did so only for last authors. This reinforces the idea that WoS may be undercounting corresponding authorships.
Set 2.
Twenty-one of the papers sampled in Set 2 were not research articles—such as book reviews, corrections, editorials, and errata—with either no explicit author or a single author (7%). No email address was provided or corresponding author indicated. This suggests that Scopus might be more liberal in assigning a corresponding author to this front material. Only 27% of the other records did not explicitly state an email address for the corresponding author. In one case, there was an email address provided and for another a mailing address. In one instance, there was a collaborative author (STAR Collaboration). The remaining records had an author explicitly labeled as corresponding: 43% as first, 20% as middle, and 37% as last author. More than half of those with an explicitly labeled corresponding author did not have an email address provided on the manuscript. This may suggest that WoS is more likely to avoid assigning a corresponding author without an email address.
Set 3.
Twenty-four percent of the documents sampled in Set 3 were collaborative authorships from high-energy physics (e.g., The CMS Collaboration, ATLAS Collaboration). This comprises almost a quarter of the records where WoS identified a corresponding author but Scopus did not, suggesting that Scopus’ practices tend to ignore corresponding authorships in large-scale collaborations. In each of these cases, two email addresses were for the network rather than an individual, and in seven cases an email address was not provided. In all these cases, the corresponding author occupied the first position, suggesting that WoS simply chose the first available email address for the corresponding author. For the remaining records, 12 not explicitly identify an email address for the corresponding author. When a clear corresponding author was listed, a single email address was provided in 53 cases, which was likely interpreted as corresponding. In 50 cases it was the first author, in 10 cases last author, and in six cases middle author. Multiple email addresses were provided for four of the records (in one case all authors, in one case first and last author, and in three cases the last two authors). In short, of those with a WoS corresponding author but no Scopus corresponding author, 53% were unambiguous upon examination. Table 2 summarizes the validation process.
. | WoS . | Scopus . | ||||
---|---|---|---|---|---|---|
A . | B . | C . | A . | B . | C . | |
Set 1 (First) | 100 | 77 | 100 | 97 | 88 | 100 |
Set 1 (Middle) | 100 | 68 | 100 | 96 | 76 | 100 |
Set 1 (Last) | 100 | 94 | 100 | 94 | 99 | 100 |
Set 2 | 100 | 27 | 100 | |||
Set 3 | 100 | 71 | F(76); M(6); L(16) |
. | WoS . | Scopus . | ||||
---|---|---|---|---|---|---|
A . | B . | C . | A . | B . | C . | |
Set 1 (First) | 100 | 77 | 100 | 97 | 88 | 100 |
Set 1 (Middle) | 100 | 68 | 100 | 96 | 76 | 100 |
Set 1 (Last) | 100 | 94 | 100 | 94 | 99 | 100 |
Set 2 | 100 | 27 | 100 | |||
Set 3 | 100 | 71 | F(76); M(6); L(16) |
4.2. Country and Discipline Differences in Corresponding Authorship in WoS
Twenty percent of documents in WoS did not include reprint author metadata; for the remaining papers, corresponding authors appeared as first authors in 47%, as middle authors in 10%, and as the last authors in 22%. Excluding 20% of papers without reprint author metadata, 59% of papers are assigned to the first author, 13% to middle authors, and 28% to the last author (Figure 3). For those documents with at least one email, WoS registers the corresponding author of more than 55% of papers to the first author and more than 61% have more than one email address. When at least one corresponding author appears in the author table, it is usually assigned to the first author (56%), whereas when more than one corresponding author appears, they are usually assigned to the middle author position (66%).
The distribution of papers over time with reprint address metadata shows that for nearly 28% of all papers in 1998, and 20% in 2018, there is no metadata for reprint address (Figure 4A). For single-authored papers, this percentage raises from 57% in 1998 to 67% in 2018, and for coauthored papers, percentages are higher (from 83% in 1998 to 85% in 2018)4.
In Figure 4C, we can observe that WoS starts registering email addresses from 2001 onwards. As of 2004, it seems consistent, but there is still incompleteness for single-authored papers (in 2018, 21% of papers lack this information) (Figure 4B). Email addresses in the reprint address field have been completely recorded in recent years in collaborative papers (Figure 4C). In addition, WoS starts registering reprint author metadata consistently in 2005; more than one email address in collaborative papers in 2004, increasing steadily over time (more than 25% of papers in 2018); and more than one corresponding author per paper in 20165.
We also explored the position of corresponding authors over time (Figure 5). From 1998 to 2018, the most common position for corresponding authors is the first one, although it begins to decline in favor of the middle (more than 30% of papers) and last positions (more than 20%) (left panel).
When considering collaborative papers with reprint author metadata in all disciplines (right panel), the percentage of papers with corresponding author as first author declines by 46% over time (from 88% to 47%), and papers with the last author as corresponding author increase by a factor of four and middle authors increase their presence in WoS by a factor of six. It seems that correspondence was assigned by default to first authors, but more recently it has been assigned to last authors. In addition, middle authors are increasing at a higher rate than the rest. However, the percentage of papers with no corresponding author remains steady over time (around 20%).
4.2.1. Differences by field
Next, we explore the evolution over time of the percentage of papers by broad scientific fields (Figure 6). The percentages of papers with first author as corresponding author are decreasing over time in medicine (MED), natural science and engineering (NSE), and social sciences (SS). In arts and humanities (AH), first authorship shows two different phases: Until 2002 it represents around 60% of papers, with 40% of papers having no corresponding author. As of 2002 first authorship increases (up to 80%) and the share of papers with no corresponding author decreases significantly from 60% to 5%. NSE presents a higher decrease (around 40%) in first authorship as corresponding (from around 74% to 45%), favoring last and especially middle positions (growth rates of 96% and 322%, respectively). Papers with no corresponding author also decrease over time (from 6% to 1.7%). MED shows a lower decrease in first authorship than NSE (from 54% to 40%), with the highest proportion of papers for which last authors appear as the corresponding authors. This trend remains over time, with around 35% of papers in 2018. In SS, first authorship is the most common position (77% in 2018), with a slight decrease (10%) over time, and last and middle authorship increase their presence as corresponding authors by two and three times, respectively (15% and 10% respectively in 2018).
Figure 7 shows a heat map by discipline and type of author information metadata in collaborative papers in the left panel (sorted in descending order by the column ‘At least one email’). Overall, more than 39% of documents do not record at least one email address and 43% of papers do not have one corresponding author in the author table, which means that there is a huge proportion of papers without this information. There is certain correspondence between papers having an email address and at least one corresponding author in the author table. Some discrepancies are observed in Psychology and especially in Arts and Humanities, where there is a high proportion of papers with at least a corresponding author and a low proportion of papers with at least one email address. The case of Physics is more balanced but particularly striking, showing low values in both variables (less than 50%).
Only 12% of papers register more than one email address; however, Mathematics (46%) and Humanities (29%), followed by Social Sciences, Professional Fields, and Engineering and Technology (around a quarter of papers) shows higher percentages of papers with more than one email address in WoS. Papers with more than one corresponding author in the author table barely represent 2% in all disciplines, being more likely to appear in Chemistry, Biomedical Research, and Engineering.
In the right panel of Figure 7, a heat map shows author position by discipline in collaborative papers. First authors as corresponding authors have the highest proportion of papers in almost all disciplines, except for those related to the broad NSE and MED scientific fields. Biomedical Research (46%), Chemistry (38%), and Biology (28%) present higher values for papers with middle authors as corresponding authors, and last authorship as corresponding author is usual in Chemistry (23%) and Engineering and Technology (18%).
4.2.2. Differences by number of authors
Considering the number of authors per paper, the distribution of corresponding authors in distinct order positions is shown in Figure 8. Left-top panel shows that in all papers, first authorship is the most prevalent, decreasing over time (−30%) in favor of the middle (149%) and last (20%) positions, which evolve in parallel. The top-right panel shows authors appearing in coauthored papers with at least one corresponding author in the author table. In this case, there is no metadata of corresponding authorship for around 23% of authors. In 2006, there is a clear change of pattern, which is consistent with Figure 2 (WoS starts registering consistently reprint author metadata in papers from 2005 onwards). Corresponding authors in first position fall from 66% to 46% whereas authors who appear in the last position rise from around 23% in 1998 to 28% in 2018. The most evident shift is observed in authors appearing in middle positions, going from 11% to 30% in 2018 in MED, and middle authorship overtakes last authorship in NSE.
4.2.3. Differences by country
Next, we explore differences between countries and regions. Figure 9 shows the distribution of papers of the 100 most productive countries by the number of papers with first, last, and middle author as corresponding author (sorted ascending by first authors). It seems that some Asian and Latin American countries tend to accumulate a higher proportion of papers with middle and last authors as corresponding authors.
To have a better understanding of how order position of corresponding authorship varies across countries, Figure 10 shows the distributions of countries by region according to the order position of the corresponding author. The higher proportion in first authorship is observed in all regions, with variations. The general pattern that emerges is that, for all groups, there appears to be a high concentration of first authorship, although we do observe extreme cases, such as South Korea in East Asia & Pacific (25% of first and 50% of middle position as corresponding authors). China, Taiwan, and Indonesia have a higher proportion of last and middle corresponding authorship and lower first corresponding authorship.
Indeed, there are some country differences. Even though is no consensus about the status and meaning of the corresponding author in all universities, publishers, and/or authors (Willems & Plume, 2021), some countries gone so far as to monetized this position of leadership: Korea, China, and Pakistan all have governmentally funded incentive structures for those who are first and corresponding authors on papers in journals such as Science, Nature, and Cell (Franzoni et al., 2011; Fuyuno & Cyranoski, 2006; Quan et al., 2017). That suggests that different scientific cultures and incentives may also play a role in the choice of the corresponding author and, by extension, in the behavior of research groups that tend to adapt to evaluative research assessment. Hence, the validity of corresponding author in major databases is important in order to assign correctly the position of authors in evaluation studies, and this should be further investigated in future studies (Figure 11).
5. DISCUSSION AND CONCLUDING REMARKS
Gaining authorship in a published paper is a prestigious endeavor that is sought by everyone in the academic research world (Cuschieri, 2022). Several studies have examined the relationship between corresponding author and author order. However, most of these studies just focus on a small portion of data, covering only limited research fields or time range, which may not be ultimately generalized to other situations (Yu & Yin, 2021).
In this study, we present an empirical analysis of the use of corresponding authorship in scientific publishing. As metadata for corresponding author is not explicitly reported in the Scopus and WoS databases (Hu, 2009), we use the reprint address field as the indication of the author to whom correspondence should be addressed. We observe that, over time, WoS and Scopus have increased the number of records for which they include reprint metadata. WoS has a higher percentage of papers that contain this information. But the percentage of documents with more than one corresponding author is higher in Scopus than in WoS. There are significant differences in documents where only one database identifies a corresponding author or the corresponding author is not the same. After manually inspecting some random samples, we observe that when multiple email addresses are provided, WoS will simply include one of the available emails, and Scopus ignores this information if the number of coauthors is extremely high (e.g., high-energy physics).
These two data sources are important in bibliometric studies and have often been compared with regard to the coverage of fields, countries, and languages (Gusenbauer, 2022; Mongeon & Paul-Hus, 2016; Singh, Singh et al., 2021), but are rarely used to analyze authors’ positions related to corresponding authorship. Hence, this study contributes to the literature, bringing insights about their indexation practices in corresponding authorship. We also acknowledge that much more work is still to be done in the future related to this comparison of the operationalization and coverage of corresponding authors in scientometric databases. Particularly relevant will be to expand comparisons with new and larger data sources in the field (OpenAlex, Dimensions, Lens.org, etc.).
We further explore changes in the position of corresponding authors over time in WoS, by fields and by countries. We found that reprint address metadata is not complete in either single-authored (more than 30%) or coauthored papers (15%). WoS starts consistently registering reprint author metadata from 2005 onwards and more than one reprint author from 2016.
We find that first authorship is the most common position holding the corresponding author role in all papers, although this trend is changing in favor of middle and last author positions, especially in MED and NSE fields. It seems that first authors were the corresponding authors by default, but middle authors appearing as corresponding authors are increasing at a higher rate than the rest, such as in NSE. Yet, the average percentage of papers with no corresponding author remains steady over time (around 20%). This appears to be related to the document type rather than to systematic biases in the database.
When considering the number of authors per paper, we found that close to a third of authors do not appear as corresponding authors. In line with the results of Milojevic, Radicchi, and Walsh (2018), our hypothesis is that technical staff might be behind this figure, which might have some effect in research evaluation assessments. Further research needs to be conducted in future studies. In addition, there are country differences in the percentages of position in the byline of corresponding author. Although first authorship is more likely to serve as corresponding author in most countries, there are exceptions, such as South Korea, China, Pakistan, and Taiwan, where the last and middle positions are more likely to appear as corresponding. This could be due to the introduction of incentives with regard to the corresponding author and seems to be consistent with other studies (Ding & Herbert, 2022).
5.1. Policy Implications
The complexity of evaluating intellectual contributions in increasingly interdisciplinary research and collaboration and the competitive environment of the labor market (Larivière et al., 2016) have important practical implications for scientists, funding providers, and research evaluators. Corresponding authorship has become an indication of seniority and leadership on the team, driven by incentive initiatives from funding agencies and research institutions rather than a particular set of responsibilities (Willems & Plume, 2021). The use of author order as a primary source of credit (Egghe, Rousseau, & van Hooydonk, 2000) can be problematic and has consequences for evaluation studies, as the inaccurate assessment of collaborators can harm the sustainability of scientific collaborations (Lu, Zhang et al., 2022; Wang, Ren et al., 2020); lead to a dramatic drop-out from scientific careers (Milojevic et al., 2018), especially in early career stages and for female researchers (Robinson-Garcia et al., 2020); and may lead to unethical practices (ghost, gift, and/or honorary authors (Teixeira da Silva, 2021).
At the individual level, it seems that the greatest driver behind the selection of corresponding authorship in collaborative papers is the competitive environment in which researchers and institutions are now operating. To secure job opportunities and funding, researchers will use the role of corresponding author as a means to get credit regardless of their position on the author list (Willems & Plume, 2021). Recent decades have witnessed an increasing number of corresponding authors and equally contributing authors, creating stress if teamwork is not properly acknowledged in research evaluation exercises (Franzoni et al., 2011; Fuyuno & Cyranoski, 2006; Quan et al., 2017), journals (Drubin, 2014; Dubnansky & Omary, 2012; Omary, Wallace et al., 2015), and bibliographic databases (Hu, 2009). This study contributes to shedding light on the validity of corresponding authors in bibliographic databases, showing that there is no systematic and accurate standard to index this author position in two of the major bibliographic databases. Hence, studies focused on the figure of corresponding author should be cautious on their interpretation of these findings.
At the country level, we show that incentives may play a role. The significant shift in the position of corresponding author in some countries also increases geographic inequalities, as authors providing funding will automatically adopt the corresponding author role, leaving other positions to the rest of their collaborators. In addition to the individual incentives for publishing as corresponding authors in some countries, universities are increasingly reaching agreements with publishers that specify that the corresponding author must be an employee of a participating university (Willems & Plume, 2021). However, not all researchers have access to the same resources (Chinchilla-Rodríguez et al., 2019), which leads to an underrepresentation of institutions from less developed countries (Gumpenberger, Hölbling, & Gorraiz, 2018; Powell, Johnson, & Herbert, 2020), and research publishing will be closed to those who cannot make an institutional or project money payment (Zhang, Wei et al., 2022), which raises new research questions to be further investigated.
Given the potential value of publications indexed in bibliographical databases and their use as bibliometric data sources for large-scale analyses in research assessment, research landscape studies, science policy evaluation, and university ranking (Baas, Schotten et al., 2020), and the consequences for the reward system of science (Butler, 2003; Crespo & Simoes, 2021; Hornibrook, 2012), it is important to assess their strengths and weaknesses (Bornmann, 2018; Guerrero-Bote, Chinchilla-Rodríguez et al., 2021; Mongeon & Paul-Hus, 2016) in order to guarantee the bibliometric relevance, completeness, and accuracy of the sources.
The results of this study are currently relevant because more bibliometric databases are being developed (e.g., Dimensions.ai and OpenAlex). How these databases conceptualize and operationalize specific metadata elements may differ substantially among them, and sometimes important metadata elements such as corresponding authors may even be overlooked (e.g., the current version of OpenAlex does not include corresponding author identification). We plan to continue studying these differences among data sources in a more complete study on the concept of corresponding authorship and how it is captured among the different databases. In this way, it will provide better evidence for researchers to choose those which better represent their ultimate goals before drawing conclusions that can be used for policymakers and other stakeholders.
ACKNOWLEDGMENTS
We are grateful to Cassidy Sugimoto (School of Public Policy, Georgia Institute of Technology) for her fruitful discussion and feedback on an earlier draft of this paper presented at the 26th International Conference on Science, Technology and Innovation Indicators (STI 2022), Granada, Spain (Chinchilla-Rodríguez, Costas et al., 2022); Gali Halevi (Clarivate); and Andrew Plume (Elsevier) for their feedback in our validation study. We also thank reviewers for their comments, which have helped to improve the original manuscript.
AUTHOR CONTRIBUTIONS
Zaida Chinchilla-Rodríguez: Conceptualization, Formal analysis, Funding acquisition, Methodology, Writing—original draft, Writing—review & editing. Rodrigo Costas: Conceptualization, Data curation, Methodology, Supervision, Writing—review & editing. Nicolás Robinson-García: Conceptualization, Formal analysis, Funding acquisition, Methodology, Writing—review & editing. Vincent Larivière: Conceptualization, Data curation, Methodology, Supervision, Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
The authors acknowledge funding from the Spanish Ministry of Science and Innovation (RESPONSIBLE project PID2021-128429NB-I00 and COMPARE project PID2020-117007RA-I00). Nicolas Robinson-Garcia is funded by a Ramón y Cajal grant from the Spanish Ministry of Science and Innovation (REF: RYC2019-027886-I).
DATA AVAILABILITY
The data sets (Web of Science and Scopus) used for the analyses in the current study are not publicly available due to licensing clauses.
Notes
The WoS user guide (2021), https://support.clarivate.com/ScientificandAcademicResearch/s/article/Web-of-Science-Core-Collection-Explanation-of-Reprint-Address?language=en_US, provides some insight into the indexing practices. Beginning with 1998 data, we do not remove a duplicate address if it appears as both a research and a reprint address. If you want to count unique addresses, exclude <reprint_addresses> data. Prior to 1998, a research address that matches a reprint address is not included in the list of research addresses. To count unique addresses, create a table for all addresses and eliminate duplicates for all years. Then, on an ongoing basis, match addresses to the existing table and move the duplication.
Practices, however, are always changing. As noted on the Clarivate website (2021): Although many journals specify only one corresponding author, there is no limit to the number of contributors who may be designated to receive correspondence for a paper. As of January 27, 2016, multiple reprint addresses will be captured and displayed on the Web of Science Core Collection Full Record. For records indexed prior to January, 2016, only the first reprint address will be displayed. See https://support.clarivate.com/ScientificandAcademicResearch/s/article/Web-of-Science-Core-Collection-Explanation-of-Reprint-Address?language=en_US.
REFERENCES
Author notes
Handling Editor: Li Tang