Investigating the division of scientific labor using the Contributor Roles Taxonomy (CRediT)

Contributorship statements were introduced by scholarly journals in the late 1990s to provide more details on the specific contributions made by authors to research papers. After more than a decade of idiosyncratic taxonomies by journals, a partnership between medical journals and standards organizations has led to the establishment, in 2015, of the Contributor Roles Taxonomy (CRediT), which provides a standardized set of 14 research contributions. Using the data from Public Library of Science (PLOS) journals over the 2017–2018 period (N = 30,054 papers), this paper analyzes how research contributions are divided across research teams, focusing on the association between division of labor and number of authors, and authors’ position and specific contributions. It also assesses whether some contributions are more likely to be performed in conjunction with others and examines how the new taxonomy provides greater insight into the gendered nature of labor division. The paper concludes with a discussion of results with respect to current issues in research evaluation, science policy, and responsible research practices.


INTRODUCTION
Scientific authorship is regularly considered as the primary currency in academia, whether for hiring, promotions or priority disputes (Biagioli & Galison, 2003;Cronin, 2001;Pontille, 2004). Yet, from the 1950 onwards, issues have been progressively raised about the use of authorship for attributing scientific capital (Bourdieu, 2001). These issues can be grouped into three categories. The first one relates to the increasing number of authors per article (Larivière, Sugimoto et al., 2015;Zuckerman, 1968). In some domains, such as clinical research, genomics, and high-energy physics-where articles often bear several hundreds or thousands of names in the byline-identifying respective contributions and, thus, assessing individual researchers' contributions, is increasingly difficult. Second, with the rise of multidisciplinary projects, the meanings attributed to authorship-and to name ordering-have multiplied, with unintended consequences for authorship (Paul-Hus, Mongeon et al., 2017;Smith, Williams-Jones et al., 2020a, 2020b. The frictions of conventions have sown discord among the participants in research projects (Wilcox, 1998) and greater confusion has also prevailed among gatekeepers (Bhandari, Guyatt et al., 2014). Third, scientific research has regularly-and some may argue increasingly (Azoulay, Furman et al., 2015)-been shaken by cases of fraud. In some alleged cases, all authors on a work under investigation have asked journals to remove their names from performed, but also assumed that the research process can be segmented into different acts that can be properly ascribed to individual contributors. The segmentation of scientific contributions was not introduced by contributorship, but rather emerged from researchers who have proposed taxonomies in response to Moulopoulos et al.'s (1983) work. These idiosyncratic taxonomies differed in the number of contributions listed (from six to 15) and their degree of accuracy. For example, "writing up the paper" was sometimes considered as one contribution, while in other taxonomies it was supplemented with "critical revision of manuscript," or even split into "writing the first draft of the paper," "writing later draft(s)," and "approving final draft" (Goodman, 1994).
Biomedical journals were the main drivers of new taxonomies. Two peculiarities have resulted from this. First, these taxonomies are characterized by research task contributions clearly specific both to the biomedical sciences ("collecting samples or specimens," "providing DNA probes") and clinical research ("referred patients to study," "provision of study materials or patients"). Second, there are significant differences in not only the number of contributions from one journal to another but also the variations in contribution taxonomies and their organization (Bates, Anic et al., 2004;Baerlocher, Gautam et al., 2009;McDonald, Neff et al., 2010). Journals request contributions in free-text form, organized as a predefined list of research tasks to choose from, or even as hierarchical items that make some contribution roles a prerequisite for others. As these taxonomies evolve, studies have investigated the relationship between the structure of these forms, the number of contributions described, and the differences in perception among coauthors of the same article (Ilakovac, Fister et al., 2007;Ivaniš, Hren et al., 2008, 2011Marušic , Bates et al., 2006).
Early taxonomies paved the way for large-scale empirical studies of authorship practices in science. For instance, Larivière, Desrochers et al. (2016) analyzed contributorship statementsdivided into five contributions-for 87,002 papers published in all Public Library of Science (PLOS) journals, focusing on labor distributions across disciplines, authors' order, and seniority. They showed that the division of scientific labor is higher in medical research than in natural sciences, and that in all domains but medicine, the most common task among authors was drafting and editing of the manuscript. Results of this and subsequent analyses  also showed strong distinction between tasks performed and author characteristics: Younger researchers and women were more likely to perform technical contributions, whereas older, male researchers were more often associated with conceptual contributions. Authors' order was also strongly associated with number of contributions: First authors were generally associated with the vast majority of contributions, followed by last authors-who generally were not involved in technical work-and then by middle authors, whose contributions were fewer and more likely to be technical . These findings were confirmed by Sauermann and Haeussler (2017), who analyzed more than 12,000 articles published between 2007 and 2011 in PLOS ONE. As with Larivière et al. (2016), they found that first and last authors were associated with more contributions than middle authors. In an examination of team size, they demonstrated that the number of contributions per author decreases with the number of authors, but remains stable for last authors. They complemented this analysis with a survey of 6,000 corresponding authors from these papers. Their findings suggest that a majority of corresponding authors believe that contributorship statements provided more information about the contribution, but only a minority think that contributorship provides more information on the importance of contributions. Furthermore, they found that in one-fifth of papers, contributorship statements were determined by the corresponding authors alone. Sauermann and Haeussler (2017) suggested that it was difficult to predict the contribution based on author order alone. Corrêa, Silva et al. (2017)-also using the PLOS ONE data set-confirmed this uncertainty between authors' order and contributions made. Using a network-based approach, they found that the relationship becomes increasingly random as the number of authors per paper increases. They also provided evidence of how division of labor increases as the number of authors increases and showed that contributions can be grouped into three categories: those who write, those who perform data analysis, and those who conduct experiments.
These studies provided novel insight on the relationship between authorship and one coarsegrained contributorship taxonomy. However, the previously used five contributorship categories fail to account for the complexity of contemporary science. To address the need for a more refined taxonomy, an International Workshop on Contributorship and Scholarly Attribution was organized at Harvard in May 2012 at the initiative of the Wellcome Trust (IWCSA, 2012). One outcome was a pilot project involving publishers, funders, and scientists to design a cross-disciplinary standardized taxonomy for contributor roles and contribution types, which would be practicable for all scientific fields. The goal was to be interoperable with different databases and to reduce the many ambiguities that remain with earlier contributorship typologies. In the eyes of its promoters, this standardized taxonomy would not only codify the contributions of each researcher with fine granularity, allowing for specific skills to be easily identified, but also rely on an infrastructure to manage the complex relationships between the information, its archiving, and its consultation in real time.
An initial prototype comprised of 14 types of contribution roles was designed and tested among corresponding authors of work published in various (mostly biomedical) journals (Allen, Scott et al., 2014). Based on the positive result of this experiment, a partnership with two information industry standards organizations (Consortia Advancing Standards in Research Administration Information (CASRAI) and the U.S.-based National Information Standards Organization (NISO)) was established to achieve broader consultation and to refine the preliminary taxonomy. An updated version of the taxonomy was made public in 2015 under the name CRediT (Contributor Roles Taxonomy) to provide "a controlled vocabulary of contributor roles" (Brand, Allen et al., 2015) for published research outputs.
The introduction of CRediT provides more details on the division of scientific labor than was given with previous contributorship taxonomies. First, not only may a given role be assigned to multiple contributors, but when this is the case, a degree of contribution may optionally be specified as "lead," "equal," or "supporting." 1 The granularity of contribution roles is thus more precise and the same contribution role can be prioritized among contributors. Second, the 14 contribution roles go beyond the commonly identified research tasks in traditional authorship. They notably include various roles related to research data, such as "resources" (provision of study materials, reagents, materials, patients, laboratory samples, animals, etc.), "data curation" (annotation, scrubbing, and maintenance), "software" (programming, software development, designing computer programs, etc.), or "visualization" (preparation, creation and/or presentation of the published work, specifically visualization/ data presentation). Third, the writing process is divided into two main roles, "original draft" and "review and editing," introducing nuance in this primary contributorship role. With these improvements, CRediT is suited to account for both the division of scientific labor and the allocation of individual contributions. PLOS adopted CRediT in 2016 (Atkins, 2016). By the end of 2018, more than 30,000 articles had employed this new taxonomy. In this paper, we provide an examination of these articles to investigate whether the more fine-grained analysis provides a more nuanced portrait of division of labor than was possible with previous taxonomies. More specifically, we examine how research contributions are divided across research teams, focusing on the association between number of authors and division of labor, and on the relationship between authors' position and specific tasks performed. We also consider the association between each of the 14 contributions, to assess whether some contributions are more likely to be performed in conjunction with others.
In their review of the taxonomy, Allen, O'Connell, and Kiermer (2019) identify how CRediT can be a useful tool in science of science. As they state: "If we can understand how collaborations work and when, or how to optimize the best team mix, then we may be able to incentivize the sorts of behaviours and activities that can bring about and accelerate discovery" (p. 74). They particularly draw attention to the issues of diversity in team composition and how contributorship studies can provide insights into how to best support women and early career researchers as they progress in science. Therefore, we also explore how the new taxonomy provides greater insight into the gendered nature of science, comparing this with the earlier PLOS typology .

Contribution Definition Conceptualization
Ideas; formulation or evolution of overarching research goals and aims.

Data curation
Management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself ) for initial use and later reuse.

Formal analysis
Application of statistical, mathematical, computational, or other formal techniques to analyze or synthesize study data.

Funding acquisition
Acquisition of the financial support for the project leading to this publication.

Investigation
Conducting a research and investigation process, specifically performing the experiments, or data/evidence collection.

Methodology
Development or design of methodology; creation of models.
Project administration Management and coordination responsibility for the research activity planning and execution.

Resources
Provision of study materials, reagents, materials, patients, laboratory samples, animals, instrumentation, computing resources, or other analysis tools.

Software
Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components.

Supervision
Oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team.

Validation
Verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs.

Visualization
Preparation, creation and/or presentation of the published work, specifically visualization/data presentation.

Writing-original draft
Preparation, creation and/or presentation of the published work, specifically writing the initial draft (including substantive translation).

Writing-review & editing
Preparation, creation and/or presentation of the published work by those from the original research group, specifically critical review, commentary or revision-including pre-or postpublication stages.
BMJ-have adopted it or, in the case of major publishers, have seen some of their journals adopt it. By early 2019, more than 120 journals had implemented the taxonomy (Allen et al., 2019), a number that increased substantially at the end of 2019 with the adoption of the typology by 1,200 journals from Elsevier (Elsevier, 2019). Our analysis is based on one of these publishers-PLOS-which provided us with all of its contributorship information for papers published between June 15, 2017 andDecember 31, 2018 (N = 30,770). The data covered all PLOS journals and included publication date, Digital Object Identifier (DOI), journal name, author name as it appears on the paper, and associated CRediT contributions for each author 3 . Table 2 presents the characteristics of the data set. The bulk of the papers were published in the megajournal PLOS ONE (87.9%), which is the second largest megajournal (Siler, Larivière, & Sugimoto, 2020). Our data set contains comprehensive data for all journals with the exception of PLOS Biology, for which contributorship information could only be obtained for 13 papers 4 . Important differences are observed in terms of mean number of authors per paper, with PLOS Computational Biology having, on average, slightly less than five authors per paper, while PLOS Medicine has almost three times the rates of PLOS Computational Biology. However, the mean numbers of contributions per paper are quite constant across journals, with a maximum of 11.8 in PLOS Biology and a minimum of 10.6 in PLOS ONE. Given the strong focus on medical sciences of the multidisciplinary journal PLOS ONE (Siler et al., 2020) and of other PLOS journals, the results need to be interpreted as illustrative of the use of the CRediT taxonomy in those disciplines.
Contribution information provided by PLOS did not, however, contain author order; to obtain this information we had to match each PLOS paper with its record in our in-house version of Clarivate Analytics' Web of Science based on the DOI; this was feasible for 30,054 papers (97.7% of the PLOS data set; see Table 2 for percentages by journal), which included 222,938 3 This made the processing of contributorships much more straightforward than what is provided through the bulk download of the full text of papers in XML format (http://api.plos.org/text-and-data-mining/). See, for instance, Larivière et al. (2016). In this case, the full names of authors were provided, along with each contribution role, thereby facilitating the author-matching process. 4 A different editorial system for PLOS Biology made it difficult for PLOS to provide us with the data for this journal. Therefore, while the PLOS Biology contributorship data is included in the global analysis, individual data for the journal is not provided (i.e., Figures 1 and 2). For this subset of authors who could be attributed an author order, we assigned a gender based on their given names. Such gender assignation of researchers has become a relatively standard practice and was shown to obtain relatively high precision and recall (Karimi, Wagner et al., 2016;Santamaría & Mihaljevic , 2018). In this paper, we used the algorithm developed in Larivière, Ni, et al. (2013), which was created using several country-level lists of given names along with their gender. The algorithm has been tested for precision, and was found to be 98.3% precise for men and 86.7% for women (see the supplementary material in Larivière et al. (2013) for more details). The algorithm assigned a gender to 82.2% of the authorships covered in this analysis (Table 3). This percentage varies by author order, however, with a higher proportion of last authors assigned a gender, and a lower proportion of first authors. The percentage of female authorships in the PLOS data set represents 39.9% of authorships to which a gender could be assigned, which is slightly greater than the percentage of female authorships found in the WoS for disciplines of the medical sciences (about 35%). Figure 1 presents, for each PLOS journal, the percentage of papers on which each contribution appears. This provides an indication of importance of each task across the spectrum of PLOS journals and, conversely, of the tasks that are not performed by any of the authors on a given paper. Nearly all papers had an author writing the original draft (99%), as well as authors reviewing and editing (96%) and conceptualizing (95%) them. This suggests that these remain essential research acts-all papers are conceptualized and written. The percentages of papers with at least one author contributing to formal analysis (91%), methodology (90%), and investigation (86%) are also very high, suggesting that empirical papers are the bulk of those published in these journals. The supervision task is contained in 84% of papers; the 16% of papers without such a task likely do not include trainees as coauthors. Data curation is present in 79% of papers-although this percentage is higher in journals like PLOS Medicine-and 70% of papers contain project administration and funding acquisition, with the latter task accounting for a higher percentage in PLOS Pathogens and PLOS Genetics. Resources, validation, and visualization are present in about half of all papers. Software contribution appears in less than 40% of papers, except in PLOS Computational Biology, where it is found in almost three-quarters of papers.

RESULTS
To assess division of labor across authors, we compiled, for each journal, the percentage of authors who performed a given contribution. As shown in Figure 2, the majority of authors contribute to writing-review and editing (68%), as well as methodology (55%), investigation (53%), and conceptualization (51%). Worth mentioning is the fact that 95% of authors from PLOS Medicine have contributed to the review and editing of the manuscript; this is likely due to the second criterion of the ICMJE which states that all authors should have "[drafted] the work or [revised] it critically for important intellectual content" (International Committee of Medical  Journal Editors, 2019, p. 2). All other CRediT contributions were, on average, performed by a minority of authors. Formal analysis, data curation, and validation were, on average, performed by 42-45% of authors across all PLOS journals, with higher percentages of authors contributing to formal analysis at PLOS Computational Biology and PLOS Genetics, as well as a higher share of authors contributing to validation at PLOS Computational Biology. Contrary to what was observed in the previous typology used by PLOS , where more than half of authors (and as much as 80% in social sciences and physics, among others) had "written the paper," the writing of the original draft is a contribution done by a much narrower percentage of authors (39% across all PLOS journals). Tasks typically performed by principal investigators (resources, supervision, project administration, and funding acquisition), as well as contributions than can be considered to be more specialized (visualization and software) are performed by a minority of authors (between 31% and 38%), with higher percentages of visualization and software for PLOS Computational Biology. Figure 3 shows the percentage of men and women, respectively, who have performed a specific CRediT contribution. The newly adopted taxonomy reinforces some of the initial findings for gender, particularly the gendered divide between conceptual and empirical work: Although 57% of women contributed to the investigation, this percentage is of 49% for men. A similar gap is also observed for data curation. Men, on the other hand, are more likely to conduct tasks associated with seniority, such as funding acquisition and supervision (30% more likely than women), contributing resources, software, conceptualization, and project administration. Although such differences are likely influenced by the fact that women academics are on average younger than men (McChesney & Bichsel, 2020), other studies have shown that gender differences in contributions remained constant with age as well as with the number of authors per paper .
A striking feature of CRediT compared to previous studies based on the PLOS typology  is in the writing of the manuscript. Using the previous PLOS typology, it appeared that men dominated in the writing of the manuscript. However, the nuanced division between writing the original draft and doing reviewing and editing demonstrated a delineation between labor roles for men and women: Women are 6% more likely to have written the original draft, whereas men are 8% more likely to review and edit the manuscript. While those differences are not necessarily sizeable, the fact that we observe a clear inversion of leading genders in the two contributions associated with writing is quite striking. This also demonstrates that the original finding obtained in Macaluso et al. (2016) was skewed by the ubiquity of the "review" portion of writing. Once the taxonomy isolated original drafting of the text, the contribution of women as more likely to write the original draft emerges. This suggests that the more nuanced taxonomy lends greater insight into contrasted divisions of labor.
Division of labor, furthermore, varies as a function of numbers of authors. Figure 4 presents the percentage of authors who have performed a given task, for papers between 1 and 20 authors (N = 29,689 papers, 96.5% of the data set). Obviously, for single-authored papers, 100% of tasks are performed by a single author. As the number of authors increases, tasks are increasingly dividedalthough the extent to which they are varies as a function of the tasks involved. In other words, while some tasks are performed by a smaller proportion of authors as the number of authors increases, other tasks remain relative stable once a certain threshold is met. For instance, the writing-review and editing task remains performed by a high percentage of authors (i.e., more than half of authors), even when there are 20 authors on a paper. In a similar manner, the proportion of authors who contribute to investigation stabilizes once 10 authors are reached, with, again, about half of authors contributing to the task. Other tasks, however, are increasingly divided as the number of authors increases. For instance, the proportion of authors who perform supervision and writing of the original draft-among others-decreases steadily as the number of authors increases, which suggests, as shown in the inset, that these tasks remain performed by a few authors. More specifically, even in papers by 20 authors, between three and four authors have been involved in those two tasks.
As shown with the previous PLOS typology, there is a strong relationship between authors' order and tasks performed Sauermann & Haeussler, 2017). Figure 5 presents the percentage of authors who have performed a given CRediT contribution, as a function of their order on the byline of the article (first, middle, last). Taken globally, the figure shows an inverse relationship between the tasks performed by first authors and the tasks performed by last authors. More specifically, first authors are much more likely to write the original draft of the manuscript, curate the data, and perform the formal analysis, visualization, and investigation, as well as contribute to the methodology. Globally, the mean number of tasks to which first authors contribute is higher for first authors, followed by last authors, and then by middle authors. Last authors, on the other hand, are much more likely to have contributed to supervision, funding, resources, and project administration. Conceptualization, and reviewing and editing of the manuscript, are performed by both first and last authors in relatively similar proportions, although last authors are slightly more likely to have performed the tasks. There are no tasks that middle authors are more likely to perform than first and last authors. However, there are a few tasks where their participation is relatively more important: They are more likely to contribute to supervision and to resources than first authors, and more likely to contribute to data curation, investigation, and software than last authors. Figure 6 presents the contributions that are the most likely associated with each other (i.e., performed by the same authors), as well as the asymmetry of these relationships. More specifically, it shows the percentage of authors who have performed contribution A who have also performed contribution B. For example, the figure shows that, although 93% of authors who have contributed to the funding acquisition have reviewed and edited the manuscript, only 46% of authors who reviewed and edited the manuscript have acquired funding. This relationship is among the most asymmetrical, along with software, project administration, visualization, resources, and supervision, on the one hand, and their relationship with reviewing and editing the manuscript. That is not surprising: Writing and editing the manuscript is a task that most authors perform, irrespective of their other contributions to the manuscript. At the other end of the spectrum, funding acquisition is the contribution that has the lowest relationship with other tasks, except with supervision and project administration. A similar phenomenon is observed for supervision, project administration, and resources. Software also has little relation with other tasks, except for visualization.

DISCUSSION
Our analysis has delved into the ways in which scientific labor is accounted using a more refined contributorship taxonomy than was previously available. While confirming several previous findings (Corrêa et al., 2017;Larivière et al., 2016;Sauermann & Haeussler, 2017), the research has provided novel information on the composition and distribution of labor across teams. For example, contributorship information reveals the types of labor that are critical for producing scientific research: Almost all research articles include conceptualization, operationalization, and communication through writing. Deviations by discipline, however, reveal the importance of other more niche tasks, such as visualization and software, acknowledged in certain domains. These findings suggest greater heterogeneity in evaluation processes to attend to the importance of tasks by discipline. Privileging one type of labor will inevitably lead to inequities across disciplines, where specific tasks performed remain either nonperformed or unacknowledged through authorship and contributorship. Furthermore, both the heterogeneity of labor types and the number of contributions per paper suggests that mentoring and doctoral education may need to be reconfigured to address the changing composition of team science (Sugimoto, 2016).
The bureaucratization of science can be considered as an inevitable consequence of the ubiquity of collaborative science (Larivière et al., 2015). As team size increases, the mean number of authors contributing to investigation, for instance, also increases, which suggest that the expansion of teams is largely a function of the increasing number of researchers who contribute to technical tasks, and of the acknowledgment that this contribution warrants authorship (Shapin, 1989). This is not associated with a concomitant rise in those who have written papers' first drafts or supervisors: There can only be a few supervisors and original authors, but there is a constant expansion in other forms of labor, recognized through authorship (Pontille, 2016). As Shapin (1989) observed: "Scientists' authority over technicians typically means that it is the former who decide how officially to arrange the relationship, whether to 'make them' authors or coauthors, what counts as genuine knowledge as opposed to mere skill, and what technicians' work signifies in scientific terms" (p. 562). Our research suggests that, despite the steep increase in number of authors, the number of scientific leaders remains small (Robinson-Garcia, Costas et al., 2020). Such division of labor and capital reinforces scientific hierarchies and cumulative advantages (Merton, 1968). Our investigation of the current multiple authorship practices and contributorship distributions illuminates the selective attribution process among coauthors, wherein having one's name in an article byline does not equate to or result in leadership positions. Consequently, the growing proportion of "supporting authors" (Milojevic, Radicchi, & Walsh, 2018) has strong implications for the composition of the scientific workforce.
The high proportion of data curation-present in 79% of papers-draws attention to a heavily overlooked labor role in science. The majority of articles involve this task, but there is relatively little training provided to doctoral students, nor are many scientists prepared to engage in this. With the increasing prevalence of calls for open science (e.g., McKiernan, Bourne et al., 2016), it is essential that data be properly curated for better sharing and transparency. For example, several countries have established policies requiring the sharing of data created through funded research. Interviews with scientists, however, have revealed strong social and technical challenges to fulfilling these mandates (e.g., Borgerud & Borglund, 2020). Data curation work continue to be widely underresourced, despite increasing calls for data transparency (Leonelli, 2016) and the overwhelming importance of this work, as demonstrated by our analysis. Future work should ensure that data curation is both valued and supported in research environments.
Women are more likely to be associated with this data curation, as well as other technical work, such as investigation, which confirmed results obtained in previous analyses . However, CRediT provided a much more nuanced way to evaluate the conceptual vs. technical divisions identified in earlier research . Furthermore, and perhaps more importantly, the taxonomy elucidated a key difference in one of the main contribution types: writing. Whereas the original five categories contained a single writing category, where men dominated, the new classification distinguished between the editing and reviewing and the much more labor-intensive writing of the first draft. In this distinction, the role of women emerged starkly. Given that they are underrepresented in first and last authorships, this is particularly striking and speaks to some of the underlying injustices in the division of labor and calculation of production (Penders & Shaw, 2020;Rossiter, 1993). This can be critical for the career of women and other underrepresented minorities. As sociologist Mary Frank Fox (2005) observed: "…until we understand factors that are associated with productivity, and variation in productivity by gender, we can neither assess nor correct inequities in rewards, including rank, promotion, and salary […] because publication productivity operates as both cause and effect of status in science […] productivity reflects women's depressed rank and status, and partially accounts for it." It is no surprise, therefore, that junior scholars were the most concerned about their representation in contributorship statements and expressed the greatest desire for broad participation in these discussions (Sauermann & Haeussler, 2017). There is a considerable need for greater transparency about the career lifecycles and interoperability between systems (Cañibano, Woolley et al., 2019). The integration of CRediT and ORCID is a useful start to this.
It is clear from the data that contributorship provides a lens to add greater transparency in the capital exchange for authorship. In addition to providing greater accountability for research, contributorship also sheds greater light on the flaws in the current system. Our work demonstrates a clear division of labor as team size increases and the corresponding isolation of certain contribution types. While this facilitates efficiency and may be necessary for certain types of research, it inevitably increases the chances of potential misconduct, mistake, or fraud, given that several team members provide their contributions without direct oversight 5 . One critical role, therefore, may be validation. However, this was present in only 55% of papers (performed by 42% of authors). One may argue that this is merely idiosyncratic interpretations of the contributorship roles, where some authors may consider validation a part of the "investigation" or "formal analysis." However, the task definition is clear: "verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs." The lack of validation in the PLOS papers reinforces the concerns of the "reproducibility crisis" (Baker, 2016). To address this, journals could require validation as a mandatory contribution type for empirical work. Contributorship statements are not without limitation. One strong concern at present is the assumed relationship between the actual labor and the indicator of this labor in contributorship statements. Undoubtedly, when scholars mutually ascribe the different tasks of CRediT to themselves, they maintain the opacity necessary to favor good working relationships between colleagues and teams. As criteria for authorship vary considerably across disciplines (Paul-Hus et al., 2017;Pontille, 2004Pontille, , 2016, so too might the interpretation of contribution roles. More research is necessary to understand whether CRediT provides a valid representation of the work. Another related general concern has simultaneously been raised by some clinical researchers and regulatory bodies regarding these expansive categories: if contributorship removes "much of the ambiguity surrounding contributions, it leaves unresolved the question of the quantity and quality of contribution that qualify for authorship" (International Committee of Medical Journal Editors, 2019). As with any system tied to capital, there is likely to be goal displacement as the taxonomy gains wider acceptance and use. For example, the disproportionately high degree of PLOS Medicine authors associated with writing and editing may be less a disciplinary difference and more an adherence to the ICMJE criteria. And, as some critically emphasized, the contributorship procedure favors pharmaceutical firms that, without having to pretend to intervene intellectually by figuring in an article byline, could now become "contributors" and thus avoid allegations of conflicts of interest (Matheson, 2011). This suggests that authors may modify their behavior in order to meet certain requirements, norms, or incentives. Further investigations are thus needed to explore such issues.
Despite the accountability it aspires to, any description of scientific contributions, even the fine-grained provided by CRediT, can never be complete. As Sauermann and Haeussler (2017) noted, contributorship statements may reduce misconduct while simultaneously leading scientists to avoid association with those tasks with a greater potential for risk. Scientists may also begin to adopt similar practices of ghost, guest, and gift authorship to contributorship. The systematic description of work does not, therefore, preclude invisibility, but only displaces it elsewhere. As a consequence, it leaves ghostwriting of articles and potential honorary contributorship in the backrooms of scientific research. Contributorship statements are not a panacea for the problems of authorship misconduct; however, they do contribute to clarifying the contributions that are sufficiently important to warrant authorship from those that are not. Issues with authorship are not an indication of problems inherent with the contributorship model, but symptomatic of a larger structural problem in the contemporary scientific community, which is the demand, by both policymakers and researchers themselves, for procedural ways of assessing excellence and scientific performance.

CONCLUSION
Over the last few decades, transparency in authorship and scholarly publishing have become increasingly discussed in academe. This is due to several interrelated phenomena. First, bibliometric evaluations have become widespread across all countries, and have been applied to the promotion of individual researchers (Quan, Chen, & Shu, 2017) and to institutions, mostly through the ever-expanding university rankings (Debackere & Glänzel, 2004). Secondy, the rise in the number of PhD graduates, linked with the relative stability of faculty positions, is increasing the competition among new graduates, who are ever more aware-as this is often made explicit-that publications are the currency that will allow them to find a position. The pressures wrought by this system have led to several authorship malpractices. There are flagrant acts of "civil disobedience" in authorship, such as adding humorous fictional coauthors, pets, or celebrities to a paper (Penders & Shaw, 2020). However, some new authorship issues are more pernicious, such as adding children as coauthors so that they can begin to build their publication record (Zastrow, 2019), and the growth of predatory publishing (Grudniewicz, Moher et al., 2019) and publication bazaars (Hvistendahl, 2013). These latter actions demonstrate how critical authorship is for the reward structure of science and the misconduct that can arise as a result of these pressures to publish.
By fragmenting scientific production process into clearly distinct tasks, CRediT was designed to transcend the customary rules specific to name orderings in scientific publications. Information about the conditions of production of research is being made available in each scientific article, and the systematic description of contributions according to CRediT is not limited to the authorship practices of a particular discipline. On the contrary, it can easily be adjusted to various kinds of division of scientific labor and their specific hierarchical principles across research teams (e.g., a team led by a leader, a project carried out among peers, a multicenter research project). In other words, CRediT is not at odds with the distinct authorship practices in place across disciplines. Rather, based on the traceability of individual performance, it provides additional information on the attribution process. Simultaneously, as other accounting devices (Strathern, 2000), the systematic description of contributions, especially through CRediT, comes with ambivalence. While it undoubtedly introduces greater transparency in both reward and accountability related to the division of labor involved in a published article, it simultaneously fuels a regression of trust at the root of scientific relations (Pontille, 2015). Put differently, the beneficiaries of the information made available-especially women and junior scholars-may become the potential victims of devices that facilitate monitoring and surveillance at the heart of scientific activity.
All of these elements have one point in common: the (sole) emphasis on scholarly publications as the criterion for research excellence. It seems that, along the way, we have forgotten what drives researchers to do what we do, and why our societies have made the choice to support us in this endeavor, which is to discover new things. We have replaced a "taste for science" by a "taste for publication" (Osterloh & Frey, 2014). As per Gingras (2018), the meaning of scholarly publications has changed from a unit of (new) knowledge produced, to an accounting-or accountability-unit. Directly related to CRediT, "the systematic description of contributions leads toward accounting management for scientific activity.
[…] As a divisible, accounting unit, each scientific act may even be associated with a specific amount" (Pontille, 2016: 122). In this way, contributorship does not dismantle performance-based rewards (Debackere & Glänzel, 2004;Sivertsen, 2010), but rather serves to bring greater precision in accounting.