Author self-citations are a somewhat controversial phenomenon. Some scholars maintain they are a normal, even indispensable, part of scientific referencing practice, while others claim they are frequently an expression of vanity and self-promotion. Citations are the basic data for citation network clustering, an important approach to creating bottom-up, data-driven, global taxonomic systems of research publications. Thus the topical information content of self-citations is of particular interest in this context. Since it is not yet known how author self-citations affect such systems, we study the question of their influence on cluster solution topic quality in a citation network by experimentally re-weighting self-citation edges by increasing and decreasing their link weights according to self-citation status and strength. As a case study, we investigate data on the field of astronomy and astrophysics. We assess the effects of self-citation manipulations by evaluating the quality of the resulting clustering solutions using diverse external data containing meaningful topic-structural information, namely topical journal special issues, co-usage data from scientific literature database search logs, grant funding data, and intellectual paper-level classification assignments. We find that we can reliably improve clustering solution quality by emphasizing self-citation link weights.

This content is only available as a PDF.

Author notes

Handling Editor: Vincent Larivière

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Article PDF first page preview

Article PDF first page preview