Abstract

Quantile regression presents a complete picture of the effects on the location, scale, and shape of the dependent variable at all points, not just the mean. We focus on two challenges for citation count analysis by quantile regression: discontinuity and substantial mass points at lower counts. A Bayesian hurdle quantile regression model for count data with a substantial mass point at zero was proposed by King and Song (2019). It uses quantile regression for modeling the nonzero data and logistic regression for modeling the probability of zeros versus nonzeros. We show that substantial mass points for low citation counts will almost certainly also affect parameter estimation in the quantile regression part of the model, similar to a mass point at zero. We update the King and Song model by shifting the hurdle point past the main mass points. This model delivers more accurate quantile regression for moderately to highly cited articles, especially at quantiles corresponding to values just beyond the mass points, and enables estimates of the extent to which factors influence the chances that an article will be low cited. To illustrate the potential of this method, it is applied to simulated citation counts and data from Scopus.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.