Using a novel combination of methods and data sets from two national funding agency contexts, this study explores whether review sentiment can be used as a reliable proxy for understanding peer reviewer opinions. We measure reviewer opinions via their review sentiments on both specific review subjects and proposals’ overall funding worthiness with three different methods: manual content analysis and two dictionary-based sentiment analysis algorithms (TextBlob and VADER). The reliability of review sentiment to detect reviewer opinions is addressed by its correlation with review scores and proposals’ rankings and funding decisions. We find in our samples that review sentiments correlate with review scores or rankings positively, and the correlation is stronger for manually coded than for algorithmic results; manual and algorithmic results are overall correlated across different funding programs, review sections, languages, and agencies, but the correlations are not strong; and manually coded review sentiments can quite accurately predict whether proposals are funded, whereas the two algorithms predict funding success with moderate accuracy. The results suggest that manual analysis of review sentiments can provide a reliable proxy of grant reviewer opinions, whereas the two SA algorithms can be useful only in some specific situations.
Handling Editor: Ludo Waltman
Peer Review: https://publons.com/publon/10.1162/qss_a_00156