Scientific research plays a crucial role in advancing human civilization, thanks to the efforts of a multitude of individual actors. Their behavior is largely driven by individual incentives, both explicit and implicit. In this paper, we propose and validate a multi-agent model to study the complex system of scholarly publishing and investigate the impact of incentives on research output. We use reinforcement learning to make the behavior of the actors optimizable, and guide their optimization with a reward signal that encodes the incentives. We consider various combinations of incentives and predefined behaviors and analyze their impact on both individual (h-index, impact factor) and overall indexes of research output. Our results suggest that, despite its simplicity, our model is able to capture the main dynamics of the system. Moreover, we find that (a) most incentives tend to favor productivity over quality and (b) incentives related to journal perceived reputation tend to result in waste of research efforts.