Climate action, vaccination resistance or social coordination in pandemics are some of the many social endeavours with uncertain, non-linear and long-term returns. The collective risk dilemma offers an excellent game-theoretical abstraction of such scenarios. In this dilemma, players can make stepwise contributions to a public good throughout a fixed number of rounds and will only observe their payoff once the game ends. The non-linearity of returns is modeled through a threshold that determines the risk of collective loss, so that players receive zero payoff if a collective threshold is not achieved. In an article recently published in the Journal of Simulation Practice and Theory we introduce a novel population-based learning model wherein a group of individuals facing a collective risk dilemma acquire their strategies over time through reinforcement learning, while handling different sources of uncertainty. We show that the strategies learned with the model correspond to those observed in behavioral experiments, even in the presence of environmental uncertainty. Furthermore, we confirm that when participants are unsure about when the game will end, agents become more polarized and the number of fair contributions diminishes. The population-based on-line learning framework we propose is general enough to be applicable in a wide range of collective action problems and arbitrarily large sets of available policies.