Abstract

We examine a social dilemma that arises with the advancement of technologies such as AI, where technologists can choose a safe (SAFE) vs risk-taking (UNSAFE) course of development. SAFE is costlier and takes more time to implement than UNSAFE, allowing UNSAFE strategists to further claim significant benefits from reaching supremacy in a certain technology. Collectively, SAFE is the preferred choice when the risk is sufficiently high, while risk-taking is preferred otherwise. Given the advantage of risk-taking behaviour in terms of cost and speed, a social dilemma arises when the risk is not high enough to make SAFE the preferred individual choice, enabling UNSAFE to prevail when it is not collectively preferred (leading to a smaller population/social welfare). We show that the range of risk probabilities where the social dilemma arises depends on many factors, the most important among them are the time-scale to reach supremacy in a given domain (i.e. short-term vs long-term AI) and the speed gain by ignoring safety measures. Moreover, given the more complex nature of this scenario, we show that incentives such as reward and punishment (for example, for the purpose of technology regulation) are much more challenging to supply correctly than in case of cooperation dilemmas such as the Prisoner's Dilemma and the Public Good Games.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.