The Coevolution of Costly Heterogeneities and Cooperation in the Prisoner's Dilemma Game

This paper discusses the co-evolution of social strategies and an efficiency trait in spatial evolutionary games. The continuous efficiency trait determines how well a player can convert gains from a prisoner's dilemma game into evolutionary fitness. It is assumed to come at a cost proportional to its magnitude and this cost is deducted from payoff. We demonstrate that cost ranges exist such that the regime in which cooperation can persist is strongly extended by the co-evolution of efficiencies and strategies. We find that cooperation typically associates with large efficiencies while defection tends to pair with lower efficiencies. The simulations highlight that social dilemma situations in structured populations can be resolved in a natural way: the nature of the dilemma itself leads to differential pressures for efficiency improvement in cooperator and defector populations. Cooperators benefit by larger improvements which allow them to survive even in the face of inferior performance in the social dilemma. Importantly, the mechanism is possible with and without the presence of noise in the evolutionary replication process.

Models of the evolution of cooperation often build on the paradigmatic scenario described in the prisoner's dilemma.
In the simple one-off game two players are confronted with a simultaneous choice between two pure strategies, frequently labelled as "C" (for cooperate) and "D" (for defect).Depending on the combinations of choices, payoffs from the game are as follows.Mutual cooperation is rewarded with a payoff of R for both players, a player who plays "D" against "C" receives the temptation to defect T while the cooperator is paid the sucker's payoff S and mutual defection results in a payment of P for both players.For the prisoner's dilemma the ranking of payoffs is T > R > P > S and 2R > T + S, such that the optimal choice for an individual who wants to maximize its own game outcome is always "D" while "C" is the optimal choice of a central planner interested in the good of the group.A common explanation for the sustainability of cooperative strategies assumes positive assortment such that strategies of the same type can interact more often than when population structures are well mixed, cf.e.g., (Eshel and Cavalli-Sforza, 1983;Nowak and M., 1992).Such positive assortment can be facilitated by 'network reciprocity' in structured populations (Nowak, 2006;Szabó and Fath, 2007).Especially since the classification of prototypical network structures, like scale-free and small-world type networks, evolutionary game theory in structured populations has found growing interest.An important discovery in this line of research has been that cooperative strategies can receive a strong boost in populations that are coupled by very heterogeneous networks (Santos et al., 2006b), but notice the role of game participation costs in this effect (Masuda, 2007;Tanimoto and Yamauchi, 2010).Later work clarified that also other types of heterogeneity, e.g. in abilities of players to generate payoff (Perc and Szolnoki, 2008;Brede, 2011a) or in differing abilities of players to pass on strategies or adapt to neighbours (Szolnoki and Szabó, 2007;Wang and Perc, 2010;Perc and Wang, 2010;Tanimoto and Yamauchi, 2012), can give similar support for cooperation, even if the network of social interactions is regular.Some recent studies have started to focus on the question how heterogeneity and game strategies can co-evolve, see (Perc and Szolnoki, 2010) for a review.The most prominent approach in the field is probably to study adaptive networks in which social interactions change at a timescale similar to that of the evolution of game strategies (Zimmermann and Eguíluz, 2005;Santos et al., 2006a;Van Segbroek et al., 2008;Cao et al., 2011).A crucial assumption in these models is that agents have the cognitive abilities to break off undesirable ties.
Other models have considered the co-evolution of slow and fast strategy pass, e.g.via considering age-dependent abilities of agents (Wang et al., 2012), the co-evolution of performance evaluation rules (Brede, 2013b) or reinforcement of the position of abilities of agents who successfully passed on their strategies in past interactions (Szolnoki and Perc, 2008;Szolnoki et al., 2010;Zhang et al., 2010).As noted in (Brede, 2013a), common to these approaches is an assumption of a dynamics similar to Hebbian learning (Hebb, 1949): Successful interactions become stronger while unsuccessful interactions tend to decline in frequency.Whilst such processes based on Hebbian learning may be reasonable models of social interactions in many contexts, they still rest on ad-hoc assumptions (i.e.those of a Hebbian-like dynamics of system structure, or abilities to break unprofitable links (Van Segbroek et al., 2008) or some mechanism to influence group formation (Powers et al., 2011)) and do not provide a purely evolutionary framework that describes the co-evolution of system structure and social strategies via the same mechanism of evolution.Further, many of these models can only support cooperation if additionally constrained: I.e. in adaptive network models connectivities are typically held constant, or in the ageing-based models maximum ages are imposed, or in the reinforcement model the rate of reinforcement is found to be required to be within a certain range for optimal support of cooperation.
A recent paper addresses this gap and proposes a model in which traits of slow and fast strategy pass of agents can coevolve with social strategies to support cooperation (Brede, 2013a).The paper proposes a framework in which agents can enhance their abilities to pass on strategies, albeit at a cost.Considering the binary options of 'advertising' (i.e.investing in fast strategy spread at a cost) or not advertising (i.e.normal strategy spread at no cost) the study demonstrates that cost-regimes exist, such that cooperation can associate with costly fast strategy spread while this is not viable for defection.It is easy to understand why this is the case: In comparison to defectors cooperators benefit from an investment to surround themselves with like types and thus they can afford to invest more in costly strategy pass than defectors.Hence, if strategy pass is costly enough, the usual competition between cooperators and defectors is replaced by a competition between fast spreading cooperators and slow spreading defectors, resulting in an evolutionary benefit to the former.The model of (Brede, 2013a) considers a binary choice (i.e.advertise or don't advertise).Further, the cooperation-supporting dynamics of (Brede, 2013a) relies on the crucial assumption of noise in strategy replication without which the mechanism of costly advertising cannot operate and the model is very sensitive to assumptions about joint inheritence of the advertising and the social strategies.
In this paper we consider a slightly altered modelling framework and illustrate that a co-evolutionary dynamics of agents with to a different extent enhanced abilities to generate payoff from the game can support cooperation even without these key ingredients of (Brede, 2013a), i.e. without the assumption of binary strategies, and without the assumption of noise in strategy replication.

Model
Consider a spatially distributed population of N = L × L agents that interact with their von Neumann neighbours.Agents are chacterized by social strategies s ∈ {C, D} which they employ when playing a one-off prisoner's dilemma game with their neighbours.The game is parametrized in the conventional way via R = 1, T = 1 + r, S = −r and P = 0.As usual the parameter r ∈ (0, 1) characterises the dilemma strength.
Every round an agent i earns payoff π from interactions with all four spatial neighbours.Further, every agent i is characterised by a trait i that determines the efficiency with which it can convert payoff gleaned from the game into evolutionary fitness f such that (1) The motivation for Eq. ( 1) is that every agent has a default mechanism to convert payoff into fitness at unit rate (represented by the term π i in the expansion of ( 1)).However, after playing the game it can also invest an amount c i into higher efficiency conversion.Hence, after playing the PD game, a cost c i is deducted from game payoff and then the remaining payoff is converted into fitness.An alternative model is that the cost of efficiency improvements is deducted after payoff is converted into fitness, resulting in Both models result in qualitatively similar dynamics and we focus on the first choice in this paper.In our model the trait i represents the 'biological machinery' to make use of payoff from the game, b measures its efficiency, and c is the cost (per unit of ) to maintain it.In a biological context the cost of enhanced efficiencies could be seen as a cost to maintain a certain body mass, in a social context it might be associated with the maintenance of equipment or a cost to acquire certain skills.The assumption of a linear relationship between the size of the trait and the cost is for simplicity, in general it might be more reasonable to assume a different monotonic non-linear relationships, but this will not alter qualitative results.
We then carry out evolutionary simulations based on the following protocol: • The lattice is seeded with random initial conditions, i.e.
with probability 1/2 agents are assigned the social strategy C and with probability 1/2 social strategy D. Agents are also initialized with efficiency traits selected uniformly at random from the interval [0, 1].
• A focus agent i is picked at random from the population of N agents and one of its neighbours is selected at random as a reference agent j.
• Focus' and reference agent's payoffs π i and π j are evaluated and converted into evolutionary fitness according to Eq. (1).
• With probability the focus agent i will adopt the reference agent j's traits, i.e. the social strategy s j and the efficiency trait j of j.
As introduced in (Szabó and Toke, 1998) the parameter κ in (2) gives the noise level in the process of strategy spread.For κ = 0 superior performers always replace worse performers, if κ > 0 also less successful strategies have an occasional chance to invade a neighbours place.Note, that the noise parameter is of importance to contrast the present results with those of (Brede, 2013a), because the results of (Brede, 2013a) are not robust in the limit κ → 0 of (up to neighbour selection) deterministic updating.
• The process of game play and replication is iterated till a quasistationary state is reached and then average frequencies of cooperators and defectors and equilibrium averages over the evolutionary trait are calculated from a sufficient number of further iterations.
• The entire experiment is then repeated a sufficient number of times to evaluate from how many random initial conditions cooperation could evolve.
Numerical results presented below are generally obtained from simulations on 200 × 200 tori and b = 2 and have been repeated for at least 50 times to obtain estimates of the frequency of situations in which cooperation can arise.

Results
This section describes and analyses numerical results obtained by simulations of the model introduced above.Figure 1 compares average trajectories of the co-evolution of the efficiency trait and social strategies for a case when the efficiency trait is costly and another in which it comes for free.Both scenarios are for the case of noiseless replication κ = 0. Notably, if efficiency is not costly, both cooperators (open boxes) and defectors (filled boxes) evolve to maximize efficiencies.Asymptotically, this results in a homogeneous system in which payoffs are scaled by a factor 1 + b and cooperative strategies cannot survive for even very low dilemma toughness (r = 0.01 in this case).Interestingly, however, one also notes that in the initial stages of the dynamics average efficiencies of cooperators grow faster than those of defectors.The reason for this is simple: As extinction pressures are larger on cooperators than on defectors, also the evolutionary pressure on inefficient cooperators is larger than on inefficient defectors (which, if favourably positioned, can occasionally generate more fitness than efficient defectors at less favourable locations).The delayed saturation of efficiencies of defectors and cooperators leads to a dynamics that is different from the usual evolution in the one-off game (cf. the filled circles in Fig. 1).Cooperators can initially gain an advantage by associating with larger efficiencies than defectors and hence they can recover from the initial decline which is caused by the assortment dynamics after starting from random initial conditions.However, the recovery of cooperation is stopped as defectors evolve towards saturation in the efficiency trait and cooperators become extinct when a homogeneous state with = 1 is reached in the entire population.The bottom panel of Fig. 1 contrasts the co-evolution of social strategies and efficiencies for c = 1 and a much more severe dilemma setting with r = 0.1 to the above scenario of a free efficiency trait with c = 0.For a better visual understanding also some snapshots in the evolution which correspond to important stages of the dynamics are illustrated in Fig. 2. One first notices the difference in the asymptotic states: For costly efficiencies cooperation can survive in a regime far beyond dilemma strengths which typically support cooperation in the spatial game with κ = 0. Concomitantly, cooperators associate with a saturated efficiency whereas defection is typically paired with much lower values of the efficiency trait.The course of the evolution is also different from the scenario with a free efficiency trait and proceeds in several stages.First, as typical in the evolution of cooperation in spatial games, when strategies are randomly mixed cooperators are easily exploited by defectors and hence cooperation declines until assortment of like strategies is reached.In the process only cooperators with large efficiencies survive in small islands (Fig. 2 top right, which corresponds to the minimum in the number of cooperators in Fig. 1) in a sea of defectors.Within the sea of defectors low efficiency investments are favoured (as there is hardly any game payoff to be leveraged) and large efficiency defectors only survive in very small numbers when attaching to clusters of cooperators.In a second stage, large efficiency cooperators can expand into the sea of low efficiency defectors, conquering a very large share of the entire system (Fig. 2, bottom right, corresponding to the maximum in cooperation in Fig. 1).With some delay this allows large efficiency defectors to expand and eventially a stationary balance of ordered arrangements of large and low efficiency defectors and large efficiency cooperators is reached (Fig. 2

bottom left).
These initial simulations illustrate an important point: A co-evolutionary dynamics of costly efficiencies and social strategies can allow cooperation to survive far beyond the regime normally supported by network reciprocity.The origin of the support mechanism is that evolution favours efficiency enhancements in clusters of cooperators.Since cooperators benefit from surrounding themselves with like strategies, paying a cost to surround themselves with other cooperators is an evolutionary viable strategy that outcompetes the cooperate strategy that does not invest into efficiency enhancements.For defectors the situation is different.When not in contact with cooperators, defectors which invest into efficiency enhancements are outcompeted by defectors who don't.However, only efficient defectors manage to penetrate clusters of efficient cooperators and thus a cyclic dominance (efficient cooperators beat inefficient defectors, but are beaten by efficient defectors who are in turn outcompeted by inefficient defectors) similar to Rock-papers scissors (Szolnoki and Szabo, 2004), volunteering (Szabo and Hauert, 2002) or the advertising game of (Brede, 2013a) is created.As one would expect, the balance between the three competing strategies can be shifted when modifying the cost parameter.Interestingly, however, in a large cost regime high efficiency defectors can easily be pushed into extinction and for low frequencies of recurring invasions cooperators can dominate the system over large time periods.
For a more comprehensive investigation, in Fig. 3 the phase diagrams that give the dependence of the frequency of cooperators n c on the dilemma toughness are evaluated for low, intermediate, and high noise levels in strategy replication.Going hand in hand with this Fig. 4 gives the dependence of stationary average efficiencies on the dilemma toughness parameter r.Both, the n c (r) curves in Fig. 3 and the C (r) and D (r) curves in Fig. 4 are given for various cost assumptions.
For the case of noiseless replication with κ = 0 several sharp transitions can be discerned.First, comparing curves for various cost choices it is worth noting that cooperation and efficiencies can co-evolve for any cost c > 0. This is illustrated by the first panel in Fig. 3: Whereas cooperation dies out for r > 0 for c = 0 cooperation can survive up to around r ≈ 0.45 if a small cost c = 0.0001 is included (and in fact in the limit κ → 0 in Eq. ( 2) any cost makes sure efficient defectors can be invaded by = 0 defectors, thus allowing for the cyclical dominance mechanism to operate).As further illustrated in Fig. 5 this is different for κ > 0. The more noise in strategy propagation, the larger the cost required to allow cooperation to survive.On the one hand larger costs help the evolution of cooperation since they make it easier for inefficient defectors to chase efficient defectors, hence reducing the pressure on efficient cooperators and allowing them to thrive.However, on the other hand costs above some threshold make efficiency investments unviable both cooperators and defectors.As a consequence a range of costs exists for which cooperation is optimally supported.The dependence of cooperation on dilemma costs also includes a transition which demarcates a phase in which efficient defectors typically survive the intial stages of the dynamics from another phase in which they go extinct (cf.Fig. 5).When efficient defectors die out, the cyclical competition is replaced by a competition between efficient cooperators and inefficient defectors in which the former can dominate.Hence, for some cost range a state in which only cooperators survive is reached.This state is marked by homogeneity in agent's efficiencies, and hence it is not stable to the reinvasion of defectors.In fact, including Figure 2: Typical snapshots in the arrangement of cooperators (blue) and defectors (red) at various stages of the co-evolution.Clockwise from top right to bottom left: initial conditions at t = 0, then snapshots at t = 42, t = 120, and the asymptotic state at t = 3000.The intensity of the color of the sites indicates the efficiency trait: dark blue corresponds to cooperators with large , light blue to cooperators with low , and dark red and light red refer to large and low defectors, respectively.
invasions of agents with randomly selected strategies, large amplitude oscilations between regimes in which cooperation dominates and regimes in which invading defectors can take over large parts of the system result.Second, for κ = 0 the n c −r phase diagrams in Fig. 3 and the corresponding − r diagrams in Fig. 4 show a number of sharp transitions in the r-dependencies.Whereas cooperators always evolve into a monochromatic population (not shown), the defector population tends to become separated into groups of defectors with high ( = C ) and low ( = 0) efficiency.When the dilemma strength is increased, proportions of low and high efficiency defectors shift, but the values of the group-characteristic efficiency values = 0 and = C remain the same.The first order transitions in the n c − r dependencies indicate critical values of the dilemma strengths at which sudden shifts in the relative proportions of low and high efficiency defectors take place.
Most notable, in particular for larger costs, is the tran-sition at which high efficiency defectors become extinct (i.e. at which D ≈ 0).The effect is similar to what we have discussed for the n c − c dependencies in Fig. 5 above: Without the presence of high efficiency defectors low efficiency defectors are outcompeted by high efficiency cooperators and the latter can dominate the population.Whereas efficiency investments generally decline with increasing dilemma toughness, due to the efficiency competition in the now purely cooperative population, maximum efficiencies are favoured by evolution.
With the exception of smoother transitions between the various regimes, principally similar behaviour to the case of κ = 0 is observed for intermediate and high levels of noise.The main difference is that more noise in strategy propagation requires larger costs of the efficiency trait for cooperation to persist.
Last, it is worthwhile examining whether the coevolutionary mechanism is robust when strategy traits are inherited separately.To investigate this issue we consider an amended model in which the rules for passing on the social strategy s and the efficiency trait are modified.If a focus agent copies from a reference agent (i.e. according to Eq. (2)), with probability p d only either the efficiency trait or the social strategy are imitated.In the opposite case, i.e. with probability 1 − p d , both traits are simultaneously passed on.Hence the new parameter p d classifies the degree of disjoint strategy pass, with p d = 0 corresponding to the previously considered model and p d = 1 corresponding to completely disjoint strategy pass.Figure 6 illustrates some simulation experiments in which scenarios with p d > 0 were explored.Clearly, disjoint strategy pass equalizes differences in efficiencies between cooperators and defectors, hence reducing support for cooperation.However, in contrast to the advertising game of Brede (2013a), cooperation can persist for rather substantial degrees of disjoint strategy transfer, thus adding robustness to previous results.

Discussion and conclusions
In this paper we have considered a model for the coevolution of social strategies and an efficiency trait that determines how well agents can convert gains from dilemma games into evolutionary payoff.Through a series of controlled simulation experiments we have demonstrated that the co-evolution of efficiencies and social strategies can add substantial support to cooperation, if the payoff efficiency costs are within a certain range.Maximum and minimum costs that demarcate the cost window are dependent on noise in strategy replication, with lower noise generally allowing for a larger range of cooperation-supporting costs.
Even though based on a well-known cyclical dominance mechanism that has already been explored elsewhere (Szolnoki and Szabo, 2004;Szabo and Hauert, 2002), the present paper adds some significant extensions to the work of (Brede, 2013a).First, the change in the model from a trait that purely biases strategy spread to a trait that effects payoff generation adds an interesting aspect.The present paper demonstrates that pressures to enhance efficiencies are not the same for cooperators and defectors involved in evolutionary dilemmas on graphs.We demonstrate that the nature of the social game favours the evolution toward higher efficiencies in the cooperator population, and this, in turn, allows cooperation to survive.
Second, one might wonder whether the binary strategies imposed in (Brede, 2013a) constrain stationary states to settings that could not have been reached by the evolution of a continuous trait.We demonstrate here that this is not the case: A continuous efficiency trait can co-evolve with social strategies to support cooperation.The main difference compared to the binary setting is that stationary efficiency levels of the subpopulations self-organize to evolutionarily stable levels.
Third, the model presented in this paper demonstrates that the basic cooperation-supporting mechanism of (Brede, 2013a), i.e. that cooperators can afford to pay more for costly replication than defectors, is in fact more general than originally highlighted in the model based on learning and teaching.We show here that an equivalent mechanism based on costly efficiency improvents can also operate in evolutionary dynamics that are free of noise.For instance, some preliminary simulations indicate that qualitative results are robust for asynchronous updating based on 'imitate the best', for which no cooperation can survive in the standard spatial game (Huberman and Glance, 1993).

Figure 1 :
Figure 1: Co-evolution of social strategies and efficiency traits for (top) r = 0.01 and cost c = 0 and (bottom) r = 0.1 and cost c = 1 for κ = 0. Average trajectories for the density of cooperators n c and the average efficiency trait of cooperators C and defectors D have been calculated from sampling the stochastic dynamics of the evolution over 1000 independent runs on a 200 × 200 torus.For comparison the figure also contains the average evolution of cooperators in the standard one-off spatial game with κ = 0 and r = 0.01.Note the logarithmic scale for the time domain.

Figure 3 :
Figure 3: Dependence of the average stationary frequency of cooperators on the dilemma toughness r for noise levels in strategy updating κ = 0 (no noise), κ = 0.1 (low amount of noise), and κ = 1 (large amount of noise).The dependencies are given for a range of cost parameters c for the efficiency trait.Note, that for c = 0 and κ = 0 and κ = 0.1 cooperation can only survive for r = 0 (open boxes).

Figure 5 :
Figure 5: Dependence of cooperation on the cost of efficiencies for r = 0.1 and several levels of noise in strategy propagation.

Figure 6 :
Figure 6: Dependence of cooperation on the dilemma toughness for various degrees of disjoint strategy pass p d for c = 0.0001 and κ = 0.