## Abstract

It is well known that cooperation cannot be an evolutionarily stable strategy for a non-iterative game in a well-mixed population. In contrast, structured populations favor cooperation, since cooperators can benefit each other by forming local clusters. Previous studies have shown that scale-free networks strongly promote cooperation. However, little is known about the invasion mechanism of cooperation in scale-free networks. To study microscopic and macroscopic behaviors of cooperators' invasion, we conducted computational experiments on the evolution of cooperation in scale-free networks where, starting from all defectors, cooperators can spontaneously emerge by mutation. Since the evolutionary dynamics are influenced by the definition of fitness, we tested two commonly adopted fitness functions: accumulated payoff and average payoff. Simulation results show that cooperation is strongly enhanced with the accumulated payoff fitness compared to the average payoff fitness. However, the difference between the two functions decreases as the average degree increases. As the average degree increases, cooperation decreases for the accumulated payoff fitness, while it increases for the average payoff fitness. Moreover, for the average payoff fitness, low-degree nodes play a more important role in spreading cooperative strategies than for the accumulated payoff fitness.

## 1 Introduction

The emergence of cooperation is one of the challenging problems in both the biological and the social sciences. Cooperators benefit others by incurring some costs to themselves, while defectors do not pay any costs. Therefore, cooperation cannot be an evolutionarily stable strategy for a non-iterative game in a well-mixed population. This relationship between cooperators and defectors is well parameterized in the prisoner's dilemma game (PD) [3]. In PD, two individuals decide whether to cooperate or defect simultaneously. They both obtain *R* for mutual cooperation or *P* for mutual defection as payoffs. If one selects cooperation and the other selects defection, the former receives *S* for being the “sucker” of the defection, while the latter receives *T* as a reward, which is the “temptation” to defect. The order of the four payoffs is *T* > *R* > *P* > *S* in PD.

Nowak and May were the first to reveal that spatial structure provides a viable mechanism for cooperation to evolve [18]. Recently, spatial structures have been mapped to suitable network topologies, and the evolution of cooperation has been investigated through the analysis of PD played on those network topologies [1, 14, 13, 19, 2, 24, 9, 10]. In this context, the effect of spatial structures required for the emergence of cooperation is referred to as network reciprocity, and it has come to be recognized as another of the enabling mechanisms for the emergence of cooperation [17].

On such spatial structures, cooperators can form clusters and thereby reduce the risk of exploitation by defectors. In particular, it has been reported that scale-free networks strongly promote the evolution of cooperation [24]. If a cluster of cooperators takes over a hub in a scale-free network, the payoffs for these cooperators are considerably higher than those for other individuals, and thus they can spread their cooperative strategy quickly to the entire network. In contrast, if a cluster of defectors takes over a hub, this cluster of defectors is quite vulnerable and is easily replaced by cooperators. These effects explain why cooperation is likely to evolve on scale-free networks.

However, it is still unknown how cooperator-dominated hubs could arise in society that initially has few cooperators. It was assumed in [24] that the initial state of the social network was made of half cooperators and half defectors. This setting has since been adopted by other studies on evolutionary models of cooperation as described in [21, 20]. This assumption does not explain how one cooperator that emerged by mutation could increase in number and invade a population initially filled by defectors. This invasion dynamics has already been investigated for square lattice network topologies [11, 7]. In contrast, little is known about such invasion dynamics on scale-free networks, although the fixation probability on scale-free networks has been reported in [19, 12]. Recently, Miller and Knowles studied a model with a similar premise where coevolution of strategy and network growth starting from a very small population is considered [15, 16]. Pinheiro et al. developed a “gradient of selection,” which numerically decides whether cooperation increases or decreases under a given fraction of cooperators (including only one cooperator in a population) [22, 23]. However, the microscopic mechanism of cooperators' invasion is not obvious in those studies, because the analysis only focuses on fitness differences among all individuals without taking into account how and where cooperation invades a network.

Moreover, there is room for further investigation of how the evolution of cooperation depends on the accumulated payoff fitness, which is usually assumed in the previous studies mentioned above. The primary reason that cooperation is strongly promoted in scale-free networks is that cooperative hubs can gain extremely high payoffs compared to the other nodes because such hubs have a large number of connections to other cooperators. This effect of degree heterogeneity disappears if payoffs are averaged. Therefore, in the case that averaged payoffs are used [25, 27, 28, 26] or some costs are incurred to maintain links [13], the evolution of cooperation is strongly inhibited. Thus, the evolutionary dynamics are greatly influenced by the definition of fitness. The key question to be investigated is whether cooperation emerges from such harsh environments and how it spreads into a network, and this must be done by analyzing the microscopic behaviors of strategy evolution.

We performed computer simulations of the evolution of cooperation in scale-free networks where the initial population is all defectors but cooperators can spontaneously arise by mutation. In these simulations, we tested two commonly adopted fitness functions: accumulated payoff and average payoff, while the average degree of nodes was systematically varied. The purpose of this study is to reveal the microscopic and macroscopic behaviors of the cooperators' invasion into the network, and how they differ between the two fitness assumptions.

## 2 Models

We consider the evolutionary dynamics of cooperators' invasion in scale-free networks. The Barabási-Albert (BA) method is used for generating initial networks in simulations [4]. Then, each generated network is substantially randomized by the double-edge swap method while keeping the original degree distributions, in order to remove artificial network properties that are known to occur in the BA model [5, 6]. Such randomized scale-free networks are also used in [25], and it is known that, in this case, cooperation is inhibited a little compared to the original BA scale-free network. Self-loops and multiedges are avoided during the randomization.

A network is made of *N* nodes occupied by individuals. Each node has its strategy classified as either C (cooperator) or D (defector). Initially, all individuals are defectors. Each node *i* plays the PD game with all of its *k*_{i} neighbors. The payoffs of the game are calculated as follows. Both individuals obtain *R* for mutual cooperation and *P* for mutual defection. If one selects cooperation and the other selects defection, the cooperator obtains *S* as the sucker outcome, and the defector obtains *T* as the reward for temptation to defect. The order of the four payoffs is *T* > *R* > *P* ≥ *S* in typical PD. In the case that *P* = *S*, the game is called the weak prisoner's dilemma. Following previous studies [18, 24], we set *P* = 0, *T* = *b*, *R* = 1, and *S* = 0, where *b* > 1 is the only control parameter. The payoff of individual *i* against its *k*_{i} neighbors is denoted by *p*_{i}. Here we consider two types of *p*_{i}: accumulated payoff and average payoff. The average payoff is obtained by dividing the accumulated payoff by *k*_{i}.

We assume an asynchronous updating in our model as used in other recent models of evolutionary games, where the following operations are applied to each individual. At the beginning of each operation, one randomly selected individual *x* plays PD with its neighbors and obtains payoff *p*_{x}. Next, one randomly chosen neighbor of *x*, denoted by *y*, also plays PD with its neighbors and obtains payoff *p*_{y}. If *p*_{x} < *p*_{y}, individual *x* imitates individual *y*'s strategy with probability (*p*_{y} − *p*_{x})/[(*T* − *S*)*k*_{max}] [24], where *k*_{max} = max(*k*_{x}, *k*_{y}), for the accumulated payoff condition, and with probability (*p*_{y} − *p*_{x})/(*T* − *S*) for the average payoff condition. *k*_{max} is used for normalization to make the probability less than or equal to 1. Finally, another randomly selected individual *z* (which might be the same as *x* or *y*) flips its strategy (C will become D and D will become C) by mutation with probability *m*. These operations constitute one time step (usually called the “Monte Carlo step” [21]).

We regard *N* time steps as one generation, in which all individuals are selected once, on average, for the strategy update and mutation.

## 3 Results

### 3.1 Macroscopic Dynamics of the Cooperators' Invasion

First, we focus on the macroscopic dynamics of the cooperators' invasion. We compared simulation results for the two fitness conditions (accumulated and average payoff fitnesses). For each fitness condition, we conducted simulations by varying two parameters: the temptation to defect (*b*, from 1.1 to 2.0) and the average node degree (, from 4 to 16). The same tendency of the results holds for the larger as for the smaller values. The other parameters used in the simulations were *N* = 5,000 (population size) and *m* = 0.005 (mutation probability). Results are shown in Figure 1, which clearly shows the effects of *b* and , as well as the difference between the two fitness conditions.

Basically, cooperation is greatly enhanced in the accumulated payoff fitness condition in comparison with the average payoff fitness condition, as previous studies have reported [24, 25].^{1} This is because cooperative hubs can gain much greater payoffs than others by having a large number of connections to other cooperators, as long as the temptation to defect, *b*, is not too large. However, our results show that, if *b* is large (e.g., *b* > 1.8, ), cooperators may not be able to occupy hub nodes and therefore cooperation is not promoted any more.

Moreover, in the accumulated payoff fitness condition, cooperation *decreases* as the average degree increases, which disagrees with what was originally reported in [24]. This disagreement is due to the difference in model settings: In the model used in [24], the initial population was filled with half cooperators and half defectors, so having a higher average degree made it easier for cooperators to form a cluster initially by themselves. Once such clusters are formed, they hardly ever collapse, because there is no mutation in [24]. In contrast, our model assumes that the initial condition is full of defectors, and that cooperators appear only by mutation. In this model setting, it is easier for cooperative clusters to form if the average degree is low, because a cooperator will be connected to fewer defectors. We also checked the usual case in which the initial population is filled with half cooperators and half defectors and found that the differences among the average degrees are weakened in such a case (results not shown).

In contrast, cooperation *increases* as the average degree increases in the average payoff fitness condition. Therefore, the difference between the two functions decreases as the average degree increases. This is because the probability of a rare cluster of cooperators being able to connect to other cooperators increases as the average degree increases. In that case, the chance for cooperators to survive increases. Note that this situation happens only when cooperation is not dominant, as in the initial stage of invasion. Once cooperation prevails, defectors can exploit cooperators more as the average degree increases, because the probability of being connected to cooperators increases in such a situation. Then, the situation asymptotically approaches a well-mixed population, which is harmful for cooperation.

In brief, our results showed an interesting difference between the two fitness conditions, in the effect of average node degrees. These findings rely on our unique model settings in which cooperators are initially nonexistent and they spontaneously arise by mutation.

### 3.2 Microscopic Dynamics of the Cooperators' Invasion

Next, we investigate the microscopic dynamics of the cooperators' invasion. We used and *b* = 1.2 as a representative parameter setting throughout this subsection, because the general trend was consistent even if they were varied. Figure 2 shows histograms of strategy propagation events, plotted over the degrees of source and destination nodes. Cooperators spontaneously emerged by mutation and tried to invade a network of defectors. In that phase, if clusters of cooperators are formed by chance, the invasion is likely to succeed. This situation happens in the first 300–400 generations, as you can see in Figure 3. After that, only minor changes take place. Therefore, in each case, the first 500 generations of 10 replicate simulations are recorded, because the first 500 generations are enough to see the invasion of cooperation.

As seen in Figure 2a and b, only lower-degree nodes can change higher-degree nodes' strategies in the average payoff fitness condition. As a result, cooperation tends to spread more frequently from lower-degree nodes in the average payoff fitness condition (Figure 2a) than in the accumulated payoff fitness condition (Figure 2c). This is because the benefit of being hubs for cooperators disappears in the average payoff fitness condition, as discussed above. In contrast, a relatively wide range of node degrees can cause a change in the strategy of neighbors although the strategy of hub nodes cannot be changed in the accumulated payoff fitness condition (Figure 2c and d).

To investigate the effects of the local surrounding environment for the propagation of cooperation, we also plotted histograms of strategy propagation events against the degree of the source node and its neighbors' state ratio (1 = fully cooperative neighborhood, 0 = fully defective neighborhood) in Figure 4. In the accumulated payoff fitness condition, a node with any degree can change its neighbors' strategy in general. Moreover, as the neighbors' state ratio becomes greater, the frequency of strategy change tends to be greater because such a high cooperation ratio contributes to raise the fitness of the source's node. In contrast, in the average payoff fitness condition, for a given state ratio of the neighbors, only low node degrees tend to cause change in the strategy of the neighbors, as shown in Figure 4a. In the case of high source node degrees in Figure 4a, cooperators are easily invaded by defectors because defectors can get high payoff from such a high ratio of cooperators, so that there are no defectors with a high neighbor ratio of cooperators.

### 3.3 Case of the Donation Game

In all the simulations above, we adopted a *weak* PD setting where the sucker's payoff (*S*) is equal to the punishment (*P*), which is a common assumption made in many earlier studies (e.g., [18, 24]). However, this assumption does not create an incentive for cooperators to switch their strategy to defection when they play a game with defectors.

In order to test the robustness of our findings in a *strong* PD setting with *T* > *R* > *P* > *S* and 2*R* > *T* + *S*, we conducted another set of simulations using the *donation game* model. The donation game is a special class of strong PD where each cooperator provides a benefit *b* to the other player by incurring cost *c* to himself, with 0 < *c* < *b*. Thus, the payoff structure of the donation game is given by *T* = *b*, *R* = *b* − *c*, *P* = 0, and *S* = −*c* [8]. For simplicity, we varied just one parameter *b* from 1 to 2, while letting *c* = *b* − 1. Figure 5 shows the fraction of cooperators on scale-free networks in the simulations of the donation game. We find cooperation is greatly inhibited in the donation game compared to the weak PD. Cooperation sharply drops in both cases, and the same relation between average and accumulated payoff fitness conditions with the weak PD can be seen only in the low range of *b* (*b* = 1.1). In the case of the weak PD, the payoffs between *P* and *S* are the same by definition, making cooperation not so disadvantageous against defection. In contrast, in the donation game, defectors can easily exploit cooperators. Thus, cooperation can only survive within a limited range of *b*.

## 4 Conclusion

It is known that scale-free networks strongly promote cooperation due to their heterogeneity. It is also known that this advantage is mostly lost when average payoffs are adopted. However, little is known about how cooperation can spread from initially rare cooperators in the two cases. Here we show that the fate of cooperation differs greatly between the average and accumulated payoff fitness conditions, depending on the average degree. In general, cooperation is promoted in the accumulated payoff fitness condition, as previously found. However, the difference between accumulated and average payoff decreases as the average degree increases, which is not commonly discussed in the previous studies. More importantly, as the average degree increases, cooperation decreases for the accumulated payoff fitness, while it increases for the average payoff fitness. This implies that the evolution of cooperation on a network depends significantly on how game players are rewarded through their play. Moreover, from the in-depth analysis of microscopic behaviors, we have shown that the relative importance of low-degree nodes for the evolution of cooperation is much higher in the case of the average payoff than in the previously studied case of the accumulated payoff, where hubs are known to play a major role in the propagation of cooperation.

## Acknowledgment

The authors thank Yoshiki Satotani for his comments on this work.

## Note

However, the final levels of cooperation are lower in our model than in those other models, because there are no cooperators in initial configurations and the clusters of cooperators can collapse due to mutation in our model.

## References

*arXiv*, 1605.02652

## Author notes

Contact author.

Department of Mathematical and Systems Engineering, Shizuoka University, Hamamatsu, 432-8561, Japan. E-mail: ichinose.genki@shizuoka.ac.jp

Center for Collective Dynamics of Complex Systems, Binghamton University, State University of New York, Binghamton, NY 13902-6000, USA. E-mail: sayama@binghamton.edu