Abstract
Simple distributed strategies that modify the behavior of selfish individuals in a manner that enhances cooperation or global efficiency have proved difficult to identify. We consider a network of selfish agents who each optimize their individual utilities by coordinating (or anticoordinating) with their neighbors, to maximize the payoffs from randomly weighted pairwise games. In general, agents will opt for the behavior that is the best compromise (for them) of the many conflicting constraints created by their neighbors, but the attractors of the system as a whole will not maximize total utility. We then consider agents that act as creatures of habit by increasing their preference to coordinate (anticoordinate) with whichever neighbors they are coordinated (anticoordinated) with at present. These preferences change slowly while the system is repeatedly perturbed, so that it settles to many different local attractors. We find that under these conditions, with each perturbation there is a progressively higher chance of the system settling to a configuration with high total utility. Eventually, only one attractor remains, and that attractor is very likely to maximize (or almost maximize) global utility. This counterintuitive result can be understood using theory from computational neuroscience; we show that this simple form of habituation is equivalent to Hebbian learning, and the improved optimization of global utility that is observed results from well-known generalization capabilities of associative memory acting at the network scale. This causes the system of selfish agents, each acting individually but habitually, to collectively identify configurations that maximize total utility.