This paper addresses the problem of learning cooperative strategies in swarm robotics. We are interested in heterogeneous swarms, in which each robot optimizes its individual gain. For some tasks, the problem is that the optimal strategy requires to cooperate and may be counter-selected in favor of a more stable but less efficient selfish strategy. To solve this problem, we introduce a mechanism of partner choice, which conditions of operation are learned. This mechanism proves surprisingly efficient, when the swarm size is large, and the duration of interactions is long. Beyond evolutionary swarm robotics, the results we present are relevant for other distributed on-line learning methods for robotics, and as a possible extension of existing evolutionary biology and social learning models.