Skip Nav Destination
Close Modal
Update search
NARROW
Format
TocHeadingTitle
Date
Availability
1-4 of 4
Thomy Phan
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Proceedings Papers
. isal2021, ALIFE 2021: The 2021 Conference on Artificial Life74, (July 18–22, 2021) 10.1162/isal_a_00399
Abstract
View Paper
PDF
This paper considers sustainable and cooperative behavior in multi-agent systems. In the proposed predator-prey simulation, multiple selfish predators can learn to act sustainably by maintaining a herd of reproducing prey and further hunt cooperatively for long term benefit. Since the predators face starvation pressure, the scenario can also turn in a tragedy of the commons if selfish individuals decide to greedily hunt down the prey population before their conspecifics do, ultimately leading to extinction of prey and predators. This paper uses Multi-Agent Reinforcement Learning to overcome a collapse of the simulated ecosystem, analyzes the impact factors over multiple dimensions and proposes suitable metrics. We show that up to three predators are able to learn sustainable behavior in form of collective herding under starvation pressure. Complex cooperation in form of group hunting emerges between the predators as their speed is handicapped and the prey is given more degrees of freedom to escape. The implementation of environment and reinforcement learning pipeline is available online.
Proceedings Papers
. isal2020, ALIFE 2020: The 2020 Conference on Artificial Life518-525, (July 13–18, 2020) 10.1162/isal_a_00273
Abstract
View Paper
PDF
This paper applies reinforcement learning to train a predator to hunt multiple prey, which are able to reproduce, in a 2D simulation. It is shown that, using methods of curriculum learning, long-term reward discounting and stacked observations, a reinforcement-learning-based predator can achieve an economic strategy: Only hunt when there is still prey left to reproduce in order to maintain the population. Hence, purely selfish goals are sufficient to motivate a reinforcement learning agent for long-term planning and keeping a certain balance with its environment by not depleting its resources. While a comparably simple reinforcement learning algorithm achieves such behavior in the present scenario, providing a suitable amount of past and predictive information turns out to be crucial for the training success.
Proceedings Papers
. isal2020, ALIFE 2020: The 2020 Conference on Artificial Life333-340, (July 13–18, 2020) 10.1162/isal_a_00267
Abstract
View Paper
PDF
Flocking or swarm behavior is a widely observed phenomenon in nature. Although the entities might have self-interested goals like evading predators or foraging, they group themselves together because a collaborative observation is superior to the observation of a single individual. In this paper, we evaluate the emergence of swarms in a foraging task using multi-agent reinforcement learning (MARL). Every individual can move freely in a continuous space with the objective to follow a moving target object in a partially observable environment. The individuals are self-interested as there is no explicit incentive to collaborate with each other. However, our evaluation shows that these individuals learn to form swarms out of self-interest and learn to orient themselves to each other in order to find the target object even when it is out of sight for most individuals.
Proceedings Papers
. isal2019, ALIFE 2019: The 2019 Conference on Artificial Life598-605, (July 29–August 2, 2019) 10.1162/isal_a_00226
Abstract
View Paper
PDF
In nature, flocking or swarm behavior is observed in many species as it has beneficial properties like reducing the probability of being caught by a predator. In this paper, we propose SELFish (Swarm Emergent Learning Fish), an approach with multiple autonomous agents which can freely move in a continuous space with the objective to avoid being caught by a present predator. The predator has the property that it might get distracted by multiple possible preys in its vicinity. We show that this property in interaction with self-interested agents which are trained with reinforcement learning solely to survive as long as possible leads to flocking behavior similar to Boids, a common simulation for flocking behavior. Furthermore we present interesting insights into the swarming behavior and into the process of agents being caught in our modeled environment.