Our minimize surprise method evolves swarm robot controllers using a task-independent reward for prediction accuracy. Since no specific task is rewarded during optimization, various collective behaviors can emerge, as has also been shown in previous work. But so far, all generated behaviors were static or repetitive allowing for easy sensor predictions due to mostly constant sensor input. Our goal is to generate more dynamic behaviors that vary behavior based on changes in sensor input. We modify environment and agent capabilities, and extend the minimize surprise reward with additional components rewarding homing or curiosity. In preliminary experiments, we were able to generate first dynamic behaviors through our modifications, providing a promising basis for future work.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.