Abstract
Our minimize surprise method evolves swarm robot controllers using a task-independent reward for prediction accuracy. Since no specific task is rewarded during optimization, various collective behaviors can emerge, as has also been shown in previous work. But so far, all generated behaviors were static or repetitive allowing for easy sensor predictions due to mostly constant sensor input. Our goal is to generate more dynamic behaviors that vary behavior based on changes in sensor input. We modify environment and agent capabilities, and extend the minimize surprise reward with additional components rewarding homing or curiosity. In preliminary experiments, we were able to generate first dynamic behaviors through our modifications, providing a promising basis for future work.