Abstract
This paper considers adaptive control architectures that integrate active sensorimotor systems with decision systems based on reinforcement learning. One unavoidable consequence of active perception is that the agent's internal representation often confounds external world states. We call this phenomenon perceptual aliasing and show that it destabilizes existing reinforcement learning algorithms with respect to the optimal decision policy. A new decision system that overcomes these difficulties is described. The system incorporates a perceptual subcycle within the overall decision cycle and uses a modified learning algorithm to suppress the effects of perceptual aliasing. The result is a control architecture that learns not only how to solve a task but also where to focus its attention in order to collect necessary sensory information.