It has long been known that in some relatively simple reinforcement learning tasks traditional strength-based classifier systems will adapt poorly and show poor generalisation. In contrast, the more recent accuracy-based XCS, appears both to adapt and generalise well. In this work, we attribute the difference to what we call strong over general and fit over general rules. We begin by developing a taxonomy of rule types and considering the conditions under which they may occur. In order to do so an extreme simplification of the classifier system is made, which forces us toward qualitative rather than quantitative analysis. We begin with the basics, considering definitions for correct and incorrect actions, and then correct, incorrect, and overgeneral rules for both strength and accuracy-based fitness. The concept of strong overgeneral rules, which we claim are the Achilles' heel of strength-based classifier systems, are then analysed. It is shown that strong overgenerals depend on what we call biases in the reward function (or, in sequential tasks, the value function). We distinguish between strong and fit overgeneral rules, and show that although strong overgenerals are fit in a strength-based system called SB-XCS, they are not in XCS. Next we show how to design fit overgeneral rules for XCS (but not SB-XCS), by introducing biases in the variance of the reward function, and thus that each system has its own weakness. Finally, we give some consideration to the prevalence of reward and variance function bias, and note that non-trivial sequential tasks have highly biased value functions.

This content is only available as a PDF.
You do not currently have access to this content.