Abstract
A large training set of fitness cases can critically slow down genetic programming, if no appropriate subset selection method is applied. Such a method allows an individual to be evaluated on a smaller subset of fitness cases. In this paper we suggest a new subset selection method that takes the problem structure into account, while being problem independent at the same time. In order to achieve this, information about the problem structure is acquired during evolutionary search by creating a topology (relationship) on the set of fitness cases. The topology is induced by individuals of the evolving population. This is done by increasing the strength of the relation between two fitness cases, if an individual of the population is able to solve both of them. Our new topology—based subset selection method chooses a subset, such that fitness cases in this subset are as distantly related as is possible with respect to the induced topology. We compare topology—based selection of fitness cases with dynamic subset selection and stochastic subset sampling on four different problems. On average, runs with topology—based selection show faster progress than the others.