Abstract
Modeling the behavior of algorithms is the realm of evolutionary algorithm theory. From a practitioner's point of view, theory must provide some guidelines regarding which algorithm/parameters to use in order to solve a particular problem. Unfortunately, most theoretical models of evolutionary algorithms are difficult to apply to realistic situations. However, in recent work (Graff and Poli, 2008, 2010), where we developed a method to practically estimate the performance of evolutionary program-induction algorithms (EPAs), we started addressing this issue. The method was quite general; however, it suffered from some limitations: it required the identification of a set of reference problems, it required hand picking a distance measure in each particular domain, and the resulting models were opaque, typically being linear combinations of 100 features or more. In this paper, we propose a significant improvement of this technique that overcomes the three limitations of our previous method. We achieve this through the use of a novel set of features for assessing problem difficulty for EPAs which are very general, essentially based on the notion of finite difference. To show the capabilities or our technique and to compare it with our previous performance models, we create models for the same two important classes of problems—symbolic regression on rational functions and Boolean function induction—used in our previous work. We model a variety of EPAs. The comparison showed that for the majority of the algorithms and problem classes, the new method produced much simpler and more accurate models than before. To further illustrate the practicality of the technique and its generality (beyond EPAs), we have also used it to predict the performance of both autoregressive models and EPAs on the problem of wind speed forecasting, obtaining simpler and more accurate models that outperform in all cases our previous performance models.