Abstract

All commonly used stochastic optimisation algorithms have to be parameterised to perform effectively. Adaptive parameter control (APC) is an effective method used for this purpose. APC repeatedly adjusts parameter values during the optimisation process for optimal algorithm performance. The assignment of parameter values for a given iteration is based on previously measured performance. In recent research, time series prediction has been proposed as a method of projecting the probabilities to use for parameter value selection. In this work, we examine the suitability of a variety of prediction methods for the projection of future parameter performance based on previous data. All considered prediction methods have assumptions the time series data has to conform to for the prediction method to provide accurate projections. Looking specifically at parameters of evolutionary algorithms (EAs), we find that all standard EA parameters with the exception of population size conform largely to the assumptions made by the considered prediction methods. Evaluating the performance of these prediction methods, we find that linear regression provides the best results by a very small and statistically insignificant margin. Regardless of the prediction method, predictive parameter control outperforms state of the art parameter control methods when the performance data adheres to the assumptions made by the prediction method. When a parameter's performance data does not adhere to the assumptions made by the forecasting method, the use of prediction does not have a notable adverse impact on the algorithm's performance.

1  Introduction

Evolutionary algorithms (EAs) have shown good performance and robustness in solving hard optimisation problems. In recent years, it has been acknowledged that the robustness of EAs can largely be attributed to the adjustability of the algorithm parameters (Bäck et al., 2000; Eiben and Schut, 2008; Michalewicz and Schmidt, 2007), which enable the optimisation procedure to adapt to different search spaces. That said, it comes as no surprise that the parameterisation of EAs greatly affects the performance of the algorithm (Aleti and Moser, 2013; Bäck et al., 2000; Eiben and Smit, 2011; Lobo, 2011; Michalewicz and Schmidt, 2007).

The parameterisation is usually the responsibility of the practitioner, who may not be an expert in EAs. Even after many years of research into EAs and other stochastic approaches, there are no straightforward parametrisation guidelines for interdisciplinary users of these methods (Arcuri and Fraser, 2011). General guidelines regarding the crossover rate recommend a divergent variety of values, such as 0.6 (De Jong, 1995), 0.95 (Grefenstette, 1986), or “any value in the range [0.75, 0.95]” (Schaffer et al., 1989). De Jong (1995) recommends a mutation rate equal to 0.001, whereas Grefenstette (1986) argues that the mutation rate should be less than 0.05.

Any one of these conflicting guidelines is the ideal configuration for a particular search space. As the search space of a newly arising problem is typically unknown, general principles about EA parameter configurations for universal applicability can hardly be formulated (Mitchell, 1996) and EA parameterisation presents itself as an optimisation problem in its own right (Eiben and Schut, 2008). Pressed for time, practitioners tend to choose parameter values based on few preliminary trials on the problem at hand. It has been demonstrated that different parameter settings are required also for different optimisation stages of the same problem instance (Bäck et al., 2000; Cervantes and Stephens, 2009; Thierens, 2002). The “optimal” values for the mutation rate of an EA were studied using a set of benchmark problems (Cervantes and Stephens, 2009). It became clear that these values depend on the state of the search space and any fixed mutation rate will produce suboptimal results. Time dependency has since been reported for several EA parameters—the mutation rate (Aleti and Moser, 2011; Aleti et al., 2012; Bäck and Schütz, 1996; Davis, 1991), the crossover rate (Aleti and Moser, 2011; Aleti et al., 2012; Davis, 1991), and the population size (Bäck et al., 2000; Lobo, 2011).

As prior tuning of an EA's parameter values does not provide optimal performance, a great number of researchers have since proposed to reset the parameter values of an EA repeatedly during the optimisation process (Bäck et al., 2000; De Jong, 2007; Eiben et al., 2007; Fialho et al., 2010; Michalewicz and Fogel, 2004), an approach known as parameter control. The intuitive motivation comes from the way the optimisation process unfolds from a more diffused global search, requiring parameter values responsible for the exploration of the search space, to a more focused local optimisation process, requiring parameter values which help with the convergence of the algorithm.

One of the most conspicuous ways of varying parameter values during the optimisation process is adaptive parameter control (Corne et al., 2002; Fialho et al., 2009; Hong et al., 2000; Igel and Kreutz, 2003; Thierens, 2002), in which properties of an EA optimisation process (typically the quality of the solutions produced) are monitored and the change in the properties is used as a signal to change the parameter values. The area of adaptive parameter control has been researched more actively in recent times (Aleti and Moser, 2011; Fialho et al., 2009; Hansen and Ostermeier, 2001; Thierens, 2005). Parameter values are assessed based on past performance and subsequently adapted for the next iteration of the algorithm.

In essence, adaptive parameter control methods assess the potential future success of parameter values based on the estimated effect of the parameter values in the last iterations (Fialho et al., 2010; Hong et al., 2002; Igel et al., 2005). Quality attribution mechanisms use this quality measure gained from recent performance (time t) to decide which value(s) to use in the immediate future. It is generally argued that the past effectiveness of parameter values provides the information necessary to project their chance of leading to good results at iteration t+1. In previous work (Aleti and Moser, 2011), we have suggested that using time series prediction to attribute a quality metric to the parameter values to choose from for the next iteration is beneficial to the performance of the algorithm. The predictive parameter control method used linear regression to forecast the quality of mutation and crossover rates, and outperformed best-performing approaches with the same functionality, AP (Thierens, 2005), PM (Thierens, 2005), and DMAB (Fialho et al., 2009).

However, time series prediction methods require the data series to have a number of statistical properties, such as linearity of the relationship between dependent and independent variables, normality of the error distribution, independence of the errors, homoscedasticity, and stationarity of the mean and standard deviation. Different methods have been made available to accommodate different types of data. Typical EA parameters whose adaptation is meaningful to automate using predictive parameter control are mutation rate, crossover rate, mutation and crossover operators, population, and mating pool sizes. In the current work, we investigate the statistical properties of the quality of these parameter values recorded over time. The analysis of the statistical properties is employed to examine the suitability of a number of forecasting techniques for adaptive parameter value optimisation and provide recommendations on the adequacy of these methods for forecasting the quality to be expected from the use of certain parameter values. To demonstrate the benefits of predictive parameterisation, we present a comparison with state of the art quality attribution strategies—average quality attribution (AQA) and extreme quality attribution (EQA).

2  Adaptive Parameter Control for Evolutionary Algorithms

Adaptive parameter control describes the application of separate meta-optimisation methods which use feedback from the optimisation process to evaluate the effect of parameter value choices and adjust the parameter values over the iterations. The steps of adaptive parameter control as applied to an EA are depicted in Figure 1.

Figure 1:

Steps of adaptive parameter control in evolutionary algorithms.

Figure 1:

Steps of adaptive parameter control in evolutionary algorithms.

The optimisation process of EAs begins with a set of initial solutions, which is evolved in iterations until a stopping criterion is reached. The new generations of solutions are created with the help of crossover, mutation, and selection operators. These operators, their probabilities, the size of the population, and the number of new solutions produced in every iteration, are examples of parameters that can be adjusted by a parameter control method. Every iteration, the solutions created are evaluated using the objective function(s) of the problem at hand. The results from this fitness evaluation process can be used by the adaptive parameter control methods to assess the effect of the parameter values on the performance of the optimisation algorithm. This feedback collection as a part of the parameter control method records the change in the properties of the population—usually fitness, occasionally also diversity—as the performance of the algorithm.

The feedback from the optimisation process is used by the parameter effect assessment strategy to estimate the effect of the parameter values on the change in solution properties. The approximated effect of the parameter values is used to estimate a quality value which is attributed to the parameter value to project the potential of these values in the next iteration(s). These quality attributions provide the probability distributions used to select the parameter values to be employed to create the next generation of solutions. The selection mechanism subsequently configures the components of an EA (variation, selection, and replacement operators).

State of the art adaptive parameter control methods can be classified into two main categories: methods that use (i) an AQA strategy (Igel et al., 2005; Thierens, 2005), where the quality of parameter values is estimated as the average of the qualities measured in the past iterations, or (ii) an EQA (Thierens, 2005; Fialho et al., 2010), in which the best quality measured so far is used to estimate the overall quality of parameter values.

Usually, adaptive parameter control methods use a predefined time window w to approximate the quality of parameter values for the following iteration. Some methods use this time window to estimate the quality of parameter values as the average effect on algorithm performance (Igel et al., 2005). The probability matching (PM) technique (Thierens, 2005) estimates the quality of a parameter value based on a running average of past rewards. Rewards are allocated on the basis of the outcome of the optimisation process that the parameter value was used in. A minimum probability is enforced on values that do not receive rewards in order to maintain a nonzero probability. The motivation of a minimum value is the assumption that parameter values that do not perform well at present might be optimal in the future. PM has been criticised for the fact that the probability values resulting from the reward allocations poorly reflect the relative differences in algorithm performance when using the values. Values with vastly superior performance may only be differentiated by a marginal increase of the probability of being chosen in the next step.

Adaptive pursuit (AP) (Thierens, 2005) was conceived with the goal of improving the performance of PM by ensuring an appropriate difference in probabilities depending on experienced performance. After an iteration of the optimisation process, AP establishes the respective rewards for the parameter values used, but only applies the maximum award to the value of the best-performing algorithm instance. All other values have their probabilities of future use diminished. A nonzero probability is enforced as a minimum probability.

Fialho et al. (2008, 2009, 2010) employ the maximum effect in w iterations to approximate the quality of parameter values in the preceding iteration t−1. A normalisation scheme is used, which divides parameter effects by the best effect achieved so far by any parameter value. The dynamic multi-armed bandit (DMAB) (Fialho et al., 2009) also employs a performance-based rewards approach, but here the probability of re-using a certain value is based on a tradeoff between the number of times the parameter value was used and the reward gained from its previous performance. Rather than using a rewards-based weighted adjustment of the probabilities, the DMAB completely recalculates the probabilities when a change in the rewards distribution is detected. DMAB uses the Page-Hinkley test to detect a change in the rewards distribution.

The most common way of attributing quality to the parameters is by using the improvement in specific properties of the generated solutions with respect to a reference solution sref. This reference point can be either the parents of the generated solutions (Fialho et al., 2009; Hong et al., 2002; Igel et al., 2003; Igel et al., 2005; Wong et al., 2003), the best solution so far (Davis, 1989; Eiben et al., 2004), or a specified quantile, for example, the median (Julstrom, 1995, 1997). For example, if a mutation rate 0.01 is applied to create a solution from its parent s, the effect of the mutation rate on the performance of the algorithm is equal to the fitness gain .

Contemporary parameter control methods (Corne et al., 2002; Fialho et al., 2010; Igel et al., 2005; Whitacre et al., 2009) attribute the quality derived from recent performance directly to the parameter values as a basis for the choice to make in the next iteration (time t+1). One might argue that these approaches are one step behind, in that they represent the parameter value which is optimal for the previous iteration (time t). Ideally, we would use a setting optimised for the beginning iteration (time t+1). It follows that an ideal parameter control method would attempt to predict successful parameter values for time t based on previous performance.

3  Predictive Quality Attribution

Given a set , of algorithm parameters with values , , where m can be a discrete value or an interval of continuous values, the adaptive method has the task of deriving the optimal next value to optimise the influence of vi on the performance of the algorithm. As an example, when the mutation rate is dynamically adjusted using four intervals (m=4), stands for the second interval. In the discrete case of optimising the type of mutation operator , stands for the second operator type. The probability of choosing values from a particular interval depends on the previous performance of values sampled from this interval. Values from intervals which have not been used have a small initial probability of being chosen. When a value is used but does not produce any above-average outcome, its probability of being chosen reverts to the initial value.

To explore the parameter space, k independent algorithm instances are started in parallel. Each optimisation process optimises the same problem instance. The processes set their adaptive parameters for an iteration at a time, choosing the respective value probabilistically from the distribution obtained from the performance-based prediction. At the end of an iteration, the average solution fitness among all instances is established. Each value interval keeps a history list of success rates from all iterations. Algorithm instances which produce above-average population fitness gain are considered successful. For each parameter value or interval, the number of times the value has been used and the number of times the value led to a success (above-average increase in population fitness) are established. The effect of each parameter value is calculated using Equation (1).
formula
1
Predictive quality attribution (PQA) applies forecasting techniques to predict the quality of parameter values to be applied in the parameter control strategy. The effects e of parameter intervals/types are recorded for each iteration, resulting in the time series , where t is the current iteration. The time series is used to forecast the effect of the parameter intervals/types for the next iterations. The purpose of predictive modelling is to estimate the expected effect of a parameter interval/type in the future iterations. A predictive model in adaptive parameter control has the general form:
formula
2
where is the response, is a vector of predictors, and ε is a random error. The different predictive models vary in the type of the response, which can be discrete or continuous, and the function f. In this paper, we consider predictive models with continuous response, where the response is the expected quality of parameter values/intervals. The probability of sampling from a range is the combined quality of its members , normalised by the summed qualities of all intervals.

Forecasting necessitates the use of analytical methods to identify the relationship between future (predicted) effects of parameter values and past effects. The forecasting techniques explored in this work are the most prominent methods available for time series prediction, ranging from simple models such as linear regression (LR), simple moving average (SMA), and exponentially weighted moving average (EWMA) to more sophisticated formulations such as the autoregressive integrated moving average (ARIMA).

LR fits a linear equation to the observed data; hence, it makes the assumption that the data points have a linear relationship with time. If this is not the case, LR cannot be expected to provide accurate predictions and other alternatives have to be explored. The SMA model measures trends by smoothing the data using the mean of the data values. As a result, the trend in the data is extrapolated into a forecast. The assumption is that prior trend movements will continue. EWMA, on the other hand, focuses mostly on recent effects and captures shorter trends in the data. As a result, an EWMA is expected to reduce any lags in the simple moving average such that the fitting line is closer to the real values of the effects compared to a simple moving average. EWMA is vulnerable to sudden changes in the data, and hence is easily affected by noise. If this problem arises, it can be counteracted by increasing the length of the moving average term or using SMA instead. The ARIMA model is a combination of moving average models and a special case of the linear regression model. The ARIMA model is considered more powerful than LR, but also more complex.

Using the most sophisticated model available does not guarantee better results, as all models make assumptions about the data used for projection. For an optimal outcome, the parameter performance data must conform to the assumptions made by the forecasting method. It is necessary to investigate the data with regard to the assumptions to make recommendations of which forecasting methods are likely to produce well-parameterised algorithms. To this end, the analysis of the properties of the data is complemented with the performance of the algorithm when parameterised using the respective forecasting methods.

3.1  Linear Regression

Linear regression builds a linear mathematical model of past observations that can be used to forecast future values of a discrete time series. Regression models are based on the notion of regression to the mean, that is, if a time step is x standard deviations from the mean, the model predicts that the next value will be standard deviations from the mean of the time series, where r is the correlation coefficient between time and parameter effects. This is an indication of the extent to which the linear model can be used to predict the deviation of the effect of a parameter value from the mean when the deviation of time t from the mean is known.

The correlation coefficient measures the degree of linear relatedness between the data and time on a scale of −1 to +1. The correlation coefficient is positive if both variables vary on the same side of the respective means, and negative if they vary on opposite sides. If the effect varies independently of the number of iterations, the correlation coefficient is equal to zero. The linear model for predicting the quality of a parameter value from t is equal to:
formula
3
where t is the total number of observations, is the mean of the effects in the last t iterations, is the mean of the time steps and is its standard deviation at time t. The correlation coefficient r is equal to the average of the product of the standardized effect and time calculated as follows:
formula
4

The current parameter effect and time are standardised by expressing them in units of standard deviations from their own mean. The best fit of the model is calculated by minimising the mean squared error.

3.2  Simple Moving Average

Due to the stochastic nature of approximate optimisation methods, the performance of the algorithm can display random variation with time. Therefore, a smoothing method could be useful to cancel, or at least reduce, the effect due to random variation. Averaging and exponential smoothing models, such as the SMA model, reduce the noise in the data. A simple moving average model uses equal weights for all the data points in the time series as follows:
formula
5
where k is the “width” of the moving average and denotes the number of observations used to compute the average. The forecast is equal to the average of the last k data points in the time series, centred at ; the forecast will thus be delayed by time steps. The selection of the window width k depends on the type of prediction of interest, which can be short, intermediate, or long term. Short-term averages respond quickly to changes in the effect of the parameter values, while long-term averages are slow to react.

3.3  Exponentially Weighted Moving Average

In the LR and SMA models, past observations are weighted equally. EWMA gives exponentially decreasing weights to past observations, that is, feedback taken from the current iteration is given more importance in forecasting the next parameter values than the earlier observations. The forecast of the future effect of parameter values is derived from the weighted sum of all observations. Mathematically, EWMA is the convolution of the data points with an exponentially smoothing weighting function defined as:
formula
6
where is the smoothing factor in the range (0,1). It can be observed that the multiplicands of the data series decrease exponentially, approaching zero without reaching it. The smoothing factor controls the closeness of the interpolated value to the most recent observation. Large values of reduce the level of smoothing. To determine the value of we use the least-squares method, for which the mean of squared errors is minimised.

3.4  Autoregressive Integrated Moving Average

The use of ARIMA models is appropriate when the time series data show a stable trend with time and do not have many outliers. ARIMA can perform better than exponential smoothing techniques given a sufficient number of observations and a stable correlation between past observations. The ARIMA model is composed of three parts, which are calculated individually: the integrated (I) part, the autoregressive part (AR) and the moving average part (MA).

The integrated part I(d) is used if the data show evidence of nonstationarity, that is, the time series is not stable with time and exhibits a trend, and d is the number of nonseasonal differences. The stationarity of the time series is measured using the stationarity in mean and variance. The first step in the identification of the appropriate ARIMA model is the identification of a value for d, that is, the order(s) of differencing needed to make the time series stationary. The process of differencing subtracts each observation at time t from the observation at time t−1.

The AR(p) model is used to describe the correlation between the current value of the time series and p past values. The order p of the AR model describes the number of lags (past values) included in the model prediction, that is, the number of the autoregressive terms. In adaptive parameter control, AR(1) describes the correlation between the prediction of the effect at time t and the immediately preceding value in the time series, equal to , where is the first lag (lag 1) autoregressive parameter, and the error.

The moving average (MA(q)) part of the ARIMA model considers the time series as an unevenly weighted moving average of random errors , capturing the deviation of the current effect as a finite weighted sum of q previous deviations of the effects, plus a random error. In the first order MA(1), the error of the current value is directly correlated with the error of the immediate past observation, equal to , where , are the errors at times t and t−1, and is the first order moving average coefficient. Both the AR model and the MA model can be extended to a higher order model in order to include more information than that of the immediate past observation.

4  Statistical Properties of Time Series Data

The field of statistics defines a number of properties which a data series has to have to be modelled by time series prediction methods. Different methods have been made available to accommodate different types of data. The relevant statistical properties are

  • Linearity of the relationship between dependent and independent variables,

  • Normality of the error (residual) distribution,

  • Independence of the errors (residuals),

  • Homoscedasticity, which defines the variance of the errors as constant with time, and

  • Stationarity of the mean and standard deviation.

Linear regression requires the first four properties; SMA and EWMA assume only the last two. It is generally accepted that ARIMA produces accurate predictions only after a threshold of 50 data points. In addition, ARIMA requires the data to meet all but the first assumption.

4.1  Testing for the Statistical Properties

For each of the five properties, we perform statistical tests which reveal to what degree the property exists in the given data series.

4.1.1  Linearity

Linearity can be determined visually by plotting the data. A more reliable method is to check for a statistical significance of the linearity by fitting a linear model and reporting the p value which indicates the significance of the regression. A p value<.05 indicates that the coefficient belonging to the linear term in the model is nonzero, which implies that the time series is linear with all data points having equal influence on the fitted line. The hypothesis is formulated as follows:

  • H0: The correlation coefficient is equal to zero.

  • H1: The correlation coefficient is not zero.

4.1.2  Normality

The calculation of the confidence intervals for the model coefficients in linear regression is based on the assumptions of normally distributed errors. In terms of PQA, normality of errors implies that most of the effects of the parameter values are expected to fall close to the regression line and very few to fall far from the line. Since the estimation of model coefficients is based on the minimisation of the squared error, the presence of extreme data points can exert a disproportionate influence on the estimates. To check for the normality assumption, we use the Kolmogorov-Smirnov (KS) test (Gibbons and Chakraborti, 2003) which tests the null hypothesis that the errors of the effects are normally distributed. The null hypothesis is rejected if the p value is smaller than 0.05, which indicates that the data cannot be considered normally distributed. The hypothesis is formulated as follows:

  • H0: The data follow a normal distribution.

  • H1: The data do not follow a normal distribution.

4.1.3  Independence

Linear regression models make the assumption that the error terms are independent. One example of the violation of this assumption is autocorrelation, which occurs if each error term in the time series is related to its immediate predecessor. The Durbin-Watson (DW) statistical test measures the correlation between the error terms and their immediate predecessors. The D statistic tests the null hypothesis that the errors from a least-squares regression are not autocorrelated. The values of D vary in the interval [0, 4]. For independent error terms, the value of D is expected to be close to 2. The value of D tends to be smaller than 2 for positive autocorrelation between the terms, and greater than 2 for negative autocorrelation.

  • H0: The autocorrelation of errors is zero.

  • H1: The autocorrelation of errors is less than zero.

The null hypothesis is rejected if the p value is less than .05.

4.1.4  Homoscedasticity

Homoscedasticity refers to the assumption that the effects of parameter values exhibit a similar variance with time. If homoscedasticity is violated, it is difficult to measure the true standard deviation of the residuals, which can result in confidence intervals that are too wide or too narrow. In particular, if the variance of the errors increases with time, confidence intervals for predictions will tend to be unrealistically narrow. Heteroscedasticity may also have the effect of placing too much weight on a small subset of the data, that is, the subset where the error variance is the largest when estimating coefficients.

Residuals are tested for homoscedasticity using the Breusch-Pagan (BP) test, which regresses square residuals to independent variables. The null hypothesis for the BP test is homosedasticity. The alternative hypothesis is that the error variance varies with a set of regressors.

  • H0: The data exhibit a similar variance with time.

  • H1: The error variance varies with a set of regressors.

If the p value is less than .05, the null hypothesis is rejected, which means that the data are not homoscedastic.

4.1.5  Stationarity

A time-series is stationary if the mean and standard deviation remain the same with time. Formally, the effect of parameter values are said to be stationary if the distribution of the time-series measured during the time segment [1, t] is the same as the distribution of the effects shifted by some lag k, that is, in the time segment [1+k, t+k].

In the application of the PQA, this means that it does not matter when the effects are observed, since the distribution of the parameter effects does not depend on time. In the case of the mutation rate, the assumption is that the diffuse part of the search with large values is followed by a more focussed part, which favours lower mutation rate values. Hence, a decreasing trend in the time series is expected.

To check for the stationarity of the effects of parameter values, we use the Kwiatkowski-Phillips-Schmidt-Shin (KPSS; Kwiatkowski et al., 1992) test, which tests the null hypothesis that the data are stationary, formulated as follows:

  • H0: The data are stationary.

  • H1: The data are not stationary.

If the p value is less than .05, the null hypothesis is rejected, which means that the data are not stationary.

5  Design of Experiments

We are interested in investigating statistical properties of the time-series data obtained from the effects of parameter values over time. Typical EA parameters whose adaptation is meaningful to automate using parameter control are the mutation rate, crossover rate, the mutation and crossover operators, the population, and mating pool sizes. More specialised EA implementations may have additional parameters. In this work we limit the investigation to the parameters that EAs most commonly use.

The real-valued parameters (mutation rate, crossover rate, population size, and mating pool size) are discretised into intervals. The quality attribution targets the intervals. The selection strategy samples the new parameter values from the intervals chosen according to the quality attribution mechanism. The definition of the interval bounds is likely to affect the performance of the algorithm and potentially the properties of the performance data to analyse. Therefore, we perform the experiments twice: with a total of two range intervals over the whole meaningful parameter range and a second time with intervals that divide the complete meaningful range into four intervals. All intervals are evenly spread within the range.

It is obvious that the practitioner is still left with the task of defining the range bounds for meaningful values of tunable parameters. For most well-known parameters, bounds are given in the literature. When practitioners customise algorithms and in the process define new parameters, the definition of range bounds is an unavoidable task even when using adaptive parameter control methods. Inexperienced practitioners would best avoid this dilemma by adhering to well-known EAs which permit the application of known bounds as listed in Table 1.

Table 1:
Ranges/values of parameters analysed.
ParameterRanges/values
Mutation rate, 2 ranges [0.001,0.249], [0.25,0.49] 
Mutation rate, 4 ranges [0.001,0.1249], [0.125,0.249], [0.25,0.3749], [0.375,0.49] 
Crossover rate, 2 ranges [0.6,0.79], [0.8,0.99] 
Crossover rate, 4 ranges [0.6,0.69], [0.7,0.79], [0.8,0.89], [0.9,0.99] 
Population size, 2 ranges [20,60], [61,100] 
Population size, 4 ranges [20,40], [41,60], [61,80], [81,100] 
Mating pool size, 2 ranges [0.1,0.39], [0.4,0.69] 
Mating pool size, 4 ranges [0.1,0.249], [0.25,0.39], [0.4,0.549], [0.55,0.69] 
Mutation operator Single-point, uniform 
Crossover operator Single-point, uniform 
ParameterRanges/values
Mutation rate, 2 ranges [0.001,0.249], [0.25,0.49] 
Mutation rate, 4 ranges [0.001,0.1249], [0.125,0.249], [0.25,0.3749], [0.375,0.49] 
Crossover rate, 2 ranges [0.6,0.79], [0.8,0.99] 
Crossover rate, 4 ranges [0.6,0.69], [0.7,0.79], [0.8,0.89], [0.9,0.99] 
Population size, 2 ranges [20,60], [61,100] 
Population size, 4 ranges [20,40], [41,60], [61,80], [81,100] 
Mating pool size, 2 ranges [0.1,0.39], [0.4,0.69] 
Mating pool size, 4 ranges [0.1,0.249], [0.25,0.39], [0.4,0.549], [0.55,0.69] 
Mutation operator Single-point, uniform 
Crossover operator Single-point, uniform 

5.1  Benchmark Problems

The current work on parameterising EAs is necessarily experimental in nature. A set of complex benchmark problems ensures the results are representative of typical combinatorial optimisation processes in which EAs are often used. The quadratic assignment problem (QAP) is a well-known and very complex combinatorial optimisation problem. In the current work, we use the QAP in its single-objective and multi-objective formulations. Another difficult and equally well-known problem tailored to the optimisation behaviour of EAs is the royal road (RR) function.

5.1.1  Quadratic Assignment Problem

In the (QAP) (Koopmans and Beckmann, 1957), n facilities have to be allocated to n locations, such that the total cost is minimised and every location has only one assigned facility. The total cost is calculated as the flow between the facilities multiplied by the costs for placing the facilities at their respective locations. More formally, the problem is modelled with two , representing the cost and the flow. The aim is to assign n utilities to n locations with minimal cost. The candidate assignments are evaluated according to the following cost function:
formula
7
where
  • n is the number of facilities and locations,

  • Dij is the distance between location i and location j,

  • Ckl is the cost of the flow between neighbouring utilities k and l,

  • uik is 1 if utility i is assigned to location k, and 0 otherwise.

QAP does not allow multiple assignments to the same location, hence solutions are subject to the following constraints:
formula
8
formula
9
formula
10
Single-ObjectiveFormulation
For the purpose of this experiment a well-known instance (TAI30A) from the QAPlib (Burkard et al., 1991) was used as a benchmark for the optimisation process whose parameter values are adaptively set. The scales of the fitness values vary considerably over different QAP instances. For ease of comparison, we normalise the fitness values according to Equation (11).
formula
11
where fmax is the maximum value and fmin is the minimum value that the fitness function f can take. Since QAP is a minimisation problem, the normalisation of the results converts it into a maximisation problem.
Multi-ObjectiveFormulation
The multi-objective model of QAP (mQAP) concerns itself with the flow of more than one type of item. A practical application may be a hospital layout problem, where the goal is to optimise the distances that doctors, visitors, and patients have to travel each day. The cost of assigning all facilities to locations is evaluated according to Equation (7) for each of the objectives. The task is to optimise all m objectives according to Equation (12).
formula
12

The instances used for the current work were published by Knowles and Corne (2003). File names ending in rl denote instances designed to be real-like with rugged fitness landscapes.

5.1.2  The Royal Road Problem

The RR functions are parametrised landscapes, first introduced by Mitchell, Forrest, and Holland (Jones, 1994). They are especially devised to demonstrate that there exist problems which are easier to solve using a GA than a hill climbing algorithm.

The function of the form is used to define a search task in which one wants to locate strings that produce high fitness values. The string is composed of 2k nonoverlapping contiguous sections each of length b + g, where b is known as the block and g is known as the gap. These sections form the first-level schemata, composed of short, low-order, highly fit schemata or building blocks. A schema si is a combination of ones, zeros, and the special character “*”, which is interpreted as “do not care.” Lower-level schemata are used to form intermediate-level higher-fitness schemata, which in turn are combined to create even higher-fitness schemata.

The fitness calculation of a string x proceeds in two steps, the part calculation and the bonus calculation. The overall fitness assigned to that string is the sum of these two calculations. The part calculation considers each block individually and assigns a fitness according to the number of 1s. The block fitnesses are later summed to form the total fitness of the part calculation. The fitness function RRP(x) for the part calculation of a string x is defined as:
formula
13
where
formula
14

The aim of the bonus calculation is to reward complete blocks and some combinations of complete blocks. Holland's approach assigns rewards for attaining a certain level. At the lowest level, rewards are given for complete blocks. If such a block exists, it receives a fitness equal to . Any additional complete blocks receive a fitness of u.

5.2  Data Collection

The data series we wish to analyse is the performance feedback data for the typical EA parameters identified. The data gathered are the success rate (Equation (1)) of the particular parameter or interval. The success rate is calculated as a ratio of the times the value or interval was used successfully to the number of times the value or interval was used. Successful usage is defined as the optimisation outcome producing above-average fitness values, where the average fitness is calculated across all new population members produced in the iteration under scrutiny.

To investigate the characteristics of the data, we ran as many parallel trials as we had discrete parameter values/intervals to investigate. In each trial, one of the parameters was the design factor and all others were set to fixed values which were identical in all parallel trials. It is clear that this procedure cannot eliminate possible interactions between certain parameter values. However, we can expect to reduce these interactions to a possible minimum. Every instance was granted 15, 000 function evaluations. Every 150 function evaluations, the effect of the parameter intervals was recorded. In total, each experiment produces 100 data points for each parameter range. Approximate algorithms are not expected to deliver exact and repeatable results. Therefore, for the current data analysis, all trials were repeated 30 times for each parameter value.

An example of two time series recording the success rate of mutation rate in the ranges [0.001,0.249) and (0.25,0.49] are shown in Table 2. The scatter plots of the data (shown in Figure 2, with time on the x axis) are useful for establishing trends in the time series. It can be observed that both time series are nonlinear. In fact, neither of them seems to exhibit any particular trend. Furthermore, the value-specific or interval-specific time series data from the optimisation process are tested for linearity, normality, independence, homoscedasticity, and stationarity using the statistical tests described in the previous sections. The results from the statistical tests that evaluate the linearity, normality of the residuals (KS test), independence of the residuals (DW test), homoscedasticity of the time series (BP test), and stationarity of the data (KPSS test) are shown in Table 2.

Figure 2:

Scatterplot of the time series recording the success rate of the mutation rate.

Figure 2:

Scatterplot of the time series recording the success rate of the mutation rate.

Table 2:
p value of statistical tests.
Parameter rangeLinearity testKS testDW testBP testKPSS test
Mutation rate [0.001,0.249) 0.1435 0.9684 <0.01 0.04 
Mutation rate (0.25,0.49] 0.2366 0.4445 <0.01 0.1 
Parameter rangeLinearity testKS testDW testBP testKPSS test
Mutation rate [0.001,0.249) 0.1435 0.9684 <0.01 0.04 
Mutation rate (0.25,0.49] 0.2366 0.4445 <0.01 0.1 

The p value from fitting a linear model to the data helps decide whether there is a significant relationship between the variables in the linear regression model of the time series at a .05 significance level. The p values are greater than .05 for both ranges of the mutation rate, hence the null hypothesis is not rejected and we can say that there is no significant relationship between the variables of the regression model. It can be concluded that the time series in these two examples are not linear.

The KS test is employed to test the assumption of normally distributed errors. The p values for both time series are greater than .05, which means that the null hypothesis stating that the data follow a normal distribution is not rejected. To test for the independence of error terms we record the p value from the DW test, which checks whether the errors from a least-squares regression are autocorrelated. The null hypothesis stating that the autocorrelation of errors is zero is rejected if the p value is less that .05. The p values from the DW test for the two time series are not less than .05, hence the null hypothesis is not rejected. This indicates that the error terms are not autocorrelated. Residuals are tested for homoscedasticity using the BP test, with a null hypothesis formulated as the data exhibit a similar variance with time, which is rejected with a p value less than .05 for both time series. Finally, the stationarity of the time series is analysed using the KPSS test. The p value from testing the time series of the mutation rate in the range [0.001, 0.249) is less than .05, which indicates that this time series is not stationary. On the contrary, the time series of the mutation rate in the range (0.25, 0.49] is stationary, with a p value equal to .1.

6  Evaluation of Statistical Properties

Table 3 summarises the results from the statistical tests when the continuous values have been divided into two ranges. Table 4 lists the same outcomes in the case where four intervals were used. The discrete parameters are listed in Table 5. The percentages represent the runs among the 30 trials in which H0 was not rejected. We use these results to derive conclusions regarding the appropriateness of using forecasting techniques to predict the effects of the parameter values.

Table 3:
Characteristics of the effects of the continuous parameter values when the ranges are divided into two intervals. The percentages represent the percentage of times (of a total 30) H0 was not rejected.
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Mutation rate QAP  97 100  80 90  97 
[0.0010, 0.25] MQAP  97 100  93 93  93 
 Royal Road  93  90  87 80  80 
Mutation rate QAP 100 100  70 97 100 
[0.2505, 0.5] MQAP 100 100  73 93 100 
 Royal Road  93  93  93 87 100 
Crossover rate QAP 100 100  83 90  93 
[0.6, 0.799] MQAP 100 100  93 90  90 
 Royal Road  93  93  93 87  87 
Crossover rate QAP 100 100  83 97  93 
[0.8, 1.0] MQAP 100 100  83 90  90 
 Royal Road 100  97  80 87  90 
Population QAP 100 100 100 53  13 
size MQAP 100 100  90 50  13 
[20, 60] Royal Road  90  97  80 43  10 
Population QAP  33  73 100 77  13 
size MQAP  33  80  83 73  17 
[61, 100] Royal Road  30  97  80 87  13 
Mating pool QAP 100 100  77 93  97 
[0.1, 0.4] MQAP 100 100  80 93  97 
 Royal Road 100  97  73 87  90 
Mating pool QAP 100 100  80 87 100 
[0.41, 0.7] MQAP 100 100  80 90 100 
 Royal Road 100  97  80 87 100 
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Mutation rate QAP  97 100  80 90  97 
[0.0010, 0.25] MQAP  97 100  93 93  93 
 Royal Road  93  90  87 80  80 
Mutation rate QAP 100 100  70 97 100 
[0.2505, 0.5] MQAP 100 100  73 93 100 
 Royal Road  93  93  93 87 100 
Crossover rate QAP 100 100  83 90  93 
[0.6, 0.799] MQAP 100 100  93 90  90 
 Royal Road  93  93  93 87  87 
Crossover rate QAP 100 100  83 97  93 
[0.8, 1.0] MQAP 100 100  83 90  90 
 Royal Road 100  97  80 87  90 
Population QAP 100 100 100 53  13 
size MQAP 100 100  90 50  13 
[20, 60] Royal Road  90  97  80 43  10 
Population QAP  33  73 100 77  13 
size MQAP  33  80  83 73  17 
[61, 100] Royal Road  30  97  80 87  13 
Mating pool QAP 100 100  77 93  97 
[0.1, 0.4] MQAP 100 100  80 93  97 
 Royal Road 100  97  73 87  90 
Mating pool QAP 100 100  80 87 100 
[0.41, 0.7] MQAP 100 100  80 90 100 
 Royal Road 100  97  80 87 100 
Table 4:
Characteristics of the effects of continuous parameter values with four ranges. The percentages represent the percentage of times (of a total 30) H0 was not rejected.
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Mutation rate QAP  97  97  97 90  80 
[0.001, 0.124] MQAP  97  97  97 90  80 
 Royal Road  90  90  87 80  77 
Mutation rate QAP 100 100 100 80  63 
[0.125, 0.249] MQAP  97  93  97 83  63 
 Royal Road 100  97  97 80  60 
Mutation rate QAP 100 100  97 90  87 
[0.25, 0.3749] MQAP 100  97  97 90  80 
 Royal Road  97 100 100 93  83 
Mutation rate QAP 100  90  97 87  80 
[0.375, 0.49] MQAP 100  90  93 87  83 
 Royal Road  97  77  93 83  80 
Crossover rate QAP 100 100  93 83  80 
[0.6, 0.69] MQAP  97 100  97 87  83 
 Royal Road 100  93  90 83  83 
Crossover rate QAP 100 100  97 90  73 
[0.7, 0.79] MQAP 100 100  97 93  73 
 Royal Road 100  97  93 80  70 
Crossover rate QAP 100 100  97 87  80 
[0.8, 0.89] MQAP 100 100  93 90  83 
 Royal Road 100  90  93 80  77 
Crossover rate QAP 100 100 100 83  93 
[0.9, 0.99] MQAP 100  97  97 90  90 
 Royal Road  97  90  97 77  83 
Population QAP 100  97 100 40  10 
size MQAP 100 100  97 50  13 
[20, 40] Royal Road  97  80  97 47  10 
Population QAP 100 100 100 40  3 
size MQAP  97  97 100 53  7 
[41, 60] Royal Road  97 100  97 40  10 
Population QAP 100 100 100 57  17 
size MQAP 100  97  97 50  13 
[61, 80] Royal Road  93  93  97 53  10 
Population QAP 100 100 100 53  47 
size MQAP 100  97 100 47  50 
[81, 100] Royal Road  97 100  90 50  43 
Mating pool QAP 100 100  90 80  63 
[0.1, 0.249] MQAP 100  97  93 83  70 
 Royal Road  97  93  93 80  70 
Mating pool QAP 100 100  97 83  97 
[0.25, 0.39] MQAP 100  97  97 90  90 
 Royal Road  93  93  97 80  90 
Mating pool QAP 100 100 100 80  70 
[0.4, 0.549] MQAP 100  97 100 83  83 
 Royal Road  97 100  97 87  83 
Mating pool QAP 100 100  97 80  77 
[0.55, 0.69] MQAP 100  97  97 80  77 
 Royal Road  90  93  93 83  73 
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Mutation rate QAP  97  97  97 90  80 
[0.001, 0.124] MQAP  97  97  97 90  80 
 Royal Road  90  90  87 80  77 
Mutation rate QAP 100 100 100 80  63 
[0.125, 0.249] MQAP  97  93  97 83  63 
 Royal Road 100  97  97 80  60 
Mutation rate QAP 100 100  97 90  87 
[0.25, 0.3749] MQAP 100  97  97 90  80 
 Royal Road  97 100 100 93  83 
Mutation rate QAP 100  90  97 87  80 
[0.375, 0.49] MQAP 100  90  93 87  83 
 Royal Road  97  77  93 83  80 
Crossover rate QAP 100 100  93 83  80 
[0.6, 0.69] MQAP  97 100  97 87  83 
 Royal Road 100  93  90 83  83 
Crossover rate QAP 100 100  97 90  73 
[0.7, 0.79] MQAP 100 100  97 93  73 
 Royal Road 100  97  93 80  70 
Crossover rate QAP 100 100  97 87  80 
[0.8, 0.89] MQAP 100 100  93 90  83 
 Royal Road 100  90  93 80  77 
Crossover rate QAP 100 100 100 83  93 
[0.9, 0.99] MQAP 100  97  97 90  90 
 Royal Road  97  90  97 77  83 
Population QAP 100  97 100 40  10 
size MQAP 100 100  97 50  13 
[20, 40] Royal Road  97  80  97 47  10 
Population QAP 100 100 100 40  3 
size MQAP  97  97 100 53  7 
[41, 60] Royal Road  97 100  97 40  10 
Population QAP 100 100 100 57  17 
size MQAP 100  97  97 50  13 
[61, 80] Royal Road  93  93  97 53  10 
Population QAP 100 100 100 53  47 
size MQAP 100  97 100 47  50 
[81, 100] Royal Road  97 100  90 50  43 
Mating pool QAP 100 100  90 80  63 
[0.1, 0.249] MQAP 100  97  93 83  70 
 Royal Road  97  93  93 80  70 
Mating pool QAP 100 100  97 83  97 
[0.25, 0.39] MQAP 100  97  97 90  90 
 Royal Road  93  93  97 80  90 
Mating pool QAP 100 100 100 80  70 
[0.4, 0.549] MQAP 100  97 100 83  83 
 Royal Road  97 100  97 87  83 
Mating pool QAP 100 100  97 80  77 
[0.55, 0.69] MQAP 100  97  97 80  77 
 Royal Road  90  93  93 83  73 
Table 5:
Characteristics of the effects of the discrete parameter values. The percentages represent the percentage of times (of a total 30) H0 was not rejected.
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Single-point QAP 100 100 70 93 97 
mutation MQAP 100  97 70 83 83 
 Royal Road  90  90 73 87 87 
Uniform QAP 100 100 77 93 93 
mutation MQAP 100  97 80 83 90 
 Royal Road  97  87 73 87 90 
Single-point QAP 100 100 77 87 93 
crossover MQAP 100  97 80 83 90 
 Royal Road  97  93 73 87 87 
Uniform QAP 100 100 77 97 93 
crossover MQAP 100  97 77 83 93 
 Royal Road 100  90 73 87 83 
LinearNormalIndependentHomoscedasticStationary
ParameterProblem(%)(%)(%)(%)(%)
Single-point QAP 100 100 70 93 97 
mutation MQAP 100  97 70 83 83 
 Royal Road  90  90 73 87 87 
Uniform QAP 100 100 77 93 93 
mutation MQAP 100  97 80 83 90 
 Royal Road  97  87 73 87 90 
Single-point QAP 100 100 77 87 93 
crossover MQAP 100  97 80 83 90 
 Royal Road  97  93 73 87 87 
Uniform QAP 100 100 77 97 93 
crossover MQAP 100  97 77 83 93 
 Royal Road 100  90 73 87 83 

In general, the time series data are linear with normally distributed errors. The lowest number of normally distributed residuals resulted from the experiment with the population size in the interval [61,100], in which the null hypothesis was rejected in 27% of the cases (cf. Table 3). The time series belonging to the population values in this range is also the only time series that was not linear in the majority of the cases (the null hypothesis was rejected in 70% of the runs).

The DW test used to verify the assumption of no correlation between the error terms of the time series shows that performance data of parameter values are independent for the majority of the cases, with the lowest percentage scored by the mutation rate in the interval [0.2505,0.5] (70% of the runs have independent errors).

The BP test was used to measure the homoscedasticity of the time series data. In general, homoscedasticity holds for the majority of the time series data, although the null hypothesis can only be confirmed in 80–90% of the runs for all parameter intervals. The lowest percentages of homoscedastic runs are in the experiments with the ranges [20,40] and [41,60] of the population size, in which only 40% of the runs are homoscedastic. Similar results were observed in the other ranges of the population size.

The KPSS test verifies the null hypothesis that the data are stationary. It can be observed that stationarity is a common feature in the effects of the single-point mutation, uniform mutation, single-point crossover, and uniform crossover operators. Similarly, in some of the intervals of the mutation rate ([0.0010, 0.25] and [0.2505, 0.5]), crossover rate ([0.6, 0.799], [0.8, 1.0], and [0.9, 0.99]) and mating pool size ([0.1, 0.4], [0.41, 0.7], and [0.25, 0.39]) the null hypothesis was not rejected in more than 90% of the runs with the mating pool size in the interval [0.41, 0.7] and mutation rate in the range [0.2505, 0.5] being the most stationary processes (100% of the runs are stationary).

However, the crossover rate in the interval [0.7, 0.79], the mutation rate in the interval [0.125, 0.249], and the mating pool size in the intervals [0.1, 0.249], [0.4, 0.549], and [0.55, 0.69] have a moderately low number of trials which confirm stationarity. Furthermore, less than 20% of the runs in all ranges of the population size confirm stationarity. The time series belonging to the population size in the interval [41, 60] has the lowest percentage of confirmation of the null hypothesis.

For the assessment of the suitability of forecasting methods LR, SMA, EWMA, and ARIMA based on statistical properties it is necessary to reach a conclusion regarding the parameters themselves instead of the individual value options. For the comparison, the assumptions made by the considered forecasting models have been summarised in Table 6.

Table 6:
Assumptions of the forecasting models.
 Linearity Normality Independence Homoscedasticity Stationarity 
LR – 
SMA – – – 
EWMA – – – 
ARIMA – 
 Linearity Normality Independence Homoscedasticity Stationarity 
LR – 
SMA – – – 
EWMA – – – 
ARIMA – 

Fortunately, the characteristics of parameter value effects are reasonably consistent over the values or intervals of the same parameter. We also find that dividing the parameter value ranges into two versus four intervals causes no significant differences in the behaviour of the statistical properties.

The results flag stationarity as the least satisfied condition, where the null hypothesis can be confirmed for some intervals in only around 70% of the trials. Excluding population size for its problematic outcomes on many assumptions, most parameters’ interval-specific results are mixed regarding stationarity. This would suggest that the best choice for all of these is LR, alternatively ARIMA with an appropriate differencing method. The condition of independence of errors seems threatened only when larger intervals are used (Table 3). Larger intervals seem to complicate the attribution of causality of parameter effects. If we assume smaller intervals, independence of errors seems unproblematic.

Homoscedasticity has low values only in the case of the intervals of the mating pool sizes. A decision has to be made whether a homoscedastic result in around 80% of the cases is sufficient to regard the mutation pool size effect data as homoscedastic. Since homoscedasticity is a prerequisite for the use of all considered models, we can only accept these low scores and investigate whether the use of the forecasting methods on the mating pool size leads to good overall results.

The parameter of population size scores poorly on both the test for homoscedasticity and stationarity. As all suggested forecasting models require homoscedasticity, we can only decide that none of these methods can be expected to provide accurate performance predictions on population size values. Perhaps the least unsuitable method for this parameter is LR for the reason that LR does not require stationarity.

Consequently, the resulting summary Table 7, which shows the suitability of forecasting methods for parameters, displays a “” sign against population size and all discussed forecasting methods. This indicates that the time series data of this parameter only partially follows the assumptions made by the forecasting model, which cannot be recommended for use without reserve. All other parameters have been marked as “+” for all forecasting methods considered. This decision has been made notwithstanding the limitations discussed above.

Table 7:
Recommendations on the suitability of forecasting methods for parameters.
Parameter LR SMA EWMA ARIMA 
Mutation rate 
Crossover rate 
Population size     
Mating pool size 
Mutation operator 
Crossover operator 
Parameter LR SMA EWMA ARIMA 
Mutation rate 
Crossover rate 
Population size     
Mating pool size 
Mutation operator 
Crossover operator 

7  Performance Comparison of the Prediction Methods

The parameters mutation rate, crossover rate, mating pool size, mutation, and crossover operators are all similar in their adherence to the expectations made by the considered prediction models. From the outcome of the conformity tests we cannot definitively conclude which prediction model is clearly best suited for each of them. Therefore, in this section, we apply all considered forecasting methods to each of the parameters except the population size, which conforms too poorly to several of the assumptions.

To avoid overfitting of the forecasting models, we employ regularization techniques in which a part of the data is reserved to test whether the model generalises well. We randomly split the data into the training and the testing set, and fit the model to the training set by minimising the squared error. The model is then evaluated in the testing set, by measuring the squared error and comparing it to the training set.

7.1  Experimental Settings

In the first instance, mutation rate, crossover rate, mating pool size, crossover operator, and mutation operator are set adaptively using predictive parameter control. The population size is not discussed in this section due to its problematic statistical properties. The parameter values or intervals used for these trials are listed in Table 1.

We performed 30 runs for each optimisation scheme with different quality attribution strategies. The parameter value effect is measured based on the fitness improvement over the previous iteration. In the case of a multi-objective setting we use the hypervolume indicator, which has been identified as the best unary measure to reflect the fitness of a population in a multi-objective setting (Zitzler et al., 2003). The results are presented as the mean of 30 runs over 55 iterations. The problem instance used for QAP is BUR26E, for mQAP we used KC30-3fl-2rl. The Bur* problems’ first matrices are average typists’ typing times whereas the second are the frequencies of combinations of two letters. The multiobjective problem KC30-3fl-2rl has 30 facilities/locations, three objectives, and is the second real-like instance of this size created by Knowles and Corne (2003).

7.2  Results

The development of the population fitness averaged over 30 trials, plotted in Figure 3, shows that using LR for the parameter optimisation helps find better solutions at an earlier stage of the optimisation process. In the case of the QAP problems, single-objective and multi-objective, the quality achieved by LR is not attained by the other methods. The performances of all predictive models converge to the same quality on the RR problem. The optimisation scheme using ARIMA as a quality attribution scheme does not perform very well during the initial iterations, as could be expected given the requirement of a minimum of 50 data points.

Figure 3:

Performance development of the EA with different forecasting techniques.

Figure 3:

Performance development of the EA with different forecasting techniques.

On the other hand, certain ranges of mutation rate, crossover rate, and mating pool size meet the stationarity assumption, which is a prerequisite of SMA, EWMA, and ARIMA (see Table 7), only in a moderate number of runs. As a result, these models may not fit perfectly the time series data we are controlling. This explains the weaker performance of SMA, EWMA, and ARIMA models when compared with LR. As the differences in performance between the models are not statistically significant, the results suggest that LR can be regarded as representative of all considered prediction models.

8  Experimental Evaluation

The experimental validation of predictive parameter control using the considered techniques is carried out on all parameters regardless of adherence to statistical properties as established in a previous section. Practitioners will not, in general, have the leisure to perform tests on statistical properties of parameters. We are therefore obliged to provide recommendations how to proceed when assumptions are not met or the adherence to statistical assumptions is unknown. We provide a validation for all parameters using all forecasting techniques to establish the potential of adverse effects on the performance of the EA.

Having investigated the appropriate forecasting method to use for PQA, we investigate whether using prediction as part of adaptive parameter control is beneficial to the result quality. The PQA method using LR is compared to the average and extreme quality attribution schemes, which are currently the best-performing quality attribution strategies.

8.1  Benchmark Quality Attribution Strategies

The PQA strategy is compared to two conspicuous methods, an AQA and an EQA strategy. AQA is a relatively simple algorithm, which uses a time window W to approximate the quality of parameter values as the average effect on algorithm performance. The AQA is given in Equation 15.
formula
15
where is the effect of parameter value , assessed as the fitness difference between the solution produced by applying parameter value and its parent. The sum of the effect values of in the last w iterations (i.e., ) is divided by the total number of solutions created using that parameter value () normalising the quality of .
The credit attribution strategy shown in Equation 15 is susceptible to the scale of the fitness function. The EQA strategy, proposed by Fialho et al. (2010), known as the area under the curve (AUC), addresses this problem by using a ranked-based strategy. The method ranks the effects of parameter values in the recent w applications in a descending order. Each parameter effect is assigned a rank value wr, where r refers to the rank-position of the effect value. The higher the rank position, the lower is the rank value. The final rank value for each parameter effect is calculated as:
formula
16
where is the decay factor, which emphasizes the influence of the top-ranked effect values on the overall quality of that parameter value, hence it belongs to the class of EQA strategies. Next, the area under the curve is calculated by going down the sorted list, and drawing the receiving operator curve starting from the origin (Fialho et al., 2010). Since AUC uses only the ranks of the parameter effects, this method is quite robust with respect to hyperparameter and fitness transformations, as shown in the experimental evaluation by Fialho et al. (2010).

8.2  Experimental Settings

For the benefit of the comparison between PQA, AQA, and EQA, we control the mutation rate, the crossover rate, the mutation operator, the crossover operator, the population size, and the mating pool size with values/ranges depicted in Table 1.

The experimental study in Section 4 demonstrated that the effect of the population size during the optimisation time steps does not adhere to the assumptions made by the forecasting techniques. Hence we conduct a separate set of experiments for the population size and another for the mutation rate, the crossover rate, the mutation operator, the crossover operator, and the mating pool size. The latter set of experiments is performed to examine the performance of PQA when used to predict the quality of parameter values which conform to the assumptions of the forecasting techniques. The population size is investigated for an insight into the problem—whether the use of PQA can have adverse effects on the performance of the algorithm when the assumptions of the forecasting models are clearly not met.

Approximate algorithms are not expected to deliver exact and repeatable results, but to provide good approximate solutions where exact approaches cannot be devised. Hence, results concerning the performance of approximate algorithms, such as EAs, are usually reported as mean values over repeated trials. To obtain a fair comparison, the generally accepted approach is to allow the same number of function evaluations for each trial. Every trial instance is granted 90, 000 function evaluations. To obtain a fair comparison among the different quality attribution schemes we repeat each run 30 times. To check for a statistical difference of the results, the different effect attribution schemes of the optimisation methods are validated using the KS nonparametric test (Pettitt and Stephens, 1977). The final solutions of each optimisation scheme are recorded and the results are compared by using the normalised fitness for the single-objective problems and the hypervolume indicator for the multi-objective problem.

8.3  Hyperparameters

All adaptive algorithms involve hyperparameters, which need to be tuned depending on the optimisation problem at hand. This defines another optimisation problem, which can become quite computationally expensive if we attempt an exhaustive exploration of the search space. However, the number of hyperparameters is usually lower than the number of algorithm parameters that should be controlled.

The tuning of the hyperparameter values was performed using a sequential parameter optimisation (SPO) (Bartz-Beielstein et al., 2005), which tests each hyperparameter combination using several runs. To decrease the number of tests required, we employ a racing technique, which uses a variable number of runs depending on the performance of the hyperparameter configuration. Hyperparameter configurations are tested against the best configuration so far, using at least the same number of function evaluations as employed for the best configuration. The results from the tuning process are shown in Table 8.

Table 8:
Hyperparameters of the three credit attribution strategies.
BenchmarkHyperparameterValueDescription
EQA  0.5 Decay factor 
PQA, EQA pmin 0.1 Minimum selection probability 
EQA, AQA, PQA W 100 Window size 
BenchmarkHyperparameterValueDescription
EQA  0.5 Decay factor 
PQA, EQA pmin 0.1 Minimum selection probability 
EQA, AQA, PQA W 100 Window size 

The only hyperparameter for the PQA method is the minimum probability used for the exploration of new parameter values. For the minimum selection probability, we employ the same settings as AP and PM, as shown in Table 8.

8.4  Benchmark Problems

The optimisation schemes are employed to optimise the RR problem, described in Section 5.1.2, four well-known instances of the QAP (CHR20A, BUR26E, TAI30A, and STE36B) and three instances of the multi-objective QAP (KC10-2fl-5rl, KC30-3fl-1rl, and KC30-3fl-2rl). CHR20a is known to be among the hardest of the CHR* problems (Christofides and Benavent, 1989). The BUR* problems’ first matrices are average typists’ typing times; whereas the second are the frequencies of combinations of two letters. TAI*a problems are considered to have rugged fitness landscapes, whereas TAI*b instances are structured (Merz and Freisleben, 2000). STE36b is a variation of a computer component placement problem with the goal to minimise the lengths of the connecting wires (Steinberg, 1961). The multi-objective problem KC10-2fl-5rl has 10 facilities/locations, two objectives, and is the fifth real-like instance of this size created by Knowles and Corne (2003). By analogy, KC30-3fl-1rl and KC30-3fl-2rl have 30 facilities and three objectives.

8.5  Results

Figure 4 shows the results of the 30 trials in which the mutation rate, crossover rate, mating pool size, mutation operator, and crossover operator were controlled. The experimental results are clearly not normally distributed, but the mean and 25th percentile of the PQA method are consistently above the respective values of the benchmarks.

Figure 4:

Boxplots of the 30 trials which control all parameters except population size. PQA is compared to the benchmark quality attribution schemes AQA and EQA.

Figure 4:

Boxplots of the 30 trials which control all parameters except population size. PQA is compared to the benchmark quality attribution schemes AQA and EQA.

The means and standard deviations are listed in Table 9, which shows a significant difference between the result groups of AQA, EQA, and PQA. The mean performance of PQA is consistently above the averages of the benchmarks. The performance of EQA is clearly worse than that of the other two quality attribution strategies.

Table 9:
The means and standard deviations over 30 runs of each problem instance controlling all EA parameters except population size using different quality attribution schemes: Average Quality Attribution (AQA), Extreme Quality Attribution (EQA), and Predictive Quality Attribution (PQA).
MeanStandard deviation
ProblemAQAEQAPQAAQAEQAPQA
CHR20A 0.4909 0.4845 0.4972 1.627E–02 1.306E–02 1.920E–02 
BUR26E 0.2416 0.2395 0.2422 2.049E–03 1.893E–03 2.365E–03 
TAI30A 0.4909 0.4845 0.4982 1.627E–02 1.306E–02 1.870E–02 
STE36B 0.7978 0.7984 0.8308 7.823E–03 9.038E–03 8.704E–03 
RR 0.0064 0.0063 0.0072 1.537E–04 6.3723E–03 1.473E–03 
KC10-2fl-5rl 0.8062 0.8061 0.8067 1.040E–03 8.284E–04 9.555E–04 
KC30-3fl-1rl 0.1984 0.1961 0.1995 3.255E–03 3.974E–03 3.806E–03 
KC30-3fl-2rl 0.5199 0.5175 0.5223 7.312E–03 5.439E–03 4.821E–03 
MeanStandard deviation
ProblemAQAEQAPQAAQAEQAPQA
CHR20A 0.4909 0.4845 0.4972 1.627E–02 1.306E–02 1.920E–02 
BUR26E 0.2416 0.2395 0.2422 2.049E–03 1.893E–03 2.365E–03 
TAI30A 0.4909 0.4845 0.4982 1.627E–02 1.306E–02 1.870E–02 
STE36B 0.7978 0.7984 0.8308 7.823E–03 9.038E–03 8.704E–03 
RR 0.0064 0.0063 0.0072 1.537E–04 6.3723E–03 1.473E–03 
KC10-2fl-5rl 0.8062 0.8061 0.8067 1.040E–03 8.284E–04 9.555E–04 
KC30-3fl-1rl 0.1984 0.1961 0.1995 3.255E–03 3.974E–03 3.806E–03 
KC30-3fl-2rl 0.5199 0.5175 0.5223 7.312E–03 5.439E–03 4.821E–03 

The gap between result qualities widens in favour of PQA as the problem difficulty increases. The smallest instances of the QAP (CHR20A) and mQAP (KC10-2fl-5rl) can be assumed to be the least challenging, and there the results are not as clearly in favour of PQA. The RR problem and the larger instances of QAP (STE36B) and mQAP (KC30-3fl-1rl, KC10-2fl-5rl) are evidently solved to better quality using PQA, as they are more complex than the smaller problem instances.

To check for a statistical significance of the results we used the KS non-parametric test (Pettitt and Stephens, 1977). PQA was compared with AQA and EQA with a null hypothesis of no difference between the performances. The 30 values of the hypervolume indicators of the repeated trials for each problem instance were submitted to the KS analysis (Pettitt and Stephens, 1977) and results are shown in Table 10. All KS tests used for establishing differences between independent datasets under the assumption that they are not normally distributed, result in a rejection of the null hypothesis with a minimum d value of 0.2333 at a 95% confidence level. Hence we conclude that the superior performance of the PQA method is statistically significant.

Table 10:
The Kolmogorov-Smirnov test values for the 30 runs of each problem instance controlling all EA parameters except population size using different quality attribution schemes: Average Quality Attribution (AQA), Extreme Quality Attribution (EQA), and Predictive Quality Attribution (PQA).
PQA versus AQAPQA versus EQA
Problemdpdp
CHR20A 0.2333 0.042 0.3333 0.045 
BUR26E 0.3000 0.048 0.3000 0.049 
TAI30A 0.3294 0.046 0.3517 0.039 
STE36B 0.9667 0.000 0.9667 0.000 
RR 0.6333 0.000 0.7333 0.000 
KC10-2fl-5rl 0.3333 0.045 0.4000 0.011 
KC30-3fl-1rl 0.2897 0.016 0.4333 0.005 
KC30-3fl-2rl 0.3433 0.035 0.4333 0.005 
PQA versus AQAPQA versus EQA
Problemdpdp
CHR20A 0.2333 0.042 0.3333 0.045 
BUR26E 0.3000 0.048 0.3000 0.049 
TAI30A 0.3294 0.046 0.3517 0.039 
STE36B 0.9667 0.000 0.9667 0.000 
RR 0.6333 0.000 0.7333 0.000 
KC10-2fl-5rl 0.3333 0.045 0.4000 0.011 
KC30-3fl-1rl 0.2897 0.016 0.4333 0.005 
KC30-3fl-2rl 0.3433 0.035 0.4333 0.005 

Given the outcomes of the tests for statistical properties, the population size was controlled individually. Figure 5 depicts the results of the 30 runs for each problem instance and quality attribution scheme. PQA is slightly outperformed by the EQA strategy in some of the problems. These results indicate that using forecasting techniques to predict the quality of the population choices for the future iterations may not always be beneficial.

Figure 5:

Performance improvement of the EA using different quality attribution strategies for controlling population size to solve the QAP, the mQAP, and the RR problem.

Figure 5:

Performance improvement of the EA using different quality attribution strategies for controlling population size to solve the QAP, the mQAP, and the RR problem.

The means and standard deviations of the three quality attribution schemes are shown in Table 11. It can be observed that the mean performance of PQA is usually not higher than that of the other two quality attribution schemes. To check for a statistical significance of the difference in the performances of the quality attribution schemes we submitted the results from the 30 runs to the KS nonparametric test (Pettitt and Stephens, 1977), with the null hypothesis that there is no significant difference between the result sets.

Table 11:
The means and standard deviations for the 30 runs of each problem instance using different quality attribution schemes when controlling the population size: Average Quality Attribution (AQA), Extreme Quality Attribution (EQA), and Predictive Quality Attribution (PQA).
MeanStandard deviation
ProblemAQAEQAPQAAQAEQAPQA
CHR20A 0.4660 0.4669 0.4651 4.85E–03 3.51E–03 4.26E–03 
BUR26E 0.2460 0.2461 0.2459 8.92E–05 6.88E–05 1.31E–04 
TAI30A 0.4977 0.4982 0.4962 3.57E–03 3.23E–03 5.94E–03 
STE36B 0.7934 0.8164 0.7945 3.29E–03 4.22E–02 3.64E–03 
RR 6.56E–03 6.57E–03 6.56E–03 2.14E–05 2.27E–05 1.82E–05 
KC10-2fl-5rl 0.8065 0.8065 0.8065 2.74E–05 3.30E–06 2.74E–05 
KC30-3fl-1rl 0.1936 0.1993 0.1974 5.68E–03 4.53E–03 6.74E–03 
KC30-3fl-2rl 0.5754 0.5785 0.5745 4.92E–03 4.14E–03 3.67E–03 
MeanStandard deviation
ProblemAQAEQAPQAAQAEQAPQA
CHR20A 0.4660 0.4669 0.4651 4.85E–03 3.51E–03 4.26E–03 
BUR26E 0.2460 0.2461 0.2459 8.92E–05 6.88E–05 1.31E–04 
TAI30A 0.4977 0.4982 0.4962 3.57E–03 3.23E–03 5.94E–03 
STE36B 0.7934 0.8164 0.7945 3.29E–03 4.22E–02 3.64E–03 
RR 6.56E–03 6.57E–03 6.56E–03 2.14E–05 2.27E–05 1.82E–05 
KC10-2fl-5rl 0.8065 0.8065 0.8065 2.74E–05 3.30E–06 2.74E–05 
KC30-3fl-1rl 0.1936 0.1993 0.1974 5.68E–03 4.53E–03 6.74E–03 
KC30-3fl-2rl 0.5754 0.5785 0.5745 4.92E–03 4.14E–03 3.67E–03 

The null hypothesis was not rejected in any of the KS tests (see Table 12), which means that the superior performances of the AQA and EQA methods compared to the PQA strategy are not statistically significant.

Table 12:
The Kolmogorov-Smirnov test values for the 30 runs of each problem instance using different quality attribution schemes on experiments that control the population size: Average Quality Attribution (AQA), Extreme Quality Attribution (EQA), and Predictive Quality Attribution (PQA).
PQA versus AQAPQA versus EQA
Problemdpdp
CHR20A 0.2000 0.537 0.2667 0.200 
BUR26E 0.2000 0.537 0.3000 0.109 
TAI30A 0.3750 0.162 0.3750 0.162 
STE36B 0.2500 0.633 0.3750 0.162 
RR 0.2667 0.200 0.2667 0.200 
KC10-2fl-5rl 0.3750 0.162 0.3750 0.162 
KC30-3fl-1rl 0.3500 0.121 0.3542 0.113 
KC30-3fl-2rl 0.2000 0.890 0.4000 0.136 
PQA versus AQAPQA versus EQA
Problemdpdp
CHR20A 0.2000 0.537 0.2667 0.200 
BUR26E 0.2000 0.537 0.3000 0.109 
TAI30A 0.3750 0.162 0.3750 0.162 
STE36B 0.2500 0.633 0.3750 0.162 
RR 0.2667 0.200 0.2667 0.200 
KC10-2fl-5rl 0.3750 0.162 0.3750 0.162 
KC30-3fl-1rl 0.3500 0.121 0.3542 0.113 
KC30-3fl-2rl 0.2000 0.890 0.4000 0.136 

This indicates that using PQA to forecast the effect of the population size does not have a negative effect on the performance of the algorithm, despite the fact that this parameter does not meet the assumptions made by the forecasting models. Hence, we expect that when new EA parameters are controlled, prediction can be used without a significant negative effect on the performance of the algorithm.

9  Conclusion

Controlling parameters adaptively relies on strategies to extract effects of parameter values from the optimisation process and extract information which is subsequently used to attribute a measure of quality to the parameter values used. In previous work (Aleti and Moser, 2011), we have suggested that using prediction to attribute a quality metric to the parameter values to choose from for the next iteration is beneficial to the performance of the algorithm. Many forecasting techniques exist for time series prediction. Four of these were considered for use in PQA.

The suitability of each forecasting technique for predicting the effect of six EA parameters (mutation rate, crossover rate, population size, mating pool size, the type of crossover operator, and the type of mutation operator) was assessed. It was established that all considered parameters, with the exception of the population size, adhered, within a certain tolerance, to the assumptions required for the use of each of the considered forecasting techniques. Hence, all forecasting methods could reasonably be considered for the projection of future parameter quality in PQA.

Despite being one of the simplest forecasting strategies, LR has proven to be the most effective model to predict the effect of EA parameters, albeit by a negligible margin. In the experimental studies we have used the PQA method with LR to adjust the mutation rate, crossover rate, mating pool size, mutation operator, and crossover operator throughout the optimisation process carried out using different problem instances. The trials have demonstrated that PQA outperforms the quality attribution methods currently considered most successful.

Although the population size was not found to conform to the assumptions made by the forecasting techniques to a reasonable tolerance, the results of the experiments have shown that using the PQA strategy to control the population size did not have a significant negative effect on the performance of the EA. Based on the experiments with the population size, we argue that if EA parameters are controlled without prior testing for statistical properties, the use of PQA has a potential benefit without the expectation of any negative effects on the performance of the algorithm. The exploration of the best possible prediction method is unlikely to be affected by the discretisation, given that we employ two different settings.

References

Aleti
,
A.
, and
Moser
,
I
. (
2011
).
Predictive parameter control
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
561
568
.
Aleti
,
A.
, and
Moser
,
I
. (
2013
).
Entropy-based adaptive range parameter control for evolutionary algorithms
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
1501
1508
.
Aleti
,
A.
,
Moser
,
I.
, and
Mostaghim
,
S.
(
2012
).
Adaptive range parameter control
. In
Proceedings of the Congress on Evolutionary Computation
, pp.
1
8
.
Arcuri
,
A.
, and
Fraser
,
G
. (
2011
).
On parameter tuning in search based software engineering
. In
Symposium on Search Based Software Engineering. Lecture notes in computer science
, Vol.
6956
(pp.
33
47
).
Berlin
:
Springer-Verlag
.
Bäck
,
T.
,
Eiben
,
Á. E.
, and
van der Vaart
,
N. A. L.
(
2000
).
An empirical study on gas without parameters
. In
Parallel problem solving from nature. Lecture notes in computer science
, Vol.
1917
(pp.
315
324
).
Berlin
:
Springer-Verlag
.
Bäck
,
T.
, and
Schütz
,
M.
(
1996
).
Intelligent mutation rate control in canonical genetic algorithms
. In
Proceedings of the International Symposium on Foundations of Intelligent Systems. Lecture notes in artificial intelligence
, Vol.
1079
(pp.
158
167
).
Berlin
:
Springer-Verlag
.
Bartz-Beielstein
,
T.
,
Lasarczyk
,
C.
, and
Preuss
,
M
. (
2005
).
Sequential parameter optimization
. In
Proceedings of the IEEE Congress on Evolutionary Computation
, pp.
773
780
.
Burkard
,
R. E.
,
Karisch
,
S. E.
, and
Rendl
,
F
. (
1991
).
QAPLIB; A quadratic assignment problem library
.
European Journal of Operational Research
,
55
(
1
):
115
119
.
Cervantes
,
J.,
and
Stephens
,
C. R
. (
2009
).
Limitations of existing mutation rate heuristics and how a rank GA overcomes them
.
IEEE Transactions on Evolutionary Computation
,
13
(
2
):
369
397
.
Christofides
,
N.,
and
Benavent
,
E.
(
1989
).
An exact algorithm for the quadratic assignment problem on a tree
.
Operations Research
,
37
(
5
):
760
768
.
Corne
,
D.
,
Oates
,
M. J.
, and
Kell
,
D. B.
(
2002
).
On fitness distributions and expected fitness gain of mutation rates in parallel evolutionary algorithms
. In
Parallel problem solving from nature. Lecture notes in computer science
, Vol.
2439
(pp.
132
141
).
Berlin
:
Springer-Verlag
.
Davis
,
L.
(
1989
).
Adapting operator probabilities in genetic algorithms
. In
Proceedings of the International Conference on Genetic Algorithms
, pp.
70
79
.
Davis
,
L.
(Ed.). (
1991
).
Handbook of genetic algorithms
.
New York
:
Van Nostrand Reingold
.
De Jong
,
K. A.
(
1995
).
An analysis of the behavior of a class of genetic adaptive systems
.
Ph.D. thesis, University of Michigan. Ann Arbor, MI
.
De Jong
,
K. A.
(
2007
).
Parameter setting in EAs: A 30 year perspective
. In
Parameter setting in evolutionary algorithms. Studies in computational intelligence
, Vol.
54
(pp.
1
18
).
Berlin
:
Springer-Verlag
.
Eiben
,
Á. E.
,
Hinterding
,
R.
, and
Michalewicz
,
Z
(
2007
).
Parameter control in evolutionary algorithms
.
IEEE Transations on Evolutionary Computation
,
3
(
2
):
124
141
.
Eiben
,
Á. E.
,
Marchiori
,
E.
, and
Valkó
,
V. A.
(
2004
).
Evolutionary algorithms with on-the-fly population size adjustment
. In
Parallel problem solving from nature. Lecture notes in computer science
, Vol.
3242
(pp.
41
50
).
Berlin
:
Springer-Verlag
.
Eiben
,
Á. E.
, and
Schut
,
M. C.
(
2008
).
New ways to calibrate evolutionary algorithms
. In
Advances in metaheuristics for hard optimization
(pp.
153
177
).
Berlin
:
Springer
.
Eiben
,
Á. E.
, and
Smit
,
S. K
. (
2011
).
Parameter tuning for configuring and analyzing evolutionary algorithms
.
Swarm and Evolutionary Computation
,
1
(
1
):
19
31
.
Fialho
,
Á.
,
Da Costa
,
L.
,
Schoenauer
,
M.
, and
Sebag
,
M.
(
2008
).
Extreme value based adaptive operator selection
. In
Parallel problem solving from nature. Lecture notes in computer science
, Vol.
5199
(pp.
175
184
).
Berlin
:
Springer-Verlag
.
Fialho
,
Á.
,
Da Costa
,
L.
,
Schoenauer
,
M.
, and
Sebag
,
M.
(
2010
).
Analyzing bandit-based adaptive operator selection mechanisms
.
Annals of Mathematics and Artificial Intelligence
,
60
(
1–2
):
25
64
.
Fialho
,
Á.
,
Schoenauer
,
M.
, and
Sebag
,
M
. (
2009
).
Analysis of adaptive operator selection techniques on the royal road and long k-path problems
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
779
786
.
Fialho
,
Á.
,
Schoenauer
,
M.
, and
Sebag
,
M
. (
2010
).
Toward comparison-based adaptive operator selection
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
767
774
.
Gibbons
,
J. D.,
and
Chakraborti
,
S
. (
2003
).
Nonparametric statistical inference
.
Boca Raton, FL
:
CRC Press
.
Grefenstette
,
J. J
. (
1986
).
Optimization of control parameters for genetic algorithms
.
IEEE Transactions on Systems, Man, and Cybernetics
,
16
(
1
):
122
128
.
Hansen
,
N.,
and
Ostermeier
,
A
. (
2001
).
Completely derandomized self-adaptation in evolution strategies
.
Evolutionary Computation
,
9
(
2
):
159
195
.
Hong
,
T.-P.
,
Wang
,
H.-S.
, and
Chen
,
W.-C
. (
2000
).
Simultaneously applying multiple mutation operators in genetic algorithms
.
Journal of Heuristics
,
6
(
4
):
439
455
.
Hong
,
T.-P.
,
Wang
,
H.-S.
,
Lin
,
W.-Y.
, and
Lee
,
W.-Y
. (
2002
).
Evolution of appropriate crossover and mutation operators in a genetic process
.
Applied Intelligence
,
16
(
1
):
7
17
.
Igel
,
C.,
and
Kreutz
,
M
. (
2003
).
Operator adaptation in evolutionary computation and its application to structure optimization of neural networks
.
Neurocomputing
,
55
(
1–2
):
347
361
.
Igel
,
C.
,
Wiegand
,
S.
, and
Friedrichs
,
F.
(
2005
).
Evolutionary optimization of neural systems: The use of strategy adaptation
. In
Trends and Applications in Constructive Approximation. International Series of Numerical Mathematics
, Vol.
151
(pp.
103
123
).
Basel, Switzerland
:
Birkhäuser
.
Jones
,
T
. (
1994
).
A description of Holland's royal road function
.
Evolutionary Computation
,
2
(
4
):
409
415
.
Julstrom
,
B. A
. (
1995
).
What have you done for me lately? Adapting operator probabilities in a steady-state genetic algorithm
. In
Proceedings of the International Conference on Genetic Algorithms
, pp.
81
87
.
Julstrom
,
B. A
. (
1997
).
Adaptive operator probabilities in a genetic algorithm that applies three operators
. In
Proceedings of the ACM Symposium on Applied Computing
, pp.
233
238
.
Knowles
,
J.
, and
Corne
,
D.
(
2003
).
Instance generators and test suites for the multiobjective quadratic assignment problem
. In
Proceedings of the International Conference on Evolutionary Multi-Criterion Optimization. Lecture notes in computer science
, Vol.
2632
(pp.
295
310
).
Berlin
:
Springer-Verlag
.
Koopmans
,
T. C.
, and
Beckmann
,
M.
(
1957
).
Assignment problems and the location of economic activities
.
Econometrica
,
25
:
53
76
.
Kwiatkowski
,
D.
,
Phillips
,
P. C. B.
,
Schmidt
,
P.
, and
Shin
,
Y
. (
1992
).
Testing the null hypothesis of stationarity against the alternative of a unit root
.
Journal of Econometrics
,
54
(
1–3
):
159
178
.
Lobo
,
F. G.
(
2011
).
Idealized dynamic population sizing for uniformly scaled problems
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
917
924
.
Merz
,
P.,
and
Freisleben
,
B
. (
2000
).
Fitness landscapes, memetic algorithms, and greedy operators for graph bipartitioning
.
Evolutionary Computation
,
8
(
1
):
61
91
.
Michalewicz
,
Z.,
and
Fogel
,
D. B
. (
2004
).
How to solve it: Modern heuristics
.
Berlin
:
Springer-Verlag
.
Michalewicz
,
Z.
, and
Schmidt
,
M.
(
2007
).
Parameter control in practice
. In
Parameter setting in evolutionary algorithms. Studies in computational intelligence
, Vol.
54
(pp.
277
294
).
Berlin
:
Springer-Verlag
.
Mitchell
,
M.
(
1996
).
An introduction to genetic algorithms
.
MIT Press
.
Pettitt
,
A. N.,
and
Stephens
,
M. A
. (
1977
).
The Kolmogorov-Smirnov goodness-of-fit statistic with discrete and grouped data
.
Technometrics
,
19
(
2
):
205
210
.
Schaffer
,
J. D.
,
Caruana
,
R. A.
,
Eshelman
,
L. J.
, and
Das
,
R
. (
1989
).
A study of control parameters affecting online performance of genetic algorithms for function optimization
. In
Proceedings of the International Conference on Genetic Algorithms
, pp.
51
60
.
Steinberg
,
L.
(
1961
).
The backboard wiring problem: A placement algorithm
.
SIAM Review
,
3
:
37
50
.
Thierens
,
D.
(
2002
).
Adaptive mutation rate control schemes in genetic algorithms
. In
Proceedings of the Congress on Evolutionary Computation
, pp.
980
985
.
Thierens
,
D
. (
2005
).
An adaptive pursuit strategy for allocating operator probabilities
. In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp.
1539
1546
.
Whitacre
,
J. M.
,
Pham
,
T. Q.
, and
Sarker
,
R. A.
(
2009
).
Use of statistical outlier detection method in adaptive evolutionary algorithms
.
CoRR
,
abs/0907.0595
.
Wong
,
Y.-Y.
,
Lee
,
K.-H.
,
Leung
,
K.-S.
, and
Ho
,
C.-W
. (
2003
).
A novel approach in parameter adaptation and diversity maintenance for genetic algorithms
.
Soft Computing
,
7
(
8
):
506
515
.
Zitzler
,
E.
,
Thiele
,
L.
,
Laumanns
,
M.
,
Fonseca
,
C. M.
, and
Grunert da Fonseca
,
V.
(
2003
).
Performance assessment of multiobjective optimizers: An analysis and review
.
IEEE Transactions on Evolutionary Computation
,
7
:
117
132
.