This paper investigates constraint-handling techniques used in nonelitist single-parent evolution strategies for the problem of maximizing a linear function with a single linear constraint. Two repair mechanisms are considered, and the analytical results are compared to those of earlier repair approaches in the same fitness environment. The first algorithm variant applies reflection to initially infeasible candidate solutions, and the second repair method uses truncation to generate feasible solutions from infeasible ones. The distributions describing the strategies’ one-generation behavior are calculated and used in a zeroth-order model for the steady state attained when operating with fixed step size. Considering cumulative step size adaptation, the qualitative differences in the behavior of the algorithm variants can be explained. The approach extends the theoretical knowledge of constraint-handling methods in the field of evolutionary computation and has implications for the design of constraint-handling techniques in connection with cumulative step size adaptation.
In the field of evolutionary algorithms, approaches to constrained optimization problems make use of different types of constraint-handling techniques. These include penalty methods, repair mechanisms, and methods based on principles of multiobjective optimization. In this context the work of Oyman et al. (1999), Runarsson and Yao (2000), Mezura-Montes and Coello Coello (2005), and Kramer and Schwefel (2006) should be referenced. An overview of constraint-handling techniques is provided in Mezura-Montes and Coello Coello (2011).
Different approaches are usually evaluated by direct comparison of the performances on test functions considered to be difficult. A disadvantage is that the effects of the test environment on the performance of the algorithm are often difficult to interpret. Also, the configuration of specific strategy parameters is not obvious. That is, complex test functions do not contribute insight into the behavior of the algorithm on a microscopic level. In contrast to benchmark studies on large testbeds, this work concentrates on the analysis of the behavior of algorithms on a very simple class of test functions. Simple test environments often provide a description of local regions within more complex environments. This way, analytical results can be obtained, allowing for a deeper understanding of the influence of strategy parameters on algorithm performance.
Early contributions to constrained optimization with evolution strategies (ES) include the work of Rechenberg (1973), who investigated the performance of the -ES for the axis-aligned corridor model. The same environment has been studied by Schwefel (1975) examining the behavior of the -ES, and by Beyer (1989) investigating the dynamics of the -ES for a constrained discus-like function.
Recent work (see Arnold [2011a, 2011b, 2013]) examines the behavior of nonelitist ES using cumulative step size adaptation (CSA) on a linear problem with a single linear constraint. Two constraint-handling techniques are compared. It is found that the resampling of infeasible candidate solutions results in premature convergence of the strategy in the face of small constraint angles, that is, small angles between the objective function’s gradient direction and the normal vector of the constraint plane. This is due to short search paths that result in a systematic reduction of the step size. Handling constraints by projecting infeasible solutions onto the boundary of the feasible region allows setting the parameters of CSA in such a way that premature convergence is avoided for any constraint angle. This is due to successful projected candidate solutions predominantly lying in one direction, resulting in long search paths.
Further repair mechanisms exist. Helwig et al. (2013), among other methods, suggest a reflection method as well as a hyperbolic approach, in conjunction with particle swarm optimization in order to deal with infeasible candidate solutions. These approaches serve as the basis of the repair methods considered here. The first one reflects initially infeasible candidate solutions into the feasible region. The other repair mechanism truncates the mutation vector of infeasible offspring in such a way that the offspring is then located on the boundary of the feasible region.
The goal of this paper is to analyze these constraint-handling techniques in the context of the linear optimization problem suggested by Arnold (2013) and to examine their performance relative to those repair mechanisms considered in that paper. Thereby the theoretical analysis of the behavior of CSA-ES is extended by the analysis of two additional repair mechanisms. This reveals further insight into the potential of constraint-handling techniques to maintain a sufficient population diversity in order to prevent premature convergence when approaching a constraint boundary.
The paper is organized as follows. In Section 2 the optimization problem, the algorithm, and the two suggested constraint-handling techniques are described. Section 3 investigates the behavior of the ES assuming that the mutation strength is kept constant. A simple zeroth-order model is established to characterize the average distance from the constraint plane. Section 4 considers cumulative step size adaptation within the strategy. The paper concludes with a discussion of the results and suggestions for future research in Section 5. For reasons of clarity and comprehensibility calculations that underlie Section 3 are presented in the appendix.
2 Problem Formulation and Algorithm
At this point, the general algorithm of the -CSA-ES dealing with problem (P) is introduced. It replaces infeasible offspring candidate solutions with feasible ones applying a specific repair mechanism. The two repair mechanisms investigated in this paper are explained in the corresponding subsections. Beginning with a feasible candidate solution , the -ES generates offspring by performing the following steps per iteration:
Evaluate for , and let denote the (possibly repaired) mutation vector of the best offspring, that is, of the offspring with the largest objective function value.
Replace the parental candidate solution with the best offspring candidate solution according to .
Modify the mutation strength using cumulative step size adaptation.
The decision whether the mutation strength is decreased or increased depends on the sign of the numerator in Equation (3). The basic idea is that long search paths indicate that selected mutation vectors predominantly point in one direction and could be replaced with fewer but longer steps. A short search path suggests that the strategy is moving back and forth and thus should benefit from smaller step sizes. If the unconstrained setting with randomly selected offspring candidate solutions (i.e., the ordering in step 2 of the algorithm is random) is considered, the search path has the expected squared length of N. In this case the mutation strength performs a random walk on log scale. That is, in expectation the logarithm of the mutation strength does not change at all. It is not unexpected that CSA in constrained settings may fail under some conditions.
2.1 Repair by Reflection
In this section the repair mechanism denoted as reflection is considered. It enables the -ES to deal with a single linear constraint (see Figure 1). Initially infeasible offspring candidate solutions are mirrored into the feasible region of the search space. The repair step in the algorithm reads
2.2 Repair by Truncation
We now consider the second repair mechanism, which truncates the mutation vector of an initially infeasible offspring candidate solution at the edge of the feasible region of the search space. In the following we refer to this method as truncation. As well as repair by reflection, truncation enables the -ES to deal with the previously introduced single linear constraint. The procedure is illustrated in Figure 2. Accordingly, the repair step 1b in the algorithm can be formulated as
3 Behavior for Fixed Mutation Strength
In the following a characterization of the distribution of offspring candidate solutions conditional on is provided, and the expected step made in a single iteration of the algorithm is calculated considering both repair mechanisms. Finally, the average normalized distance from the constraint plane realized by the respective repair methods when operating with constant mutation strength is investigated.
3.1 Single Step Behavior: Mutation
3.2 Single Step Behavior: Selection
3.3 Steady State Behavior
4 Mutation Strength Control
In this section we drop the assumption that the mutation strength of the strategy is fixed. That is, the strategy’s step size is adapted using cumulative step size adaptation as described in Section 2. This affects the long-term behavior of the -ES in the constrained linear environment considered in such a way that the strategy either increases or decreases the mutation strength on average. With respect to the optimization problem (P), increasing is required because decreasing would lead to premature convergence.
The theoretical results of the moments , can be validated by comparing them to measurements from experimental runs of the evolution strategies. With the Dirac model to compute the average distance of the parental candidate solution from the constraint plane by solving (22), the moments , can be evaluated numerically. For reflection and truncation the resulting curves are plotted against the constraint angle in Figure 4. The comparison of the curves with the corresponding values obtained experimentally in runs of the respective -ES with fixed mutation strength shows a good visual agreement for the first moments and . The approximation quality of the Dirac model deteriorates for the second moments , and . In the first case the approximation in case of highly acute constraint angles exhibits small deviations. However, the values in that scenario are generally small and have limited impact on the results. Regarding the deviations occur with increasing values of but appear to be visually good for small constraint angles.
The failure of cumulative step size adaptation in the limit of small constraint angles (see Figure 4) can be explained as follows. The moments and of the parental mutation vector’s first component describe the behavior in x1 direction. For small values of both moments assume very small values near zero, as both strategies operate in close vicinity to the constraint plane. In order to compensate for the small contribution of the moments and to the expected squared length of the search path on the x1-axis, the moments and have to be large enough to ensure a positive expected logarithmic adaptation response . In terms of the first two moments of and components, repair by reflection resembles the strategy that resamples infeasible solutions. Arnold (2013) found that the behavior in x2 direction almost performs a random walk, namely, and . The same observation can be made for the strategy that applies reflection (see Figure 4). This can be clarified considering the way reflection acts on initially infeasible offspring candidate solutions. In the limit of very small , on average, half of the offspring are infeasible and will be reflected into the feasible region. On average, half of those that are reflected have a positive z2 component, and the other half are reflected on a negative z2 component. But offspring being reflected on a negative z2 component are for small only slightly more likely to be selected. That is, in contrast to the observations made for projection (see Arnold ), the strategy using reflection does not exhibit a strong correlation between candidates being successful and a large negative value of the z2 components. As a consequence, reflection behaves similarly to resampling.
With the strategy that truncates the mutation vectors of initially infeasible candidate solutions at the boundary of the feasible area, and approach even smaller values. The observed moments and assume zero for decreasing constraint angles . Thus they are not suitable to compensate for the low contribution of and on the x1-axis. To prevent convergence to a nonstationary limit point, the strategy using truncation requires even lower choices of the cumulation parameter c. In the limit of very small , the probability of generating an initially infeasible offspring is again approximately .5. The area that provides positive z1 components after truncation shrinks considerably with decreasing constraint angle. That is, the mutation vectors of successful candidate solutions after truncation are on average very small. We can infer the same correlation from truncation that was observed in the case of projection, namely, that successful offspring after truncation tend to be associated with negative z2 components. But this correlation is not able to compensate for the drawback of small mutation vectors. As a consequence, using repair by truncation within the -CSA-ES is not a well-suited constraint-handling method, especially for rather small constraint angles .
The accuracy of the predictions based on the Dirac model is verified in Figure 6. There, all curves are obtained from condition (25) for both reflection and truncation as well as for several offspring population sizes . Each mark in Figure 6 represents experimental results obtained from 10 independent runs of the respective evolution strategies with cumulative step size adaptation initialized with , , and . Further, search space dimensionality and damping parameter were used. The runs were terminated after the mutation strength assumed a value either smaller than or greater than . The very small step sizes suggest that stagnation is likely to ensue, whereas large step sizes point to continuing progress at increasing rates. The + symbol in Figure 6 indicates that at least 9 of the 10 runs terminated with ; the × symbol indicates that termination took place with in at least 9 of the 10 runs. Neither symbol is present at a grid location in the plots if either termination criterion was satisfied in at least 2 of the 10 runs. Within the range of considered constraint angles and population size parameters, the agreement of our predictions based on condition (25) with the experimentally generated results is very good. Only for small population sizes the strategy using truncation shows deviations from the theoretical predictions. This points toward the Dirac model providing an insufficient approximation of the evolution strategy’s behavior in the range of small constraint angles and small population sizes.
5 Summary and Conclusions
This paper investigated constraint-handling techniques for the -ES using cumulative step size adaptation on a linear problem with a single linear inequality constraint. Two different repair mechanisms were considered: the reflection of infeasible points into the feasible region and the truncation of the mutation vectors at the boundary of the feasible area. As pointed out earlier, an interesting aspect of the simple scenario of a linear function with a single linear constraint is that it can be used to model microscopic properties of more general constrained problems.
The use of the simple Dirac model allowed for approximating the average distance of the parental candidate solution from the constraint plane, provided that the strategy runs with fixed mutation strength. With cumulative step size adaptation and the assumption that the evolution strategy operates out of a stationary state, this approximation was then used to derive an expression for the expected logarithmic adaptation response. Hence, it was possible to calculate the maximal cumulation parameter up to which premature convergence can be avoided by the -ES. The theoretical predictions were verified by experiments over a wide range of constraint angles with varying number of offspring. The validation revealed generally very good agreement.
While projection still allows for positive logarithmic adaptation response for arbitrarily small constraint angles, neither reflection nor truncation turned out to enable the strategy to make sustained progress by continually increasing the mutation strength for small . The reason for the better performance of projection is that it exhibits a strong correlation between an offspring being successful and a large negative value of its z2 component. This is not true for repair by reflection, since an initially infeasible offspring with negative z2 component is not necessarily successful after reflection. Thus reflection shows a behavior similar to resampling. For truncation the strong correlation between a successful offspring and a negative value of its z2 component can also be found. However, on average, truncation generates short mutation vectors, and consequently the correlation turns out to be insufficient to make up for the short length. There is also empirical evidence in Beyer and Finck (2012) that for more complex constraints, repair by projection provides better performance compared to truncation and resampling.
The results provide insight into the interactions of constraint-handling techniques used in combination with cumulative step size adaptation. Possible future work includes the extension of the analysis to the more general multirecombinant -ES. Furthermore, the interplay of repair by reflection with other step size adaptation methods, such as mutative self-adaptation, remains to be discussed. A theoretical attempt to address problems with a nonlinear constraint was provided by Arnold (2014). There, a linear optimization problem with conically constrained feasible region was investigated by using resampling to deal with infeasible candidate solutions. A future analysis of this type of problem using repair by projection seems reasonable. As well, investigations regarding nonlinear objective functions and further nonlinear constraints are worth considering.
This work was supported by the Austrian Science Fund (FWF) under grant P22649-N23.
Appendix Computation of the Distributions
This appendix computes the distribution of the mutation vectors’ first components z1 after the repair step for both constraint-handling techniques, that is, for repair by reflection as well as the repair method using truncation. Subsequently, the distribution of the mutation vectors’ z2 components conditional on z1 is computed for the respective strategies.
In the first step the probability distribution function of the z1 components of the mutation vector after reflection is calculated. Then we derive the first two moments about zero of its z2 components conditional on z1.
A.1.1 Distribution of the z1 Components after Reflection
A.1.2 Distribution of the z2 Components after Reflection
Now for repair by truncation, the probability distribution function of the z1 components of the mutation vector and the first two moments about zero of its z2 components conditional on z1 are computed.