Abstract

This paper investigates constraint-handling techniques used in nonelitist single-parent evolution strategies for the problem of maximizing a linear function with a single linear constraint. Two repair mechanisms are considered, and the analytical results are compared to those of earlier repair approaches in the same fitness environment. The first algorithm variant applies reflection to initially infeasible candidate solutions, and the second repair method uses truncation to generate feasible solutions from infeasible ones. The distributions describing the strategies’ one-generation behavior are calculated and used in a zeroth-order model for the steady state attained when operating with fixed step size. Considering cumulative step size adaptation, the qualitative differences in the behavior of the algorithm variants can be explained. The approach extends the theoretical knowledge of constraint-handling methods in the field of evolutionary computation and has implications for the design of constraint-handling techniques in connection with cumulative step size adaptation.

1  Introduction

In the field of evolutionary algorithms, approaches to constrained optimization problems make use of different types of constraint-handling techniques. These include penalty methods, repair mechanisms, and methods based on principles of multiobjective optimization. In this context the work of Oyman et al. (1999), Runarsson and Yao (2000), Mezura-Montes and Coello Coello (2005), and Kramer and Schwefel (2006) should be referenced. An overview of constraint-handling techniques is provided in Mezura-Montes and Coello Coello (2011).

Different approaches are usually evaluated by direct comparison of the performances on test functions considered to be difficult. A disadvantage is that the effects of the test environment on the performance of the algorithm are often difficult to interpret. Also, the configuration of specific strategy parameters is not obvious. That is, complex test functions do not contribute insight into the behavior of the algorithm on a microscopic level. In contrast to benchmark studies on large testbeds, this work concentrates on the analysis of the behavior of algorithms on a very simple class of test functions. Simple test environments often provide a description of local regions within more complex environments. This way, analytical results can be obtained, allowing for a deeper understanding of the influence of strategy parameters on algorithm performance.

Early contributions to constrained optimization with evolution strategies (ES) include the work of Rechenberg (1973), who investigated the performance of the -ES for the axis-aligned corridor model. The same environment has been studied by Schwefel (1975) examining the behavior of the -ES, and by Beyer (1989) investigating the dynamics of the -ES for a constrained discus-like function.

Recent work (see Arnold [2011a, 2011b, 2013]) examines the behavior of nonelitist ES using cumulative step size adaptation (CSA) on a linear problem with a single linear constraint. Two constraint-handling techniques are compared. It is found that the resampling of infeasible candidate solutions results in premature convergence of the strategy in the face of small constraint angles, that is, small angles between the objective function’s gradient direction and the normal vector of the constraint plane. This is due to short search paths that result in a systematic reduction of the step size. Handling constraints by projecting infeasible solutions onto the boundary of the feasible region allows setting the parameters of CSA in such a way that premature convergence is avoided for any constraint angle. This is due to successful projected candidate solutions predominantly lying in one direction, resulting in long search paths.

Further repair mechanisms exist. Helwig et al. (2013), among other methods, suggest a reflection method as well as a hyperbolic approach, in conjunction with particle swarm optimization in order to deal with infeasible candidate solutions. These approaches serve as the basis of the repair methods considered here. The first one reflects initially infeasible candidate solutions into the feasible region. The other repair mechanism truncates the mutation vector of infeasible offspring in such a way that the offspring is then located on the boundary of the feasible region.

The goal of this paper is to analyze these constraint-handling techniques in the context of the linear optimization problem suggested by Arnold (2013) and to examine their performance relative to those repair mechanisms considered in that paper. Thereby the theoretical analysis of the behavior of CSA-ES is extended by the analysis of two additional repair mechanisms. This reveals further insight into the potential of constraint-handling techniques to maintain a sufficient population diversity in order to prevent premature convergence when approaching a constraint boundary.

The paper is organized as follows. In Section 2 the optimization problem, the algorithm, and the two suggested constraint-handling techniques are described. Section 3 investigates the behavior of the ES assuming that the mutation strength is kept constant. A simple zeroth-order model is established to characterize the average distance from the constraint plane. Section 4 considers cumulative step size adaptation within the strategy. The paper concludes with a discussion of the results and suggestions for future research in Section 5. For reasons of clarity and comprehensibility calculations that underlie Section 3 are presented in the appendix.

2  Problem Formulation and Algorithm

Throughout this paper the problem of maximizing a linear function
formula
with a single linear constraint is investigated. More precisely, since there exists no finite maximum, the task is one of amelioration rather than maximization. We assume that the gradient vector of the objective function forms an acute angle with the normal vector of the constraint plane. In that setting, the challenge for the evolution strategy is to increase its mutation strength, thereby accelerating long-term progress. Systematically reducing the mutation strength, as observed for the -ES with success-probability-based step size adaptation as early as the 1970s (Schwefel, 1975), results in eventual stagnation of the strategy. Thus, the interplay of step size adaptation mechanism and constraint-handling technique is of central significance.
Without loss of generality, the coordinate system can be chosen in such a way that the origin lies on the constraint plane, and that the x1-axis matches with the gradient direction . Additionally, the x2-axis should lie in the two-dimensional subspace spanned by the gradient vector and the normal vector. The corresponding angle is referred to as the constraint angle . Constraint angles of interest are in the interval . The unit normal vector of the constraint plane in terms of the considered coordinate system is . The signed distance of a point from the constraint plane is . An illustration of the problem definition can be found in Figure 1. The resulting optimization problem reads
formula
P
for some constant . Because of the choice of the coordinate system, the variables have no effect on either the objective function or the constraint inequality. Despite its simplicity the linear environment serves as a model for other constrained optimization problems in the immediate vicinity of the constraint plane (see Arnold [2013]).
Figure 1:

Linear objective function with a single constraint, displayed within the subspace spanned by the x1- and x2-axes. The shaded area is the feasible region. The parental candidate solution is at a distance from the constraint plane. The infeasible candidate solution is reflected into the feasible region along the normal vector of the constraint plane, resulting in the new point .

Figure 1:

Linear objective function with a single constraint, displayed within the subspace spanned by the x1- and x2-axes. The shaded area is the feasible region. The parental candidate solution is at a distance from the constraint plane. The infeasible candidate solution is reflected into the feasible region along the normal vector of the constraint plane, resulting in the new point .

At this point, the general algorithm of the -CSA-ES dealing with problem (P) is introduced. It replaces infeasible offspring candidate solutions with feasible ones applying a specific repair mechanism. The two repair mechanisms investigated in this paper are explained in the corresponding subsections. Beginning with a feasible candidate solution , the -ES generates offspring by performing the following steps per iteration:

  1. For ,

    1. Generate offspring candidate solution by multiplying an N-dimensional, independent, and identically distributed vector drawn from a standard normal distribution with mean zero and unit covariance matrix with the mutation strength and adding it to the parental candidate solution :
      formula
    2. If , then repair the infeasible offspring candidate:
      formula
    3. Evaluate for , and let denote the (possibly repaired) mutation vector of the best offspring, that is, of the offspring with the largest objective function value.

  2. Replace the parental candidate solution with the best offspring candidate solution according to .

  3. Modify the mutation strength using cumulative step size adaptation.

The algorithm generates an offspring candidate solution by random sampling and, where necessary, additional repair. The sampling of an offspring candidate solution in step 1a involves generating a mutation vector with standard normally distributed components. Introducing the normalized distance of the parental candidate solution from the constraint plane,
formula
the admissibility of an offspring can be described in terms of the corresponding mutation vector . That is, if the condition
formula
1
holds, then is infeasible and repaired by the strategy to provide a candidate solution within the feasible region of the search space. Repair is performed in step 1b of the algorithm and is described concentrating on the respective constraint-handling mechanisms in the subsequent subsections.
In step 3 the algorithm adapts the strategy’s step size using cumulative step size adaptation. This mutation strength control mechanism was introduced by Ostermeier et al. (1994) and is popular because of its use in the CMA-ES proposed by Hansen and Ostermeier (2001). Cumulative step size adaptation compiles a search path starting from an initial by implementing a fading record of past steps taken by the strategy according to
formula
2
The length of its memory is determined by the choice of the constant , which is referred to as the cumulation parameter. CSA updates the mutation strength from generation t to conforming to the rule
formula
3
with damping parameter that scales the magnitude of the updates. Notice, update rule (3) is different from the original mutation strength update in Hansen and Ostermeier (2001) in the way that it adapts based on the squared length of the search path rather than based on search path’s length. With appropriately chosen parameters both variants often show a similar behavior. In accordance with Arnold (2013), this work uses the mutation strength update (3) because it simplifies the CSA analysis.

The decision whether the mutation strength is decreased or increased depends on the sign of the numerator in Equation (3). The basic idea is that long search paths indicate that selected mutation vectors predominantly point in one direction and could be replaced with fewer but longer steps. A short search path suggests that the strategy is moving back and forth and thus should benefit from smaller step sizes. If the unconstrained setting with randomly selected offspring candidate solutions (i.e., the ordering in step 2 of the algorithm is random) is considered, the search path has the expected squared length of N. In this case the mutation strength performs a random walk on log scale. That is, in expectation the logarithm of the mutation strength does not change at all. It is not unexpected that CSA in constrained settings may fail under some conditions.

2.1  Repair by Reflection

In this section the repair mechanism denoted as reflection is considered. It enables the -ES to deal with a single linear constraint (see Figure 1). Initially infeasible offspring candidate solutions are mirrored into the feasible region of the search space. The repair step in the algorithm reads

formula
Reflection of an initially infeasible offspring candidate solution transforms the first two components of the corresponding mutation vector in the following way:
formula
4
and
formula
5
All other components of remain unchanged. Notice, the feasible point after reflection resides at the same distance from the constraint plane as the initially infeasible solution.

2.2  Repair by Truncation

We now consider the second repair mechanism, which truncates the mutation vector of an initially infeasible offspring candidate solution at the edge of the feasible region of the search space. In the following we refer to this method as truncation. As well as repair by reflection, truncation enables the -ES to deal with the previously introduced single linear constraint. The procedure is illustrated in Figure 2. Accordingly, the repair step 1b in the algorithm can be formulated as

formula
Figure 2:

Repair by truncation applied to a linear objective function with a single constraint, displayed within the subspace spanned by the x1- and x2-axes. The shaded area is the feasible region. From the parental candidate solution at a distance an infeasible candidate solution is generated, and its mutation vector is truncated at the constraint plane, yielding the new point .

Figure 2:

Repair by truncation applied to a linear objective function with a single constraint, displayed within the subspace spanned by the x1- and x2-axes. The shaded area is the feasible region. From the parental candidate solution at a distance an infeasible candidate solution is generated, and its mutation vector is truncated at the constraint plane, yielding the new point .

An offspring candidate solution is generated by random sampling and, if applicable, repair by truncation. Truncation modifies the standard normally distributed mutation vector of an initially infeasible offspring in the following way:
formula
6
and
formula
7
Again, all other components of are not affected by the repair step.

3  Behavior for Fixed Mutation Strength

In the first step the strategy is investigated considering a fixed mutation strength . That is, the CSA step in the algorithm is omitted during this section. The probability of an offspring candidate solution being feasible and the expected improvement in the objective function value depend on the distance of the parental candidate solution from the constraint plane. Hence, the ES behavior can be completely described by means of the normalized distance . According to Arnold (2013), the evolution of is then characterized by the difference equation
formula
8
where superscripts on indicate time, or the number of generations. We refer to as the mutation vector after repair associated with the best offspring candidate solution in generation t.

In the following a characterization of the distribution of offspring candidate solutions conditional on is provided, and the expected step made in a single iteration of the algorithm is calculated considering both repair mechanisms. Finally, the average normalized distance from the constraint plane realized by the respective repair methods when operating with constant mutation strength is investigated.

3.1  Single Step Behavior: Mutation

The variation step of the evolution strategy consists of mutation and, when applicable, repair according to the specific constraint-handling approach used. Since initially infeasible offspring candidate solutions are repaired, the strategy affects the distribution of the original normally distributed offspring. The marginal distribution of the z1 components and the distribution of the z2 components conditional on z1 after repair are derived in the appendix for both repair mechanisms. Taking into account Equation (34), in the case of reflection the marginal density of the z1 components after reflection reads
formula
9
Notice, in this context denotes the cumulative distribution function of the standard normal distribution and the abbreviation
formula
10
is used. According to Equation (36), the corresponding cumulative distribution function is
formula
11
With repair by truncation the respective marginal density of the z1 components is derived in Equation (55) and reads
formula
12
The corresponding cumulative distribution function is found in Equation (56) as
formula
13
In the case of truncation the abbreviation
formula
14
is used to improve the readability of the results.

3.2  Single Step Behavior: Selection

The purpose of this section is to characterize the expected step made by the strategy for both repair mechanisms. To this end, the first two components of the mutation vector that belong to the best offspring candidate solution are considered, and their first two moments about zero
formula
15
are computed. Since the following formulas hold for both repair cases, superscripts indicating the constraint-handling method are omitted. Depending on the repair method applied, the first component is the th order statistic of a sample of independent realizations of a random variable distributed according to in Equation (9) in case of reflection [or according to in Equation (12), in case of truncation]. The probability density function of the component is thus
formula
16
By integration of Equation (16) the means of , can be computed as
formula
17
and
formula
18
Similarly, the second moments about zero of for equal
formula
19
and
formula
20
The moments , conditional on z1 are presented in the appendix for both constraint-handling methods. In the case of reflection they can be found in Equations (40) and (41). For repair by truncation the respective expressions are derived in Equations (60) and (61).

3.3  Steady State Behavior

All results derived up to this point are concerned with the algorithm’s behavior in single time steps and are conditional on the normalized distance from the constraint plane. Because of the assumption that the mutation strength is held constant, the distribution of values approaches a stationary limit distribution when the algorithm is iterated. According to Meyer-Nieberg and Beyer (2012), an approximation of this limit distribution can be computed by expanding the unknown distribution in terms of its moments, calculating the resulting moments after an update, and imposing stationarity conditions demanding that the moments remain unchanged. This approach results in a system of as many equations as there are moments included in the computation. Making use of a zeroth-order approximation, this limit distribution is well characterized by its mean neglecting higher-order moments. The limit distribution can thus be modeled as a (shifted) Dirac delta function, and by solving the corresponding stationarity condition
formula
21
for , one obtains an approximation of the average distance of the parental candidate solution from the constraint plane. Though considering higher-order moments may provide better approximations, the Dirac model turns out to provide a quite accurate description of the behavior of the constraint-handling mechanisms.
Taking the expected value of Equation (8) and using the stationary condition (21) yields
formula
22
Integral representations of the two moments were derived in Section 3.2. Thus equation (22) can be solved numerically for . The results are illustrated in Figure 3 for both constraint-handling techniques. The average normalized distance of the parental centroid from the constraint plane of each repair mechanism is compared to results obtained by Arnold (2013) using resampling as well as repair by projection. In this context resampling refers to the constraint-handling technique that continuously resamples infeasible candidate solutions until feasible offspring candidate solutions have been generated in each iteration of the algorithm. The repair mechanism referred to as projection simply translates initially infeasible candidate solutions along the normal vector of the constraint plane onto the boundary of the feasible region. In Figure 3(a) the average normalized distance, realized by the -ES that repairs initially infeasible candidate solutions by reflection, is plotted against the constraint angle . Figure 3(c) presents the respective results for the ES that truncates the mutation vector of an initially infeasible candidate solution at the boundary of the feasible set. Figures 3(b) and 3(d) illustrate the same information making use of logarithmic scales. For both strategies, that is, reflection as well as truncation, the average normalized distance increases with increasing . Since it truncates infeasible candidate solutions at the boundary of the feasible region, the strategy using truncation tracks the constraint plane more closely than the strategy that uses reflection. The observed distances resemble those obtained by projection in terms of the realized average normalized distances for small constraint angles. The points in Figure 3 represent experimental data that are obtained by averaging over steps, or generations, of the respective evolution strategies described in Section 2 with fixed mutation strength . When these data points are compared to the solid curves, the quality of the zeroth order approximation provided by the Dirac model reveals a good agreement for small constraint angles. Considering larger values results in deterioration of the approximation quality. As the constraint angle becomes less acute, the strategy tracks the constraint plane at a greater distance. Thus variations in that distance become significant. In this situation the Dirac model is inappropriate. Making use of an exponential model, like the one proposed in Arnold and Brauer (2008), would likely increase the accuracy of the distance distribution.
Figure 3:

Average normalized distance from the constraint plane is plotted against the constraint angle for the -ES. (a) Numerical results of reflection are denoted by solid line. (b) Plot of the same information using logarithmic scales for both axes. (c) Numerically generated average distance realized by truncation is denoted by solid line. (d) Truncation results in logarithmic axes. In each part the predictions are compared to experimental results obtained by averaging over steps of the corresponding evolution strategies as well as to numerical results of repair by projection (dash-dotted line) and resampling (dashed line).

Figure 3:

Average normalized distance from the constraint plane is plotted against the constraint angle for the -ES. (a) Numerical results of reflection are denoted by solid line. (b) Plot of the same information using logarithmic scales for both axes. (c) Numerically generated average distance realized by truncation is denoted by solid line. (d) Truncation results in logarithmic axes. In each part the predictions are compared to experimental results obtained by averaging over steps of the corresponding evolution strategies as well as to numerical results of repair by projection (dash-dotted line) and resampling (dashed line).

4  Mutation Strength Control

In this section we drop the assumption that the mutation strength of the strategy is fixed. That is, the strategy’s step size is adapted using cumulative step size adaptation as described in Section 2. This affects the long-term behavior of the -ES in the constrained linear environment considered in such a way that the strategy either increases or decreases the mutation strength on average. With respect to the optimization problem (P), increasing is required because decreasing would lead to premature convergence.

The behavior of the -ES with cumulative step size adaptation applied to the constrained linear problem can be described by a nonlinear, stochastic dynamic system that depends on , , and the search path . Rather than trying to solve the complex problem of investigating the dynamics of the stochastic process we follow the approach in Arnold (2013), where the logarithmic adaptation response of the strategy operating out of a stationary state is considered. The logarithmic adaptation response of the -ES applying CSA is defined as
formula
23
The sign of indicates whether the mutation strength is increased or decreased, that is, if the logarithmic adaptation response yields positive values, then is increased, and vice versa. Experiments indicate that the resulting predictions of the strategy’s dynamic behavior are accurate for a broad range of parameter settings.
Assume that the strategy has been iterated with a fixed mutation strength for a sufficiently long time until time step t. Then the expected logarithmic adaptation response at time step is computed. The derivation of the expected value was completed in Arnold (2013), resulting in
formula
24
This equation depends on the first two moments of the first and second components of the mutation vector that provides the best offspring candidate solution in the respective generation. The moments , are specified in Equations (17)–(20) for both repair mechanisms considered. Notice, Equation (24) depends on the search space dimension only through the choice of the cumulation parameter c, which is usually specified as an N-dependent value.

The theoretical results of the moments , can be validated by comparing them to measurements from experimental runs of the evolution strategies. With the Dirac model to compute the average distance of the parental candidate solution from the constraint plane by solving (22), the moments , can be evaluated numerically. For reflection and truncation the resulting curves are plotted against the constraint angle in Figure 4. The comparison of the curves with the corresponding values obtained experimentally in runs of the respective -ES with fixed mutation strength shows a good visual agreement for the first moments and . The approximation quality of the Dirac model deteriorates for the second moments , and . In the first case the approximation in case of highly acute constraint angles exhibits small deviations. However, the values in that scenario are generally small and have limited impact on the results. Regarding the deviations occur with increasing values of but appear to be visually good for small constraint angles.

Figure 4:

Moments of the first and second components of are plotted against the constraint angle , for and . Results of repair by reflection are denoted by solid lines; moments obtained by truncation, by dashed lines. The discrete data points represent measurements obtained by averaging over runs of the -ES with fixed mutation strength .

Figure 4:

Moments of the first and second components of are plotted against the constraint angle , for and . Results of repair by reflection are denoted by solid lines; moments obtained by truncation, by dashed lines. The discrete data points represent measurements obtained by averaging over runs of the -ES with fixed mutation strength .

By Equation (24) the logarithmic adaptation response decreases with increasing cumulation parameter. As c approaches 1, the second term decreases to zero. That is, depending on the first term, the expected logarithmic adaptation response may be negative, resulting in decreasing step sizes. In order to achieve a positive expected logarithmic response, the negative third term in Equation (24) has to be outweighed by the first two terms. This can be realized for c values that are close enough to zero. Requiring yields the condition
formula
25
for the cumulation parameter. Condition (25) allows for the computation of the maximal c value for which stagnation, namely, convergence to a nonstationary point, can be avoided. The calculation is performed by considering equality in (25), where the moments are computed by making use of the Dirac model for the average distance from the constraint plane. The results for the strategies using reflection and truncation are illustrated in Figure 5 and compared to the corresponding results for projection and resampling obtained by Arnold (2011a, 2011b). It can be observed that for the strategy that reflects initially infeasible candidate solutions, the maximal cumulation parameter c capable of avoiding premature convergence decreases to zero as the constraint angle becomes increasingly acute. For the strategies applying truncation and resampling, the same behavior can be deduced from Figure 5. That is, provided that the constraint angle is small enough, all three strategies will converge to a nonstationary limit point independent of the choice of the cumulation parameter c. This behavior has already been investigated for the strategy that resamples infeasible offspring candidate solutions. The approach of the maximal cumulation parameter toward a finite limit value for decreasing , as observed by Arnold (2013), is only realized by the strategy that repairs infeasible candidate solutions by projection. Thus, only projection is able to ensure continuous progress for arbitrarily small constraint angles. Note, as , values exceeding 1 are not shown in Figure 5.
Figure 5:

Maximum value of the cumulation parameter c for which convergence is avoided plotted against the constraint angle . The results for the repair mechanisms reflection, truncation, and projection, as well as constraint handling by resampling, are displayed for .

Figure 5:

Maximum value of the cumulation parameter c for which convergence is avoided plotted against the constraint angle . The results for the repair mechanisms reflection, truncation, and projection, as well as constraint handling by resampling, are displayed for .

The failure of cumulative step size adaptation in the limit of small constraint angles (see Figure 4) can be explained as follows. The moments and of the parental mutation vector’s first component describe the behavior in x1 direction. For small values of both moments assume very small values near zero, as both strategies operate in close vicinity to the constraint plane. In order to compensate for the small contribution of the moments and to the expected squared length of the search path on the x1-axis, the moments and have to be large enough to ensure a positive expected logarithmic adaptation response . In terms of the first two moments of and components, repair by reflection resembles the strategy that resamples infeasible solutions. Arnold (2013) found that the behavior in x2 direction almost performs a random walk, namely, and . The same observation can be made for the strategy that applies reflection (see Figure 4). This can be clarified considering the way reflection acts on initially infeasible offspring candidate solutions. In the limit of very small , on average, half of the offspring are infeasible and will be reflected into the feasible region. On average, half of those that are reflected have a positive z2 component, and the other half are reflected on a negative z2 component. But offspring being reflected on a negative z2 component are for small only slightly more likely to be selected. That is, in contrast to the observations made for projection (see Arnold [2013]), the strategy using reflection does not exhibit a strong correlation between candidates being successful and a large negative value of the z2 components. As a consequence, reflection behaves similarly to resampling.

With the strategy that truncates the mutation vectors of initially infeasible candidate solutions at the boundary of the feasible area, and approach even smaller values. The observed moments and assume zero for decreasing constraint angles . Thus they are not suitable to compensate for the low contribution of and on the x1-axis. To prevent convergence to a nonstationary limit point, the strategy using truncation requires even lower choices of the cumulation parameter c. In the limit of very small , the probability of generating an initially infeasible offspring is again approximately .5. The area that provides positive z1 components after truncation shrinks considerably with decreasing constraint angle. That is, the mutation vectors of successful candidate solutions after truncation are on average very small. We can infer the same correlation from truncation that was observed in the case of projection, namely, that successful offspring after truncation tend to be associated with negative z2 components. But this correlation is not able to compensate for the drawback of small mutation vectors. As a consequence, using repair by truncation within the -CSA-ES is not a well-suited constraint-handling method, especially for rather small constraint angles .

The accuracy of the predictions based on the Dirac model is verified in Figure 6. There, all curves are obtained from condition (25) for both reflection and truncation as well as for several offspring population sizes . Each mark in Figure 6 represents experimental results obtained from 10 independent runs of the respective evolution strategies with cumulative step size adaptation initialized with , , and . Further, search space dimensionality and damping parameter were used. The runs were terminated after the mutation strength assumed a value either smaller than or greater than . The very small step sizes suggest that stagnation is likely to ensue, whereas large step sizes point to continuing progress at increasing rates. The + symbol in Figure 6 indicates that at least 9 of the 10 runs terminated with ; the × symbol indicates that termination took place with in at least 9 of the 10 runs. Neither symbol is present at a grid location in the plots if either termination criterion was satisfied in at least 2 of the 10 runs. Within the range of considered constraint angles and population size parameters, the agreement of our predictions based on condition (25) with the experimentally generated results is very good. Only for small population sizes the strategy using truncation shows deviations from the theoretical predictions. This points toward the Dirac model providing an insufficient approximation of the evolution strategy’s behavior in the range of small constraint angles and small population sizes.

Figure 6:

Experimental verification of the convergence behavior for the strategies using reflection and truncation. The solid lines represent the analytically obtained maximum cumulation parameter c values up to which premature convergence is avoided (see condition (25)). The c values are plotted against the constraint angle . The markers indicate runs of the respective -ES with cumulative step size adaptation. Runs in which mutation strength is steadily decreased on average are marked by the + symbol, and cases where mutation strength is increased, by the × symbol.

Figure 6:

Experimental verification of the convergence behavior for the strategies using reflection and truncation. The solid lines represent the analytically obtained maximum cumulation parameter c values up to which premature convergence is avoided (see condition (25)). The c values are plotted against the constraint angle . The markers indicate runs of the respective -ES with cumulative step size adaptation. Runs in which mutation strength is steadily decreased on average are marked by the + symbol, and cases where mutation strength is increased, by the × symbol.

5  Summary and Conclusions

This paper investigated constraint-handling techniques for the -ES using cumulative step size adaptation on a linear problem with a single linear inequality constraint. Two different repair mechanisms were considered: the reflection of infeasible points into the feasible region and the truncation of the mutation vectors at the boundary of the feasible area. As pointed out earlier, an interesting aspect of the simple scenario of a linear function with a single linear constraint is that it can be used to model microscopic properties of more general constrained problems.

The use of the simple Dirac model allowed for approximating the average distance of the parental candidate solution from the constraint plane, provided that the strategy runs with fixed mutation strength. With cumulative step size adaptation and the assumption that the evolution strategy operates out of a stationary state, this approximation was then used to derive an expression for the expected logarithmic adaptation response. Hence, it was possible to calculate the maximal cumulation parameter up to which premature convergence can be avoided by the -ES. The theoretical predictions were verified by experiments over a wide range of constraint angles with varying number of offspring. The validation revealed generally very good agreement.

While projection still allows for positive logarithmic adaptation response for arbitrarily small constraint angles, neither reflection nor truncation turned out to enable the strategy to make sustained progress by continually increasing the mutation strength for small . The reason for the better performance of projection is that it exhibits a strong correlation between an offspring being successful and a large negative value of its z2 component. This is not true for repair by reflection, since an initially infeasible offspring with negative z2 component is not necessarily successful after reflection. Thus reflection shows a behavior similar to resampling. For truncation the strong correlation between a successful offspring and a negative value of its z2 component can also be found. However, on average, truncation generates short mutation vectors, and consequently the correlation turns out to be insufficient to make up for the short length. There is also empirical evidence in Beyer and Finck (2012) that for more complex constraints, repair by projection provides better performance compared to truncation and resampling.

The results provide insight into the interactions of constraint-handling techniques used in combination with cumulative step size adaptation. Possible future work includes the extension of the analysis to the more general multirecombinant -ES. Furthermore, the interplay of repair by reflection with other step size adaptation methods, such as mutative self-adaptation, remains to be discussed. A theoretical attempt to address problems with a nonlinear constraint was provided by Arnold (2014). There, a linear optimization problem with conically constrained feasible region was investigated by using resampling to deal with infeasible candidate solutions. A future analysis of this type of problem using repair by projection seems reasonable. As well, investigations regarding nonlinear objective functions and further nonlinear constraints are worth considering.

Acknowledgments

This work was supported by the Austrian Science Fund (FWF) under grant P22649-N23.

References

Arnold
,
D. V
. (
2002
).
Noisy optimization with evolution strategies
.
Dordrecht, Netherlands
:
Kluwer
.
Arnold
,
D. V
. (
2011a
).
Analysis of a repair mechanism for the (1, λ)-ES applied to a simple constrained problem
. In
Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2011
, pp.
853
860
.
Arnold
,
D. V
. (
2011b
).
On the behaviour of the (1, λ)-ES for a simple constrained problem
. In
Proceedings of the 11th Workshop on Foundations of Genetic Algorithms, FOGA XI
, pp.
15
24
.
Arnold
,
D. V
. (
2012
).
On the behaviour of the (1, λ)-σSA-ES for a constrained linear problem
. In
Parallel Problem Solving from Nature, PPSN XII
, pp.
82
91
.
Arnold
,
D. V
. (
2013
).
Resampling versus repair in evolution strategies applied to a constrained linear problem
.
Evolutionary Computation
,
21
(
3
):
389
411
.
Arnold
,
D. V
. (
2014
).
On the behaviour of the (1, λ)-ES for conically constrained linear problems
.
Evolutionary Computation
,
22
(
3
):
503
522
.
Arnold
,
D. V.
, and
Beyer
,
H.-G.
(
2001
).
Local performance of the (μ/μi, λ)-ES in a noisy environment
. In
W.
Martin
and
W.
Spears
(Eds.),
Foundations of genetic algorithms
,
6
, pp.
127
141
.
San Francisco
:
Morgan Kaufmann
.
Arnold
,
D. V.
, and
Brauer
,
D
. (
2008
).
On the behaviour of the (1+1)-ES for a simple constrained problem
. In
Parallel Problem Solving from Nature, PPSN X
, pp.
1
10
.
Beyer
,
H.-G.
(
1989
).
Ein Evolutionsverfahren zur mathematischen Modellierung stationärer Zustände in dynamischen Systemen
.
Unpublished doctoral dissertation, Hochschule für Architektur und Bauwesen
,
Weimar, Germany
.
Beyer
,
H.-G
. (
2001
).
The theory of evolution strategies
.
Berlin
:
Springer
.
Beyer
,
H.-G.
, and
Finck
,
S
. (
2012
).
On the design of constraint covariance matrix self-adaptation evolution strategies including a cardinality constraint
.
IEEE Transactions on Evolutionary Computation
,
16
(
4
):
578
596
.
Beyer
,
H.-G.
, and
Schwefel
,
H.-P
. (
2002
).
Evolution strategies: A comprehensive introduction
.
Natural Computing
,
1
(
1
):
3
52
.
Collange
,
G.
,
Reynaud
,
S.
, and
Hansen
,
N.
(
2010
).
Covariance matrix adaptation evolution strategy for multidisciplinary optimization of expendable launcher families
.
2010 AIAA Meeting Papers
,
15
(
11
).
Hansen
,
N.
(
2006
).
The CMA evolution strategy: A comparing review
. In
J. A.
Lozano
,
P.
Larrañaga
,
I.
Inza
,
E.
Bengoetxea
(Eds.),
Towards a new evolutionary computation: Advances on estimation of distribution algorithms
, pp.
75
102
.
Berlin
:
Springer
.
Hansen
,
N
. (
2008
).
Adaptive encoding: How to render search coordinate system invariant
. In
Parallel Problem Solving from Nature, PPSN X
, pp.
205
214
.
Hansen
,
N.
, and
Ostermeier
,
A
. (
2001
).
Completely derandomized self-adaptation in evolution strategies
.
Evolutionary Computation
,
9
(
2
):
159
195
.
Helwig
,
S.
,
Branke
,
J.
, and
Mostaghim
,
S
. (
2013
).
Experimental analysis of bound handling techniques in particle swarm optimization
.
IEEE Transactions on Evolutionary Computation
,
17
(
2
):
259
271
.
Kramer
,
O.
, and
Schwefel
,
H.-P
. (
2006
).
On three new approaches to handle constraints within evolution strategies
.
Natural Computing
,
5
(
4
):
363
385
.
Meyer-Nieberg
,
S.
, and
Beyer
,
H.-G.
(
2012
).
The dynamical systems approach: Progress measures and convergence properties
. In
G.
Rozenberg
,
T.
Bäck
, and
J.
Kok
(Eds.),
Handbook of natural computing
, pp.
741
814
.
Berlin
:
Springer
.
Mezura-Montes
,
E.
, and
Coello Coello
,
C. A
. (
2005
).
A simple multi-membered evolution strategy to solve constrained optimization problems
.
Transactions on Evolutionary Computation
,
9
(
1
):
1
17
.
Mezura-Montes
,
E.
, and
Coello Coello
,
C. A
. (
2011
).
Constraint-handling in nature-inspired numerical optimization: Past, present and future
.
Swarm and Evolutionary Computation
,
1
(
4
):
173
194
.
Ostermeier
,
A.
,
Gawelczyk
,
A.
, and
Hansen
,
N
. (
1994
).
Step-size adaptation based on non-local use of selection information
. In
Parallel Problem Solving from Nature, PPSN III
, pp.
189
198
.
Oyman
,
A. I.
,
Deb
,
K.
, and
Beyer
,
H.-G
. (
1999
).
An alternative constraint handling method for evolution strategies
. In
Proceedings of the 1999 IEEE Congress on Evolutionary Computation
, pp.
612
619
.
Rechenberg
,
I
. (
1973
).
Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution
.
Stuttgart, Germany
:
Frommann
.
Runarsson
,
T. P.
, and
Yao
,
X
. (
2000
).
Stochastic ranking for constrained evolutionary optimization
.
IEEE Transactions on Evolutionary Computation
,
4
(
3
):
274
283
.
Runarsson
,
T. P.
, and
Yao
,
X
. (
2005
).
Search biases in constrained evolutionary optimization
.
IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews
,
35
(
2
):
233
243
.
Schwefel
,
H.-P.
(
1975
).
Evolutionsstrategie und numerische Optimierung
.
Unpublished doctoral dissertation, Technische Universität Berlin, Fachbereich Verfahrenstechnik
,
Berlin
.

Appendix  Computation of the Distributions

This appendix computes the distribution of the mutation vectors’ first components z1 after the repair step for both constraint-handling techniques, that is, for repair by reflection as well as the repair method using truncation. Subsequently, the distribution of the mutation vectors’ z2 components conditional on z1 is computed for the respective strategies.

A.1  Reflection

In the first step the probability distribution function of the z1 components of the mutation vector after reflection is calculated. Then we derive the first two moments about zero of its z2 components conditional on z1.

A.1.1  Distribution of the z1 Components after Reflection
The distribution of the z1 components after reflection is the sum of the distribution of the corresponding components from immediately feasible offspring candidate solutions and the distribution of those candidates that are initially infeasible and thus repaired by reflection into the feasible area. Reflection transforms the mutation vector of infeasible candidate solutions in the following way (see Equations (4) and (5)):
formula
26
A feasible offspring candidate solution has to satisfy the constraint condition . Here, refers to the normalized distance of the parental candidate solution to the constraint plane, and denotes the constraint angle. The joint probability density of the initially feasible z1 and z2 components is the truncated normal density
formula
27
Notice, the factor in the denominator is the probability of an offspring being feasible
formula
28
Accordingly, the joint probability density of initially infeasible z1 and z2 components before reflection is a truncated normal density as well. It is described by
formula
29
The strategy reflects infeasible candidate solutions into the feasible region of the search space. That is, a transformation of the random variables x and y according to the repair step (26) applied to Equation (29) yields the joint probability density of initially infeasible z1 and z2 components after reflection. After simple rearrangements the transformation reads
formula
30
The Jacobian matrix of this transformation with determinant is computed and some straightforward simplifications are applied. The resulting joint probability density of the initially infeasible components after reflection is
formula
31
After the joint densities and are weighted with the probability of their occurrence and , respectively, the overall joint probability density of the z1 and z2 components after the repair step (26) is
formula
32
for those x and y that satisfy the condition , and zero otherwise.
The marginal density of the z1 components after reflection is derived by integration over all feasible z2 values
formula
33
Solving the integral yields
formula
34
By introducing the abbreviation
formula
35
and integrating the marginal density with respect to x, the corresponding cumulative distribution function of the z1 components after reflection is
formula
36
A.1.2  Distribution of the z2 Components after Reflection
The next step considers the distribution of the z2 components after reflection conditional on . On the basis of this distribution the first two moments about zero can be computed. The probability of an offspring being infeasible and thus being in need of repair conditional on is obtained by computation of the relative weight of the second term in the marginal density in Equation (34) and reads
formula
37
The contributions to the first two moments after reflection conditional on , which arise from initially feasible candidate solutions, can be computed from the conditional probability density
formula
38
where
formula
39
denotes the marginal density function of the z1 components of initially feasible candidate solutions according to Equation (27). Taking into account Equation (38), the contributions to the first two moments equal
formula
40
and
formula
41
The next step is concerned with the calculation of the contributions caused by initially infeasible candidate solutions after reflection. Therefore, the marginal density function of the z1 components of initially infeasible candidate solutions is derived from the joint density in Equation (31), yielding
formula
42
Thus the conditional density of the z2 components reads
formula
43
and the contributions from initially infeasible and thus reflected offspring to the first moments of z2 can be computed as
formula
44
and
formula
45
Weighting the terms with the probabilities of their occurrence and applying some simplifications, one obtains
formula
46
and
formula
47
for the first two moments about zero of the z2 component of mutation vectors after reflection conditional on .

A.2  Truncation

Now for repair by truncation, the probability distribution function of the z1 components of the mutation vector and the first two moments about zero of its z2 components conditional on z1 are computed.

A.2.1  Distribution of the z1 Components after Truncation
The distribution after truncation also has two components: one from offspring that are initially feasible and one resulting from initially infeasible offspring candidate solutions. Truncation repairs these infeasible offspring by cutting off the mutation vector at the edge of the feasible region of the search space (see Section 2.2). This results in a transformation of mutation vectors described by
formula
48
Notice, the contribution from initially feasible candidate solutions to the distribution of the z1 components is independent of the repair method and thus equals the first addend in Equation (34):
formula
49
That is, only the derivation of the contribution from truncated offspring to the marginal density of the z1 components has to be carried out. The probability of an initially infeasible candidate solution having a z1 component less than some x is the sum
formula
50
formula
51
Computing the derivative with respect to x and applying some straightforward simplifications yields
formula
52
After solving the integral in (52) and introducing the abbreviation
formula
53
the contribution of initially infeasible offspring after truncation is obtained by
formula
54
Finally, combining (49) and (54) yields the marginal density of the z1 component of mutation vectors after repair by truncation:
formula
55
By integration of Equation (55) with respect to x, the cumulative distribution function of the z1 component reads
formula
56
A.2.2  Distribution of the z2 Components after Truncation
Next the distribution of the z2 components of mutation vectors after truncation conditional on is regarded. The probability of an offspring being in need of repair conditional on is the relative weight of in the marginal density (55) and equals
formula
57
Initially feasible offspring are not truncated. The moments of their z2 components are described by Equations (40) and (41). Initially infeasible candidate solutions after truncation have the z2 component and thus
formula
58
and
formula
59
Weighting both terms with the probability of their occurrence yields the moments of the z2 components of mutation vectors after truncation conditional on . After some straightforward simplifications these moments are
formula
60
and
formula
61