Active-set approaches are commonly used in algorithms for constrained numerical optimization. We propose that active-set techniques can beneficially be employed for evolutionary black-box optimization with explicit constraints and present an active-set evolution strategy. We experimentally evaluate its performance relative to those of several algorithms for constrained optimization and find that the active-set evolution strategy compares favourably for the problem set under consideration.

Constrained optimization involves the problem of minimizing an objective function $f:Rn→R$ subject to constraints
$gi(x)≤0fori∈{1,2,…,l}hj(x)=0forj∈{1,2,…,m}.$
The former are referred to as inequality constraints, the latter as equality constraints. The feasible region is the set $D={x∈Rn|gi(x)≤0foralli∈{1,2,…,l}andhj(x)=0forallj∈{1,2,…,m}}$. An inequality constraint $gi$ is said to be active (or tight) at $x∈D$ if $gi(x)=0$; it is said to be inactive at $x∈D$ if $gi(x)<0$. The active set $A(x)={i∈{1,2,…,l}|gi(x)=0}$ is the set of the indices of all inequality constraints active at $x∈D$.
Nocedal and Wright (2006) contend that one of the main challenges in solving constrained optimization problems lies in deciding which inequality constraints are active at an optimal solution $x*∈D$ and which are not. Knowledge of the optimal active set $A*=A(x*)$ simplifies the search for $x*$. Active-set approaches employ a working set $W⊆{1,2,…,l}$ that they update iteratively. In each iteration, they solve the problem of minimizing the objective function subject to equality constraints
$gi(x)=0fori∈Whj(x)=0forj∈{1,2,…,m}.$
Any constraints $gi$ with nonpositive associated Lagrange multipliers are removed from the working set and will thus be treated as inequality constraints in subsequent iterations. Any inequality constraints that are violated by the solution are added to the working set and will thus be treated as equality constraints. The length of the step is then adapted such that the iterate remains in the feasible region. The algorithm terminates when a solution $x*$ that satisfies the Karush--Kuhn--Tucker conditions has been found, at which point the working set under certain conditions equals $A*$.

Black-box optimization refers to problems where the objective is a black box in the sense that objective function values of points can be obtained, but no further information is available. Gradients can only be approximated through finite differencing. Obtaining an objective function value may require running a possibly expensive computer simulation or even conducting a physical experiment. The total number of objective function evaluations is commonly used as a measure for the computational cost of a black-box optimization process.

In constrained black-box optimization, the objective function is a black box in the sense described above. However, the constraint functions may or may not be black boxes. Le Digabel and Wild (2015) propose a taxonomy of constraints along the following four axes:

• Quantifiable or nonquantifiable. A quantifiable constraint is one where constraint function values represent degrees of violation or feasibility. An example of a nonquantifiable constraint is one where the constraint function returns a Boolean value.

• Relaxable or unrelaxable. Relaxable constraints allow obtaining meaningful objective function values for infeasible points; unrelaxable ones do not.

• Simulation based or a priori. Simulation-based constraints are black boxes and their evaluation may involve substantial cost. A priori constraints are cheap to evaluate, and gradients can be obtained at an insubstantial cost either directly or through finite differencing.

• Known or hidden. Hidden constraints are not specified in the problem formulation and may occur for example when a simulation crashes.

The axes are not mutually orthogonal as, for example, hidden constraints cannot be a priori. We refer to constraints that are a priori and quantifiable as explicit constraints. An example of an explicit constraint is a bound constraint, which can be evaluated easily and for which gradients are readily available.

Most work on constrained optimization with evolutionary algorithms (EAs) does not distinguish between different types of constraints, and many algorithms compute both objective and constraint function values for all candidate solutions generated in the course of a run. However, if the constraints are explicit, then it is conceivable that algorithms may benefit from evaluating some points using the (cheap) constraint functions, but not the (expensive) objective function. If the constraints are unrelaxable, then the use of the constraint functions for guiding the search toward the feasible region is unavoidable. Among the few instances of EAs that consider nonlinear constraints as explicit is work by Arnold (2016, 2017), Sakamoto and Akimoto (2019), and Spettel and Beyer (2020).

The primary goal of this article is to show that the use of active-set techniques in an EA can result in a powerful tool for black-box optimization problems where constraints are explicit and objective function evaluations are costly. Toward that goal, we present an active-set evolution strategy for explicitly constrained black-box optimization. The article extends our preliminary work (Arnold, 2016, 2017) in two directions. First, it introduces several modifications and algorithmic tweaks that result in improved performance on a wider range of test problems. A detailed discussion of those modifications can be found in Appendix A. Second, it presents an extensive experimental comparison with other algorithms for constrained black-box optimization.

We provide a discussion of related work in Section 2. Section 3 describes our algorithm and discusses implementation details. Section 4 presents an experimental comparison of our algorithm with other approaches to constrained black-box optimization. Section 5 concludes with a brief discussion.

This section briefly introduces evolution strategy related terminology and discusses issues with step size adaptation resulting from the presence of constraints in Section 2.1. Benchmarks for the experimental evaluation and comparison of EAs for constrained optimization are discussed in Section 2.2. EAs for constrained optimization that are relevant in the context of this work are surveyed in Section 2.3.

### 2.1 Evolution Strategies, Step Size Adaptation, and Constraints

EAs are stochastic algorithms for black-box optimization. They iteratively apply variation and selection to populations of candidate solutions. Evolution strategies constitute a class of EAs where variation may involve recombination of more than a single candidate solution—implemented as some form of arithmetic averaging—and always involves mutation—most often implemented as adding a random vector drawn from a normal distribution with zero mean. Hansen et al. (2015) provide a comprehensive overview of evolution strategies. The simplest evolution strategy is referred to as (1 $+$ 1)-ES and does not employ recombination. In each iteration it generates a single offspring candidate solution from a single parental candidate solution by means of mutation. It selects the better of the two with respect to objective function value as the parental candidate solution for the next iteration.

The covariance matrix of the mutation distribution is adapted throughout a run of the algorithm. Covariance matrix adaptation evolution strategies (CMA-ES) have been introduced by Hansen and Ostermeier (2001) and adapt the full matrix. For well-conditioned problems, the covariance matrix may be restricted to a scalar multiple of the identity matrix, leaving only a single parameter to be adapted. That parameter is referred to as the step size parameter and for the (1 $+$ 1)-ES can be adapted by means of the 1/5th rule (Rechenberg, 1973): decrease the step size if fewer than one in five offspring are successful in the sense of being superior to their parent; increase it otherwise.

While evolution strategies are most commonly used for unconstrained optimization, constrained problems were considered already in the seminal work of Rechenberg (1973) and Schwefel (1975). Schwefel showed that handling constraints in the (1 $+$ 1)-ES by simply discarding infeasible offspring can result in the 1/5th rule failing even for very simple constrained problems as illustrated in Figure 1. For convex problems, near the boundary of the feasible region, the fraction of offspring that improve on their parent decreases as candidate solutions approach the optimal solution, leading to a systematic decrease in the step size and thus resulting in failure to converge to the optimal solution. Arnold and Brauer (2008) have studied this phenomenon quantitatively by analyzing the behaviour of the algorithm on a linear objective function with a single linear constraint. Arnold (2013) considers evolution strategies that in each iteration generate a batch of offspring, employ recombination, and use cumulative step size adaptation as proposed by Ostermeier et al. (1994) to control the step size parameter. He compares constraint handling through resampling infeasible candidate solutions with constraint handling that repairs infeasible offspring by projecting them onto the boundary of the feasible region and finds that while the former is prone to convergence to non-optimal points, the latter mechanism is capable of continued progress on a linear objective with a single linear constraint. Hellwig and Arnold (2016) consider further repair techniques.
Figure 1:

Two-dimensional convex objective function with a single linear constraint. The dashed black lines are contour lines of the objective function; the solid line marks the constraint plane. The feasible region is shaded. The dashed red circles show magnifications of two areas straddling the boundary of the feasible region. The areas shaded in green are those parts where feasible points have objective function values lower than at the centre of the circles.

Figure 1:

Two-dimensional convex objective function with a single linear constraint. The dashed black lines are contour lines of the objective function; the solid line marks the constraint plane. The feasible region is shaded. The dashed red circles show magnifications of two areas straddling the boundary of the feasible region. The areas shaded in green are those parts where feasible points have objective function values lower than at the centre of the circles.

Close modal

Considering Figure 1, an active-set approach would at some point add the constraint into the working set and henceforth treat it as an equality constraint. An active-set evolution strategy would then be constrained to searching on the constraint boundary, where it faces an unconstrained convex problem and is capable of linear convergence.

### 2.2 Evaluating Algorithms for Constrained Black-Box Optimization

In contrast to mathematical programming approaches that are developed with the aim of being able to guarantee convergence to stationary points under certain conditions, EAs are commonly evaluated and compared experimentally, using sets of test problems. Partly drawing from earlier collections, such as those assembled by Hock and Schittkowski (1981) and Floudas and Pardalos (1987), a set of eleven test problems for constrained evolutionary optimization was gathered by Michalewicz and Schoenauer (1996). That set was expanded to 24 test problems to form the basis of a competition held in connection with the Special Session on Constrained Real-Parameter Optimization at the 2006 IEEE Congress on Evolutionary Computation (Liang et al., 2006). Detailed information on these test problems that we refer to as the CEC 2006 test set can be found in Appendix B. More recent competitions held in connection with the IEEE Congress on Evolutionary Computation have used sets of constrained test problems with variable dimension (Mallipeddi and Suganthan, 2010; Wu et al., 2016). However, the numbers of constraints present in those problems are small, and the problems' characteristics differ significantly from those present in other commonly used test sets. The most recent competition employed test problems closer in character to the 2006 test set, but also included mixed-integer problems (Kumar et al., 2020).

A central factor in the experimental evaluation and comparison of algorithms is the choice of performance measure. Hansen et al. (2016) discuss the relative merits of fixed-target approaches versus fixed-budget approaches and argue for running times measured as the number of objective function evaluations required to reach target objective function values as “the only available measure with a generic, meaningful, and quantitative interpretation.” We adopt that measure in place of the fixed-budget approach used in the CEC benchmarks.

Mallipeddi and Suganthan (2010) state that “it has become impossible to demonstrate the superior performance of newly designed algorithms” with the CEC 2006 benchmark since it “has been solved satisfactorily by several methods.” However, the number of objective function evaluations required to solve the CEC 2006 test problems is large even for those algorithms that were the most successful in the competitions. We argue that optimal solutions can be obtained with much less computational effort if the constraints are a priori.

### 2.3 Algorithms for Constrained Black-Box Optimization

Several strategies exist for equipping EAs for unconstrained optimization with constraint handling techniques. The strategies often combine all constraint function values of a point in a single, scalar-constraint violation value. An order is defined on the product space of objective function and constraint violation values and is used for selection. An example is the $ε$ constrained method due to Takahama and Sakai (2006, 2010b). The order defined in that method initially prioritizes the minimization of objective function values over the minimization of constraint violation, but then progressively shifts more emphasis toward the latter. Variants of $ε$ constrained differential evolution have ranked first in both the CEC 2006 and CEC 2010 competitions. Takahama and Sakai (2006) successfully solve all but two of the problems from the CEC 2006 test set, but require numbers of objective function evaluations that range from the low thousands to the hundreds of thousands. The $ε$ constrained method has been used in evolution strategies by Hellwig and Beyer (2018), whose $ε$MAg-ES adapts the full covariance matrix of its mutation distribution.

Some of the algorithms that employ the $ε$ constrained method approximate the gradients of the constraint functions to repair some infeasible candidate solutions and thus implicitly assume that the constraints are a priori, while others do not. All of them depend upon the constraints to be relaxable. Crucially, the method requires the specification of a parameter that determines the duration of the phase during which selection for objective function values may take precedence over selection for constraint violation. Both Takahama and Sakai (2010a) and Hellwig and Beyer (2018) set the number of iterations after which the minimization of constraint violation is given absolute priority over the minimization of the objective function to 1,000. It is unclear how the parameter should be set if optimal solutions can be obtained with far fewer objective function evaluations. In the remainder of this section, we focus on work that does not assume large function evaluation budgets.

Martínez and Sobral (2013) contend that many derivative-free approaches for constrained optimization are not efficient on “thin” domains (e.g., those that include nonlinear equality constraints) and that other algorithms do not exploit constraints being a priori. They propose Skinny as an algorithm that in a first phase strives to decrease infeasibility without evaluating the objective function while in a second phase minimizes the objective function on a relaxed feasible set. The admissible constraint violation is tightened gradually. Using test problems from Hock and Schittkowski (1981) they find that Skinny requires fewer objective function evaluations than an augmented Lagrangian method and solves more problems than a derivative-free trust region approach.

Regis (2014) proposes COBRA, an algorithm for inequality constrained black-box optimization that uses radial basis function based surrogate models for both the objective and constraint functions. The approach does not require the constraint functions to be a priori and proceeds in two phases. In the first phase it aims to find a feasible point that is then improved upon in the second phase. The strategy has several parameters that need to be set. Bagheri et al. (2017) introduce SACOBRA, a self-adaptive variant of COBRA that uses an EA for the online optimization of those parameters, removing the need for problem specific tuning. Bagheri et al. (2016) enable SACOBRA to solve equality and inequality constrained problems by using a conjugate gradient method to minimize the sum of (surrogate model approximated) equality constraint violations in each iteration.

Regis (2018) proposes CONOPUS, a surrogate model assisted particle swarm algorithm for inequality constrained black-box optimization. As COBRA, the approach does not assume that the constraints are a priori and fits surrogate models to both the objective and constraint functions. In each iteration, those models are used to refine the best point located so far. The algorithm is demonstrated to outperform particle swarm variants that do not employ surrogate models. Two further surrogate model assisted approaches applicable in the face of simulation based inequality constraints are a differential evolution variant proposed by Wang et al. (2019) and an algorithm by Li and Zhang (2021). In both cases, the experimental evaluation on the inequality constrained subset of problems from the CEC 2006 test set employs function evaluation budgets much smaller than those used in connection with the CEC 2006 special session.

Sakamoto and Akimoto (2019) design ARCH as an algorithm specifically for black-box optimization problems with constraints that are quantifiable and a priori, but not required to be relaxable. Their algorithm involves a constraint handling technique similar to an approach for handling bound constraints in the CMA-ES introduced by Hansen et al. (2009). Infeasible points are not evaluated directly but instead associated with the objective function value of the closest corresponding point in the feasible region, subject to inequality constraints being violated to remain tight. This is akin to the projection of points in active-set evolution strategies, but differs in that in ARCH projection is performed only for the purpose of computing objective function values; infeasible points remain within the infeasible region. Selection employs rankings of the objective and constraint function values that are based on a linear combination with an adaptive coefficient. The resulting algorithm is shown to possess desirable invariance properties.

Both Spettel et al. (2019) and Spettel and Beyer (2020) propose interior-point evolution strategies that repair infeasible candidate solutions through projection onto the feasible region before evaluating them. The former algorithm is applicable only to linearly constrained problems, the latter only to equality constrained problems.

This section introduces a (1 $+$ 1)-ES that uses an active-set approach for constraint handling and is referred to as (1 $+$ 1)-AS-ES. It is the result of making several modifications to the strategies presented by Arnold (2016, 2017); differences between the algorithm variants are described in Appendix A. The algorithm employs a working set $W⊆{1,2,…,l}$ of indices of inequality constraints that are treated as equality constraints. We write $S(W)={x∈D|gi(x)=0foralli∈W}$ for the subset of the feasible region where all of the constraints in the working set are tight and refer to it as the reduced search space. Offspring candidate solutions are projected onto the reduced search space, thus ensuring their feasibility. If an offspring candidate solution $y∈Rn$ that improves on the parental candidate solution is selected to replace the latter, then any constraints that are tight at $y$ and that are not yet present in $W$ are added to the working set. As it is possible that constraints not in an optimal active set $A*$ are added to the working set at some point during an optimization process, constraints in $W$ need to periodically be considered for removal. This is accomplished by temporarily releasing a constraint from the working set and generating an offspring candidate solution for which that constraint is not tight. If the offspring improves on the parental candidate solution, then the constraint is removed from the working set.

### 3.1 Algorithm

A single iteration of the (1 $+$ 1)-AS-ES is described in Figure 2. The state of the algorithm is determined by current candidate solution $x∈D$ and its corresponding objective function value $fx=f(x)$, step size parameter $σ∈R$, working set $W$, iteration number $t∈N$, and an array $t=(t1,t2,…,tl)$ that for each inequality constraint stores the iteration number when it was most recently considered for removal from the working set. The initialization of $x$ and $σ$ is problem-specific. $W$ is initialized to $A(x)$ and $t$ to hold all zeros. Line 1 computes the effective dimension $neff$ of the reduced search space by subtracting the dimension of the span of the normal vectors of the equality constraints as well as the inequality constraints in the working set evaluated at $x$ from the dimension $n$ of the domain of the objective function. Lines 2 through 6 determine whether or not a constraint from the working set is to be considered for removal. We choose to release a constraint with a probability of 20 percent by sampling standard uniform random variable $U0,1$. This figure represents a compromise in that a probability too close to zero would result in extended phases of wasted effort in cases where there are constraints in the working set that are not in an optimal active set. Choosing a value much larger than 20 percent (e.g., 50 percent) would noticeably impact the rate of convergence for those problems where an optimal active set has been identified and the effective dimension of the reduced search space is nonzero. Optimal values differ from problem to problem and depend on the dimension of the optimally reduced search space as well as on the number and the nature of the constraints. The choice of 20 percent is found to be workable for all problems considered. A constraint is also released if $neff=0$, as in that case the projection of an offspring candidate solution onto $S(W)$ is likely to result in the same point as the parental candidate solution. In either case, a constraint in $W$ that has not been considered for removal from the working set the longest is selected, with ties broken at random. Lines 7 through 10 generate an offspring candidate solution $y$ through mutation and immediate projection onto the reduced search space. Projection is accomplished by minimizing the squared Euclidean norm $∥y-(x+σz)∥2$ subject to $y∈S(W∖{k})$. As discussed below, projection may be unsuccessful and thus potentially necessitate sampling and projecting several points. If a constraint is being considered for removal from the working set, then points are sampled and projected until a candidate solution is found for which the constraint in question is not tight. In Line 11, the offspring candidate solution is evaluated. Notice that this is the only point in the algorithm where an objective function evaluation is required. Lines 12 through 22 implement selection and step size adaptation. If the offspring candidate solution has a lower objective function value than the parental one, then it replaces the latter and any constraints in $A(y)$ not yet present in $W$ are added to the working set. If a constraint was considered for release, then it is removed from $W$ in Line 18. Lines 16 and 21 implement the 1/5th rule as proposed by Kern et al. (2004), with the choice of damping constant made in accordance with Hansen et al. (2015). The step size is adapted only in those iterations where constraints are not considered for release. Lines 23 through 26 update the array storing the release times and the iteration counter.
Figure 2:

Single iteration of the (1 $+$ 1)-AS-ES.

Figure 2:

Single iteration of the (1 $+$ 1)-AS-ES.

Close modal

### 3.2 Details

Numerical issues arise both due to the limited precision of floating point number representations on digital computers and possibly due to objective or constraint functions for some inputs returning values that are not real numbers. We have implemented the (1 $+$ 1)-AS-ES in MATLAB,1 which uses double-precision floating-point numbers. The presence of equality constraints necessitates the specification of a parameter $δ>0$ that determines the maximally admissible constraint violation and that is referred to as the constraint tolerance. An equality constraint $h(x)=0$ is thus interpreted as $|h(x)|≤δ$; an inequality constraint $g(x)≤0$ as $g(x)≤δ$. We refer to a solution that is feasible to within the constraint tolerance as a viable solution.

Projection in Line 9 of the algorithm in Figure 2 requires the minimization of a squared Euclidean norm as described above and can be accomplished by sequential quadratic programming. Sequential quadratic programming may or may not result in a viable solution, potentially necessitating more than a single iteration of the loop in Lines 7 through 10. Not shown in Figure 2 in order not to impede readability, in some situations projection may repeatedly fail to generate a viable candidate solution that satisfies the conditions in Line 10. We abort an iteration of the algorithm after 400 unsuccessful tries. If considering a constraint for removal from the working set, then that constraint is simply removed with a probability of 20 percent and we set $tk←minl∈W∖{k}tl$ otherwise. As a result, the next iteration of the algorithm may see either another constraint or no constraint at all considered for removal from $W$. The activation of this safeguard is observed only on a small subset of the test problems considered in Section 4.

The MATLAB Optimization Toolbox in the fmincon function implements two sequential quadratic programming algorithms: active-set and sqp. The former can be observed to be faster on some problems. However, it may enter infinite loops or crash in runs where the constraint functions return values that are not real numbers. A further difference is that active-set interprets tolerances as absolute while sqp considers them relative. As a consequence, sqp may for some problems return solutions that do not satisfy the (absolute) constraint tolerance $δ$. In order to benefit from the relative strengths of both algorithms, we first attempt to project using active-set. In order to avoid crashes and infinite loops, we catch exceptions and use an output function to abort runs in which values that are not real are observed. If projection using active-set fails, then sqp is employed instead. Moreover, fmincon can either be supplied with the gradients of the constraint functions or it can approximate them through finite differencing. Differences between the two options are most obvious where values that are not real are encountered. If constraint gradients are available explicitly, then we first attempt to use them; we switch to gradients that are approximated numerically if sequential quadratic programming with the former fails. Both active-set and sqp are used with default parameter settings. Only those solutions that absolutely satisfy the constraints to within the constraint tolerance are used.

Finally, Line 14 of the algorithm in Figure 2 requires determining which constraints are newly tight. In order for a constraint $g(y)≤0$ to be added to the working set, we require that the corresponding Lagrange multiplier obtained in the process of projection and returned by fmincon is strictly positive and that the value of the constraint function is within the constraint tolerance in that $|g(y)|<δ$.

We experimentally evaluate the (1 $+$ 1)-AS-ES with the primary goal of showing that the use of active-set techniques in an EA can result in a powerful tool for solving black-box optimization problems where constraints are explicit and objective function evaluations are costly. For that purpose, we compare the performance of the (1 $+$ 1)-AS-ES with that of several deterministic algorithms that use active-set and other techniques for handling constraints as well as with a stochastic algorithm that does not assume the explicitness of the constraints and has been found to be successful in a small function-evaluation budget scenario. We first describe our testing environment in Section 4.1 and discuss the algorithms that the (1 $+$ 1)-AS-ES is compared with in Section 4.2. We then present results from the experimental comparison in Section 4.3.

### 4.1 Testing Environment

We employ the CEC 2006 test problem set gathered by Liang et al. (2006). We choose this set as its problems have been designed or selected to exhibit specific intricacies that pose difficulties for algorithms for constrained optimization and thus present a diverse set of challenges. None of the algorithms that we consider have been designed to be applicable to mixed-integer problems, thus precluding the use of the 2020 test set. Moreover, the widespread use of the 2006 test set makes it possible to put into perspective performance figures from a wide range of published papers. The problems of the 2006 set are well understood and optimal function values are known, making it possible to not simply present but to interpret experimental results.

Problems of the CEC 2006 test set are labelled from g01 to g24 and have bound constraints as well as further inequality and equality constraints in the bound constrained region. In accordance with the definition of constrained optimization in Section 1, we require that for all problems the feasible region is closed and that the objective function is well defined anywhere on the feasible region, necessitating changes to the definitions of problems g14 and g20. Those modifications are detailed in Appendix B, along with a summary of characteristics of the entire test set. Starting points for all algorithms are sampled uniformly within the bound constrained region and may or may not be feasible. Experiments are run with constraint tolerance $δ=10-8$ unless noted otherwise.

Our primary tool for comparing algorithm performance is empirical cumulative running-time distributions, which accumulate data from multiple runs and for multiple objective function value targets in easy to interpret figures. The results are “anytime” in that they do not assume specific function evaluation budgets. Hansen et al. (2021) provide a comprehensive discussion of the use and relative merits of empirical cumulative running-time distributions when compared with other approaches to assessing the performance of black-box optimization algorithms.

With the exception of SACOBRA, which samples random points with a small probability, we do not consider restart algorithms and report success rates instead in order to avoid termination criteria from impacting the results. A problem is considered solved to within target accuracy $ε>0$ if a solution $x$ is found that is feasible to within the constraint tolerance and that has an objective function value $f(x), where $f*$ is the value of an optimal solution to the problem. Target objective function values for generating empirical cumulative running-time distributions are obtained for each problem by selecting twenty values that are logarithmically uniformly spaced between a minimum value of $f*+10-8|f*|$ and the median objective function value of 101 points uniformly sampled in the bound constrained domain and subsequently projected onto the feasible region.

### 4.2 Algorithms

We compare the performance of the (1 $+$ 1)-AS-ES with those of several other algorithms: two sequential quadratic programming approaches and an interior point algorithm implemented in the MATLAB Optimization Toolbox, as well as Skinny and SACOBRA as described in Section 2.3. Approaches that employ the $ε$ constrained method for constraint handling are less suitable for use in a fixed-target comparison as they require providing a schedule for the parameter that controls the constraint violation below which selection for objective function values takes precedence over the minimization of constraint violation. With their default schedules, the algorithms of Takahama and Sakai (2010a) and Hellwig and Beyer (2018) may require thousands of objective function evaluations before generating feasible candidate solutions and are thus not competitive in a scenario where targets are to be attained with small numbers of function evaluations. While SACOBRA will be seen below to require about 220 objective function evaluations to reach 50 percent of all targets, $ε$MAgES does not attain that milestone until in excess of 20,000 function evaluations. Although ARCH is designed for nonlinearly constrained problems, Sakamoto and Akimoto (2019) provide test results only for problems with linear constraints. The evolution strategies of Spettel et al. (2019) and Spettel and Beyer (2020) are designed for the solution of linearly constrained and equality constrained optimization problems, respectively, and thus not applicable for the full CEC 2006 test set.

The sequential quadratic programming approaches and the interior point algorithm are available through the fmincon interface of the MATLAB Optimization Toolbox and are referred to as active-set, sqp, and interior-point, respectively. All of them are deterministic, and some differences between active-set and sqp are described in Section 3.2. In order to prevent the algorithms from stopping before a viable solution of the required target accuracy is found, we set the optimality tolerance, step tolerance, and function tolerance parameters to zero and use an output function to stop when the termination criteria given in Section 4.1 are satisfied. Runs are aborted as unsuccessful if either the number of objective function evaluations exceeds $105$ or if values that are not real numbers are encountered. In the latter case, the algorithms would not be able to make further progress even if allowed to continue running. Skinny is run with default parameter settings and uses the DFO/Algencan packages as described by Martínez and Sobral (2013). We use the SACOBRA version current as of this writing (Version 1.2) with default parameter settings. The active-set evolution strategy is initialized with a value for the step size parameter $σ$ equal to one-fifth of the minimum extent of the bound constrained region across the dimensions.

### 4.3 Results

All six algorithms have been used to solve the problems in the CEC 2006 test set with constraint tolerance $δ=10-8$. For each algorithm/problem combination we have conducted 101 runs. This number is sufficient for Fisher's exact test to affirm that an absolute difference between observed success rates of ten percentage points or more affords at least 95 percent confidence that the underlying success probabilities differ.

Presumably as a result of generating models based on increasing amounts of data, runs of SACOBRA slow down with increasing number of iterations. As Bagheri et al. (2016), we have limited each run to a maximum of 500 objective function evaluations. The corresponding empirical cumulative running-time distributions are thus shown only up to that point.2 Longer runs of SACOBRA would likely result in higher success rates for some of the problems. At the same time, the median number of objective function evaluations across successful runs for those problems would increase.

For the active-set evolution strategy, the safeguard described in Section 3.2 that results in aborting iterations in which the algorithm repeatedly fails to generate viable candidate solutions can be observed to kick in infrequently on problems g02 and g07, and frequently on g22, which is highly ill-conditioned and where the vast majority of projections fail due to numerical issues. It is not triggered on any of the remaining problems.

Performance data for two objective function value targets are summarized in Table 1. Each cell in the table contains results for target accuracy $ε=10-4$ on the left and for $ε=10-8$ on the right. The numbers in the upper row are the median number of objective function evaluations across all successful runs. The numbers in the lower row indicate the fraction of runs that were successful. It can be seen from the table that the active-set evolution strategy solves 15 out of the 24 problems to within target accuracy $ε=10-8$ in all runs. Of the 9 problems that the (1 $+$ 1)-AS-ES does not solve in all of the runs, g02, g08, g12, g13, g18, and g23 are multimodal, and those runs that do not locate the optimal solution converge to locally optimal solutions. The evolution strategy solves problem g03 to within target accuracy $ε=10-4$ in all runs, but in some cases fails to attain a solution satisfying the higher target accuracy. A reduced constraint tolerance of $δ=10-9$ would allow it to succeed in all runs. Problem g17 is multimodal and 42 of the 101 runs converge to the merely local optimizer. However, about one-third of the runs fail to locate either of the two locally optimal solutions. Those runs approach the globally optimal solution, but fail to locate it to within the required accuracy due to an unwarranted decrease of the step size parameter, likely as a result of the discontinuity of the objective function that creates a ridge-like structure. Finally, one of the 101 runs on problem g20 fails. Problem g20 is the problem with the highest dimension and the largest number of constraints tight at the optimal solution. The algorithm in the failed run does not add all of the constraints that are tight to its working set and as a result generates many candidate solutions that after projection are inferior to their parent, resulting in a systematic decrease of the step size that prevents it from converging to the optimal solution.

Table 1:

Median runtimes and success rates for target accuracies $10-4$ and $10-8$.

(1 $+$ 1)-AS-ESactive-setsqpinterior-pointSkinnySACOBRA
g01 22/22 28/35 28/42 352/438 1582/1589 30/120
1.00/1.00 0.17/0.12 0.14/0.09 0.56/0.56 0.98/0.98 1.00/0.63
g02 —/— —/— —/— —/— —/— —/—
0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00
g03 285/555 653/674 707/714 872/943 —/— 141/—
1.00/0.87 0.71/0.67 0.39/0.39 1.00/1.00 0.00/0.00 0.97/0.00
g04 18/18 24/30 24/30 61/66 186/378 18/45
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g05 37/79 45/45 50/50 61/61 419/425 567/567
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.51/0.51
g06 5/5 36/36 37/37 104/104 207/207 8/23
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g07 279/449 155/155 157/157 187/242 —/— 38/277
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.00/0.00 1.00/0.82
g08 98/183 66/72 64/68 50/57 75/112 149/245
0.54/0.54 0.10/0.10 0.20/0.20 0.73/0.73 1.00/1.00 0.85/0.74
g09 250/523 341/341 290/290 280/332 641/1147 279/—
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.79/0.00
g10 119/219 342/342 360/360 1302/1355 12809/12953 122/165
1.00/1.00 1.00/1.00 1.00/1.00 0.90/0.90 1.00/1.00 0.66/0.49
g11 27/78 21/21 21/21 27/27 144/144 95/97
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g12 68/174 26/26 50/50 —/— 60/81 9/9
0.94/0.94 0.17/0.10 0.01/0.01 0.00/0.00 1.00/1.00 0.76/0.76
g13 82/166 98/120 109/126 144/144 423/450 362/462
0.84/0.84 0.37/0.21 0.41/0.31 0.29/0.29 0.59/0.59 0.28/0.19
g14 205/1885 376/578 284/415 177/335 —/— 136/—
1.00/1.00 1.00/0.99 1.00/1.00 1.00/1.00 0.00/0.00 0.81/0.00
g15 7/55 44/44 40/40 45/48 142/142 121/125
1.00/1.00 0.92/0.90 0.88/0.88 0.84/0.84 0.93/0.93 0.96/0.96
g16 15/15 109/109 187/187 537/580 543/543 113/279
1.00/1.00 0.14/0.14 1.00/1.00 1.00/1.00 1.00/1.00 0.71/0.02
g17 81/197 —/— 351/368 179/— 654/722 259/—
0.40/0.27 0.00/0.00 0.16/0.16 0.18/0.00 0.09/0.09 0.34/0.00
g18 33/36 100/100 110/110 233/382 1333/1374 324/—
0.74/0.74 0.67/0.64 0.67/0.67 0.74/0.72 1.00/1.00 0.03/0.00
g19 257/380 176/240 176/226 384/496 —/— —/—
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.00/0.00 0.00/0.00
g20 51/51 525/538 300/300 —/— 1023/1118 —/—
0.99/0.99 0.26/0.23 0.78/0.73 0.00/0.00 0.24/0.23 0.00/0.00
g21 44/44 —/— 143/143 409/712 442/574 853/—
1.00/1.00 0.00/0.00 0.01/0.01 0.76/0.70 0.63/0.41 0.02/0.00
g22 88/88 —/— —/— 3714/3980 —/— —/—
1.00/1.00 0.00/0.00 0.00/0.00 0.40/0.40 0.00/0.00 0.00/0.00
g23 75/75 160/170 160/160 271/302 877/901 191/229
0.97/0.97 0.91/0.90 0.97/0.97 0.95/0.95 0.96/0.96 0.79/0.72
g24 17/17 18/21 21/21 27/42 170/170 20/175
1.00/1.00 0.44/0.43 0.52/0.43 0.42/0.42 1.00/1.00 1.00/0.27
(1 $+$ 1)-AS-ESactive-setsqpinterior-pointSkinnySACOBRA
g01 22/22 28/35 28/42 352/438 1582/1589 30/120
1.00/1.00 0.17/0.12 0.14/0.09 0.56/0.56 0.98/0.98 1.00/0.63
g02 —/— —/— —/— —/— —/— —/—
0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00 0.00/0.00
g03 285/555 653/674 707/714 872/943 —/— 141/—
1.00/0.87 0.71/0.67 0.39/0.39 1.00/1.00 0.00/0.00 0.97/0.00
g04 18/18 24/30 24/30 61/66 186/378 18/45
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g05 37/79 45/45 50/50 61/61 419/425 567/567
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.51/0.51
g06 5/5 36/36 37/37 104/104 207/207 8/23
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g07 279/449 155/155 157/157 187/242 —/— 38/277
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.00/0.00 1.00/0.82
g08 98/183 66/72 64/68 50/57 75/112 149/245
0.54/0.54 0.10/0.10 0.20/0.20 0.73/0.73 1.00/1.00 0.85/0.74
g09 250/523 341/341 290/290 280/332 641/1147 279/—
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.79/0.00
g10 119/219 342/342 360/360 1302/1355 12809/12953 122/165
1.00/1.00 1.00/1.00 1.00/1.00 0.90/0.90 1.00/1.00 0.66/0.49
g11 27/78 21/21 21/21 27/27 144/144 95/97
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00
g12 68/174 26/26 50/50 —/— 60/81 9/9
0.94/0.94 0.17/0.10 0.01/0.01 0.00/0.00 1.00/1.00 0.76/0.76
g13 82/166 98/120 109/126 144/144 423/450 362/462
0.84/0.84 0.37/0.21 0.41/0.31 0.29/0.29 0.59/0.59 0.28/0.19
g14 205/1885 376/578 284/415 177/335 —/— 136/—
1.00/1.00 1.00/0.99 1.00/1.00 1.00/1.00 0.00/0.00 0.81/0.00
g15 7/55 44/44 40/40 45/48 142/142 121/125
1.00/1.00 0.92/0.90 0.88/0.88 0.84/0.84 0.93/0.93 0.96/0.96
g16 15/15 109/109 187/187 537/580 543/543 113/279
1.00/1.00 0.14/0.14 1.00/1.00 1.00/1.00 1.00/1.00 0.71/0.02
g17 81/197 —/— 351/368 179/— 654/722 259/—
0.40/0.27 0.00/0.00 0.16/0.16 0.18/0.00 0.09/0.09 0.34/0.00
g18 33/36 100/100 110/110 233/382 1333/1374 324/—
0.74/0.74 0.67/0.64 0.67/0.67 0.74/0.72 1.00/1.00 0.03/0.00
g19 257/380 176/240 176/226 384/496 —/— —/—
1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.00/0.00 0.00/0.00
g20 51/51 525/538 300/300 —/— 1023/1118 —/—
0.99/0.99 0.26/0.23 0.78/0.73 0.00/0.00 0.24/0.23 0.00/0.00
g21 44/44 —/— 143/143 409/712 442/574 853/—
1.00/1.00 0.00/0.00 0.01/0.01 0.76/0.70 0.63/0.41 0.02/0.00
g22 88/88 —/— —/— 3714/3980 —/— —/—
1.00/1.00 0.00/0.00 0.00/0.00 0.40/0.40 0.00/0.00 0.00/0.00
g23 75/75 160/170 160/160 271/302 877/901 191/229
0.97/0.97 0.91/0.90 0.97/0.97 0.95/0.95 0.96/0.96 0.79/0.72
g24 17/17 18/21 21/21 27/42 170/170 20/175
1.00/1.00 0.44/0.43 0.52/0.43 0.42/0.42 1.00/1.00 1.00/0.27

With the exception of problem g02, which has such a large number of local minima that none of the algorithms locates the globally optimal solution in even a single run, and problem g17, for which it achieves the highest success rates among all of the algorithms considered, the (1 $+$ 1)-AS-ES locates the globally optimal solutions in the majority of runs. Embedding it in a simple restart scheme would allow it to locate globally optimal solutions with overwhelming probability.

Across all problems, the active-set evolution strategy locates globally optimal solutions in 89 percent of runs for target accuracy $ε=10-4$ and in 88 percent of runs for target accuracy $10-8$. The corresponding rates for active-set are 57 percent and 56 percent, for sqp are 63 percent and 62 percent, for interior-point are 70 percent and 69 percent, for Skinny are 64 percent and 63 percent, and for SACOBRA are 60 percent and 38 percent. The figures for SACOBRA are negatively impacted by the constraint on the number of objective function evaluations that is not shared by the other algorithms. However, the relatively low success rate of SACOBRA is likely also a result of SACOBRA not exploiting that the constraint functions are a priori and relying on inexact surrogate models instead.

Comparing the numbers of objective function evaluations across successful runs, Table 1 shows lower or equal median values for the evolution strategy than for all of the other algorithms on all of those problems where the effective dimension $neff$ of the reduced search space at the optimal solution is zero. Wilcoxon rank sum tests confirm that the null hypothesis of equal medians is rejected at the 5 percent significance level in all of those cases, with three exceptions. For problem g04, SACOBRA performs similarly to the evolution strategy at target accuracy $ε=10-4$, but is significantly slower at target accuracy $ε=10-8$. For problem g21, the null hypothesis is not rejected for sqp. However, while all of the runs of the evolution strategy succeed, only a single run of the sequential quadratic programming approach does. For problem g24, both active-set and sqp perform similarly to the evolution strategy in those runs where they succeed, but have much lower success rates. In contrast, the evolution strategy requires a relatively larger number of function evaluations where $neff$ is large and where the deterministic approaches that approximate gradients through finite differencing and that are equipped with techniques to handle ill-conditioning in some cases have advantages.

Figure 3 presents empirical cumulative running-time distributions for all algorithms across all test problems. Figures 4 and 5 show distributions for individual problems. All of them plot the fraction of objective function target values reached against the computational budget. Targets are computed as described in Section 4.2. As above, distributions are based on 101 runs for each of the algorithms and assume constraint tolerance $δ=10-8$. The lines depicting the distributions for SACOBRA end shortly after 500 objective function evaluations due to the constraint on the computational budget for that algorithm.
Figure 3:

Empirical cumulative running-time distributions over all problems.

Figure 3:

Empirical cumulative running-time distributions over all problems.

Close modal
Figure 4:

Empirical cumulative running-time distributions for problems g01 to g12.

Figure 4:

Empirical cumulative running-time distributions for problems g01 to g12.

Close modal
Figure 5:

Empirical cumulative running-time distributions for problems g13 to g24.

Figure 5:

Empirical cumulative running-time distributions for problems g13 to g24.

Close modal

The (1 $+$ 1)-AS-ES dominates the empirical cumulative running-time distributions of the other algorithms in Figure 3 for any computational budget. In contrast to the other algorithms, it generates feasible candidate solutions from the first iteration and as a result reaches about 4 percent of the targets with a single objective function evaluation. The evolution strategy eventually reaches almost 90 percent of targets. The interior-point fmincon algorithm attains 71 percent, Skinny 70 percent, the sequential quadratic programming approaches implemented in fmincon 65 percent and 59 percent, and SACOBRA 60 percent, albeit with a much more limited computational budget. For most computational budgets of up to about 200 objective function evaluations SACOBRA is the algorithm that attains the second largest number of targets.

Looking at the individual empirical cumulative running-time distributions in Figures 4 and 5, the (1 $+$ 1)-AS-ES dominates the other algorithms on twelve out of the 24 problems (g01, g06, g10, g13, g15, g16, g17, g20, g21, g22, g23, and g24). For none of these does the effective dimension $neff$ of the reduced search space at the optimal solution exceed two. The evolution strategy reaches all targets for problems g04, g05, g07, g09, g11, g14, and g19, but algorithms from the MATLAB Optimization Toolbox reach the more stringent targets with fewer objective function evaluations. Skinny is the only algorithm to reach all targets on multimodal problems g08 and g18. SACOBRA intermittently dominates the empirical cumulative running-time distributions for problems g03, g07, g12, and g14. We interpret this dominance to be a result of the use of surrogate models of the objective function. Nearly half of the empirical cumulative running-time distributions of SACOBRA suggest that a significant number of further targets could be attained if the constraint on the number of objective function evaluations were relaxed.

In summary, if it is assumed that the constraints are explicit, then the problems from the CEC 2006 test set can be solved with vastly fewer objective function evaluations than were available in the context of the competition. The (1 $+$ 1)-AS-ES is successful primarily on those problems where the effective dimension of the reduced search space at the optimal solution is low. If there are many unconstrained search directions at the optimal solution, then the relatively slow convergence of the evolution strategy puts it at a disadvantage in comparison with algorithms that are able to exploit the smoothness of most of the test problems. A large majority of the failures to achieve target objective function values can be traced to the multimodality of the problems.

The primary goal of this article is to show the potential usefulness of active-set techniques for evolutionary black-box optimization with explicit constraints. While active-set approaches are commonly used in mathematical programming, they have not received widespread use in evolutionary computation. We show that a simple evolution strategy that uses active-set techniques and that can be implemented in under 200 lines of code can solve problems from the CEC 2006 test set with numbers of objective function evaluations comparable with those required by both sequential quadratic programming and by interior-point methods, and that for some problems it appears to exhibit greater robustness. The evolution strategy is two orders of magnitude faster than the algorithms successful in the CEC competitions, which did not assume that constraint function evaluations have negligible cost compared with objective function evaluations. Embedding the (1 $+$ 1)-AS-ES in a simple restart scheme would allow solving all of the problems from the test set with the exception of g02 with high probability and with comparatively few objective function evaluations. There are no fundamental obstacles to using active-set techniques in more sophisticated evolution strategies, such as those that employ covariance matrix adaptation or surrogate modelling approaches. If constraints are not explicit, then algorithms such as SACOBRA, which use vastly fewer constraint function evaluations than the (1 $+$ 1)-AS-ES, need to be considered.

This research was supported by the Austrian Science Fund FWF through grant P29651-N32 and by the Natural Sciences and Engineering Research Council of Canada (NSERC) though grant RGPIN-2020-04833. We are grateful to the anonymous reviewers for their constructive comments on an earlier version of the manuscript.

1

Code implementing the algorithm is available at web.cs.dal.ca/∼dirk/AS_ES.tar.

2

SACOBRA may perform multiple objective function evaluations between checking the termination criteria, resulting in runs that exceed the mandated maximum number of objective function evaluations. As a consequence, the corresponding empirical cumulative running-time distributions may extend past 500 evaluations, but can no longer be considered accurate after that point.

Arnold
,
D. V.
(
2013
).
Resampling versus repair in evolution strategies applied to a constrained linear problem
.
Evolutionary Computation
,
21
(
3
):
389
411
.
[PubMed]
Arnold
,
D. V.
(
2016
). An active-set evolution strategy for optimization with known constraints. In
Parallel Problem Solving from Nature
, pp.
192
202
.
Arnold
,
D. V.
(
2017
). Reconsidering constraint release for active-set evolution strategies. In
Genetic and Evolutionary Computation Conference (GECCO)
, pp.
665
672
.
Arnold
,
D. V.
, and
Brauer
,
D.
(
2008
). On the behaviour of the (1 + 1)-ES for a simple constrained problem. In
Parallel Problem Solving from Nature
, pp.
1
10
.
Bagheri
,
S.
,
Konen
,
W.
, and
Bäck
,
T.
(
2016
). Equality constraint handling for surrogate-assisted constrained optimization. In
IEEE Congress on Evolutionary Computation
, pp.
1924
1931
.
Bagheri
,
S.
,
Konen
,
W.
,
Emmerich
,
M.
, and
Bäck
,
T.
(
2017
).
Self-adjusting parameter control for surrogate-assisted constrained optimization under limited budgets
.
Applied Soft Computing
,
61
:
377
393
.
Floudas
,
C.
, and
Pardalos
,
P.
(
1987
).
A collection of test problems for constrained global optimization algorithms
.
Berlin
:
Springer Verlag
.
Hansen
,
N.
,
Arnold
,
D. V.
, and
Auger
,
A.
(
2015
). Evolution strategies. In
J.
Kacprzyk
and
W.
Pedrycz
(Eds.),
Springer handbook of computational intelligence
,
pp. 871–898. Berlin
:
Springer Verlag
.
Hansen
,
N.
,
Auger
,
A.
,
Brockhoff
,
D.
,
Tušar
,
D.
, and
Tušar
,
T.
(
2016
).
COCO: Performance assessment
Hansen
,
N.
,
Auger
,
A.
,
Ros
,
R.
,
Mersmann
,
O.
,
Tusar
,
T.
, and
Brockhoff
,
D.
(
2021
).
COCO: A platform for comparing continuous optimizers in a black-box setting
.
Optimization Methods and Software
,
36
(
1
):
114
144
.
Hansen
,
N.
,
Niederberger
,
A. S. P.
,
Guzzella
,
L.
, and
Koumoutsakos
,
P.
(
2009
).
A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion
.
IEEE Transactions on Evolutionary Computation
,
13
:
180
197
.
Hansen
,
N.
, and
Ostermeier
,
A.
(
2001
).
Completely derandomized self-adaptation in evolution strategies
.
Evolutionary Computation
,
9
(
2
):
159
195
.
[PubMed]
Hellwig
,
M.
, and
Arnold
,
D. V.
(
2016
).
Comparison of constraint handling mechanisms for the (1, λ)-ES on a simple constrained problem
.
Evolutionary Computation
,
24
(
1
):
1
23
.
[PubMed]
Hellwig
,
M.
, and
Beyer
,
H.-G.
(
2018
). A matrix adaptation evolution strategy for constrained real-parameter optimization. In
IEEE Congress on Evolutionary Computation
, pp.
1
8
.
Hock
,
W.
, and
Schittkowski
,
K.
(
1981
).
Test examples for nonlinear programming codes
.
Berlin
:
Springer Verlag
.
Kern
,
S.
,
Müller
,
S. D.
,
Hansen
,
N.
,
Büche
,
D.
,
Ocenasek
,
J.
, and
Koumoutsakos
,
P.
(
2004
).
Learning probability distributions in continuous evolutionary algorithms—A comparative review
.
Natural Computing
,
3
(
1
):
77
112
.
Kumar
,
A.
,
Wu
,
G.
,
Ali
,
M. Z.
,
Mallipeddi
,
R.
,
Suganthan
,
P. N.
, and
Das
,
S.
(
2020
).
A test-suite of non-convex constrained optimization problems from the real-world and some baseline results
.
Swarm and Evolutionary Computation
,
56
:100693.
Le Digabel
,
S.
, and
Wild
,
S. M.
(
2015
).
A taxonomy of constraints in simulation-based optimization
Li
,
G.
, and
Zhang
,
Q.
(
2021
).
Multiple penalties and multiple local surrogates for expensive constrained optimization
.
IEEE Transactions on Evolutionary Computation
,
25
(
4
):
769
778
.
Liang
,
J. J.
,
,
T. P.
,
Mezura-Montes
,
E.
,
Clerc
,
M.
,
Suganthan
,
P. N.
,
Coello
,
C. A. C.
, and
Deb
,
K.
(
2006
).
Problem definitions and evaluation criteria for the CEC 2006 Special Session on Constrained Real-Parameter Optimization
. Technical Report. Nanyang Technological University, Singapore.
Mallipeddi
,
R.
, and
Suganthan
,
P. N.
(
2010
).
Problem definitions and evaluation criteria for the CEC 2010 Competition on Constrained Real-Parameter Optimization
. Technical Report. Nanyang Technological University, Singapore.
Martínez
,
J. M.
, and
Sobral
,
F. N. C.
(
2013
).
Constrained derivative-free optimization on thin domains
.
Journal of Global Optimization
,
56
(
3
):
1217
1232
.
Michalewicz
,
Z.
, and
Schoenauer
,
M.
(
1996
).
Evolutionary algorithms for constrained parameter optimization problems
.
Evolutionary Computation
,
4
(
1
):
1
32
.
Nocedal
,
J.
, and
Wright
,
S.
(
2006
).
Numerical optimization
, 2nd ed.
Berlin
:
Springer Verlag
.
Ostermeier
,
A.
,
Gawelczyk
,
A.
, and
Hansen
,
N.
(
1994
). Step-size adaptation based on non-local use of selection information. In
Parallel Problem Solving from Nature
, pp.
189
198
.
Rechenberg
,
I.
(
1973
).
Evolutionsstrategie—Optimierung technischer Systeme nach Prinzipien der biologischen Evolution
.
Stuttgart
:
Friedrich Frommann Verlag
.
Regis
,
R. G.
(
2014
).
Constrained optimization by radial basis function interpolation for high-dimensional expensive black-box problems with infeasible initial points
.
Engineering Optimization
,
46
(
2
):
218
243
.
Regis
,
R. G.
(
2018
). Surrogate-assisted particle swarm optimization with local search for expensive constrained optimization. In
Bioinspired Optimization Methods and Their Applications
, pp.
246
257
.
Sakamoto
,
N.
, and
Akimoto
,
Y.
(
2019
). Adaptive ranking based constraint handling for explicitly constrained black-box optimization. In
Genetic and Evolutionary Computation Conference (GECCO)
, pp.
700
708
.
Schwefel
,
H.-P.
(
1975
).
Evolutionsstrategie und numerische Optimierung
. PhD thesis, Technical University of Berlin, Department of Process Engineering.
Spettel
,
P.
, and
Beyer
,
H.-G.
(
2020
).
Matrix adaptation evolution strategies for optimization under nonlinear equality constraints
.
Swarm and Evolutionary Computation
,
54
:100653.
Spettel
,
P.
,
Beyer
,
H.-G.
, and
Hellwig
,
M.
(
2019
).
A covariance matrix self-adaptation evolution strategy for linear constrained optimization
.
IEEE Transactions on Evolutionary Computation
,
23
(
3
):
514
524
.
Takahama
,
T.
, and
Sakai
,
S.
(
2006
). Constrained optimization by the ε constrained differential evolution with gradient-based mutation and feasible elites. In
IEEE World Congress on Computational Intelligence
, pp.
308
315
.
Takahama
,
T.
, and
Sakai
,
S.
(
2010a
). Constrained optimization by the ε constrained differential evolution with an archive and gradient-based mutation. In
IEEE Congress on Evolutionary Computation
, pp.
1680
1688
.
Takahama
,
T.
, and
Sakai
,
S.
(
2010b
). Efficient constrained optimization by the ε constrained adaptive differential evolution. In
IEEE Congress on Evolutionary Computation
, pp.
2052
2059
.
Wang
,
Y.
,
Yin
,
D.-Q.
,
Yang
,
S.
, and
Sun
,
G.
(
2019
).
Global and local surrogate-assisted differential evolution for expensive constrained optimization problems with inequality constraints
.
IEEE Transactions on Cybernetics
,
49
(
5
):
1642
1656
.
[PubMed]
Wu
,
G.
,
Mallipeddi
,
R.
, and
Suganthan
,
P. N.
(
2016
).
Problem definitions and evaluation criteria for the CEC 2017 Competition and Special Session on Constrained Single Objective Real-Parameter Optimization
. Technical Report. Nanyang Technological University, Singapore.

The algorithm described in Section 3 has evolved from and combines aspects of two prototypical variants proposed by Arnold (2016, 2017). All variants share the same fundamental design: an active-set approach implemented in a simple (1 $+$ 1)-ES. All variants project candidate solutions generated through mutation onto the feasible region before evaluating them. Constraints are added to the working set if a new best candidate solution has newly tight constraints. Constraints are periodically considered for removal from the working set in order to prevent convergence to solutions in reduced subspaces that do not contain optimal solutions.

The most significant difference between the variants is the mechanism for considering constraints for removal from the working set. That mechanism is highly important as failure to remove constraints from the working set may result in convergence to non-optimal solutions. On the other hand, considering constraints for removal from the working set too frequently results in computational overhead as each attempt incurs an objective function evaluation. The strategy described by Arnold (2016) suspends the use of the entire working set in an attempt to release constraints, either if the effective dimension of the reduced search space is zero or otherwise with a fixed probability. That simple approach allows effectively solving most of the test problems from g01 to g11, but results in occasional failure to converge to the optimal solution on problem g07. Suboptimal performance ensues when a constraint from the working set needs to be removed and the working set is relatively large. In that case, chances of generating successful offspring by suspending the use of the entire working set are small and many iterations may be needed to attain the optimal working set. On problem g07, numerical issues can sometimes be observed to arise before being able to remove the constraint, resulting in failure to locate the optimal solution.

Arnold (2017) in addition considers individual constraints for release. Whenever the step size parameter has decreased by a certain factor or the effective dimension of the reduced search space is zero, the algorithm loops through all of the constraints in the working set and releases them one by one. If the effective dimension of the reduced search space is zero, then the step size parameter is reduced rapidly. The modifications not only address the issues observed with the earlier variant on problem g07 but can also be observed to result in significantly improved performance on some linearly constrained sphere functions. Advantages of the latter algorithm variant are most pronounced on problems where constraints need to be removed from relatively large working sets. However, the algorithm can be observed to behave suboptimally on several problems of the CEC 2006 test set. First, on multimodal and severely constrained problems, such as g01, the rapid reduction of the step size if the effective dimension of the reduced search space is zero further impairs the (1 $+$ 1)-ES's limited capacity for global search. And second, the rule used for triggering a review of all of the constraints present in the working set can be observed to result in unnecessary overhead for problems where there are many active constraints and the dimension of the search space is low.

The algorithm described in Section 3only attempts to remove single constraints and regains the simplicity of the original variant. The conditions that trigger a constraint to be considered for release are those from Arnold (2016), and no rapid decrease of the step size parameter is needed. A further significant change in the algorithm is the reduction of the damping constant used in the adaptation of the step size from a value proportional to the effective dimension of the reduced search space to a value proportional to the square root. The value adopted in Section 3 is in line with the setting proposed by Hansen et al. (2015) and results in the ability to converge faster on those problems where the effective dimension of the reduced search space at the optimal solution is relatively large.

Finally, some of the problems from g12 to g24 pose difficulties not present in the smaller test set considered in the earlier papers and may result in numerical issues when attempting to project points onto the reduced search space. These issues were not considered by Arnold (2016, 2017) and are addressed though the measures described in Section 3.2. As a result of the sum of these changes, the algorithm described in Section 3 can be used to solve all of the problems from the 2006 CEC test set with the exception of g02. Moreover, it can be observed to require smaller median numbers of function evaluations across successful runs than both preliminary variants on all but three of the problems from g01 to g11 (Wilcoxon rank sum test at 5% significance level).

Liang et al. (2006) list 24 test problems for constrained nonlinear optimization that they have gathered from earlier sources and labelled from g01 to g24. In order to conform with the definition of constrained optimization problems in Section 1 and to ensure that optimal solutions exist, we require that for all problems the feasible region is a closed set and that the objective function is well defined anywhere on the feasible region. This necessitates changes to two problems:

• Problem g14 has lower bound constraints $0 for $i=1,2,…,10$. We replace those constraints with constraints $0≤xi$ for $i=1,2,…,10$. This change results in the objective function
$f(x)=∑i=110xici+lnxi∑j=110xj$
being undefined where one or more lower bound constraints are active. We continue the objective function to the constraint boundary by defining $0log0=0$ (i.e., the contribution of the $i$th term to the sum is zero if the lower bound constraint is tight). None of the bound constraints is tight at the optimal solution.
• Problem g20 has constraints
$hi(x)=xi+12bi+12∑j=1324xj/bj-cixi40bi∑j=112xj/bj=0fori=1,2,…,12.$
Those constraint functions are undefined if $x1=x2=⋯=x12=0$ or $x13=x14=⋯=x24=0$. In order to ensure that the feasible region is closed, we replace the constraints with constraints
$hi(x)=40bixi+12∑j=112xjbj-cixibi+12∑j=1324xjbj=0fori=1,2,…,12.$
For zero constraint tolerance, points that are feasible or infeasible according to the original definition retain their feasibility or infeasibility.

Some of the remaining problems have the objective or constraint functions undefined at some points outside of the bound constrained region. Others have constraint normals diverge as the boundary of the feasible region is approached. We have not made any further modifications to address those issues and leave it to the algorithms to deal with them.

Table 2 summarizes characteristics of the test problems, listing the dimension $n$ of the search space, the number $m$ of equality constraints, the number $l*$ of inequality constraints active at the optimal solutions, the effective dimension $neff*$ of the reduced search space at those solutions, and optimal objective function values $f*$. Problem g18 has multiple optimal solutions with different values of $l*$ and $neff*$. The optimal objective function values are given to twelve significant digits and have been obtained analytically where possible and by using sequential quadratic programming with a constraint tolerance of $δ=10-15$ started in close proximity to the known optimal solutions otherwise. Some of the optimal function values differ by small amounts from those reported by Liang et al. (2006), presumably because the latter were obtained with a higher constraint tolerance. Our values are stable in that increasing the constraint tolerance by a factor of ten has no impact on the values reported in Table 2. Larger deviations between our figures and those reported by Liang et al. (2006) can be observed for two problems. For problem g20, Liang et al. (2006) state that no feasible solution to the problem has been found. Ensuring that the feasible region is a closed set as detailed above results in there being a feasible solution with the objective function value indicated in the table. For problem g22 the accurate solution to the problem is better than the best solution found by Liang et al. (2006).

Table 2:

Properties of the CEC 2006 test problems.

$n$$m$$l*$$neff*$$f*$
g01 13 17 $-$15
g02 20 19 $-$0.803619104126
g03 10 $-$
g04 $-$30665.5386718
g05 5126.49810960
g06 $-$6961.81387558
g07 10 24.3062090682
g08 $-$0.0958250414180
g09 680.630057374
g10 7049.24802053
g11 0.75
g12 $-$
g13 0.0539498477703
g14 10 $-$47.7610908594
g15 961.715172130
g16 $-$1.90515525853
g17 8853.53989133
g18 8 to 11 0 to 2 $-$0.866025403784
g19 15 13 32.6555929502
g20 24 14 28 0.147466071547
g21 193.786925260
g22 22 19 236.370313315
g23 $-$400
g24 $-$5.50801327160
$n$$m$$l*$$neff*$$f*$
g01 13 17 $-$15
g02 20 19 $-$0.803619104126
g03 10 $-$
g04 $-$30665.5386718
g05 5126.49810960
g06 $-$6961.81387558
g07 10 24.3062090682
g08 $-$0.0958250414180
g09 680.630057374
g10 7049.24802053
g11 0.75
g12 $-$
g13 0.0539498477703
g14 10 $-$47.7610908594
g15 961.715172130
g16 $-$1.90515525853
g17 8853.53989133
g18 8 to 11 0 to 2 $-$0.866025403784
g19 15 13 32.6555929502
g20 24 14 28 0.147466071547
g21 193.786925260
g22 22 19 236.370313315
g23 $-$400
g24 $-$5.50801327160

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode