## Abstract

We analyze the unrestricted black-box complexity of the Jump function classes for different jump sizes. For upper bounds, we present three algorithms for small, medium, and extreme jump sizes. We prove a matrix lower bound theorem which is capable of giving better lower bounds than the classic information theory approach. Using this theorem, we prove lower bounds that almost match the upper bounds. For the case of extreme jump functions, which apart from the optimum reveal only the middle fitness value(s), we use an additional lower bound argument to show that any black-box algorithm does not gain significant insight about the problem instance from the first fitness evaluations. This, together with our upper bound, shows that the black-box complexity of extreme jump functions is .

## 1  Introduction

To understand how evolutionary algorithms (and other black-box optimizers as well) behave when optimizing certain functions, one proves upper bounds for the problem’s difficulty (by constructing and studying various algorithms) and lower bounds (by studying how fast an algorithm can be in principle), which complement each other. Comparing these bounds helps researchers to evaluate how good today’s heuristics are and sometimes to construct better algorithms (Doerr et al., 2015).

In this work, we study the second question, that is, how fast in principle a black-box optimization algorithm can solve certain optimization problems. Droste et al. (2003) were the first to ask this question in the context of evolutionary algorithms. In their seminal paper---see also Droste et al. (2006) for the journal version---they introduce the notion of black-box complexity as a measure of problem difficulty. In simple words, the black-box complexity of an optimization problem is the (expected) number of function evaluations that are needed by an optimal black-box algorithm until it queries an optimum for the first time. As many randomized search heuristics such as evolutionary algorithms, ant colony optimization, or simulated annealing are black-box optimizers, the black-box complexity of a problem gives a lower bound on performance of all these search heuristics.

While dormant for several years, the area of black-box complexity became very active from 2009 on, possibly spurred by the remarkable works of Anil and Wiegand (2009) as well as of Lehre and Witt (2010). Since then, several deep and surprising results on black-box complexities were achieved, so that now we reasonably well understand the black-box complexities of classic test functions, for example, for OneMax (Droste et al., 2006; Anil and Wiegand, 2009) or for LeadingOnes (Afshani et al., 2013) and of several combinatorial optimization problems like sorting, maximum clique, and the single-source shortest path problem (Droste et al., 2006), the minimum spanning tree problem (Doerr et al., 2013) and the partition problem (Doerr et al., 2014b). It was also observed that modified definitions of black-box complexity are able to study the influence of unbiasedness (Lehre and Witt, 2012; Rowe and Vose, 2011), ranking-basedness (Doerr and Winzen, 2014b), memory size (Doerr and Winzen, 2014a), parallel search (Badkobeh et al., 2014), or elitism (Doerr and Lengler, 2015).

In this article, we stay within the realm of classic black-box complexity of pseudo-Boolean functions; that is, we ask how many fitness evaluations an otherwise unrestricted algorithm needs to perform to find the optimum of a function (given in a black-box fashion) from a given problem class. With the OneMax test function class and the LeadingOnes class being studied, we turn to another important Jump test function class. These are test functions used as examples with scalable difficulty, because the fitness landscape has a large plateau of low fitness around the optimum. For a jump function with the jump size , this plateau consists of all search points with the Hamming distance from the optimum between 1 and . See Section 2 for a precise definition. Our motivation is both understanding the black-box complexity of this well-studied function class and using it as a trigger to develop a new method, in particular, to prove lower bounds for black-box complexities, where at the moment not much is known beyond the information theoretic argument of Droste et al. (2006).

Concerning the black-box complexities of jump functions, we observe that while jump functions tend to be difficult for many randomized search heuristics, their black-box complexity is not excessively large (for the restricted notion of unbiased black-box complexity, weaker and less precise results pointing in the same direction have been obtained in Doerr et al. (2014a) (see Section 2). We show that when the jump parameter satisfies , the black-box complexity satisfies the same upper bound of that is the best known bound for the easy OneMax test function class. Note that is actually quite large, meaning that all search points with distance between 1 and from the optimum lie on the plateau of low fitness, making this a plateau of size and diameter . For even larger jump sizes, that is, , we show an upper bound of , where the asymptotic notation refers to tending to infinity. This upper bound does not make precise the leading constant when is constant, in particular, not for the extreme case when the jump function has all fitness levels on the plateau except for the “middle level” (for even n) or except for the two middle levels and (for odd n). For such extreme jump functions (and thus also for all others), we show an upper bound of .

These upper bounds are asymptotically of the right order of magnitude. This follows from the information theoretic argument (Theorem 4) in Droste et al. (2006): an optimization problem over a search space S such that each element of S is the unique solution to an instance of the problem has a black-box complexity of at least , where k is the maximum number of different answers a query can have. For jump functions with the jump size , this gives a lower bound of with the asymptotics being with respect to tending to infinity. Consequently, when , then our upper bounds and this lower bound are identical up to the leading constant. For smaller , they differ by at most a factor of . This is the same gap that exists for the black-box complexity of , which is an open problem pointed out (in a different context) already fifty years ago in the famous paper by Erdős and Rényi (1963). For constant-size values of , we also see a substantial gap between our upper bounds and the information theoretic lower bound. For example, in the case of an extreme jump function for even n, we have the three different fitness values , and n as possible answers to queries, and hence the lower bound is . For odd n, we have the four fitness values , and n, giving a lower bound of only. In both cases, our upper bound is and thus quite far away.

The reason is that the information theoretic argument pretends that at all times, all k answers may occur, and moreover, occur with similar frequency. This is clearly an overly optimistic view, as the following three examples show:

1. Once an optimal solution is found, the search is stopped. Hence such nodes of the decision tree have no children, even though some show up in a close distance from the root.

2. In an optimization problem with each search point being the unique optimum of exactly one instance, typically the different answers to a query do not occur with similar frequency as the optimal answer occurs at most once.

3. Correlations between the answers may also lead to a smaller information gain than assumed by the information theoretic lower bound. For simplicity, let us regard the OneMax problem, but the same effect exists for jump functions, most notable (as we will see) for extreme jump functions when n is odd. When the answer to the first query is received, we can predict the parity of answers for all subsequent queries, knowing only the parity of the number of one-bits in a query. Thus, when analyzing algorithms for solving this problem, we may safely assume that for every query (except for the first one), once the parity of one-bits is known, the number of different answers is reduced from to at most .

We significantly extend the information theoretic bound to allow taking care of such reasons for a smaller information gain, that is, to exploit these to prove stronger lower bounds. Our matrix lower bound theorem gives improved lower bounds for all black-box complexities of jump functions. In particular, for extreme jump functions, we raise the lower bounds from to when n is even and from to when n is odd. Note that the larger number of fitness values for n odd does now have a much smaller influence on the result than when using the classic information theoretic bound. While these results do determine the leading constant, we could not prove with our general lower bound theorem that the term in the upper bound is necessary. For this, we add an extra argument that shows that, as our upper bound suggests, indeed when optimizing an extreme jump function, the first fitness evaluations typically reduce the size of the solution space only by a constant factor. This argument together with the matrix lower bound proves that the black-box complexity of extreme jump function is indeed .

The rest of the article is structured as follows. In Section 2, we make precise the definitions of black-box complexity and jump functions. We also summarize the state of the art concerning optimization and complexity of jump functions. Section 3 is dedicated to upper bounds on Jump which are proven by giving the corresponding algorithms and discussing their complexity. In Section 4, the matrix lower bound theorem is formulated and proven. It resembles Theorem 4 from Droste et al. (2006), but is able to give better lower bounds under certain conditions. We apply this theorem in Section 5 to the Jump problem. Section 6 describes the refinement of the lower bound in the case of extreme jump sizes. Section 7 concludes the article, among others, with some open problems.

A preliminary version of a larger part of these results has appeared as a conference paper of Buzdalov et al. (2015). Apart from making a more detailed introduction, which was impossible in the conference paper format, we have refined the lower bound for extreme jump sizes up to to match the upper bound (see Section 6), which seems to be impossible by a straightforward application of the matrix lower bound theorem and required substantial additional effort.

## 2  Preliminaries

In this section, we define the notion of unrestricted black-box complexity, we make precise the definition of the jump functions we regard, and we review the existing results in runtime analysis and black-box complexity for jump functions.

### 2.1  Unrestricted Black-Box Complexity

The notion of black-box complexity was introduced by Droste et al. (2003) to investigate how difficult a problem is to be solved via general-purpose randomized search heuristics like evolutionary algorithms. In simple words, this black-box complexity (now sometimes called unrestricted black-box complexity to distinguish it from later developed more restricted variants) is the number of fitness evaluations needed to solve a problem. Let us make this notion precise.

By a problem, we shall always mean a set of pseudo-Boolean functions (problem instances), with the implicit meaning that the task is to find a global maximum of such a function. A black-box algorithm A for is a (possibly randomized) algorithm that takes as input a function and tries to maximize it in a black-box fashion. This means that the algorithm has no access to an explicit description of f, but can only evaluate f at any search point . In this unrestricted black-box optimization model, we do not make any other limiting assumption on the computational power of the algorithm. In particular, the algorithm may store all previously regarded search points and their fitnesses and may conduct arbitrary computations with these.

As a performance measure, similar as for evolutionary algorithms, we take the number of fitness evaluations performed by the algorithm. More precisely, the cost T(A, f) of running A on f is the expected number of function evaluations performed until (and including this evaluation) for the first time a global maximum of f is evaluated. We take a worst-case view with respect to the problem instances and define the complexity of A by
The black-box complexity of the problem , as in classic algorithmics, now is the performance of the best possible algorithm for the problem, that is,
where A runs over all black-box algorithms for .
Let us give a brief example to illustrate these definitions. The most classic test function in evolutionary computation is the so-called OneMax function , which simply counts the number of one-bits in x. To make this one instance a reasonable problem, in particular, to prevent discussing algorithms exploiting the obvious fact that has as the unique optimum, we regard the problem consisting of all instances with fitness landscape isomorphic to . These are all functions , , which are defined by
1
that is, counts the number of bit positions in which x and z agree. Again, it is obvious that z is the unique optimum of ; however; this is nontrivial to find out for an algorithm having no access to an explicit representation of like Equation (1).

Let be the set of all , . Let A be the randomized local search heuristic (start with a random search point and then repeat flipping one bit, evaluating the new search point, and keeping the better of parent and offspring). It is an easy exercise to show that A finds the optimum of any function in fitness evaluations. Hence for all z. Consequently, and thus . It is much harder to prove that there is a better black-box algorithm A for that has a complexity , a result shown in the context of information theory (Erdős and Rényi, 1963), the context of the Mastermind guessing game (Chvátal, 1983), and in the evolutionary computation community (Anil and Wiegand, 2009). The first and second references as well as Droste et al. (2006) also prove that for any black-box algorithm for , there is a function such that . Consequently, .

It is an ongoing discussion whether this is the best definition of a problem difficulty for randomized search of evolutionary computation and several alternative definitions have been defined, none of which, however, could clearly be shown to be more appropriate. We refer the reader to the works cited in the introduction or the tutorial (Doerr and Doerr, 2014).

### 2.2  Jump Functions

The Jump function is another popular test function. While OneMax is used to analyze how evolutionary algorithms cope with easy optimization problems, Jump is intended to be a difficult function where elitist optimization needs to flip several bits at once. There are several similar definitions of jump functions. They all have in common that they are a OneMax function modified by giving all search points within a certain radius from the optimum a low fitness. Consequently, typical hill-climbers find it easy to come close to the optimum (namely up to a search point with Hamming distance from the optimum), but then have to struggle to jump over the valley of low fitness to the optimum.

For black-box complexity analyses, the following definition of jump functions was suggested in Doerr, Kötzing et al. (2011). For and , let
for all . We define the class to consist of all function with . The special case of , which is the maximum possible that does not zero out the middle fitness values (thus giving the so-called Needle function), is called extremeJump.

Our definition of the jump functions, which is also used in Doerr et al. (2014a, 2016) not only has the highest suboptimal fitness levels of the function blanked out, but also the fitness levels . This is to avoid the trivial solution that a black-box algorithm first minimizes the jump function with a performance essentially equal to optimizing and, once a solution x with fitness 0 is found, inverts x and (with high probability, as the algorithm might have started in a blanked-out region with nonzero probability) finds the optimum. A very similar definition for jump functions, also in the context of black-box complexity, was used in Lehre and Witt (2010). The only difference there is that for a jump value of , the authors blank out only the highest nonoptimal fitness values (but on the lower end, the fitness values , that is, levels, are blanked out). For contexts outside of complexity theory, usually one does not care about the fitness levels at the low end, because a heuristic building on the hill-climbing paradigm will rarely encounter these levels, and if so, not profit from having the exact fitness available. Such a definition was used, for example, in Droste et al. (2002). Recently, Jansen (2015) introduced yet another type of jump function that, roughly speaking, agrees with the definition of Droste et al. (2002) except that the optimum is located at an unknown point in the plateau (or deceptive region) that is a Hamming ball of radius . Thus this function class contains nonisomorphic fitness landscapes (whereas the previous classes contain just a single isomorphism type). This function class has a black-box complexity of order when is constant, very different from the previously discussed classes. It may be closer to what is the original intuition of the jump functions; however, several results other than black-box complexity results seem to fail for this class as well, for example, the proof that crossover can greatly speed up the optimization of jump functions (Jansen and Wegener, 2002).

### 2.3  Runtime Analysis and Black-Box Complexity of Jump Functions

Jump functions are generally difficult to optimize with classic randomized search heuristics. This was first shown by Droste et al. (2002) for their definition of jump functions and the (1 + 1) evolutionary algorithm, but it is quite easy to see that also many other classic randomized search heuristics have an optimization time of for any jump function definition that has the highest suboptimal fitness levels on reset to low fitness values. Jansen and Wegener (2002) showed that a steady state genetic algorithm using uniform crossover (however, only with rate ) can optimize the jump functions defined in Droste et al. (2002) in time when is constant. It is easy to see that this result also holds for all other definitions of jump functions discussed previously, with the exception of the definition of Jansen (2015). Here, in fact, his proof that the black-box complexity is implies that the use of crossover cannot give similar advantages as for the other types of jump functions.

While there are some black-box complexity analyses for jump functions, surprisingly none of them uses the most classic unrestricted black-box complexity model; in fact, they all use the unbiased model introduced in Lehre and Witt (2012). Here, only the restricted class of unbiased black-box algorithms is regarded. While the precise definition is nontrivial, roughly speaking, a black-box algorithm is called unbiased if it satisfies three properties:

1. Each search point must be chosen uniformly at random or must be created from previous search points via a variation operator.

2. All variation operators must be unbiased, that is, treat the bit-values 0 and 1 in a symmetric way and treat the bit positions in a symmetric way.

3. All actions of the algorithm apart from what happens inside a variation operator may depend only on the search history and the fitness values observed, but not on the bit-string representation of the individuals.

This algorithm model includes many classic randomized search heuristics, except, for example, those using one-point crossovers.

For this unbiased black-box complexity model, Doerr et al. (2014a, 2016) show the following results. If , an arbitrary constant, then the black-box complexity is . In the case of extreme jump functions for even n, that is , the black-box complexity is . It is clear that these results are valid for the unrestricted black-box complexity as well; however, our results are stronger in that they provide much more precise bounds (making the leading constant precise and in the case of extreme jump functions also the lower order terms up to order ) and regard wider ranges of (namely all values of ).

## 3  Upper Bounds for the Black-Box Complexity of

In this section, the upper bounds for Jump are considered. To ease reading in the main subsections, in Section 3.1 we collect several useful technical results. The subsequent Sections 3.23.4 then treat separately small, large, and extreme jump sizes .

### 3.1  Preliminaries

Lemma 1:

For sufficiently large n, for all and the cases of all even , it holds that .

Proof:

This is proven in Doerr, Johannsen, et al. (2011) as Statement 8.

Lemma 2:

For sufficiently large n, for a fixed , for and for taken uniformly at random, the probability for to be zero is at most .

Proof:

The value of for a random x has a binomial distribution with parameters n and . From Hoeffding’s inequality (see Hoeffding, 1963 or Theorem 1.11 in Doerr, 2011), for , the distribution function for binomial distribution is bound from above by . As a consequence, the probability for to be zero is at most .

The following result extends the black-box complexity analysis of Doerr, Johannsen, et al. (2011) for OneMax to Jump functions.

Theorem 1:

Assume that n is sufficiently large and . Let . Let X be a (multi-)set of elements from chosen randomly using uniform distribution and mutually independently. Then the probability that there exists a such that and for all , is at most .

Proof:

We define Ad as a set of points which differ from z in exactly d positions, where .

We say that a point agrees with if . This means that or . The probability of the former does not exceed by Lemma 2. The latter holds if and only if x and y, as well as x and z, differ in exactly half of the bits in which y and z differ. To sum up, if , the probability for y to agree with a random x is at most for an odd d and at most for an even d. As for large enough n it holds that , the latter is at most .

Let p be the probability that there exists an such that y agrees with all . Then
After applying Lemma 1, we obtain
which is less than for sufficiently large n.

### 3.2  Upper Bound for Smaller

Using Theorem 3, we easily find the following upper bound for the unrestricted black-box complexity of jump functions with small jump size . This bound is asymptotically equal to the best known upper bound for . Also note that, trivially, any lower bound for the black-box complexity of also holds for any jump function class with problem size n. Consequently, the best known lower bound for , which is a factor of below these upper bounds, also holds for the jump functions regarded in this subsection. In a sense, these results show that for solving , only the inner fitness levels are sufficient.

Theorem 2:

If , the unrestricted black-box complexity of is at most , where refers to .

Proof:

We use the same algorithm which is used in Doerr, Johannsen, et al. (2011) for proving the upper bound for OneMax. We select randomly and independently t queries such that and check if there exists a single optimum z which agrees with all these queries (a query q with an answer a agrees with an optimum z if ). If there is more than one possible z, then we repeat the whole procedure. The complexity of one invocation of this procedure equals t. The probability of not finding a unique optimum is at most by Theorem 3. Thus the complexity of the algorithm is at most .

It is not difficult to see that, with slightly more care, Theorems 3 and 4 can be shown to hold also for larger values of , however, not for . For this reason, we will not follow this way. Instead, in the next subsection we propose an algorithm that works for even larger , but gives the same results that an extension of the previous theorem would have given in the small range above , where such an extension would have been possible.

### 3.3  Upper Bound for Larger

For bigger , finding an optimum for can be reduced to finding optima of several jump functions with smaller dimensions and jump sizes, for which the algorithm from the previous section suffices.

Theorem 3:

For , the unrestricted black-box complexity of is at most , where and refer to .

Proof:

Let . We reduce our problem to the one of , where s is chosen such that . The algorithm is outlined in Figure 1.

Figure 1:

Algorithm for with .

Figure 1:

Algorithm for with .

First, the algorithm finds a maximum even s such that , which would allow applying Theorem 4 for solving . After that, the algorithm finds a string with exactly correct bits using random queries. The probability that is equal to for a random query is which is by the Stirling’s formula. This means that the string x can be found in queries.

After finding x, the algorithm splits all bit indices into sets of size s in such a way that x and the answer agree in exactly half of the bits. To be precise, the last such set may have indices, in which case x and the answer agree in of its positions. This is done in lines 8–15 in Figure 1, where bi is the i-th such set. B, the set of yet undistributed bits, is always such that x and the answer agree in exactly indices of B.

To do that, the algorithm generates random subsets of size s and checks if they contain exactly correct bits, which is done by flipping the bits from the chosen subset and checking whether the fitness remains equal . If , the probability of choosing such a subset is
This gives an bound for one subset selection and an bound on entire process of finding subsets.

Next, the algorithm separately optimizes bits from each of the subsets bi using the algorithm for small Jump from Theorem 4 (lines 17–20 in Figure 1). If every query for a subproblem on bits from bi is forwarded to the main function f with all bits not from bi taken from x, the resulting subproblem becomes exactly a problem with the following corrections:

• from all nonzero answers, a value of needs to be subtracted;

• at the optimum of the subproblem, zero will be returned.

The latter correction, however, does not change the algorithm very much, because the algorithm from Theorem 4 doesn’t actually query the optimum point. Line 19 in Figure 1 collects the partial answers one by one: it sets the bits of ai at the corresponding positions from bi to the previous partial answer and returns the updated value.

Assuming , , the complexity of the algorithm is at most:
However, due to the choice of s, it holds that , which finally results in .

### 3.4  Upper Bound for Extreme Jump

The algorithm from Theorem 5 cannot be applied to the case of extreme Jump function, because then k would be zero and the subproblems are extreme Jump problems as well. In this case we have to use another algorithm, which will be given in the proof of the following theorem.

Theorem 4:

The unrestricted black-box complexity of an extreme Jump problem is at most .

Proof:

As described in the proof of Theorem 5, one can find a point x such that in queries. After that, if one flips two bits, the value of f remains the same if and only if one of these bits was correct and the other was not.

We denote by the bitwise exclusive OR of the bit strings a and b having equal lengths. The algorithm queries for all , and if it equals , the value of bi is set to one, otherwise to zero. This results in queries. After that, if the first bit is correct, then is the answer, otherwise its inverse is the answer. One has to make a single query, , to determine which one is true. The complexity of this algorithm is .

## 4  The Matrix Lower Bound Theorem

In this section we present a new theorem which is similar to Theorem 4 from Droste et al. (2006) except that the nodes corresponding to queries are required to be split in several types.

Theorem 5:

Let S be the search space of an optimization problem, and for each there exists an instance such that s is a unique optimum. Let each query have one of T types, such that for any query q of the i-th type the following holds:

• there is exactly one answer to the query q which means that q is an optimum;

• there are at most answers such that the next query after such answer belongs to the j-th type.

Define , , to be a matrix such that:

• for (note the transposition);

• for ;

• for ;

• otherwise.

Let the first ever query in the optimization process be of type 1. Define to be a vector, , . Then the following statements are true:

1. is the maximum total number of possible queries with depth in , where depth of a root is equal to one.

2. The lower bound on the average depth of N nodes is where d is an integer such that .

3. The unrestricted black-box complexity of the considered optimization problem is not less than the lower bound on average depth of nodes.

Proof:

According to the Yao’s minimax principle (Yao, 1977), the expected runtime of a randomized algorithm on any input is not less than the average runtime of the best deterministic algorithm over all possible inputs. Thus we construct a lower bound on the complexity of a randomized algorithm by constructing a lower bound on the average performance of any deterministic algorithm over all possible inputs. A deterministic algorithm can be represented as a (rooted) decision tree with nodes corresponding to queries and arcs going downwards corresponding to answers to these queries. A total lower bound on the average performance of deterministic algorithms, just as in Droste et al. (2006), is done by assigning different queries to different nodes of a tree such that their average depth is minimized, and then by considering all such trees and taking a minimum over them.

It should be noted that, if a (fixed) set of queries is to be assigned to nodes of a (fixed) rooted tree such that the average depth of these queries is minimized, an optimal assignment can be constructed in a greedy way: each query should be assigned to a free node with the minimum possible depth. Assume that an optimal assignment does not use at least one node a with depth d while using at least one node b with depth . Then one can move a query from the node b to the node a, which decreases the average depth, so the initial assignment is, in fact, not optimal.

Next, we show that, in order to minimize the average depth, one needs to consider only the complete tree, that is, a tree where for any query of the i-th type, for any j there are exactly answers, each leading to a query of the j-th type. Indeed, if an optimal assignment can be done for an incomplete tree, it can be done for the complete tree as well, because all the nodes of any incomplete tree are preserved in the complete tree.

For a complete tree with the constraints determined by the matrix A (as specified in the theorem’s statement) and with the root vertex of type 1, the number of vertices of type i and depth d (the root has the depth equal to 1) is exactly . In the matrix B, the next-to-last row is designed to collect the sum of all numbers of vertices at all previous depths (which is exactly how is defined), and the last row, in a similar manner, collects —the sum of ’s for all . In a more explicit way, can be expressed as:
where is the number of vertices of all types residing at the depth , so the expression is actually the sum of depths of all vertices up to the depth d:
and the expression is thus exactly the average depth of all such vertices.
If we consider arbitrary integer N, we can find an integer d such that . In this case, the total sum of depths of the first vertices is , and the next vertices have the depth of . The average depth is thus:

It is difficult to use this theorem straightaway, because the lower bound on the average depth of N vertices is not defined only in terms of N and the matrix A, but additionally requires to find which depth d fulfils . However, for several common usages it is possible to make it more convenient.

Theorem 6:

If there is only one type of query in Theorem 7, and such that , then for the search space S the lower bound on the average depth is at least .

Proof:
The value of yields the following result (intermediate computations omitted):

One can see that and .

Consider an equality . It follows that:
As for a given N, we need to find an integer d such that ; we need to round it down: .

Note that, if and , grows when d grows, as .

The expression for a lower bound on the average depth of N queries is at most:
Note that the classical result from Droste et al. (2006), the lower bound, is actually not greater than the given bound (note that differs from k used in the definition of Droste et al., 2006 because in the current context k does not include the answer corresponding to the optimum). Indeed, for :

For the case of k = 1, the lower bound is even stronger.

Theorem 7:

If there is only one type of query in Theorem 7, and , then for the search space S the lower bound on the average depth is at least .

Proof:

In this case one can show that and . The average depth for N is .

## 5  Lower Bounds for

First, let’s apply Theorem 8 immediately to the Jump problem.

Theorem 8:

For any n and , the unrestricted black-box complexity of is at least .

Proof:

In , the search space has a size of . There are possible answers to a query, but one of them terminates the search process immediately, so . The result follows straightaway from Theorem 8.

Theorem 9:

The unrestricted black-box complexity of extreme Jump for even n is at least .

Proof:

It follows from Theorem 10 by assuming .

The presented bounds are already an improvement over the currently known bounds (say, for extreme Jump and even n, as follows from Droste et al., 2006). However, for odd n Theorem 10 reports , which is still quite far away from the best known algorithms. Fortunately, the Jump problem possesses a particular property, which can be used to refine the lower bounds using Theorem 7 with two types of queries.

Theorem 10:

For , define an answer to the query to be nontrivial if it is neither 0 nor n. After receiving the first nontrivial answer for every subsequent query it is possible to determine a priori the parity of any nontrivial answer.

Proof:

Consider the optimum and a query. We introduce the following values:

• q00: number of positions with zeros in both the optimum and the query;

• q01: number of positions with zeros in the optimum and ones in the query;

• q10: number of positions with ones in the optimum and zeros in the query;

• q11: number of positions with ones in both the optimum and the query.

The number of zeros in the optimum modulo 2, which is , is fixed. The number of ones in the query modulo 2 is , and the answer to the query modulo 2 is . The following equality holds:
which means that the parity of the nontrivial answer is uniquely determined by the parity of the number of ones in the query.

As a result, if an algorithm receives the first nontrivial answer, all subsequent queries will probably have fewer possible answers.

Using Theorem 12, we can define two types of queries to use with Theorem 7, namely, the queries that happened before and after a nontrivial answer.

Theorem 11:
The unrestricted black-box complexity of for odd n is at least:
Proof:
For odd n there are = possible answers: one answer equal to 0, one answer equal to n, and k pairs of nontrivial answers. For Theorem 7, the first type of query has nonterminating answers, and the second type of query, which occurs after one of nontrivial answers is received from a query of the first type, has only nonterminating answers. The value of is thus:
A problem of defining d in terms of N is more difficult this time: as , the equality cannot be easily solved in terms of d. Instead, we introduce a function such that the following equality holds:
We find the lower bound on the average depth , keeping in mind that grows as d grows and that for :
We can also obtain a good lower bound on by throwing out the part in the definition of above, which leads to . Together, For , it holds that and , which constitutes:
Theorem 12:

The unrestricted black-box complexity of extreme Jump for odd n is at least .

Proof:
For extreme Jump and odd n, . Then from Theorem 13 it follows that the lower bound is at least:
Theorem 13:
The unrestricted black-box complexity of for even n is at least:
Proof:
For even n there are possible answers (): one answer equal to 0, one answer equal to n, one answer equal to , and k more pairs of nontrivial answers. For Theorem 7, the first type of query has nonterminating answers, and the second type of query can have either or k nonterminating answers, depending on the parity of the number of ones in a query. As we cannot predict the parity for all possible algorithms, the maximum number of queries is limited to . The matrix B has the following form:
We omit the intermediate computations and just state that:
Following the same approach as in the proof of Theorem 13, we define such that and produce the following lower bound:
The lower bound on can be achieved from the value of by throwing out the part, which yields:
and, together:
Substitution of N with and with proves the theorem.

Note that Theorem 15 does not improve the bound for extreme Jump and even n—it remains equal to when one sets —because in this case the number of possible answers does not change after receiving the first nontrivial answer.

## 6  Refining the Lower Bound for Extreme Jump Sizes

For extreme jump functions, we gave a black-box algorithm finding the optimum in time , whereas the lower bound stemming from the matrix lower bound theorem proposed in Section 4 was . The gap between these two bounds is relatively small compared to most black-box complexity results—recall that, for example, for the OneMax problem not even the leading constant in the complexity is known. Nevertheless, it is an interesting question whether the term in the upper bound is necessary or only stems from insufficient proof methods. So far, the upper bound proof suggests that black-box algorithms optimizing extreme jump functions have an initial phase of rounds in which they gain relatively little information about the optimum. Only once they have found a search point with nonzero fitness, they become more efficient and solve the problem in roughly n additional iterations.

In this section, we show that the term is indeed necessary; that is, the black-box complexity of the extreme jump functions is . Our proof also shows that it indeed cannot be avoided that the first fitness evaluations do not reveal much information about the optimum.

The remainder of this section is organized as follows. In Section 6.1, we reduce the problem of finding a lower bound for extreme Jump to a minimization problem over decision trees having a particular structure. This is where we exploit particular properties of optimizing extreme jump functions. In Section 6.2, we solve this minimization problem in a general form via a recursive argument. In Section 6.3, we minimize the obtained lower bound by choosing the best possible value for the parameter t of the minimization problem. Finally, in Section 6.4, we derive from the results of Section 6.3 our improved lower bounds for the black-box complexity of extreme jump functions.

### 6.1  Representing a Deterministic Algorithm for Extreme Jump

Consider an extreme jump problem, for simplicity here for even n.1 As we have argued in Section 4 already, a lower bound for the black-box complexity can be obtained by regarding the best average performance a deterministic black-box algorithm can have (where the average is taken over all instances, here over all extreme jump functions over ). Hence, let us consider a deterministic black-box algorithm for the extreme jump problem (and argue that it cannot have a too good average performance).

Before starting this argument, we remind the reader that the extreme jump functions problem has the properties that (i) for each point of the search space there is exactly one problem instance having this as optimal solution and (ii) all instances have unique optimal solutions. Consequently, we can (and will) identify the problem instances with their unique optima.

A deterministic black-box algorithm gives rise (and in fact is equivalent) to the following type of decision tree. In a decision tree for a search space S (recall that we use as representatives of the (unique) extreme jump functions with these optima), each node is labeled with a subset (“remaining search space”); the root is labeled with S. Each internal node v is also labeled with a query . If qv has the answer i to at least one , then v has an outgoing edge to a node w labeled with all having the answer i to the query qv. Consequently, the labels Sw of the children w of v form a partition of Sv. A node with ingoing edge labeled with an optimal answer (here n) has no children. These are the only nodes without children; this last requirement stems from the fact that we do not only require the black-box algorithm to “know” the optimal solution, but also to query it.

It is clear that each deterministic black-box algorithm gives rise to a decision tree and that each decision tree describes a black-box algorithm. By our observation that there is a bijection between the extreme jump functions and their unique optima, we see that at each leaf the set of possible solutions contains just a single element and that the average performance of the algorithm on a random instance is exactly the average depth of the leaves of the tree. Consequently, the black-box complexity of the extreme jump functions is the smallest possible average depth of the leaves of a decision tree.

By analyzing the structure of the decision trees for the extreme jump problem, we shall show improved lower bounds for their black-box complexity. To this aim, note first that if a query q gets the answer , one knows that the remaining search space contains only the binary strings of length n which have exactly common bits with the query q. There are at most such strings.2 From this innocent observation, we derive that the decision tree cannot be very balanced. The crucial property to look at is the length of the maximal path starting at the root of the tree with all edges labeled with the answer 0. Figure 2 shows a decision tree for extreme Jump with this path highlighted as the trunk of the tree. Note that this trunk is formed by queries which are made before any nonzero answer is received. The branches which are above the trunk in Figure 2 are used when the answer of n is received, in these cases the algorithm immediately stops. The branches below the trunk correspond to the cases when the answer of is received.3 The values of si are the remaining search space sizes for the corresponding branches. These values satisfy as follows from above. The value of t is the total number of queries in the trunk, that is, the t-th query in the trunk cannot have an answer of 0.

Figure 2:

A decision tree of a deterministic algorithm to solve extreme Jump for even n. For odd n, the branches below the trunk appear in pairs for answers and .

Figure 2:

A decision tree of a deterministic algorithm to solve extreme Jump for even n. For odd n, the branches below the trunk appear in pairs for answers and .

Let the total search space size be s, then . If we fix both s and t, any lower bound l(s, t) on the average depth of the decision tree of an optimal deterministic algorithm with the fixed t will be a lower bound on the expected runtime of any black-box search algorithm with the fixed t. After that, we can optimize l(s, t) for fixed s by varying t to find the lower bound for any black-box search algorithm for solving the extreme Jump.

The average depth of a subtree of a size si is at least , which follows from Theorem 8. To obtain the desired lower bound, we minimize the average depth of the entire tree by finding the proper values for si. In a general form, this minimization problem is formalized and solved in Section 6.2.

### 6.2  Lower Bound for Fixed s and t in General Form

In this section, we construct a lower bound for a function f(s, t), which requires that and integer , where Q is a certain positive value. This function is defined as follows:
where . The aim of this function is to represent the sum of depths of all queries for the given s and t in a tree from Figure 2, provided that si are assigned optimally. The bounds for the parameter u in the minimization clause come from the domain of and the fact that u is actually the value for s1, which cannot be negative and cannot exceed , the latter4 is rewritten as Q.

For the sake of brevity, we define a function and a function . Note that and . As is convex downwards for every fixed p, the following lemma holds.

Lemma 3:
For and , the following inequality holds:
which turns to equality at .
Proof:
We rewrite as . By Jensen’s inequality,

The following theorem gives a lower bound for f(s, t) which we use later.

Theorem 14:
For , the following lower bound holds:
Proof:

We prove this theorem using induction by t.

Induction base:t = 2. By definition:
By Lemma 16, the minimization clause gets minimum at . As by definition of f, u is a feasible minimum point, so:

Induction step:. Figure 3 shows the feasible region for u and the regions corresponding to the same pieces from the definition of on the Cartesian plane with axes of and .

Figure 3:

Feasible region for u and pieces from the definition of .

Figure 3:

Feasible region for u and pieces from the definition of .

We prove the induction step by parts which correspond to parts from the definition of f(s, t).

1. For the proof follows exactly the same scheme as in the induction base.

2. Consider and . These constraints correspond to lower triangles in Figure 3, examples of which are shown as triangle A and triangle C. The triangle C corresponds to the situation when and is used as x for . If x = 1, which corresponds to the triangle A, the first clause from the definition of should be used; however, in this case it is precisely equal to what would happen if x = 0 is substituted to the second clause. Hence we can evaluate both cases (triangles A and C) simultaneously.

3. Denote the interval as . By definition:
4. The minimum point for the expression in the minimization clause is, by Lemma 16, . This value is infeasible as , and all feasible values of u are at the same side of the minimum point, so the closest one should be used: . This results in the following bound:
5. Consider and . These constraints correspond to upper triangles in Figure 3, of which the triangle B is an example. We denote the interval as . By definition:
6. The minimum point for the expression in the minimization clause is, by Lemma 16, . As and , this minimum point is infeasible. As all the feasible values of u are at the same side of the minimum point, the closest one should be used: . This results in the following bound:
7. As parts 2 and 3 correspond to the same values of x, but to different ranges of u, we need to take a minimum of them. We denote as the lower bound from part 2 and as the lower bound from part 3. To prove that the first one is always the minimum of two, we should prove that . Here we denote as z, such that . Thus we have:
Both parentheses blocks in the latter expression are actually instantiations of a function , the first block has j = 0, the second block has . The function , by Lemma 16, has a minimum point at , so it decreases for , which means that .
8. As follows from the results of the part 4, the lower bound for f(s, t) at is always the expression from the part 2. As for the last definition piece, , the expression is the only lower bound to consider, this proves the induction step and the whole theorem as well.

### 6.3  Minimizing the General Form by t

In this section, we use the lower bounds for f(s, t) from Theorem 17 and try to find a minimum of them by t. For the case we relax the lower bound: .

Consider the case . We replace x by as x was defined by . Thus:
The two last addends together can be simplified as:
Now, we have to minimize by t the following expression for the lower bound:
This is a polynomial of degree two, which has its minimum at . Note that ; that is, is feasible. We get the following expression by substituting :

### 6.4 Applying the Lower Bounds to Extreme Jump

To apply the results of previous sections to the extreme Jump problem, we first need to note that, while Sections 6.26.3 use as the minimum sum depth of all queries in a subtree of size u, the real value is . However, . We can estimate the lower bound on the black-box complexity of the extreme Jump as follows:

The search space size for extreme Jump is . For even n, . Note that , also note that .

We get the following expression for even n:

For odd n, as noticed before, which brings the same asymptotic.

Note that in both the odd and the even cases, the value for Q plugged in the expression for gives . Hence, when running any deterministic black-box algorithm for extreme jump functions, it cannot be avoided that, on average, only 0-answers are received for the first iterations, and consequently, the size of the remaining search space is still a constant fraction of the initial search space.

## 7  Conclusion

New black-box algorithms for solving problem are presented, giving the following upper bounds:

• for : , where is measured when ;

• for : , where and are measured when ;

• for : .

A new theorem for constructing lower bounds on unrestricted black-box complexity of problems is proposed. The underlying idea is that influence of particular answers to queries to all subsequent queries can be formalized by assigning a type to each query and writing the relations in a form of a matrix. Several following steps for constructing the lower bounds are automated and can be performed using tools like Wolfram Alpha. We hope that this theorem can be used to obtain better lower bounds in other problems.

Using the proposed theorem, the lower bounds for are updated:

• for even n: ;

• for odd n: .

For extreme Jump, the lower bounds produced by the matrix theorem are for even n and for odd n. We applied additional knowledge about the jump functions, namely, the fact that the search space size decreases significantly when the algorithm receives an answer equal to or . The lower bounds for extreme Jump were proven to be , which matches the upper bound. More precisely, the lower bound equals , while the upper bound equals , where for even n and for odd n.

In the case of large, but not extreme , the quotients at the highest terms coincide as well (when tends to infinity), because the ratio of the upper bound to the lower bound approaches:

These new methods to prove lower bounds are interesting beyond the particular application to jump functions, since there are very few lower bound proofs in black-box complexity theory beyond the information theoretic argument from Droste et al. (2006), while there are many challenging problems still open. Let us point out two.

For the black-box complexity of the class, we know that it is between and . It is not known what is the truth. This problem was already raised by Erdős and Rényi (1963). The lower bound stems from the information theoretic argument, building on the fact that each query has at most different answers. For a single query regarded independently, the information gain is much lower. In fact, for each query the vast majority of the instances have the answer lying in an interval of size around . For this reason, nonadaptive black-box algorithms such as asking random queries (but also all other algorithms asking queries independent of the previous answers) necessarily have a complexity of at least ; see again Erdős and Rényi (1963). To prove that this is also a lower bound for any black-box algorithm, one would need a lower bound technique that goes beyond the argument of counting the number of different answers.

A second challenging lower bound problem is the unbiased k-ary black-box complexity of the function class. The k-ary unbiased black-box complexity is the unbiased black-box complexity defined in Section 2.3 with the additional restriction that the unbiased variation operators take at most k previous search points as arguments. There are upper bounds for this black-box complexity decreasing with k (Doerr, Johannsen, et al., 2011; Doerr and Winzen, 2014c), which indicates a possible stronger power of higher arity operators. Unfortunately, no matching lower bounds confirming this exist. Worse, there are no lower bounds for k-ary black-box complexities of for at all that are stronger than the information theoretic . For , Lehre and Witt (2012) show a lower bound of by, very roughly speaking, showing that a unary unbiased black-box algorithm for cannot do much better than taking the best-so-far search point and mutating it by flipping a certain number of bits.

While it is not clear how to use our matrix lower bound theorem for either of these two problems, we hope that our work does give some motivation to push the lower bound question beyond the classic information theory argument.

## Acknowledgments

This work was partially financially supported by the Government of Russian Federation, Grant 074-U01. This research also benefited from the support of the FMJH Program Gaspard Monge in Optimization and Operation Research, and from the support to this program from EDF.

## References

Afshani
,
P.
,
Agrawal
,
M.
,
Doerr
,
B.
,
Doerr
,
C.
,
Larsen
,
K. G.
, and
Mehlhorn
,
K.
(
2013
).
The query complexity of finding a hidden permutation
. In
Space-Efficient Data Structures, Streams, and Algorithms
, pp.
1
11
.
Lecture Notes in Computer Science
, Vol.
8066
.
Berlin
:
Springer
.
Anil
,
G.
, and
Wiegand
,
R. P
. (
2009
).
Black-box search by elimination of fitness functions
. In
Proceedings of Foundations of Genetic Algorithms
, pp.
67
78
.
,
G.
,
Lehre
,
P. K.
, and
Sudholt
,
D.
(
2014
).
Unbiased black-box complexity of parallel search
. In
Parallel Problem Solving from Nature XIII
, pp.
892
901
.
Lecture Notes in Computer Science
, Vol.
8672
.
Berlin
:
Springer
.
Buzdalov
,
M.
,
Kever
,
M.
, and
Doerr
,
B.
(
2015
).
Upper and lower bounds on unrestricted black-box complexity of
. In
Evolutionary Computation in Combinatorial Optimization
, pp.
209
221
.
Lecture Notes in Computer Science
, Vol.
9026
.
Chvátal
,
V
. (
1983
).
Mastermind
.
Combinatorica
,
3
(
3
):
325
329
.
Doerr
,
B
. (
2011
).
Analyzing randomized search heuristics: Tools from probability theory
. In
Theory of Randomized Search Heuristics
, pp.
1
20
.
Doerr
,
B.
, and
Doerr
,
C
. (
2014
).
Black-box complexity: From complexity theory to playing mastermind
. In
Proceedings of Genetic and Evolutionary Computation Conference Companion
, pp.
623
646
.
Doerr
,
B.
,
Doerr
,
C.
, and
Ebel
,
F.
(
2015
).
From black-box complexity to designing new genetic algorithms
.
Theoretical Computer Science
,
567
:
87
104
.
Doerr
,
B.
,
Doerr
,
C.
, and
Kötzing
,
T
. (
2014a
).
Unbiased black-box complexities of jump functions: How to cross large plateaus
. In
Proceedings of Genetic and Evolutionary Computation Conference
, pp.
769
776
.
Doerr
,
B.
,
Doerr
,
C.
, and
Kötzing
,
T.
(
2014b
).
The unbiased black-box complexity of partition is polynomial
.
Artificial Intelligence
,
216
:
275
286
.
Doerr
,
B.
,
Doerr
,
C.
, and
Kötzing
,
T.
(
2016
).
Unbiased black-box complexities of jump functions
.
Evolutionary Computation
.
Accepted for publication
.
Doerr
,
B.
,
Johannsen
,
D.
,
Kötzing
,
T.
,
Lehre
,
P. K.
,
Wagner
,
M.
, and
Winzen
,
C
. (
2011
).
Faster black-box algorithms through higher arity operators
. In
Proceedings of Foundations of Genetic Algorithms
, pp.
163
172
.
Doerr
,
B.
,
Kötzing
,
T.
,
Lengler
,
J.
, and
Winzen
,
C.
(
2013
).
Black-box complexities of combinatorial problems
.
Theoretical Computer Science
,
471
:
84
106
.
Doerr
,
B.
,
Kötzing
,
T.
, and
Winzen
,
C
. (
2011
).
Too fast unbiased black-box algorithms
. In
Proceedings of Genetic and Evolutionary Computation Conference
, pp.
2043
2050
.
Doerr
,
B.
, and
Winzen
,
C
. (
2014a
).
Playing Mastermind with constant-size memory
.
Theory of Computing Systems
,
55
(
4
):
658
684
.
Doerr
,
B.
, and
Winzen
,
C
. (
2014b
).
Ranking-based black-box complexity
.
Algorithmica
,
68
(
3
):
571
609
.
Doerr
,
B.
, and
Winzen
,
C.
(
2014c
).
Reducing the arity in unbiased black-box complexity
.
Theoretical Computer Science
,
545
:
108
121
.
Doerr
,
C.
, and
Lengler
,
J
. (
2015
).
Elitist black-box models: Analyzing the impact of elitist selection on the performance of evolutionary algorithms
. In
Proceedings of Genetic and Evolutionary Computation Conference
, pp.
839
846
.
Droste
,
S.
,
Jansen
,
T.
,
Tinnefeld
,
K.
, and
Wegener
,
I
. (
2003
).
A new framework for the valuation of algorithms for black-box optimization
. In
Proceedings of Foundations of Genetic Algorithms
, pp.
253
270
.
Droste
,
S.
,
Jansen
,
T.
, and
Wegener
,
I.
(
2002
).
On the analysis of the (1 + 1) evolutionary algorithm
.
Theoretical Computer Science
,
276
:
51
81
.
Droste
,
S.
,
Jansen
,
T.
, and
Wegener
,
I
. (
2006
).
Upper and lower bounds for randomized search heuristics in black-box optimization
.
Theory of Computing Systems
,
39
(
4
):
525
544
.
Erdős
,
P.
, and
Rényi
,
A.
(
1963
).
On two problems of information theory
.
Magyar Tudományos Akadémia Matematikai Kutató Intézet Közleményei
,
8
:
229
243
.
Hoeffding
,
W
. (
1963
).
Probability inequalities for sums of bounded random variables
.
Journal of the American Statistical Association
,
58
(
301
):
13
30
.
Jansen
,
T
. (
2015
).
On the black-box complexity of example functions: The real Jump function
. In
Proceedings of Foundations of Genetic Algorithms
, pp.
16
24
.
Jansen
,
T.
, and
Wegener
,
I
. (
2002
).
The analysis of evolutionary algorithms—A proof that crossover really can help
.
Algorithmica
,
34
(
1
):
47
66
.
Lehre
,
P. K.
, and
Witt
,
C
. (
2010
).
Black-box search by unbiased variation
. In
Proceedings of Genetic and Evolutionary Computation Conference
, pp.
1441
1448
.
Lehre
,
P. K.
, and
Witt
,
C.
(
2012
).
Black-box search by unbiased variation
.
Algorithmica
,
64
:
623
642
.
Rowe
,
J.
, and
Vose
,
M
. (
2011
).
Unbiased black box search algorithms
. In
Proceedings of Genetic and Evolutionary Computation Conference
, pp.
2035
2042
.
Yao
,
A. C.-C
. (
1977
).
Probabilistic computations: Toward a unified measure of complexity
. In
18th Annual Symposium on Foundations of Computer Science
, pp.
222
227
.

## Notes

1

In this section, the footnotes will explain how the proof is adapted to odd n where necessary.

2

For odd n, there are two such answers, and , each leaving the room for binary strings.

3

For odd n, each vertex has two children below the trunk corresponding to the answers and .

4

For odd n, the corresponding Q is . For the sake of the current bound, the two branches for the answers and , each with the same maximum size , can be safely glued together.