We analyze the unrestricted black-box complexity of the Jump function classes for different jump sizes. For upper bounds, we present three algorithms for small, medium, and extreme jump sizes. We prove a matrix lower bound theorem which is capable of giving better lower bounds than the classic information theory approach. Using this theorem, we prove lower bounds that almost match the upper bounds. For the case of extreme jump functions, which apart from the optimum reveal only the middle fitness value(s), we use an additional lower bound argument to show that any black-box algorithm does not gain significant insight about the problem instance from the first fitness evaluations. This, together with our upper bound, shows that the black-box complexity of extreme jump functions is .
To understand how evolutionary algorithms (and other black-box optimizers as well) behave when optimizing certain functions, one proves upper bounds for the problem’s difficulty (by constructing and studying various algorithms) and lower bounds (by studying how fast an algorithm can be in principle), which complement each other. Comparing these bounds helps researchers to evaluate how good today’s heuristics are and sometimes to construct better algorithms (Doerr et al., 2015).
In this work, we study the second question, that is, how fast in principle a black-box optimization algorithm can solve certain optimization problems. Droste et al. (2003) were the first to ask this question in the context of evolutionary algorithms. In their seminal paper---see also Droste et al. (2006) for the journal version---they introduce the notion of black-box complexity as a measure of problem difficulty. In simple words, the black-box complexity of an optimization problem is the (expected) number of function evaluations that are needed by an optimal black-box algorithm until it queries an optimum for the first time. As many randomized search heuristics such as evolutionary algorithms, ant colony optimization, or simulated annealing are black-box optimizers, the black-box complexity of a problem gives a lower bound on performance of all these search heuristics.
While dormant for several years, the area of black-box complexity became very active from 2009 on, possibly spurred by the remarkable works of Anil and Wiegand (2009) as well as of Lehre and Witt (2010). Since then, several deep and surprising results on black-box complexities were achieved, so that now we reasonably well understand the black-box complexities of classic test functions, for example, for OneMax (Droste et al., 2006; Anil and Wiegand, 2009) or for LeadingOnes (Afshani et al., 2013) and of several combinatorial optimization problems like sorting, maximum clique, and the single-source shortest path problem (Droste et al., 2006), the minimum spanning tree problem (Doerr et al., 2013) and the partition problem (Doerr et al., 2014b). It was also observed that modified definitions of black-box complexity are able to study the influence of unbiasedness (Lehre and Witt, 2012; Rowe and Vose, 2011), ranking-basedness (Doerr and Winzen, 2014b), memory size (Doerr and Winzen, 2014a), parallel search (Badkobeh et al., 2014), or elitism (Doerr and Lengler, 2015).
In this article, we stay within the realm of classic black-box complexity of pseudo-Boolean functions; that is, we ask how many fitness evaluations an otherwise unrestricted algorithm needs to perform to find the optimum of a function (given in a black-box fashion) from a given problem class. With the OneMax test function class and the LeadingOnes class being studied, we turn to another important Jump test function class. These are test functions used as examples with scalable difficulty, because the fitness landscape has a large plateau of low fitness around the optimum. For a jump function with the jump size , this plateau consists of all search points with the Hamming distance from the optimum between 1 and . See Section 2 for a precise definition. Our motivation is both understanding the black-box complexity of this well-studied function class and using it as a trigger to develop a new method, in particular, to prove lower bounds for black-box complexities, where at the moment not much is known beyond the information theoretic argument of Droste et al. (2006).
Concerning the black-box complexities of jump functions, we observe that while jump functions tend to be difficult for many randomized search heuristics, their black-box complexity is not excessively large (for the restricted notion of unbiased black-box complexity, weaker and less precise results pointing in the same direction have been obtained in Doerr et al. (2014a) (see Section 2). We show that when the jump parameter satisfies , the black-box complexity satisfies the same upper bound of that is the best known bound for the easy OneMax test function class. Note that is actually quite large, meaning that all search points with distance between 1 and from the optimum lie on the plateau of low fitness, making this a plateau of size and diameter . For even larger jump sizes, that is, , we show an upper bound of , where the asymptotic notation refers to tending to infinity. This upper bound does not make precise the leading constant when is constant, in particular, not for the extreme case when the jump function has all fitness levels on the plateau except for the “middle level” (for even n) or except for the two middle levels and (for odd n). For such extreme jump functions (and thus also for all others), we show an upper bound of .
These upper bounds are asymptotically of the right order of magnitude. This follows from the information theoretic argument (Theorem 4) in Droste et al. (2006): an optimization problem over a search space S such that each element of S is the unique solution to an instance of the problem has a black-box complexity of at least , where k is the maximum number of different answers a query can have. For jump functions with the jump size , this gives a lower bound of with the asymptotics being with respect to tending to infinity. Consequently, when , then our upper bounds and this lower bound are identical up to the leading constant. For smaller , they differ by at most a factor of . This is the same gap that exists for the black-box complexity of , which is an open problem pointed out (in a different context) already fifty years ago in the famous paper by Erdős and Rényi (1963). For constant-size values of , we also see a substantial gap between our upper bounds and the information theoretic lower bound. For example, in the case of an extreme jump function for even n, we have the three different fitness values , and n as possible answers to queries, and hence the lower bound is . For odd n, we have the four fitness values , and n, giving a lower bound of only. In both cases, our upper bound is and thus quite far away.
The reason is that the information theoretic argument pretends that at all times, all k answers may occur, and moreover, occur with similar frequency. This is clearly an overly optimistic view, as the following three examples show:
Once an optimal solution is found, the search is stopped. Hence such nodes of the decision tree have no children, even though some show up in a close distance from the root.
In an optimization problem with each search point being the unique optimum of exactly one instance, typically the different answers to a query do not occur with similar frequency as the optimal answer occurs at most once.
Correlations between the answers may also lead to a smaller information gain than assumed by the information theoretic lower bound. For simplicity, let us regard the OneMax problem, but the same effect exists for jump functions, most notable (as we will see) for extreme jump functions when n is odd. When the answer to the first query is received, we can predict the parity of answers for all subsequent queries, knowing only the parity of the number of one-bits in a query. Thus, when analyzing algorithms for solving this problem, we may safely assume that for every query (except for the first one), once the parity of one-bits is known, the number of different answers is reduced from to at most .
We significantly extend the information theoretic bound to allow taking care of such reasons for a smaller information gain, that is, to exploit these to prove stronger lower bounds. Our matrix lower bound theorem gives improved lower bounds for all black-box complexities of jump functions. In particular, for extreme jump functions, we raise the lower bounds from to when n is even and from to when n is odd. Note that the larger number of fitness values for n odd does now have a much smaller influence on the result than when using the classic information theoretic bound. While these results do determine the leading constant, we could not prove with our general lower bound theorem that the term in the upper bound is necessary. For this, we add an extra argument that shows that, as our upper bound suggests, indeed when optimizing an extreme jump function, the first fitness evaluations typically reduce the size of the solution space only by a constant factor. This argument together with the matrix lower bound proves that the black-box complexity of extreme jump function is indeed .
The rest of the article is structured as follows. In Section 2, we make precise the definitions of black-box complexity and jump functions. We also summarize the state of the art concerning optimization and complexity of jump functions. Section 3 is dedicated to upper bounds on Jump which are proven by giving the corresponding algorithms and discussing their complexity. In Section 4, the matrix lower bound theorem is formulated and proven. It resembles Theorem 4 from Droste et al. (2006), but is able to give better lower bounds under certain conditions. We apply this theorem in Section 5 to the Jump problem. Section 6 describes the refinement of the lower bound in the case of extreme jump sizes. Section 7 concludes the article, among others, with some open problems.
A preliminary version of a larger part of these results has appeared as a conference paper of Buzdalov et al. (2015). Apart from making a more detailed introduction, which was impossible in the conference paper format, we have refined the lower bound for extreme jump sizes up to to match the upper bound (see Section 6), which seems to be impossible by a straightforward application of the matrix lower bound theorem and required substantial additional effort.
In this section, we define the notion of unrestricted black-box complexity, we make precise the definition of the jump functions we regard, and we review the existing results in runtime analysis and black-box complexity for jump functions.
2.1 Unrestricted Black-Box Complexity
The notion of black-box complexity was introduced by Droste et al. (2003) to investigate how difficult a problem is to be solved via general-purpose randomized search heuristics like evolutionary algorithms. In simple words, this black-box complexity (now sometimes called unrestricted black-box complexity to distinguish it from later developed more restricted variants) is the number of fitness evaluations needed to solve a problem. Let us make this notion precise.
By a problem, we shall always mean a set of pseudo-Boolean functions (problem instances), with the implicit meaning that the task is to find a global maximum of such a function. A black-box algorithm A for is a (possibly randomized) algorithm that takes as input a function and tries to maximize it in a black-box fashion. This means that the algorithm has no access to an explicit description of f, but can only evaluate f at any search point . In this unrestricted black-box optimization model, we do not make any other limiting assumption on the computational power of the algorithm. In particular, the algorithm may store all previously regarded search points and their fitnesses and may conduct arbitrary computations with these.
Let be the set of all , . Let A be the randomized local search heuristic (start with a random search point and then repeat flipping one bit, evaluating the new search point, and keeping the better of parent and offspring). It is an easy exercise to show that A finds the optimum of any function in fitness evaluations. Hence for all z. Consequently, and thus . It is much harder to prove that there is a better black-box algorithm A for that has a complexity , a result shown in the context of information theory (Erdős and Rényi, 1963), the context of the Mastermind guessing game (Chvátal, 1983), and in the evolutionary computation community (Anil and Wiegand, 2009). The first and second references as well as Droste et al. (2006) also prove that for any black-box algorithm for , there is a function such that . Consequently, .
It is an ongoing discussion whether this is the best definition of a problem difficulty for randomized search of evolutionary computation and several alternative definitions have been defined, none of which, however, could clearly be shown to be more appropriate. We refer the reader to the works cited in the introduction or the tutorial (Doerr and Doerr, 2014).
2.2 Jump Functions
The Jump function is another popular test function. While OneMax is used to analyze how evolutionary algorithms cope with easy optimization problems, Jump is intended to be a difficult function where elitist optimization needs to flip several bits at once. There are several similar definitions of jump functions. They all have in common that they are a OneMax function modified by giving all search points within a certain radius from the optimum a low fitness. Consequently, typical hill-climbers find it easy to come close to the optimum (namely up to a search point with Hamming distance from the optimum), but then have to struggle to jump over the valley of low fitness to the optimum.
Our definition of the jump functions, which is also used in Doerr et al. (2014a, 2016) not only has the highest suboptimal fitness levels of the function blanked out, but also the fitness levels . This is to avoid the trivial solution that a black-box algorithm first minimizes the jump function with a performance essentially equal to optimizing and, once a solution x with fitness 0 is found, inverts x and (with high probability, as the algorithm might have started in a blanked-out region with nonzero probability) finds the optimum. A very similar definition for jump functions, also in the context of black-box complexity, was used in Lehre and Witt (2010). The only difference there is that for a jump value of , the authors blank out only the highest nonoptimal fitness values (but on the lower end, the fitness values , that is, levels, are blanked out). For contexts outside of complexity theory, usually one does not care about the fitness levels at the low end, because a heuristic building on the hill-climbing paradigm will rarely encounter these levels, and if so, not profit from having the exact fitness available. Such a definition was used, for example, in Droste et al. (2002). Recently, Jansen (2015) introduced yet another type of jump function that, roughly speaking, agrees with the definition of Droste et al. (2002) except that the optimum is located at an unknown point in the plateau (or deceptive region) that is a Hamming ball of radius . Thus this function class contains nonisomorphic fitness landscapes (whereas the previous classes contain just a single isomorphism type). This function class has a black-box complexity of order when is constant, very different from the previously discussed classes. It may be closer to what is the original intuition of the jump functions; however, several results other than black-box complexity results seem to fail for this class as well, for example, the proof that crossover can greatly speed up the optimization of jump functions (Jansen and Wegener, 2002).
2.3 Runtime Analysis and Black-Box Complexity of Jump Functions
Jump functions are generally difficult to optimize with classic randomized search heuristics. This was first shown by Droste et al. (2002) for their definition of jump functions and the (1 + 1) evolutionary algorithm, but it is quite easy to see that also many other classic randomized search heuristics have an optimization time of for any jump function definition that has the highest suboptimal fitness levels on reset to low fitness values. Jansen and Wegener (2002) showed that a steady state genetic algorithm using uniform crossover (however, only with rate ) can optimize the jump functions defined in Droste et al. (2002) in time when is constant. It is easy to see that this result also holds for all other definitions of jump functions discussed previously, with the exception of the definition of Jansen (2015). Here, in fact, his proof that the black-box complexity is implies that the use of crossover cannot give similar advantages as for the other types of jump functions.
While there are some black-box complexity analyses for jump functions, surprisingly none of them uses the most classic unrestricted black-box complexity model; in fact, they all use the unbiased model introduced in Lehre and Witt (2012). Here, only the restricted class of unbiased black-box algorithms is regarded. While the precise definition is nontrivial, roughly speaking, a black-box algorithm is called unbiased if it satisfies three properties:
Each search point must be chosen uniformly at random or must be created from previous search points via a variation operator.
All variation operators must be unbiased, that is, treat the bit-values 0 and 1 in a symmetric way and treat the bit positions in a symmetric way.
All actions of the algorithm apart from what happens inside a variation operator may depend only on the search history and the fitness values observed, but not on the bit-string representation of the individuals.
This algorithm model includes many classic randomized search heuristics, except, for example, those using one-point crossovers.
For this unbiased black-box complexity model, Doerr et al. (2014a, 2016) show the following results. If , an arbitrary constant, then the black-box complexity is . In the case of extreme jump functions for even n, that is , the black-box complexity is . It is clear that these results are valid for the unrestricted black-box complexity as well; however, our results are stronger in that they provide much more precise bounds (making the leading constant precise and in the case of extreme jump functions also the lower order terms up to order ) and regard wider ranges of (namely all values of ).
3 Upper Bounds for the Black-Box Complexity of
For sufficiently large n, for all and the cases of all even , it holds that .
This is proven in Doerr, Johannsen, et al. (2011) as Statement 8.
For sufficiently large n, for a fixed , for and for taken uniformly at random, the probability for to be zero is at most .
The value of for a random x has a binomial distribution with parameters n and . From Hoeffding’s inequality (see Hoeffding, 1963 or Theorem 1.11 in Doerr, 2011), for , the distribution function for binomial distribution is bound from above by . As a consequence, the probability for to be zero is at most .
The following result extends the black-box complexity analysis of Doerr, Johannsen, et al. (2011) for OneMax to Jump functions.
Assume that n is sufficiently large and . Let . Let X be a (multi-)set of elements from chosen randomly using uniform distribution and mutually independently. Then the probability that there exists a such that and for all , is at most .
We define Ad as a set of points which differ from z in exactly d positions, where .
We say that a point agrees with if . This means that or . The probability of the former does not exceed by Lemma 2. The latter holds if and only if x and y, as well as x and z, differ in exactly half of the bits in which y and z differ. To sum up, if , the probability for y to agree with a random x is at most for an odd d and at most for an even d. As for large enough n it holds that , the latter is at most .
3.2 Upper Bound for Smaller
Using Theorem 3, we easily find the following upper bound for the unrestricted black-box complexity of jump functions with small jump size . This bound is asymptotically equal to the best known upper bound for . Also note that, trivially, any lower bound for the black-box complexity of also holds for any jump function class with problem size n. Consequently, the best known lower bound for , which is a factor of below these upper bounds, also holds for the jump functions regarded in this subsection. In a sense, these results show that for solving , only the inner fitness levels are sufficient.
If , the unrestricted black-box complexity of is at most , where refers to .
We use the same algorithm which is used in Doerr, Johannsen, et al. (2011) for proving the upper bound for OneMax. We select randomly and independently t queries such that and check if there exists a single optimum z which agrees with all these queries (a query q with an answer a agrees with an optimum z if ). If there is more than one possible z, then we repeat the whole procedure. The complexity of one invocation of this procedure equals t. The probability of not finding a unique optimum is at most by Theorem 3. Thus the complexity of the algorithm is at most .
It is not difficult to see that, with slightly more care, Theorems 3 and 4 can be shown to hold also for larger values of , however, not for . For this reason, we will not follow this way. Instead, in the next subsection we propose an algorithm that works for even larger , but gives the same results that an extension of the previous theorem would have given in the small range above , where such an extension would have been possible.
3.3 Upper Bound for Larger
For bigger , finding an optimum for can be reduced to finding optima of several jump functions with smaller dimensions and jump sizes, for which the algorithm from the previous section suffices.
For , the unrestricted black-box complexity of is at most , where and refer to .
Let . We reduce our problem to the one of , where s is chosen such that . The algorithm is outlined in Figure 1.
First, the algorithm finds a maximum even s such that , which would allow applying Theorem 4 for solving . After that, the algorithm finds a string with exactly correct bits using random queries. The probability that is equal to for a random query is which is by the Stirling’s formula. This means that the string x can be found in queries.
After finding x, the algorithm splits all bit indices into sets of size s in such a way that x and the answer agree in exactly half of the bits. To be precise, the last such set may have indices, in which case x and the answer agree in of its positions. This is done in lines 8–15 in Figure 1, where bi is the i-th such set. B, the set of yet undistributed bits, is always such that x and the answer agree in exactly indices of B.
Next, the algorithm separately optimizes bits from each of the subsets bi using the algorithm for small Jump from Theorem 4 (lines 17–20 in Figure 1). If every query for a subproblem on bits from bi is forwarded to the main function f with all bits not from bi taken from x, the resulting subproblem becomes exactly a problem with the following corrections:
from all nonzero answers, a value of needs to be subtracted;
at the optimum of the subproblem, zero will be returned.
The latter correction, however, does not change the algorithm very much, because the algorithm from Theorem 4 doesn’t actually query the optimum point. Line 19 in Figure 1 collects the partial answers one by one: it sets the bits of ai at the corresponding positions from bi to the previous partial answer and returns the updated value.
3.4 Upper Bound for Extreme Jump
The algorithm from Theorem 5 cannot be applied to the case of extreme Jump function, because then k would be zero and the subproblems are extreme Jump problems as well. In this case we have to use another algorithm, which will be given in the proof of the following theorem.
The unrestricted black-box complexity of an extreme Jump problem is at most .
As described in the proof of Theorem 5, one can find a point x such that in queries. After that, if one flips two bits, the value of f remains the same if and only if one of these bits was correct and the other was not.
We denote by the bitwise exclusive OR of the bit strings a and b having equal lengths. The algorithm queries for all , and if it equals , the value of bi is set to one, otherwise to zero. This results in queries. After that, if the first bit is correct, then is the answer, otherwise its inverse is the answer. One has to make a single query, , to determine which one is true. The complexity of this algorithm is .
4 The Matrix Lower Bound Theorem
Let S be the search space of an optimization problem, and for each there exists an instance such that s is a unique optimum. Let each query have one of T types, such that for any query q of the i-th type the following holds:
there is exactly one answer to the query q which means that q is an optimum;
there are at most answers such that the next query after such answer belongs to the j-th type.
Define , , to be a matrix such that:
for (note the transposition);
Let the first ever query in the optimization process be of type 1. Define to be a vector, , . Then the following statements are true:
is the maximum total number of possible queries with depth in , where depth of a root is equal to one.
The lower bound on the average depth of N nodes is where d is an integer such that .
The unrestricted black-box complexity of the considered optimization problem is not less than the lower bound on average depth of nodes.
According to the Yao’s minimax principle (Yao, 1977), the expected runtime of a randomized algorithm on any input is not less than the average runtime of the best deterministic algorithm over all possible inputs. Thus we construct a lower bound on the complexity of a randomized algorithm by constructing a lower bound on the average performance of any deterministic algorithm over all possible inputs. A deterministic algorithm can be represented as a (rooted) decision tree with nodes corresponding to queries and arcs going downwards corresponding to answers to these queries. A total lower bound on the average performance of deterministic algorithms, just as in Droste et al. (2006), is done by assigning different queries to different nodes of a tree such that their average depth is minimized, and then by considering all such trees and taking a minimum over them.
It should be noted that, if a (fixed) set of queries is to be assigned to nodes of a (fixed) rooted tree such that the average depth of these queries is minimized, an optimal assignment can be constructed in a greedy way: each query should be assigned to a free node with the minimum possible depth. Assume that an optimal assignment does not use at least one node a with depth d while using at least one node b with depth . Then one can move a query from the node b to the node a, which decreases the average depth, so the initial assignment is, in fact, not optimal.
Next, we show that, in order to minimize the average depth, one needs to consider only the complete tree, that is, a tree where for any query of the i-th type, for any j there are exactly answers, each leading to a query of the j-th type. Indeed, if an optimal assignment can be done for an incomplete tree, it can be done for the complete tree as well, because all the nodes of any incomplete tree are preserved in the complete tree.
It is difficult to use this theorem straightaway, because the lower bound on the average depth of N vertices is not defined only in terms of N and the matrix A, but additionally requires to find which depth d fulfils . However, for several common usages it is possible to make it more convenient.
If there is only one type of query in Theorem 7, and such that , then for the search space S the lower bound on the average depth is at least .
One can see that and .
Note that, if and , grows when d grows, as .
For the case of k = 1, the lower bound is even stronger.
If there is only one type of query in Theorem 7, and , then for the search space S the lower bound on the average depth is at least .
In this case one can show that and . The average depth for N is .
5 Lower Bounds for
First, let’s apply Theorem 8 immediately to the Jump problem.
For any n and , the unrestricted black-box complexity of is at least .
In , the search space has a size of . There are possible answers to a query, but one of them terminates the search process immediately, so . The result follows straightaway from Theorem 8.
The unrestricted black-box complexity of extreme Jump for even n is at least .
It follows from Theorem 10 by assuming .
The presented bounds are already an improvement over the currently known bounds (say, for extreme Jump and even n, as follows from Droste et al., 2006). However, for odd n Theorem 10 reports , which is still quite far away from the best known algorithms. Fortunately, the Jump problem possesses a particular property, which can be used to refine the lower bounds using Theorem 7 with two types of queries.
For , define an answer to the query to be nontrivial if it is neither 0 nor n. After receiving the first nontrivial answer for every subsequent query it is possible to determine a priori the parity of any nontrivial answer.
Consider the optimum and a query. We introduce the following values:
q00: number of positions with zeros in both the optimum and the query;
q01: number of positions with zeros in the optimum and ones in the query;
q10: number of positions with ones in the optimum and zeros in the query;
q11: number of positions with ones in both the optimum and the query.
As a result, if an algorithm receives the first nontrivial answer, all subsequent queries will probably have fewer possible answers.
The unrestricted black-box complexity of extreme Jump for odd n is at least .
Note that Theorem 15 does not improve the bound for extreme Jump and even n—it remains equal to when one sets —because in this case the number of possible answers does not change after receiving the first nontrivial answer.
6 Refining the Lower Bound for Extreme Jump Sizes
For extreme jump functions, we gave a black-box algorithm finding the optimum in time , whereas the lower bound stemming from the matrix lower bound theorem proposed in Section 4 was . The gap between these two bounds is relatively small compared to most black-box complexity results—recall that, for example, for the OneMax problem not even the leading constant in the complexity is known. Nevertheless, it is an interesting question whether the term in the upper bound is necessary or only stems from insufficient proof methods. So far, the upper bound proof suggests that black-box algorithms optimizing extreme jump functions have an initial phase of rounds in which they gain relatively little information about the optimum. Only once they have found a search point with nonzero fitness, they become more efficient and solve the problem in roughly n additional iterations.
In this section, we show that the term is indeed necessary; that is, the black-box complexity of the extreme jump functions is . Our proof also shows that it indeed cannot be avoided that the first fitness evaluations do not reveal much information about the optimum.
The remainder of this section is organized as follows. In Section 6.1, we reduce the problem of finding a lower bound for extreme Jump to a minimization problem over decision trees having a particular structure. This is where we exploit particular properties of optimizing extreme jump functions. In Section 6.2, we solve this minimization problem in a general form via a recursive argument. In Section 6.3, we minimize the obtained lower bound by choosing the best possible value for the parameter t of the minimization problem. Finally, in Section 6.4, we derive from the results of Section 6.3 our improved lower bounds for the black-box complexity of extreme jump functions.
6.1 Representing a Deterministic Algorithm for Extreme Jump
Consider an extreme jump problem, for simplicity here for even n.1 As we have argued in Section 4 already, a lower bound for the black-box complexity can be obtained by regarding the best average performance a deterministic black-box algorithm can have (where the average is taken over all instances, here over all extreme jump functions over ). Hence, let us consider a deterministic black-box algorithm for the extreme jump problem (and argue that it cannot have a too good average performance).
Before starting this argument, we remind the reader that the extreme jump functions problem has the properties that (i) for each point of the search space there is exactly one problem instance having this as optimal solution and (ii) all instances have unique optimal solutions. Consequently, we can (and will) identify the problem instances with their unique optima.
A deterministic black-box algorithm gives rise (and in fact is equivalent) to the following type of decision tree. In a decision tree for a search space S (recall that we use as representatives of the (unique) extreme jump functions with these optima), each node is labeled with a subset (“remaining search space”); the root is labeled with S. Each internal node v is also labeled with a query . If qv has the answer i to at least one , then v has an outgoing edge to a node w labeled with all having the answer i to the query qv. Consequently, the labels Sw of the children w of v form a partition of Sv. A node with ingoing edge labeled with an optimal answer (here n) has no children. These are the only nodes without children; this last requirement stems from the fact that we do not only require the black-box algorithm to “know” the optimal solution, but also to query it.
It is clear that each deterministic black-box algorithm gives rise to a decision tree and that each decision tree describes a black-box algorithm. By our observation that there is a bijection between the extreme jump functions and their unique optima, we see that at each leaf the set of possible solutions contains just a single element and that the average performance of the algorithm on a random instance is exactly the average depth of the leaves of the tree. Consequently, the black-box complexity of the extreme jump functions is the smallest possible average depth of the leaves of a decision tree.
By analyzing the structure of the decision trees for the extreme jump problem, we shall show improved lower bounds for their black-box complexity. To this aim, note first that if a query q gets the answer , one knows that the remaining search space contains only the binary strings of length n which have exactly common bits with the query q. There are at most such strings.2 From this innocent observation, we derive that the decision tree cannot be very balanced. The crucial property to look at is the length of the maximal path starting at the root of the tree with all edges labeled with the answer 0. Figure 2 shows a decision tree for extreme Jump with this path highlighted as the trunk of the tree. Note that this trunk is formed by queries which are made before any nonzero answer is received. The branches which are above the trunk in Figure 2 are used when the answer of n is received, in these cases the algorithm immediately stops. The branches below the trunk correspond to the cases when the answer of is received.3 The values of si are the remaining search space sizes for the corresponding branches. These values satisfy as follows from above. The value of t is the total number of queries in the trunk, that is, the t-th query in the trunk cannot have an answer of 0.
Let the total search space size be s, then . If we fix both s and t, any lower bound l(s, t) on the average depth of the decision tree of an optimal deterministic algorithm with the fixed t will be a lower bound on the expected runtime of any black-box search algorithm with the fixed t. After that, we can optimize l(s, t) for fixed s by varying t to find the lower bound for any black-box search algorithm for solving the extreme Jump.
6.2 Lower Bound for Fixed s and t in General Form
For the sake of brevity, we define a function and a function . Note that and . As is convex downwards for every fixed p, the following lemma holds.
The following theorem gives a lower bound for f(s, t) which we use later.
We prove this theorem using induction by t.
Induction step:. Figure 3 shows the feasible region for u and the regions corresponding to the same pieces from the definition of on the Cartesian plane with axes of and .
We prove the induction step by parts which correspond to parts from the definition of f(s, t).
For the proof follows exactly the same scheme as in the induction base.
Consider and . These constraints correspond to lower triangles in Figure 3, examples of which are shown as triangle A and triangle C. The triangle C corresponds to the situation when and is used as x for . If x = 1, which corresponds to the triangle A, the first clause from the definition of should be used; however, in this case it is precisely equal to what would happen if x = 0 is substituted to the second clause. Hence we can evaluate both cases (triangles A and C) simultaneously.
- The minimum point for the expression in the minimization clause is, by Lemma 16, . This value is infeasible as , and all feasible values of u are at the same side of the minimum point, so the closest one should be used: . This results in the following bound:
- Consider and . These constraints correspond to upper triangles in Figure 3, of which the triangle B is an example. We denote the interval as . By definition:
- The minimum point for the expression in the minimization clause is, by Lemma 16, . As and , this minimum point is infeasible. As all the feasible values of u are at the same side of the minimum point, the closest one should be used: . This results in the following bound:
- As parts 2 and 3 correspond to the same values of x, but to different ranges of u, we need to take a minimum of them. We denote as the lower bound from part 2 and as the lower bound from part 3. To prove that the first one is always the minimum of two, we should prove that . Here we denote as z, such that . Thus we have: , the first block has j = 0, the second block has . The function , by Lemma 16, has a minimum point at , so it decreases for , which means that .
As follows from the results of the part 4, the lower bound for f(s, t) at is always the expression from the part 2. As for the last definition piece, , the expression is the only lower bound to consider, this proves the induction step and the whole theorem as well.
6.3 Minimizing the General Form by t
In this section, we use the lower bounds for f(s, t) from Theorem 17 and try to find a minimum of them by t. For the case we relax the lower bound: .
6.4 Applying the Lower Bounds to Extreme Jump
The search space size for extreme Jump is . For even n, . Note that , also note that .
For odd n, as noticed before, which brings the same asymptotic.
Note that in both the odd and the even cases, the value for Q plugged in the expression for gives . Hence, when running any deterministic black-box algorithm for extreme jump functions, it cannot be avoided that, on average, only 0-answers are received for the first iterations, and consequently, the size of the remaining search space is still a constant fraction of the initial search space.
New black-box algorithms for solving problem are presented, giving the following upper bounds:
for : , where is measured when ;
for : , where and are measured when ;
for : .
A new theorem for constructing lower bounds on unrestricted black-box complexity of problems is proposed. The underlying idea is that influence of particular answers to queries to all subsequent queries can be formalized by assigning a type to each query and writing the relations in a form of a matrix. Several following steps for constructing the lower bounds are automated and can be performed using tools like Wolfram Alpha. We hope that this theorem can be used to obtain better lower bounds in other problems.
Using the proposed theorem, the lower bounds for are updated:
for even n: ;
for odd n: .
For extreme Jump, the lower bounds produced by the matrix theorem are for even n and for odd n. We applied additional knowledge about the jump functions, namely, the fact that the search space size decreases significantly when the algorithm receives an answer equal to or . The lower bounds for extreme Jump were proven to be , which matches the upper bound. More precisely, the lower bound equals , while the upper bound equals , where for even n and for odd n.
These new methods to prove lower bounds are interesting beyond the particular application to jump functions, since there are very few lower bound proofs in black-box complexity theory beyond the information theoretic argument from Droste et al. (2006), while there are many challenging problems still open. Let us point out two.
For the black-box complexity of the class, we know that it is between and . It is not known what is the truth. This problem was already raised by Erdős and Rényi (1963). The lower bound stems from the information theoretic argument, building on the fact that each query has at most different answers. For a single query regarded independently, the information gain is much lower. In fact, for each query the vast majority of the instances have the answer lying in an interval of size around . For this reason, nonadaptive black-box algorithms such as asking random queries (but also all other algorithms asking queries independent of the previous answers) necessarily have a complexity of at least ; see again Erdős and Rényi (1963). To prove that this is also a lower bound for any black-box algorithm, one would need a lower bound technique that goes beyond the argument of counting the number of different answers.
A second challenging lower bound problem is the unbiased k-ary black-box complexity of the function class. The k-ary unbiased black-box complexity is the unbiased black-box complexity defined in Section 2.3 with the additional restriction that the unbiased variation operators take at most k previous search points as arguments. There are upper bounds for this black-box complexity decreasing with k (Doerr, Johannsen, et al., 2011; Doerr and Winzen, 2014c), which indicates a possible stronger power of higher arity operators. Unfortunately, no matching lower bounds confirming this exist. Worse, there are no lower bounds for k-ary black-box complexities of for at all that are stronger than the information theoretic . For , Lehre and Witt (2012) show a lower bound of by, very roughly speaking, showing that a unary unbiased black-box algorithm for cannot do much better than taking the best-so-far search point and mutating it by flipping a certain number of bits.
While it is not clear how to use our matrix lower bound theorem for either of these two problems, we hope that our work does give some motivation to push the lower bound question beyond the classic information theory argument.
This work was partially financially supported by the Government of Russian Federation, Grant 074-U01. This research also benefited from the support of the FMJH Program Gaspard Monge in Optimization and Operation Research, and from the support to this program from EDF.
In this section, the footnotes will explain how the proof is adapted to odd n where necessary.
For odd n, there are two such answers, and , each leaving the room for binary strings.
For odd n, each vertex has two children below the trunk corresponding to the answers and .
For odd n, the corresponding Q is . For the sake of the current bound, the two branches for the answers and , each with the same maximum size , can be safely glued together.