Evolutionary multiobjective optimization for the classical vertex cover problem has been analysed in Kratsch and Neumann (2013) in the context of parameterized complexity analysis. This article extends the analysis to the weighted vertex cover problem in which integer weights are assigned to the vertices and the goal is to find a vertex cover of minimum weight. Using an alternative mutation operator introduced in Kratsch and Neumann (2013), we provide a fixed parameter evolutionary algorithm with respect to , the cost of an optimal solution for the problem. Moreover, we present a multiobjective evolutionary algorithm with standard mutation operator that keeps the population size in a polynomial order by means of a proper diversity mechanism, and therefore, manages to find a 2-approximation in expected polynomial time. We also introduce a population-based evolutionary algorithm which finds a -approximation in expected time .
The area of runtime analysis has provided many rigorous new insights into the working behaviour of bio-inspired computing methods such as evolutionary algorithms and ant colony optimization (Auger and Doerr, 2011; Jansen, 2013; Neumann and Witt, 2010). In recent years, the parameterized analysis of bio-inspired computing has gained additional interest (Kratsch et al., 2010; Kratsch and Neumann, 2013; Sutton and Neumann, 2012; Sutton et al., 2014). Here, the runtime of bio-inspired computing is studied in dependence of the input size and additional parameters such as the solution size and/or other structural parameters of the given input.
One of the classical problems that has been studied extensively in the area of runtime analysis is the classical NP-hard vertex cover problem. Here, an undirected graph is given and the goal is to find a minimum set of vertices such that each edge has at least one endpoint in . Friedrich et al. (2010) have shown that the single-objective evolutionary algorithm (1 1) EA cannot achieve a constant approximation ratio in expected polynomial time. Furthermore, they have shown that a multiobjective approach using Global Simple Evolutionary Multiobjective Optimizer (Global SEMO) gives a factor approximation for the wider classes of set cover problems in expected polynomial time. Further investigations regarding the approximation behaviour of evolutionary algorithms for the vertex cover problem have been carried out in Friedrich et al. (2009) and Oliveto et al. (2009). Edge-based representations in connection with different fitness functions have been investigated in Jansen et al. (2013) and Pourhassan et al. (2015) according to their approximation behaviour in the static and dynamic setting. Kratsch and Neumann (2013) have studied evolutionary algorithms and the vertex cover problem in the context of parameterized complexity (Downey and Fellows, 1999). They have shown that Global SEMO, with a problem specific mutation operator is a fixed parameter evolutionary algorithm for this problem (for details about fixed parameter evolutionary algorithms, please refer to [Kratsch and Neumann, 2013]), and finds 2-approximations in expected polynomial time. Kratsch and Neumann (2013) have also introduced an alternative mutation operator and have proved that Global SEMO using this mutation operator finds a -approximation in expected time . Jansen et al. (2013) have shown that a 2-approximation can also be obtained by using an edge-based representation in the (1 1) EA combined with a fitness function formulation based on matchings.
In this article, we consider the weighted vertex cover problem where integer weights on the vertices are given and the goal is to find a vertex cover of minimum weight. We extend the investigations carried out in Kratsch and Neumann (2013) to the weighted minimum vertex cover problem. In Kratsch and Neumann (2013), multiobjective models in combination with a simple multiobjective evolutionary algorithm called Global SEMO are investigated. The secondary objective that is studied there is the solution for the LP relaxation of the problem, which helps the evolutionary algorithm construct LP-based approximation solutions. One key argument for the results presented for the (unweighted) vertex cover problem is that the population size is always upper bounded by . This argument does not hold in the weighted case. Therefore, we study how a variant of Global SEMO using appropriate diversity mechanisms is able to deal with the weighted vertex cover problem.
The focus of this article is on the expected time (number of fitness evaluations) of the algorithms to find good approximations of an optimal solution. The time complexity analysis is performed with respect to , , and , which denote the number of vertices, the maximum weight in the input graph, and the cost of the optimal solution respectively. We first study the expected time until Global SEMO with standard mutation operator has found a 2-approximation in dependence of and . Afterwards, we analyse the expected time that Global SEMO requires to find a solution with expected approximation ratio for this problem when the algorithm uses an alternative mutation operator. Furthermore, this article considers DEMO, a variant of Global SEMO, which incorporates -dominance (Laumanns et al., 2002) as a diversity mechanism. It is shown that DEMO finds a 2-approximation in expected polynomial time. Finally, a population-based approach is presented that obtains a solution that has approximation ratio in expected time .
This article extends the conference version (Pourhassan et al., 2016) by giving complete proofs for a number of lemmata (Lemmata 4, 5, 11, and 12), that are not contained in the conference version. Furthermore, it analyses the expected time until Global SEMO with standard mutation operator has found a 2-approximation (Section 3.1), and provides a population-based approach that obtains a solution that has approximation ratio in expected time (Section 5).
The outline of the article is as follows. In Section 2, the problem definition is presented as well as the classical Global SEMO algorithm and DEMO algorithm. Runtime analysis for Global SEMO is presented in Section 3 with the standard mutation operator investigated in Section 3.1 for finding a 2-approximation, and the alternative mutation operator analysed in Section 3.2 for finding a -approximation. Section 4 includes the analysis that shows DEMO can find 2-approximations of the optimum in expected polynomial time. The population-based algorithm is defined and investigated for finding a -approximation in Section 5. At the end, in Section 6 we summarize and conclude.
We consider the weighted vertex cover problem defined as follows. Given a graph with vertex set and edge set , and a positive weight function on the vertices, the goal is to find a subset of vertices, , that covers all edges and has minimum weight; that is, and is minimized. We consider the standard node-based approach; that is, the search space is and for a solution the vertex is chosen iff .
By relaxing the constraint to , the linear program formulation of Fractional Weighted Vertex Cover is obtained. Hochbaum (1983) has shown that we can find a 2-approximation using the LP result of the relaxed weighted vertex cover. This can be done by including any vertex for which .
We consider primarily multiobjective approaches for the weighted vertex cover problem. Given a multiobjective fitness function , defined on the solution set , where all objectives should be minimized, we have iff , . We say that (weakly) dominates iff . Furthermore, we say that (strongly) dominates iff and . A solution is Pareto optimal if there is no solution that can strongly dominate it. The set of Pareto optimal solutions is called Pareto front.
We now introduce the objectives used in our multiobjective evolutionary algorithm. Let be the graph obtained from by removing all edges covered by the vertices chosen by . Formally, we have where and (note that to unify the search space, we keep and the same vertex set). Kratsch and Neumann (2013) investigated a multiobjective baseline algorithm called Global SEMO using the LP-value for as one of the fitness values for the (unweighted) minimum vertex cover problem.
Our goal is to expand the analysis on behaviour of multiobjective evolutionary algorithms to the weighted vertex cover problem. In order to do this, we modify the fitness function that was used in Global SEMO in Kratsch and Neumann (2013), to match the weighted version of the problem. We investigate the multiobjective fitness function , where
is the sum of weights of selected vertices
is the value of an optimal solution of the LP for .
We analyse Global SEMO with this fitness function using the standard mutation operator flipping each bit with probability . We also investigate Global SEMO using the alternative mutation operator introduced in Kratsch and Neumann (2013) (see Algorithm 2). By this mutation operator, the vertices that are adjacent to uncovered edges are included with probability in some steps.
In the fitness function used in Global SEMO, both and can be exponential with respect to the input size; therefore, we need to deal with exponentially many solutions, even if we only keep the Pareto front. One approach for dealing with this problem is using the concept of -dominance (Laumanns et al., 2002). The concept of -dominance has previously been proved to be useful for coping with exponentially large Pareto fronts in some problems (Horoba and Neumann, 2008; Neumann et al., 2011). Having two objective vectors and , -dominates , denoted by , if for all we have . Motivated by this approach, DEMO (Diversity Evolutionary Multiobjective Optimizer) has been investigated in Neumann and Reichel (2008) and Neumann et al. (2011), which we present in Algorithm 3. In this approach, the objective space is partitioned into a polynomial number of boxes in which all solutions -dominate each other, and at most one solution from each box is kept in the population. Here, we describe the concept of boxes and how we keep one solution for each box in detail, and in Section 4, we analyze DEMO.
The functions and partition the objective space into horizontal and vertical stripes, which we name rows and columns, and the whole boxing function partitions the objective space into boxes. A box can be denoted by , where and are values of and for the solutions in that box, respectively.
Note that two boxes and with and (or and ) can include search points that do not dominate each other; therefore, we may keep solutions from different boxes with same values of or . But if and , then all search points in dominate all search points in . Hence, we define dominance among boxes as: box dominates box , denoted by , if and .
In DEMO only one nondominated solution can be kept in the population for each box based on a predefined criteria. In our setting, among two solutions and from one box, is kept in and is discarded if . The reason behind this particular setting is that we aim to work on solutions under the constraint that , because by only adding vertices to these solutions, it is possible to obtain 2-approximate complete vertex covers.
Analysing the runtime of our evolutionary algorithms, we are interested in the expected number of rounds of the repeat loop until a solution of desired quality has been obtained. We call this the expected time until the considered algorithm has achieved its desired goal.
3. Analysis of Global SEMO
In this section, we analyse the expected time of Global SEMO to find good approximations for the weighted vertex cover problem in dependence of the input size and OPT. Before we present our analysis for Global SEMO, we state some basic properties of the solutions in our multi-objective model. The following theorem shown by Balinski (1970) states that all basic feasible solutions of the LP relaxation of the weighted vertex cover, which are the extremal points or the corner solutions of the polyhedron that forms the feasible space, are half-integral.
Each basic feasible solution of the LP relaxation of the weighted vertex cover is half-integral; that is, . Balinski (1970)
As a result, there always exists a half-integral optimal LP solution for a vertex cover problem. In several parts of this article, we make use of this result. We establish the following two lemmata which we will use later on in the analysis of our algorithms.
For any , .
Let be the LP solution of . The solution contains no vertices; therefore, is the optimal fractional vertex cover for all edges of the input graph. Thus, for any solution , is a (possibly nonoptimal) fractional cover for ; therefore, . Moreover, we have as is the optimal value of the LP relaxation.
Let be a solution and be a fractional solution for . If there is a vertex where , mutating from 0 to 1 results in a solution for which .
The graph is the same as excluding the edges connected to . Therefore, the solution is a fractional vertex cover for and has a cost of . The cost of the optimal fractional vertex cover of is at most as great as the cost of ; thus .
We now analyse the runtime behaviour of Global SEMO (Algorithm 1) with the standard mutation operator, in dependence of OPT. We start by giving an upper bound on the population size of Global SEMO.
The population size of Algorithm 1 is upper bounded by .
For any solution there exists an optimal fractional vertex cover which is half-integral (Theorem 1). Moreover, we are assuming that all the weights are integer values. Therefore, can only take different values, because is an upper bound on (Lemma 2). For each value of , only one solution is in , because Algorithm 1 keeps nondominated solutions only. Therefore, the population size of this algorithm is upper bounded by which is at most due to Lemma 2.
For our analysis, we first consider the expected time of Global SEMO to reach a population which contains the empty set of vertices. Once included, such a solution will never be removed from the population as it is minimal with respect to the cost function.
The search point is included in the population in expected time of .
From Lemma 4 we know that the population contains at most solutions. Therefore, at each step, there is a probability of that the solution is selected where .
If , there must be vertices such as in where . Let be the improvement that happens on the minimum cost in at step . If all the 1-bits in solution flip to zero, at the same step or different steps, a solution will be obtained with , which implies that the expected improvement that flipping a randomly chosen 1-bit makes is at each step . Note that flipping 1-bits always improves the minimum cost and the new solution is added to the population. Moreover, flipping any 0-bits does not improve the minimum cost in the population and is not replaced with the new solution in that case.
The maximum value that can take is bounded by , and for any solution , the minimum value of is at least 1. Using Multiplicative Drift Analysis (Doerr et al., 2012) with and , we can conclude that in expected time solution is included in the population.
We now show that Global SEMO is able to achieve a 2-approximation efficiently as long as OPT is small.
The expected number of iterations of Global SEMO until the population contains a 2-approximation is .
Let be a solution that minimizes under the constraint that . Note that this constraint holds for solution since , and according to Lemma 5, solution exists in the population in expected time of .
If , then all edges are covered and is a 2-approximate vertex cover because we have as the constraint. Otherwise, some edges are uncovered and any LP solution of assigns at least to at least one vertex of any uncovered edge. Let be a basic LP solution for . According to Theorem 1, is a half-integral solution.
According to Lemma 2, for any solution , we have . We also know that for any solution which is not a complete cover, because the weights are positive integers. Using the method of Multiplicative Drift Analysis (Doerr et al., 2012) with and , in expected time of a solution with and is obtained which is a 2-approximate vertex cover. Overall, since we have , the expected time of finding this solution is .
3.2. Improved Approximations by Alternative Mutation
In this section, we analyse the expected time of Global SEMO with an alternative mutation operator to find a (1 + )-approximation.
A solution fulfilling the two properties
there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of
is included in the population of Global SEMO in expected time .
As the standard mutation occurs with probability in the alternative mutation operator, the search point which satisfies property 1 is included in the population in expected time of using the argument presented in the proof of Lemma 5. Let be a set of solutions such that for each solution , . Let be a solution such that .
If the optimal fractional vertex cover for assigns 1/2 to each nonisolated vertex of , then the conditions of the lemma hold. Otherwise, it assigns 1 to some nonisolated vertex, say . The probability that the algorithm selects and flips the bit corresponding to , is because the population size is (Lemma 4). Let be the new solution. We have , and by Lemma 3, . This implies that ; hence, is a Pareto Optimal solution and is added to the population .
Since (Lemma 2) and the weights are at least 1, assuming that we already have the solution in the population, by means of the method of fitness-based partitions, we find the expected time of finding a solution that fulfils the properties given above as . Since the search point is included in expected time , the expected time that a solution fulfilling the properties given above is included in is .
We now present the main approximation result for Global SEMO using the alternative mutation operator. The general idea of the analysis given in the following theorem is to partition the nonisolated vertices in into four subsets: , , , and , where denotes a solution satisfying the two properties given in Lemma 7. The precise definition of these four subsets are given in the proof of the theorem. For a new vertex cover obtained by the alternative mutation operator on , the analysis only considers the probability that all vertices of are chosen and no vertex of is chosen in the new solution . The quality of highly depends on this property (including all vertices of and no vertices of ), but it also depends on the vertices chosen from and , where the vertices in and are chosen randomly with probability by the alternative mutation operator. Thus the analysis finds the expected time until the event that a solution with the defined property is found, and the expected ratio of that solution is considered, based on the expectation that half of the vertices in and half of the vertices in are chosen by . In the following, we first present the formal definition of sets and and the mentioned property, and then we state the main theorem of this section.
Let be a solution that satisfies the two properties given in Lemma 7. Also, let be the set containing all nonisolated vertices in graph . Moreover, let be a vertex cover of with the minimum weight over all vertex covers of , and be the set containing all nonisolated vertices in . For a set of vertices, , we define . Let . Let be a numbering of the vertices in such that , for all . And let be a numbering of the vertices in such that , for all . We define
Property 9 (High-Quality solutions):We say that a solution has the property of a High-Quality solution if all vertices of are chosen and no vertex of is chosen in .
The expected time until Global SEMO has obtained a solution with Property 9 (High-Quality solution) is . Moreover, the obtained solution has expected approximation ratio of .
By Lemma 7, a solution that satisfies the two properties given in Lemma 7 is included in the population in expected time of . Let , , , , , and for a vertex set be as defined in Definition 8. Due to property 2 of Lemma 7, ; therefore, . Also, let be as defined in Definition 8. Observe that , because is the minimum vertex cover of .
With probability , the algorithm Global SEMO selects the solution , and sets in the Alternative Mutation Operator. With , the probability that the bits corresponding to all vertices of are flipped, is , and the probability that none of the bits corresponding to the vertices of are flipped is . Also, the bits corresponding to the isolated vertices of are flipped with probability by the Alternative Mutation Operator; hence, the probability that none of them flips is . As a result, with probability , solution is selected, the vertices of are included, and the vertices of and isolated vertices are not included in the new solution . Due to the definition of Property 9, has the property of a High-Quality solution. Since , and also , the expected time until solution is found after reaching solution is .
Now we show that the second statement of the theorem holds. Note that the bits corresponding to vertices of and , are arbitrarily flipped in solution with probability by the Alternative Mutation Operator. Here, we show that for the expected cost and the LP value of , the following constraint holds: .
Case (I). . Then . Thus, and Inequality (1) holds true.
Now we analyze whether the new solution could be included in the population . If could not be included in , then there is a solution dominating ; that is, and . This implies . Therefore, after having a solution that fulfils the properties of Lemma 7 in , in expected time , the population would contain a solution such that .
Let contain all solutions such that , and let be the one that minimizes . With similar proof as we saw in Theorem 6 it is possible to show that at each step, on expectation improves by . Using Multiplicative Drift Analysis, we get the expected time to find a solution for which and .
Overall, the expected number of iterations of Global SEMO with alternative mutation operator, for getting a weighted vertex cover with expected approximation ratio , is bounded by .
4. Analysis of DEMO
Due to Lemma 4, with Global SEMO, the population size is upper bounded by , which can be exponential in terms of the input size. In this section, we analyse the other evolutionary algorithm, DEMO (Algorithm 3), that uses some diversity handling mechanisms for dealing with exponentially large population sizes. The following lemmata are used in the proof of Theorem 13.
Let be the maximum weight assigned to a vertex. The population size of DEMO is upper bounded by .
We here show that the size of the population is . Since the dominated solutions according to are discarded by the algorithm, none of the solutions in can be located in a box that is dominated by another box that contains a solution in . Moreover, at most one solution from each box is kept in the population; therefore, is at most the maximum number of boxes where none of them dominates another.
Let be the number of boxes that contain a solution of in the first column. Let be the smallest row number among these boxes. Observe that and the equality holds when the boxes are from rows down to . Any box in the second column with a row number of or above is dominated by the box of the previous column and row . Therefore, the maximum row number for a box in the second column, that is not dominated, is . With generalizing the idea, the maximum row number for a box in the column , that is not dominated, is , where for , is the number of boxes that contain a solution of in column .
The search point is included in the population in expected time of .
From Lemma 10 we know that the population contains solutions. Therefore, at each step, there is a probability of at least that the solution is selected where .
If , we have , which means since the weights are greater than 0.
Therefore, in expected time of at most the new solution is obtained which is accepted by the algorithm because it is placed in a box with a smaller value of than all solutions in and hence not dominated. There are different values for ; therefore, the solution with is found in expected time of at most .
Let be a search point such that and . There exists a 1-bit flip leading to a search point with and .
The expected time until DEMO constructs a 2-approximate vertex cover is .
Consider solution that minimizes under the constraint that . Note that fulfils this constraint and according to Lemma 11, the solution will be included in in time .
If then covers all edges and by selection of we have , which means that is a 2-approximation.
In case , according to Lemma 12 there is a one-bit flip on that results in a new solution for which , while the mentioned constraint also holds for it. Since the population size is (Lemma 10), this 1-bit flip happens with a probability of and is obtained in expected time of . This new solution will be added to because a solution with cannot dominate with , and has the minimum value of among solutions that fulfil the constraint. Moreover, if there already is a solution, , in the same box as , it will be replaced by because ; otherwise, it would have been selected as .
There are at most different values for in the objective space, and since , . Therefore, the expected time until a solution is found so that and , is at most .
5. Diverse Population-Based EA
In this section, we introduce a population-based algorithm (see Algorithm 4) that keeps for each , , at most two solutions. This implies that the population size is upper bounded by . The two solutions kept in the population are chosen according to different weighing of the cost and the LP-value. For each solution , let be the number of selected vertices in . Algorithm 4 keeps a new solution in the population, if it minimizes or among other solutions where . Algorithm 4 gives a detailed description.
Taking into account that the population size is upper bounded by and considering in each step an individual with the smallest number of ones in the population for mutation, one can obtain the following lemma by standard fitness level arguments.
The search point is included in the population in expected time of .
To show the main result for Diverse Population-Based EA, we will use the following lemma.
A solution fulfilling the two properties
there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of
is included in the population of the Diverse Population-Based EA in expected time .
By Lemma 14, solution is contained in the population in expected time , which satisfies the property 1 given above. Let be a set containing all solutions in that satisfy the property 1 given above.
Let be the solution of with the maximal number of 1-bits. If the optimal fractional vertex cover for assigns 1/2 to each non-isolated vertex of , then the second property also holds. If the optimal fractional vertex cover for assigns 1 to some nonisolated vertex, say , then the algorithm selects and flips exactly the bit corresponding to with probability . Let be the new solution. By selection of we know that is the only solution with one-bits; hence, added to .
Since the maximum value of is , after expected time of , there is a solution in the population that fulfils the properties given in the lemma.
We now show the main result for the Diverse Population-Based EA.
The expected time until Diverse Population-Based EA has obtained a solution that has approximation ratio is .
By Lemma 15 we know that after expected time of , there is a solution, , in the population that fulfils the properties given in that lemma. With analysis similar to what we had in Theorem 9, we can show that a solution with is produced in expected time .
Now we see whether solution is added to population . If could not be added to , then there exists a solution such that and . Thus, the population already includes a solution such that .
Let be a set containing all solutions such that . Let such that .
Suppose that could not be included in , then there exists a solution in such that and , which contradicts the assumption that . Therefore, solution could be included in .
Observe that for any solution , if , then . Thus, after expected time of at most , the population could include a solution such that and , which is a -approximate weighted vertex cover.
Overall, the expected time in which Diverse Population-Based EA finds a -approximate weighted vertex cover, is bounded by .
The minimum vertex cover problem is one of the classical NP-hard combinatorial optimization problems. In this article, we have generalized previous results of Kratsch and Neumann (2013) for the unweighted minimum vertex cover problem to the weighted case where in addition weights on the vertices are given. Based on the conference version of this article (Pourhassan et al., 2016), in sections 3.2 and 4, we have investigated Global SEMO with alternative mutation operator for finding a -approximation, and studied the algorithm DEMO using the -dominance approach showing that it reaches a 2-approximation in expected polynomial time. Furthermore, in this article we have shown that Global SEMO with standard mutation operator efficiently computes a 2-approximation as long as the value of an optimal solution is small. We have also presented a population-based approach with a specific diversity mechanism that reaches an -approximation in expected time .
This research has been supported by Australian Research Council grants DP140103400 and DP160102401, and National Natural Science Foundation of China grant 61802441.