Abstract

Evolutionary multiobjective optimization for the classical vertex cover problem has been analysed in Kratsch and Neumann (2013) in the context of parameterized complexity analysis. This article extends the analysis to the weighted vertex cover problem in which integer weights are assigned to the vertices and the goal is to find a vertex cover of minimum weight. Using an alternative mutation operator introduced in Kratsch and Neumann (2013), we provide a fixed parameter evolutionary algorithm with respect to OPT, the cost of an optimal solution for the problem. Moreover, we present a multiobjective evolutionary algorithm with standard mutation operator that keeps the population size in a polynomial order by means of a proper diversity mechanism, and therefore, manages to find a 2-approximation in expected polynomial time. We also introduce a population-based evolutionary algorithm which finds a (1+ɛ)-approximation in expected time O(n·2min{n,2(1-ɛ)OPT}+n3).

1. Introduction

The area of runtime analysis has provided many rigorous new insights into the working behaviour of bio-inspired computing methods such as evolutionary algorithms and ant colony optimization (Auger and Doerr, 2011; Jansen, 2013; Neumann and Witt, 2010). In recent years, the parameterized analysis of bio-inspired computing has gained additional interest (Kratsch et al., 2010; Kratsch and Neumann, 2013; Sutton and Neumann, 2012; Sutton et al., 2014). Here, the runtime of bio-inspired computing is studied in dependence of the input size and additional parameters such as the solution size and/or other structural parameters of the given input.

One of the classical problems that has been studied extensively in the area of runtime analysis is the classical NP-hard vertex cover problem. Here, an undirected graph is given and the goal is to find a minimum set of vertices V' such that each edge has at least one endpoint in V'. Friedrich et al. (2010) have shown that the single-objective evolutionary algorithm (1 + 1) EA cannot achieve a constant approximation ratio in expected polynomial time. Furthermore, they have shown that a multiobjective approach using Global Simple Evolutionary Multiobjective Optimizer (Global SEMO) gives a factor O(logn) approximation for the wider classes of set cover problems in expected polynomial time. Further investigations regarding the approximation behaviour of evolutionary algorithms for the vertex cover problem have been carried out in Friedrich et al. (2009) and Oliveto et al. (2009). Edge-based representations in connection with different fitness functions have been investigated in Jansen et al. (2013) and Pourhassan et al. (2015) according to their approximation behaviour in the static and dynamic setting. Kratsch and Neumann (2013) have studied evolutionary algorithms and the vertex cover problem in the context of parameterized complexity (Downey and Fellows, 1999). They have shown that Global SEMO, with a problem specific mutation operator is a fixed parameter evolutionary algorithm for this problem (for details about fixed parameter evolutionary algorithms, please refer to [Kratsch and Neumann, 2013]), and finds 2-approximations in expected polynomial time. Kratsch and Neumann (2013) have also introduced an alternative mutation operator and have proved that Global SEMO using this mutation operator finds a (1+ɛ)-approximation in expected time O(n2logn+OPT·n2+n·4(1-ɛ)OPT). Jansen et al. (2013) have shown that a 2-approximation can also be obtained by using an edge-based representation in the (1 + 1) EA combined with a fitness function formulation based on matchings.

In this article, we consider the weighted vertex cover problem where integer weights on the vertices are given and the goal is to find a vertex cover of minimum weight. We extend the investigations carried out in Kratsch and Neumann (2013) to the weighted minimum vertex cover problem. In Kratsch and Neumann (2013), multiobjective models in combination with a simple multiobjective evolutionary algorithm called Global SEMO are investigated. The secondary objective that is studied there is the solution for the LP relaxation of the problem, which helps the evolutionary algorithm construct LP-based approximation solutions. One key argument for the results presented for the (unweighted) vertex cover problem is that the population size is always upper bounded by n+1. This argument does not hold in the weighted case. Therefore, we study how a variant of Global SEMO using appropriate diversity mechanisms is able to deal with the weighted vertex cover problem.

The focus of this article is on the expected time (number of fitness evaluations) of the algorithms to find good approximations of an optimal solution. The time complexity analysis is performed with respect to n, Wmax, and OPT, which denote the number of vertices, the maximum weight in the input graph, and the cost of the optimal solution respectively. We first study the expected time until Global SEMO with standard mutation operator has found a 2-approximation in dependence of n and OPT. Afterwards, we analyse the expected time that Global SEMO requires to find a solution with expected approximation ratio (1+ɛ) for this problem when the algorithm uses an alternative mutation operator. Furthermore, this article considers DEMO, a variant of Global SEMO, which incorporates ɛ-dominance (Laumanns et al., 2002) as a diversity mechanism. It is shown that DEMO finds a 2-approximation in expected polynomial time. Finally, a population-based approach is presented that obtains a solution that has approximation ratio (1+ɛ) in expected time O(n·2min{n,2(1-ɛ)OPT}+n3).

This article extends the conference version (Pourhassan et al., 2016) by giving complete proofs for a number of lemmata (Lemmata 4, 5, 11, and 12), that are not contained in the conference version. Furthermore, it analyses the expected time until Global SEMO with standard mutation operator has found a 2-approximation (Section 3.1), and provides a population-based approach that obtains a solution that has approximation ratio (1+ɛ) in expected time O(n·2min{n,2(1-ɛ)OPT}+n3) (Section 5).

The outline of the article is as follows. In Section 2, the problem definition is presented as well as the classical Global SEMO algorithm and DEMO algorithm. Runtime analysis for Global SEMO is presented in Section 3 with the standard mutation operator investigated in Section 3.1 for finding a 2-approximation, and the alternative mutation operator analysed in Section 3.2 for finding a (1+ɛ)-approximation. Section 4 includes the analysis that shows DEMO can find 2-approximations of the optimum in expected polynomial time. The population-based algorithm is defined and investigated for finding a (1+ɛ)-approximation in Section 5. At the end, in Section 6 we summarize and conclude.

2. Preliminaries

We consider the weighted vertex cover problem defined as follows. Given a graph G=(V,E) with vertex set V={v1,,vn} and edge set E={e1,,em}, and a positive weight function w:VN+ on the vertices, the goal is to find a subset of vertices, VCV, that covers all edges and has minimum weight; that is, eE,eVC and vVCw(v) is minimized. We consider the standard node-based approach; that is, the search space is {0,1}n and for a solution x=(x1,,xn) the vertex vi is chosen iff xi=1.

The weighted vertex cover problem has the following Integer Linear Programming (ILP) formulation.
mini=1nw(vi)·xist.xi+xj1[vi,vj]Exi{0,1}1in.

By relaxing the constraint xi{0,1} to xi[0,1], the linear program formulation of Fractional Weighted Vertex Cover is obtained. Hochbaum (1983) has shown that we can find a 2-approximation using the LP result of the relaxed weighted vertex cover. This can be done by including any vertex vi for which xi12.

We consider primarily multiobjective approaches for the weighted vertex cover problem. Given a multiobjective fitness function f=(f1,,fd):SRn, defined on the solution set S, where all d objectives should be minimized, we have f(x)f(y) iff fi(x)fi(y), 1id. We say that x (weakly) dominates y iff f(x)f(y). Furthermore, we say that x (strongly) dominates y iff f(x)f(y) and f(x)f(y). A solution x* is Pareto optimal if there is no solution that can strongly dominate it. The set of Pareto optimal solutions is called Pareto front.

We now introduce the objectives used in our multiobjective evolutionary algorithm. Let G(x) be the graph obtained from G by removing all edges covered by the vertices chosen by x. Formally, we have G(x)=(V,E(x)) where V(x)=V{vixi=1} and E(x)=E{ee(VV(x))} (note that to unify the search space, we keep G and G(x) the same vertex set). Kratsch and Neumann (2013) investigated a multiobjective baseline algorithm called Global SEMO using the LP-value for G(x) as one of the fitness values for the (unweighted) minimum vertex cover problem.

Our goal is to expand the analysis on behaviour of multiobjective evolutionary algorithms to the weighted vertex cover problem. In order to do this, we modify the fitness function that was used in Global SEMO in Kratsch and Neumann (2013), to match the weighted version of the problem. We investigate the multiobjective fitness function f(x)=(Cost(x),LP(x)), where

  • Cost(x)=i=1nw(vi)xi is the sum of weights of selected vertices

  • LP(x) is the value of an optimal solution of the LP for G(x).

We analyse Global SEMO with this fitness function using the standard mutation operator flipping each bit with probability 1/n. We also investigate Global SEMO using the alternative mutation operator introduced in Kratsch and Neumann (2013) (see Algorithm 2). By this mutation operator, the vertices that are adjacent to uncovered edges are included with probability 1/2 in some steps.

graphic

graphic

graphic

graphic

In the fitness function used in Global SEMO, both Cost(x) and LP(x) can be exponential with respect to the input size; therefore, we need to deal with exponentially many solutions, even if we only keep the Pareto front. One approach for dealing with this problem is using the concept of ɛ-dominance (Laumanns et al., 2002). The concept of ɛ-dominance has previously been proved to be useful for coping with exponentially large Pareto fronts in some problems (Horoba and Neumann, 2008; Neumann et al., 2011). Having two objective vectors p=(p1,,pm) and q=(q1,,qm), pɛ-dominates q, denoted by pɛq, if for all i{1,,m} we have pi(1+ɛ)qi. Motivated by this approach, DEMO (Diversity Evolutionary Multiobjective Optimizer) has been investigated in Neumann and Reichel (2008) and Neumann et al. (2011), which we present in Algorithm 3. In this approach, the objective space is partitioned into a polynomial number of boxes in which all solutions ɛ-dominate each other, and at most one solution from each box is kept in the population. Here, we describe the concept of boxes and how we keep one solution for each box in detail, and in Section 4, we analyze DEMO.

To implement the concept of ɛ-dominance in DEMO, we use the parameter δ=12n and define the boxing function b:{0,1}nN2 as:
b1(x)=log1+δ(1+Cost(x)),b2(x)=log1+δ(1+LP(x)).

The functions b1 and b2 partition the objective space into horizontal and vertical stripes, which we name rows and columns, and the whole boxing function partitions the objective space into boxes. A box can be denoted by B=(a,b), where a and b are values of b1 and b2 for the solutions in that box, respectively.

Note that two boxes B=(a,b) and B'=(a',b') with a=a' and b<b' (or a<a' and b=b') can include search points that do not dominate each other; therefore, we may keep solutions from different boxes with same values of b1 or b2. But if a<a' and b<b', then all search points in B dominate all search points in B'. Hence, we define dominance among boxes as: box B=(a,b) dominates box B'=(a',b'), denoted by B<B', if a<a' and b<b'.

In DEMO only one nondominated solution can be kept in the population for each box based on a predefined criteria. In our setting, among two solutions x and y from one box, y is kept in P and x is discarded if Cost(y)+2·LP(y)Cost(x)+2·LP(x). The reason behind this particular setting is that we aim to work on solutions x under the constraint that Cost(x)+2·LP(x)2·OPT, because by only adding vertices to these solutions, it is possible to obtain 2-approximate complete vertex covers.

Analysing the runtime of our evolutionary algorithms, we are interested in the expected number of rounds of the repeat loop until a solution of desired quality has been obtained. We call this the expected time until the considered algorithm has achieved its desired goal.

3. Analysis of Global SEMO

In this section, we analyse the expected time of Global SEMO to find good approximations for the weighted vertex cover problem in dependence of the input size and OPT. Before we present our analysis for Global SEMO, we state some basic properties of the solutions in our multi-objective model. The following theorem shown by Balinski (1970) states that all basic feasible solutions of the LP relaxation of the weighted vertex cover, which are the extremal points or the corner solutions of the polyhedron that forms the feasible space, are half-integral.

Theorem 1:

Each basic feasible solution x of the LP relaxation of the weighted vertex cover is half-integral; that is, x{0,1/2,1}n. Balinski (1970)

As a result, there always exists a half-integral optimal LP solution for a vertex cover problem. In several parts of this article, we make use of this result. We establish the following two lemmata which we will use later on in the analysis of our algorithms.

Lemma 2:

For any x{0,1}n, LP(x)LP(0n)OPT.

Proof:

Let y be the LP solution of LP(0n). The solution 0n contains no vertices; therefore, y is the optimal fractional vertex cover for all edges of the input graph. Thus, for any solution x, y is a (possibly nonoptimal) fractional cover for G(x); therefore, LP(x)LP(0n). Moreover, we have LP(0n)OPT as LP(0n) is the optimal value of the LP relaxation.

Lemma 3:

Let x={x1,,xn},xi{0,1} be a solution and y={y1,,yn},yi[0,1] be a fractional solution for G(x). If there is a vertex vi where yi12, mutating xi from 0 to 1 results in a solution x' for which LP(x')LP(x)-yi·w(vi)LP(x)-12w(vi).

Proof:

The graph G(x') is the same as G(x) excluding the edges connected to vi. Therefore, the solution y'={y1,,yi-1,0,yi+1,yn} is a fractional vertex cover for G(x') and has a cost of LP(x)-yiw(vi). The cost of the optimal fractional vertex cover of G(x') is at most as great as the cost of y'; thus LP(x')LP(x)-yiw(vi)LP(x)-12w(vi).

3.1. 2-Approximation

We now analyse the runtime behaviour of Global SEMO (Algorithm 1) with the standard mutation operator, in dependence of OPT. We start by giving an upper bound on the population size of Global SEMO.

Lemma 4:

The population size of Algorithm 1 is upper bounded by 2·OPT+1.

Proof:

For any solution x there exists an optimal fractional vertex cover which is half-integral (Theorem 1). Moreover, we are assuming that all the weights are integer values. Therefore, LP(x) can only take 2LP(0n)+1 different values, because LP(0n) is an upper bound on LP(x) (Lemma 2). For each value of LP, only one solution is in P, because Algorithm 1 keeps nondominated solutions only. Therefore, the population size of this algorithm is upper bounded by 2·LP(0n)+1 which is at most 2·OPT+1 due to Lemma 2.

For our analysis, we first consider the expected time of Global SEMO to reach a population which contains the empty set of vertices. Once included, such a solution will never be removed from the population as it is minimal with respect to the cost function.

Lemma 5:

The search point 0n is included in the population in expected time of OOPT·n(logWmax+logn).

Proof:

From Lemma 4 we know that the population contains at most 2·OPT+1 solutions. Therefore, at each step, there is a probability of 12·OPT+1 that the solution xmin is selected where Cost(xmin)=minxPCost(x).

If Cost(xmin)>0, there must be k1 vertices such as vi in xmin where xi=1. Let Δt be the improvement that happens on the minimum cost in P at step t. If all the 1-bits in solution xmin flip to zero, at the same step or different steps, a solution 0n will be obtained with Cost(0n)=0, which implies that the expected improvement that flipping a randomly chosen 1-bit makes is Δt=Cost(xmin)k at each step t. Note that flipping 1-bits always improves the minimum cost and the new solution is added to the population. Moreover, flipping any 0-bits does not improve the minimum cost in the population and xmin is not replaced with the new solution in that case.

At each step, with probability at least 1e only one bit flips. With probability kn, the flipping bit is a 1-bit, and makes an expected improvement of Δt=Cost(xmin)k, and with probability 1-kn, a 0-bit is flipped with Δt=0. We can conclude that the expected improvement of minimum cost, when only one bit of xmin flips, is at least
kn·Cost(xmin)k=Cost(xmin)n.
Moreover, the algorithm selects xmin and flips only one bit with probability at least 1(2·OPT+1)·e; therefore, the expected improvement of minimum cost is bounded by
E[Δtxmin]Cost(xmin)(2·OPT+1)·e·n.

The maximum value that Cost(xmin) can take is bounded by Wmax·n, and for any solution x0n, the minimum value of Cost(x) is at least 1. Using Multiplicative Drift Analysis (Doerr et al., 2012) with s0Wmax·n and smin1, we can conclude that in expected time OOPT·n(logWmax+logn) solution 0n is included in the population.

We now show that Global SEMO is able to achieve a 2-approximation efficiently as long as OPT is small.

Theorem 6:

The expected number of iterations of Global SEMO until the population P contains a 2-approximation is O(OPT·n(logWmax+logn)).

Proof:

Let x be a solution that minimizes LP(x) under the constraint that Cost(x)+2·LP(x)2·OPT. Note that this constraint holds for solution 0n since LP(0n)OPT, and according to Lemma 5, solution 0n exists in the population in expected time of OOPT·n(logWmax+logn).

If LP(x)=0, then all edges are covered and x is a 2-approximate vertex cover because we have Cost(x)+2·LP(x)2·OPT as the constraint. Otherwise, some edges are uncovered and any LP solution of G(x) assigns at least 12 to at least one vertex of any uncovered edge. Let y={y1,,yn} be a basic LP solution for G(x). According to Theorem 1, y is a half-integral solution.

Let Δt be the improvement that happens on the minimum LP value among solutions that fulfil the constraint at time step t. Also, let k be the number of vertices that are assigned at least 12 by y. Flipping only one of these vertices by the algorithm happens with probability at least ke·n. According to Lemma 3, flipping one of these vertices, vi, results in a solution x' with LP(x')LP(x)-12w(vi). Observe that the constraint of Cost(x')+2·LP(x')2·OPT holds for solution x'. Therefore, Δtyi·w(vi), which is on expectation at least LP(x)k due to the definition of LP(x). Moreover, at each step, the probability that x is selected and only one of the k bits defined above flips is at least k(2·OPT+1)·e·n. As a result we have:
E[Δtx]k(2·OPT+1)·e·n·LP(x)k=LP(x)en(2·OPT+1).

According to Lemma 2, for any solution x, we have LP(x)OPT. We also know that for any solution x which is not a complete cover, LP(x)1 because the weights are positive integers. Using the method of Multiplicative Drift Analysis (Doerr et al., 2012) with s0OPT and smin1, in expected time of O(OPT·nlogOPT) a solution y with LP(y)=0 and Cost(y)+2LP(y)2OPT is obtained which is a 2-approximate vertex cover. Overall, since we have OPTWmax·n, the expected time of finding this solution is O(OPT·n(logWmax+logn)).

3.2. Improved Approximations by Alternative Mutation

In this section, we analyse the expected time of Global SEMO with an alternative mutation operator to find a (1 + ɛ)-approximation.

Lemma 7:

A solution x fulfilling the two properties

  1. LP(x)=LP(0n)-Cost(x) and

  2. there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of G(x)

is included in the population of Global SEMO in expected time O(OPT·n(logWmax+logn+OPT)).

Proof:

As the standard mutation occurs with probability 1/2 in the alternative mutation operator, the search point 0n which satisfies property 1 is included in the population in expected time of O(OPT·n(logWmax+logn)) using the argument presented in the proof of Lemma 5. Let P'P be a set of solutions such that for each solution xP', LP(x)+Cost(x)=LP(0n). Let xminP' be a solution such that LP(xmin)=minxP'LP(x).

If the optimal fractional vertex cover for G(xmin) assigns 1/2 to each nonisolated vertex of G(xmin), then the conditions of the lemma hold. Otherwise, it assigns 1 to some nonisolated vertex, say v. The probability that the algorithm selects xmin and flips the bit corresponding to v, is Ω(1OPT·n) because the population size is O(OPT) (Lemma 4). Let xnew be the new solution. We have Cost(xnew)=Cost(xmin)+w(v), and by Lemma 3, LP(xnew)LP(xmin)-w(v). This implies that LP(xnew)+Cost(xnew)=LP(0n); hence, xnew is a Pareto Optimal solution and is added to the population P.

Since LP(xmin)OPT (Lemma 2) and the weights are at least 1, assuming that we already have the solution 0n in the population, by means of the method of fitness-based partitions, we find the expected time of finding a solution that fulfils the properties given above as O(OPT2·n). Since the search point 0n is included in expected time O(OPT·n(logWmax+logn)), the expected time that a solution fulfilling the properties given above is included in P is O(OPT·n(logWmax+logn+OPT)).

We now present the main approximation result for Global SEMO using the alternative mutation operator. The general idea of the analysis given in the following theorem is to partition the nonisolated vertices in G(x) into four subsets: S1, S2, T1, and T2, where x denotes a solution satisfying the two properties given in Lemma 7. The precise definition of these four subsets are given in the proof of the theorem. For a new vertex cover x' obtained by the alternative mutation operator on x, the analysis only considers the probability that all vertices of S1 are chosen and no vertex of T1 is chosen in the new solution x'. The quality of x' highly depends on this property (including all vertices of S1 and no vertices of T1), but it also depends on the vertices chosen from S2 and T2, where the vertices in S2 and T2 are chosen randomly with probability 1/2 by the alternative mutation operator. Thus the analysis finds the expected time until the event that a solution x' with the defined property is found, and the expected ratio of that solution is considered, based on the expectation that half of the vertices in S2 and half of the vertices in T2 are chosen by x'. In the following, we first present the formal definition of sets S1 and T1 and the mentioned property, and then we state the main theorem of this section.

Definition 8:

Let x be a solution that satisfies the two properties given in Lemma 7. Also, let X be the set containing all nonisolated vertices in graph G(x). Moreover, let SX be a vertex cover of G(x) with the minimum weight over all vertex covers of G(x), and T be the set containing all nonisolated vertices in XS. For a set of vertices, X', we define Cost(X')=vX'w(v). Let OPT'=OPT-Cost(x). Let s1,,s|S| be a numbering of the vertices in S such that w(si)w(si+1), for all 1i|S|-1. And let t1,,t|T| be a numbering of the vertices in T such that w(ti)w(ti+1), for all 1i|T|-1. We define

  • S1={s1,s2,,sρ}, where ρ=min{|S|,(1-ɛ)·OPT'}

  • T1={t1,t2,,tη}, where η=min{|T|,(1-ɛ)·OPT'}

Property 9 (High-Quality solutions):We say that a solution x has the property of a High-Quality solution if all vertices of S1 are chosen and no vertex of T1 is chosen in x.

Theorem 10:

The expected time until Global SEMO has obtained a solution with Property 9 (High-Quality solution) is O(OPT·2min{n,2(1-ɛ)OPT}+OPT·n(logWmax+logn+OPT)). Moreover, the obtained solution has expected approximation ratio of (1+ɛ).

Proof:

By Lemma 7, a solution x that satisfies the two properties given in Lemma 7 is included in the population in expected time of O(OPT·n(logWmax+logn+OPT)). Let X, S, T, ρ, η, and Cost(X') for a vertex set X' be as defined in Definition 8. Due to property 2 of Lemma 7, 12Cost(S)+12Cost(T)=LP(x)Cost(S); therefore, Cost(T)Cost(S). Also, let OPT' be as defined in Definition 8. Observe that OPT'=Cost(S), because S is the minimum vertex cover of G(x).

With probability Ω(1OPT), the algorithm Global SEMO selects the solution x, and sets b=1 in the Alternative Mutation Operator. With b=1, the probability that the bits corresponding to all vertices of S1 are flipped, is Ω((12)ρ), and the probability that none of the bits corresponding to the vertices of T1 are flipped is Ω((12)η). Also, the bits corresponding to the isolated vertices of G(x) are flipped with probability 1n by the Alternative Mutation Operator; hence, the probability that none of them flips is Ω(1). As a result, with probability Ω(1OPT·(12)ρ+η), solution x is selected, the vertices of S1 are included, and the vertices of T1 and isolated vertices are not included in the new solution x'. Due to the definition of Property 9, x' has the property of a High-Quality solution. Since ρ+η2(1-ɛ)·OPT'2(1-ɛ)·OPT, and also ρ+ηn, the expected time until solution x' is found after reaching solution x is O(OPT·2min{n,2(1-ɛ)OPT}).

Now we show that the second statement of the theorem holds. Note that the bits corresponding to vertices of S2=SS1 and T2=TT1, are arbitrarily flipped in solution x' with probability 1/2 by the Alternative Mutation Operator. Here, we show that for the expected cost and the LP value of x', the following constraint holds: E[Cost(x')]+2·LP(x')(1+ɛ)·OPT.

Let S'S and T'T denote the subset of vertices of S and T that are actually included in the new solution x', respectively. In the following, we show that for the expected values of Cost(S') and Cost(T'), we have:
ECost(S')(1-ɛ)·OPT'+ECost(T').
(1)
Since the bits corresponding to the vertices of S2 and T2 are flipped with probability 1/2, for the expected values of Cost(S') and Cost(T') we have:
ECost(S')=Cost(S1)+Cost(S2)2=Cost(S1)+Cost(S)-Cost(S1)2=1/2Cost(S)+1/2Cost(S1)
and
ECost(T')=1/2Cost(T2).
If ρ=|S|, then S1=S and Cost(S1)=Cost(S)=OPT'. If ρ=(1-ɛ)·OPT', we have Cost(S1)(1-ɛ)·OPT' since each vertex has a weight of at least 1. Using Cost(S)=OPT' and the inequality above, we have
ECost(S')(1-ɛ)·OPT'+ɛ·OPT'2.
We divide the analysis into two cases based on the relation between η and |T|.

Case (I). η=|T|. Then T2=T'=. Thus, ECost(T')=0 and Inequality (1) holds true.

Case (II). η=(1-ɛ)·OPT'<|T|. Since w(ti)w(ti+1) for 1i|T|-1 and Cost(T)Cost(S)=OPT', we have
Cost(T2)|T|-η|T|Cost(T)OPT'-(1-ɛ)·OPT'OPT'Cost(T)OPT'-(1-ɛ)·OPT'OPT'Cost(T)ɛ·Cost(S)=ɛ·OPT'.
Thus, for the expected value of Cost(T'), we have
ECost(T')=12Cost(T2)ɛ·OPT'2.
Summarizing the above analysis, we can get that the Inequality 1 holds. In the following, using Inequality (1), we prove that, on expectation, the new solution x' satisfies the inequality Cost(x')+2·LP(x')(1+ɛ)·OPT.
ECost(x')+2·LP(x')=Cost(x)+ECost(S')+ECost(T')+2·LP(x')Cost(x)+ECost(S')+ECost(S')-(1-ɛ)·OPT'+2·LP(x')Cost(x)+2ECost(S')-(1-ɛ)·OPT'+2·(OPT'-ECost(S'))=Cost(x)+(1+ɛ)·OPT'=Cost(x)+(1+ɛ)·(OPT-Cost(x))(1+ɛ)·OPT.
The third inequality holds because the set S1 chosen by x is a subset of the optimal solution for G(x).

Now we analyze whether the new solution x' could be included in the population P. If x' could not be included in P, then there is a solution x'' dominating x; that is, LP(x'')LP(x') and Cost(x'')Cost(x'). This implies Cost(x'')+2·LP(x'')<Cost(x')+2·LP(x')(1+ɛ)·OPT. Therefore, after having a solution that fulfils the properties of Lemma 7 in P, in expected time O(OPT·2min{n,2(1-ɛ)OPT}), the population would contain a solution y such that Cost(y)+2·LP(y)(1+ɛ)·OPT.

Let P' contain all solutions xP such that Cost(x)+2·LP(x)(1+ɛ)·OPT, and let xmin be the one that minimizes LP. With similar proof as we saw in Theorem 6 it is possible to show that at each step, on expectation LP(xmin) improves by LP(x)en(2·OPT+1). Using Multiplicative Drift Analysis, we get the expected time O(OPT·nlogOPT) to find a solution y for which LP(y)=0 and Cost(y)+2·LP(y)(1+ɛ)·OPT.

Overall, the expected number of iterations of Global SEMO with alternative mutation operator, for getting a weighted vertex cover with expected approximation ratio (1+ɛ), is bounded by O(OPT·2min{n,2(1-ɛ)OPT}+OPT·n(logWmax+logn+OPT)).

4. Analysis of DEMO

Due to Lemma 4, with Global SEMO, the population size is upper bounded by O(OPT), which can be exponential in terms of the input size. In this section, we analyse the other evolutionary algorithm, DEMO (Algorithm 3), that uses some diversity handling mechanisms for dealing with exponentially large population sizes. The following lemmata are used in the proof of Theorem 13.

Lemma 11:

Let Wmax be the maximum weight assigned to a vertex. The population size of DEMO is upper bounded by On·(logn+logWmax).

Proof:
The values that can be taken by b1 are integer values between 0 and log1+δ(1+Cost(1n)) and the values that can be taken by b2 are integer values between 0 and log1+δ(1+LP(0n)) (Lemma 2). Since n·Wmax is an upper bound for both Cost(1n) and LP(0n), the number of rows and also the number of columns are bounded by
k=1+log1+δ(1+n·Wmax)1+log(1+n·Wmax)log(1+δ)=On·(logn+logWmax).
The last equality holds because δ=12n.

We here show that the size of the population is Psize2k-1. Since the dominated solutions according to f are discarded by the algorithm, none of the solutions in P can be located in a box that is dominated by another box that contains a solution in P. Moreover, at most one solution from each box is kept in the population; therefore, Psize is at most the maximum number of boxes where none of them dominates another.

Let k1 be the number of boxes that contain a solution of P in the first column. Let r1 be the smallest row number among these boxes. Observe that r1k-k1+1 and the equality holds when the boxes are from rows k down to k-k1+1. Any box in the second column with a row number of r1+1 or above is dominated by the box of the previous column and row r1. Therefore, the maximum row number for a box in the second column, that is not dominated, is r1k-k1+1. With generalizing the idea, the maximum row number for a box in the column i, that is not dominated, is ri-1k-k1--ki-1+i-1, where for 1jk, kj is the number of boxes that contain a solution of P in column j.

The last column has kkrk-1 boxes which gives us:
kkrk-1k-k1--kk-1+k-1.
This implies that
k1++kkrk-12k-1,
which completes the proof.
Lemma 12:

The search point xz=0n is included in the population in expected time of O(n3(logn+logWmax)2).

Proof:

From Lemma 10 we know that the population contains Psize=O(n·(logn+logWmax)) solutions. Therefore, at each step, there is a probability of at least 1psize that the solution xmin is selected where b1(xmin)=minxPb1(x).

If b1(xmin)=0, we have Cost(xmin)=0, which means xmin=0n since the weights are greater than 0.

If b1(xmin)0, there must be at least one vertex vi in xmin where xi=1. Consider vj the vertex that maximizes w(vi) among vertices vi where xi=1. If Cost(x)=C, then w(vj)Cn, because n is an upper bound on the number of vertices selected by xmin. As a result, removing vertex xj from solution xmin results in a solution x' for which Cost(x')C·(1-1n). Using this value of Cost(x'), we have
(1+δ)(1+Cost(x'))1+δ+C1-1n(1+δ)1+δ+C+Cδ-1n-δn1+Cδ+C+Cδ-1n-δn1+C+C2δ-1n-δn1+C.
The third inequality above holds because C1 and the last one holds because δ=12n. From (1+δ)(1+Cost(x'))1+C we can observe that
1+log1+δ(1+Cost(x'))log1+δ(1+C),
which implies b1(x')b1(x)-1. Note that x' is obtained by performing a 1-bit flip on x and is done at each step with a probability of at least
1Psize·1n·(1-1n)n-1=Ω1n(logn+logWmax)·1n.

Therefore, in expected time of at most On2(logn+logWmax) the new solution x' is obtained which is accepted by the algorithm because it is placed in a box with a smaller value of b1 than all solutions in P and hence not dominated. There are On(logn+logWmax) different values for b1; therefore, the solution xz=0n with b1(xz)=0 is found in expected time of at most On3(logn+logWmax)2.

Lemma 13:

Let xP be a search point such that Cost(x)+2·LP(x)2·OPT and b2(x)>0. There exists a 1-bit flip leading to a search point x' with Cost(x')+2·LP(x')2·OPT and b2(x')<b2(x).

Proof:
Let y={y1yn} be a basic half-integral LP solution for G(x). Since b2(x)=LP(x)0, there must be at least one uncovered edge; hence, at least one vertex vi has a yi12 in LP solution y. Consider vj the vertex that maximizes yiw(vi) among vertices vi,1in. Also, let x' be a solution obtained by adding vj to x. Since solutions x and x' are only different in one vertex, vj, we have Cost(x')=Cost(x)+w(vj). Moreover, according to Lemma 3, LP(x')LP(x)-12·w(vj). Therefore,
Cost(x')+2·LP(x')Cost(x)+w(vj)+2LP(x)-w(vj)2Cost(x)+2·LP(x)2·OPT,
which means solution x' fulfils the mentioned constraint. If LP(x)=W, then yjw(vj)Wn, because n is an upper bound on the number of vertices selected by the LP solution. As a result, using Lemma 3, we get LP(x')W·(1-1n). Therefore, with similar analysis as Lemma 11 we get:
(1+δ)1+LP(x')1+δ+W1-1n(1+δ)1+W.
This inequality implies
1+log1+δ(1+LP(x'))log1+δ(1+W).
As a result, b2(x')<b2(x) holds for x', which is obtained by performing a 1-bit flip on x, and the lemma is proved.
Theorem 14:

The expected time until DEMO constructs a 2-approximate vertex cover is On3·(logn+logWmax)2.

Proof:

Consider solution xP that minimizes b2(x) under the constraint that Cost(x)+2·LP(x)2·OPT. Note that 0n fulfils this constraint and according to Lemma 11, the solution 0n will be included in P in time On3(logn+logWmax)2.

If b2(x)=0 then x covers all edges and by selection of x we have Cost(x)2·OPT, which means that x is a 2-approximation.

In case b2(x)0, according to Lemma 12 there is a one-bit flip on x that results in a new solution x' for which b2(x')<b2(x), while the mentioned constraint also holds for it. Since the population size is On·(logn+logWmax) (Lemma 10), this 1-bit flip happens with a probability of Ωn-2·(logn+logWmax)-1 and x' is obtained in expected time of O(n3·(logn+logWmax)2). This new solution will be added to P because a solution y with Cost(y)+2·LP(y)>2·OPT cannot dominate x' with Cost(x')+2·LP(x')2·OPT, and x' has the minimum value of b2 among solutions that fulfil the constraint. Moreover, if there already is a solution, xprev, in the same box as x', it will be replaced by x' because Cost(xprev)+2·LP(xprev)>2·OPT; otherwise, it would have been selected as x.

There are at most A=1+logn+logWmaxlog(1+δ) different values for b2 in the objective space, and since δ=12n, A=O(n·(logn+logWmax)). Therefore, the expected time until a solution x'' is found so that b2(x'')=0 and Cost(x'')+2·LP(x'')2·OPT, is at most O(n3·(logn+logWmax)2).

5. Diverse Population-Based EA

In this section, we introduce a population-based algorithm (see Algorithm 4) that keeps for each k, 0kn, at most two solutions. This implies that the population size is upper bounded by 2n. The two solutions kept in the population are chosen according to different weighing of the cost and the LP-value. For each solution x, let |x|1 be the number of selected vertices in x. Algorithm 4 keeps a new solution x' in the population, if it minimizes Cost(z)+LP(z) or Cost(z)+2·LP(z) among other solutions xP where |x|1=|x'|1. Algorithm 4 gives a detailed description.

Taking into account that the population size is upper bounded by 2n and considering in each step an individual with the smallest number of ones in the population for mutation, one can obtain the following lemma by standard fitness level arguments.

Lemma 15:

The search point 0n is included in the population in expected time of O(n2logn).

To show the main result for Diverse Population-Based EA, we will use the following lemma.

Lemma 16:

A solution x fulfilling the two properties

  1. LP(x)=LP(0n)-Cost(x) and

  2. there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of G(x)

is included in the population of the Diverse Population-Based EA in expected time O(n3).

Proof:

By Lemma 14, solution 0n is contained in the population in expected time O(n2logn), which satisfies the property 1 given above. Let P'P be a set containing all solutions in P that satisfy the property 1 given above.

Let xmax be the solution of P' with the maximal number of 1-bits. If the optimal fractional vertex cover for G(xmax) assigns 1/2 to each non-isolated vertex of G(xmax), then the second property also holds. If the optimal fractional vertex cover for G(xmax) assigns 1 to some nonisolated vertex, say v, then the algorithm selects xmax and flips exactly the bit corresponding to v with probability Ω(1n2). Let x' be the new solution. By selection of xmax we know that x' is the only solution with |xmax|1+1 one-bits; hence, added to P.

Since the maximum value of |x|1 is n, after expected time of O(n3), there is a solution in the population that fulfils the properties given in the lemma.

We now show the main result for the Diverse Population-Based EA.

Theorem 17:

The expected time until Diverse Population-Based EA has obtained a solution that has approximation ratio (1+ɛ) is O(n·2min{n,2(1-ɛ)OPT}+n3).

Proof:

By Lemma 15 we know that after expected time of O(n3), there is a solution, x, in the population that fulfils the properties given in that lemma. With analysis similar to what we had in Theorem 9, we can show that a solution x with Cost(x)+2·LP(x)(1+ɛ)·OPT is produced in expected time O(n·2min{n,2(1-ɛ)OPT}+n3).

Now we see whether solution x is added to population P. If x could not be added to P, then there exists a solution yP such that |y|1=|x|1 and Cost(y)+2·LP(y)Cost(x)+2·LP(x). Thus, the population already includes a solution y such that Cost(y)+2·LP(y)(1+ɛ)·OPT.

Let P' be a set containing all solutions xP such that Cost(x)+2·LP(x)(1+ɛ)·OPT. Let xmaxP' such that |xmax|1=maxxP'|x|1.

If LP(xmax)=0, then solution xmax leads to a vertex cover for graph G. If LP(xmax)>0, we present a way to construct a (1+ɛ)-approximate vertex cover as follows, using xmax. If LP(xmax)>0, then there exists at least one vertex v to which the optimal fractional vertex cover LP(xmax) assigns value at least 1/2. Then the algorithm selects the solution xmax and flips exactly the bit corresponding to the vertex v with probability Ω(1n2). Let y be the new solution. We have
Cost(y)+2·LP(y)Cost(xmax)+2·LP(xmax)(1+ɛ)·OPT.

Suppose that y could not be included in P, then there exists a solution y' in P such that |y'|1=|y|1 and 2·LP(y')+Cost(y')2·LP(y)+Cost(y)(1+ɛ)·OPT, which contradicts the assumption that |xmax|1=maxxP'|x|1. Therefore, solution y could be included in P.

Observe that for any solution x, if |x|1=n, then LP(x)=0. Thus, after expected time of at most O(n3), the population P could include a solution y such that Cost(y)+2·LP(y)(1+ɛ)·OPT and LP(y)=0, which is a (1+ɛ)-approximate weighted vertex cover.

Overall, the expected time in which Diverse Population-Based EA finds a (1+ɛ)-approximate weighted vertex cover, is bounded by O(n·2min{n,2(1-ɛ)OPT}+n3).

6. Conclusion

The minimum vertex cover problem is one of the classical NP-hard combinatorial optimization problems. In this article, we have generalized previous results of Kratsch and Neumann (2013) for the unweighted minimum vertex cover problem to the weighted case where in addition weights on the vertices are given. Based on the conference version of this article (Pourhassan et al., 2016), in sections 3.2 and 4, we have investigated Global SEMO with alternative mutation operator for finding a (1+ɛ)-approximation, and studied the algorithm DEMO using the ɛ-dominance approach showing that it reaches a 2-approximation in expected polynomial time. Furthermore, in this article we have shown that Global SEMO with standard mutation operator efficiently computes a 2-approximation as long as the value of an optimal solution is small. We have also presented a population-based approach with a specific diversity mechanism that reaches an (1+ɛ)-approximation in expected time O(n·2min{n,2(1-ɛ)OPT}+n3).

Acknowledgments

This research has been supported by Australian Research Council grants DP140103400 and DP160102401, and National Natural Science Foundation of China grant 61802441.

References

Auger
,
A.
, and
Doerr
,
B
. (
2011
).
Theory of randomized search heuristics: Foundations and recent developments
.
Singapore
:
World Scientific Publishing
.
Balinski
,
M
. (
1970
). On the maximum matching, minimum covering. In
Proceedings of the Symposium on Mathematics and Programming
, pp.
434
445
.
Doerr
,
B.
,
Johannsen
,
D.
, and
Winzen
,
C
. (
2012
).
Multiplicative drift analysis
.
Algorithmica
,
64
(
4
):
673
697
.
Downey
,
R. G.
, and
Fellows
,
M. R
. (
1999
).
Parameterized complexity
.
New York
:
Springer
.
Friedrich
,
T.
,
He
,
J.
,
Hebbinghaus
,
N.
,
Neumann
,
F.
, and
Witt
,
C
. (
2009
).
Analyses of simple hybrid algorithms for the vertex cover problem
.
Evolutionary Computation
,
17
(
1
):
3
19
.
Friedrich
,
T.
,
He
,
J.
,
Hebbinghaus
,
N.
,
Neumann
,
F.
, and
Witt
,
C
. (
2010
).
Approximating covering problems by randomized search heuristics using multi-objective models
.
Evolutionary Computation
,
18
(
4
):
617
633
.
Hochbaum
,
D. S
. (
1983
).
Efficient bounds for the stable set, vertex cover and set packing problems
.
Discrete Applied Mathematics
,
6
(
3
):
243
254
.
Horoba
,
C.
, and
Neumann
,
F
. (
2008
). Benefits and drawbacks for the use of epsilon-dominance in evolutionary multi-objective optimization. In
Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO)
, pp.
641
648
.
Jansen
,
T
. (
2013
).
Analyzing evolutionary algorithms—The computer science perspective
.
Natural Computing Series. New York
:
Springer
.
Jansen
,
T.
,
Oliveto
,
P. S.
, and
Zarges
,
C
. (
2013
). Approximating vertex cover using edge-based representations. In
Proceedings of the Workshop on Foundations of Genetic Algorithms
, pp.
87
96
.
Kratsch
,
S.
,
Lehre
,
P. K.
,
Neumann
,
F.
, and
Oliveto
,
P. S
. (
2010
). Fixed parameter evolutionary algorithms and maximum leaf spanning trees: A matter of mutation. In
Proceedings of the Conference on Parallel Problem Solving from Nature
, pp.
204
213
.
Kratsch
,
S.
, and
Neumann
,
F
. (
2013
).
Fixed-parameter evolutionary algorithms and the vertex cover problem
.
Algorithmica
,
65
(
4
):
754
771
.
Laumanns
,
M.
,
Thiele
,
L.
,
Deb
,
K.
, and
Zitzler
,
E
. (
2002
).
Combining convergence and diversity in evolutionary multiobjective optimization
.
Evolutionary Computation
,
10
(
3
):
263
282
.
Neumann
,
F.
, and
Reichel
,
J
. (
2008
). Approximating minimum multicuts by evolutionary multi-objective algorithms. In
Proceedings of the Conference on Parallel Problem Solving from Nature
, pp.
72
81
.
Neumann
,
F.
,
Reichel
,
J.
, and
Skutella
,
M
. (
2011
).
Computing minimum cuts by randomized search heuristics
.
Algorithmica
,
59
(
3
):
323
342
.
Neumann
,
F.
, and
Witt
,
C
. (
2010
).
Bioinspired computation in combinatorial optimization: Algorithms and their computational complexity
.
1st ed. New York
:
Springer
.
Oliveto
,
P. S.
,
He
,
J.
, and
Yao
,
X
. (
2009
).
Analysis of the (1 + 1)-EA for finding approximate solutions to vertex cover problems
.
IEEE Transactions on Evolutionary Computation
,
13
(
5
):
1006
1029
.
Pourhassan
,
M.
,
Gao
,
W.
, and
Neumann
,
F
. (
2015
). Maintaining 2-approximations for the dynamic vertex cover problem using evolutionary algorithms. In
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)
, pp.
903
910
.
Pourhassan
,
M.
,
Shi
,
F.
, and
Neumann
,
F
. (
2016
). Parameterized analysis of multi-objective evolutionary algorithms and the weighted vertex cover problem. In
Proceedings of the Conference on Parallel Problem Solving from Nature
, pp.
729
739
.
Sutton
,
A. M.
, and
Neumann
,
F
. (
2012
). A parameterized runtime analysis of simple evolutionary algorithms for makespan scheduling. In
Proceedings of the Conference on Parallel Problem Solving from Nature
, pp.
52
61
.
Sutton
,
A. M.
,
Neumann
,
F.
, and
Nallaperuma
,
S
. (
2014
).
Parameterized runtime analyses of evolutionary algorithms for the planar Euclidean traveling salesperson problem
.
Evolutionary Computation
,
22
(
4
):
595
628
.