## Abstract

Evolutionary multiobjective optimization for the classical vertex cover problem has been analysed in Kratsch and Neumann (2013) in the context of parameterized complexity analysis. This article extends the analysis to the weighted vertex cover problem in which integer weights are assigned to the vertices and the goal is to find a vertex cover of minimum weight. Using an alternative mutation operator introduced in Kratsch and Neumann (2013), we provide a fixed parameter evolutionary algorithm with respect to $OPT$, the cost of an optimal solution for the problem. Moreover, we present a multiobjective evolutionary algorithm with standard mutation operator that keeps the population size in a polynomial order by means of a proper diversity mechanism, and therefore, manages to find a 2-approximation in expected polynomial time. We also introduce a population-based evolutionary algorithm which finds a $(1+\u025b)$-approximation in expected time $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.

## 1. Introduction

The area of runtime analysis has provided many rigorous new insights into the working behaviour of bio-inspired computing methods such as evolutionary algorithms and ant colony optimization (Auger and Doerr, 2011; Jansen, 2013; Neumann and Witt, 2010). In recent years, the parameterized analysis of bio-inspired computing has gained additional interest (Kratsch et al., 2010; Kratsch and Neumann, 2013; Sutton and Neumann, 2012; Sutton et al., 2014). Here, the runtime of bio-inspired computing is studied in dependence of the input size and additional parameters such as the solution size and/or other structural parameters of the given input.

One of the classical problems that has been studied extensively in the area of runtime analysis is the classical NP-hard vertex cover problem. Here, an undirected graph is given and the goal is to find a minimum set of vertices $V'$ such that each edge has at least one endpoint in $V'$. Friedrich et al. (2010) have shown that the single-objective evolutionary algorithm (1 $+$ 1) EA cannot achieve a constant approximation ratio in expected polynomial time. Furthermore, they have shown that a multiobjective approach using Global Simple Evolutionary Multiobjective Optimizer (Global SEMO) gives a factor $O(logn)$ approximation for the wider classes of set cover problems in expected polynomial time. Further investigations regarding the approximation behaviour of evolutionary algorithms for the vertex cover problem have been carried out in Friedrich et al. (2009) and Oliveto et al. (2009). Edge-based representations in connection with different fitness functions have been investigated in Jansen et al. (2013) and Pourhassan et al. (2015) according to their approximation behaviour in the static and dynamic setting. Kratsch and Neumann (2013) have studied evolutionary algorithms and the vertex cover problem in the context of parameterized complexity (Downey and Fellows, 1999). They have shown that Global SEMO, with a problem specific mutation operator is a fixed parameter evolutionary algorithm for this problem (for details about fixed parameter evolutionary algorithms, please refer to [Kratsch and Neumann, 2013]), and finds 2-approximations in expected polynomial time. Kratsch and Neumann (2013) have also introduced an alternative mutation operator and have proved that Global SEMO using this mutation operator finds a $(1+\u025b)$-approximation in expected time $O(n2logn+OPT\xb7n2+n\xb74(1-\u025b)OPT)$. Jansen et al. (2013) have shown that a 2-approximation can also be obtained by using an edge-based representation in the (1 $+$ 1) EA combined with a fitness function formulation based on matchings.

In this article, we consider the weighted vertex cover problem where integer weights on the vertices are given and the goal is to find a vertex cover of minimum weight. We extend the investigations carried out in Kratsch and Neumann (2013) to the weighted minimum vertex cover problem. In Kratsch and Neumann (2013), multiobjective models in combination with a simple multiobjective evolutionary algorithm called Global SEMO are investigated. The secondary objective that is studied there is the solution for the LP relaxation of the problem, which helps the evolutionary algorithm construct LP-based approximation solutions. One key argument for the results presented for the (unweighted) vertex cover problem is that the population size is always upper bounded by $n+1$. This argument does not hold in the weighted case. Therefore, we study how a variant of Global SEMO using appropriate diversity mechanisms is able to deal with the weighted vertex cover problem.

The focus of this article is on the expected time (number of fitness evaluations) of the algorithms to find good approximations of an optimal solution. The time complexity analysis is performed with respect to $n$, $Wmax$, and $OPT$, which denote the number of vertices, the maximum weight in the input graph, and the cost of the optimal solution respectively. We first study the expected time until Global SEMO with standard mutation operator has found a 2-approximation in dependence of $n$ and $OPT$. Afterwards, we analyse the expected time that Global SEMO requires to find a solution with expected approximation ratio $(1+\u025b)$ for this problem when the algorithm uses an alternative mutation operator. Furthermore, this article considers DEMO, a variant of Global SEMO, which incorporates $\u025b$-dominance (Laumanns et al., 2002) as a diversity mechanism. It is shown that DEMO finds a 2-approximation in expected polynomial time. Finally, a population-based approach is presented that obtains a solution that has approximation ratio $(1+\u025b)$ in expected time $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.

This article extends the conference version (Pourhassan et al., 2016) by giving complete proofs for a number of lemmata (Lemmata 4, 5, 11, and 12), that are not contained in the conference version. Furthermore, it analyses the expected time until Global SEMO with standard mutation operator has found a 2-approximation (Section 3.1), and provides a population-based approach that obtains a solution that has approximation ratio $(1+\u025b)$ in expected time $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$ (Section 5).

The outline of the article is as follows. In Section 2, the problem definition is presented as well as the classical Global SEMO algorithm and DEMO algorithm. Runtime analysis for Global SEMO is presented in Section 3 with the standard mutation operator investigated in Section 3.1 for finding a 2-approximation, and the alternative mutation operator analysed in Section 3.2 for finding a $(1+\u025b)$-approximation. Section 4 includes the analysis that shows DEMO can find 2-approximations of the optimum in expected polynomial time. The population-based algorithm is defined and investigated for finding a $(1+\u025b)$-approximation in Section 5. At the end, in Section 6 we summarize and conclude.

## 2. Preliminaries

We consider the weighted vertex cover problem defined as follows. Given a graph $G=(V,E)$ with vertex set $V={v1,\u2026,vn}$ and edge set $E={e1,\u2026,em}$, and a positive weight function $w:V\u2192N+$ on the vertices, the goal is to find a subset of vertices, $VC\u2286V$, that covers all edges and has minimum weight; that is, $\u2200e\u2208E,e\u2229VC\u2260\u2205$ and $\u2211v\u2208VCw(v)$ is minimized. We consider the standard node-based approach; that is, the search space is ${0,1}n$ and for a solution $x=(x1,\u2026,xn)$ the vertex $vi$ is chosen iff $xi=1$.

By relaxing the constraint $xi\u2208{0,1}$ to $xi\u2208[0,1]$, the linear program formulation of Fractional Weighted Vertex Cover is obtained. Hochbaum (1983) has shown that we can find a 2-approximation using the LP result of the relaxed weighted vertex cover. This can be done by including any vertex $vi$ for which $xi\u226512$.

We consider primarily multiobjective approaches for the weighted vertex cover problem. Given a multiobjective fitness function $f=(f1,\u2026,fd):S\u2192Rn$, defined on the solution set $S$, where all $d$ objectives should be minimized, we have $f(x)\u2264f(y)$ iff $fi(x)\u2264fi(y)$, $1\u2264i\u2264d$. We say that $x$ (weakly) dominates $y$ iff $f(x)\u2264f(y)$. Furthermore, we say that $x$ (strongly) dominates $y$ iff $f(x)\u2264f(y)$ and $f(x)\u2260f(y)$. A solution $x*$ is *Pareto optimal* if there is no solution that can strongly dominate it. The set of Pareto optimal solutions is called *Pareto front*.

We now introduce the objectives used in our multiobjective evolutionary algorithm. Let $G(x)$ be the graph obtained from $G$ by removing all edges covered by the vertices chosen by $x$. Formally, we have $G(x)=(V,E(x))$ where $V(x)=V\u2216{vi\u2223xi=1}$ and $E(x)=E\u2216{e\u2223e\u2229(V\u2216V(x))\u2260\u2205}$ (note that to unify the search space, we keep $G$ and $G(x)$ the same vertex set). Kratsch and Neumann (2013) investigated a multiobjective baseline algorithm called Global SEMO using the LP-value for $G(x)$ as one of the fitness values for the (unweighted) minimum vertex cover problem.

Our goal is to expand the analysis on behaviour of multiobjective evolutionary algorithms to the weighted vertex cover problem. In order to do this, we modify the fitness function that was used in Global SEMO in Kratsch and Neumann (2013), to match the weighted version of the problem. We investigate the multiobjective fitness function $f(x)=(Cost(x),LP(x))$, where

$Cost(x)=\u2211i=1nw(vi)xi$ is the sum of weights of selected vertices

$LP(x)$ is the value of an optimal solution of the LP for $G(x)$.

We analyse Global SEMO with this fitness function using the standard mutation operator flipping each bit with probability $1/n$. We also investigate Global SEMO using the alternative mutation operator introduced in Kratsch and Neumann (2013) (see Algorithm 2). By this mutation operator, the vertices that are adjacent to uncovered edges are included with probability $1/2$ in some steps.

In the fitness function used in Global SEMO, both $Cost(x)$ and $LP(x)$ can be exponential with respect to the input size; therefore, we need to deal with exponentially many solutions, even if we only keep the Pareto front. One approach for dealing with this problem is using the concept of $\u025b$-dominance (Laumanns et al., 2002). The concept of $\u025b$-dominance has previously been proved to be useful for coping with exponentially large Pareto fronts in some problems (Horoba and Neumann, 2008; Neumann et al., 2011). Having two objective vectors $p=(p1,\u2026,pm)$ and $q=(q1,\u2026,qm)$, $p$$\u025b$-dominates $q$, denoted by $p\u227c\u025bq$, if for all $i\u2208{1,\u2026,m}$ we have $pi\u2264(1+\u025b)qi$. Motivated by this approach, DEMO (Diversity Evolutionary Multiobjective Optimizer) has been investigated in Neumann and Reichel (2008) and Neumann et al. (2011), which we present in Algorithm 3. In this approach, the objective space is partitioned into a polynomial number of boxes in which all solutions $\u025b$-dominate each other, and at most one solution from each box is kept in the population. Here, we describe the concept of boxes and how we keep one solution for each box in detail, and in Section 4, we analyze DEMO.

The functions $b1$ and $b2$ partition the objective space into horizontal and vertical stripes, which we name *rows* and *columns*, and the whole boxing function partitions the objective space into boxes. A box can be denoted by $B=(a,b)$, where $a$ and $b$ are values of $b1$ and $b2$ for the solutions in that box, respectively.

Note that two boxes $B=(a,b)$ and $B'=(a',b')$ with $a=a'$ and $b<b'$ (or $a<a'$ and $b=b'$) can include search points that do not dominate each other; therefore, we may keep solutions from different boxes with same values of $b1$ or $b2$. But if $a<a'$ and $b<b'$, then all search points in $B$ dominate all search points in $B'$. Hence, we define dominance among boxes as: box $B=(a,b)$ dominates box $B'=(a',b')$, denoted by $B<B'$, if $a<a'$ and $b<b'$.

In DEMO only one nondominated solution can be kept in the population for each box based on a predefined criteria. In our setting, among two solutions $x$ and $y$ from one box, $y$ is kept in $P$ and $x$ is discarded if $Cost(y)+2\xb7LP(y)\u2264Cost(x)+2\xb7LP(x)$. The reason behind this particular setting is that we aim to work on solutions $x$ under the constraint that $Cost(x)+2\xb7LP(x)\u22642\xb7OPT$, because by only adding vertices to these solutions, it is possible to obtain 2-approximate complete vertex covers.

Analysing the runtime of our evolutionary algorithms, we are interested in the expected number of rounds of the repeat loop until a solution of desired quality has been obtained. We call this the expected time until the considered algorithm has achieved its desired goal.

## 3. Analysis of Global SEMO

In this section, we analyse the expected time of Global SEMO to find good approximations for the weighted vertex cover problem in dependence of the input size and OPT. Before we present our analysis for Global SEMO, we state some basic properties of the solutions in our multi-objective model. The following theorem shown by Balinski (1970) states that all basic feasible solutions of the LP relaxation of the weighted vertex cover, which are the extremal points or the corner solutions of the polyhedron that forms the feasible space, are half-integral.

Each basic feasible solution $x$ of the LP relaxation of the weighted vertex cover is half-integral; that is, $x\u2208{0,1/2,1}n$. Balinski (1970)

As a result, there always exists a half-integral optimal LP solution for a vertex cover problem. In several parts of this article, we make use of this result. We establish the following two lemmata which we will use later on in the analysis of our algorithms.

For any $x\u2208{0,1}n$, $LP(x)\u2264LP(0n)\u2264OPT$.

Let $y$ be the LP solution of $LP(0n)$. The solution $0n$ contains no vertices; therefore, $y$ is the optimal fractional vertex cover for all edges of the input graph. Thus, for any solution $x$, $y$ is a (possibly nonoptimal) fractional cover for $G(x)$; therefore, $LP(x)\u2264LP(0n)$. Moreover, we have $LP(0n)\u2264OPT$ as $LP(0n)$ is the optimal value of the LP relaxation.$\u25a1$

Let $x={x1,\u2026,xn},xi\u2208{0,1}$ be a solution and $y={y1,\u2026,yn},yi\u2208[0,1]$ be a fractional solution for $G(x)$. If there is a vertex $vi$ where $yi\u226512$, mutating $xi$ from 0 to 1 results in a solution $x'$ for which $LP(x')\u2264LP(x)-yi\xb7w(vi)\u2264LP(x)-12w(vi)$.

The graph $G(x')$ is the same as $G(x)$ excluding the edges connected to $vi$. Therefore, the solution $y'={y1,\u2026,yi-1,0,yi+1,yn}$ is a fractional vertex cover for $G(x')$ and has a cost of $LP(x)-yiw(vi)$. The cost of the optimal fractional vertex cover of $G(x')$ is at most as great as the cost of $y'$; thus $LP(x')\u2264LP(x)-yiw(vi)\u2264LP(x)-12w(vi)$.$\u25a1$

### 3.1. 2-Approximation

We now analyse the runtime behaviour of Global SEMO (Algorithm 1) with the standard mutation operator, in dependence of OPT. We start by giving an upper bound on the population size of Global SEMO.

The population size of Algorithm 1 is upper bounded by $2\xb7OPT+1$.

For any solution $x$ there exists an optimal fractional vertex cover which is half-integral (Theorem ^{1}). Moreover, we are assuming that all the weights are integer values. Therefore, $LP(x)$ can only take $2LP(0n)+1$ different values, because $LP(0n)$ is an upper bound on $LP(x)$ (Lemma ^{2}). For each value of $LP$, only one solution is in $P$, because Algorithm 1 keeps nondominated solutions only. Therefore, the population size of this algorithm is upper bounded by $2\xb7LP(0n)+1$ which is at most $2\xb7OPT+1$ due to Lemma ^{2}.$\u25a1$

For our analysis, we first consider the expected time of Global SEMO to reach a population which contains the empty set of vertices. Once included, such a solution will never be removed from the population as it is minimal with respect to the cost function.

The search point $0n$ is included in the population in expected time of $OOPT\xb7n(logWmax+logn)$.

From Lemma ^{4} we know that the population contains at most $2\xb7OPT+1$ solutions. Therefore, at each step, there is a probability of $12\xb7OPT+1$ that the solution $xmin$ is selected where $Cost(xmin)=minx\u2208PCost(x)$.

If $Cost(xmin)>0$, there must be $k\u22651$ vertices such as $vi$ in $xmin$ where $xi=1$. Let $\Delta t$ be the improvement that happens on the minimum cost in $P$ at step $t$. If all the 1-bits in solution $xmin$ flip to zero, at the same step or different steps, a solution $0n$ will be obtained with $Cost(0n)=0$, which implies that the expected improvement that flipping a randomly chosen 1-bit makes is $\Delta t=Cost(xmin)k$ at each step $t$. Note that flipping 1-bits always improves the minimum cost and the new solution is added to the population. Moreover, flipping any 0-bits does not improve the minimum cost in the population and $xmin$ is not replaced with the new solution in that case.

The maximum value that $Cost(xmin)$ can take is bounded by $Wmax\xb7n$, and for any solution $x\u22600n$, the minimum value of $Cost(x)$ is at least 1. Using Multiplicative Drift Analysis (Doerr et al., 2012) with $s0\u2264Wmax\xb7n$ and $smin\u22651$, we can conclude that in expected time $OOPT\xb7n(logWmax+logn)$ solution $0n$ is included in the population.$\u25a1$

We now show that Global SEMO is able to achieve a 2-approximation efficiently as long as OPT is small.

The expected number of iterations of Global SEMO until the population $P$ contains a 2-approximation is $O(OPT\xb7n(logWmax+logn))$.

Let $x$ be a solution that minimizes $LP(x)$ under the constraint that $Cost(x)+2\xb7LP(x)\u22642\xb7OPT$. Note that this constraint holds for solution $0n$ since $LP(0n)\u2264OPT$, and according to Lemma ^{5}, solution $0n$ exists in the population in expected time of $OOPT\xb7n(logWmax+logn)$.

If $LP(x)=0$, then all edges are covered and $x$ is a 2-approximate vertex cover because we have $Cost(x)+2\xb7LP(x)\u22642\xb7OPT$ as the constraint. Otherwise, some edges are uncovered and any LP solution of $G(x)$ assigns at least $12$ to at least one vertex of any uncovered edge. Let $y={y1,\u2026,yn}$ be a basic LP solution for $G(x)$. According to Theorem ^{1}, $y$ is a half-integral solution.

^{3}, flipping one of these vertices, $vi$, results in a solution $x'$ with $LP(x')\u2264LP(x)-12w(vi)$. Observe that the constraint of $Cost(x')+2\xb7LP(x')\u22642\xb7OPT$ holds for solution $x'$. Therefore, $\Delta t\u2265yi\xb7w(vi)$, which is on expectation at least $LP(x)k$ due to the definition of $LP(x)$. Moreover, at each step, the probability that $x$ is selected and only one of the $k$ bits defined above flips is at least $k(2\xb7OPT+1)\xb7e\xb7n$. As a result we have:

According to Lemma ^{2}, for any solution $x$, we have $LP(x)\u2264OPT$. We also know that for any solution $x$ which is not a complete cover, $LP(x)\u22651$ because the weights are positive integers. Using the method of Multiplicative Drift Analysis (Doerr et al., 2012) with $s0\u2264OPT$ and $smin\u22651$, in expected time of $O(OPT\xb7nlogOPT)$ a solution $y$ with $LP(y)=0$ and $Cost(y)+2LP(y)\u22642OPT$ is obtained which is a 2-approximate vertex cover. Overall, since we have $OPT\u2264Wmax\xb7n$, the expected time of finding this solution is $O(OPT\xb7n(logWmax+logn))$.$\u25a1$

### 3.2. Improved Approximations by Alternative Mutation

In this section, we analyse the expected time of Global SEMO with an alternative mutation operator to find a (1 + $\u025b$)-approximation.

A solution $x$ fulfilling the two properties

$LP(x)=LP(0n)-Cost(x)$ and

there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of $G(x)$

is included in the population of Global SEMO in expected time $O(OPT\xb7n(logWmax+logn+OPT))$.

As the standard mutation occurs with probability $1/2$ in the alternative mutation operator, the search point $0n$ which satisfies property 1 is included in the population in expected time of $O(OPT\xb7n(logWmax+logn))$ using the argument presented in the proof of Lemma ^{5}. Let $P'\u2286P$ be a set of solutions such that for each solution $x\u2208P'$, $LP(x)+Cost(x)=LP(0n)$. Let $xmin\u2208P'$ be a solution such that $LP(xmin)=minx\u2208P'LP(x)$.

If the optimal fractional vertex cover for $G(xmin)$ assigns 1/2 to each nonisolated vertex of $G(xmin)$, then the conditions of the lemma hold. Otherwise, it assigns 1 to some nonisolated vertex, say $v$. The probability that the algorithm selects $xmin$ and flips the bit corresponding to $v$, is $\Omega (1OPT\xb7n)$ because the population size is $O(OPT)$ (Lemma ^{4}). Let $xnew$ be the new solution. We have $Cost(xnew)=Cost(xmin)+w(v)$, and by Lemma ^{3}, $LP(xnew)\u2264LP(xmin)-w(v)$. This implies that $LP(xnew)+Cost(xnew)=LP(0n)$; hence, $xnew$ is a Pareto Optimal solution and is added to the population $P$.

Since $LP(xmin)\u2264OPT$ (Lemma ^{2}) and the weights are at least 1, assuming that we already have the solution $0n$ in the population, by means of the method of fitness-based partitions, we find the expected time of finding a solution that fulfils the properties given above as $O(OPT2\xb7n)$. Since the search point $0n$ is included in expected time $O(OPT\xb7n(logWmax+logn))$, the expected time that a solution fulfilling the properties given above is included in $P$ is $O(OPT\xb7n(logWmax+logn+OPT))$.$\u25a1$

We now present the main approximation result for Global SEMO using the alternative mutation operator. The general idea of the analysis given in the following theorem is to partition the nonisolated vertices in $G(x)$ into four subsets: $S1$, $S2$, $T1$, and $T2$, where $x$ denotes a solution satisfying the two properties given in Lemma ^{7}. The precise definition of these four subsets are given in the proof of the theorem. For a new vertex cover $x'$ obtained by the alternative mutation operator on $x$, the analysis only considers the probability that all vertices of $S1$ are chosen and no vertex of $T1$ is chosen in the new solution $x'$. The quality of $x'$ highly depends on this property (including all vertices of $S1$ and no vertices of $T1$), but it also depends on the vertices chosen from $S2$ and $T2$, where the vertices in $S2$ and $T2$ are chosen randomly with probability $1/2$ by the alternative mutation operator. Thus the analysis finds the expected time until the event that a solution $x'$ with the defined property is found, and the *expected* ratio of that solution is considered, based on the expectation that half of the vertices in $S2$ and half of the vertices in $T2$ are chosen by $x'$. In the following, we first present the formal definition of sets $S1$ and $T1$ and the mentioned property, and then we state the main theorem of this section.

Let $x$ be a solution that satisfies the two properties given in Lemma ^{7}. Also, let $X$ be the set containing all nonisolated vertices in graph $G(x)$. Moreover, let $S\u2286X$ be a vertex cover of $G(x)$ with the minimum weight over all vertex covers of $G(x)$, and $T$ be the set containing all nonisolated vertices in $X\u2216S$. For a set of vertices, $X'$, we define $Cost(X')=\u2211v\u2208X'w(v)$. Let $OPT'=OPT-Cost(x)$. Let $s1,\u2026,s|S|$ be a numbering of the vertices in $S$ such that $w(si)\u2264w(si+1)$, for all $1\u2264i\u2264|S|-1$. And let $t1,\u2026,t|T|$ be a numbering of the vertices in $T$ such that $w(ti)\u2265w(ti+1)$, for all $1\u2264i\u2264|T|-1$. We define

$S1={s1,s2,\u2026,s\rho}$, where $\rho =min{|S|,\u2308(1-\u025b)\xb7OPT'\u2309}$

$T1={t1,t2,\u2026,t\eta}$, where $\eta =min{|T|,\u2308(1-\u025b)\xb7OPT'\u2309}$

Property 9 (High-Quality solutions):*We say that a solution $x$ has the property of a High-Quality solution if all vertices of $S1$ are chosen and no vertex of $T1$ is chosen in $x$.*

The expected time until Global SEMO has obtained a solution with Property 9 (High-Quality solution) is $O(OPT\xb72min{n,2(1-\u025b)OPT}+OPT\xb7n(logWmax+logn+OPT))$. Moreover, the obtained solution has expected approximation ratio of $(1+\u025b)$.

By Lemma ^{7}, a solution $x$ that satisfies the two properties given in Lemma ^{7} is included in the population in expected time of $O(OPT\xb7n(logWmax+logn+OPT))$. Let $X$, $S$, $T$, $\rho $, $\eta $, and $Cost(X')$ for a vertex set $X'$ be as defined in Definition ^{8}. Due to property 2 of Lemma ^{7}, $12Cost(S)+12Cost(T)=LP(x)\u2264Cost(S)$; therefore, $Cost(T)\u2264Cost(S)$. Also, let $OPT'$ be as defined in Definition ^{8}. Observe that $OPT'=Cost(S)$, because $S$ is the minimum vertex cover of $G(x)$.

With probability $\Omega (1OPT)$, the algorithm Global SEMO selects the solution $x$, and sets $b=1$ in the Alternative Mutation Operator. With $b=1$, the probability that the bits corresponding to all vertices of $S1$ are flipped, is $\Omega ((12)\rho )$, and the probability that none of the bits corresponding to the vertices of $T1$ are flipped is $\Omega ((12)\eta )$. Also, the bits corresponding to the isolated vertices of $G(x)$ are flipped with probability $1n$ by the Alternative Mutation Operator; hence, the probability that none of them flips is $\Omega (1)$. As a result, with probability $\Omega (1OPT\xb7(12)\rho +\eta )$, solution $x$ is selected, the vertices of $S1$ are included, and the vertices of $T1$ and isolated vertices are not included in the new solution $x'$. Due to the definition of Property 9, $x'$ has the property of a *High-Quality solution*. Since $\rho +\eta \u22642\u2308(1-\u025b)\xb7OPT'\u2309\u22642\u2308(1-\u025b)\xb7OPT\u2309$, and also $\rho +\eta \u2264n$, the expected time until solution $x'$ is found after reaching solution $x$ is $O(OPT\xb72min{n,2(1-\u025b)OPT})$.

Now we show that the second statement of the theorem holds. Note that the bits corresponding to vertices of $S2=S\u2216S1$ and $T2=T\u2216T1$, are arbitrarily flipped in solution $x'$ with probability $1/2$ by the Alternative Mutation Operator. Here, we show that for the expected cost and the LP value of $x'$, the following constraint holds: $E[Cost(x')]+2\xb7LP(x')\u2264(1+\u025b)\xb7OPT$.

Case (I). $\eta =|T|$. Then $T2=T'=\u2205$. Thus, $ECost(T')=0$ and Inequality (1) holds true.

Now we analyze whether the new solution $x'$ could be included in the population $P$. If $x'$ could not be included in $P$, then there is a solution $x''$ dominating $x$; that is, $LP(x'')\u2264LP(x')$ and $Cost(x'')\u2264Cost(x')$. This implies $Cost(x'')+2\xb7LP(x'')<Cost(x')+2\xb7LP(x')\u2264(1+\u025b)\xb7OPT$. Therefore, after having a solution that fulfils the properties of Lemma ^{7} in $P$, in expected time $O(OPT\xb72min{n,2(1-\u025b)OPT})$, the population would contain a solution $y$ such that $Cost(y)+2\xb7LP(y)\u2264(1+\u025b)\xb7OPT$.

Let $P'$ contain all solutions $x\u2208P$ such that $Cost(x)+2\xb7LP(x)\u2264(1+\u025b)\xb7OPT$, and let $xmin$ be the one that minimizes $LP$. With similar proof as we saw in Theorem ^{6} it is possible to show that at each step, on expectation $LP(xmin)$ improves by $LP(x)en(2\xb7OPT+1)$. Using Multiplicative Drift Analysis, we get the expected time $O(OPT\xb7nlogOPT)$ to find a solution $y$ for which $LP(y)=0$ and $Cost(y)+2\xb7LP(y)\u2264(1+\u025b)\xb7OPT$.

Overall, the expected number of iterations of Global SEMO with alternative mutation operator, for getting a weighted vertex cover with expected approximation ratio $(1+\u025b)$, is bounded by $O(OPT\xb72min{n,2(1-\u025b)OPT}+OPT\xb7n(logWmax+logn+OPT))$.$\u25a1$

## 4. Analysis of DEMO

Due to Lemma ^{4}, with Global SEMO, the population size is upper bounded by $O(OPT)$, which can be exponential in terms of the input size. In this section, we analyse the other evolutionary algorithm, DEMO (Algorithm 3), that uses some diversity handling mechanisms for dealing with exponentially large population sizes. The following lemmata are used in the proof of Theorem ^{13}.

Let $Wmax$ be the maximum weight assigned to a vertex. The population size of DEMO is upper bounded by $On\xb7(logn+logWmax)$.

^{2}). Since $n\xb7Wmax$ is an upper bound for both $Cost(1n)$ and $LP(0n)$, the number of rows and also the number of columns are bounded by

We here show that the size of the population is $Psize\u22642k-1$. Since the dominated solutions according to $f$ are discarded by the algorithm, none of the solutions in $P$ can be located in a box that is dominated by another box that contains a solution in $P$. Moreover, at most one solution from each box is kept in the population; therefore, $Psize$ is at most the maximum number of boxes where none of them dominates another.

Let $k1$ be the number of boxes that contain a solution of $P$ in the first column. Let $r1$ be the smallest row number among these boxes. Observe that $r1\u2264k-k1+1$ and the equality holds when the boxes are from rows $k$ down to $k-k1+1$. Any box in the second column with a row number of $r1+1$ or above is dominated by the box of the previous column and row $r1$. Therefore, the maximum row number for a box in the second column, that is not dominated, is $r1\u2264k-k1+1$. With generalizing the idea, the maximum row number for a box in the column $i$, that is not dominated, is $ri-1\u2264k-k1-\cdots -ki-1+i-1$, where for $1\u2264j\u2264k$, $kj$ is the number of boxes that contain a solution of $P$ in column $j$.

The search point $xz=0n$ is included in the population in expected time of $O(n3(logn+logWmax)2)$.

From Lemma ^{10} we know that the population contains $Psize=O(n\xb7(logn+logWmax))$ solutions. Therefore, at each step, there is a probability of at least $1psize$ that the solution $xmin$ is selected where $b1(xmin)=minx\u2208Pb1(x)$.

If $b1(xmin)=0$, we have $Cost(xmin)=0$, which means $xmin=0n$ since the weights are greater than 0.

Therefore, in expected time of at most $On2(logn+logWmax)$ the new solution $x'$ is obtained which is accepted by the algorithm because it is placed in a box with a smaller value of $b1$ than all solutions in $P$ and hence not dominated. There are $On(logn+logWmax)$ different values for $b1$; therefore, the solution $xz=0n$ with $b1(xz)=0$ is found in expected time of at most $On3(logn+logWmax)2$.$\u25a1$

Let $x\u2208P$ be a search point such that $Cost(x)+2\xb7LP(x)\u22642\xb7OPT$ and $b2(x)>0$. There exists a 1-bit flip leading to a search point $x'$ with $Cost(x')+2\xb7LP(x')\u22642\xb7OPT$ and $b2(x')<b2(x)$.

^{3}, $LP(x')\u2264LP(x)-12\xb7w(vj)$. Therefore,

^{3}, we get $LP(x')\u2264W\xb7(1-1n)$. Therefore, with similar analysis as Lemma

^{11}we get:

The expected time until DEMO constructs a 2-approximate vertex cover is $On3\xb7(logn+logWmax)2$.

Consider solution $x\u2208P$ that minimizes $b2(x)$ under the constraint that $Cost(x)+2\xb7LP(x)\u22642\xb7OPT$. Note that $0n$ fulfils this constraint and according to Lemma ^{11}, the solution $0n$ will be included in $P$ in time $On3(logn+logWmax)2$.

If $b2(x)=0$ then $x$ covers all edges and by selection of $x$ we have $Cost(x)\u22642\xb7OPT$, which means that $x$ is a 2-approximation.

In case $b2(x)\u22600$, according to Lemma ^{12} there is a one-bit flip on $x$ that results in a new solution $x'$ for which $b2(x')<b2(x)$, while the mentioned constraint also holds for it. Since the population size is $On\xb7(logn+logWmax)$ (Lemma ^{10}), this 1-bit flip happens with a probability of $\Omega n-2\xb7(logn+logWmax)-1$ and $x'$ is obtained in expected time of $O(n3\xb7(logn+logWmax)2)$. This new solution will be added to $P$ because a solution $y$ with $Cost(y)+2\xb7LP(y)>2\xb7OPT$ cannot dominate $x'$ with $Cost(x')+2\xb7LP(x')\u22642\xb7OPT$, and $x'$ has the minimum value of $b2$ among solutions that fulfil the constraint. Moreover, if there already is a solution, $xprev$, in the same box as $x'$, it will be replaced by $x'$ because $Cost(xprev)+2\xb7LP(xprev)>2\xb7OPT$; otherwise, it would have been selected as $x$.

There are at most $A=1+\u2308logn+logWmaxlog(1+\delta )\u2309$ different values for $b2$ in the objective space, and since $\delta =12n$, $A=O(n\xb7(logn+logWmax))$. Therefore, the expected time until a solution $x''$ is found so that $b2(x'')=0$ and $Cost(x'')+2\xb7LP(x'')\u22642\xb7OPT$, is at most $O(n3\xb7(logn+logWmax)2)$.$\u25a1$

## 5. Diverse Population-Based EA

In this section, we introduce a population-based algorithm (see Algorithm 4) that keeps for each $k$, $0\u2264k\u2264n$, at most two solutions. This implies that the population size is upper bounded by $2n$. The two solutions kept in the population are chosen according to different weighing of the cost and the LP-value. For each solution $x$, let $|x|1$ be the number of selected vertices in $x$. Algorithm 4 keeps a new solution $x'$ in the population, if it minimizes $Cost(z)+LP(z)$ or $Cost(z)+2\xb7LP(z)$ among other solutions $x\u2208P$ where $|x|1=|x'|1$. Algorithm 4 gives a detailed description.

Taking into account that the population size is upper bounded by $2n$ and considering in each step an individual with the smallest number of ones in the population for mutation, one can obtain the following lemma by standard fitness level arguments.

The search point $0n$ is included in the population in expected time of $O(n2logn)$.

To show the main result for Diverse Population-Based EA, we will use the following lemma.

A solution $x$ fulfilling the two properties

$LP(x)=LP(0n)-Cost(x)$ and

there is an optimal solution of the LP for G(x) which assigns 1/2 to each non-isolated vertex of $G(x)$

is included in the population of the Diverse Population-Based EA in expected time $O(n3)$.

By Lemma ^{14}, solution $0n$ is contained in the population in expected time $O(n2logn)$, which satisfies the property 1 given above. Let $P'\u2286P$ be a set containing all solutions in $P$ that satisfy the property 1 given above.

Let $xmax$ be the solution of $P'$ with the maximal number of 1-bits. If the optimal fractional vertex cover for $G(xmax)$ assigns 1/2 to each non-isolated vertex of $G(xmax)$, then the second property also holds. If the optimal fractional vertex cover for $G(xmax)$ assigns 1 to some nonisolated vertex, say $v$, then the algorithm selects $xmax$ and flips exactly the bit corresponding to $v$ with probability $\Omega (1n2)$. Let $x'$ be the new solution. By selection of $xmax$ we know that $x'$ is the only solution with $|xmax|1+1$ one-bits; hence, added to $P$.

Since the maximum value of $|x|1$ is $n$, after expected time of $O(n3)$, there is a solution in the population that fulfils the properties given in the lemma.$\u25a1$

We now show the main result for the Diverse Population-Based EA.

The expected time until Diverse Population-Based EA has obtained a solution that has approximation ratio $(1+\u025b)$ is $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.

By Lemma ^{15} we know that after expected time of $O(n3)$, there is a solution, $x$, in the population that fulfils the properties given in that lemma. With analysis similar to what we had in Theorem ^{9}, we can show that a solution $x$ with $Cost(x)+2\xb7LP(x)\u2264(1+\u025b)\xb7OPT$ is produced in expected time $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.

Now we see whether solution $x$ is added to population $P$. If $x$ could not be added to $P$, then there exists a solution $y\u2208P$ such that $|y|1=|x|1$ and $Cost(y)+2\xb7LP(y)\u2264Cost(x)+2\xb7LP(x)$. Thus, the population already includes a solution $y$ such that $Cost(y)+2\xb7LP(y)\u2264(1+\u025b)\xb7OPT$.

Let $P'$ be a set containing all solutions $x\u2208P$ such that $Cost(x)+2\xb7LP(x)\u2264(1+\u025b)\xb7OPT$. Let $xmax\u2208P'$ such that $|xmax|1=maxx\u2208P'|x|1$.

Suppose that $y$ could not be included in $P$, then there exists a solution $y'$ in $P$ such that $|y'|1=|y|1$ and $2\xb7LP(y')+Cost(y')\u22642\xb7LP(y)+Cost(y)\u2264(1+\u025b)\xb7OPT$, which contradicts the assumption that $|xmax|1=maxx\u2208P'|x|1$. Therefore, solution $y$ could be included in $P$.

Observe that for any solution $x$, if $|x|1=n$, then $LP(x)=0$. Thus, after expected time of at most $O(n3)$, the population $P$ could include a solution $y$ such that $Cost(y)+2\xb7LP(y)\u2264(1+\u025b)\xb7OPT$ and $LP(y)=0$, which is a $(1+\u025b)$-approximate weighted vertex cover.

Overall, the expected time in which Diverse Population-Based EA finds a $(1+\u025b)$-approximate weighted vertex cover, is bounded by $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.$\u25a1$

## 6. Conclusion

The minimum vertex cover problem is one of the classical NP-hard combinatorial optimization problems. In this article, we have generalized previous results of Kratsch and Neumann (2013) for the unweighted minimum vertex cover problem to the weighted case where in addition weights on the vertices are given. Based on the conference version of this article (Pourhassan et al., 2016), in sections 3.2 and 4, we have investigated Global SEMO with alternative mutation operator for finding a $(1+\u025b)$-approximation, and studied the algorithm DEMO using the $\u025b$-dominance approach showing that it reaches a 2-approximation in expected polynomial time. Furthermore, in this article we have shown that Global SEMO with standard mutation operator efficiently computes a 2-approximation as long as the value of an optimal solution is small. We have also presented a population-based approach with a specific diversity mechanism that reaches an $(1+\u025b)$-approximation in expected time $O(n\xb72min{n,2(1-\u025b)OPT}+n3)$.

## Acknowledgments

This research has been supported by Australian Research Council grants DP140103400 and DP160102401, and National Natural Science Foundation of China grant 61802441.