## Abstract

The Steiner tree problem (STP) aims to determine some Steiner nodes such that the minimum spanning tree over these Steiner nodes and a given set of special nodes has the minimum weight, which is NP-hard. STP includes several important cases. The Steiner tree problem in graphs (GSTP) is one of them. Many heuristics have been proposed for STP, and some of them have proved to be performance guarantee approximation algorithms for this problem. Since evolutionary algorithms (EAs) are general and popular randomized heuristics, it is significant to investigate the performance of EAs for STP. Several empirical investigations have shown that EAs are efficient for STP. However, up to now, there is no theoretical work on the performance of EAs for STP. In this article, we reveal that the (1+1) EA achieves 3/2-approximation ratio for STP in a special class of quasi-bipartite graphs in expected runtime , where , , and are, respectively, the number of Steiner nodes, the number of special nodes, and the largest weight among all edges in the input graph. We also show that the (1+1) EA is better than two other heuristics on two GSTP instances, and the (1+1) EA may be inefficient on a constructed GSTP instance.

## 1 Introduction

The STP problem, named after Jakob Steiner, looks for a tree spanning a given set of nodes with the minimum weight by introducing some auxiliary nodes. All nodes in the given set are called special nodes. This problem is a fundamental NP-hard combinatorial optimization problem (Garey et al., 1977).

The STP problem has wide applications in many fields, such as circuit layout, networks design, and identification of subnetwork for a given set of seed genes or proteins (Sadeghi and Fröhlich, 2013).

The problem depends on the way the weight between two nodes is determined; thus several special cases are derived. In this article, we consider one important case of STP, that is, the Steiner tree problem in graphs (GSTP), which is also NP-hard (Hwang et al., 1992).

The GSTP problem: given an undirected graph and a weighting function , where and are, respectively, the sets of nodes and edges, and given a set of special nodes, the GSTP problem is to find a tree that spans all special nodes in and possibly some nodes from with the minimum sum of edge weights. We call such a tree the minimum weight Steiner tree.

For a GSTP problem, if is a metric, then we call such a GSTP problem a metric STP problem. The concept of metric will be defined in the next section.

If the number of special nodes is 2, that is, , then the GSTP problem is reduced to the “shortest path problem”; if , then it is reduced to the “minimum spanning tree problem.” Both can be efficiently solved (see, e.g., Dijkstra, 1959; Cheriton and Tarjan, 1976).

In this article, let , i.e., is the set of all Steiner nodes, , and . If there is no edge connecting two nodes in , then is called quasi-bipartite (Rajagopalan and Vazirani, 1999).

For NP-hard combinatorial optimization problems, including STP, it is believed that there exists no polynomial-time algorithm so far. Thus, quite a few approximation algorithms have been developed for the STP problem over the past thirty years.

For GSTP, a 2-approximation algorithm has been presented for the first time (Takahashi and Matsuyama, 1980; Kou et al., 1981), which is a heuristic based on the minimum spanning tree. Prömel and Steger (1997) proposed an approximation algorithm for GSTP, obtaining an improved approximation ratio of for any constant . Vazirani (2000) surveyed approximation algorithms for GSTP developed before 2000. Recently, Robins and Zelikovsky (2005) presented a heuristic algorithm that achieves 1.55-approximation ratio for GSTP in general graphs and 1.28-approximation ratio in quasi-bipartite graphs.

The EA is a randomized heuristic and a general purpose problem solver, so it is natural to investigate the performance of this algorithm for the STP problem. Several experimental investigations have shown that GAs, which belong to the larger class of EAs, are efficient for STP (Hesser et al., 1989; Rabkin, 2002; Kapsalis et al., 1993; Haghighat et al., 2002). However, we know nothing in theory about the efficiencies of EAs for STP.

In fact, extensive attention has been paid recently to the theoretical analysis of evolutionary algorithms’ performance on combinatorial optimization problems. These problems range from simple pseudo-Boolean functions (Jansen and Wegener, 2001; He and Yao, 2001; Droste et al., 2002; He and Yao, 2003), to classic combinatorial optimization problems such as minimum spanning tree problems (Neumann and Wegener, 2007), Eulerian cycle problems (Neumann, 2008), satisfiability problems (Zhou et al., 2009), minimum cut problems (Neumann et al., 2011), and Euclidean traveling salesperson problems (Sutton and Neumann, 2012).

Recently, the performance of EAs has been studied on the single-objective minimum spanning tree problem (Neumann and Wegener, 2007), a problem belonging to the complexity class P, and also on the multi-objective minimum spanning tree problem (Neumann, 2007; Qian et al., 2013), an NP-hard problem. The STP is another spanning tree-related optimization problem.

This article is devoted to theoretically investigating how EAs perform on the STP problem, which includes the approximation ability of EAs on this problem. Since we are in practice satisfied with good approximation solutions, the approximation performance analysis of EAs for NP-hard combinatorial optimization problems has recently become a hot topic (Giel and Wegener, 2003; Oliveto et al., 2009; Friedrich et al., 2010; Witt, 2005; Yu et al., 2012; Jansen et al., 2013). We reveal that the (1+1) EA is a -approximation algorithm for the metric STP problem in quasi-bipartite graphs when weights on edges are positive integer numbers and polynomially bounded. Investigations on two GSTP instances show that the (1+1) EA outperforms the so-called average weight heuristic and the heuristic based on the minimum spanning tree. On one constructed GSTP instance the (1+1) EA may need exponential expected runtime to find its optimal solution.

The next section describes some definitions, notations, and algorithms discussed in this article, and Section 3 discusses the approximation performance of the (1+1) EA for the metric STP problem in a class of quasi-bipartite graphs. Section 4 investigates the performance of the (1+1) EA on GSTP instances. The last section concludes the article.

## 2 Preliminaries

At first, we describe the concepts of subgraph and induced subgraph, which can be found in any textbook on graph theory (see Bondy and Murty, 2008).

(**Subgraph**): Let and be two graphs, where () is the node set of (), and () is the edge set of (). If and , then is a subgraph of .

(**Induced subgraph**): Let and be two graphs, where () is the node set of (), and () is the edge set of (). If and , then is a subgraph induced by , which is denoted by .

The weight of a tree is the sum of the weights of all edges in this tree; that is, . For induced subgraph , we denote the minimum spanning tree of as , and the weight of as ).

As mentioned earlier, we aim to find a subset of Steiner nodes for GSTP such that the minimum spanning tree over them and the special nodes has the minimum weight. Therefore, a subset of the set of Steiner nodes is a solution. Assume that the number of Steiner nodes is ; that is, . We sort all Steiner nodes in a fixed order. Thus, in the (1+1) EA a bit string represents a solution, where if Steiner node is selected, and otherwise. Thus, a bit string represents a subset of Steiner nodes, and vice versa. So, the terms bit string, solution, and a subset of Steiner nodes will be interchangeably used in this article.

Clearly, for a feasible solution which satisfies , the fitness value of is .

Fitness function (1) has to be minimized by the (1+1) EA. The first target leads to find a feasible solution , and the second leads to decrease the weight of the spanning tree over and .

The (1+1) EA for GSTP can be described as follows. Typically, the (1+1) EA accepts a new solution as long as it is not worse than the current one. In this article, the (1+1) EA accepts a new solution if and only if it is better than the current one. Jansen and Wegener (2001) showed that the criterion for accepting a new solution has an important effect on the behavior of the (1+1) EA for plateaus of constant fitness. In this sense, if a fitness function contains no plateaus, then the analysis of these two simple (1+1) EAs on this fitness function is the same.

In this article, the runtime of the (1+1) EA refers to the number of fitness evaluations until some termination criterion is fulfilled. If the (1+1) EA can efficiently solve a problem, we are interested in the expected optimization runtime; otherwise, we care about what approximation performance guarantee it can efficiently achieve.

Consider that a randomized heuristic algorithm is used to find a minimum solution for a combinatorial optimization problem . If , where is the value of the solution obtained by for an instance of in an expected polynomial runtime and denotes the value of the global optimum of , then we say that algorithm achieves a -approximation solution (ratio) for problem .

For the Steiner tree problem in quasi-bipartite graphs, which is NP-hard (Chlebik and Chlebikova, 2002), too, Rizzi (2003) proved that the following heuristic algorithm achieves a -approximation ratio, which is called the iterated 1-Steiner heuristic (ISH).

For completeness, we describe two other heuristic algorithms for GSTP in the following.

The first heuristic algorithm for GSTP is the minimum spanning tree-based heuristic algorithm (MSTA) proposed by Takahashi and Matsuyama (1980), and it is also independently proposed by Kou et al. (1981).

Given an input graph , and a set of special nodes, MSTA first constructs a complete graph on , where the weight on each edge connecting two special nodes equals the weight of the shortest path between them in the input graph. Then, MSTA finds the minimum spanning tree on . By replacing edges in this minimum spanning tree with their corresponding shortest path in the input graph, MSTA constructs a subgraph of . Finally, MSTA finds a minimum spanning tree of this subgraph, and constructs a Steiner tree from the minimum spanning tree. The following describes MSTA.

The second heuristic algorithm for GSTP is proposed by Rayward-Smith, which is called the average distance heuristic (Rayward-Smith, 1983; Bern and Plassmann, 1989; Waxman and Imase, 1988). Since the concept of “distance” in this heuristic algorithm is equivalent to the concept of “weight” in this article, we call it the average weight heuristic (AWH), where means the weight on the edge connecting nodes and . The following description of AWH is taken from Bern and Plassmann (1989).

At last, we end this section by defining the metric as follows.

(**Metric**): A metric is a weight function satisfying the following conditions, where is the set of nodes, and is the set of real numbers.

, for any ;

, if and only if ;

, for any ;

, for any .

The first three conditions are usually satisfied in an edge-weighted undirected graph. The last condition is called the triangle inequality. If the weight on edges of a given undirected graph is a metric, then the edge weights satisfy triangle inequality; that is, the weight of an edge that forms a triangle with two other edges is less than or equal to the sum of the weights of the other two.

## 3 The Approximation Performance Guarantee of the (1+1) EA for the Metric STP Problem in Quasi-Bipartite Graphs

In this section, we show that the (1+1) EA achieves a -approximation ratio for the metric STP problem in a special class of quasi-bipartite graphs as long as is polynomially bounded.

The following lemma has been proven by Rizzi, which shows that ISH described in Algorithm 2 produces a -approximation solution for the metric STP problem, when is quasi-bipartite.

(Rizzi, 2003): Let be a solution such that each Steiner node in connects at least three special nodes and for every , then is a -approximation solution to the metric STP problem, when is quasi-bipartite.

Further, let weights on edges be integer numbers. We now prove that starting with any initial solution the (1+1) EA achieves a -approximation solution for the metric STP problem in expected runtime , which is polynomial as long as is polynomially bounded.

Starting with any initial solution, the (1+1) EA achieves a -approximation ratio for the metric STP problem in expected runtime , when is quasi-bipartite and weights on edges are integer numbers.

For a solution , consider the Steiner nodes in . In there is no edge connecting two Steiner nodes, as is quasi-bipartite. Let denote that each Steiner node in connects at least three special nodes, and let denote that for every Steiner node .

We partition the solution space into two disjoint subspaces. One is ; the other is its complement set .

The main idea behind the proof is that if a solution is in subspace , then according to Lemma ^{4}, is a -approximation solution to the metric STP problem in ; if is in subspace , then the fitness value of can be decreased by at least one in expected time , and thus be efficiently transformed to a solution in subspace as long as is polynomially bounded.

Note that holds for any solution , as is a metric. Thus, each is a feasible solution, implying that the fitness value of is .

Let be the current solution. If is not in , then it is in , i.e., holds.

If holds, then there is at least one Steiner node in connecting one or two special nodes. If it connects only one special node, then it is a leaf node in . In this case, removing and the edge incident to it will produce a Steiner tree covering all special nodes whose weight is less than . Hence, removing from results in solution such that . If it connects two special nodes, say and , then removing and two edges and and simultaneously connecting the two special nodes and with edge will produce a Steiner tree whose weight is less than that of the current one. This improvement follows from the triangle inequality. Hence, removing from can also result in solution such that .

Altogether, if holds, then the fitness value can be decreased by removing a Steiner node from .

If holds, then there is a Steiner node such that .

Altogether, if is in , that is, holds, then the event that removes one specific Steiner node from , or adds some specific Steiner node to , results in a new solution whose fitness value is less than . The probability of this event is , which implies that the expected time is . Since the weight of each edge is a positive integer, the fitness value of can be decreased by at least one in expected time .

For an arbitrary solution, the minimum spanning tree covering all special nodes contains at most edges, which contains all special nodes and all Steiner nodes. So the weight of this minimum spanning tree is at most . Hence, a -approximation solution can be found in expected runtime .

## 4 Performance Analysis of the (1+1) EA on GSTP Instances

In this section, we show that the (1+1) EA is better than two other heuristics on two instances. At the end of this section, another instance is constructed to show that the (1+1) EA cannot always be efficient for GSTP.

### 4.1. An Instance Where the (1+1) EA Outperforms the MSTA

Takahashi and Matsuyama (1980) constructed an instance of GSTP, which we call in this article, to show that the approximation ratio produced by the MSTA described in Algorithm 3 is tight.

As shown in Figure 1, the solid edges construct the minimum Steiner tree of , which covers all special nodes and Steiner node . Clearly, the weight of the minimum weight Steiner tree is . In this article, Steiner nodes and special nodes are represented by hollow and solid circles, respectively.

For , MSTA achieves a -approximation ratio. A complete graph is constructed by MSTA on the set of special nodes. Since the weight on each edge of the complete graph is 2, the total weight of the minimum spanning tree on is .

In this subsection, we show that the (1+1) EA can efficiently find the minimum weight Steiner tree of .

For , , thus a solution can be represented as , where corresponds to Steiner node ; corresponds to Steiner node ; ; corresponds to Steiner node . It is clear that the global optimum is . If , then is also a global optimum.

The (1+1) EA starting with any initial solution finds the global optimum of in expected runtime .

Note that is complete. Any solution is feasible, as the induced subgraph is connected. Let denote the current solution, and let .

The main idea behind the proof is that if contains a Steiner node from , then it can be removed from .

If contains Steiner nodes from , then each of them must be a leaf node incident to a special node by an edge in as is complete. The weight of this edge is 10. Removing any one of such Steiner nodes from will result in a solution whose fitness value is 10 less than that of . Similar to the OneMax analysis from Droste et al. (2002), the (1+1) EA accepts the mutations where the number of Steiner nodes from contained in is decreased, and the number of such Steiner nodes contained in can be decreased by at least one in such mutations. So, at most such mutations will make contain no Steiner nodes from . If there are Steiner nodes coming from contained in , then the probability of occurring such mutations is at least . So, the upper bound of the expected time until contains no Steiner nodes from is .

Now contains no Steiner node from . If contains Steiner node , then it is the global optimum. If contains neither Steiner node nor Steiner node from , then there are two cases to be considered. The first is that . In this case, is also a global optimum. The second is that . In this case, the global optimum will be found by adding Steiner node to . The probability of this event is , which implies the expected time is .

Altogether, the global optimum of can be found by the (1+1) EA starting with any initial solution in expected runtime .

For instance , MSTA can achieve only a -approximation solution; however, the (1+1) EA can efficiently find its global optimum. Therefore, the (1+1) EA outperforms MSTA on instance .

### 4.2. An Instance Where the (1+1) EA is Superior to AWH

This subsection compares the (1+1) EA with AWH described in Algorithm 4 on an instance which we call in this article. This instance is proposed by Waxman and Imase (1988) to show that for any , the weight of the Steiner tree found by AWH for is larger than times the weight of the minimum weight Steiner tree of .

For the sake of clarity, we first give some concepts related to instance .

In a tree, if the minimum number of edges that must be visited from the root node to a node is , then we say that the node is in layer . Clearly, the root node is in layer 0. A perfect binary tree is a tree where all leaf nodes are in the bottom-most layer and each non-leaf node has two children. If a node has two children and , then node is called the father node of and , and (respectively ) is called the brother node of (respectively ). A perfect binary tree of height refers to a perfect binary tree in which the minimum number of edges from the root node to a leaf node is . Denote by a perfect binary tree of height . Thus, there are layers in : layer 0, layer 1, , and layer , and the number of nodes in layer is . Altogether, there are nodes including all leaf nodes and all non-leaf nodes in .

We now describe instance . Given a positive integer number , is a perfect binary tree with a path connecting all leaf nodes. In , the set of special nodes contains all leaf nodes, and the set of Steiner nodes contains all non-leaf nodes. Therefore, in there are special nodes and Steiner nodes, i.e., . All nodes in are numbered starting from the root node layer by layer, and from left to right: , , , , . Figure 2 shows an example with .

In this article, a subtree of refers to a perfect binary subtree of , which contains all the nodes and edges branching downwards from a given node (the root node of the subtree) till the leaf nodes, and we denote it by the root node of the subtree. For example, when , subtree is the perfect binary tree over Steiner nodes , , , and special nodes , , , , where is its root node. Let be a non-leaf node in layer , then the weight of the two edges leaving downward is , and the weight of the bottom edge connecting the two subtrees of is .

Let be the weight of a subtree of height , which equals the sum of weights on each edge in the subtree, and let be the weight of the path connecting all leaf nodes of a subtree of height , which is the sum of weights of all edges on this path. Clearly, , and .

The minimum weight Steiner tree of is the perfect binary tree . As shown in Figure 2, the solid edges construct the minimum weight Steiner tree of with weight of , which covers all special nodes and all Steiner nodes. The dashed edges construct the tree that may be produced by AWH, whose weight is .

While AWH may be trapped in the local optimum which contains no Steiner nodes, we will show that the (1+1) EA can efficiently find the minimum weight Steiner tree of .

The following analysis will utilize the drift theorem which is described in Lemma ^{7}.

First consider the case where . In this case, for all , i.e., the height of a subtree such that the weight of this subtree is less than the weight on the path connecting all its leaf nodes must be at least 2.

For with , the (1+1) EA starting with any initial solution finds the global optimum in expected runtime .

We first analyze the expected time that a feasible solution is found by the (1+1) EA starting with any initial solution. Then we utilize the drift theorem to derive the expected time until the global optimum is found once a feasible solution has been constructed.

Let be the current solution.

If the number of connected components in the subgraph induced by is greater than 1, that is, , then there must exist a connected component consisting of only Steiner node(s). Otherwise, all connected components contain special nodes, then these connected components can be connected by some proper edges on the path; that is, the number of connected components is 1, which contradicts the assumption that the number of connected components is greater than 1.

Such a connected component is either a tree consisting of Steiner nodes or only one isolated Steiner node. If it is the former, then removing a Steiner node which is a leaf node in this tree results in a cheaper tree. If it is the latter, removing the unique Steiner node deletes this component. Altogether, removing some Steiner nodes will decrease the fitness value, which can be accepted by the (1+1) EA. The probability of this event is , which implies that the expected time is . A connected component consisting of only Steiner node(s) will be deleted in expected time , as the number of Steiner nodes contained in such a connected component is . When all connected components only consisting of Steiner nodes are deleted, a feasible solution will be finally constructed. Since the number of such connected components is , a feasible solution will be found in expected time .

Next, we utilize the drift theorem to derive the expected time for the (1+1) EA to find the global optimum starting from any feasible solution.

Let , where is the feasible solution after iterations, is the weight of the minimum spanning tree of the subgraph induced by ; that is, and is the weight of the minimum spanning tree of the subgraph induced by and the global optimum. If , then the global optimum has been found and is the time to find the global optimum.

We first estimate the upper bound of . We have , since the weight of the minimum spanning tree of the subgraph induced by is not larger than the sum of weights on all edges of the input graph; that is, , and the weight of the minimum spanning tree of the subgraph induced by and the global optimum is , that is, . Thus, as is a positive integer number, that is, .

Then, we estimate the lower bound of . Note that , and is always nonnegative as long as is not the global optimum, since holds according to the acceptance condition of the (1+1) EA.

All feasible solutions of can be partitioned into two sets: and its complement . is the set of feasible solutions that do not contain at least one Steiner node in layer , and is the set of feasible solutions that contain all Steiner nodes in layer .

If , then at least one Steiner node in layer is not contained in . Assume that is such a Steiner node, and denote its father node and brother node by and , respectively. Now we consider the subtree whose height is 2.

Since is not contained in , not all the edges from subtree are contained in . Therefore, the weight of can be reduced by constructing subtree . There are four cases that need to be considered with respect to whether and are contained in . Among these four cases, the worst case is that neither nor is contained in , since in this case for constructing subtree three nodes , , and should be simultaneously added. This means that among these four cases, the lower bound of the probability of constructing subtree is . On the other hand, among these four cases, the lower bound of the reduction of is , which is the reduction of in the cases where is not contained in .

If , then all Steiner nodes in layer are contained in . Since is not the global optimum, there is at least one Steiner node not contained in . Assume that among all such Steiner nodes is in the largest layer, say layer ; that is, Steiner nodes in layers from to are all contained in . Then, can be decreased by through constructing a subtree of height by adding Steiner node to connect two subtrees of height , i.e., with probability .

Altogether, the global optimum of will be found in expected runtime .

Next, consider the case where . In this case, for all , that is, the height of a subtree such that the weight of this subtree is less than the weight on the path connecting all its leaf nodes should be at least 3.

Note that feasible solution belongs to either set or its complement , where is the set of feasible solutions that do not contain at least one Steiner node in layer or layer , and is the set of feasible solutions that contain all Steiner nodes of layer and layer . If , a subtree of height 3 can be constructed by simultaneously adding at most 7 Steiner nodes, and can be reduced by at least . While if , adding a Steiner node that is in the largest layer among all Steiner nodes not contained in , say layer , reduces by .

Similar to the proof of Theorem ^{8}, we have the following theorem.

For with , the (1+1) EA starting with any initial solution finds the global optimum in expected runtime .

Finally consider the case where . In this case, for all , that is, the height of a subtree such that the weight of this subtree is less than the weight on the path connecting all its leaf nodes should be at least 4.

Note that feasible solution belongs to either set or its complement , where is the set of feasible solutions that do not contain all Steiner nodes in layers from to , and is the set of feasible solutions that contain all Steiner nodes in layers from to . If , a subtree of height 4 can be constructed by simultaneously adding at most 15 Steiner nodes, and can be reduced by at least . While if , adding a Steiner node that is in the largest layer among all Steiner nodes not contained in , say layer , reduces by .

Similar to the proof of Theorem ^{8}, we have the following theorem.

For with , the (1+1) EA starting with any initial solution finds the global optimum in expected runtime .

Theorems ^{8}, ^{9}, and ^{10} show that the (1+1) EA can efficiently find the global optimum for with ; however, AWH may be trapped in the local optimum of . Therefore, the (1+1) EA is superior to AWH on .

In the proofs of Theorems ^{8}, ^{9}, and ^{10}, we utilize the drift theorem and define the potential function as the difference between the fitness values of a feasible solution and the global optimum. It seems that these three theorems can also be proved by the fitness level method. However, defining the concrete fitness levels may be tedious in this instance. Thus, the utilization of the drift theorem is justified.

### 4.3. An Instance Where the (1+1) EA May Need Expected Exponential Optimization Runtime

In this subsection, we construct an instance called for which the expected optimization runtime of the (1+1) EA may be exponential.

Given three constant numbers , , and such that , thus , , where , , and . Obviously, the total number of Steiner nodes in is ; that is, . As shown in Figure 3, we construct in four steps. First, each pair of adjacent special nodes and is connected by an edge , whose weight is . Second, each pair of adjacent Steiner nodes and is connected by an edge , whose weight is , and each pair of adjacent Steiner nodes and is connected by an edge , whose weight is . Third, and are respectively incident to by edge of weight and incident to by edge of weight , and Steiner node is incident to special node by edge of weight . Finally, Steiner node is incident to special node by edge of weight . The solid edges construct the minimum weight Steiner tree of , which covers all special nodes and all Steiner nodes , and the weight of the minimum weight Steiner tree is . Therefore, the optimal solution of is .

The (1+1) EA need an exponential expected runtime to find the optimal solution for when it starts with the all-zeros solution.

For , the (1+1) EA starting with the all-zeros solution finds the optimal solution in expected runtime .

Let . Note that the fitness value of the all-zeros solution is . If , then adding all Steiner nodes in to the all-zeros solution cannot be accepted by the (1+1) EA, as the weight of the minimum spanning tree of is not less than the fitness value of the all-zeros solution. However, when all Steiner nodes in are added, the weight of the minimum spanning tree of is , which is less than the fitness value of the all-zeros solution. So the (1+1) EA accepts only the event of simultaneously adding all Steiner nodes in to the all-zeros solution. The probability of this event is , which implies the expected runtime is .

Starting with any initial solution, the expected optimization runtime of the (1+1) EA may also be exponential.

For , the (1+1) EA starting with any initial solution finds the optimal solution in expected runtime .

We first prove that starting with any initial solution, the Steiner nodes from can be removed by the (1+1) EA in expected time . Then we prove that the optimal solution could be found by the (1+1) EA in expected time .

Let be the current solution, then is a feasible solution, as each Steiner node can be connected to a special node by an edge in the input graph. This is to say, .

If contains Steiner nodes from , then each of them must be a leaf node connecting with a special node in by an edge of weight . The fitness value can be decreased by through removing any one of such Steiner nodes from . So, the (1+1) EA accepts the mutations which decrease the number of Steiner nodes in coming from . Since the number of such Steiner nodes in will be decreased by at least one in such mutations, at most such mutations will make contain no Steiner nodes from . If there are Steiner nodes coming from contained in , then the probability of such mutations occurring is at least . Therefore, the expected time until contains no Steiner nodes from is .

After removing from all Steiner nodes which come from , if now contains all Steiner nodes from , then it is already the optimal solution. Otherwise, the optimal solution can be found by the (1+1) EA from in one step with probability at least . Therefore, the expected time is .

Theorems ^{11} and ^{12} show that the (1+1) EA may need expected exponential optimization runtime to find the optimal solution of .

## 5 Conclusions

We investigate the performance of evolutionary algorithms for Steiner tree problems in this article. We reveal that the (1+1) EA achieves an approximation ratio of for the metric STP problem in a special class of quasi-bipartite graphs, which are also NP-hard. This exemplifies that evolutionary algorithms are good approximation algorithms for NP-hard combinatorial optimization problem, though they are randomized heuristic algorithms. While the Steiner tree problems are NP-hard, we find that the (1+1) EA efficiently finds the global optima of two GSTP instances where other heuristics may be trapped in the local optima. However, on one constructed instance we show that the (1+1) EA may need an exponential expected optimization runtime, which implies that the (1+1) EA may not be always efficient for GSTP.

A question is whether the (1+1) EA can achieve an approximation ratio for the metric STP problem in the special class of quasi-bipartite graphs better than that we have obtained by simulating the iterated 1-Steiner heuristic, after all it is a randomized heuristic. For the STP problem in general graphs, we still know nothing about the approximation performance of the (1+1) EA.

Another question is that we theoretically know little about the performance of the (1+1) EA on other cases of STP, such as rectilinear Steiner tree problems until now.

Population-based EAs are used in practice, which use not only mutation operator but also crossover. The performance of population-based EAs on combinatorial optimization problems is recently a hot topic (Chen et al., 2009; Jansen et al., 2005; Chen et al., 2012; Doerr et al., 2012). Hence, the analysis on the performance of population-based EAs for the STP problem will be another interesting work.

## Acknowledgments

The authors thank the anonymous reviewers for their constructive feedback and valuable comments. This article is supported by the National Natural Science Foundation of China (grant nos. 61472143 and 61562071), the Scientific Research Special Plan of Guangzhou Science and Technology Programme (grant no. 201607010045), and the Natural Science Foundation of Jiangxi Province (grant nos. 20151BAB217008 and 20151BAB207020).