Abstract

In this article, we propose a new evolutionary algorithm for multiobjective optimization called Global WASF-GA (global weighting achievement scalarizing function genetic algorithm), which falls within the aggregation-based evolutionary algorithms. The main purpose of Global WASF-GA is to approximate the whole Pareto optimal front. Its fitness function is defined by an achievement scalarizing function (ASF) based on the Tchebychev distance, in which two reference points are considered (both utopian and nadir objective vectors) and the weight vector used is taken from a set of weight vectors whose inverses are well-distributed. At each iteration, all individuals are classified into different fronts. Each front is formed by the solutions with the lowest values of the ASF for the different weight vectors in the set, using the utopian vector and the nadir vector as reference points simultaneously. Varying the weight vector in the ASF while considering the utopian and the nadir vectors at the same time enables the algorithm to obtain a final set of nondominated solutions that approximate the whole Pareto optimal front. We compared Global WASF-GA to MOEA/D (different versions) and NSGA-II in two-, three-, and five-objective problems. The computational results obtained permit us to conclude that Global WASF-GA gets better performance, regarding the hypervolume metric and the epsilon indicator, than the other two algorithms in many cases, especially in three- and five-objective problems.

1  Introduction

Many real decision problems have to deal with several conflicting criteria, which must be optimized simultaneously. In such cases, the traditional optimization approach, in which a single objective is optimized subject to a given set of constraints, is no longer applicable. Then, a multiobjective optimization problem is formulated, which consists of the simultaneous optimization of several objective functions that mathematically model the criteria, subject to several constraints that determine the feasible set of solutions. In most of the cases, it is impossible to find a solution that optimizes all the objectives at the same time. Instead, so-called Pareto optimal solutions exist, in which an improvement of one objective leads to a deterioration in at least one of the others. The set of all Pareto optimal solutions in the decision space is referred to as the Pareto optimal set, and its image in the objective space is called the Pareto optimal front.

A research field focused on solving multiobjective optimization problems is evolutionary multiobjective optimization (EMO) (Deb, 2001a; Coello et al., 2007). EMO algorithms work with a population of individuals and attempt to find a good approximation of the whole Pareto optimal front by applying several operators, such as selection, crossover, and mutation. They aim at, on the one hand, finding a set of nondominated solutions as close as possible to the Pareto optimal front (convergence) and, on the other hand, representing the whole Pareto optimal front (diversity). One of the main advantages of using EMO algorithms for solving multiobjective optimization problems is that they can manage variables and objective functions of different natures; that is, they can easily handle discontinuous and concave Pareto optimal fronts, and binary and integer-valued variables (Deb, 2001a; Tan et al., 2005; Coello et al., 2007; Deb, 2008a; Zhou et al., 2011). This fact is reflected by the large number of real multiobjective optimization problems solved by EMO algorithms belonging to different areas (see Deb, 2001a; Coello et al., 2007; Zhou et al., 2011, and the references therein).

Achieving convergence and diversity simultaneously in an EMO algorithm is not always easy, especially when handling a large number of objectives. In Pareto-based EMO algorithms (those using the Pareto dominance relation for the fitness evaluation), the number of nondominated solutions in a population becomes very large in many-objective problems. Then, the selection pressure of the Pareto dominance toward the Pareto optimal front decreases, since there is no room for new solutions to be included in the population. This slows down the convergence of the algorithm and reduces the diversity of the population (Khare et al., 2003; Deb and Saxena, 2006; Corne and Knowles, 2007; Ishibuchi et al., 2008a; 2008b). Despite this drawback, one of the most popular Pareto-based EMO algorithms, the nondominated sorting genetic algorithm (NSGA-II) proposed in Deb et al. (2002a), has been successfully applied to many real-life multiobjective optimization problems (Deb, 2001a; Zhou et al., 2011). NSGA-II uses an elite-preserving strategy and an explicit diversity-preserving mechanism, and it has stood out because of its fast nondominated sorting procedure for ranking solutions into several nondominated fronts for the selection of the best individuals.

It is known that fitness evaluations based on scalarizing functions have several advantages in comparison to those based on Pareto dominance, such as the scalability of problems with any number of objectives and high searchability. Also, scalarizing functions improve the computational efficiency for the fitness evaluation, especially for many-objective problems, because the computational effort does not increase exponentially with the number of objectives. In this regard, a group of algorithms based on scalarizing fitness functions are the so-called aggregation-based EMO algorithms. Instead of treating the multiobjective problem as a whole, these algorithms transform the original multiobjective optimization problem into a set of scalarizing (single-objective) optimization subproblems, which are typically solved independently from each other. During the search process, solutions are somehow compared to each other according to the values they reached for each scalarizing subproblem. Among the aggregation-based EMO algorithms, the multiobjective evolutionary algorithm based on decomposition (MOEA/D) suggested in Zhang and Li (2007) is currently one of the most promising algorithms. In MOEA/D, a set of well-spread normalized weight vectors is used to formulate a set of single-objective subproblems. On each subproblem, the same scalarizing function is considered with a different weight vector. Then, at each generation, the new population is composed by the best solution found so far for each subproblem, relying on the assumption that neighbour weighting vectors will produce neighbour solutions.

The EMO field is one of the most active research areas and is constantly in evolution. The works referred hereafter constitute a sample of some recent results published in the field. In Menchaca-Mendez and Coello (2015), the authors propose a new selection mechanism based on -dominance and suggest a new algorithm called GDE-MOEA (generational distance & -dominance multi-objective evolutionary algorithm). This new algorithm is based on the generational distance indicator and a technique based on Euclidean distances to improve the diversity in the population. Pilat and Neruda (2015) suggest a modification of the distribution of the weight vectors in MOEA/D through the introduction of user preferences into the search. These preferences are given by assigning binary preference values to the individuals, and new weight vectors are created that take account of these preferences. Another modification of MOEA/D is proposed in H. Li et al. (2015), where the authors consider two sets of weight vectors: one set is fixed, and another one is generated randomly with the intention of covering disconnected and degenerated Pareto optimal fronts. Also, Zapotecas-Martínez et al. (2015) present a new way for generating the set of weight vectors in MOEA/D. The proposed approach is based on a finite set of points with low discrepancy, which are projected onto the () simplex (which defines the set of weight vectors).

The main purpose of this article is to describe a new EMO algorithm that takes advantage of the strengths of NSGA-II and MOEA/D. The algorithm suggested is called Global WASF-GA, and it extends the idea behind the preference-based EMO algorithm named the weighting achievement scalarizing function genetic algorithm (WASF-GA), recently proposed in A. B. Ruiz et al. (2015b). Following a philosophy similar to that of WASF-GA, Global WASF-GA maintains a diverse set of trade-off solutions by considering, on the one hand, a predefined set of weight vectors and, on the other hand, by using the nadir point and a utopian point in an achievement scalarizing function. In general, the nadir vector is composed of the worst values that each objective function can take, and the utopian point is defined as an objective vector that dominates every Pareto optimal solution. Additionally, an achievement scalarizing function (ASF) (Wierzbicki, 1980), which is normally an extension of a distance function, transforms the original problem into a single-objective minimization problem by using a vector of weights and a reference point. When an ASF is minimized over the feasible set of solutions, in practice, the reference point is projected onto the Pareto optimal front in a direction defined by the inverses of the weights, and a Pareto optimal solution is obtained in this way (Miettinen, 1999).

Roughly speaking, in each generation of Global WASF-GA, the population of parents and offspring is divided into several fronts, as in NSGA-II. However, the classification is done according to the values that each individual takes on the ASF proposed in Wierzbicki (1980), for each one of the weight vectors in the predefined set, and considering simultaneously the nadir and a utopian points as reference points. Thus, we could say that Global WASF-GA falls into the group of aggregation-based EMO algorithms, given that it transforms the original problem into a set of scalarizing subproblems that minimize the ASF considered for each weight vector in the predefined set, using the nadir and utopian points at the same time. In each generation, each subproblem is somehow minimized, first, by sorting the individuals into several fronts according to their ASF values, and second, by selecting those in the lower-level fronts until completing the new population, which are the best ones with respect to each weight vector and nadir and utopian points. From the practical point of view, the Pareto optimal front is approximated by projecting the nadir and utopian points simultaneously while taking into account a set of projection directions (or search directions), which are defined by the set of weight vectors. Furthermore, the weight vectors in the predefined set are generated so that they define an evenly distributed set of projection directions. In that way, the nondominated solutions obtained by Global WASF-GA are also likely to be widely distributed along the Pareto optimal front in most problems.

The rest of this article is organized as follows. In Section 2, we introduce the main concepts and notations used in multiobjective optimization. The proposed algorithm Global WASF-GA is motivated and described in Section 3, pointing out the contribution of our method to the pool of existing EMO algorithms. The computational experiments carried out are shown in Section 4. We present a comparative study of Global WASF-GA using sets of weight vectors generated by means of different procedures, and we also analyse the performance of the new algorithm in comparison to NSGA-II and different versions of MOEA/D. Finally, the conclusion is drawn in Section 5.

2  Concepts and Notation

A multiobjective optimization problem is given by the following formulation:
formula
1
where is the vector of decision variables, , for (), are the objective functions that must be minimized simultaneously, and is the feasible set. The image of the feasible set in the objective space is called the feasible objective region, and it is defined by objective vectors for all .

Usually, there is no feasible solution that can minimize all of the objective functions at the same time, given that the objective functions are typically in conflict. Because of that, the concept of Pareto optimality arises. A solution is Pareto optimal if there does not exist another such as for all and for at least one index j. The corresponding objective vector is called a Pareto optimal objective vector. The set of all Pareto optimal solutions is called the Pareto optimal set and is denoted by E, and the set of all Pareto optimal objective vectors is called the Pareto optimal front and it is denoted by . Also, a solution is said to be weakly Pareto optimal for problem (1) if there does not exist another such that for all . The corresponding objective vectors are called weakly Pareto optimal objective vectors. Note that the Pareto optimal set is a subset of the set of weakly Pareto optimal solutions.

Besides, given , we say that dominates if and only if for all and for at least one index j. We refer to a set of solutions whose objective vectors do not dominate each other as a set of nondominated solutions. Obviously, the Pareto optimal set is a set of nondominated solutions, but it is not true that any set of nondominated solutions is formed by Pareto optimal solutions. Additionally, weakly dominates if and only if for all . Given two sets of objective vectors , we say that Z1(weakly) dominatesZ2 if every objective vector in Z2 is (weakly) dominated by at least one objective vector in Z1.

Other important concepts in multiobjective optimization are the ideal and the nadir objective vectors, also referred to as ideal and nadir points. Respectively, their components define the lower and upper bounds for the objective functions and represent the best and worst values that each objective function can reach in the Pareto optimal front. The ideal objective vector can be calculated by minimizing each objective function individually in the feasible set—that is, for all . The nadir objective vector can be obtained with for all . In practice, the nadir objective vector is difficult to calculate, and commonly it is estimated somehow (Ehrgott and Tenfelde-Podehl, 2003; Deb and Miettinen, 2010; Deb et al., 2010). Sometimes, a vector that dominates every Pareto optimal solution is required, what means that it is strictly better than the ideal objective vector. This vector is called a utopian objective vector or utopian point, denoted by , and can be defined by , where is a small real number, for all .

In order to generate a good approximation of the Pareto optimal front, Global WASF-GA considers an achievement scalarizing function (ASF) (Wierzbicki, 1980). Commonly, an ASF is a single real-valued function that combines the original objective functions of problem (1) and the preferences specified by a decision maker (DM). These preferences can be expressed by means of a reference point , where qi is a desirable value provided by the DM for fi, for every . By minimizing an ASF over the feasible set, the Pareto optimal solution that best satisfies the preferences of the DM is obtained (Miettinen and Mäkelä, 2002). For a reference point , a vector of strictly positive weights (i.e., for every ), and a real value , the ASF considered in Global WASF-GA, proposed in Wierzbicki (1980), is formulated as follows:
formula
2
For every , the weights assigned to each objective function fi can be defined as normalizing coefficients (F. Ruiz et al., 2009), or they can also have a preferential meaning (Luque et al., 2009). The coefficient , which must be a small positive value, is the so-called augmentation coefficient and is used to assure that the solution that minimizes (2) over S is a Pareto optimal solution of problem (1). Thus, if (in which case, (2) is the Tchebychev function), we can only assure that the solution that minimizes (2) over S is a weakly Pareto optimal solution of problem (1) (Miettinen, 1999). Besides, if the ranges of the objectives are in very different scales, the values of the objective functions must be normalized, for instance by dividing by in (2), for every .

In practice, minimizing (2) over the feasible set S means to project the reference point considered onto the Pareto optimal front in the direction defined by the inverses of the weights—that is, in the direction given by (for more details, see Steuer, 1986; Luque et al., 2015). Actually, if the reference point is achievable (which means that there is a feasible solution whose objective values achieve or improve the reference values), the objective values achieved by the minimal solution of (2) improve the given reference values as much as possible. Otherwise, if the reference point is not achievable, minimizing (2) is equivalent to the minmax distance.

It has been demonstrated that any Pareto optimal solution can be found by minimizing (2) over S using the ideal point as a reference point (or any objective vector that dominates it), and varying the weight vector in the whole weight vector space (Wierzbicki, 1986; Miettinen, 1999). This is also true if we consider the nadir point as a reference point (or any objective vector dominated by it) (Miettinen, 1999).

3  Global WASF-GA

As we previously said, Global WASF-GA is an extension of the preference-based EMO algorithm WASF-GA (A. B. Ruiz et al., 2015b), which aims at approximating the whole Pareto optimal front. For that, on the one hand, at each generation of Global WASF-GA, the individuals that are closer to the Pareto optimal front are highlighted by classifying them into different fronts according to the values they achieve on the ASF given in (2). Two reference points are used at the same time for that classification, the nadir point and a utopian point. The lower the values of (2) reached by a solution for one of these reference points, the more this solution is highlighted. On the other hand, in order to maintain diversity among the solutions, a set of weight vectors is used in the algorithm for the procedure to sort individuals. Somehow, each one of these weight vectors defines a search direction for new nondominated solutions close to the Pareto optimal front. In this way, the search process in Global WASF-GA is performed while taking into account multiple search directions at the same time.

In Section 2, we pointed out that if we consider a reference point that dominates the ideal point, such as a utopian point , every Pareto optimal solution can be generated by minimizing (2) using different weights (Wierzbicki, 1986; Miettinen, 1999). In the same way, if the reference point is the nadir point or any objective vector dominated by it, any Pareto optimal solution can be also found by minimizing (2) for different weights (Miettinen, 1999). Based on these assumptions, the main purpose of Global WASF-GA is to generate a set of well-distributed nondominated solutions that approximate the Pareto optimal front by minimizing the ASF given in (2) at each generation, using a utopian and nadir points as reference points simultaneously, and considering a predefined set of weight vectors. Thus, theoretically, any Pareto optimal solution could be found by Global WASF-GA. From the practical point of view, Figure 1 shows the performance of the proposed algorithm in a bi-objective problem. The arrows represent the projection directions determined by a set of weight vectors. It must be mentioned that a utopian point has been considered instead of the ideal point because, in practice, it may be easier to approximate the extremes of the Pareto optimal front by using a utopian point in (2). For the same reason, the nadir point has been slightly worse in practice, that is, we have used an objective vector dominated by the nadir point. However, for simplicity in the notation, we refer to this objective vector as .

Figure 1:

Performance of Global WASF-GA in a bi-objective optimization problem.

Figure 1:

Performance of Global WASF-GA in a bi-objective optimization problem.

Let be the number of weight vectors considered in the predefined set, N the population size (), h the current generation, Ph the population of individuals at generation h, Qh the population of offspring at generation h, and the nth front at generation h. Algorithm 1 contains the main steps of Global WASF-GA. We denote by the number of elements in a set A. As we explained in Section 2, when the ASF given in (2) is minimized over S for and , these two points are projected onto the Pareto optimal front in the direction defined by the inverses of the weights used. Then, it is important that the weight vectors used in Global WASF-GA define a well-spread set of projection directions. Therefore, these weight vectors must verify that the vectors formed by their inverse components, termed inverse weight vectors, are well-distributed in the weight vector space .

formula

In Algorithm 1, initially, a population of individuals is randomly created, and the nadir and ideal points are approximated using the worst and best objective function values achieved by the individuals in that population. Subsequently, the utopian point is calculated accordingly and the nadir point is slightly worsened (e.g., by adding to each component the amount used for computing the utopian point). At each generation h, a new population of offspring solutions Qh is created as usual, by applying selection, crossover, and mutation operators. If some component of the nadir or utopian point is improved by an individual in Qh, the two points are updated. Subsequently, the population Zh formed by parents and offspring (of size ) is divided into several fronts, as follows. The first front is formed, on the one hand, by the solutions in Zh with the lowest values of (2) for , taking into account half of the weight vectors (the odd order ones), and, on the other hand, by the solutions in Zh with the lowest values of (2) for , considering the other half of the weight vectors (the even order ones). These solutions are removed from Zh. Similarly, the second front is formed by the solutions in Zh with the next lowest values of (2) for and half of the weight vectors, and by the solutions in Zh with the next lowest values of (2) for and the other half of the weight vectors. This process continues until every individual in Zh has been classified. Finally, the population for the next generation consists of the N individuals in the lowest-level fronts. If the last front considered has more solutions than the remaining space in , the solutions from this front with the lowest values of (2) are selected.

It must be noted that, when new solutions are being selected to form a new front in Algorithm 1 (lines 17–22), it may happen that one solution reaches the minimum value of (2) for more than one weight vector. To maintain diversity in the population, repeating solutions in the same front is not allowed in that case. Thus, once a solution is selected for one weight vector, it cannot be chosen for another weight vector.

Constrained multiobjective optimization problems can be handled in Global WASF-GA as follows. At each generation h, first the feasible solutions in Zh can be classified into several fronts, as already explained in Algorithm 1 (lines 17–22), and second, the infeasible individuals in Zh can be divided into (individual) fronts according to their overall constraint violations, for example, following the procedure suggested in Deb et al. (2002a).

Our initial experiments have indicated that Global WASF-GA performs better when the utopian and nadir points are used at the same time than when only one of them is considered. Using both points simultaneously allows the search process to adapt to the shape of the Pareto optimal front.

As we previously said, the weight vectors considered in Global WASF-GA must verify that the vectors formed by their inverse components are well-distributed. Subsection 3.2 describes the procedure to generate a set of weight vectors with this property. Thus, by using this set of weight vectors, Global WASF-GA projects the nadir and utopian points onto the Pareto optimal front, taking into account a set of evenly distributed projection directions. Since the projection directions are widely distributed, the nondominated solutions obtained are also likely to be widely distributed along the Pareto optimal front in most problems. Finally, it should be noted that the weights used must be strictly positive, since the solution that minimizes (2) over S is assured to be Pareto optimal just in this case (Miettinen, 1999).

To show that different solutions are generated by using a weight vector and by using its inverse for the same reference point, let us consider the DTLZ2 test problem with three objective functions (Deb et al., 2002b), the reference point , and the weight vector . For this reference point, we can calculate four Pareto optimal solutions by minimizing over S the following functions: (solution 1) the Tchebychev function with for all ; (solution 2) the ASF given in (2) with for all ; (solution 3) the Tchebychev function with for all ; and (solution 4) the ASF given in (2) with for all . In Figure 2, all the solutions found have been plotted together with the Pareto optimal front of the DTLZ2 problem. In this case, solutions 1 and 2 match, and solutions 3 and 4 are also equal, due to the fact that minimizing over S the Tchebychev function and the function (2) produce the same solution in problems where every weakly Pareto optimal solution is also Pareto optimal (Miettinen, 1999).

Figure 2:

Different solutions found for in the DTLZ2 problem.

Figure 2:

Different solutions found for in the DTLZ2 problem.

Next, further comments about Global WASF-GA are discussed in Subsection 3.1, and the procedures to generate the weight vectors are described in Subsection 3.2.

3.1  Further Comments

Although Global WASF-GA is based on the WASF-GA algorithm, there are some differences between them. Mainly, Global WASF-GA aims at approximating the whole Pareto optimal front, while WASF-GA is a preference-based EMO algorithm that concentrates the search for nondominated solutions just on a region of interest of the Pareto optimal front, defined by a reference point given by the DM. Actually, this reference point is used to sort the individuals into several fronts. In contrast, Global WASF-GA does not introduce any preference information into the search process, and it considers two reference points (the utopian and nadir points) for classification of the individuals.

As introduced in Section 1, Global WASF-GA is similar to NSGA-II and MOEA/D. Hereafter, we point out the similarities and differences between Global WASF-GA and both of these algorithms.

On the one hand, NSGA-II is well-known by its fast nondominated sorting procedure for ranking solutions into several nondominated fronts. As is stated in Deb (2008b), its simplicity, modularity, and low number of initial parameters are probably the main reasons for the popularity of NSGA-II. However, it is based on Pareto dominance, which reduces the search ability of the algorithm in many-objective problems (Khare et al., 2003; Ishibuchi et al., 2008a; 2008b). Global WASF-GA makes use of the same underlying philosophy for sorting the individuals into several fronts, but the classification is not done according to Pareto dominance. Instead, an ASF is employed as a fitness function. In this way, Global WASF-GA takes advantage of the simplicity of NSGA-II, without relying on Pareto dominance and employing a scalarizing function. This avoids the convergence problems that arise from Pareto dominance and allows for the scalability of the problem in the presence of many objectives, which also improves the computational efficiency of the fitness evaluation. Additionally, Global WASF-GA does not use any crowding distance to maintain diversity in the population, as NSGA-II does. The diversity is instead enhanced by the use of a set of well-spread projection directions, and by not allowing repeating solutions in the same front.

On the other hand, MOEA/D is a high-performance algorithm, and Global WASF-GA is similar with this algorithm in two respects. First, both algorithms use a scalarizing function as a fitness function, and second, both of them simultaneously employ several weight vectors for searching new solutions (aggregation methods). As we previously mentioned, using scalarizing fitness functions makes these two algorithms especially suitable for problems with many objectives, since the problem can be easily scalarized for any number of objectives. Furthermore, the use of a set of weight vectors means that, in practice, both algorithms approximate the Pareto optimal front while taking into account multiple search directions at the same time.

However, the difference between most variants of MOEA/D and Global WASF-GA is the aggregation (scalarizing) function considered in each algorithm: Global WASF-GA uses the ASF given in (2) for the nadir and utopian points, and MOEA/D uses the weighted Tchebychev function for the ideal point. Actually, different scalarizing functions have been tested in MOEA/D (Zhang and Li, 2007; Ishibuchi et al., 2013; Sato, 2014), and even works such as Ishibuchi et al. (2009, 2010) have proposed the simultaneous use in MOEA/D of several aggregation functions, using different sets of weight vectors for each one. This is similar to what it is done in Global WASF-GA, because we also use different subsets of the weight vectors for the utopian point and the nadir point. It is convenient to point out that most versions of MOEA/D just consider one reference point in the scalarizing function (the ideal point), but the nadir and utopian points are simultaneously used in Global WASF-GA. This difference allows our algorithm to better adapt to the shape of the Pareto optimal front. In this regard, Sato (2014) proposed that an inverted PBI (penalty-based boundary intersection) approach be used in MOEA/D, which optimizes, for the nadir point, a maximization problem similar to the minimization one formulated for the ideal point in the PBI approach (Zhang and Li, 2007). The authors claimed that the part of the Pareto optimal front that may not be approximated by the PBI approach can be better approximated with the inverted PBI approach.

More recently, Jiang and Yang (in press) have proposed an improved MOEA/D in which, first, MOEA/D uses the Tchebychev function with the ideal point, as usual, and after some generations, the Tchebychev function is formulated for the nadir point if there are fewer solutions in the extreme regions of the Pareto optimal front than in the intermediate region. The authors state that “Thus, it is natural to achieve uniform solutions for a convex MOP by solving the scalar optimization subproblems in a reversed form” and “When handling an MOP, however, its convexity–concavity is not always known beforehand. This gives rise to the problem on how to properly choose the subproblem form.” Both Sato (2014) and Jiang and Yang (in press) reinforce the philosophy of Global WASF-GA of using the utopian and nadir points simultaneously to better cover the Pareto optimal front. But our algorithm differs from both of these proposals because it uses (2) for the utopian and nadir vectors simultaneously in the entire algorithm, by means of two subsets of weight vectors, while Sato (2014) only uses the nadir point and Jiang and Yang (in press) modularizes the search for new solutions by first using the ideal point and only later considering the nadir point. Anyway, it must be mentioned that Global WASF-GA and MOEA/D are aggregation-based algorithms in which different scalarizing functions can be used.

Taking into account the aggregation function, several reasons have led us to consider the ASF (2) in Global WASF-GA, from among all the aggregation functions that can be found in the literature. First, the weighted sum cannot handle problems with nonconvex Pareto optimal fronts, due to its inability to generate nonsupported Pareto optimal solutions, while (2) can generate every Pareto optimal solution, including the nonsupported ones (Miettinen, 1999). Second, the weighted Tchebychev function and (2) differ in the augmentation term, but this term assures that the minimal solution of (2) is Pareto optimal, avoiding producing weakly Pareto optimal solutions (Miettinen, 1999; Luque et al., 2015). On the other hand, regarding the PBI and inverted PBI approaches, Global WASF-GA takes the advantages of both approaches, since it uses (2) for the utopian and nadir points simultaneously. Lately, the NBI (normal boundary intersection) and Tchebychev approaches have been combined in the NBI-style Tchebychev approach (Zhang et al., 2010). This approach aims at overcoming the scaling problems associated with the Tchebychev approach and the difficulties of the NBI approach in MOEA/D due to the introduction of additional constraints. This case is in line with Figueira et al. (2010), where the same weight vector is fixed for all subproblems formulated for a set of evenly distributed reference points. However, the NBI-style Tchebychev approach was just formulated for the bi-objective case, and it cannot be easily extended to higher-dimensional problems, as is stated in Jiang and Yang (in press). Other aggregation functions differ from each other in the weight vectors considered (Liu et al., 2009; Jiao et al., 2013; Qi et al., 2014). The performance of Global WASF-GA for the weight vectors generated by different procedures is studied in Section 4.2.1.

With respect to the framework, differences between Global WASF-GA and MOEA/D are minimal. When the neighbourhood size is less than the population size in MOEA/D, this algorithm seeks the solution for each scalarizing subproblem (or for each weight vector) in its neighbourhood at each generation, while Global WASF-GA classifies all the solutions into different fronts according to their values on the ASF and selects the individuals in the lowest-level fronts. But recent works have proposed to fit the neighbourhood size to the population size. In this situation, both algorithms are very similar, with minor differences. If the population and neighbourhood sizes are equal, at each generation of MOEA/D and for each subproblem (or weight vector), the new solution generated replaces those solutions associated with the subproblems in the neighbourhood (its size is the population size) if it improves their corresponding aggregation function value. Then, on the one hand, the new solution may replace several solutions from the whole population, and on the other hand, the new solution may be selected as a parent to generate other solutions in the same generation. Since repeating solutions in the same front for different weight vectors is not allowed in Global WASF-GA,1 the same solution can appear only once in the population for the next generation in Global WASF-GA. It must be mentioned that there are new versions of MOEA/D, such as MOEAD-STM (K. Li et al., 2014), which also do not allow solutions to repeat, in order to maintain a higher diversity in the population. Besides, at each generation of Global WASF-GA, parents and offspring are combined, and the best solution for each weight vector is selected, taking into account all parents and offspring at once. This means that the offspring solutions are not used to generate new solutions in the same generation.

Regarding the number of weight vectors, MOEA/D requires as many weight vectors as the population size, but some works have suggested considering fewer weight vectors through the selection of some of them (Palmers et al., 2009). In Global WASF-GA, the number of weight vectors can be set below the population size (), although, in our computational tests, we have considered for both algorithms, to make a fair comparison with MOEA/D. Of course, can be less than N and, in that case, the approximation of the Pareto optimal front is obtained with more than one frontier in Global WASF-GA (). In an interactive version of WASF-GA (A. B. Ruiz et al., 2015a), this issue has been considered in the same way, and was too much lower than N.

In relation to the parameters needed by each algorithm, the neighbourhood size can influence the efficiency of MOEA/D, according some studies. On the basis of experimental results, Zhang and Li (2007) proposed to define the neighbourhood size as a large fraction of the population size, and recently, Ishibuchi et al. (2013) concluded that the specification of an appropriate neighbourhood size is problem-dependent. Global WASF-GA also needs to set some parameters ( in the ASF (2), and the real values , for , needed to compute the utopian point), but they can be set in general as and , , for any problem.

The augmentation coefficient is used to assure that the solution generated by minimizing (2) over the feasible set is Pareto optimal, and not just weakly Pareto optimal. The impact of considering in this ASF considering normalized values is insignificant for the algorithm’s performance (see Miettinen, 1999; Kaliszewski, 1994). However, if a bigger value of were used, the ASF given in (2) would be a combination of the and L1 metrics, which may not be useful for generating all Pareto optimal solutions (for more details, see Andre and Romero, 2008). That is why we suggest using .

On the other hand, regarding the values of for , it should be mentioned that it is proved theoretically that every Pareto optimal solution that can be generated (i.e., every Pareto optimal objective vector) can be obtained by minimizing the ASF given in (2) over the feasible set, considering the ideal point (or any point that dominates it) as a reference point and varying the weight vector (Miettinen, 1999). However, if a Pareto optimal objective vector has a component that matches with the corresponding ideal value, it would be required to consider a weight vector with the corresponding component equal to nil, which may produce inefficiencies. To avoid such a situation, we have considered the utopian point, in which each ideal value is improved slightly using , as we indicated in Section 2. For more details about the possible implications of considering the utopian vector instead of the ideal vector, see, for example, Steuer (1986). Furthermore, the nadir point has also been modified slightly to avoid the same situation. Thus, considering small values for set according to the ranges of the objective functions ( has been set as 1% of the difference between the corresponding nadir and ideal values) is very convenient for this reason, and its influence is also insignificant to the performance of the algorithm (this has been checked computationally).

3.2  Generation of the Weight Vectors

Let be the number of weight vectors needed in Global WASF-GA. Let us denote by A = the predefined set of weight vectors considered in Global WASF-GA, and by B = a representative set of weight vectors in . Once the vectors in the set B are generated, the weight vectors in A are defined as the inverse vectors of the ones in B. That is, for every and , the weight vector is calculated as follows:
formula
3

There are several procedures for generating a representative set B of vectors in (Scheffé, 1958, 1963; Wagner et al., 2013), but it is important to have in mind that whatever the procedure selected for generating B is, the weight vectors used in Global WASF-GA are the ones given in (3). In this study, the performance of Global WASF-GA is analysed with different procedures for the generation of the set B, which are described in what follows. It is important to highlight that the weight vectors generated by any of these procedures are suitable for solving any problem (with the same number of objectives) using Global WASF-GA.

From the practical point of view, considering the set is similar to considering the set with , a small positive amount.

3.2.1  Two-Objective Functions

In the bi-objective case, we have employed three ways to generate B:

  • Even Design. Let us consider weights that are evenly distributed as much as possible. For each , the weight vector is defined by:
    formula
    4
    where is a small positive value. We suggest using in order not to move too far from the extremes of the interval . However, this value can be changed if so desired. It can be observed that the value of defines the size of the step from one weight to the next in , which is . These weight vectors belong to the standard 1-simplex in , since for all (Scheffé, 1958).
  • Radial Design. These weight vectors are defined by means of polar coordinates (Vilenkin et al., 1965). Let us consider , a set of evenly distributed angles in . For each , the weight vector is obtained in this case as follows:
    formula
    5
    Observe that these weight vectors verify that their L2-norm is equal to 1—that is, for every .
  • WTB Design. Recently, Wagner et al. (2013) have proposed a new way of generating weight vectors for bi-objective problems (the name given to this procedure corresponds to the initial capital letters of the authors). Through a power function that depends on a parameter , this procedure generates weight vectors’ distributions that express increased preferences regarding the extremes of the Pareto optimal front. For , the distribution obtained is uniform, and the higher the parameter , the greater is the concentration of the weight vectors in the extremes.

3.2.2  Three or More Objective Functions

When , the generation of a set of weight vectors equally distributed along is not as straight forward. Let us see three ways to generate the vectors of weighs in B:

  • Even Design. In this procedure, a large enough set of weight vectors in , denoted by W, is first generated. For that, the possible values that can be assumed by the components of the vectors in W are initially obtained by calculating equidistant values in . Let and S be two positive real parameters, where is a small value used to avoid having weights equal to 0, and S is the size of the step between two consecutive values in . The equidistant values in are generated as follows:
    formula
    6
    where [ denotes the integer part of a number]. We suggest using and , although other values can be considered.Subsequently, we consider the set of weight vectors in in which each component is equal to one wr, for an index , and we normalize these vectors by dividing each component by the sum of all the components. Then, the set W is formed by the resulting normalized weight vectors. Finally, the most representative vectors of W are selected by dividing W into clusters using, for instance, the k-means clustering method (MacQueen, 1967). The centroid weight vectors of these clusters are the vectors of weights needed (the set B). Note that the set of weight vectors W can be used to obtain any number of weight vectors.
  • Radial Design. In this case, the weight vectors are defined by means of spherical coordinates (Vilenkin et al., 1965). Let be a set of angles belonging to . Let us assume that, for each , the angle can take any value in a set of evenly distributed values in , and let L be the number of evenly distributed values considered in . Then, for each combination of angles , the following vectors are calculated:
    formula
    7
    The number of weight vectors obtained is . Then, L must be set so that and, if , the most representative weight vectors in the set of vectors generated must be selected—for instance, by following the same procedure as in the even design.
  • Simplex Design. In this case, we consider the weight vectors in the standard -simplex generated using the -simplex-lattice design (Scheffé, 1958). This procedure generates a set of weight vectors. Then, L must be set so that and, if , the most representative weight vectors in the set of vectors generated must be selected—for instance, by following the same procedure as in the even design.

4  Computational Tests

To analyse the effectiveness of the new algorithm, we carried out different computational experiments for testing our algorithm in different scenarios in comparison to MOEA/D and NSGA-II. Global WASF-GA has been implemented in Java and incorporated to the jMetal framework (Durillo and Nebro, 2011), where implementations of the other algorithms are also available. This framework is an object-oriented Java-based framework for multiobjective optimization using metaheuristic algorithms.

The multiobjective test problems involved in our study are the following ones: two-objective problems from the families ZDT (Zitzler et al., 2000), WFG (Huband et al., 2007), LZ09 (H. Li and Zhang, 2009), and CEC09 (Zhang et al., 2008); problems with three objective functions from the families DTLZ (Deb et al., 2002b), WFG, and CEC09, and the three-objective problem LZ09F6 from the LZ09 family; and finally problems with five objectives from the WFG family.

To compare the performance of the algorithms, 30 independent runs per instance were performed for each algorithm. The quality of the final populations was measured using the hypervolume (HV) indicator proposed in Zitzler and Thiele (1999) and the unary epsilon additive (EPS) indicator suggested in Zitzler et al. (2003). The hypervolume of a population P can be defined as the hypervolume of the portion of the objective space that is dominated by the solutions in P and is bounded by a reference point dominated by all the solutions in P. It is better to have a hypervolume indicator as high as possible. For a minimization multiobjective problem and a reference set R of the Pareto optimal front, the unary additive epsilon indicator of P is the minimum value that must be added to all the objective vectors of the solutions in R, such that the resulting transformed set is weakly dominated by the objective vectors of the solutions in P. Having an EPS value less than or equal to 0 implies that P weakly dominates the reference set R, so the EPS indicator is to be minimized. The experimental results in Subsections 4.2.1, 4.2.2, 4.2.3, and 4.2.4 report the average HV and EPS indicator values in the 30 runs. According to this, a representative set R of the Pareto optimal front of each problem is required to compute the EPS indicator. For the problems with two and three objectives, the sets R used are the approximations of the true Pareto optimal fronts that are available in the jMetal framework. Since there are no Pareto optimal front approximations for the five-objective WFG problems in the jMetal framework, R has been obtained for these problems as follows. At the WFG website, there is a toolkit2 to generate sets of solutions for any WFG problem, taking as input the position and distance-related parameters, the number of objectives, and the number of solutions one desires to generate. In order to have well-distributed approximations of the Pareto optimal fronts, we generated 100,000 nondominated solutions using the WFG Toolkit for each test problem considered with five objectives, and then we clustered them to obtain 2,000 solutions, which constitute the representative set R for these problems. For clustering, we applied a modified version of the k-means clustering algorithm, which performs like the original algorithm but, once the centroids are obtained, the final representative of each cluster is the closest nondominated solution in the population to each centroid. This modification is needed to assure that, on each problem, the set R is formed by solutions generated by the WFG Toolkit. Finally, in all our experiments, the reference point used to compute the HV values was set as the nadir point of each problem (see Table 1), which was estimated from the Pareto optimal fronts available. In addition, the HV values have been normalized by taking into account the HV of R.

Table 1:
Nadir points used for calculating the HV indicator.
ProblemObj.Nadir pointProblemObj.Nadir point
ZDT1 (1.0, 1.0) DTLZ1 (0.49, 0.495, 0.5) 
ZDT2 (1.0, 1.0) DTLZ2 (1.0, 0.9998, 0.9998) 
ZDT3 (0.8520, 1.0) DTLZ3 (1.0, 0.9998, 0.9998) 
ZDT4 (1.0, 1.0) DTLZ4 (1.0, 0.9888, 0.9888) 
ZDT6 (1.0, 0.9212) DTLZ5 (0.7071, 0.7071, 0.9999) 
WFG1 (1.9999, 4.0) DTLZ6 (0.7071, 0.7071, 0.9999) 
WFG2 (1.9764, 4.0) DTLZ7 (0.86, 0.86, 6.0) 
WFG3 (1.9818, 3.9974) WFG1 (1.9944, 3.9996, 6.0) 
WFG4 (2.0, 3.9965) WFG2 (1.9433, 3.9651, 5.9998) 
WFG5 (1.9995, 3.9876) WFG3 (0.9999, 1.9998, 5.9998) 
WFG6 (2.0, 4.0) WFG4 (1.9991, 3.9999, 6.0) 
WFG7 (2.0, 4.0) WFG5 (1.9999, 3.9829, 5.98344) 
WFG8 (2.0, 4.0) WFG6 (1.9995, 3.9999, 6.0) 
WFG9 (1.9987, 3.971) WFG7 (1.9996, 3.9991, 6.0) 
LZ09_F1 (1.0, 1.0) WFG8 (1.9995, 3.9999, 6.0) 
LZ09_F2 (1.0, 1.0) WFG9 (1.9982, 3.9847, 5.9890) 
LZ09_F3 (1.0, 1.0) LZ09_F6 (1.0, 1.0, 1.0) 
LZ09_F4 (1.0, 1.0) CEC2009_UF8 (1.0, 1.0, 1.0) 
LZ09_F5 (1.0, 1.0) CEC2009_UF9 (1.0, 1.0, 1.0) 
LZ09_F7 (1.0, 1.0) CEC2009_UF10 (1.0, 1.0, 1.0) 
LZ09_F8 (1.0, 1.0) WFG1 (1.2254, 2.8551, 5.6331, 7.7653, 10.11062) 
LZ09_F9 (1.0, 1.0) WFG2 (1.5857, 2.5838, 5.5332, 7.7187, 9.9995) 
CEC2009_UF1 (1.0, 1.0) WFG3 (0.2499, 0.4998, 1.4995, 3.9988, 9.9974) 
CEC2009_UF2 (1.0, 1.0) WFG4 (1.6833, 3.3879, 5.6633, 7.9548, 9.9999) 
CEC2009_UF3 (1.0, 1.0) WFG5 (1.9442, 3.7301, 5.8585, 7.9329, 9.9646) 
CEC2009_UF4 (1.0, 1.0) WFG6 (1.8910, 3.6894, 5.928, 7.9772, 9.9999) 
CEC2009_UF5 (1.0, 1.0) WFG7 (1.7884, 3.7984, 5.8397, 7.9853, 9.9999) 
CEC2009_UF6 (1.0, 1.0) WFG8 (1.9521, 3.8443, 5.8933, 7.9827, 9.9999) 
CEC2009_UF7 (1.0, 1.0) WFG9 (1.7860, 3.6238, 5.8502, 7.9419, 9.9676) 
ProblemObj.Nadir pointProblemObj.Nadir point
ZDT1 (1.0, 1.0) DTLZ1 (0.49, 0.495, 0.5) 
ZDT2 (1.0, 1.0) DTLZ2 (1.0, 0.9998, 0.9998) 
ZDT3 (0.8520, 1.0) DTLZ3 (1.0, 0.9998, 0.9998) 
ZDT4 (1.0, 1.0) DTLZ4 (1.0, 0.9888, 0.9888) 
ZDT6 (1.0, 0.9212) DTLZ5 (0.7071, 0.7071, 0.9999) 
WFG1 (1.9999, 4.0) DTLZ6 (0.7071, 0.7071, 0.9999) 
WFG2 (1.9764, 4.0) DTLZ7 (0.86, 0.86, 6.0) 
WFG3 (1.9818, 3.9974) WFG1 (1.9944, 3.9996, 6.0) 
WFG4 (2.0, 3.9965) WFG2 (1.9433, 3.9651, 5.9998) 
WFG5 (1.9995, 3.9876) WFG3 (0.9999, 1.9998, 5.9998) 
WFG6 (2.0, 4.0) WFG4 (1.9991, 3.9999, 6.0) 
WFG7 (2.0, 4.0) WFG5 (1.9999, 3.9829, 5.98344) 
WFG8 (2.0, 4.0) WFG6 (1.9995, 3.9999, 6.0) 
WFG9 (1.9987, 3.971) WFG7 (1.9996, 3.9991, 6.0) 
LZ09_F1 (1.0, 1.0) WFG8 (1.9995, 3.9999, 6.0) 
LZ09_F2 (1.0, 1.0) WFG9 (1.9982, 3.9847, 5.9890) 
LZ09_F3 (1.0, 1.0) LZ09_F6 (1.0, 1.0, 1.0) 
LZ09_F4 (1.0, 1.0) CEC2009_UF8 (1.0, 1.0, 1.0) 
LZ09_F5 (1.0, 1.0) CEC2009_UF9 (1.0, 1.0, 1.0) 
LZ09_F7 (1.0, 1.0) CEC2009_UF10 (1.0, 1.0, 1.0) 
LZ09_F8 (1.0, 1.0) WFG1 (1.2254, 2.8551, 5.6331, 7.7653, 10.11062) 
LZ09_F9 (1.0, 1.0) WFG2 (1.5857, 2.5838, 5.5332, 7.7187, 9.9995) 
CEC2009_UF1 (1.0, 1.0) WFG3 (0.2499, 0.4998, 1.4995, 3.9988, 9.9974) 
CEC2009_UF2 (1.0, 1.0) WFG4 (1.6833, 3.3879, 5.6633, 7.9548, 9.9999) 
CEC2009_UF3 (1.0, 1.0) WFG5 (1.9442, 3.7301, 5.8585, 7.9329, 9.9646) 
CEC2009_UF4 (1.0, 1.0) WFG6 (1.8910, 3.6894, 5.928, 7.9772, 9.9999) 
CEC2009_UF5 (1.0, 1.0) WFG7 (1.7884, 3.7984, 5.8397, 7.9853, 9.9999) 
CEC2009_UF6 (1.0, 1.0) WFG8 (1.9521, 3.8443, 5.8933, 7.9827, 9.9999) 
CEC2009_UF7 (1.0, 1.0) WFG9 (1.7860, 3.6238, 5.8502, 7.9419, 9.9676) 

4.1  Parameters Settings

In all experiments carried out, the 30 independent runs were performed using the following parameters:

  • The population sizes were 200 individuals for the two-objective problems, 300 individuals for the three-objective problems, and 1,000 individuals for the five-objective problems.

  • 300, 400, and 600 generations were executed on each run for the two-, three-, and five-objective problems, respectively.

  • The SBX crossover operator (Deb, 2001b) was used in Global WASF-GA and NSGA-II, with a distribution index and a probability . In MOEA/D, the differential evolution operator (H. Li and Zhang, 2009) was set as the crossover operator in Experiments II and IV, in which case the default values proposed by H. Li and Zhang (2009) and Zhang et al. (2009) were used for the crossover ratio and the scale factor ( and ). In Experiment III, we considered the SBX operator as the crossover operator in MOEA/D, with and .

  • The polynomial mutation operator (Deb, 2001b) was considered in all of the algorithms, with a distribution index and a mutation probability , where n is the number of variables.

  • The stopping criterion used was the maximal number of generations on each case.

  • The number of weight vectors used in Global WASF-GA and in MOEA/D, denoted by , was set using the population size N—that is, in both algorithms.3 In Global WASF-GA, we used the sets of weight vectors described in Subsection 3.2. The vectors of weights used in MOEA/D for Experiments II and IV were generated according to Zhang and Li (2007).4 For Experiment III, MOEA/D was executed using the weight vectors generated by the even design described in Subsection 3.2.

  • The utopian objective vector used in Global WASF-GA was defined as , where is 1% of the difference between the corresponding nadir and ideal values, for every .

  • Additionally, MOEA/D used the following control parameters:

  • Number of weight vectors in the neighbourhood: T = 20.

  • Probability that parent solutions are selected from the neighbourhood: .

  • Maximal number of solutions replaced by each child solution: .

  • For the three-objective WFG problems, we used the values 4 and 2 for the distance- and position-related parameters, respectively. In the WFG problems with five objectives, we set them to 4 and 4, respectively.

4.2  Experimental Results

In what follows, we show and analyse the results obtained for the HV and EPS indicators in four different experiments in which Global WASF-GA was tested in different scenarios against MOEA/D and NSGA-II. We checked the performances of the algorithms in 29 two-objective problems, 20 three-objective problems, and nine five-objective problems.

4.2.1  Experiment I: Global WASF-GA with Different Sets of Weight Vectors

In this section, we test the performance of our algorithm for the different sets of weight vectors generated by the designs described in Subsection 3.2. As described there, the even, radial, and WTB designs are considered for two-objective problems, while the even, simplex, and radial designs are used for the rest of problems. Regarding the WTB design, the parameter that controls the distribution near the extremes of the Pareto optimal front has been set to and . It should be mentioned that gives a uniform distribution of weights equal to that of the even design in the bi-objective case.

For the HV indicator, the means and standard deviations for each test problem can be seen in Tables 2 and 3 and, for the EPS indicator, the means and standard deviations are given in Tables 4 and 5. In these tables, we have coloured in dark grey the design that has obtained the best mean, and in light grey the one with the second best mean.

Table 2:
Experiment I—Results for the HV indicator in the two-objective problems.
ProblemEvenRadialWTB ()WTB ()
ZDT1     
ZDT2     
ZDT3     
ZDT4     
ZDT6     
WFG1     
WFG2     
WFG3     
WFG4     
WFG5     
WFG6     
WFG7     
WFG8     
WFG9     
LZ09F1     
LZ09F2     
LZ09F3     
LZ09F4     
LZ09F5     
LZ09F7     
LZ09F8     
LZ09F9     
CEC09UF1     
CEC09UF2     
CEC09UF3     
CEC09UF4     
CEC09UF5     
CEC09UF6     
CEC09UF7     
ProblemEvenRadialWTB ()WTB ()
ZDT1     
ZDT2     
ZDT3     
ZDT4     
ZDT6     
WFG1     
WFG2     
WFG3     
WFG4     
WFG5     
WFG6     
WFG7     
WFG8     
WFG9     
LZ09F1     
LZ09F2     
LZ09F3     
LZ09F4     
LZ09F5     
LZ09F7     
LZ09F8     
LZ09F9     
CEC09UF1     
CEC09UF2     
CEC09UF3     
CEC09UF4     
CEC09UF5     
CEC09UF6     
CEC09UF7     
Table 3:
Experiment I—Results for the HV indicator in the three and five-objective problems.
ProblemObj.EvenSimplexRadial
DTLZ1    
DTLZ2    
DTLZ3    
DTLZ4    
DTLZ5    
DTLZ6    
DTLZ7    
WFG1    
WFG2    
WFG3    
WFG4    
WFG5    
WFG6    
WFG7    
WFG8    
WFG9    
LZ09F6    
CEC09UF8    
CEC09UF9    
CEC09UF10    
WFG1    
WFG2    
WFG3    
WFG4    
WFG5    
WFG6    
WFG7    
WFG8    
WFG9    
ProblemObj.EvenSimplexRadial
DTLZ1    
DTLZ2    
DTLZ3    
DTLZ4    
DTLZ5    
DTLZ6    
DTLZ7    
WFG1    
WFG2    
WFG3    
WFG4    
WFG5    
WFG6    
WFG7    
WFG8    
WFG9    
LZ09F6    
CEC09UF8    
CEC09UF9    
CEC09UF10    
WFG1    
WFG2    
WFG3    
WFG4    
WFG5    
WFG6    
WFG7    
WFG8    
WFG9    
Table 4:
Experiment I—Results for the EPS indicator in the two-objective problems.
ProblemEvenRadialWTB ()WTB ()
ZDT1     
ZDT2     
ZDT3     
ZDT4     
ZDT6     
WFG1     
WFG2     
WFG3     
WFG4     
WFG5     
WFG6     
WFG7     
WFG8     
WFG9     
LZ09F1     
LZ09F2     
LZ09F3     
LZ09F4     
LZ09F5