## Abstract

Pareto-based multi-objective evolutionary algorithms experience grand challenges in solving many-objective optimization problems due to their inability to maintain both convergence and diversity in a high-dimensional objective space. Exiting approaches usually modify the selection criteria to overcome this issue. Different from them, we propose a novel meta-objective (MeO) approach that transforms the many-objective optimization problems in which the new optimization problems become easier to solve by the Pareto-based algorithms. MeO converts a given many-objective optimization problem into a new one, which has the same Pareto optimal solutions and the number of objectives with the original one. Each meta-objective in the new problem consists of two components which measure the convergence and diversity performances of a solution, respectively. Since MeO only converts the problem formulation, it can be readily incorporated within any multi-objective evolutionary algorithms, including those non-Pareto-based ones. Particularly, it can boost the Pareto-based algorithms' ability to solve many-objective optimization problems. Due to separately evaluating the convergence and diversity performances of a solution, the traditional density-based selection criteria, for example, crowding distance, will no longer mistake a solution with poor convergence performance for a solution with low density value. By penalizing a solution in term of its convergence performance in the meta-objective space, the Pareto dominance becomes much more effective for a many-objective optimization problem. Comparative study validates the competitive performance of the proposed meta-objective approach in solving many-objective optimization problems.

## 1  Introduction

A great number of real-world applications involve multiple objectives to be optimized simultaneously. These Multi-objective Optimization Problems (MOPs) are commonly seen in a variety of disciplines, such as industrial scheduling (Han et al., 2015, 2017; Li et al., 2018). The conflict of objectives implies that there is no single optimal solution to an MOP, rather a set of trade-off solutions, known as the Pareto optimal solutions.

Over the past two decades, a large number of Multi-Objective Evolutionary Algorithms (MOEAs) have been proposed. Compared to traditional mathematical programming techniques, MOEAs are particularly suited in searching for the Pareto optimal set in one single run. Generally, the task of MOEAs is to find solutions not only close to the Pareto optimal front (i.e., convergence measure), but also uniformly distributed (i.e., diversity measure).

Among various methods, Pareto-based MOEAs are the most popular ones, for example, Nondominated Sorting Genetic Algorithm II (NSGA-II) (Deb et al., 2002) and Strength Pareto Evolutionary Algorithm 2 (SPEA2) (Zitzler et al., 2001), to name a few. These algorithms first adopt the Pareto dominance principle to select nondominated solutions which are always preferred for good convergence. Then a density-based selection criterion is employed to promote a good diversity among the solutions. Pareto-based MOEAs have been proven successful in solving a large number of MOPs.

However, for MOPs with more than three objectives which are often referred to as Many-objective Optimization Problems (He and Yen, 2016; Liu, Gong, Sun, and Jin, 2017; Liu et al., 2018b), the performance of Pareto-based MOEAs generally deteriorates appreciably. One main reason is that the Pareto dominance-based selection criterion becomes very ineffective, since the proportion of nondominated solutions in the population rises considerably as the number of objectives increases, and then the density-based criterion alone plays a crucial role in environmental and mating selections. However, the density-based criterion often fails in a high-dimensional space, since a solution far away from the Pareto optimal front often has a lower density value.

Intuitively, there are two approaches to improving the performance of Pareto-based MOEAs in solving MaOPs: (1) redefining the dominance relationship (Laumanns et al., 2002; Jaimes et al., 2010; Sato et al., 2011; Yang et al., 2013; He et al., 2014); (2) improving the diversity-based secondary selection criterion or replacing it with another selection criterion (Deb and Jain, 2013; Li et al., 2014; Zhang et al., 2015). These approaches modify either the Pareto dominance or density-based selection criterion, and have shown great potential in many-objective optimization.

In this study, different from the exiting approaches of modifying the selection criteria in Pareto-based MOEAs, we propose a novel Meta-Objective approach (MeO for short) of modifying the many-objective optimization problems. By the proposed method, a many-objective optimization problem is transformed into a new optimization problem, which has the same Pareto optimal solution set as the original one, but is easier to be solved by the Pareto-based MOEAs. As mentioned above, the design goals of Pareto-based MOEAs are to utilize the Pareto dominance and density-based selection criteria to search for solutions with good convergence and diversity, respectively. However, the goals are often mistreated in many-objective optimization, due to the low efficiency of Pareto dominance and the misleading convergence information in density estimation. In our proposed meta-objective approach, these goals become apparent to reach when solving the new optimization problem. To be specific, the new optimization problem has the same number of objectives as the original optimization problem. Each meta-objective in the new optimization problem consists of two components which measure the convergence and diversity performances of a solution, respectively. The convergence component in each meta-objective is the same, and it represents the distance of a solution to the Pareto optimal front. On the other hand, the diversity components in each meta-objective are different, and represent the positions of the solution according to different orientations. The diversity components lead to conflicts among the meta-objectives. Since all the meta-objective have the same convergence component, a solution closer to the Pareto optimal front, that is, with better convergence performance, is more likely to dominate the other solution in the meta-objective space than that in the original objective space. For the same reason, the density-based selection criterion can focus on diversity components and will be no longer misguided by the convergence information. Therefore, the Pareto-based MOEAs are better equipped to solve the MaOPs by incorporating MeO.

The main contributions of this work can be summarized as follows:

First, a meta-objective method, termed MeO, is proposed, which transforms a given MaOP into a new one. There are two components in each meta-objective in the new MaOP, which measure the convergence and diversity performances of a solution, respectively. Solving the new MaOP by an MOEA leads to a solution set with a good overall performance. This method provides a new idea of managing convergence and diversity in many-objective optimization, and it can be seamlessly integrated into any MOEA, including those of non-Pareto-based ones. Moreover, it is computationally cheap and does not require a set of predefined reference points or vectors.

Next, three alternatives to the convergence component are proposed based on the Pareto rank method, the $Lp$ scalarizing method, and the double-rank method, respectively, since the true Pareto optimal front is actually unknown. The first two are feasible but ineffective. Therefore, the third one is recommended with the following advantages: 1) it is theoretically consistent with the Pareto dominance criterion; 2) it provides an adequate selection pressure in the high-dimensional objective space; and 3) it is insensitive to the settings of control parameters.

Last but most importantly, MeO provides a general extension of Pareto-based algorithms in many-objective optimization. By integrating MeO, the performances of Pareto-based MOEAs, for example, NSGA-II and SPEA2, are significantly boosted in solving MaOPs. On one hand, the density-based selection criteria in these algorithms will no longer incorrectly estimate a solution's density value, since the convergence and diversity performances of a solution are separately evaluated in the meta-objective space. On the other hand, the Pareto dominance will become much more effective, since the meta-objective values of a solution are penalized in term of its convergence performance.

The remainder of this article is organized as follows. In Section 2, fundamental definitions in multi-objective optimization are given for the completeness of the presentation and the motivation of this work is also elaborated. The proposed MeO is then described in detail in Section 3. Section 4 demonstrates the effectiveness of MeO by incorporating it into two representative MOEAs. The comparison of MeO with the state-of-the-art MOEAs are given in Section 5. Section 6 concludes the article and provides pertinent observations and future research directions.

## 2  Preliminaries

### 2.1  Basic Definitions in Multi-Objective Optimization

Without loss of generality, the MOP considered in this study is formulated as follows:
$minf(x)=minf1x,f2x,…,fMxs.t.x∈S⊂Rn,$
(1)
where $x$ represents an $n$-dimensional decision vector in space $S$. $fm(x),m=1,…,M$, is the $m$-th objective to be minimized, and $M$ is the number of objectives. When $M>3$, this problem is known as an MaOP.

In multi-objective optimization, the following concepts have been well defined and widely applied.

Definition 1:

$ParetoDominance:$ For any two different solutions to the problem in (1), $x1,x2∈S$, if $∀m=1,…,M,fmx1≤fmx2$, and $∃i=1,…,M,fix1, then $x1$ dominates $x2$, denoted as $x1≺x2$.

Definition 2:

$Paretooptimalset:$ For a solution to the problem in (1), $x*∈S$, if there is no $x∈S$ satisfying $x≺x*$, then $x*$ is the Pareto optimal solution. All such solutions form a set often called the Pareto optimal solution set ($PS$).

Definition 3:

$Paretooptimalfront:$ The image of the Pareto optimal solution set in the objective space is known as the Pareto optimal front ($PF$).

Definition 4:

$Idealpoint:$ The ideal point $z*=(z1*,…,zM*)T$, where $zm*$ is the infimum of $fm(x),m=1,…,M$.

### 2.2  Many-Objective Evolutionary Optimization

As mentioned in Section 1, many-objective optimization problems have more than three objectives. For the traditional Pareto-based MOEAs, their ability of solving MaOPs are usually improved by the following two approaches. The first is redefining the dominance relationship. A large body of research along this line has been reported, for example, dominance area control (Sato et al., 2011), Pareto partial dominance (Jaimes et al., 2010), $ɛ$-dominance (Laumanns et al., 2002), and fuzzy Pareto dominance (He et al., 2014). These modified dominance relationships have been shown to be able to improve the convergence of MOEAs for solving MaOPs. Recently, a Grid-based Evolutionary Algorithm (GrEA) (Yang et al., 2013) was developed based on the concept of $ɛ$-dominance and showed encouraging performance in many-objective optimization. The second is to improve the diversity-based secondary selection criterion or replace it with another selection criterion. In Laumanns et al. (2002), a diversity management operator is adopted to meet the requirement on diversity in selection. By shifting the position of a solution in sparse regions, a Shift-based Density Estimation (SDE) was proposed to discard solutions with poor convergence (Li et al., 2014). NSGA-III, an extension of NSGA-II, was developed in Deb and Jain (2013), which employs a reference-point-based selection criterion to promote diversity. Recently, a Knee point-driven Evolutionary Algorithm (KnEA) (Zhang et al., 2015) was proposed by preferring local knee solutions among nondominated ones.

In addition to enhancing the Pareto-based MOEAs, the decomposition-based MOEAs have been found to be a promising alternative to solve MaOPs. A well-regarded representative of decomposition-based algorithms is MOEA/D (Zhang and Li, 2007). In MOEA/D, a set of well-distributed reference vectors are first defined. Based on these reference vectors, a given MaOP is aggregated into a number of scalarizing functions. Individuals in a population are guided to search towards the Pareto optimal front in the directions specified by the reference vectors via minimizing the values of the scalarizing functions. MOEA/D has been demonstrated as efficient in solving MaOPs. However, one known problem of MOEA/D is that uniformly distributed reference vectors do not necessarily lead to uniformly distributed solutions, particularly for problems with irregular (i.e., nonuniform) Pareto optimal fronts. Several methods to adaptively generating reference vectors have been proposed to address this issue (Qi et al., 2014; Liu, Gong, Sun, and Zhang, 2017). In addition, Ensemble Fitness Ranking with a Ranking Restriction scheme (EFR-RR) (Yuan et al., 2016) provides another means to achieve uniformly distributed solutions in a high-dimensional objective space, in which only solutions close to a reference vector in the objective space will be further considered.

Indicator-Based Evolutionary Algorithms (IBEA) (Zitzler and Künzli, 2004) are theoretically well-supported options to the Pareto-based MOEAs. It adopts a performance indicator to account for both convergence and diversity of a solution. Solutions with the best values of the indicator are preferred. Among others, hypervolume is a widely used indicator in IBEAs. Unfortunately, the computational complexity for calculating hypervolume increases exponentially as the number of objectives increases, which makes it computationally prohibitive for MaOPs. To address this issue, HypE (Bader and Zitzler, 2011) uses Monte Carlo simulations to estimate hypervolume, where the accuracy of the estimated hypervolume can be traded off against the available computational resources. Recently, employing the R2 indicator, an improved Many-Objective Metaheuristic Based on the R2 Indicator (MOMBI-II) (Hernández Gómez and Coello Coello, 2015) was proposed to further enhance the efficiency of IBEAs for solving MaOPs.

From the above, we can see that there have been various algorithms developed for solving MaOPs. On the other hand, if we can extract some characteristics from an MaOP and transform it into a new optimization problem, it would be easier for the existing algorithms to obtain the Pareto optimal solutions.

### 2.3  Meta-Objective Methods in Evolutionary Optimization

The meta-objective approach has been used in evolutionary optimization, where a meta-objective is referred to as a new formulation about the original objective(s) or the characteristic extracted from the optimization problem. In the meta-objective approach, a new optimization problem is formed based on the meta-objective(s), which has the same or similar optimal solutions with the original problem. This means we can solve the original problem by solving the new one, especially when we do not have effective means in solving the original one. For example, in constrained evolutionary optimization (Takahama et al., 2005; Wang and Cai, 2012), the solution's degree of constraint violations is regarded as a meta-objective. Optimizing this new objective can lead to feasible solutions to the original constrained optimization problem. For another example in multimodal evolutionary optimization (Wang et al., 2015), the meta-objectives are constructed based on the original objective and the solutions' positions in the variable space. By solving the new multi-objective formed by these meta-objectives, multiple optimal solutions to the original multimodal optimization problem can be found.

Up to now, there exist very few meta methods for many-objective evolutionary optimization. Very recently, a Bi-Goal Evolutionary approach (BiGE) (Li et al., 2015) was proposed for solving MaOPs, in which two meta-objectives are constructed in terms of the convergence and diversity performances of a solution. The meta-objective about convergence is the sum of all objective functions, and the meta-objective about diversity is a share function based on niche technique. The experiments showed that a solution set with a well-balanced performance to the original MaOP can be achieved by optimizing the revised bi-objective optimization problem. However, the necessity of optimizing convergence and diversity as two conflicting objectives should be further investigated. In an ideal situation, only a single solution with both meta-objectives minimized can be obtained. This suggests that the original and new optimization problems do not have the same Pareto optimal set. Therefore, we may not achieve the whole Pareto optimal set of the original optimization problem by optimizing the new optimization problem.

In this study, we propose a novel meta-objective approach for many-objective optimization, where the meta-objectives are also constructed based on convergence and diversity. Different from Li et al. (2015), convergence and diversity are not regarded as conflict objectives, but components in every meta-objective. Each meta-objective has the same convergence component, thus the Pareto optimal solutions to the original optimization problem have great advantages in the meta-objective space. In addition, the diversity components in each meta-objective are different, which leads to conflicts among the meta-objectives. In this way, the new optimization problem has a diverse Pareto optimal set, which is the same as that of the original optimization problem. We will prove this in Subsection 3.1.

### 2.4  Motivation

As mentioned in the last subsection, a meta-objective can be constructed based on the characteristics extracted from the optimization problem. For a multi-objective optimization problem, its characteristics distinguished from a single-objective optimization problem is that not only the convergence but also the diversity performances of solutions should be simultaneously considered by an optimizer. Therefore, constructing meta-objectives in terms of these two performances is considered a natural idea. In this study, we propose a meta-objective approach to convert the given MOP formulated by (1) into a new MOP. When constructing this new MOP, we should make clear that these meta-objectives conflict with each other. If not, the new MOP will be inherently a single-objective optimization problem, and it would be difficult to achieve diverse solutions by solving the new MOP.

Recently, there has been a growing interest in balancing convergence and diversity in multi- and many-objective optimization (Adra and Fleming, 2011; Li et al., 2012, 2015; Yuan et al., 2016; Cheng et al., 2015). Some researchers argue that convergence and diversity are two requirements that conflict with each other. Particularly, these requirements are formulated as a bi-objective optimization problem in Li et al. (2015). In fact, the rationale for balancing the convergence and diversity performances of solutions is that the selection criteria are involved with both performances. For instance, the PBI (Penalty-Boundary Intersection) function in MOEA/D is composed of $d1$ and $d2$, which measure the distances of a solution to the origin (i.e., convergence measure) and to the reference vector (i.e., diversity measure), respectively. Then a solution with a smaller value of $d1$ but a large value of $d2$ might be preferred. Selecting this solution is not helpful to maintaining diversity in a high-dimensional objective space. Therefore, MOEA/D may not perform well in some MaOPs, since it lacks a strategy of balancing $d1$ and $d2$. To overcome this shortcoming in MOEA/D, EFR-RR (Yuan et al., 2016) only considers solutions with the smaller values of $d2$. As mentioned in the last subsection, the meta-objective about convergence in BiGE (Li et al., 2015) is the sum of all the objectives. Since the contour curves of this meta-objective is usually different from the geometric shape of the Pareto optimal front, solely optimizing it results in solutions in specific regions, which means that it is inherently involved with diversity information. This is why the meta-objective about convergence can be traded off with the other meta-objective about diversity. What conflicts with each other is the diversity information in both meta-objectives, but not convergence and diversity.

From the above discussions, we notice that whether or not convergence and diversity conflict with each other depends on how to define them. If solutions located exactly on the Pareto optimal front and uniformly distributed are regarded as having good convergence and diversity performances, then these two performances do not conflict with each other. For example, in Figure 1, when searching for an ideal solution set, improving the convergence or diversity performance of a solution set does not imply that the other performance should be degraded. In Figure 1a, the solutions remain uniformly distributed when approximating to the true Pareto optimal front. Similarly, in Figure 1b, diversifying the archived solution set dose not result in degeneration in convergence performance. Therefore, convergence and diversity are not necessarily balanced, but simultaneously optimized.

Figure 1:

An illustration of searching for the ideal solution set by improving the convergence or diversity performances of a solution set. In this example, solid dots $A$$D$ are the evenly distributed Pareto optimal solutions, and open dots $A'$$D'$ are the solutions to be improved.

Figure 1:

An illustration of searching for the ideal solution set by improving the convergence or diversity performances of a solution set. In this example, solid dots $A$$D$ are the evenly distributed Pareto optimal solutions, and open dots $A'$$D'$ are the solutions to be improved.

In this study, we suggest that what really conflicts with each other are the positions of solutions in the objective space. In Wang et al. (2015), a multimodal optimization problem is converted into a multi-objective optimization problem based on the positions of solutions in the variable space. The tradeoff between these meta-objectives leads to multiple optimal solutions for the multimodal optimization problem. Hence in this study, we will build the meta-objectives based on the position of a solution in the objective space. Figure 2a provides an illustrative example, where $v1(1,0)$ and $v2(0,1)$ are vectors on the coordinate axises, $θ1$ ($θ2$) is the angle between $v1$ ($v2$) and $f(x)$, and $d(x)$ is the distance between $f(x)$ and the Pareto optimal front in the objective space. It is evident that when $θ1$ becomes large, $θ2$ will decrease, and vice versa. This means that $θ1$ and $θ2$ conflict with each other. Then, a new bi-objective optimization problem can be formulated as follows:
$minf'(x)=min(f1'(x),f2'(x))=min(θ1,θ2).$
(2)
Balancing these two objectives will lead to solutions well scattered between $v1$ and $v2$.
Figure 2:

A solution in the original objective and meta-objective spaces, respectively.

Figure 2:

A solution in the original objective and meta-objective spaces, respectively.

Besides, as discussed above, the convergence performance of a solution, that is, $d(x)$, should be simultaneously optimized. Integrating (2) with $d(x)$, we get:
$minf'(x)=min(f1'(x),f2'(x))=min(θ1+η·d(x),θ2+η·d(x)),$
(3)
where $η$ is a scaling factor. Optimizing $f1'(x)$ implies to seek solutions not only closest to the $f1$-axis but also the Pareto optimal front, and optimizing $f2'(x)$ has a similar meaning. Since both objectives contain $d(x)$, optimizing (3) will lead to solutions with $d(x)$ close to 0. Consequently, a solution set that is almost located on the Pareto optimal front and well distributed can be achieved.

There may exist two issues in the above method. The first is that $d(x)$ cannot be calculated since the true Pareto optimal front is actually unknown. To address this issue, in Subsection 3.2, we will give three alternative measures based on the Pareto rank, the $Lp$ scalarizing function, and the double-rank method for estimating the convergence performance of a solution. The first two are feasible but ineffective. Thus we propose the third method by taking the advantages and avoiding the disadvantages of the first two. The other issue is that the new optimization problem has the same number of objectives as the given one, which means that the new problem is also an MaOP. However, this issue is actually trivial. In Subsection 3.5, we will explain the reason why the new optimization problem can be easily solved by a traditional Pareto-based MOEA, even though the number of objectives could be large. Additionally, our experimental results will demonstrate the effectiveness of the proposed method in Sections 4 and 5.

## 3  The Proposed Method

### 3.1  The Meta-Objective Method (MeO) for MOPs

MeO converts the MOP formulated by (1) into the following optimization problem:
$minf'(x)=minf1'x,f2'x,…,fM'xs.t.x∈S⊂Rn,$
(4)
where
$fm'(x)=fmD(x)+η·fC(x),$
(5)
$fmD(x),m=1,…,M$ is the diversity component that measures the position of a solution, $x$, with respect to the $fm$-axis in the objective space, $fC(x)$ is the convergence component that reflects the distance between a solution and the Pareto optimal front.
In the following, we present the methods of calculating the two components. For the first component,
$fmD(x)=cos2θ(vm,f(x)-z*),$
(6)
where $vm=(0,…,1m,…,0)$, and $θ(vm,f(x)-z*)$ is the angle between $vm$ and $f(x)-z*$. It is clear that $fmD(x)∈[0,1]$. $fmD(x)=0$ implies that $f(x)-z*$ is perpendicular to the $fm$-axis, while $fmD(x)=1$ indicates that $f(x)-z*$ and the $fm$-axis are colinear. Note that $cos2θ$ is adopted in the diversity competent to make Pareto optimal solutions located on a hyperplane, which will be explained in Theorem 6.
With respect to the second component,
$fC(x)=d(r,f(x)),$
(7)
where $r$ is a point located on the Pareto optimal front and closest to $f(x)$. $d(r,f(x))≥0$ defines the distance between $r$ and $f(x)$.

For an intuitive understanding, Figure 2b shows how to map a solution in Figure 2a from the original objective space to the meta-objective space for a bi-objective problem, where $d(x)$ is short for $d(r,f(x))$ and $θm,m=1,2$ for $θ(vm,f(x)-z*)$.

According to the definitions in Subsection 2.1, we have the following two theorems.

Theorem 1:

Any Pareto optimal solution to the original optimization problem defined as in (1) is a Pareto optimal solution to the new optimization problem as (4).

Proof of Theorem 5:

Since the convergence component, $fC(x*)$, of a Pareto optimal solution to the original optimization problem must be zero, the theorem will be proven if we can show that when $fC(x*)=0$, $x*$ is definitely a Pareto optimal solution to the new optimization problem. If this assertion would not hold, then there exists another solution, $x'$, that dominates $x*$. According to Definition 1, $∑m=1Mfm'(x')<∑m=1Mfm'(x*)$, i.e., $ηMfC(x')+∑m=1MfmD(x')<ηMfC(x*)+∑m=1MfmD(x*)$. Since $∑m=1MfmD(x)=∑m=1Mcos2θ(vm,f(x)-z*)=∑m=1M|fm(x)-zm*|2|f(x)-z*|2=1$, then $fC(x'). This contradicts the fact that $fC(x')≥0$. Consequently, we can infer that there are no other solutions that dominate $x*$. This means that $x*$ is a Pareto optimal solution to the new optimization problem. We have thus proved the theorem.$□$

Theorem 2:

The Pareto optimal front of the new optimization problem is located on a hyperplane formed by $∑m=1Mfm'=1$.

Proof of Theorem 6:

According to Theorem 5, for any Pareto optimal solution, $x*$, $fC(x*)=0$, and we get $∑m=1MfmD(x*)=1$. This means that $x*$ is on a hyperplane formed by $∑m=1Mfm'=1$ in the meta-objective space.$□$

Theorem 5 implies that all Pareto optimal solutions to the original optimization problem can be achieved by successfully solving the new optimization problem. However, there exists a critical issue when using this method. Since the true Pareto optimal front is unavailable in a priori, it is impractical to obtain $d(r,f(x))$ and $z*$. In multi-objective evolutionary optimization, $zm*$ can be well replaced with the known minimum of $fm(x)$ during the evolution. Therefore, how to estimate $d(r,f(x))$ as the convergence component is the major issue. In view of this, we present three alternatives to the convergence component, which will be described in detail in the following subsection.

### 3.2  Alternatives to the Convergence Component

In this subsection, we give three alternatives to compute the convergence component, which are based on the Pareto rank method (Deb et al., 2002), the $Lp$ scalarizing method (Wang, Zhang et al., 2016), and the double-rank method, respectively.

#### 3.2.1  The Pareto Rank Method

It is often believed that the Pareto rank is ineffective in distinguishing superior solutions when the number of objectives is large. However, when cooperating with a good diversity-based selection criterion, it is also capable of selecting solutions with good convergence. One of the most successful examples is NSGA-III (Deb and Jain, 2013). Therefore, in this study, the Pareto rank is regarded as one of alternatives to the convergence component, which is represented as follows:
$f^C(x)=rank(x)-1,$
(8)
where $f^C$ represents the alternative of $fC$, $rank(x)$ is the Pareto rank of $x$ in the current solution set, with its value being in the range of it, is $[1,N]$ ($N$ is the size of the solution set). Please refer to Deb et al. (2002) for the details of calculating $rank(x)$. $η$ defined in (5) will be set to 1 when using this method. Under this circumstance, a solution dominating others in the original optimization problem will be definitely superior to these solutions in the new optimization problem, since each diversity component is in the range of $[0,1]$. Solutions in the first Pareto rank has zero values of $f^C$ and are assumed as the Pareto optimal solutions. However, this method has a low efficiency, which results in a slow convergence rate for some optimization problems.

#### 3.2.2  The $Lp$ Scalarizing Method

The $Lp$ scalarizing method is commonly used in the decomposition- and preference-based MOEAs (Wang, Zhang et al., 2016; Zhang and Li, 2007; Goulart and Campelo, 2016). In this study, however, no weight vectors are predefined for the objectives, since all the objectives are considered equally important. The alternative to the convergence component based on this method is formulated as follows:
$f^C(x)=fS(x)-fminSfmaxS-fminS,$
(9)
where
$fS(x)=∑m=1M(fm(x)-zm*)p1p,$
(10)
$fminS$ and $fmaxS$ represent the minimum and maximum values of $fS(x)$ during the evolution, respectively. The contour curves of the $Lp$ scalarizing function varies according to the value of $p$. Figure 3 shows three examples. When $p=1$, $fS(x)$ refers to the sum of all the objective values, and when $p=2$ (or $p=∞$), $fS(x)$ means the Euclidean (or Chebyshev) distance between the solution, $x$, and the ideal point. Intuitively, this convergence component will work well for a problem whose Pareto optimal front fits its contour curves. The solution with the smallest $Lp$ scalarizing function value is termed the optimal solution.
Figure 3:

The contour curves of the $Lp$ scalarizing method with different values of $p$.

Figure 3:

The contour curves of the $Lp$ scalarizing method with different values of $p$.

The main advantage of the $Lp$ scalarizing method is that it provides a larger selection pressure than that of the Pareto rank method. However, this method has the following three deficiencies. The first is that the process of converting objectives is not strictly consistent with the Pareto dominance criterion. In other words, a dominated solution to the original optimization problem may become non-dominated under the new optimization problem, and be preferentially selected. The second is that this method is very sensitive to the setting of parameter $η$ as defined in (5). Obtaining a proper setting of $η$ is non-trivial, if not impossible, especially when the problem has a complex Pareto optimal front. The last but not least is that although Wang, Zhang et al. (2016) presented a strategy of adaptively selecting an appropriate value of $p$ for MOEA/D, one single scalarizing function cannot properly measure the solutions on the whole Pareto front in this study. The bias with respect to its contour curves may result in solutions located in specific regions. An illustrative example is provided in Figure 4a. Solutions $A$$J$ are candidates for a bi-objective optimization problem with a concave Pareto optimal front. Since the geometric shape of the Pareto optimal front is actually unknown, the $Lp$ scalarizing method with $p=1$ is simply chosen. Then solutions $A$, $B$, $I$, and $J$ will be preferred, since they have smaller values of the $Lp$ scalarizing function values than the rest. Consequently, solutions in the final solution set may be overly crowded in edge regions. However, the double-rank method can overcome this issue (as shown in Figure 4b), which will be presented in the next subsection.

Figure 4:

The evaluation of the convergence performances of solutions, where $p=1$.

Figure 4:

The evaluation of the convergence performances of solutions, where $p=1$.

#### 3.2.3  The Double-Rank Method

Taking the advantages of the above two designs and addressing their disadvantages, we propose the third alternative to the convergence component based on the double-rank method, which is formulated as follows:
$f^C(x)=2·(rank(x)-1)+1nmax·(rankl(x)-1),$
(11)
where $rankl(x)$ is the local rank of $x$ in its neighborhood in terms of the $Lp$ scalarizing function value. Solutions whose angles to $x$ smaller than a threshold value, $θl$, are termed the neighbors of $x$. $θl$ is set to $C·π·N-1M-1$, where $N$ is the population size, $M$ is the objective number, and $C$ is constant parameter. Increasing $N$ and reducing $M$ decrease the neighborhood size. Based on our preliminary investigations, $C=0.22$ would work well for most problems. For a problem whose Pareto front does not cover the whole objective space, the value of $C$ can be set a bit smaller to achieve a better performance. The number of neighbors of $x$ is denoted as $nl(x)$, then $nmax$ is the maximum of $nl(x)$ among the current population.

When employing this method, $η$ can simply be set to 1. Only nondominated solutions with the smallest $Lp$ scalarizing function value among their neighbors have zero value of $f^C$ and are termed as the optimal solutions. This method has the following four advantages: (1) the process of converting objective is strictly consistent with the Pareto dominance criterion due to $rank(x)$; (2) it offers a large selection pressure in the high-dimensional objective space due to $rankl(x)$; (3) it is insensitive to the settings of parameters. Although $θl$ is an adjustable parameter, it does not need to be tuned in most situations; and (4) the bias with respect to the contour curves of the $Lp$ scalarizing function can be neglected. Please see Figure 4b for an example. Compared to the method in Figure 4a, solutions in the non-edge regions will also be preferred, since some solutions (e.g., $D$, $F$, and $H$) have the smallest values of the $Lp$ scalarizing function in their neighborhoods. Additionally, it can be seen from Figure 4b that if $p$ is set to $∞$, $H$ will no longer be assumed as the best solution, since its neighbor, $G$, has a smaller value of the $Lp$ scalarizing function. However, if the population is infinite and the neighborhood size is infinitesimal, this discrepancy resulted from different settings of $p$ will become negligibly small.

The proposed double-rank method inherently utilizes the idea of niching (Liu et al., 2018a; Liu, Yen et al., 2018) and shares some common ideas with constrained decomposition-based methods, e.g., localized weighted sum (Wang et al., 2018) and localized PBI methods (Wang, Ishibuchi et al., 2016), where they all evaluate solutions according to their scalarizing function values in their local areas. The main difference between them is that the double-rank method does not require any reference vector. Another difference is that the double-rank method allows solutions in the same rank to compete on equal terms, which further promotes diversity in the solution set.

### 3.3  Incorporating MeO into MOEAs

MeO can be integrated within any existing MOEAs. Since the generational MOEAs are most commonly seen, we give a flow chart of incorporating MeO into them in Figure 5. We denote an MOEA combined with MeO as MeO+MOEA in this study, for example, MeO+NSGA-II. Specifically, the versions with the Pareto rank method, the $Lp$ scalarizing method, and the double-rank method are denoted as MeO$I$+MOEA, MeO$II$+MOEA, and MeO$III$+MOEA, respectively. The unique difference of MeO+MOEA from the original MOEA is that individuals are evaluated in terms of their meta-objectives, which are shown in the shaded parts of the flow chart. Note that in this study, when employing a normalization procedure to tackling a scaled problem, the original objective values will be normalized according to the maximum and minimum original objective values in the current population before calculating the meta-objective values.

Figure 5:

The flow chart of MeO+MOEA.

Figure 5:

The flow chart of MeO+MOEA.

In addition, the computational complexity analysis of MeO+MOEA is given in Section I in the Supplementary Material available at https://www.mitpressjournals.org/doi/suppl/10.1162/evco_a_00243.

### 3.4  Why Does MeO Work for MaOPs?

After the above description of MeO, one may be interested in the reason why MeO+MOEA could be effective for MaOPs when the MOEA is a traditional Pareto-based one and the new problem also has a large number of objectives. We now explain the reasons under the following two situations: (1) selecting solutions with the same value of the convergence component; and (2) selecting solutions with different values of the convergence component.

#### 3.4.1  Situation 1

This usually happens when the Pareto rank method is employed as an alternative to the convergence component. In the original Pareto-based MOEAs, when solutions are in the same Pareto rank, the diversity-based selection criteria, for example, the crowding distance (Deb et al., 2002) or the $k$-nearest neighbor (Zitzler et al., 2001), may prefer the solutions far from the Pareto optimal front since they are located in sparse regions. In other words, the poor convergence performance of a solution is mistaken for its good diversity performance. This is one of reasons why the Pareto-based MOEAs fail in solving MaOPs. In NSGA-III, this issue is solved by a new diversity-based selection criterion. The solutions are mapped into a hyperplane and evaluated by their distances to the predefined reference points. In the proposed method in this study, if different solutions have the same value of the convergence component, they will be located on a hyperplane, however, in the meta-objective space. In this situation, only their diversity performances will be traded off. Therefore, the aforementioned diversity-based selection criteria (Deb et al., 2002; Zitzler et al., 2001) can also be effective. Figure 6 provides an illustrative example, where solutions $A$$E$ are nondominated solutions. Solution $A$ will be estimated to have a low density value in the original objective space. In contrast, when employing the Pareto rank method as an alternative to the convergence component, all these solutions will be mapped into a hyperplane in the meta-objective space, and solution $A$ will be estimated to have a high density value. Compared to NSGA-III, the main advantage of the proposed method is that any density-based selection criterion can be adopted and no reference points need to be predefined.

Figure 6:

Density estimation of solutions.

Figure 6:

Density estimation of solutions.

#### 3.4.2  Situation 2

This often happens when the $Lp$ scalarizing method or the double-rank method is employed as an alternative to the convergence component. In this situation, solutions with large values of the convergence component will be penalized in the meta-objective space, whilst those with small values of the convergence component will have more chances to become the dominators. This effect is somewhat similar to that of $ε$-dominance which has been proven capable for solving MaOPs (Hadka and Reed, 2012; Yang et al., 2013). When comparing two solutions which do not dominate each other in the objective space, if the difference of the convergence components of the two solutions is larger than a positive constant, i.e., $ε$, the one with a smaller value will dominate the other in the meta-objective space. Figure 7 provides an illustrative example. In Figure 7a, solutions $A$ and $B$ do not dominate each other in the original objective space, and $fC(A)$ and $fC(B)$ are their distances to the Pareto optimal front, respectively. If we only consider the diversity components of solutions $A$ and $B$, we can obtain points $A'$ and $B'$ in Figure 7b. Assuming that $η=1$ in this case, after adding the convergence components, i.e., $fC(A)$ and $fC(B)$, they are mapped to $A$ and $B$ in the meta-objective space. In this mapping process, we notice that if the difference of $fC(A)$ and $fC(B)$ is larger than the vertical or horizontal distance (these two distances are the same) between $A'$ and $B'$, $B$ will be located in the domination area of $A$ in the meta-objective space. This means $ε$ is actually the vertical or horizontal distance between $A'$ and $B'$. From Figure 7, we can draw the following observation. For any two solutions $A$ and $B$, if $fC(A) and $η·|fC(A)-fC(B)|>ε$, solution $A$ will dominate $B$ in the meta-objective space. Consequently, the Pareto dominance becomes much more effective in MaOPs after incorporating MeO. It is worth noting that $A'$ and $B'$ are in the range of $[0,1]$, and then $ε$ is in the range of $(0,1]$. This is the reason why $η$ can be set to 1 in the Pareto rank method and the double-rank method. Under this circumstances, if solution $A$ dominates solution $B$ in the original objective space, $η·|fC(A)-fC(B)|$ will be definitely larger than $ε$. As a result, solution $A$ will necessarily dominate $B$ in the meta-objective space.

Figure 7:

Pareto dominance relationship of solutions.

Figure 7:

Pareto dominance relationship of solutions.

In addition, SDE (Li et al., 2014) is another algorithm that penalizes solutions with poor convergence performances, where the penalization is considered in the density-based selection criterion. In SDE, solutions with poor convergence performances are shifted (mapped) to new locations that close to other solutions, thus they will have larger density values. SDE has been integrated with NSGA-II and SPEA2, respectively. However, it only works well with the latter and is computationally very expensive. On the other hand, both NSGA-II and SPEA2 perform very well after integrating with MeO.

## 4  Investigations on the Effect of MeO on Two Pareto-Based MOEAs

In this section, we validate the performance of MeO by integrating it into two Pareto-based MOEAs, that is, NSGA-II and SPEA2, which results in two new algorithms, denoted as MeO+NSGA-II and MeO+SPEA2, respectively. There are two reasons for choosing these two MOEAs. The first is that both are classic MOEAs and have been applied in a wide range of optimization problems with success. The second is that both are generally believed incapable of solving MaOPs. The experiments in this section are divided into the following three parts. Firstly, we compare MeO$I$+NSGA-II and MeO$I$+SPEA2 with their corresponding original versions in high-dimension DTLZ benchmark problems. Following that, we further compare three different alternatives to the convergence component in the design of MeO+NSGA-II.

### 4.1  Test Problems, Performance Indicators, and Parameter Settings

We choose two widely used test suites, that is, DTLZ (Deb et al., 2005) and WFG (Huband et al., 2006), for empirical comparisons in this study. We consider these scalable problems with 4, 6, 8, and 10 objectives in this article. The number of variables is set to $M+4$ for DTLZ1, $M+19$ for DTLZ7, and $M+9$ for the other DTLZ problems. The distance- and position-related parameters are set to 24 and $M-1$, respectively, for the WFG problems. Please refer to Deb et al. (2005) and Huband et al. (2006) for detailed descriptions of the DTLZ and WFG suites.

In order to compare different algorithms on these test problems, we adopt the Inverted Generation Distance Plus (IGD$+$) (Ishibuchi et al., 2015) as a performance metric which gives a comprehensive quantification of both the convergence and diversity of a solution set. It is weakly Pareto compliant and thus more accurate on evaluation than its original version, IGD. The smaller the value of IGD$+$ of an algorithm, the better the performance of the algorithm. IGD$+$ requires a reference set, whose points are typically uniformly distributed on the true Pareto optimal front of a test problem. In this study, over 5,000 reference points are evenly sampled from the true Pareto optimal front of each test problem. Note that solutions and the reference points will be normalized based on the true Pareto optimal front, when calculating IGD$+$.

The following parameter settings are adopted by all the compared algorithms. Simulated binary crossover and polynomial mutation are applied as the crossover and mutation operators, respectively, where both distribution indexes are set to 20. The crossover and mutation probabilities are 1.0 and $1/n$, respectively, where $n$ is the number of decision variables. The termination criterion is that the population has evolved for the predefined maximum number of generations. For DTLZ1, DTLZ3, DTLZ6, and WFG1, the maximum number of generations is set to 1,000, while 300 is used for the rest test functions. The setting of termination criterion is the same as those in the recent studies (Li et al., 2015; Yang et al., 2013). The population size, $N$, is set to 120, 132, 156, and 275, when $M$ is 4, 6, 8, and 10, respectively. All these parameter settings are the same as those in Section 5 for a fair comparison. All the experimental results in this article are obtained by executing 50 independent runs of each algorithm on each optimization problem. The Wilcoxon's rank sum test is employed to determine whether one algorithm shows a statistically significant difference with the other, and the null hypothesis is rejected at a significant level of 0.05.

### 4.2  Comparisons among NSGA-II, SPEA2, MeO$I$+NSGA-II, and MeO$I$+SPEA2

In this subsection, we apply NSGA-II, SPEA2 and their new versions (i.e., MeO$I$+NSGA-II and MeO$I$+SPEA2 with the Pareto rank method as the convergence measure) to solve the DTLZ problems. Table 1 shows the mean values of IGD$+$ obtained by all competing algorithms on the DTLZ problems. The results in boldface indicate that MeO$I$+NSGA-II (MeO$I$+SPEA2) performs significantly better its original version. In Table 1, the performance score (Bader and Zitzler, 2011) of each algorithm on each test problem is given in gray scale after the mean IGD$+$ value. A darker tone corresponds to a larger performance score. For a test problem, the performance score of an algorithm is the number of the competing algorithms which perform significantly worse than it according to IGD$+$. Moreover, we give the average performance score (APS) of each algorithms on all the test problems at the bottom of Table 1. Note that in this article, notation DTLZ$i$-$M$ refers to DTLZ$i$ (i.e., $i∈1,2,3,4,5,6,7$) with $M$ objectives (i.e., $M∈4,6,8,10$), and WFG$i$-$M$ has a similar definition.

Table 1:
Comparisons among NSGA-II, SPEA2, MeO+NSGA-II and MeO+SPEA2 based on the mean value of IGD$+$.

As can be seen from Table 1, the performances of both NSGA-II and SPEA2 are significantly boosted on most problems when integrating with MeO. These results show the effectiveness of MeO when solving MaOPs. However, MeO$I$+SPEA2 does not show significantly better performance than SPEA2 on some 4- and 6-objective test problems, for example, DTLZ1-4 and -6, DTLZ5-4, DTLZ6-4, and DLTZ7-4, and -6. In addition, neither MeO$I$+NSGA-II nor MeO$I$+SPEA2 performs better than its original version on some DTLZ5 and DTLZ6 test problems. The most possible reason is that the Pareto rank method does not provide a great selection pressure. The Pareto fronts of DTLZ5 and DTLZ6 only have two dimensions. Therefore, when solving DTLZ5 and DTLZ6 with any number of objectives and some other test problems with a relatively smaller number of objectives (e.g., 4 or 6), MeO$I$+NSGA-II and MeO$I$+SPEA2 do not show more evident strength than their original versions.

Since MeO$I$+SPEA2 not only performs worse than MeO$I$+NSGA-II according to APS but also is extraordinarily time-consuming (see Table 1 in the Supplementary Material), we further investigate the performance of MeO+NSGA-II with different convergence components in the next subsection.

### 4.3  Comparisons among MeO$I$+NSGA-II, MeO$II$+NSGA-II, MeO$III$+NSGA-II, and MeO$*$+NSGA-II

In this subsection, we investigate the performance of MeO+NSGA-II with three different alternatives to the convergence component. We employ two settings of $p$ (1 and 2) in MeO$II$+NSGA-II and MeO$III$+NSGA-II to observe their different behaviors. We also include the results obtained by the method that calculates the convergence component precisely (denoted as MeO$*$+NSGA-II), since the true Pareto fronts of DTLZ problems are known. Note that $η$ is set to 1 in all the designs.

Table 1 lists the mean values of IGD$+$ obtained by different algorithms. According to Table 1, we can make the following observations: (1) MeO$III$+NSGA-II can achieve satisfactory results on most problems. There is hardly any significant difference between the results obtained by MeO$III$+NSGA-II with $p=1$ and $p=2$, which implies that the performance of MeO$III$ is not sensitive to the parameter $p$. (2) MeO$II$+NSGA-II with $p=1$ performs significantly better on DTLZ1 than that with $p=2$. This is attributed to the fact that the geometrical shape of the Pareto optimal front of DTLZ1 perfectly fits the contour curves of the $Lp$ scalarizing function with $p=1$. Generally, MeO$II$+NSGA-II with $p=2$ show better performance than that with $p=1$ according to APS. (3) MeO$II$+NSGA-II does not work as well as MeO$III$+NSGA-II on most problems. The reason is that $η$ is not tuned for each problem. However, MeO$II$+NSGA-II with $p=2$ performs very well on DTLZ4 and DTLZ5. (4) MeO$*$+NSGA-II achieve the best APS on all the problems. However, it is not as good as MeO$III$+NSGA-II on some test problems. One reason is that it behaves like MeO$II$+NSGA-II to some degree and thus is relatively sensitive to the setting of parameter $η$. Another reason is that the population could be far away from the true Pareto front at the beginning of evolution. Precisely calculating the convergence component based on the true Pareto front may result in the population converging into a small region, which degenerates the diversity performance.

In addition, we show the Pareto fronts achieved by NSGA-II and MeO$III$+NSGA-II on some 3-objective DTLZ test problems in Figure 1 in the Supplementary Material for a visual understanding of MeO$III$+NSGA-II's performance.

## 5  Comparisons with the State-of-the-Art Algorithms

In this section, we compare MeO$III$+NSGA-II with $p=2$ with seven state-of-the-art MaOEAs, that is, BiGE (Li et al., 2015), SDE (Li et al., 2014), NSGA-III (Deb and Jain, 2013), EFR-RR (Yuan et al., 2016), MOMBI-II (Hernández Gómez and Coello Coello, 2015), GrEA (Yang et al., 2013), and KnEA (Zhang et al., 2015) on DTLZ and WFG suites with 4-, 6-, 8-, and 10-objective.

### 5.1  Competing Algorithms

In this subsection, we briefly introduce the seven chosen algorithms. These algorithms cover all the main categories mentioned in Section 1 for many-objective optimization. All the competing algorithms as well as MeO are implemented by PlatEMO (Tian et al., 2017). The source code of MeO is available at https://github.com/yiping0liu.

(1) BiGE is a known meta-objective method designed for many-objective optimization. It simultaneously considers the convergence and diversity performances of a solution. Different from MeO proposed in this article, BiGE transforms an MaOP into a bi-objective optimization problem. A solution set with a good performance can be achieved by optimizing a bi-objective optimization problem.

(2) SDE is a recently proposed method that modifies the density estimation strategy in traditional Pareto-based MOEAs. By shifting the positions of individuals in sparse regions, individuals with poor convergence are penalized to have a high density value, and then discarded during the evolution. In this method, the version that integrates SDE into SPEA2 (denoted as SPEA2+SDE) is employed for comparisons due to its better performance compared with other variants.

(3) NSGA-III is advanced from a popular Pareto dominance-based MOEA, NSGA-II, for handling MaOPs. NSGA-III uses a reference-point-based selection criterion instead of a density-based counterpart (i.e., the crowding distance) in NSGA-II. When employing the Pareto rank method in MeO, both NSGA-III and MeO+MOEA map nondominated solutions into a hyperplane, whereas no reference points need to be defined in the latter.

(4) EFR-RR is based on the framework of NSGA-II. However, it is inherently a decomposition-based method like MOEA/D. EFR-RR also concentrates in balancing convergence and diversity for MaOPs. It first considers the diversity performance of a solution. Only when the perpendicular distance from the solution to the reference vector is small, will its scalarizing function value be calculated. This idea has also been applied to MOEA/D, termed MOEA/D-DU. Since EFR-RR performs slightly better than MOEA/D-DU according to the original study, we choose the former for comparisons.

(5) MOMBI-II is an indicator-based algorithm that uses the R2 indicator to guide the search. The R2 indicator is an attractive alternative to hypervolume, due to its low computational cost and weak-Pareto compatibility. MOMBI-II takes two key aspects into account, that is, using the scalarizing function and statistical information about the population's convergence to select individuals.

(6) GrEA exploits the potential of the grid-based approach to deal with MaOPs. The grid dominance and the grid difference are introduced to determine the relationship between individuals in a grid environment. Three grid-based criteria are incorporated into calculating the fitness of an individual to distinguish individuals in both mating and environmental selection processes. Moreover, a fitness adjustment strategy is developed to avoid partial overcrowding as well as to guide the search towards various directions in the archive.

(7) KnEA is established under the framework of NSGA-II, and it gives a priority to the knee points among nondominated solutions. To identify local knee points, a niche method based on hyperboxes is employed, in which the niche size can be adaptively tuned in terms of information from the population at the previous generations.

### 5.2  Extra Parameter Settings

In this subsection, we give the settings of extra parameters used in the above algorithms. For the common parameter settings, their settings are the same as those in Subsection 4.1.

To avoid the situation in which all reference points are located on the boundary of the Pareto optimal front for problems with a large number of objectives, the strategy of two-layered reference points (vectors) is used for NSGA-III, EFR-RR, and MOMBI-II. As a result, the population size of NSGA-III and EFR-RR cannot be arbitrarily specified. We set the population size (i.e., $N$) of all algorithms to be the same value for a fair comparison. The settings of parameters for controlling the number of reference points are listed in Table 2.

Table 2:
The setting of parameters, $p1$ and $p2$, for controlling the number of reference points (population size) along the boundary of the Pareto optimal front and inside it, respectively.
$M$$p1$$p2$$N$
120
132
156
10 275
$M$$p1$$p2$$N$
120
132
156
10 275

As suggested in the original study of EFR-RR, the Chebychev approach is chosen as the scalarizing method, the neighborhood size is set to $0.1N$, and $K$ is set to 2. Since the number of objectives and the population size are different from those of the original KnEA and GrEA, we adjust the threshold, $T$, in KnEA and the grid division, $div$, in GrEA according to the guidelines provided in the original studies to achieve the best performances of these algorithms. The settings of $T$ and $div$ for the DTLZ and WFG problems are listed in Table 3 in the Supplementary Material.

Table 3:
Comparisons between MeO$III$+NSGA-II and seven state-of-the-art algorithms regarding the mean value of IGD$+$.

### 5.3  Results and Discussions

Table 3 shows the mean values of IGD$+$ and the performance scores obtained by different algorithms on the DTLZ and WFG problems. A darker tone corresponds to a larger performance score. The average performance score (APS) of each algorithm on all the test problems is at the bottom of Table 3.

From Table 3 we can draw the following conclusions.

MeO$III$+NSGA-II shows a competitive performance on most problems. It achieves the best APS among the eight competitors. It significantly outperforms the other algorithms on more test problems. However, it performs relatively poorly on some test problems, for example, WFG2 and WFG3, since both the setting of $p$ and $θl$ are not perfect for these problems. This suggests that if developing a method to better estimate the convergence component, the performance of MeO+MOEA can be further enhanced. One promising way is to adaptively tune $p$ and $θl$ for a given optimization problem. However, this is left for future study.

BiGE shows appealing results on some test problems, such as DTLZ5, WFG2, and WFG3, although it does not achieve any best result among the algorithms considered. BiGE works poorly on DTLZ1, DTLZ3, and DTLZ7. The reason is that the normalization procedure results in an improper niche size for these problems.

SPEA2+SDE shows a competitive performance on DTLZ1, DTLZ5, DTLZ6, DTLZ7, and WFG1. However, SPEA2+SDE is a very time-consuming design, whose runtime is no less than the original SPEA2. Please see Table 1 in the Supplementary Material for an elaborated discussion. Although SDE can also be combined with NSGA-II, which is computationally efficient, the performance of NSGA-II+SDE is much worse. Please refer to Li et al. (2014) for more details.

NSGA-III works quite well on most test problems. However, its performance scales poorly with the increasing number of objectives on DTLZ5, DTLZ6, WFG1, and WFG3. This is attributed to the fact that the reference vectors employed in NSGA-III are uniformly distributed in the whole objective space, whereas the Pareto optimal solutions (or nondominated solutions during the evolution) of these problems are not.

EFR-RR can effectively deal with most problems. Although it has the same issue as NSGA-III, it generally performs better than for NSGA-III on DTLZ5, DTLZ6, WFG1, and WFG3. The performance of both EFR-RR and NSGA-III can be further improved by integrating a strategy for adaptively generating the reference vectors (Qi et al., 2014).

MOMBI-II performs poorly on some scaled test problems, for example, DTLZ7, and WFG4 to 9, since it does not employ a normalization operator. In addition, it adopts the R2 indicator based on the reference vectors, which results in a low time consumption but shares the similar disadvantages as those in the decomposition-based algorithms.

GrEA achieves encouraging results on most problems. However, it is very sensitive to the choice of parameter $div$. In our experiments, we test a number of settings of $div$ for each instance to make sure that GrEA can obtain a satisfactory performance. If a method of adaptively adjusting $div$ can be further developed, the performance of GrEA will be considerably enhanced on various problems.

KnEA has an advantage of approximating the true Pareto optimal front by giving preferences to the knee points among nondominated solutions. It performs very well on DTLZ7, WFG1, and WFG2. Similar to GrEA, KnEA needs a proper setting of the parameter, $T$. A smaller value of $T$ can lead to more solutions with good convergence.

To sum up, MeO$III$+NSGA-II is very competitive compared with the state-of-the-art MOEAs. Unlike KnEA and GrEA, MeO$III$+NSGA-II is relatively insensitive to its design parameter, $η$. Additionally, different from NSGA-III, EFR-RR, and MOMBI-II, it does not require any reference vectors. Finally, it is computationally much cheaper than SPEA2+SDE.

In addition, we show the results of hypervolume obtained by these algorithms on DTLZ and WFG test problems in Table 2 in the Supplementary Material. Readers can refer to them if interested.

## 6  Conclusion

In this article, we have proposed a novel meta-objective method for many-objective evolutionary optimization, which converts a given MOP into a new MOP. Each meta-objective in the new MOP contains a unique diversity component that represents the position of a solution according to a coordinate axis in the original objective space, which results in the conflict among the meta-objectives. Furthermore, all the meta-objectives contain the same convergence component that measures the distance between the solution and the Pareto optimal front. A well-converged and distributed solution set can be obtained by optimizing the new MOP, even if the number of objectives is large. This method provides a new way to manage the requirements of convergence and diversity in many-objective optimization.

Since the Pareto optimal front is actually unknown, we have proposed three alternatives to the convergence component based on the Pareto rank method, the $Lp$ scalarizing method, and the double-rank method, respectively. The first is effective in solving MaOPs but occasionally has a low efficiency for some optimization problems. The second provides a needed selection pressure but is very sensitive to the settings of parameters. The third is recommended by taking the advantages of the above two methods while avoiding their disadvantages at the same time.

In this study, we combine MeO with two classic Pareto-based MOEAs, that is, NSGA-II and SPEA2, which results in two new algorithms, denoted as MeO+NSGA-II and MeO+SPEA2. The performances of the two new algorithms are significantly better than the original counterparts when solving MaOPs. Additionally, MeO+NSGA-II performs slightly better than MeO+SPEA2 and is computationally much cheaper.

To demonstrate the effectiveness of MeO, we test MeO+NSGA-II with the double-rank method on the DTLZ and WFG problems in comparison with seven state-of-the-art designs, namely, BiGE, SDE, NSGA-III, EFR-RR, MOMBI-II, GrEA, and KnEA. The experimental results demonstrate that MeO+NSGA-II is very competitive among the chosen algorithms.

To further promote the performance of MeO+MOEA, better estimating the convergence component is a must. Therefore, our future work includes the development of a strategy of adaptively tuning the parameters, $θl$ and $p$, in each neighborhood for the double-rank method. In addition, combining MeO with other MOEAs, for example, indicator- and decomposition-based algorithms, and investigating the behavior of the new corresponding algorithms in many-objective optimization are of great interest.

## Acknowledgments

This work was jointly supported by National Natural Science Foundation of China with grant No. 61803192, 61876075, 61773384, 61763026, 61673404, 61573361, and 61503220, National Basic Research Program of China (973 Program) with grant No. 2014CB046306-2, National Key R&D Program of China with grant No. 2018YFB1003802-01, and China Scholarship Council with grant No. 201606420005.

## References

,
S. F.
, and
Fleming
,
P. J
. (
2011
).
Diversity management in evolutionary many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
15
(
2
):
183
195
.
,
J.
, and
Zitzler
,
E
. (
2011
).
HypE: An algorithm for fast hypervolume-based many-objective optimization
.
Evolutionary Computation
,
19
(
1
):
45
76
.
Cheng
,
J.
,
Yen
,
G. G.
, and
Zhang
,
G
. (
2015
).
A many-objective evolutionary algorithm with enhanced mating and environmental selections
.
IEEE Transactions on Evolutionary Computation
,
19
(
4
):
592
605
.
Deb
,
K.
, and
Jain
,
H
. (
2013
).
An evolutionary many-objective optimization algorithm using reference-point based non-dominated sorting approach, part i: Solving problems with box constraints
.
IEEE Transactions on Evolutionary Computation
,
18
(
4
):
577
601
.
Deb
,
K.
,
Pratap
,
A.
,
Agarwal
,
S.
, and
Meyarivan
,
T
. (
2002
).
A fast and elitist multiobjective genetic algorithm: NSGA-II
.
IEEE Transactions on Evolutionary Computation
,
6
(
2
):
182
197
.
Deb
,
K.
,
Thiele
,
L.
,
Laumanns
,
M.
, and
Zitzler
,
E
. (
2005
).
Scalable test problems for evolutionary multiobjective optimization
.
New York
:
Springer
.
Goulart
,
F.
, and
Campelo
,
F.
(
2016
).
Preference-guided evolutionary algorithms for many-objective optimization
.
Information Sciences
,
329:236
255
.
,
D.
, and
Reed
,
P
. (
2012
).
Diagnostic assessment of search controls and failure modes in many-objective evolutionary optimization
.
Evolutionary Computation
,
20
(
3
):
423
452
.
Han
,
Y.
,
Gong
,
D.
,
Jin
,
Y.
, and
Pan
,
Q.
(
2017
).
Evolutionary multi-objective blocking lot-streaming flow shop scheduling with machine breakdowns
.
IEEE Transactions on Cybernetics
. doi:10.1109/TCYB.2017.2771213
Han
,
Y.
,
Gong
,
D.
, and
Sun
,
X
. (
2015
).
A discrete artificial bee colony algorithm incorporating differential evolution for the flow-shop scheduling problem with blocking
.
Engineering Optimization
,
47
(
7
):
927
946
.
He
,
Z.
, and
Yen
,
G. G
. (
2016
).
Visualization and performance metric in many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
20
(
3
):
386
402
.
He
,
Z.
,
Yen
,
G. G.
, and
Zhang
,
J
. (
2014
).
Fuzzy-based Pareto optimality for many-objective evolutionary algorithms
.
IEEE Transactions on Evolutionary Computation
,
18
(
2
):
269
285
.
Hernández Gómez
,
R.
, and
Coello Coello
,
C. A
. (
2015
). Improved metaheuristic based on the r2 indicator for many-objective optimization. In
Proceedings of the 2015 Conference on Genetic and Evolutionary Computation
, pp.
679
686
.
Huband
,
S.
,
Hingston
,
P.
,
Barone
,
L.
, and
While
,
L
. (
2006
).
A review of multiobjective test problems and a scalable test problem toolkit
.
IEEE Transactions on Evolutionary Computation
,
10
(
5
):
477
506
.
Ishibuchi
,
H.
,
Masuda
,
H.
,
Tanigaki
,
Y.
, and
Nojima
,
Y
. (
2015
). Modified distance calculation in generational distance and inverted generational distance. In
International Conference on Evolutionary Multi-Criterion Optimization
, pp.
110
125
.
Jaimes
,
A. L.
,
Aguirre
,
H.
,
Tanaka
,
K.
, and
Coello
,
C. A. C
. (
2010
). Objective space partitioning using conflict information for many-objective optimization. In
Parallel Problem Solving from Nature
, pp.
657
666
.
Laumanns
,
M.
,
Thiele
,
L.
,
Deb
,
K.
, and
Zitzler
,
E
. (
2002
).
Combining convergence and diversity in evolutionary multiobjective optimization
.
Evolutionary Computation
,
10
(
3
):
263
282
.
Li
,
J.
,
Sang
,
H.
,
Han
,
Y.
,
Wang
,
C.
, and
Gao
,
K.
(
2018
).
Efficient multi-objective optimization algorithm for hybrid flow shop scheduling problems with setup energy consumptions
.
Journal of Cleaner Production
,
181:584
598
.
Li
,
K.
,
Kwong
,
S.
,
Cao
,
J.
,
Li
,
M.
,
Zheng
,
J.
, and
Shen
,
R
. (
2012
).
Achieving balance between proximity and diversity in multi-objective evolutionary algorithm
.
Information Sciences
,
182
(
1
):
220
242
.
Li
,
M.
,
Yang
,
S.
, and
Liu
,
X
. (
2014
).
Shift-based density estimation for Pareto-based algorithms in many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
18
(
3
):
348
365
.
Li
,
M.
,
Yang
,
S.
, and
Liu
,
X.
(
2015
).
Bi-goal evolution for many-objective optimization problems
.
Artificial Intelligence
,
228:45
65
.
Liu
,
Y.
,
Gong
,
D.
,
Sun
,
J.
, and
Jin
,
Y
. (
2017
).
A many-objective evolutionary algorithm using a one-by-one selection strategy
.
IEEE Transactions on Cybernetics
,
47
(
9
):
2689
2702
.
Liu
,
Y.
,
Gong
,
D.
,
Sun
,
X.
, and
Zhang
,
Y.
(
2017
).
Many-objective evolutionary optimization based on reference points
.
Applied Soft Computing
,
50:344
355
.
Liu
,
Y.
,
Ishibuchi
,
H.
,
Nojima
,
Y.
,
Masuyama
,
N.
, and
Shang
,
K
. (
2018a
). A double-niched evolutionary algorithm and its behavior on polygon-based problems. In
International Conference on Parallel Problem Solving from Nature
, pp.
262
273
.
Liu
,
Y.
,
Ishibuchi
,
H.
,
Nojima
,
Y.
,
Masuyama
,
N.
, and
Shang
,
K
. (
2018b
). Improving 1by1ea to handle various shapes of Pareto fronts. In
International Conference on Parallel Problem Solving from Nature
, pp.
311
322
.
Liu
,
Y.
,
Yen
,
G. G.
, and
Gong
,
D.
(
2018
).
A multi-modal multi-objective evolutionary algorithm using two-archive and recombination strategies
.
IEEE Transactions on Evolutionary Computation
. doi:10.1109/TEVC.2018.2879406
Qi
,
Y.
,
Ma
,
X.
,
Liu
,
F.
,
Jiao
,
L.
,
Sun
,
J.
, and
Wu
,
J
. (
2014
).
.
Evolutionary Computation
,
22
(
2
):
231
264
.
Sato
,
H.
,
Aguirre
,
H.
, and
Tanaka
,
K
. (
2011
). Improved S-CDAS using crossover controlling the number of crossed genes for many-objective optimization. In
Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation
, pp.
753
760
.
Takahama
,
T.
,
Sakai
,
S.
, and
Iwane
,
N.
(
2005
).
Constrained optimization by the $ɛ$ constrained hybrid algorithm of particle swarm optimization and genetic algorithm
.
AI 2005: Advances in Artificial Intelligence
, pp.
389
400
.
Tian
,
Y.
,
Cheng
,
R.
,
Zhang
,
X.
, and
Jin
,
Y
. (
2017
).
PlatEMO: A MATLAB platform for evolutionary multi-objective optimization
.
IEEE Computational Intelligence Magazine
,
12
(
4
):
73
87
.
Wang
,
R.
,
Ishibuchi
,
H.
,
Zhang
,
Y.
,
Zheng
,
X.
, and
Zhang
,
T
. (
2016
). On the effect of localized PBI method in MOEA/D for multi-objective optimization. In
2016 IEEE Symposium Series on Computational Intelligence
, pp.
1
8
.
Wang
,
R.
,
Zhang
,
Q.
, and
Zhang
,
T.
(Early access,
2016
).
Decomposition based algorithms using Pareto adaptive scalarizing methods
.
Wang
,
R.
,
Zhou
,
Z.
,
Ishibuchi
,
H.
,
Liao
,
T.
, and
Zhang
,
T
. (
2018
).
Localized weighted sum method for many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
22
(
1
):
3
18
.
Wang
,
Y.
, and
Cai
,
Z
. (
2012
).
Combining multiobjective optimization with differential evolution to solve constrained optimization problems
.
IEEE Transactions on Evolutionary Computation
,
16
(
1
):
117
134
.
Wang
,
Y.
,
Li
,
H.
,
Yen
,
G. G.
, and
Song
,
W
. (
2015
).
Mommop: Multiobjective optimization for locating multiple optimal solutions of multimodal optimization problems
.
IEEE Transactions on Cybernetics
,
45
(
4
):
830
843
.
Yang
,
S.
,
Li
,
M.
,
Liu
,
X.
, and
Zheng
,
J
. (
2013
).
A grid-based evolutionary algorithm for many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
17
(
5
):
721
736
.
Yuan
,
Y.
,
Xu
,
H.
,
Wang
,
B.
,
Zhang
,
B.
, and
Yao
,
X
. (
2016
).
Balancing convergence and diversity in decomposition-based many-objective optimizers
.
IEEE Transactions on Evolutionary Computation
,
20
(
6
):
180
198
.
Zhang
,
Q.
, and
Li
,
H
. (
2007
).
MOEA/D: A multiobjective evolutionary algorithm based on decomposition
.
IEEE Transactions on Evolutionary Computation
,
11
(
6
):
712
731
.
Zhang
,
X.
,
Tian
,
Y.
, and
Jin
,
Y
. (
2015
).
A knee point driven evolutionary algorithm for many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
19
(
6
):
761
776
.
Zitzler
,
E.
, and
Künzli
,
S
. (
2004
). Indicator-based selection in multiobjective search. In
Parallel Problem Solving from Nature
, pp.
832
842
.
Zitzler
,
E.
,
Laumanns
,
M.
, and
Thiele
,
L.
(
2001
).
SPEA2: Improving the strength Pareto evolutionary algorithm
.
Technical Report. Eidgenössische Technische Hochschule Zürich (ETH), Institut für Technische Informatik und Kommunikationsnetze (TIK)
.