## Abstract

The most relevant property that a quality indicator (QI) is expected to have is Pareto compliance, which means that every time an approximation set strictly dominates another in a Pareto sense, the indicator must reflect this. The hypervolume indicator and its variants are the only unary QIs known to be Pareto-compliant but there are many commonly used weakly Pareto-compliant indicators such as R2, IGD$+$, and $\epsilon +$. Currently, an open research area is related to finding new Pareto-compliant indicators whose preferences are different from those of the hypervolume indicator. In this article, we propose a theoretical basis to combine existing weakly Pareto-compliant indicators with at least one being Pareto-compliant, such that the resulting combined indicator is Pareto-compliant as well. Most importantly, we show that the combination of Pareto-compliant QIs with weakly Pareto-compliant indicators leads to indicators that inherit properties of the weakly compliant indicators in terms of optimal point distributions. The consequences of these new combined indicators are threefold: (1) to increase the variety of available Pareto-compliant QIs by correcting weakly Pareto-compliant indicators, (2) to introduce a general framework for the combination of QIs, and (3) to generate new selection mechanisms for multiobjective evolutionary algorithms where it is possible to achieve/adjust desired distributions on the Pareto front.

## 1 Introduction

The quality assessment of Pareto front approximations^{1} (also known as approximation sets) has been a critical factor to compare multi-objective evolutionary algorithms (MOEAs) (Coello Coello et al., 2007). When assessing an approximation set, three quality aspects have been commonly considered: convergence towards the Pareto front, spread, and diversity of solutions (Zitzler et al., 2000). The first evaluation method consisted in qualitative comparisons by plotting the approximation sets (Horn et al., 1994). However, a visual comparison is insufficient when the number of objective functions, MOEAs, and the cardinality of the approximation sets increases. To overcome this issue, researchers extended the Pareto dominance relation and its variants (which give a general notion of optimality) to be applied on approximation sets. In this regard, given two objective vectors $u\u2192,v\u2192\u2208Rm$, $u\u2192$ Pareto dominates $v\u2192$ (denoted as $u\u2192\u227av\u2192$) if and only if $ui\u2264vi$ for all $i=1,2,\u2026,m$ and there exists at least an index $j\u2208{1,2,\u2026,m}$ such that $uj<vj$. In case that $ui\u2264vi$ for all $i$, it is said that $u\u2192$ weakly Pareto dominates $v\u2192$ (denoted as $u\u2192\u2aafv\u2192$). If $ui<vi$ for all $i$, $u\u2192$ strongly Pareto dominates $v\u2192$ (denoted as $u\u2192\u227a\u227av\u2192$). The extended Pareto dominance relations are shown in Table 1. An important drawback of these set-based binary relations is their impossibility to take into account the spread and diversity of solutions. To alleviate the issues of the two previous comparison methods, quality indicators (QIs) were proposed as a quantitative methodology focused on measuring the three main quality aspects previously indicated (Veldhuizen, 1999; Zitzler et al., 2003; Jiang et al., 2014; Liefooghe and Derbel, 2016; Li and Yao, 2019).

Relation . | Description . | Name . |
---|---|---|

$A\u2aafB$ | For every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u2aafb\u2192$. | Weak dominance |

$A\u25c3B$ | $A\u2aafB$ and $B\xac\u2aafA$. | Strict dominance |

$A\u227aB$ | $A\u2260\u2205$ and for every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u227ab\u2192$. | Strict elementwise dominance |

$A\u227a\u227aB$ | $A\u2260\u2205$ and for every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u227a\u227ab\u2192$. | Strong dominance |

Relation . | Description . | Name . |
---|---|---|

$A\u2aafB$ | For every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u2aafb\u2192$. | Weak dominance |

$A\u25c3B$ | $A\u2aafB$ and $B\xac\u2aafA$. | Strict dominance |

$A\u227aB$ | $A\u2260\u2205$ and for every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u227ab\u2192$. | Strict elementwise dominance |

$A\u227a\u227aB$ | $A\u2260\u2205$ and for every element $b\u2192\u2208B$, there is at least an element $a\u2192\u2208A$ such that $a\u2192\u227a\u227ab\u2192$. | Strong dominance |

Quality indicators are set functions that assign a real value to one or more approximation sets simultaneously, according to specific preference information (Zitzler et al., 2003; Li and Yao, 2019). Mathematically, an $l$-ary QI is a function $I:A1\xd7\cdots \xd7Al\u2192R$, where each $Aj\u2282Rm,j=1,\u2026,l$ is a non-empty approximation set. Due to its mathematical definition, they impose a total order on the set $\Psi $ of all approximation sets related to a multiobjective optimization problem. Hence, this property makes QIs a remarkable option to compare the performance of MOEAs. In the specialized literature, there are several QIs that aim to assess convergence, spread and uniformity of approximation sets (Li and Yao, 2019). QIs focused on measuring convergence have a noteworthy relevance since they have been used to assess the performance of MOEAs and also to design their selection mechanisms (Falcón-Cardona and Coello Coello, 2020). Regarding the assessment of MOEAs, *Pareto compliance* is a critical property of convergence QIs^{2} that allows them to reflect the order imposed by the $\u25c3$-relation (see Table 1). It is worth noting that throughout the years, the term *Pareto compliance* has been commonly used. However, Hansen and Jaszkiewicz (1998) firstly named this property as *compatibility* with an outperformance relation; Zitzler et al. (2003) denoted these indicators as $\u25c3$-*complete*; and, finally, Zitzler, Knowles, et al. (2008) refined the term as *strict monotonicity*. In the following, we define the Pareto compliance and the weak Pareto compliance properties, assuming, without loss of generality, that a lower indicator value implies a better quality.

(Pareto compliance): Given two approximation sets $A$ and $B$, a unary indicator $I$ is $\u25c3$-compliant (Pareto-compliant) if $A\u25c3B\u21d2I(A)<I(B)$.

(Weak Pareto compliance): Given two approximation sets $A$ and $B$, a unary indicator $I$ is weakly $\u25c3$-compliant (weakly Pareto-compliant) if $A\u25c3B\u21d2I(A)\u2264I(B)$.

Pareto compliance implies that every time an approximation set $A$ strictly dominates another set $B$, the indicator $I$ will reflect this situation by assigning a lower indicator value to $A$. In contrast, weak Pareto compliance indicates that $A$ and $B$ at least have the same quality. If the QI does not reflect the order imposed by the extended dominance relations, then it is denoted as a non Pareto-compliant indicator. The importance of Pareto-compliant indicators when assessing MOEAs lies in their impossibility of contradicting the structure of order imposed by the $\u25c3$-relation (Zitzler et al., 2003). Hence, when comparing MOEAs, Pareto compliance avoids the generation of misleading conclusions regarding the use of Pareto dominance.

Among the plethora of convergence QIs currently available, the hypervolume indicator (HV) is the only unary QI that is Pareto-compliant (Zitzler, 1999; Zitzler et al., 2003, Zitzler, Knowles, et al., 2008). The HV measures the extent of the volume dominated by an approximation set and bounded by a user-supplied reference point that should be dominated by all points in the Pareto front approximation. For nonlinear Pareto front shapes, the set of size $\mu $ that approximates the solution to the HV-based subset selection problem presents a non-uniform distribution of objective vectors (Shang, Ishibuchi, He, et al., 2020). The consequences of these non-uniform optimal $\mu $-distributions are twofold: (1) HV penalizes uniform Pareto front approximations in comparison to certain non-uniform approximation sets, and (2) MOEAs that use HV-based selection mechanisms, produce non-uniform approximation sets. To improve the uniformity of the optimal $\mu $-distributions of HV, the weighted HV (Zitzler et al., 2007), logarithmic HV (Friedrich et al., 2011), free HV (Emmerich et al., 2014), and the transformation-based HV (Shang, Ishibuchi, Nan, et al., 2020), which are all variants of HV, have been proposed. Moreover, these variants preserve the Pareto compliance property of HV. Additionally, some other QIs have been proposed, having different preferences to those of the HV but being weakly Pareto-compliant or non Pareto-compliant. For instance, the most noteworthy weakly Pareto-compliant QIs are R2 (Brockhoff et al., 2012), the Inverted Generational Distance plus (IGD$+$) (Ishibuchi et al., 2015), and the unary additive $\epsilon $ indicator ($\epsilon +$) (Zitzler et al., 2003), while IGD (Coello Coello and Cruz Cortés, 2005) and Generational Distance (Veldhuizen, 1999) are non-Pareto-compliant indicators.

Due to the imperative need to propose new Pareto-compliant QIs whose preferences are significantly different to those of the HV, we propose in this article a methodology to generate new Pareto-compliant QIs. It is worth noting that in this article we provide both a theoretical and an experimental extension of the work introduced by Falcón-Cardona et al. (2019). Under this methodology, we combine one or more weakly Pareto-compliant indicators with at least one Pareto-compliant QI, using an order-preserving combination function. Under these conditions, we demonstrate that the combined indicators preserve the Pareto compliance property. Additionally, we show through preference analysis and the approximate optimal $\mu $-distributions, that our framework allows to create Pareto-compliant QIs with different preferences to those of the HV in two ways: (1) by exploiting the conflict that sometimes exists between the preferences of indicators, such that the combined indicator shows intermediate preferences, and (2) by keeping the original preferences of the weakly Pareto-compliant QIs but using a correcting factor derived from the Pareto-compliant QI being used. Overall, the contributions of this article are the following:

We provide guidelines for the construction of new Pareto-compliant QIs whose preferences are essentially different from those of the HV.

Our proposed framework allows correcting weakly Pareto-compliant QIs, such as R2, IGD$+$, and $\epsilon +$, so that they can become Pareto-compliant.

We provide an empirical study of the optimal $\mu $-distribution of solutions generated by a steady-state MOEA based on some (selected) new Pareto-compliant QIs.

The remainder of the article is organized as follows. Section 2 defines the quality indicators that we use throughout the article. The previous related work is described in Section 3. Our mathematical framework for the construction of new Pareto-compliant QIs is introduced in Section 4. The experimental results using the combined indicators is presented in Section 5. Finally, the main conclusions and future work are described in Section 6.

## 2 Background

In the following, we provide the mathematical definitions of HV, R2, IGD$+$, and $\epsilon +$, which are extensively used throughout the article. In all cases, let $A$ denote an approximation set and $Z$ be a reference set.

HV measures the extent of volume jointly dominated by the points in $A$ and bounded by $z\u2192ref$. Currently, HV and the closely related logarithmic HV (Friedrich et al., 2011), the weighted HV (Auger et al., 2009), the free HV (Emmerich et al., 2014), and the transformation-based HV (Shang, Ishibuchi, Nan, et al., 2020) are the only Pareto-compliant QIs known so far. The two main drawbacks of the HV are the following. First, under $NP\u2260P$, its computational cost increases super-polynomially with the number of objective functions (Bringmann and Friedrich, 2009, 2010). The other issue is related to $z\u2192ref$, since the preferences of HV strongly depend on it (Auger et al., 2009; Ishibuchi, Imada, et al., 2017). In other words, the specification of the reference is dependent on the Pareto front shape. It has been shown that the distribution of points is often concentrated on the boundary and in knee-point regions.

The R2 indicator is a convergence-diversity QI that measures the average minimum utility values of the approximation set with respect to a set of weight vectors (Hansen and Jaszkiewicz, 1998; Brockhoff et al., 2012). Its computational cost is in $\Theta (m|W|\xb7|A|)$. Unlike the hypervolume indicator, the time complexity of R2 grows linearly with the number of objectives. Its time complexity is, however, proportional to the number of weight vectors,^{3} which has to grow exponentially in size, if the number of objectives increases and the same resolution of sampling is desired. A major conceptual difference with respect to the hypervolume indicator is that the R2 indicator does not require an anti-optimal reference point. Instead, it works with an ideal or utopian reference point. In many applications involving, for instance, error or cost minimization, there is a natural choice for an ideal point, but it is difficult to define an anti-optimal reference point. So, in such cases, the R2 indicator could be a better choice than the hypervolume.

A problem, however, arises due to the fact that the R2 indicator is not Pareto-compliant, and it is only weakly Pareto-compliant (Hansen and Jaszkiewicz, 1998; Brockhoff et al., 2012). This makes it possible that a set might have equal R2 indicator values than another set, although it is dominated in the set order, or that sets degenerate if this indicator is used as a guide in a Pareto optimization process. One might argue that these are rare cases, as they always involve shared coordinate values among points, and in most cases, the R2 indicator works well when comparing sets. In fact, in continuous unconstrained optimization, such cases might occur with low probability, but they are relatively likely to occur in continuous optimization and in cases in which box constraints are introduced.

Ishibuchi et al. (2015) proposed IGD$+$ as an improvement of the IGD indicator (Coello Coello and Cruz Cortés, 2005). Both QIs measure convergence and diversity of solutions simultaneously. However, IGD$+$ is weakly Pareto-compliant while IGD is not Pareto-compliant (Bezerra et al., 2017). IGD$+$ measures the average distance from the reference set to the dominated space of the approximation set. Its computational cost is $\Theta (m|Z|\xb7|A|)$. A critical aspect is how to specify the reference set when no information is available about $PF*$ (Ishibuchi et al., 2014).

The unary $\epsilon +$-indicator gives the minimum distance by which a Pareto front approximation needs to or can be translated in all dimensions at once in objective space such that a reference set is weakly dominated. In consequence, this QI exclusively measures convergence to $PF*$ and it is weakly Pareto-compliant. A remarkable aspect is that $\epsilon +$ does not require any parameters but, as in the case of IGD$+$, a reference set has to be supplied. Additionally, $\epsilon +$ is not very sensitive to local changes in the solutions in $A$ (Bringmann et al., 2011).

^{4}values are calculated. It is worth noting that a lower value of IGD$+$ and IGD means higher quality in contrast to the HV which aims to maximize the dominated volume. Due to its Pareto compliance, the HV effectively shows that $A$ strictly dominates $B$ since $HV(A,z\u2192ref)=0.781875$ and $HV(B,z\u2192ref)=0.671250$. On the other hand, IGD$+$ cannot reflect that $A\u25c3B$ since the same IGD$+$ value (0.125) is assigned to both approximation sets. In contrast, IGD determines that $B$ is better because $IGD(B,Z)=0.125$ and $IGD(A,Z)=0.167705$, even though it is strictly dominated by $A$. Regarding the second example, let's consider a QI which we will call “Zero indicator”. It is defined as $Z:\Psi \u2192R$ with $Z\u22610$. Clearly, for every $A,B\u2208\Psi $ such that $A\u25c3B$, it implies $Z(A)=Z(B)$, i.e., $Z$ is weakly Pareto-compliant. Although indicators such as R2, IGD$+$ and $\epsilon +$ are more complex than $Z$ in a mathematical sense, all of them are only weakly Pareto-compliant as the Zero indicator. Based on the discussed examples, we can see that is not enough to construct weakly and non Pareto-compliant QIs, since they can lead to misleading conclusions when comparing approximation sets (Zitzler, Knowles, et al., 2008). Consequently, this is a motivation to construct new Pareto-compliant QIs.

## 3 Previous Related Work

In this section, we briefly describe the previous work done in the direction of the employment of multiple QIs. Additionally, we first provide a discussion on the use of different terms to denote the Pareto compliance property.

### 3.1 Pareto Compliance

Pareto compliance (defined in Property 1) is a characteristic of convergence QIs which allows them to reflect the order structure imposed by the $\u25c3$-relation defined in Table 1. Over the years, this property has been named in different ways, namely, *compatibility*, *completeness*, and *strict monotonicity*. The origin of this property dates back to the work of Hansen and Jaszkiewicz (1998). In that paper, the authors defined the general notion of an *outperformance relation*^{5}$R$ as a set-based binary order relation to compare approximation sets. Based on the general outperformance relation, Hansen and Jaszkiewicz defined the term *compatibility*, which is currently known as Pareto compliance, as follows:

Compatibility with an outperformance relation (Hansen and Jaszkiewicz, 1998). A comparison method $R$ is *compatible* with an outperformance relation $R$ if for each pair of approximation sets $A$ and $B$, such that $ARB$, $R$ will evaluate $A$ as being better than $B$.

Weak compatibility with an outperformance relation (Hansen and Jaszkiewicz, 1998). A comparison method $R$ is *weakly compatible* with an outperformance relation $R$ if for each pair of approximation sets $A$ and $B$, such that $ARB$, $R$ will evaluate $A$ as being not worse than $B$.

The comparison method $R$ in the above definitions can be replaced by a quality indicator. Hence, assuming that a lower indicator value implies better quality, then $ARB\u21d2I(A)<I(B)$ and $ARB\u21d2I(A)\u2264I(B)$ imply that $I$ is compatible and weakly compatible, respectively. In case that $R$ is replaced by $\u25c3$, we obtain Properties 1 and 2.

Following the definitions of Hansen and Jaszkiewicz, Zitzler et al. (2003) denoted the compatibility as *completeness* with respect to an arbitrary binary relation $R$. However, the main diference was the consideration of an indicator vector $I\u2192=(I1,\u2026,Ik)$ instead of a single QI. Moreover, the comparison method $R$ was redefined as a function $CI\u2192,E:A\xd7B\u2192{true,false}$, where $E:Rk\xd7Rk\u2192{true,false}$ is an interpretation function. This newly defined comparison method aims to determine by a Boolean value if $A$ is better than $B$. The comparison method $CI\u2192,E=E(I\u2192(A),I\u2192(B))$ is denoted as $R$-complete if either for any $A,B\u2208\Psi $, it holds $ARB\u21d2CI\u2192,E(A,B)$. This definition extended the compatibility of Hansen and Jaszkiewicz to consider multiple QIs when comparing approximation sets. We can easily derive the compatibility definition from the completeness one.

Finally, Zitzler, Knowles, et al. (2008) employed the term *strict monotonicity* to denote a QI for which the following holds: $\u2200A,B\u2208\Psi :A\u227aB\u21d2I(A)<I(B)$, where $\u227a$ corresponds to the strict elementwise dominance in Table 1. According to Zitzler et al. (2003), $A\u227aB\u21d2A\u25c3B$, thus, for these subset of approximation sets for which $\u227a$ holds, we can employ the $\u25c3$-relation as it is stated in the compatibility and completeness definitions.

### 3.2 Combination of Multiple Indicators

Even though quality indicators have been widely employed by the evolutionary multiobjective optimization community, the exploration of quantitative methods using multiple QIs has been scarce. In this section, we discuss some previous works in which the *combination of multiple indicators* was discussed. Particularly, we focus on the works of Zitzler et al. (2003), Knowles et al. (2006), Zitzler, Knowles, et al. (2008), and Zitzler et al. (2010).

When assessing MOEAs, it is a common strategy to employ several QIs to characterize different quality aspects of approximation sets. In this regard, Zitzler et al. (2003) proposed to analyze such combinations or, more formally, quality indicator vectors $I\u2192=(I1,\u2026,Ik)$ to better interpret the results, based on multiple quality aspects, when comparing two approximation sets. To this aim, the authors proposed to use comparison methods based on interpretation functions $E$ (defined in the previous section) to analyze the indicator vectors. Depending on the several possibilities to define $E$, different claims could be produced when comparing two approximation sets. However, no theoretical properties about $I\u2192$ or the interpretations functions were provided.

Knowles et al. (2006) briefly explained that the combination of indicators, ideally using Pareto-compliant ones, could lead to powerful interpretations in contrast to employing a single QI. However, no mathematical definition was provided to support this claim. In a subsequent work, Zitzler, Knowles, et al. (2008) explained that the combination of indicators could allow us to overcome the difficult situation of finding an ideal indicator, that is, a QI being Pareto-compliant, scaling invariant, and cheap to compute. To this aim, one has to look for a way to combine the resulting indicator values. Hence, they were the first to suggest the use of a sequence of indicators to evaluate approximation sets.

Zitzler et al. (2010) mathematically defined a multi-indicator preference relation $RS$ that sequentially applies preference relations based on a single quality indicator. $RS$ utilizes a sequence $S=(\u2aafI1,\u2aafI2,\u2026,\u2aafIk)$, where each $\u2aafIj$ is a preference relation based on the indicator $Ij$ that compares the indicator values of two given approximation sets. The backbone of this proposal is to create a chain of refinements of indicator preferences to compare two given approximation sets and, in this way, break ties (i.e., when a QI cannot decide which approximation is better). Based on this idea, the authors proposed an algorithm for the evaluation of $A$ and $B$ similar to the non-dominated sorting algorithm (Deb et al., 2002). First, an index $j$ is set to 1 to use $\u2aafI1$, i.e., the value $Ij=1$ is calculated for both $A$ and $B$. If $Ij(A)=Ij(B)$, then $j$ is increased to point to the next indicator-based preference relation in the sequence if it exists. In case that $\u2aafIj$ claims an approximation set with better quality, the decision is returned. In consequence, the use of these indicator-based preference relations could increase the sensitivity of an order relation by solving cases of incomparability. Furthermore, Zitzler et al. proposed that the first $1\u2264t<k$ indicators are weakly Pareto-compliant and the remaining ones are Pareto-compliant. This way, the Pareto-compliant QIs could be employed to refine the preferences of the weakly Pareto-compliant indicators.

It is worth noting at this point what is the main focus of these multi-indicator-based preference relations. Zitzler, Thiele, et al. (2008) and Zitzler et al. (2010) employed these preference relations to construct the Set Preference Algorithm for Multiobjective Optimization (SPAM) that uses populations as individuals instead of single decision vectors. The reason to use populations as individuals is related to the utilization of the indicator-based preference relations that require a population (a set of objective vectors) as an input parameter. Internally, SPAM manages three populations: the main population $P$, the randomly mutated population $P'$, and the heuristically mutated population $P''$. $P$ is compared with $P'$ and $P''$, using the indicator-based preference relations, to determine the one that possibly replaces $P$. In case of using HV in the sequence of indicators, the high computational cost of HV-based MOEAs such as the $S$-Metric Selection Evolutionary Multiobjective Algorithm (SMS-EMOA) (Beume et al., 2007) is avoided because it is necessary to calculate only two HV values at each generation if necessary, that is, when HV needs to refine the preferences. However, SPAM needs more function evaluations than a usual MOEA due to the utilization of three populations.

From the above discussion, it is clear that the indicator-based preference relations show remarkable advantages to derive set-based MOEAs such as SPAM. Furthermore, another utilization of these binary relations could be the comparison of the Pareto front approximations generated by different MOEAs. This is possible since the indicator-based preference relations are basically a lexicographic order based on a sequence $S=(\u2aafI1,\u2aafI2,\u2026,\u2aafIk)$. Thus, they impose a preorder^{6} on the set of all approximation sets. In spite of the possibility of using the indicator-based preference relations to compare MOEAs, there are some issues that should be pointed out. The preference relation on its own, only determines what is the relationship between two given approximation sets but it does not numerically indicate the difference in quality between them. Even though if we inspect the indicator values produced by the preference relation, there is a possibility that some of the comparisons do not employ the same $\u2aafIj\u2208S$. Hence, we would have performance evaluations in a different scale of measure.

In the next section, we present a generalization of the multi-indicator-based preference relations proposed by Zitzler et al. (2010). Our proposal is a framework to combine multiple QIs in order to define new unary indicators. The resulting combined unary indicator merges the preferences of all the baseline QIs, defining new preferences and a common numerical field of comparisons to determine the difference in quality between Pareto front approximations. Another important aspect is that the user can specify the relative importance of all the baseline QIs in the combination. Furthermore, this framework allows the construction of new Pareto-compliant indicators by combining as many weakly Pareto-compliant indicators as needed, but including at least one Pareto-compliant QI to retain the Pareto compliance property. Due to the use of at least one Pareto-compliant QI, we refine the preferences of the weakly Pareto-compliant indicators in a similar way as the one proposed by Zitzler et al.

## 4 New Pareto-Compliant Indicators

In this section, we propose a systematic framework for combining QIs, following the basic idea introduced by Falcón-Cardona et al. (2019). Additionally, we provide the mathematical argumentation to ensure that when combining QIs with specific properties, the resulting combined indicator will be Pareto-compliant. This leads not only to new types of indicators but also proves to be a way to create new Pareto-compliant indicators with very different properties than those of the HV in terms of the distribution of points that they favor, and in terms of the parameters provided by the user. In the following, we present the mathematical framework for the combination of QIs. Let $A$ be an approximation set in $\Psi $.

(Combination function): A combination function $C:Rk\u2192R$ assigns a real value to a vector $I\u2192(A)=(I1(A),I2(A),\u2026,Ik(A))$, where each $Ij(A)$ is the value of the $jth$ unary indicator.

(Combined Indicator): Given an indicator vector $I\u2192(A)=(I1(A),I2(A),\u2026,Ik(A))$ and a combination function $C$, a combined indicator $I(A)$ is defined as follows: $I(A)=C(I\u2192(A))$.

(Compliant Indicator Vector):$I\u2192(A)=(I1(A),I2(A),\u2026,Ik(A))\u2208Q$ is called a compliant indicator vector (CIV) if $\u2200j=1,\u2026,k,Ij$ is weakly Pareto-compliant and there exists at least one index $t\u2208{1,\u2026,k}$ such that $It$ is Pareto-compliant. $Q\u2286Rk$ is denoted as the quality space.

For the following theorem, let us assume, without loss of generality, that the unary indicators $I1,\u2026,Ik$ are to be minimized.

Consider two approximation sets $A$ and $B$ such that $A\u25c3B$ and let $I\u2192$ be a CIV. Then, $A\u25c3B\u21d2I\u2192(A)\u227aI\u2192(B)$ because the Pareto-compliant indicators get better and the weakly Pareto-compliant ones get better or stay equal. Moreover, by definition $I\u2192(A)\u227aI\u2192(B)\u21d2C(I\u2192(A))<C(I\u2192(B))$. Hence, by transitivity of $\u21d2$ and considering that $I(A)=C(I\u2192(A))$ and $I(B)=C(I\u2192(B))$, it holds $A\u25c3B\u21d2I(A)<I(B)$, that is, $I$ is Pareto-compliant.

Theorem 1 provides a sufficient condition for constructing Pareto-compliant combined indicators on the basis of compliant indicator vectors. In other words, a combined indicator preserves the Pareto compliance property because of the use of order-preserving combination functions.

The condition of Theorem 1 is suffcient but not necessary. For instance, given $I\u2192(A)=(I1(A),I2(A),\u2026,Ik(A))$ where $I1$ is Pareto-compliant and the $Ij,j=2,\u2026,k$ are not Pareto-compliant, the “combined” indicator $I(A)=C(I\u2192(A))=I1(A)$ is also Pareto-compliant. Hence, there is a large number of possibilities to construct combined and compliant indicators.

There exist many combination functions that have the property of Theorem 1. In this article, we focus on order-preserving utility functions $u:Rk\u2192R$ (Miettinen, 1999; Pescador-Rojas et al., 2017). A utility function (UF) is a model of the decision maker preferences that assigns to each $k$-dimensional vector a utility value. Thus, a combination function $C$ can be defined in terms of these functions. Generally, UFs employ a convex weight vector $w\u2192\u2208[0,1]k$ such that $\u2211i=1kwi=1,wi\u22650$. However, for the combination of QIs, we need $wi>0$ for all $i=1,\u2026,k$ such that all QIs contribute to the combined indicator value. Based on the above, we denote $uw\u2192(I\u2192(A))$ as a Pareto-compliant utility indicator (PCUI) if $u$ is order preserving and $I\u2192$ is a CIV.

## 5 Experimental Results

Throughout this section, we analyze six newly created PCUIs, emphasizing the distribution of points in different Pareto fronts that they prefer. To define the PCUIs, we focused our attention on two well-known utility functions that are order-preserving, namely, the weighted sum function (WS) and a slightly modified augmented Tchebycheff function (ATCH) (Pescador-Rojas et al., 2017). Let $I\u2192\u2208Rk$ be an indicator vector and $w\u2192\u2208(0,1)k$ be a convex weight vector. In the following, we define WS and ATCH as $WSw\u2192(I\u2192)=\u2211i=1kwiIi$ and $ATCHw\u2192(I\u2192)=maxi=1,\u2026,kwiIi+\alpha \u2211i=1kIi$, respectively. Regarding ATCH, we modified its original definition by not considering the absolute value of the term $wixi$ such that the function is order-preserving in the whole $Rk$ and $\alpha >0$ is a user-supplied parameter. Since a PCUI requires all its baseline QIs to be minimized, we exclusively consider $-HV$ in their definition. The proposed PCUIs, defined in the following, correspond to Pareto-compliant versions of the indicators R2, IGD$+$, and $\epsilon +$:

$WSw\u2192(-HV,R2)$ and $ATCHw\u2192(-HV,R2)$;

$WSw\u2192(-HV,IGD+)$ and $ATCHw\u2192(-HV,IGD+)$;

$WSw\u2192(-HV,\epsilon +)$ and $ATCHw\u2192(-HV,\epsilon +)$.

In the following sections, we are interested in understanding the properties of the adopted PCUIs as performance assessment functions to compare MOEAs' performance. To this aim, we proposed two main experiments. The first experiment aims to compare how the PCUIs and their baseline QIs rank several state-of-the-art MOEAs that produce Pareto front approximations with different specific distributions of objective vectors (we employed the Lamé and Mirror superspheres problems proposed by Emmerich and Deutz (2007)). The comparison is based on measuring the correlation of preferences between all the adopted PCUIs and QIs, following the methodology of Liefooghe and Derbel (2016). On the other hand, the second experiment analyzes the approximate optimal $\mu $-distributions related to PCUIs to reinforce the preference information. To obtain the approximate optimal $\mu $-distributions, we set up a steady-state MOEA similar to the $S$-Metric Selection Evolutionary Multiobjective Algorithm (SMS-EMOA) (Beume et al., 2007), that uses the PCUIs as part of its density estimator. The proposed algorithm, denoted as PCUI-EMOA, is tested on MOPs from the Deb-Thiele-Laumanns-Zitzler (DTLZ) (Deb et al., 2005), Walking-Fish-Group (WFG) (Huband et al., 2006) test suites, and their minus versions, DTLZ$-1$ and WFG$-1$ (Ishibuchi, Setoguchi, et al., 2017), respectively. It is worth noting that we turned off all the search difficulties of these problems to avoid issues related to the PCUI-EMOA's performance.

### 5.1 Weight Vector Effect

### 5.2 Analysis of Preferences

In this experiment, we are interested in analyzing how similarly the six adopted PCUIs and their baseline QIs rank different Pareto front approximations corresponding to the Lamé and Mirror superspheres problems with unimodal difficulty (Emmerich and Deutz, 2007). The correlation analysis is based on the methodology proposed by Liefooghe and Derbel (2016) where the main focus is the comparison of *rankings* of approximation sets obtained within each QI. The reason to use the Lamé superspheres is their scalability in the objective space and the controlling of the Pareto front shapes. Regarding the latter, a parameter $\gamma $ controls the Pareto front geometry. For Lamé problems, $\gamma \u2208(0,1)$, $\gamma =1$, and $\gamma >1$ correspond to convex, linear, and concave Pareto front geometries. In case of the Mirror problems, $\gamma \u2208(0,1)$ corresponds to a concave geometry while $\gamma >1$ is related to convex shapes. For both Lamé and Mirror problems, we employed $\gamma =0.25,0.50,0.75,1.00,1.50,2.00,6.00$ for 2, 3, and 4 objective functions.

Following the med-Q sampling methodology of Liefooghe and Derbel where a black-box MOEA (they employed NSGA-II) produces Pareto front approximations, we selected MOEAs that exhibit particular distribution characteristics to have a representative sample of the set $\Psi $. For each test instance, 30 independent executions were produced by each MOEA, where all the algorithms shared the same settings. For all the objective functions, the size of all the produced approximation sets was 120. To avoid performance issues of the selected MOEAs, all of them used Pareto optimal solutions as their initial population.^{7} Hence, during the execution time, the MOEAs explore these initial solutions to impose their own preferences. The adopted MOEAs are classified in five classes as follows:

Indicator-based MOEAs: SMS-EMOA (Beume et al., 2007), MOMBI2 (Hernández Gómez and Coello Coello, 2015), IGD$+$-MaOEA (Falcón-Cardona and Coello Coello, 2018), and $\Delta p$-MaOEA

^{8}.Pareto-based MOEAs: NSGA-II (Deb et al., 2002) and SPEA2 (Zitzler et al., 2001).

Reference set-based MOEAs: NSGA-III (Deb and Jain, 2014).

Decomposition-based MOEAs: MOEA/D (Zhang and Li, 2007).

Image analysis-based MOEAs: MOVAP (Hernández Gómez et al., 2016).

Regarding the assessment of the Pareto front approximations generated by the adopted MOEAs, we used the following settings on the QIs and PCUIs. We set $z\u2192ref=(1.12,\u2026,1.12)$ to calculate HV for all test instances. A set of convex weight vectors (constructed by the Simplex-Lattice-Design method; Das and Dennis, 1998) was employed as the set $W$ for R2. The reference sets for IGD$+$ and $\epsilon +$ are constructed by collecting all the Pareto front approximations and, then, obtaining a set of 120 nondominated solutions using a subset selection based on the Riesz $s$-energy as proposed by Falcón-Cardona et al. (2020). For all the six adopted PCUIs, the weight vector was set as $w\u2192=(0.0001,0.9999)$, where 0.0001 is the weight associated with the HV and 0.9999 is related to the weakly Pareto-compliant QI. This setting was employed to mostly preserve the preferences of the weakly Pareto-compliant QIs while producing Pareto-compliant results due to the use of the HV as a correction factor.

#### 5.2.1 Correlation between PCUIs and Baseline QIs

Regarding the correlation analysis on Lamé problems in Figure 4, we have the following conclusions. For problems with $\gamma =0.75,1.00,1.50$, the PCUIs are consistently correlated with their baseline QIs (the correlation is stronger with the weakly Pareto-compliant indicator), regardless of the dimension of the objective space. Regarding $\gamma =1.00$, the Kendall's $\tau $ is in $(0.75,1.0]$ in almost all comparisons although for three- and four-objective MOPs, the correlation with all of the QIs with HV gets weaker ($\tau \u2208(0.5,0.75]$). This behavior can be explained from the studies where it is stated that indicators such as HV, R2, IGD$+$, and $\epsilon +$ have strongly correlated preferences for MOPs with linear Pareto front shapes (Jiang et al., 2014; Liefooghe and Derbel, 2016; Falcón-Cardona and Coello, 2019). For both highly convex and highly concave Pareto fronts, the correlation is strongly positive with the weakly Pareto-compliant QI (i.e., R2, IGD$+$, and $\epsilon +$) and, overall, there is no significant correlation with HV (i.e., $\tau \u2208[-0.5,0.5]$). For three- and four-objective MOPs, all the PCUIs present consistently a strong correlation with their baseline weakly Pareto-compliant QIs. In contrast, they do not exhibit a significant correlation with HV in most cases except for $\gamma =0.75,1.00,1.50$. From the results, we have a preference bias towards the weakly Pareto-compliant indicators. This is an effect of giving more importance to these indicators, that is, using $w2=0.9999$. Hence, the preferences of the PCUIs are very similar to those of their baseline weakly Pareto-compliant indicators. In other words, the PCUIs exhibit different preferences to those of HV but they still are Pareto-compliant regardless of the Pareto front shape and the dimensionality of the objective space.

For the Mirror problems in Figure 5, we have some similar results. However, in these problems which represent inverted Pareto front shapes of the Lamé problems, the PCUIs tend to be even more correlated to their baseline weakly Pareto-compliant QIs. The only exception is the two-objective Mirror problem with $\gamma =1.00$ where all the PCUIs and QIs are positively correlated. This result is consistent with that of the two-objective Lamé problem, using the same $\gamma $ value due to the same Pareto front shape. For all the other test instances, the preferences of the PCUIs are different to those of HV because in almost all cases, the $\tau $ correlation value is in the interval $[-0.5,0.5]$. A clear example of the previous claim is the 4-objective Mirror problem with $\gamma =6.00$ where no PCUI is correlated with HV while they present a Kendall's $\tau \u2208(0.75,1.0]$ with respect of their baseline weakly Pareto-compliant QIs.

In summary, there are two important points to emphasize. The use of a setting of $w\u2192$ that gives more importance to the weakly Pareto-compliant QI in the combination, will promote that the PCUIs inherit its preferences, leaving HV just as a correction factor to make the PCUI Pareto-compliant. This is a remarkable result since we can create new Pareto-compliant QIs but having preferences mostly different to those of HV. Hence, we can increase the number of Pareto-compliant indicators that researchers can employ for different comparison situations (e.g., when comparing MOEAs that can produce evenly distributed solutions). The second point relates to the design of new selection mechanisms where the distribution of solutions can be adjusted depending on the QIs to combine. In other words, PCUIs could be employed to manipulate the distribution properties of MOEAs while maintaining the Pareto compliance property.

#### 5.2.2 Correlation between PCUIs

We analyzed the correlation between the preferences of all PCUIs to ensure that the combination does produce different indicators. Concerning both the Lamé and Mirror problems, the correlation analysis indicates that the PCUIs based on the same weakly Pareto-compliant QI are strongly correlated between them. In consequence, the use of WS or ATCH is basically producing the same PCUI although they have different landscapes. In the next section, we analyze that in some cases, the preferences of indicators are in conflict which promotes the formation of a Pareto front in the Quality Space. Hence, the PCUI could exploit this trade-off to show preferences similar to those of its baseline QIs or biased to one of them.

Another remarkable conclusion is that the preferences of PCUIs based on a different weakly Pareto-compliant QI are, in general, independent. Hence, each class of PCUIs are presenting distinct preferences. This is explained by the analysis of the correlation between R2-IGD$+$, R2-$\epsilon +$, and IGD$+$-$\epsilon +$ that are mostly independent as shown in Figures 4 and 5. Additionally, due to the use of $w\u2192=(0.0001,0.9999)$, each PCUI inherits the preferences of its weakly Pareto-compliant QI. Hence, the PCUI will behave in a similar way to its weakly Pareto-compliant QI but maintaining the Pareto compliance property.

#### 5.2.3 Pareto Fronts in Quality Space

### 5.3 Analysis of Optimal $\mu $-Distributions

In this section, we investigate the optimal $\mu $-distributions of the selected PCUIs. To this aim, we considered the framework of SMS-EMOA that uses a density estimator (DE) based on HV but, in our case, a PCUI is employed in the DE. Algorithm 1 presents the general framework of our proposed PCUI-EMOA whose main loop is in lines 2 to 12. At each generation, a new solution is created using genetic operators and, then, this newly created solution is added to the population $P$ to create the temporary population $Q$. Then, in line 5, a set of ranks $R1,\u2026,Rt$ are created using the nondominated sorting algorithm (Deb et al., 2002), where $Rt$ has the worst solutions according to the Pareto dominance relation. If $Rt$ has more than one solution, the individual contributions to the PCUI are computed to delete the worst-contributing solution in line 11. Finally, the Pareto front approximation is returned when the stopping criterion is fulfilled.

We focused our attention on studying the approximate optimal $\mu $-distributions produced by PCUI-EMOA in comparison with those of four steady-state MOEAs based on the indicators HV, R2, IGD$+$, and $\epsilon +$, that is, SMS-EMOA, R2-EMOA, IGD$+$-MaOEA, and $\epsilon +$-MaOEA. The latter is similar to IGD$+$-MaOEA. Regarding PCUI-EMOA, we employed the six PCUIs of the previous section. Since all the adopted indicator-based MOEAs (IB-MOEAs) share the same structure, the parameters settings are the following. For all objective functions, the population size is 120. All MOEAs use simulated binary crossover and polynomial-based mutation as their genetic operators (Deb et al., 2002), where, for all cases, the crossover probability is set to 0.9, the mutation probability is $1/n$ ($n$ is the number of decision variables), and both the crossover and mutation distribution indexes are set to 20. PCUI-EMOA employs the combination vector as $w\u2192=(0.5,0.5)$ to look for the knee point on the Pareto front in quality space, that is, to generate distributions similar to both baseline QIs. We tested the adopted MOEAs on 14 MOPs from the benchmarks DTLZ, WFG, DTLZ$-1$, and WFG$-1$ for 2, 3, and 4 objective functions. We employed the problems DTLZ1, DTLZ2, DTLZ5, DTLZ7, WFG1, WFG2, WFG3, and their minus versions. It is worth noting that we turned off all the difficulties of these MOPs to avoid biasing the results such that we can observe the approximate optimal $\mu $-distributions of the selected MOEAs. We adopted these MOPs since they possess Pareto fronts with different geometries, namely, linear, concave, convex, degenerate, mixed, disconnected, correlated with the simplex shape, and not correlated with it (Ishibuchi, Setoguchi, et al., 2017).

The study proposed here is focused on determining the similarity between the approximate optimal $\mu $-distributions produced by the six PCUI-EMOAs and those of the selected IB-MOEAs. For each test instance, the MOEAs were executed $N=30$ independent times. Thus, each one produced $N$ approximation sets for each MOP. We investigate the similarity between two sets of approximation sets produced by two MOEAs, using a similarity measure based on the Hausdorff distance that we propose in the following:

$S(A,B)$ calculates the similarity of $A$ to (most) elements in $B$. It is worth noting that $S$ is a similarity measure, not a metric (distance function), as it is not symmetrical. It attains small values if every set in $A$ is similar to at least half of the sets in $B$. It can, however, be the case that $B$ contains a number of sets that by no means resemble sets in $A$. In that sense it is not symmetrical. To make it symmetrical, one would have to compute also $S(B,A)$ and take the maximum of these two values, as it is done in the Hausdorff--Pompeiu Distance (on the level of points sets). In consequence, we propose the two-sided Hausdorff-based similarity measure which is symmetric.

In the case that we are given three sets of approximation sets $A,B,$ and $C$ and we would like to know if $A$ is similar to $B$, to $C$, to both, or to none of them, a classification function is required. Such classifier is given as follows.

Based on the classification function, we analyzed the similarities between the approximation sets produced by the PCUI-EMOAs and their corresponding IB-MOEAs that use the baseline indicators for the construction of the PCUI. Table 2,^{0} shows the results for all the considered test instances using $\epsilon =m/10$. Since all PCUI-EMOAs use $w\u2192=(0.5,0.5)$ as the combination weight vector for the order-preserving utility functions, our hypothesis is that the Pareto front approximations should be similar to both IB-MOEAs that employ the baseline indicators. This hypothesis is true for several cases related to the DTLZ and DTLZ$-1$ problems. Nevertheless, for most of the WFG and WFG$-1$ problems, the PCUI-EMOA tends to produce approximation sets with particular distributions that are not similar to the baseline IB-MOEAs. This fact could be explained by the independence of preferences between HV and the weakly Pareto-compliant indicators on these MOPs. Considering the linear problems DTLZ1 and DTLZ1$-1$, it is clear that in most cases the PCUI-EMOAs produce approximation sets similar to the IB-MOEAs using their baseline indicators. This result is explained by the correlation analysis of Section 5.2 where in almost all cases HV, R2, IGD$+$, and $\epsilon +$ are strongly correlated. The most important observation is related to the PCUI-EMOAs based on $WSw\u2192(-HV,R2)$ and $ATCHw\u2192(-HV,R2)$. On the one hand, SMS-EMOA produces uniformly distributed solutions in convex and linear Pareto fronts and there is a bias towards the knee and boundaries of concave Pareto fronts. Additionally, SMS-EMOA presents good results in degenerate problems such as DTLZ5 and WFG3. On the other hand, R2-EMOA (Brockhoff et al., 2015) does not produce uniformly distributed solutions in convex Pareto fronts, but it does in linear and concave ones. Regarding degenerate MOPs, R2-EMOA does not produce good results since its weight vectors do not completely intersect the Pareto front shape. Hence, SMS-EMOA and R2-EMOA have specific strengths and weaknesses depending on the MOP being tackled. We refer to *strengths* and *weaknesses* of an IB-MOEA (and, more specifically, to the underlying QI) as its capacity to generate (or failure to generate) Pareto front approximations with good diversity. For instance, SMS-EMOA does not produce a uniform distribution in DTLZ2 (which has a concave Pareto front) because of the inner preferences of the HV while R2-EMOA can produce uniformly distributed solutions in this problem.

MOP . | Dim. . | $WSw\u2192(-HV,R2)$ . | $ATCHw\u2192(-HV,R2)$ . | $WSw\u2192(-HV,IGD+)$ . | $ATCHw\u2192(-HV,IGD+)$ . | $WSw\u2192(-HV,\epsilon +)$ . | $ATCHw\u2192(-HV,\epsilon +)$ . |
---|---|---|---|---|---|---|---|

DTLZ1 | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | Both | Both | Both | Both | Both | Both | |

DTLZ1$-1$ | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | Both | Both | HV | None | |

DTLZ2 | 2 | Both | Both | Both | Both | Both | Both |

3 | R2 | R2 | Both | Both | Both | Both | |

4 | R2 | R2 | None | None | None | None | |

DTLZ2$-1$ | 2 | HV | HV | Both | Both | Both | Both |

3 | HV | HV | None | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ5 | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | IGD$+$ | IGD$+$ | $\epsilon +$ | $\epsilon +$ | |

DTLZ5$-1$ | 2 | HV | HV | Both | Both | Both | Both |

3 | HV | HV | IGD$+$ | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ7 | 2 | Both | Both | Both | Both | Both | Both |

3 | None | None | IGD$+$ | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ7$-1$ | 2 | Both | Both | Both | Both | Both | Both |

3 | R2 | R2 | Both | Both | Both | Both | |

4 | None | None | None | $\epsilon +$ | $\epsilon +$ | None | |

WFG1 | 2 | None | None | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | IGD$+$ | IGD$+$ | $\epsilon +$ | None | |

WFG1$-1$ | 2 | R2 | R2 | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | None | None | None | None | |

WFG2 | 2 | None | None | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | None | None | None | None |

MOP . | Dim. . | $WSw\u2192(-HV,R2)$ . | $ATCHw\u2192(-HV,R2)$ . | $WSw\u2192(-HV,IGD+)$ . | $ATCHw\u2192(-HV,IGD+)$ . | $WSw\u2192(-HV,\epsilon +)$ . | $ATCHw\u2192(-HV,\epsilon +)$ . |
---|---|---|---|---|---|---|---|

DTLZ1 | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | Both | Both | Both | Both | Both | Both | |

DTLZ1$-1$ | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | Both | Both | HV | None | |

DTLZ2 | 2 | Both | Both | Both | Both | Both | Both |

3 | R2 | R2 | Both | Both | Both | Both | |

4 | R2 | R2 | None | None | None | None | |

DTLZ2$-1$ | 2 | HV | HV | Both | Both | Both | Both |

3 | HV | HV | None | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ5 | 2 | Both | Both | Both | Both | Both | Both |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | IGD$+$ | IGD$+$ | $\epsilon +$ | $\epsilon +$ | |

DTLZ5$-1$ | 2 | HV | HV | Both | Both | Both | Both |

3 | HV | HV | IGD$+$ | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ7 | 2 | Both | Both | Both | Both | Both | Both |

3 | None | None | IGD$+$ | None | None | None | |

4 | None | None | None | None | None | None | |

DTLZ7$-1$ | 2 | Both | Both | Both | Both | Both | Both |

3 | R2 | R2 | Both | Both | Both | Both | |

4 | None | None | None | $\epsilon +$ | $\epsilon +$ | None | |

WFG1 | 2 | None | None | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | IGD$+$ | IGD$+$ | $\epsilon +$ | None | |

WFG1$-1$ | 2 | R2 | R2 | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | None | None | None | None | |

WFG2 | 2 | None | None | None | None | None | None |

3 | None | None | None | None | None | None | |

4 | None | None | None | None | None | None |

MOP . | Dim. . | $WSw\u2192(-HV,R2)$ . | $ATCHw\u2192(-HV,R2)$ . | $WSw\u2192(-HV,IGD+)$ . | $ATCHw\u2192(-HV,IGD+)$ . | $WSw\u2192(-HV,\epsilon +)$ . | $ATCHw\u2192(-HV,\epsilon +)$ . |
---|---|---|---|---|---|---|---|

WFG2$-1$ | 2 | None | None | None | None | None | None |

3 | R2 | R2 | None | None | None | None | |

4 | None | None | None | None | None | None | |

WFG3 | 2 | None | None | None | None | None | None |

3 | Both | Both | Both | Both | Both | Both | |

4 | None | None | IGD$+$ | IGD$+$ | $\epsilon +$ | $\epsilon +$ | |

WFG3$-1$ | 2 | R2 | R2 | None | None | None | None |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | Both | Both | HV | HV |

MOP . | Dim. . | $WSw\u2192(-HV,R2)$ . | $ATCHw\u2192(-HV,R2)$ . | $WSw\u2192(-HV,IGD+)$ . | $ATCHw\u2192(-HV,IGD+)$ . | $WSw\u2192(-HV,\epsilon +)$ . | $ATCHw\u2192(-HV,\epsilon +)$ . |
---|---|---|---|---|---|---|---|

WFG2$-1$ | 2 | None | None | None | None | None | None |

3 | R2 | R2 | None | None | None | None | |

4 | None | None | None | None | None | None | |

WFG3 | 2 | None | None | None | None | None | None |

3 | Both | Both | Both | Both | Both | Both | |

4 | None | None | IGD$+$ | IGD$+$ | $\epsilon +$ | $\epsilon +$ | |

WFG3$-1$ | 2 | R2 | R2 | None | None | None | None |

3 | Both | Both | Both | Both | Both | Both | |

4 | HV | HV | Both | Both | HV | HV |

### 5.4 Computational Complexity

Theorem 1 proposes the combination of as many weakly Pareto-compliant QIs as required with at least one Pareto-compliant indicator. Currently, the only Pareto-compliant QI available is the hypervolume indicator whose computational cost increases super-polynomially with the number of objective functions (Bringmann and Friedrich, 2009, 2010). In terms of computational cost, it is clear that the runtime complexity of HV dominates those of R2, IGD$+$, and $\epsilon +$, which implies that a PCUI and, in general, all the QIs constructed by Theorem 1 inherit the cost of HV. Hence, one may argue that this a clear disadvantage of the proposed method. However, we need to emphasize that even though we showed that a PCUI can be utilized to guide the evolutionary process of a MOEA (in fact, we used PCUI-EMOA to study the approximate optimal $\mu $-distributions of PCUIs), the primary focus of PCUIs is to increase the number of available Pareto-compliant indicators for performance assessment. In consequence, when using a PCUI to compare MOEAs, one may use fast HV calculation methods such as the Walking-Fish-Group (WFG) algorithm (While et al., 2012) that significantly reduces the computational time to get exact HV values. Although following Theorem 1 allows the construction of new Pareto-compliant QIs, this does not overcome the runtime complexity of the HV, and it does not theoretically increase the overall computational cost.^{9} Additionally, these combined indicators provide different preferences to those of the HV and they are controlled by the user when setting the weight vector $w\u2192$. Hence, we argue that we obtain more advantages than drawbacks by using our proposed framework.

At this point, it is worth comparing the PCUIs and the multi-indicator-based preference relations proposed by Zitzler et al. (2010) in terms of computational cost. Due to the sequential application of the indicator-based preference relations ($\u2aafIj$), it is likely that a multi-indicator-based preference relation avoids the computational cost of calculating HV whenever a cheaper QI could solve the comparison. In contrast, PCUIs do always require the calculation of HV in furtherance of obtaining extra flexibility to control how the preferences are combined. Moreover, from the discussion throughout this article, it is possible to visualize two distinct application paths. On the one hand, PCUIs are useful for performance assessment of MOEAs, considering the Pareto compliance property and offering different preferences to those of HV. On the other hand, the multi-indicator-based preference relations are a promising direction to construct set-based MOEAs as in the case of SPAM (Zitzler et al., 2010). In consequence, each approach is specialized in different areas of evolutionary multiobjective optimization. Nevertheless, it is worth noting that PCUIs and, in general, our proposed framework for the combination of multiple QIs represent a generalization of the multi-indicator-based preferences relations proposed by Zitzler et al. (2010).

## 6 Conclusions and Future Work

In this article, we proposed to construct new Pareto-compliant indicators by combining existing QIs, under specific conditions. In order to ensure the Pareto compliance property, it is mandatory to combine at least one Pareto-compliant indicator (such as the hypervolume indicator) with as many weakly Pareto-compliant QIs as required, by using an order-preserving function. Following this framework, we proposed six new Pareto-compliant QIs based on the combination of HV with R2, IGD$+$, and $\epsilon +$, using the weighted sum and augmented Tchebycheff utility functions as order-preserving combination functions. We studied the properties of these six PCUIs from a preferences and approximate optimal $\mu $-distributions perspective. Our experimental results showed that the proposed PCUIs do exhibit different preferences to those of the HV which implies that they can be used as an alternative to the HV to assess performance of MOEAs while ensuring the Pareto compliance property. These results were supported by an analysis of approximate optimal $\mu $-distributions. Furthermore, when using a steady-state MOEA based on a PCUI, we observed that the non-uniform distributions produced by one of the baseline QIs of the PCUI could be compensated by the uniform distribution strengths of another baseline QI. It is worth emphasizing that the main focus of PCUIs is the performance assessment of MOEAs. In this direction, we discussed that even though a PCUI requires the calculation of the HV, the main advantage of PCUIs lies on their different preferences with respect to those of the HV. Additionally, PCUIs have the same theoretical runtime complexity as the HV while allowing to draw conclusions supported by the Pareto compliance property. As part of our future work, we aim to provide a mechanism that adapts the combination vector depending on the geometrical features of the MOP. Moreover, we aim to theoretically analyze the properties of PCUIs in order to define in which cases they can be used to provide better information. Finally, an analysis considering MOPs with more than four objective functions is also a task that we would like to undertake as part of our future work.

## Acknowledgments

The first author acknowledges support from CINVESTAV-IPN to pursue graduate studies in computer science and the 2018 IEEE CIS Graduate Student Research Grant. The third author gratefully acknowledges support from CONACyT grant no. 2016-01-1920 (*Investigación en Fronteras de la Ciencia 2016*) and from a SEP-Cinvestav grant (proposal no. 4). He was also partially supported by the Basque Government through the BERC 2022--2025 program and by Spanish Ministry of Economy and Competitiveness MINECO: BCAM Severo Ochoa excellence accreditation SEV-2017-0718.

## Notes

^{1}

A Pareto front approximation is a finite set of objective vectors that aims to represent a Pareto front.

^{2}

From this point onwards, convergence QIs will be denoted just as QIs.

^{3}

The Simplex-Lattice-Design method is usually employed to construct the set of weight vectors (Das and Dennis, 1998). Using this method, the number of weight vectors is the following combinatorial number: $N=Cm-1H+m-1$, where $H\u2208N$ is a user-supplied parameter that determines the number of divisions of the space, and $m$ is the number of objectives.

^{4}

Given an approximation set $A$ and a reference set $Z$, the IGD indicator is defined as follows: $IGD(A,Z)=1|Z|\u2211z\u2192\u2208Zmina\u2192\u2208Ad(a\u2192,z\u2192)p1/p$, where $d$ is the Euclidean distance and $p>0$ is a user-supplied parameter, usually set to 2.

^{6}

As long as we are concerned with how approximation sets are ordered, the indicator-based preference relations are not anti-symmestric since different approximation sets may have equal indicator values.

^{7}

To generate the initial populations, we executed the algorithms several times to gather in and archive the best-found solutions. Then, each time an algorithm runs, it randomly selects a subset of the archive as its initial population.

^{8}

We proposed this algorithm based on the framework of IGD$+$-MaOEA (Falcón-Cardona and Coello Coello, 2018) but using the $\Delta p$ indicator as the density estimator.

^{9}

Even in practice, the calculation of as many weakly Pareto-compliant QIs as required does not aggregate a considerable overhead when computing a PCUI value.

## References

*Proceedings of the Tenth ACM SIGEVO Workshop on Foundations of Genetic Algorithms*

*European Journal of Operational Research*

*Proceedings of the 9th International Conference on Evolutionary Multi-Criterion Optimization*

*Fifth International Conference on Evolutionary Multi-Criterion Optimization*

*Computational Geometry---Theory and Applications*

*Proceedings of the 21st International Joint Conference on Artificial Intelligence*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*Evolutionary Computation*

*Genetic Programming and Evolvable Machines*

*Evolutionary algorithms for solving multi-objective problems*

*SIAM Journal on Optimization*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Evolutionary Computation*

*Evolutionary multiobjective optimization. Theoretical advances and applications*

*Fourth International Conference on Evolutionary Multi-Criterion Optimization*

*Procedia Technology*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*2018 Parallel Problem Solving from Nature*

*ACM Computing Surveys*

*Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO)*

*IEEE Congress on Evolutionary Computation*

*Proceedings of the 2011 ACM/SIGEVO Foundations of Genetic Algorithms*

*European Journal of Operational Research*

*Evaluating the quality of approximations to the non-dominated set*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*Proceedings of the First IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Intelligence*

*IEEE Transactions on Evolutionary Computation*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*2014 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making*

*Proceedings of the 8th International Conference on Evolutionary Multi-Criterion Optimization*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Cybernetics*

*Applied Soft Computing*

*ACM Computing Surveys*

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO)*

*Nonlinear multiobjective optimization*

*Proceedings of the 9th International Conference on Evolutionary Multi-Criterion Optimization*

*IEEE Transactions on Evolutionary Computation*

*2020 IEEE Symposium Series on Computational Intelligence*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Evolutionary Computation*

*Proceedings of the 4th International Conference on Evolutionary Multi-Criterion Optimization*

*Evolutionary Computation*

*Multiobjective optimization. Interactive and evolutionary approaches*

*EUROGEN 2001. Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems*

*Parallel Problem Solving from Nature*

*IEEE Transactions on Evolutionary Computation*

*IEEE Transactions on Evolutionary Computation*