## Abstract

We present an empirical study of a range of evolutionary algorithms applied to various noisy combinatorial optimisation problems. There are three sets of experiments. The first looks at several toy problems, such as OneMax and other linear problems. We find that UMDA and the Paired-Crossover Evolutionary Algorithm (PCEA) are the only ones able to cope robustly with noise, within a reasonable fixed time budget. In the second stage, UMDA and PCEA are then tested on more complex noisy problems: SubsetSum, Knapsack, and SetCover. Both perform well under increasing levels of noise, with UMDA being the better of the two. In the third stage, we consider two noisy multiobjective problems (CountingOnesCountingZeros and a multiobjective formulation of SetCover). We compare several adaptations of UMDA for multiobjective problems with the Simple Evolutionary Multiobjective Optimiser (SEMO) and NSGA-II. We conclude that UMDA, and its variants, can be highly effective on a variety of noisy combinatorial optimisation, outperforming many other evolutionary algorithms.

## 1 Introduction

Realistic optimisation problems are often affected with noisy fitness measurements. The recent theoretical analyses of evolutionary algorithms (EAs) on noisy problems defined in discrete spaces are mostly focused on simple and classical benchmark problems, such as OneMax. Harder and more realistic combinatorial problems such as Knapsack or SetCover, in presence of noisy fitness evaluations, are not easily amenable to theoretical analysis. This article discusses a thorough empirical analysis of an array of EAs on several simple and harder noisy combinatorial problems. This study attempts to identify which EAs to choose when solving realistic noisy combinatorial problems.

Noise may affect fitness evaluation in a number of ways. It may be considered as *prior*, in which the search point is randomly tampered with, and fitness evaluation is performed on the noisy search point. Alternatively, *posterior* noise (which will be the focus of our study) is where a random value gets added to the fitness of a search point during the optimisation process. The same search string would have a different fitness value each time it is being evaluated. With more complex combinatorial problems (e.g., ones with constraints) the noise may enter in different ways (e.g., in the evaluation of those constraints).

An early theoretical result by Droste (2004) examined the performance of the hill-climber $(1+1)$-EA on OneMax with prior noise. This was generalised to the $(\mu +\lambda )$-EA by Gießen and Kötzing (2016), showing that populations can help in both prior and posterior noise. They show the $(1+1)$-EA, however, can tolerate posterior Gaussian noise only when the variance is very small (less than $1/(4logn)$). It has been recognised for a long time that the population size can affect the ability of an EA to handle noise (Goldberg et al., 1991; Rattray and Shapiro, 1997). A more recent theoretical study by Dang and Lehre (2015) shows that a low mutation rate enables a particular mutation-population algorithm to handle arbitrary posterior noise for the OneMax problem in polynomial time, although the bounds given are large. Similarly, the compact genetic algorithm (cGA) is shown to handle noise with (large) polynomial runtime (Friedrich et al., 2015). A better asymptotic runtime for OneMax with posterior Gaussian noise is proved for the Paired-Crossover Evolutionary Algorithm (PCEA) which just uses crossover, and no mutation (Prügel-Bennett et al., 2015).

Of course, it is possible to handle noise simply by resampling the fitness of a potential solution many times, and taking the average as an estimate of the true fitness. Suppose the noisy problem is defined by taking a deterministic fitness function and adding Gaussian noise with mean 0 and variance $\sigma 2$. There is a general result (Akimoto et al., 2015) that states if the runtime of a black-box algorithm on a problem with no noise is $T$, then $\sigma 2logT$ samples are required at each step leading to a runtime of $\sigma 2TlogT$. In the case of OneMax, the most efficient algorithm (Anil and Wiegand, 2009) has a runtime of $\Theta (n/logn)$. Using this algorithm with resampling, gives a runtime for noisy OneMax of $\Theta (\sigma 2n)$. By contrast, the PCEA algorithm (Prügel-Bennett et al., 2015), when $\sigma 2=n$, has a runtime of $O(n(logn)2)$ which is already faster than the resampling algorithm. It has been suggested by Doerr and Sutton (2019) that using the median rather than mean provides a better estimate when resampling, but this is significant only when the variance is small (less than a constant).

A recent study by Rowe and Aishwaryaprajna (2019) of a new voting algorithm on OneMax shows a runtime of $O(nlogn)$, when the variance of the noise distribution is $\sigma 2=O(n)$ and in $O(\sigma 2logn)$ when the noise variance is greater than this. This upper bound is the best proven runtime that we are aware of to date. Some empirical results show that the use of voting in population-based algorithms (UMDA, PCEA, and cGA) are effective for large population sizes. The voting algorithm is shown to solve the Jump function, which has a fitness gap next to the global optimum, in $O(nlogn)$ fitness evaluations with high probability any gap size $m<(1-\alpha )n$ where, the constant $\alpha \u2208[12,1]$. The performance of voting algorithm and the cGA are also analysed for other variants of the Jump function in the recent paper by Witt (2021). However, in our work, problems with fitness gaps have not been considered.

In this article, we are interested in whether any of the algorithms with polynomial theoretical runtimes for noisy OneMax would be capable of solving combinatorial problems with added noise in practice, when given a reasonable but fixed time budget.^{1} We proceed in three stages. First, we will experimentally compare a collection of algorithms on noisy OneMax and noisy Linear problems, to see which can find solutions within a reasonable amount of time (to be defined below), bearing in mind that the asymptotic bounds for some of these algorithms, while polynomial, are actually very large. Second, we will take those algorithms which pass this first test, and see how well they handle noise in three combinatorial problems: SubsetSum, Knapsack, and SetCover. We choose these, as they have a “packing” structure which might make them amenable to algorithms which can solve noisy OneMax efficiently. We generate random problem instances within the “easy” regime (so that the algorithms can be expected to solve them when there is no noise) and then empirically study how they degrade with added Gaussian noise.

There has been considerable research in the last decades on stochastic combinatorial optimisation problems, like the stochastic knapsack problem (refer to Steinberg and Parks, 1979; Sniedovich, 1980; Ross and Tsang, 1989; Henig, 1990; Carraway et al., 1993; Fortz et al., 2013; and Bianchi et al., 2009 for a survey). In stochastic knapsack formulations, in some instances the profits are considered as random variables (Steinberg and Parks, 1979; Sniedovich, 1980; Carraway et al., 1993; Henig, 1990), as well as, the distribution of weights are known in some formulations (Fortz et al., 2013). This means, that the approaches in handling stochastic problems involve having access to more details of the problem, for example, the distribution of the decision variables is known. However, in this article, we assume that the noise contribution to the fitness value is part of the black-box, and so we do not have access to details such as the noise distribution, or the exact distribution of the decision variables.

In the last stage of our analysis, we look at noisy multiobjective problems. Initially, we analyse the performance of a collection of multiobjective algorithms on a toy multiobjective problem COCZ without and with high levels of noise and we attempted to identify which algorithms perform better. We study the simple hill-climber algorithm SEMO, the popular NSGA-II, and some other algorithms designed on the basis of our previous experimental results. We compare our algorithms on the basis of the performance indicator, hypervolume, which provides an analysis of the spread of the nondominated solutions found, in a reasonable time budget. We then formulate the noisy constrained SetCover problem as a multiobjective problem and we empirically analyse the performance of the better algorithms on this.

It should be noted that in our empirical results, while error bars are not always shown, the Mann–Whitney test was used on all relevant comparisons, and results are significant at the 95% level unless explicitly indicated.

**Notation:** We use the convention that $[expr]$ equals 1 if $expr$ is true, and 0 otherwise.

## 2 Problem Definitions—Noisy Single Objective Problems

The problems studied in this article are defined on a Boolean search space of bit strings of length $n$. Let $N(0,\sigma )$ denote a random number drawn from a normal distribution with mean zero, and standard deviation $\sigma $, which will be freshly generated at each fitness function evaluation.

### 2.1 Unconstrained Single-Objective Noisy Problems

In generating random problem instances, we draw the weights uniformly at random from the range $1,\u2026,100$. Thus we avoid more extreme instances such as BinVal (in which $wi=2i-1$ for each $i=1,\u2026,n$). The reason for this is that when the distribution of weights is highly skewed, the addition of noise is irrelevant for those bits with very high weights, yet completely overwhelms bits with weights lower than the typical noise level. Thus most algorithms will find the more significant bits, and fail on the remainder.

SubsetSum can be seen as a generalisation of the WeightedLinear problem (in which the target is $\theta =0$). In our experiments, we generate instances by choosing weights uniformly at random from $1,\u2026,100$. We take the target to be two-thirds of the sum of the weights (we have run experiments for other choices of $\theta $ and found that they do not significantly affect the empirical observations).

### 2.2 Constrained Single-Objective Noisy Problems

The problem can be defined as a constrained single-objective one, as well as a single-objective problem with a penalty term. The problem can also be defined as a multiobjective problem (discussed later).

## 3 Algorithms Chosen for Noisy Single-Objective Optimisation

### 3.1 The ($1+1$)-EA

The $(1+1)$-EA uses a bitwise mutation operator that produces an offspring by flipping each bit of the parent string independently with probability $1/n$. This can be considered as a randomised or stochastic hill-climber which considers only one point in the search space at a time and proceeds by trying to find a point which has a superior function value. In each iteration, only one function evaluation takes place. The expected runtime of the $(1+1)$-EA solving the non-noisy OneMax is $O(nlogn)$. The runtime remains polynomial in the posterior Gaussian noise case for $\sigma 2<1/(4logn)$, so we do not expect this algorithm to cope with anything but the smallest noise levels (Gießen and Kötzing, 2016).

### 3.2 Mutation-Population Algorithm

It has long been recognised that populations can help an EA handle noise. The paper by Goldberg et al. (1991) developed a population sizing equation and instigated the adoption of variance-based population sizing. Rattray and Shapiro (1997) showed that in weak selection limit, effects of Gaussian noise could be overcome by an appropriate increase of the population size. More recently, a population-based, non-elitist EA was analysed by Dang and Lehre to study how it optimises the noisy OneMax problem with uniform, Gaussian and exponential posterior noise distributions (Dang and Lehre, 2015, 2016). They considered a recently developed fitness-level theorem for non-elitist populations to estimate the expected running time for the said problems in noisy environment. In the case of additive Gaussian noise $N(0,\sigma 2)$ with mutation rate $\chi n=a3\sigma n$ and population size $\lambda =b\sigma 2lnn$ (where $a$ and $b$ are constants), the considered algorithm optimises the OneMax problem in expected time $O(\sigma 7nln(n)ln(ln(n)))$. Similar results were shown for uniform and exponential noise distributions. Note that this is potentially very large, when the noise is large—in excess of $n4.5$ when $\sigma =n$, although of course this is an upper bound, and we do not know the constants.

### 3.3 Compact Genetic Algorithm (cGA)

The compact GA (cGA) is an EDA, introduced by Harik et al. (1999). cGA is able to average out the noise and optimise the noisy OneMax problem in expected polynomial time, when the noise variance $\sigma 2$ is bounded by some polynomial in $n$, as suggested in Friedrich et al. (2015). The paper introduced the concept of graceful scaling in which the runtime of an algorithm scales polynomially with noise intensity, and suggested that cGA is capable of achieving this. It is also suggested that there is no threshold point in noise intensity at which the cGA algorithm begins to perform poorly (by which they mean having super-polynomial runtime). They proved that cGA is able to find the optimum of the noisy OneMax problem with Gaussian noise of variance $\sigma 2$ after $O(K\sigma 2nlogKn)$ steps when $K=\omega (\sigma 2nlogn)$, with probability $1-o(1)$. Note that this upper bound is in excess of $n3$ when $\sigma =n$.

### 3.4 Population-Based Incremental Learning (PBIL)

The algorithm PBIL, proposed by Baluja (1994), combines genetic algorithms and competitive learning for optimising a function. We have included this algorithm as it is in some ways similar to the cGA, so we might expect it to have similar performance. We are not aware of any theoretical analysis of this algorithm on noisy problems. The runtime of PBIL on OneMax is known to be $O(n3/2logn)$, for suitable choice of $\lambda $ (Wu et al., 2017).

### 3.5 Univariate Marginal Distribution Algorithm (UMDA)

The Univariate Marginal Distribution Algorithm (UMDA) proposed by Mühlenbein (1997) belongs to the EDA schema. In some ways, it is therefore similar to cGA and PBIL. However, it can also be viewed as generalising the *genepool* crossover scheme, in which bits are shuffled across the whole population (within their respective string positions). We have included UMDA then, to see if its behaviour is more like cGA and PBIL on the one hand (which emphasise an evolving distribution over bit values), or like PCEA on the other (which emphasises crossover). The UMDA algorithm initialises a population of $\lambda $ solutions, and sorts the population according to the fitness evaluation of each candidate solution. The best $\mu $ members of the population are selected to calculate the sample distribution of bit values in each position. The next population is generated from this distribution. There are two variants of UMDA, depending on whether the probabilities are constrained to stay away from the extreme values of 0 and 1, or not. It is known that if the population size is large enough (that is, $\Omega (nlogn)$), then this handling of probabilities at the margins is not required (Witt, 2017). Since we will work with a large population (to match the PCEA algorithm described below), we will not employ margin handling, unless otherwise stated. In our experiments we will take $\mu =\lambda /2$. We are not aware of any theoretical results concerning UMDA on problems with posterior noise, but the runtime on OneMax is known to be $O(nlogn)$ for $\mu =\Theta (nlogn)$—see Witt (2017).

### 3.6 Paired-Crossover EA (PCEA)

Recently, the recombination operator has been suggested to be considerably beneficial in noisy evolutionary search. Prügel-Bennett et al. (2015) considered the problem of solving OneMax with noise of order $\sigma =n$ and analysed the runtime of an evolutionary algorithm consisting only of selection and uniform crossover, the paired-crossover EA (PCEA). They show that if the population size is $cnlogn$ then the required number of generations is $Onlogn$, giving a runtime of $O(cnlogn2)$, with the probability of failure at $O(1/nc)$. The proof in that paper can be generalised to the case of $\sigma \u2265n$, to give a runtime of $O(\sigma 2logn)$. It is not known what happens for lower levels of noise, though it is shown that in the absence of noise, PCEA solves OneMax in $O(n(logn)2)$.

## 4 Experiments—Simple Noisy Problems

### 4.1 Noisy OneMax

We investigate the performance of the algorithms described above, in solving the noisy OneMax problem. In the literature, some theoretical proofs exist for the expected runtime of specific algorithms on solving the noisy OneMax problem with additive posterior Gaussian noise (Prügel-Bennett et al., 2015; Dang and Lehre, 2015; Akimoto et al., 2015; Friedrich et al., 2017; Lucas et al., 2017; Qian et al., 2018; Dang-Nhu et al., 2018; Doerr and Sutton, 2019; Doerr, 2020). We are interested in the algorithms' performances given a reasonable but fixed runtime budget across a wide range of noise levels, from $\sigma =0$ up to $\sigma =n$.

To address the question of what constitutes a *reasonable* budget, we compared the known theoretical results of our algorithms on noisy OneMax. PCEA has the lowest proven upper bound on its runtime, compared with the other algorithms for which results exist. We therefore allowed each algorithm to have twice the number of fitness evaluations that PCEA requires (on average) to find the optimum, as a reasonable budget. The function evaluation budgets calculated in this way are given in Table 1.

$\sigma $ | 1 | 2 | 3 | 4 | 5 |

budget | 38392 | 41066 | 44477 | 50728 | 56851 |

$\sigma $ | 6 | 7 | 8 | 9 | 10 |

budget | 64079 | 70736 | 79034 | 86078 | 93638 |

$\sigma $ | 1 | 2 | 3 | 4 | 5 |

budget | 38392 | 41066 | 44477 | 50728 | 56851 |

$\sigma $ | 6 | 7 | 8 | 9 | 10 |

budget | 64079 | 70736 | 79034 | 86078 | 93638 |

The population size for the PCEA is taken to be $10nlogn$ according to the theoretical proofs and empirical study by Prügel-Bennett et al. (2015). According to the proofs by Dang and Lehre (2015), the population size $\lambda =\sigma 2logn$ is chosen for the mutation-population algorithm. According to the paper by Friedrich et al. (2015), the parameter $K=7\sigma 2nlogn$ is considered for cGA. In presence of additive posterior noise, PBIL and UMDA have not yet been studied much. For PBIL, the population size is taken as $\lambda =10n$ (following the theoretical requirement of Wu et al., 2017). From these, we select the best $\mu =\lambda /2$ individuals. In case of UMDA, the total number of generated candidates in a particular generation is chosen as $20nlogn$, so that the effective population size is the same as for PCEA. All these parameter settings are retained for all of our experiments in simple and constrained noisy combinatorial optimisation problems.

### 4.2 Noisy WeightedLinear Problem

Maximising the WeightedLinear problem, as defined in Section 2, has only one global optimum, the sum of all the weights. The OneMax problem is a special case of the WeightedLinear problem when all the weights are units. However, optimising the WeightedLinear problem is difficult as the bits with heavier weights get optimised with a higher preference than the bits with lower weights.

$\sigma $ | 1 | 2 | 3 | 4 | 5 |

budget | 47096 | 46801 | 47704 | 48350 | 48682 |

$\sigma $ | 6 | 7 | 8 | 9 | 10 |

budget | 49954 | 50876 | 51429 | 52794 | 53310 |

$\sigma $ | 1 | 2 | 3 | 4 | 5 |

budget | 47096 | 46801 | 47704 | 48350 | 48682 |

$\sigma $ | 6 | 7 | 8 | 9 | 10 |

budget | 49954 | 50876 | 51429 | 52794 | 53310 |

It is evident from the empirical results of these simple noisy problems that uniform crossover-based PCEA and UMDA can cope with noise significantly better than the other algorithms. At this point, it is interesting to note that, UMDA employs a mechanism similar to *genepool crossover*, where at each bit position, the offspring bit is obtained by recombination of that bit across the whole parent population. It is hypothesised that these two algorithms are therefore highly similar in operation.

## 5 Experiments—Noisy Combinatorial Problems

### 5.1 Noisy SubsetSum

Given the success of UMDA and PCEA on the noisy toy problems, and the failure of the others to cope with even modest levels of noise, we now move to the second stage of the study considering only UMDA and PCEA.

For the noisy SubsetSum problem, a range of problem sizes is considered with 50, 100, 150, and 200 weights, each lying between 1 and 100, and chosen uniformly at random. Corresponding to each problem size, 10 different problems are considered. The target $\theta $ is considered to be two-thirds of the sum of all the weights in the set. The additive Gaussian noise considered in the SubsetSum problem is centered at zero and is considered to have standard deviation of integral multiples of the mean of the weights, viz., $5\xd7mean(W)$, $10\xd7mean(W)$, $15\xd7mean(W)$, and $20\xd7mean(W)$.

### 5.2 Noisy Knapsack (Version 1)

When noise is added, neither algorithm finds the optimal solution, so we record the best solution found (as assessed by non-noisy fitness function). For each problem instance, we plot (in Figures 6 and 10) the best solution found (averaged over 100 runs) as a fraction of the best solution ever encountered for that problem instance. This enables us to make meaningful comparisons between problem instances. The best known solution for each problem instance has a scaled fitness value of 1. Figures 7 and 11 show the time taken (on average) to locate the best found solution in each case. We can observe in Figures 6 and 7, that both the algorithms can find good, though not optimal solutions, for NoisyKnapsackV1 with significant levels of noise.

Observations from the Mann–Whitney U-test show that UMDA is slightly better than PCEA with these parameter settings for larger noise levels.

### 5.3 Noisy Knapsack (Version 2)

When the measurements of the weights is uncertain, as well as the profits, this creates a more complex noise model for the Knapsack problem. In the first stage, the total weight of the proposed solution is compared against the capacity, and this is done with added noise. Hence it may be thought that the proposed solution is feasible when in fact it is not. If it is considered feasible, then the benefit (total profit) is calculated, again with added noise.The same problem instances are considered as in the previous version of the Knapsack problem.

Figures 8 and 10 depict how the best (non-noisy) solution varies for different problem sizes. This value is scaled with respect to the best value found when there is no noise. The Mann–Whitney U-test shows that the best solution achieved and corresponding runtime of UMDA is better than PCEA in these particular parameter settings. The runtime required to find these values is shown in Figures 9 and 11, and we see that UMDA finds its best solution considerably faster than PCEA.

### 5.4 Noisy ConstrainedSetCover and PenaltySetCover

The ConstrainedSetCover problem is solved by initially finding the feasible solutions and then minimising the number of the selected sets. This lexicographic ordering is achieved in the selection mechanism of the considered algorithms.

In PCEA, the child with least uncovered elements is selected. When both of the children have the same number of uncovered elements, the child with the minimum number of sets goes to the next population. In UMDA, the sorting of the population is based on the above mentioned lexicographic ordering. We consider margin handling in UMDA for all the following experiments in single objective-optimisation.

## 6 Noisy Combinatorial Multiobjective Problems

In this section, we empirically examine the performances of several evolutionary algorithms on noisy combinatorial multiobjective problems. Much of the previous work on multiobjective optimisation (especially in the context of noise) has concerned continuous problems (Goh et al., 2010; Shim et al., 2013; Fieldsend and Everson, 2015; Falcón-Cardona and Coello, 2020). In this article, we focus on discrete problems, but with additive (posterior) Gaussian noise.

*Pareto optimal solution set*where none of the objectives may be improved without worsening at least one of the other objectives. In the context of noisy multiobjective optimisation, the goal is to find the set of Pareto optimal solutions, as defined in the absence of noise; however, the challenge is that each time a comparison is made, noise is applied. This is particularly problematic for algorithms that make use of an

*archive*of nondominated solutions, as it is easy for a solution to be incorrectly placed in the archive due to the noise.

In order to assess how successfully we have approximated the true Pareto optimal set, we measure the spread of a set of nondominated solutions on the basis of the frequently used *hypervolume* performance indicator (Zitzler and Thiele, 1998). Where we seek to minimise each objective, this is a measure of the area (or volume) of the region bounded below by a set of candidate solutions simultaneously and bounded above by a reference point $r$ in the objective space. The reference point $r$ is chosen to be the maximum value each objective function can attain in each corresponding dimension of the objective space; that is, $r=(maxf1,maxf2,\u2026,maxfk)$. Conversely, for maximisation problems, we take the volume between the candidate set and a lower bounding reference point (in the case of non-negative objectives, it is common to take the origin as the reference point). We use hypervolume of the population as an indicator of the spread of the nondominated solutions in each generation of the considered algorithms.

*Counting Ones Counting Zeroes*(COCZ), in which the first objective function counts the number of ones in a string, and the second objective function counts the number of ones in the first $m$ bits and the number of zeroes in the remainder. We seek to maximise both objectives.

## 7 Algorithms Chosen for Noisy Multiobjective Combinatorial Problems

### 7.1 Simple Evolutionary Multiobjective Optimiser (SEMO)

SEMO (Laumanns et al., 2004) is one of the simplest evolutionary algorithms designed for multiobjective optimisation in discrete search space. To the best of our knowledge, it has not previously been used to solve noisy problems. SEMO is a simple population-based algorithm using one-bit mutation, and a variable population size (representing the current nondominated solutions found). The algorithm starts with adding an initial solution $x\u2208{0,1}n$ chosen uniformly at random to the population $P$. Then a solution $y$ is chosen randomly from $P$ and mutated with a one-bit flip to obtain $y'$. If $y'$ is dominated by anything in $P$ it is discarded. Otherwise it is added to $P$ and all the solutions that $y'$ dominates in $P$ are discarded. Then a new $y$ is chosen from $P$ and the process is repeated. One of the great challenges SEMO will face due to noisy dominance relations is that, often good solutions will be discarded and bad solutions will be retained in $P$.

### 7.2 Nondominated Sorting Genetic Algorithm—II (NSGA-II)

NSGA-II, by Deb et al. (2002), sorts the population into nondominated fronts in each generation. Based on nondominated sorting and using a crowding heuristic to break ties, the best half of individuals become the parent population of the next generation. In case of noisy function evaluations, nondominated sorting will be affected and worse solutions will appear in better nondominated fronts. We use the same algorithm structure as defined in Deb et al. (2002) except considering noisy function evaluations during the selection process.

### 7.3 Variants of Multiobjective Univariate Marginal Distribution Algorithm (moUMDA)

From our experiments in noisy single-objective combinatorial problems, UMDA and PCEA show significantly better performance in handling noise compared with the other algorithms we tried, with UMDA generally producing better quality solutions. From these results, we hypothesise that a multiobjective version of UMDA (denoted moUMDA) may be able to handle large levels of noise in noisy combinatorial multiobjective problems if proper diversification mechanisms are employed. In order to investigate this, we have considered several versions of moUMDA in our analysis with different diversification techniques.

Pelikan et al. (2005) introduced a version of UMDA to address multiobjective problems which used nondominated sorting in the selection mechanism. They also experimented with clustering methods, to help the algorithm generate solutions across the Pareto front. We have followed this idea, and studied several versions of UMDA adapted for multiobjective problems. Where nondominated sorting and crowding are used for selection, these are implemented identically to NSGA-II. We also consider making use of an archive, and in using hypervolume as a criterion in selection:

**moUMDA without duplicates:**Uses nondominated sorting (with crowding to break ties) for selection. Maintains diversity by disallowing duplicates when generating the population. See Algorithm 2.**moUMDA with clustering:**Uses nondominated sorting (with crowding to break ties) for selection. Clusters the selected population members (using either K-means or Hierarchical Agglomeration), and produces a frequency vector for each cluster. Generates next population from these, in proportion to the number of items within each cluster. See Algorithm 3.**moUMDA with Pareto archive:**Maintains an archive of nondominated solutions and uses this to generate the frequency vector for the next population. Uses nondominated sorting (with crowding to break ties) for selection, and updates the archive with the selected items. See Algorithm 4.**moUMDA with hypervolume comparison operator:**Uses binary tournament selection, comparing solutions initially by Pareto dominance. If neither dominates the other, then select the one with the better hypervolume indicator value. See Algorithm 5.

## 8 Experiments—Noisy Multiobjective Problems

Following the same strategy as for single objective problems, we initially choose a wide range of evolutionary multiobjective algorithms to compare their performances on a toy problem: noisy CountingOnesCountingZeroes (COCZ). The algorithms considered for solving COCZ consists of SEMO, NSGA-II and several versions of multiobjective UMDA (moUMDA) as described above. Depending on their performances on this problem, we selected a smaller set of the better performing algorithms for the multiobjective noisy SetCover problem.

Some recent studies claim that multiobjective evolutionary approaches are useful in solving single objective optimisation problems (Segura et al., 2016). For example, the multiobjective version of SetCover could enable us to find good solutions to the original single-objective version (by looking at solutions generated which do not violate the constraints). Here, we consider whether this approach is also helpful in the context of noise.

### 8.1 Noisy CountingOnesCountingZeroes (COCZ)

The results shown in Figure 15 show that SEMO is the worst performing algorithm, even when there is no noise, and the performance degrades slightly as noise is increased. The Pareto Archive algorithm (PAmoUMDA) is the next worst. Although it does not degrade too much with added noise, it is still clearly worse than the other algorithms.

The remaining algorithms have rather similar performance, but we can still distinguish different behaviours by looking at the zoomed in section of the plot in Figure 15. The version of moUMDA that uses the hypervolume comparison operator (moUMDAHCO) performs very well when there is little or no noise. However, its performance degrades considerably as the level of noise increases. The same is true for NSGAII. When the noise reaches a standard deviation of $\sigma =15$, these two algorithms are the worst of the remaining ones.

The plain moUMDA and the version forbidding duplicates in the population both have the curious property that their performance improves with the presence of low levels of noise, and then degrades at higher levels of noise. We speculate that low levels of noise allow for much more diversity in the populations. At high levels of noise ($\sigma =15$) they are the best performing algorithms, along with the two versions of moUMDA that use clustering (moUMDA-Kmeans and moUMDA-HAC). The moUMDA with no duplicates is marginally the best overall at this level of noise.

### 8.2 Noisy Multiobjective SetCover

In this section, we compare the performance of three of our multiobjective algorithms, viz., NSGA-II, moUMDA with no duplicates allowed, and moUMDA employing K-means clustering, on the noisy multiobjective SetCover problem. We have chosen these algorithms based on their behaviours on the COCZ. These were amongst the best algorithms we tried on that problem. There being little to distinguish the two different clustering methods, we have chosen to test just one of these (K-means clustering). We have selected the “no duplicates” version of moUMDA, as this gave a small advantage over the plain moUMDA. And we have kept NSGAII as this is a standard algorithm for any multiobjective problem.

## 9 Conclusion

We have empirically studied a range of evolutionary algorithms on a set of noisy problems. The $(1+1)$-EA, as expected, fails to cope with any degree of posterior noise. Interestingly, some algorithms (the mutation-population algorithm and cGA), where there is a theoretical polynomial runtime for noisy OneMax, fail to be useful in practice compared with some other algorithms. PBIL performs somewhat similar to cGA. The Paired-Crossover Evolutionary algorithm handles noise well on both the simple test problems, and on the noisy combinatorial problems we have tried. Interestingly, UMDA also handles these cases well, with even a slightly better performance than PCEA. This may be due to the fact that UMDA has a strong selection method (truncation selection) than PCEA (which uses a tournament on pairs of offspring). Of course, parameter values on each could be tweaked to produce slightly different results; our key finding is that these are the only algorithms we have tried that seem remotely practical for such problems. It seems likely that UMDA's performance is more due to its relationship with crossover algorithms (such as the genepool crossover), rather than considered as an EDA (such as PBIL).

We are not aware of any previously published results on noisy combinatorial multiobjective problems. We carefully selected a set of multiobjective algorithms on the basis of the performance on noisy COCZ and tested them on the noisy multiobjective SetCover. We observe that multiobjective UMDA with a simple diversity mechanism that allows no duplicate solutions in the population is effective at solving the noisy SetCover problem in both constrained and multiobjective forms. UMDA can also benefit from using a clustering approach when dealing with noisy multiobjective problems.

## Note

^{1}

This article is an extended version of Aishwaryaprajna and Rowe (2019).