## Abstract

The multipopulation method has been widely used to solve dynamic optimization problems (DOPs) with the aim of maintaining multiple populations on different peaks to locate and track multiple changing optima simultaneously. However, to make this approach effective for solving DOPs, two challenging issues need to be addressed. They are how to adapt the number of populations to changes and how to adaptively maintain the population diversity in a situation where changes are complicated or hard to detect or predict. Tracking the changing global optimum in dynamic environments is difficult because we cannot know when and where changes occur and what the characteristics of changes would be. Therefore, it is necessary to take these challenging issues into account in designing such adaptive algorithms. To address the issues when multipopulation methods are applied for solving DOPs, this paper proposes an adaptive multi-swarm algorithm, where the populations are enabled to be adaptive in dynamic environments without change detection. An experimental study is conducted based on the moving peaks problem to investigate the behavior of the proposed method. The performance of the proposed algorithm is also compared with a set of algorithms that are based on multipopulation methods from different research areas in the literature of evolutionary computation.

## 1 Introduction

The multipopulation method has been widely used in evolutionary computation (EC) to locate and track multiple optima over environmental changes. Multipopulation methods, with proper enhancements, have the potential to be efficient methods to solve dynamic optimization problems (DOPs) because they have two advantages. Firstly, they are able to maintain the population diversity at the global level. The population diversity will always be guaranteed as long as populations distribute in different subareas in the fitness landscape even though all of them are converging. Secondly, they are able to track a set of optima rather than a single optimum, which will increase the possibility of tracking the changing global optimum. This is because one of the relatively good optima in the current environment has a high possibility of being the new global optimum in the next environment.

Although many algorithms based on multipopulation methods have been proposed to solve DOPs (Cruz et al., 2011; Nguyen et al., 2012), several fundamental challenging issues still remain to be addressed (Li and Yang, 2009; Yang and Li, 2010), for example, how to determine the number of populations, when to respond to changes, and how to maintain the population diversity without change detection. Some of these issues had not been discussed until recently (Li and Yang, 2009; Yang and Li, 2010), where a hierarchical clustering method is employed within particle swarm optimization (PSO) to create a set of populations that are distributed in different subareas in the fitness landscape. The clustering PSO (CPSO) algorithm proposed in Li and Yang (2009), and Yang and Li (2010) and the extended version (CPSOR) in Li and Yang (2012) attempt to address the challenging issues when multipopulation methods are applied. However, far more effort is needed to solve these issues.

To enable multiple populations to adapt to changes in dynamic environments, this paper proposes an adaptive multi-swarm optimizer (AMSO). The motivation is to provide a method that permits maintaining population diversity and adapting the number of populations without requiring a change detection mechanism. The work in this paper is based on previous research (Li and Yang, 2012; Yang and Li, 2010), which is based on the clustering method (Li and Yang, 2009) to create multiple populations. However, there are several significant improvements between this research and our previous research.

Firstly, the parameter settings in the AMSO are not guided by the problem information. For our previous algorithms, some key parameters are set by directly using the information of the problem to be solved, for example, two key parameters in CPSOR (Li and Yang, 2012; i.e., the number of individuals and the diversity threshold ), are determined by the number of peaks (optima) in the moving peaks benchmark (MPB; Branke, 1999; see Equations (2) and (3) in Section 2.2.1 for and , respectively). For the MPB, the number of peaks is available. However, for other problems, such information may be unknown, for example, the generalized dynamic benchmark generator (GDBG; Li et al., 2011) containing a huge unknown number of local optima. AMSO does not use such problem information to guide parameter settings due to its diversity maintaining mechanism.

Although the population diversity maintaining mechanism in AMSO is similar to the idea in CPSOR (Li and Yang, 2012; both increase the population diversity when it drops to a threshold level), the mechanism in AMSO is adaptive while the one in CPSOR is not. In CPSOR, the number of individuals is simply restored to an initial number when the diversity drops to a threshold level . The total number of individuals and the threshold value determine the number of populations and the moment to increase diversity, respectively. However, they are fixed and not adaptive to changes particularly in situations with an unknown number of peaks (see Figure 2 discussed later). In this paper, the number of populations and the moment to increase diversity are adaptive. This enables the proposed algorithm to efficiently use the available fitness evaluations to track more peaks than CPSOR (see Figure 5 later in this paper), and hence greatly improves the performance.

Secondly, like the CPSOR algorithm (Li and Yang, 2012), change detection is not needed in AMSO. However, change detection is needed for CPSO (Yang and Li, 2010) to trigger the population increase procedure. The performance of CPSO is seriously affected in environments where changes are hard to detect (see Table 9 later in this paper).

Thirdly, in our previous work (Li and Yang, 2012; Yang and Li, 2010), the results of some peer algorithms were collected from the papers where they were proposed, while in this paper, all the peer algorithms are implemented, and they are run and compared based on exactly the same dynamic environments and performance measurements.

The rest of this paper is organized as follows. Section 2 reviews some multipopulation methods developed in dynamic environments and discusses the difficulties in using multipopulation methods for DOPs. The idea of adaptively maintaining the population diversity without change detection and the proposed AMSO are described in Section 3. The experimental studies regarding the configuration, working mechanism, and comparison of AMSO with other algorithms on the MPB problem are presented in Section 4. Finally, conclusions are given in Section 5.

## 2 Multiple Populations in Dynamic Environments

Due to the advantages of implicitly maintaining diversity, multipopulation methods have been widely used in the literature of EC for solving DOPs.

### 2.1 Related Research for DOPs

Branke et al. (2000) proposed a self-organizing scouts (SOS) algorithm that has shown promising results on DOPs with many peaks. In SOS, the whole population is composed of a parent population that searches through the entire search space and child populations that track local optima. The parent population is regularly analyzed to check the condition for creating child populations, which are split off from the parent population.

Inspired by the SOS algorithm (Branke et al., 2000), a fast multi-swarm^{1} optimization (FMSO) algorithm was proposed by Li and Yang (2008) to locate and track multiple optima in dynamic environments. In FMSO, a parent swarm is used as a basic swarm to detect the most promising area when the environment changes, and a group of child swarms are used to search the local optimum in their own subspaces. Each child swarm has a search radius, and there is no overlap among all child swarms since they exclude each other. If the distance between two child swarms is less than their radius, then the whole swarm of the worse one is removed. This guarantees that no more than one child swarm covers a single peak. Another similar idea of hibernation multi-swarm optimization (HmSO) algorithm was introduced by Kamosi et al. (2010), where a child swarm will hibernate if it is no longer productive and will be woken up if a change is detected.

In the work of Jiang et al. (2009), swarms are dynamic and the size of each swarm is small. The whole population is divided into many small subswarms. The subswarms are regrouped frequently by using different regrouping schemes and information is exchanged among subswarms. Several accelerating operators are applied to improve the local search ability. Changes need to be detected and adjustments are performed once changes are detected.

An atomic swarm approach was adopted by Blackwell and Branke (2004) to track multiple optima simultaneously with multiple swarms in dynamic environments. An atomic swarm is composed of charged (or quantum) and neutral particles. The model can be depicted as a cloud of charged particles orbiting a contracting, neutral, PSO nucleus. In the approaches of Blackwell and Branke (2006), either charged particles (for the mCPSO algorithm) or quantum particles (for the mQSO algorithm) are used for maintaining the diversity of the swarm, and an exclusion principle ensures that no more than one swarm surrounds a single peak. An anti-convergence principle is also introduced to detect new peaks by sharing information among all subswarms. This strategy was experimentally shown to be efficient for the MPB.

Borrowing the idea of exclusion from mQSO (Blackwell and Branke, 2006), Mendes and Mohais (2005) developed a multipopulation differential evolution (DE) algorithm (DynDE) for DOPs. In their approach, a dynamic strategy for the mutation factor *F* and probability factor in DE was introduced. Recently, an enhanced version of mQSO was proposed by applying two heuristic rules to further enhance the diversity of mQSO in del Amo et al. (2010). One of the two rules is to increase the number of quantum particles and decrease the number of trajectory particles when a change occurs. The other rule is to reinitialize or pause the swarms that have bad performance.

A collaborative evolutionary swarm optimization (CESO) was proposed by Lung and Dumitrescu (2007). In CESO, two swarms, which use the crowding DE (CDE) (Thomsen, 2004) and the PSO model, respectively, cooperate with each other by a collaborative mechanism. The swarm using CDE is responsible for preserving diversity while the PSO swarm is used for tracking the global optimum. The competitive results were reported in Lung and Dumitrescu (2007). Thereafter, a similar algorithm, called evolutionary swarm cooperative algorithm (ESCA), was proposed by Lung and Dumitrescu (2010) based on the collaboration between a PSO algorithm and an evolutionary algorithm (EA). In ESCA, three populations using different EAs are used. Two of them follow the rules of CDE (Thomsen, 2004) to maintain the diversity. The third population uses the rules of PSO. Three types of collaborative mechanisms were also developed to transmit information among the three populations.

Parrott and Li (2004) developed a speciation based PSO (SPSO), which dynamically adjusts the number and size of swarms by constructing an ordered list of particles, ranked according to their fitness by a “good first” rule, with spatially close particles joining a particular species. At each generation, SPSO aims to identify multiple species seeds within a swarm. Once a species seed has been identified, all the particles within its radius are assigned to that same species. Parrott and Li (2006) also proposed an improved version with a mechanism to remove duplicate particles in species. Bird and Li (2006) developed an adaptive niching PSO (ANPSO) algorithm which adaptively determines the radius of a species by using the population statistics. Based on their previous work, Bird and Li (2007) introduced another improved version of SPSO using a least square regression (rSPSO). Recently, in order to determine niche boundaries, a vector-based PSO algorithm (Schoeman and Engelbrecht, 2009) was proposed to locate and maintain niches by using additional vector operations.

An algorithm similar to SPSO (Parrott and Li, 2006), called PSO-CP, was proposed by Liu et al. (2010). In PSO-CP, the whole swarm is partitioned into a set of composite particles by a “worst first” principle, which is opposite to the “good first” rule used in SPSO. The members of each composite particle is fixed by three particles (one pioneer particle and two elementary particles). Inspired by the composite particle phenomenon in physics, the elementary members in each composite particle interact via a velocity-anisotropic reflection (VAR) scheme to integrate valuable information. The idea behind the VAR scheme is to replace the worst particle with a refection point with better fitness through the other two particles. The diversity of each composite particle is maintained by a scattering operator. An integral movement strategy with the aim to promote swarm diversity is introduced by moving two elementary particles with the same velocity as the pioneer particle after the pioneer particle is updated.

The clustering PSO algorithm proposed by Li and Yang (2009) applies a hierarchical clustering method to divide an initial swarm into subswarms that cover different local regions. CPSO attempts to solve some challenging issues associated with multipopulation methods, for example, how to guide particles to move toward different promising subregions and how to determine the radius of subswarms. Recently, Li and Yang (2012) proposed a general framework for multipopulation methods in undetectable dynamic environments based on the clustering method used in Li and Yang (2009) and Yang and Li (2010). An algorithm called CPSOR was implemented using the PSO technique. The CPSOR algorithm shows superior performance compared with other algorithms, especially in dynamic environments where changes are hard to detect.

Recently, a new cluster-based differential evolution algorithm was proposed by Halder et al. (2013). In the algorithm, multiple populations are periodically generated by the *k*-means clustering method, and the number of clusters is decreased or increased by one over a time span according to the algorithm’s performance. When a cluster is converged, it is removed with the best individual stored in an external archive. When a change is detected, all the populations are restored to an initial size and reclustered.

A cultural framework was introduced in Daneshyari and Yen (2011) for PSO where five different kinds of knowledge, named situational knowledge, temporal knowledge, domain knowledge, normative knowledge, and spatial knowledge, respectively, are defined. The information is used to detect changes. Once a change is detected, a diversity-based repulsion mechanism is applied among particles as well as a migration strategy among swarms. The knowledge also helps in selecting leading particles at the personal, swarm, and global levels.

In Khouadjia et al. (2011), a multienvironmental cooperative model for parallel metaheuristics was proposed to handle DOPs that consists of different subproblems or environments. A parallel multi-swarm approach is used to deal with different environments at the same time by using different algorithms that exchange information obtained from these environments. The multi-swarm model was tested on a set of dynamic vehicle routing problems.

An adaptive PSO algorithm was proposed in Rezazadeh et al. (2011). In the proposed algorithm, the exclusion radius and inertia weight are adaptively adjusted by a fuzzy C-means (FCM) mechanism. A local search scheme is employed for the best swarm to accelerate the search progress. When the search areas of two subswarms overlap, the worse one is removed. To increase diversity, all normal particles are converted to quantum particles when a change is detected.

### 2.2 Difficulties in Determining the Number of Populations

The number of populations is a vital factor that affects the performance of an algorithm to locate and track the multiple peaks. However, determining a proper number of populations needed in a specific environment is a very difficult task. This is because the proper number of populations needed is mainly determined by the number of peaks in the fitness landscape. In addition, the distribution and shape of peaks may also play a role in configuring the number of populations. Generally speaking, the more peaks that are in the fitness landscape, the more populations that are needed. Several experimental studies (Blackwell and Branke, 2006; Mendes and Mohais, 2005; Yang and Li, 2010) have shown that the optimal number of populations is equal to the number of peaks in the fitness landscape for the MPB with a small number of peaks (i.e., less than 10 peaks). However, results in du Plessis and Engelbrecht (2012a) show that the optimal number of populations is not equal to the number of total peaks for the MPB with many peaks (i.e., more than 10 peaks). Although locating and tracking each peak by a single population is theoretically correct, it is not effective and hard to achieve in practice because only limited computational resources are available. In practice, a relatively low peak in the current environment usually has a very small chance to become the highest peak in a new environment, and thus, it will waste computational resources available to locate and track each peak by each population.

Intuitively, the optimal number of populations should be relevant to the number of promising peaks. The difficulty is how to figure out the number of such promising peaks in each specific environment. It becomes even harder when the number of peaks fluctuates or in cases where the number of peaks is unknown. In addition, how to determine the search radius for each population is also a difficult issue.

#### 2.2.1 Solutions So Far

To the best of our knowledge, little research regarding the above issue has been done so far. In the literature of multipopulation methods for DOPs, some researchers use predefined values for the number of populations and the radius of each population according to their empirical experience. For example, to effectively solve the MPB, 10 populations were suggested by Blackwell and Branke (2006) for the mQSO algorithm, the radius was set to 30 in SPSO (Parrott and Li, 2006), rSPSO (Bird and Li, 2007), and HmSO (Kamosi et al., 2010), and to 25 in FMSO (Li and Yang, 2008).

*X*is the range of the search space,

*D*is the number of dimensions, and denotes the number of peaks in the search space, respectively. Thereafter, several other researchers (del Amo et al., 2010; Mendes and Mohais, 2005) also adopted the same population radius on the MPB problem. In order to get an optimized number of populations, the CPSOR algorithm (Li and Yang, 2012) uses the number of peaks to estimate the total number of individuals () as follows: The threshold value of in CPSOR is also determined by:

Although the number of populations varies over the runtime in SOS (Branke et al., 2000), SPSO (Parrott and Li, 2004), and CPSO (Yang and Li, 2010), it is not adaptive as the total number of individuals is fixed during the whole run. One attempt at adapting the number of populations was made by Blackwell (2007) where mQSO was extended to a self-adaptive version, called self-adaptive multi-swarm optimizer (SAMO). The SAMO algorithm starts with a single free swarm (a free swarm is one that is patrolling the search space rather than converging on a peak). The number of free swarms will decrease when some of them are converging (a swarm is assumed to be converging when the neutral swarm diameter is less than a convergence diameter of ). If there is no free swarm, a new free swarm is created. On the other hand, a maximum number of free swarms (*n*_{excess}) is used to prevent too many free swarms from being created.

A similar population spawning and removing idea as used in SAMO was introduced and incorporated into a competitive differential evolution (CDE) algorithm (du Plessis and Engelbrecht, 2012b), which is called DynPopDE (du Plessis and Engelbrecht, 2012a), to address DOPs with an unknown number of optima. Different from the population converging criterion used in SAMO, a simple approach is used in DynPopDE where a population *k* is assumed to stagnate if there is no difference between the fitness of the best individual of two successive iterations (). If the stagnation criterion is met, a new free population will be created and the stagnated one will be reinitialized if it is an excluded population. To prevent too many populations from crowding in the search space, a population will be discarded when it is identified for reinitialization due to exclusion and .

One major issue of the above two adaptive algorithms is that the number of converging populations is unwatched. Therefore, more and more free populations will become converging populations without considering the total number of peaks in the search space, which may be caused by an improper exclusion radius used (i.e., where *M* is the number of populations). For example, the average number of populations obtained by DynPopDE (du Plessis and Engelbrecht, 2012a) on a 80-peak MPB instance rises almost to 45 when the number of changes reaches 100 and still has a growing trend (see Figure 4 in du Plessis and Engelbrecht, 2012a) and the fourth graph in Figure 2, shown later in this paper). Thus, here the issue is that the 45 peaks tracked by DynPopDE may not all be promising in terms of the probability of becoming new global optima when a change happens. Thus, the performance would decrease as fitness evaluations are not effectively used due to tracking unpromising peaks. Another issue with the SAMO algorithm (Blackwell, 2007) is that the optimal value for parameter *n*_{excess} is problem-dependent (Blackwell, 2007; du Plessis and Engelbrecht, 2012a). For example, experimental results in Blackwell (2007) suggests is optimal for the 10-peak MPB instance while is optimal for the 200-peak MPB instance.

### 2.3 Difficulties in Maintaining Diversity in Dynamic Environments

So far, most EAs developed for DOPs either use some change detection methods (Li, 2004; Lung and Dumitrescu, 2007, 2010; Richter, 2009; Yang and Li, 2010) or predict changes assuming that changes have a pattern (Simoes and Costa, 2008). Once a change has been detected or predicted, different kinds of strategies are applied to increase the diversity, for example, random immigrant strategies (Li, 2004; Li and Yang, 2009; Lung and Dumitrescu, 2007, 2010; Yang and Li, 2010), or to reuse stored useful information assuming that the new environment is closely related to the current or a previous environment, for example, memory-based strategies (Branke, 1999). However, in order to use these strategies efficiently, a condition must be applied, that is, changes must be successfully detected. There arises a common question: what can these algorithms do if they fail to detect changes? For example, reevaluating methods will fail to detect changes in a fitness landscape where a part of it changes if all evaluators are in unchanged areas. An example of completely undetectable environments is noisy environments where changes are impossible to be detected by reevaluation methods because the noise in every fitness evaluation will be misinterpreted as changes.

Maintaining diversity without change detection throughout the run is an interesting topic. In Grefenstette (1992), random individuals (called random immigrants) are created in every iteration. Three different mutation strategies were designed to control the diversity in Cobb and Grefenstette (1993). Sharing or crowding mechanisms in Cedeño and Rao Vemuri (1997) were introduced to ensure diversity. A genetic algorithm (GA), called thermodynamical GA (TDGA; Mori et al., 1996), was proposed to control the diversity explicitly via a measure called free energy. However, these methods are not effective because the continuous focus on diversity slows down the optimization process, as pointed out by Jin and Branke (2005).

Normally, maintaining diversity is achieved by the following three methods: (1) introducing new randomly generated individuals; (2) reactivating individuals via mutation operation with a large probability or a large mutation step; and (3) allowing some individuals to use specially designed rules to maintain diversity rather than to locate the global optimum. However, for the first and second methods, the difficulty is deciding when to increase diversity. For the third method, the problem is how to design effective rules for maintaining diversity. In addition, the waste of computational resources for the third method cannot be avoided due to the function of specialized individuals.

In fact, all the above difficulties concerned with maintaining diversity in dynamic environments can be attributed to one fundamental issue, which is how to actively adapt the whole population to changes. As we know, changes in dynamic environments are usually unpredictable. We cannot predict when, where, and what kind of changes will take place. Therefore, to efficiently solve DOPs, an algorithm should be able to actively learn the information about the changes.

## 3 Multipopulation Adaptation in Dynamic Environments

In order to make populations adaptable to changes, we use a clustering method to create populations. All populations use the same search operator to focus on local search. An overcrowding handling scheme is applied, if certain criteria are satisfied, to remove unnecessary populations and, hence, save computational resources. To find out proper moments to increase diversity without the aid of change detection methods, a special rule is designed according to the drop rate of the number of populations over a certain period of time. In order to introduce the proper number of active individuals that are needed in each specific environment, an adaptive method is developed according to the information collected from the whole populations since the last diversity-increasing point.

### 3.1 Preparation for Multipopulation Adaptation

Before introducing our population adaptation method, we do some preparatory work, including the introduction of a multipopulation generation scheme and an overlapping detection scheme.

#### 3.1.1 Multipopulation Generation

In order to divide the search space into several subareas without overlapping, we use the single linkage hierarchical clustering method proposed in Li and Yang (2009). In this method, the distance *d*(*i*, *j*) between two individuals *i* and *j* in the *D*-dimensional space is defined as the Euclidean distance between them.

*t*and

*s*, denoted

*M*(

*t*,

*s*), is defined as the distance of the two closest individuals

*i*and

*j*that belong to clusters

*t*and

*s*, respectively.

*M*(

*t*,

*s*) is formulated as follows: Here, we assume that each peak in the fitness landscape has a cone shape. Therefore, the search area of a population

*s*can be defined as a circle, and accordingly, its radius can be calculated as: where

*s*

_{center}is the central position of population

*s*and is the number of individuals in

*s*. Note that the best individual of population

*s*will be replaced with

*s*

_{center}if

*s*

_{center}is better than the best individual of population

*s*in this paper.

Given an initial population with a number of individuals uniformly distributed in the fitness landscape, the clustering method works as follows: It first creates a list *G* of clusters with each cluster containing only one individual. Then, in each iteration, it finds a pair of clusters *t* and *s* such that they are the closest among those pairs of clusters, of which the total number of individuals in the two clusters is not greater than (where is a prefixed maximum size of a subpopulation), and, if successful, combines *t* and *s* into one cluster. This iteration continues until each cluster in *G* contains more than one individual. Finally, the cluster list *G* is appended to a global population list , which is empty initially. As a result, we can have a certain number of populations without overlapping each other.

#### 3.1.2 Overlapping Detection Scheme

Generally speaking, overcrowded populations on a single peak should not be allowed as computational resources are wasted due to redundant individuals searching on the same peak. Over-lapped populations searching on different peaks should be allowed, in order to encourage populations to track as many as possible promising peaks. In order to detect whether two populations involve a real overcrowding or overlapping situation, we adopt the following method introduced in Yang and Li (2010). If two populations *t* and *s* are within each other’s search area, an overlapping ratio between them, denoted , is calculated as follows: We first calculate the percentage of individuals in *t* that are within the search area of *s* and the percentage of individuals in *s* that are within the search area of *t*, and then set to the smaller one of these two percentages. The two populations *t* and *s* are combined only when is greater than a threshold value ( is used in this paper). In the combination process, only best individuals are kept if the number of individuals in the combined population is greater than . It should be noted that the radius of *s* and *t* used in the overlapping check operation is their initial radius when *s* and *t* are first created by the clustering method rather than their current radius. It should also be noted that this method does not guarantee that every detection is able to identify a real overcrowding or overlapping situation.

In this paper, if the radius of a population is less than a small threshold value , which is set to , the population is regarded as converged. A converged population will be removed from the population list , but the best individual is kept in a list .

### 3.2 Multipopulation Adaptation

Diversity loss is one the major issues of applying EAs to solve DOPs (Blackwell, 2007). Multipopulation adaptation is an alternative approach to addressing the diversity loss issue: it aims to adaptively maintain the population diversity at the multipopulation level. To achieve this aim, two issues should be addressed: when to increase the population diversity when it gets low and how many populations (via clustering random individuals in this paper) should be introduced.

#### 3.2.1 The Moment to Increase Diversity

In order to illustrate how to find the proper moment to increase the population diversity at the multipopulation level, we carried out a preliminary experimental study on the MPB with the default settings (see Table 2 in Sect. 4.1.1) based on a nonadaptive algorithm introduced above—the CPSO algorithm (Yang and Li, 2010). The algorithm is informed when a change occurs and at the same time individuals will be restored to the initial size of . Then, the clustering method is applied to create populations. In the study, and were set to suggested values of 100 and 7, respectively. Figure 1 presents the progress of the number of populations, the average radius, the number of peaks tracked, and the best error across five changes over a typical run. A peak is assumed to be tracked/found with the MPB if the distance from any individual to the peak is less than 0.01. Here, apart from current populations, converged populations are also counted along with the number of populations to show the converging behavior in each environment. The best error is the fitness difference between the best solution found since the last change and the global optimum.

From the top graph in Figure 1, the number of populations decreases as the search goes on in each environment due to the overlapping detection scheme introduced above. It eventually stays at a certain level in all five environments. Similar observations can be seen in the curves of average radius and best error. The number of peaks tracked increases as the search goes on in each environment. When the number of populations does not change, all of the populations enter a stable status, that is, all populations converge on different peaks. As a result, new peaks can no longer be found.

This can be validated from the results of the average radius and the number of peaks tracked. From the figure, the corresponding average radius of all populations almost drops to zero in the first two environments after the number of populations converges. For the other three environments, the corresponding average radius also decreases to very small values compared with the initial values. The number of peaks tracked no longer increases at a certain time after the number of populations becomes stable. This observation is an important clue indicating that when the number of populations converges, it is a proper moment to increase diversity. For example, in Figure 1, 3k is such a proper moment to increase the diversity by introducing new individuals as the drop rate of the number of populations decreases to almost zero and the number of peaks being tracked also converges. From that moment, as stated above, no new peaks can be found if no new individuals are introduced. Therefore, it is necessary to introduce new random individuals to explore new promising peaks, whether the environment changes or not.

*t*(measured in the number of fitness evaluations), and where is a new trace gap parameter in this paper. Note that although the drop rate decreases to zero as overlapping populations are gradually removed for each environment, as shown in Figure 1, we should not use zero as the threshold of the drop rate. There are several reasons for this. Firstly, the overlapping detection scheme cannot guarantee to detect and remove all overlapping populations as search goes on due to the difficulties stated above. Secondly, the evolutionary status of all populations at the same time may be different, since new populations are added repeatedly at each diversity increasing point. This would make it more difficult to detect overlapping populations. Thirdly, populations that have converged will be removed in this paper. This also suggests that the threshold value for the drop rate cannot be zero from the viewpoint of removal of converged populations at an unknown time point. It should also be noted that the choice of the threshold of the drop rate also affects the choice of , and vice versa (the sensitivity of will be studied later in Section 4.2.1). Although the threshold of the drop rate should not be zero, obviously it should be a very small value. Based on the above considerations, we use 0.002 as the threshold value for the drop rate in this paper (the choice was made also based on our experimental results). Further, this would make it easy to perform the sensitivity analysis of later.

It should be noted that the monitoring operation on the drop rate of the number of populations will start over once new random individuals are introduced, that is, populations evolve for at least evaluations after an operation to increase diversity. To achieve this, a queue can be used to store relevant information at each iteration, including the number of populations and the number of fitness evaluations. We keep pushing the relevant information into the back of the queue at each iteration. An element is popped out from the queue if the time difference between the front and back elements is larger than . This way, the moment to increase population diversity can be identified by checking the difference of the number of populations between the front and back elements. The queue is cleared once new individuals are introduced and the monitoring will start over.

#### 3.2.2 Adaptation of the Number of Populations

Another issue of population adaptation is how many random populations should be introduced when the population diversity needs to be increased. Intuitively, the optimal number of populations needed is related to the number of peaks in the fitness landscape. However, the relationship between them is hard to know even if we have prior knowledge of the number of peaks. And it will become harder to get such relationship in a situation where the number of peaks fluctuates. To address this issue, we introduce another rule. In order to explain our idea, we again conducted a preliminary experimental study on the MPB with different numbers of peaks over 100 changes with the CPSO algorithm (Yang and Li, 2010) in this section. For CPSO, the same parameter values were used as in Section 3.2.1.

Table 1 presents the average number of populations at the time before a change occurs over 30 runs. From Table 1, the average number of populations increases linearly from 5 to 14 as the number of peaks increases from 5 to 100, even though the same number of individuals () is used in all cases. Therefore, our idea is to use the changes in the number of populations to guide the decision on the number of populations to be introduced and hence to adapt the number of populations to changes where the number of peaks is unknown.

5 | 10 | 20 | 30 | 50 | 100 | |

Populations | 5.0 | 7.3 | 9.6 | 10.8 | 11.9 | 14.0 |

5 | 10 | 20 | 30 | 50 | 100 | |

Populations | 5.0 | 7.3 | 9.6 | 10.8 | 11.9 | 14.0 |

In AMSO, the number of populations to be increased depends on the number of random individuals to be generated for clustering. The number of random individuals to be generated (and hence the number of populations to be increased) is estimated as follows. Whenever a moment of increased diversity is identified by Equation (6), we compare the number of populations at the current increasing point () with the number of populations at the last increasing point (). If , the total number of individuals (and hence the number of random individuals to be generated) will be increased; otherwise, if is less than by a certain amount (which is set to three in this paper), the number of individuals is decreased in comparison with the number of individuals at the last diversity increasing point. In our experiments, we found that decreasing the number of individuals once is less than sometimes would lead to a wrong decision. This is because a few peaks sometimes become invisible in the fitness landscape when changes occur, which will cause the same effect as the number of peaks actually being reduced. And once a wrong decision is made to decrease the number of individuals, it will dramatically affect the performance in locating and tracking multiple peaks as only a few peaks can be located and tracked due to a small number of populations. However, a wrong decision to increase the number of individuals will not affect the performance too much, as the tracking will not be lost. Therefore, we apply a harder condition on decreasing the number of individuals than the condition on increasing the number of individuals in this paper.

After all the conditions are checked, an estimated number of individuals for the following search will be obtained using Algorithm 1. The number of individuals to be increased or decreased is determined by the difference between and . The larger the difference between and , the larger the number of individuals that will be increased or decreased accordingly, where the number is estimated by a base step of times (see steps 5 and 8 in Algorithm 1). In this way, the idea is to be able to adapt the number of populations to changes according to the feedback information of the whole populations. Note that the optimal number of populations for each environment is not guaranteed.

We reiterate that the aim of this paper is to locate and track as many promising peaks as possible via multipopulation methods where each population locates a single peak and tracks its movement. The two issues discussed above are challenging, because two tradeoffs must be considered. One tradeoff is between the frequency of increasing populations and exploitation, and the other is between the number of populations to be increased and exploitation. Increasing populations frequently or increasing a large number of populations at each increasing moment is helpful to explore more promising peaks. However, increasing populations too frequently or increasing too many populations at each moment is harmful for populations to carry out exploitation since there are limited computational resources (i.e., evaluations) available before a change occurs.

### 3.3 Algorithm Implementation by Particle Swarm Optimization

*i*(a candidate solution) is represented by a position vector and a velocity vector , which are updated in the version of PSO with an inertia weight (Shi and Eberhart, 1998) as follows: where and represent the current and previous position in the

*d*th dimension of particle

*i*, respectively; and

*are the current and previous velocity of particle*

_{i}*i*, respectively; and are the best position found by particle

*i*so far and the best position found by the whole swarm so far, respectively; , , and are constant parameters; and

*r*

_{1}and

*r*

_{2}are random numbers generated in the interval uniformly. Note that the maximum velocity of each particle is set to the initial search radius of its swarm.

*i*, gets improved, we iteratively check each dimension

*d*of the particle and replace the dimension with the corresponding dimensional value of particle

*i*with a probability

*p*if the particle is improved by doing so. The value of

_{d}*p*is calculated by (see Algorithm 2). The introduction of the heuristic learning probability greatly saves function evaluations. This way, the particle is able to learn some useful information from those dimensions of a particle that has been improved.

_{d}To implement an AMSO with the ideas proposed above, we use the improved PSO with learning as a local search method for each population. Algorithm 4 summarizes the framework of AMSO. Initially, populations are obtained by clustering an initial random swarm. In the evolutionary process, all populations use the improved PSO (see Algorithm 3) to locate different optima simultaneously. Then, they undergo the overlapping and convergence check process where redundant populations will be removed. Before discarding converged populations, the best individuals of them will be saved into a list for later use. To increase the population diversity at a proper moment, Equation (6) is applied every iteration to identify that moment. If the proper moment is found, an expected number of individuals is estimated. After that, the estimated figure is amended if it goes beyond the range of the maximum and minimum number of individuals. Finally, a random immigrants scheme is applied to introduce new random populations, which are obtained by clustering a temporal random population with the estimated number of individuals and the members in .

## 4 Experimental Study

In this section, two groups of experiments are carried out to investigate the performance of the AMSO algorithm. The aim of the first group is to investigate the adaptability of AMSO in different perspectives in dynamic environments based on the MPB. In the second group of experiments, 12 multipopulation-based EAs are selected from the research areas of PSO, DE, GA, and hybrid algorithms. They are mCPSO (Blackwell and Branke, 2006), mQSO (Blackwell and Branke, 2006), SAMO (Blackwell, 2007), SPSO (Parrott and Li, 2006), rSPSO (Bird and Li, 2007), CPSO (Yang and Li, 2010), CPSOR (Li and Yang, 2012), and HmSO (Kamosi et al., 2010) from PSO, DynDE (Mendes and Mohais, 2005) and DynPopDE (du Plessis and Engelbrecht, 2012a) from DE, SOS (Branke et al., 2000) from GA, and ESCA (Lung and Dumitrescu, 2010) from the hybridization of DE and PSO. Comparison is conducted based on the MPB problem (Branke, 1999).

In order to use exactly the same fitness landscapes across all environmental changes for a fair comparison, all the peer algorithms involved in this paper were carefully implemented and examined according to their origins where they were proposed. Note that the PSO-CP algorithm has also been implemented, but the results could not be replicated and this algorithm is therefore omitted from the comparison.

### 4.1 Experimental Setup

#### 4.1.1 The MPB Problem

*D*-dimensional landscape, the problem is defined as follows: where and are the height and width of peak

*i*at time

*t*, respectively, and is the

*j*th element of the location of peak

*i*at time

*t*. The

*p*independently specified peaks are blended together by the function. The position of each peak is shifted in a random direction by a vector of a distance

*s*(

*s*is also called the shift length, which determines the severity of the problem dynamics), and the move of a single peak can be described as follows: where the shift vector is a linear combination of a random vector and the previous shift vector and is normalized to the shift length

*s*. The correlated parameter is set to 0, which implies that the peak movements are uncorrelated.

Note that different from the traditional MPB problem (Branke, 1999), two new features are introduced to make it more difficult to solve in this paper, namely, changes in the number of peaks, and changes in a part of the fitness landscape.

**Changes in the number of peaks**. The number of peaks is allowed to change to evaluate the performance of multipopulation methods in terms of the adaptation of the number of populations. If this feature is enabled, the number of peaks changes using one of the following formulas: where if , if , and the initial value of is one;*rand*(*a*,*b*) returns a random value in .**Changes in a part of the fitness landscape.**A ratio of changing peaks to the total number of peaks () is also introduced. This feature may cause algorithms that are based on change detection to fail.

The default settings and definition of the benchmark problem used in the experiments of this paper can be found in Table 2. The new features introduced above are disabled by default unless explicitly stated otherwise in this paper.

Parameter . | Value . |
---|---|

Number of peaks () | 10 |

Change frequency (u) | 5,000 function evaluations |

Height severity | 7.0 |

Width severity | 1.0 |

Peak shape | Cone |

Basic function | No |

Shift length (s) | 1.0 |

Number of dimensions (D) | 5 |

Correlation coefficient () | 0 |

Percentages of changing peaks () | 1.0 |

Noise | No |

Time-linkage | No |

Number of peaks change | No |

S | [0, 100] |

H | [30.0, 70.0] |

W | [1, 12] |

I | 50.0 |

Parameter . | Value . |
---|---|

Number of peaks () | 10 |

Change frequency (u) | 5,000 function evaluations |

Height severity | 7.0 |

Width severity | 1.0 |

Peak shape | Cone |

Basic function | No |

Shift length (s) | 1.0 |

Number of dimensions (D) | 5 |

Correlation coefficient () | 0 |

Percentages of changing peaks () | 1.0 |

Noise | No |

Time-linkage | No |

Number of peaks change | No |

S | [0, 100] |

H | [30.0, 70.0] |

W | [1, 12] |

I | 50.0 |

#### 4.1.2 Performance Evaluation

Two performance measures are used in this paper. They are the offline error (*E*_{offline}) (Branke and Schmeck, 2003) and the best-before-change error (*E*_{BBC}). The offline error is the average of the best error found at each fitness evaluation. The best-before-change error is the average of the best error achieved at the fitness evaluation just before a change occurs.

#### 4.1.3 *t*-Test Comparison

To compare the performance of two algorithms at the statistical level, a two-tailed *t*-test with 58 degrees of freedom at a 0.05 level of significance was conducted between AMSO and each peer algorithm. The *t*-test result is given together with the average score value with superscript letter , *l*, or *t*, which denotes that the performance of AMSO is significantly better than, significantly worse than, and statistically equivalent to its peer algorithm, respectively.

#### 4.1.4 Configurations of AMSO

In AMSO, the number of populations and the moments to increase diversity are adaptive. However, in order to make AMSO adaptable to changes, several nonadaptive parameters are also introduced. Table 3 lists all the parameters for AMSO. Note that constant values for most nonadaptive parameters are made by a systematical experimental study and they are reasonable. For example, the threshold radius value of is small enough for checking whether a population converges or not. Making the parameters of PSO (, , and ) adaptive may be helpful in dynamic environments. However, we do not investigate this aspect as it is not the main objective of this paper. To start to run the AMSO algorithm, the initial value of was set to 100 in all experiments unless otherwise stated in this paper. All the results obtained on the MPB problem are averaged over 30 independent runs in this paper.

Parameter . | Type . | Value . |
---|---|---|

Overlapping ratio () | Constant | 0.5 |

Convergence threshold () | Constant | 1 |

Trace gap () | Constant | 1,500 s |

Population adjustment step size () | Constant | 10 individuals |

Population decrease threshold () | Constant | 3 |

Maximum individuals () | Constant | 300 |

Minimum individuals () | Constant | 70 |

Maximum individuals in a sub-pop () | Constant | 7 |

PSO:inertia weight () | Constant | 0.6 |

PSO:acceleration constants ( = ) | Constant | 1.7 |

Initial population size () | Variable | 100 |

Number of populations | Adaptive | — |

Number of total individuals | Adaptive | — |

Population radius | Variable | — |

Frequency of diversity increase | Adaptive | — |

Parameter . | Type . | Value . |
---|---|---|

Overlapping ratio () | Constant | 0.5 |

Convergence threshold () | Constant | 1 |

Trace gap () | Constant | 1,500 s |

Population adjustment step size () | Constant | 10 individuals |

Population decrease threshold () | Constant | 3 |

Maximum individuals () | Constant | 300 |

Minimum individuals () | Constant | 70 |

Maximum individuals in a sub-pop () | Constant | 7 |

PSO:inertia weight () | Constant | 0.6 |

PSO:acceleration constants ( = ) | Constant | 1.7 |

Initial population size () | Variable | 100 |

Number of populations | Adaptive | — |

Number of total individuals | Adaptive | — |

Population radius | Variable | — |

Frequency of diversity increase | Adaptive | — |

All the peer algorithms use the suggested configurations from the papers where they were proposed on the MPB problem. Table 4 presents the configurations regarding the population radius and the number of populations for all the involved algorithms. Note that the population radius is not applicable for ESCA.

Algorithm . | Radius . | Number of populations . |
---|---|---|

SOS (Branke et al., 2000) | Constant | Variable |

DynDE (Mendes and Mohais, 2005) | Constant (31.5: Eq. (1)) | Constant (10) |

mCPSO (Blackwell and Branke, 2006) | Constant (31.5: Eq. (1)) | Constant (10) |

mQSO (Blackwell and Branke, 2006) | Constant (31.5: Eq. (1)) | Constant (10) |

SPSO (Parrott and Li, 2006) | Constant (30) | Variable |

rSPSO (Bird and Li, 2007) | Constant (30) | Variable |

SAMO (Blackwell, 2007) | Variable | Adaptive |

ESCA (Lung and Dumitrescu, 2010) | N/A | Constant (3) |

CPSO (Yang and Li, 2010) | Variable | Roughly constant (70/3) |

HmSO (Kamosi et al., 2010) | Constant (30) | Variable |

CPSOR (Li and Yang, 2012) | Variable | Roughly constant (Eq. (2)) |

DynPopDE (du Plessis and | Variable | Adaptive |

Engelbrecht, 2012a) | ||

AMSO (present work) | Variable | Adaptive |

Algorithm . | Radius . | Number of populations . |
---|---|---|

SOS (Branke et al., 2000) | Constant | Variable |

DynDE (Mendes and Mohais, 2005) | Constant (31.5: Eq. (1)) | Constant (10) |

mCPSO (Blackwell and Branke, 2006) | Constant (31.5: Eq. (1)) | Constant (10) |

mQSO (Blackwell and Branke, 2006) | Constant (31.5: Eq. (1)) | Constant (10) |

SPSO (Parrott and Li, 2006) | Constant (30) | Variable |

rSPSO (Bird and Li, 2007) | Constant (30) | Variable |

SAMO (Blackwell, 2007) | Variable | Adaptive |

ESCA (Lung and Dumitrescu, 2010) | N/A | Constant (3) |

CPSO (Yang and Li, 2010) | Variable | Roughly constant (70/3) |

HmSO (Kamosi et al., 2010) | Constant (30) | Variable |

CPSOR (Li and Yang, 2012) | Variable | Roughly constant (Eq. (2)) |

DynPopDE (du Plessis and | Variable | Adaptive |

Engelbrecht, 2012a) | ||

AMSO (present work) | Variable | Adaptive |

### 4.2 Experimental Investigation of AMSO

In this section, the performance of AMSO is investigated with regard to several aspects, including the number of populations in dynamic environments with a variable number of peaks, the sensitivity to the parameter , and the ability of locating and tracking multiple peaks, respectively.

#### 4.2.1 Sensitivity Analysis of Parameter

From Equation (6), the frequency of increasing diversity depends on the parameter once the threshold of the drop rate is fixed. To make sure that necessary population diversity is always guaranteed, the value of on the one hand should not be too large, as a future change may take place at any time. However, the current check is already a postponed operation, because we have to give enough time for populations to evolve into the converging status in order to achieve a precise estimation. Therefore, the value of on the other hand should not be too small. In order to find out a good choice of the value of , we carried out an experimental study with AMSO with different values of on the MPB problem with different numbers of peaks in this section. Table 5 presents the offline errors, the best-before-change errors, diversity increasing times per change (), and the number of peaks tracked of AMSO over 30 runs.

. | . | Trace gap () . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

. | . | 100 . | 300 . | 500 . | 700 . | 1,000 . | 1,500 . | 2,000 . | 2,500 . | 3,000 . | 3,500 . | 4,000 . | 4,500 . |

1 | E_{offline} | 3.91^{w} | 3.96^{w} | 4.61^{w} | 4.56^{w} | 3.66^{w} | 2.45^{w} | 1.6 | 1.82^{t} | 1.69^{t} | 2^{w} | 1.87^{t} | 1.93^{w} |

E_{BBC} | 0.0407 | 0.0691^{w} | 0.117^{w} | 0.168^{w} | 0.22^{w} | 0.13^{w} | 0.098^{w} | 0.114^{w} | 0.0687^{w} | 0.0652^{w} | 0.0479^{t} | 0.0542^{t} | |

5.29 | 4.35 | 3.17 | 2.68 | 2.45 | 2.11 | 1.92 | 1.71 | 1.53 | 1.44 | 1.36 | 1.28 | ||

0.99 | 0.981 | 0.97 | 0.955 | 0.941 | 0.965 | 0.974 | 0.971 | 0.981 | 0.982 | 0.986 | 0.984 | ||

2 | E_{offline} | 2.72^{w} | 2.55^{w} | 2.6^{w} | 2.47^{w} | 2.2 | 2.48^{w} | 2.53^{w} | 2.76^{w} | 2.91^{w} | 2.81^{w} | 2.6^{w} | 2.77^{w} |

E_{BBC} | 0.791^{w} | 0.745^{w} | 0.611^{t} | 0.66^{t} | 0.503 | 0.707^{t} | 0.96^{w} | 1.18^{w} | 1.41^{w} | 1.24^{w} | 1.18^{w} | 1.28^{w} | |

5.2 | 4.15 | 3.25 | 2.71 | 2.28 | 1.74 | 1.66 | 1.5 | 1.36 | 1.22 | 1.14 | 1.06 | ||

1.76 | 1.79 | 1.79 | 1.79 | 1.81 | 1.78 | 1.58 | 1.5 | 1.46 | 1.47 | 1.45 | 1.42 | ||

7 | E_{offline} | 1.32^{w} | 1.38^{w} | 1.26^{t} | 1.33^{w} | 1.24 | 1.37^{w} | 1.4^{w} | 1.35^{w} | 1.4^{w} | 1.38^{w} | 1.35^{w} | 1.39^{w} |

E_{BBC} | 0.223^{w} | 0.318^{w} | 0.203^{t} | 0.274^{w} | 0.153 | 0.231^{w} | 0.238^{w} | 0.212^{t} | 0.241^{w} | 0.267^{w} | 0.248^{t} | 0.298^{w} | |

5.49 | 4.2 | 3.62 | 2.88 | 2.36 | 1.69 | 1.38 | 1.21 | 1.08 | 1.01 | 0.94 | 0.862 | ||

5.95 | 5.93 | 6.06 | 6.01 | 6.34 | 6.3 | 6.28 | 6.28 | 6.35 | 6.24 | 6.31 | 6.25 | ||

10 | E_{offline} | 1.22^{t} | 1.21^{t} | 1.21 | 1.24^{t} | 1.25^{t} | 1.4^{w} | 1.43^{w} | 1.38^{w} | 1.36^{w} | 1.36^{w} | 1.43^{w} | 1.39^{w} |

E_{BBC} | 0.109^{t} | 0.123^{t} | 0.108^{t} | 0.131^{t} | 0.107 | 0.13^{t} | 0.196^{w} | 0.146^{t} | 0.151^{t} | 0.175^{w} | 0.196^{w} | 0.196^{w} | |

2.44 | 2.19 | 2.19 | 2.34 | 1.73 | 1.53 | 1.35 | 1.21 | 1.12 | 1.03 | 0.956 | 0.892 | ||

9.22 | 9.21 | 9.21 | 9.29 | 9.25 | 9.25 | 9.12 | 9.23 | 9.18 | 9.07 | 9.08 | 9.09 | ||

20 | E_{offline} | 2.29^{w} | 2.12^{w} | 2.16^{w} | 2.02^{t} | 2.02^{t} | 1.97 | 2.04^{t} | 2.06^{t} | 2.04^{t} | 2.06^{t} | 2.02^{t} | 2.05^{t} |

E_{BBC} | 1.47^{w} | 1.33^{w} | 1.34^{w} | 1.2^{w} | 1.17^{t} | 1.02 | 1.04^{t} | 1.03^{t} | 1.04^{t} | 1.09^{t} | 1.08^{t} | 1.13^{w} | |

1.25 | 1.1 | 1.12 | 1.16 | 1.08 | 1.29 | 1.16 | 1.06 | 0.983 | 0.96 | 0.897 | 0.875 | ||

10.9 | 11 | 11.1 | 11.7 | 12.3 | 13.5 | 13.7 | 13.8 | 13.7 | 13.6 | 13.3 | 13.5 | ||

30 | E_{offline} | 1.9^{w} | 1.96^{w} | 1.73^{w} | 1.68^{w} | 1.57^{w} | 1.48 | 1.54^{w} | 1.58^{w} | 1.59^{w} | 1.6^{w} | 1.55^{w} | 1.58^{w} |

E_{BBC} | 0.971^{w} | 1.04^{w} | 0.792^{w} | 0.751^{w} | 0.617^{w} | 0.457 | 0.458^{t} | 0.49^{t} | 0.479^{t} | 0.541^{w} | 0.512^{w} | 0.527^{w} | |

0.918 | 0.784 | 0.904 | 0.943 | 0.783 | 1.11 | 1.11 | 1.02 | 0.945 | 0.93 | 0.891 | 0.857 | ||

14.6 | 13.8 | 15.2 | 15.7 | 16.9 | 18.9 | 19.2 | 19.1 | 19.4 | 18.7 | 18.7 | 18.7 | ||

50 | E_{offline} | 2.3^{w} | 2.34^{w} | 2.27^{w} | 2.11^{w} | 2.06^{w} | 1.95^{t} | 1.93^{t} | 1.9 | 1.91^{t} | 1.95^{w} | 1.92^{t} | 1.91^{t} |

E_{BBC} | 1.33^{w} | 1.37^{w} | 1.29^{w} | 1.08^{w} | 1.04^{w} | 0.964^{t} | 0.913 | 0.928^{t} | 0.916^{t} | 0.968^{w} | 0.93^{t} | 0.947^{t} | |

0.89 | 0.755 | 0.872 | 0.885 | 0.884 | 1.04 | 1.22 | 1.09 | 1.04 | 0.982 | 0.943 | 0.877 | ||

16.6 | 15.9 | 16.7 | 18.6 | 19.6 | 20.7 | 21.9 | 21.3 | 21.4 | 21.2 | 21.1 | 21.1 | ||

100 | E_{offline} | 2.41^{w} | 2.57^{w} | 2.45^{w} | 2.43^{w} | 2.21^{w} | 2.12^{t} | 2.07^{t} | 2.1^{t} | 2.1^{t} | 2.06 | 2.15^{w} | 2.11^{t} |

E_{BBC} | 1.47^{w} | 1.65^{w} | 1.55^{w} | 1.52^{w} | 1.26^{w} | 1.2^{w} | 1.11 | 1.16^{t} | 1.16^{t} | 1.16^{t} | 1.25^{w} | 1.22^{w} | |

1.07 | 0.888 | 0.912 | 0.917 | 1 | 1.13 | 1.24 | 1.1 | 0.962 | 1.04 | 0.877 | 0.893 | ||

18.7 | 17.3 | 18.1 | 19 | 21.6 | 22.4 | 23.4 | 22.9 | 22.8 | 22.1 | 21.4 | 21.4 | ||

200 | E_{offline} | 2.47^{w} | 2.47^{w} | 2.56^{w} | 2.43^{w} | 2.18^{w} | 1.94 | 2.08^{w} | 1.95^{t} | 2.06^{w} | 2.04^{t} | 2.06^{t} | 2.2^{w} |

E_{BBC} | 1.53^{w} | 1.55^{w} | 1.62^{w} | 1.48^{w} | 1.26^{w} | 1.04 | 1.17^{w} | 1.05^{t} | 1.17^{w} | 1.13^{t} | 1.14^{w} | 1.28^{w} | |

0.736 | 0.737 | 0.629 | 0.65 | 0.921 | 1.17 | 0.968 | 1.04 | 0.847 | 0.808 | 0.691 | 0.559 | ||

18.1 | 17.6 | 17.7 | 19.1 | 21.5 | 24.1 | 22.2 | 23.1 | 21.6 | 21.6 | 22.4 | 20 |

. | . | Trace gap () . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

. | . | 100 . | 300 . | 500 . | 700 . | 1,000 . | 1,500 . | 2,000 . | 2,500 . | 3,000 . | 3,500 . | 4,000 . | 4,500 . |

1 | E_{offline} | 3.91^{w} | 3.96^{w} | 4.61^{w} | 4.56^{w} | 3.66^{w} | 2.45^{w} | 1.6 | 1.82^{t} | 1.69^{t} | 2^{w} | 1.87^{t} | 1.93^{w} |

E_{BBC} | 0.0407 | 0.0691^{w} | 0.117^{w} | 0.168^{w} | 0.22^{w} | 0.13^{w} | 0.098^{w} | 0.114^{w} | 0.0687^{w} | 0.0652^{w} | 0.0479^{t} | 0.0542^{t} | |

5.29 | 4.35 | 3.17 | 2.68 | 2.45 | 2.11 | 1.92 | 1.71 | 1.53 | 1.44 | 1.36 | 1.28 | ||

0.99 | 0.981 | 0.97 | 0.955 | 0.941 | 0.965 | 0.974 | 0.971 | 0.981 | 0.982 | 0.986 | 0.984 | ||

2 | E_{offline} | 2.72^{w} | 2.55^{w} | 2.6^{w} | 2.47^{w} | 2.2 | 2.48^{w} | 2.53^{w} | 2.76^{w} | 2.91^{w} | 2.81^{w} | 2.6^{w} | 2.77^{w} |

E_{BBC} | 0.791^{w} | 0.745^{w} | 0.611^{t} | 0.66^{t} | 0.503 | 0.707^{t} | 0.96^{w} | 1.18^{w} | 1.41^{w} | 1.24^{w} | 1.18^{w} | 1.28^{w} | |

5.2 | 4.15 | 3.25 | 2.71 | 2.28 | 1.74 | 1.66 | 1.5 | 1.36 | 1.22 | 1.14 | 1.06 | ||

1.76 | 1.79 | 1.79 | 1.79 | 1.81 | 1.78 | 1.58 | 1.5 | 1.46 | 1.47 | 1.45 | 1.42 | ||

7 | E_{offline} | 1.32^{w} | 1.38^{w} | 1.26^{t} | 1.33^{w} | 1.24 | 1.37^{w} | 1.4^{w} | 1.35^{w} | 1.4^{w} | 1.38^{w} | 1.35^{w} | 1.39^{w} |

E_{BBC} | 0.223^{w} | 0.318^{w} | 0.203^{t} | 0.274^{w} | 0.153 | 0.231^{w} | 0.238^{w} | 0.212^{t} | 0.241^{w} | 0.267^{w} | 0.248^{t} | 0.298^{w} | |

5.49 | 4.2 | 3.62 | 2.88 | 2.36 | 1.69 | 1.38 | 1.21 | 1.08 | 1.01 | 0.94 | 0.862 | ||

5.95 | 5.93 | 6.06 | 6.01 | 6.34 | 6.3 | 6.28 | 6.28 | 6.35 | 6.24 | 6.31 | 6.25 | ||

10 | E_{offline} | 1.22^{t} | 1.21^{t} | 1.21 | 1.24^{t} | 1.25^{t} | 1.4^{w} | 1.43^{w} | 1.38^{w} | 1.36^{w} | 1.36^{w} | 1.43^{w} | 1.39^{w} |

E_{BBC} | 0.109^{t} | 0.123^{t} | 0.108^{t} | 0.131^{t} | 0.107 | 0.13^{t} | 0.196^{w} | 0.146^{t} | 0.151^{t} | 0.175^{w} | 0.196^{w} | 0.196^{w} | |

2.44 | 2.19 | 2.19 | 2.34 | 1.73 | 1.53 | 1.35 | 1.21 | 1.12 | 1.03 | 0.956 | 0.892 | ||

9.22 | 9.21 | 9.21 | 9.29 | 9.25 | 9.25 | 9.12 | 9.23 | 9.18 | 9.07 | 9.08 | 9.09 | ||

20 | E_{offline} | 2.29^{w} | 2.12^{w} | 2.16^{w} | 2.02^{t} | 2.02^{t} | 1.97 | 2.04^{t} | 2.06^{t} | 2.04^{t} | 2.06^{t} | 2.02^{t} | 2.05^{t} |

E_{BBC} | 1.47^{w} | 1.33^{w} | 1.34^{w} | 1.2^{w} | 1.17^{t} | 1.02 | 1.04^{t} | 1.03^{t} | 1.04^{t} | 1.09^{t} | 1.08^{t} | 1.13^{w} | |

1.25 | 1.1 | 1.12 | 1.16 | 1.08 | 1.29 | 1.16 | 1.06 | 0.983 | 0.96 | 0.897 | 0.875 | ||

10.9 | 11 | 11.1 | 11.7 | 12.3 | 13.5 | 13.7 | 13.8 | 13.7 | 13.6 | 13.3 | 13.5 | ||

30 | E_{offline} | 1.9^{w} | 1.96^{w} | 1.73^{w} | 1.68^{w} | 1.57^{w} | 1.48 | 1.54^{w} | 1.58^{w} | 1.59^{w} | 1.6^{w} | 1.55^{w} | 1.58^{w} |

E_{BBC} | 0.971^{w} | 1.04^{w} | 0.792^{w} | 0.751^{w} | 0.617^{w} | 0.457 | 0.458^{t} | 0.49^{t} | 0.479^{t} | 0.541^{w} | 0.512^{w} | 0.527^{w} | |

0.918 | 0.784 | 0.904 | 0.943 | 0.783 | 1.11 | 1.11 | 1.02 | 0.945 | 0.93 | 0.891 | 0.857 | ||

14.6 | 13.8 | 15.2 | 15.7 | 16.9 | 18.9 | 19.2 | 19.1 | 19.4 | 18.7 | 18.7 | 18.7 | ||

50 | E_{offline} | 2.3^{w} | 2.34^{w} | 2.27^{w} | 2.11^{w} | 2.06^{w} | 1.95^{t} | 1.93^{t} | 1.9 | 1.91^{t} | 1.95^{w} | 1.92^{t} | 1.91^{t} |

E_{BBC} | 1.33^{w} | 1.37^{w} | 1.29^{w} | 1.08^{w} | 1.04^{w} | 0.964^{t} | 0.913 | 0.928^{t} | 0.916^{t} | 0.968^{w} | 0.93^{t} | 0.947^{t} | |

0.89 | 0.755 | 0.872 | 0.885 | 0.884 | 1.04 | 1.22 | 1.09 | 1.04 | 0.982 | 0.943 | 0.877 | ||

16.6 | 15.9 | 16.7 | 18.6 | 19.6 | 20.7 | 21.9 | 21.3 | 21.4 | 21.2 | 21.1 | 21.1 | ||

100 | E_{offline} | 2.41^{w} | 2.57^{w} | 2.45^{w} | 2.43^{w} | 2.21^{w} | 2.12^{t} | 2.07^{t} | 2.1^{t} | 2.1^{t} | 2.06 | 2.15^{w} | 2.11^{t} |

E_{BBC} | 1.47^{w} | 1.65^{w} | 1.55^{w} | 1.52^{w} | 1.26^{w} | 1.2^{w} | 1.11 | 1.16^{t} | 1.16^{t} | 1.16^{t} | 1.25^{w} | 1.22^{w} | |

1.07 | 0.888 | 0.912 | 0.917 | 1 | 1.13 | 1.24 | 1.1 | 0.962 | 1.04 | 0.877 | 0.893 | ||

18.7 | 17.3 | 18.1 | 19 | 21.6 | 22.4 | 23.4 | 22.9 | 22.8 | 22.1 | 21.4 | 21.4 | ||

200 | E_{offline} | 2.47^{w} | 2.47^{w} | 2.56^{w} | 2.43^{w} | 2.18^{w} | 1.94 | 2.08^{w} | 1.95^{t} | 2.06^{w} | 2.04^{t} | 2.06^{t} | 2.2^{w} |

E_{BBC} | 1.53^{w} | 1.55^{w} | 1.62^{w} | 1.48^{w} | 1.26^{w} | 1.04 | 1.17^{w} | 1.05^{t} | 1.17^{w} | 1.13^{t} | 1.14^{w} | 1.28^{w} | |

0.736 | 0.737 | 0.629 | 0.65 | 0.921 | 1.17 | 0.968 | 1.04 | 0.847 | 0.808 | 0.691 | 0.559 | ||

18.1 | 17.6 | 17.7 | 19.1 | 21.5 | 24.1 | 22.2 | 23.1 | 21.6 | 21.6 | 22.4 | 20 |

From Table 5, the expected results can be observed, that is, the choice of affects the performance of AMSO. A good choice of seems to be instance-dependent. Based on the results, we suggest that a relatively large value of should be used for problems with a large number of optima as AMSO achieves small *E*_{BBC} and *E*_{offline} errors with a large value of in most cases. In this paper, is used for the following experiments.

Two interesting results can also be observed from Table 5. Firstly, the average number of diversity increasing per change decreases as the value of increases in cases where the number of peaks is less than 30. For the instances with a large number of peaks (e.g., more than 20 peaks), however, there is no such trend compared with the former cases. For example, in the case with 200 peaks, the value of decreases from 0.73 to 0.62 as increases from 100 to 500, then it increases to 1.17 when reaches 1,500, and then the value again decreases as increases. Secondly, the larger the number of peaks that are tracked by AMSO, the better the performance is for AMSO. This is obvious particularly in cases with many peaks. For example, the largest number of peaks tracked by AMSO in the case of 200-peak is 24.1, which corresponds to the smallest offline error and the best-before-change error. The explanation is that the more peaks (promising peaks) that an algorithms can track, the larger probability the algorithm will track the global optimum.

#### 4.2.2 Adaptation in the Number of Populations

Figure 2 presents the comparison of the progress of the number of populations and the offline error between AMSO and three other algorithms (CPSOR, SAMO, and DynPopDE) on the MPB with different numbers of peaks. CPSOR is our previous algorithm and SAMO and DynPopDE are two adaptive algorithms regarding the number of populations. The optimal number of populations for a specific environment depends on the total number of peaks in the fitness landscape. Comparing the results between CPSOR and AMSO on the left graphs, we can see that AMSO shows much better adaptation capability than CPSOR. For example, the number of populations obtained by AMSO is slightly more than 10 in the 10-peak MPB case, but that number obtained by CPSOR is much larger than 10 over the whole run. In the case with 50 peaks, the number of populations achieved by CPSOR is similar to that of AMSO. For the instance with a variable number of peaks using Equation (14a), CPSOR hardly shows any adaptation regarding the number of populations where the number of populations even grows when the number of peaks drops. The number of populations obtained by AMSO, by contrast, generally changes in synchronization with the change in the number of peaks.

Due to the adaptation ability, AMSO shows much better performance than CPSOR in terms of the offline error. In the 10-peak MPB case, the average number of populations generated by CPSOR is about 25, which is much larger than the total number of peaks. Due to limited fitness evaluations for each change interval, too many populations may cause them to be unable to exploit their local areas sufficiently before new populations are introduced. In this case, it can be seen that the offline error of CPSOR is much larger than that of AMSO, which makes the average offline error of CPSOR much worse than that of AMSO (see the results in Table 6). The effect of the number of populations on the performance of AMSO and CPSOR can be further seen in the 50-peak MPB case. In the graph, CPSOR and AMSO use a similar number of populations after 250k evaluations, which causes them to achieve similar offline errors as well as best-before-change errors (see the results in Table 6). Again, in the case of a varying number of peaks, the gap between the offline errors of CPSOR and AMSO increases when the peak number reaches the lowest level, due to a larger number of populations generated by CPSOR than by AMSO.

. | . | AMSO . | CPSOR . | CPSO . | rSPSO . | SPSO . | mCPSO . | mQSO . | SAMO . | DynDE . | DynPopDE . | ESCA . | HmSO . | SOS . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | E_{offline} | 2.4 0.71 | 6.1 0.84 | 7.8 1 | 4.1 0.71 | 4.4 0.72 | 41 9.5 | 5.9 0.98 | 4.2 0.61 | 5.2 1 | 0.44 0.12 | 6.5 0.93 | 4 0.4 | 2.8 1.6 |

E_{BBC} | 0.13 | 0.062 | 2.01 × | 1.6 | 1.9 | 39 | 3.2 | 2.9 | 1.3 | 0.034 | 4.9 | 2.2 | 1.5 | |

2 | E_{offline} | 2.5 0.37 | 5 0.79 | 5.3 0.45 | 2.6 0.15 | 2.6 0.2 | 14 2.2 | 6.1 0.46 | 2.9 0.17 | 5.7 0.41 | 1.2 0.4 | 7.4 0.8 | 2.6 0.43 | 4.3 0.93 |

E_{BBC} | 0.71 | 1.4 | 1.1 | 0.93 | 0.99 | 12 | 4.3 | 1.7 | 4.1 | 0.85 | 6.5 | 1.2 | 3.5 | |

5 | E_{offline} | 1.6 0.28 | 2.9 0.34 | 4.2 0.32 | 2.5 0.29 | 2.5 0.25 | 7.3 1.2 | 2.6 0.24 | 2.6 0.17 | 1.9 0.12 | 2 0.68 | 13 1.8 | 5.2 0.75 | 6.5 0.96 |

E_{BBC} | 0.42 | 0.66 | 1.1 | 1.5 | 1.5 | 6.3 | 1.6 | 1.6 | 1.1 | 1.7 | 13 | 4.4 | 6 | |

7 | E_{offline} | 1.4 0.15 | 3 0.22 | 4 0.28 | 2.3 0.21 | 2.3 0.17 | 5.3 0.54 | 2.2 0.11 | 2.2 0.089 | 1.5 0.064 | 1.6 0.77 | 14 1.8 | 3.9 0.31 | 7 1.5 |

E_{BBC} | 0.23 | 0.54 | 0.63 | 1.2 | 1.2 | 4.4 | 1.2 | 1.2 | 0.69 | 1.1 | 13 | 2.1 | 6.2 | |

10 | E_{offline} | 1.4 0.11 | 2.6 0.2 | 4.5 0.26 | 3.5 0.41 | 3.6 0.47 | 8.6 1.1 | 2.8 0.19 | 3 0.15 | 1.5 0.067 | 2.3 0.79 | 15 1.8 | 5.1 0.31 | 8.6 1.4 |

E_{BBC} | 0.13 | 0.36 | 1.3 | 2.2 | 2.3 | 7.7 | 1.7 | 2 | 0.68 | 1.7 | 14 | 3.6 | 7.8 | |

20 | E_{offline} | 2 0.19 | 2.6 0.3 | 4 0.16 | 4.3 0.38 | 4.3 0.38 | 8.6 0.96 | 3.4 0.24 | 3.2 0.16 | 2.8 0.61 | 2.3 0.27 | 11 1.5 | 4.2 0.12 | 6.4 0.98 |

E_{BBC} | 1 | 1 | 1.6 | 3.6 | 3.6 | 7.9 | 2.7 | 2.5 | 2.2 | 1.8 | 10 | 3.4 | 5.7 | |

30 | E_{offline} | 1.5 0.1 | 2 0.14 | 3.5 0.16 | 3.9 0.25 | 4 0.3 | 6.4 0.72 | 3.8 0.46 | 2.8 0.1 | 3.1 0.3 | 1.9 0.28 | 9.9 1.1 | 3.9 0.11 | 6.1 1.1 |

E_{BBC} | 0.46 | 0.6 | 1.3 | 3 | 3.1 | 5.7 | 2.9 | 2 | 2.5 | 1.4 | 9.1 | 3.3 | 5.3 | |

50 | E_{offline} | 2 0.16 | 2.4 0.12 | 3.5 0.13 | 4.3 0.29 | 4.3 0.31 | 6.4 0.74 | 3.7 0.19 | 3 0.11 | 3.5 0.28 | 2.1 0.24 | 10 1.6 | 4.1 0.11 | 5.8 1.2 |

E_{BBC} | 0.96 | 0.98 | 1.4 | 3.3 | 3.4 | 5.5 | 2.8 | 2.2 | 2.9 | 1.5 | 9.5 | 3.3 | 5 | |

100 | E_{offline} | 2.1 0.16 | 2.5 0.1 | 3.2 0.13 | 4.3 0.3 | 4.4 0.35 | 6.4 0.46 | 4.2 0.33 | 3.1 0.11 | 3.9 0.29 | 2.2 0.33 | 11 1.3 | 4.1 0.11 | 6.1 1 |

E_{BBC} | 1.2 | 1.2 | 1.5 | 3.5 | 3.5 | 5.6 | 3.4 | 2.4 | 3.2 | 1.6 | 9.6 | 3.6 | 5.3 | |

200 | E_{offline} | 1.9 0.17 | 2.3 0.097 | 2.5 0.091 | 4.5 0.38 | 4.5 0.47 | 6 0.85 | 4.3 0.39 | 2.9 0.15 | 3.8 0.35 | 2 0.21 | 8.5 0.7 | 3.4 0.12 | 5.2 0.89 |

E_{BBC} | 1 | 1.1 | 0.98 | 3.6 | 3.6 | 5.2 | 3.3 | 2.2 | 3.2 | 1.4 | 7.7 | 3 | 4.4 |

. | . | AMSO . | CPSOR . | CPSO . | rSPSO . | SPSO . | mCPSO . | mQSO . | SAMO . | DynDE . | DynPopDE . | ESCA . | HmSO . | SOS . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | E_{offline} | 2.4 0.71 | 6.1 0.84 | 7.8 1 | 4.1 0.71 | 4.4 0.72 | 41 9.5 | 5.9 0.98 | 4.2 0.61 | 5.2 1 | 0.44 0.12 | 6.5 0.93 | 4 0.4 | 2.8 1.6 |

E_{BBC} | 0.13 | 0.062 | 2.01 × | 1.6 | 1.9 | 39 | 3.2 | 2.9 | 1.3 | 0.034 | 4.9 | 2.2 | 1.5 | |

2 | E_{offline} | 2.5 0.37 | 5 0.79 | 5.3 0.45 | 2.6 0.15 | 2.6 0.2 | 14 2.2 | 6.1 0.46 | 2.9 0.17 | 5.7 0.41 | 1.2 0.4 | 7.4 0.8 | 2.6 0.43 | 4.3 0.93 |

E_{BBC} | 0.71 | 1.4 | 1.1 | 0.93 | 0.99 | 12 | 4.3 | 1.7 | 4.1 | 0.85 | 6.5 | 1.2 | 3.5 | |

5 | E_{offline} | 1.6 |