Existing work on offline data-driven optimization mainly focuses on problems in static environments, and little attention has been paid to problems in dynamic environments. Offline data-driven optimization in dynamic environments is a challenging problem because the distribution of collected data varies over time, requiring surrogate models and optimal solutions tracking with time. This paper proposes a knowledge-transfer-based data-driven optimization algorithm to address these issues. First, an ensemble learning method is adopted to train surrogate models to leverage the knowledge of data in historical environments as well as adapt to new environments. Specifically, given data in a new environment, a model is constructed with the new data, and the preserved models of historical environments are further trained with the new data. Then, these models are considered to be base learners and combined as an ensemble surrogate model. After that, all base learners and the ensemble surrogate model are simultaneously optimized in a multitask environment for finding optimal solutions for real fitness functions. In this way, the optimization tasks in the previous environments can be used to accelerate the tracking of the optimum in the current environment. Since the ensemble model is the most accurate surrogate, we assign more individuals to the ensemble surrogate than its base learners. Empirical results on six dynamic optimization benchmark problems demonstrate the effectiveness of the proposed algorithm compared with four state-of-the-art offline data-driven optimization algorithms. Code is available at https://github.com/Peacefulyang/DSE_MFS.git.

A common assumption underlying many optimization problems is that analytical objective and constraint functions exist. However, this assumption may not hold in many real-life situations, such as protein or molecule design problems (Brookes et al., 2019; Gaulton et al., 2012), and robot morphology design (Liao et al., 2019), where the objective function to be optimized cannot be described analytically (Jin and Sendhoff, 2009). In some cases, the objective or constraint values can be calculated based on only a certain amount of data collected, and such problems are known as data-driven optimization problems (Jin, 2016; Jin et al., 2019). Existing methods of solving data-driven optimization problems typically train a surrogate model from the collected data first. Then they use the surrogate model to replace real objectives or constraints to search for optimal solutions. During optimization processes, in cases where no candidate solutions can be verified by real objective or constraint functions, the problems are called offline data-driven optimization problems, which is different from online data-driven optimization problems (Jin et al., 2019; Wang et al., 2016). Evolutionary algorithms (EAs) are popular tools for data-driven optimization algorithms (Wang et al., 2016; Chugh et al., 2019) and are referred to as data-driven EAs. To solve online optimization problems, data-driven EAs generally replace expensive real fitness functions with less expensive/low-cost surrogates to save computational time or resources. In contrast, offline data-driven EAs can solely perform on surrogates since real fitness function evaluation is unavailable in offline data-driven optimization problems.

Even though many data-driven EAs have been developed (Jin et al., 2021), most of them focus on problems in static environments with the assumption of fixed real objective functions. However, complex real-world processes often work in dynamic scenarios where parameters in real objective functions or constraints evolve over time (Jin and Branke, 2005; Mavrovouniotis et al., 2017). For example, in operational indices optimization of the beneficiation process, raw ore type or equipment capacity may change during processing to meet market requirements (Ding et al., 2012). In these situations, the distributions of collected data change with time. We denote the offline data-driven optimization problems in such scenarios as offline data-driven optimization in dynamic environments (DynODD). Addressing DynODD requires adapting surrogate models to and tracking optimums of changing environments. Most recently, a data-driven EA for dealing with online data-driven optimization problems in dynamic environments, SAEF, has been proposed in Luo et al. (2018). In SAEF, surrogate models are rebuilt at each new environment with the newly collected data; then a memory scheme that reuses excellent solutions of the past environments is applied to initialize the population to achieve fast convergence. Generally speaking, surrogate models trained using data only of the current environment are not accurate enough in the case of small data. Meanwhile, an EA for DynODD must be able to quickly and accurately track the moving optimum. Thus, new optimization strategies for adapting to changes need to be designed.

As reported in Krawczyk et al. (2017) and Yazdani et al. (2021), a new environment is usually related to its previous environments in dynamic environments. Accordingly, the knowledge obtained from past environments may be helpful to the new environment. Therefore, techniques for extracting useful knowledge from historical environments have been widely investigated in both learning and optimization areas. For example, in the concept drift learning problems (Krawczyk et al., 2017; Gomes et al., 2017; Lu et al., 2018), where the statistical properties of the generated data change over time, methods for concept drift learning problems reuse data or the learned models of the historical environments to generate a high-quality model for the new environment (Alippi and Roveri, 2008; Street and Kim, 2001). In dynamic optimization problems (DOPs), the proposed evolutionary dynamic algorithms use solutions in the past environments to produce high-quality solutions for the new environment via memory scheme (Deb et al., 2007; Liu et al., 2010), prediction model (Cao et al., 2019; Muruganantham et al., 2015), or transfer learning method (Liu et al., 2019; Jiang et al., 2020).

Inspired by these learning and optimization approaches in dynamic environments, this study aims to address the challenges mentioned above in DynODD by taking full advantage of historical knowledge. To this end, we first adopt concept drift learning approaches to construct surrogate models for the data collected from dynamic environments. Here, we employ an effective data stream ensemble (DSE) (Gomes et al., 2017) approach to tracking the concept drift data. DSE is a combination of a set of base learners, where each base learner approximates an objective function in a selected number of past environments. After that, we search for optimal solutions for the real fitness functions by simultaneously optimizing the selected past base learners and the ensemble surrogate model in a multitask environment via a multitask EA, also known as multifactorial evolutionary algorithms (MFEA). MFEA enables knowledge sharing among the multiple optimization tasks, thereby improving the efficiency and/or effectiveness of solving each task (Gupta, Ong, and Feng, 2016). In this work, we optimize both the DSE surrogate model and its base learners, aiming at transferring useful knowledge acquired by the base learners to the DSE-assistant optimization. By doing so, we attempt to transfer knowledge of past environments to improve the optimal solution of the current environment searching, thereby accelerating optimum tracking speed. Unlike the classical MFEA (Gupta, Ong, and Feng, 2016), this work keeps more individuals for the ensemble surrogate model than its base learners, since we focus on optimizing the ensemble surrogate model. The proposed offline data-driven optimization algorithm is termed DSE-assisted MFEA, DSE_MFS for short. Contributions of this work can be summarized as follows:

  1. We propose a knowledge-transfer-based method for offline data-driven evolutionary optimization in dynamic environments, where knowledge transfer is carried out in surrogate model construction and surrogate-assisted optimization procedures;

  2. We employ data stream ensemble learning to build DSE surrogate models in each environment of DynODD. The data stream ensemble learning combines base learners trained by historical data, thus being able to leverage the knowledge of historical environments;

  3. We simultaneously optimize the DSE surrogate and its base learners via MFEA to extract knowledge from historical environments to speed up the DSE surrogate-assisted optimization. In MFEA, the DSE surrogate is assigned with more individuals than its base learners to improve searchability.

The rest of the paper is organized as follows. The definition of DynODD, backgrounds of learning in dynamic environments, MFEA, and data-driven EAs are introduced in Section 2. The proposed algorithm is described in detail in Section 3. Experimental studies, simulation results, and discussions are presented in Section 4. Finally, the conclusions and possible future research on DynODD are outlined in Section 5.

The dynamic single-objective optimization problem can be generally defined as:
(1)
where f(x,t) is the objective function, Ω is the search space, tT is the time index, xRD is the decision vector, and D is the number of decision variables.
Figure 1 illustrates the DynODD under consideration in this work, in which data is continuously generated from real fitness functions. Once an environment changes, a small number of data pairs is collected at the t-th environment and denoted as a data chunk Bt. The size of the Bt may vary according to specific applications. In DynODD, the generated data chunks in different environments involve various dynamism resulting from the time-varying real fitness function, leading to the various distributions of data chunks. Once a data chunk of an environment is obtained, a data-driven optimization algorithm is required to quickly respond to these changes to find the optimal solution of each environment f(x,t), which is denoted by xt*. Note that in DynODD, all available data are passively generated, and the optimizer cannot actively sample new data as is done in online data-driven optimization.
Figure 1:

Offline data-driven optimization in dynamic environments.

Figure 1:

Offline data-driven optimization in dynamic environments.

Close modal

2.1  Data-Driven EAs

Data-driven optimization problems refer to problems where analytical optimization functions of objectives and constraints are unavailable. Thus, optimization approaches for solving these problems often rely on collected data (Jin et al., 2021). Surrogate-assist optimization algorithms are dominating methods for data-driven optimization problems. Their main idea is building a surrogate model based on the data and using this surrogate model as the objective function to evaluate solutions. Various strategies for building and managing surrogate models and searching strategies have been proposed in the past years (Jin and Sendhoff, 2009). In cases where simulation platforms are available, the optimization can be directly driven by the simulation platforms. Usually, multiple simulators with different evaluation accuracy and computation costs can be obtained for a certain problem. To balance evaluation accuracy and computational costs, multifidelity optimization methods (Balabanov and Venter, 2004; Zimmer et al., 2021; Branke et al., 2016) and fidelity-adaptive strategies (Conn and Le Digabel, 2013) have been investigated in the past decades.

Offline data-driven EAs mainly focus on surrogate modeling, surrogate model-assisted optimization, and model management by efficiently and effectively using collected data because no new data can be actively sampled (Jin et al., 2019, 2021). According to characters of collected data, offline data-driven EAs vary a lot. For example, Wang et al. (2016) proposed a multifidelity surrogate model management to deal with problems involving a large amount of data, which will result in prohibitively large recourse or computational cost in fitness evaluation. The algorithm reduces computational costs while preserving an acceptable optimal solution by switching different fidelity models. Chugh et al. (2017) suggested a local regression smoothing technique to preprocess the data before surrogate model formulation for handling small offline data-driven problems containing noise and outliers. They employed KRVEA (Chugh et al., 2018) to perform an optimization search based on the constructed models. To solve problems with small data, an NSGAII_GP algorithm that consists of low-order PR and GP models is developed by Guo et al. (2016). The low-order PR model serves as a real fitness function to generate synthetic data for alleviating limited data. In contrast, the GP model performs an optimization search based on offline and synthetic data. Wang et al. (2019) presented a selective ensemble surrogate algorithm for problems with limited data. Yang et al. (2019) developed a coarse-fine surrogate model, where the coarse surrogate aims to guide the algorithm to find a promising subregion quickly, and the fine model focuses on leveraging good solutions according to the knowledge transferred from the coarse surrogate. Meanwhile, they suggested a reference-based average technique for generating final solutions. Li, Zhan, Wang, Jin, et al. (2020) presented boosting data-driven EAs. They first iteratively built new models and combined them as the surrogate model, then proposed a localized data generation strategy to generate synthetic data to increase data and its quantity. Huang et al. (2021) introduced semisupervised learning to the optimization process and adopted a tri-training strategy to manage surrogate models. Li, Zhan, Wang, and Zhang (2020) proposed a perturbation-based ensemble surrogate that first generates diverse surrogates based on perturbed data and then selects some of the built surrogates to form an ensemble surrogate model.

In addition to the offline data-driven optimization methods that use supervised learning to construct surrogate models, other strategies concerning off-manifold, invalid, and low-scoring out-of-distribution (OOD) data have been proposed to achieve a robust surrogate model. For example, Model Inversion Networks (MIN) (Kumar and Levine, 2020) learn an inverse function instead of training a surrogate model, mapping objective values to decision variables, in order to avoid searching OOD solutions. Similar to MIN, a density estimator is employed in Brookes et al. (2019) to restrict the searched solution to be in distribution. In Trabucco et al. (2021), the sampled OOD data are applied to train a conservative surrogate model in order to prevent the surrogate model from overestimating these OOD data.

2.2  Learning under Concept Drift

The data in many applications are naturally generated in a stream fashion, where data is continuously incoming, known as data streams. In the case of nonstationary data-generating processes, the statistical distributions of data streams will change. This phenomenon is called concept drift. An important factor of learning under concept drift is designing adaptive mechanisms to adapt to the newly available data. One kind of adaptive mechanism retains a single model in each new environment with the newest data. For example, Alippi and Roveri (2008) adopted a windowing strategy to preserve the most recent data for retraining the model, where the change ratio decides the window length. Then, adaptive length windowing techniques are proposed, whose length is estimated according to the rate of change (Bifet and Gavalda, 2007) or the ICI-based refinement procedure (Alippi et al., 2013). An alternative mechanism is the weighting mechanism, which uses all available samples to retain the model but weights each sample. The weights of the samples linearly decrease with time (Koychev, 2000) or are calculated based on decay functions (Cohen and Strauss, 2003). Different from the above two strategies, Aggarwal (2006) applied a reservoir sampling to select a subset of data to retrain the new model.

Recently, ensemble learning has attracted extensive attention to handling concepts. The developed data stream ensemble (DSE) methods differ in selecting base classifiers and combining them (Krawczyk et al., 2017; Gomes et al., 2017; Lu et al., 2018). For example, Streaming Ensemble Algorithm (SEA) (Street and Kim, 2001) creates a classifier with the newest data chunk and removes the poorest historical classifier if the maximum number of archived historical classifiers is reached. The performance of historical classifiers is measured according to their predicted mean square errors on the current data chunk. Then, SEA combines all classifiers via a weighted sum scheme, which is calculated based on the classifier's performance. Accuracy Weighted Ensemble (AWE) differs from SEA in historical classifier selection. It maintains only classifiers whose prediction accuracies are higher than that of a randomly selected one. Alternatively, Accuracy Updated Ensemble (AUE2) (Brzezinski and Stefanowski, 2013) not only selects historical classifiers for combination but also incrementally updates these classifiers with the new data chunk. Sun et al. (2018) selects a pool of individual classifiers with diversified qualities in order to cover different changes. Similarly, diverse instance-weighting ensemble (DiwE) (Liu et al., 2020) maintains diversified classifiers according to their agreement on the probability of a regional distribution change.

Among these DSE methods, the accuracy updated ensemble algorithm (AUE2) (Brzezinski and Stefanowski, 2013) can react to different types of change. For this reason, we borrow the idea of AUE2 to build a surrogate model for DynODD. Figure 2 illustrates the structure of the ensemble suggested in the AUE2, where the model in the (t-1)-th environment DSEt-1 consists of K base learners Sk, k=1,2,...,K together with a new base learner for the t-th environment, St. Note that in AUE2, not only St is created, but each base learner Sk is updated with data chunk Bt, which is collected from the t-th environment. Specifically, AUE2 first creates a new base learner St using Bt aiming at quickly adapting to changed functions. Next, AUE2 discards the poorest base learners when the number is larger than the predefined maximum. Then, AUE2 incrementally updates the past base learners using the data chunk Bt. Finally, AUE2 re-assigns the weights of ensemble members to emphasize the weight of the members that are likely to be the most accurate in the current environment, known as the dynamic combiner. The weights are calculated based on the accuracy of the base learners with data chunk Bt.
Figure 2:

Diagram of the AUE2 algorithm for learning the model of the t-th environment.

Figure 2:

Diagram of the AUE2 algorithm for learning the model of the t-th environment.

Close modal

2.3  Evolutionary Dynamic Optimization

In solving dynamic optimization problems (DOPs), EAs usually use information in historical environments to adapt to the new environment instead of restarting the optimization, which can hopefully enhance the search efficiency since problems before and after a change are somehow related (Nakano et al., 2015; Nasiri and Meybodi, 2016). Many approaches have been proposed for adapting to changes. For example, the work in Deb et al. (2007), Cobb (1990), and Liu et al. (2010) introduces a hypermutation, which increases the mutation rate at the beginning of a new environment, and then gradually decreases the mutation rate to enable the population to converge to the optimal location. Their basic idea of diversity enhancement is introducing diversity to the converged population to help it jump out of the current optimum, thereby facilitating new optimum searching. Reusing the optimal solutions in previous environments would accelerate the convergence speed in the situation of recurrent or periodical changes. For instance, the most diversified solutions (Yu and Suganthan, 2009) or the best solutions (Daneshyari and Yen, 2011) in a particular historical moment are injected into new populations. In addition, prediction-based approaches are often adopted when the changes exhibit a regular pattern. It first recognizes the moving pattern via knowledge in previous environments, then initializes the population that would be close to the optimum of the new environment using this pattern. For example, the work in Cao et al. (2019), Muruganantham et al. (2015), and Hatzakis and Wallace (2006) estimates the moving path via Support Vector Regression, Kalman Filtering, Autoregressive model. In recent years, transfer learning techniques have been employed to acquire knowledge in the previous environments and transfer it to the new environment to enhance search efficiency (Liu et al., 2019; Jiang et al., 2020).

2.4  Multitask Evolutionary Optimization Algorithm

Multitask evolutionary optimization algorithm (MFEA) (Gupta, Ong, and Feng, 2016), is a paradigm that concurrently solves multiple optimization problems (tasks) using one population. It is expected to enhance problem solving via knowledge transfer among tasks. Similar to traditional EAs, procedures of MFEA include population initialization and evaluation, reproduction, offspring evaluation, and environment selection. Major different procedures between MFEA and traditional EAs are listed as follows (Gupta, Ong, and Feng, 2016):

  • Each individual is assigned a skill factor τ except gene in the population initialization, representing the aligned task.

  • In evaluation processes, individuals are evaluated only by their aligned tasks according to the skill factor.

  • The reproduction in MFEA involves an assortative mating procedure and a vertical cultural transmission procedure. In particular, assortative mating allows individuals from different tasks to mate with each other to create offspring. Thereafter, the vertical cultural transmission assigns skill factors to the generated offspring according to their parents.

  • Environment selection is performed for each task independently to select individuals of the next generation.

Many recent multitask evolutionary optimization algorithms have been proposed to enhance positive knowledge transfer and reduce negative knowledge transfer among tasks. For example, a data-driven multitasking approach is proposed (Bali et al., 2019) to determine how much knowledge can be transferred between tasks dynamically. Li et al. (2021) developed a meta-knowledge transfer strategy to leverage knowledge from heterogeneous multisource data to the target task. Wu et al. (2022) proposed an approach for knowledge transfer among heterogeneous tasks. On the one hand, they mapped the global best individual of the source task from its original search space to the search space of the target task via an optimization process to handle the difference in task dimension. On the other hand, they proposed an orthogonal transfer to find the best combination of different dimensions across the two heterogeneous tasks. We employ the simplest MFEA approaches (Gupta, Ong, and Feng, 2016) in this work, while the above advanced knowledge transfer methods are valued to be investigated in our DSE surrogate-assisted optimization in future work.

In this section, we first present the main framework of the proposed DSE_MFS. Then, we detail its two components, DSE surrogate construction and DSE surrogate-assisted MFEA. Finally, we analyze the computational complexity analysis of the proposed DSE_MFS.

3.1  Overall Framework

A diagram of the proposed DSE_MFS is shown in Figure 3. As presented in the figure, the DSE surrogate model is adapted based on the incoming data once an environmental change occurs. Subsequently, a surrogate-assisted MFEA is performed to optimize optimization problems of the current environment and the problems in some previous environments, thereby achieving knowledge transfer from the base learners to the DSE surrogate. Once the environment changes, the best solution among the individuals associated with the DSE surrogate is chosen and outputted as an implementation solution of the current environment. The following three subsections provide a detailed description of each main component of the presented algorithm.
Figure 3:

Diagram of the proposed algorithm for offline data-driven optimization in dynamic environments. It consists of two main components: (1) construction of the data stream ensemble surrogate and (2) multifactorial evolutionary optimization assisted by a data stream ensemble surrogate.

Figure 3:

Diagram of the proposed algorithm for offline data-driven optimization in dynamic environments. It consists of two main components: (1) construction of the data stream ensemble surrogate and (2) multifactorial evolutionary optimization assisted by a data stream ensemble surrogate.

Close modal

3.2  Data Stream Ensemble (DSE) Surrogate

The DSE surrogate is constructed using the AUE2 algorithm. AUE2 was developed to solve classification problems in which the base learner is a Hoeffding tree. In this work, we replace the Hoeffding tree with a radial-basis-function (RBF) network for regression tasks. The pseudocode of the revised AUE2 is presented in Algorithm 1.

graphic

At the t-th environment, the surrogate DSEt-1 of the last environment contains a pool of base learners consisting of Sk,k=1,2,...,K. These base learners are denoted as S; B is the corresponding training data including Bk,k=1,2,...,K; and K is the number of base learners in the current ensemble model. Given the training data Bt generated at the t-th environment, all the K ensemble learners Sk,k=1,2,...,K are tested on Bt. Accordingly, the root mean square errors (RMSEs) of the base learners are calculated as follows:
(2)
where (xj,yj) is the j-th data sample in Bt, y^j is the prediction of Sk on xj.
After that, a new base learner is built using Bt, which is denoted as St. Meanwhile, the performance of the new member St is validated on the current data chunk Bt. The RMSE of St, RMSEr, is calculated using leave-one-out cross-validation (Kearns and Ron, 1999). After the errors of all base learners are calculated, the weights of the ensemble learners for the past environments Sk,k=1,2,...,K are assigned according to their RMSEs because RMSEs represent the accuracies of the base learners on the current data. In particular, a base learner with a larger RMSE value indicates that it is less accurate on the current data and thus is assigned a smaller weight. In addition, as the newest base learner is trained on the most recent data, it is treated as a “perfect” base learner and should be assigned a larger weight than other base learners. In summary, the weights corresponding to the historical base learners and the newest base learner can be calculated according to Equations (3) and (4), respectively.
(3)
(4)

If the number of the existing base learners, K, is less than the predefined maximum number Kmax, the new member St is simply added to the ensemble. Otherwise, a base learner with the largest RMSE in the K will be replaced by St. Similar to Brzezinski and Stefanowski (2013), Kmax is set to 10 in this work.

Apart from adjusting the weight of each base learner, AUE2 also updates all base learner except for St, using Bt (lines 18–21 in Algorithm 1). This is achieved by updating each base learner using a combination of its own training data and the most Bt. For example, base learner Sk is updated using a combination of Bk and Bt. Meanwhile, the training data of Sk is replaced with the union of Bk and Bt.

Finally, the output of the ensemble is aggregated as follows:
(5)

Note that the surrogate for assisting the optimization of the t-th environment is DSEt consisting of Sk,k=1,2,...,K and St. When t=1, however, the DSE1 is the same as S1.

3.3  DSE Surrogate-Assisted MFEA

The proposed DSE_MFS solves K+1 problems including the DSE surrogate and its base learners simultaneously using MFEA (Gupta, Ong, and Feng, 2016). By doing this, the knowledge about the fitness landscapes of the previous problems acquired by the ensemble base learners can be transferred to accelerate solution searching of the current problem. Different from existing MFEAs (Gupta, Ong, and Feng, 2016; Ding et al., 2017; Bali et al., 2019; Gupta, Ong, Feng, and Tan, 2016) that treat all tasks equally important, DSE_MFS aims to find the optimum of the current environment. Therefore, we assign a larger number of individuals for the current optimization problem than those of the previous problems. Specifically, for a given population size NP, NP2 individuals are assigned to solving the DSE-assisted optimization of the current problem. By contrast, NP2K individuals are related to each base learner, where K is the number of base learners. All individuals in the population are randomly initialized and assigned to a task via a skill factor τ. An individual pi is assigned to the k-th base learner if its skill factor τi=k,k{1,2,...,K}; otherwise to the DSE surrogate if τi=K+1.

graphic

Like other MFEAs, knowledge transfer from previous problems to current optimization problems is realized during reproduction, including assortative mating and vertical cultural transmission. Details of the offspring creation process in the proposed DSE_MFS are presented in Algorithm 2. For a pair of parents pa,pb, whose skill factors are τa,τb, their offspring individuals are generated with crossover and mutation if the two parents come from the same task, and they are assigned to the same task of their parents (lines 3–5). If the parents are from different tasks, a randomly generated number is smaller than the predefined random mating probability (rmp). The knowledge between the two tasks can also be shared in the generated offspring through crossover and mutation. Notably, the probability of assigning an offspring to a task is different from MFEA, in which it is fixed. In this work, DSE_MFS adaptively calculates the probability in accordance with each task to maintain their associated individual number constant. For example, if the number of individuals for tasks τa and τb are Na and Nb, respectively, then the probability of assigning an offspring to τa(τca) is NaNa+Nb and that to τb(τcb) is NbNa+Nb. If a randomly generated number is smaller than rmp, then the offspring is created by mutating one of the parents; this offspring is assigned to the same task as its parent accordingly, as shown in lines 18–23. Once the offspring are generated and evaluated by the corresponding tasks, environment selection is performed for each task to select parents for the next generation.

The DSE surrogate-assisted MFEA simultaneously optimizes multiple objectives, that is, the DSE surrogate model and its base learners, making the proposed DSE_MFS similar to dynamic multiobjective optimization algorithms. However, they are very different, as at each environment, DSE_MFS aims to find only one optimal solution corresponding to the DSE surrogate model with the assistance of optimizing the other objectives. In contrast, dynamic multiobjective optimization algorithms aim to find a set of Pareto-set that trade off across multiple objectives. Although the proposed algorithm in this work focuses on single-objective DynODD, it can be extended to multiobjective DynODD in a straightforward way by building and managing surrogate models for each objective and replacing the multitask search algorithm with a multi-objective multitask search algorithm, such as Mo-MFEA (Gupta, Ong, Feng, and Tan, 2016) or ATO-MFEA (Yang et al., 2017). As analyzed in Gupta et al. (2018), an exchange of genetic materials between tasks in the multitask environment enables each task to automatically leverage useful knowledge from other tasks while preserving its own best genes. Meanwhile, each base learner exhibits a specific search behavior because of the time-varying objective functions each base learner aims to approximate. Therefore, the base learners can continuously transfer knowledge of the previous problems to the current environment to make an efficient search.

3.4  Computational Complexity Analysis

Suppose there are T environments in the data generation process, and N data are generated at each environment. The number of center points in RBF is C, the number of generations is G, and the population size of MFEA is Q. Then, the time complexity of the DSE_MFS, including surrogate model training, the best base learners evaluation and selection, and surrogate model prediction, are analyzed as follows.

This work applies an RBF as the learning algorithm and if the number of centers of the RBF is C, the time complexity of training the RBF model is O(NC2) (Du and Swamy, 2006). DSE_MFS keeps the maximum Kmax base learners in the archive and updates the surrogate models in each environment. In the first Kmax environments, the time complexity of training the DSE model in the k-th environment is (1+2+...,+K)NC2. After Kmax environments, the complexity of training time varies with the selected base learners. The older base learners need more computation time since they preserve much more data. The worst case is that all selected base learners from historical environments are the oldest, which have NKmax data. Thus, the upper bound of the time complexity of training the DSE model in each environment is (Kmax-1)KmaxNC2. Therefore, in T environments, the upper bound of the time complexity of training the surrogates is:
(6)

As for base learner selection, DSE_MFS needs to evaluate all base learners in the archive on the newly generated data, so the time complexity is O(NC(Kmax-1)Kmax). Meanwhile, DSE_MFS needs to sort all archived base learners, so the time complexity is O(Kmaxlog(Kmax)(T-Kmax)).

The time complexity of model prediction using an RBF is O(C). Thus, the total time complexity for model evaluations is:
(7)

To investigate the performance of the proposed DSE_MFS algorithm in solving DynODD, we compare it with three state-of-the-art offline data-driven EAs for stationary optimization, namely, DDEA-SE (Wang et al., 2019), TT-DDEA (Huang et al., 2021), and BDDEA-LDG (Li, Zhan, Wang, Jin et al., 2020), and one online data-driven EA for dynamic optimization, SAEF (Luo et al., 2018). Among the compared algorithms, DDEA-SE is an ensemble surrogate-assisted offline data-driven EA that selects the best base learners from a large pool of base learners to serve as the surrogate. TT-DDEA introduces a tri-training to update surrogate models with generated candidate solutions. BDDEA-LDG applies a boosting strategy to incrementally build surrogates and a localized data generation to generate synthetic data. SAEF rebuilds surrogate models at each environment and uses a memory-based strategy to track moving optimum. In SAEF, we apply a multipopulation particle swarm optimization-based method as an optimizer.

To implement offline data-driven EAs in DynODD, the data chunk generated from each environment is incrementally added to the training set. Then, the updated dataset is used to retrain each component. Since SAEF (Luo et al., 2018) is designed for online data-driven optimization, it needs real function evaluations when updating the surrogates. To adapt SAEF to offline DynODD, we remove the surrogate update part during the optimization. Among the seven algorithms in SAEF, we select the 1S_GP that uses GPs as surrogate models for comparison due to its effectiveness on ten-dimensional problems (Luo et al., 2018).

4.1  Experimental Settings

4.1.1  Benchmarks

The CEC 2009 benchmark problems for dynamic optimization, which are generated from a GDBG system (Li et al., 2008), are used to evaluate the performance of the compared algorithms. This test suite contains six problems, F1–F6, where F1 is a maximization problem and F2–F6 are minimization problems. The characters of test instances are presented as follows:

  • F1: Rotation peak function with 12 peaks.

  • F2: Composition of ten sphere functions.

  • F3: Composition of ten Rastrigin functions.

  • F4: Composition of ten Griewank functions.

  • F5: Composition of ten Ackley functions.

  • F6: Hybrid composition function consists of sphere, Rastrigin, Weierstrass, Griewank, and Ackley.

According to the change type of control parameters in the function, which determines the position of the best solution, GDBG designs six environmental change types for each test problem, which are listed as follows:

  • C1: Small step change, where φ changes mildly between two consecutive environments.

  • C2: Large step change, where φ has a significant change between two consecutive environments.

  • C3: Random change, where φ changes randomly.

  • C4: Chaotic change, where the control parameters between two consecutives times related non-linearly.

  • C5: Recurrent change, where the control parameters' period changed.

  • C6: Recurrent with noise, where the control parameter change is caused by noise.

4.1.2  Performance Metrics

We utilize E_BBC and STD (Li et al., 2008) as two metrics to assess the performance of compared algorithms on the benchmark problems, which are calculated by Equations (8) and (9):
(8)
(9)
where Run is the total number of independent runs, T is the number of environments in each independent run, and Ei,jlast(t) is the error recorded before the j-th dynamic change in the i-th independent run. Elast(t) is the absolute fitness difference between the best solution found by an algorithm, Xbest(t), and the optimal solution, X*(t), that is Elast(t)=fXbest(t)-fX*(t). The smaller value of the two metrics indicates a better performance of the corresponding algorithm.

4.1.3  Parameter Settings

Parameters of the benchmark problems and the algorithms under comparison are as follows:

  • The total number of environments T in each independent run is set to T=60. The maximum generation of each environment, that is, the change frequency FEs, is set to 20.

  • The distribution indexes of simulated binary crossover and polynomial mutation in EA are set to 20. The crossover probability and mutation probability are set to pc=1.0 and pm=1/D, where D is the number of decision variables. The randomly mating probability rmp is set to rmp=0.3, similar to MFEA.

4.1.4  Other Settings

RBF in the proposed DSE_MFS is constructed using the toolbox in Jekabsons (2009). Solutions obtained by the compared algorithms at each environment are re-evaluated using the real fitness functions before performance metrics are calculated. All experiments are executed on Matlab R2018a, Intel Xeon with 3.5 GHz CPU, Microsoft Windows 10 64-bit operating system.

4.2  Comparative Results on the Benchmark Problems

In this section, we compare the performance of DDEA-SE, TT-DDEA, BDDEA-LDG, SAEF, and DSE_MFS in solving DynODD (Li, Zhan, Wang, Jin et al., 2020). In this experiment, each data chunk consists of 5D samples generated using Latin hypercube sampling (LHS) (Stein, 1987). These data are used to train or update DSE surrogate models. The statistical E_BBC and STD results of the five compared algorithms on ten-dimensional benchmark problems over 20 independent runs are presented in Table 1. Wilcoxon's rank sum test at a 0.05 significance level is performed to denote the significance of differences between the compared algorithms and DSE_MFS. In the table, the best metric values are highlighted. Symbols “+,” “-,” and “=” indicate the corresponding algorithm performs significantly better than, worse than, and comparable to the proposed DSE_MFS, respectively.

Table 1:

The statistical E_BBC and STD of DSE_MFS variants over 20 runs.

DDEA-SETT-DDEABDDEA-LDGSAEFDSE_MFS
E_BBCE_BBCE_BBCE_BBCE_BBC
ProbC(STD)(STD)(STD)(STD)(STD)
F1 C1 21.25(1.062)- 34.45(1.169)- 25.76(2.274e)- 47.98(4.952)- 22.52(3.100) 
 C2 22.32(2.751)- 18.50(3.079)+ 24.31(0.792)- 49.78(5.721)- 20.88(5.424) 
 C3 22.38(2.184)- 21.21(0.811)- 21.25(1.225)- 40.51(4.456)- 19.44(4.230) 
 C4 24.27(1.446)- 47.43(2.047)- 24.47(1.796)- 40.57(0.421)- 23.22(1.524) 
 C5 17.69(0.361)+ 42.87(0.645)- 23.23(0.774)- 40.72(4.102)- 18.37(5.232) 
 C6 19.64(0.484)- 35.10(0.675)- 23.23(0.927)- 40.27(4.219)- 18.01(5.184) 
F2 C1 491.4(19.03)- 1071(209.8)- 519.1(18.23)- 526.1(28.75)- 440.7(18.80) 
 C2 514.3(27.94)- 1047(219.9)- 519.1(18.23)- 529.1(25.46)- 454.0(25.21) 
 C3 499.6(23.94)- 1082(225.5)- 534.5(19.23)- 511.8(31.85)- 446.2(20.13) 
 C4 466.0(24.68)- 1157(31.49)- 463.1(18.23)- 493.2(25.48)- 405.5(17.11) 
 C5 496.8(19.90)- 825.3(10.85)- 525.8(20.33)- 541.7(27.74)- 437.6(17.88) 
 C6 494.3(26.25)- 907.5(20.22)- 523.4(14.92)- 539.1(31.92)- 450.7(24.69) 
F3 C1 1131(58.46)- 1281(52.22)- 1142(45.84)- 1056(63.73)- 1040(54.40) 
 C2 1140(56.64)- 1156(203.9)- 1136(55.84)- 1162(68.03)- 1121(52.58) 
 C3 1139(58.46)- 1147(159.5)- 1149(45.84)- 1148(68.70)- 1102(54.72) 
 C4 1142(88.46)- 1147(56.43)- 1059(55.84)- 1027(62.41)- 989.8(61.79) 
 C5 1171(68.46)- 1160(67.74)- 1145(65.84)- 1069(75.71)- 1033(59.07) 
 C6 1131(58.46)- 1140(68.66)- 1123(65.84)- 1073(68.04)- 1041(58.72) 
F4 C1 874.3(30.17)- 907.0(29.86)- 580.8(28.14)- 582.1(21.93)- 508.1(20.86) 
 C2 764.7(23.28)- 809.7(31.35)- 562.3(28.14)- 587.5(21.07)- 514.4(21.92) 
 C3 764.7(13.28)- 810.0(197.0)- 599.8(18.14)- 572.8(25.16)- 509.0(21.89) 
 C4 724.7(13.28)- 720.4(29.23)- 577.0(48.14)- 551.6(25.06)- 441.1(39.00) 
 C5 664.7(23.28)- 708.9(20.83)- 575.8(28.23)- 602.9(24.97)- 519.5(29.47) 
 C6 764.7(23.28)- 805.6(20.05)- 572.8(22.13)- 596.6(26.69)- 514.2(21.79) 
F5 C1 1964(42.14)- 1956(35.50)- 1913(44.31)+ 1954(41.09)- 1923(33.96) 
 C2 1966(37.98)- 1953(25.92)- 1958(44.31)- 1959(33.33)- 1935(43.90) 
 C3 1976(37.98)- 1993(42.86)- 1923(34.31)= 1946(32.64)- 1926(42.52) 
 C4 1969(27.98)- 1949(36.89)- 1919(44.31)+ 1940(28.09)- 1931(43.02) 
 C5 1986(27.98)- 1983(25.79)- 1932(34.31)= 1962(22.13)- 1934(32.50) 
 C6 1978(27.98)- 1975(25.14)- 1927(44.31)+ 1962(23.52)- 1932(40.31) 
F6 C1 1414(85.82)- 1413(76.25)- 1351(82.62)- 1262(60.26)- 1173(59.83) 
 C2 1452(72.06)- 1431(67.92)- 1291(72.62)- 1273(66.82)- 1176(62.63) 
 C3 1414(65.82)- 1423(67.44)- 1301(72.62)- 1261(69.07)- 1169(62.70) 
 C4 1314(75.82)- 1374(70.15)- 1251(72.62)- 1240(70.33)- 1060(91.83) 
 C5 1314(55.82)- 1286(573.4)- 1246(52.62)- 1273(63.02)- 1186(68.57) 
 C6 1414(75.82)- 1432(677.6)- 1346(72.62)- 1268(60.58)- 1181(61.23) 
running time 8.635e+07 (s) 5.376e+06 (s) 3.270e+07 (s) 1.113e+04 (s) 4.752e+04 (s) 
DDEA-SETT-DDEABDDEA-LDGSAEFDSE_MFS
E_BBCE_BBCE_BBCE_BBCE_BBC
ProbC(STD)(STD)(STD)(STD)(STD)
F1 C1 21.25(1.062)- 34.45(1.169)- 25.76(2.274e)- 47.98(4.952)- 22.52(3.100) 
 C2 22.32(2.751)- 18.50(3.079)+ 24.31(0.792)- 49.78(5.721)- 20.88(5.424) 
 C3 22.38(2.184)- 21.21(0.811)- 21.25(1.225)- 40.51(4.456)- 19.44(4.230) 
 C4 24.27(1.446)- 47.43(2.047)- 24.47(1.796)- 40.57(0.421)- 23.22(1.524) 
 C5 17.69(0.361)+ 42.87(0.645)- 23.23(0.774)- 40.72(4.102)- 18.37(5.232) 
 C6 19.64(0.484)- 35.10(0.675)- 23.23(0.927)- 40.27(4.219)- 18.01(5.184) 
F2 C1 491.4(19.03)- 1071(209.8)- 519.1(18.23)- 526.1(28.75)- 440.7(18.80) 
 C2 514.3(27.94)- 1047(219.9)- 519.1(18.23)- 529.1(25.46)- 454.0(25.21) 
 C3 499.6(23.94)- 1082(225.5)- 534.5(19.23)- 511.8(31.85)- 446.2(20.13) 
 C4 466.0(24.68)- 1157(31.49)- 463.1(18.23)- 493.2(25.48)- 405.5(17.11) 
 C5 496.8(19.90)- 825.3(10.85)- 525.8(20.33)- 541.7(27.74)- 437.6(17.88) 
 C6 494.3(26.25)- 907.5(20.22)- 523.4(14.92)- 539.1(31.92)- 450.7(24.69) 
F3 C1 1131(58.46)- 1281(52.22)- 1142(45.84)- 1056(63.73)- 1040(54.40) 
 C2 1140(56.64)- 1156(203.9)- 1136(55.84)- 1162(68.03)- 1121(52.58) 
 C3 1139(58.46)- 1147(159.5)- 1149(45.84)- 1148(68.70)- 1102(54.72) 
 C4 1142(88.46)- 1147(56.43)- 1059(55.84)- 1027(62.41)- 989.8(61.79) 
 C5 1171(68.46)- 1160(67.74)- 1145(65.84)- 1069(75.71)- 1033(59.07) 
 C6 1131(58.46)- 1140(68.66)- 1123(65.84)- 1073(68.04)- 1041(58.72) 
F4 C1 874.3(30.17)- 907.0(29.86)- 580.8(28.14)- 582.1(21.93)- 508.1(20.86) 
 C2 764.7(23.28)- 809.7(31.35)- 562.3(28.14)- 587.5(21.07)- 514.4(21.92) 
 C3 764.7(13.28)- 810.0(197.0)- 599.8(18.14)- 572.8(25.16)- 509.0(21.89) 
 C4 724.7(13.28)- 720.4(29.23)- 577.0(48.14)- 551.6(25.06)- 441.1(39.00) 
 C5 664.7(23.28)- 708.9(20.83)- 575.8(28.23)- 602.9(24.97)- 519.5(29.47) 
 C6 764.7(23.28)- 805.6(20.05)- 572.8(22.13)- 596.6(26.69)- 514.2(21.79) 
F5 C1 1964(42.14)- 1956(35.50)- 1913(44.31)+ 1954(41.09)- 1923(33.96) 
 C2 1966(37.98)- 1953(25.92)- 1958(44.31)- 1959(33.33)- 1935(43.90) 
 C3 1976(37.98)- 1993(42.86)- 1923(34.31)= 1946(32.64)- 1926(42.52) 
 C4 1969(27.98)- 1949(36.89)- 1919(44.31)+ 1940(28.09)- 1931(43.02) 
 C5 1986(27.98)- 1983(25.79)- 1932(34.31)= 1962(22.13)- 1934(32.50) 
 C6 1978(27.98)- 1975(25.14)- 1927(44.31)+ 1962(23.52)- 1932(40.31) 
F6 C1 1414(85.82)- 1413(76.25)- 1351(82.62)- 1262(60.26)- 1173(59.83) 
 C2 1452(72.06)- 1431(67.92)- 1291(72.62)- 1273(66.82)- 1176(62.63) 
 C3 1414(65.82)- 1423(67.44)- 1301(72.62)- 1261(69.07)- 1169(62.70) 
 C4 1314(75.82)- 1374(70.15)- 1251(72.62)- 1240(70.33)- 1060(91.83) 
 C5 1314(55.82)- 1286(573.4)- 1246(52.62)- 1273(63.02)- 1186(68.57) 
 C6 1414(75.82)- 1432(677.6)- 1346(72.62)- 1268(60.58)- 1181(61.23) 
running time 8.635e+07 (s) 5.376e+06 (s) 3.270e+07 (s) 1.113e+04 (s) 4.752e+04 (s) 

The table shows that DSE_MFS achieves minimum E_BBC and STD metric values on nearly all test problems among the five compared data-driven algorithms and performs a little worse than the other four compared algorithms on five functions. The results demonstrate a clear superior performance of DSE_MFS to that of the other four compared algorithms in dealing with DynODD. DDEA-SE, TT-DDEA, and BDDEA-LDG achieve higher E_BBC for offline data-driven algorithms than DSE_MFS. The underlying reason may be that without an adaptation strategy, they cannot track the dynamic environments in DynODD. With closer observation, BDDEA-LDG performs the best among these three offline data-driven algorithms. This is because BDDEA-LDG applies a boosting strategy in surrogate model management, which can alleviate concept drift in building surrogate models. On the contrary, SAEF performs much worse than DSE_MFS. This may be attributed to the fact that SAEF uses only the data chunk of the current environment to build surrogates, which may lead to much larger approximation errors on problems with a small amount of data chunk. The results of SAEF suggest that it is important to reuse historical data to alleviate data shortage for small data-driven optimization problems.

We list the runtime of each compared algorithm on each test instance averaged over 20 independent runs in Table 1. These results show that SAEF and DSE_MFS spend much less time than the other three offline data-driven EAs. This is reasonable as SAEF and DSE_MFS update models only when the environment changes and do not consider model management strategies during each environment, thus saving much computational time.

Statistical E_BBC and STD metric values of the five compared algorithms on different data chunk sizes are plotted in Figure 4. The performance of DSE_MFS is significantly promising when the size is 5D. On the contrary, the performances of the other four algorithms slightly fluctuate with different amounts of data. The results indicate that they are not very sensitive to the data chunk size.
Figure 4:

The E_BBC and STD obtained by DDEA-SE, TT_DDEA, BDDEA, and DSE_MFS on F6 with different dimensions and data chunk sizes.

Figure 4:

The E_BBC and STD obtained by DDEA-SE, TT_DDEA, BDDEA, and DSE_MFS on F6 with different dimensions and data chunk sizes.

Close modal
Statistical results about the averaged errors (Avg_errors) over 20 runs between the best-found and the optimal solution at each generation of the first 20 environments on F6 are reported in Figure 5. From the figure, we see that the errors of all compared algorithms generally decrease with the number of generations increasing. This observation indicates that surrogate models of all compared algorithms can guide the optimizer in the direction of the optimal solution. In addition, DDEA-SE, TT-DDEA, and BDDEA-LDG, designed for static offline data-driven optimization problems, also have more dramatic fluctuates than SFEA and DSE_MFS. The underlying reasons are that they do not incorporate adaptation strategies in both surrogate model building and optimization processes. Thus, they cannot track optimal solutions of new environments quickly.
Figure 5:

The averaged errors over 20 runs between the best-found and the optimal solution at each generation of the first 20 environments on F6.

Figure 5:

The averaged errors over 20 runs between the best-found and the optimal solution at each generation of the first 20 environments on F6.

Close modal

4.3  Further Comparative Studies

As introduced in Section 3, DSE_MFS contains two main components, DSE surrogate model construction and ensemble surrogate-assisted MFEA. In the following, we perform additional experiments on ten-dimensional F1 and F6 to examine the role of each component by comparing three variants of DSE_MFS.

  • SA_SOEA: SA_SOEA is a variant of DSE_MFS that replaces the multitask evolutionary optimizer with a traditional evolutionary algorithm and constructs the surrogate model using all data collected from the current and historical environments.

  • DSE_SOEA: DSE_SOEA is a variant of DSE_MFS that replaces the multitask evolutionary optimizer with a traditional evolutionary algorithm but builds surrogate models using DSE.

  • DSE_MPSO: DSE_MPSO is a variant of DSE_MFS that replaces the multitask evolutionary optimizer with a multipopulation particle swarm optimization algorithm proposed by Blackwell et al. (2008).

The statistical results of E_BBC and STD of the three variants on F1 and F6 are summarized in Table 2. Similarly, symbols “+,” “-,” and “=” indicate the corresponding variant is significantly better than, worse than, or comparable to DSE_MFS, respectively. The compared results between SA_SOEA, DSE_SOEA, DSE_MPSO, and DSE_MFS are listed in the last line of Table 2. In the table, the result l/t/w denotes the corresponding variant in the current column loses l times, ties t times, and wins w times compared with the corresponding variant in the next column. DSE_MFS achieves the best performance compared with its three variants. It should be emphasized that the three variants perform worse than or comparable to DSE_MFS on almost all the test problems. From the results in Table 2, we can make the following observations:

Table 2:

The statistical E_BBC and STD of DSE_MFS variants over 20 runs.

SA_SOEADSE_SOEADSE_MPSODSE_MFS
E_BBCE_BBCE_BBCE_BBC
ProbC(STD)(STD)(STD)(STD)
F1 C1 25.21(3.140)= 23.91(3.179)= 23.77(4.105)= 22.52(3.100) 
 C2 23.04(4.871) - 22.11(4.333) - 22.07(4.742) - 20.88(5.424) 
 C3 24.64(3.947) - 21.01(4.270) - 20.64(6.423)= 19.44(4.230) 
 C4 25.20(1.482) - 24.93(1.549)= 22.50(1.421)+ 23.22(1.524) 
 C5 20.36(5.136) - 19.70(5.130)= 21.03(5.077) - 18.37(5.232) 
 C6 21.82(5.199) - 20.25(5.160) - 20.57(5.073) - 18.01(5.184) 
F6 C1 1456(64.72) - 1250(60.31) - 1294(66.76) - 1173(59.83) 
 C2 1348(72.53) - 1260(65.85) - 1283(74.97) - 1176(62.63) 
 C3 1439(75.37) - 1251(74.15) - 1279(82.52) - 1169(62.70) 
 C4 1445(76.08) - 1159(71.19) - 1251(67.59) - 1060(91.83) 
 C5 1458(75.29) - 1267(66.55) - 1293(58.73) - 1186(68.57) 
 C6 1451(66.71) - 1263(69.24) - 1293(56.92) - 1181(61.23) 
l/t/w 11/1/0 9/3/0 9/2/1  
SA_SOEADSE_SOEADSE_MPSODSE_MFS
E_BBCE_BBCE_BBCE_BBC
ProbC(STD)(STD)(STD)(STD)
F1 C1 25.21(3.140)= 23.91(3.179)= 23.77(4.105)= 22.52(3.100) 
 C2 23.04(4.871) - 22.11(4.333) - 22.07(4.742) - 20.88(5.424) 
 C3 24.64(3.947) - 21.01(4.270) - 20.64(6.423)= 19.44(4.230) 
 C4 25.20(1.482) - 24.93(1.549)= 22.50(1.421)+ 23.22(1.524) 
 C5 20.36(5.136) - 19.70(5.130)= 21.03(5.077) - 18.37(5.232) 
 C6 21.82(5.199) - 20.25(5.160) - 20.57(5.073) - 18.01(5.184) 
F6 C1 1456(64.72) - 1250(60.31) - 1294(66.76) - 1173(59.83) 
 C2 1348(72.53) - 1260(65.85) - 1283(74.97) - 1176(62.63) 
 C3 1439(75.37) - 1251(74.15) - 1279(82.52) - 1169(62.70) 
 C4 1445(76.08) - 1159(71.19) - 1251(67.59) - 1060(91.83) 
 C5 1458(75.29) - 1267(66.55) - 1293(58.73) - 1186(68.57) 
 C6 1451(66.71) - 1263(69.24) - 1293(56.92) - 1181(61.23) 
l/t/w 11/1/0 9/3/0 9/2/1  

4.3.1  Effectiveness of the Data Stream Ensemble Surrogate

The compared results between SA_SOEA and DSE_SOEA (the results listed in the column of SA_SOEA) in Table 2 show that DSE_SOEA significantly outperforms SA_SOEA on all the test instances. To understand the reason behind this observation, we calculate the accuracy of the surrogate model in the two variants on F1 and F6 with 2D, 5D, 10D, and 15D data chunk sizes in the presence of environmental change type C2. The surrogates in DSE_SOEA and SA_SOEA are denoted as DSE surrogate and SA surrogate, respectively. We first sample a data chunk using LHS in each environment. Then, we calculate its fitness using the real objective function. These solutions are used as the training data to construct the DSE and SA surrogates. Thereafter, 100,000 samples are randomly generated as the test data to calculate the RMSEs of the two trained models. The average RMSEs of the 60 environments on F1 and F6 with different data chunk sizes are plotted in Figure 6. We can see from the figure that the RMSEs of the two surrogates tend to decrease with the data chunk size increasing. Meanwhile, the RMSE of the DSE surrogate reaches a smaller value than that of the SA surrogate. In other words, the DSE surrogate is more accurate than the SA surrogate, meaning that the variants using DSE learning can improve the accuracy of the surrogate by leveraging the data in the previous environments, thereby improving the performance of DSE_SOEA.
Figure 6:

The average RMSEs of the SA and DSE surrogates over the 60 environments on F1 and F6 with different dimensions and data chunk sizes.

Figure 6:

The average RMSEs of the SA and DSE surrogates over the 60 environments on F1 and F6 with different dimensions and data chunk sizes.

Close modal

4.3.2  Effectiveness of the DSE-Assisted MFEA

The results of DSE_SOEA, DSE_MPSO, and DSE_MFS (in the column of DSE_SOEA and DSE_MPSO) in Table 2 show that the performance of DSE_MFS is better than that of DSE_SOEA and DSE_MPSO. Therefore, the DSE surrogate using MFEA as the optimizer significantly outperforms the one using traditional EAs. This observation confirms that the optimization of the current problem can benefit from the knowledge of the problems in the previous environments. We record the best solutions at each generation found by DSE_SOEA and DSE_MFS on F1 (a maximization problem) and F6 (a minimization problem) for further validating the effect of using MFEA. The averaged objective values evaluated by the DSE surrogates and real fitness functions of the solutions obtained by DSE_SOEA and DSE_MFS over 20 independent runs for the 60 environments are plotted in Figure 7. As shown in the figure, the solutions obtained by DSE_MFS are much better than DSE_SOEA in terms of DSE surrogate on most environments, confirming that DSE_MFS converges considerably faster and achieves a significantly better solution at the end of each environment than DSE_SOEA. These superior solutions obtained by DSE_MFS are confirmed when they are evaluated by the real objective functions, as shown in Figure 7. In addition, comparing DSE_MPSO and DSE_MFS, where DSE_MPSO only uses the DSE surrogate model as the fitness function and searches multiple subregions of the search space, while DSE_MFS uses both the DSE surrogate model as the fitness function and its base learners as assistant fitness functions to guide the search process, DSE_MFS outperforms DSE_MPSO on 9 out of 12 test problems and losses on only one test instance. This result further confirms that the base learners are helpful in searching for better solutions for the DSE surrogate model via knowledge transfer. The reason is that these base learners are similar to their DSE surrogate model.
Figure 7:

The solution obtained by DSE_SOEA and DSE_MFEA evaluated by the DSE surrogates and the real objective functions in the first 20 environments.

Figure 7:

The solution obtained by DSE_SOEA and DSE_MFEA evaluated by the DSE surrogates and the real objective functions in the first 20 environments.

Close modal

4.4  Parameter Sensitivity and Analysis

4.4.1  Sensitivity to the Maximum Number of Base Learners Kmax

In the proposed DSE_MFS, the maximum number of base learners, Kmax, influences both the accuracy of DSE surrogates and the MFEA search and therefore influences the performance of the algorithm. Here, we assess the sensitivity of DSE_MFS to Kmax using 10D F1 and F6 as the test instances. Figure 8 presents the averaged mean obtained by DSE_MFS when Kmax is set to 1, 5, 10, 15, and 20, respectively. As shown in Figure 8, the averaged mean of DSE_MFS gradually decreases with Kmax increasing when Kmax less than 10, and suddenly increases when Kmax=20 on F1. On F6, DSE_MFS is relatively insensitive to Kmax and achieves the best overall performance when Kmax is around 10. Thus, we recommend Kmax=10.
Figure 8:

The E_BBC over 20 independent runs achieved by DSE_MFS with different Kmax.

Figure 8:

The E_BBC over 20 independent runs achieved by DSE_MFS with different Kmax.

Close modal

4.4.2  Sensitivity to the Change Frequency FEs

The change frequency FEs influences the optimization time in each environment, thus influencing the best-found solutions. Here, we examine the sensitivity of the compared algorithms to the change frequencies on F1 and F6 instances with D=10. The results obtained by each algorithm when FEs={5,10,20,30,50} are plotted in Figure 9. The figure shows that the E_BBC metrics of all compared algorithms generally become smaller with the increase of FEs. This indicates that the obtained best-found solutions are closer to the optimal solutions with larger FEs. This observation is reasonable as the algorithms are likely to find better solutions with more computational budgets.
Figure 9:

The E_BBC over 20 independent runs achieved by DDEA-SE, TT_DDEA, BDDEA, SAEF, and DSE_MFS with different FEs.

Figure 9:

The E_BBC over 20 independent runs achieved by DDEA-SE, TT_DDEA, BDDEA, SAEF, and DSE_MFS with different FEs.

Close modal

In this study, a data stream ensemble (DSE) assisted multifactorial evolutionary algorithm (MFEA) is proposed to solve dynamic offline data-driven optimization problems (DynODD). The DSE-assisted MFEA is able to leverage the knowledge of the problems in the previous environments to track the moving optimum quickly even if the amount of available data in each environment is small. The proposed algorithm, DSE_MFS, is compared with four state-of-the-art data-driven EAs on six dynamic optimization benchmarks to validate its performance. The empirical results demonstrate that DSE_MFS has achieved the best overall performance against the compared algorithms.

Although the performance of the proposed algorithm is encouraging, many challenges remain to be addressed. For example, this work reuses the knowledge acquired in previous environments in surrogate modeling and multitask optimization. Furthermore, investigating the use of surrogate models to generate a more informed initial population for preventing cold start is also interesting. Moreover, the dimension of the search space considered in this work is relatively low, and it will become much more challenging for high-dimensional problems. Finally, offline data-driven multi- and many-objective optimization in dynamic environments will also be important for future research.

Aggarwal
,
C. C.
(
2006
).
On biased reservoir sampling in the presence of stream evolution
. In
Proceedings of the 32nd International Conference on Very Large Data Bases
, pp.
607
618
.
Alippi
,
C.
,
Boracchi
,
G.
, and
Roveri
,
M.
(
2013
).
Just-in-time classifiers for recurrent concepts
.
IEEE Transactions on Neural Networks and Learning Systems
,
24
(
4
):
620
634
.
Alippi
,
C.
, and
Roveri
,
M.
(
2008
).
Just-in-time adaptive classifiers—Part II: Designing the classifier
.
IEEE Transactions on Neural Networks
,
19
(
12
):
2053
2064
.
Balabanov
,
V.
, and
Venter
,
G.
(
2004
).
Multi-fidelity optimization with high-fidelity analysis and low-fidelity gradients
. In
10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference
, p. 4459.
Bali
,
K. K.
,
Ong
,
Y.-S.
,
Gupta
,
A.
, and
Tan
,
P. S.
(
2019
).
Multifactorial evolutionary algorithm with online transfer parameter estimation: MFEA-II
.
IEEE Transactions on Evolutionary Computation
,
24
(
1
):
69
83
.
Bifet
,
A.
, and
Gavalda
,
R.
(
2007
).
Learning from time-changing data with adaptive windowing
. In
Proceedings of the 2007 SIAM International Conference on Data Mining
, pp.
443
448
.
Blackwell
,
T.
,
Branke
,
J.
, and
Li
,
X.
(
2008
).
Particle swarms for dynamic optimization problems
. In
Swarm intelligence
, pp.
193
217
.
Springer
.
Branke
,
J.
,
Asafuddoula
,
M.
,
Bhattacharjee
,
K. S.
, and
Ray
,
T.
(
2016
).
Efficient use of partially converged simulations in evolutionary optimization
.
IEEE Transactions on Evolutionary Computation
,
21
(
1
):
52
64
.
Brookes
,
D.
,
Park
,
H.
, and
Listgarten
,
J.
(
2019
).
Conditioning by adaptive sampling for robust design
. In
Proceedings of the 36th International Conference on Machine Learning
, Vol.
97
, pp.
773
782
.
Brzezinski
,
D.
, and
Stefanowski
,
J.
(
2013
).
Reacting to different types of concept drift: The Accuracy Updated Ensemble algorithm
.
IEEE Transactions on Neural Networks and Learning Systems
,
25
(
1
):
81
94
.
Cao
,
L.
,
Xu
,
L.
,
Goodman
,
E. D.
,
Bao
,
C.
, and
Zhu
,
S.
(
2019
).
Evolutionary dynamic multiobjective optimization assisted by a support vector regression predictor
.
IEEE Transactions on Evolutionary Computation
,
24
(
2
):
305
319
.
Chugh
,
T.
,
Chakraborti
,
N.
,
Sindhya
,
K.
, and
Jin
,
Y.
(
2017
).
A data-driven surrogate-assisted evolutionary algorithm applied to a many-objective blast furnace optimization problem
.
Materials and Manufacturing Processes
,
32
(
10
):
1172
1178
.
Chugh
,
T.
,
Jin
,
Y.
,
Miettinen
,
K.
,
Hakanen
,
J.
, and
Sindhya
,
K.
(
2018
).
A surrogate-assisted reference vector guided evolutionary algorithm for computationally expensive many-objective optimization
.
IEEE Transactions on Evolutionary Computation
,
22
(
1
):
129
142
.
Chugh
,
T.
,
Sindhya
,
K.
,
Hakanen
,
J.
, and
Miettinen
,
K.
(
2019
).
A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms
.
Soft Computing
,
23
(
9
):
3137
3166
.
Cobb
,
H. G.
(
1990
).
An investigation into the use of hypermutation as an adaptive operator in genetic algorithms having continuous, time-dependent nonstationary environments
.
Technical report
.
Naval Research Lab, Washington DC
.
Cohen
,
E.
, and
Strauss
,
M.
(
2003
).
Maintaining time-decaying stream aggregates
. In
Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems
, pp.
223
233
.
Conn
,
A. R.
, and
Le Digabel
,
S.
(
2013
).
Use of quadratic models with mesh-adaptive direct search for constrained black box optimization
.
Optimization Methods and Software
,
28
(
1
):
139
158
.
Daneshyari
,
M.
, and
Yen
,
G. G.
(
2011
).
Dynamic optimization using cultural based PSO
. In
2011 IEEE Congress of Evolutionary Computation
, pp.
509
516
.
Deb
,
K.
, Udaya Bhaskara Rao, N., and
Karthik
,
S.
(
2007
).
Dynamic multi-objective optimization and decision-making using modified NSGA-II: A case study on hydro-thermal power scheduling
. In
International Conference on Evolutionary Multi-criterion Optimization
, pp.
803
817
.
Ding
,
J.
,
Chai
,
T.
,
Wang
,
H.
, and
Chen
,
X.
(
2012
).
Knowledge-based global operation of mineral processing under uncertainty
.
IEEE Transactions on Industrial Informatics
,
8
(
4
):
849
859
.
Ding
,
J.
,
Yang
,
C.
,
Jin
,
Y.
, and
Chai
,
T.
(
2017
).
Generalized multitasking for evolutionary optimization of expensive problems
.
IEEE Transactions on Evolutionary Computation
,
23
(
1
):
44
58
.
Du
,
K. L.
, and
Swamy
,
M.
(
2006
).
Radial basis function networks
.
Neural Networks in a Softcomputing Framework
, pp.
251
294
.
Gaulton
,
A.
,
Bellis
,
L. J.
,
Bento
,
A. P.
,
Chambers
,
J.
,
Davies
,
M.
,
Hersey
,
A.
,
Light
,
Y.
, et al
. (
2012
).
ChEMBL: A large-scale bioactivity database for drug discovery
.
Nucleic Acids Research
,
40
(
D1
):
D1100
D1107
.
Gomes
,
H. M.
,
Barddal
,
J. P.
,
Enembreck
,
F.
, and
Bifet
,
A.
(
2017
).
A survey on ensemble learning for data stream classification
.
ACM Computing Surveys
,
50
(
2
): 23.
Guo
,
D.
,
Chai
,
T.
,
Ding
,
J.
, and
Jin
,
Y.
(
2016
).
Small data driven evolutionary multi-objective optimization of fused magnesium furnaces
. In
2016 IEEE Symposium Series on Computational Intelligence
, pp.
1
8
.
Gupta
,
A.
,
Ong
,
Y.-S.
, and
Feng
,
L.
(
2016
).
Multifactorial evolution: Toward evolutionary multitasking
.
IEEE Transactions on Evolutionary Computation
,
20
(
3
):
343
357
.
Gupta
,
A.
,
Ong
,
Y.-S.
, and
Feng
,
L.
(
2018
).
Insights on transfer optimization: Because experience is the best teacher
.
IEEE Transactions on Emerging Topics in Computational Intelligence
,
2
(
1
):
51
64
.
Gupta
,
A.
,
Ong
,
Y.-S.
,
Feng
,
L.
, and
Tan
,
K. C.
(
2016
).
Multiobjective multifactorial optimization in evolutionary multitasking
.
IEEE Transactions on Cybernetics
,
47
(
7
):
1652
1665
.
Hatzakis
,
I.
, and
Wallace
,
D.
(
2006
).
Dynamic multi-objective optimization with evolutionary algorithms: A forward-looking approach
. In
Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation
, pp.
1201
1208
.
Huang
,
P.
,
Wang
,
H.
, and
Jin
,
Y.
(
2021
).
Offline data-driven evolutionary optimization based on tri-training
.
Swarm and Evolutionary Computation
,
60
:100800.
Jekabsons
,
G.
(
2009
).
RBF: Radial basis function interpolation for Matlab/Octave
(Version 1.1)
. Riga Technical University, Latvia.
Jiang
,
M.
,
Wang
,
Z.
,
Guo
,
S.
,
Gao
,
X.
, and
Tan
,
K. C.
(
2020
).
Individual-based transfer learning for dynamic multiobjective optimization
.
IEEE Transactions on Cybernetics
,
51
(
10
):
4968
4981
.
Jin
,
Y.
(
2016
).
Data driven evolutionary optimization of complex systems: Big data versus small data
. In
Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion
, pp.
1281
1282
.
Jin
,
Y.
, and
Branke
,
J.
(
2005
).
Evolutionary optimization in uncertain environments---A survey
.
IEEE Transactions on Evolutionary Computation
,
9
(
3
):
303
317
.
Jin
,
Y.
, and
Sendhoff
,
B.
(
2009
).
A systems approach to evolutionary multiobjective structural optimization and beyond
.
IEEE Computational Intelligence Magazine
,
4
(
3
):
62
76
.
Jin
,
Y.
,
Wang
,
H.
,
Chugh
,
T.
,
Guo
,
D.
, and
Miettinen
,
K.
(
2019
).
Data-driven evolutionary optimization: An overview and case studies
.
IEEE Transactions on Evolutionary Computation
,
23
(
3
):
442
458
.
Jin
,
Y.
,
Wang
,
H.
, and
Sun
,
C.
(
2021
).
Data-driven evolutionary optimization
.
Springer
.
Kearns
,
M.
, and
Ron
,
D.
(
1999
).
Algorithmic stability and sanity-check bounds for leave-one-out cross-validation
.
Neural Computation
,
11
(
6
):
1427
1453
.
Koychev
,
I.
(
2000
).
Gradual forgetting for adaptation to concept drift
. In
Proceedings of ECAI 2000 Workshop on Current Issues in Spatio-Temporal Reasoning
, pp.
101
107
.
Krawczyk
,
B.
,
Minku
,
L. L.
,
Gama
,
J.
,
Stefanowski
,
J.
, and
Woźniak
,
M.
(
2017
).
Ensemble learning for data stream analysis: A survey
.
Information Fusion
,
37
:
132
156
.
Kumar
,
A.
, and
Levine
,
S.
(
2020
).
Model inversion networks for model-based optimization
.
Advances in Neural Information Processing Systems
,
33
:
5126
5137
.
Li
,
C.
,
Yang
,
S.
,
Nguyen
,
T.
,
Yu
,
E. L.
,
Yao
,
X.
,
Jin
,
Y.
,
Beyer
,
H.
, and
Suganthan
,
P.
(
2008
).
Benchmark generator for CEC 2009 competition on dynamic optimization
.
Technical Report
.
Department of Computer Science, University of Leicester
,
UK
.
Li
,
J.-Y.
,
Zhan
,
Z.-H.
,
Tan
,
K. C.
, and
Zhang
,
J.
(
2021
).
A meta-knowledge transfer-based differential evolution for multitask optimization
.
IEEE Transactions on Evolutionary Computation
,
26
(
4
):
719
734
.
Li
,
J.-Y.
,
Zhan
,
Z.-H.
,
Wang
,
C.
,
Jin
,
H.
, and
Zhang
,
J.
(
2020
).
Boosting data-driven evolutionary algorithm with localized data generation
.
IEEE Transactions on Evolutionary Computation
,
24
(
5
):
923
937
.
Li
,
J.-Y.
,
Zhan
,
Z.-H.
,
Wang
,
H.
, and
Zhang
,
J.
(
2020
).
Data-driven evolutionary algorithm with perturbation-based ensemble surrogates
.
IEEE Transactions on Cybernetics
,
51
(
8
):
3925
3937
.
Liao
,
T.
,
Wang
,
G.
,
Yang
,
B.
,
Lee
,
R.
,
Pister
,
K.
,
Levine
,
S.
, and
Calandra
,
R.
(
2019
).
Data-efficient learning of morphology and controller for a microrobot
. In
2019 International Conference on Robotics and Automation
, pp.
2488
2494
.
Liu
,
A.
,
Lu
,
J.
, and
Zhang
,
G.
(
2020
).
Diverse instance-weighting ensemble based on region drift disagreement for concept drift adaptation
.
IEEE Transactions on Neural Networks and Learning Systems
,
32
(
1
):
293
307
.
Liu
,
R.
,
Zhang
,
W.
,
Jiao
,
L.
,
Liu
,
F.
, and
Ma
,
J.
(
2010
).
A sphere-dominance based preference immune-inspired algorithm for dynamic multi-objective optimization
. In
Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation
, pp.
423
430
.
Liu
,
X.-F.
,
Zhan
,
Z.-H.
,
Gu
,
T.-L.
,
Kwong
,
S.
,
Lu
,
Z.
,
Duh
,
H.B.-L.
, and
Zhang
,
J.
(
2019
).
Neural network-based information transfer for dynamic optimization
.
IEEE Transactions on Neural Networks and Learning Systems
,
31
(
5
):
1557
1570
.
Lu
,
J.
,
Liu
,
A.
,
Dong
,
F.
,
Gu
,
F.
,
Gama
,
J.
, and
Zhang
,
G.
(
2018
).
Learning under concept drift: A review
.
IEEE Transactions on Knowledge and Data Engineering
,
31
(
12
):
2346
2363
.
Luo
,
W.
,
Yi
,
R.
,
Yang
,
B.
, and
Xu
,
P.
(
2018
).
Surrogate-assisted evolutionary framework for data-driven dynamic optimization
.
IEEE Transactions on Emerging Topics in Computational Intelligence
,
3
(
2
):
137
150
.
Mavrovouniotis
,
M.
,
Li
,
C.
, and
Yang
,
S.
(
2017
).
A survey of swarm intelligence for dynamic optimization: Algorithms and applications
.
Swarm and Evolutionary Computation
,
33
:
1
17
.
Muruganantham
,
A.
,
Tan
,
K. C.
, and
Vadakkepat
,
P.
(
2015
).
Evolutionary dynamic multiobjective optimization via Kalman filter prediction
.
IEEE Transactions on Cybernetics
,
46
(
12
):
2862
2873
.
Nakano
,
H.
,
Kojima
,
M.
, and
Miyauchi
,
A.
(
2015
).
An artificial bee colony algorithm with a memory scheme for dynamic optimization problems
. In
2015 IEEE Congress on Evolutionary Computation
, pp.
2657
2663
.
Nasiri
,
B.
, and
Meybodi
,
M. R.
(
2016
).
History-driven firefly algorithm for optimisation in dynamic and uncertain environments
.
International Journal of Bio-Inspired Computation
,
8
(
5
):
326
339
.
Stein
,
M.
(
1987
).
Large sample properties of simulations using Latin hypercube sampling
.
Technometrics
,
29
(
2
):
143
151
.
Street
,
W. N.
, and
Kim
,
Y.
(
2001
).
A streaming ensemble algorithm (SEA) for large-scale classification
. In
Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, pp.
377
382
.
Sun
,
Y.
,
Tang
,
K.
,
Zhu
,
Z.
, and
Yao
,
X.
(
2018
).
Concept drift adaptation by exploiting historical knowledge
.
IEEE Transactions on Neural Networks and Learning Systems
,
29
(
10
):
4822
4832
.
Trabucco
,
B.
,
Kumar
,
A.
,
Geng
,
X.
, and
Levine
,
S.
(
2021
).
Conservative objective models for effective offline model-based optimization
. In
International Conference on Machine Learning
, pp.
10358
10368
.
Wang
,
H.
,
Jin
,
Y.
, and
Jansen
,
J. O.
(
2016
).
Data-driven surrogate-assisted multiobjective evolutionary optimization of a trauma system
.
IEEE Transactions on Evolutionary Computation
,
20
(
6
):
939
952
.
Wang
,
H.
,
Jin
,
Y.
,
Sun
,
C.
, and
Doherty
,
J.
(
2019
).
Offline data-driven evolutionary optimization using selective surrogate ensembles
.
IEEE Transactions on Evolutionary Computation
,
23
(
2
):
203
216
.
Wu
,
S.-H.
,
Zhan
,
Z.-H.
,
Tan
,
K. C.
, and
Zhang
,
J.
(
2022
).
Orthogonal transfer for multitask optimization
.
IEEE Transactions on Evolutionary Computation
,
27
(
1
):
185
200
.
Yang
,
C.
,
Ding
,
J.
,
Jin
,
Y.
, and
Chai
,
T.
(
2019
).
Offline data-driven multiobjective optimization: Knowledge transfer between surrogates and generation of final solutions
.
IEEE Transactions on Evolutionary Computation
,
24
(
3
):
409
423
.
Yang
,
C.
,
Ding
,
J.
,
Tan
,
K. C.
, and
Jin
,
Y.
(
2017
).
Two-stage assortative mating for multi-objective multifactorial evolutionary optimization
. In
2017 IEEE 56th Annual Conference on Decision and Control
, pp.
76
81
.
Yazdani
,
D.
,
Cheng
,
R.
,
Yazdani
,
D.
,
Branke
,
J.
,
Jin
,
Y.
, and
Yao
,
X.
(
2021
).
A survey of evolutionary continuous dynamic optimization over two decades—Part A
.
IEEE Transactions on Evolutionary Computation
,
25
(
4
):
609
629
.
Yu
,
E.
, and
Suganthan
,
P. N.
(
2009
).
Evolutionary programming with ensemble of explicit memories for dynamic optimization
. In
2009 IEEE Congress on Evolutionary Computation
, pp.
431
438
.
Zimmer
,
L.
,
Lindauer
,
M.
, and
Hutter
,
F.
(
2021
).
Auto-PyTorch: Multi-fidelity metalearning for efficient and robust AutoDL
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
,
43
(
9
):
3079
3090
.