Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-12 of 12
Mengjie Zhang
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation 1–30.
Published: 18 July 2023
Abstract
View article
PDF
Minimizing the number of selected features and maximizing the classification performance are two main objectives in feature selection, which can be formulated as a biobjective optimization problem. Due to the complex interactions between features, a solution (i.e., feature subset) with poor objective values does not mean that all the features it selects are useless, as some of them combined with other complementary features can greatly improve the classification performance. Thus, it is necessary to consider not only the performance of feature subsets in the objective space, but also their differences in the search space, to explore more promising feature combinations. To this end, this paper proposes a tri-objective method for bi-objective feature selection in classification, which solves a bi-objective feature selection problem as a triobjective problem by considering the diversity (differences) between feature subsets in the search space as the third objective. The selection based on the converted triobjective method can maintain a balance between minimizing the number of selected features, maximizing the classification performance, and exploring more promising feature subsets. Furthermore, a novel initialization strategy and an offspring reproduction operator are proposed to promote the diversity of feature subsets in the objective space and improve the search ability, respectively. The proposed algorithm is compared with five multi-objective-based feature selection methods, six typical feature selection methods, and two peer methods with diversity as a helper objective. Experimental results on 20 real-world classification datasets suggest that the proposed method outperforms the compared methods in most scenarios.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2022) 30 (1): 99–129.
Published: 01 March 2022
Abstract
View article
PDF
High-dimensional unbalanced classification is challenging because of the joint effects of high dimensionality and class imbalance. Genetic programming (GP) has the potential benefits for use in high-dimensional classification due to its built-in capability to select informative features. However, once data are not evenly distributed, GP tends to develop biased classifiers which achieve a high accuracy on the majority class but a low accuracy on the minority class. Unfortunately, the minority class is often at least as important as the majority class. It is of importance to investigate how GP can be effectively utilized for high-dimensional unbalanced classification. In this article, to address the performance bias issue of GP, a new two-criterion fitness function is developed, which considers two criteria, that is, the approximation of area under the curve (AUC) and the classification clarity (i.e., how well a program can separate two classes). The obtained values on the two criteria are combined in pairs, instead of summing them together. Furthermore, this article designs a three-criterion tournament selection to effectively identify and select good programs to be used by genetic operators for generating offspring during the evolutionary learning process. The experimental results show that the proposed method achieves better classification performance than other compared methods.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2021) 29 (3): 331–366.
Published: 01 September 2021
FIGURES
Abstract
View article
PDF
The performance of image classification is highly dependent on the quality of the extracted features that are used to build a model. Designing such features usually requires prior knowledge of the domain and is often undertaken by a domain expert who, if available, is very costly to employ. Automating the process of designing such features can largely reduce the cost and efforts associated with this task. Image descriptors, such as local binary patterns, have emerged in computer vision, and aim at detecting keypoints, for example, corners, line-segments, and shapes, in an image and extracting features from those keypoints. In this article, genetic programming (GP) is used to automatically evolve an image descriptor using only two instances per class by utilising a multitree program representation. The automatically evolved descriptor operates directly on the raw pixel values of an image and generates the corresponding feature vector. Seven well-known datasets were adapted to the few-shot setting and used to assess the performance of the proposed method and compared against six handcrafted and one evolutionary computation-based image descriptor as well as three convolutional neural network (CNN) based methods. The experimental results show that the new method has significantly outperformed the competitor image descriptors and CNN-based methods. Furthermore, different patterns have been identified from analysing the evolved programs.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2021) 29 (1): 75–105.
Published: 01 March 2021
FIGURES
| View All (5)
Abstract
View article
PDF
Dynamic Flexible Job Shop Scheduling (DFJSS) is an important and challenging problem, and can have multiple conflicting objectives. Genetic Programming Hyper-Heuristic (GPHH) is a promising approach to fast respond to the dynamic and unpredictable events in DFJSS. A GPHH algorithm evolves dispatching rules (DRs) that are used to make decisions during the scheduling process (i.e., the so-called heuristic template). In DFJSS, there are two kinds of scheduling decisions: the routing decision that allocates each operation to a machine to process it, and the sequencing decision that selects the next job to be processed by each idle machine. The traditional heuristic template makes both routing and sequencing decisions in a non-delay manner, which may have limitations in handling the dynamic environment. In this article, we propose a novel heuristic template that delays the routing decisions rather than making them immediately. This way, all the decisions can be made under the latest and most accurate information. We propose three different delayed routing strategies, and automatically evolve the rules in the heuristic template by GPHH. We evaluate the newly proposed GPHH with Delayed Routing (GPHH-DR) on a multiobjective DFJSS that optimises the energy efficiency and mean tardiness. The experimental results show that GPHH-DR significantly outperformed the state-of-the-art GPHH methods. We further demonstrated the efficacy of the proposed heuristic template with delayed routing, which suggests the importance of delaying the routing decisions.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2020) 28 (4): 531–561.
Published: 01 December 2020
Abstract
View article
PDF
Clustering is a difficult and widely studied data mining task, with many varieties of clustering algorithms proposed in the literature. Nearly all algorithms use a similarity measure such as a distance metric (e.g., Euclidean distance) to decide which instances to assign to the same cluster. These similarity measures are generally predefined and cannot be easily tailored to the properties of a particular dataset, which leads to limitations in the quality and the interpretability of the clusters produced. In this article, we propose a new approach to automatically evolving similarity functions for a given clustering algorithm by using genetic programming. We introduce a new genetic programming-based method which automatically selects a small subset of features (feature selection) and then combines them using a variety of functions (feature construction) to produce dynamic and flexible similarity functions that are specifically designed for a given dataset. We demonstrate how the evolved similarity functions can be used to perform clustering using a graph-based representation. The results of a variety of experiments across a range of large, high-dimensional datasets show that the proposed approach can achieve higher and more consistent performance than the benchmark methods. We further extend the proposed approach to automatically produce multiple complementary similarity functions by using a multi-tree approach, which gives further performance improvements. We also analyse the interpretability and structure of the automatically evolved similarity functions to provide insight into how and why they are superior to standard distance metrics.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2020) 28 (4): 563–593.
Published: 01 December 2020
Abstract
View article
PDF
Due to its direct relevance to post-disaster operations, meter reading and civil refuse collection, the Uncertain Capacitated Arc Routing Problem (UCARP) is an important optimisation problem. Stochastic models are critical to study as they more accurately represent the real world than their deterministic counterparts. Although there have been extensive studies in solving routing problems under uncertainty, very few have considered UCARP, and none consider collaboration between vehicles to handle the negative effects of uncertainty. This article proposes a novel Solution Construction Procedure (SCP) that generates solutions to UCARP within a collaborative, multi-vehicle framework. It consists of two types of collaborative activities: one when a vehicle unexpectedly expends capacity ( route failure ), and the other during the refill process. Then, we propose a Genetic Programming Hyper-Heuristic (GPHH) algorithm to evolve the routing policy used within the collaborative framework. The experimental studies show that the new heuristic with vehicle collaboration and GP-evolved routing policy significantly outperforms the compared state-of-the-art algorithms on commonly studied test problems. This is shown to be especially true on instances with larger numbers of tasks and vehicles. This clearly shows the advantage of vehicle collaboration in handling the uncertain environment, and the effectiveness of the newly proposed algorithm.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2020) 28 (2): 289–316.
Published: 01 June 2020
FIGURES
| View All (9)
Abstract
View article
PDF
The uncertain capacitated arc routing problem is of great significance for its wide applications in the real world. In the uncertain capacitated arc routing problem, variables such as task demands and travel costs are realised in real time. This may cause the predefined solution to become ineffective and/or infeasible. There are two main challenges in solving this problem. One is to obtain a high-quality and robust baseline task sequence , and the other is to design an effective recourse policy to adjust the baseline task sequence when it becomes infeasible and/or ineffective during the execution. Existing studies typically only tackle one challenge (the other being addressed using a naive strategy). No existing work optimises the baseline task sequence and recourse policy simultaneously. To fill this gap, we propose a novel proactive-reactive approach, which represents a solution as a baseline task sequence and a recourse policy. The two components are optimised under a cooperative coevolution framework, in which the baseline task sequence is evolved by an estimation of distribution algorithm, and the recourse policy is evolved by genetic programming. The experimental results show that the proposed algorithm, called Solution-Policy Coevolver, significantly outperforms the state-of-the-art algorithms to the uncertain capacitated arc routing problem for the ugdb and uval benchmark instances. Through further analysis, we discovered that route failure is not always detrimental. Instead, in certain cases (e.g., when the vehicle is on the way back to the depot) allowing route failure can lead to better solutions.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2019) 27 (3): 467–496.
Published: 01 September 2019
FIGURES
| View All (17)
Abstract
View article
PDF
Designing effective dispatching rules for production systems is a difficult and time-consuming task if it is done manually. In the last decade, the growth of computing power, advanced machine learning, and optimisation techniques has made the automated design of dispatching rules possible and automatically discovered rules are competitive or outperform existing rules developed by researchers. Genetic programming is one of the most popular approaches to discovering dispatching rules in the literature, especially for complex production systems. However, the large heuristic search space may restrict genetic programming from finding near optimal dispatching rules. This article develops a new hybrid genetic programming algorithm for dynamic job shop scheduling based on a new representation, a new local search heuristic, and efficient fitness evaluators. Experiments show that the new method is effective regarding the quality of evolved rules. Moreover, evolved rules are also significantly smaller and contain more relevant attributes.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2017) 25 (2): 173–204.
Published: 01 June 2017
FIGURES
| View All (19)
Abstract
View article
PDF
A main research direction in the field of evolutionary machine learning is to develop a scalable classifier system to solve high-dimensional problems. Recently work has begun on autonomously reusing learned building blocks of knowledge to scale from low-dimensional problems to high-dimensional ones. An XCS-based classifier system, known as XCSCFC, has been shown to be scalable, through the addition of expression tree–like code fragments, to a limit beyond standard learning classifier systems. XCSCFC is especially beneficial if the target problem can be divided into a hierarchy of subproblems and each of them is solvable in a bottom-up fashion. However, if the hierarchy of subproblems is too deep, then XCSCFC becomes impractical because of the needed computational time and thus eventually hits a limit in problem size. A limitation in this technique is the lack of a cyclic representation, which is inherent in finite state machines (FSMs). However, the evolution of FSMs is a hard task owing to the combinatorially large number of possible states, connections, and interaction. Usually this requires supervised learning to minimize inappropriate FSMs, which for high-dimensional problems necessitates subsampling or incremental testing. To avoid these constraints, this work introduces a state-machine-based encoding scheme into XCS for the first time, termed XCSSMA. The proposed system has been tested on six complex Boolean problem domains: multiplexer, majority-on, carry, even-parity, count ones, and digital design verification problems. The proposed approach outperforms XCSCFA (an XCS that computes actions) and XCSF (an XCS that computes predictions) in three of the six problem domains, while the performance in others is similar. In addition, XCSSMA evolved, for the first time, compact and human readable general classifiers (i.e., solving any n -bit problems) for the even-parity and carry problem domains, demonstrating its ability to produce scalable solutions using a cyclic representation.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2016) 24 (1): 143–182.
Published: 01 March 2016
FIGURES
| View All (27)
Abstract
View article
PDF
In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2014) 22 (4): 629–650.
Published: 01 December 2014
FIGURES
| View All (14)
Abstract
View article
PDF
Image pattern classification is a challenging task due to the large search space of pixel data. Supervised and subsymbolic approaches have proven accurate in learning a problem’s classes. However, in the complex image recognition domain, there is a need for investigation of learning techniques that allow humans to interpret the learned rules in order to gain an insight about the problem. Learning classifier systems (LCSs) are a machine learning technique that have been minimally explored for image classification. This work has developed the feature pattern classification system (FPCS) framework by adopting Haar-like features from the image recognition domain for feature extraction. The FPCS integrates Haar-like features with XCS, which is an accuracy-based LCS. A major contribution of this work is that the developed framework is capable of producing human-interpretable rules. The FPCS system achieved 91 1% accuracy on the unseen test set of the MNIST dataset. In addition, the FPCS is capable of autonomously adjusting the rotation angle in unaligned images. This rotation adjustment raised the accuracy of FPCS to 95%. Although the performance is competitive with equivalent approaches, this was not as accurate as subsymbolic approaches on this dataset. However, the benefit of the interpretability of rules produced by FPCS enabled us to identify the distribution of the learned angles—a normal distribution around —which would have been very difficult in subsymbolic approaches. The analyzable nature of FPCS is anticipated to be beneficial in domains such as speed sign recognition, where underlying reasoning and confidence of recognition needs to be human interpretable.
Journal Articles
Publisher: Journals Gateway
Evolutionary Computation (2014) 22 (1): 105–138.
Published: 01 March 2014
FIGURES
| View All (6)
Abstract
View article
PDF
Due-date assignment plays an important role in scheduling systems and strongly influences the delivery performance of job shops. Because of the stochastic and dynamic nature of job shops, the development of general due-date assignment models (DDAMs) is complicated. In this study, two genetic programming (GP) methods are proposed to evolve DDAMs for job shop environments. The experimental results show that the evolved DDAMs can make more accurate estimates than other existing dynamic DDAMs with promising reusability. In addition, the evolved operation-based DDAMs show better performance than the evolved DDAMs employing aggregate information of jobs and machines.