Abstract
The field of metaheuristics has a long history of finding inspiration in natural systems, starting from evolution strategies, genetic algorithms, and ant colony optimization in the second half of the 20th century. In the last decades, however, the field has experienced an explosion of metaphor-centered methods claiming to be inspired by increasingly absurd natural (and even supernatural) phenomena—several different types of birds, mammals, fish and invertebrates, soccer and volleyball, reincarnation, zombies, and gods. Although metaphors can be powerful inspiration tools, the emergence of hundreds of barely discernible algorithmic variants under different labels and nomenclatures has been counterproductive to the scientific progress of the field, as it neither improves our ability to understand and simulate biological systems nor contributes generalizable knowledge or design principles for global optimization approaches. In this article we discuss some of the possible causes of this trend, its negative consequences for the field, and some efforts aimed at moving the area of metaheuristics toward a better balance between inspiration and scientific soundness.
1 Introduction
In 1865, August Kekulé proposed that the structure of benzene was a hexagonal ring of six carbon atoms, solving a problem that had confounded chemists for decades. Kekulé championed visual scientific creativity and mentioned that his inspiration came from a daydream about an Ouroboros, which is a symbol depicting a serpent or dragon eating its own tail. However, it is clear to anyone who has gone through even a basic course in organic chemistry that scientists do not discuss their work using snake anatomy terminology or try to come up with new compounds by carefully examining legendary reptiles. Despite the importance he attributed to visual creativity, Kekulé himself only went on record about his original inspiration in 1890, at a meeting held in his honor (Robinson, 2010). In a similar anecdote, Elias Howe is reported to have drawn inspiration for the needle design in his lock-stitch sewing machine from a nightmare in which he was threatened by cannibals with hollow-tip spears. Engineers, however, have never described their machines in anthropological terms or attempted to design better equipment by looking at the habits of isolated anthropophagous tribes. Howe himself is not known to have publicly discussed his inspiration, which only appeared in a family chronicle decades after the event (Draper, 1900; Windsor, 1905).
Throughout history, scientists and engineers have drawn inspiration from different sources: the natural world, dreams, or personal experiences. Ideas from biology and observations of natural processes have inspired interesting developments within computer science and engineering since at least the 1960s, suggesting, among other things, innovative ways to solve optimization problems (Beyer & Schwefel, 2002; Bremermann, 1962; Dorigo et al., 1996; Fogel & Fogel, 1995; Holland, 1975; Kennedy & Eberhart, 1995; Kirkpatrick et al., 1983). The development of these methods was often experiment driven rather than theory led, which was not surprising for a new field lacking an existing theoretical framework. Although the algorithms were in most cases described and discussed using metaphor-specific language, beyond what would be necessary for understanding the computational concepts being implemented, the elements of good scientific practice were present: An original idea would suggest a new method, which would be tested, refined, and compared against state-of-the-art approaches for the problems they were intended to solve. Attempts at theoretical development would be advanced, discussed, adopted, or refuted depending on their success in explaining the behavior of each method. This approach led to increased developments in metaheuristic methodologies, with excellent results for the solution of a variety of applied problems with characteristics that did not allow the use of traditional mathematical programming methods.
This so-called initial phase of nature-inspired computation has its origins somewhat interwoven with those of Artificial Life (ALife) (Banzhaf & McMullin, 2012; Stein et al., 2021). Despite the difference in focus and approach, the two fields had—and, to some extent, still maintain—an interesting exchange of ideas and concepts. Developments in swarm and evolutionary computation (SaEC) not only draw from existing biological concepts but often go beyond the constraints of known biological reality in their pursuit of better problem-solving strategies, which can be easily connected to the ALife concept of “life-as-it-could-be” (Langton, 1998, p.1). In the opposite direction, developments in SaEC are also known to feed back into ALife, not only in terms of better simulation and understanding of biological and lifelike phenomena (Lehman et al., 2020) but also in areas like evolutionary hardware (Eiben & Smith, 2015). As such, an understanding of what happened to the publication landscape of nature-inspired computation in the last two decades, as well as an awareness of recent initiatives aimed at bringing the field back to more methodologically sound grounds, can serve as a cautionary tale to researchers in the closely related field of ALife. This awareness may be particularly relevant for the emerging publication ecosystem around lifelike computing systems, which risks becoming attractive to the same sort of opportunistic publishing that took hold of considerable portions of the nature-inspired metaheuristics community unless countermeasures, such as clear editorial policies, are established. In the remainder of this article, we elaborate on what we perceive as the problem with the so-called age of the metaphors and some of the recent initiatives aimed at mitigating its damage to the field.
2 The Age of the Metaphors
The success of early nature-inspired metaheuristics led to attempts to find other phenomena that could provide insights for optimization. Around the late 1990s and early 2000s, this pursuit of insightful inspiration from natural processes started to transform into a different phenomenon: an increasing number of publications claiming to present revolutionary ideas or even “novel paradigms for optimization,” based on ever more obscure social, natural, or even supernatural metaphors.
Inspired by a “Cat Swarm Optimisation” paper, in 2014, we started gathering examples of particularly absurd metaphors published in peer-reviewed venues in a humorous catalog named the Evolutionary Computation Bestiary (Campelo & Aranha, 2021). As the website started to attract attention, several colleagues contacted us to recommend entries based on new and progressively more bizarre examples. The raw number of different methods added to the Bestiary showed that this was (and remains) a growing and concerning phenomenon.
Figure 1 illustrates this point. Between 2000 and 2008, we see the publication of a few methods per year (including algorithms based on sheep flocks, musicians, plant saplings, parliament elections, and the big bang). This increased to an average of more than one per month between 2009 and 2013 (with methods referring to semi-intelligent water drops, group counseling, sports championships, fireflies, paddy fields, and mountain climbers), and then to an average of two new metaphor-based methods being published every month in the peer-reviewed literature after 2014 (including not only sharks, zombies, and volleyball but also reincarnation, four different whale-based and three distinct football-based methods, barnacles, chicken swarms, interior design and decoration, and several other equally outlandish ones).1
3 Why This Is a Problem
The sheer volume of papers following the same general pattern raises a few important questions. The first one is whether there really are hundreds of fundamentally different ways to build an optimizer. As of late 2021, the Bestiary listed more than 260 unique entries, with a backlog of tens of others—including elephant clans, gorilla troops, and Mexican axolotls—awaiting validation for inclusion. A recent comprehensive taxonomy of nature- and bio-inspired optimization approaches suggests as many as 360 unique metaphors in the metaheuristics literature (Molina et al., 2020). This massive number of distinct algorithms, each claiming to present a unique way to solve optimization problems (in most cases limited to continuous and unconstrained formulations) is at odds with the relatively simplistic structure of most of these techniques (once the unnecessary metaphor-heavy language is stripped), as well as with the existence of general algorithmic design patterns that generalize many of these techniques (de Armas et al., 2021; de Jong, 2006; Stegherr et al., 2020; Stegherr & Hähn, 2021).
This explosion of metaphor-centered methods has led to an intense fragmentation of the literature into tens, perhaps hundreds, of small, barely discernible niches. The use of metaphor-heavy language when proposing new methods is partly responsible for this, as it adds an unnecessary obstacle to comparing the algorithmic similarities of these methods at first glance. How should one compare the ability of a bird to drop a cuckoo egg from its nest to the behavior of a scouting bee? It takes a deeper reading to find out, for instance, that these two completely different descriptions refer to the same underlying computational action, namely, generating a new random solution when the search has stalled.
This pattern of reinventing the wheel is seen quite frequently in the metaphor-based optimization literature, as denounced by Sörensen (2013). For instance, careful analysis by Weyland (2010, 2015) showed that Harmony Search was nothing more than a special case of Evolutionary Strategies. (Piotrowski et al., 2014) analyzed the novelty (or lack thereof) of the Black Hole algorithm, while Camacho-Villalón and colleagues (2018, 2020, 2022) did the same for the Intelligent Water Drops, Grey Wolf, Firefly, Bat, and Cuckoo algorithms. In all these cases, the conclusions were unequivocal—the “novel” algorithm did not in fact contain any novelty beyond the use of a metaphor-specific language, instead representing a simple instantiation of existing, well-known computational algorithms already in use. On the basis of our reading of the literature, we would expect to find the same pattern of repeated or reinvented ideas in many—if not most—metaphor-based methods, if subject to similar scrutiny. Even in the few cases where new ideas may be found, they become tied to the specific nomenclature of the metaphor, instead of being described in a way that would allow analysis, comparison to other methods, and easier dissemination of the design principles to other works.
Another common issue is the generally poor methodological standards of the experimental results reported in many of these papers. These problems were not exclusive to metaphor-centered methods but rather were part of an area without a strong statistical or methodological tradition, as documented since at least the mid-1990s (Barr et al., 1995; Campelo & Takahashi, 2019; Eiben & Jelasity, 2002; García-Martínez et al., 2017; Hooker, 1994, 1995), but the field of metaheuristics has been continuously improving its standards and developing better methodological practices (Bartz-Beielstein et al., 2020; Campelo & Wanner, 2020). Despite these advances, the experimental validation presented in the majority of metaphor-based papers continues to suffer from serious issues. These include long-identified problems (Campelo & Takahashi, 2019; Eiben & Jelasity, 2002; García-Martínez et al., 2017; Hooker, 1994, 1995), such as the following:
the almost exclusive focus on competitive testing, rather than on the underlying working principles of algorithms;
overfitting of methods and implementations to benchmark problems, rather than verifying whether estimated performance in an instance set generalizes to independent instances;
the absence of well-defined underlying hypotheses being tested;
the exclusive use of very similar algorithms, that is, other metaphor-based approaches, as comparison baselines, instead of state-of-the-art methods for the specific problem class being investigated (this is sometimes aggravated in papers that test only against methods from the same very specific niche, such as comparing a method only against mammal-based algorithms—as if the source of the metaphor had any meaningful relationship with the algorithmic aspects of the method);
unbalanced tuning efforts between the proposed and competing algorithms.
Application-oriented venues are particularly vulnerable to being colonized by “novel” metaphor-based methods. This appears to happen for two main reasons. The first is lack of domain awareness: Researchers in application fields who look at metaheuristics for solutions to optimization problems get lost in the multitude of papers proposing methods with strange names, unclear connections to each other, and seemingly outstanding results. Often, the choice of which method to use is defined by which names appear more frequently or are cited most often. Chicco and Mazza (2020) discuss the difficulties application researchers face when evaluating metaheuristics in more detail. The second likely reason is exploitation: Metaphor-based method creators who may find it difficult to publish their research in more optimization-focused journals sometimes opt for submitting their “novel” methods to application-oriented venues, where reviewers are less likely to be familiar with the technical shortcomings and wider criticism of these methods, or sometimes even with basic concepts of optimization. In more exasperating cases, the algorithm is submitted to a journal in the area of its base metaphor. A recent example is a “COVID-19 optimization algorithm” published in a high-impact biomedical and health informatics journal, even though the method does not specifically address any issue related to these areas. The main arguments advanced to justify that particular paper, as presented in its abstract, can be briefly summarized:
COVID-19 is overloading hospitals and causing death.
COVID-19 must be contained, and social distancing must be ensured.
Therefore, we need an efficient optimizer capable of “solving NP-hard in addition to applied optimisation problems.”
This argument not only presents a clear non sequitur (“COVID-19 is a problem; therefore we need a new optimization algorithm.”) but also suggests a lack of understanding of basic aspects of optimization theory and practice. Despite that, the paper was published, which suggests that the reviewers themselves also lacked the particular skill set to detect these and other shortcomings of the work.
Another unfortunate result of this contamination is that optimization tracks of some application journals sometimes become dominated by cliques that keep publishing minute variations of bizarre methods with little oversight. Figure 2 illustrates part of this phenomenon, highlighting a prevalence of application-oriented journals among the venues where the first papers proposing metaphor-based methods have appeared.
4 Reasons for the Problem
The proliferation of metaphor-heavy algorithms in the metaheuristics literature is a multifaceted problem involving multiple actors with different motivations. Some factors, however, may be identified as potential contributors to this problem.
The first is a structure of perverse incentives that permeates the academic environment (Edwards & Roy, 2017). The pressure to “publish or perish,” coupled with a heavy focus on short-term results, to the detriment of a broader and more reflective scientific education in computer science and engineering degrees, tends to reward poor methodological standards and lead to a “natural selection of bad science” (Smaldino & McElreath, 2016, “Introduction”). In this context, publishing metaphor-based methods is perceived as a low-effort, low-risk process with high potential rewards, a perception that is fueled by “success stories” of authors that have built professional careers out of creating not one but often multiple metaphor-based methods. As an example, the 6 authors whose names appear most often in the Bestiary entries have each created between 6 and 10 different metaphor-based methods, and at least 40 authors have created two or more methods, as shown in Figure 3. These algorithms, despite having in some cases been shown to contain no novelty beyond the use of a new metaphor (Camacho-Villalón et al., 2018; Camacho-Villalón et al., 2020), have gathered tens of thousands of citations, a highly desirable prize in an academic culture obsessed with bibliometrics. Tzanetos and Dounias (2021) highlight this issue, focusing on clusters of metaphors proposed by the same research groups and showing the possibility that metaphors may be used to disguise the practice of “salami science” (Wawer, 2018), that is, slicing down a single scientific work into several smaller pieces to artificially inflate publication count.
The lack of a well-established statistical tradition in the field compounds the problem, leading to generally poor practices by authors and, in many cases, an inability of reviewers to pick up on the main methodological problems of some papers, resulting in a particular brand of “cargo cult science” (Feynman, 1974; Hanlon, 2013): work that emulates scientific practices—implementation of methods, running of tests, publication of papers, and so on—without representing an actual scientific process of defining, testing, and refining hypotheses to incrementally build generalizable knowledge about what works and what does not.
5 Potential Solutions
As suggested, the ongoing “age of metaphors” is a multifactorial, complex issue involving many different actors and incentives. Accordingly, a single, simple answer to this problem is unlikely to exist, and any definitive solutions will probably require a cultural shift over an entire field of knowledge. To that end, there have been multiple efforts to steer the area away from some of the worst practices documented in the preceding sections.
Potential solutions to the metaphor problem must begin by increasing awareness of the problems associated with developing algorithms focusing on the metaphor, rather than on the problem being solved. This article is an effort in this direction, but it is hardly the first. “Metaheuristics—the Metaphor Exposed” (Sörensen, 2013) is probably the highest-profile paper raising this issue, and it has become a focal point that inspired several later works discussing the proliferation of those methods. Fong et al. (2016) not only list common design patterns among metaheuristics but also show how improper experimentation is being used to claim spurious results in the metaphor-based literature. Works showing the lack of novelty in many of these methods (Camacho-Villalón et al., 2018, 2020, 2022; Piotrowski et al., 2014; Weyland, 2010, 2015) have also helped bring this issue to attention, raising the wider community’s awareness of these problems.
In parallel to criticizing the focus on metaphors, it is important to provide and disseminate more constructive alternatives to developing research on metaheuristics. A common approach in this direction is to recast search-based metaheuristic optimization as a framework of sequentially linked modules that modify one (or a few) core algorithmic structures. The concept of unified approaches and models for nature-inspired optimization algorithms precedes the heavy proliferation of metaphor-based methods, and it has been discussed in the literature at least since the mid-2000s (de Jong, 2006), with later authors suggesting a research agenda to approach the issues with metaphor-heavy methods (Swan et al., 2015). Other initiatives in that direction include Lones’s (2020) description of a large number of metaphor-based optimizers using common, nonmetaphoric language, highlighting the similarities and differences among the algorithms and de Armas et al.’s (2021) initial work on defining similarity metrics for metaheuristics, which can greatly simplify the analysis of methods and the investigation of which algorithms can be seen as particular cases of others.
Several authors have recently proposed taxonomies of search-based optimization methods, where several algorithms are explained by a unifying framework and its associated components (Molina et al., 2020; Stegherr et al., 2020; Stegherr & Hähn, 2021; Stork et al., 2020). Some of these works go so far as to describe specific code for the framework and its components and to use this code to reimplement some of the existing metaphor methods (Cruz-Duarte et al., 2020; de Armas et al., 2021). Once there is a framework to describe a generic metaheuristic and components to provide variation in the algorithm, a natural next step is to use automated processes to generate algorithmic variations better tailored to specific problem classes (Bezerra et al., 2015, 2020;; Campelo et al., 2020).
A more aggressive approach to changing the current structure of incentives is the implementation of stricter editorial policies. This has recently become more common, with journals like the Journal of Heuristics, Evolutionary Computation, 4OR, ACM Transactions on Evolutionary Learning and Optimization and Swarm Intelligence (Dorigo, 2016) adding specific statements to their publication policy documents, warning authors against submitting methods that fail to describe their contributions in metaphor-free, standard computational and/or mathematical terms. To help bring the issue to the attention of journal editorial boards, Aranha et al. (2022) have recently published and started to circulate an open letter to editors-in-chief of several venues, recommending that explicit editorial policies be put in place to prevent or mitigate the “colonization” problem described earlier. We hope that an editorial barrier to the publication of works that fail to reach some minimal methodological standards, coupled with the increase in awareness not only of these issues but also of alternative, more methodologically sound approaches to research in metaheuristics, may help gradually improve the quality of works developed in the field.
6 Final Remarks
In the last 20 years, the field of metaheuristics has seen a flood of metaphor-inspired methods that are neither novel (despite claims from the authors) nor based on metaphors that are particularly connected to optimization. Cataloging these methods through the Evolutionary Computation Bestiary, we have observed how this phenomenon has had a negative impact on the field, wasting the work of scientists and reviewers on methods that continuously reinvent the wheel, hiding sloppy or dubious practices, and confusing application researchers through the sheer quantity of similar-sounding optimization methods.
More concerted push-back from the metaheuristics (and wider optimization) research community has started to emerge in recent years. Several papers discussing the issues with metaphor-heavy optimization have recently appeared, and journals are beginning to enact policy changes to reject papers that provide no novelty other than a new metaphor. However, our experience tells us that change is likely to be slow. For instance, although the critical tone of the Bestiary is clearly stated in the repository, we are often contacted by authors of “novel” metaphor-based metaheuristics requesting that their work be listed. It has never been quite clear to us whether these authors fail to understand the critical tone of the page or if they assume that any exposition, however critical, would be a net positive for their work. Even when metaheuristics journals (it is hoped) cease to be breeding grounds for metaphor-based methods, this change will take time to spread to application venues, where groups that have specialized in the regular publication of new metaphors managed to acquire a foothold.
It is important to highlight that, although the problems described in this work represent a challenge to the metaheuristics and related communities, they are by no means exclusive to those. In fact, a culture of “perverse incentives” in publication is common across many, perhaps most, academic disciplines (Edwards & Roy, 2017), which has resulted in damaging trends, such as the rise of predatory publishing (Bartholomew, 2014) and the reproducibility crisis (Baker, 2016). By employing the rise of metaphors in the metaheuristics literature as a case study in poor scientific practice, we hope the insights that we present herein can be useful to researchers working in fields that may be experiencing similar problems
To conclude on a positive note, it is worth indicating that the increasing efforts by the community to address this problem may have helped steer the metaheuristics field toward more scientific practices. Recent works criticizing the metaphor phenomenon have focused on how to improve the experimental soundness, reproducibility, and standardization of new approaches, which, it is hoped, indicates that the full transition from the “age of metaphors” into what (Sörensen et al., 2018) called the “scientific phase of metaheuristic research” may already be under way.
Acknowledgments
We thank the community that emerged around the Evolutionary Computation Bestiary for their suggestions, encouragement, and many interesting discussions over the years.
Notes
Direct citations of the papers describing the metaphor-based methods mentioned in this work are intentionally not provided. The original references are listed in Campelo and Aranha (2021) and can be easily found by searching for the name of the specific metaphor.