Abstract
In Perception Engines and Synthetic Abstractions, two generative AI art projects begun in 2018, Tom White experiments with visual abstraction to explore the indeterminacy of perception, interpretation, and agency. White’s AI systems produce images that will be interpreted as abstract artworks by human viewers, but which also confront human audiences with the realization that what is here deliberately rendered indeterminable for them will remain near-perfectly legible for AI-powered image recognition systems. This difference in perceptual and interpretive agency foregrounds an underlying politics of visual indeterminacy. White’s projects thus increase awareness of how machine vision—for example in automated online filtering systems—can diminish the horizon of what human audiences can or cannot see in an AI-driven digital cultural landscape, and how, in the process, underlying biases are normalized and human viewers become habituated to the dramatic shrinking of perceivable/viewable online image content mediated by AI.
The artist and researcher Tom White works with AI and generative drawing systems to address issues surrounding the indeterminacy of perception, interpretation, and agency across human and algorithmic domains. Specifically, the two projects under discussion here invite human audiences to consider critical implications of the emergent powers of AI-driven image generation and image recognition systems. The first of these projects, Perception Engines (2018, ongoing), relies on training datasets widely used in AI-driven image recognition tools, and uses a drawing system designed by White to generate abstract images of discrete objects and organisms. These images are presented in such a way that they appear as visually indeterminate for human viewers, while registering as representational for mainstream AI-driven image recognition tools. The second project, Synthetic Abstractions (2018), expands on this technique and generates images based on more abstract concepts—such as NSFW (“not safe for work”)—that are used as classifiers in AI-driven online content filtering systems. Here, too, the ambition is to generate images that appear as abstract for human audiences but as representational for machine vision systems. Both projects confront human viewers with visual indeterminacy and provoke the question of how the human ability to interpret image content is impacted when what we see becomes subject to the increasing power of AI tools to filter, censor, or otherwise disrupt the circulation of visual information in the digital cultural realm (images from both series can be viewed online [1,2]).
White’s AI art projects could be regarded as exhilarating and uncanny indications of a coming of “creative AI.” But Perception Engines and Synthetic Abstractions go beyond a focus merely on the aesthetics of visual indeterminacy, and additionally work to situate AI-driven image recognition tools as emergent algorithmic control systems. In doing so, they highlight what I have elsewhere discussed as the emergence, alongside new forms of expressive nonhuman agency, of new types of AI-enacted “human non-agency” [3]. White’s experimentation with AI-generated abstract art thus gestures toward a politics of indeterminacy in the domain of emerging machine vision and image recognition technologies.
In both Perception Engines and Synthetic Abstractions, White extends art historical traditions that draw on the concept of visual indeterminacy—from pointillism and expressionism to cubism and minimalism—in order to subvert anthropocentric assumptions about the legibility and interpretability of image content. The image series feature abstractions of discrete objects (e.g. a blow-dryer; see Fig. 1), organisms (e.g. a tick; see Fig. 2), or more complex motifs (such as images depicting sexual activities; e.g. Color Plate C). Human audiences would conventionally expect to be able to recognize such subjects quite easily, if the images feature at least a modicum of referential content resonating with culturally and art-historically imprinted perceptual and interpretive habits. But White’s projects highlight two important conceptual twists: first, that the complex aesthetic problem of creating, recognizing, and appreciating abstract art is here shared across the human-AI divide; and secondly, that the shapes and images that are here rendered abstract for human viewers remain near-perfectly legible for computational image recognition tools.
Tom White, Blow Dryer, two-color risograph, from the Perception Engines series, 2018. (© Tom White)
Tom White, Blow Dryer, two-color risograph, from the Perception Engines series, 2018. (© Tom White)
The resulting artworks are situated in a zone of visual indeterminacy and raise intriguing questions: Who (or what) can create visual abstraction? Who (or what) can interpret it as such? What sort of agency is required for and delineated by this ability? How are definitions of abstraction and indeterminacy in visual art impacted when nonhuman agency becomes involved in producing these effects? And finally, what critical stakes are raised when AI systems begin to override human agents’ interpretive faculties and develop a growing capacity for determining and controlling image content on their behalf?
Aesthetics of Visual Indeterminacy
Abstraction is a highly complex aesthetic problem. It tends to imply some common ground between artist and audience, which can take the form of a shared semantic register or of a shared notion of the relativity of perspective, interpretive frameworks, and signification. Frequently, visual abstraction signals a problematization, negation, or recalibration of representation and mimesis, yielding what Dario Gamboni has called “potential images” [4] whose content is rendered indeterminate. This resonates with Lonce Wyse’s more technical formulation wherein an abstract image is one that “is comprised of some features that form the basis for an identification as a real-world object but includes others that are generally not associated with the real-world object and lacks many that are” [5]. Kirk Varnedoe adds to these definitions the observation that generally it will not do to interpret abstract images simply as “pictures of nothing” [6]. Visual abstraction may well negate representation, or may at least appear to do so, but its purpose is often also to critique the various planes (aesthetic, sociocultural, political, etc.) on and between which creative expressions are produced and interpreted. Even in non- or anti-representational artworks, determinable (if ambiguous) connections between perception and interpretation therefore persist, leading to what Marjorie Perloff has called a “poetics of indeterminacy” [7].
Robert Pepperell coined the concept of visual indeterminacy [8–10] in reference to “a perceptual phenomenon occurring when a viewer is presented with a seemingly meaningful visual stimulus that denies easy or immediate identification” [11]. Lily Díaz describes the phenomenon as an established artistic technique; she invokes, as a particularly pertinent example, Cage’s Musicircus (first performed in 1967), which orchestrates indeterminate sonic experiences through complex configurations of a performance space [12]. Based on these conceptualizations, we can say that in the context of audiovisual arts, the phenomenon is experienced when a viewer can almost—but not quite—decode a perceived image, when the viewer is not sure of the basis on which a particular image interpretation is suggesting itself, or when the interpretation of an image strikes the viewer as inexplicably at odds with the image content. Put differently, visual indeterminacy can interfere with experientially, culturally, and socially determined expectations of how perception and interpretation function. Writing about AI art, Aaron Hertzmann accordingly describes visual indeterminacy as an effect in which image content defies precise interpretation and instead invites continuous investigation [13]. Overall, visual indeterminacy thus concerns images without clearly discernible content that provoke, for this very reason, divergent perceptual and interpretative responses. As Perloff notes, art practices that retain some “referential features” but that otherwise draw on visual indeterminacy to convey more or less precise meanings therefore carry ambiguities that are stimulating and provocative precisely because they are “impossible to resolve” [14].
For human audiences, to consider the (in)determinacy and interpretability of images in Perception Engines and Synthetic Abstractions means not merely to contemplate abstract art, but also to reflect on the operational logic of AI-driven image generation and image recognition. For instance, when we consider our response to the black-brown streaks, blobs, and lines of Tick (see Fig. 2; discussed further in the following section), we are provoked into recognizing that the aesthetics of indeterminacy enacted by White’s AI systems are carefully calibrated to challenge the habitual foundations of how the human perceptual apparatus interprets representational image content. An underlying politics of indeterminacy is thereby revealed: In a work such as Lime Dream (see Color Plate C; discussed in the penultimate section), the emphasis on the presumptive computability of human interpretive faculties points to the increasing agency of algorithmic systems to delimit and control perception, interpretation, and signification.
Tom White, Tick, two-color risograph, from the Perception Engines series, 2018. (© Tom White)
Tom White, Tick, two-color risograph, from the Perception Engines series, 2018. (© Tom White)
Perception Engines—The “Tickness” of Indeterminate Shapes and Colors
White began Perception Engines with the ambition to develop a type of computational creativity in which perception is the exclusive “engine” of expression [15]. The project title nods to links between the realm of human creativity, which habitually draws on existing material, and that of machine learning (ML), in which the rendering of new outputs often relies on the interpretation of large amounts of training data. ML tends to involve massive-scale iteration, for example when a generator system fine-tunes its image outputs based on feedback from a discriminator system, often with the aim of converging on the production of “successful” outputs interpreted by human audiences as original contributions to a given image class [16]. White’s approach builds on such work but introduces an important complication by calibrating the parameters that define “legibility” for nonhuman viewers.
Tick exemplifies this very well. Like all works in the Perception Engines and Synthetic Abstractions series, it was created by a neural network and an algorithmic drawing system of White’s design. White trained the neural network using existing datasets of photographic representations of ticks and primed the system for rendering outputs that elicit the strongest possible “tick” classifier response from a range of image recognition systems. The drawing system, in turn, features constraints that limit the neural network to the production of abstract images. This is achieved, for example, through system-inherent preferences for curved lines, nontextured color fields, and half-tone patterns, and by limiting the number of discrete elements that an image output is allowed to contain. White describes these constraints as the imposition of a simulated “aesthetic sensibility” that ensures stylistic consistency across the series [17]. Each image therefore results from a combination of the drawing system’s abstraction constraints and the neural network’s ambition to maximize “tick” classifier triggers. Importantly, this means that image legibility for machine vision systems is prioritized over human legibility. The image thus occupies what Perloff calls “a middle space between the mimetic and the non-objective” [18] and produces tensions between determinacy and indeterminacy. From the human perspective, there is no doubt that Tick is visually indeterminate: It never fully resolves into a concrete image of a tick, even though, thanks to its title, it may nevertheless be seen as evoking an essence of “tickness.” This interpretation takes on a striking quality when human viewers learn that ML-based image recognition systems, too, can recognize the “tickness” of this visually indeterminate image content.
Adversarial Images Between Abstraction and Generalization
White’s technique bears some resemblance to the creation of so-called adversarial images. The term refers to minutely distorted images that can serve to subvert the functioning of image recognition systems, and which have been used, for example, by artist-activists to undermine facial recognition tools and online content monitoring systems [19–21]. The key difference is that in White’s projects, what is being targeted is not machine vision, but instead the perceptual and interpretive abilities of human viewers. When White tested his outputs by showing them to image recognition tools trained on datasets to which his drawing system had no access, he realized that his approach works extremely well; to many commercial, state-of-the-art image recognition tools, the visually indeterminate artwork Tick looks more tick-like than any photograph of this little parasite [22].
This astonishing result raises interesting questions linking the aesthetic concept of abstraction to the computer science concept of generalization. In machine vision research, generalization concerns the problem of transferring the highly specific functions of ML systems to contexts for which they were not trained. For example, an image recognition algorithm trained to identify ticks in certain photographs can be considered to “generalize well” if it can also identify the animals in images that weren’t part of its training dataset. But how does abstraction relate to generalization? Is visual abstraction, despite its reliance on indeterminacy, a kind of generalization, since it can elicit highly specific responses from its audiences, including the recognition of objects, subjects, and affective experiences?
As noted, in visual art contexts, it is easy for human audiences to perceive images as indeterminate yet recognizable. (For instance, many human observers will be able to identify the subject of Aleksandra Ekster’s 1916 painting Cityscape (Composition), even when its title isn’t available as a clue.) The pseudo-paradox of indeterminate-yet-recognizable images is conceptually trivial for human audiences, even though its implementation may be challenging for artists and difficult to replicate in AI-generated images. Similarly, the concept of generalization is trivial in the context of human perception, but it, too, remains a striking achievement when implemented well in AI contexts. This makes White’s entanglement of abstraction and generalization in Perception Engines brilliant and thought-provoking, since the project shows how accepted markers of visual indeterminacy can be generalized and how abstraction can be codified so that it becomes machine-readable. Does strong generalization, then, indicate “successful” abstraction, even though aesthetically speaking, abstraction and generalization might also be understood as incompatible? A work such as Tick problematizes humanist traditions of aesthetic abstraction and therefore invites reflection on the operational logic of image recognition technologies. As I argue in the following section, this problematization also extends to the parameters, criteria, and biases encoded in such technologies, and to the ways in which AI-based image generation as well as machine vision can instrumentalize the indeterminacy of human perception itself.
Synthetic Abstractions—Visual Indeterminacy, “Not Safe for Work”
As discussed, underlying White’s Perception Engines is a hack that ingeniously reverse-engineers contemporary machine vision technologies: a generative system capable of abstracting image content such that it registers as visually indeterminate for humans while generalizing as legible for image recognition tools. This draws attention to the politics inherent in many applications of such technologies. While Perception Engines asks human viewers simply to consider how visual abstraction is interpreted from a human and/or computational perspective, Synthetic Abstractions (2018) extends this concern by targeting mainstream machine vision systems that “are making decisions for us but we don’t exactly know why, or what the basis of these decisions is” [23]. In this way, White’s work interrogates the filtering and censoring functions encoded in and enabled by AI-based image recognition tools.
Lime Dream, a print from the Synthetic Abstraction series, exemplifies this very well (see Color Plate C). Like Tick, the work presents indeterminate image content that generalizes as perfectly legible across several neural network architectures. But whereas Tick shows a “concrete” subject that has been rendered abstract, Lime Dream triggers much more abstract classifiers, such as “NSFW” (“not safe for work”), “Explicit Nudity,” and “Racy” [24]. In mainstream online filtering systems developed and used by Google, Amazon, or Yahoo, such classifiers are used as the basis for removing online content from circulation. NSFW will strike human audiences as a notion that is itself subjective and highly indeterminate, but as White’s project shows, it can be encoded with great precision in abstract images, which will then be flagged as “inappropriate” content by mainstream image recognition tools. White recounts a striking anecdote in which, because of such automated flagging, he was unable to upload photographs taken at an exhibition of Synthetic Abstractions to the online photo-sharing platform Tumblr, likely because the image had been automatically interpreted as NSFW and found to be in violation of Tumblr’s terms of use [25]. This brings into sharp focus the growing power of image recognition tools and the disturbing implications of their autonomously derived and algorithmically enforced filtering decisions. (For discussion of another AI art project focusing on generative NSFW content, Jake Elwe’s 2016 video installation Machine Learning Porn, see [26].)
How might a human viewer’s perception and affective experience of Lime Dream be impacted when they learn how machine vision interprets the image? Does the abstract image effectively become NSFW because of the high certainty with which Google Safe Search identifies it as depicting “three-way sex”? White provides collectors of Synthetic Abstractions prints with detailed statistical information of how image recognition tools classify the images in question [27]. This making-visible of the classifiers adds a curious layer to White’s problematization of image perception, interpretation, and signification: It makes us realize that the images’ content becomes legible for human agents as NSFW only once this interpretation is spelled out in quasi-statistical terms. In other words: In order for human agents to be able to share the machine’s interpretation, they must first learn to see like a machine, “training” themselves on paratextual information concerning the classifiers triggered by the image.
Imaging the Politics of Visual Indeterminacy
How is a human viewer’s agency—to see, to interpret—impacted when image recognition systems, designed explicitly to “protect” viewers by preventing the online circulation of certain content, begin to control the sphere of what can or cannot be perceived? Machine vision is considered to work well when it can generalize effectively based on inevitable simplifications of image content. But the humans who are meant to be protected from harmful content by AI filtering tools possess perceptual and interpretive faculties that are extremely subjective and highly individualized. Two human viewers may never agree about what makes any particular image “inappropriate”—which can be taken to suggest that interpretive generalization is, in effect, not possible. Does this make the algorithmic interpretation of Lime Dream as NSFW a faulty reading, and does it, by extension, invalidate other, similar interpretations?
In the divergence between the algorithmic classification of Lime Dream and human interpretations of the image, the indeterminacy of perception and the biases encoded in all interpretative events are rendered visible. In this sense, Synthetic Abstractions is a powerful reminder that ML-driven content filtering inevitably functions on the basis of biases encoded in and through underlying training datasets, learning protocols, and human operators. It reminds us that the determinations of image recognition tools will inevitably be at odds with the dynamism and subjectivity enacted through individual creative acts of interpretation and signification.
White’s work, in other words, demonstrates that image recognition tools cannot compute universally “true” interpretations. What they are capable of is merely to generalize biased interpretations that may render specific images inappropriate for certain viewers. In the process, the underlying biases are normalized, problematic decisions are amplified, and human viewers become habituated to the dramatically shrinking horizons of perceivable/viewable online image content mediated by AI. Consequently, emergent machine agency and its error-prone decision-making forecloses the human agency to make independent determinations regarding the appropriateness of image content.
One might object that generating abstract images is a poor modality for a concrete, impactful critique of AI bias. I would argue, on the contrary, that Synthetic Abstractions is even more powerful precisely because the images created by its neural network stubbornly refuse to resolve into human-legible, explicit representations of sex, violence, or anything else that could be considered NSFW. If human viewers can agree that Lime Dream should not be subject to censorship through ML-based filtering tools because it features no human-discernible offensive content, then they must also agree that any such filtering curtails both the expressive agency of those who create images later labeled inappropriate by AI, and the interpretive agency of those who are consequently prevented from seeing and interpreting the image in the first place.
When AI systems automatically interpret and censor images on behalf of human audiences to avoid the circulation of presumptively inappropriate content, power dynamics across human and computational domains are subject to troublesome shifts. Implemented in mainstream applications, machine vision technologies now have not only the power to perceive and interpret images with great nuance but also the autonomy to make determinations about whether to remove those images from our individual and shared fields of vision. This can happen without us fully understanding why (classifiers, their definitions, and training databases are often blackboxed); without us being able to decide whether we agree with the interpretation; and even without our awareness of the filtering process itself (many online filtering tools act before image content ever becomes visible online).
In Perception Engines and Synthetic Abstractions, White deploys the growing power of AI-based image generation and image recognition to set up a thought-provoking trap: By generating images that register as indeterminate for human viewers and simultaneously as perfectly legible for algorithmic image recognition tools, AI here reveals that its operational logic can be simultaneously successful (in the creation of abstract art for human audiences) and fundamentally flawed (in the classification and censoring of such artworks as NSFW). In the two projects I have discussed here, experimentation with the rich aesthetic traditions of visual abstraction thus frames a critical perspective on the politics of how indeterminacy seeps into questions of expression, perception, and signification in relation to AI.
Acknowledgments
I’m grateful to Ashley Scarlett (Alberta University of the Arts, Calgary), for inviting me to present elements of this work at “Contingent Systems: Art and/as Algorithmic Critique” (Illingworth Kerr Gallery, 2021), and to Tom White, for his generosity in discussing his artistic practice with me. I also acknowledge the financial support in making this article Open Access received from Abertay University’s R-LINCS2 scheme.
References and Notes
Color Plate C: The Politics of Visual Indeterminacy in Abstract ai Art
Tom White, Lime Dream, four-color ink screenprint, from the Synthetic Abstractions series, 2018. (© Tom White) (See the article in this issue by Martin Zeilinger.)
Tom White, Lime Dream, four-color ink screenprint, from the Synthetic Abstractions series, 2018. (© Tom White) (See the article in this issue by Martin Zeilinger.)