Good news, everyone! Science has finally, finally delivered us something it’s been long promising: the much-anticipated portal to hell, teeming with summonable demons.
It’s about goddamned time!
This is literally the reason for Marshall McLuhan’s beef with Northrop Frye. That later. First, here’s the deal.
These A.I. art generator thingies are trained on millions upon millions of images. Also, they can detect features of those images, including objects within them describable by human language. Roughly speaking, they can “read” what’s “in” a picture, and then reverse the process to generate images based on whatever prompts we “write.”
We can take the whole sum of millions and millions of images—photographs and artworks and whatever else—that these things have synthesized into coherency as some kind of unprecedented summation or unity of culture. Clearly there will be over-representation of lots of things—you know, the things we want pictures of, compared to others.
Well, according to Twitter-user @mattskala, (who has a PhD in Computer Science) this total synthesis of images held together by some sorta natural-language semantic unity of words, can be thought of as existing within a high-dimensional euclidean space. In other words, this total, unprecedented summation of culture has a geometric form—a shape, if you will. It’s like a body of culture within images, held together by language. Sort of like an quasi-organic virtual rhizzome of archetypes.
Then he point outs something very basic about geometry: that if you want to calculate the furthest point from any given origin in, or on, a polyhedron, there are a many, many fewer answers than there are possible origins.
Take this oval. For ANY possible point inside it, what spot on the oval is furthest away? Well, those spots will be clustered tightly in only two zones, at the tips.
And THAT means that when you give one of these art-generator-thingies a negatively-defined prompt, the best it can answer is “furthest possible point from” within Euclidean space. It’s not “opposite”, it’s “furthest from.”
In other words, in euclidean space, (when we’re not talking about spheres) the negative operator is not one-to-one. For each unique input there is not an opposite unique output. It’s many-to-few. There are far fewer zones, or furthest-reaches, within the geometry of the cultural body to reach to as the “furthest” from ANY given prompt. Fewer extremities within-which to grope for a “not-that-thing,” whatever that thing is.
So when we ask the A.I. to give us the negative prompt—the not-something—, we’re pushing it to converge within the furthest, extreme reaches of the shape of that quasi-organic virtual rhizome. Which is, remember, an archetypal snapshot of virtually-all human image-culture. And what do we find when we probe around those extremes?
Twitter user @Supercomposite just found the same woman over and over again, surrounded by gruesome and macabre depictions of children, often dismembered. All of which are being summoned out of some presumably-tight cluster, or zone, within the geometric shape of the language describing the sum of all the images roughly-comprising culture.
These zones aren’t being created by some horror-fiction artist. This isn’t the depraved imagination of some edgy shocklord. This is some impersonal mathematical triangulation of the extremes of the the collective imagination as represented by however-the-hell this A.I. found semantic and imagistic coherency across millions of images as describable in words. And it’s this:
And this is literally the furthest from anything you could ask for.
Basically, we’ve finally materially instantiated the Frye-model of archetypal culture. You can tell, because near-death-looking women with mutilated children are, I think we can agree, archetypaly pretty-damn-close to the worst thing you could ever want. Like, nobody wants this. And even the A.I. model is telling us this.
Maybe, as an individual, you can imagine worse images. But this isn’t the individual mind, this is the collective mind. We’re seeing the limits of archetypal reductions in these images. You see, language isn’t rhizomic, like a ginger root sprouting in twisted pathways. And archetypes aren’t the base unit of our human experience of reality. Individual words and individual bodies are.
Humans do not work with archetypes. Stories do. Humans, in the ideal “humanities” sense, are bodies which speak and write words. And words create and renew the situations that we exist within as embodied beings. When we selectively recast our remembered experiences into stories, we are performing a great—and necessary—act of reduction. Of artistic creation. And when we tell stories, and capture life within media, we are also reducing embodied existence into much-more-rudimentary forms.
If I ask you for the thing that’s not a cool cat, maybe you’d say a hot dog. I mean, it makes sense logically. But that’s just one interpretation—using poetry and rhetoric, you could justify a million other possibly valid negatives of the phrase. If I asked you for the thing that’s not a male statue of liberty made of cotton candy on the moon, maybe you’d say… well you’d come up with something. You could come up with anything, really. Because you’re working within language. Each word, strung next to each other word, evokes a great deal of meaning. Now, if every negative query I gave you always came up with dead babies, well… I’d have some questions.
That lack of creativity is just as true as the reduction of human experience and psychology to archetypes and narratives.
In a review of Northrop Frye’s book Anatomy of Criticism, Marshall McLuhan wryly compared Frye’s archetypal view of literature, which he calls collective, to the “mental postures” of rhetoric amenable to individual thought. Just as how the A.I. has abstracted, from our collective cultural images, reoccurring norms expressed in reoccuring terms, the archetypal analysis develops and perpetuates patterns of meaning at a level more brittle and normative than what any individual human can perceive and communicate. Archetypes are norms, a language of relatively-few symbols which must be taken or left as is, rearranged and remixed and redressed within a finite number of formulas. They’re collectivizing, converging into norms. The only way out of those norms is to be able to read into them sufficiently enough words to know how to talk about them, and then take them apart and talk our way out of them!
This negative-prompt doorway to limitless A.I. pictures of archetypal hell is, itself, an archetypal image. Lots of people are, and will continue to talk about this in literal terms of demonology. We can’t, however, conflate the archetypal appearances of things, easily ammenable to our collective imagination, with the particulars of what’s going on. All the human languages, and specialisms, can provide analogies and avenues to talk our way through conundrums like this one, including math and literary theory, as I’ve tried to demonstrate.
This is a matter of media and eloquence! Not the reduction of reality to story. We’ve generated A.I. hell out of such reductions, it’s through increased nuance we find our way out. Keep your wits.
edit (September 8, 2022): Twitter user Donald Watts helpfully clarifies my interpretation of the phenomenon:
To be clear Loab isn’t the result of lots of different negative prompts. It was just one negative prompt which was combined with various other images. The notable part is how strongly the Loab figure seemed to persist, suggesting it was unusually overfit…. But you wouldn’t get to Loab by just doing “opposite of a calculator” or whatever. It was specifically that first image in the thread.
The idea is that number of vetices, extremities, “zones”, or what-have-you within which a negative prompt might land are far, far fewer than the number of prompts which might be given. That amount could still be a very big number. It will be interesting to discover more of these extreme vertices, and the archetypes which dwell there, which are not discoverable by positively-defined prompts.
I just wanted to be the first to comment, even though I have no real comment. Maybe AI could make a reasonable comment up for me.
Do you know what the corresponding prompts are for the images generated? The prompts should accompany the images
According to the OP (@supercomposite), the original prompt was “DIGITA PNTICS skyline logo::-1” which lead to all the images of the woman. And then using the woman as an image input let all the gory stuff get pulled from the gory “zone” (as I’m calling it) out of which she comes. The OP is just probing that “zone” which was reached from that meaningless prompt.