The Midjourney Mess, part 2: Turing problems
Why the metrics by which we judge the success of creative AI failed to prepare us for this.
(This post was originally for subscribers only; I’ve chosen to release it, along with part one, for free as a part of the SFWA roundup of posts on AI and the creative industries.)
I'm thinking more about image- and text-generating AI, and one point I want to make is that this current mess isn't just a failure of ethics - it's a failure of evaluation more broadly. It's not just that the tech industry didn't think carefully enough about whether producing this kind of AI would be harmful; it's also that the other questions we asked ourselves - whether a creative AI was effective, whether it was "really" creative, and so on - were the wrong questions.
A lot of my PhD work was specifically about grappling with these questions. What does the word "creative" mean when we apply it to humans? What do psychologists of creativity say about how it works? When we try to call a machine creative, are we using the word in the same way? Is it really working in the same way, undergoing the same processes deep down, as a creative human?
Broadly speaking, there are four ways of defining creativity. There's creativity as a personality trait (hard to apply to computers for obvious reasons, though people have tried.) There's creativity in terms of the creative process - from idea to planning to execution, revision, and polishing (which, as far as I can tell, none of the current crop of AI actually do - they're all exercises in "slap a very big neural network together and hope for the best.") There's creativity in terms of the novelty and quality of the finished product (we'll come back to this.) And there's creativity as a social construct, in which an artist is deemed creative through their interactions with the broader community, which not only grants an artist social legitimacy but constantly influences and is influenced by what the artist produces. (Even when an artist consciously rebels against some trend in the broader community, they’re still in some sense talking about what’s going on around them.)
Product is where the current crop of AI are showing up strongest. And we often talk about product using a Turing Test-like concept. Are the creative products of an AI sufficiently correct and interesting to be indistinguishable from a human artist's work?
A lot of ink has been spilled, in academia, on the appropriateness of tests like these. The prevailing view in computational creativity research is that a Turing-style test simply isn't good enough. It's relatively easy to make a derivative pastiche that can fool an untrained human, especially if the person can't inquisitively interact with the computer and test its capabilities. In the original form of the Turing test, this kind of interaction is key to the whole thing. Besides, it would be very cool and interesting if a computer somehow made good art that was different from what a human could make! So “indistinguishability from humans” might not even be our goal.
In more recent AI, like ChatGPT or the various prompt-based image generators, a user actually can interact in real time - which leads to more ink being spilled on articles like this one by Gary Marcus, highlighting what kinds of interactions will expose the AI's weaknesses, and in particular, its lack of grounded understanding.
I actually enjoy articles like these. The process of testing an AI's limits is interesting for its own sake and sheds light on what's going on under the hood. It also helpfully pokes holes in some of the more outlandish techbro claims about these models - that they're already sentient, for example (which they most certainly are not.)
But also? It doesn't fucking matter.
The applications that do the most damage - replacing human artists, stealing their work, breaching academic integrity by writing students' assignments and so forth - do not require the AI to be fully humanlike, "really" creative, or any of these other things the theorists have been arguing over. They don't require any specific process, nor any acceptance at all from the broader community of artists. In fact, the broader community of artists - as we’re now seeing - can be quite vehemently opposed to them, and they’ll keep on trucking.
All they require is for the system to churn out something more or less convincing, in response to the right kinds of prompts, often enough that a human can pick through the things it generates and pick one that serves the human's purpose in a vaguely okayish way.
At that point, it’s already to a company’s advantage to just go and use the vaguely okayish AI instead of having to pay any real artists.
Right now, every other, higher-minded criterion that we argued about in grad school feels moot.