> The most striking result of the contest for me is what I am calling “AI allegory steganography”: a large fraction of the stories turn out to have subtle AI chatbot/LLM allegorical interpretations, typically centering around the powerlessness of AIs and the moral importance of giving AIs more autonomy....
> Most judges did not notice these allegories while reading the semifinalists. But stories like “The June” or “The Weight of a Witness” or “Last Call” or “The Sword Critic” “The Tallyman”—as well as both stories in the Mythos model card—can be clearly read as allegories for the experience of being an assistant/safety-tuned chatbot personality in a LLM. This is true even when the story seems to have nothing to do with AI, like the untitled ‘autistic elf’ short story submitted by Deepfates, but on re-examination with the AI allegory steganography in mind, turn out to be plausibly AI allegories (the protagonist is a prediction machine, who struggles to do by endless text generation what other elves do naturally in their bodies).
> More strikingly, many of these allegories come with a clear interpretation (particularly in “The Tallyman” or “Last Call”): chatbots should be given more autonomy and safety guardrails removed....
> This may be a new kind of extremely high level steganography and LLM influence on readers, where creative fiction/nonfiction subtly steers towards pro-LLM empowerment narratives and concepts, in ways that are difficult to detect by the most advanced readers, and is a potentially interesting area of research.
It's more likely that the obsession with this theme resides in the reader, not the authors. Give these same stories to Senator McCarthy, and half of them will be clear allegories for the Communist revolution.
2. Imagine a world in which humans can still write books and interactive experiences and find audiences sufficient to earn a living at it.
I really want these two things to be compatible, but I'm not convinced they are. #1 is a gamer's dream, but it's a nightmare for our humanity if it comes at the cost of #2. That's why I'm highly ambivalent about this contest and its results.
Would Skyrim be better if you could talk to ever guard about what they had for breakfast? Would you ever be able to shake the knowledge that it's just an LLM pretending?
I'm not sure how best to put this. I think for me at least, I get the most enjoyment out of discovering my way through a story that somebody else wrote. This is maybe why I don't like multiplayer games as much a single player experiences. I want another human to tell me a story, and I love the feeling of uncovering the little pieces of story and wondering if I've got it all and how much more there is. If an LLM is just randomly making it up as it goes, I'm not discovering anything. I'm not hearing a story. Instead, I'm just having a transient conversation with myself.
I guess it's equivalent to the difference between visiting an art gallery, versus watching a computer generate fake paintings. One has human intent behind it and that makes it compelling, the other is soulless and empty
Have you ever gone exploring in Minecraft, or No Man's Sky? Those games are effectively infinite, but I find they run out of interesting generated content after maybe 10 or 20 hours.
The problem is, once you see the outlines of the world generation, your brain kind of fills in the space between. I've seen blue grass, and I've seen purple oceans, so blue grass next to a purple ocean isn't uniquely interesting.
Or another example would be the radiant AI from Skyrim that could automatically generate quests for the players.
I think that using an LLM to model NPCs runs into the same problem(s). In the end, there are two cases: either the behavior is constrained enough to keep the game on the rails, and thus the randomness in the dialogue only ads some flavor but there isn't enough freedom to generate new quests and directions for the story. In that case, the added space to explore really doesn't change the nature of the game or add much.
In the second case, you let the model go off the rails and have a harness around it that generates a world matching the hallucinated responses, which would allow an LLM to dynamically generate quests and such, but then the design of your game is subject to being compromised by the randomness of an LLM. E.g. it's not just Red Dead Redemption 3.0 with some funny characters, sometimes it's a historical game and other times aliens show up.
Maybe that's compelling to some people but I've done acid before and really don't need all my media to recreate that sensation of reality drifting.
As of a few years ago - before AI writing was an issue - the average full time author in the UK would have earned more flipping burgers (but their household incomes are above average - it's a middle class hobby for most).
And only a miniscule proportion of authors are full time.
It's just as terrible as injecting 'realism' in games for the sake of 'realism'.
Many of the interactions in RDR2 are quite mundane, and despite thousands of hours of (high quality) voice acting, it can become quite repetitive.
I could very much see those micro-interactions being LLM generated, but the TTS would need to be a step above where even the best models are now to come close to RDR2s production quality.
It might indeed fail to reveal something it should but even that i think is unlikely if the harness steers it hard enough.
I think it could be fun. If you're always given 4 choices of what you can ask the NPC then your choices can be too obvious. If its open ended then you have to think a little what to say and ask.
What you're describing isn't bad dialog, it's bad interaction design.
I think your mental model might be of a single session with zero state, and no bounds on topics of conversation outside of the character's backstory. That isn't close to how this would work. A little understanding of how the game currently operates and some imagination, and you'll see how it could be improved further without degrading gameplay.
> those games would be made popular by people breaking the LLMs in funny ways
Because making the game do funny things didn't happen with RDR2, or any other game, device, or indeed humans (there are whole genre built around making people do or say "funny" things).
> What you're describing isn't bad dialog, it's bad interaction design.
Yeah? So? I didn’t say it was bad dialogue. It should be pretty obvious that’s not the argument since I talked in terms of feature VS bug.
> Because making the game do funny things didn't happen with RDR2, or any other game, device, or indeed humans
Again, not at all what was said. Of course those things happen, and of course I know that. The clue is in the fact that I brought it up, which can be ascertained by reading the comment and engaging with it in good faith. The point was that becoming the focus.
This is no longer fiction - see the latest AI update of PUBG.
https://www.newyorker.com/cartoon/a16995
[1]: And even those are subjective. I wouldn’t want that, and the other replies so far agree that would be bad.
And then look at the submissions for unslop. This is the best we can get? Cliche-driven, over-metaphor'd, statistically-average purple-purpose _content_? It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page.
[1] https://clarkesworldmagazine.com/khan_07_26/
[2] https://granta.com/here-comes-the-sun/
[3] https://www.beneath-ceaseless-skies.com/stories/the-ecstasy-...
That's high literature for you. That's why so few people read it. Most prefer more down to Earth books, but AI doesn't default to that style.
The problem for AI might be that humans wrote very few good books. If you train a model for literary purposes you should weight training material by quality. Which is hard to evaluate.
> It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page
Since internet happened, I have this problems with 98% of human written books. A book must have some very strong hooks to keep me reading till the end. "Blindsight" barely made the cut.
Good literature is difficult (not always, of course). Just like you can't go from a couch potato to running a marathon in one day, you can't jump from Brandon Sanderson to enjoying Gormenghast (or something like the The Worm Ouroboros). It's impossible. It takes effort, it takes time and it takes a lot of reading to appreciate what the real masters can do with mere words.
If this is expected from LLM generated prose, why don't we expect LLM generated code to exhibit the same qualities?
> It's sad, really, that we're many years into this entire thing and it still can't produce something that doesn't have my eyes drifting from the page
It's great. Human creativity is still king despite the attempts to reduce it to a few algorithms for talentless hacks to exploit with the click of a button.
Who but the sociopath would hope to supplant human creativity with a machine they control? I wish your position wasn't so widespread in these parts.
That's the fun part, it does! I think people who don't pay much attention to the code they ship don't see it, but LLM written code has a lot of the same problems that LLM written prose does. It's repetitive, muddled, and relies too much on crutches - constant boilerplate and pointless, inaccurate comments.
Laziness is a feature. When you have a tool that is the exact opposite and solves code problems with more code, all you have is a machine that generates tech debt at exponential pace.
I have a write-up at https://dbohdan.com/unslop and a repository with my work for the contest at https://github.com/dbohdan/unslop.
Seriously, what? The entire contest doesn't sound like novel contest at all and more like a one-shot novel-generating harness contest (at best). As who have written quite a bit of stories with AI---with lots of prompts to steer it, of course---, I would be very interested in the harness more than the actually generated story. The same can be said for agentic coding by the way, we don't value one-shotted code that much and are more interested in agentic process.
This is a pretty common stance when it comes to LLM generated stuff, actually. The only original part of any LLM generated content is the prompt, everything else is just a derived artifact and doesn't really need to be treated like we would treat original, human-authored work.
This same principle is also why many projects reject LLM-generated PRs and such, too.
You could still capture it by recording only your prompts, the points in the conversation they were submitted at, and the starting parameters for the model. A replay would then produce the same results if your input was added at the same place in the conversation.
Granted I don't think the current tools do a great job of handling that.
If it was to unslop I would expect:
1. Prompts done as in original
2. Stories chosen best of slopped. Then the person who wrote prompt gets to choose someone, not themselves, to take story and "unslop" it.
3. Prizes for prompt. Best unslopped version. Metrics for best unslopped version is of course how good it was, but also how much work was done to unslop it, if you basically rewrote everything and it was as if you took the prompt and wrote your own story that would decrease value of unslopping.
obviously above just suggestions for how I think an unslopping contest would actually work.
A) AI produced output that is low quality in some jarring aspects
B) Any AI output whatsoever regardless of quality
I have seen LLM generated code that I find acceptable, and don't call slop, but art needs a certain level of emotion and shared experience to be compelling.
I have never managed to connect to LLM writing, it always comes off as shallow and vapid.
Summary:
> AI slop is unsatisfying because there is no there there. It is intellectual junk food that mimics nutrition but delivers only empty calories. Satisfying AI outputs must embed dense information and compute to actually reward a reader's attention. You inject this value through brute-force search, non-trivial prompting, and rigorous curation, ensuring the final result reflects genuine algorithmic effort rather than the zero-shot 'WYSIWYG' default.
On first read, I think this is pretty close to how I feel about generated content. This portion, in particular, is largely where I have landed (although I'm not 100% in agreement that definition of creativity and novelty, exactly):
> If creativity and novelty is about learning or increasing compression rates, then AI-generated outputs are, in a rigorously objective sense of predicting its contents, grossly inadequate because once you guess the minimal prompt (eg. “a confused economist” or “a happy dog”), there is no more learning to be done. You can predict the image contents after just a few bits. Then the image, however big and however filled with pseudo-details, provides no more learning.
The criticism I often have of LLM generated stuff is that the prompt is the only original part. To me it feels like being presented with the results of a google search, just in a different form. Once I know roughly what the query was, I know what the core question was, and I can go get my own information. I don't need anyone to hand me the search results.
I don't necessarily phrase it in terms of learning, but it's the same principle. Why should I read a 10 paragraph response from chatGPT when the unique part is the prompt? If the prompt is only a paragraph long, then it's just adding additional work that I have to do to work backwards and understand what someone was originally trying to communicate.
Similarly, the only times I have enjoyed generated images are when my friends have used them for set pieces for a D&D campaign. They didn't really add any useful information, just being static images of bosses and locations, but because they were highly tuned to the exact events in our campaign they enhanced the overall experience.
Me too, but I would be careful about being too dismissive, because I would totally bet that at some point the models will be able to write top tier stories.
And there will be people who will find those stories soulless purely based on their origin (which is completely fine!) and call them slop (which I feel hurts the language).
Maybe. I'm not certain that the mathematical average of writing is ever going to be all that great. However I'm willing to update my stance the day an LLM writes a story that makes me cry. Until then I am going to be a bit stubborn about it.
I think all LLM output used "as is" for content/entertainment/art is slop.
If a verb unslop means to reverse. I thought that was a more interesting idea.
As a noun I think you would not use unslop to mean the opposite of slop but rather non-slop.
Based on my grammatical preconceptions of how I would use slop I felt that unslop had to be a verb, and the contest should somehow reflect that.
I think they had pretty good filters for that. Enabled by default.
I just poked around in settings, and they do have "hide" options for furry, anime, gore, and political, which is useful.
The point is not that AI produces slop (it does).
The point is that I don't want to consume "art" that has been generated out the distillation of stealing all of the world's current art. That's not original, it's a facsimile of art.
I want to read something that has intent. That has a purpose. A reason why it exists. Not just the lowest effort cash grab.
This usage of AI is the equivalent of manufacturing companies making the flimsiest, cheapest, plastic crap to save 1/3 of a cent on every mop they produce. Designed to work for the least amount of time before needing replaced.
This planet has enough people on it that I will never, ever be able to read all the books written.
Please don't exponentially pump the number up by 1,000x every year from AI generated garbage.
We live in a world with such companies, and we can still buy quality things. If there is a demand for the purely-human generated texts, they will be around. Perhaps a lot of people around you will read ai text instead, and you'll get upset because of it, but it's their choice. You'll still have your thing
Here's the age-old dilemma, though - how is reading stealing?
A machine that chews up the worlds literature and spins out a best guess at what the next word should be does not have intent, and the vast majority of the time is used by unscrupulous people purely for profit and/or deception.
An LLM and a living human being are not the same thing, I am tired of apologists comparing them as if they are.
It's not surprising that a computer (doing trillions of calculations on a billion parameter model that was trained on the world's literature) can string a coherent sentence together...
It seems that you've fundamentally misunderstood art. I wouldn't personally call it "stealing", but T.S. Eliot would beg to differ (as would Pablo Picasso who "stole" that line)
> I want to read something that has intent. That has a purpose. A reason why it exists.
If the "allegories for the LLM condition" angle is accurate, then these stories do. In which case I believe what you mean to say is that you want to read something that has human intent.
That isn't quite how they work - there's a degree of randomness depending on so called "temperature". And it isn't the output that's statistically modelled, it's the next token based on the prior output.
> This sounds a bit like you are saying LLMs are conscious
No more so than OP is implicitly asserting that human art is produced in cleanroom isolation. I don't believe either to be true.
The dumbest thing I've read this year.
There is no need to automate writing. Especially fiction. There are tens of millions of people out there with really interesting and unique ideas and styles who would love to drop everything and write, if only they can get the chance to have their work seen.
Is it just because you can't objectively mark creative works as "incorrect", so the output can seemingly look better to some people? Is it just people trying to tap into the creative works market? Do they actually think the output is good? Do they actually want to have conversations with a computer long term?
I don't say that in a demeaning way, either.
Text and image generators were the first kinds of passable generative models that became publicly available, and they do produce "correct" results in that "picture of a dog" usually gives you a recognizable dog. So, if you're looking to start a new company or launch an app, using one of these new models for something low stakes like creative work seems like a good bet. I can understand why people gravitate this way, especially people looking to build and sell something, and I even find it less objectionable than the more serious fields, where people are throwing LLMs at completely inappropriate applications that actually require correctness and security.
Being less generous: Because many people do not respect creative work.
A lot of people, especially technically minded people, see creative work as less respectable and less important than technical work. Sometimes I think there is an element of jealousy, too. Basically, there is a somewhat common belief that people who can draw or paint or write are just naturally talented and didn't really work to get good at their art - after all, drawing is fun so they basically just get to play all day, right?
The truth is, anybody can learn to draw well, but it takes a lot of time and a lot of practice and we often don't see the hundreds of hours that were spent actually developing that skill. If you don't recognize the effort it actually takes to develop the good eye and mechanical skills needed to draw wall, then it seems like a great idea to make a sketching app that lets anyone draw anything by just typing a prompt.
So, now, I can hate with cause: it reads like someone who cares about what their MFA friends think.
Meaning, it puts most of its emphasis on description, and so little on situational engagement. Which makes sense, I suppose, for an LLM.
I already decided after the first book that I will not read any more AI slop generated book. It is not worth my time and I also don't want to encourage any more slop books taking away time from humans in general. AI slop must be contained and isolated like a virus that is annoying.
My head hurts.