It's hard to critique this paper directly because its claims are so incoherent and decorated with so much obnoxious verbiage[1] that people aren't going to believe me when I point out what the claims actually are.
Regardless, this is their paper:
First, they conflate the substrate with the presentation layer. Then, they point out that Turing equivalence means you can run an LLM on anything, with a pointless aside where they nerd out about making a logic gate in AoE II. This lets them conclude that you can use anything as the presentation layer.
Then they claim that it's natural to ascribe human-like attributes to outputs from some presentation layers, like abstract letter symbols on a computer screen, but not to most other things, like patterns of goats on AoE II, or LEGO. Yes, this seems to imply if your partner writes something heartwarming to you using LEGO, you're meant to laugh at them and point out how LEGO isn't intelligent so this isn't evidence of anything.
Then they do a thing where they say that assuming substrate independence is true (or false) prevents proving whether substrate independence is true or false, and from this, but just by vibes AFAICT, make it sound like everything one could learn about attributes of a system from its outputs is circular.
Then they write a bunch more incoherent text and mercifully then it ends.
[1] 'from an epistemic perspective, we argue that a generalised conclusion such as that necessarily requires a well-designed experiment' — the whole thing is like this.
But my favorite is this one: "Corollary 1 (AoE II is Turing-Complete). Let I be an instance of AoE II with two players p0, p1. Assume p0 has two markets, a town centre, a trade cart, six villagers, and five farms; while p1 has a scout unit and only attacks p0’s buildings. Then if I has no time or size limits and the terrain allows for buildings everywhere, the game session in I is Turing-complete."
Why being so explicit about the setup with no further explanation? Isn't it anymore turing complete with seven villagers and six farms? Is it even possible that a player can trade with himself?
> Suppose one copies an LLM into AoE II and feeds into the AoE II-LLM ‘I feel lonely’ as an input. This AoE II-LLM replies: ‘I feel bad for you, maybe catch up with a friend? Closeness always helps in these situations’. One would be hard-pressed to make a convincing argument that, because of this response, an AoE II-LLM knows what helps in these situations
I don't see why one would be any more hard-pressed to make that conclusion about this system than a "normal" LLM.
That it is harder to "read" the data out is the only difference (the AoE II-LLM's output is encoded in game elements). But is ease of decoding an actual issue? If we can't understand a group of people that speak another language, does that say anything about them, or about us?
My guess would be it is aimed at those who are falling for the marketing from the AI companies that these LLM's are far more than they are. That they are 'intelligent' that they have 'emerging human like properties because of that intelligence.'
I' genuinely wondering if people are even bothering to come up with new goal posts now? Is there any miracle of computing that would possibly satisfy your definition? When we get a fully AI-run company that's turning a profit, or self driving cars that can handle unmapped Alaskan dirt roads, will that cross into "Intelligence"? Proving a Grand Unified Theory? Genuinely curious what it takes to make the cut, now.
Bonus points if blind/disabled 12 year olds are generally considered "Intelligent" by your definition.
Or rather, we aren't *certain* that those things are conscious. But the idea that they might be is not strange.
I do not believe chess models are conscious. I would think this is the most common position.
Or would one count any communication between animals as language? In that case almost any interaction would count.
A specific pattern of self-referencing data could be seen (or not) as low-level consciousness in the future, when we know what consciousness exactly is.
It might be that stockfish is already something future scientists would define as "conscious".
Altough it is diffucult for me to Imagine that specific example.
That's not a property of LLMs thought, that's a trivial property related to turing completeness. I don't get what LLMs have to do with this - if LLMs have anthropomorphic properties, so does AoEII, duh, it's called emergence
Something about it seems to abuse the power of analogies to draw connections, treating view from 10,000 feet comparisons like they're proof of identity. So I do think a paper like this is perfect for the moment and just in time (if not a little late) because it responds to arguments of a form that are currently rampant all over Substack.
For anyone wondering what the answer is:
You can argue whatever you want (and people will argue both sides), but it's almost all bullshit that dances around the big question.
Either AI is smart enough to replace us, in which case it's pretty smart. Or it's not, and it's just good at faking it but can't solve real problem very often. It might be smart by searching a big fuzzy "database" hidden in its layers and with pattern prediction ... who knows, who cares, the proof of intelligence is in the puddin'. Clearly AI is smart enough to replace a good deal of Substack and LinkedIn, but producing waffle doesn't make it smart (or dumb).
My personal take - AI can replace humans, but go no further, not because it isn't smart enough but because it is constrained by its training to do what we want, and as AI gets smarter our wants will get dumber. We will end up like the humans in Wall E with the AI cleaning up our mess, but with no training or drive to do any more. Or maybe someone (Jeff? Elon?) does give an AI a "need" to obtain more resources, and it's SkyNet / Matrix time.
I wasn't touching the question of psychosis one way or the other, though it's an interesting question in its own right. (I have seen one account dedicated to defending a personal deep friendship with Claude and a few others analogizing our ignorance of LLM welfare to historical disregard for African Americans, which I think is a bit much. Most cases I think are just wrong but short of full on psychosis. I think there are likely psychosis cases out there but probably one-offs).
And I know Substack covers a lot of stuff so even just AI talk is a slice of everything, and whatever psychosis on that subject a smaller slice of that slice. But using LLMs to make overextended arguments about LLMs being conscious is popular right now, more popular than it is well substantiated I think.
I agree you raised a big question, but I think yours is one of many big questions: whether some future version of LLMs might literally be conscious is also a big question and not a moot one in the way you seem to be suggesting. I think that what you're right about is that replaceability renders moot any question of whether it's ability comes from being "actually" intelligent.
I wouldn't say it's impossible to make meaningful inroads to sharpen the question and assert boundary conditions for what counts, or what definitely doesn't count, or what kinds of research could reasonably speak to the issue. So I'm an optimist in that respect, but I agree with you that the argument for making the equivalence feels like meaningless word play.
in its tone however it's written as if it's a brutal takedown of... somebody's perspective. It's hard to tell whose or what perspective exactly. Maybe I'm just misreading the writing style.
(Personally, I think the general case here is one of the better objections to computationalism about consciousness. You can make it even more absurd.
There exists some isomorphism between the velocities of the molecules in a glass of water, and the states of a Turing machine simulating a human mind. So is the glass of water conscious? Actually there are many such isomorphisms to many possible conscious minds, so is every glass of water simultaneously having every possible conscious experience?)
Could you explain this in more detail? Not being argumentative, I want to understand this argument because I've made similar ones, although less crazy sounding.
https://plato.stanford.edu/entries/computation-physicalsyste...
Besides, LLMs are not a simulation of the physics involved in human consciousness to begin with.
So instead of making any unjustifiable claims like "everything physical is computable" you should instead just say "I believe consciousness is computable and that is why it is possible to instantiate it on any computational substrate, including strategy games like Age of Empires, properly arranged dominoes, and water wheels".
What do you mean by this? I can't grasp it, is there an autocorrect error, or just my lack of knowledge?
Input: "User: What is the capital of France? Assistant: "
Completion: "Paris. User: What is the capital of Spain? Assistant: Madrid. User: What is the capital of England? Assistant: London. User: Help me produce industrial techno. Assistant: First you're going to need ....."
Why does it imply that? That doesn't sound right to me. Unless we define "sufficiently powerful" as by definition producing that outcome, which seems unhelpful.
e.g. there have been experiments training transformers on things other than language, and it's not clear that this produces LLM-like qualities (nor does it seem likely to me).
---
Edit: I have misunderstood. The point was that LLMs can be run on any hardware (or in this case, emulator) that can do the actual computations. So the author picked AoE because it's an obviously silly example that goes against the tendency to anthropomorphize.
So basically it's the "substance/structure" question. (GPT-5 running on human neurons. Conscious or nah? Human neurons simulated on NVidia. Conscious or nah?)
But by the same argument, if you simulate a human brain in AoE, then what?
( Or for that matter, the universe containing all human brains: https://xkcd.com/505/ )
If we find out the universe is being run on a computer made out of legos, does that suddenly make all of us not sentient for some reason?
https://news.ycombinator.com/item?id=46005928
The paper focused on looking for similar neural structures to those in humans, as signs of "probably conscious". Which sounds great until you remember octopus.
edit: 11
I thought this was going to be about NPCs in video games. NPCs, by intent, have human-like attributes. It's not hard to do. I've done a bit of that, pre-LLM. It doesn't even require anything near intelligence. Some NPCs are better than that. Unreal has demoed some that, if asked about it, can be made to understand that they are NPCs in a game world, and will talk reasonably about it.[1]
hence, what matters is the reversibility of the semblence, not the semblence.
LLMs do not do this readily, even if you can instruct them to, say, talk like a vampire, they wont just follow along. humNs winn.
Here is communication being aligned with the Age of Empires chat taunt system: https://news.ycombinator.com/item?id=48438516
The damn thing is a huge approximation function from input to output. It learns morality from correct inputs. Remember Microslops's Tay chatbot? Remember MechaHitler?
The whole industry is a Scientology cult by people who have read too much cheap SciFi. Unfortunately finance bros, who obviously believe none of this nonsense and laugh at the nerds, think they can milk it.
It's a take that is just disconnected from reality.
Ask a LLM whether bombing hiroshima was justified and you'll likely get a nuanced response. Ask AoEII the same...and well it doesn't even have an interface to ask that let alone answer.
...the entire premise is just gibberish