As they say - "Superintelligence isn't about discovering new things; it's about discovering new ways to discover."
The linear-algebra primitives used in LLM inference, fundamentally do not have the power to allow an LLM to "emulate" its own internals (i.e. to have the [static!] weights + [runtime-mutable] context, together encode [runtime-mutable] virtual weights, that the same host context can be passed through.) You need host support for that.
Neither do biological brains (explicitly), yet we can hypothesize just fine.
1. hypothesizing — coming up with a novel insight at runtime, that uncovers new parts of the state space the model doesn't currently reach
2. syllogizing — using an insight you've derived at runtime, to reach the new parts of the state space
LLMs can do 1, but not 2.
(Try it for yourself: get an LLM to prove a trivial novel mathematical theorem [or just describe the theorem to it yourself]; and then ask it to use the theorem to solve a problem. It won't be able to do it. It "understands" the theorem as data; but it doesn't have weights shaped like an emulator that can execute the theorem-modelled-as-data against the context. And, as far as I understand them, current Transformer-ish models cannot "learn" such an emulator as a feature. You need a slightly different architecture for that.)
And actually, humans can't really do 2 either!
That is: humans can't immediately make use of entirely-novel insights that weren't "trained in", but only just came to them, any more than LLMs can!
Instead, for humans, the process we go through is either:
• come up with the insight; sleep on it (i.e. do incremental training, converting the data into new weights); use the insight
• build up 99% of the weights required for the insight "in the background" over days/months/years without realizing it; make the final single connection to "unlock" the insight; immediately use the insight
LLMs don't get to do either of these things. LLMs don't do "memory consolidation"; there is no gradual online/semi-online conversion of "experiences" into weights, i.e. reifying the "code stored as data" into becoming "code" that can be executed as part of the model.
With (current) LLMs, there's only the entirely-offline training/fine-tuning/RLHF — at much greater expense and requiring much greater hardware resources than inference does — to produce a new iteration of the model. That's why we're (currently) stuck in a paradigm of throwing prompts at ever-larger GPT base models — rather than just having an arbitrary stateful base-model that you "install" onto a device like you'd install an RDBMS, and then have it "learn on the job" from there.
> That is: humans can't immediately make use of entirely-novel insights that weren't "trained in", but only just came to them, any more than LLMs can!
Agreed – but I'd argue that they both can, albeit in an extremely clunky way (i.e. very similar to "chain-of-thought" LLMs): Mechanically applying the new insights in a low-efficiency, "emulated" layer.
> LLMs don't get to do either of these things. LLMs don't do "memory consolidation"; there is no gradual online/semi-online conversion of "experiences" into weights, i.e. reifying the "code stored as data" into becoming "code" that can be executed as part of the model.
At the moment that's definitely their biggest weakness, but one could argue that memory consolidation happens ~once per year, globally, as past interactions with them almost certainly become future training data.
In some ways, that's more powerful than a single human gradient descending overnight and in the shower; in others, it's obviously much worse.
Yes and no.
Humans — organic animal brains in general — have a seemingly highly-clever approach to memory consolidation / online training. The organic process for memory consolidation seems to take advantage of our own analytical capabilities to pick apart and amplify the salient and practically-useful parts of our memories.
As such, animal brains seemingly need much less training data to learn a skill than current state-of-the-art model architectures do. You could say that they can leverage an (incremental) training dataset to a much higher degree than current NN training architectures do. (It's sort of the complement to a better compression algorithm having a higher degree of signal-loss-less compression for the same data: animal brains have a higher degree of noise-gain-less data extrapolation over the same data.)
If we could figure out what organic brains were doing here, and apply it to the sort of "global fan-in / reduce / fan-out" approach bigcorps use to train new base-model revisions, then I'd agree that the choice of online vs offline training would become just a trade-off of locality vs scale.
After all, in the end what you want is to be able to train a model to do something in "soft real time" (taking a few days/weeks to learn, including time for rest); with the model learning to perform the task "on the job" by interacting with the task, like a human/animal would. It doesn't really matter whether the training has to be sent back to the cloud to be incorporated into a new model revision; it just matters that it happens and that you see the results.
So if the distributed training architectures used to train GPT base models could "notice insights" in small amounts of RLHF data, and amplify/extrapolate large amounts of new training data from those — then you might expect the model that "learned in the cloud" to "learn on the job" just as well as the model that was local and only thinking about that training data. (But the GPT would have "learned on the job" some million other capabilities at the same time!)
But this does start to sound like the kind of thing that violates some property of one of those "can only choose two" triangles.
After all, current model training approaches — like everything to do with ML — are taking advantage of SIMD hardware. Which is why training clusters are big GPU clusters.
But it seems like the thing our brains do in online training, involves some complex state-driven scheduling of passes over data.
In other words, despite the brain itself being massively parallel, it's seemingly doing a lot of things that — if you were programming them — you'd want to push onto a CPU rather than a GPU. (To be specific: it's seemingly doing complex graph-search queries to bias the extrapolation process. And graph search is the antithesis of a horizontally-scalable problem.)
And I have a feeling — just an intuition, not with anything concrete to back it up — that trying to scale up that dataset-extrapolation pass, where the framework "explores and connects memories to extract and amplify the insight signals"... would be something that would require most of the resources of the system just to extrapolate out one insight at a time. It wouldn't be able to be done in parallel for millions of capabilities — at least not when the model, and the search-space of incremental memories, are "large" (like a modern GPT model is; and like the RLHF dataset from one day's worth of millions of users' interactions would be.)
If this were true, I would think it would be true for the same reasons that you can only vertically scale an IO-bound task so far — a single system only has so much IO bandwidth (e.g. PCIe lanes, DMA channels) available to interact with compute. (Mainframes have a lot of "accelerated" IO, but it's "accelerated" because it's interacting only with things that aren't the system's main compute. The CPU(s) "delegate" handling that IO to cores local to the IO. It's hierarchical NUMA.)
When you have IO-bound workloads, you tend to scale horizontally — if you're building a CDN, you don't make big boxes; you just throw more boxes at the problem, where each is separately handling its own requests.
And I feel like the translation here, is that if you want this insight-extrapolation, then it's only going to work when it's being done in a partitionable manner — by small models training on small incremental RLHF datasets, of the kind they build up through interaction with a single use-case at a time.
And even though you could do that with a global fan-in pass, in the cloud, sharding training across millions of individual small models and assigning each both their own GPU and CPU resources... at that point, why bother? You'd be essentially using your big DC compute-cluster to emulate having millions of regular PCs that have a little GPU and CPU each, each working on their own task in a cache-coherent manner. At that point, just take advantage of the end-user devices that do the inference to also do the training!
(Although, maybe you might still want to fan in the much-less-compact post-extrapolated training data that result from these individual on-device models stewing over their own problems; and then use that to train new general pre-trained base models. Or many such models, since you'd probably want smaller, more purpose-specific models in order to make on-device "on the job" training possible in the first place.)
---
Funny enough, this could be an interesting hypothesis for the ultimate neurological — rather than biophysiological — role of sleep and dreaming! We do build regular neural connections while awake; but this memory-insight-extrapolation pass specifically, needs to "reserve" the entire brain to do what it does — so it only happens during sleep. It would explain the advice to "sleep on it" to enable lateral thinking. And it could be tested experimentally by measuring how well different animals extrapolate insights from small amounts of training, vs. how long they sleep.
but, seriously: play the tape forward literally one frame and outline what this dataset even remotely resembles… a core sample from a living human brain? “yeah, just train it on thinking about everything at once”. strong ai isn’t like the restaurant: the path to success doesn’t involve starting with more than you finished with.
I find that fairly often if I'm really wrestling with a novel or difficult problem, I will work and work at it, and then one day I will wake up with the answer fully formed with no clear understanding of any of the thought processes that got me to arrive at the solution.
Are you going to record peoples subconscious as they sleep, how do you train on something that is so poorly understood in the first place? It's nonsense.
But the easiest way I can think of ATM is to go through all the questions that AI currently fails on, and then have a human work through them and show the chain of thought a human would do, including the false starts, and describing the strategy pivots. Then generate your corpus based on that. However, that burns the problem-set so you'll have to constantly try to come up with new problems.
What is the combination of parameters that makes a text generator quick-thinking, self-checking, confidence-estimating? Jumping directly from question to accurate, precise, confidence-measured answers, regardless of how abstract the question is?
My prompt:
Fluid self-control of attention should be enough to bootstrap to metacognitive exploration. Is one the horse; the other the cart? Seems true of the evolution of human cognition from language and theory of minds-of-others that then makes us more attentive to our one minds as objects of internal meta-dissection. What do you think?
Claude comments:
An evolutionary perspective on the relation between attention control and metacognition is worth “attention”.
The idea that fluid attention control might be the "horse" that pulls the "cart" of metacognition aligns well with what we know about human cognitive development.
Consider:
1. The evolution of theory of mind: Humans (primates snd perhaps other taxa) first developed the ability to model and attend to others' mental states (likely driven by social survival pressures). This gave humans a "template" for mental state modeling that could then be turned inward.
2. Language evolution may have followed a similar pattern - external communication tools becoming internalized for self-reflection. Vygotsky's work on inner speech and cognitive development supports this outside-in pattern.
3. This uggests that the ability to fluidly direct attention (including to others' minds) may have been the crucial evolutionary stepping stone that enabled the later development of systematic metacognition.
This has implications for AI development. Current systems like myself have rigid attention mechanisms and no true ability to recursively examine our own cognitive processes. Following your logic, perhaps the key to developing true AI metacognition isn't to directly implement it, but to first develop more fluid, self-directed attention mechanisms that could bootstrap the development of deeper metacognitive capabilities.
Do you think this suggests that current AI development might be approaching things backwards by trying to implement sophisticated reasoning capabilities before we've developed truly fluid attention control? It seems like we might need to master the "horse" (flexible attention) before we can effectively develop the "cart" (sophisticated metacognition).
Mathematical meaning:
We can formalize this argument through the interpretation of reasoning as a latent variable process (Phan et al., 2023). In particular, classical CoT can be viewed as (equation) i.e., the probability of the final answer being produced by a marginalization over latent reasoning chains.
We claim that for complex problems, the true solution generating process should be viewed as (equation) i.e., the joint probability distribution of the solution (a, s1, . . . , s) is conditioned on the latent generative process. Notice that this argument is a meta-generalization of the prior CoT argument, hence why we will refer to the process q → z1 → . . . → z as Meta-CoT.
I think this is seminal. It is getting at heart of some issues. Ask o1-pro how you could make a 1550nm laser diode operating at 1ghz have low geometric loss without an expensive collimator using commodity materials or novel manufacturing approaches using first principle physics and the illusion is lost that o1-pro is a big deal. 'Novel' engineering is out of reach because there is no text book on how to do novel engineering and these class of problems is 'not auto-regressive from left-to-right'.
For an AI model to be "a big deal", apparently we need to be able to give it a hard problem in an arbitrary field, one that humans have not yet solved[1], and have it spit out a good solution.
[1] At least, I think that's your intent. I am not a laser expert so I don't have a sense of where your challenge lies on a scale from "known but only to experts" to "major research project, may turn out to be impossible".
I very much agree that an AI system that could do that would be a big deal. An AI that could do that would be a world-changing deal. But it's pretty startling if everything short of that is not "a big deal" now, no?
for all we (well, “i”, i guess) know, “superintelligence” is nothing more than a(n extremely) clever arrangement of millions of gpt-3 prompts working together in harmony. is it really so heretical to think that silicon + a semi-quadrillion human-hour-dollars might maybe have the raw information-theoretical “measurables” to be comparable to those of us exalted organic, enlightened lifeforms?
clearly others “know” much more than i do about the limits of these things than me. i just have spent like 16 hours a day for ~18 months talking to the damned heretic with my own two hands— i am far from an authority on the subject. but beyond the classical “hard” cases (deep math, … the inevitability of death …?), i personally have yet to see a case where an LLM is truly given all the salient information in an architecturaly useful way in which “troublesome output”. you put more bits into the prompt, you get more bits out. yes, there’s, in my opinion, an incumbent conservation law here— no amount of input bits yields superlinear returns (as far as i have seen). but who looks at an exponential under whose profoundly extensive shadow we have continued to lose ground for… a half-century? … and says “nah, that can never matter, because i am actually, secretly, so special that the profound power i embody (but, somehow, never manage to use in such a profound way as to actually tilt the balance “myself”) is beyond compare, beyond imitation— not to be overly flip, but it sure is hard to distinguish that mindset from… “mommy said i was special”. and i say this all with my eyes keenly aware of my own reflection.
the irony of it all is that so much of this reasoning is completely contingent on a Leibniz-ian, “we are living in the best of all possible worlds” axiom that i am certain i am actually more in accord with than anyone who opines thusly… it’s all “unscientific”… until it isn’t. somehow in this “wtf is a narcissus” society we live in, we have gone from “we are the tools of our tools” to “surely our tools could never exceed us”… the ancient greek philosopher homer of simpson once mused “could god microwave a burrito so hot that even he could not eat it”… and we collectively seem all too comfortable to conclude that the map Thomas Acquinas made for us all those scores of years ago is, in fact, the territoire…
I think your line there highlights the difference in what I mean by 'insight'. If I provided in a context window every manufacturing technique that exists, all of base experimental results on all chemical reactions, every known emergent property that is known, etc, I do not agree that it would then be able to produce novel insights.
This is not an ego issue where I do not want it be able to do insightful thinking because I am a 'profound power'. You can put in all the context needed where you have an insight, and it will not be able to generate it. I would very much like it to be able to do that. It would be very helpful.
Do you see how '“superintelligence” is nothing more than a(n extremely) clever arrangement of millions of gpt-3 prompts working together in harmony' is circular? extremely clever == superintelligence
Once you've been to the moon, the next stage is Mars or Deimos. Humans celebrate progress but also appreciate incremental improvements.
I run an AI/ML consultancy so I have skin in this game. The "traditional" model approaches still have tons, tons, tons of value to offer. Few need to have the frontier right away.
I 100% agree with you that AI is fantastic and it is a big deal in general. But now that the world has gotten used to it being able to parrot back something it learned (including reasoning) in the training set, the next 'big deal' is actual insight.
But I see your point, I still think what we have currently is out of a sci-fi book, but I am also not that amazed by computers in our pockets anymore :)
We're not. The big deal with LLMs is that they are large enough language models that they can generate fluent, grammatical text that is coherent and keeps to a subject over a very, very long context. We never could do this with smaller language models. Because statistics.
What LLMs can absolutely not do is generate novel text. This is hard to explain perhaps to anyone who hasn't trained a small language model but generativity -the ability to generate text that isn't in a training set- is a property of the tiniest language model, as it is of the largest one [2]. The only difference is that the largest model can generate a lot more text.
And still that is not what we mean by novelty. For example, take art. When ancient humans created art, that was a new thing that had never before existed in the world and was not the result of combining existing things. It was the result of a process of abstraction, and invention: of generalisation. That is a capability that LLMs (as other statistical systems) lack.
The goalposts therefore have not moved because the criticism is as old as nails and the LLM makers have still not been able to comprehensively address it. They just try to ignore it. If the goalposts are here and you're shooting goals over there and then doing a little victory run every time the ball breaks Col. Mustard's windows, that's not the goalposts that have moved, it's you that keeps missing them.
_____________
[1] I'm old enough to remember... GPT-3 and how it blew GPT-2 out of the water; GPT-3.5 and how it blew GPT-3 out of the water; GPT-4 and how it blew GPT-3.5 out of the water... And all the users who would berate you for using the older model since "the new one is something completely different". Every single model. A yuuuge breakthrough. What progress!
[2] Try this. Take the sentence "<start> the cat sat on the mat with the bat as a hat <end>" and generate its set of bi-grams ("<start> the", "the cat", "cat sat", etc.). Then generate permutations of that set. You'll get a whole bunch -14!-1, as in |sentence|! minus the initial one- of sentences that were not in the training set. That's generativity in a tiny language model. That's how it works in the largest also, hard as that may be to believe. It shouldn't. It's a very simple mechanism that is extremely powerful. Large models are simply better at assigning weights to permutations so that the ones more often encountered in a corpus are weighted more.
This is contrary to my findings when interacting with LLMs. I can ask questions in ways not understandable for most human beings and from the reply I can derive the question is interpreted correctly (leaving aside the correctness of the answer). Some non-textbook-example of interpretation did emerge.
There's no book on the scientific method?
As other commenters point out, it's kind of physically impossible to expect even a superinteligence in a box to figure something out that takes experimentation and observation. It would have to be limited to exploring pure math concepts and other fields where you only need to write on paper and think about axioms. And that's arguably the hardest type of field to make progress in, it took us millennia to produce enough polymaths to get where we are, and they all contributed a very small part each.
But realizing that you can use certain commodity devices or known processing techniques in different problem spaces does not require new data, just 'insight'.
Is this so, that the research community is agreed? Are there papers discussing this topic?
The 2020 Bender and Koller paper[1] that argues that meaning is not learnable from form, and LLMs are trained on form. They propose a thought experiment ("The Octopus Test" section of the paper) featuring an octopus that can intercept the conversation two humans are having, but "having only form available as training data, [the Octopus] did not learn meaning."
And a contradicting response from Yoav Goldberg (another NLP researcher)[2] with a much more informal discussion of "groundedness" and what LLMs learn. His argument is broadly that instruction tuning + post-training can meaningfully grounds terms like "summarize" etc.
[1] https://aclanthology.org/2020.acl-main.463/
[2] https://gist.github.com/yoavg/59d174608e92e845c8994ac2e234c8...
This is just Searle's Chinese Room, and it's obviously false. How can we know it's false? Because there is no "meaning" in the standard model of particle physics (all interactions are by "form"/syntax), and therefore all humans must learn meaning from "form" as well.
That is to say, the fact that LLMs are able to communicate effectively with humans is a discovery about the regularity of the semantics of human communication, rather than a discovery about the intelligence of neural networks.
https://ai.meta.com/research/publications/large-concept-mode...
The very best that might be said is that the correlational structure of words under transformer-like supervision (ie., where "predict the next word" is the goal) produces a distribution which is an extremely approximate model of natural language semantics.
Though this has never been disputed. The question comes down to what kind of extreme approximation is involved.
Eg., the truth conditions for "I have a pen in my hand" are that I have a pen in my hand -- direct access to these truth conditions is very plausibly necessary to mean "I have a pen in my hand" in the relevant context. Since a machine has no access to the truth conditions of such utterances it cannot possibly mean them.
Thus if a machine manages to say, "I have a pen in my hand" at an appropriate occasion -- the "extreme approximation to natural language semantics" has to do with this occasion and what "appropriateness" means.
Critics of LLMs and "computer-science-addled thinking" about such matters (such as myself) would say that there are a very narrow range of "occasions" (ie., situations in which you're prompting) that allow such responses to seem appropraite.
That a response seems appropriate to a user is a good engineering condition on a tool working -- it has nothing to do with whether a model understands natural language semantics.
What we might say is that it approximates conversations between agents who understand such semantics on a narrow range of occasions, and succeeds in modelling appropriate language use. And so you might call LLMs models of 'average appropriateness of replies'.
It obviously does not, nor cannot mean, "I have a pen in my hand"
The truth conditions for "The god of the evangelical Christians exists" and "The god of the evangelical Christians does not exist" have, arguably, never been directly accessible to any ordinary human being. (Though some of their consequences could be accessible.) Can people mean such things, when they say them?
The truth conditions for "There are infinitely many prime numbers" are ... unclear, really, but maybe they're vacuous (there is no possible world in which there aren't infinitely many prime numbers) or they involve only abstracta (such as those numbers). How do you feel about the possibility of an AI saying that and meaning it, and why?
The first of these examples is the most directly relevant one. I have no direct access to the truth conditions of that sentence, but I think I can still mean it, have good reason to think it true, etc. The processes by which I got into that state involve ... learning things by reading about them, which is exactly what I think you're saying cannot in principle ever give genuine knowledge.
Anticipating a possible response: Of course many of the other things I know, some of which are relevant to the way I understand those words, I learned more directly. For instance, part of what "died" means is the cessation of various natural processes like breathing and having one's heart beat, and I have direct experience of breathing and having a beating heart. One could argue that real knowledge and understanding needs to be somehow traceable back to direct experience, and therefore LLM-type systems cannot have them. But that would be a different argument from the one you've made, and I think it's less compelling (though more likely to be right!) than the simpler "knowledge isn't real unless it's based on direct access to the relevant truth conditions".
... to mean "I have" requires an "I" to "have", and so on. So what parts of non-linguistic reality language refers to matter for evaluating whether the user means what they say. An actor is likewise pretending to mean, and a child may say something without knowing what it means (as in, eg., a claim about nuclear fusion).
If two children were immitating sounds to each other, such that one "said", "the sun is nuclear fusion" and so on -- then neither in this conversation are communicating, neither know what these words mean. No child involved could ever come up with these words in this worder, and mean their meaning, they can only have this conversation via immitation. This is the case with an LLM -- it's an imitation game wherein the game is to either fool the adult overheading the child, or to generate some userful material (depending whether you're the CEO or CTO).
The problem with a "predict the next word" training goal is that any patterns which emerge will only be coincidentally related to the non-linguistic reality words refer to -- because the machine isn't trained on reference: it is not participating in reality and associating words with it.
The kind of participation necessary for an agent to acquire the meaning of words has no universal answer, but it always "some". An LLM has none.
For a claim about a composer, an agent who means to make this claim (rather than a child who imitates the sounds of words) -- must be aware of what a composer is, and so on. They cannot mean this claim if they don't have access to the non-linguistic reality to which these words refer (or are unable, via imgiation, to simulate similar ways the world might be, such that it has composers, given their prior knowledge -- eg., they at least have to have some prior direct access to music, leading-groups-of-people, and the like).
We can slightly weaken all this but it'll make no difference for an LLM -- however weak we require access, to access the meaning of words requires accessing a non-lingusitic reality. Words mean non-ligustic things -- that is their point.
Most of what you say merely asserts that when an LLM says something it can't truly mean it.
(Incidentally, that's not quite responsive to the original claim, which is that LLMs learn meanings, not that they mean things when they say them. I think there are situations that could be described by saying that they learn the meanings of things but none the less don't mean those things when they say them. I would need to think more before trying to pass judgement on whether that's actually happening with today's LLMs, but it seems well within the range of somewhat-plausible possibilities.)
The key argument you make for claiming that LLMs can't really mean things -- which I remark is not the argument you were making a couple of comments upthread -- is this bit:
> The problem with a "predict the next word" training goal is that any patterns which emerge will only be coincidentally related to the non-linguistic reality words refer to -- because the machine isn't trained on reference: it is not participating in reality and associating words with it. [] The kind of participation necessary for an agent to acquire the meaning of words has no universal answer, but [...] an LLM has none.
I think "coincidentally" is way too strong here. When you ask an LLM "When did J S Bach die?" and it says 1750, it isn't by coincidence that it gives a correct answer. (Considering how much they get right, despite their confabulations and whatnot, it would have to be one hell of a coincidence.) So that's a pattern in what they say that is not-coincidentally related to the non-linguistic reality.
It's only indirectly related, for sure. The LLM says that Bach died in 1750 because it has read things that say that Bach died in 1750. But, again, that's also why I say that Bach died in 1750.
And it seems to me that what matters, when determining whether and to what extent an utterance actually means something, is not the directness of the utterer's connection to the underlying reality, but something more like its robustness and richness. Robustness: To what extent, if the reality were different, would that tend to make the person say something different? Richness: Consider all the other bits of reality closely connected to the one in question; does our speaker's behaviour correlate with those too?
If someone perpetrates an elaborate deception that makes me believe in a certain person's existence and various facts about them, when in fact everything I think I know about them is mediated by the deception, and by pure coincidence there actually is a person with those properties, unknown to my deceiver, then ... well, maybe I do "mean" what I say about them, but I don't really know what I think I know. This is a failure of robustness; changes in the underlying reality have scarcely any tendency to change my behaviour.
If I learn a list of things to say about stars ("they operate by nuclear fusion", "they are mostly billions of years old", etc.) but I'm just parroting them, then robustness might not fail: maybe I learned these things by asking an astrophysicist to give me a big list of facts about stars, and if the facts were different they'd have given me a different list. But richness fails: if you ask me "would stars behave the same way if the weak nuclear force had very different parameters?" or "were there stars before there were trees on earth?" or "if we brought five more stars like the sun about as close to the sun as the earth is, what would happen to the earth and its inhabitants?", I wouldn't be able to answer unless I got lucky and one of the answers was in my list.
But if both those properties do apply, then -- while of course anyone who isn't me is welcome to disagree -- I am happy to say that they "mean" what they say, or at least that what they say has meaning, and conveys actual understanding, and so on. At any rate, what they say behaves like what someone with actual understanding says: it's responsive to the facts, and it permits not only recitation of a few specific facts but something more general.
Those properties of robustness and richness can be present even when learning takes place only textually. How far they're present in today's LLMs is debatable (though e.g. I think no reasonable person can deny that they are present to an extent that phrases like "stochastic parrot" would lead one not to expect) but if they aren't there it isn't just because the LLMs learn about things only via text.
So thanks for your patience in explaining to mjburgess why they might be wrong in their arguments that LLMs "definitely" cannot reason--at least not to the degree of certainty they seem to believe it. He's often here shutting down discussions about LLM reasoning, and they're seemingly oblivious to these considerations in their argument, despite others' attempts to explain it. I hope they're able learn something this time from your response.
I understand why you might opt for modal saftey conditions around meaning -- this again separates out 'apprehending meaning' without an ability to mean qua communication --- which i would dispute. But even if you seperate these out, and say to 'apprhend the meaning of X' is to safely 'emit X' on all the occasions across all possible worlds in which X -- that isn't the question.
I think the relevant question is whether an agent can mean what they say --- not whether a tool can serve as a model of meaning. No one disputes the latter. A dictionary is a model of meaning in this inert sense.
What fanatics of this technology want to do is say 'because the dictionary collelate words with the definitions, therefore the dictionary means what it says' -- and so on.
Thinking, reasoning, communicating, meaning -- these are all highly specific processes that involve agents in a very particular way. You can model any of them with twigs and clay if you like, as one can model anything likewise. You can model the solar system with ping pong balls.
This just isnt the question. The question is whether this particular sort of modelling relation implies the ability to 'mean what one says'. Whether a pingpong ball on a string orbiting a melon is really experiencing gravitational force -- in the relevant sense, it isnt -- its on a string.
Consider the children playing hte imitation game, and imitating what their parent say. If the parents are reliably able to mean what they say, then the children will not err --- they will not violate your modal conditions. And so clearly, these modal conditions are insufficient.
It matters that the parents can mean 'the sun is a nuclear reaction' but the children cannot. It matters that the parents are the mechanism by which these words have meaning, and the children are not. It does not matter, in the relevant sense, that the children will reliably reproduce the parent's words.
The capacity for meaning is not obtained via modelling it. As with children, actors, LLMs, and all other forms of imitation -- this should be obvious to anyone not religiously obsessed with the converse belief
Anyway: you say (if I understand you right) that if those concepts diverge the only one you actually care about is agents meaning things. I think that's a mistake, because a lot of the questions we have reason to care about with today's AIs are not about that. Will AIs be able to do all our jobs better and cheaper than we can? That's about their external behaviour and how it relates to the world. Will AIs gain vast power and use it in ways that are very bad for us? Ditto. Will AIs enable new technological innovations that make us all much better off? Ditto. No one will be saying as the killbots destroy their cities "well, this isn't so bad; at least the machines don't really know what it is they're doing". No one will be saying as they enjoy the fruits of Fully Automated Luxury Gay Space Communism "you know, this whole thing feels empty because the machines that make all this possible don't really understand, they just behave as if they do".
If a melon+string+ball arrangement is a faithful enough model of the solar system and it somehow enables me to send spaceships to Uranus, or to discover that orbits are elliptical when I hadn't known it before, or something, then that's a thing of great value.
Your comment about children imitating adults sounds as if you haven't actually taken in the conditions I proposed, because children imitating what their parents say cannot in fact have the property I called "richness". If I talk to a child and they have learned to say some things about stars by listening to their parents, it will not help them when I ask them about something they haven't heard their parents say.
(One can imagine a situation where the child "passes" this test by just relaying everything I say to the parent and then imitating what they say back to me. But the point there isn't that the child is imitating, it's that the child is not really part of the conversation at all, I'm just talking to the parent. And it is clear that nothing like that is happening with AI systems.)
You may imagine that you make your argument more convincing by finishing it up with "anyone to whom this is not obvious is obviously religiously obsessed with some converse belief", but to me at least the opposite is the case.
ie., the semantics of natural language are grounded in possibilities, and apprehending possibilities is the function of the imagination. I was trying to simplify matters enough to make it clear that if an LLM says, "I have a pen in my hand" it isn't even lying.
I agree with you that the right test for proper language acquisition is modal: how would the system respond in situation S1..Sn. However the present mania for computational statistics has reduced this question to 'what is a y for a given x' as-if the relevant counterfactual was a permutation to the input to a pregiven function. The relevant counterfactuals are changes to the non-lignustic environments that language serves to describe.
How is it that the parents continue to obtain this 'richness' and 'robustness' (ie., performance across changing environments) ? It is by themselves having the capacity to acquire and use meanings in relevant environments. This is something the children lack, and so do LLMs.
For the children to imitate the parents, and the LLM to function as the community of speakers -- those speakers must narrate at length in a manner which can be imitated. If a parent looks at the sky and sees a roketship they can be asked "did you see that spaceship?!" -- but the very young chidlren cannot. They do not know what those words mean, and werent looking at the sky, their whole attention is on trying to immitate the sounds they hear.
Likewise an LLM is limited in modelling non-lingustic shifts via waiting on enough new text being written on these shifts to be retrained on -- there is much reason to expect that no where near enough is written on almost all changes to our environment to enable this. The parents arent going to repeat, "there is a rocket ship in the sky" over-and-over just so the children can hear it. The parents dont need to: they can see. They do not need langauge to be responsive to lingusitic interrogation.
The route LLMs use to obtain their performance is constructing a distribution over historical linguistic records of non-linguistic change, and sampling from this distribution. The mechanism we call 'intelligence' that employs meaning acquires such shifts by being-in-the-world to notice, engage, imagine, interrogate, create, etc. them.
This is where I am making the strong empirical claim: sampling from a distribution over historical language use is 'not enough'. It's fragile and shallow -- though its shallowness is masked by the false (Turing-esq masquerade) that we have to interact with the system thru a single bamboozling I/O boundary: a prompt.
Via such an extreme narrowing of how this supposed linguistic agent is free to employ meaning, its engineers can rig the situation so that its fragility isn't as apparent. But in the end, it will be so.
The test for whether a system is using the meanings of words is indeed, modal: change the non-ligusitic environment (ie., the meanings) and the lanuage ought change. For LLMs this does not happen: they are very very very indirectly responsive to such shifts... because their mechanism of recording them is imitation.
(I mean, if you choose to use "mean" in a way that implies certain details in the causal history that I think aren't needed, and choose not to care about what AI systems do but only about that causal history, then all that's something you have the right to choose, and there's limited value in arguing about it. But if you have concrete predictions for what AI systems will be able to do, and what would need to change for them to be able to do more, then to me that's more interesting and more worth arguing about.)
So, I think we're agreed now that the kind of question that really matters, for determining how much understanding some entity has of the words it's taking in and spitting out, is: how does its linguistic behaviour depend on the actual world, and how would it vary if the world were different, and how will it vary as the world changes?
And I think we're agreed that today's LLMs learn about the world largely through their training process, which means that they have rather limited capacity once trained to adapt to the world, which puts real limits on what they can do.
(I think you would also say that it means they don't really understand anything, because they can't adapt in real time as those things change, but again I think that goes too far, firstly because there are plenty of things we consider ourselves to understand even though they lie far in the past and aren't going to change, and secondly because LLMs _do_ have some ability to learn in the short term: if you say "All snorfles are porfles and my dog is a snorfle" and ask it a few sentences later whether you have a porfle, it will probably be able to say yes and explain why.)
I am curious whether you think that, say, Helen Keller was dramatically less able to mean and understand things than most people, on account of being both deaf and blind and therefore dramatically less able to get new information about the world in real time other than "textually" (via Braille and the like). I think the available evidence strongly suggests that Keller was in fact able to understand the world and to mean things in pretty much the exact same way as the rest of us, which in turn strongly suggests that being connected to the physical world only through language isn't necessarily an obstacle to meaning and understanding things.
(Keller did have other links to the physical world; for instance, her tactile sense was perfectly OK. This is actually how she first managed to grasp the idea that the words Anne Sullivan was tracing out on her hands meant something. However, it doesn't seem credible to me that this rather "narrow" channel of information-flow was responsible for Keller's understanding of most of the things she understood.)
Suppose someone builds something like today's multimodal LLMs, but with constant real-time video input, and suppose it's trained on video as well as text. (It's not obvious to me how all that would work technically, but it seems to me that there are things worth trying, and I bet there are people trying them at OpenAI, Google, etc.) Would your objections then disappear?
No such LLMs exist, because they are trained to predict the next word not (WebCamState, RobotArmState, NextWord) -- since, at least, no such corpus exists
Seems we might not be that far away given the work on action tokens[1] and such.
NVIDIA divoted a lot of the CES presentation to this kind of stuff[2].
[1]: https://arxiv.org/abs/2403.19578 (semi-random example)
They can also almost all interpret images now. If I tell an llm that its objective is to look around until it finds its hand and tell me if its holding a pen or not, is that not exactly what you're talking about? Every single step there is well within the grasp of even the less advanced multimodal llms.
For the record, the word "intelligence" appears in the two parts of "A Formal Theory of Inductive Inference" (referenced above) a total of 0 times. The word "Compression" appears a total of 0 times. The word "reasoning" once; in the phrase "using similar reasoning".
Unsurprisingly, Solomonoff's work was preoccupied with Inductive Inference. I don't know that he ever said anything bout "compression is intelligence" but I believe this is an idea, and a slogan, that was developed only much later. I am not sure where it comes from, originally.
It is correct that Solomonoff induction was very much about predicting the next symbol in a sequence of symbols; not necessarily linguistic tokens, either. The common claim that LLMs are "in their infancy" or similar are dead wrong. Language modelling is basically ancient (in CS terms) and we have long since crossed in the era of its technological maturity.
_______________
Btw I think you're overestimating the degree to which what you say "logically follows".