LeCun calls Alex Wang inexperienced, predicts more Meta AI employee departure(www.businessinsider.com)

48 pointsby _____ka month ago10 comments

anonzzziesa month ago
> LLMs basically are a dead end
Not sure if anyone who works in the foundational model space and who doesn't directly depend on LLMs 'making it' for VC money would claim differently. It is rather obvious at this point, but some companies are too far in and not cash rich enough so they have to keep the LLM dream alive.
- D-Machinea month ago
  > Not sure if anyone who works in the foundational model space and who doesn't directly depend on LLMs 'making it' for VC money would claim differently
  This is the problem. The vast majority of people over-hyping LLMs don't even have the most basic understanding of how simple LLMs are at core (manifold-fitting the semantic space of the internet), and so can't understand why they are necessarily dead ends, theoretically. This really isn't debatable for anyone with a full understanding of the training + basic dynamics of what these models do.
  But, practically, it remains to be seen where the dead end with LLMs lies. I think we are clearly approaching plateaus in both academic research and in practice (people forget or are unaware how much benchmarks are being gamed as well), but, even small practical gains remain game-changers in this space, and much of the progress / tradeoffs we actually care about can't be measured accurately yet (e.g. rapid development vs. "technical debt" from fast but not-understood / weakly-reviewed LLM code).
  LLMs are IMO undebatably a theoretical dead end, and for that reason, a practical dead end too. But we haven't hit that practical dead end yet.
  - refulgentisa month ago
    Why are LLMs a theoretical dead-end? I understand the "manifold-fitting the semantic space of the internet", but I don't understand "why they are necessarily dead ends, theoretically."
    If I had to steelman a counterargument, I'd handwave about RL and environments creating something greater-than the semantic space of the internet, and then highlight the part you mention where we haven't reached a practical dead-end. Maybe link out to the Anthropic interp work on them planning-in-advance via poking at activations when working on a rhyming poem.
    D-Machinea month ago
    I should clarify that LLMs trained on the internet are necessarily a dead end, theoretically, because the internet both (1) lacks specialist knowledge and knowledge that cannot be encoded in text / language, and (2) is polluted with not just false, but irrelevant knowledge for general tasks. LLMs (or rather, transformers and deep models tuned by gradient descent) trained on synthetic data or more curated / highly-specific data where there are actual costs / losses we can properly model (e.g. AlphaFold) could still have tremendous potential. But "LLM" in the usual, everyday sense in which people use this label, are very limited.
    A good example would be trying to make an LLM trained on the entire internet do math proofs. Almost everything in its dataset tells it that the word "orthogonal" means "unrelated to", because this is how it is used colloquially. Only in a tiny amount of math forums / resources it digested does this actually mean something about the dot product, so clearly an LLM that does math well only does so by ignoring the majority of the space it is trained on. Similar considerations apply for attempting to use e.g. vision-language models trained on "pop" images to facilitate the analysis of, say, MRI scans, or LIDAR data. That we can make some progress in these domains tells us there is some substantial overlap in the semantics, but it is obvious there are limits to this.
    There is no reason to believe these (often: irrelevant, incorrect) semantics learned from the entire web are going to be helpful for the LLM to produce deeply useful math / MRI analysis / LIDAR interpretation. Broadly, not all semantics useful in one domain are useful in another, and, even more clearly, linguistic semantics clearly have limited relevance to much of what we consider intelligence (which includes visual, auditory, proprioceptive/kinaesthetic, and, arguably, mathematical abstractions). But, it could well be that curve-fitting huge amounts of data from the relevant semantic space (e.g. feeding transformers enough Lean / MRI / LIDAR data) is in fact all we need, so that e.g. transformers are "good enough" for achieving most basic AI aims. It just is clearly the case that the internet can't provide all that data for all / most domains.
    EDIT: Also Anthropic's writeups are basically fraud if you actually understand the math, there is no "thinking ahead" or "planning in advance" in any sense, literally just if you head down certain paths due to pre-training, yes, of course, you can "already see" weight activations of future tokens: this is just what curve-fitting in N-D looks like, there is no where else for the model to go. Actual thinking ahead means things like backtracking / backspace tokens, i.e. actually retracing your path, which current LLMs simply cannot do.
    grogersa month ago
    > so clearly an LLM that does math well only does so by ignoring the majority of the space it is trained on
    There are probably good reasons why LLMs are not the "ultimate solution", but this argument seems wrong. Humans have to ignore the majority of their "training dataset" in tons of situations, and we seem to do it just fine.
    D-Machinea month ago
    It isn't wrong, just think about how weights are updated via (mini-)batches, and how tokenization works, and you will understand that LLM's can't ignore poisoning / outliers like humans do. This would be a classic recent example (https://arxiv.org/abs/2510.07192): IMO because the standard (non-robust) loss functions allow for anchor points .
  - tim333a month ago
    I'm not sure about the dead end thing because you may be able to add on to them?
    In human terms LLMs seem similar to talking without thinking but we can also think as a separate activity to waffling on.
    In AI research terms, DeepMind have done some interesting things with Mind Evolution and AlphaEvolve, the latter being the one that came up with a more efficient matrix multiplication algorithm.
    (https://deepmind.google/research/publications/122391/
    https://deepmind.google/blog/alphaevolve-a-gemini-powered-co...)
- aspenmartina month ago
  I agree but I think to be fair it seems that there’s an open question of just how much more we can get from scaling / tricks. I would assume that there’s agreement that e.g. continual learning just won’t be solved without a radical departure from the current stack. But even with all of the baggage we have right now, if you believe extrapolations we have ~2 GPT4->5 sized leaps before everyone has to get out of the pool
fookera month ago
Being early is often the same as being wrong.
LLMs could be a dead end, but aren't anywhere close to saturating the technology yet.
labradora month ago
"I'm sure there's a lot of people at Meta, including perhaps Alex, who would like me to not tell the world that LLMs basically are a dead end when it comes to superintelligence" - Yann LeCun
I've been following Yann for years and in my opinion he's been consistently right. He's been saying something like this for a long time while Elon Musk and others breathlessly broadcast that scaling up would soon get us to AGI and beyond. Mark Zuckerberg bought in to Musk's idea. We'll see, but it's increasingly looking like LeCunn is right.
- aspenmartina month ago
  More like Yann had a long time to prove out his ideas and he did not deliver, meanwhile the industry passed Meta/Facebook by due to the sort of product-averse comfortable academic bubble that FAIR lived in. It wasn’t Zuckerberg getting swindled it was giving up on ever seeing Yann deliver anything other than LinkedIn posts and small scale tests. You do not want to bank on Yann for a big payoff. His ideas may or may not be right (joint predictive architectures, world modeling, etc), but you’d better not have him at the helm of something you expect to turn a profit on.
  Also almost everyone agrees the current architecture and paradigm, where you have a finite context (or a badly compressed one in Mamba / SSM), is not sufficient. That plus lots of other issues. That said scaling has delivered a LOT and it’s hard to argue against demonstrated progress.
  - labradora month ago
    As I said in my cousin comment, it depends on how you define AGI and ASI. Claude Opus 4.5 tells me "[Yann LeCun] thinks the phrase AGI should be retired and replaced by "human-level AI." which supports my cousin comment
  - CharlieDigitala month ago
    > ...but you’d better not have him at the helm of something you expect to turn a profit on
    I don't understand this distinction. Is anyone (besides NVDA) turning a profit on inference at this point?
    aspenmartina month ago
    I don’t know I assume not but everyone has a product that could easily be profitable, it would just be dumb to do it because you will lose out to everyone else running at a loss to capture market share. I just mean the guy seems to have an aversion to business sensibility generally. I think he’s really in it for the love of the research. He’s of course rightly lauded for everything he’s done, he’s extremely brilliant, and in person (at a distance) very kind and reasonable (something that is very different than his LinkedIn personality which is basically a daily pissing contest). But I would not give him one cent of investment personally.
- throw310822a month ago
  > He's been saying something like this for a long time [...] it's increasingly looking like LeCunn is right.
  No? LLMs are getting smarter and smarter, only three years have passed since ChatGPT was released and we have models generating whole apps, competently working on complex features, solving math problems at a level only reached by a small percentage of the population, and much more. The progress is constant and the results are stunning. Really it makes me wonder in what sort of denial are those who think this has been proven to be a dead end.
  - nutjob2a month ago
    Nothing you say proves or indicates that progress will continue indefinitely.
    Your argument says we should have flying cars by now, because they kept on getting better.
    LeCun says LLMs do text processing so won't scale to AGI, just like a faster can never fly (controllably).
    throw310822a month ago
    When I look at these cars, I don't see them going faster, I see them hovering higher and higher above the ground. They're already flying.
    tiahuraa month ago
    Good thing nobody listened to the LeCuns saying cars were a deadend back in the 1910s.
  - labradora month ago
    If you call that AGI as many do or ASI, then we are not talking about the same thing. I'm talking about conversing with AI and being unable to tell if it's human or not in kind of a Turing Plus test. Turing Plus 9 would be 90% of humans can't tell if it's human or not. We're at Turing Plus 1. I can easily tell Claude Opus 4..5 is a machine by the mistakes it made. It's dumb as a box of rocks. That's how I define AGI and beyond to ASI
    rvza month ago
    This goes for any experienced senior SWE individual with a sharp attention to detail can easily tell if an AI wrote a project or not.
    Right now the definition of AGI has been hijacked so much that it can mean absolutely anything.
    nutjob2a month ago
    No one has even given a rigorous definition of AGI.
    A prime environment for snake oil salesmen like Altman and Musk.
    dragonwritera month ago
    > No one has even given a rigorous definition of AGI.
    No one has even given a rigorous definition of the I, much less the G qualifier.
    nutjob2a month ago
    Exactly.
- rvza month ago
  We are due for much more optimizations and new deep learning architectures rather than throwing more compute + RAM + money + GPUs + data at the problem, which you can do only for so long until a bottleneck occurs.
  Given that we have seen research from DeepSeek and Google on optimizing parts of the lower layers of deep neural networks, it's clear that a new form of AI needs to be created and I agree that LeCun will be proven right.
  Instead of borrowing tens of trillions to scale to a false "AGI".
  - labradora month ago
    This seems so obvious to me that scaling advocates seem to be exhibiting a form of magical thinking
    djmipsa month ago
    I can see why it's so intoxicating though, it seemed magical that scaling got us as far as it did.
- skybriana month ago
  It's too soon to say anything like that is proven. Sure, AGI hasn't been reached yet. I suspect there's some new trick that's needed. But the work going into LLM's might be part of the eventual solution.
- catigulaa month ago
  > but it's increasingly looking like LeCunn is right.
  This is an absolutely crazy statement vis-a-vis reality and the fact that it’s so upvoted is an indictment of the type of wishful thinking that has grown deep roots here.
  - D-Machinea month ago
    If you are paying attention to actual research, guarded benchmarks, and understand how benchmarks are being gamed, I would say there is plenty of evidence we are approaching a clear plateau / the march-of-nines thesis of Karpathy is basically correct long-term. Short-term it remains to be seen how much more we can do with the current tech.
    gbnwla month ago
    Can you point me to some of the actual research you're talking about? I'd love to read.
    D-Machinea month ago
    Your best bet would be to look deeply into performance on ARC-AGI fully-private test set performances (e.g. https://arcprize.org/blog/arc-prize-2025-results-analysis), and think carefully about the discrepancies here, or, just to broadly read any academic research on classic benchmarks and note the plateaus on classic datasets.
    It is very clear when you look at academic papers actually targeting problems specific to reasoning / intelligence (e.g. rotation invariance in images, adversarial robustness) that all the big companies are doing is just fitting more data / spending more resources on human raters and other things to boost performance on (open) metrics, but that clear actual gains in genuine intelligence are being made only by milking what we know very well to be a limited approach. I.e. there are trivially-basic problems that cannot be solved by curve-fitting models, which makes it clear most current advances are indeed coming from curve(manifold) fitting. It just isn't clear how far we can exploit these current approaches and in what domains this kind of exploitation is more than good enough.
    EDIT: Are people unaware Google Scholar is a thing? It is trivial to find modern AI papers that can be read without requiring access to a research institution. And e.g. HuggingFace collects trending papers (https://huggingface.co/papers/trending), and etc.
    jk2444a month ago
    At present its only SWE's that are benefitting from a productivity stand point. I know a lot of people in finance (from accounting to portfolio management) and they scoff at the outputs of LLMs in their day to day jobs.
    But the bizarre thing is, even though the productivity of SWE's is increasing I dont believe there will be much happening in regards to lay offs due to the fact that there isn't complete trust in LLMs; I dont see this changing either. In which case the LLM producers will need to figure out a way to increase the value of LLMs and get users to pay more.
    Ianjita month ago
    Are SWE’s really experiencing a productivity uplift? When studies attempt to measure the productivity impact of AI in software the results I have seen are underwhelming compared to the frontier labs marketing.
    D-Machinea month ago
    This too should be questioned, at least a couple studies at this point suggesting many feel like they are going faster with AI when, by some metrics, they are going slower (e.g. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...), and then there are e.g. admissions from major CEOs publicly admitting e.g. Copilot doesn't "really work" (https://ppc.land/microsoft-ceo-admits-copilot-integrations-d...).
    And, again, this is ignoring all the technical debt of produced code that is poorly understood, weakly-reviewed, and of questionable quality overall.
    I still think this all has serious potential for net benefit, and does now in certain cases. But we need to be clearer about spelling out where that is (webshit, boilerplate, language-to-language translation, etc) and where it maybe isn't (research code, legacy code, large codebases, niche/expert domains).
    Ianjita month ago
    This Stanford study on developer productivity found 0 correlation between developers assessment of their own productivity and independent measures of their productivity. Any anecdotal evidence from developers on how AI has made them more or less productive is worthless.
    https://youtu.be/tbDDYKRFjhk?si=gF4EN4ilogoam3hG
    D-Machinea month ago
    Agreed.
    tiahuraa month ago
    Lawyer here. AI has taken over my workflow.
    D-Machinea month ago
    Yup, most progress is also confined to SWE's doing webshit / writing boilerplate code too. Anything specialized, LLMs are rarely useful, and this is all ignoring the future technical debt of debugging LLM code.
    I am hopeful about LLMs for SWE, but the progress is currently contextual.
    jk2444a month ago
    Agreed.
    Even if LLMs could write great code with no human oversight, the world would not change over night. Human creativity is necessary to figure out what stuff to produce that will yield incremental benefits to what already exists.
    The humans who possess such capability stand to win long-term; said humans tend to be those from the humanities and liberal arts.
    catigulaa month ago
    You're going to be eating so much crow shortly.
- stevenhuanga month ago
  > I've been following Yann for years and in my opinion he's been consistently right
  Lol. This is the complete opposite of reality. You realize lecun is memed for all his failed assertions of what LLMs cannot do? Look it up. You clearly have not been following closely, at all.
  - labradora month ago
    I meant he's held the line against those warning us about superintelligent AI soon destroying us all
    stevenhuanga month ago
    Sure and that is fair. Seldom are extreme viewpoints likely scenarios anyways, but my disagreement with him stems from his unwarranted confidence in his own abilities to predict the future when he's already wrong about LLMs.
    He has zero epistemic humility.
    We don't know the nature of intelligence. His difficulties in scaling up his research is a testament to this fact. This means we really have no theoretical basis upon which to rest the claim that superintelligence cannot in principle emerge from LLM adjacent architectures--how can we make such a statement, when we don't even know what such thing looks like?
    We could be staring at an imperative definition of superintelligence and not know it, nevermind that approximations to such a function could in principle be learned by LLMs (universal approximation theorem). It sounds exceedingly unlikely, but would you rather be comforted by false confidence or be told the honest truth of what our current understanding of the sciences can tell us?
songodongoa month ago
Well, he sure is confident in himself with quotes like “you certainly don’t tell a researcher like me what to do” and “I’m a visionary”. Best of luck.
- D-Machinea month ago
  He's right that LLMs are a dead end, but yeah, those quotes were cringe as hell. Hubris.
websiteapia month ago
human beings are estimated to use roughly 50 to 100W when idle (up to maybe 1000-2000W when exerting ourselves physically), and I think it's fair to say we're generally intelligent.
Something is fundamentally missing with LLMs w.r.t. intelligence per watt. assuming gpt4 is around human intelligence, that needs 2-4 H100s, so roughly the same and that doesn't include the rest of the computer.
That being said, we're willing to brute force our way to a solution to some extent so maybe it doesn't matter, but I say the fact that we don't use that much energy is proof enough we haven't perfected the architecture yet.
- LogicFailsMea month ago
  At 5 cents or less per kWH these days, 10 kW is 50 cents per hour, well below minimum wage. LLMs aren't AGI and I'm not convinced we're anywhere close to AGI, but they are useful. That the people deploying them have the same product instincts as Microsoft executives seems to be the core issue.
- boroboro4a month ago
  This being said in this setup of 2-4 h100 you’ll be able to generate with batch size of somewhere around 128 ie its 128 humans and not one. And just like that difference in efficiency isn’t that high anymore.
- dangusa month ago
  I think the more important point to bring up is “you can hire a human for minimum wage.”
  The median monthly salary in Bangladesh is cheaper than the Cursor Ultra plan. And Cursor loses money.
  An experienced developer in India makes around $20k.
- nutjob2a month ago
  Brains use approximately 20W.
- leoha month ago
  Not for inference, right?
  - websiteapia month ago
    correct - h100 can do like 100 tokens per second on a gpt4 like model, but you'd need to account for regular fine-tuning to accurately compare to a person, hence 4 or so. of course the whole comparison is inane since computers and humans are obviously so different ha...
alyxyaa month ago
I don't get the anti-LLM sentiment because plenty of trends continue to show steady progress with LLMs over time. Sure, you can poke at some dumb things LLMs do as evidence of some fundamental issue, but the frontier capabilities continue to amaze people. I suspect the anti-LLM sentiment comes from people who haven't given a serious chance at seeing all the things they're capable of for themselves. I used to be skeptical, but I've changed my mind quite a bit over the past year, and there are many others who've changed their stance towards LLMs as well.
- D-Machinea month ago
  Or, people who've actually trained and used models in domains where "stuff on the internet" is of no relevance to what you are actually doing realize the profound limitations to what these LLMs actually do. They are amazing, don't get me wrong, but not so amazing in many specific contexts.
- nutjob2a month ago
  People who think that "steady progress" will continue forever have no basis for their assumption.
  You have a ad-hominem attack and your own personal anecdote with, which are not an argument for LLMs.
  - alyxyaa month ago
    It'll steadily continue the same way Moore's law has continued for a while. I don't think people question the general trend in Moore's law besides the point where it's nearing the limit of physics. It's a lot harder to claim LLMs don't work as a universal claim, whereas claiming something is possible for LLMs only needs some evidence.
    nutjob2a month ago
    Yes, LLMs will continue to progress until they hit the limits of LLMs.
    The idea that LLMs will reach AGI is entirely speculative, not least because AGI is undefined and speculative.
  - stevenhuanga month ago
    Lecun has already been proven wrong countless times over the years regarding his predictions of what LLMs can or cannot do. While LLMs continue to improve, he has yet to produce anything of practical value from his research. The salt is palpable, and for this he's memed for a reason.
djmipsa month ago
LeCun is agist - he said a 65 year old man is too old to be a CEO.
htrpa month ago
Reminder that it's in the lecun's interests to talk up AMI and to explain why they're going to win when they didn't do so at FAIR.
- mbac32768a month ago
  Yann joins Ilya, Karpathy, Sutton + Carmack when he says LLMs are a dead end, though.
  Karpathy is probably the most careful not to write off LLMs entirely but he seems pretty skeptical.
carolynmichaela month ago
[dead]
carolynmichaela month ago
[dead]